From: Peter Zijlstra <peterz@infradead.org>
To: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>, Kan Liang <kan.liang@intel.com>,
Dmitri Prokhorov <Dmitry.Prohorov@intel.com>,
Valery Cherepennikov <valery.cherepennikov@intel.com>,
Mark Rutland <mark.rutland@arm.com>,
Stephane Eranian <eranian@google.com>,
David Carrillo-Cisneros <davidcc@google.com>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v6 1/3] perf/core: use rb trees for pinned/flexible groups
Date: Fri, 4 Aug 2017 16:36:28 +0200 [thread overview]
Message-ID: <20170804143628.34c2xqxl2e6k2arj@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <86cbe0b0-a1ec-4d5f-addc-87bccf2e97d7@linux.intel.com>
On Thu, Aug 03, 2017 at 11:30:09PM +0300, Alexey Budankov wrote:
> On 03.08.2017 16:00, Peter Zijlstra wrote:
> > On Wed, Aug 02, 2017 at 11:13:54AM +0300, Alexey Budankov wrote:
> >> +/*
> >> + * Find group list by a cpu key and rotate it.
> >> + */
> >> +static void
> >> +perf_event_groups_rotate(struct rb_root *groups, int cpu)
> >> +{
> >> + struct rb_node *node;
> >> + struct perf_event *node_event;
> >> +
> >> + node = groups->rb_node;
> >> +
> >> + while (node) {
> >> + node_event = container_of(node,
> >> + struct perf_event, group_node);
> >> +
> >> + if (cpu < node_event->cpu) {
> >> + node = node->rb_left;
> >> + } else if (cpu > node_event->cpu) {
> >> + node = node->rb_right;
> >> + } else {
> >> + list_rotate_left(&node_event->group_list);
> >> + break;
> >> + }
> >> + }
> >> +}
> >
> > Ah, you worry about how to rotate inside a tree?
>
> Exactly.
>
> >
> > You can do that by adding (run)time based ordering, and you'll end up
> > with a runtime based scheduler.
>
> Do you mean replacing a CPU indexed rb_tree of lists with
> an CPU indexed rb_tree of counter indexed rb_trees?
No, single tree, just more complicated ordering rules.
> > A trivial variant keeps a simple counter per tree that is incremented
> > for each rotation. That should end up with the events ordered exactly
> > like the list. And if you have that comparator like above, expressing
> > that additional ordering becomes simple ;-)
> >
> > Something like:
> >
> > struct group {
> > u64 vtime;
> > rb_tree tree;
> > };
> >
> > bool event_less(left, right)
> > {
> > if (left->cpu < right->cpu)
> > return true;
> >
> > if (left->cpu > right_cpu)
> > return false;
> >
> > if (left->vtime < right->vtime)
> > return true;
> >
> > return false;
> > }
> >
> > insert_group(group, event, tail)
> > {
> > if (tail)
> > event->vtime = ++group->vtime;
> >
> > tree_insert(&group->root, event);
> > }
> >
> > Then every time you use insert_group(.tail=1) it goes to the end of that
> > CPU's 'list'.
> >
>
> Could you elaborate more on how to implement rotation?
Its almost all there, but let me write a complete replacement for your
perf_event_group_rotate() above.
/* find the leftmost event matching @cpu */
/* XXX not sure how to best parametrise a subtree search, */
/* again, C sucks... */
struct perf_event *__group_find_cpu(group, cpu)
{
struct rb_node *node = group->tree.rb_node;
struct perf_event *event, *match = NULL;
while (node) {
event = container_of(node, struct perf_event, group_node);
if (cpu > event->cpu) {
node = node->rb_right;
} else if (cpu < event->cpu) {
node = node->rb_left;
} else {
/*
* subtree match, try left subtree for a
* 'smaller' match.
*/
match = event;
node = node->rb_left;
}
}
return match;
}
void perf_event_group_rotate(group, int cpu)
{
struct perf_event *event = __group_find_cpu(cpu);
if (!event)
return;
tree_delete(&group->tree, event);
insert_group(group, event, 1);
}
So we have a tree ordered by {cpu,vtime} and what we do is find the
leftmost {cpu} entry, that is the smallest vtime entry for that cpu. We
then take it out and re-insert it with a vtime number larger than any
other, which places it as the rightmost entry for that cpu.
So given:
{1,1}
/ \
{0,5} {1,2}
/ \ \
{0,1} {0,6} {1,4}
__group_find_cpu(.cpu=1) will return {1,1} as being the leftmost entry
with cpu=1. We'll then remove it, update its vtime to 7 and re-insert.
resulting in something like:
{1,2}
/ \
{0,5} {1,4}
/ \ \
{0,1} {0,6} {1,7}
next prev parent reply other threads:[~2017-08-04 14:36 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-02 8:11 [PATCH v6 0/3] perf/core: addressing 4x slowdown during per-process profiling of STREAM benchmark on Intel Xeon Phi Alexey Budankov
2017-08-02 8:13 ` [PATCH v6 1/3] perf/core: use rb trees for pinned/flexible groups Alexey Budankov
2017-08-03 13:00 ` Peter Zijlstra
2017-08-03 20:30 ` Alexey Budankov
2017-08-04 14:36 ` Peter Zijlstra [this message]
2017-08-07 7:17 ` Alexey Budankov
2017-08-07 8:39 ` Peter Zijlstra
2017-08-07 9:13 ` Peter Zijlstra
2017-08-07 15:32 ` Alexey Budankov
2017-08-07 15:55 ` Peter Zijlstra
2017-08-07 16:27 ` Alexey Budankov
2017-08-07 16:57 ` Peter Zijlstra
2017-08-07 17:39 ` Andi Kleen
2017-08-07 18:12 ` Peter Zijlstra
2017-08-07 18:13 ` Alexey Budankov
2017-08-15 17:28 ` Alexey Budankov
2017-08-23 13:39 ` Alexander Shishkin
2017-08-23 14:18 ` Alexey Budankov
2017-08-29 13:51 ` Alexander Shishkin
2017-08-30 8:30 ` Alexey Budankov
2017-08-30 10:18 ` Alexander Shishkin
2017-08-30 10:30 ` Alexey Budankov
2017-08-30 11:13 ` Alexander Shishkin
2017-08-30 11:16 ` Alexey Budankov
2017-08-31 10:12 ` Alexey Budankov
2017-08-31 10:12 ` Alexey Budankov
2017-08-04 14:53 ` Peter Zijlstra
2017-08-07 15:22 ` Alexey Budankov
2017-08-02 8:15 ` [PATCH v6 2/3]: perf/core: use context tstamp_data for skipped events on mux interrupt Alexey Budankov
2017-08-03 13:04 ` Peter Zijlstra
2017-08-03 14:00 ` Peter Zijlstra
2017-08-03 15:58 ` Alexey Budankov
2017-08-04 12:36 ` Peter Zijlstra
2017-08-03 15:00 ` Peter Zijlstra
2017-08-03 18:47 ` Alexey Budankov
2017-08-04 12:35 ` Peter Zijlstra
2017-08-04 12:51 ` Peter Zijlstra
2017-08-04 14:25 ` Alexey Budankov
2017-08-04 14:23 ` Alexey Budankov
2017-08-10 15:57 ` Alexey Budankov
2017-08-22 20:47 ` Peter Zijlstra
2017-08-23 8:54 ` Alexey Budankov
2017-08-31 17:18 ` [RFC][PATCH] perf: Rewrite enabled/running timekeeping Peter Zijlstra
2017-08-31 19:51 ` Stephane Eranian
2017-09-05 7:51 ` Stephane Eranian
2017-09-05 9:44 ` Peter Zijlstra
2017-09-01 10:45 ` Alexey Budankov
2017-09-01 12:31 ` Peter Zijlstra
2017-09-01 11:17 ` Alexey Budankov
2017-09-01 12:42 ` Peter Zijlstra
2017-09-01 21:03 ` Vince Weaver
2017-09-04 10:46 ` Alexey Budankov
2017-09-04 12:08 ` Peter Zijlstra
2017-09-04 14:56 ` Alexey Budankov
2017-09-04 15:41 ` Peter Zijlstra
2017-09-04 15:58 ` Peter Zijlstra
2017-09-05 10:17 ` Alexey Budankov
2017-09-05 11:19 ` Peter Zijlstra
2017-09-11 6:55 ` Alexey Budankov
2017-09-05 12:06 ` Alexey Budankov
2017-09-05 12:59 ` Peter Zijlstra
2017-09-05 16:03 ` Peter Zijlstra
2017-09-06 13:48 ` Alexey Budankov
2017-09-08 8:47 ` Alexey Budankov
2018-03-12 17:43 ` [tip:perf/core] perf/cor: Use RB trees for pinned/flexible groups tip-bot for Alexey Budankov
2017-08-02 8:16 ` [PATCH v6 3/3]: perf/core: add mux switch to skip to the current CPU's events list on mux interrupt Alexey Budankov
2017-08-18 5:17 ` [PATCH v7 0/2] perf/core: addressing 4x slowdown during per-process profiling of STREAM benchmark on Intel Xeon Phi Alexey Budankov
2017-08-18 5:21 ` [PATCH v7 1/2] perf/core: use rb trees for pinned/flexible groups Alexey Budankov
2017-08-23 11:17 ` Alexander Shishkin
2017-08-23 17:23 ` Alexey Budankov
2017-08-18 5:22 ` [PATCH v7 2/2] perf/core: add mux switch to skip to the current CPU's events list on mux interrupt Alexey Budankov
2017-08-23 11:54 ` Alexander Shishkin
2017-08-23 18:12 ` Alexey Budankov
2017-08-22 20:21 ` [PATCH v7 0/2] perf/core: addressing 4x slowdown during per-process profiling of STREAM benchmark on Intel Xeon Phi Peter Zijlstra
2017-08-23 8:54 ` Alexey Budankov
2017-08-31 10:12 ` Alexey Budankov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170804143628.34c2xqxl2e6k2arj@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=Dmitry.Prohorov@intel.com \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=alexey.budankov@linux.intel.com \
--cc=davidcc@google.com \
--cc=eranian@google.com \
--cc=kan.liang@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=valery.cherepennikov@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).