linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ian Rogers <irogers@google.com>
To: Riccardo Mancini <rickyman7@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Mark Rutland <mark.rutland@arm.com>, Jiri Olsa <jolsa@redhat.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Andi Kleen <ak@linux.intel.com>
Subject: Re: [RFC PATCH v1 03/37] perf evlist: replace evsel__cpu_iter* functions with evsel__find_cpu
Date: Fri, 10 Dec 2021 16:20:52 -0800	[thread overview]
Message-ID: <CAP-5=fV4aFHf1K+wuqJ19W1GPsj4fnO-EznLf=wL1z7nKf_tqg@mail.gmail.com> (raw)
In-Reply-To: <5bfe13dd0985611285bb697987816ddc36e93e76.1629490974.git.rickyman7@gmail.com>

On Sat, Aug 21, 2021 at 2:19 AM Riccardo Mancini <rickyman7@gmail.com> wrote:
>
> Commit a8cbe40fe9f4ba49 ("perf evsel: Add iterator to iterate over events
> ordered by CPU") introduced an iterator for evsel.core.cpus inside evsel
> which is used to iterate over evlist one CPU at a time.
>
> However, this solution is hacky since it involves a mutable state in the
> evsel which is supposed to be unmodified during iteration.
> Sice checking that a CPU is within the evsel can be done quickly in
> O(logn) time, this patch replaces the aforementioned iterator with a
> simple check.

I like this change. It ties in with the CPU vs index refactoring I'm
doing and expanding upon in:
https://lore.kernel.org/lkml/20211208024607.1784932-1-irogers@google.com/
The O(log(n)) look up of the CPU from the CPU map I think can be
improved, but it is good enough for now. We can improve upon it by
recognizing that the CPU we're looking for in the CPU map is 'cpu /
num_cpus' into the map - ie. don't binary search from the middle, but
start from somewhere near where the CPU will be. Something like
interpolation search [1] would do this in O(log(log(n)) time, but
because CPU maps are so regular in layout I'd expect it would be O(1)
for us.

That said, I think the right thing to do here is to introduce a new
struct which would behave like an STL style iterator. STL style
iterators are returned by begin and end functions. They implement a
next() routine as operator ++. Termination of the loop happens when
the iterator equals the end value. The current evlist affinity
iteration needs two loops. The outer loop is iterating over CPUs and
the caller needs to set affinity to the CPU. An inner loop iterates
over the evlist again, skipping until an appropriate evsel is
encountered for the outer loop's CPU. I think it would be better if we
had a single loop, and this loop is going to maintain two pieces of
state in the iterator, the current CPU and evsel. The loop start, next
and end will change the CPU affinity. The next routine will iterate
over every evsel for a CPU and then advance to the next CPU and so on
until all CPUs and the evsels are done. For something like:
$ perf stat -a -e cycles,power/energy-pkg/ -I 1000
The cpumasks are going to be all CPUs for cycles, and on my machine 0
and 18 for energy-pkg. So the iterator will be on CPU 0 with an evsel
for cycles, then for energy-pkg, next for CPU 1 it will have an evsel
for cycles, and so on until CPU 18. For CPU 18 it will be like CPU.0
and have the evsel for cycles and energy-pkg. For CPU 19 it is just
cycles again until you get to CPU 35.

In the API it'd be something like:

struct evlist_cpu_iterator {
  int cpu;
  struct evsel *evsel;
  struct evlist *container;
  struct affinity saved_affinity;
};

struct evlist_cpu_iterator evlist__cpu_iter_start(struct evlist *evlist);
void evlist_cpu_iterator__next(struct evlist_cpu_iterator *itr);
bool evlist_cpu_iterator__end(const struct evlist_cpu_iterator *itr);

The advantage of this API is that we know the CPU and evsel without a
lookup (as with the current code), the affinity is hidden in the
abstraction and we get rid of the global iterator state from evsel (as
with this patch) that makes parallelization impossible.

I'm working to add something like this into the cpumap refactoring
effort. That isn't looking to do parallelization but merely to get
correctness, as at the moment CPUs and CPU map indices are often
confused leading to crashes or incorrect behavior. I'm aware that this
will mean rebasing this patch series, but I think that will be easier
as the confusion over CPU and index will be ironed out in my changes.

Thanks,
Ian

[1] https://en.wikipedia.org/wiki/Interpolation_search


> Cc: Andi Kleen <ak@linux.intel.com>
> Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
> ---
>  tools/perf/builtin-stat.c | 24 +++++++++--------
>  tools/perf/util/evlist.c  | 54 +++++++++++----------------------------
>  tools/perf/util/evlist.h  |  5 +---
>  tools/perf/util/evsel.h   |  1 -
>  4 files changed, 30 insertions(+), 54 deletions(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 84de61795e67bbb9..62dcc001c8f0a78b 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -407,7 +407,7 @@ static int read_affinity_counters(struct timespec *rs)
>  {
>         struct evsel *counter;
>         struct affinity affinity;
> -       int i, ncpus, cpu;
> +       int i, ncpus, cpu, cpu_idx;
>
>         if (all_counters_use_bpf)
>                 return 0;
> @@ -424,13 +424,14 @@ static int read_affinity_counters(struct timespec *rs)
>                 affinity__set(&affinity, cpu);
>
>                 evlist__for_each_entry(evsel_list, counter) {
> -                       if (evsel__cpu_iter_skip(counter, cpu))
> +                       cpu_idx = evsel__find_cpu(counter, cpu);
> +                       if (cpu_idx < 0)
>                                 continue;
>                         if (evsel__is_bpf(counter))
>                                 continue;
>                         if (!counter->err) {
>                                 counter->err = read_counter_cpu(counter, rs,
> -                                                               counter->cpu_iter - 1);
> +                                                               cpu_idx);
>                         }
>                 }
>         }
> @@ -789,7 +790,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>         const bool forks = (argc > 0);
>         bool is_pipe = STAT_RECORD ? perf_stat.data.is_pipe : false;
>         struct affinity affinity;
> -       int i, cpu, err;
> +       int i, cpu, cpu_idx, err;
>         bool second_pass = false;
>
>         if (forks) {
> @@ -823,7 +824,8 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>                 affinity__set(&affinity, cpu);
>
>                 evlist__for_each_entry(evsel_list, counter) {
> -                       if (evsel__cpu_iter_skip(counter, cpu))
> +                       cpu_idx = evsel__find_cpu(counter, cpu);
> +                       if (cpu_idx < 0)
>                                 continue;
>                         if (counter->reset_group || counter->errored)
>                                 continue;
> @@ -831,7 +833,7 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>                                 continue;
>  try_again:
>                         if (create_perf_stat_counter(counter, &stat_config, &target,
> -                                                    counter->cpu_iter - 1) < 0) {
> +                                                    cpu_idx) < 0) {
>
>                                 /*
>                                  * Weak group failed. We cannot just undo this here
> @@ -877,22 +879,24 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
>                         evlist__for_each_entry(evsel_list, counter) {
>                                 if (!counter->reset_group && !counter->errored)
>                                         continue;
> -                               if (evsel__cpu_iter_skip_no_inc(counter, cpu))
> +                               cpu_idx = evsel__find_cpu(counter, cpu);
> +                               if (cpu_idx < 0)
>                                         continue;
> -                               perf_evsel__close_cpu(&counter->core, counter->cpu_iter);
> +                               perf_evsel__close_cpu(&counter->core, cpu_idx);
>                         }
>                         /* Now reopen weak */
>                         evlist__for_each_entry(evsel_list, counter) {
>                                 if (!counter->reset_group && !counter->errored)
>                                         continue;
> -                               if (evsel__cpu_iter_skip(counter, cpu))
> +                               cpu_idx = evsel__find_cpu(counter, cpu);
> +                               if (cpu_idx < 0)
>                                         continue;
>                                 if (!counter->reset_group)
>                                         continue;
>  try_again_reset:
>                                 pr_debug2("reopening weak %s\n", evsel__name(counter));
>                                 if (create_perf_stat_counter(counter, &stat_config, &target,
> -                                                            counter->cpu_iter - 1) < 0) {
> +                                                            cpu_idx) < 0) {
>
>                                         switch (stat_handle_error(counter)) {
>                                         case COUNTER_FATAL:
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 47581a237c7a7848..3d55d9a53b9f4875 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -341,36 +341,9 @@ static int evlist__nr_threads(struct evlist *evlist, struct evsel *evsel)
>                 return perf_thread_map__nr(evlist->core.threads);
>  }
>
> -void evlist__cpu_iter_start(struct evlist *evlist)
> +int evsel__find_cpu(struct evsel *ev, int cpu)
>  {
> -       struct evsel *pos;
> -
> -       /*
> -        * Reset the per evsel cpu_iter. This is needed because
> -        * each evsel's cpumap may have a different index space,
> -        * and some operations need the index to modify
> -        * the FD xyarray (e.g. open, close)
> -        */
> -       evlist__for_each_entry(evlist, pos)
> -               pos->cpu_iter = 0;
> -}
> -
> -bool evsel__cpu_iter_skip_no_inc(struct evsel *ev, int cpu)
> -{
> -       if (ev->cpu_iter >= ev->core.cpus->nr)
> -               return true;
> -       if (cpu >= 0 && ev->core.cpus->map[ev->cpu_iter] != cpu)
> -               return true;
> -       return false;
> -}
> -
> -bool evsel__cpu_iter_skip(struct evsel *ev, int cpu)
> -{
> -       if (!evsel__cpu_iter_skip_no_inc(ev, cpu)) {
> -               ev->cpu_iter++;
> -               return false;
> -       }
> -       return true;
> +       return perf_cpu_map__idx(ev->core.cpus, cpu);
>  }
>
>  static int evsel__strcmp(struct evsel *pos, char *evsel_name)
> @@ -400,7 +373,7 @@ static void __evlist__disable(struct evlist *evlist, char *evsel_name)
>  {
>         struct evsel *pos;
>         struct affinity affinity;
> -       int cpu, i, imm = 0;
> +       int cpu, i, imm = 0, cpu_idx;
>         bool has_imm = false;
>
>         if (affinity__setup(&affinity) < 0)
> @@ -414,7 +387,8 @@ static void __evlist__disable(struct evlist *evlist, char *evsel_name)
>                         evlist__for_each_entry(evlist, pos) {
>                                 if (evsel__strcmp(pos, evsel_name))
>                                         continue;
> -                               if (evsel__cpu_iter_skip(pos, cpu))
> +                               cpu_idx = evsel__find_cpu(pos, cpu);
> +                               if (cpu_idx < 0)
>                                         continue;
>                                 if (pos->disabled || !evsel__is_group_leader(pos) || !pos->core.fd)
>                                         continue;
> @@ -422,7 +396,7 @@ static void __evlist__disable(struct evlist *evlist, char *evsel_name)
>                                         has_imm = true;
>                                 if (pos->immediate != imm)
>                                         continue;
> -                               evsel__disable_cpu(pos, pos->cpu_iter - 1);
> +                               evsel__disable_cpu(pos, cpu_idx);
>                         }
>                 }
>                 if (!has_imm)
> @@ -462,7 +436,7 @@ static void __evlist__enable(struct evlist *evlist, char *evsel_name)
>  {
>         struct evsel *pos;
>         struct affinity affinity;
> -       int cpu, i;
> +       int cpu, i, cpu_idx;
>
>         if (affinity__setup(&affinity) < 0)
>                 return;
> @@ -473,11 +447,12 @@ static void __evlist__enable(struct evlist *evlist, char *evsel_name)
>                 evlist__for_each_entry(evlist, pos) {
>                         if (evsel__strcmp(pos, evsel_name))
>                                 continue;
> -                       if (evsel__cpu_iter_skip(pos, cpu))
> +                       cpu_idx = evsel__find_cpu(pos, cpu);
> +                       if (cpu_idx < 0)
>                                 continue;
>                         if (!evsel__is_group_leader(pos) || !pos->core.fd)
>                                 continue;
> -                       evsel__enable_cpu(pos, pos->cpu_iter - 1);
> +                       evsel__enable_cpu(pos, cpu_idx);
>                 }
>         }
>         affinity__cleanup(&affinity);
> @@ -1264,7 +1239,7 @@ void evlist__close(struct evlist *evlist)
>  {
>         struct evsel *evsel;
>         struct affinity affinity;
> -       int cpu, i;
> +       int cpu, i, cpu_idx;
>
>         /*
>          * With perf record core.cpus is usually NULL.
> @@ -1282,9 +1257,10 @@ void evlist__close(struct evlist *evlist)
>                 affinity__set(&affinity, cpu);
>
>                 evlist__for_each_entry_reverse(evlist, evsel) {
> -                       if (evsel__cpu_iter_skip(evsel, cpu))
> -                           continue;
> -                       perf_evsel__close_cpu(&evsel->core, evsel->cpu_iter - 1);
> +                       cpu_idx = evsel__find_cpu(evsel, cpu);
> +                       if (cpu_idx < 0)
> +                               continue;
> +                       perf_evsel__close_cpu(&evsel->core, cpu_idx);
>                 }
>         }
>         affinity__cleanup(&affinity);
> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
> index 5c22383489ae4905..fde893170c7ba6d2 100644
> --- a/tools/perf/util/evlist.h
> +++ b/tools/perf/util/evlist.h
> @@ -310,15 +310,12 @@ void evlist__to_front(struct evlist *evlist, struct evsel *move_evsel);
>         __evlist__for_each_entry_safe(&(evlist)->core.entries, tmp, evsel)
>
>  #define evlist__for_each_cpu(evlist, index, cpu)       \
> -       evlist__cpu_iter_start(evlist);                 \
>         perf_cpu_map__for_each_cpu (cpu, index, (evlist)->core.all_cpus)
>
>  struct evsel *evlist__get_tracking_event(struct evlist *evlist);
>  void evlist__set_tracking_event(struct evlist *evlist, struct evsel *tracking_evsel);
>
> -void evlist__cpu_iter_start(struct evlist *evlist);
> -bool evsel__cpu_iter_skip(struct evsel *ev, int cpu);
> -bool evsel__cpu_iter_skip_no_inc(struct evsel *ev, int cpu);
> +int evsel__find_cpu(struct evsel *ev, int cpu);
>
>  struct evsel *evlist__find_evsel_by_str(struct evlist *evlist, const char *str);
>
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 80383096d51c5f00..eabccce406886320 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -119,7 +119,6 @@ struct evsel {
>         bool                    errored;
>         struct hashmap          *per_pkg_mask;
>         int                     err;
> -       int                     cpu_iter;
>         struct {
>                 evsel__sb_cb_t  *cb;
>                 void            *data;
> --
> 2.31.1
>

  parent reply	other threads:[~2021-12-11  0:21 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-21  9:19 [RFC PATCH v1 00/37] perf: use workqueue for evlist operations Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 01/37] libperf cpumap: improve idx function Riccardo Mancini
2021-08-31 18:46   ` Arnaldo Carvalho de Melo
2021-10-08 14:29   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 02/37] libperf cpumap: improve max function Riccardo Mancini
2021-08-31 18:47   ` Arnaldo Carvalho de Melo
2021-08-31 19:16     ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 03/37] perf evlist: replace evsel__cpu_iter* functions with evsel__find_cpu Riccardo Mancini
2021-10-08 14:38   ` [RFC PATCH v1 03/37] perf evlist: replace evsel__cpu_iter* functions with evsel__find_cpu() Arnaldo Carvalho de Melo
2021-12-11  0:20   ` Ian Rogers [this message]
2021-08-21  9:19 ` [RFC PATCH v1 04/37] perf util: add mmap_cpu_mask__duplicate function Riccardo Mancini
2021-08-31 19:21   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 05/37] perf util/mmap: add missing bitops.h header Riccardo Mancini
2021-08-31 19:22   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 06/37] perf workqueue: add affinities to threadpool Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 07/37] perf workqueue: add support for setting affinities to workers Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 08/37] perf workqueue: add method to execute work on specific CPU Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 09/37] perf python: add workqueue dependency Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 10/37] perf evlist: add multithreading helper Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 11/37] perf evlist: add multithreading to evlist__disable Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 12/37] perf evlist: add multithreading to evlist__enable Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 13/37] perf evlist: add multithreading to evlist__close Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 14/37] perf evsel: remove retry_sample_id goto label Riccardo Mancini
2021-08-31 19:25   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 15/37] perf evsel: separate open preparation from open itself Riccardo Mancini
2021-08-31 19:27   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 16/37] perf evsel: save open flags in evsel Riccardo Mancini
2021-08-31 19:31   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 17/37] perf evsel: separate missing feature disabling from evsel__open_cpu Riccardo Mancini
2021-08-31 19:35   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 18/37] perf evsel: add evsel__prepare_open function Riccardo Mancini
2021-08-31 19:36   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 19/37] perf evsel: separate missing feature detection from evsel__open_cpu Riccardo Mancini
2021-08-31 19:39   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 20/37] perf evsel: separate rlimit increase " Riccardo Mancini
2021-08-31 19:41   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 21/37] perf evsel: move ignore_missing_thread to fallback code Riccardo Mancini
2021-08-31 19:44   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 22/37] perf evsel: move test_attr__open to success path in evsel__open_cpu Riccardo Mancini
2021-08-31 19:47   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 23/37] perf evsel: move bpf_counter__install_pe " Riccardo Mancini
2021-08-31 19:50   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 24/37] perf evsel: handle precise_ip fallback " Riccardo Mancini
2021-08-31 19:52   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 25/37] perf evsel: move event open in evsel__open_cpu to separate function Riccardo Mancini
2021-08-31 19:54   ` Arnaldo Carvalho de Melo
2021-09-03 21:52     ` Riccardo Mancini
2021-09-11 19:10       ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 26/37] perf evsel: add evsel__open_per_cpu_no_fallback function Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 27/37] perf evlist: add evlist__for_each_entry_from macro Riccardo Mancini
2021-08-31 20:06   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 28/37] perf evlist: add multithreading to evlist__open Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 29/37] perf evlist: add custom fallback " Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 30/37] perf record: use evlist__open_custom Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 31/37] tools lib/subcmd: add OPT_UINTEGER_OPTARG option type Riccardo Mancini
2021-08-31 18:44   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 32/37] perf record: add --threads option Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 33/37] perf record: pin threads to monitored cpus if enough threads available Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 34/37] perf record: apply multithreading in init and fini phases Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 35/37] perf test/evlist-open-close: add multithreading Riccardo Mancini
2021-08-21  9:19 ` [RFC PATCH v1 36/37] perf test/evlist-open-close: use inline func to convert timeval to usec Riccardo Mancini
2021-10-08 14:46   ` Arnaldo Carvalho de Melo
2021-08-21  9:19 ` [RFC PATCH v1 37/37] perf test/evlist-open-close: add detailed output mode Riccardo Mancini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAP-5=fV4aFHf1K+wuqJ19W1GPsj4fnO-EznLf=wL1z7nKf_tqg@mail.gmail.com' \
    --to=irogers@google.com \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rickyman7@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).