linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: jolsa@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Optimize perf stat for large number of events/cpus
Date: Wed, 27 Nov 2019 12:16:57 -0300	[thread overview]
Message-ID: <20191127151657.GE22719@kernel.org> (raw)
In-Reply-To: <20191121001522.180827-1-andi@firstfloor.org>

Em Wed, Nov 20, 2019 at 04:15:10PM -0800, Andi Kleen escreveu:
> [v8: Address review feedback. Only changes one patch.]
> 
> This patch kit optimizes perf stat for a large number of events 
> on systems with many CPUs and PMUs.
> 
> Some profiling shows that the most overhead is doing IPIs to
> all the target CPUs. We can optimize this by using sched_setaffinity
> to set the affinity to a target CPU once and then doing
> the perf operation for all events on that CPU. This requires
> some restructuring, but cuts the set up time quite a bit.
> 
> In theory we could go further by parallelizing these setups
> too, but that would be much more complicated and for now just batching it
> per CPU seems to be sufficient. At some point with many more cores 
> parallelization or a better bulk perf setup API might be needed though.
> 
> In addition perf does a lot of redundant /sys accesses with
> many PMUs, which can be also expensve. This is also optimized.
> 
> On a large test case (>700 events with many weak groups) on a 94 CPU
> system I go from
> 
> real	0m8.607s
> user	0m0.550s
> sys	0m8.041s
> 
> to 
> 
> real	0m3.269s
> user	0m0.760s
> sys	0m1.694s
> 
> so shaving ~6 seconds of system time, at slightly more cost
> in perf stat itself. On a 4 socket system the savings
> are more dramatic:
> 
> real	0m15.641s
> user	0m0.873s
> sys	0m14.729s
> 
> to 
> 
> real	0m4.493s
> user	0m1.578s
> sys	0m2.444s
> 
> so 11s difference in the user visible set up time.

Applied to my local perf/core branch, now undergoing test builds on all
the containers.

Thanks,

- Arnaldo

  parent reply	other threads:[~2019-11-27 15:17 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-21  0:15 Andi Kleen
2019-11-21  0:15 ` [PATCH 01/12] perf pmu: Use file system cache to optimize sysfs access Andi Kleen
2019-11-29  6:02   ` [tip: perf/urgent] " tip-bot2 for Andi Kleen
2019-11-21  0:15 ` [PATCH 02/12] perf affinity: Add infrastructure to save/restore affinity Andi Kleen
2019-11-29  6:02   ` [tip: perf/urgent] " tip-bot2 for Andi Kleen
2019-11-21  0:15 ` [PATCH 03/12] perf cpumap: Maintain cpumaps ordered and without dups Andi Kleen
2019-12-04  7:53   ` [tip: perf/urgent] " tip-bot2 for Andi Kleen
2019-11-21  0:15 ` [PATCH 04/12] perf evlist: Maintain evlist->all_cpus Andi Kleen
2019-12-04  7:53   ` [tip: perf/urgent] " tip-bot2 for Andi Kleen
2019-11-21  0:15 ` [PATCH 05/12] perf evsel: Add iterator to iterate over events ordered by CPU Andi Kleen
2019-12-04  7:53   ` [tip: perf/urgent] " tip-bot2 for Andi Kleen
2019-11-21  0:15 ` [PATCH 06/12] perf evsel: Add functions to close evsel on a CPU Andi Kleen
2019-12-04  7:53   ` [tip: perf/urgent] " tip-bot2 for Andi Kleen
2019-11-21  0:15 ` [PATCH 07/12] perf stat: Use affinity for closing file descriptors Andi Kleen
2019-12-04  7:53   ` [tip: perf/urgent] " tip-bot2 for Andi Kleen
2019-11-21  0:15 ` [PATCH 08/12] perf stat: Factor out open error handling Andi Kleen
2019-12-04  7:53   ` [tip: perf/urgent] " tip-bot2 for Andi Kleen
2019-11-21  0:15 ` [PATCH 09/12] perf stat: Use affinity for opening events Andi Kleen
2019-12-04  7:53   ` [tip: perf/urgent] " tip-bot2 for Andi Kleen
2019-12-18  9:29   ` [perf stat] cc9cdf40ae: perf-sanity-tests.Event_times.fail kernel test robot
2019-11-21  0:15 ` [PATCH 10/12] perf stat: Use affinity for reading Andi Kleen
2019-12-04  7:53   ` [tip: perf/urgent] " tip-bot2 for Andi Kleen
2019-11-21  0:15 ` [PATCH 11/12] perf evsel: Add functions to enable/disable for a specific CPU Andi Kleen
2019-12-04  7:53   ` [tip: perf/urgent] " tip-bot2 for Andi Kleen
2019-11-21  0:15 ` [PATCH 12/12] perf stat: Use affinity for enabling/disabling events Andi Kleen
2019-12-04  7:53   ` [tip: perf/urgent] " tip-bot2 for Andi Kleen
2019-11-21 12:47 ` Optimize perf stat for large number of events/cpus Andi Kleen
2019-11-21 14:32   ` Arnaldo Carvalho de Melo
2019-11-27 15:16 ` Arnaldo Carvalho de Melo [this message]
2019-11-27 15:43   ` Arnaldo Carvalho de Melo
2019-11-27 23:26     ` Andi Kleen
2019-11-28  0:01       ` Arnaldo Carvalho de Melo
  -- strict thread matches above, loose matches on Subject: below --
2019-11-16  5:52 Andi Kleen
2019-11-20 15:16 ` Jiri Olsa
2019-11-12  0:59 Andi Kleen
2019-11-07 18:16 Andi Kleen
2019-11-05  0:25 Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191127151657.GE22719@kernel.org \
    --to=arnaldo.melo@gmail.com \
    --cc=andi@firstfloor.org \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --subject='Re: Optimize perf stat for large number of events/cpus' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox