LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Andi Kleen <ak@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>,
	mingo@kernel.org, acme@kernel.org, mark.rutland@arm.com,
	alexander.shishkin@linux.intel.com, namhyung@kernel.org,
	linux-kernel@vger.kernel.org, eranian@google.com
Subject: Re: [PATCH v2 0/4] perf: Fix perf_event_attr::exclusive rotation
Date: Mon, 2 Nov 2020 18:41:43 -0800
Message-ID: <20201103024143.GK466880@tassilo.jf.intel.com> (raw)
In-Reply-To: <20201102141625.GX2594@hirez.programming.kicks-ass.net>

On Mon, Nov 02, 2020 at 03:16:25PM +0100, Peter Zijlstra wrote:
> On Sun, Nov 01, 2020 at 07:52:38PM -0800, Andi Kleen wrote:
> > The main motivation is actually that the "multiple groups" algorithm
> > in perf doesn't work all that great: it has quite a few cases where it
> > starves groups or makes the wrong decisions. That is because it is very
> > difficult (likely NP complete) problem and the kernel takes a lot
> > of short cuts to avoid spending too much time on it.
> 
> The event scheduling should be starvation free, except in the presence
> of pinned events.
> 
> If you can show starvation without pinned events, it's a bug.
> 
> It will also always do equal or better than exclusive mode wrt PMU
> utilization. Again, if it doesn't it's a bug.

Simple example (I think we've shown that one before):

(on skylake)
$ cat /proc/sys/kernel/nmi_watchdog
0
$ perf stat -e instructions,cycles,frontend_retired.latency_ge_2,frontend_retired.latency_ge_16 -a sleep 2

 Performance counter stats for 'system wide':

       654,514,990      instructions              #    0.34  insn per cycle           (50.67%)
     1,924,297,028      cycles                                                        (74.28%)
        21,708,935      frontend_retired.latency_ge_2                                     (75.01%)
         1,769,952      frontend_retired.latency_ge_16                                     (24.99%)

       2.002426541 seconds time elapsed

The second frontend_retired should be both getting 50% and the fixed events should be getting
100%. So several events are starved.

Another similar example is trying to schedule the topdown events on Icelake in parallel to other
groups. It works with one extra group, but breaks with two.

(on icelake)
$ cat /proc/sys/kernel/nmi_watchdog
0
$ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring},{branches,branches,branches,branches,branches,branches,branches,branches},{branches,branches,branches,branches,branches,branches,branches,branches}' -a sleep 1

 Performance counter stats for 'system wide':

        71,229,087      slots                                                         (60.65%)
         5,066,320      topdown-bad-spec          #      7.1% bad speculation         (60.65%)
        35,080,387      topdown-be-bound          #     49.2% backend bound           (60.65%)
        22,769,750      topdown-fe-bound          #     32.0% frontend bound          (60.65%)
         8,336,760      topdown-retiring          #     11.7% retiring                (60.65%)
           424,584      branches                                                      (70.00%)
           424,584      branches                                                      (70.00%)
           424,584      branches                                                      (70.00%)
           424,584      branches                                                      (70.00%)
           424,584      branches                                                      (70.00%)
           424,584      branches                                                      (70.00%)
           424,584      branches                                                      (70.00%)
           424,584      branches                                                      (70.00%)
         3,634,075      branches                                                      (30.00%)
         3,634,075      branches                                                      (30.00%)
         3,634,075      branches                                                      (30.00%)
         3,634,075      branches                                                      (30.00%)
         3,634,075      branches                                                      (30.00%)
         3,634,075      branches                                                      (30.00%)
         3,634,075      branches                                                      (30.00%)
         3,634,075      branches                                                      (30.00%)

       1.001312511 seconds time elapsed

A tool using exclusive hopefully will be able to do better than this.

-Andi

  reply index

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-29 16:27 Peter Zijlstra
2020-10-29 16:27 ` [PATCH v2 1/4] perf: Simplify group_sched_out() Peter Zijlstra
2020-11-10 12:45   ` [tip: perf/urgent] " tip-bot2 for Peter Zijlstra
2020-10-29 16:27 ` [PATCH v2 2/4] perf: Simplify group_sched_in() Peter Zijlstra
2020-11-10 12:45   ` [tip: perf/urgent] " tip-bot2 for Peter Zijlstra
2020-10-29 16:27 ` [PATCH v2 3/4] perf: Fix event multiplexing for exclusive groups Peter Zijlstra
2020-11-10 12:45   ` [tip: perf/urgent] " tip-bot2 for Peter Zijlstra
2020-10-29 16:27 ` [PATCH v2 4/4] perf: Tweak perf_event_attr::exclusive semantics Peter Zijlstra
2020-11-10 12:45   ` [tip: perf/urgent] " tip-bot2 for Peter Zijlstra
2020-10-31 23:44 ` [PATCH v2 0/4] perf: Fix perf_event_attr::exclusive rotation Jiri Olsa
2020-11-02  3:52   ` Andi Kleen
2020-11-02 14:16     ` Peter Zijlstra
2020-11-03  2:41       ` Andi Kleen [this message]
2020-11-09 11:48         ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201103024143.GK466880@tassilo.jf.intel.com \
    --to=ak@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=eranian@google.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git