From: Song Liu <songliubraving@fb.com>
To: Stephane Eranian <eranian@google.com>
Cc: open list <linux-kernel@vger.kernel.org>,
"peterz@infradead.org" <peterz@infradead.org>,
"mingo@elte.hu" <mingo@elte.hu>,
"acme@redhat.com" <acme@redhat.com>,
"jolsa@redhat.com" <jolsa@redhat.com>,
"kan.liang@intel.com" <kan.liang@intel.com>,
"irogers@google.com" <irogers@google.com>
Subject: Re: [PATCH] perf/core: fix multiplexing event scheduling issue
Date: Fri, 18 Oct 2019 06:13:21 +0000 [thread overview]
Message-ID: <7B6B74DF-53E0-42DA-97D6-3F04E84D688B@fb.com> (raw)
In-Reply-To: <20191018002746.149200-1-eranian@google.com>
> On Oct 17, 2019, at 5:27 PM, Stephane Eranian <eranian@google.com> wrote:
>
> This patch complements the following commit:
> 7fa343b7fdc4 ("perf/core: Fix corner case in perf_rotate_context()")
>
> The fix from Song addresses the consequences of the problem but
> not the cause. This patch fixes the causes and can sit on top of
> Song's patch.
>
> This patch fixes a scheduling problem in the core functions of
> perf_events. Under certain conditions, some events would not be
> scheduled even though many counters would be available. This
> is related to multiplexing and is architecture agnostic and
> PMU agnostic (i.e., core or uncore).
>
> This problem can easily be reproduced when you have two perf
> stat sessions. The first session does not cause multiplexing,
> let's say it is measuring 1 event, E1. While it is measuring,
> a second session starts and causes multiplexing. Let's say it
> adds 6 events, B1-B6. Now, 7 events compete and are multiplexed.
> When the second session terminates, all 6 (B1-B6) events are
> removed. Normally, you'd expect the E1 event to continue to run
> with no multiplexing. However, the problem is that depending on
> the state Of E1 when B1-B6 are removed, it may never be scheduled
> again. If E1 was inactive at the time of removal, despite the
> multiplexing hrtimer still firing, it will not find any active
> events and will not try to reschedule. This is what Song's patch
> fixes. It forces the multiplexing code to consider non-active events.
Good analysis! I kinda knew the example I had (with pinned event)
was not the only way to trigger this issue. But I didn't think
about event remove path.
> However, the cause is not addressed. The kernel should not rely on
> the multiplexing hrtimer to unblock inactive events. That timer
> can have abitrary duration in the milliseconds. Until the timer
> fires, counters are available, but no measurable events are using
> them. We do not want to introduce blind spots of arbitrary durations.
>
> This patch addresses the cause of the problem, by checking that,
> when an event is disabled or removed and the context was multiplexing
> events, inactive events gets immediately a chance to be scheduled by
> calling ctx_resched(). The rescheduling is done on event of equal
> or lower priority types. With that in place, as soon as a counter
> is freed, schedulable inactive events may run, thereby eliminating
> a blind spot.
>
> This can be illustrated as follows using Skylake uncore CHA here:
>
> $ perf stat --no-merge -a -I 1000 -C 28 -e uncore_cha_0/event=0x0/
> 42.007856531 2,000,291,322 uncore_cha_0/event=0x0/
> 43.008062166 2,000,399,526 uncore_cha_0/event=0x0/
> 44.008293244 2,000,473,720 uncore_cha_0/event=0x0/
> 45.008501847 2,000,423,420 uncore_cha_0/event=0x0/
> 46.008706558 2,000,411,132 uncore_cha_0/event=0x0/
> 47.008928543 2,000,441,660 uncore_cha_0/event=0x0/
>
> Adding second sessiont with 4 events for 4s
>
> $ perf stat -a -I 1000 -C 28 --no-merge -e uncore_cha_0/event=0x0/,uncore_cha_0/event=0x0/,uncore_cha_0/event=0x0/,uncore_cha_0/event=0x0/ sleep 4
> 48.009114643 1,983,129,830 uncore_cha_0/event=0x0/ (99.71%)
> 49.009279616 1,976,067,751 uncore_cha_0/event=0x0/ (99.30%)
> 50.009428660 1,974,448,006 uncore_cha_0/event=0x0/ (98.92%)
> 51.009524309 1,973,083,237 uncore_cha_0/event=0x0/ (98.55%)
> 52.009673467 1,972,097,678 uncore_cha_0/event=0x0/ (97.11%)
>
> End of 4s, second session is removed. But the first event does not schedule and never will
> unless new events force multiplexing again.
>
> 53.009815999 <not counted> uncore_cha_0/event=0x0/ (95.28%)
> 54.009961809 <not counted> uncore_cha_0/event=0x0/ (93.52%)
> 55.010110972 <not counted> uncore_cha_0/event=0x0/ (91.82%)
> 56.010217579 <not counted> uncore_cha_0/event=0x0/ (90.18%)
Does this still happen after my patch? I was not able to repro this.
> Signed-off-by: Stephane Eranian <eranian@google.com>
Maybe add:
Fixes: 8d5bce0c37fa ("perf/core: Optimize perf_rotate_context() event scheduling")
Thanks,
Song
next prev parent reply other threads:[~2019-10-18 6:14 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-18 0:27 [PATCH] perf/core: fix multiplexing event scheduling issue Stephane Eranian
2019-10-18 6:13 ` Song Liu [this message]
2019-10-18 6:19 ` Stephane Eranian
2019-10-18 6:55 ` Song Liu
2019-10-21 10:05 ` Peter Zijlstra
2019-10-23 7:06 ` Stephane Eranian
2019-10-23 9:37 ` Peter Zijlstra
2019-10-23 15:29 ` Peter Zijlstra
2019-10-21 10:20 ` Peter Zijlstra
2019-10-23 7:30 ` Stephane Eranian
2019-10-23 11:02 ` Peter Zijlstra
2019-10-23 17:44 ` Stephane Eranian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7B6B74DF-53E0-42DA-97D6-3F04E84D688B@fb.com \
--to=songliubraving@fb.com \
--cc=acme@redhat.com \
--cc=eranian@google.com \
--cc=irogers@google.com \
--cc=jolsa@redhat.com \
--cc=kan.liang@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).