All of lore.kernel.org
 help / color / mirror / Atom feed
From: Song Liu <songliubraving@fb.com>
To: Stephane Eranian <eranian@google.com>
Cc: open list <linux-kernel@vger.kernel.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"mingo@elte.hu" <mingo@elte.hu>,
	"acme@redhat.com" <acme@redhat.com>,
	"jolsa@redhat.com" <jolsa@redhat.com>,
	"kan.liang@intel.com" <kan.liang@intel.com>,
	"irogers@google.com" <irogers@google.com>
Subject: Re: [PATCH] perf/core: fix multiplexing event scheduling issue
Date: Fri, 18 Oct 2019 06:13:21 +0000	[thread overview]
Message-ID: <7B6B74DF-53E0-42DA-97D6-3F04E84D688B@fb.com> (raw)
In-Reply-To: <20191018002746.149200-1-eranian@google.com>



> On Oct 17, 2019, at 5:27 PM, Stephane Eranian <eranian@google.com> wrote:
> 
> This patch complements the following commit:
> 7fa343b7fdc4 ("perf/core: Fix corner case in perf_rotate_context()")
> 
> The fix from Song addresses the consequences of the problem but
> not the cause. This patch fixes the causes and can sit on top of
> Song's patch.
> 
> This patch fixes a scheduling problem in the core functions of
> perf_events. Under certain conditions, some events would not be
> scheduled even though many counters would be available. This
> is related to multiplexing and is architecture agnostic and
> PMU agnostic (i.e., core or uncore).
> 
> This problem can easily be reproduced when you have two perf
> stat sessions. The first session does not cause multiplexing,
> let's say it is measuring 1 event, E1. While it is measuring,
> a second session starts and causes multiplexing. Let's say it
> adds 6 events, B1-B6. Now, 7 events compete and are multiplexed.
> When the second session terminates, all 6 (B1-B6) events are
> removed. Normally, you'd expect the E1 event to continue to run
> with no multiplexing. However, the problem is that depending on
> the state Of E1 when B1-B6 are removed, it may never be scheduled
> again. If E1 was inactive at the time of removal, despite the
> multiplexing hrtimer still firing, it will not find any active
> events and will not try to reschedule. This is what Song's patch
> fixes. It forces the multiplexing code to consider non-active events.

Good analysis! I kinda knew the example I had (with pinned event)
was not the only way to trigger this issue. But I didn't think 
about event remove path. 

> However, the cause is not addressed. The kernel should not rely on
> the multiplexing hrtimer to unblock inactive events. That timer
> can have abitrary duration in the milliseconds. Until the timer
> fires, counters are available, but no measurable events are using
> them. We do not want to introduce blind spots of arbitrary durations.
> 
> This patch addresses the cause of the problem, by checking that,
> when an event is disabled or removed and the context was multiplexing
> events, inactive events gets immediately a chance to be scheduled by
> calling ctx_resched(). The rescheduling is done on  event of equal
> or lower priority types.  With that in place, as soon as a counter
> is freed, schedulable inactive events may run, thereby eliminating
> a blind spot.
> 
> This can be illustrated as follows using Skylake uncore CHA here:
> 
> $ perf stat --no-merge -a -I 1000 -C 28 -e uncore_cha_0/event=0x0/
>    42.007856531      2,000,291,322      uncore_cha_0/event=0x0/
>    43.008062166      2,000,399,526      uncore_cha_0/event=0x0/
>    44.008293244      2,000,473,720      uncore_cha_0/event=0x0/
>    45.008501847      2,000,423,420      uncore_cha_0/event=0x0/
>    46.008706558      2,000,411,132      uncore_cha_0/event=0x0/
>    47.008928543      2,000,441,660      uncore_cha_0/event=0x0/
> 
> Adding second sessiont with 4 events for 4s
> 
> $ perf stat -a -I 1000 -C 28 --no-merge -e uncore_cha_0/event=0x0/,uncore_cha_0/event=0x0/,uncore_cha_0/event=0x0/,uncore_cha_0/event=0x0/ sleep 4
>    48.009114643      1,983,129,830      uncore_cha_0/event=0x0/                                       (99.71%)
>    49.009279616      1,976,067,751      uncore_cha_0/event=0x0/                                       (99.30%)
>    50.009428660      1,974,448,006      uncore_cha_0/event=0x0/                                       (98.92%)
>    51.009524309      1,973,083,237      uncore_cha_0/event=0x0/                                       (98.55%)
>    52.009673467      1,972,097,678      uncore_cha_0/event=0x0/                                       (97.11%)
> 
> End of 4s, second session is removed. But the first event does not schedule and never will
> unless new events force multiplexing again.
> 
>    53.009815999      <not counted>      uncore_cha_0/event=0x0/                                       (95.28%)
>    54.009961809      <not counted>      uncore_cha_0/event=0x0/                                       (93.52%)
>    55.010110972      <not counted>      uncore_cha_0/event=0x0/                                       (91.82%)
>    56.010217579      <not counted>      uncore_cha_0/event=0x0/                                       (90.18%)

Does this still happen after my patch? I was not able to repro this. 

> Signed-off-by: Stephane Eranian <eranian@google.com>

Maybe add:
Fixes: 8d5bce0c37fa ("perf/core: Optimize perf_rotate_context() event scheduling")

Thanks,
Song

  reply	other threads:[~2019-10-18  6:14 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-18  0:27 [PATCH] perf/core: fix multiplexing event scheduling issue Stephane Eranian
2019-10-18  6:13 ` Song Liu [this message]
2019-10-18  6:19   ` Stephane Eranian
2019-10-18  6:55     ` Song Liu
2019-10-21 10:05 ` Peter Zijlstra
2019-10-23  7:06   ` Stephane Eranian
2019-10-23  9:37     ` Peter Zijlstra
2019-10-23 15:29       ` Peter Zijlstra
2019-10-21 10:20 ` Peter Zijlstra
2019-10-23  7:30   ` Stephane Eranian
2019-10-23 11:02     ` Peter Zijlstra
2019-10-23 17:44       ` Stephane Eranian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7B6B74DF-53E0-42DA-97D6-3F04E84D688B@fb.com \
    --to=songliubraving@fb.com \
    --cc=acme@redhat.com \
    --cc=eranian@google.com \
    --cc=irogers@google.com \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.