From: Wen Yang <wenyang@linux.alibaba.com>
To: Peter Zijlstra <peterz@infradead.org>,
Stephane Eranian <eranian@google.com>
Cc: Stephane Eranian <eranian@google.com>,
Wen Yang <simon.wy@alibaba-inc.com>,
Ingo Molnar <mingo@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
mark rutland <mark.rutland@arm.com>, jiri olsa <jolsa@redhat.com>,
namhyung kim <namhyung@kernel.org>,
borislav petkov <bp@alien8.de>,
x86@kernel.org, "h. peter anvin" <hpa@zytor.com>,
linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RESEND PATCH 2/2] perf/x86: improve the event scheduling to avoid unnecessary pmu_stop/start
Date: Fri, 18 Mar 2022 01:54:55 +0800 [thread overview]
Message-ID: <05861b8c-2c7c-ae89-613a-41fcace6a174@linux.alibaba.com> (raw)
In-Reply-To: <Yi8fELo+k9gmkJIa@hirez.programming.kicks-ass.net>
在 2022/3/14 下午6:55, Peter Zijlstra 写道:
> On Thu, Mar 10, 2022 at 11:50:33AM +0800, Wen Yang wrote:
>
>> As you pointed out, some non-compliant rdpmc can cause problems. But you
>> also know that linux is the foundation of cloud servers, and many
>> third-party programs run on it (we don't have any code for it), and we can
>> only observe that the monitoring data will jitter abnormally (the
>> probability of this issue is not high, about dozens of tens of thousands of
>> machines).
>
> This might be a novel insight, but I *really* don't give a crap about
> any of that. If they're not using it right, they get to keep the pieces.
>
> I'd almost make it reschedule more to force them to fix their stuff.
>
Thank you for your guidance.
We also found a case in thousands of servers where the PMU counter is no
longer updated due to frequent x86_pmu_stop/x86_pmu_start.
We added logs in the kernel and found that a third-party program would
cause the PMU counter to start/stop several times in just a few seconds,
as follows:
[8993460.537776] XXX x86_pmu_stop line=1388 [cpu1] active_mask=100000001
event=ffff880a53411000, state=1, attr.type=0, attr.config=0x0,
attr.pinned=1, hw.idx=3, hw.prev_count=0x802a877ef302,
hw.period_left=0x7fd578810cfe, event.count=0x14db802a877ecab4,
event.prev_count=0x14db802a877ecab4
[8993460.915873] XXX x86_pmu_start line=1312 [cpu1]
active_mask=200000008 event=ffff880a53411000, state=1, attr.type=0,
attr.config=0x0, attr.pinned=1, hw.idx=3,
hw.prev_count=0xffff802a9cf6a166, hw.period_left=0x7fd563095e9a,
event.count=0x14db802a9cf67918, event.prev_count=0x14db802a9cf67918
[8993461.104643] XXX x86_pmu_stop line=1388 [cpu1] active_mask=100000001
event=ffff880a53411000, state=1, attr.type=0, attr.config=0x0,
attr.pinned=1, hw.idx=3, hw.prev_count=0xffff802a9cf6a166,
hw.period_left=0x7fd563095e9a, event.count=0x14db802a9cf67918,
event.prev_count=0x14db802a9cf67918
[8993461.442508] XXX x86_pmu_start line=1312 [cpu1]
active_mask=200000004 event=ffff880a53411000, state=1, attr.type=0,
attr.config=0x0, attr.pinned=1, hw.idx=2,
hw.prev_count=0xffff802a9cf8492e, hw.period_left=0x7fd56307b6d2,
event.count=0x14db802a9cf820e0, event.prev_count=0x14db802a9cf820e0
[8993461.736927] XXX x86_pmu_stop line=1388 [cpu1] active_mask=100000001
event=ffff880a53411000, state=1, attr.type=0, attr.config=0x0,
attr.pinned=1, hw.idx=2, hw.prev_count=0xffff802a9cf8492e,
hw.period_left=0x7fd56307b6d2, event.count=0x14db802a9cf820e0,
event.prev_count=0x14db802a9cf820e0
[8993461.983135] XXX x86_pmu_start line=1312 [cpu1]
active_mask=200000004 event=ffff880a53411000, state=1, attr.type=0,
attr.config=0x0, attr.pinned=1, hw.idx=2,
hw.prev_count=0xffff802a9cfc29ed, hw.period_left=0x7fd56303d613,
event.count=0x14db802a9cfc019f, event.prev_count=0x14db802a9cfc019f
[8993462.274599] XXX x86_pmu_stop line=1388 [cpu1] active_mask=100000001
event=ffff880a53411000, state=1, attr.type=0, attr.config=0x0,
attr.pinned=1, hw.idx=2, hw.prev_count=0x802a9d24040e,
hw.period_left=0x7fd562dbfbf2, event.count=0x14db802a9d23dbc0,
event.prev_count=0x14db802a9d23dbc0
[8993462.519488] XXX x86_pmu_start line=1312 [cpu1]
active_mask=200000004 event=ffff880a53411000, state=1, attr.type=0,
attr.config=0x0, attr.pinned=1, hw.idx=2,
hw.prev_count=0xffff802ab0bb4719, hw.period_left=0x7fd54f44b8e7,
event.count=0x14db802ab0bb1ecb, event.prev_count=0x14db802ab0bb1ecb
[8993462.726929] XXX x86_pmu_stop line=1388 [cpu1] active_mask=100000003
event=ffff880a53411000, state=1, attr.type=0, attr.config=0x0,
attr.pinned=1, hw.idx=2, hw.prev_count=0xffff802ab0bb4719,
hw.period_left=0x7fd54f44b8e7, event.count=0x14db802ab0bb1ecb,
event.prev_count=0x14db802ab0bb1ecb
[8993463.035674] XXX x86_pmu_start line=1312 [cpu1]
active_mask=200000008 event=ffff880a53411000, state=1, attr.type=0,
attr.config=0x0, attr.pinned=1, hw.idx=3,
hw.prev_count=0xffff802ab0bcd328, hw.period_left=0x7fd54f432cd8,
event.count=0x14db802ab0bcaada, event.prev_count=0x14db802ab0bcaada
Then, the PMU counter will not be updated:
[8993463.333622] x86_perf_event_update, event=ffff880a53411000,
new_raw_count=802abea31354
[8993463.359905] x86_perf_event_update [cpu1] active_mask=30000000f
event=ffff880a53411000, state=1, attr.config=0x0, attr.pinned=1,
hw.idx=3, hw.prev_count=0x802abea31354, hw.period_left=0x7fd5415cecac,
event.count=0x14db802abea2eb06,
[8993463.504783] x86_perf_event_update, event=ffff880a53411000,
new_raw_count=802ad8760160
[8993463.521138] x86_perf_event_update [cpu1] active_mask=30000000f
event=ffff880a53411000, state=1, attr.config=0x0, attr.pinned=1,
hw.idx=3, hw.prev_count=0x802ad8760160, hw.period_left=0x7fd52789fea0,
event.count=0x14db802ad875d912,
[8993463.638337] x86_perf_event_update, event=ffff880a53411000,
new_raw_count=802aecb4747b
[8993463.654441] x86_perf_event_update [cpu1] active_mask=30000000f
event=ffff880a53411000, state=1, attr.config=0x0, attr.pinned=1,
hw.idx=3, hw.prev_count=0x802aecb4747b, hw.period_left=0x7fd5134b8b85,
event.count=0x14db802aecb44c2d,
[8993463.837321] x86_perf_event_update, event=ffff880a53411000,
new_raw_count=802aecb4747b
[8993463.861625] x86_perf_event_update [cpu1] active_mask=30000000f
event=ffff880a53411000, state=1, attr.config=0x0, attr.pinned=1,
hw.idx=3, hw.prev_count=0x802aecb4747b, hw.period_left=0x7fd5134b8b85,
event.count=0x14db802aecb44c2d,
[8993464.012398] x86_perf_event_update, event=ffff880a53411000,
new_raw_count=802aecb4747b
[8993464.012402] x86_perf_event_update [cpu1] active_mask=30000000f
event=ffff880a53411000, state=1, attr.config=0x0, attr.pinned=1,
hw.idx=3, hw.prev_count=0x802aecb4747b, hw.period_left=0x7fd5134b8b85,
event.count=0x14db802aecb44c2d,
[8993464.013676] x86_perf_event_update, event=ffff880a53411000,
new_raw_count=802aecb4747b
[8993464.013678] x86_perf_event_update [cpu1] active_mask=30000000f
event=ffff880a53411000, state=1, attr.config=0x0, attr.pinned=1,
hw.idx=3, hw.prev_count=0x802aecb4747b, hw.period_left=0x7fd5134b8b85,
event.count=0x14db802aecb44c2d,
[8993464.016123] x86_perf_event_update, event=ffff880a53411000,
new_raw_count=802aecb4747b
[8993464.016125] x86_perf_event_update [cpu1] active_mask=30000000f
event=ffff880a53411000, state=1, attr.config=0x0, attr.pinned=1,
hw.idx=3, hw.prev_count=0x802aecb4747b, hw.period_left=0x7fd5134b8b85,
event.count=0x14db802aecb44c2d,
[8993464.016196] x86_perf_event_update, event=ffff880a53411000,
new_raw_count=802aecb4747b
[8993464.016199] x86_perf_event_update [cpu1] active_mask=30000000f
event=ffff880a53411000, state=1, attr.config=0x0, attr.pinned=1,
hw.idx=3, hw.prev_count=0x802aecb4747b, hw.period_left=0x7fd5134b8b85,
event.count=0x14db802aecb44c2d,
......
Until 6 seconds later, the counter is stopped/started again:
[8993470.243959] XXX x86_pmu_stop line=1388 [cpu1] active_mask=100000001
event=ffff880a53411000, state=1, attr.type=0, attr.config=0x0,
attr.pinned=1, hw.idx=3, hw.prev_count=0x802aecb4747b,
hw.period_left=0x7fd5134b8b85, event.count=0x14db802aecb44c2d,
event.prev_count=0x14db802aecb44c2d
[8993470.243998] XXX x86_pmu_start line=1305 [cpu1]
active_mask=200000000 event=ffff880a53411000, state=1, attr.type=0,
attr.config=0x0, attr.pinned=1, hw.idx=3,
hw.prev_count=0xffff802aecb4747b, hw.period_left=0x7fd5134b8b85,
event.count=0x14db802aecb44c2d, event.prev_count=0x14db802aecb44c2d
[8993470.245285] x86_perf_event_update, event=ffff880a53411000,
new_raw_count=802aece1e6f6
...
Such problems can be solved by avoiding unnecessary x86_pmu_{stop|start}.
Please have a look again. Thanks.
--
Best wishes,
Wen
next prev parent reply other threads:[~2022-03-17 17:55 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-04 11:03 [RESEND PATCH 1/2] perf/x86: extract code to assign perf events for both core and uncore Wen Yang
2022-03-04 11:03 ` [RESEND PATCH 2/2] perf/x86: improve the event scheduling to avoid unnecessary pmu_stop/start Wen Yang
2022-03-04 15:39 ` Peter Zijlstra
2022-03-06 14:36 ` Wen Yang
2022-03-07 17:14 ` Stephane Eranian
2022-03-08 6:42 ` Wen Yang
2022-03-08 12:53 ` Peter Zijlstra
2022-03-08 12:50 ` Peter Zijlstra
2022-03-10 3:50 ` Wen Yang
2022-03-14 10:55 ` Peter Zijlstra
2022-03-17 17:54 ` Wen Yang [this message]
2022-04-17 15:06 ` Wen Yang
2022-04-19 14:16 ` Wen Yang
2022-04-19 20:57 ` Peter Zijlstra
2022-04-19 21:18 ` Stephane Eranian
2022-04-20 14:44 ` Wen Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=05861b8c-2c7c-ae89-613a-41fcace6a174@linux.alibaba.com \
--to=wenyang@linux.alibaba.com \
--cc=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=bp@alien8.de \
--cc=eranian@google.com \
--cc=hpa@zytor.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=simon.wy@alibaba-inc.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).