From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Dave Hansen <dave.hansen@intel.com>,
peterz@infradead.org, mingo@redhat.com, acme@kernel.org,
linux-kernel@vger.kernel.org
Cc: mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
jolsa@redhat.com, eranian@google.com, ak@linux.intel.com,
kirill.shutemov@linux.intel.com, mpe@ellerman.id.au,
benh@kernel.crashing.org, paulus@samba.org
Subject: Re: [PATCH V7 1/4] perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE
Date: Thu, 17 Sep 2020 17:58:26 -0400 [thread overview]
Message-ID: <0e2cd6e4-8016-ccf6-eaff-2b304cf966ee@linux.intel.com> (raw)
In-Reply-To: <04f59f3b-2e53-8774-8333-63dfc5b8e6a9@intel.com>
On 9/17/2020 5:24 PM, Dave Hansen wrote:
> On 9/17/20 2:16 PM, Liang, Kan wrote:
>>> One last concern as I look at this: I wish it was a bit more
>>> future-proof. There are lots of weird things folks are trying to do
>>> with the page tables, like Address Space Isolation. For instance, if
>>> you get a perf NMI when running userspace, current->mm->pgd is
>>> *different* than the PGD that was in use when userspace was running.
>>> It's close enough today, but it might not stay that way. But I can't
>>> think of any great ways to future proof this code, other than spitting
>>> out an error message if too many of the page table walks fail when they
>>> shouldn't.
>>>
>>
>> If the page table walks fail, page size 0 will return. So the worst case
>> is that the page size is not available for users, which is not a fatal
>> error.
>
> Right, it's not a fatal error. It will just more or less silently break
> this feature.
>
>> If my understanding is correct, when the above case happens, there is
>> nothing we can do for now (because we have no idea what it will become),
>> except disabling the page size support and throw an error/warning.
>>
>> From the user's perspective, throwing an error message or marking the
>> page size unavailable should be the same. I think we may leave the code
>> as-is. We can fix the future case later separately.
>
> The only thing I can think of is to record the number of consecutive
> page walk errors without a success. Occasional failures are OK and
> expected, such as if reclaim zeroes a PTE and a later perf event goes
> and looks at it. But a *LOT* of consecutive errors indicates a real
> problem somewhere.
>
> Maybe if you have 10,000 or 1,000,000 successive walk failures, you do a
> WARN_ON_ONCE().
The user space perf tool looks like a better place for this kind of
warning. The perf tool knows the total number of the samples. It also
knows the number of the page size 0 samples. We can set a threshold,
e.g., 90%. If 90% of the samples have the page size 0, perf tool will
throw out a warning message.
The problem is that the warning from the perf tool usually includes some
hints regarding the cause of the warning or possible solution to
workaround/fix the warning. What message should we deliver to the users?
"Warning: Too many error page size. Address space isolation feature may
be enabled, please check"?
Thanks,
Kan
next prev parent reply other threads:[~2020-09-17 21:58 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-17 13:52 [PATCH V7 0/4] Add the page size in the perf record (kernel) kan.liang
2020-09-17 13:52 ` [PATCH V7 1/4] perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE kan.liang
2020-09-17 19:00 ` Dave Hansen
2020-09-17 21:16 ` Liang, Kan
[not found] ` <04f59f3b-2e53-8774-8333-63dfc5b8e6a9@intel.com>
2020-09-17 21:58 ` Liang, Kan [this message]
2020-09-17 22:02 ` Dave Hansen
2020-09-17 22:16 ` Liang, Kan
2020-09-17 13:52 ` [PATCH V7 2/4] perf/x86/intel: Support PERF_SAMPLE_DATA_PAGE_SIZE kan.liang
2020-09-17 13:52 ` [PATCH V7 3/4] powerpc/perf: " kan.liang
2020-09-17 13:52 ` [PATCH V7 4/4] perf/core: Add support for PERF_SAMPLE_CODE_PAGE_SIZE kan.liang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0e2cd6e4-8016-ccf6-eaff-2b304cf966ee@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=benh@kernel.crashing.org \
--cc=dave.hansen@intel.com \
--cc=eranian@google.com \
--cc=jolsa@redhat.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).