linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] perf record: fix priv level with branch sampling for paranoid=2
@ 2019-09-20 23:03 Stephane Eranian
  2019-09-23 14:03 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 3+ messages in thread
From: Stephane Eranian @ 2019-09-20 23:03 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mingo, acme, jolsa, namhyung

Now that the default perf_events paranoid level is set to 2, a regular user
cannot monitor kernel level activity anymore. As such, with the following
cmdline:

$ perf record -e cycles date

The perf tool first tries cycles:uk but then falls back to cycles:u
as can be seen in the perf report --header-only output:

  cmdline : /export/hda3/tmp/perf.tip record -e cycles ls
  event : name = cycles:u, , id = { 436186, ... }

This is okay as long as there is way to learn the priv level was changed
internally by the tool.

But consider a similar example:

$ perf record -b -e cycles date
Error:
You may not have permission to collect stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).
...

Why is that treated differently given that the branch sampling inherits the
priv level of the first event in this case, i.e., cycles:u? It turns out
that the branch sampling code is more picky and also checks exclude_hv.

In the fallback path, perf record is setting exclude_kernel = 1, but it
does not change exclude_hv. This does not seem to match the restriction
imposed by paranoid = 2.

This patch fixes the problem by forcing exclude_hv = 1 in the fallback
for paranoid=2. With this in place:

$ perf record -b -e cycles date
  cmdline : /export/hda3/tmp/perf.tip record -b -e cycles ls
  event : name = cycles:u, , id = { 436847, ... }

And the command succeeds as expected.

V2 fix a white space.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/util/evsel.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 85825384f9e8..3cbe06fdf7f7 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2811,9 +2811,11 @@ bool perf_evsel__fallback(struct evsel *evsel, int err,
 		if (evsel->name)
 			free(evsel->name);
 		evsel->name = new_name;
-		scnprintf(msg, msgsize,
-"kernel.perf_event_paranoid=%d, trying to fall back to excluding kernel samples", paranoid);
+		scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, trying "
+			  "to fall back to excluding kernel and hypervisor "
+			  " samples", paranoid);
 		evsel->core.attr.exclude_kernel = 1;
+		evsel->core.attr.exclude_hv     = 1;

 		return true;
 	}
-- 
2.23.0.187.g17f5b7556c-goog


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] perf record: fix priv level with branch sampling for paranoid=2
  2019-09-20 23:03 [PATCH v2] perf record: fix priv level with branch sampling for paranoid=2 Stephane Eranian
@ 2019-09-23 14:03 ` Arnaldo Carvalho de Melo
  2019-09-23 14:17   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 3+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-09-23 14:03 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: linux-kernel, peterz, mingo, jolsa, namhyung

Em Fri, Sep 20, 2019 at 04:03:56PM -0700, Stephane Eranian escreveu:
> Now that the default perf_events paranoid level is set to 2, a regular user
> cannot monitor kernel level activity anymore. As such, with the following
> cmdline:
> 
> $ perf record -e cycles date
> 
> The perf tool first tries cycles:uk but then falls back to cycles:u
> as can be seen in the perf report --header-only output:
> 
>   cmdline : /export/hda3/tmp/perf.tip record -e cycles ls
>   event : name = cycles:u, , id = { 436186, ... }
> 
> This is okay as long as there is way to learn the priv level was changed
> internally by the tool.
> 
> But consider a similar example:
> 
> $ perf record -b -e cycles date
> Error:
> You may not have permission to collect stats.
> 
> Consider tweaking /proc/sys/kernel/perf_event_paranoid,
> which controls use of the performance events system by
> unprivileged users (without CAP_SYS_ADMIN).
> ...
> 
> Why is that treated differently given that the branch sampling inherits the
> priv level of the first event in this case, i.e., cycles:u? It turns out
> that the branch sampling code is more picky and also checks exclude_hv.
> 
> In the fallback path, perf record is setting exclude_kernel = 1, but it
> does not change exclude_hv. This does not seem to match the restriction
> imposed by paranoid = 2.
> 
> This patch fixes the problem by forcing exclude_hv = 1 in the fallback
> for paranoid=2. With this in place:
> 
> $ perf record -b -e cycles date
>   cmdline : /export/hda3/tmp/perf.tip record -b -e cycles ls
>   event : name = cycles:u, , id = { 436847, ... }
> 
> And the command succeeds as expected.
> 
> V2 fix a white space.

Thanks, tested added Jiri's Acked-by, applied, added this note:

Committer testing:

After aplying the patch we get:

  [acme@quaco ~]$ perf record -b -e cycles date
  WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
  check /proc/sys/kernel/kptr_restrict and /proc/sys/kernel/perf_event_paranoid.

  Samples in kernel functions may not be resolved if a suitable vmlinux
  file is not found in the buildid cache or in the vmlinux path.

  Samples in kernel modules won't be resolved at all.

  If some relocation was applied (e.g. kexec) symbols may be misresolved
  even with a suitable vmlinux or kallsyms file.

  Mon 23 Sep 2019 11:00:59 AM -03
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.005 MB perf.data (14 samples) ]
  [acme@quaco ~]$ perf evlist -v
  cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
  [acme@quaco ~]$

That warning about restricted kernel maps will be suppressed in a follow
up patch, as perf_event_attr.exclude_kernel is set, i.e. no samples for
the kernel will be taken and thus no need for those maps

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] perf record: fix priv level with branch sampling for paranoid=2
  2019-09-23 14:03 ` Arnaldo Carvalho de Melo
@ 2019-09-23 14:17   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 3+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-09-23 14:17 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: linux-kernel, peterz, mingo, jolsa, namhyung

Em Mon, Sep 23, 2019 at 11:03:45AM -0300, Arnaldo Carvalho de Melo escreveu:
> Thanks, tested added Jiri's Acked-by, applied, added this note:
> 
> Committer testing:
> 
> After aplying the patch we get:
> 
>   [acme@quaco ~]$ perf record -b -e cycles date
>   WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
>   check /proc/sys/kernel/kptr_restrict and /proc/sys/kernel/perf_event_paranoid.
> 
>   Samples in kernel functions may not be resolved if a suitable vmlinux
>   file is not found in the buildid cache or in the vmlinux path.
> 
>   Samples in kernel modules won't be resolved at all.
> 
>   If some relocation was applied (e.g. kexec) symbols may be misresolved
>   even with a suitable vmlinux or kallsyms file.
> 
>   Mon 23 Sep 2019 11:00:59 AM -03
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.005 MB perf.data (14 samples) ]
>   [acme@quaco ~]$ perf evlist -v
>   cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
>   [acme@quaco ~]$
> 
> That warning about restricted kernel maps will be suppressed in a follow
> up patch, as perf_event_attr.exclude_kernel is set, i.e. no samples for
> the kernel will be taken and thus no need for those maps

Its there:

https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=c8b567c8a96a35acff746b1eef6f859e2a9fc195

- Arnaldo

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-09-23 14:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-20 23:03 [PATCH v2] perf record: fix priv level with branch sampling for paranoid=2 Stephane Eranian
2019-09-23 14:03 ` Arnaldo Carvalho de Melo
2019-09-23 14:17   ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).