perf record: fix priv level with branch sampling for paranoid=2
diff mbox series

Message ID 20190904062603.90165-1-eranian@google.com
State Superseded
Headers show
Series
  • perf record: fix priv level with branch sampling for paranoid=2
Related show

Commit Message

Stephane Eranian Sept. 4, 2019, 6:26 a.m. UTC
Now that the default perf_events paranoid level is set to 2, a regular user
cannot monitor kernel level activity anymore. As such, with the following
cmdline:

$ perf record -e cycles date

The perf tool first tries cycles:uk but then falls back to cycles:u
as can be seen in the perf report --header-only output:

  cmdline : /export/hda3/tmp/perf.tip record -e cycles ls
  event : name = cycles:u, , id = { 436186, ... }

This is okay as long as there is way to learn the priv level was changed
internally by the tool.

But consider a similar example:

$ perf record -b -e cycles date
Error:
You may not have permission to collect stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).
...

Why is that treated differently given that the branch sampling inherits the
priv level of the first event in this case, i.e., cycles:u? It turns out
that the branch sampling code is more picky and also checks exclude_hv.

In the fallback path, perf record is setting exclude_kernel = 1, but it
does not change exclude_hv. This does not seem to match the restriction
imposed by paranoid = 2.

This patch fixes the problem by forcing exclude_hv = 1 in the fallback
for paranoid=2. With this in place:

$ perf record -b -e cycles date
  cmdline : /export/hda3/tmp/perf.tip record -b -e cycles ls
  event : name = cycles:u, , id = { 436847, ... }

And the command succeeds as expected.

Signed-off-by: Stephane Eranian <eranian@google.com>
---
 tools/perf/util/evsel.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Stephane Eranian Sept. 13, 2019, 8:14 a.m. UTC | #1
On Tue, Sep 3, 2019 at 11:26 PM Stephane Eranian <eranian@google.com> wrote:
>
> Now that the default perf_events paranoid level is set to 2, a regular user
> cannot monitor kernel level activity anymore. As such, with the following
> cmdline:
>
> $ perf record -e cycles date
>
> The perf tool first tries cycles:uk but then falls back to cycles:u
> as can be seen in the perf report --header-only output:
>
>   cmdline : /export/hda3/tmp/perf.tip record -e cycles ls
>   event : name = cycles:u, , id = { 436186, ... }
>
> This is okay as long as there is way to learn the priv level was changed
> internally by the tool.
>
> But consider a similar example:
>
> $ perf record -b -e cycles date
> Error:
> You may not have permission to collect stats.
>
> Consider tweaking /proc/sys/kernel/perf_event_paranoid,
> which controls use of the performance events system by
> unprivileged users (without CAP_SYS_ADMIN).
> ...
>
> Why is that treated differently given that the branch sampling inherits the
> priv level of the first event in this case, i.e., cycles:u? It turns out
> that the branch sampling code is more picky and also checks exclude_hv.
>
> In the fallback path, perf record is setting exclude_kernel = 1, but it
> does not change exclude_hv. This does not seem to match the restriction
> imposed by paranoid = 2.
>
> This patch fixes the problem by forcing exclude_hv = 1 in the fallback
> for paranoid=2. With this in place:
>
> $ perf record -b -e cycles date
>   cmdline : /export/hda3/tmp/perf.tip record -b -e cycles ls
>   event : name = cycles:u, , id = { 436847, ... }
>
> And the command succeeds as expected.
>
Any comment on this patch?

> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/util/evsel.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 85825384f9e8..3cbe06fdf7f7 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2811,9 +2811,11 @@ bool perf_evsel__fallback(struct evsel *evsel, int err,
>                 if (evsel->name)
>                         free(evsel->name);
>                 evsel->name = new_name;
> -               scnprintf(msg, msgsize,
> -"kernel.perf_event_paranoid=%d, trying to fall back to excluding kernel samples", paranoid);
> +               scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, trying "
> +                         "to fall back to excluding kernel and hypervisor "
> +                         " samples", paranoid);
>                 evsel->core.attr.exclude_kernel = 1;
> +               evsel->core.attr.exclude_hv     = 1;
>
>                 return true;
>         }
> --
> 2.23.0.187.g17f5b7556c-goog
>
Jiri Olsa Sept. 20, 2019, 7:12 p.m. UTC | #2
On Tue, Sep 03, 2019 at 11:26:03PM -0700, Stephane Eranian wrote:
> Now that the default perf_events paranoid level is set to 2, a regular user
> cannot monitor kernel level activity anymore. As such, with the following
> cmdline:
> 
> $ perf record -e cycles date
> 
> The perf tool first tries cycles:uk but then falls back to cycles:u
> as can be seen in the perf report --header-only output:
> 
>   cmdline : /export/hda3/tmp/perf.tip record -e cycles ls
>   event : name = cycles:u, , id = { 436186, ... }
> 
> This is okay as long as there is way to learn the priv level was changed
> internally by the tool.
> 
> But consider a similar example:
> 
> $ perf record -b -e cycles date
> Error:
> You may not have permission to collect stats.
> 
> Consider tweaking /proc/sys/kernel/perf_event_paranoid,
> which controls use of the performance events system by
> unprivileged users (without CAP_SYS_ADMIN).
> ...
> 
> Why is that treated differently given that the branch sampling inherits the
> priv level of the first event in this case, i.e., cycles:u? It turns out
> that the branch sampling code is more picky and also checks exclude_hv.
> 
> In the fallback path, perf record is setting exclude_kernel = 1, but it
> does not change exclude_hv. This does not seem to match the restriction
> imposed by paranoid = 2.
> 
> This patch fixes the problem by forcing exclude_hv = 1 in the fallback
> for paranoid=2. With this in place:
> 
> $ perf record -b -e cycles date
>   cmdline : /export/hda3/tmp/perf.tip record -b -e cycles ls
>   event : name = cycles:u, , id = { 436847, ... }
> 
> And the command succeeds as expected.
> 
> Signed-off-by: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/util/evsel.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 85825384f9e8..3cbe06fdf7f7 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2811,9 +2811,11 @@ bool perf_evsel__fallback(struct evsel *evsel, int err,
>  		if (evsel->name)
>  			free(evsel->name);
>  		evsel->name = new_name;
> -		scnprintf(msg, msgsize,
> -"kernel.perf_event_paranoid=%d, trying to fall back to excluding kernel samples", paranoid);
> +		scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, trying "
> +			  "to fall back to excluding kernel and hypervisor "
> +			  " samples", paranoid);

extra space in here        ^

	Warning:
	kernel.perf_event_paranoid=2, trying to fall back to excluding kernel and hypervisor  samples

other than that it looks good to me

Acked-by: Jiri Olsa <jolsa@redhat.com>

thanks,
jirka
Stephane Eranian Sept. 20, 2019, 11:05 p.m. UTC | #3
On Fri, Sep 20, 2019 at 12:12 PM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Tue, Sep 03, 2019 at 11:26:03PM -0700, Stephane Eranian wrote:
> > Now that the default perf_events paranoid level is set to 2, a regular user
> > cannot monitor kernel level activity anymore. As such, with the following
> > cmdline:
> >
> > $ perf record -e cycles date
> >
> > The perf tool first tries cycles:uk but then falls back to cycles:u
> > as can be seen in the perf report --header-only output:
> >
> >   cmdline : /export/hda3/tmp/perf.tip record -e cycles ls
> >   event : name = cycles:u, , id = { 436186, ... }
> >
> > This is okay as long as there is way to learn the priv level was changed
> > internally by the tool.
> >
> > But consider a similar example:
> >
> > $ perf record -b -e cycles date
> > Error:
> > You may not have permission to collect stats.
> >
> > Consider tweaking /proc/sys/kernel/perf_event_paranoid,
> > which controls use of the performance events system by
> > unprivileged users (without CAP_SYS_ADMIN).
> > ...
> >
> > Why is that treated differently given that the branch sampling inherits the
> > priv level of the first event in this case, i.e., cycles:u? It turns out
> > that the branch sampling code is more picky and also checks exclude_hv.
> >
> > In the fallback path, perf record is setting exclude_kernel = 1, but it
> > does not change exclude_hv. This does not seem to match the restriction
> > imposed by paranoid = 2.
> >
> > This patch fixes the problem by forcing exclude_hv = 1 in the fallback
> > for paranoid=2. With this in place:
> >
> > $ perf record -b -e cycles date
> >   cmdline : /export/hda3/tmp/perf.tip record -b -e cycles ls
> >   event : name = cycles:u, , id = { 436847, ... }
> >
> > And the command succeeds as expected.
> >
> > Signed-off-by: Stephane Eranian <eranian@google.com>
> > ---
> >  tools/perf/util/evsel.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index 85825384f9e8..3cbe06fdf7f7 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -2811,9 +2811,11 @@ bool perf_evsel__fallback(struct evsel *evsel, int err,
> >               if (evsel->name)
> >                       free(evsel->name);
> >               evsel->name = new_name;
> > -             scnprintf(msg, msgsize,
> > -"kernel.perf_event_paranoid=%d, trying to fall back to excluding kernel samples", paranoid);
> > +             scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, trying "
> > +                       "to fall back to excluding kernel and hypervisor "
> > +                       " samples", paranoid);
>
> extra space in here        ^
>
>         Warning:
>         kernel.perf_event_paranoid=2, trying to fall back to excluding kernel and hypervisor  samples
>
> other than that it looks good to me
>
Fixed in v2.

> Acked-by: Jiri Olsa <jolsa@redhat.com>
>
> thanks,
> jirka

Patch
diff mbox series

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 85825384f9e8..3cbe06fdf7f7 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2811,9 +2811,11 @@  bool perf_evsel__fallback(struct evsel *evsel, int err,
 		if (evsel->name)
 			free(evsel->name);
 		evsel->name = new_name;
-		scnprintf(msg, msgsize,
-"kernel.perf_event_paranoid=%d, trying to fall back to excluding kernel samples", paranoid);
+		scnprintf(msg, msgsize, "kernel.perf_event_paranoid=%d, trying "
+			  "to fall back to excluding kernel and hypervisor "
+			  " samples", paranoid);
 		evsel->core.attr.exclude_kernel = 1;
+		evsel->core.attr.exclude_hv     = 1;
 
 		return true;
 	}