All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU
@ 2021-01-08  0:01 Rob Herring
  2021-01-12 15:32 ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Rob Herring @ 2021-01-08  0:01 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Will Deacon, Andy Lutomirski, linux-kernel, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, x86, H. Peter Anvin

Userspace access using rdpmc only makes sense if the event is valid for
the current CPU. However, cap_user_rdpmc is currently set no matter which
CPU the event is associated with. The result is userspace reading another
CPU's event thinks it can use rdpmc to read the counter. In doing so, the
wrong counter will be read.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Rob Herring <robh@kernel.org>
---
I'm working on adding userspace counter access for arm64 along with a
common access routine in libperf. I found this issue when testing per CPU
events. More details are here[1]. 

I'm going to need the same thing for arm64. This could possibly go into
perf_event_update_userpage() instead, but perhaps there could be an arch
where userspace can read other cpu's counters.

What's the ABI between libperf and kernel versions? This change will 
only help the libperf implementation if libperf doesn't need to support 
old kernels.

[1] https://lore.kernel.org/r/CAL_JsqJzeCebq4VP+xBtfh=fbomvaJoVMp35AQQDGTYD-fRWgw@mail.gmail.com
---
 arch/x86/events/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index a88c94d65693..6e6d4c1d03ca 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2490,7 +2490,8 @@ void arch_perf_update_userpage(struct perf_event *event,
 	userpg->cap_user_time = 0;
 	userpg->cap_user_time_zero = 0;
 	userpg->cap_user_rdpmc =
-		!!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED);
+		!!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED) &&
+		(event->oncpu == smp_processor_id());
 	userpg->pmc_width = x86_pmu.cntval_bits;
 
 	if (!using_native_sched_clock() || !sched_clock_stable())
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU
  2021-01-08  0:01 [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU Rob Herring
@ 2021-01-12 15:32 ` Peter Zijlstra
  2021-01-12 16:16   ` Rob Herring
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2021-01-12 15:32 UTC (permalink / raw)
  To: Rob Herring
  Cc: Jiri Olsa, Will Deacon, Andy Lutomirski, linux-kernel,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, x86, H. Peter Anvin

On Thu, Jan 07, 2021 at 05:01:36PM -0700, Rob Herring wrote:
> Userspace access using rdpmc only makes sense if the event is valid for
> the current CPU. However, cap_user_rdpmc is currently set no matter which
> CPU the event is associated with. The result is userspace reading another
> CPU's event thinks it can use rdpmc to read the counter. In doing so, the
> wrong counter will be read.

Don't do that then?

> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index a88c94d65693..6e6d4c1d03ca 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2490,7 +2490,8 @@ void arch_perf_update_userpage(struct perf_event *event,
>  	userpg->cap_user_time = 0;
>  	userpg->cap_user_time_zero = 0;
>  	userpg->cap_user_rdpmc =
> -		!!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED);
> +		!!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED) &&
> +		(event->oncpu == smp_processor_id());
>  	userpg->pmc_width = x86_pmu.cntval_bits;
>  
>  	if (!using_native_sched_clock() || !sched_clock_stable())

Isn't that a nop? That is, from the few sites I checked, we're always
calling this on the event's CPU.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU
  2021-01-12 15:32 ` Peter Zijlstra
@ 2021-01-12 16:16   ` Rob Herring
  2021-01-12 17:03     ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Rob Herring @ 2021-01-12 16:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, Will Deacon, Andy Lutomirski, linux-kernel,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, X86 ML, H. Peter Anvin

On Tue, Jan 12, 2021 at 9:33 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Jan 07, 2021 at 05:01:36PM -0700, Rob Herring wrote:
> > Userspace access using rdpmc only makes sense if the event is valid for
> > the current CPU. However, cap_user_rdpmc is currently set no matter which
> > CPU the event is associated with. The result is userspace reading another
> > CPU's event thinks it can use rdpmc to read the counter. In doing so, the
> > wrong counter will be read.
>
> Don't do that then?

I could check this in userspace I suppose, but then it's yet another
thing the rdpmc loop has to check. I think it's better to not add more
overhead there.

Or we just ignore this. This issue came up in the referenced thread
with Jiri. I'm not all that convinced that per CPU events and
userspace rdpmc is too useful of a combination. It would only reduce
the overhead on 1 out of N cpus. In this case, we'd have to limit the
userspace read to per thread events (and ones on any cpu I suppose).

>
> > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> > index a88c94d65693..6e6d4c1d03ca 100644
> > --- a/arch/x86/events/core.c
> > +++ b/arch/x86/events/core.c
> > @@ -2490,7 +2490,8 @@ void arch_perf_update_userpage(struct perf_event *event,
> >       userpg->cap_user_time = 0;
> >       userpg->cap_user_time_zero = 0;
> >       userpg->cap_user_rdpmc =
> > -             !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED);
> > +             !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED) &&
> > +             (event->oncpu == smp_processor_id());
> >       userpg->pmc_width = x86_pmu.cntval_bits;
> >
> >       if (!using_native_sched_clock() || !sched_clock_stable())
>
> Isn't that a nop? That is, from the few sites I checked, we're always
> calling this on the event's CPU.

If cpu0 opens and mmaps an event for cpu1, then cpu0 will see
cap_user_rdpmc set and think it can use rdpmc.

Rob

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU
  2021-01-12 16:16   ` Rob Herring
@ 2021-01-12 17:03     ` Peter Zijlstra
  2021-01-12 20:09       ` Rob Herring
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2021-01-12 17:03 UTC (permalink / raw)
  To: Rob Herring
  Cc: Jiri Olsa, Will Deacon, Andy Lutomirski, linux-kernel,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, X86 ML, H. Peter Anvin

On Tue, Jan 12, 2021 at 10:16:50AM -0600, Rob Herring wrote:
> On Tue, Jan 12, 2021 at 9:33 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Jan 07, 2021 at 05:01:36PM -0700, Rob Herring wrote:
> > > Userspace access using rdpmc only makes sense if the event is valid for
> > > the current CPU. However, cap_user_rdpmc is currently set no matter which
> > > CPU the event is associated with. The result is userspace reading another
> > > CPU's event thinks it can use rdpmc to read the counter. In doing so, the
> > > wrong counter will be read.
> >
> > Don't do that then?
> 
> I could check this in userspace I suppose, but then it's yet another
> thing the rdpmc loop has to check. I think it's better to not add more
> overhead there.

So all this was designed for self monitoring; attempting rdpmc on an
event not for yourself is out of spec.

> > > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> > > index a88c94d65693..6e6d4c1d03ca 100644
> > > --- a/arch/x86/events/core.c
> > > +++ b/arch/x86/events/core.c
> > > @@ -2490,7 +2490,8 @@ void arch_perf_update_userpage(struct perf_event *event,
> > >       userpg->cap_user_time = 0;
> > >       userpg->cap_user_time_zero = 0;
> > >       userpg->cap_user_rdpmc =
> > > -             !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED);
> > > +             !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED) &&
> > > +             (event->oncpu == smp_processor_id());
> > >       userpg->pmc_width = x86_pmu.cntval_bits;
> > >
> > >       if (!using_native_sched_clock() || !sched_clock_stable())
> >
> > Isn't that a nop? That is, from the few sites I checked, we're always
> > calling this on the event's CPU.
> 
> If cpu0 opens and mmaps an event for cpu1, then cpu0 will see
> cap_user_rdpmc set and think it can use rdpmc.

I don't think your check helps with that. IIRC we always call
arch_perf_update_userpage() on the CPU the event actually runs on. So
it's always true.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU
  2021-01-12 17:03     ` Peter Zijlstra
@ 2021-01-12 20:09       ` Rob Herring
  0 siblings, 0 replies; 5+ messages in thread
From: Rob Herring @ 2021-01-12 20:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jiri Olsa, Will Deacon, Andy Lutomirski, linux-kernel,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Namhyung Kim, Thomas Gleixner,
	Borislav Petkov, X86 ML, H. Peter Anvin

On Tue, Jan 12, 2021 at 11:05 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Jan 12, 2021 at 10:16:50AM -0600, Rob Herring wrote:
> > On Tue, Jan 12, 2021 at 9:33 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Thu, Jan 07, 2021 at 05:01:36PM -0700, Rob Herring wrote:
> > > > Userspace access using rdpmc only makes sense if the event is valid for
> > > > the current CPU. However, cap_user_rdpmc is currently set no matter which
> > > > CPU the event is associated with. The result is userspace reading another
> > > > CPU's event thinks it can use rdpmc to read the counter. In doing so, the
> > > > wrong counter will be read.
> > >
> > > Don't do that then?
> >
> > I could check this in userspace I suppose, but then it's yet another
> > thing the rdpmc loop has to check. I think it's better to not add more
> > overhead there.
>
> So all this was designed for self monitoring; attempting rdpmc on an
> event not for yourself is out of spec.
>
> > > > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> > > > index a88c94d65693..6e6d4c1d03ca 100644
> > > > --- a/arch/x86/events/core.c
> > > > +++ b/arch/x86/events/core.c
> > > > @@ -2490,7 +2490,8 @@ void arch_perf_update_userpage(struct perf_event *event,
> > > >       userpg->cap_user_time = 0;
> > > >       userpg->cap_user_time_zero = 0;
> > > >       userpg->cap_user_rdpmc =
> > > > -             !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED);
> > > > +             !!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED) &&
> > > > +             (event->oncpu == smp_processor_id());
> > > >       userpg->pmc_width = x86_pmu.cntval_bits;
> > > >
> > > >       if (!using_native_sched_clock() || !sched_clock_stable())
> > >
> > > Isn't that a nop? That is, from the few sites I checked, we're always
> > > calling this on the event's CPU.
> >
> > If cpu0 opens and mmaps an event for cpu1, then cpu0 will see
> > cap_user_rdpmc set and think it can use rdpmc.
>
> I don't think your check helps with that. IIRC we always call
> arch_perf_update_userpage() on the CPU the event actually runs on. So
> it's always true.

My testing says otherwise. I tested this change on the arm64 version
of arch_perf_update_userpage, but I don't think x86 should be any
different here.

I'm testing with libperf test_stat_cpu() modified to mmap each cpu
event. Without the change I get the following result:

# taskset 2 test-evsel-a -v
- running test-evsel.c...
mmap base 0xffff9fd77000
userspace counter access enabled on cpu0            <<<<< Reflects
cap_user_rdpmc state
cpu0: count = 0x72cf, ena = 0x1a838, run = 0x1a838  <<<<< count is from rdpmc
mmap base 0xffff9fd76000
userspace counter access enabled on cpu1
cpu1: count = 0xc978, ena = 0x163e6, run = 0x163e6  <<<<< count is from rdpmc

cpu0 is idle here, so we'd expect count to be near zero, but it's not.

Then with the change, I get the following:

# taskset 2 test-evsel-a -v
- running test-evsel.c...
mmap base 0xffffa742d000
cpu0: count = 0xddb, ena = 0x3f8d6, run = 0x3f8d6  <<<<< count is from read()
mmap base 0xffffa742c000
userspace counter access enabled on cpu1
cpu1: count = 0xb538, ena = 0x154f0, run = 0x154f0 <<<<< count is from rdpmc

# taskset 1 test-evsel-a -v
- running test-evsel.c...
mmap base 0xffff8b008000
userspace counter access enabled on cpu0
cpu0: count = 0x7c21, ena = 0x18574, run = 0x18574
mmap base 0xffff8b007000
cpu1: count = 0xb3c, ena = 0x61aa8, run = 0x61aa8

As you can see, count tracks the idle and not idle cpu, and
cap_user_rdpmc is only set for the cpu event matching the cpu we are
running on.

Rob

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-01-12 21:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-08  0:01 [RFC] perf/x86: Only expose userspace rdpmc for events on current CPU Rob Herring
2021-01-12 15:32 ` Peter Zijlstra
2021-01-12 16:16   ` Rob Herring
2021-01-12 17:03     ` Peter Zijlstra
2021-01-12 20:09       ` Rob Herring

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.