All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rob Herring <robh@kernel.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Jiri Olsa <jolsa@redhat.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Namhyung Kim <namhyung@kernel.org>,
	Raphael Gault <raphael.gault@arm.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Ian Rogers <irogers@google.com>,
	Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>,
	Itaru Kitayama <itaru.kitayama@gmail.com>
Subject: Re: [PATCH v4 2/9] arm64: perf: Enable pmu counter direct access for perf event on armv8
Date: Thu, 19 Nov 2020 12:35:17 -0600	[thread overview]
Message-ID: <CAL_JsqKM+91Meg+u07VRsD5=O1srQooe1Dd_M3NA+CZgcN4QcQ@mail.gmail.com> (raw)
In-Reply-To: <20201113180633.GE44988@C02TD0UTHF1T.local>

On Fri, Nov 13, 2020 at 12:06 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> Hi Rob,
>
> Thanks for this, and sorry for the long delay since this was last
> reviewed. Overall this is looking pretty good, but I have a couple of
> remaining concerns.
>
> Will, I have a query for you below.
>
> On Thu, Oct 01, 2020 at 09:01:09AM -0500, Rob Herring wrote:
> > From: Raphael Gault <raphael.gault@arm.com>
> >
> > Keep track of event opened with direct access to the hardware counters
> > and modify permissions while they are open.
> >
> > The strategy used here is the same which x86 uses: everytime an event
> > is mapped, the permissions are set if required. The atomic field added
> > in the mm_context helps keep track of the different event opened and
> > de-activate the permissions when all are unmapped.
> > We also need to update the permissions in the context switch code so
> > that tasks keep the right permissions.
> >
> > Signed-off-by: Raphael Gault <raphael.gault@arm.com>
> > Signed-off-by: Rob Herring <robh@kernel.org>
> > ---
> > v2:
> >  - Move mapped/unmapped into arm64 code. Fixes arm32.
> >  - Rebase on cap_user_time_short changes
> >
> > Changes from Raphael's v4:
> >   - Drop homogeneous check
> >   - Disable access for chained counters
> >   - Set pmc_width in user page
> > ---
> >  arch/arm64/include/asm/mmu.h         |  5 ++++
> >  arch/arm64/include/asm/mmu_context.h |  2 ++
> >  arch/arm64/include/asm/perf_event.h  | 14 ++++++++++
> >  arch/arm64/kernel/perf_event.c       | 41 ++++++++++++++++++++++++++++
> >  4 files changed, 62 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> > index a7a5ecaa2e83..52cfdb676f06 100644
> > --- a/arch/arm64/include/asm/mmu.h
> > +++ b/arch/arm64/include/asm/mmu.h
> > @@ -19,6 +19,11 @@
> >
> >  typedef struct {
> >       atomic64_t      id;
> > +     /*
> > +      * non-zero if userspace have access to hardware
> > +      * counters directly.
> > +      */
> > +     atomic_t        pmu_direct_access;
> >  #ifdef CONFIG_COMPAT
> >       void            *sigpage;
> >  #endif
> > diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> > index f2d7537d6f83..d24589ecb07a 100644
> > --- a/arch/arm64/include/asm/mmu_context.h
> > +++ b/arch/arm64/include/asm/mmu_context.h
> > @@ -21,6 +21,7 @@
> >  #include <asm/proc-fns.h>
> >  #include <asm-generic/mm_hooks.h>
> >  #include <asm/cputype.h>
> > +#include <asm/perf_event.h>
> >  #include <asm/sysreg.h>
> >  #include <asm/tlbflush.h>
> >
> > @@ -224,6 +225,7 @@ static inline void __switch_mm(struct mm_struct *next)
> >       }
> >
> >       check_and_switch_context(next);
> > +     perf_switch_user_access(next);
> >  }
> >
> >  static inline void
> > diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
> > index 2c2d7dbe8a02..a025d9595d51 100644
> > --- a/arch/arm64/include/asm/perf_event.h
> > +++ b/arch/arm64/include/asm/perf_event.h
> > @@ -8,6 +8,7 @@
> >
> >  #include <asm/stack_pointer.h>
> >  #include <asm/ptrace.h>
> > +#include <linux/mm_types.h>
> >
> >  #define      ARMV8_PMU_MAX_COUNTERS  32
> >  #define      ARMV8_PMU_COUNTER_MASK  (ARMV8_PMU_MAX_COUNTERS - 1)
> > @@ -251,4 +252,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
> >       (regs)->pstate = PSR_MODE_EL1h; \
> >  }
> >
> > +static inline void perf_switch_user_access(struct mm_struct *mm)
> > +{
> > +     if (!IS_ENABLED(CONFIG_PERF_EVENTS))
> > +             return;
> > +
> > +     if (atomic_read(&mm->context.pmu_direct_access)) {
> > +             write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR,
> > +                          pmuserenr_el0);
> > +     } else {
> > +             write_sysreg(0, pmuserenr_el0);
> > +     }
> > +}
>
> PMUSERENR.ER gives RW access to PMSELR_EL0. While we no longer use
> PMSELR_EL0 in the kernel, we can preempt and migrate userspace tasks
> between homogeneous CPUs, and presumably need to context-switch this
> with the task (like we do for TPIDR_EL0 and friends), or clear the
> register on context-switch to prevent it becoming an unintended covert
> channel.

Humm, now that I've read up on PMSELR_EL0 I'm now wondering if I
should be using PMSELR_EL0 in libperf. If you look at patch 7, the
counter read is pretty ugly because there's 32 possible mrs
instructions. If PMSELR_EL0 is used, we can have a single read path.
It's a msr and mrs vs. a function ptr load, branch/ret, and mrs. I'd
guess there's no guarantees on system reg access times, but I'd guess
typically the former is more optimal? It certainly simplifies the code
which I'd rather have given the limited users.

If I go that route and we don't context switch PMSELR_EL0, reads of
PMXEVCNTR_EL0 could be stale. But does that matter? No, because
reading PMEVCNTR<n>_EL0 can already be stale and the seq counter will
catch that.

> These bits also enable AArch32 access. Is there any way an AArch32 task
> can enable this? If so we should probably block that given we do not
> support this interface on 32-bit arm.

I'd assume this works for AArch32 given we don't do anything here to
prevent it. I suppose we could look at MMCF_AARCH32 flag in
mm_context_t? But is not implemented for arch/arm/ really a reason to
disable?

Rob

WARNING: multiple messages have this Message-ID (diff)
From: Rob Herring <robh@kernel.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: Ian Rogers <irogers@google.com>, Will Deacon <will@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Raphael Gault <raphael.gault@arm.com>,
	Ingo Molnar <mingo@redhat.com>,
	Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Namhyung Kim <namhyung@kernel.org>,
	Itaru Kitayama <itaru.kitayama@gmail.com>,
	Jiri Olsa <jolsa@redhat.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v4 2/9] arm64: perf: Enable pmu counter direct access for perf event on armv8
Date: Thu, 19 Nov 2020 12:35:17 -0600	[thread overview]
Message-ID: <CAL_JsqKM+91Meg+u07VRsD5=O1srQooe1Dd_M3NA+CZgcN4QcQ@mail.gmail.com> (raw)
In-Reply-To: <20201113180633.GE44988@C02TD0UTHF1T.local>

On Fri, Nov 13, 2020 at 12:06 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> Hi Rob,
>
> Thanks for this, and sorry for the long delay since this was last
> reviewed. Overall this is looking pretty good, but I have a couple of
> remaining concerns.
>
> Will, I have a query for you below.
>
> On Thu, Oct 01, 2020 at 09:01:09AM -0500, Rob Herring wrote:
> > From: Raphael Gault <raphael.gault@arm.com>
> >
> > Keep track of event opened with direct access to the hardware counters
> > and modify permissions while they are open.
> >
> > The strategy used here is the same which x86 uses: everytime an event
> > is mapped, the permissions are set if required. The atomic field added
> > in the mm_context helps keep track of the different event opened and
> > de-activate the permissions when all are unmapped.
> > We also need to update the permissions in the context switch code so
> > that tasks keep the right permissions.
> >
> > Signed-off-by: Raphael Gault <raphael.gault@arm.com>
> > Signed-off-by: Rob Herring <robh@kernel.org>
> > ---
> > v2:
> >  - Move mapped/unmapped into arm64 code. Fixes arm32.
> >  - Rebase on cap_user_time_short changes
> >
> > Changes from Raphael's v4:
> >   - Drop homogeneous check
> >   - Disable access for chained counters
> >   - Set pmc_width in user page
> > ---
> >  arch/arm64/include/asm/mmu.h         |  5 ++++
> >  arch/arm64/include/asm/mmu_context.h |  2 ++
> >  arch/arm64/include/asm/perf_event.h  | 14 ++++++++++
> >  arch/arm64/kernel/perf_event.c       | 41 ++++++++++++++++++++++++++++
> >  4 files changed, 62 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
> > index a7a5ecaa2e83..52cfdb676f06 100644
> > --- a/arch/arm64/include/asm/mmu.h
> > +++ b/arch/arm64/include/asm/mmu.h
> > @@ -19,6 +19,11 @@
> >
> >  typedef struct {
> >       atomic64_t      id;
> > +     /*
> > +      * non-zero if userspace have access to hardware
> > +      * counters directly.
> > +      */
> > +     atomic_t        pmu_direct_access;
> >  #ifdef CONFIG_COMPAT
> >       void            *sigpage;
> >  #endif
> > diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
> > index f2d7537d6f83..d24589ecb07a 100644
> > --- a/arch/arm64/include/asm/mmu_context.h
> > +++ b/arch/arm64/include/asm/mmu_context.h
> > @@ -21,6 +21,7 @@
> >  #include <asm/proc-fns.h>
> >  #include <asm-generic/mm_hooks.h>
> >  #include <asm/cputype.h>
> > +#include <asm/perf_event.h>
> >  #include <asm/sysreg.h>
> >  #include <asm/tlbflush.h>
> >
> > @@ -224,6 +225,7 @@ static inline void __switch_mm(struct mm_struct *next)
> >       }
> >
> >       check_and_switch_context(next);
> > +     perf_switch_user_access(next);
> >  }
> >
> >  static inline void
> > diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
> > index 2c2d7dbe8a02..a025d9595d51 100644
> > --- a/arch/arm64/include/asm/perf_event.h
> > +++ b/arch/arm64/include/asm/perf_event.h
> > @@ -8,6 +8,7 @@
> >
> >  #include <asm/stack_pointer.h>
> >  #include <asm/ptrace.h>
> > +#include <linux/mm_types.h>
> >
> >  #define      ARMV8_PMU_MAX_COUNTERS  32
> >  #define      ARMV8_PMU_COUNTER_MASK  (ARMV8_PMU_MAX_COUNTERS - 1)
> > @@ -251,4 +252,17 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
> >       (regs)->pstate = PSR_MODE_EL1h; \
> >  }
> >
> > +static inline void perf_switch_user_access(struct mm_struct *mm)
> > +{
> > +     if (!IS_ENABLED(CONFIG_PERF_EVENTS))
> > +             return;
> > +
> > +     if (atomic_read(&mm->context.pmu_direct_access)) {
> > +             write_sysreg(ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR,
> > +                          pmuserenr_el0);
> > +     } else {
> > +             write_sysreg(0, pmuserenr_el0);
> > +     }
> > +}
>
> PMUSERENR.ER gives RW access to PMSELR_EL0. While we no longer use
> PMSELR_EL0 in the kernel, we can preempt and migrate userspace tasks
> between homogeneous CPUs, and presumably need to context-switch this
> with the task (like we do for TPIDR_EL0 and friends), or clear the
> register on context-switch to prevent it becoming an unintended covert
> channel.

Humm, now that I've read up on PMSELR_EL0 I'm now wondering if I
should be using PMSELR_EL0 in libperf. If you look at patch 7, the
counter read is pretty ugly because there's 32 possible mrs
instructions. If PMSELR_EL0 is used, we can have a single read path.
It's a msr and mrs vs. a function ptr load, branch/ret, and mrs. I'd
guess there's no guarantees on system reg access times, but I'd guess
typically the former is more optimal? It certainly simplifies the code
which I'd rather have given the limited users.

If I go that route and we don't context switch PMSELR_EL0, reads of
PMXEVCNTR_EL0 could be stale. But does that matter? No, because
reading PMEVCNTR<n>_EL0 can already be stale and the seq counter will
catch that.

> These bits also enable AArch32 access. Is there any way an AArch32 task
> can enable this? If so we should probably block that given we do not
> support this interface on 32-bit arm.

I'd assume this works for AArch32 given we don't do anything here to
prevent it. I suppose we could look at MMCF_AARCH32 flag in
mm_context_t? But is not implemented for arch/arm/ really a reason to
disable?

Rob

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-11-19 18:35 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-01 14:01 [PATCH v4 0/9] libperf and arm64 userspace counter access support Rob Herring
2020-10-01 14:01 ` Rob Herring
2020-10-01 14:01 ` [PATCH v4 1/9] arm64: pmu: Add function implementation to update event index in userpage Rob Herring
2020-10-01 14:01   ` Rob Herring
2020-10-01 14:01 ` [PATCH v4 2/9] arm64: perf: Enable pmu counter direct access for perf event on armv8 Rob Herring
2020-10-01 14:01   ` Rob Herring
2020-11-13 18:06   ` Mark Rutland
2020-11-13 18:06     ` Mark Rutland
2020-11-19 18:35     ` Rob Herring [this message]
2020-11-19 18:35       ` Rob Herring
2020-11-19 19:15     ` Will Deacon
2020-11-19 19:15       ` Will Deacon
2020-11-20 20:03       ` Rob Herring
2020-11-20 20:03         ` Rob Herring
2020-11-20 22:08         ` Rob Herring
2020-11-20 22:08           ` Rob Herring
2020-12-02 14:57         ` Rob Herring
2020-12-02 14:57           ` Rob Herring
2021-01-07  0:17           ` Rob Herring
2021-01-07  0:17             ` Rob Herring
2020-10-01 14:01 ` [PATCH v4 3/9] tools/include: Add an initial math64.h Rob Herring
2020-10-01 14:01   ` Rob Herring
2020-10-01 14:01 ` [PATCH v4 4/9] libperf: Add libperf_evsel__mmap() Rob Herring
2020-10-01 14:01   ` Rob Herring
2020-10-14 11:05   ` Jiri Olsa
2020-10-14 11:05     ` Jiri Olsa
2020-10-16 21:39     ` Rob Herring
2020-10-16 21:39       ` Rob Herring
2020-10-19 20:15       ` Jiri Olsa
2020-10-19 20:15         ` Jiri Olsa
2020-10-20 14:38         ` Rob Herring
2020-10-20 14:38           ` Rob Herring
2020-10-20 15:35           ` Jiri Olsa
2020-10-20 15:35             ` Jiri Olsa
2020-10-20 17:11             ` Rob Herring
2020-10-20 17:11               ` Rob Herring
2020-10-21 11:24               ` Jiri Olsa
2020-10-21 11:24                 ` Jiri Olsa
2020-11-05 16:19                 ` Rob Herring
2020-11-05 16:19                   ` Rob Herring
2020-11-05 22:41                   ` Jiri Olsa
2020-11-05 22:41                     ` Jiri Olsa
2020-11-06 21:56                     ` Rob Herring
2020-11-06 21:56                       ` Rob Herring
2020-11-11 12:00                       ` Jiri Olsa
2020-11-11 12:00                         ` Jiri Olsa
2020-11-11 14:50                         ` Rob Herring
2020-11-11 14:50                           ` Rob Herring
2020-10-01 14:01 ` [PATCH v4 5/9] libperf: tests: Add support for verbose printing Rob Herring
2020-10-01 14:01   ` Rob Herring
2020-10-01 14:01 ` [PATCH v4 6/9] libperf: Add support for user space counter access Rob Herring
2020-10-01 14:01   ` Rob Herring
2020-10-01 14:01 ` [PATCH v4 7/9] libperf: Add arm64 support to perf_mmap__read_self() Rob Herring
2020-10-01 14:01   ` Rob Herring
2020-10-01 14:01 ` [PATCH v4 8/9] perf: arm64: Add test for userspace counter access on heterogeneous systems Rob Herring
2020-10-01 14:01   ` Rob Herring
2020-10-01 14:01 ` [PATCH v4 9/9] Documentation: arm64: Document PMU counters access from userspace Rob Herring
2020-10-01 14:01   ` Rob Herring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAL_JsqKM+91Meg+u07VRsD5=O1srQooe1Dd_M3NA+CZgcN4QcQ@mail.gmail.com' \
    --to=robh@kernel.org \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=catalin.marinas@arm.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=irogers@google.com \
    --cc=itaru.kitayama@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=raphael.gault@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.