All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Brown <broonie@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: psodagud@codeaurora.org, will@kernel.org, Dave.Martin@arm.com,
	amit.kachhap@arm.com, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: sve_user_discard
Date: Fri, 21 May 2021 12:54:41 +0100	[thread overview]
Message-ID: <20210521115441.GA5825@sirena.org.uk> (raw)
In-Reply-To: <20210521091254.GA6675@arm.com>

[-- Attachment #1: Type: text/plain, Size: 2495 bytes --]

On Fri, May 21, 2021 at 10:12:54AM +0100, Catalin Marinas wrote:
> On Thu, May 20, 2021 at 04:02:03PM -0700, psodagud@codeaurora.org wrote:

> > This is regarding sve_user_disable(CPACR_EL1_ZEN_EL0EN) on every system
> > call.  If a userspace task is using SVE instructions and making sys calls in
> > between, it would impact the performance of the thread. On every SVE
> > instructions after SVC/system call, it would trap to EL1.

> > I think by setting CPACR_EL1_ZEN_EL0EN flag,  the processor faults when it
> > runs an SVE instruction. This approach may be taken as part of FPSIMD
> > registers switching optimizations.  Can below portion of the code use
> > thread.fpsimd_cpu and fpsimd_last_state variables to avoid clearing
> > CPACR_EL1_ZEN_EL0EN for this kind of use cases?

This mail hasn't hit the lists yet so I've only got the quoted portions,
missing some context like the "code below" referenced above so I don't
know exactly what the proposal is.

> There were attempts over the past couple of years to optimise the
> syscall return use-case. I think the latest is this one:

> https://lore.kernel.org/r/20201106193553.22946-2-broonie@kernel.org

There's actually this more recently:

    https://lore.kernel.org/linux-arm-kernel/20210512151131.27877-1-broonie@kernel.org/

which does 90% of the optimisation in a lot less code, people seem a bit
more enthusiastic about that version.

> I'll let Mark comment on his plans for reviving the series. Do you
> happen to have some realistic workload that would be improved by this?
> We can always write a micro-benchmark but I wonder how much this matters
> in the real world.

Yeah, I'm not sure how much of a meaningful overhead there is from doing
the sve_user_discard() vs testing to see if we need to do it while
maintaining correctness.  Whatever overhead there is with the current
code will only take effect if we're hitting a slow path anyway so it
feels like it might be more trouble than it's worth.  If the proposal
was to just leave SVE enabled for userspace then the issue there is that
we'd have to context switch SVE even if the process isn't using it,
there is nothing other than a syscall that lets us stop doing that.

It will be interesting to look at this stuff as SVE hardware starts to
become more widely available and used on a wider range of workloads on
systems with various vector sizes, lazy restore might be worth looking
at for example possibly in conjunction with always allowing SVE from
userspace.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Mark Brown <broonie@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: psodagud@codeaurora.org, will@kernel.org, Dave.Martin@arm.com,
	amit.kachhap@arm.com, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: sve_user_discard
Date: Fri, 21 May 2021 12:54:41 +0100	[thread overview]
Message-ID: <20210521115441.GA5825@sirena.org.uk> (raw)
In-Reply-To: <20210521091254.GA6675@arm.com>


[-- Attachment #1.1: Type: text/plain, Size: 2495 bytes --]

On Fri, May 21, 2021 at 10:12:54AM +0100, Catalin Marinas wrote:
> On Thu, May 20, 2021 at 04:02:03PM -0700, psodagud@codeaurora.org wrote:

> > This is regarding sve_user_disable(CPACR_EL1_ZEN_EL0EN) on every system
> > call.  If a userspace task is using SVE instructions and making sys calls in
> > between, it would impact the performance of the thread. On every SVE
> > instructions after SVC/system call, it would trap to EL1.

> > I think by setting CPACR_EL1_ZEN_EL0EN flag,  the processor faults when it
> > runs an SVE instruction. This approach may be taken as part of FPSIMD
> > registers switching optimizations.  Can below portion of the code use
> > thread.fpsimd_cpu and fpsimd_last_state variables to avoid clearing
> > CPACR_EL1_ZEN_EL0EN for this kind of use cases?

This mail hasn't hit the lists yet so I've only got the quoted portions,
missing some context like the "code below" referenced above so I don't
know exactly what the proposal is.

> There were attempts over the past couple of years to optimise the
> syscall return use-case. I think the latest is this one:

> https://lore.kernel.org/r/20201106193553.22946-2-broonie@kernel.org

There's actually this more recently:

    https://lore.kernel.org/linux-arm-kernel/20210512151131.27877-1-broonie@kernel.org/

which does 90% of the optimisation in a lot less code, people seem a bit
more enthusiastic about that version.

> I'll let Mark comment on his plans for reviving the series. Do you
> happen to have some realistic workload that would be improved by this?
> We can always write a micro-benchmark but I wonder how much this matters
> in the real world.

Yeah, I'm not sure how much of a meaningful overhead there is from doing
the sve_user_discard() vs testing to see if we need to do it while
maintaining correctness.  Whatever overhead there is with the current
code will only take effect if we're hitting a slow path anyway so it
feels like it might be more trouble than it's worth.  If the proposal
was to just leave SVE enabled for userspace then the issue there is that
we'd have to context switch SVE even if the process isn't using it,
there is nothing other than a syscall that lets us stop doing that.

It will be interesting to look at this stuff as SVE hardware starts to
become more widely available and used on a wider range of workloads on
systems with various vector sizes, lazy restore might be worth looking
at for example possibly in conjunction with always allowing SVE from
userspace.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-05-21 11:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20 23:02 sve_user_discard psodagud
2021-05-21  9:12 ` sve_user_discard Catalin Marinas
2021-05-21  9:12   ` sve_user_discard Catalin Marinas
2021-05-21 11:54   ` Mark Brown [this message]
2021-05-21 11:54     ` sve_user_discard Mark Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210521115441.GA5825@sirena.org.uk \
    --to=broonie@kernel.org \
    --cc=Dave.Martin@arm.com \
    --cc=amit.kachhap@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=psodagud@codeaurora.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.