All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Brown <broonie@kernel.org>
To: Marc Zyngier <maz@kernel.org>
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
	kvm@vger.kernel.org, James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Zenghui Yu <yuzenghui@huawei.com>,
	James Clark <james.clark@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>
Subject: Re: [PATCH 5/5] KVM: arm64: Exclude FP ownership from kvm_vcpu_arch
Date: Wed, 6 Mar 2024 22:19:03 +0000	[thread overview]
Message-ID: <a8416451-011c-4159-b9e4-b492b81f5a2c@sirena.org.uk> (raw)
In-Reply-To: <87edcnr8zy.wl-maz@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 4851 bytes --]

On Wed, Mar 06, 2024 at 09:43:13AM +0000, Marc Zyngier wrote:
> Mark Brown <broonie@kernel.org> wrote:
> > On Sat, Mar 02, 2024 at 11:19:35AM +0000, Marc Zyngier wrote:

> > > Move the ownership tracking into the host data structure, and
> > > rename it from fp_state to fp_owner, which is a better description
> > > (name suggested by Mark Brown).

> > The SME patch series proposes adding an additional state to this
> > enumeration which would say if the registers are stored in a format
> > suitable for exchange with userspace, that would make this state part of
> > the vCPU state.  With the addition of SME we can have two vector lengths
> > in play so the series proposes picking the larger to be the format for
> > userspace registers.

> What does this addition have anything to do with the ownership of the
> physical register file? Not a lot, it seems.

> Specially as there better be no state resident on the CPU when
> userspace messes up with it.

If we have a situation where the state might be stored in memory in
multiple formats it seems reasonable to consider the metadata which
indicates which format is currently in use as part of the state.

> > We could store this separately to fp_state/owner but it'd still be a
> > value stored in the vCPU.

> I totally disagree.

Where would you expect to see the state stored?

> > Storing in a format suitable for userspace
> > usage all the time when we've got SME would most likely result in
> > performance overhead

> What performance overhead? Why should we care?

Since in situations where we're not using the larger VL we would need to
load and store the registers using a vector length other than the
currently configured vector length we would not be able to use the
ability to load and store to a location based on a multiple of the
vector length that the architecture has:

   LDR <Zt>, [<Xn|SP>{, #<imm>, MUL VL}]
   LDR <Pt>, [<Xn|SP>{, #<imm>, MUL VL}]
   
   STR <Zt>, [<Xn|SP>{, #<imm>, MUL VL}]
   STR <Pt>, [<Xn|SP>{, #<imm>, MUL VL}]

and would instead need to manually compute the memory locations where
values are stored.  As well as the extra instructions when using the
smaller vector length we'd also be working with sparser data likely over
more cache lines.

We would also need to consider if we need to zero the holes in the data
when saving, we'd only potentially be leaking information from the guest
but it might cause nasty surprises given that transitioning to/from
streaming mode is expected to zero values.  If we do need to zero then
that would be additional work that would need doing.

Exactly what the performance hit would be will be system and use case
dependent.  *Hopefully* we aren't needing to save and load the guest
state too often but I would be very surprised if we didn't have people
considering any cost in the guest context switch path worth paying
attention to.

As well as the performance overhead there would be some code complexity
cost, if nothing else we'd not be using the same format as fpsimd_save()
and would need to rearrange how we handle saving the register state.

Spending more effort to implement something which also has more runtime
performance overhead for the case of saving and restoring guest state
which I expect to be vastly more common than the VMM accessing the guest
registers just doesn't seem like an appealing choice.

> > if nothing else and feels more complicated than
> > rewriting the data in the relatively unusual case where userspace looks
> > at it.  Trying to convert userspace writes into the current layout would
> > have issues if the current layout uses the smaller vector length and
> > create fragility with ordering issues when loading the guest state.

> What ordering issues? If userspace manipulates the guest state, the
> guest isn't running. If it is, all bets are off.

If we were storing the data in the native format for the guest then that
format will change if streaming mode is changed via a write to SVCR.
This would mean that the host would need to understand that when writing
values SVCR needs to be written before the Z and P registers.  To be
clear I don't think this is a good idea.

> > The proposal is not the most lovely idea ever but given the architecture
> > I think some degree of clunkiness would be unavoidable.

> It is only unavoidable if we decide to make a bad job of it.

I don't think the handling of the vector registers for KVM with SME is
something where there is a clear good and bad job we can do - I don't
see how we can reasonably avoid at some point needing to translate
vector lengths or to/from FPSIMD format (in the case of a system with
SME but not SVE) which is just inherently a sharp edge.  It's just a
question of when and how we do that.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Mark Brown <broonie@kernel.org>
To: Marc Zyngier <maz@kernel.org>
Cc: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
	kvm@vger.kernel.org, James Morse <james.morse@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Zenghui Yu <yuzenghui@huawei.com>,
	James Clark <james.clark@arm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>
Subject: Re: [PATCH 5/5] KVM: arm64: Exclude FP ownership from kvm_vcpu_arch
Date: Wed, 6 Mar 2024 22:19:03 +0000	[thread overview]
Message-ID: <a8416451-011c-4159-b9e4-b492b81f5a2c@sirena.org.uk> (raw)
In-Reply-To: <87edcnr8zy.wl-maz@kernel.org>


[-- Attachment #1.1: Type: text/plain, Size: 4851 bytes --]

On Wed, Mar 06, 2024 at 09:43:13AM +0000, Marc Zyngier wrote:
> Mark Brown <broonie@kernel.org> wrote:
> > On Sat, Mar 02, 2024 at 11:19:35AM +0000, Marc Zyngier wrote:

> > > Move the ownership tracking into the host data structure, and
> > > rename it from fp_state to fp_owner, which is a better description
> > > (name suggested by Mark Brown).

> > The SME patch series proposes adding an additional state to this
> > enumeration which would say if the registers are stored in a format
> > suitable for exchange with userspace, that would make this state part of
> > the vCPU state.  With the addition of SME we can have two vector lengths
> > in play so the series proposes picking the larger to be the format for
> > userspace registers.

> What does this addition have anything to do with the ownership of the
> physical register file? Not a lot, it seems.

> Specially as there better be no state resident on the CPU when
> userspace messes up with it.

If we have a situation where the state might be stored in memory in
multiple formats it seems reasonable to consider the metadata which
indicates which format is currently in use as part of the state.

> > We could store this separately to fp_state/owner but it'd still be a
> > value stored in the vCPU.

> I totally disagree.

Where would you expect to see the state stored?

> > Storing in a format suitable for userspace
> > usage all the time when we've got SME would most likely result in
> > performance overhead

> What performance overhead? Why should we care?

Since in situations where we're not using the larger VL we would need to
load and store the registers using a vector length other than the
currently configured vector length we would not be able to use the
ability to load and store to a location based on a multiple of the
vector length that the architecture has:

   LDR <Zt>, [<Xn|SP>{, #<imm>, MUL VL}]
   LDR <Pt>, [<Xn|SP>{, #<imm>, MUL VL}]
   
   STR <Zt>, [<Xn|SP>{, #<imm>, MUL VL}]
   STR <Pt>, [<Xn|SP>{, #<imm>, MUL VL}]

and would instead need to manually compute the memory locations where
values are stored.  As well as the extra instructions when using the
smaller vector length we'd also be working with sparser data likely over
more cache lines.

We would also need to consider if we need to zero the holes in the data
when saving, we'd only potentially be leaking information from the guest
but it might cause nasty surprises given that transitioning to/from
streaming mode is expected to zero values.  If we do need to zero then
that would be additional work that would need doing.

Exactly what the performance hit would be will be system and use case
dependent.  *Hopefully* we aren't needing to save and load the guest
state too often but I would be very surprised if we didn't have people
considering any cost in the guest context switch path worth paying
attention to.

As well as the performance overhead there would be some code complexity
cost, if nothing else we'd not be using the same format as fpsimd_save()
and would need to rearrange how we handle saving the register state.

Spending more effort to implement something which also has more runtime
performance overhead for the case of saving and restoring guest state
which I expect to be vastly more common than the VMM accessing the guest
registers just doesn't seem like an appealing choice.

> > if nothing else and feels more complicated than
> > rewriting the data in the relatively unusual case where userspace looks
> > at it.  Trying to convert userspace writes into the current layout would
> > have issues if the current layout uses the smaller vector length and
> > create fragility with ordering issues when loading the guest state.

> What ordering issues? If userspace manipulates the guest state, the
> guest isn't running. If it is, all bets are off.

If we were storing the data in the native format for the guest then that
format will change if streaming mode is changed via a write to SVCR.
This would mean that the host would need to understand that when writing
values SVCR needs to be written before the Z and P registers.  To be
clear I don't think this is a good idea.

> > The proposal is not the most lovely idea ever but given the architecture
> > I think some degree of clunkiness would be unavoidable.

> It is only unavoidable if we decide to make a bad job of it.

I don't think the handling of the vector registers for KVM with SME is
something where there is a clear good and bad job we can do - I don't
see how we can reasonably avoid at some point needing to translate
vector lengths or to/from FPSIMD format (in the case of a system with
SME but not SVE) which is just inherently a sharp edge.  It's just a
question of when and how we do that.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2024-03-06 22:19 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-02 11:19 [PATCH 0/5] KVM: arm64: Move host-specific data out of kvm_vcpu_arch Marc Zyngier
2024-03-02 11:19 ` Marc Zyngier
2024-03-02 11:19 ` [PATCH 1/5] KVM: arm64: Add accessor for per-CPU state Marc Zyngier
2024-03-02 11:19   ` Marc Zyngier
2024-03-04 12:05   ` Suzuki K Poulose
2024-03-04 12:05     ` Suzuki K Poulose
2024-03-09 13:00     ` Marc Zyngier
2024-03-09 13:00       ` Marc Zyngier
2024-03-11  4:50   ` Dongli Zhang
2024-03-11  4:50     ` Dongli Zhang
2024-03-11 17:13     ` Marc Zyngier
2024-03-11 17:13       ` Marc Zyngier
2024-03-02 11:19 ` [PATCH 2/5] KVM: arm64: Exclude host_debug_data from vcpu_arch Marc Zyngier
2024-03-02 11:19   ` Marc Zyngier
2024-03-02 11:19 ` [PATCH 3/5] KVM: arm64: Exclude mdcr_el2_host from kvm_vcpu_arch Marc Zyngier
2024-03-02 11:19   ` Marc Zyngier
2024-03-02 11:19 ` [PATCH 4/5] KVM: arm64: Exclude host_fpsimd_state pointer " Marc Zyngier
2024-03-02 11:19   ` Marc Zyngier
2024-03-04 20:45   ` Mark Brown
2024-03-04 20:45     ` Mark Brown
2024-03-02 11:19 ` [PATCH 5/5] KVM: arm64: Exclude FP ownership " Marc Zyngier
2024-03-02 11:19   ` Marc Zyngier
2024-03-04 19:10   ` Mark Brown
2024-03-04 19:10     ` Mark Brown
2024-03-06  9:43     ` Marc Zyngier
2024-03-06  9:43       ` Marc Zyngier
2024-03-06 22:19       ` Mark Brown [this message]
2024-03-06 22:19         ` Mark Brown
2024-03-07 11:10         ` Marc Zyngier
2024-03-07 11:10           ` Marc Zyngier
2024-03-07 14:26           ` Mark Brown
2024-03-07 14:26             ` Mark Brown
2024-03-09 11:01             ` Marc Zyngier
2024-03-09 11:01               ` Marc Zyngier
2024-03-11 18:42               ` Mark Brown
2024-03-11 18:42                 ` Mark Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a8416451-011c-4159-b9e4-b492b81f5a2c@sirena.org.uk \
    --to=broonie@kernel.org \
    --cc=anshuman.khandual@arm.com \
    --cc=james.clark@arm.com \
    --cc=james.morse@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=suzuki.poulose@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.