All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sebastian Ene <sebastianene@google.com>
To: Will Deacon <will@kernel.org>
Cc: catalin.marinas@arm.com, james.morse@arm.com,
	jean-philippe@linaro.org, maz@kernel.org, oliver.upton@linux.dev,
	qperret@google.com, qwandor@google.com, sudeep.holla@arm.com,
	suzuki.poulose@arm.com, tabba@google.com, yuzenghui@huawei.com,
	lpieralisi@kernel.org, kvmarm@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH 1/4] KVM: arm64: Trap FFA_VERSION host call in pKVM
Date: Tue, 7 May 2024 09:17:15 +0000	[thread overview]
Message-ID: <Zjnxmym0GLdzl0uR@google.com> (raw)
In-Reply-To: <20240503162114.GA18789@willie-the-truck>

On Fri, May 03, 2024 at 05:21:14PM +0100, Will Deacon wrote:
> On Fri, May 03, 2024 at 03:29:12PM +0000, Sebastian Ene wrote:
> > On Fri, May 03, 2024 at 03:39:38PM +0100, Will Deacon wrote:
> > > On Thu, Apr 18, 2024 at 04:30:23PM +0000, Sebastian Ene wrote:
> > > > The pKVM hypervisor initializes with FF-A version 1.0. Keep the
> > > > supported version inside the host structure and prevent the host
> > > > drivers from overwriting the FF-A version with an increased version.
> > > > Without trapping the call, the host drivers can negotiate a higher
> > > > version number with TEE which can result in a different memory layout
> > > > described during the memory sharing calls.
> > > > 
> > > > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > > > ---
> > > >  arch/arm64/kvm/hyp/nvhe/ffa.c | 43 ++++++++++++++++++++++++++++++++---
> > > >  1 file changed, 40 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/arch/arm64/kvm/hyp/nvhe/ffa.c b/arch/arm64/kvm/hyp/nvhe/ffa.c
> > > > index 320f2eaa14a9..023712e8beeb 100644
> > > > --- a/arch/arm64/kvm/hyp/nvhe/ffa.c
> > > > +++ b/arch/arm64/kvm/hyp/nvhe/ffa.c
> > > > @@ -58,6 +58,7 @@ struct kvm_ffa_buffers {
> > > >  	hyp_spinlock_t lock;
> > > >  	void *tx;
> > > >  	void *rx;
> > > > +	u32 ffa_version;
> > > >  };
> > > 
> > > Why should this be part of 'struct kvm_ffa_buffers'? The host, proxy and
> > > Secure side will end up using the same version, so a simple global
> > > variable would suffice, no?
> > > 
> > I prefer keeping it here as we will have more clients in the future /
> > different VMs and each one of them will have its own version and its own
> > pair of buffers.
> 
> We can add that when we need to. Let's keep it simple for now, as the
> idea of the proxy having to support multiple versions of the spec at
> once sounds terrifying to me. I don't think we're going to want to
> re-marshall the data structures between the 1.0 and 1.1 formats, are we?
> 

I don't think we increase the complexity of the code by keeping this
argument in the structure. The code in nvhe/ffa.c supports marshalling
the structure as of [this
change](https://lore.kernel.org/r/20231005-ffa_v1-1_notif-v4-14-cddd3237809c@arm.com
) and that is why I was in favor of keeping the version where it belongs
to.

> > > > @@ -640,6 +641,39 @@ static bool do_ffa_features(struct arm_smccc_res *res,
> > > >  	return true;
> > > >  }
> > > >  
> > > > +static void do_ffa_version(struct arm_smccc_res *res,
> > > > +			   struct kvm_cpu_context *ctxt)
> > > > +{
> > > > +	DECLARE_REG(u32, ffa_req_version, ctxt, 1);
> > > > +	u32 current_version;
> > > > +
> > > > +	hyp_spin_lock(&host_buffers.lock);
> > > 
> > > Why do you need to take the lock for this?
> > > 
> > 
> > Because we interpret the host buffer content based on the version that we
> > end up setting here and each time we are accessing these buffers we are
> > protected by this lock.
> 
> I think that's indicative of a broader issue, though, which is that you
> allow for the version to be re-negotiated at runtime. The spec doesn't
> allow that and I don't think we should either.
> 

The spec talks about interopeartion in case two versions (x,y) and (a,b)
want to talk: 

- given the pairs (x,y) and (a,b) x=major, y=minor if x == a and y > b
  the versions are incompatible until y downgrades its version such that
  y <= b.

From this I drew the conclusion that the spec allows the re-negotiation
at runtime, please let me know if you see things differently.

> > > > +	/*
> > > > +	 * If the client driver tries to downgrade the version, we need to ask
> > > > +	 * first if TEE supports it.
> > > > +	 */
> > > > +	if (FFA_MINOR_VERSION(ffa_req_version) < FFA_MINOR_VERSION(current_version)) {
> > > 
> > > Similarly here, I don't think 'current_version' is what we should expose.
> > > Rather, we should be returning the highest version that the proxy
> > > supports in the host, which is 1.0 at this point in the patch series.
> > 
> > We already report the highest version that the proxy supports on line:
> > `res->a0 = current_version;`
> > 
> > 'current_version' is assigned during proxy initialization.
> > This check allows us to downgrade the supported ffa_version if the Host
> > requested it and only if TF-A supports it.
> 
> I don't think we want the host negotiating directly with the Secure side,
> though, do we? 'current_version' is initialised to whatever the Secure
> side gives us, so if we had something like:
> 
>   1. Proxy initialises, issues FFA_VERSION(1.0)

This will save 1.0 in host_buffers.ffa_version

>   2. Secure implements 1.2 and so returns 1.2 but remains in 1.0
>      compatability mode for the data structure formats.

Ack.

>   3. The host issues FFA_VERSION(1.1)

The call is trapped and we verify if the requested version
(FFA_VERSION(1.1) is smaller than our current_version saved in step 1.

Given that is not smaller we only reply with our current supported
version which is FFA_VERSION(1.0) and we return to the host.

>   4. The code above then issues FFA_VERSION(1.1) to Secure and it
>      switches over to the 1.1 data structures

This was happening prior to my patch, so in a way this patch is a bugfix.
We get this behavior without trapping and interpretting
of the FFA_VERSION API.

> 
> Did I get that right?
> 
> I really think we need to settle on a single version for the host,
> hypervisor and Secure and then stick with it following a single
> negotiation stage.
> 
> > > > +		arm_smccc_1_1_smc(FFA_VERSION, ffa_req_version, 0,
> > > > +				  0, 0, 0, 0, 0,
> > > > +				  res);
> > > 
> > > Hmm, I'm struggling to see how this is supposed to work per the spec.
> > > The FF-A spec says:
> > > 
> > >   | ... negotiation of the version must happen before an invocation of
> > >   | any other FF-A ABI.
> > 
> > I think that is a bit vague in my opinion but what I get is that the first call
> > executed should always be the get version ff-a call.
> > 
> > > 
> > > and:
> > > 
> > >   | Once the caller invokes any FF-A ABI other than FFA_VERSION, the
> > >   | version negotiation phase is complete.
> > >   |
> > >   | Once an FF-A version has been negotiated between a caller and a
> > >   | callee, the version may not be changed for the lifetime of the
> > >   | calling component. The callee must treat the negotiated version as
> > >   | the only supported version for any subsequent interactions with the
> > >   | caller.> 
> > > So by the time we get here, we've already settled on our version with
> > > the Secure side and the host cannot downgrade.
> > 
> > At this stage I think the spec didn't take into account that there can be a hypervisor
> > in between.
> 
> Well, that's what the spec says and I think we need to follow it. I can
> well imagine that the Secure side won't allow the version to be
> re-negotiated on the fly and I don't think we'd want that in the proxy,
> either.
> 
> > > That's a bit rubbish if you ask me, but I think it means we'll have to
> > > defer some of the proxy initialisation until the host calls FFA_VERSION,
> > > at which point we'll need to negotiate a common version between the host,
> > > the proxy and Secure. Once we've done that, our FFA_VERSION handler will
> > > just return that negotiated version.
> > 
> > We are already doing this when the ARM driver is built as an external
> > module. If it is not as an external module and is builtin we have a
> > bigger issue because it loads before pKVM at subsys_initcall. This means
> > that we won't trap FFA_MAP* and other setup calls.
> 
> Sorry, I don't follow. hyp_ffa_init() issues FFA_FEATURES immediately
> after FFA_VERSION, so we terminate the negotiation right away.

Sorry I confused you, I am afraid I was trying to desribe a different issue
here which is related to how early the ARM FF-A driver initializes when
is builtin - it is before the hypervisor proxy is installed.

> 
> Will
> 
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
> 

Thank you,
Seb


WARNING: multiple messages have this Message-ID (diff)
From: Sebastian Ene <sebastianene@google.com>
To: Will Deacon <will@kernel.org>
Cc: catalin.marinas@arm.com, james.morse@arm.com,
	jean-philippe@linaro.org, maz@kernel.org, oliver.upton@linux.dev,
	qperret@google.com, qwandor@google.com, sudeep.holla@arm.com,
	suzuki.poulose@arm.com, tabba@google.com, yuzenghui@huawei.com,
	lpieralisi@kernel.org, kvmarm@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, kernel-team@android.com
Subject: Re: [PATCH 1/4] KVM: arm64: Trap FFA_VERSION host call in pKVM
Date: Tue, 7 May 2024 09:17:15 +0000	[thread overview]
Message-ID: <Zjnxmym0GLdzl0uR@google.com> (raw)
In-Reply-To: <20240503162114.GA18789@willie-the-truck>

On Fri, May 03, 2024 at 05:21:14PM +0100, Will Deacon wrote:
> On Fri, May 03, 2024 at 03:29:12PM +0000, Sebastian Ene wrote:
> > On Fri, May 03, 2024 at 03:39:38PM +0100, Will Deacon wrote:
> > > On Thu, Apr 18, 2024 at 04:30:23PM +0000, Sebastian Ene wrote:
> > > > The pKVM hypervisor initializes with FF-A version 1.0. Keep the
> > > > supported version inside the host structure and prevent the host
> > > > drivers from overwriting the FF-A version with an increased version.
> > > > Without trapping the call, the host drivers can negotiate a higher
> > > > version number with TEE which can result in a different memory layout
> > > > described during the memory sharing calls.
> > > > 
> > > > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > > > ---
> > > >  arch/arm64/kvm/hyp/nvhe/ffa.c | 43 ++++++++++++++++++++++++++++++++---
> > > >  1 file changed, 40 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/arch/arm64/kvm/hyp/nvhe/ffa.c b/arch/arm64/kvm/hyp/nvhe/ffa.c
> > > > index 320f2eaa14a9..023712e8beeb 100644
> > > > --- a/arch/arm64/kvm/hyp/nvhe/ffa.c
> > > > +++ b/arch/arm64/kvm/hyp/nvhe/ffa.c
> > > > @@ -58,6 +58,7 @@ struct kvm_ffa_buffers {
> > > >  	hyp_spinlock_t lock;
> > > >  	void *tx;
> > > >  	void *rx;
> > > > +	u32 ffa_version;
> > > >  };
> > > 
> > > Why should this be part of 'struct kvm_ffa_buffers'? The host, proxy and
> > > Secure side will end up using the same version, so a simple global
> > > variable would suffice, no?
> > > 
> > I prefer keeping it here as we will have more clients in the future /
> > different VMs and each one of them will have its own version and its own
> > pair of buffers.
> 
> We can add that when we need to. Let's keep it simple for now, as the
> idea of the proxy having to support multiple versions of the spec at
> once sounds terrifying to me. I don't think we're going to want to
> re-marshall the data structures between the 1.0 and 1.1 formats, are we?
> 

I don't think we increase the complexity of the code by keeping this
argument in the structure. The code in nvhe/ffa.c supports marshalling
the structure as of [this
change](https://lore.kernel.org/r/20231005-ffa_v1-1_notif-v4-14-cddd3237809c@arm.com
) and that is why I was in favor of keeping the version where it belongs
to.

> > > > @@ -640,6 +641,39 @@ static bool do_ffa_features(struct arm_smccc_res *res,
> > > >  	return true;
> > > >  }
> > > >  
> > > > +static void do_ffa_version(struct arm_smccc_res *res,
> > > > +			   struct kvm_cpu_context *ctxt)
> > > > +{
> > > > +	DECLARE_REG(u32, ffa_req_version, ctxt, 1);
> > > > +	u32 current_version;
> > > > +
> > > > +	hyp_spin_lock(&host_buffers.lock);
> > > 
> > > Why do you need to take the lock for this?
> > > 
> > 
> > Because we interpret the host buffer content based on the version that we
> > end up setting here and each time we are accessing these buffers we are
> > protected by this lock.
> 
> I think that's indicative of a broader issue, though, which is that you
> allow for the version to be re-negotiated at runtime. The spec doesn't
> allow that and I don't think we should either.
> 

The spec talks about interopeartion in case two versions (x,y) and (a,b)
want to talk: 

- given the pairs (x,y) and (a,b) x=major, y=minor if x == a and y > b
  the versions are incompatible until y downgrades its version such that
  y <= b.

From this I drew the conclusion that the spec allows the re-negotiation
at runtime, please let me know if you see things differently.

> > > > +	/*
> > > > +	 * If the client driver tries to downgrade the version, we need to ask
> > > > +	 * first if TEE supports it.
> > > > +	 */
> > > > +	if (FFA_MINOR_VERSION(ffa_req_version) < FFA_MINOR_VERSION(current_version)) {
> > > 
> > > Similarly here, I don't think 'current_version' is what we should expose.
> > > Rather, we should be returning the highest version that the proxy
> > > supports in the host, which is 1.0 at this point in the patch series.
> > 
> > We already report the highest version that the proxy supports on line:
> > `res->a0 = current_version;`
> > 
> > 'current_version' is assigned during proxy initialization.
> > This check allows us to downgrade the supported ffa_version if the Host
> > requested it and only if TF-A supports it.
> 
> I don't think we want the host negotiating directly with the Secure side,
> though, do we? 'current_version' is initialised to whatever the Secure
> side gives us, so if we had something like:
> 
>   1. Proxy initialises, issues FFA_VERSION(1.0)

This will save 1.0 in host_buffers.ffa_version

>   2. Secure implements 1.2 and so returns 1.2 but remains in 1.0
>      compatability mode for the data structure formats.

Ack.

>   3. The host issues FFA_VERSION(1.1)

The call is trapped and we verify if the requested version
(FFA_VERSION(1.1) is smaller than our current_version saved in step 1.

Given that is not smaller we only reply with our current supported
version which is FFA_VERSION(1.0) and we return to the host.

>   4. The code above then issues FFA_VERSION(1.1) to Secure and it
>      switches over to the 1.1 data structures

This was happening prior to my patch, so in a way this patch is a bugfix.
We get this behavior without trapping and interpretting
of the FFA_VERSION API.

> 
> Did I get that right?
> 
> I really think we need to settle on a single version for the host,
> hypervisor and Secure and then stick with it following a single
> negotiation stage.
> 
> > > > +		arm_smccc_1_1_smc(FFA_VERSION, ffa_req_version, 0,
> > > > +				  0, 0, 0, 0, 0,
> > > > +				  res);
> > > 
> > > Hmm, I'm struggling to see how this is supposed to work per the spec.
> > > The FF-A spec says:
> > > 
> > >   | ... negotiation of the version must happen before an invocation of
> > >   | any other FF-A ABI.
> > 
> > I think that is a bit vague in my opinion but what I get is that the first call
> > executed should always be the get version ff-a call.
> > 
> > > 
> > > and:
> > > 
> > >   | Once the caller invokes any FF-A ABI other than FFA_VERSION, the
> > >   | version negotiation phase is complete.
> > >   |
> > >   | Once an FF-A version has been negotiated between a caller and a
> > >   | callee, the version may not be changed for the lifetime of the
> > >   | calling component. The callee must treat the negotiated version as
> > >   | the only supported version for any subsequent interactions with the
> > >   | caller.> 
> > > So by the time we get here, we've already settled on our version with
> > > the Secure side and the host cannot downgrade.
> > 
> > At this stage I think the spec didn't take into account that there can be a hypervisor
> > in between.
> 
> Well, that's what the spec says and I think we need to follow it. I can
> well imagine that the Secure side won't allow the version to be
> re-negotiated on the fly and I don't think we'd want that in the proxy,
> either.
> 
> > > That's a bit rubbish if you ask me, but I think it means we'll have to
> > > defer some of the proxy initialisation until the host calls FFA_VERSION,
> > > at which point we'll need to negotiate a common version between the host,
> > > the proxy and Secure. Once we've done that, our FFA_VERSION handler will
> > > just return that negotiated version.
> > 
> > We are already doing this when the ARM driver is built as an external
> > module. If it is not as an external module and is builtin we have a
> > bigger issue because it loads before pKVM at subsys_initcall. This means
> > that we won't trap FFA_MAP* and other setup calls.
> 
> Sorry, I don't follow. hyp_ffa_init() issues FFA_FEATURES immediately
> after FFA_VERSION, so we terminate the negotiation right away.

Sorry I confused you, I am afraid I was trying to desribe a different issue
here which is related to how early the ARM FF-A driver initializes when
is builtin - it is before the hypervisor proxy is installed.

> 
> Will
> 
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
> 

Thank you,
Seb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2024-05-07  9:17 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-18 16:30 [PATCH 0/4] KVM: arm64: pKVM host proxy FF-A fixes Sebastian Ene
2024-04-18 16:30 ` Sebastian Ene
2024-04-18 16:30 ` [PATCH 1/4] KVM: arm64: Trap FFA_VERSION host call in pKVM Sebastian Ene
2024-04-18 16:30   ` Sebastian Ene
2024-05-03 14:39   ` Will Deacon
2024-05-03 14:39     ` Will Deacon
2024-05-03 15:29     ` Sebastian Ene
2024-05-03 15:29       ` Sebastian Ene
2024-05-03 16:21       ` Will Deacon
2024-05-03 16:21         ` Will Deacon
2024-05-07  9:17         ` Sebastian Ene [this message]
2024-05-07  9:17           ` Sebastian Ene
2024-04-18 16:30 ` [PATCH 2/4] KVM: arm64: Add support for FFA_PARTITION_INFO_GET Sebastian Ene
2024-04-18 16:30   ` Sebastian Ene
2024-05-03 15:01   ` Will Deacon
2024-05-03 15:01     ` Will Deacon
2024-05-15 15:40     ` Sebastian Ene
2024-05-15 15:40       ` Sebastian Ene
2024-04-18 16:30 ` [PATCH 3/4] KVM: arm64: Fix the identification range for the FF-A smcs Sebastian Ene
2024-04-18 16:30   ` Sebastian Ene
2024-05-03 15:07   ` Will Deacon
2024-05-03 15:07     ` Will Deacon
2024-04-18 16:30 ` [PATCH 4/4] KVM: arm64: Use FF-A 1.1 with pKVM Sebastian Ene
2024-04-18 16:30   ` Sebastian Ene

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zjnxmym0GLdzl0uR@google.com \
    --to=sebastianene@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=jean-philippe@linaro.org \
    --cc=kernel-team@android.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=qperret@google.com \
    --cc=qwandor@google.com \
    --cc=sudeep.holla@arm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.