linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RFC: Plumbers microconf topic: PCI DOE and related.
@ 2021-07-27 12:06 Jonathan Cameron
  2021-07-27 16:50 ` Vikram Sethi
  0 siblings, 1 reply; 3+ messages in thread
From: Jonathan Cameron @ 2021-07-27 12:06 UTC (permalink / raw)
  To: Dan Williams, Ben Widawsky, Chris Browy, Linux PCI, linux-cxl,
	Lorenzo Pieralisi, Bjorn Helgaas
  Cc: Krzysztof Wilczyński, linuxarm, Fangjian

Hi All,

There have been several mentions already of discussing some of the topics
around DOE mailboxes and the various things they enable at the upcoming
Plumbers VFIO/IOMMU/PCI microconf.

A few references:
https://lore.kernel.org/linux-pci/CAPcyv4i2ukD4ZQ_KfTaKXLyMakpSk=Y3_QJGV2P_PLHHVkPwFw@mail.gmail.com/
https://lore.kernel.org/linux-pci/20210520092205.000044ee@Huawei.com/

The intent of this email thread is to consolidate those suggestions into
a reasonable list of things to talk about (that I can then put into the
CFP system). Obviously Plumbers is still some time away and we will
"hopefully" resolve some of this stuff on this list before then. Also
open is who will lead this session if accepted.
(Perhaps Dan Williams + myself?)

Note this may be full of inaccuracies and (whilst I've tried not to) some
of my own opinions, so please do poke holes in it!

The latter parts are about CMA / SPDM for which a kernel RFC should be public
shortly (subject to summer holidays etc).

Quick background:

The elephant in the room for this topic is that there is on going related
specification work that we cannot discuss due to various confidentiality
rules. Everything in this email is based on published specs / public
discussions on the mailing lists.
Having said that there is plenty to talk about today, we just might
need a round 2 next year :)

Terms:

DOE - Data Object Exchange (PCI ECN) https://pcisig.com
 * A mailbox in PCI config space.
 
CDAT - Coherent Device Attribute table (UEFI hosted separate public spec)
 * Uses DOE mailbox to retrieve info on (CXL) EP such as bandwidth and
   latency of access to memory. 

CMA - Component Measurement and Authentication (PCI ECN) https://pcisig.com
 * Uses DMTF SPDM 1.1 based exchanges over DOE to authenticate EPs and
   carry out runtime measurements (kind of IMA for devices).

IDE - Integrity and Data Encryption (PCI ECN) https://pcisig.com
 * Link and selective (through switches) encryption.  Uses DOE / SPDM 1.1
   and builds on top of CMA.

Open Questions / Problems:
1. Control which software entity uses DOE.
   It does not appear to be safe (as in not going to disrupt each other rather
   than security) for multiple software entities (Userspace, Kernel, TEE,
   Firmware) to access an individual DOE instance on a device without
   mediation.  Some DOE protocols have clear reasons for Linux kernel
   access (e.g. CDAT) others are more debatable.
   Even running the discovery protocol could disrupt other users. Hardening
   against such disruption is probably best effort only (no guarantees).
   Question is: How to prevent this?
    a) Userspace vs Kernel. Are there valid reasons for userspace to access
       a DOE? If so do how do we enable that? Does a per protocol approach
       make sense? Potential vendor defined protocols? Do we need to lock
       out 'developer' tools such as setpci - or do we let developers shoot
       themselves in the foot?
    b) OS vs lower levels / TEE. Do we need to propose a means of telling the OS
       to keep its hands off a DOE?  How to do it?

2. CMA support.
   Usecases for in kernel CMA support and whether strong enough to support
   native access. (e.g. authentication of VF from a VM, or systems not running
   any suitable lower level software / TEE)
   Key / Certificate management. This is somewhat like IMA, but we probably
   need to manage the certificate chain separately for each CMA/SPDM instance. 
   Understanding provisioning models would be useful to guide this work.

3. IDE support
   Is native kernel support worthwhile? Perhaps good to discuss
   potential usecases + get some idea on priority for this feature.

4. Potential blockers on merging emulation support in QEMU. (I'm less sure
   on this one, but perhaps worth briefly touching on or a separate
   session on emulation if people are interested? Ben, do you think this
   would be worthwhile?)

There are other minor questions we might slip into the discussion, time
allowing such as need for async support handling in the kernel DOE code.

For all these features, we have multiple layers on top of underlying PCI
so discussion of 'how' to support this might be useful.
1) Service model - detected at PCI subsystem level, services to drivers.
2) Driver initiated mode - library code, but per driver instantiation etc.

That's what have come up with this morning, so please poke holes in it and
point out what I've forgotten about.

Note for an actual CFP proposal, I'll probably split this into at least two.
Topic 1: DOE only.  Topic 2: CMA / IDE. As there is a lot here, for some
topics we may be looking at introduce the topic + questions rather than
resolving everything on the day.

Thanks,

Jonathan

p.s. Perhaps it is a little unusual to have this level of 'planning' discussion
explicitly on list, but we are working under some unusual constraints
and inclusiveness and openness always good anyway!


^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: RFC: Plumbers microconf topic: PCI DOE and related.
  2021-07-27 12:06 RFC: Plumbers microconf topic: PCI DOE and related Jonathan Cameron
@ 2021-07-27 16:50 ` Vikram Sethi
  2021-07-28  8:56   ` Jonathan Cameron
  0 siblings, 1 reply; 3+ messages in thread
From: Vikram Sethi @ 2021-07-27 16:50 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams, Ben Widawsky, Chris Browy,
	Linux PCI, linux-cxl, Lorenzo Pieralisi, Bjorn Helgaas
  Cc: Krzysztof Wilczyński, linuxarm, Fangjian, Natu, Mahesh,
	Varun Sampath

Hi Jonathan, 

> -----Original Message-----
> From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>


> Open Questions / Problems:
> 1. Control which software entity uses DOE.
>    It does not appear to be safe (as in not going to disrupt each other rather
>    than security) for multiple software entities (Userspace, Kernel, TEE,
>    Firmware) to access an individual DOE instance on a device without
>    mediation.  Some DOE protocols have clear reasons for Linux kernel
>    access (e.g. CDAT) others are more debatable.
>    Even running the discovery protocol could disrupt other users. Hardening
>    against such disruption is probably best effort only (no guarantees).
>    Question is: How to prevent this?
>     a) Userspace vs Kernel. Are there valid reasons for userspace to access
>        a DOE? If so do how do we enable that? Does a per protocol approach
>        make sense? Potential vendor defined protocols? Do we need to lock
>        out 'developer' tools such as setpci - or do we let developers shoot
>        themselves in the foot?
>     b) OS vs lower levels / TEE. Do we need to propose a means of telling the
> OS
>        to keep its hands off a DOE?  How to do it?
> 
> 2. CMA support.
>    Usecases for in kernel CMA support and whether strong enough to support
>    native access. (e.g. authentication of VF from a VM, or systems not running
>    any suitable lower level software / TEE)

Any time the device is reset, you'd want to measure again. I'd think every kernel
PF FLR/SBR/CXL reset needs to be followed by a measurement of the device
In kernel. Of course needs bigger discussion on the plumbing/infrastructure
to report the measurement and attest that the measurements post reset are valid.
Instead of native access, could it be mediated via ACPI or UEFI runtime service?
Not clear that ACPI/UEFI would be the appropriate mediator in all cases. 

>    Key / Certificate management. This is somewhat like IMA, but we probably
>    need to manage the certificate chain separately for each CMA/SPDM
> instance.
>    Understanding provisioning models would be useful to guide this work.
> 
> 3. IDE support
>    Is native kernel support worthwhile? Perhaps good to discuss
>    potential usecases + get some idea on priority for this feature.
> 
> 4. Potential blockers on merging emulation support in QEMU. (I'm less sure
>    on this one, but perhaps worth briefly touching on or a separate
>    session on emulation if people are interested? Ben, do you think this
>    would be worthwhile?)
> 
> There are other minor questions we might slip into the discussion, time
> allowing such as need for async support handling in the kernel DOE code.
> 
> For all these features, we have multiple layers on top of underlying PCI so
> discussion of 'how' to support this might be useful.
> 1) Service model - detected at PCI subsystem level, services to drivers.
> 2) Driver initiated mode - library code, but per driver instantiation etc.
> 
> That's what have come up with this morning, so please poke holes in it and
> point out what I've forgotten about.
> 
> Note for an actual CFP proposal, I'll probably split this into at least two.
> Topic 1: DOE only.  Topic 2: CMA / IDE. As there is a lot here, for some topics
> we may be looking at introduce the topic + questions rather than resolving
> everything on the day.
> 
> Thanks,
> 
> Jonathan
> 
> p.s. Perhaps it is a little unusual to have this level of 'planning' discussion
> explicitly on list, but we are working under some unusual constraints and
> inclusiveness and openness always good anyway!


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: RFC: Plumbers microconf topic: PCI DOE and related.
  2021-07-27 16:50 ` Vikram Sethi
@ 2021-07-28  8:56   ` Jonathan Cameron
  0 siblings, 0 replies; 3+ messages in thread
From: Jonathan Cameron @ 2021-07-28  8:56 UTC (permalink / raw)
  To: Vikram Sethi
  Cc: Dan Williams, Ben Widawsky, Chris Browy, Linux PCI, linux-cxl,
	Lorenzo Pieralisi, Bjorn Helgaas, Krzysztof Wilczyński,
	linuxarm, Fangjian, Natu, Mahesh, Varun Sampath

On Tue, 27 Jul 2021 16:50:05 +0000
Vikram Sethi <vsethi@nvidia.com> wrote:

> Hi Jonathan, 
> 
> > -----Original Message-----
> > From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>  
> 
> 
> > Open Questions / Problems:
> > 1. Control which software entity uses DOE.
> >    It does not appear to be safe (as in not going to disrupt each other rather
> >    than security) for multiple software entities (Userspace, Kernel, TEE,
> >    Firmware) to access an individual DOE instance on a device without
> >    mediation.  Some DOE protocols have clear reasons for Linux kernel
> >    access (e.g. CDAT) others are more debatable.
> >    Even running the discovery protocol could disrupt other users. Hardening
> >    against such disruption is probably best effort only (no guarantees).
> >    Question is: How to prevent this?
> >     a) Userspace vs Kernel. Are there valid reasons for userspace to access
> >        a DOE? If so do how do we enable that? Does a per protocol approach
> >        make sense? Potential vendor defined protocols? Do we need to lock
> >        out 'developer' tools such as setpci - or do we let developers shoot
> >        themselves in the foot?
> >     b) OS vs lower levels / TEE. Do we need to propose a means of telling the
> > OS
> >        to keep its hands off a DOE?  How to do it?
> > 
> > 2. CMA support.
> >    Usecases for in kernel CMA support and whether strong enough to support
> >    native access. (e.g. authentication of VF from a VM, or systems not running
> >    any suitable lower level software / TEE)  
> 
> Any time the device is reset, you'd want to measure again. I'd think every kernel
> PF FLR/SBR/CXL reset needs to be followed by a measurement of the device
> In kernel. Of course needs bigger discussion on the plumbing/infrastructure
> to report the measurement and attest that the measurements post reset are valid.
> Instead of native access, could it be mediated via ACPI or UEFI runtime service?
> Not clear that ACPI/UEFI would be the appropriate mediator in all cases. 

Absolutely agree that checking on reset.  There will be cases where it has
to be mediated by firmware of some type, as same DOE can be in use for IDE
which may well be controlled by an entity other than the kernel.  However,
there are other cases where the kernel will probably want to do it directly
- particularly as CMA can exist for VFs. From the ECN:

"In other use cases it may be desirable to evaluate individual Functions.
 For example, when a Function is directly assigned to a Virtual Machine
 (VM) that VM can use CMA via DOE to confirm that the hardware element
 assigned to it meets the VM’s requirements. For such use cases, the
 security exchange with the individual Function is not required to match
 identically the results received from other Functions."

You also raise the question of measurement management which is still very
much on the todo list. I'll add that to the cfp proposal as possible
discussion topic.  My assumption so far is it will look very much like IMA
but I've not gotten down to the details.

> 
> >    Key / Certificate management. This is somewhat like IMA, but we probably
> >    need to manage the certificate chain separately for each CMA/SPDM
> > instance.
> >    Understanding provisioning models would be useful to guide this work.
> > 
> > 3. IDE support
> >    Is native kernel support worthwhile? Perhaps good to discuss
> >    potential usecases + get some idea on priority for this feature.
> > 
> > 4. Potential blockers on merging emulation support in QEMU. (I'm less sure
> >    on this one, but perhaps worth briefly touching on or a separate
> >    session on emulation if people are interested? Ben, do you think this
> >    would be worthwhile?)
> > 
> > There are other minor questions we might slip into the discussion, time
> > allowing such as need for async support handling in the kernel DOE code.
> > 
> > For all these features, we have multiple layers on top of underlying PCI so
> > discussion of 'how' to support this might be useful.
> > 1) Service model - detected at PCI subsystem level, services to drivers.
> > 2) Driver initiated mode - library code, but per driver instantiation etc.
> > 
> > That's what have come up with this morning, so please poke holes in it and
> > point out what I've forgotten about.
> > 
> > Note for an actual CFP proposal, I'll probably split this into at least two.
> > Topic 1: DOE only.  Topic 2: CMA / IDE. As there is a lot here, for some topics
> > we may be looking at introduce the topic + questions rather than resolving
> > everything on the day.
> > 
> > Thanks,
> > 
> > Jonathan
> > 
> > p.s. Perhaps it is a little unusual to have this level of 'planning' discussion
> > explicitly on list, but we are working under some unusual constraints and
> > inclusiveness and openness always good anyway!  
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-07-28  8:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-27 12:06 RFC: Plumbers microconf topic: PCI DOE and related Jonathan Cameron
2021-07-27 16:50 ` Vikram Sethi
2021-07-28  8:56   ` Jonathan Cameron

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox