linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "Weiny, Ira" <ira.weiny@intel.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Alison Schofield <alison.schofield@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Ben Widawsky <ben.widawsky@intel.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	linux-cxl@vger.kernel.org, Linux PCI <linux-pci@vger.kernel.org>
Subject: Re: [PATCH 2/5] PCI/DOE: Add Data Object Exchange Aux Driver
Date: Fri, 3 Dec 2021 17:56:17 -0600	[thread overview]
Message-ID: <20211203235617.GA3036259@bhelgaas> (raw)
In-Reply-To: <CAPcyv4hDTitnqasVwLTV4QPJqW_ykoJc+hRVRm8aLzG4xBxVag@mail.gmail.com>

On Fri, Dec 03, 2021 at 12:48:18PM -0800, Dan Williams wrote:
> On Tue, Nov 16, 2021 at 3:48 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Fri, Nov 05, 2021 at 04:50:53PM -0700, ira.weiny@intel.com wrote:
> > > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > >
> > > Introduced in a PCI ECN [1], DOE provides a config space based mailbox
> > > with standard protocol discovery.  Each mailbox is accessed through a
> > > DOE Extended Capability.
> > >
> > > Define an auxiliary device driver which control DOE auxiliary devices
> > > registered on the auxiliary bus.
> >
> > What do we gain by making this an auxiliary driver?
> >
> > This doesn't really feel like a "driver," and apparently it used to be
> > a library.  I'd like to see the rationale and benefits of the driver
> > approach (in the eventual commit log as well as the current email
> > thread).
> 
> I asked Ira to use the auxiliary bus for DOE primarily for the ABI it
> offers for userspace to manage kernel vs userspace access to a device.
> CONFIG_IO_STRICT_DEVMEM set the precedent that userspace can not
> clobber mmio space that is actively claimed by a kernel driver. I
> submit that DOE merits the same protection for DOE instances that the
> kernel consumes.
>
> Unlike other PCI configuration registers that root userspace has no
> reason to touch unless it wants to actively break things, DOE is a
> mechanism that root userspace may need to access directly in some
> cases. There are a few examples that come to mind.

It's useful for root to read/write config registers with setpci, e.g.,
to test ASPM configuration, test power management behavior, etc.  That
can certainly break things and interfere with kernel access (and IMO
should taint the kernel) but we have so far accepted that risk.  I
think the same will be true for DOE.

In addition, I would think you might want a safe userspace interface
via sysfs, e.g., something like the "vpd" file, but I missed that if
it was in this series.

> CXL Compliance Testing (see CXL 2.0 14.16.4 Compliance Mode DOE)
> offers a mechanism to set different test modes for a DOE device. The
> kernel has no reason to ever use that interface, and it has strong
> reasons to want to block access to it in production. However, hardware
> vendors also use debug Linux builds for hardware bringup. So I would
> like to be able to say that the mechanism to gain access to the
> compliance DOE is to detach the aux DOE driver from the right aux DOE
> device. Could we build a custom way to do the same for the DOE
> library, sure, but why re-invent the wheel when udev and the driver
> model can handle this type of policy question already?
> 
> Another use case is SPDM where an agent can establish a secure message
> passing channel to a device, or paravirtualized device to exchange
> protected messages with the hypervisor. My expectation is that in
> addition to the kernel establishing SPDM sessions for PCI IDE and
> CXL.cachemem IDE (link Integrity and Data Encryption) there will be
> use cases for root userspace to establish their own SPDM session. In
> that scenario as well the kernel can be told to give up control of a
> specific DOE instance by detaching the aux device for its driver, but
> otherwise the kernel driver can be assured that userspace will not
> clobber its communications with its own attempts to talk over the DOE.

I assume the kernel needs to control access to DOE in all cases,
doesn't it?  For example, DOE can generate interrupts, and only the
kernel can field them.  Maybe if I saw the userspace interface this
would make more sense to me.  I'm hoping there's a reasonable "send
this query and give me the response" primitive that can be implemented
in the kernel, used by drivers, and exposed safely to userspace.

> Lastly, and perhaps this is minor, the PCI core is a built-in object
> and aux-bus allows for breaking out device library functionality like
> this into a dedicated module. But yes, that's not a good reason unto
> itself because you could "auxify" almost anything past the point of
> reason just to get more modules.
> 
> > > A DOE mailbox is allowed to support any number of protocols while some
> > > DOE protocol specifications apply additional restrictions.
> >
> > This sounds something like a fancy version of VPD, and VPD has been a
> > huge headache.  I hope DOE avoids that ;)
> 
> Please say a bit more, I think DOE is a rather large headache as
> evidenced by us fine folks grappling with how to architect the Linux
> enabling.

VPD is not widely used, so gets poor testing.  The device doesn't tell
us how much VPD it has, so we have to read until the end.  The data is
theoretically self-describing (series of elements, each containing a
type and size), but of course some devices don't format it correctly,
so we read to the absolute limit, which takes a long time.

Typical contents of unimplemented or uninitialized VPD are all 0x00 or
all 0xff, neither of which is defined as a valid "end of VPD" signal,
so we have hacky "this doesn't look like VPD" code.

The PCI core doesn't need VPD itself and should only need to look for
the size of each element and the "end" tag, but it works around some
issues by doing more interpretation of the data.  The spec is a little
ambiguous and leaves room for vendors to use types not mentioned in
the spec.

Some devices share VPD hardware across functions but don't protect it
correctly (a hardware defect, granted).

VPD has address/data registers, so it requires locking, polling for
completion, and timeouts.

Bjorn

  reply	other threads:[~2021-12-03 23:56 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-05 23:50 [PATCH 0/5] CXL: Read CDAT and DSMAS data from the device ira.weiny
2021-11-05 23:50 ` [PATCH 1/5] PCI: Add vendor ID for the PCI SIG ira.weiny
2021-11-17 21:50   ` Bjorn Helgaas
2021-11-05 23:50 ` [PATCH 2/5] PCI/DOE: Add Data Object Exchange Aux Driver ira.weiny
2021-11-08 12:15   ` Jonathan Cameron
2021-11-10  5:45     ` Ira Weiny
2021-11-18 18:48       ` Jonathan Cameron
2021-11-16 23:48   ` Bjorn Helgaas
2021-12-03 20:48     ` Dan Williams
2021-12-03 23:56       ` Bjorn Helgaas [this message]
2021-12-04 15:47         ` Dan Williams
2021-12-06 12:27           ` Jonathan Cameron
2021-11-05 23:50 ` [PATCH 3/5] cxl/pci: Add DOE Auxiliary Devices ira.weiny
2021-11-08 13:09   ` Jonathan Cameron
2021-11-11  1:31     ` Ira Weiny
2021-11-11 11:53       ` Jonathan Cameron
2021-11-16 23:48   ` Bjorn Helgaas
2021-11-17 12:23     ` Jonathan Cameron
2021-11-17 22:15       ` Bjorn Helgaas
2021-11-18 10:51         ` Jonathan Cameron
2021-11-19  6:48         ` Christoph Hellwig
2021-11-29 23:37           ` Dan Williams
2021-11-29 23:59             ` Dan Williams
2021-11-30  6:42               ` Christoph Hellwig
2021-11-05 23:50 ` [PATCH 4/5] cxl/mem: Add CDAT table reading from DOE ira.weiny
2021-11-08 13:21   ` Jonathan Cameron
2021-11-08 23:19     ` Ira Weiny
2021-11-08 15:02   ` Jonathan Cameron
2021-11-08 22:25     ` Ira Weiny
2021-11-09 11:09       ` Jonathan Cameron
2021-11-19 14:40   ` Jonathan Cameron
2021-11-05 23:50 ` [PATCH 5/5] cxl/cdat: Parse out DSMAS data from CDAT table ira.weiny
2021-11-08 14:52   ` Jonathan Cameron
2021-11-11  3:58     ` Ira Weiny
2021-11-11 11:58       ` Jonathan Cameron
2021-11-18 17:02   ` Jonathan Cameron
2021-11-19 14:55   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211203235617.GA3036259@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=ben.widawsky@intel.com \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).