All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>,
	Ben Widawsky <ben.widawsky@intel.com>,
	linux-cxl@vger.kernel.org, Linux PCI <linux-pci@vger.kernel.org>,
	Alison Schofield <alison.schofield@intel.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Stuart Hayes <Stuart.Hayes@dell.com>
Subject: Re: [PATCH 20/23] cxl/port: Introduce a port driver
Date: Mon, 6 Dec 2021 20:56:35 -0600	[thread overview]
Message-ID: <20211207025635.GA9183@bhelgaas> (raw)
In-Reply-To: <CAPcyv4hU3ZUG_Psx7GKS-fQMf8_Rb-pn8Lubvnqqbr4Q8AzZcA@mail.gmail.com>

On Fri, Dec 03, 2021 at 05:24:34PM -0800, Dan Williams wrote:
> On Fri, Dec 3, 2021 at 2:03 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Thu, Dec 02, 2021 at 05:38:17PM -0800, Dan Williams wrote:

I'm cutting out a bunch of details, not because they're unimportant,
but because I don't know enough yet for them to make sense to me.

> > I guess software uses the CXL DVSEC to distinguish a CXL device
> > from a PCIe device, at least for 00.0.
> 
> driver/cxl/pci.c attaches to: "PCI_DEVICE_CLASS((PCI_CLASS_MEMORY_CXL
> << 8 | CXL_MEMORY_PROGIF), ~0)"
> 
> I am not aware of any restriction for that class code to appear at
> function0?

Maybe it's not an actual restriction; I'm reading CXL r2.0, sec 8.1.3,
where it says:

  The PCIe configuration space of Device 0, Function 0 shall include
  the CXL PCI Express Designated Vendor-Specific Extended Capability
  (DVSEC) as shown in Figure 126. The capability, status and control
  fields in Device 0, Function 0 DVSEC control the CXL functionality
  of the entire CXL device.
  ...
  Software may use the presence of this DVSEC to differentiate between
  a CXL device and a PCIe device. As such, a standard PCIe device must
  not expose this DVSEC.

Sections 9.11.5 and 9.12.2 also talk about looking for CXL DVSEC on
dev 0, func 0 to identify CXL devices.

> > Not sure about non-Dev 0 Func 0 devices; maybe this implies other
> > functions in the same device are part of the same CXL device?
> 
> DVSEC is really only telling us the layout of the MMIO register
> space. ...

And DVSEC also apparently tells us that "this is a CXL device, not
just an ordinary PCIe device"?  It's not clear to me how you identify
other PCIe functions that are also part of the same CXL device.

> > > The fact that the PCI core remains blissfully unaware of all these
> > > interdependencies is a feature because CXL is effectively a super-set
> > > of PCIe for fast-path CXL.mem operation, even though it is derivative
> > > for enumeration and slow-path manageability.
> >
> > I don't know how blissfully unaware PCI can be.  Can a user remove a
> > PCIe device that's part of a CXL device via sysfs?
> 
> Yes. If that PCIe device is an endpoint then the corresponding "mem"
> driver will get a ->remove() event because it is registered as a child
> of that parent PCIe device. That in turn will tear down the relevant
> part of the CXL port topology.

The CXL device may include several PCIe devices.  "mem" is a CXL
driver that's bound to one of them (?)  Is that what you mean by "mem"
being a "child of the the parent PCIe device"?

> However, a gap (deliberate design choice?) in the PCI hotplug
> implementation I currently see is an ability for an endpoint PCI
> driver to dynamically deny hotplug requests based on the state of the
> device. ...

PCI allows surprise remove, so drivers generally can't deny hot
unplugs.  PCIe *does* provide for an Electromechanical Interlock (see
PCI_EXP_SLTCTL_EIC), but we don't currently support it.

> > Is the PCIe device available for drivers to claim?
> 
> drivers/cxl/pci.c *is* a native PCI driver for CXL memory expanders.
> You might be thinking of CXL accelerator devices, where the CXL.cache
> and CXL.mem capabilities are incremental capabilities for the
> accelerator.  ...

No, I'm not nearly sophisticated enough to be thinking of specific
types of CXL things :)

> > Do we need coordination between cxl_driver_register() and
> > pci_register_driver()?
> 
> They do not collide, and I think this goes back to the concern about
> whether drivers/cxl/ is scanning for all CXL DVSECs. ...

Sorry, I don't remember what this concern was, and I don't know why
they don't collide.  I *would* know that if I knew that the set of
things cxl_driver_register() binds to doesn't intersect the set of
pci_devs, but it's not clear to me what those things are.

The PCI core enumerates devices by doing config reads of the Vendor ID
for every possible bus and device number.  It allocs a pci_dev for
each device it finds, and those are the things pci_register_driver()
binds drivers to based on Vendor ID, Device ID, etc.

How does CXL enumeration work?  Do you envision it being integrated
into PCI enumeration?  Does it build a list/tree/etc of cxl_devs?

cxl_driver_register() associates a driver with something.  What
exactly is the thing the driver is associated with?  A pci_dev?  A
cxl_dev?  Some kind of aggregate CXL device composed of several PCIe
devices?

I expected cxl_driver.probe() to take a "struct cxl_dev *" or similar,
but it takes a "struct device *".  I'm trying to apply my knowledge of
how other buses work in Linux, but obviously it's not working very
well.

> > Does CXL impose new constraints on PCI power management?
> 
> Recall that CXL is built such that it could be supported by a legacy
> operating system where platform firmware owns the setup of devices,
> just like DDR memory does not need a driver. This is where CXL 1.1
> played until CXL 2.0 added so much configuration complexity (hotplug,
> interleave, persistent memory) that it started to need OS help. The
> Linux PCIe PM code will not notice a difference, but behind the scenes
> the device, if it is offering coherent memory to the CPU, will be
> coordinating with the CPU like it was part of the CPU package and not
> a discrete device. I do not see new PM software enabling required in
> my reading of "Section 10.0 Power Management" of the CXL
> specification.

So if Linux PM decides to suspend a PCIe device that's part of a CXL
device and put it in D3hot, this is not a problem for CXL?  I guess if
a CXL driver binds to the PCIe device, it can control what PM will do.
But I thought CXL drivers would bind to a CXL thing, not a PCIe thing.

I see lots of mentions of LTR in sec 10, which really scares me
because I'm pretty confident that Linux handling of LTR is broken, and
I have no idea how to fix it.

> > > So, instead the system may be designed to have an NPEM in the
> > > upstream switch port and disable the NPEM instances in the device.
> > > Platform firmware decides which NPEM is in use.
> >
> > Since you didn't mention a firmware interface to communicate which
> > NPEM is in use, I assume firmware does this by preventing other
> > devices from advertising NPEM support?
> 
> That's also my assumption. If the OS sees a disabled NPEM in the
> topology it just needs to assume firmware knew what it was doing when
> it disabled it. I wish NPEM was better specified than "trust firmware
> to do the right thing via an ambiguous signal".

If we enumerate a device with a capability that is disabled, we
normally don't treat that as a signal that it cannot be enabled.
There are lots of enable bits in PCI capabilities, and I don't know of
any cases where Linux assumes "Enable == 0" means "don't use this
feature."  Absent some negotiation like _OSC or some documented
restriction, e.g., in the PCI Firmware spec, Linux normally just
enables features when it decides to use them.

Bjorn

  reply	other threads:[~2021-12-07  2:56 UTC|newest]

Thread overview: 133+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-20  0:02 [PATCH 00/23] Add drivers for CXL ports and mem devices Ben Widawsky
2021-11-20  0:02 ` [PATCH 01/23] cxl: Rename CXL_MEM to CXL_PCI Ben Widawsky
2021-11-22 14:47   ` Jonathan Cameron
2021-11-24  4:15   ` Dan Williams
2021-11-20  0:02 ` [PATCH 02/23] cxl: Flesh out register names Ben Widawsky
2021-11-22 14:49   ` Jonathan Cameron
2021-11-24  4:24   ` Dan Williams
2021-11-20  0:02 ` [PATCH 03/23] cxl/pci: Extract device status check Ben Widawsky
2021-11-22 15:03   ` Jonathan Cameron
2021-11-24 19:30   ` Dan Williams
2021-11-20  0:02 ` [PATCH 04/23] cxl/pci: Implement Interface Ready Timeout Ben Widawsky
2021-11-22 15:02   ` Jonathan Cameron
2021-11-22 17:17     ` Ben Widawsky
2021-11-22 17:53       ` Jonathan Cameron
2021-11-24 19:56         ` Dan Williams
2021-11-25  6:17           ` Ben Widawsky
2021-11-25  7:14             ` Dan Williams
2021-11-20  0:02 ` [PATCH 05/23] cxl/pci: Don't poll doorbell for mailbox access Ben Widawsky
2021-11-22 15:11   ` Jonathan Cameron
2021-11-22 17:24     ` Ben Widawsky
2021-11-24 21:55   ` Dan Williams
2021-11-29 18:33     ` Ben Widawsky
2021-11-29 19:02       ` Dan Williams
2021-11-29 19:11         ` Ben Widawsky
2021-11-29 19:18           ` Dan Williams
2021-11-29 19:31             ` Ben Widawsky
2021-11-29 19:37               ` Dan Williams
2021-11-29 19:50                 ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 06/23] cxl/pci: Don't check media status for mbox access Ben Widawsky
2021-11-22 15:19   ` Jonathan Cameron
2021-11-24 21:58   ` Dan Williams
2021-11-20  0:02 ` [PATCH 07/23] cxl/pci: Add new DVSEC definitions Ben Widawsky
2021-11-22 15:22   ` Jonathan Cameron
2021-11-22 17:32     ` Ben Widawsky
2021-11-24 22:03       ` Dan Williams
2021-11-20  0:02 ` [PATCH 08/23] cxl/acpi: Map component registers for Root Ports Ben Widawsky
2021-11-22 15:51   ` Jonathan Cameron
2021-11-22 19:28     ` Ben Widawsky
2021-11-24 22:18   ` Dan Williams
2021-11-20  0:02 ` [PATCH 09/23] cxl: Introduce module_cxl_driver Ben Widawsky
2021-11-22 15:54   ` Jonathan Cameron
2021-11-24 22:22   ` Dan Williams
2021-11-20  0:02 ` [PATCH 10/23] cxl/core: Convert decoder range to resource Ben Widawsky
2021-11-22 16:08   ` Jonathan Cameron
2021-11-24 22:41   ` Dan Williams
2021-11-20  0:02 ` [PATCH 11/23] cxl/core: Document and tighten up decoder APIs Ben Widawsky
2021-11-22 16:13   ` Jonathan Cameron
2021-11-24 22:55   ` Dan Williams
2021-11-20  0:02 ` [PATCH 12/23] cxl: Introduce endpoint decoders Ben Widawsky
2021-11-22 16:20   ` Jonathan Cameron
2021-11-22 19:37     ` Ben Widawsky
2021-11-25  0:07       ` Dan Williams
2021-11-29 20:05         ` Ben Widawsky
2021-11-29 20:07           ` Dan Williams
2021-11-29 20:12             ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 13/23] cxl/core: Move target population locking to caller Ben Widawsky
2021-11-22 16:33   ` Jonathan Cameron
2021-11-22 21:58     ` Ben Widawsky
2021-11-23 11:05       ` Jonathan Cameron
2021-11-25  0:34   ` Dan Williams
2021-11-20  0:02 ` [PATCH 14/23] cxl: Introduce topology host registration Ben Widawsky
2021-11-22 18:20   ` Jonathan Cameron
2021-11-22 22:30     ` Ben Widawsky
2021-11-25  1:09   ` Dan Williams
2021-11-29 21:23     ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 15/23] cxl/core: Store global list of root ports Ben Widawsky
2021-11-22 18:22   ` Jonathan Cameron
2021-11-22 22:32     ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 16/23] cxl/pci: Cache device DVSEC offset Ben Widawsky
2021-11-22 16:46   ` Jonathan Cameron
2021-11-22 22:34     ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 17/23] cxl: Cache and pass DVSEC ranges Ben Widawsky
2021-11-20  4:29   ` kernel test robot
2021-11-20  4:29     ` kernel test robot
2021-11-22 17:00   ` Jonathan Cameron
2021-11-22 22:50     ` Ben Widawsky
2021-11-26 11:37   ` Jonathan Cameron
2021-11-20  0:02 ` [PATCH 18/23] cxl/pci: Implement wait for media active Ben Widawsky
2021-11-22 17:03   ` Jonathan Cameron
2021-11-22 22:57     ` Ben Widawsky
2021-11-23 11:09       ` Jonathan Cameron
2021-11-23 16:04         ` Ben Widawsky
2021-11-23 17:48           ` Bjorn Helgaas
2021-11-23 19:37             ` Ben Widawsky
2021-11-26 11:36     ` Jonathan Cameron
2021-11-20  0:02 ` [PATCH 19/23] cxl/pci: Store component register base in cxlds Ben Widawsky
2021-11-20  7:28   ` kernel test robot
2021-11-20  7:28     ` kernel test robot
2021-11-22 17:11   ` Jonathan Cameron
2021-11-22 23:01     ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 20/23] cxl/port: Introduce a port driver Ben Widawsky
2021-11-20  3:14   ` kernel test robot
2021-11-20  3:14     ` kernel test robot
2021-11-20  5:38   ` kernel test robot
2021-11-20  5:38     ` kernel test robot
2021-11-22 17:41   ` Jonathan Cameron
2021-11-22 23:38     ` Ben Widawsky
2021-11-23 11:38       ` Jonathan Cameron
2021-11-23 16:14         ` Ben Widawsky
2021-11-23 18:21   ` Bjorn Helgaas
2021-11-23 22:03     ` Ben Widawsky
2021-11-23 22:36       ` Dan Williams
2021-11-23 23:38         ` Ben Widawsky
2021-11-23 23:55         ` Bjorn Helgaas
2021-11-24  0:40           ` Dan Williams
2021-11-24  6:33             ` Christoph Hellwig
2021-11-24  7:17               ` Dan Williams
2021-11-24  7:28                 ` Christoph Hellwig
2021-11-24  7:33                   ` Greg Kroah-Hartman
2021-11-24  7:54                     ` Dan Williams
2021-11-24  8:21                       ` Greg Kroah-Hartman
2021-11-24 18:24                         ` Dan Williams
2021-12-02 21:24                 ` Bjorn Helgaas
2021-12-03  1:38                   ` Dan Williams
2021-12-03 22:03                     ` Bjorn Helgaas
2021-12-04  1:24                       ` Dan Williams
2021-12-07  2:56                         ` Bjorn Helgaas [this message]
2021-12-07  4:48                           ` Dan Williams
2021-11-24 21:31       ` Bjorn Helgaas
2021-11-20  0:02 ` [PATCH 21/23] cxl: Unify port enumeration for decoders Ben Widawsky
2021-11-22 17:48   ` Jonathan Cameron
2021-11-22 23:44     ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 22/23] cxl/mem: Introduce cxl_mem driver Ben Widawsky
2021-11-20  0:40   ` Randy Dunlap
2021-11-21  3:55     ` Ben Widawsky
2021-11-22 18:17   ` Jonathan Cameron
2021-11-23  0:05     ` Ben Widawsky
2021-11-20  0:02 ` [PATCH 23/23] cxl/mem: Disable switch hierarchies for now Ben Widawsky
2021-11-22 18:19   ` Jonathan Cameron
2021-11-22 19:17     ` Ben Widawsky
2021-11-25 21:53 [PATCH 14/23] cxl: Introduce topology host registration kernel test robot
2021-11-29 11:42 ` Dan Carpenter
2021-11-29 11:42 ` Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211207025635.GA9183@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=Stuart.Hayes@dell.com \
    --cc=alison.schofield@intel.com \
    --cc=ben.widawsky@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.