From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
Alison Schofield <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
"Ben Widawsky" <ben.widawsky@intel.com>,
Bjorn Helgaas <bhelgaas@google.com>, <linux-cxl@vger.kernel.org>,
<linux-pci@vger.kernel.org>
Subject: Re: [PATCH 3/5] cxl/pci: Add DOE Auxiliary Devices
Date: Thu, 11 Nov 2021 11:53:19 +0000 [thread overview]
Message-ID: <20211111115319.000014a4@Huawei.com> (raw)
In-Reply-To: <20211111013122.GL3538886@iweiny-DESK2.sc.intel.com>
On Wed, 10 Nov 2021 17:31:23 -0800
Ira Weiny <ira.weiny@intel.com> wrote:
> On Mon, Nov 08, 2021 at 01:09:18PM +0000, Jonathan Cameron wrote:
> > On Fri, 5 Nov 2021 16:50:54 -0700
> > <ira.weiny@intel.com> wrote:
> >
> > > From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > >
> > > CXL devices have DOE mailboxes. Create auxiliary devices which can be
> > > driven by the generic DOE auxiliary driver.
> >
> > I'd like Bjorn's input on the balance here between what is done
> > in cxl/pci.c and what should be in the PCI core code somewhere.
> >
> > The tricky bit preventing this being done entirely as part of
> > PCI device instantiation is the interrupts.
> >
> > >
> > > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >
> > Mostly new code, so not sure I should really be listed on this
> > one but I don't mind either way.
> >
> > A few comments inline but overall this ended up nice and clean.
> >
> > >
> > > ---
> > > Changes from V4:
> > > Make this an Auxiliary Driver rather than library functions
> > > Split this out into it's own patch
> > > Base on the new cxl_dev_state structure
> > >
> > > Changes from Ben
> > > s/CXL_DOE_DEV_NAME/DOE_DEV_NAME/
> > > ---
> > > drivers/cxl/Kconfig | 1 +
> > > drivers/cxl/cxl.h | 13 +++++
> > > drivers/cxl/pci.c | 120 ++++++++++++++++++++++++++++++++++++++++++++
> > > 3 files changed, 134 insertions(+)
> > >
> > > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > > index 67c91378f2dd..9d53720bea07 100644
> > > --- a/drivers/cxl/Kconfig
> > > +++ b/drivers/cxl/Kconfig
> > > @@ -16,6 +16,7 @@ if CXL_BUS
> > > config CXL_MEM
> > > tristate "CXL.mem: Memory Devices"
> > > default CXL_BUS
> > > + select PCI_DOE_DRIVER
> > > help
> > > The CXL.mem protocol allows a device to act as a provider of
> > > "System RAM" and/or "Persistent Memory" that is fully coherent
> > > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > > index 5e2e93451928..f1241a7f2b7b 100644
> > > --- a/drivers/cxl/cxl.h
> > > +++ b/drivers/cxl/cxl.h
> > > @@ -75,6 +75,19 @@ static inline int cxl_hdm_decoder_count(u32 cap_hdr)
> > > #define CXLDEV_MBOX_BG_CMD_STATUS_OFFSET 0x18
> > > #define CXLDEV_MBOX_PAYLOAD_OFFSET 0x20
> > >
> > > +/*
> > > + * Address space properties derived from:
> > > + * CXL 2.0 8.2.5.12.7 CXL HDM Decoder 0 Control Register
> > > + */
> > > +#define CXL_ADDRSPACE_RAM BIT(0)
> > > +#define CXL_ADDRSPACE_PMEM BIT(1)
> > > +#define CXL_ADDRSPACE_TYPE2 BIT(2)
> > > +#define CXL_ADDRSPACE_TYPE3 BIT(3)
> > > +#define CXL_ADDRSPACE_MASK GENMASK(3, 0)
> >
> > Stray.
>
> Not sure what you mean here???
>
> There were a number of defines which were unused but I left them in.
>
> This came right out of your patch 3.
>
> https://lore.kernel.org/linux-cxl/20210524133938.2815206-4-Jonathan.Cameron@huawei.com/
>
> I can remove these defines if you want?
They don't have anything to do with DOE that I can see. Probably a side effect
of a merge that went wrong and I didn't notice!
> >
> > Normally after device_add() suceeded we only ever call device_del()
> > as per the docs for device_add()
> > https://elixir.bootlin.com/linux/latest/source/drivers/base/core.c#L3277
>
> I think you are miss reading that comment. Here auxiliary_device_add() has
> succeeded. Therefore both device_del() and put_device() must be called. In
> the case of auxiliary_device_add() failing we only call
> auxiliary_device_uninit() [put_device()].
>
> So I think this is correct.
>
> The other places I spot checked called device_del() _and_ put_device().
Yeah. I had that wrong. Ref counts will be wrong otherwise.
>
> >
> > > +}
> > > +
> > > +static DEFINE_IDA(cxl_doe_adev_ida);
> > > +static void __doe_dev_release(struct auxiliary_device *adev)
> > > +{
> > > + struct pci_doe_dev *doe_dev = container_of(adev, struct pci_doe_dev,
> > > + adev);
> > > +
> > > + ida_free(&cxl_doe_adev_ida, adev->id);
> > > + kfree(doe_dev);
> > > +}
> > > +
> > > +static void cxl_doe_dev_release(struct device *dev)
> > > +{
> > > + struct auxiliary_device *adev = container_of(dev,
> > > + struct auxiliary_device,
> > > + dev);
> > > + __doe_dev_release(adev);
> > > +}
> > > +
> > > +static int cxl_setup_doe_devices(struct cxl_dev_state *cxlds)
> >
> > Pass in the struct device, or maybe even the struct pci_dev as
> > nothing in here is using the cxl_dev_state.
>
> Ah yea can I leave this per the next patch? Or I can change it then change it
> to cxlds in the next patch. But I would rather leave it.
I think we will end up reworking this anyway, but maybe there will
still be a cxl_setup_doe_devices wrapper involved.
>
> >
> > > +{
> > > + struct device *dev = cxlds->dev;
> > > + struct pci_dev *pdev = to_pci_dev(dev);
> > > + int irqs, rc;
> > > + u16 pos = 0;
> > > +
> > > + /*
> > > + * An implementation of a cxl type3 device may support an unknown
> > > + * number of interrupts. Assume that number is not that large and
> > > + * request them all.
> > > + */
> > > + irqs = pci_msix_vec_count(pdev);
> > > + rc = pci_alloc_irq_vectors(pdev, irqs, irqs, PCI_IRQ_MSIX);
> > > + if (rc != irqs) {
> > > + /* No interrupt available - carry on */
> > > + dev_dbg(dev, "No interrupts available for DOE\n");
> > > + } else {
> > > + /*
> > > + * Enabling bus mastering could be done within the DOE
> > > + * initialization, but as it potentially has other impacts
> > > + * keep it within the driver.
> > > + */
> > > + pci_set_master(pdev);
> > > + rc = devm_add_action_or_reset(dev,
> > > + cxl_mem_free_irq_vectors,
> > > + pdev);
> > > + if (rc)
> > > + return rc;
> > > + }
> > > +
> >
> > Above here is driver specific...
> > Everything from here is is generic so perhaps move it to the PCI core?
> > Alternatively wait until we have users that aren't CXL.
>
> I'm still looking for where in the PCI core this would be appropriate to
> place...
Yeah, this needs Bjorn's input. One option would be to move from a soft
dependency to a hard one on the pci-doe module and just put this in there
as an exported utility function.
>
> >
> > > + pos = pci_find_next_ext_capability(pdev, pos, PCI_EXT_CAP_ID_DOE);
> > > +
> > > + while (pos > 0) {
> > > + struct auxiliary_device *adev;
> > > + struct pci_doe_dev *new_dev;
> > > + int id;
> > > +
> > > + new_dev = kzalloc(sizeof(*new_dev), GFP_KERNEL);
> > > + if (!new_dev)
> > > + return -ENOMEM;
> > > +
> > > + new_dev->pdev = pdev;
> > > + new_dev->cap_offset = pos;
> > > +
> > > + /* Set up struct auxiliary_device */
> > > + adev = &new_dev->adev;
> > > + id = ida_alloc(&cxl_doe_adev_ida, GFP_KERNEL);
> > > + if (id < 0) {
> > > + kfree(new_dev);
> > > + return -ENOMEM;
> > > + }
> > > +
> > > + adev->id = id;
> > > + adev->name = DOE_DEV_NAME;
> > > + adev->dev.release = cxl_doe_dev_release;
> > > + adev->dev.parent = dev;
> > > +
> > > + if (auxiliary_device_init(adev)) {
> > > + __doe_dev_release(adev);
> > > + return -EIO;
> > > + }
> > > +
> > > + if (auxiliary_device_add(adev)) {
> > > + auxiliary_device_uninit(adev);
> > > + return -EIO;
> > > + }
> > > +
> > > + rc = devm_add_action_or_reset(dev, cxl_destroy_doe_device, adev);
> > > + if (rc)
> > > + return rc;
> > > +
> > > + if (device_attach(&adev->dev) != 1)
> > > + dev_err(&adev->dev,
> > > + "Failed to attach a driver to DOE device %d\n",
> > > + adev->id);
> >
> > I wondered about this and how it would happen.
> > Given soft dependency only between the drivers it's possible but error or info?
> > I'd go with dev_info(). It is an error I'd bail out and used deferred probing
> > to try again when it will succeed.
>
> I made this dev_err() on purpose. And I don't know about the deferred probing.
> Maybe deferred probing on the CDAT read but even that I think is going to be a
> pain.
>
> The sequence I can think of is:
>
> cxl_pci loaded
> [finds all devices]
> [soft loads pci_doe]
> [device_attach works]
> Admin unloads pci_doe
> [hot-plug new device]
> [device_attach fails]
> [cdat will fail until driver is loaded]
>
> I spoke with Dan about this and while this is unfortunate it is what the user
> asked for. So I prefer dev_err() above to make sure that there is an
> indication of why this device is potentially not going to work.
I'm fine with it being dev_err(), but make it a hard error if it happens.
I don't like potentially not working, and would rather see definitely not
working in this case - so have the function return an error code.
Jonathan
>
> Thanks for the review,
> Ira
>
next prev parent reply other threads:[~2021-11-11 11:53 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-05 23:50 [PATCH 0/5] CXL: Read CDAT and DSMAS data from the device ira.weiny
2021-11-05 23:50 ` [PATCH 1/5] PCI: Add vendor ID for the PCI SIG ira.weiny
2021-11-17 21:50 ` Bjorn Helgaas
2021-11-05 23:50 ` [PATCH 2/5] PCI/DOE: Add Data Object Exchange Aux Driver ira.weiny
2021-11-08 12:15 ` Jonathan Cameron
2021-11-10 5:45 ` Ira Weiny
2021-11-18 18:48 ` Jonathan Cameron
2021-11-16 23:48 ` Bjorn Helgaas
2021-12-03 20:48 ` Dan Williams
2021-12-03 23:56 ` Bjorn Helgaas
2021-12-04 15:47 ` Dan Williams
2021-12-06 12:27 ` Jonathan Cameron
2021-11-05 23:50 ` [PATCH 3/5] cxl/pci: Add DOE Auxiliary Devices ira.weiny
2021-11-08 13:09 ` Jonathan Cameron
2021-11-11 1:31 ` Ira Weiny
2021-11-11 11:53 ` Jonathan Cameron [this message]
2021-11-16 23:48 ` Bjorn Helgaas
2021-11-17 12:23 ` Jonathan Cameron
2021-11-17 22:15 ` Bjorn Helgaas
2021-11-18 10:51 ` Jonathan Cameron
2021-11-19 6:48 ` Christoph Hellwig
2021-11-29 23:37 ` Dan Williams
2021-11-29 23:59 ` Dan Williams
2021-11-30 6:42 ` Christoph Hellwig
2021-11-05 23:50 ` [PATCH 4/5] cxl/mem: Add CDAT table reading from DOE ira.weiny
2021-11-08 13:21 ` Jonathan Cameron
2021-11-08 23:19 ` Ira Weiny
2021-11-08 15:02 ` Jonathan Cameron
2021-11-08 22:25 ` Ira Weiny
2021-11-09 11:09 ` Jonathan Cameron
2021-11-19 14:40 ` Jonathan Cameron
2021-11-05 23:50 ` [PATCH 5/5] cxl/cdat: Parse out DSMAS data from CDAT table ira.weiny
2021-11-08 14:52 ` Jonathan Cameron
2021-11-11 3:58 ` Ira Weiny
2021-11-11 11:58 ` Jonathan Cameron
2021-11-18 17:02 ` Jonathan Cameron
2021-11-19 14:55 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211111115319.000014a4@Huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=ben.widawsky@intel.com \
--cc=bhelgaas@google.com \
--cc=dan.j.williams@intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).