From: Dan Williams <dan.j.williams@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Linux PCI <linux-pci@vger.kernel.org>,
linux-cxl@vger.kernel.org, Bjorn Helgaas <helgaas@kernel.org>,
"Weiny, Ira" <ira.weiny@intel.com>,
Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
Ben Widawsky <ben.widawsky@intel.com>,
Chris Browy <cbrowy@avery-design.com>,
Linux ACPI <linux-acpi@vger.kernel.org>,
"Schofield, Alison" <alison.schofield@intel.com>,
Vishal L Verma <vishal.l.verma@intel.com>,
Fangjian <f.fangjian@huawei.com>, Linuxarm <linuxarm@huawei.com>
Subject: Re: [PATCH v4 3/5] cxl/mem: Add CDAT table reading from DOE
Date: Thu, 10 Jun 2021 14:46:08 -0700 [thread overview]
Message-ID: <CAPcyv4hJ3e0jwTbfvg_bWvLdvk=4VO+b3JsQRj=CMezrfoEDbA@mail.gmail.com> (raw)
In-Reply-To: <20210524133938.2815206-4-Jonathan.Cameron@huawei.com>
On Mon, May 24, 2021 at 6:42 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> This patch provides a sysfs binary attribute to allow dumping of the whole
> table.
>
> Binary dumping is modeled on /sys/firmware/ACPI/tables/
>
> The ability to dump this table will be very useful for emulation of real
> devices once they become available as QEMU CXL type 3 device emulation will
> be able to load this file in.
>
> This does not support table updates at runtime. It will always provide
> whatever was there when first cached. Handling of table updates can be
> implemented later.
>
> Once we have more users, this code can move out to driver/cxl/cdat.c or
> similar.
>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> drivers/cxl/Kconfig | 1 +
> drivers/cxl/cxl.h | 21 ++++++
> drivers/cxl/mem.c | 174 ++++++++++++++++++++++++++++++++++++++++++++
> drivers/cxl/mem.h | 6 ++
> 4 files changed, 202 insertions(+)
>
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 97dc4d751651..26cad9fa29f7 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -15,6 +15,7 @@ if CXL_BUS
>
> config CXL_MEM
> tristate "CXL.mem: Memory Devices"
> + select PCI_DOE
> help
> The CXL.mem protocol allows a device to act as a provider of
> "System RAM" and/or "Persistent Memory" that is fully coherent
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index d49e0cb679fa..e649a286aace 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -7,6 +7,7 @@
> #include <linux/bitfield.h>
> #include <linux/bitops.h>
> #include <linux/io.h>
> +#include <linux/pci-doe.h>
>
> /* CXL 2.0 8.2.8.1 Device Capabilities Array Register */
> #define CXLDEV_CAP_ARRAY_OFFSET 0x0
> @@ -69,5 +70,25 @@ struct cxl_regs {
> void cxl_setup_device_regs(struct device *dev, void __iomem *base,
> struct cxl_device_regs *regs);
>
> +/*
> + * Address space properties derived from:
> + * CXL 2.0 8.2.5.12.7 CXL HDM Decoder 0 Control Register
> + */
> +#define CXL_ADDRSPACE_RAM BIT(0)
> +#define CXL_ADDRSPACE_PMEM BIT(1)
> +#define CXL_ADDRSPACE_TYPE2 BIT(2)
> +#define CXL_ADDRSPACE_TYPE3 BIT(3)
> +#define CXL_ADDRSPACE_MASK GENMASK(3, 0)
Looks like this got picked up from a rebase... they're now decoder
flags, but no need to include them in the CDAT patch.
> +
> +#define CXL_DOE_PROTOCOL_COMPLIANCE 0
> +#define CXL_DOE_PROTOCOL_TABLE_ACCESS 2
> +
> +/* Common to request and response */
> +#define CXL_DOE_TABLE_ACCESS_3_CODE GENMASK(7, 0)
> +#define CXL_DOE_TABLE_ACCESS_3_CODE_READ 0
> +#define CXL_DOE_TABLE_ACCESS_3_TYPE GENMASK(15, 8)
> +#define CXL_DOE_TABLE_ACCESS_3_TYPE_CDAT 0
> +#define CXL_DOE_TABLE_ACCESS_3_ENTRY_HANDLE GENMASK(31, 16)
> +
> extern struct bus_type cxl_bus_type;
> #endif /* __CXL_H__ */
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index c5fdf2c57181..4224d1de311e 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -14,6 +14,7 @@
> #include "pci.h"
> #include "cxl.h"
> #include "mem.h"
> +#include "cdat.h"
>
> /**
> * DOC: cxl mem
> @@ -926,6 +927,85 @@ static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm)
> return 0;
> }
>
> +#define CDAT_DOE_REQ(entry_handle) \
> + (FIELD_PREP(CXL_DOE_TABLE_ACCESS_REQ_CODE, \
> + CXL_DOE_TABLE_ACCESS_REQ_CODE_READ) | \
> + FIELD_PREP(CXL_DOE_TABLE_ACCESS_TABLE_TYPE, \
> + CXL_DOE_TABLE_ACCESS_TABLE_TYPE_CDATA) | \
> + FIELD_PREP(CXL_DOE_TABLE_ACCESS_ENTRY_HANDLE, (entry_handle)))
> +
> +static ssize_t cdat_get_length(struct pci_doe *doe)
> +{
> + u32 cdat_request_pl = CDAT_DOE_REQ(0);
> + u32 cdat_response_pl[32];
> + struct pci_doe_exchange ex = {
> + .vid = PCI_DVSEC_VENDOR_ID_CXL,
> + .protocol = CXL_DOE_PROTOCOL_TABLE_ACCESS,
> + .request_pl = &cdat_request_pl,
> + .request_pl_sz = sizeof(cdat_request_pl),
> + .response_pl = cdat_response_pl,
> + .response_pl_sz = sizeof(cdat_response_pl),
> + };
> +
> + ssize_t rc;
> +
> + rc = pci_doe_exchange_sync(doe, &ex);
> + if (rc < 0)
> + return rc;
> + if (rc < 1)
> + return -EIO;
> +
> + return cdat_response_pl[1];
> +}
> +
> +static int cdat_to_buffer(struct pci_doe *doe, u32 *buffer, size_t length)
> +{
> + int entry_handle = 0;
> + int rc;
> +
> + do {
> + u32 cdat_request_pl = CDAT_DOE_REQ(entry_handle);
> + u32 cdat_response_pl[32];
> + struct pci_doe_exchange ex = {
> + .vid = PCI_DVSEC_VENDOR_ID_CXL,
> + .protocol = CXL_DOE_PROTOCOL_TABLE_ACCESS,
> + .request_pl = &cdat_request_pl,
> + .request_pl_sz = sizeof(cdat_request_pl),
> + .response_pl = cdat_response_pl,
> + .response_pl_sz = sizeof(cdat_response_pl),
> + };
> + size_t entry_dw;
> + u32 *entry;
> +
> + rc = pci_doe_exchange_sync(doe, &ex);
> + if (rc < 0)
> + return rc;
> +
> + entry = cdat_response_pl + 1;
I think:
entry = &cdat_response_pl[1];
...is less ambiguous, otherwise you need to backtrack and remember
that cdat_response_pl is an array not a struct. Perhaps a comment or a
symbol name for "1" would help here too?
> + entry_dw = rc / sizeof(u32);
> + /* Skip Header */
> + entry_dw -= 1;
> + entry_dw = min(length / 4, entry_dw);
sometimes sizeof(u32) sometimes "4"?
> + memcpy(buffer, entry, entry_dw * sizeof(u32));
> + length -= entry_dw * sizeof(u32);
Why not keep entry_dw in bytes, it seems this conversion to a word
count is causing more trouble than it is worth above.
> + buffer += entry_dw;
> + entry_handle = FIELD_GET(CXL_DOE_TABLE_ACCESS_ENTRY_HANDLE, cdat_response_pl[0]);
> +
> + } while (entry_handle != 0xFFFF);
Shouldn't this also break out if length goes to zero?
> +
> + return 0;
> +}
> +
> +static void cxl_mem_free_irq_vectors(void *data)
> +{
> + pci_free_irq_vectors(data);
> +}
> +
> +static void cxl_mem_doe_unregister_all(void *data)
> +{
> + pci_doe_unregister_all(data);
> +}
> +
> static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev, u32 reg_lo,
> u32 reg_hi)
> {
> @@ -933,6 +1013,7 @@ static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev, u32 reg_lo,
> struct cxl_mem *cxlm;
> void __iomem *regs;
> u64 offset;
> + int irqs;
> u8 bar;
> int rc;
>
> @@ -971,6 +1052,44 @@ static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev, u32 reg_lo,
> return NULL;
> }
>
> + /*
> + * An implementation of a cxl type3 device may support an unknown
> + * number of interrupts. Assume that number is not that large and
> + * request them all.
> + */
> + irqs = pci_msix_vec_count(pdev);
> + rc = pci_alloc_irq_vectors(pdev, irqs, irqs, PCI_IRQ_MSIX);
> + if (rc != irqs) {
> + /* No interrupt available - carry on */
> + dev_dbg(dev, "No interrupts available for DOE\n");
> + } else {
> + /*
> + * Enabling bus mastering could be done within the DOE
> + * initialization, but as it potentially has other impacts
> + * keep it within the driver.
> + */
> + pci_set_master(pdev);
> + rc = devm_add_action_or_reset(dev, cxl_mem_free_irq_vectors,
> + pdev);
> + if (rc)
> + return NULL;
> + }
> +
> + /*
> + * Find a DOE mailbox that supports CDAT.
> + * Supporting other DOE protocols will require more complexity.
> + */
> + rc = pci_doe_register_all(pdev);
> + if (rc < 0)
> + return NULL;
> +
> + rc = devm_add_action_or_reset(dev, cxl_mem_doe_unregister_all, pdev);
> + if (rc)
> + return NULL;
> +
> + cxlm->table_doe = pci_doe_find(pdev, PCI_DVSEC_VENDOR_ID_CXL,
> + CXL_DOE_PROTOCOL_TABLE_ACCESS);
> +
cxl_mem_create() is about allocating / initializing the @cxlm object.
I think the above belongs in its own hardware init function. Well the
interrupt init belongs in its own init function before
cxl_mem_create() and the DOE / CDAT registration can come sometime
later after cxl_mem_identify() succeeds. I.e. it's not until the
driver has successfully setup mailbox communications should it worry
about talking to the DOE.
> dev_dbg(dev, "Mapped CXL Memory Device resource\n");
> return cxlm;
> }
> @@ -1060,6 +1179,31 @@ static ssize_t pmem_size_show(struct device *dev, struct device_attribute *attr,
> return sysfs_emit(buf, "%#llx\n", len);
> }
>
> +static ssize_t CDAT_read(struct file *filp, struct kobject *kobj,
> + struct bin_attribute *bin_attr, char *buf,
> + loff_t offset, size_t count)
> +{
> + struct device *dev = kobj_to_dev(kobj);
> + struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +
> + return memory_read_from_buffer(buf, count, &offset, cxlmd->cdat_table,
> + cxlmd->cdat_length);
> +}
> +
> +static BIN_ATTR_RO(CDAT, 0);
> +
> +static umode_t cxl_memdev_bin_attr_is_visible(struct kobject *kobj,
> + struct bin_attribute *attr, int i)
> +{
> + struct device *dev = kobj_to_dev(kobj);
> + struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +
> + if ((attr == &bin_attr_CDAT) && cxlmd->cdat_table)
> + return 0400;
> +
> + return 0;
> +}
> +
> static struct device_attribute dev_attr_pmem_size =
> __ATTR(size, 0444, pmem_size_show, NULL);
>
> @@ -1069,6 +1213,11 @@ static struct attribute *cxl_memdev_attributes[] = {
> NULL,
> };
>
> +static struct bin_attribute *cxl_memdev_bin_attributes[] = {
> + &bin_attr_CDAT,
> + NULL,
> +};
> +
> static struct attribute *cxl_memdev_pmem_attributes[] = {
> &dev_attr_pmem_size.attr,
> NULL,
> @@ -1081,6 +1230,8 @@ static struct attribute *cxl_memdev_ram_attributes[] = {
>
> static struct attribute_group cxl_memdev_attribute_group = {
> .attrs = cxl_memdev_attributes,
> + .bin_attrs = cxl_memdev_bin_attributes,
> + .is_bin_visible = cxl_memdev_bin_attr_is_visible,
> };
>
> static struct attribute_group cxl_memdev_ram_attribute_group = {
> @@ -1158,6 +1309,25 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_mem *cxlm)
> return ERR_PTR(rc);
> }
>
> +static int cxl_cache_cdat_table(struct cxl_memdev *cxlmd)
> +{
> + struct cxl_mem *cxlm = cxlmd->cxlm;
> + struct device *dev = &cxlmd->dev;
> + ssize_t cdat_length;
> +
> + if (cxlm->table_doe == NULL)
> + return 0;
> +
> + cdat_length = cdat_get_length(cxlm->table_doe);
> + if (cdat_length < 0)
> + return cdat_length;
> +
> + cxlmd->cdat_length = cdat_length;
> + cxlmd->cdat_table = devm_kzalloc(dev->parent, cdat_length, GFP_KERNEL);
I'm not sure how big these CDATs can get, but I don't think there is
any requirement for the memory to be contiguous, so perhaps this
should be kvzalloc() to be kind to the page allocator.
> +
> + return cdat_to_buffer(cxlm->table_doe, cxlmd->cdat_table, cxlmd->cdat_length);
> +}
> +
> static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
> {
> struct cxl_memdev *cxlmd;
> @@ -1180,6 +1350,10 @@ static int cxl_mem_add_memdev(struct cxl_mem *cxlm)
> */
> cxlmd->cxlm = cxlm;
>
> + rc = cxl_cache_cdat_table(cxlmd);
> + if (rc)
> + goto err;
> +
> cdev = &cxlmd->cdev;
> rc = cdev_device_add(cdev, dev);
> if (rc)
> diff --git a/drivers/cxl/mem.h b/drivers/cxl/mem.h
> index 0a3f70316872..fb26155a8fb3 100644
> --- a/drivers/cxl/mem.h
> +++ b/drivers/cxl/mem.h
> @@ -38,12 +38,16 @@
> * @cdev: char dev core object for ioctl operations
> * @cxlm: pointer to the parent device driver data
> * @id: id number of this memdev instance.
> + * @cdat_table: cache of CDAT table
> + * @cdat_length: length of cached CDAT table
> */
> struct cxl_memdev {
> struct device dev;
> struct cdev cdev;
> struct cxl_mem *cxlm;
> int id;
> + void *cdat_table;
> + size_t cdat_length;
> };
>
> /**
> @@ -51,6 +55,7 @@ struct cxl_memdev {
> * @pdev: The PCI device associated with this CXL device.
> * @base: IO mappings to the device's MMIO
> * @cxlmd: Logical memory device chardev / interface
> + * @table_doe: Data exchange object mailbox used to read tables
> * @regs: Parsed register blocks
> * @payload_size: Size of space for payload
> * (CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register)
> @@ -65,6 +70,7 @@ struct cxl_mem {
> void __iomem *base;
> struct cxl_memdev *cxlmd;
>
> + struct pci_doe *table_doe;
> struct cxl_regs regs;
>
> size_t payload_size;
next prev parent reply other threads:[~2021-06-10 21:47 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-24 13:39 [PATCH v4 0/5] PCI Data Object Exchange support + CXL CDAT Jonathan Cameron
2021-05-24 13:39 ` [PATCH v4 1/5] PCI: Add vendor ID for the PCI SIG Jonathan Cameron
2021-06-10 15:17 ` Dan Williams
2021-06-10 17:39 ` Jonathan Cameron
2021-06-10 20:10 ` Dan Williams
2021-05-24 13:39 ` [PATCH v4 2/5] PCI/DOE: Add Data Object Exchange support Jonathan Cameron
2021-06-10 20:06 ` Dan Williams
2021-07-07 19:54 ` Dan Williams
2021-07-08 8:48 ` Lorenzo Pieralisi
2021-07-08 17:28 ` Jonathan Cameron
2021-05-24 13:39 ` [PATCH v4 3/5] cxl/mem: Add CDAT table reading from DOE Jonathan Cameron
2021-06-10 21:46 ` Dan Williams [this message]
2021-05-24 13:39 ` [PATCH v4 4/5] DONOTMERGE: PCI/DOE: Add per DOE chrdev for ioctl based access Jonathan Cameron
2021-05-25 10:26 ` kernel test robot
2021-05-24 13:39 ` [PATCH v4 5/5] DONOTMERGE: PCI/DOE: Add userspace example program to tools/pci Jonathan Cameron
2021-06-10 14:30 ` [PATCH v4 0/5] PCI Data Object Exchange support + CXL CDAT Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAPcyv4hJ3e0jwTbfvg_bWvLdvk=4VO+b3JsQRj=CMezrfoEDbA@mail.gmail.com' \
--to=dan.j.williams@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=ben.widawsky@intel.com \
--cc=cbrowy@avery-design.com \
--cc=f.fangjian@huawei.com \
--cc=helgaas@kernel.org \
--cc=ira.weiny@intel.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=lorenzo.pieralisi@arm.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).