linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Ben Widawsky <ben.widawsky@intel.com>
Cc: linux-cxl@vger.kernel.org,
	Alison Schofield <alison.schofield@intel.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Vishal Verma <vishal.l.verma@intel.com>
Subject: Re: [PATCH 07/13] cxl/memdev: Determine CXL.mem capability
Date: Fri, 10 Sep 2021 14:59:29 -0700	[thread overview]
Message-ID: <CAPcyv4gKBcdQYAU2qJKDbheaK-SFbOUmNPS+eNB5sbY+B644rQ@mail.gmail.com> (raw)
In-Reply-To: <20210902195017.2516472-8-ben.widawsky@intel.com>

On Thu, Sep 2, 2021 at 12:50 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> If the "upstream" port of the endpoint is an enumerated downstream CXL
> port, and the device itself is CXL capable and enabled, the memdev
> driver can bind. This binding useful for region configuration/creation
> because it provides a clean way for the region code to determine if the
> memdev is actually CXL capable.
>
> A memdev/hostbridge probe race is solved with a full CXL bus rescan at
> the end of ACPI probing (see comment in code for details). Switch
> enumeration will be done as a follow-on patch. As a result, if a switch
> is in the topology the memdev driver will not bind to any devices.
>
> CXL.mem capability is checked lazily at the time a region is bound.
> This is in line with the other configuration parameters.
>
> Below is an example (mem0, and mem1) of CXL memdev devices that now
> exist on the bus.
>
> /sys/bus/cxl/devices/
> ├── decoder0.0 -> ../../../devices/platform/ACPI0017:00/root0/decoder0.0
> ├── mem0 -> ../../../devices/pci0000:34/0000:34:01.0/0000:36:00.0/mem0
> ├── mem1 -> ../../../devices/pci0000:34/0000:34:00.0/0000:35:00.0/mem1

I'm confused, this isn't showing anything new that did not already
exist before this patch? What I think would be a useful shortcut is
for memX devices to have an attribute that links back to their cxl
root port after validation completes. Like an attribute group that
arrives and disappears when the driver successfully binds and unbinds
respectively.

> ├── pmem0 -> ../../../devices/pci0000:34/0000:34:01.0/0000:36:00.0/mem0/pmem0
> ├── pmem1 -> ../../../devices/pci0000:34/0000:34:00.0/0000:35:00.0/mem1/pmem1
> ├── port1 -> ../../../devices/platform/ACPI0017:00/root0/port1
> └── root0 -> ../../../devices/platform/ACPI0017:00/root0
>
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  drivers/cxl/acpi.c        | 27 +++++++-----------
>  drivers/cxl/core/bus.c    | 60 +++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/memdev.c |  6 ++++
>  drivers/cxl/cxl.h         |  2 ++
>  drivers/cxl/cxlmem.h      |  2 ++
>  drivers/cxl/mem.c         | 55 ++++++++++++++++++++++++++++++++++-
>  drivers/cxl/pci.c         | 23 ---------------
>  drivers/cxl/pci.h         |  7 ++++-
>  8 files changed, 141 insertions(+), 41 deletions(-)
>
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index 7130beffc929..fd14094bdb3f 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -240,21 +240,6 @@ __mock int match_add_root_ports(struct pci_dev *pdev, void *data)
>         return 0;
>  }
>
> -static struct cxl_dport *find_dport_by_dev(struct cxl_port *port, struct device *dev)
> -{
> -       struct cxl_dport *dport;
> -
> -       device_lock(&port->dev);
> -       list_for_each_entry(dport, &port->dports, list)
> -               if (dport->dport == dev) {
> -                       device_unlock(&port->dev);
> -                       return dport;
> -               }
> -
> -       device_unlock(&port->dev);
> -       return NULL;
> -}
> -
>  __mock struct acpi_device *to_cxl_host_bridge(struct device *host,
>                                               struct device *dev)
>  {
> @@ -459,9 +444,19 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>         if (rc)
>                 goto out;
>
> -       if (IS_ENABLED(CONFIG_CXL_PMEM))
> +       if (IS_ENABLED(CONFIG_CXL_PMEM)) {
>                 rc = device_for_each_child(&root_port->dev, root_port,
>                                            add_root_nvdimm_bridge);
> +               if (rc)
> +                       goto out;
> +       }
> +
> +       /*
> +        * While ACPI is scanning hostbridge ports, switches and memory devices
> +        * may have been probed. Those devices will need to know whether the
> +        * hostbridge is CXL capable.
> +        */
> +       rc = bus_rescan_devices(&cxl_bus_type);

I don't think it's a good idea to call bus_rescan_devices() from
probe() context. This now sets up a lockdep dependency between the
ACPI0017 device-lock and all the device-locks for every device on the
cxl-bus. This is why the nvdimm code punts the rescan outside the lock
to a workqueue.

Lockdep unfortunately won't complain about device-lock entanglements.
One item for the backlog is to add device-lock validation to the cxl
subsystem ala commit 87a30e1f05d7 ("driver-core, libnvdimm: Let device
subsystems add local lockdep coverage")


>
>  out:
>         acpi_put_table(acpi_cedt);
> diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c
> index 256e55dc2a3b..56f57302d27b 100644
> --- a/drivers/cxl/core/bus.c
> +++ b/drivers/cxl/core/bus.c
> @@ -8,6 +8,7 @@
>  #include <linux/idr.h>
>  #include <cxlmem.h>
>  #include <cxl.h>
> +#include <pci.h>
>  #include "core.h"
>
>  /**
> @@ -259,6 +260,12 @@ static const struct device_type cxl_port_type = {
>         .groups = cxl_port_attribute_groups,
>  };
>
> +bool is_cxl_port(struct device *dev)
> +{
> +       return dev->type == &cxl_port_type;
> +}
> +EXPORT_SYMBOL_GPL(is_cxl_port);
> +
>  struct cxl_port *to_cxl_port(struct device *dev)
>  {
>         if (dev_WARN_ONCE(dev, dev->type != &cxl_port_type,
> @@ -266,6 +273,7 @@ struct cxl_port *to_cxl_port(struct device *dev)
>                 return NULL;
>         return container_of(dev, struct cxl_port, dev);
>  }
> +EXPORT_SYMBOL_GPL(to_cxl_port);
>
>  static void unregister_port(void *_port)
>  {
> @@ -424,6 +432,27 @@ static int add_dport(struct cxl_port *port, struct cxl_dport *new)
>         return dup ? -EEXIST : 0;
>  }
>
> +/**
> + * find_dport_by_dev - gets downstream CXL port from a struct device
> + * @port: cxl [upstream] port that "owns" the downstream port is being queried
> + * @dev: The device that is backing the downstream port
> + */
> +struct cxl_dport *find_dport_by_dev(struct cxl_port *port, const struct device *dev)
> +{
> +       struct cxl_dport *dport;
> +
> +       device_lock(&port->dev);
> +       list_for_each_entry(dport, &port->dports, list)
> +               if (dport->dport == dev) {
> +                       device_unlock(&port->dev);
> +                       return dport;
> +               }
> +
> +       device_unlock(&port->dev);
> +       return NULL;
> +}
> +EXPORT_SYMBOL_GPL(find_dport_by_dev);

This wants to move into the "cxl_" prefix symbol namespace if it's now
going to be a public function.

> +
>  /**
>   * cxl_add_dport - append downstream port data to a cxl_port
>   * @port: the cxl_port that references this dport
> @@ -596,6 +625,37 @@ int cxl_decoder_autoremove(struct device *host, struct cxl_decoder *cxld)
>  }
>  EXPORT_SYMBOL_GPL(cxl_decoder_autoremove);
>
> +/**
> + * cxl_pci_dvsec - Gets offset for the given DVSEC id
> + * @pdev: PCI device to search for the DVSEC
> + * @dvsec: DVSEC id to look for
> + *
> + * Return: offset within the PCI header for the given DVSEC id. 0 if not found
> + */
> +int cxl_pci_dvsec(struct pci_dev *pdev, int dvsec)
> +{
> +       int pos;
> +
> +       pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_DVSEC);
> +       if (!pos)
> +               return 0;
> +
> +       while (pos) {
> +               u16 vendor, id;
> +
> +               pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER1, &vendor);
> +               pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER2, &id);
> +               if (vendor == PCI_DVSEC_VENDOR_ID_CXL && dvsec == id)
> +                       return pos;
> +
> +               pos = pci_find_next_ext_capability(pdev, pos,
> +                                                  PCI_EXT_CAP_ID_DVSEC);
> +       }
> +
> +       return 0;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mem_dvsec);

Why not keep this enumeration in cxl_pci and have it record the
component register block base address at cxl_memdev creation time?
This would make it similar to cxl_port creation that takes a
component_register base address argument.

> +
>  /**
>   * __cxl_driver_register - register a driver for the cxl bus
>   * @cxl_drv: cxl driver structure to attach
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index c9dd054bd813..0068b5ff5f3e 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -337,3 +337,9 @@ void cxl_memdev_exit(void)
>  {
>         unregister_chrdev_region(MKDEV(cxl_mem_major, 0), CXL_MEM_MAX_DEVS);
>  }
> +
> +bool is_cxl_mem_capable(struct cxl_memdev *cxlmd)
> +{
> +       return !!cxlmd->dev.driver;
> +}
> +EXPORT_SYMBOL_GPL(is_cxl_mem_capable);

Perhaps:

s/capable/{enabled,routed}/

The device is always capable, it's the hierarchy that will let it down.

> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index b48bdbefd949..a168520d741b 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -283,8 +283,10 @@ struct cxl_port *devm_cxl_add_port(struct device *host, struct device *uport,
>                                    resource_size_t component_reg_phys,
>                                    struct cxl_port *parent_port);
>
> +bool is_cxl_port(struct device *dev);
>  int cxl_add_dport(struct cxl_port *port, struct device *dport, int port_id,
>                   resource_size_t component_reg_phys);
> +struct cxl_dport *find_dport_by_dev(struct cxl_port *port, const struct device *dev);
>
>  struct cxl_decoder *to_cxl_decoder(struct device *dev);
>  bool is_root_decoder(struct device *dev);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 811b24451604..88264204c4b9 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -51,6 +51,8 @@ static inline struct cxl_memdev *to_cxl_memdev(struct device *dev)
>  struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>                                        struct cxl_mem *cxlm);
>
> +bool is_cxl_mem_capable(struct cxl_memdev *cxlmd);
> +
>  /**
>   * struct cxl_mbox_cmd - A command to be submitted to hardware.
>   * @opcode: (input) The command set and command submitted to hardware.
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index 978a54b0a51a..b6dc34d18a86 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -2,8 +2,10 @@
>  /* Copyright(c) 2021 Intel Corporation. All rights reserved. */
>  #include <linux/device.h>
>  #include <linux/module.h>
> +#include <linux/pci.h>
>
>  #include "cxlmem.h"
> +#include "pci.h"
>
>  /**
>   * DOC: cxl mem
> @@ -17,9 +19,60 @@
>   * components.
>   */
>
> +static int port_match(struct device *dev, const void *data)
> +{
> +       struct cxl_port *port;
> +
> +       if (!is_cxl_port(dev))
> +               return 0;
> +
> +       port = to_cxl_port(dev);
> +
> +       if (find_dport_by_dev(port, (struct device *)data))
> +               return 1;
> +
> +       return 0;
> +}
> +
> +static bool is_cxl_mem_enabled(struct pci_dev *pdev)
> +{
> +       int pcie_dvsec;
> +       u16 dvsec_ctrl;
> +
> +       pcie_dvsec = cxl_pci_dvsec(pdev, PCI_DVSEC_ID_PCIE_DVSEC_CXL_DVSEC_ID);
> +       if (!pcie_dvsec) {
> +               dev_info(&pdev->dev, "Unable to determine CXL protocol support");
> +               return false;
> +       }
> +
> +       pci_read_config_word(pdev,
> +                            pcie_dvsec + PCI_DVSEC_ID_CXL_PCIE_CTRL_OFFSET,
> +                            &dvsec_ctrl);
> +       if (!(dvsec_ctrl & CXL_PCIE_MEM_ENABLE)) {
> +               dev_info(&pdev->dev, "CXL.mem protocol not supported on device");
> +               return false;
> +       }
> +
> +       return true;
> +}
> +
>  static int cxl_mem_probe(struct device *dev)
>  {
> -       return -EOPNOTSUPP;
> +       struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> +       struct cxl_mem *cxlm = cxlmd->cxlm;
> +       struct device *pdev_parent = cxlm->dev->parent;
> +       struct pci_dev *pdev = to_pci_dev(cxlm->dev);

It's not safe to assume that the parent of a cxlmd is a pci device.

> +       struct device *port_dev;
> +
> +       if (!is_cxl_mem_enabled(pdev))
> +               return -ENODEV;

This isn't sufficient, this needs to walk the entire hierarchy, right?

> +
> +       /* TODO: if parent is a switch, this will fail. */

Won't the parent be a switch in all cases? For example, even in QEMU
today the parent of the CXL device is the switch in the host bridge.

# cat /sys/bus/cxl/devices/port1/decoder1.0/devtype
cxl_decoder_switch

> +       port_dev = bus_find_device(&cxl_bus_type, NULL, pdev_parent, port_match);
> +       if (!port_dev)
> +               return -ENODEV;
> +
> +       return 0;
>  }
>
>  static void cxl_mem_remove(struct device *dev)
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 6931885c83ce..244b99948c40 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -335,29 +335,6 @@ static void cxl_pci_unmap_regblock(struct cxl_mem *cxlm, void __iomem *base)
>         pci_iounmap(to_pci_dev(cxlm->dev), base);
>  }
>
> -static int cxl_pci_dvsec(struct pci_dev *pdev, int dvsec)
> -{
> -       int pos;
> -
> -       pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_DVSEC);
> -       if (!pos)
> -               return 0;
> -
> -       while (pos) {
> -               u16 vendor, id;
> -
> -               pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER1, &vendor);
> -               pci_read_config_word(pdev, pos + PCI_DVSEC_HEADER2, &id);
> -               if (vendor == PCI_DVSEC_VENDOR_ID_CXL && dvsec == id)
> -                       return pos;
> -
> -               pos = pci_find_next_ext_capability(pdev, pos,
> -                                                  PCI_EXT_CAP_ID_DVSEC);
> -       }
> -
> -       return 0;
> -}
> -
>  static int cxl_probe_regs(struct cxl_mem *cxlm, void __iomem *base,
>                           struct cxl_register_map *map)
>  {
> diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h
> index 8c1a58813816..d6b9978d05b0 100644
> --- a/drivers/cxl/pci.h
> +++ b/drivers/cxl/pci.h
> @@ -11,7 +11,10 @@
>   */
>  #define PCI_DVSEC_HEADER1_LENGTH_MASK  GENMASK(31, 20)
>  #define PCI_DVSEC_VENDOR_ID_CXL                0x1E98
> -#define PCI_DVSEC_ID_CXL               0x0
> +
> +#define PCI_DVSEC_ID_PCIE_DVSEC_CXL_DVSEC_ID   0x0
> +#define PCI_DVSEC_ID_CXL_PCIE_CTRL_OFFSET      0xC
> +#define   CXL_PCIE_MEM_ENABLE                  BIT(2)
>
>  #define PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID       0x8
>  #define PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET  0xC
> @@ -29,4 +32,6 @@
>
>  #define CXL_REGLOC_ADDR_MASK GENMASK(31, 16)
>
> +int cxl_pci_dvsec(struct pci_dev *pdev, int dvsec);
> +
>  #endif /* __CXL_PCI_H__ */
> --
> 2.33.0
>

  parent reply	other threads:[~2021-09-10 21:59 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-02 19:50 [PATCH 00/13] Enumerate midlevel and endpoint decoders Ben Widawsky
2021-09-02 19:50 ` [PATCH 01/13] Documentation/cxl: Add bus internal docs Ben Widawsky
2021-09-03 14:05   ` Jonathan Cameron
2021-09-10 18:20     ` Dan Williams
2021-09-02 19:50 ` [PATCH 02/13] cxl/core/bus: Add kernel docs for decoder ops Ben Widawsky
2021-09-03 14:17   ` Jonathan Cameron
2021-09-10 18:51   ` Dan Williams
2021-09-11 17:25     ` Ben Widawsky
2021-09-02 19:50 ` [PATCH 03/13] cxl/core: Ignore interleave when adding decoders Ben Widawsky
2021-09-03 14:25   ` Jonathan Cameron
2021-09-10 19:00     ` Dan Williams
2021-09-11 17:30       ` Ben Widawsky
2021-09-02 19:50 ` [PATCH 04/13] cxl: Introduce endpoint decoders Ben Widawsky
2021-09-03 14:35   ` Jonathan Cameron
2021-09-13 16:19     ` Ben Widawsky
2021-09-10 19:19   ` Dan Williams
2021-09-13 16:11     ` Ben Widawsky
2021-09-13 22:07       ` Dan Williams
2021-09-13 23:19         ` Ben Widawsky
2021-09-14 21:16           ` Dan Williams
2021-09-02 19:50 ` [PATCH 05/13] cxl/pci: Disambiguate cxl_pci further from cxl_mem Ben Widawsky
2021-09-03 14:45   ` Jonathan Cameron
2021-09-10 19:27   ` Dan Williams
2021-09-02 19:50 ` [PATCH 06/13] cxl/mem: Introduce cxl_mem driver Ben Widawsky
2021-09-03 14:52   ` Jonathan Cameron
2021-09-10 21:32   ` Dan Williams
2021-09-13 16:46     ` Ben Widawsky
2021-09-13 19:37       ` Dan Williams
2021-09-02 19:50 ` [PATCH 07/13] cxl/memdev: Determine CXL.mem capability Ben Widawsky
2021-09-03 15:21   ` Jonathan Cameron
2021-09-13 19:01     ` Ben Widawsky
2021-09-10 21:59   ` Dan Williams [this message]
2021-09-13 22:10     ` Ben Widawsky
2021-09-14 22:42       ` Dan Williams
2021-09-14 22:55         ` Ben Widawsky
2021-09-02 19:50 ` [PATCH 08/13] cxl/mem: Add memdev as a port Ben Widawsky
2021-09-03 15:31   ` Jonathan Cameron
2021-09-10 23:09   ` Dan Williams
2021-09-02 19:50 ` [PATCH 09/13] cxl/pci: Retain map information in cxl_mem_probe Ben Widawsky
2021-09-10 23:12   ` Dan Williams
2021-09-10 23:45     ` Dan Williams
2021-09-02 19:50 ` [PATCH 10/13] cxl/core: Map component registers for ports Ben Widawsky
2021-09-02 22:41   ` Ben Widawsky
2021-09-02 22:42     ` Ben Widawsky
2021-09-03 16:14   ` Jonathan Cameron
2021-09-10 23:52     ` Dan Williams
2021-09-13  8:29       ` Jonathan Cameron
2021-09-10 23:44   ` Dan Williams
2021-09-02 19:50 ` [PATCH 11/13] cxl/core: Convert decoder range to resource Ben Widawsky
2021-09-03 16:16   ` Jonathan Cameron
2021-09-11  0:59   ` Dan Williams
2021-09-02 19:50 ` [PATCH 12/13] cxl/core/bus: Enumerate all HDM decoders Ben Widawsky
2021-09-03 17:43   ` Jonathan Cameron
2021-09-11  1:37     ` Dan Williams
2021-09-11  1:13   ` Dan Williams
2021-09-02 19:50 ` [PATCH 13/13] cxl/mem: Enumerate switch decoders Ben Widawsky
2021-09-03 17:56   ` Jonathan Cameron
2021-09-13 22:12     ` Ben Widawsky
2021-09-14 23:31   ` Dan Williams
2021-09-10 18:15 ` [PATCH 00/13] Enumerate midlevel and endpoint decoders Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4gKBcdQYAU2qJKDbheaK-SFbOUmNPS+eNB5sbY+B644rQ@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=ben.widawsky@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).