From: Ben Widawsky <ben.widawsky@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: linux-cxl@vger.kernel.org,
Alison Schofield <alison.schofield@intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Ira Weiny <ira.weiny@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>
Subject: Re: [PATCH 13/13] cxl/mem: Enumerate switch decoders
Date: Mon, 13 Sep 2021 15:12:48 -0700 [thread overview]
Message-ID: <20210913221248.ghgkrkqhw4mujhpd@intel.com> (raw)
In-Reply-To: <20210903185623.00007135@Huawei.com>
On 21-09-03 18:56:23, Jonathan Cameron wrote:
> On Thu, 2 Sep 2021 12:50:17 -0700
> Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> > Switches work much in the same way as hostbridges. The primary
> > difference is that they are enumerated, and probed via regular PCIe
> > mechanisms. A switch has 1 upstream port, and n downstream ports.
> > Ultimately a memory device attached to a switch can determine if it's in
> > a CXL capable subset of the topology if the switch is CXL capable.
> >
> > The algorithm introduced enables enumerating switches in a CXL topology.
> > It walks up the topology until it finds a root port (which is enumerated
> > by the cxl_acpi driver). Once at the top, it walks back down adding all
> > downstream ports along the way.
> >
> > Note that practically speaking there can be at most 3 levels of switches
> > with the current 2.0 spec. This is because there is a max interleave of
> > 8 defined in the spec. If there is a single hostbridge and only 1 root
> > port was CXL capable, you could have 3 levels of x2 switches, making
> > the x8 interleave. However, as far as the spec is concerned, there can
> > be infinite number of switches since a x1 switch is allowed, and
> > future versions of the spec may allow for a larger total interleave.
>
> Or you could be lazy and rely on the statement in CXL 2.0 that it supports
> only a single level of switching (search for "single level" in 1.4.1)
> Lots of other reasons it's far from infinite... (number of busses etc).
>
> I'll not speculate on what might be supported in the future.
I like lazy, however, there is no statement in the spec that disallows multiple
levels of switches.
>
> A few minor comments below.
>
> Jonathan
>
> >
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> > drivers/cxl/mem.c | 130 +++++++++++++++++++++++++++++++++++++++++++++-
> > drivers/cxl/pci.c | 8 ---
> > drivers/cxl/pci.h | 8 +++
> > 3 files changed, 137 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> > index aba9a07d519f..dc8ca43d5bfc 100644
> > --- a/drivers/cxl/mem.c
> > +++ b/drivers/cxl/mem.c
> > @@ -56,6 +56,133 @@ static bool is_cxl_mem_enabled(struct pci_dev *pdev)
> > return true;
> > }
> >
> > +/* TODO: dedeuplicate this from drivers/cxl/pci.c? */
>
> That seems like a question with an obvious answer...
>
> > +static unsigned long get_component_regs(struct pci_dev *pdev)
> > +{
> > + unsigned long component_reg_phys = CXL_RESOURCE_NONE;
> > + u32 regloc_size, regblocks;
> > + int regloc, i;
> > +
> > + regloc = cxl_pci_dvsec(pdev, PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
> > + if (!regloc) {
> > + dev_err(&pdev->dev, "register location dvsec not found\n");
> > + return component_reg_phys;
> > + }
> > +
> > + /* Get the size of the Register Locator DVSEC */
> > + pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, ®loc_size);
> > + regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> > +
> > + regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
> > + regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
> > +
> > + for (i = 0; i < regblocks; i++, regloc += 8) {
> > + u32 reg_lo, reg_hi;
> > + u8 reg_type;
> > + u64 offset;
> > + u8 bar;
> > +
> > + pci_read_config_dword(pdev, regloc, ®_lo);
> > + pci_read_config_dword(pdev, regloc + 4, ®_hi);
> > +
> > + cxl_decode_register_block(reg_lo, reg_hi, &bar, &offset,
> > + ®_type);
> > +
> > + if (reg_type != CXL_REGLOC_RBI_COMPONENT)
> > + continue;
> > +
> > + component_reg_phys = pci_resource_start(pdev, bar) + offset;
> > + }
> > +
> > + return component_reg_phys;
> > +}
> > +
> > +static void enumerate_uport(struct device *dev)
> > +{
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > +
> > + /*
> > + * Parent's parent should be another uport, since we don't have root
> > + * ports here
> > + */
> > + if (dev_WARN_ONCE(dev, !dev->parent->parent, "No grandparent port\n"))
> > + return;
> > +
> > + if (!is_cxl_port(dev->parent->parent)) {
> > + dev_info(dev, "Parent of uport isn't a CXL port (%s)\n",
> > + dev_name(dev->parent->parent));
> > + return;
> > + }
> > +
> > + devm_cxl_add_port(dev, dev, get_component_regs(pdev),
> > + to_cxl_port(dev->parent));
> > +}
> > +
> > +static void enumerate_dport(struct device *dev)
> > +{
> > + struct pci_dev *pdev = to_pci_dev(dev);
> > + u32 port_num, lnkcap;
> > +
> > + if (dev_WARN_ONCE(dev, !dev->parent, "No parent port\n"))
> > + return;
> > +
> > + if (!is_cxl_port(dev->parent)) {
> > + dev_info(dev, "Uport isn't a CXL port %s\n",
> > + dev_name(dev->parent));
> > + return;
> > + }
> > +
> > + /* TODO: deduplicate from drivers/cxl/acpi.c? */
> > + if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
> > + &lnkcap) != PCIBIOS_SUCCESSFUL)
> > + return;
> > + port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> > +
> > + cxl_add_dport(to_cxl_port(dev->parent), dev, port_num,
> > + get_component_regs(pdev));
> > +}
> > +
> > +/*
> > + * Walk up the topology until we get to the root port (ie. parent is a
> > + * cxl port). From there walk back down adding the additional ports. If the
> > + * parent isn't a PCIe switch (upstream or downstream port), the downstream
> > + * endpoint(s) cannot be CXL enabled.
> > + *
> > + * XXX: It's possible that cxl_acpi hasn't yet enumerated the root ports, and
> > + * so that will rescan the CXL bus, thus coming back here.
> > + */
> > +static void enumerate_switches(struct device *dev)
> > +{
> > + struct pci_dev *pdev;
> > + int type;
> > +
> > + if (unlikely(!dev))
>
> Unlikely markings seems unlikely to be necessary. I'm assuming
> this is far from a hot path!
>
> > + return;
> > +
> > + if (unlikely(!dev_is_pci(dev)))
> > + return;
> > +
> > + pdev = to_pci_dev(dev);
> > +
> > + if (unlikely(!pci_is_pcie(pdev)))
> > + return;
> > +
> > + if (!is_cxl_mem_enabled(pdev))
> > + return;
> > +
> > + type = pci_pcie_type(pdev);
> > +
> > + if (type != PCI_EXP_TYPE_UPSTREAM && type != PCI_EXP_TYPE_DOWNSTREAM)
> > + return;
> > +
> > + enumerate_switches(dev->parent);
> > +
> > + if (type == PCI_EXP_TYPE_UPSTREAM)
> > + enumerate_uport(dev);
> > + if (type == PCI_EXP_TYPE_DOWNSTREAM)
> > + enumerate_dport(dev);
> > +}
> > +
> > static int cxl_mem_probe(struct device *dev)
> > {
> > struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > @@ -68,7 +195,8 @@ static int cxl_mem_probe(struct device *dev)
> > if (!is_cxl_mem_enabled(pdev))
> > return -ENODEV;
> >
> > - /* TODO: if parent is a switch, this will fail. */
> > + enumerate_switches(dev->parent);
> > +
> > port_dev = bus_find_device(&cxl_bus_type, NULL, pdev_parent, port_match);
> > if (!port_dev)
> > return -ENODEV;
>
next prev parent reply other threads:[~2021-09-13 22:12 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-02 19:50 [PATCH 00/13] Enumerate midlevel and endpoint decoders Ben Widawsky
2021-09-02 19:50 ` [PATCH 01/13] Documentation/cxl: Add bus internal docs Ben Widawsky
2021-09-03 14:05 ` Jonathan Cameron
2021-09-10 18:20 ` Dan Williams
2021-09-02 19:50 ` [PATCH 02/13] cxl/core/bus: Add kernel docs for decoder ops Ben Widawsky
2021-09-03 14:17 ` Jonathan Cameron
2021-09-10 18:51 ` Dan Williams
2021-09-11 17:25 ` Ben Widawsky
2021-09-02 19:50 ` [PATCH 03/13] cxl/core: Ignore interleave when adding decoders Ben Widawsky
2021-09-03 14:25 ` Jonathan Cameron
2021-09-10 19:00 ` Dan Williams
2021-09-11 17:30 ` Ben Widawsky
2021-09-02 19:50 ` [PATCH 04/13] cxl: Introduce endpoint decoders Ben Widawsky
2021-09-03 14:35 ` Jonathan Cameron
2021-09-13 16:19 ` Ben Widawsky
2021-09-10 19:19 ` Dan Williams
2021-09-13 16:11 ` Ben Widawsky
2021-09-13 22:07 ` Dan Williams
2021-09-13 23:19 ` Ben Widawsky
2021-09-14 21:16 ` Dan Williams
2021-09-02 19:50 ` [PATCH 05/13] cxl/pci: Disambiguate cxl_pci further from cxl_mem Ben Widawsky
2021-09-03 14:45 ` Jonathan Cameron
2021-09-10 19:27 ` Dan Williams
2021-09-02 19:50 ` [PATCH 06/13] cxl/mem: Introduce cxl_mem driver Ben Widawsky
2021-09-03 14:52 ` Jonathan Cameron
2021-09-10 21:32 ` Dan Williams
2021-09-13 16:46 ` Ben Widawsky
2021-09-13 19:37 ` Dan Williams
2021-09-02 19:50 ` [PATCH 07/13] cxl/memdev: Determine CXL.mem capability Ben Widawsky
2021-09-03 15:21 ` Jonathan Cameron
2021-09-13 19:01 ` Ben Widawsky
2021-09-10 21:59 ` Dan Williams
2021-09-13 22:10 ` Ben Widawsky
2021-09-14 22:42 ` Dan Williams
2021-09-14 22:55 ` Ben Widawsky
2021-09-02 19:50 ` [PATCH 08/13] cxl/mem: Add memdev as a port Ben Widawsky
2021-09-03 15:31 ` Jonathan Cameron
2021-09-10 23:09 ` Dan Williams
2021-09-02 19:50 ` [PATCH 09/13] cxl/pci: Retain map information in cxl_mem_probe Ben Widawsky
2021-09-10 23:12 ` Dan Williams
2021-09-10 23:45 ` Dan Williams
2021-09-02 19:50 ` [PATCH 10/13] cxl/core: Map component registers for ports Ben Widawsky
2021-09-02 22:41 ` Ben Widawsky
2021-09-02 22:42 ` Ben Widawsky
2021-09-03 16:14 ` Jonathan Cameron
2021-09-10 23:52 ` Dan Williams
2021-09-13 8:29 ` Jonathan Cameron
2021-09-10 23:44 ` Dan Williams
2021-09-02 19:50 ` [PATCH 11/13] cxl/core: Convert decoder range to resource Ben Widawsky
2021-09-03 16:16 ` Jonathan Cameron
2021-09-11 0:59 ` Dan Williams
2021-09-02 19:50 ` [PATCH 12/13] cxl/core/bus: Enumerate all HDM decoders Ben Widawsky
2021-09-03 17:43 ` Jonathan Cameron
2021-09-11 1:37 ` Dan Williams
2021-09-11 1:13 ` Dan Williams
2021-09-02 19:50 ` [PATCH 13/13] cxl/mem: Enumerate switch decoders Ben Widawsky
2021-09-03 17:56 ` Jonathan Cameron
2021-09-13 22:12 ` Ben Widawsky [this message]
2021-09-14 23:31 ` Dan Williams
2021-09-10 18:15 ` [PATCH 00/13] Enumerate midlevel and endpoint decoders Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210913221248.ghgkrkqhw4mujhpd@intel.com \
--to=ben.widawsky@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).