linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Widawsky <ben.widawsky@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: linux-cxl@vger.kernel.org,
	Alison Schofield <alison.schofield@intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>
Subject: Re: [PATCH 13/13] cxl/mem: Enumerate switch decoders
Date: Mon, 13 Sep 2021 15:12:48 -0700	[thread overview]
Message-ID: <20210913221248.ghgkrkqhw4mujhpd@intel.com> (raw)
In-Reply-To: <20210903185623.00007135@Huawei.com>

On 21-09-03 18:56:23, Jonathan Cameron wrote:
> On Thu, 2 Sep 2021 12:50:17 -0700
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > Switches work much in the same way as hostbridges. The primary
> > difference is that they are enumerated, and probed via regular PCIe
> > mechanisms. A switch has 1 upstream port, and n downstream ports.
> > Ultimately a memory device attached to a switch can determine if it's in
> > a CXL capable subset of the topology if the switch is CXL capable.
> > 
> > The algorithm introduced enables enumerating switches in a CXL topology.
> > It walks up the topology until it finds a root port (which is enumerated
> > by the cxl_acpi driver). Once at the top, it walks back down adding all
> > downstream ports along the way.
> > 
> > Note that practically speaking there can be at most 3 levels of switches
> > with the current 2.0 spec. This is because there is a max interleave of
> > 8 defined in the spec. If there is a single hostbridge and only 1 root
> > port was CXL capable, you could have 3 levels of x2 switches, making
> > the x8 interleave. However, as far as the spec is concerned, there can
> > be infinite number of switches since a x1 switch is allowed, and
> > future versions of the spec may allow for a larger total interleave.
> 
> Or you could be lazy and rely on the statement in CXL 2.0 that it supports
> only a single level of switching (search for "single level" in 1.4.1)
> Lots of other reasons it's far from infinite... (number of busses etc).
> 
> I'll not speculate on what might be supported in the future.

I like lazy, however, there is no statement in the spec that disallows multiple
levels of switches.

> 
> A few minor comments below.
> 
> Jonathan
> 
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  drivers/cxl/mem.c | 130 +++++++++++++++++++++++++++++++++++++++++++++-
> >  drivers/cxl/pci.c |   8 ---
> >  drivers/cxl/pci.h |   8 +++
> >  3 files changed, 137 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> > index aba9a07d519f..dc8ca43d5bfc 100644
> > --- a/drivers/cxl/mem.c
> > +++ b/drivers/cxl/mem.c
> > @@ -56,6 +56,133 @@ static bool is_cxl_mem_enabled(struct pci_dev *pdev)
> >  	return true;
> >  }
> >  
> > +/* TODO: dedeuplicate this from drivers/cxl/pci.c? */
> 
> That seems like a question with an obvious answer...
> 
> > +static unsigned long get_component_regs(struct pci_dev *pdev)
> > +{
> > +	unsigned long component_reg_phys = CXL_RESOURCE_NONE;
> > +	u32 regloc_size, regblocks;
> > +	int regloc, i;
> > +
> > +	regloc = cxl_pci_dvsec(pdev, PCI_DVSEC_ID_CXL_REGLOC_DVSEC_ID);
> > +	if (!regloc) {
> > +		dev_err(&pdev->dev, "register location dvsec not found\n");
> > +		return component_reg_phys;
> > +	}
> > +
> > +	/* Get the size of the Register Locator DVSEC */
> > +	pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, &regloc_size);
> > +	regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size);
> > +
> > +	regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET;
> > +	regblocks = (regloc_size - PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET) / 8;
> > +
> > +	for (i = 0; i < regblocks; i++, regloc += 8) {
> > +		u32 reg_lo, reg_hi;
> > +		u8 reg_type;
> > +		u64 offset;
> > +		u8 bar;
> > +
> > +		pci_read_config_dword(pdev, regloc, &reg_lo);
> > +		pci_read_config_dword(pdev, regloc + 4, &reg_hi);
> > +
> > +		cxl_decode_register_block(reg_lo, reg_hi, &bar, &offset,
> > +					  &reg_type);
> > +
> > +		if (reg_type != CXL_REGLOC_RBI_COMPONENT)
> > +			continue;
> > +
> > +		component_reg_phys = pci_resource_start(pdev, bar) + offset;
> > +	}
> > +
> > +	return component_reg_phys;
> > +}
> > +
> > +static void enumerate_uport(struct device *dev)
> > +{
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +
> > +	/*
> > +	 * Parent's parent should be another uport, since we don't have root
> > +	 * ports here
> > +	 */
> > +	if (dev_WARN_ONCE(dev, !dev->parent->parent, "No grandparent port\n"))
> > +		return;
> > +
> > +	if (!is_cxl_port(dev->parent->parent)) {
> > +		dev_info(dev, "Parent of uport isn't a CXL port (%s)\n",
> > +			 dev_name(dev->parent->parent));
> > +		return;
> > +	}
> > +
> > +	devm_cxl_add_port(dev, dev, get_component_regs(pdev),
> > +			  to_cxl_port(dev->parent));
> > +}
> > +
> > +static void enumerate_dport(struct device *dev)
> > +{
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	u32 port_num, lnkcap;
> > +
> > +	if (dev_WARN_ONCE(dev, !dev->parent, "No parent port\n"))
> > +		return;
> > +
> > +	if (!is_cxl_port(dev->parent)) {
> > +		dev_info(dev, "Uport isn't a CXL port %s\n",
> > +			 dev_name(dev->parent));
> > +		return;
> > +	}
> > +
> > +	/* TODO: deduplicate from drivers/cxl/acpi.c? */
> > +	if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
> > +				  &lnkcap) != PCIBIOS_SUCCESSFUL)
> > +		return;
> > +	port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
> > +
> > +	cxl_add_dport(to_cxl_port(dev->parent), dev, port_num,
> > +		      get_component_regs(pdev));
> > +}
> > +
> > +/*
> > + * Walk up the topology until we get to the root port (ie. parent is a
> > + * cxl port). From there walk back down adding the additional ports. If the
> > + * parent isn't a PCIe switch (upstream or downstream port), the downstream
> > + * endpoint(s) cannot be CXL enabled.
> > + *
> > + * XXX: It's possible that cxl_acpi hasn't yet enumerated the root ports, and
> > + * so that will rescan the CXL bus, thus coming back here.
> > + */
> > +static void enumerate_switches(struct device *dev)
> > +{
> > +	struct pci_dev *pdev;
> > +	int type;
> > +
> > +	if (unlikely(!dev))
> 
> Unlikely markings seems unlikely to be necessary. I'm assuming
> this is far from a hot path!
> 
> > +		return;
> > +
> > +	if (unlikely(!dev_is_pci(dev)))
> > +		return;
> > +
> > +	pdev = to_pci_dev(dev);
> > +
> > +	if (unlikely(!pci_is_pcie(pdev)))
> > +		return;
> > +
> > +	if (!is_cxl_mem_enabled(pdev))
> > +		return;
> > +
> > +	type = pci_pcie_type(pdev);
> > +
> > +	if (type != PCI_EXP_TYPE_UPSTREAM && type != PCI_EXP_TYPE_DOWNSTREAM)
> > +		return;
> > +
> > +	enumerate_switches(dev->parent);
> > +
> > +	if (type == PCI_EXP_TYPE_UPSTREAM)
> > +		enumerate_uport(dev);
> > +	if (type == PCI_EXP_TYPE_DOWNSTREAM)
> > +		enumerate_dport(dev);
> > +}
> > +
> >  static int cxl_mem_probe(struct device *dev)
> >  {
> >  	struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> > @@ -68,7 +195,8 @@ static int cxl_mem_probe(struct device *dev)
> >  	if (!is_cxl_mem_enabled(pdev))
> >  		return -ENODEV;
> >  
> > -	/* TODO: if parent is a switch, this will fail. */
> > +	enumerate_switches(dev->parent);
> > +
> >  	port_dev = bus_find_device(&cxl_bus_type, NULL, pdev_parent, port_match);
> >  	if (!port_dev)
> >  		return -ENODEV;
> 

  reply	other threads:[~2021-09-13 22:12 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-02 19:50 [PATCH 00/13] Enumerate midlevel and endpoint decoders Ben Widawsky
2021-09-02 19:50 ` [PATCH 01/13] Documentation/cxl: Add bus internal docs Ben Widawsky
2021-09-03 14:05   ` Jonathan Cameron
2021-09-10 18:20     ` Dan Williams
2021-09-02 19:50 ` [PATCH 02/13] cxl/core/bus: Add kernel docs for decoder ops Ben Widawsky
2021-09-03 14:17   ` Jonathan Cameron
2021-09-10 18:51   ` Dan Williams
2021-09-11 17:25     ` Ben Widawsky
2021-09-02 19:50 ` [PATCH 03/13] cxl/core: Ignore interleave when adding decoders Ben Widawsky
2021-09-03 14:25   ` Jonathan Cameron
2021-09-10 19:00     ` Dan Williams
2021-09-11 17:30       ` Ben Widawsky
2021-09-02 19:50 ` [PATCH 04/13] cxl: Introduce endpoint decoders Ben Widawsky
2021-09-03 14:35   ` Jonathan Cameron
2021-09-13 16:19     ` Ben Widawsky
2021-09-10 19:19   ` Dan Williams
2021-09-13 16:11     ` Ben Widawsky
2021-09-13 22:07       ` Dan Williams
2021-09-13 23:19         ` Ben Widawsky
2021-09-14 21:16           ` Dan Williams
2021-09-02 19:50 ` [PATCH 05/13] cxl/pci: Disambiguate cxl_pci further from cxl_mem Ben Widawsky
2021-09-03 14:45   ` Jonathan Cameron
2021-09-10 19:27   ` Dan Williams
2021-09-02 19:50 ` [PATCH 06/13] cxl/mem: Introduce cxl_mem driver Ben Widawsky
2021-09-03 14:52   ` Jonathan Cameron
2021-09-10 21:32   ` Dan Williams
2021-09-13 16:46     ` Ben Widawsky
2021-09-13 19:37       ` Dan Williams
2021-09-02 19:50 ` [PATCH 07/13] cxl/memdev: Determine CXL.mem capability Ben Widawsky
2021-09-03 15:21   ` Jonathan Cameron
2021-09-13 19:01     ` Ben Widawsky
2021-09-10 21:59   ` Dan Williams
2021-09-13 22:10     ` Ben Widawsky
2021-09-14 22:42       ` Dan Williams
2021-09-14 22:55         ` Ben Widawsky
2021-09-02 19:50 ` [PATCH 08/13] cxl/mem: Add memdev as a port Ben Widawsky
2021-09-03 15:31   ` Jonathan Cameron
2021-09-10 23:09   ` Dan Williams
2021-09-02 19:50 ` [PATCH 09/13] cxl/pci: Retain map information in cxl_mem_probe Ben Widawsky
2021-09-10 23:12   ` Dan Williams
2021-09-10 23:45     ` Dan Williams
2021-09-02 19:50 ` [PATCH 10/13] cxl/core: Map component registers for ports Ben Widawsky
2021-09-02 22:41   ` Ben Widawsky
2021-09-02 22:42     ` Ben Widawsky
2021-09-03 16:14   ` Jonathan Cameron
2021-09-10 23:52     ` Dan Williams
2021-09-13  8:29       ` Jonathan Cameron
2021-09-10 23:44   ` Dan Williams
2021-09-02 19:50 ` [PATCH 11/13] cxl/core: Convert decoder range to resource Ben Widawsky
2021-09-03 16:16   ` Jonathan Cameron
2021-09-11  0:59   ` Dan Williams
2021-09-02 19:50 ` [PATCH 12/13] cxl/core/bus: Enumerate all HDM decoders Ben Widawsky
2021-09-03 17:43   ` Jonathan Cameron
2021-09-11  1:37     ` Dan Williams
2021-09-11  1:13   ` Dan Williams
2021-09-02 19:50 ` [PATCH 13/13] cxl/mem: Enumerate switch decoders Ben Widawsky
2021-09-03 17:56   ` Jonathan Cameron
2021-09-13 22:12     ` Ben Widawsky [this message]
2021-09-14 23:31   ` Dan Williams
2021-09-10 18:15 ` [PATCH 00/13] Enumerate midlevel and endpoint decoders Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210913221248.ghgkrkqhw4mujhpd@intel.com \
    --to=ben.widawsky@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).