All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4 v3] PCI: Add Secondary Bus Reset (SBR) support for CXL
@ 2024-04-02 23:45 Dave Jiang
  2024-04-02 23:45 ` [PATCH v3 1/4] PCI/cxl: Move PCI CXL vendor Id to a common location from CXL subsystem Dave Jiang
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Dave Jiang @ 2024-04-02 23:45 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	Jonathan.Cameron, dave, bhelgaas, lukas

Hi Bjorn,
Please consider this series for kernel 6.10. The series attempt to
add secondary bus reset (SBR) support to CXL. By default, SBR for CXL is
masked. Per CXL specification r3.1 8.1.5.2, the Port Control Extensions
register bit 0 (Unmask SBR) in the host bridge controls the masking of CXL SBR.
"When 0, SBR bit in Bridge Control register of this Port has no effect. When 1,
the Port shall generate hot reset when SBR in Bridge Control gets set to 1.
Default value of this bit is 0. When the Port is operating in PCIe mode or RCD
mode, this field has no effect on SBR functionality and Port shall follow PCIe
Base Specification."

v3:
- Move PCI_DVSEC_VENDOR_ID_CXL to PCI_VENDOR_ID_CXL in pci_ids.h (Bjorn)
- Remove of CXL device checking. (Bjorn)
- Rename defines to PCI_DVSEC_CXL_PORT_*. (Bjorn)
- Fix SBR define in commit log. (Bjorn)
- Update comment on dvsec not found. (Dan)
- Check return of dvsec value read for error. (Dan)
- Move cxl_port_dvsec() to an earlier patch. (Dan)
- Add pci_cfg_access_lock() for bridge access. (Dan)
- Change cxl_bus_force method to cxl_bus. (Dan)
- Rename decoder_hw_mismatch() to __cxl_endpoint_decoder_reset_detected(). (Dan)
- Move CXL register access function to core/pci.c. (Dan)
- Add kernel taint to decoder reset warning. (Dan)

v2:
- Use pci_upstream_bridge() instead of dev->bus->self. (Lukas)
- Rename is_cxl_device() to pci_is_cxl(). (Lukas)
- Return -ENOTTY on error. (Lukas)

Patch 1:
Move PCI_DVSEC_VENDOR_ID_CXL to PCI_VENDOR_ID_CXL in pci_ids.h and
adjust related CXL bits.

Patch 2:
Add check to PCI bus_reset path for CXL device and return with error if "Unmask
SBR" bit is set to 0. This allows user to realize that SBR is masked for this
CXL device. However, if the user sets the "Unmask SBR" bit via a tool such as
setpci, then the bus_reset will proceed.

Patch3:
Add a new PCI reset method "cxl_bus_force" in order to allow the user an
intetional way to perform SBR. The code will set the "Unmask SBR" bit to
1 and proceed with bus_reset. The original value of the bit will be restored
after the reset operation.

Patch4:
CXL driver change that provides a ->reset_done() callback. It compares the
hardware decoder settings with the existing software configuration and emit
warning if they differ. The difference indicates that decoders were programmed
before the reset and are now cleared after reset. There may be onlined system
memory backed by CXL memory device that are now violently ripped away from
kernel mapping.

Patch series stemmed from this [1] patch. With comments [2] from Bjorn.

[1]: https://lore.kernel.org/linux-cxl/20240215232307.2793530-1-dave.jiang@intel.com/
[2]: https://lore.kernel.org/linux-cxl/20240220203956.GA1502351@bhelgaas/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 1/4] PCI/cxl: Move PCI CXL vendor Id to a common location from CXL subsystem
  2024-04-02 23:45 [PATCH 0/4 v3] PCI: Add Secondary Bus Reset (SBR) support for CXL Dave Jiang
@ 2024-04-02 23:45 ` Dave Jiang
  2024-04-02 23:45 ` [PATCH v3 2/4] PCI: Add check for CXL Secondary Bus Reset Dave Jiang
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Dave Jiang @ 2024-04-02 23:45 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	Jonathan.Cameron, dave, bhelgaas, lukas, Bjorn Helgaas

Move PCI_DVSEC_VENDOR_ID_CXL in CXL private code to PCI_VENDOR_ID_CXL in
pci_ids.h in order to be utilized in PCI subsystem.

The response Bjorn received from the PCI SIG was "1E98h is not a VID in our
system, but 1E98 has already been reserved by CXL." He suggested "we should
add '#define PCI_VENDOR_ID_CXL 0x1e98' so that if we ever *do* see such an
assignment, we'll be more likely to flag it as an issue.

Link: https://lore.kernel.org/linux-cxl/20240402172323.GA1818777@bhelgaas/
Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 drivers/cxl/core/pci.c  | 6 +++---
 drivers/cxl/core/regs.c | 2 +-
 drivers/cxl/cxlpci.h    | 1 -
 drivers/cxl/pci.c       | 2 +-
 drivers/perf/cxl_pmu.c  | 2 +-
 include/linux/pci_ids.h | 2 ++
 6 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 0df09bd79408..c496a9710d62 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -525,7 +525,7 @@ static int cxl_cdat_get_length(struct device *dev,
 	__le32 response[2];
 	int rc;
 
-	rc = pci_doe(doe_mb, PCI_DVSEC_VENDOR_ID_CXL,
+	rc = pci_doe(doe_mb, PCI_VENDOR_ID_CXL,
 		     CXL_DOE_PROTOCOL_TABLE_ACCESS,
 		     &request, sizeof(request),
 		     &response, sizeof(response));
@@ -555,7 +555,7 @@ static int cxl_cdat_read_table(struct device *dev,
 		__le32 request = CDAT_DOE_REQ(entry_handle);
 		int rc;
 
-		rc = pci_doe(doe_mb, PCI_DVSEC_VENDOR_ID_CXL,
+		rc = pci_doe(doe_mb, PCI_VENDOR_ID_CXL,
 			     CXL_DOE_PROTOCOL_TABLE_ACCESS,
 			     &request, sizeof(request),
 			     rsp, sizeof(*rsp) + remaining);
@@ -640,7 +640,7 @@ void read_cdat_data(struct cxl_port *port)
 	if (!pdev)
 		return;
 
-	doe_mb = pci_find_doe_mailbox(pdev, PCI_DVSEC_VENDOR_ID_CXL,
+	doe_mb = pci_find_doe_mailbox(pdev, PCI_VENDOR_ID_CXL,
 				      CXL_DOE_PROTOCOL_TABLE_ACCESS);
 	if (!doe_mb) {
 		dev_dbg(dev, "No CDAT mailbox\n");
diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c
index 372786f80955..da52fc9e234b 100644
--- a/drivers/cxl/core/regs.c
+++ b/drivers/cxl/core/regs.c
@@ -313,7 +313,7 @@ int cxl_find_regblock_instance(struct pci_dev *pdev, enum cxl_regloc_type type,
 		.resource = CXL_RESOURCE_NONE,
 	};
 
-	regloc = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL,
+	regloc = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
 					   CXL_DVSEC_REG_LOCATOR);
 	if (!regloc)
 		return -ENXIO;
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 93992a1c8eec..4da07727ab9c 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -13,7 +13,6 @@
  * "DVSEC" redundancies removed. When obvious, abbreviations may be used.
  */
 #define PCI_DVSEC_HEADER1_LENGTH_MASK	GENMASK(31, 20)
-#define PCI_DVSEC_VENDOR_ID_CXL		0x1E98
 
 /* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
 #define CXL_DVSEC_PCIE_DEVICE					0
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 2ff361e756d6..110478573296 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -817,7 +817,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	cxlds->rcd = is_cxl_restricted(pdev);
 	cxlds->serial = pci_get_dsn(pdev);
 	cxlds->cxl_dvsec = pci_find_dvsec_capability(
-		pdev, PCI_DVSEC_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE);
+		pdev, PCI_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE);
 	if (!cxlds->cxl_dvsec)
 		dev_warn(&pdev->dev,
 			 "Device DVSEC not present, skip CXL.mem init\n");
diff --git a/drivers/perf/cxl_pmu.c b/drivers/perf/cxl_pmu.c
index 308c9969642e..a1b742b1a735 100644
--- a/drivers/perf/cxl_pmu.c
+++ b/drivers/perf/cxl_pmu.c
@@ -345,7 +345,7 @@ static ssize_t cxl_pmu_event_sysfs_show(struct device *dev,
 
 /* For CXL spec defined events */
 #define CXL_PMU_EVENT_CXL_ATTR(_name, _gid, _msk)			\
-	CXL_PMU_EVENT_ATTR(_name, PCI_DVSEC_VENDOR_ID_CXL, _gid, _msk)
+	CXL_PMU_EVENT_ATTR(_name, PCI_VENDOR_ID_CXL, _gid, _msk)
 
 static struct attribute *cxl_pmu_event_attrs[] = {
 	CXL_PMU_EVENT_CXL_ATTR(clock_ticks,			CXL_PMU_GID_CLOCK_TICKS, BIT(0)),
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index a0c75e467df3..7dfbf6d96b3d 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2607,6 +2607,8 @@
 
 #define PCI_VENDOR_ID_ALIBABA		0x1ded
 
+#define PCI_VENDOR_ID_CXL		0x1e98
+
 #define PCI_VENDOR_ID_TEHUTI		0x1fc9
 #define PCI_DEVICE_ID_TEHUTI_3009	0x3009
 #define PCI_DEVICE_ID_TEHUTI_3010	0x3010
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 2/4] PCI: Add check for CXL Secondary Bus Reset
  2024-04-02 23:45 [PATCH 0/4 v3] PCI: Add Secondary Bus Reset (SBR) support for CXL Dave Jiang
  2024-04-02 23:45 ` [PATCH v3 1/4] PCI/cxl: Move PCI CXL vendor Id to a common location from CXL subsystem Dave Jiang
@ 2024-04-02 23:45 ` Dave Jiang
  2024-04-03  8:26   ` Lukas Wunner
  2024-04-03 15:01   ` Jonathan Cameron
  2024-04-02 23:45 ` [PATCH v3 3/4] PCI: Create new reset method to force SBR for CXL Dave Jiang
  2024-04-02 23:45 ` [PATCH v3 4/4] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR) Dave Jiang
  3 siblings, 2 replies; 17+ messages in thread
From: Dave Jiang @ 2024-04-02 23:45 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	Jonathan.Cameron, dave, bhelgaas, lukas

Per CXL spec r3.1 8.1.5.2, Secondary Bus Reset (SBR) is masked unless the
"Unmask SBR" bit is set. Add a check to the PCI secondary bus reset
path to fail the CXL SBR request if the "Unmask SBR" bit is clear in
the CXL Port Control Extensions register by returning -ENOTTY.

When the "Unmask SBR" bit is set to 0 (default), the bus_reset would
appear to have executed successfully. However the operation is actually
masked. The intention is to inform the user that SBR for the CXL device
is masked and will not go through.

If the "Unmask SBR" bit is set to 1, then the bus reset will execute
successfully.

Link: https://lore.kernel.org/linux-cxl/20240220203956.GA1502351@bhelgaas/
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
v3:
- Move and rename PCI_DVSEC_VENDOR_ID_CXL to PCI_VENDOR_ID_CXL.
  Move to pci_ids.h in a different patch. (Bjorn)
- Remove of CXL device checking. (Bjorn)
- Rename defines to PCI_DVSEC_CXL_PORT_*. (Bjorn)
- Fixup SBR define in commit log. (Bjorn)
- Update comment on dvsec not found. (Dan)
- Check return of dvsec value read for error. (Dan)
---
 drivers/pci/pci.c             | 45 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/pci_regs.h |  5 ++++
 2 files changed, 50 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e5f243dd4288..00eddb451102 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4927,10 +4927,55 @@ static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe)
 	return pci_reset_hotplug_slot(dev->slot->hotplug, probe);
 }
 
+static int cxl_port_dvsec(struct pci_dev *dev)
+{
+	return pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
+					 PCI_DVSEC_CXL_PORT);
+}
+
+static bool cxl_sbr_masked(struct pci_dev *dev)
+{
+	u16 dvsec, reg;
+	int rc;
+
+	/*
+	 * No DVSEC found, either is not a CXL port, or not connected in which
+	 * case mask state is a nop (CXL r3.1 sec 9.12.3 "Enumerating CXL RPs
+	 * and DSPs"
+	 */
+	dvsec = cxl_port_dvsec(dev);
+	if (!dvsec)
+		return false;
+
+	rc = pci_read_config_word(dev, dvsec + PCI_DVSEC_CXL_PORT_CTL, &reg);
+	if (rc || PCI_POSSIBLE_ERROR(reg))
+		return false;
+
+	/*
+	 * CXL spec r3.1 8.1.5.2
+	 * When 0, SBR bit in Bridge Control register of this Port has no effect.
+	 * When 1, the Port shall generate hot reset when SBR bit in Bridge
+	 * Control gets set to 1.
+	 */
+	if (reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR)
+		return false;
+
+	return true;
+}
+
 static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
 {
+	struct pci_dev *bridge = pci_upstream_bridge(dev);
 	int rc;
 
+	/* If it's a CXL port and the SBR control is masked, fail the SBR */
+	if (bridge && cxl_sbr_masked(bridge)) {
+		if (probe)
+			return 0;
+
+		return -ENOTTY;
+	}
+
 	rc = pci_dev_reset_slot_function(dev, probe);
 	if (rc != -ENOTTY)
 		return rc;
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index a39193213ff2..d61fa43662e3 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -1148,4 +1148,9 @@
 #define PCI_DOE_DATA_OBJECT_DISC_RSP_3_PROTOCOL		0x00ff0000
 #define PCI_DOE_DATA_OBJECT_DISC_RSP_3_NEXT_INDEX	0xff000000
 
+/* Compute Express Link (CXL) */
+#define PCI_DVSEC_CXL_PORT				3
+#define PCI_DVSEC_CXL_PORT_CTL				0x0c
+#define PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR		0x00000001
+
 #endif /* LINUX_PCI_REGS_H */
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 3/4] PCI: Create new reset method to force SBR for CXL
  2024-04-02 23:45 [PATCH 0/4 v3] PCI: Add Secondary Bus Reset (SBR) support for CXL Dave Jiang
  2024-04-02 23:45 ` [PATCH v3 1/4] PCI/cxl: Move PCI CXL vendor Id to a common location from CXL subsystem Dave Jiang
  2024-04-02 23:45 ` [PATCH v3 2/4] PCI: Add check for CXL Secondary Bus Reset Dave Jiang
@ 2024-04-02 23:45 ` Dave Jiang
  2024-04-03 15:09   ` Jonathan Cameron
  2024-04-02 23:45 ` [PATCH v3 4/4] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR) Dave Jiang
  3 siblings, 1 reply; 17+ messages in thread
From: Dave Jiang @ 2024-04-02 23:45 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	Jonathan.Cameron, dave, bhelgaas, lukas

CXL spec r3.1 8.1.5.2
By default Secondary Bus Reset (SBR) is masked for CXL ports. Introduce a
new PCI reset method "cxl_bus" to force SBR on CXL ports by setting
the unmask SBR bit in the CXL DVSEC port control register before performing
the bus reset and restore the original value of the bit post reset. The
new reset method allows the user to intentionally perform SBR on a CXL
device without needing to set the "Unmask SBR" bit via a user tool.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
v3:
- move cxl_port_dvsec() to previous patch. (Dan)
- add pci_cfg_access_lock() for the bridge. (Dan)
- Change cxl_bus_force method to cxl_bus. (Dan)
---
 drivers/pci/pci.c   | 44 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h |  2 +-
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 00eddb451102..3989c8888813 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4982,6 +4982,49 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
 	return pci_parent_bus_reset(dev, probe);
 }
 
+static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
+{
+	struct pci_dev *bridge;
+	int dvsec;
+	int rc;
+	u16 reg, val;
+
+	bridge = pci_upstream_bridge(dev);
+	if (!bridge)
+		return -ENOTTY;
+
+	dvsec = cxl_port_dvsec(bridge);
+	if (!dvsec)
+		return -ENOTTY;
+
+	if (probe)
+		return 0;
+
+	pci_cfg_access_lock(bridge);
+	rc = pci_read_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL, &reg);
+	if (rc) {
+		rc = -ENOTTY;
+		goto out;
+	}
+
+	if (!(reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR)) {
+		val = reg | PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR;
+		pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL,
+				      val);
+	} else {
+		val = reg;
+	}
+
+	rc = pci_reset_bus_function(dev, probe);
+
+	if (reg != val)
+		pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL, reg);
+
+out:
+	pci_cfg_access_unlock(bridge);
+	return rc;
+}
+
 void pci_dev_lock(struct pci_dev *dev)
 {
 	/* block PM suspend, driver probe, etc. */
@@ -5066,6 +5109,7 @@ static const struct pci_reset_fn_method pci_reset_fn_methods[] = {
 	{ pci_af_flr, .name = "af_flr" },
 	{ pci_pm_reset, .name = "pm" },
 	{ pci_reset_bus_function, .name = "bus" },
+	{ cxl_reset_bus_function, .name = "cxl_bus" },
 };
 
 static ssize_t reset_method_show(struct device *dev,
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 16493426a04f..235f37715a43 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -51,7 +51,7 @@
 			       PCI_STATUS_PARITY)
 
 /* Number of reset methods used in pci_reset_fn_methods array in pci.c */
-#define PCI_NUM_RESET_METHODS 7
+#define PCI_NUM_RESET_METHODS 8
 
 #define PCI_RESET_PROBE		true
 #define PCI_RESET_DO_RESET	false
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 4/4] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR)
  2024-04-02 23:45 [PATCH 0/4 v3] PCI: Add Secondary Bus Reset (SBR) support for CXL Dave Jiang
                   ` (2 preceding siblings ...)
  2024-04-02 23:45 ` [PATCH v3 3/4] PCI: Create new reset method to force SBR for CXL Dave Jiang
@ 2024-04-02 23:45 ` Dave Jiang
  2024-04-03 15:32   ` Jonathan Cameron
  3 siblings, 1 reply; 17+ messages in thread
From: Dave Jiang @ 2024-04-02 23:45 UTC (permalink / raw)
  To: linux-cxl, linux-pci
  Cc: dan.j.williams, ira.weiny, vishal.l.verma, alison.schofield,
	Jonathan.Cameron, dave, bhelgaas, lukas

SBR is equivalent to a device been hot removed and inserted again. Doing a
SBR on a CXL type 3 device is problematic if the exported device memory is
part of system memory that cannot be offlined. The event is equivalent to
violently ripping out that range of memory from the kernel. While the
hardware requires the "Unmask SBR" bit set in the Port Control Extensions
register and the kernel currently does not unmask it, user can unmask
this bit via setpci or similar tool.

The driver does not have a way to detect whether a reset coming from the
PCI subsystem is a Function Level Reset (FLR) or SBR. The only way to
detect is to note if a decoder is marked as enabled in software but the
decoder control register indicates it's not committed.

A helper function is added to find discrepancy between the decoder
software state versus the hardware register state.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
v3:
- Rename decocer_hw_mismatch() to __cxl_endpoint_decoder_reset_detected(). (Dan)
- Move register accessing function to core/pci.c. (Dan)
- Add kernel taint to decoder reset. (Dan)
---
 drivers/cxl/core/pci.c | 31 +++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h      |  2 ++
 drivers/cxl/pci.c      | 20 ++++++++++++++++++++
 3 files changed, 53 insertions(+)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index c496a9710d62..597221f7f19b 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -1045,3 +1045,34 @@ long cxl_pci_get_latency(struct pci_dev *pdev)
 
 	return cxl_flit_size(pdev) * MEGA / bw;
 }
+
+static int __cxl_endpoint_decoder_reset_detected(struct device *dev, void *data)
+{
+	struct cxl_endpoint_decoder *cxled;
+	struct cxl_port *port = data;
+	struct cxl_decoder *cxld;
+	struct cxl_hdm *cxlhdm;
+	void __iomem *hdm;
+	u32 ctrl;
+
+	if (!is_endpoint_decoder(dev))
+		return 0;
+
+	cxled = to_cxl_endpoint_decoder(dev);
+	if ((cxled->cxld.flags & CXL_DECODER_F_ENABLE) == 0)
+		return 0;
+
+	cxld = &cxled->cxld;
+	cxlhdm = dev_get_drvdata(&port->dev);
+	hdm = cxlhdm->regs.hdm_decoder;
+	ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(cxld->id));
+
+	return !FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl);
+}
+
+bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port)
+{
+	return device_for_each_child(&port->dev, port,
+				     __cxl_endpoint_decoder_reset_detected);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_reset_detected, CXL);
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 534e25e2f0a4..e3c237c50b59 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -895,6 +895,8 @@ void cxl_coordinates_combine(struct access_coordinate *out,
 			     struct access_coordinate *c1,
 			     struct access_coordinate *c2);
 
+bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
+
 /*
  * Unit test builds overrides this to __weak, find the 'strong' version
  * of these symbols in tools/testing/cxl/.
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 110478573296..5dc1f28a031d 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -957,11 +957,31 @@ static void cxl_error_resume(struct pci_dev *pdev)
 		 dev->driver ? "successful" : "failed");
 }
 
+static void cxl_reset_done(struct pci_dev *pdev)
+{
+	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
+	struct cxl_memdev *cxlmd = cxlds->cxlmd;
+	struct device *dev = &pdev->dev;
+
+	/*
+	 * FLR does not expect to touch the HDM decoders and related registers.
+	 * SBR however will wipe all device configurations.
+	 * Issue warning if there was active decoder before reset that no
+	 * longer exists.
+	 */
+	if (cxl_endpoint_decoder_reset_detected(cxlmd->endpoint)) {
+		dev_warn(dev, "SBR happened without memory regions removal.\n");
+		dev_warn(dev, "System may be unstable if regions hosted system memory.\n");
+		add_taint(TAINT_USER, LOCKDEP_NOW_UNRELIABLE);
+	}
+}
+
 static const struct pci_error_handlers cxl_error_handlers = {
 	.error_detected	= cxl_error_detected,
 	.slot_reset	= cxl_slot_reset,
 	.resume		= cxl_error_resume,
 	.cor_error_detected	= cxl_cor_error_detected,
+	.reset_done	= cxl_reset_done,
 };
 
 static struct pci_driver cxl_pci_driver = {
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 2/4] PCI: Add check for CXL Secondary Bus Reset
  2024-04-02 23:45 ` [PATCH v3 2/4] PCI: Add check for CXL Secondary Bus Reset Dave Jiang
@ 2024-04-03  8:26   ` Lukas Wunner
  2024-04-04  0:19     ` Dave Jiang
  2024-04-03 15:01   ` Jonathan Cameron
  1 sibling, 1 reply; 17+ messages in thread
From: Lukas Wunner @ 2024-04-03  8:26 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, Jonathan.Cameron, dave, bhelgaas

On Tue, Apr 02, 2024 at 04:45:30PM -0700, Dave Jiang wrote:
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4927,10 +4927,55 @@ static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe)
>  	return pci_reset_hotplug_slot(dev->slot->hotplug, probe);
>  }
>  
> +static int cxl_port_dvsec(struct pci_dev *dev)
> +{
> +	return pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
> +					 PCI_DVSEC_CXL_PORT);
> +}

Hm, seems a bit odd that this returns an int even though
pci_find_dvsec_capability() returns a u16 and all the callers
of cxl_port_dvsec() seem to assign the return value to a u16
as well.  Is the "int" on purpose?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 2/4] PCI: Add check for CXL Secondary Bus Reset
  2024-04-02 23:45 ` [PATCH v3 2/4] PCI: Add check for CXL Secondary Bus Reset Dave Jiang
  2024-04-03  8:26   ` Lukas Wunner
@ 2024-04-03 15:01   ` Jonathan Cameron
  1 sibling, 0 replies; 17+ messages in thread
From: Jonathan Cameron @ 2024-04-03 15:01 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave, bhelgaas, lukas

On Tue, 2 Apr 2024 16:45:30 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> Per CXL spec r3.1 8.1.5.2, Secondary Bus Reset (SBR) is masked unless the
> "Unmask SBR" bit is set. Add a check to the PCI secondary bus reset
> path to fail the CXL SBR request if the "Unmask SBR" bit is clear in
> the CXL Port Control Extensions register by returning -ENOTTY.
> 
> When the "Unmask SBR" bit is set to 0 (default), the bus_reset would
> appear to have executed successfully. However the operation is actually
> masked. The intention is to inform the user that SBR for the CXL device
> is masked and will not go through.
> 
> If the "Unmask SBR" bit is set to 1, then the bus reset will execute
> successfully.
> 
> Link: https://lore.kernel.org/linux-cxl/20240220203956.GA1502351@bhelgaas/
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
LGTM though Lukas' comment make sense so I'll assume you'll tidy that up.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> v3:
> - Move and rename PCI_DVSEC_VENDOR_ID_CXL to PCI_VENDOR_ID_CXL.
>   Move to pci_ids.h in a different patch. (Bjorn)
> - Remove of CXL device checking. (Bjorn)
> - Rename defines to PCI_DVSEC_CXL_PORT_*. (Bjorn)
> - Fixup SBR define in commit log. (Bjorn)
> - Update comment on dvsec not found. (Dan)
> - Check return of dvsec value read for error. (Dan)
> ---
>  drivers/pci/pci.c             | 45 +++++++++++++++++++++++++++++++++++
>  include/uapi/linux/pci_regs.h |  5 ++++
>  2 files changed, 50 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e5f243dd4288..00eddb451102 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4927,10 +4927,55 @@ static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe)
>  	return pci_reset_hotplug_slot(dev->slot->hotplug, probe);
>  }
>  
> +static int cxl_port_dvsec(struct pci_dev *dev)
> +{
> +	return pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
> +					 PCI_DVSEC_CXL_PORT);
> +}
> +
> +static bool cxl_sbr_masked(struct pci_dev *dev)
> +{
> +	u16 dvsec, reg;
> +	int rc;
> +
> +	/*
> +	 * No DVSEC found, either is not a CXL port, or not connected in which
> +	 * case mask state is a nop (CXL r3.1 sec 9.12.3 "Enumerating CXL RPs
> +	 * and DSPs"
> +	 */
> +	dvsec = cxl_port_dvsec(dev);
> +	if (!dvsec)
> +		return false;
> +
> +	rc = pci_read_config_word(dev, dvsec + PCI_DVSEC_CXL_PORT_CTL, &reg);
> +	if (rc || PCI_POSSIBLE_ERROR(reg))
> +		return false;
> +
> +	/*
> +	 * CXL spec r3.1 8.1.5.2
> +	 * When 0, SBR bit in Bridge Control register of this Port has no effect.
> +	 * When 1, the Port shall generate hot reset when SBR bit in Bridge
> +	 * Control gets set to 1.
> +	 */
> +	if (reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR)
> +		return false;
> +
> +	return true;
> +}
> +
>  static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
>  {
> +	struct pci_dev *bridge = pci_upstream_bridge(dev);
>  	int rc;
>  
> +	/* If it's a CXL port and the SBR control is masked, fail the SBR */
> +	if (bridge && cxl_sbr_masked(bridge)) {
> +		if (probe)
> +			return 0;
> +
> +		return -ENOTTY;
> +	}
> +
>  	rc = pci_dev_reset_slot_function(dev, probe);
>  	if (rc != -ENOTTY)
>  		return rc;
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index a39193213ff2..d61fa43662e3 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -1148,4 +1148,9 @@
>  #define PCI_DOE_DATA_OBJECT_DISC_RSP_3_PROTOCOL		0x00ff0000
>  #define PCI_DOE_DATA_OBJECT_DISC_RSP_3_NEXT_INDEX	0xff000000
>  
> +/* Compute Express Link (CXL) */
> +#define PCI_DVSEC_CXL_PORT				3
> +#define PCI_DVSEC_CXL_PORT_CTL				0x0c
> +#define PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR		0x00000001
> +
>  #endif /* LINUX_PCI_REGS_H */


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 3/4] PCI: Create new reset method to force SBR for CXL
  2024-04-02 23:45 ` [PATCH v3 3/4] PCI: Create new reset method to force SBR for CXL Dave Jiang
@ 2024-04-03 15:09   ` Jonathan Cameron
  2024-04-04  0:21     ` Dave Jiang
  0 siblings, 1 reply; 17+ messages in thread
From: Jonathan Cameron @ 2024-04-03 15:09 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave, bhelgaas, lukas

On Tue, 2 Apr 2024 16:45:31 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> CXL spec r3.1 8.1.5.2
> By default Secondary Bus Reset (SBR) is masked for CXL ports. Introduce a
> new PCI reset method "cxl_bus" to force SBR on CXL ports by setting
> the unmask SBR bit in the CXL DVSEC port control register before performing
> the bus reset and restore the original value of the bit post reset. The
> new reset method allows the user to intentionally perform SBR on a CXL
> device without needing to set the "Unmask SBR" bit via a user tool.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
A few trivial things inline.  Otherwise looks fine.

FWIW
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> v3:
> - move cxl_port_dvsec() to previous patch. (Dan)
> - add pci_cfg_access_lock() for the bridge. (Dan)
> - Change cxl_bus_force method to cxl_bus. (Dan)
> ---
>  drivers/pci/pci.c   | 44 ++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pci.h |  2 +-
>  2 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 00eddb451102..3989c8888813 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4982,6 +4982,49 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
>  	return pci_parent_bus_reset(dev, probe);
>  }
>  
> +static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
> +{
> +	struct pci_dev *bridge;
> +	int dvsec;

Lukas' comment on previous applies to this as well.

> +	int rc;
> +	u16 reg, val;

Maybe combine lines as appropriate.

> +
> +	bridge = pci_upstream_bridge(dev);
> +	if (!bridge)
> +		return -ENOTTY;
> +
> +	dvsec = cxl_port_dvsec(bridge);
> +	if (!dvsec)
> +		return -ENOTTY;
> +
> +	if (probe)
> +		return 0;
> +
> +	pci_cfg_access_lock(bridge);
> +	rc = pci_read_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL, &reg);
> +	if (rc) {
> +		rc = -ENOTTY;
> +		goto out;
> +	}
> +
> +	if (!(reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR)) {
> +		val = reg | PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR;
> +		pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL,
> +				      val);
> +	} else {
> +		val = reg;
> +	}
> +
> +	rc = pci_reset_bus_function(dev, probe);
> +
> +	if (reg != val)
> +		pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL, reg);
> +
> +out:
> +	pci_cfg_access_unlock(bridge);

Maybe a guard() use case to allow early returns in error paths?

> +	return rc;
> +}
> +
>  void pci_dev_lock(struct pci_dev *dev)
>  {
>  	/* block PM suspend, driver probe, etc. */
> @@ -5066,6 +5109,7 @@ static const struct pci_reset_fn_method pci_reset_fn_methods[] = {
>  	{ pci_af_flr, .name = "af_flr" },
>  	{ pci_pm_reset, .name = "pm" },
>  	{ pci_reset_bus_function, .name = "bus" },
> +	{ cxl_reset_bus_function, .name = "cxl_bus" },
>  };
>  
>  static ssize_t reset_method_show(struct device *dev,
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 16493426a04f..235f37715a43 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -51,7 +51,7 @@
>  			       PCI_STATUS_PARITY)
>  
>  /* Number of reset methods used in pci_reset_fn_methods array in pci.c */
> -#define PCI_NUM_RESET_METHODS 7
> +#define PCI_NUM_RESET_METHODS 8
>  
>  #define PCI_RESET_PROBE		true
>  #define PCI_RESET_DO_RESET	false


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 4/4] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR)
  2024-04-02 23:45 ` [PATCH v3 4/4] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR) Dave Jiang
@ 2024-04-03 15:32   ` Jonathan Cameron
  2024-04-03 16:27     ` Dan Williams
  2024-04-04  8:51     ` Lukas Wunner
  0 siblings, 2 replies; 17+ messages in thread
From: Jonathan Cameron @ 2024-04-03 15:32 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave, bhelgaas, lukas

On Tue, 2 Apr 2024 16:45:32 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> SBR is equivalent to a device been hot removed and inserted again. Doing a
> SBR on a CXL type 3 device is problematic if the exported device memory is
> part of system memory that cannot be offlined. The event is equivalent to
> violently ripping out that range of memory from the kernel. While the
> hardware requires the "Unmask SBR" bit set in the Port Control Extensions
> register and the kernel currently does not unmask it, user can unmask
> this bit via setpci or similar tool.
> 
> The driver does not have a way to detect whether a reset coming from the
> PCI subsystem is a Function Level Reset (FLR) or SBR. The only way to
> detect is to note if a decoder is marked as enabled in software but the
> decoder control register indicates it's not committed.
> 
> A helper function is added to find discrepancy between the decoder
> software state versus the hardware register state.
> 
> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

As I said way back on v1, this smells hacky.

Why not pass the info on what reset was done down from the PCI core?
I see Bjorn commented it would be *possible* to do it in the PCI core
but raised other concerns that needed addressing first (I think you've
dealt with thosenow).  Doesn't look that hard to me (I've not coded it
up yet though).

The core code knows how far it got down the list reset_methods before
it succeeded in resetting.  So...

Modify __pci_reset_function_locked() to return the index of the reset
method that succeeded. Then pass that to pci_dev_restore().
Finally push it into a reset_done2() that takes that as an extra
parameter and the driver can see if it is FLR or SBR.
The extended reset_done is to avoid modifying lots of drivers.
However a quick grep suggests it's not that heavily used (15ish?)
so maybe just add the parameter.

There are a few other paths, but non look that problematic at
first glance...

So Bjorn, now the rest of this is hopefully close to what you'll be
happey with, which way do you prefer?



> ---
> v3:
> - Rename decocer_hw_mismatch() to __cxl_endpoint_decoder_reset_detected(). (Dan)
> - Move register accessing function to core/pci.c. (Dan)
> - Add kernel taint to decoder reset. (Dan)
> ---
>  drivers/cxl/core/pci.c | 31 +++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h      |  2 ++
>  drivers/cxl/pci.c      | 20 ++++++++++++++++++++
>  3 files changed, 53 insertions(+)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index c496a9710d62..597221f7f19b 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -1045,3 +1045,34 @@ long cxl_pci_get_latency(struct pci_dev *pdev)
>  
>  	return cxl_flit_size(pdev) * MEGA / bw;
>  }
> +
> +static int __cxl_endpoint_decoder_reset_detected(struct device *dev, void *data)
> +{
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_port *port = data;
> +	struct cxl_decoder *cxld;
> +	struct cxl_hdm *cxlhdm;
> +	void __iomem *hdm;
> +	u32 ctrl;
> +
> +	if (!is_endpoint_decoder(dev))
> +		return 0;
> +
> +	cxled = to_cxl_endpoint_decoder(dev);
> +	if ((cxled->cxld.flags & CXL_DECODER_F_ENABLE) == 0)
> +		return 0;
> +
> +	cxld = &cxled->cxld;
> +	cxlhdm = dev_get_drvdata(&port->dev);
> +	hdm = cxlhdm->regs.hdm_decoder;
> +	ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(cxld->id));
> +
> +	return !FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl);
> +}
> +
> +bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port)
> +{
> +	return device_for_each_child(&port->dev, port,
> +				     __cxl_endpoint_decoder_reset_detected);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_reset_detected, CXL);
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 534e25e2f0a4..e3c237c50b59 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -895,6 +895,8 @@ void cxl_coordinates_combine(struct access_coordinate *out,
>  			     struct access_coordinate *c1,
>  			     struct access_coordinate *c2);
>  
> +bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
> +
>  /*
>   * Unit test builds overrides this to __weak, find the 'strong' version
>   * of these symbols in tools/testing/cxl/.
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 110478573296..5dc1f28a031d 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -957,11 +957,31 @@ static void cxl_error_resume(struct pci_dev *pdev)
>  		 dev->driver ? "successful" : "failed");
>  }
>  
> +static void cxl_reset_done(struct pci_dev *pdev)
> +{
> +	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
> +	struct cxl_memdev *cxlmd = cxlds->cxlmd;
> +	struct device *dev = &pdev->dev;
> +
> +	/*
> +	 * FLR does not expect to touch the HDM decoders and related registers.
> +	 * SBR however will wipe all device configurations.
> +	 * Issue warning if there was active decoder before reset that no
> +	 * longer exists.
> +	 */
> +	if (cxl_endpoint_decoder_reset_detected(cxlmd->endpoint)) {
> +		dev_warn(dev, "SBR happened without memory regions removal.\n");
> +		dev_warn(dev, "System may be unstable if regions hosted system memory.\n");
> +		add_taint(TAINT_USER, LOCKDEP_NOW_UNRELIABLE);
> +	}
> +}
> +
>  static const struct pci_error_handlers cxl_error_handlers = {
>  	.error_detected	= cxl_error_detected,
>  	.slot_reset	= cxl_slot_reset,
>  	.resume		= cxl_error_resume,
>  	.cor_error_detected	= cxl_cor_error_detected,
> +	.reset_done	= cxl_reset_done,
>  };
>  
>  static struct pci_driver cxl_pci_driver = {


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 4/4] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR)
  2024-04-03 15:32   ` Jonathan Cameron
@ 2024-04-03 16:27     ` Dan Williams
  2024-04-04 13:16       ` Jonathan Cameron
  2024-04-04  8:51     ` Lukas Wunner
  1 sibling, 1 reply; 17+ messages in thread
From: Dan Williams @ 2024-04-03 16:27 UTC (permalink / raw)
  To: Jonathan Cameron, Dave Jiang
  Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave, bhelgaas, lukas

Jonathan Cameron wrote:
> On Tue, 2 Apr 2024 16:45:32 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
> > SBR is equivalent to a device been hot removed and inserted again. Doing a
> > SBR on a CXL type 3 device is problematic if the exported device memory is
> > part of system memory that cannot be offlined. The event is equivalent to
> > violently ripping out that range of memory from the kernel. While the
> > hardware requires the "Unmask SBR" bit set in the Port Control Extensions
> > register and the kernel currently does not unmask it, user can unmask
> > this bit via setpci or similar tool.
> > 
> > The driver does not have a way to detect whether a reset coming from the
> > PCI subsystem is a Function Level Reset (FLR) or SBR. The only way to
> > detect is to note if a decoder is marked as enabled in software but the
> > decoder control register indicates it's not committed.
> > 
> > A helper function is added to find discrepancy between the decoder
> > software state versus the hardware register state.
> > 
> > Suggested-by: Dan Williams <dan.j.williams@intel.com>
> > Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> 
> As I said way back on v1, this smells hacky.
> 
> Why not pass the info on what reset was done down from the PCI core?
> I see Bjorn commented it would be *possible* to do it in the PCI core
> but raised other concerns that needed addressing first (I think you've
> dealt with thosenow).  Doesn't look that hard to me (I've not coded it
> up yet though).
> 
> The core code knows how far it got down the list reset_methods before
> it succeeded in resetting.  So...
> 
> Modify __pci_reset_function_locked() to return the index of the reset
> method that succeeded. Then pass that to pci_dev_restore().
> Finally push it into a reset_done2() that takes that as an extra
> parameter and the driver can see if it is FLR or SBR.
> The extended reset_done is to avoid modifying lots of drivers.
> However a quick grep suggests it's not that heavily used (15ish?)
> so maybe just add the parameter.
> 
> There are a few other paths, but non look that problematic at
> first glance...
> 
> So Bjorn, now the rest of this is hopefully close to what you'll be
> happey with, which way do you prefer?

I will defer to Bjorn, but I am not fan of this reset_done2() proposal.
"Revalidate after reset" is a common driver pattern and all that
plumbing the effective-reset-type does is make cxl_reset_done() more
precise for no discernible value.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 2/4] PCI: Add check for CXL Secondary Bus Reset
  2024-04-03  8:26   ` Lukas Wunner
@ 2024-04-04  0:19     ` Dave Jiang
  0 siblings, 0 replies; 17+ messages in thread
From: Dave Jiang @ 2024-04-04  0:19 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, Jonathan.Cameron, dave, bhelgaas



On 4/3/24 1:26 AM, Lukas Wunner wrote:
> On Tue, Apr 02, 2024 at 04:45:30PM -0700, Dave Jiang wrote:
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -4927,10 +4927,55 @@ static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe)
>>  	return pci_reset_hotplug_slot(dev->slot->hotplug, probe);
>>  }
>>  
>> +static int cxl_port_dvsec(struct pci_dev *dev)
>> +{
>> +	return pci_find_dvsec_capability(dev, PCI_VENDOR_ID_CXL,
>> +					 PCI_DVSEC_CXL_PORT);
>> +}
> 
> Hm, seems a bit odd that this returns an int even though
> pci_find_dvsec_capability() returns a u16 and all the callers
> of cxl_port_dvsec() seem to assign the return value to a u16
> as well.  Is the "int" on purpose?

Should be u16. Oversight. Thanks.
> 
> Thanks,
> 
> Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 3/4] PCI: Create new reset method to force SBR for CXL
  2024-04-03 15:09   ` Jonathan Cameron
@ 2024-04-04  0:21     ` Dave Jiang
  2024-04-04 13:29       ` Jonathan Cameron
  0 siblings, 1 reply; 17+ messages in thread
From: Dave Jiang @ 2024-04-04  0:21 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave, bhelgaas, lukas



On 4/3/24 8:09 AM, Jonathan Cameron wrote:
> On Tue, 2 Apr 2024 16:45:31 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
>> CXL spec r3.1 8.1.5.2
>> By default Secondary Bus Reset (SBR) is masked for CXL ports. Introduce a
>> new PCI reset method "cxl_bus" to force SBR on CXL ports by setting
>> the unmask SBR bit in the CXL DVSEC port control register before performing
>> the bus reset and restore the original value of the bit post reset. The
>> new reset method allows the user to intentionally perform SBR on a CXL
>> device without needing to set the "Unmask SBR" bit via a user tool.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> A few trivial things inline.  Otherwise looks fine.
> 
> FWIW
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
>> ---
>> v3:
>> - move cxl_port_dvsec() to previous patch. (Dan)
>> - add pci_cfg_access_lock() for the bridge. (Dan)
>> - Change cxl_bus_force method to cxl_bus. (Dan)
>> ---
>>  drivers/pci/pci.c   | 44 ++++++++++++++++++++++++++++++++++++++++++++
>>  include/linux/pci.h |  2 +-
>>  2 files changed, 45 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 00eddb451102..3989c8888813 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -4982,6 +4982,49 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
>>  	return pci_parent_bus_reset(dev, probe);
>>  }
>>  
>> +static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
>> +{
>> +	struct pci_dev *bridge;
>> +	int dvsec;
> 
> Lukas' comment on previous applies to this as well.

ok

> 
>> +	int rc;
>> +	u16 reg, val;
> 
> Maybe combine lines as appropriate.

ok

> 
>> +
>> +	bridge = pci_upstream_bridge(dev);
>> +	if (!bridge)
>> +		return -ENOTTY;
>> +
>> +	dvsec = cxl_port_dvsec(bridge);
>> +	if (!dvsec)
>> +		return -ENOTTY;
>> +
>> +	if (probe)
>> +		return 0;
>> +
>> +	pci_cfg_access_lock(bridge);
>> +	rc = pci_read_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL, &reg);
>> +	if (rc) {
>> +		rc = -ENOTTY;
>> +		goto out;
>> +	}
>> +
>> +	if (!(reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR)) {
>> +		val = reg | PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR;
>> +		pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL,
>> +				      val);
>> +	} else {
>> +		val = reg;
>> +	}
>> +
>> +	rc = pci_reset_bus_function(dev, probe);
>> +
>> +	if (reg != val)
>> +		pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL, reg);
>> +
>> +out:
>> +	pci_cfg_access_unlock(bridge);
> 
> Maybe a guard() use case to allow early returns in error paths?

I'm not seeing a good way to do it. pci_cfg_access_lock/unlock() isn't like your typical lock/unlock. It locks, changes some pci_dev internal stuff, and then unlocks in both functions. The pci_lock isn't being held after lock() call.

> 
>> +	return rc;
>> +}
>> +
>>  void pci_dev_lock(struct pci_dev *dev)
>>  {
>>  	/* block PM suspend, driver probe, etc. */
>> @@ -5066,6 +5109,7 @@ static const struct pci_reset_fn_method pci_reset_fn_methods[] = {
>>  	{ pci_af_flr, .name = "af_flr" },
>>  	{ pci_pm_reset, .name = "pm" },
>>  	{ pci_reset_bus_function, .name = "bus" },
>> +	{ cxl_reset_bus_function, .name = "cxl_bus" },
>>  };
>>  
>>  static ssize_t reset_method_show(struct device *dev,
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 16493426a04f..235f37715a43 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -51,7 +51,7 @@
>>  			       PCI_STATUS_PARITY)
>>  
>>  /* Number of reset methods used in pci_reset_fn_methods array in pci.c */
>> -#define PCI_NUM_RESET_METHODS 7
>> +#define PCI_NUM_RESET_METHODS 8
>>  
>>  #define PCI_RESET_PROBE		true
>>  #define PCI_RESET_DO_RESET	false
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 4/4] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR)
  2024-04-03 15:32   ` Jonathan Cameron
  2024-04-03 16:27     ` Dan Williams
@ 2024-04-04  8:51     ` Lukas Wunner
  2024-04-04 13:13       ` Jonathan Cameron
  1 sibling, 1 reply; 17+ messages in thread
From: Lukas Wunner @ 2024-04-04  8:51 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dave Jiang, linux-cxl, linux-pci, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, dave, bhelgaas

On Wed, Apr 03, 2024 at 04:32:57PM +0100, Jonathan Cameron wrote:
> On Tue, 2 Apr 2024 16:45:32 -0700 Dave Jiang <dave.jiang@intel.com> wrote:
> Why not pass the info on what reset was done down from the PCI core?
> I see Bjorn commented it would be *possible* to do it in the PCI core
> but raised other concerns that needed addressing first (I think you've
> dealt with those now).  Doesn't look that hard to me (I've not coded it
> up yet though).
> 
> The core code knows how far it got down the list reset_methods before
> it succeeded in resetting.  So...
> 
> Modify __pci_reset_function_locked() to return the index of the reset
> method that succeeded. Then pass that to pci_dev_restore().
> Finally push it into a reset_done2() that takes that as an extra
> parameter and the driver can see if it is FLR or SBR.

The reset types to distinguish per PCIe r6.2 sec 6.6 are
Conventional Reset and Function Level Reset.

Secondary Bus Reset is a Conventional Reset.

The spec subdivides Conventional Reset into Cold, Warm and Hot,
but that distinction is probably irrelevant for the kernel.

I think a more generalized (and therefore better) approach would be
to store the reset type the device has undergone in struct pci_dev,
right next to error_state, so that not just the ->reset_done()
callback benefits from the information.  The reset type applied has
consequences beyond the individual driver:  E.g. an FLR does not
affect CMA-SPDM session state, but a Conventional Reset does.
So there may be consumers of that information in the PCI core as well.

It's worth noting that we already have an enum pcie_reset_state in
<linux/pci.h> which distinguishes between deassert, warm and hot reset.
It is currently only used by PowerPC EEH to convey to the platform
which type of reset it should apply.  It might be possible to extend
the enum so that it can be used to store the reset type that *was*
applied to a device in struct pci_dev.

That all being said, checking for the *symptoms* of a Conventional Reset,
as Dave has done here, may actually be more robust than just relying on
what type of reset was applied.  E.g. after an FLR was handled, the device
may experience a DPC-induced Hot Reset.  By checking for the *symptoms*,
the driver may be able to catch that the device has undergone a
Conventional Reset immediately after an FLR.  Also, who knows if all
devices are well-behaved and retain their state during an FLR, as they
should per the spec?  Maybe there are broken devices which do not respect
that rule.  Checking for symptoms of a Conventional Reset would catch
those devices as well.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 4/4] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR)
  2024-04-04  8:51     ` Lukas Wunner
@ 2024-04-04 13:13       ` Jonathan Cameron
  0 siblings, 0 replies; 17+ messages in thread
From: Jonathan Cameron @ 2024-04-04 13:13 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Dave Jiang, linux-cxl, linux-pci, dan.j.williams, ira.weiny,
	vishal.l.verma, alison.schofield, dave, bhelgaas

On Thu, 4 Apr 2024 10:51:36 +0200
Lukas Wunner <lukas@wunner.de> wrote:

> On Wed, Apr 03, 2024 at 04:32:57PM +0100, Jonathan Cameron wrote:
> > On Tue, 2 Apr 2024 16:45:32 -0700 Dave Jiang <dave.jiang@intel.com> wrote:
> > Why not pass the info on what reset was done down from the PCI core?
> > I see Bjorn commented it would be *possible* to do it in the PCI core
> > but raised other concerns that needed addressing first (I think you've
> > dealt with those now).  Doesn't look that hard to me (I've not coded it
> > up yet though).
> > 
> > The core code knows how far it got down the list reset_methods before
> > it succeeded in resetting.  So...
> > 
> > Modify __pci_reset_function_locked() to return the index of the reset
> > method that succeeded. Then pass that to pci_dev_restore().
> > Finally push it into a reset_done2() that takes that as an extra
> > parameter and the driver can see if it is FLR or SBR.  
> 
> The reset types to distinguish per PCIe r6.2 sec 6.6 are
> Conventional Reset and Function Level Reset.
> 
> Secondary Bus Reset is a Conventional Reset.
> 
> The spec subdivides Conventional Reset into Cold, Warm and Hot,
> but that distinction is probably irrelevant for the kernel.

Agreed. SBR is only called out explicitly here because it's the one
with a handy triggering mechamism.

> 
> I think a more generalized (and therefore better) approach would be
> to store the reset type the device has undergone in struct pci_dev,
> right next to error_state, so that not just the ->reset_done()
> callback benefits from the information.  The reset type applied has
> consequences beyond the individual driver:  E.g. an FLR does not
> affect CMA-SPDM session state, but a Conventional Reset does.
> So there may be consumers of that information in the PCI core as well.

That makes sense if we do go the route of enhancing the information
provided for a reset.

> 
> It's worth noting that we already have an enum pcie_reset_state in
> <linux/pci.h> which distinguishes between deassert, warm and hot reset.
> It is currently only used by PowerPC EEH to convey to the platform
> which type of reset it should apply.  It might be possible to extend
> the enum so that it can be used to store the reset type that *was*
> applied to a device in struct pci_dev.
> 
> That all being said, checking for the *symptoms* of a Conventional Reset,
> as Dave has done here, may actually be more robust than just relying on
> what type of reset was applied.  E.g. after an FLR was handled, the device
> may experience a DPC-induced Hot Reset.  

This sounds like a plausible reason for doing it by symptom checking.

> By checking for the *symptoms*,
> the driver may be able to catch that the device has undergone a
> Conventional Reset immediately after an FLR.  Also, who knows if all
> devices are well-behaved and retain their state during an FLR, as they
> should per the spec?  Maybe there are broken devices which do not respect
> that rule.  Checking for symptoms of a Conventional Reset would catch
> those devices as well.

I'm not particularly keen on complexity additions to the kernel for
possible broken devices. For CXL devices the rules are very clear 
and the HDM decoder must not be reset.  If not chances are host OS will
take out BIOS setup memory and that isn't healthy.

Perhaps the key point here is that the patch title is misleading / simplistic.
The patch only warns if a reset happened that caused a configuration mismatch
for the address decoders.  SBR at other times is fine.

So even if we had a reset_type available, the driver would still need
to see if it mattered.

So I've ended up arguing myself into the fact all this code is needed anyway.
Perhaps change the patch title to

cxl: Add post reset warning if reset results in loss of previously committed HDM decoders.

If something along those lines..

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Jonathan

> 
> Thanks,
> 
> Lukas


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 4/4] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR)
  2024-04-03 16:27     ` Dan Williams
@ 2024-04-04 13:16       ` Jonathan Cameron
  0 siblings, 0 replies; 17+ messages in thread
From: Jonathan Cameron @ 2024-04-04 13:16 UTC (permalink / raw)
  To: Dan Williams
  Cc: Dave Jiang, linux-cxl, linux-pci, ira.weiny, vishal.l.verma,
	alison.schofield, dave, bhelgaas, lukas

On Wed, 3 Apr 2024 09:27:28 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> > On Tue, 2 Apr 2024 16:45:32 -0700
> > Dave Jiang <dave.jiang@intel.com> wrote:
> >   
> > > SBR is equivalent to a device been hot removed and inserted again. Doing a
> > > SBR on a CXL type 3 device is problematic if the exported device memory is
> > > part of system memory that cannot be offlined. The event is equivalent to
> > > violently ripping out that range of memory from the kernel. While the
> > > hardware requires the "Unmask SBR" bit set in the Port Control Extensions
> > > register and the kernel currently does not unmask it, user can unmask
> > > this bit via setpci or similar tool.
> > > 
> > > The driver does not have a way to detect whether a reset coming from the
> > > PCI subsystem is a Function Level Reset (FLR) or SBR. The only way to
> > > detect is to note if a decoder is marked as enabled in software but the
> > > decoder control register indicates it's not committed.
> > > 
> > > A helper function is added to find discrepancy between the decoder
> > > software state versus the hardware register state.
> > > 
> > > Suggested-by: Dan Williams <dan.j.williams@intel.com>
> > > Signed-off-by: Dave Jiang <dave.jiang@intel.com>  
> > 
> > As I said way back on v1, this smells hacky.
> > 
> > Why not pass the info on what reset was done down from the PCI core?
> > I see Bjorn commented it would be *possible* to do it in the PCI core
> > but raised other concerns that needed addressing first (I think you've
> > dealt with thosenow).  Doesn't look that hard to me (I've not coded it
> > up yet though).
> > 
> > The core code knows how far it got down the list reset_methods before
> > it succeeded in resetting.  So...
> > 
> > Modify __pci_reset_function_locked() to return the index of the reset
> > method that succeeded. Then pass that to pci_dev_restore().
> > Finally push it into a reset_done2() that takes that as an extra
> > parameter and the driver can see if it is FLR or SBR.
> > The extended reset_done is to avoid modifying lots of drivers.
> > However a quick grep suggests it's not that heavily used (15ish?)
> > so maybe just add the parameter.
> > 
> > There are a few other paths, but non look that problematic at
> > first glance...
> > 
> > So Bjorn, now the rest of this is hopefully close to what you'll be
> > happey with, which way do you prefer?  
> 
> I will defer to Bjorn, but I am not fan of this reset_done2() proposal.
> "Revalidate after reset" is a common driver pattern and all that
> plumbing the effective-reset-type does is make cxl_reset_done() more
> precise for no discernible value.

As per other thread branch, I think you are right, but key is this is not
detecting the SBR at all, it's detecting HDM decoders not being in expected
state. If they weren't setup before SBR, then we don't warn.  So SBR is
the cause, but not what is being detected (which is a subset of SBR results)
  
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 3/4] PCI: Create new reset method to force SBR for CXL
  2024-04-04  0:21     ` Dave Jiang
@ 2024-04-04 13:29       ` Jonathan Cameron
  2024-04-04 14:42         ` Dan Williams
  0 siblings, 1 reply; 17+ messages in thread
From: Jonathan Cameron @ 2024-04-04 13:29 UTC (permalink / raw)
  To: Dave Jiang
  Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave, bhelgaas, lukas


> >   
> >> +
> >> +	bridge = pci_upstream_bridge(dev);
> >> +	if (!bridge)
> >> +		return -ENOTTY;
> >> +
> >> +	dvsec = cxl_port_dvsec(bridge);
> >> +	if (!dvsec)
> >> +		return -ENOTTY;
> >> +
> >> +	if (probe)
> >> +		return 0;
> >> +
> >> +	pci_cfg_access_lock(bridge);
> >> +	rc = pci_read_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL, &reg);
> >> +	if (rc) {
> >> +		rc = -ENOTTY;
> >> +		goto out;
> >> +	}
> >> +
> >> +	if (!(reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR)) {
> >> +		val = reg | PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR;
> >> +		pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL,
> >> +				      val);
> >> +	} else {
> >> +		val = reg;
> >> +	}
> >> +
> >> +	rc = pci_reset_bus_function(dev, probe);
> >> +
> >> +	if (reg != val)
> >> +		pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL, reg);
> >> +
> >> +out:
> >> +	pci_cfg_access_unlock(bridge);  
> > 
> > Maybe a guard() use case to allow early returns in error paths?  
> 
> I'm not seeing a good way to do it. pci_cfg_access_lock/unlock() isn't like your typical lock/unlock. It locks, changes some pci_dev internal stuff, and then unlocks in both functions. The pci_lock isn't being held after lock() call.
> 

You've lost me.

Why does guard() care about the internals?

All it does is stash a copy of the '_lock' - here the bridge struct pci_dev then call the _unlock
on it when the stashed copy of that pointer when it goes out of scope.

Those functions don't need to hold a conventional lock.  Though in this case
I believe the lock is effectively pci_dev->block_cfg_access.

FWIW we do the similar in IIO (with a conditional lock for extra fun :)
https://elixir.bootlin.com/linux/v6.9-rc2/source/include/linux/iio/iio.h#L650
That is setting a flag much like this one.  Don't look too closely at that though
as it evolved into a slightly odd form and needs a revisit.

This was a possible nice to have, not something I care that much about
in this patch set so feel free to not do it :)

Jonathan



> >   
> >> +	return rc;
> >> +}
> >> +
> >>  void pci_dev_lock(struct pci_dev *dev)
> >>  {
> >>  	/* block PM suspend, driver probe, etc. */
> >> @@ -5066,6 +5109,7 @@ static const struct pci_reset_fn_method pci_reset_fn_methods[] = {
> >>  	{ pci_af_flr, .name = "af_flr" },
> >>  	{ pci_pm_reset, .name = "pm" },
> >>  	{ pci_reset_bus_function, .name = "bus" },
> >> +	{ cxl_reset_bus_function, .name = "cxl_bus" },
> >>  };
> >>  
> >>  static ssize_t reset_method_show(struct device *dev,
> >> diff --git a/include/linux/pci.h b/include/linux/pci.h
> >> index 16493426a04f..235f37715a43 100644
> >> --- a/include/linux/pci.h
> >> +++ b/include/linux/pci.h
> >> @@ -51,7 +51,7 @@
> >>  			       PCI_STATUS_PARITY)
> >>  
> >>  /* Number of reset methods used in pci_reset_fn_methods array in pci.c */
> >> -#define PCI_NUM_RESET_METHODS 7
> >> +#define PCI_NUM_RESET_METHODS 8
> >>  
> >>  #define PCI_RESET_PROBE		true
> >>  #define PCI_RESET_DO_RESET	false  
> >   
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 3/4] PCI: Create new reset method to force SBR for CXL
  2024-04-04 13:29       ` Jonathan Cameron
@ 2024-04-04 14:42         ` Dan Williams
  0 siblings, 0 replies; 17+ messages in thread
From: Dan Williams @ 2024-04-04 14:42 UTC (permalink / raw)
  To: Jonathan Cameron, Dave Jiang
  Cc: linux-cxl, linux-pci, dan.j.williams, ira.weiny, vishal.l.verma,
	alison.schofield, dave, bhelgaas, lukas

Jonathan Cameron wrote:
[..]
> > > Maybe a guard() use case to allow early returns in error paths?  
> > 
> > I'm not seeing a good way to do it. pci_cfg_access_lock/unlock() isn't like your typical lock/unlock. It locks, changes some pci_dev internal stuff, and then unlocks in both functions. The pci_lock isn't being held after lock() call.
> > 
> 
> You've lost me.
> 
> Why does guard() care about the internals?
> 
> All it does is stash a copy of the '_lock' - here the bridge struct
> pci_dev then call the _unlock on it when the stashed copy of that
> pointer when it goes out of scope.

Agree, and I suggested offlist to just use pci_dev_lock() similar to
pci_reset_function(). There is already a guard() for that, and
pci_dev_lock() is amenable to lockdep.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-04-04 14:42 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-02 23:45 [PATCH 0/4 v3] PCI: Add Secondary Bus Reset (SBR) support for CXL Dave Jiang
2024-04-02 23:45 ` [PATCH v3 1/4] PCI/cxl: Move PCI CXL vendor Id to a common location from CXL subsystem Dave Jiang
2024-04-02 23:45 ` [PATCH v3 2/4] PCI: Add check for CXL Secondary Bus Reset Dave Jiang
2024-04-03  8:26   ` Lukas Wunner
2024-04-04  0:19     ` Dave Jiang
2024-04-03 15:01   ` Jonathan Cameron
2024-04-02 23:45 ` [PATCH v3 3/4] PCI: Create new reset method to force SBR for CXL Dave Jiang
2024-04-03 15:09   ` Jonathan Cameron
2024-04-04  0:21     ` Dave Jiang
2024-04-04 13:29       ` Jonathan Cameron
2024-04-04 14:42         ` Dan Williams
2024-04-02 23:45 ` [PATCH v3 4/4] cxl: Add post reset warning if reset is detected as Secondary Bus Reset (SBR) Dave Jiang
2024-04-03 15:32   ` Jonathan Cameron
2024-04-03 16:27     ` Dan Williams
2024-04-04 13:16       ` Jonathan Cameron
2024-04-04  8:51     ` Lukas Wunner
2024-04-04 13:13       ` Jonathan Cameron

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.