linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/8] Expose and manage PCI device reset
@ 2021-06-08  5:48 Amey Narkhede
  2021-06-08  5:48 ` [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods Amey Narkhede
                   ` (8 more replies)
  0 siblings, 9 replies; 52+ messages in thread
From: Amey Narkhede @ 2021-06-08  5:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki,
	Amey Narkhede

PCI and PCIe devices may support a number of possible reset mechanisms
for example Function Level Reset (FLR) provided via Advanced Feature or
PCIe capabilities, Power Management reset, bus reset, or device specific reset.
Currently the PCI subsystem creates a policy prioritizing these reset methods
which provides neither visibility nor control to userspace.

Expose the reset methods available per device to userspace, via sysfs
and allow an administrative user or device owner to have ability to
manage per device reset method priorities or exclusions.
This feature aims to allow greater control of a device for use cases
as device assignment, where specific device or platform issues may
interact poorly with a given reset method, and for which device specific
quirks have not been developed.

Changes in v7:
	- Fix the pci_dev_acpi_reset() prototype mismatch
	  in case of CONFIG_ACPI=n

Changes in v6:
	- Address Bjorn's and Krzysztof's review comments
	- Add Shanker's updated patches along with new
	  "PCI: Setup ACPI_COMPANION early" patch

Changes in v5:
	- Rebase the series over pci/reset branch of
	  Bjorn's pci tree to avoid merge conflicts
	  caused by recent changes in existing reset
	  sysfs attribute

Changes in v4:
	- Change the order or strlen and strim in reset_method_store
	  function to avoid extra strlen call.
	- Use consistent terminology in new
	  pci_reset_mode enum and rename the probe argument
	  of reset functions.

Changes in v3:
	- Dropped "PCI: merge slot and bus reset implementations" which was
	  already accepted separately
	- Grammar fixes
	- Added Shanker's patches which were rebased on v2 of this series
	- Added "PCI: Change the type of probe argument in reset functions"
	  and additional user input sanitization code in reset_method_store
	  function per review feedback from Krzysztof

Changes in v2:
	- Use byte array instead of bitmap to keep track of
	  ordering of reset methods
	- Fix incorrect use of reset_fn field in octeon driver
	- Allow writing comma separated list of names of supported reset
	  methods to reset_method sysfs attribute
	- Writing empty string instead of "none" to reset_method attribute
	  disables ability of reset the device

Amey Narkhede (5):
  PCI: Add pcie_reset_flr to follow calling convention of other reset
    methods
  PCI: Add new array for keeping track of ordering of reset methods
  PCI: Remove reset_fn field from pci_dev
  PCI/sysfs: Allow userspace to query and set device reset mechanism
  PCI: Change the type of probe argument in reset functions

Shanker Donthineni (3):
  PCI: Setup ACPI_COMPANION early
  PCI: Add support for ACPI _RST reset method
  PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs

 Documentation/ABI/testing/sysfs-bus-pci       |  16 ++
 drivers/crypto/cavium/nitrox/nitrox_main.c    |   4 +-
 .../ethernet/cavium/liquidio/lio_vf_main.c    |   2 +-
 drivers/pci/hotplug/pciehp.h                  |   2 +-
 drivers/pci/hotplug/pciehp_hpc.c              |   4 +-
 drivers/pci/pci-acpi.c                        |  39 ++-
 drivers/pci/pci-sysfs.c                       | 120 ++++++++-
 drivers/pci/pci.c                             | 238 +++++++++++-------
 drivers/pci/pci.h                             |  22 +-
 drivers/pci/pcie/aer.c                        |  12 +-
 drivers/pci/probe.c                           |   6 +-
 drivers/pci/quirks.c                          |  54 ++--
 drivers/pci/remove.c                          |   1 -
 include/linux/pci.h                           |  16 +-
 include/linux/pci_hotplug.h                   |   2 +-
 15 files changed, 393 insertions(+), 145 deletions(-)

--
2.31.1

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods
  2021-06-08  5:48 [PATCH v7 0/8] Expose and manage PCI device reset Amey Narkhede
@ 2021-06-08  5:48 ` Amey Narkhede
  2021-06-10 20:15   ` Shanker R Donthineni
                     ` (2 more replies)
  2021-06-08  5:48 ` [PATCH v7 2/8] PCI: Add new array for keeping track of ordering of " Amey Narkhede
                   ` (7 subsequent siblings)
  8 siblings, 3 replies; 52+ messages in thread
From: Amey Narkhede @ 2021-06-08  5:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki,
	Amey Narkhede

Currently there is separate function pcie_has_flr() to probe if pcie flr is
supported by the device which does not match the calling convention
followed by reset methods which use second function argument to decide
whether to probe or not.  Add new function pcie_reset_flr() that follows
the calling convention of reset methods.

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
---
 drivers/crypto/cavium/nitrox/nitrox_main.c |  4 +-
 drivers/pci/pci.c                          | 62 ++++++++++++----------
 drivers/pci/pcie/aer.c                     | 12 ++---
 drivers/pci/quirks.c                       |  9 ++--
 include/linux/pci.h                        |  2 +-
 5 files changed, 43 insertions(+), 46 deletions(-)

diff --git a/drivers/crypto/cavium/nitrox/nitrox_main.c b/drivers/crypto/cavium/nitrox/nitrox_main.c
index facc8e6bc..15d6c8452 100644
--- a/drivers/crypto/cavium/nitrox/nitrox_main.c
+++ b/drivers/crypto/cavium/nitrox/nitrox_main.c
@@ -306,9 +306,7 @@ static int nitrox_device_flr(struct pci_dev *pdev)
 		return -ENOMEM;
 	}
 
-	/* check flr support */
-	if (pcie_has_flr(pdev))
-		pcie_flr(pdev);
+	pcie_reset_flr(pdev, 0);
 
 	pci_restore_state(pdev);
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 452351025..3bf36924c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4611,32 +4611,12 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev)
 }
 EXPORT_SYMBOL(pci_wait_for_pending_transaction);
 
-/**
- * pcie_has_flr - check if a device supports function level resets
- * @dev: device to check
- *
- * Returns true if the device advertises support for PCIe function level
- * resets.
- */
-bool pcie_has_flr(struct pci_dev *dev)
-{
-	u32 cap;
-
-	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
-		return false;
-
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
-	return cap & PCI_EXP_DEVCAP_FLR;
-}
-EXPORT_SYMBOL_GPL(pcie_has_flr);
-
 /**
  * pcie_flr - initiate a PCIe function level reset
  * @dev: device to reset
  *
- * Initiate a function level reset on @dev.  The caller should ensure the
- * device supports FLR before calling this function, e.g. by using the
- * pcie_has_flr() helper.
+ * Initiate a function level reset unconditionally on @dev without
+ * checking any flags and DEVCAP
  */
 int pcie_flr(struct pci_dev *dev)
 {
@@ -4659,6 +4639,31 @@ int pcie_flr(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pcie_flr);
 
+/**
+ * pcie_reset_flr - initiate a PCIe function level reset
+ * @dev: device to reset
+ * @probe: If set, only check if the device can be reset this way.
+ *
+ * Initiate a function level reset on @dev.
+ */
+int pcie_reset_flr(struct pci_dev *dev, int probe)
+{
+	u32 cap;
+
+	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
+		return -ENOTTY;
+
+	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
+	if (!(cap & PCI_EXP_DEVCAP_FLR))
+		return -ENOTTY;
+
+	if (probe)
+		return 0;
+
+	return pcie_flr(dev);
+}
+EXPORT_SYMBOL_GPL(pcie_reset_flr);
+
 static int pci_af_flr(struct pci_dev *dev, int probe)
 {
 	int pos;
@@ -5139,11 +5144,9 @@ int __pci_reset_function_locked(struct pci_dev *dev)
 	rc = pci_dev_specific_reset(dev, 0);
 	if (rc != -ENOTTY)
 		return rc;
-	if (pcie_has_flr(dev)) {
-		rc = pcie_flr(dev);
-		if (rc != -ENOTTY)
-			return rc;
-	}
+	rc = pcie_reset_flr(dev, 0);
+	if (rc != -ENOTTY)
+		return rc;
 	rc = pci_af_flr(dev, 0);
 	if (rc != -ENOTTY)
 		return rc;
@@ -5174,8 +5177,9 @@ int pci_probe_reset_function(struct pci_dev *dev)
 	rc = pci_dev_specific_reset(dev, 1);
 	if (rc != -ENOTTY)
 		return rc;
-	if (pcie_has_flr(dev))
-		return 0;
+	rc = pcie_reset_flr(dev, 1);
+	if (rc != -ENOTTY)
+		return rc;
 	rc = pci_af_flr(dev, 1);
 	if (rc != -ENOTTY)
 		return rc;
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index ec943cee5..98077595a 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1405,13 +1405,11 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
 	}
 
 	if (type == PCI_EXP_TYPE_RC_EC || type == PCI_EXP_TYPE_RC_END) {
-		if (pcie_has_flr(dev)) {
-			rc = pcie_flr(dev);
-			pci_info(dev, "has been reset (%d)\n", rc);
-		} else {
-			pci_info(dev, "not reset (no FLR support)\n");
-			rc = -ENOTTY;
-		}
+		rc = pcie_reset_flr(dev, 0);
+		if (!rc)
+			pci_info(dev, "has been reset\n");
+		else
+			pci_info(dev, "not reset (no FLR support: %d)\n", rc);
 	} else {
 		rc = pci_bus_error_reset(dev);
 		pci_info(dev, "%s Port link has been reset (%d)\n",
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index d85914afe..f977ba79a 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3819,7 +3819,7 @@ static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
 	u32 cfg;
 
 	if (dev->class != PCI_CLASS_STORAGE_EXPRESS ||
-	    !pcie_has_flr(dev) || !pci_resource_start(dev, 0))
+	    pcie_reset_flr(dev, 1) || !pci_resource_start(dev, 0))
 		return -ENOTTY;
 
 	if (probe)
@@ -3888,13 +3888,10 @@ static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
  */
 static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
 {
-	if (!pcie_has_flr(dev))
-		return -ENOTTY;
+	int ret = pcie_reset_flr(dev, probe);
 
 	if (probe)
-		return 0;
-
-	pcie_flr(dev);
+		return ret;
 
 	msleep(250);
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c20211e59..20b90c205 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1225,7 +1225,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
 			     enum pci_bus_speed *speed,
 			     enum pcie_link_width *width);
 void pcie_print_link_status(struct pci_dev *dev);
-bool pcie_has_flr(struct pci_dev *dev);
+int pcie_reset_flr(struct pci_dev *dev, int probe);
 int pcie_flr(struct pci_dev *dev);
 int __pci_reset_function_locked(struct pci_dev *dev);
 int pci_reset_function(struct pci_dev *dev);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v7 2/8] PCI: Add new array for keeping track of ordering of reset methods
  2021-06-08  5:48 [PATCH v7 0/8] Expose and manage PCI device reset Amey Narkhede
  2021-06-08  5:48 ` [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods Amey Narkhede
@ 2021-06-08  5:48 ` Amey Narkhede
  2021-06-10 20:15   ` Shanker R Donthineni
  2021-06-17 23:13   ` Bjorn Helgaas
  2021-06-08  5:48 ` [PATCH v7 3/8] PCI: Remove reset_fn field from pci_dev Amey Narkhede
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 52+ messages in thread
From: Amey Narkhede @ 2021-06-08  5:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki,
	Amey Narkhede

Introduce a new array reset_methods in struct pci_dev to keep track of
reset mechanisms supported by the device and their ordering.
Also refactor probing and reset functions to take advantage of calling
convention of reset functions.

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
---
 drivers/pci/pci.c   | 108 ++++++++++++++++++++++++++------------------
 drivers/pci/pci.h   |   8 +++-
 drivers/pci/probe.c |   5 +-
 include/linux/pci.h |   7 +++
 4 files changed, 81 insertions(+), 47 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 3bf36924c..39a9ea8bb 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -72,6 +72,14 @@ static void pci_dev_d3_sleep(struct pci_dev *dev)
 		msleep(delay);
 }
 
+bool pci_reset_supported(struct pci_dev *dev)
+{
+	u8 null_reset_methods[PCI_RESET_METHODS_NUM] = { 0 };
+
+	return memcmp(null_reset_methods,
+		      dev->reset_methods, PCI_RESET_METHODS_NUM);
+}
+
 #ifdef CONFIG_PCI_DOMAINS
 int pci_domains_supported = 1;
 #endif
@@ -5107,6 +5115,18 @@ static void pci_dev_restore(struct pci_dev *dev)
 		err_handler->reset_done(dev);
 }
 
+/*
+ * The ordering for functions in pci_reset_fn_methods is required for
+ * reset_methods byte array defined in struct pci_dev.
+ */
+const struct pci_reset_fn_method pci_reset_fn_methods[] = {
+	{ &pci_dev_specific_reset, .name = "device_specific" },
+	{ &pcie_reset_flr, .name = "flr" },
+	{ &pci_af_flr, .name = "af_flr" },
+	{ &pci_pm_reset, .name = "pm" },
+	{ &pci_reset_bus_function, .name = "bus" },
+};
+
 /**
  * __pci_reset_function_locked - reset a PCI device function while holding
  * the @dev mutex lock.
@@ -5129,65 +5149,67 @@ static void pci_dev_restore(struct pci_dev *dev)
  */
 int __pci_reset_function_locked(struct pci_dev *dev)
 {
-	int rc;
+	int i, rc = -ENOTTY;
+	u8 prio;
 
 	might_sleep();
 
-	/*
-	 * A reset method returns -ENOTTY if it doesn't support this device
-	 * and we should try the next method.
-	 *
-	 * If it returns 0 (success), we're finished.  If it returns any
-	 * other error, we're also finished: this indicates that further
-	 * reset mechanisms might be broken on the device.
-	 */
-	rc = pci_dev_specific_reset(dev, 0);
-	if (rc != -ENOTTY)
-		return rc;
-	rc = pcie_reset_flr(dev, 0);
-	if (rc != -ENOTTY)
-		return rc;
-	rc = pci_af_flr(dev, 0);
-	if (rc != -ENOTTY)
-		return rc;
-	rc = pci_pm_reset(dev, 0);
-	if (rc != -ENOTTY)
-		return rc;
-	return pci_reset_bus_function(dev, 0);
+	for (prio = PCI_RESET_METHODS_NUM; prio; prio--) {
+		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
+			if (dev->reset_methods[i] == prio) {
+				/*
+				 * A reset method returns -ENOTTY if it doesn't
+				 * support this device and we should try the
+				 * next method.
+				 *
+				 * If it returns 0 (success), we're finished.
+				 * If it returns any other error, we're also
+				 * finished: this indicates that further reset
+				 * mechanisms might be broken on the device.
+				 */
+				rc = pci_reset_fn_methods[i].reset_fn(dev, 0);
+				if (rc != -ENOTTY)
+					return rc;
+				break;
+			}
+		}
+		if (i == PCI_RESET_METHODS_NUM)
+			break;
+	}
+	return rc;
 }
 EXPORT_SYMBOL_GPL(__pci_reset_function_locked);
 
 /**
- * pci_probe_reset_function - check whether the device can be safely reset
- * @dev: PCI device to reset
+ * pci_init_reset_methods - check whether device can be safely reset
+ * and store supported reset mechanisms.
+ * @dev: PCI device to check for reset mechanisms
  *
  * Some devices allow an individual function to be reset without affecting
  * other functions in the same device.  The PCI device must be responsive
- * to PCI config space in order to use this function.
+ * to reads and writes to its PCI config space in order to use this function.
  *
- * Returns 0 if the device function can be reset or negative if the
- * device doesn't support resetting a single function.
+ * Stores reset mechanisms supported by device in reset_methods byte array
+ * which is a member of struct pci_dev.
  */
-int pci_probe_reset_function(struct pci_dev *dev)
+void pci_init_reset_methods(struct pci_dev *dev)
 {
-	int rc;
+	int i, rc;
+	u8 prio = PCI_RESET_METHODS_NUM;
+	u8 reset_methods[PCI_RESET_METHODS_NUM] = { 0 };
 
-	might_sleep();
+	BUILD_BUG_ON(ARRAY_SIZE(pci_reset_fn_methods) != PCI_RESET_METHODS_NUM);
 
-	rc = pci_dev_specific_reset(dev, 1);
-	if (rc != -ENOTTY)
-		return rc;
-	rc = pcie_reset_flr(dev, 1);
-	if (rc != -ENOTTY)
-		return rc;
-	rc = pci_af_flr(dev, 1);
-	if (rc != -ENOTTY)
-		return rc;
-	rc = pci_pm_reset(dev, 1);
-	if (rc != -ENOTTY)
-		return rc;
+	might_sleep();
 
-	return pci_reset_bus_function(dev, 1);
+	for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
+		rc = pci_reset_fn_methods[i].reset_fn(dev, 1);
+		if (!rc)
+			reset_methods[i] = prio--;
+		else if (rc != -ENOTTY)
+			break;
+	}
+	memcpy(dev->reset_methods, reset_methods, sizeof(reset_methods));
 }
 
 /**
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 37c913bbc..13ec6bd6f 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -33,7 +33,7 @@ enum pci_mmap_api {
 int pci_mmap_fits(struct pci_dev *pdev, int resno, struct vm_area_struct *vmai,
 		  enum pci_mmap_api mmap_api);
 
-int pci_probe_reset_function(struct pci_dev *dev);
+void pci_init_reset_methods(struct pci_dev *dev);
 int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
 int pci_bus_error_reset(struct pci_dev *dev);
 
@@ -606,6 +606,12 @@ struct pci_dev_reset_methods {
 	int (*reset)(struct pci_dev *dev, int probe);
 };
 
+struct pci_reset_fn_method {
+	int (*reset_fn)(struct pci_dev *pdev, int probe);
+	char *name;
+};
+
+extern const struct pci_reset_fn_method pci_reset_fn_methods[];
 #ifdef CONFIG_PCI_QUIRKS
 int pci_dev_specific_reset(struct pci_dev *dev, int probe);
 #else
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 3a62d09b8..8cf532681 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2404,9 +2404,8 @@ static void pci_init_capabilities(struct pci_dev *dev)
 	pci_rcec_init(dev);		/* Root Complex Event Collector */
 
 	pcie_report_downtraining(dev);
-
-	if (pci_probe_reset_function(dev) == 0)
-		dev->reset_fn = 1;
+	pci_init_reset_methods(dev);
+	dev->reset_fn = pci_reset_supported(dev);
 }
 
 /*
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 20b90c205..0955246f8 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -49,6 +49,8 @@
 			       PCI_STATUS_SIG_TARGET_ABORT | \
 			       PCI_STATUS_PARITY)
 
+#define PCI_RESET_METHODS_NUM 5
+
 /*
  * The PCI interface treats multi-function devices as independent
  * devices.  The slot/function address of each device is encoded
@@ -505,6 +507,10 @@ struct pci_dev {
 	char		*driver_override; /* Driver name to force a match */
 
 	unsigned long	priv_flags;	/* Private flags for the PCI driver */
+	/*
+	 * See pci_reset_fn_methods array in pci.c for ordering.
+	 */
+	u8 reset_methods[PCI_RESET_METHODS_NUM];	/* Reset methods ordered by priority */
 };
 
 static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
@@ -1227,6 +1233,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
 void pcie_print_link_status(struct pci_dev *dev);
 int pcie_reset_flr(struct pci_dev *dev, int probe);
 int pcie_flr(struct pci_dev *dev);
+bool pci_reset_supported(struct pci_dev *dev);
 int __pci_reset_function_locked(struct pci_dev *dev);
 int pci_reset_function(struct pci_dev *dev);
 int pci_reset_function_locked(struct pci_dev *dev);
-- 
2.31.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v7 3/8] PCI: Remove reset_fn field from pci_dev
  2021-06-08  5:48 [PATCH v7 0/8] Expose and manage PCI device reset Amey Narkhede
  2021-06-08  5:48 ` [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods Amey Narkhede
  2021-06-08  5:48 ` [PATCH v7 2/8] PCI: Add new array for keeping track of ordering of " Amey Narkhede
@ 2021-06-08  5:48 ` Amey Narkhede
  2021-06-10 20:16   ` Shanker R Donthineni
  2021-06-08  5:48 ` [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism Amey Narkhede
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 52+ messages in thread
From: Amey Narkhede @ 2021-06-08  5:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki,
	Amey Narkhede

reset_fn field is used to indicate whether the device supports any reset
mechanism or not. Remove the use of reset_fn in favor of new reset_methods
array which can be used to keep track of all supported reset mechanisms of
a device and their ordering.

The octeon driver is incorrectly using
reset_fn field to detect if the device supports FLR or not. Use
pcie_reset_flr() to probe whether it supports FLR or not.

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
---
 drivers/net/ethernet/cavium/liquidio/lio_vf_main.c | 2 +-
 drivers/pci/pci-sysfs.c                            | 2 +-
 drivers/pci/pci.c                                  | 6 +++---
 drivers/pci/probe.c                                | 1 -
 drivers/pci/quirks.c                               | 2 +-
 drivers/pci/remove.c                               | 1 -
 include/linux/pci.h                                | 1 -
 7 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
index 516f166ce..336d149ee 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
@@ -526,7 +526,7 @@ static void octeon_destroy_resources(struct octeon_device *oct)
 			oct->irq_name_storage = NULL;
 		}
 		/* Soft reset the octeon device before exiting */
-		if (oct->pci_dev->reset_fn)
+		if (!pcie_reset_flr(oct->pci_dev, 1))
 			octeon_pci_flr(oct);
 		else
 			cn23xx_vf_ask_pf_to_do_flr(oct);
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index beb8d1f4f..316f70c3e 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1367,7 +1367,7 @@ static umode_t pci_dev_reset_attr_is_visible(struct kobject *kobj,
 {
 	struct pci_dev *pdev = to_pci_dev(kobj_to_dev(kobj));
 
-	if (!pdev->reset_fn)
+	if (!pci_reset_supported(pdev))
 		return 0;
 
 	return a->mode;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 39a9ea8bb..2302aa421 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5232,7 +5232,7 @@ int pci_reset_function(struct pci_dev *dev)
 {
 	int rc;
 
-	if (!dev->reset_fn)
+	if (!pci_reset_supported(dev))
 		return -ENOTTY;
 
 	pci_dev_lock(dev);
@@ -5268,7 +5268,7 @@ int pci_reset_function_locked(struct pci_dev *dev)
 {
 	int rc;
 
-	if (!dev->reset_fn)
+	if (!pci_reset_supported(dev))
 		return -ENOTTY;
 
 	pci_dev_save_and_disable(dev);
@@ -5291,7 +5291,7 @@ int pci_try_reset_function(struct pci_dev *dev)
 {
 	int rc;
 
-	if (!dev->reset_fn)
+	if (!pci_reset_supported(dev))
 		return -ENOTTY;
 
 	if (!pci_dev_trylock(dev))
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 8cf532681..90fd4f61f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2405,7 +2405,6 @@ static void pci_init_capabilities(struct pci_dev *dev)
 
 	pcie_report_downtraining(dev);
 	pci_init_reset_methods(dev);
-	dev->reset_fn = pci_reset_supported(dev);
 }
 
 /*
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index f977ba79a..e86cf4a3b 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5589,7 +5589,7 @@ static void quirk_reset_lenovo_thinkpad_p50_nvgpu(struct pci_dev *pdev)
 
 	if (pdev->subsystem_vendor != PCI_VENDOR_ID_LENOVO ||
 	    pdev->subsystem_device != 0x222e ||
-	    !pdev->reset_fn)
+	    !pci_reset_supported(pdev))
 		return;
 
 	if (pci_enable_device_mem(pdev))
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index dd12c2fcc..4c54c7505 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -19,7 +19,6 @@ static void pci_stop_dev(struct pci_dev *dev)
 	pci_pme_active(dev, false);
 
 	if (pci_dev_is_added(dev)) {
-		dev->reset_fn = 0;
 
 		device_release_driver(&dev->dev);
 		pci_proc_detach_device(dev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 0955246f8..6e9bc4f9c 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -429,7 +429,6 @@ struct pci_dev {
 	unsigned int	state_saved:1;
 	unsigned int	is_physfn:1;
 	unsigned int	is_virtfn:1;
-	unsigned int	reset_fn:1;
 	unsigned int	is_hotplug_bridge:1;
 	unsigned int	shpc_managed:1;		/* SHPC owned by shpchp */
 	unsigned int	is_thunderbolt:1;	/* Thunderbolt controller */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-08  5:48 [PATCH v7 0/8] Expose and manage PCI device reset Amey Narkhede
                   ` (2 preceding siblings ...)
  2021-06-08  5:48 ` [PATCH v7 3/8] PCI: Remove reset_fn field from pci_dev Amey Narkhede
@ 2021-06-08  5:48 ` Amey Narkhede
  2021-06-09 21:57   ` Raphael Norwitz
                     ` (3 more replies)
  2021-06-08  5:48 ` [PATCH v7 5/8] PCI: Setup ACPI_COMPANION early Amey Narkhede
                   ` (4 subsequent siblings)
  8 siblings, 4 replies; 52+ messages in thread
From: Amey Narkhede @ 2021-06-08  5:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki,
	Amey Narkhede

Add reset_method sysfs attribute to enable user to
query and set user preferred device reset methods and
their ordering.

Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
---
 Documentation/ABI/testing/sysfs-bus-pci |  16 ++++
 drivers/pci/pci-sysfs.c                 | 118 ++++++++++++++++++++++++
 2 files changed, 134 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index ef00fada2..cf6dbbb3c 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -121,6 +121,22 @@ Description:
 		child buses, and re-discover devices removed earlier
 		from this part of the device tree.
 
+What:		/sys/bus/pci/devices/.../reset_method
+Date:		March 2021
+Contact:	Amey Narkhede <ameynarkhede03@gmail.com>
+Description:
+		Some devices allow an individual function to be reset
+		without affecting other functions in the same slot.
+		For devices that have this support, a file named reset_method
+		will be present in sysfs. Reading this file will give names
+		of the device supported reset methods and their ordering.
+		Writing the name or comma separated list of names of any of
+		the device supported reset methods to this file will set the
+		reset methods and their ordering to be used when resetting
+		the device. Writing empty string to this file will disable
+		ability to reset the device and writing "default" will return
+		to the original value.
+
 What:		/sys/bus/pci/devices/.../reset
 Date:		July 2009
 Contact:	Michael S. Tsirkin <mst@redhat.com>
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 316f70c3e..52def79aa 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1334,6 +1334,123 @@ static const struct attribute_group pci_dev_rom_attr_group = {
 	.is_bin_visible = pci_dev_rom_attr_is_visible,
 };
 
+static ssize_t reset_method_show(struct device *dev,
+				 struct device_attribute *attr,
+				 char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	ssize_t len = 0;
+	int i, prio;
+
+	for (prio = PCI_RESET_METHODS_NUM; prio; prio--) {
+		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
+			if (prio == pdev->reset_methods[i]) {
+				len += sysfs_emit_at(buf, len, "%s%s",
+						     len ? "," : "",
+						     pci_reset_fn_methods[i].name);
+				break;
+			}
+		}
+
+		if (i == PCI_RESET_METHODS_NUM)
+			break;
+	}
+
+	if (len)
+		len += sysfs_emit_at(buf, len, "\n");
+
+	return len;
+}
+
+static ssize_t reset_method_store(struct device *dev,
+				  struct device_attribute *attr,
+				  const char *buf, size_t count)
+{
+	u8 reset_methods[PCI_RESET_METHODS_NUM];
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u8 prio = PCI_RESET_METHODS_NUM;
+	char *name, *options;
+	int i;
+
+	if (count >= (PAGE_SIZE - 1))
+		return -EINVAL;
+
+	options = kstrndup(buf, count, GFP_KERNEL);
+	if (!options)
+		return -ENOMEM;
+
+	/*
+	 * Initialize reset_method such that 0xff indicates
+	 * supported but not currently enabled reset methods
+	 * as we only use priority values which are within
+	 * the range of PCI_RESET_FN_METHODS array size
+	 */
+	for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
+		reset_methods[i] = pdev->reset_methods[i] ? 0xff : 0;
+
+	if (sysfs_streq(options, "")) {
+		pci_warn(pdev, "All device reset methods disabled by user");
+		goto set_reset_methods;
+	}
+
+	if (sysfs_streq(options, "default")) {
+		for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
+			reset_methods[i] = reset_methods[i] ? prio-- : 0;
+		goto set_reset_methods;
+	}
+
+	while ((name = strsep(&options, ",")) != NULL) {
+		if (sysfs_streq(name, ""))
+			continue;
+
+		name = strim(name);
+
+		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
+			if (reset_methods[i] &&
+			    sysfs_streq(name, pci_reset_fn_methods[i].name)) {
+				reset_methods[i] = prio--;
+				break;
+			}
+		}
+
+		if (i == PCI_RESET_METHODS_NUM) {
+			kfree(options);
+			return -EINVAL;
+		}
+	}
+
+	if (reset_methods[0] &&
+	    reset_methods[0] != PCI_RESET_METHODS_NUM)
+		pci_warn(pdev, "Device specific reset disabled/de-prioritized by user");
+
+set_reset_methods:
+	kfree(options);
+	memcpy(pdev->reset_methods, reset_methods, sizeof(reset_methods));
+	return count;
+}
+static DEVICE_ATTR_RW(reset_method);
+
+static struct attribute *pci_dev_reset_method_attrs[] = {
+	&dev_attr_reset_method.attr,
+	NULL,
+};
+
+static umode_t pci_dev_reset_method_attr_is_visible(struct kobject *kobj,
+						    struct attribute *a, int n)
+{
+	struct pci_dev *pdev = to_pci_dev(kobj_to_dev(kobj));
+
+	if (!pci_reset_supported(pdev))
+		return 0;
+
+	return a->mode;
+}
+
+static const struct attribute_group pci_dev_reset_method_attr_group = {
+	.attrs = pci_dev_reset_method_attrs,
+	.is_visible = pci_dev_reset_method_attr_is_visible,
+};
+
 static ssize_t reset_store(struct device *dev, struct device_attribute *attr,
 			   const char *buf, size_t count)
 {
@@ -1491,6 +1608,7 @@ const struct attribute_group *pci_dev_groups[] = {
 	&pci_dev_config_attr_group,
 	&pci_dev_rom_attr_group,
 	&pci_dev_reset_attr_group,
+	&pci_dev_reset_method_attr_group,
 	&pci_dev_vpd_attr_group,
 #ifdef CONFIG_DMI
 	&pci_dev_smbios_attr_group,
-- 
2.31.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v7 5/8] PCI: Setup ACPI_COMPANION early
  2021-06-08  5:48 [PATCH v7 0/8] Expose and manage PCI device reset Amey Narkhede
                   ` (3 preceding siblings ...)
  2021-06-08  5:48 ` [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism Amey Narkhede
@ 2021-06-08  5:48 ` Amey Narkhede
  2021-06-08  5:48 ` [PATCH v7 6/8] PCI: Add support for ACPI _RST reset method Amey Narkhede
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 52+ messages in thread
From: Amey Narkhede @ 2021-06-08  5:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

From: Shanker Donthineni <sdonthineni@nvidia.com>

Currently, the ACPI_COMPANION is not available until device_add().
The software features which have dependency on ACPI fwnode properties
and needs to be handled before device_add() will not work. One use
case, software has to check the existence of _RST method to support
ACPI based reset mechanism.

This patch adds a new function pci_set_acpi_fwnode() for setting the
ACPI_COMPANION, same code which is available in acpi_pci_bridge_d3().

Call pci_set_acpi_fwnode() from pci_scan_device() to fix the issue.

Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
---
 drivers/pci/pci-acpi.c | 12 ++++++++----
 drivers/pci/pci.h      |  2 ++
 drivers/pci/probe.c    |  2 ++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 36bc23e21..eaddbf701 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -934,6 +934,13 @@ static pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
 
 static struct acpi_device *acpi_pci_find_companion(struct device *dev);
 
+void pci_set_acpi_fwnode(struct pci_dev *dev)
+{
+	if (!ACPI_COMPANION(&dev->dev) && !pci_dev_is_added(dev))
+		ACPI_COMPANION_SET(&dev->dev,
+				   acpi_pci_find_companion(&dev->dev));
+}
+
 static bool acpi_pci_bridge_d3(struct pci_dev *dev)
 {
 	const struct fwnode_handle *fwnode;
@@ -945,11 +952,8 @@ static bool acpi_pci_bridge_d3(struct pci_dev *dev)
 		return false;
 
 	/* Assume D3 support if the bridge is power-manageable by ACPI. */
+	pci_set_acpi_fwnode(dev);
 	adev = ACPI_COMPANION(&dev->dev);
-	if (!adev && !pci_dev_is_added(dev)) {
-		adev = acpi_pci_find_companion(&dev->dev);
-		ACPI_COMPANION_SET(&dev->dev, adev);
-	}
 
 	if (adev && acpi_device_power_manageable(adev))
 		return true;
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 13ec6bd6f..d22da6d3c 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -703,7 +703,9 @@ static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL
 #ifdef CONFIG_ACPI
 int pci_acpi_program_hp_params(struct pci_dev *dev);
 extern const struct attribute_group pci_dev_acpi_attr_group;
+void pci_set_acpi_fwnode(struct pci_dev *dev);
 #else
+static inline void pci_set_acpi_fwnode(struct pci_dev *dev) {}
 static inline int pci_acpi_program_hp_params(struct pci_dev *dev)
 {
 	return -ENODEV;
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 90fd4f61f..dfefa5ed0 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2359,6 +2359,8 @@ static struct pci_dev *pci_scan_device(struct pci_bus *bus, int devfn)
 		return NULL;
 	}
 
+	pci_set_acpi_fwnode(dev);
+
 	return dev;
 }
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v7 6/8] PCI: Add support for ACPI _RST reset method
  2021-06-08  5:48 [PATCH v7 0/8] Expose and manage PCI device reset Amey Narkhede
                   ` (4 preceding siblings ...)
  2021-06-08  5:48 ` [PATCH v7 5/8] PCI: Setup ACPI_COMPANION early Amey Narkhede
@ 2021-06-08  5:48 ` Amey Narkhede
  2021-06-08  5:48 ` [PATCH v7 7/8] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs Amey Narkhede
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 52+ messages in thread
From: Amey Narkhede @ 2021-06-08  5:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

From: Shanker Donthineni <sdonthineni@nvidia.com>

The _RST is a standard method specified in the ACPI specification. It
provides a function level reset when it is described in the acpi_device
context associated with PCI-device. Implement a new reset function
pci_dev_acpi_reset() for probing RST method and execute if it is defined
in the firmware.

The default priority of the ACPI reset is set to below device-specific
and above hardware resets.

Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
---
 drivers/pci/pci-acpi.c | 23 +++++++++++++++++++++++
 drivers/pci/pci.c      |  1 +
 drivers/pci/pci.h      |  6 ++++++
 include/linux/pci.h    |  2 +-
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index eaddbf701..40dd24cd3 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -941,6 +941,29 @@ void pci_set_acpi_fwnode(struct pci_dev *dev)
 				   acpi_pci_find_companion(&dev->dev));
 }
 
+/**
+ * pci_dev_acpi_reset - do a function level reset using _RST method
+ * @dev: device to reset
+ * @probe: check if _RST method is included in the acpi_device context.
+ */
+int pci_dev_acpi_reset(struct pci_dev *dev, int probe)
+{
+	acpi_handle handle = ACPI_HANDLE(&dev->dev);
+
+	if (!handle || !acpi_has_method(handle, "_RST"))
+		return -ENOTTY;
+
+	if (probe)
+		return 0;
+
+	if (ACPI_FAILURE(acpi_evaluate_object(handle, "_RST", NULL, NULL))) {
+		pci_warn(dev, "ACPI _RST failed\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static bool acpi_pci_bridge_d3(struct pci_dev *dev)
 {
 	const struct fwnode_handle *fwnode;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 2302aa421..2e7efd7e7 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5121,6 +5121,7 @@ static void pci_dev_restore(struct pci_dev *dev)
  */
 const struct pci_reset_fn_method pci_reset_fn_methods[] = {
 	{ &pci_dev_specific_reset, .name = "device_specific" },
+	{ &pci_dev_acpi_reset, .name = "acpi" },
 	{ &pcie_reset_flr, .name = "flr" },
 	{ &pci_af_flr, .name = "af_flr" },
 	{ &pci_pm_reset, .name = "pm" },
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index d22da6d3c..e9cfb7cd6 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -704,7 +704,13 @@ static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL
 int pci_acpi_program_hp_params(struct pci_dev *dev);
 extern const struct attribute_group pci_dev_acpi_attr_group;
 void pci_set_acpi_fwnode(struct pci_dev *dev);
+int pci_dev_acpi_reset(struct pci_dev *dev, int probe);
 #else
+static inline int pci_dev_acpi_reset(struct pci_dev *dev, int probe)
+{
+	return -ENOTTY;
+}
+
 static inline void pci_set_acpi_fwnode(struct pci_dev *dev) {}
 static inline int pci_acpi_program_hp_params(struct pci_dev *dev)
 {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 6e9bc4f9c..a7f063da2 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -49,7 +49,7 @@
 			       PCI_STATUS_SIG_TARGET_ABORT | \
 			       PCI_STATUS_PARITY)
 
-#define PCI_RESET_METHODS_NUM 5
+#define PCI_RESET_METHODS_NUM 6
 
 /*
  * The PCI interface treats multi-function devices as independent
-- 
2.31.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v7 7/8] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs
  2021-06-08  5:48 [PATCH v7 0/8] Expose and manage PCI device reset Amey Narkhede
                   ` (5 preceding siblings ...)
  2021-06-08  5:48 ` [PATCH v7 6/8] PCI: Add support for ACPI _RST reset method Amey Narkhede
@ 2021-06-08  5:48 ` Amey Narkhede
  2021-06-10 23:16   ` Bjorn Helgaas
  2021-06-08  5:48 ` [PATCH v7 8/8] PCI: Change the type of probe argument in reset functions Amey Narkhede
  2021-06-08 10:05 ` [PATCH v7 0/8] Expose and manage PCI device reset Enrico Weigelt, metux IT consult
  8 siblings, 1 reply; 52+ messages in thread
From: Amey Narkhede @ 2021-06-08  5:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

From: Shanker Donthineni <sdonthineni@nvidia.com>

On select platforms, some Nvidia GPU devices do not work with SBR.
Triggering SBR would leave the device inoperable for the current
system boot. It requires a system hard-reboot to get the GPU device
back to normal operating condition post-SBR. For the affected
devices, enable NO_BUS_RESET quirk to fix the issue.

This issue will be fixed in the next generation of hardware.

Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
---
 drivers/pci/quirks.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index e86cf4a3b..45a8c3caa 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3546,6 +3546,18 @@ static void quirk_no_bus_reset(struct pci_dev *dev)
 	dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
 }
 
+/*
+ * Some Nvidia GPU devices do not work with bus reset, SBR needs to be
+ * prevented for those affected devices.
+ */
+static void quirk_nvidia_no_bus_reset(struct pci_dev *dev)
+{
+	if ((dev->device & 0xffc0) == 0x2340)
+		quirk_no_bus_reset(dev);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
+			 quirk_nvidia_no_bus_reset);
+
 /*
  * Some Atheros AR9xxx and QCA988x chips do not behave after a bus reset.
  * The device will throw a Link Down error on AER-capable systems and
-- 
2.31.1


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v7 8/8] PCI: Change the type of probe argument in reset functions
  2021-06-08  5:48 [PATCH v7 0/8] Expose and manage PCI device reset Amey Narkhede
                   ` (6 preceding siblings ...)
  2021-06-08  5:48 ` [PATCH v7 7/8] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs Amey Narkhede
@ 2021-06-08  5:48 ` Amey Narkhede
  2021-06-09 21:40   ` Raphael Norwitz
  2021-06-08 10:05 ` [PATCH v7 0/8] Expose and manage PCI device reset Enrico Weigelt, metux IT consult
  8 siblings, 1 reply; 52+ messages in thread
From: Amey Narkhede @ 2021-06-08  5:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki,
	Amey Narkhede

Introduce a new enum pci_reset_mode_t to make the context of probe argument
in reset functions clear and the code easier to read.
Change the type of probe argument in functions which implement reset
methods from int to pci_reset_mode_t to make the intent clear.

Add a new line in return statement of pci_reset_bus_function().

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Suggested-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
---
 drivers/crypto/cavium/nitrox/nitrox_main.c    |  2 +-
 .../ethernet/cavium/liquidio/lio_vf_main.c    |  2 +-
 drivers/pci/hotplug/pciehp.h                  |  2 +-
 drivers/pci/hotplug/pciehp_hpc.c              |  4 +-
 drivers/pci/pci-acpi.c                        | 10 ++-
 drivers/pci/pci.c                             | 85 ++++++++++++-------
 drivers/pci/pci.h                             | 12 +--
 drivers/pci/pcie/aer.c                        |  2 +-
 drivers/pci/quirks.c                          | 37 ++++----
 include/linux/pci.h                           |  8 +-
 include/linux/pci_hotplug.h                   |  2 +-
 11 files changed, 101 insertions(+), 65 deletions(-)

diff --git a/drivers/crypto/cavium/nitrox/nitrox_main.c b/drivers/crypto/cavium/nitrox/nitrox_main.c
index 15d6c8452..f97fa8e99 100644
--- a/drivers/crypto/cavium/nitrox/nitrox_main.c
+++ b/drivers/crypto/cavium/nitrox/nitrox_main.c
@@ -306,7 +306,7 @@ static int nitrox_device_flr(struct pci_dev *pdev)
 		return -ENOMEM;
 	}

-	pcie_reset_flr(pdev, 0);
+	pcie_reset_flr(pdev, PCI_RESET_DO_RESET);

 	pci_restore_state(pdev);

diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
index 336d149ee..6e666be69 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c
@@ -526,7 +526,7 @@ static void octeon_destroy_resources(struct octeon_device *oct)
 			oct->irq_name_storage = NULL;
 		}
 		/* Soft reset the octeon device before exiting */
-		if (!pcie_reset_flr(oct->pci_dev, 1))
+		if (!pcie_reset_flr(oct->pci_dev, PCI_RESET_PROBE))
 			octeon_pci_flr(oct);
 		else
 			cn23xx_vf_ask_pf_to_do_flr(oct);
diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 4fd200d8b..87da03adc 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -181,7 +181,7 @@ void pciehp_release_ctrl(struct controller *ctrl);

 int pciehp_sysfs_enable_slot(struct hotplug_slot *hotplug_slot);
 int pciehp_sysfs_disable_slot(struct hotplug_slot *hotplug_slot);
-int pciehp_reset_slot(struct hotplug_slot *hotplug_slot, int probe);
+int pciehp_reset_slot(struct hotplug_slot *hotplug_slot, pci_reset_mode_t mode);
 int pciehp_get_attention_status(struct hotplug_slot *hotplug_slot, u8 *status);
 int pciehp_set_raw_indicator_status(struct hotplug_slot *h_slot, u8 status);
 int pciehp_get_raw_indicator_status(struct hotplug_slot *h_slot, u8 *status);
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index fb3840e22..24b3c8787 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -834,14 +834,14 @@ void pcie_disable_interrupt(struct controller *ctrl)
  * momentarily, if we see that they could interfere. Also, clear any spurious
  * events after.
  */
-int pciehp_reset_slot(struct hotplug_slot *hotplug_slot, int probe)
+int pciehp_reset_slot(struct hotplug_slot *hotplug_slot, pci_reset_mode_t mode)
 {
 	struct controller *ctrl = to_ctrl(hotplug_slot);
 	struct pci_dev *pdev = ctrl_dev(ctrl);
 	u16 stat_mask = 0, ctrl_mask = 0;
 	int rc;

-	if (probe)
+	if (mode == PCI_RESET_PROBE)
 		return 0;

 	down_write(&ctrl->reset_lock);
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 40dd24cd3..9de334457 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -944,16 +944,20 @@ void pci_set_acpi_fwnode(struct pci_dev *dev)
 /**
  * pci_dev_acpi_reset - do a function level reset using _RST method
  * @dev: device to reset
- * @probe: check if _RST method is included in the acpi_device context.
+ * @probe: If PCI_RESET_PROBE, check whether _RST method is included
+ *         in the acpi_device context.
  */
-int pci_dev_acpi_reset(struct pci_dev *dev, int probe)
+int pci_dev_acpi_reset(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	acpi_handle handle = ACPI_HANDLE(&dev->dev);

+	if (mode >= PCI_RESET_MODE_MAX)
+		return -EINVAL;
+
 	if (!handle || !acpi_has_method(handle, "_RST"))
 		return -ENOTTY;

-	if (probe)
+	if (mode == PCI_RESET_PROBE)
 		return 0;

 	if (ACPI_FAILURE(acpi_evaluate_object(handle, "_RST", NULL, NULL))) {
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 2e7efd7e7..e28611d7c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4650,14 +4650,17 @@ EXPORT_SYMBOL_GPL(pcie_flr);
 /**
  * pcie_reset_flr - initiate a PCIe function level reset
  * @dev: device to reset
- * @probe: If set, only check if the device can be reset this way.
+ * @mode: If PCI_RESET_PROBE, only check if the device can be reset this way.
  *
  * Initiate a function level reset on @dev.
  */
-int pcie_reset_flr(struct pci_dev *dev, int probe)
+int pcie_reset_flr(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	u32 cap;

+	if (mode >= PCI_RESET_MODE_MAX)
+		return -EINVAL;
+
 	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
 		return -ENOTTY;

@@ -4665,18 +4668,21 @@ int pcie_reset_flr(struct pci_dev *dev, int probe)
 	if (!(cap & PCI_EXP_DEVCAP_FLR))
 		return -ENOTTY;

-	if (probe)
+	if (mode == PCI_RESET_PROBE)
 		return 0;

 	return pcie_flr(dev);
 }
 EXPORT_SYMBOL_GPL(pcie_reset_flr);

-static int pci_af_flr(struct pci_dev *dev, int probe)
+static int pci_af_flr(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	int pos;
 	u8 cap;

+	if (mode >= PCI_RESET_MODE_MAX)
+		return -EINVAL;
+
 	pos = pci_find_capability(dev, PCI_CAP_ID_AF);
 	if (!pos)
 		return -ENOTTY;
@@ -4688,7 +4694,7 @@ static int pci_af_flr(struct pci_dev *dev, int probe)
 	if (!(cap & PCI_AF_CAP_TP) || !(cap & PCI_AF_CAP_FLR))
 		return -ENOTTY;

-	if (probe)
+	if (mode == PCI_RESET_PROBE)
 		return 0;

 	/*
@@ -4719,7 +4725,7 @@ static int pci_af_flr(struct pci_dev *dev, int probe)
 /**
  * pci_pm_reset - Put device into PCI_D3 and back into PCI_D0.
  * @dev: Device to reset.
- * @probe: If set, only check if the device can be reset this way.
+ * @mode: If PCI_RESET_PROBE, only check if the device can be reset this way.
  *
  * If @dev supports native PCI PM and its PCI_PM_CTRL_NO_SOFT_RESET flag is
  * unset, it will be reinitialized internally when going from PCI_D3hot to
@@ -4731,10 +4737,13 @@ static int pci_af_flr(struct pci_dev *dev, int probe)
  * by default (i.e. unless the @dev's d3hot_delay field has a different value).
  * Moreover, only devices in D0 can be reset by this function.
  */
-static int pci_pm_reset(struct pci_dev *dev, int probe)
+static int pci_pm_reset(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	u16 csr;

+	if (mode >= PCI_RESET_MODE_MAX)
+		return -EINVAL;
+
 	if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET)
 		return -ENOTTY;

@@ -4742,7 +4751,7 @@ static int pci_pm_reset(struct pci_dev *dev, int probe)
 	if (csr & PCI_PM_CTRL_NO_SOFT_RESET)
 		return -ENOTTY;

-	if (probe)
+	if (mode == PCI_RESET_PROBE)
 		return 0;

 	if (dev->current_state != PCI_D0)
@@ -4991,10 +5000,13 @@ int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pci_bridge_secondary_bus_reset);

-static int pci_parent_bus_reset(struct pci_dev *dev, int probe)
+static int pci_parent_bus_reset(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	struct pci_dev *pdev;

+	if (mode >= PCI_RESET_MODE_MAX)
+		return -EINVAL;
+
 	if (pci_is_root_bus(dev->bus) || dev->subordinate ||
 	    !dev->bus->self || dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET)
 		return -ENOTTY;
@@ -5003,44 +5015,47 @@ static int pci_parent_bus_reset(struct pci_dev *dev, int probe)
 		if (pdev != dev)
 			return -ENOTTY;

-	if (probe)
+	if (mode == PCI_RESET_PROBE)
 		return 0;

 	return pci_bridge_secondary_bus_reset(dev->bus->self);
 }

-static int pci_reset_hotplug_slot(struct hotplug_slot *hotplug, int probe)
+static int pci_reset_hotplug_slot(struct hotplug_slot *hotplug, pci_reset_mode_t mode)
 {
 	int rc = -ENOTTY;

+	if (mode >= PCI_RESET_MODE_MAX)
+		return -EINVAL;
+
 	if (!hotplug || !try_module_get(hotplug->owner))
 		return rc;

 	if (hotplug->ops->reset_slot)
-		rc = hotplug->ops->reset_slot(hotplug, probe);
+		rc = hotplug->ops->reset_slot(hotplug, mode);

 	module_put(hotplug->owner);

 	return rc;
 }

-static int pci_dev_reset_slot_function(struct pci_dev *dev, int probe)
+static int pci_dev_reset_slot_function(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	if (dev->multifunction || dev->subordinate || !dev->slot ||
 	    dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET)
 		return -ENOTTY;

-	return pci_reset_hotplug_slot(dev->slot->hotplug, probe);
+	return pci_reset_hotplug_slot(dev->slot->hotplug, mode);
 }

-static int pci_reset_bus_function(struct pci_dev *dev, int probe)
+static int pci_reset_bus_function(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	int rc;

-	rc = pci_dev_reset_slot_function(dev, probe);
+	rc = pci_dev_reset_slot_function(dev, mode);
 	if (rc != -ENOTTY)
 		return rc;
-	return pci_parent_bus_reset(dev, probe);
+	return pci_parent_bus_reset(dev, mode);
 }

 static void pci_dev_lock(struct pci_dev *dev)
@@ -5168,7 +5183,7 @@ int __pci_reset_function_locked(struct pci_dev *dev)
 				 * finished: this indicates that further reset
 				 * mechanisms might be broken on the device.
 				 */
-				rc = pci_reset_fn_methods[i].reset_fn(dev, 0);
+				rc = pci_reset_fn_methods[i].reset_fn(dev, PCI_RESET_DO_RESET);
 				if (rc != -ENOTTY)
 					return rc;
 				break;
@@ -5204,7 +5219,7 @@ void pci_init_reset_methods(struct pci_dev *dev)
 	might_sleep();

 	for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
-		rc = pci_reset_fn_methods[i].reset_fn(dev, 1);
+		rc = pci_reset_fn_methods[i].reset_fn(dev, PCI_RESET_PROBE);
 		if (!rc)
 			reset_methods[i] = prio--;
 		else if (rc != -ENOTTY)
@@ -5520,21 +5535,24 @@ static void pci_slot_restore_locked(struct pci_slot *slot)
 	}
 }

-static int pci_slot_reset(struct pci_slot *slot, int probe)
+static int pci_slot_reset(struct pci_slot *slot, pci_reset_mode_t mode)
 {
 	int rc;

+	if (mode >= PCI_RESET_MODE_MAX)
+		return -EINVAL;
+
 	if (!slot || !pci_slot_resetable(slot))
 		return -ENOTTY;

-	if (!probe)
+	if (mode != PCI_RESET_PROBE)
 		pci_slot_lock(slot);

 	might_sleep();

-	rc = pci_reset_hotplug_slot(slot->hotplug, probe);
+	rc = pci_reset_hotplug_slot(slot->hotplug, mode);

-	if (!probe)
+	if (mode != PCI_RESET_PROBE)
 		pci_slot_unlock(slot);

 	return rc;
@@ -5548,7 +5566,7 @@ static int pci_slot_reset(struct pci_slot *slot, int probe)
  */
 int pci_probe_reset_slot(struct pci_slot *slot)
 {
-	return pci_slot_reset(slot, 1);
+	return pci_slot_reset(slot, PCI_RESET_PROBE);
 }
 EXPORT_SYMBOL_GPL(pci_probe_reset_slot);

@@ -5571,14 +5589,14 @@ static int __pci_reset_slot(struct pci_slot *slot)
 {
 	int rc;

-	rc = pci_slot_reset(slot, 1);
+	rc = pci_slot_reset(slot, PCI_RESET_PROBE);
 	if (rc)
 		return rc;

 	if (pci_slot_trylock(slot)) {
 		pci_slot_save_and_disable_locked(slot);
 		might_sleep();
-		rc = pci_reset_hotplug_slot(slot->hotplug, 0);
+		rc = pci_reset_hotplug_slot(slot->hotplug, PCI_RESET_DO_RESET);
 		pci_slot_restore_locked(slot);
 		pci_slot_unlock(slot);
 	} else
@@ -5587,14 +5605,17 @@ static int __pci_reset_slot(struct pci_slot *slot)
 	return rc;
 }

-static int pci_bus_reset(struct pci_bus *bus, int probe)
+static int pci_bus_reset(struct pci_bus *bus, pci_reset_mode_t mode)
 {
 	int ret;

+	if (mode >= PCI_RESET_MODE_MAX)
+		return -EINVAL;
+
 	if (!bus->self || !pci_bus_resetable(bus))
 		return -ENOTTY;

-	if (probe)
+	if (mode == PCI_RESET_PROBE)
 		return 0;

 	pci_bus_lock(bus);
@@ -5633,14 +5654,14 @@ int pci_bus_error_reset(struct pci_dev *bridge)
 			goto bus_reset;

 	list_for_each_entry(slot, &bus->slots, list)
-		if (pci_slot_reset(slot, 0))
+		if (pci_slot_reset(slot, PCI_RESET_DO_RESET))
 			goto bus_reset;

 	mutex_unlock(&pci_slot_mutex);
 	return 0;
 bus_reset:
 	mutex_unlock(&pci_slot_mutex);
-	return pci_bus_reset(bridge->subordinate, 0);
+	return pci_bus_reset(bridge->subordinate, PCI_RESET_DO_RESET);
 }

 /**
@@ -5651,7 +5672,7 @@ int pci_bus_error_reset(struct pci_dev *bridge)
  */
 int pci_probe_reset_bus(struct pci_bus *bus)
 {
-	return pci_bus_reset(bus, 1);
+	return pci_bus_reset(bus, PCI_RESET_PROBE);
 }
 EXPORT_SYMBOL_GPL(pci_probe_reset_bus);

@@ -5665,7 +5686,7 @@ static int __pci_reset_bus(struct pci_bus *bus)
 {
 	int rc;

-	rc = pci_bus_reset(bus, 1);
+	rc = pci_bus_reset(bus, PCI_RESET_PROBE);
 	if (rc)
 		return rc;

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index e9cfb7cd6..9787700f8 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -603,19 +603,19 @@ static inline int pci_enable_ptm(struct pci_dev *dev, u8 *granularity)
 struct pci_dev_reset_methods {
 	u16 vendor;
 	u16 device;
-	int (*reset)(struct pci_dev *dev, int probe);
+	int (*reset)(struct pci_dev *dev, pci_reset_mode_t mode);
 };

 struct pci_reset_fn_method {
-	int (*reset_fn)(struct pci_dev *pdev, int probe);
+	int (*reset_fn)(struct pci_dev *pdev, pci_reset_mode_t mode);
 	char *name;
 };

 extern const struct pci_reset_fn_method pci_reset_fn_methods[];
 #ifdef CONFIG_PCI_QUIRKS
-int pci_dev_specific_reset(struct pci_dev *dev, int probe);
+int pci_dev_specific_reset(struct pci_dev *dev, pci_reset_mode_t mode);
 #else
-static inline int pci_dev_specific_reset(struct pci_dev *dev, int probe)
+static inline int pci_dev_specific_reset(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	return -ENOTTY;
 }
@@ -704,9 +704,9 @@ static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL
 int pci_acpi_program_hp_params(struct pci_dev *dev);
 extern const struct attribute_group pci_dev_acpi_attr_group;
 void pci_set_acpi_fwnode(struct pci_dev *dev);
-int pci_dev_acpi_reset(struct pci_dev *dev, int probe);
+int pci_dev_acpi_reset(struct pci_dev *dev, pci_reset_mode_t mode);
 #else
-static inline int pci_dev_acpi_reset(struct pci_dev *dev, int probe)
+static inline int pci_dev_acpi_reset(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	return -ENOTTY;
 }
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 98077595a..cfa7a1775 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1405,7 +1405,7 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
 	}

 	if (type == PCI_EXP_TYPE_RC_EC || type == PCI_EXP_TYPE_RC_END) {
-		rc = pcie_reset_flr(dev, 0);
+		rc = pcie_reset_flr(dev, PCI_RESET_DO_RESET);
 		if (!rc)
 			pci_info(dev, "has been reset\n");
 		else
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 45a8c3caa..60fd101ac 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3681,7 +3681,7 @@ DECLARE_PCI_FIXUP_SUSPEND_LATE(PCI_VENDOR_ID_INTEL,
  * reset a single function if other methods (e.g. FLR, PM D0->D3) are
  * not available.
  */
-static int reset_intel_82599_sfp_virtfn(struct pci_dev *dev, int probe)
+static int reset_intel_82599_sfp_virtfn(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	/*
 	 * http://www.intel.com/content/dam/doc/datasheet/82599-10-gbe-controller-datasheet.pdf
@@ -3691,7 +3691,7 @@ static int reset_intel_82599_sfp_virtfn(struct pci_dev *dev, int probe)
 	 * Thus we must call pcie_flr() directly without first checking if it is
 	 * supported.
 	 */
-	if (!probe)
+	if (mode == PCI_RESET_DO_RESET)
 		pcie_flr(dev);
 	return 0;
 }
@@ -3703,13 +3703,13 @@ static int reset_intel_82599_sfp_virtfn(struct pci_dev *dev, int probe)
 #define NSDE_PWR_STATE		0xd0100
 #define IGD_OPERATION_TIMEOUT	10000     /* set timeout 10 seconds */

-static int reset_ivb_igd(struct pci_dev *dev, int probe)
+static int reset_ivb_igd(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	void __iomem *mmio_base;
 	unsigned long timeout;
 	u32 val;

-	if (probe)
+	if (mode == PCI_RESET_PROBE)
 		return 0;

 	mmio_base = pci_iomap(dev, 0, 0);
@@ -3746,7 +3746,7 @@ static int reset_ivb_igd(struct pci_dev *dev, int probe)
 }

 /* Device-specific reset method for Chelsio T4-based adapters */
-static int reset_chelsio_generic_dev(struct pci_dev *dev, int probe)
+static int reset_chelsio_generic_dev(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	u16 old_command;
 	u16 msix_flags;
@@ -3762,7 +3762,7 @@ static int reset_chelsio_generic_dev(struct pci_dev *dev, int probe)
 	 * If this is the "probe" phase, return 0 indicating that we can
 	 * reset this device.
 	 */
-	if (probe)
+	if (mode == PCI_RESET_PROBE)
 		return 0;

 	/*
@@ -3824,17 +3824,17 @@ static int reset_chelsio_generic_dev(struct pci_dev *dev, int probe)
  *    Chapter 3: NVMe control registers
  *    Chapter 7.3: Reset behavior
  */
-static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
+static int nvme_disable_and_flr(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	void __iomem *bar;
 	u16 cmd;
 	u32 cfg;

 	if (dev->class != PCI_CLASS_STORAGE_EXPRESS ||
-	    pcie_reset_flr(dev, 1) || !pci_resource_start(dev, 0))
+	    pcie_reset_flr(dev, PCI_RESET_PROBE) || !pci_resource_start(dev, 0))
 		return -ENOTTY;

-	if (probe)
+	if (mode == PCI_RESET_PROBE)
 		return 0;

 	bar = pci_iomap(dev, 0, NVME_REG_CC + sizeof(cfg));
@@ -3898,11 +3898,13 @@ static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
  * device too soon after FLR.  A 250ms delay after FLR has heuristically
  * proven to produce reliably working results for device assignment cases.
  */
-static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
+static int delay_250ms_after_flr(struct pci_dev *dev, pci_reset_mode_t mode)
 {
-	int ret = pcie_reset_flr(dev, probe);
+	int ret;
+
+	ret = pcie_reset_flr(dev, mode);

-	if (probe)
+	if (ret || mode == PCI_RESET_PROBE)
 		return ret;

 	msleep(250);
@@ -3918,13 +3920,13 @@ static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
 #define HINIC_OPERATION_TIMEOUT     15000	/* 15 seconds */

 /* Device-specific reset method for Huawei Intelligent NIC virtual functions */
-static int reset_hinic_vf_dev(struct pci_dev *pdev, int probe)
+static int reset_hinic_vf_dev(struct pci_dev *pdev, pci_reset_mode_t mode)
 {
 	unsigned long timeout;
 	void __iomem *bar;
 	u32 val;

-	if (probe)
+	if (mode == PCI_RESET_PROBE)
 		return 0;

 	bar = pci_iomap(pdev, 0, 0);
@@ -3995,16 +3997,19 @@ static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
  * because when a host assigns a device to a guest VM, the host may need
  * to reset the device but probably doesn't have a driver for it.
  */
-int pci_dev_specific_reset(struct pci_dev *dev, int probe)
+int pci_dev_specific_reset(struct pci_dev *dev, pci_reset_mode_t mode)
 {
 	const struct pci_dev_reset_methods *i;

+	if (mode >= PCI_RESET_MODE_MAX)
+		return -EINVAL;
+
 	for (i = pci_dev_reset_methods; i->reset; i++) {
 		if ((i->vendor == dev->vendor ||
 		     i->vendor == (u16)PCI_ANY_ID) &&
 		    (i->device == dev->device ||
 		     i->device == (u16)PCI_ANY_ID))
-			return i->reset(dev, probe);
+			return i->reset(dev, mode);
 	}

 	return -ENOTTY;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index a7f063da2..c46df52e6 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -51,6 +51,12 @@

 #define PCI_RESET_METHODS_NUM 6

+typedef enum pci_reset_mode {
+	PCI_RESET_DO_RESET,
+	PCI_RESET_PROBE,
+	PCI_RESET_MODE_MAX,
+} pci_reset_mode_t;
+
 /*
  * The PCI interface treats multi-function devices as independent
  * devices.  The slot/function address of each device is encoded
@@ -1230,7 +1236,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
 			     enum pci_bus_speed *speed,
 			     enum pcie_link_width *width);
 void pcie_print_link_status(struct pci_dev *dev);
-int pcie_reset_flr(struct pci_dev *dev, int probe);
+int pcie_reset_flr(struct pci_dev *dev, pci_reset_mode_t mode);
 int pcie_flr(struct pci_dev *dev);
 bool pci_reset_supported(struct pci_dev *dev);
 int __pci_reset_function_locked(struct pci_dev *dev);
diff --git a/include/linux/pci_hotplug.h b/include/linux/pci_hotplug.h
index b482e42d7..9e8da46e7 100644
--- a/include/linux/pci_hotplug.h
+++ b/include/linux/pci_hotplug.h
@@ -44,7 +44,7 @@ struct hotplug_slot_ops {
 	int (*get_attention_status)	(struct hotplug_slot *slot, u8 *value);
 	int (*get_latch_status)		(struct hotplug_slot *slot, u8 *value);
 	int (*get_adapter_status)	(struct hotplug_slot *slot, u8 *value);
-	int (*reset_slot)		(struct hotplug_slot *slot, int probe);
+	int (*reset_slot)		(struct hotplug_slot *slot, pci_reset_mode_t mode);
 };

 /**
--
2.31.1

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 0/8] Expose and manage PCI device reset
  2021-06-08  5:48 [PATCH v7 0/8] Expose and manage PCI device reset Amey Narkhede
                   ` (7 preceding siblings ...)
  2021-06-08  5:48 ` [PATCH v7 8/8] PCI: Change the type of probe argument in reset functions Amey Narkhede
@ 2021-06-08 10:05 ` Enrico Weigelt, metux IT consult
  2021-06-08 15:44   ` Amey Narkhede
  8 siblings, 1 reply; 52+ messages in thread
From: Enrico Weigelt, metux IT consult @ 2021-06-08 10:05 UTC (permalink / raw)
  To: Amey Narkhede, Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On 08.06.21 07:48, Amey Narkhede wrote:

Hi,

> PCI and PCIe devices may support a number of possible reset mechanisms
> for example Function Level Reset (FLR) provided via Advanced Feature or
> PCIe capabilities, Power Management reset, bus reset, or device specific reset.
> Currently the PCI subsystem creates a policy prioritizing these reset methods
> which provides neither visibility nor control to userspace.

Since I've got a current use case for that - could you perhaps tell more
about the whole pci device reset mechanisms ?

In my case I've got a board that wires reset lines to the soc's gpios.
Not sure how exactly to qualify this, but I guess it would count as a
bus wide reset.

Now the big question for me is how to implement that in a board specific
platform driver (which already does setup of gpios and other attached
devices), so we can reset the card in slot X in a generic way.

Any help highly appreciated.


--mtx

-- 
---
Hinweis: unverschlüsselte E-Mails können leicht abgehört und manipuliert
werden ! Für eine vertrauliche Kommunikation senden Sie bitte ihren
GPG/PGP-Schlüssel zu.
---
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 0/8] Expose and manage PCI device reset
  2021-06-08 10:05 ` [PATCH v7 0/8] Expose and manage PCI device reset Enrico Weigelt, metux IT consult
@ 2021-06-08 15:44   ` Amey Narkhede
  0 siblings, 0 replies; 52+ messages in thread
From: Amey Narkhede @ 2021-06-08 15:44 UTC (permalink / raw)
  To: Enrico Weigelt, metux IT consult
  Cc: Bjorn Helgaas, Alex Williamson, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Shanker Donthineni, Sinan Kaya, Len Brown,
	Rafael J. Wysocki

On 21/06/08 12:05PM, Enrico Weigelt, metux IT consult wrote:
> On 08.06.21 07:48, Amey Narkhede wrote:
>
> Hi,
>
> > PCI and PCIe devices may support a number of possible reset mechanisms
> > for example Function Level Reset (FLR) provided via Advanced Feature or
> > PCIe capabilities, Power Management reset, bus reset, or device specific reset.
> > Currently the PCI subsystem creates a policy prioritizing these reset methods
> > which provides neither visibility nor control to userspace.
>
> Since I've got a current use case for that - could you perhaps tell more
> about the whole pci device reset mechanisms ?
>
> In my case I've got a board that wires reset lines to the soc's gpios.
> Not sure how exactly to qualify this, but I guess it would count as a
> bus wide reset.
>
> Now the big question for me is how to implement that in a board specific
> platform driver (which already does setup of gpios and other attached
> devices), so we can reset the card in slot X in a generic way.
>
> Any help highly appreciated.
>
>
> --mtx
>
In case of bus reset(pci_reset_secondary_bus()), it uses bridge control
register to assert reset on bus so I think it should out of the box but
not 100% sure about it.

Thanks,
Amey

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 8/8] PCI: Change the type of probe argument in reset functions
  2021-06-08  5:48 ` [PATCH v7 8/8] PCI: Change the type of probe argument in reset functions Amey Narkhede
@ 2021-06-09 21:40   ` Raphael Norwitz
  0 siblings, 0 replies; 52+ messages in thread
From: Raphael Norwitz @ 2021-06-09 21:40 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: Bjorn Helgaas, alex.williamson, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Shanker Donthineni, Sinan Kaya, Len Brown,
	Rafael J . Wysocki

On Tue, Jun 08, 2021 at 11:18:57AM +0530, Amey Narkhede wrote:
> Introduce a new enum pci_reset_mode_t to make the context of probe argument
> in reset functions clear and the code easier to read.
> Change the type of probe argument in functions which implement reset
> methods from int to pci_reset_mode_t to make the intent clear.
> 
> Add a new line in return statement of pci_reset_bus_function().
> 
> Suggested-by: Alex Williamson <alex.williamson@redhat.com>
> Suggested-by: Krzysztof Wilczyński <kw@linux.com>
> Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>

Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>

> ---
>  drivers/crypto/cavium/nitrox/nitrox_main.c    |  2 +-
>  .../ethernet/cavium/liquidio/lio_vf_main.c    |  2 +-
>  drivers/pci/hotplug/pciehp.h                  |  2 +-
>  drivers/pci/hotplug/pciehp_hpc.c              |  4 +-
>  drivers/pci/pci-acpi.c                        | 10 ++-
>  drivers/pci/pci.c                             | 85 ++++++++++++-------
>  drivers/pci/pci.h                             | 12 +--
>  drivers/pci/pcie/aer.c                        |  2 +-
>  drivers/pci/quirks.c                          | 37 ++++----
>  include/linux/pci.h                           |  8 +-
>  include/linux/pci_hotplug.h                   |  2 +-
>  11 files changed, 101 insertions(+), 65 deletions(-)
 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-08  5:48 ` [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism Amey Narkhede
@ 2021-06-09 21:57   ` Raphael Norwitz
  2021-06-09 22:36     ` Shanker R Donthineni
  2021-06-10 20:16   ` Shanker R Donthineni
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 52+ messages in thread
From: Raphael Norwitz @ 2021-06-09 21:57 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: Bjorn Helgaas, alex.williamson, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Shanker Donthineni, Sinan Kaya, Len Brown,
	Rafael J . Wysocki

On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> Add reset_method sysfs attribute to enable user to
> query and set user preferred device reset methods and
> their ordering.
> 
> Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>

Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>

> ---
>  Documentation/ABI/testing/sysfs-bus-pci |  16 ++++
>  drivers/pci/pci-sysfs.c                 | 118 ++++++++++++++++++++++++
>  2 files changed, 134 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> index ef00fada2..cf6dbbb3c 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -121,6 +121,22 @@ Description:
>  		child buses, and re-discover devices removed earlier
>  		from this part of the device tree.
>  
> +What:		/sys/bus/pci/devices/.../reset_method
> +Date:		March 2021
> +Contact:	Amey Narkhede <ameynarkhede03@gmail.com>
> +Description:
> +		Some devices allow an individual function to be reset
> +		without affecting other functions in the same slot.
> +		For devices that have this support, a file named reset_method
> +		will be present in sysfs. Reading this file will give names
> +		of the device supported reset methods and their ordering.
> +		Writing the name or comma separated list of names of any of
> +		the device supported reset methods to this file will set the
> +		reset methods and their ordering to be used when resetting
> +		the device. Writing empty string to this file will disable
> +		ability to reset the device and writing "default" will return
> +		to the original value.
> +
>  What:		/sys/bus/pci/devices/.../reset
>  Date:		July 2009
>  Contact:	Michael S. Tsirkin <mst@redhat.com>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 316f70c3e..52def79aa 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1334,6 +1334,123 @@ static const struct attribute_group pci_dev_rom_attr_group = {
>  	.is_bin_visible = pci_dev_rom_attr_is_visible,
>  };
>  
> +static ssize_t reset_method_show(struct device *dev,
> +				 struct device_attribute *attr,
> +				 char *buf)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	ssize_t len = 0;
> +	int i, prio;
> +
> +	for (prio = PCI_RESET_METHODS_NUM; prio; prio--) {
> +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> +			if (prio == pdev->reset_methods[i]) {
> +				len += sysfs_emit_at(buf, len, "%s%s",
> +						     len ? "," : "",
> +						     pci_reset_fn_methods[i].name);
> +				break;
> +			}
> +		}
> +
> +		if (i == PCI_RESET_METHODS_NUM)
> +			break;
> +	}
> +

Don't you still need to ensure you add the newline even if there are no
reset methods set? If the len is zero why don't we need the newline?

Otherwise looks good.

> +	if (len)
> +		len += sysfs_emit_at(buf, len, "\n");
> +
> +	return len;
> +}
> +
> +static ssize_t reset_method_store(struct device *dev,
> +				  struct device_attribute *attr,
> +				  const char *buf, size_t count)
> +{
> +	u8 reset_methods[PCI_RESET_METHODS_NUM];
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	u8 prio = PCI_RESET_METHODS_NUM;
> +	char *name, *options;
> +	int i;
> +
> +	if (count >= (PAGE_SIZE - 1))
> +		return -EINVAL;
> +
> +	options = kstrndup(buf, count, GFP_KERNEL);
> +	if (!options)
> +		return -ENOMEM;
> +
> +	/*
> +	 * Initialize reset_method such that 0xff indicates
> +	 * supported but not currently enabled reset methods
> +	 * as we only use priority values which are within
> +	 * the range of PCI_RESET_FN_METHODS array size
> +	 */

NIT: missing period in above comment.

> +	for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> +		reset_methods[i] = pdev->reset_methods[i] ? 0xff : 0;
> +
> +	if (sysfs_streq(options, "")) {
> +		pci_warn(pdev, "All device reset methods disabled by user");
> +		goto set_reset_methods;
> +	}
> +
> +	if (sysfs_streq(options, "default")) {
> +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> +			reset_methods[i] = reset_methods[i] ? prio-- : 0;
> +		goto set_reset_methods;
> +	}
> +
> +	while ((name = strsep(&options, ",")) != NULL) {
> +		if (sysfs_streq(name, ""))
> +			continue;
> +
> +		name = strim(name);
> +
> +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> +			if (reset_methods[i] &&
> +			    sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> +				reset_methods[i] = prio--;
> +				break;
> +			}
> +		}
> +
> +		if (i == PCI_RESET_METHODS_NUM) {
> +			kfree(options);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	if (reset_methods[0] &&
> +	    reset_methods[0] != PCI_RESET_METHODS_NUM)
> +		pci_warn(pdev, "Device specific reset disabled/de-prioritized by user");
> +
> +set_reset_methods:
> +	kfree(options);
> +	memcpy(pdev->reset_methods, reset_methods, sizeof(reset_methods));
> +	return count;
> +}
> +static DEVICE_ATTR_RW(reset_method);
> +
> +static struct attribute *pci_dev_reset_method_attrs[] = {
> +	&dev_attr_reset_method.attr,
> +	NULL,
> +};
> +
> +static umode_t pci_dev_reset_method_attr_is_visible(struct kobject *kobj,
> +						    struct attribute *a, int n)
> +{
> +	struct pci_dev *pdev = to_pci_dev(kobj_to_dev(kobj));
> +
> +	if (!pci_reset_supported(pdev))
> +		return 0;
> +
> +	return a->mode;
> +}
> +
> +static const struct attribute_group pci_dev_reset_method_attr_group = {
> +	.attrs = pci_dev_reset_method_attrs,
> +	.is_visible = pci_dev_reset_method_attr_is_visible,
> +};
> +
>  static ssize_t reset_store(struct device *dev, struct device_attribute *attr,
>  			   const char *buf, size_t count)
>  {
> @@ -1491,6 +1608,7 @@ const struct attribute_group *pci_dev_groups[] = {
>  	&pci_dev_config_attr_group,
>  	&pci_dev_rom_attr_group,
>  	&pci_dev_reset_attr_group,
> +	&pci_dev_reset_method_attr_group,
>  	&pci_dev_vpd_attr_group,
>  #ifdef CONFIG_DMI
>  	&pci_dev_smbios_attr_group,
> -- 
> 2.31.1
> 
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-09 21:57   ` Raphael Norwitz
@ 2021-06-09 22:36     ` Shanker R Donthineni
  2021-06-09 22:48       ` Raphael Norwitz
  0 siblings, 1 reply; 52+ messages in thread
From: Shanker R Donthineni @ 2021-06-09 22:36 UTC (permalink / raw)
  To: Raphael Norwitz, Amey Narkhede
  Cc: Bjorn Helgaas, alex.williamson, linux-pci, linux-kernel, kw,
	Sinan Kaya, Len Brown, Rafael J . Wysocki

Hi Raphael,

On 6/9/21 4:57 PM, Raphael Norwitz wrote:
>> +static ssize_t reset_method_show(struct device *dev,
>> +                              struct device_attribute *attr,
>> +                              char *buf)
>> +{
>> +     struct pci_dev *pdev = to_pci_dev(dev);
>> +     ssize_t len = 0;
>> +     int i, prio;
>> +
>> +     for (prio = PCI_RESET_METHODS_NUM; prio; prio--) {
>> +             for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
>> +                     if (prio == pdev->reset_methods[i]) {
>> +                             len += sysfs_emit_at(buf, len, "%s%s",
>> +                                                  len ? "," : "",
>> +                                                  pci_reset_fn_methods[i].name);
>> +                             break;
>> +                     }
>> +             }
>> +
>> +             if (i == PCI_RESET_METHODS_NUM)
>> +                     break;
>> +     }
>> +
> Don't you still need to ensure you add the newline even if there are no
> reset methods set? If the len is zero why don't we need the newline?
>
> Otherwise looks good.
>

sysfs entry 'reset_method' will not be visible if there are no reset methods.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-09 22:36     ` Shanker R Donthineni
@ 2021-06-09 22:48       ` Raphael Norwitz
  0 siblings, 0 replies; 52+ messages in thread
From: Raphael Norwitz @ 2021-06-09 22:48 UTC (permalink / raw)
  To: Shanker R Donthineni
  Cc: Raphael Norwitz, Amey Narkhede, Bjorn Helgaas, alex.williamson,
	linux-pci, linux-kernel, kw, Sinan Kaya, Len Brown,
	Rafael J . Wysocki

Yes - I got this one wrong.

Nevermind, looks good. Just the punctuation NIT.

On Wed, Jun 09, 2021 at 05:36:26PM -0500, Shanker R Donthineni wrote:
> Hi Raphael,
> 
> On 6/9/21 4:57 PM, Raphael Norwitz wrote:
> >> +static ssize_t reset_method_show(struct device *dev,
> >> +                              struct device_attribute *attr,
> >> +                              char *buf)
> >> +{
> >> +     struct pci_dev *pdev = to_pci_dev(dev);
> >> +     ssize_t len = 0;
> >> +     int i, prio;
> >> +
> >> +     for (prio = PCI_RESET_METHODS_NUM; prio; prio--) {
> >> +             for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> >> +                     if (prio == pdev->reset_methods[i]) {
> >> +                             len += sysfs_emit_at(buf, len, "%s%s",
> >> +                                                  len ? "," : "",
> >> +                                                  pci_reset_fn_methods[i].name);
> >> +                             break;
> >> +                     }
> >> +             }
> >> +
> >> +             if (i == PCI_RESET_METHODS_NUM)
> >> +                     break;
> >> +     }
> >> +
> > Don't you still need to ensure you add the newline even if there are no
> > reset methods set? If the len is zero why don't we need the newline?
> >
> > Otherwise looks good.
> >
> 
> sysfs entry 'reset_method' will not be visible if there are no reset methods.
> 
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods
  2021-06-08  5:48 ` [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods Amey Narkhede
@ 2021-06-10 20:15   ` Shanker R Donthineni
  2021-06-17 21:57   ` Bjorn Helgaas
  2021-06-24 12:23   ` Bjorn Helgaas
  2 siblings, 0 replies; 52+ messages in thread
From: Shanker R Donthineni @ 2021-06-10 20:15 UTC (permalink / raw)
  To: Amey Narkhede, Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Sinan Kaya, Len Brown, Rafael J . Wysocki



On 6/8/21 12:48 AM, Amey Narkhede wrote:
> Currently there is separate function pcie_has_flr() to probe if pcie flr is
> supported by the device which does not match the calling convention
> followed by reset methods which use second function argument to decide
> whether to probe or not.  Add new function pcie_reset_flr() that follows
> the calling convention of reset methods.
>
> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>

Tested-by: Shanker Donthineni <sdonthineni@nvidia.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 2/8] PCI: Add new array for keeping track of ordering of reset methods
  2021-06-08  5:48 ` [PATCH v7 2/8] PCI: Add new array for keeping track of ordering of " Amey Narkhede
@ 2021-06-10 20:15   ` Shanker R Donthineni
  2021-06-17 23:13   ` Bjorn Helgaas
  1 sibling, 0 replies; 52+ messages in thread
From: Shanker R Donthineni @ 2021-06-10 20:15 UTC (permalink / raw)
  To: Amey Narkhede, Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Sinan Kaya, Len Brown, Rafael J . Wysocki



On 6/8/21 12:48 AM, Amey Narkhede wrote:
> Introduce a new array reset_methods in struct pci_dev to keep track of
> reset mechanisms supported by the device and their ordering.
> Also refactor probing and reset functions to take advantage of calling
> convention of reset functions.
>
> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>

Tested-by: Shanker Donthineni <sdonthineni@nvidia.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 3/8] PCI: Remove reset_fn field from pci_dev
  2021-06-08  5:48 ` [PATCH v7 3/8] PCI: Remove reset_fn field from pci_dev Amey Narkhede
@ 2021-06-10 20:16   ` Shanker R Donthineni
  0 siblings, 0 replies; 52+ messages in thread
From: Shanker R Donthineni @ 2021-06-10 20:16 UTC (permalink / raw)
  To: Amey Narkhede, Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Sinan Kaya, Len Brown, Rafael J . Wysocki



On 6/8/21 12:48 AM, Amey Narkhede wrote:
> reset_fn field is used to indicate whether the device supports any reset
> mechanism or not. Remove the use of reset_fn in favor of new reset_methods
> array which can be used to keep track of all supported reset mechanisms of
> a device and their ordering.
>
> The octeon driver is incorrectly using
> reset_fn field to detect if the device supports FLR or not. Use
> pcie_reset_flr() to probe whether it supports FLR or not.
>
> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>

Tested-by: Shanker Donthineni <sdonthineni@nvidia.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-08  5:48 ` [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism Amey Narkhede
  2021-06-09 21:57   ` Raphael Norwitz
@ 2021-06-10 20:16   ` Shanker R Donthineni
  2021-06-18 20:00   ` Bjorn Helgaas
  2021-06-24 12:15   ` Bjorn Helgaas
  3 siblings, 0 replies; 52+ messages in thread
From: Shanker R Donthineni @ 2021-06-10 20:16 UTC (permalink / raw)
  To: Amey Narkhede, Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Sinan Kaya, Len Brown, Rafael J . Wysocki



On 6/8/21 12:48 AM, Amey Narkhede wrote:
> Add reset_method sysfs attribute to enable user to
> query and set user preferred device reset methods and
> their ordering.
>
> Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>

Tested-by: Shanker Donthineni <sdonthineni@nvidia.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 7/8] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs
  2021-06-08  5:48 ` [PATCH v7 7/8] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs Amey Narkhede
@ 2021-06-10 23:16   ` Bjorn Helgaas
  2021-06-10 23:33     ` Shanker R Donthineni
  2021-06-10 23:43     ` Shanker R Donthineni
  0 siblings, 2 replies; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-10 23:16 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: Bjorn Helgaas, alex.williamson, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Shanker Donthineni, Sinan Kaya, Len Brown,
	Rafael J . Wysocki

On Tue, Jun 08, 2021 at 11:18:56AM +0530, Amey Narkhede wrote:
> From: Shanker Donthineni <sdonthineni@nvidia.com>
> 
> On select platforms, some Nvidia GPU devices do not work with SBR.

Interesting that you say "on select platforms."  Apparently SBR does
work for some of these GPUs, but not on all platforms?  If you have
any clarification here, I can still update the commit log.

> Triggering SBR would leave the device inoperable for the current
> system boot. It requires a system hard-reboot to get the GPU device
> back to normal operating condition post-SBR. For the affected
> devices, enable NO_BUS_RESET quirk to fix the issue.
> 
> This issue will be fixed in the next generation of hardware.
> 
> Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
> Reviewed-by: Sinan Kaya <okaya@kernel.org>

This patch doesn't seem to have any dependencies or particular
connection to the rest of the reset series, so I applied this patch by
itself to for-linus for v5.13 and marked it for stable.

If that's not right, let me know.

> ---
>  drivers/pci/quirks.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index e86cf4a3b..45a8c3caa 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3546,6 +3546,18 @@ static void quirk_no_bus_reset(struct pci_dev *dev)
>  	dev->dev_flags |= PCI_DEV_FLAGS_NO_BUS_RESET;
>  }
>  
> +/*
> + * Some Nvidia GPU devices do not work with bus reset, SBR needs to be
> + * prevented for those affected devices.
> + */
> +static void quirk_nvidia_no_bus_reset(struct pci_dev *dev)
> +{
> +	if ((dev->device & 0xffc0) == 0x2340)
> +		quirk_no_bus_reset(dev);
> +}
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
> +			 quirk_nvidia_no_bus_reset);
> +
>  /*
>   * Some Atheros AR9xxx and QCA988x chips do not behave after a bus reset.
>   * The device will throw a Link Down error on AER-capable systems and
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 7/8] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs
  2021-06-10 23:16   ` Bjorn Helgaas
@ 2021-06-10 23:33     ` Shanker R Donthineni
  2021-06-10 23:43     ` Shanker R Donthineni
  1 sibling, 0 replies; 52+ messages in thread
From: Shanker R Donthineni @ 2021-06-10 23:33 UTC (permalink / raw)
  To: Bjorn Helgaas, Amey Narkhede
  Cc: Bjorn Helgaas, alex.williamson, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Sinan Kaya, Len Brown, Rafael J . Wysocki

Hi Bjorn,

On 6/10/21 6:16 PM, Bjorn Helgaas wrote:
>> Triggering SBR would leave the device inoperable for the current
>> system boot. It requires a system hard-reboot to get the GPU device
>> back to normal operating condition post-SBR. For the affected
>> devices, enable NO_BUS_RESET quirk to fix the issue.
>>
>> This issue will be fixed in the next generation of hardware.
>>
>> Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
>> Reviewed-by: Sinan Kaya <okaya@kernel.org>
> This patch doesn't seem to have any dependencies or particular
> connection to the rest of the reset series, so I applied this patch by
> itself to for-linus for v5.13 and marked it for stable.
>
> If that's not right, let me know.
>

Yes, you're right this patch no dependency on reset method series.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 7/8] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs
  2021-06-10 23:16   ` Bjorn Helgaas
  2021-06-10 23:33     ` Shanker R Donthineni
@ 2021-06-10 23:43     ` Shanker R Donthineni
  2021-06-10 23:53       ` Bjorn Helgaas
  1 sibling, 1 reply; 52+ messages in thread
From: Shanker R Donthineni @ 2021-06-10 23:43 UTC (permalink / raw)
  To: Bjorn Helgaas, Amey Narkhede
  Cc: Bjorn Helgaas, alex.williamson, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Sinan Kaya, Len Brown, Rafael J . Wysocki

Hi Bjorn,

On 6/10/21 6:16 PM, Bjorn Helgaas wrote:
>> From: Shanker Donthineni <sdonthineni@nvidia.com>
>>
>> On select platforms, some Nvidia GPU devices do not work with SBR.
> Interesting that you say "on select platforms."  Apparently SBR does
> work for some of these GPUs, but not on all platforms?  If you have
> any clarification here, I can still update the commit log.
>
Yes, SBR works for some GPUs but GPUs which are listed in this quirk will
not work and these GPUs are available only on selected server platforms.
I believe commit text reflects the issue but please update if needed. 

-

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 7/8] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs
  2021-06-10 23:43     ` Shanker R Donthineni
@ 2021-06-10 23:53       ` Bjorn Helgaas
  2021-06-11  4:15         ` Shanker R Donthineni
  0 siblings, 1 reply; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-10 23:53 UTC (permalink / raw)
  To: Shanker R Donthineni
  Cc: Amey Narkhede, Bjorn Helgaas, alex.williamson, Raphael Norwitz,
	linux-pci, linux-kernel, kw, Sinan Kaya, Len Brown,
	Rafael J . Wysocki

On Thu, Jun 10, 2021 at 06:43:26PM -0500, Shanker R Donthineni wrote:
> On 6/10/21 6:16 PM, Bjorn Helgaas wrote:
> >> From: Shanker Donthineni <sdonthineni@nvidia.com>
> >>
> >> On select platforms, some Nvidia GPU devices do not work with SBR.
> > Interesting that you say "on select platforms."  Apparently SBR does
> > work for some of these GPUs, but not on all platforms?  If you have
> > any clarification here, I can still update the commit log.
> >
> Yes, SBR works for some GPUs but GPUs which are listed in this quirk will
> not work and these GPUs are available only on selected server platforms.
> I believe commit text reflects the issue but please update if needed. 

It sounds like there is no actual dependency on the platform.  So even
though these GPUs are only available on certain platforms, if one were
to move one of them to a different, non-supported platform, SBR would
still not work.

So I think I'll remove the reference to "select platforms" since it
doesn't add any useful information and might suggest that SBR should
work on some platforms, if you could only find the right ones.

Bjorn

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 7/8] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs
  2021-06-10 23:53       ` Bjorn Helgaas
@ 2021-06-11  4:15         ` Shanker R Donthineni
  0 siblings, 0 replies; 52+ messages in thread
From: Shanker R Donthineni @ 2021-06-11  4:15 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Amey Narkhede, Bjorn Helgaas, alex.williamson, Raphael Norwitz,
	linux-pci, linux-kernel, kw, Sinan Kaya, Len Brown,
	Rafael J . Wysocki

Hi Bjorn,

On 6/10/21 6:53 PM, Bjorn Helgaas wrote:
> It sounds like there is no actual dependency on the platform.  So even
> though these GPUs are only available on certain platforms, if one were
> to move one of them to a different, non-supported platform, SBR would
> still not work.
>
> So I think I'll remove the reference to "select platforms" since it
> doesn't add any useful information and might suggest that SBR should
> work on some platforms, if you could only find the right ones.

Appreciate your time on code review, providing better text, and picking patch
for v5.14. Please let us know if any code improvements or suggestions for the
remaining reset patch series to be considered for v5.14
 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods
  2021-06-08  5:48 ` [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods Amey Narkhede
  2021-06-10 20:15   ` Shanker R Donthineni
@ 2021-06-17 21:57   ` Bjorn Helgaas
  2021-06-17 22:51     ` Alex Williamson
  2021-06-18 16:32     ` Amey Narkhede
  2021-06-24 12:23   ` Bjorn Helgaas
  2 siblings, 2 replies; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-17 21:57 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: Bjorn Helgaas, alex.williamson, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Shanker Donthineni, Sinan Kaya, Len Brown,
	Rafael J . Wysocki, Christoph Hellwig

[+cc Christoph, since he added pcie_flr()]

On Tue, Jun 08, 2021 at 11:18:50AM +0530, Amey Narkhede wrote:
> Currently there is separate function pcie_has_flr() to probe if pcie flr is
> supported by the device which does not match the calling convention
> followed by reset methods which use second function argument to decide
> whether to probe or not.  Add new function pcie_reset_flr() that follows
> the calling convention of reset methods.

I don't like the fact that we handle FLR differently from other types
of reset, so I do like the fact that this makes them more consistent.

> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
> ---
>  drivers/crypto/cavium/nitrox/nitrox_main.c |  4 +-
>  drivers/pci/pci.c                          | 62 ++++++++++++----------
>  drivers/pci/pcie/aer.c                     | 12 ++---
>  drivers/pci/quirks.c                       |  9 ++--
>  include/linux/pci.h                        |  2 +-
>  5 files changed, 43 insertions(+), 46 deletions(-)
> 
> diff --git a/drivers/crypto/cavium/nitrox/nitrox_main.c b/drivers/crypto/cavium/nitrox/nitrox_main.c
> index facc8e6bc..15d6c8452 100644
> --- a/drivers/crypto/cavium/nitrox/nitrox_main.c
> +++ b/drivers/crypto/cavium/nitrox/nitrox_main.c
> @@ -306,9 +306,7 @@ static int nitrox_device_flr(struct pci_dev *pdev)
>  		return -ENOMEM;
>  	}
>  
> -	/* check flr support */
> -	if (pcie_has_flr(pdev))
> -		pcie_flr(pdev);
> +	pcie_reset_flr(pdev, 0);
>  
>  	pci_restore_state(pdev);
>  
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 452351025..3bf36924c 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4611,32 +4611,12 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL(pci_wait_for_pending_transaction);
>  
> -/**
> - * pcie_has_flr - check if a device supports function level resets
> - * @dev: device to check
> - *
> - * Returns true if the device advertises support for PCIe function level
> - * resets.
> - */
> -bool pcie_has_flr(struct pci_dev *dev)
> -{
> -	u32 cap;
> -
> -	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> -		return false;
> -
> -	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
> -	return cap & PCI_EXP_DEVCAP_FLR;
> -}
> -EXPORT_SYMBOL_GPL(pcie_has_flr);
> -
>  /**
>   * pcie_flr - initiate a PCIe function level reset
>   * @dev: device to reset
>   *
> - * Initiate a function level reset on @dev.  The caller should ensure the
> - * device supports FLR before calling this function, e.g. by using the
> - * pcie_has_flr() helper.
> + * Initiate a function level reset unconditionally on @dev without
> + * checking any flags and DEVCAP
>   */
>  int pcie_flr(struct pci_dev *dev)
>  {
> @@ -4659,6 +4639,31 @@ int pcie_flr(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL_GPL(pcie_flr);
>  
> +/**
> + * pcie_reset_flr - initiate a PCIe function level reset
> + * @dev: device to reset
> + * @probe: If set, only check if the device can be reset this way.
> + *
> + * Initiate a function level reset on @dev.
> + */
> +int pcie_reset_flr(struct pci_dev *dev, int probe)
> +{
> +	u32 cap;
> +
> +	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> +		return -ENOTTY;
> +
> +	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
> +	if (!(cap & PCI_EXP_DEVCAP_FLR))
> +		return -ENOTTY;
> +
> +	if (probe)
> +		return 0;
> +
> +	return pcie_flr(dev);

Christoph added pcie_flr() with a60a2b73ba69 ("PCI: Export
pcie_flr()"), where the commit log says he split out the probing
because "non-core callers already know their hardware."

It *is* reasonable to expect that drivers know whether their device
supports FLR so they don't need to probe.

But we don't expose the "probe" argument outside the PCI core for any
other reset methods, and I would like to avoid that here.

It seems excessive to have to read PCI_EXP_DEVCAP every time.
PCI_EXP_DEVCAP_FLR is a read-only bit, and we should only need to look
at it once.

What I would really like here is a single bit in the pci_dev that we
could set at enumeration-time, e.g., something like this:

  struct pci_dev {
    ...
    unsigned int has_flr:1;
  };

  void set_pcie_port_type(...)    # during enumeration
  {
    pci_read_config_word(dev, pos + PCI_EXP_DEVCAP, &reg16);
    if (reg16 & PCI_EXP_DEVCAP_FLR)
      dev->has_flr = 1;
  }

  static void quirk_no_flr(...)
  {
    dev->has_flr = 0;             # get rid of PCI_DEV_FLAGS_NO_FLR_RESET
  }

  int pcie_flr(...)
  {
    if (!dev->has_flr)
      return -ENOTTY;

    if (!pci_wait_for_pending_transaction(dev))
      ...
  }

I think this should be enough that we could get rid of pcie_has_flr()
without having to expose the "probe" argument outside drivers/pci/.

Procedural note: if we *do* have to expose the "probe" argument, can
you arrange it to have the correct type before touching the drivers, so
we only have to touch the drivers once?

> +}
> +EXPORT_SYMBOL_GPL(pcie_reset_flr);
> +
>  static int pci_af_flr(struct pci_dev *dev, int probe)
>  {
>  	int pos;
> @@ -5139,11 +5144,9 @@ int __pci_reset_function_locked(struct pci_dev *dev)
>  	rc = pci_dev_specific_reset(dev, 0);
>  	if (rc != -ENOTTY)
>  		return rc;
> -	if (pcie_has_flr(dev)) {
> -		rc = pcie_flr(dev);
> -		if (rc != -ENOTTY)
> -			return rc;
> -	}
> +	rc = pcie_reset_flr(dev, 0);
> +	if (rc != -ENOTTY)
> +		return rc;
>  	rc = pci_af_flr(dev, 0);
>  	if (rc != -ENOTTY)
>  		return rc;
> @@ -5174,8 +5177,9 @@ int pci_probe_reset_function(struct pci_dev *dev)
>  	rc = pci_dev_specific_reset(dev, 1);
>  	if (rc != -ENOTTY)
>  		return rc;
> -	if (pcie_has_flr(dev))
> -		return 0;
> +	rc = pcie_reset_flr(dev, 1);
> +	if (rc != -ENOTTY)
> +		return rc;
>  	rc = pci_af_flr(dev, 1);
>  	if (rc != -ENOTTY)
>  		return rc;
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index ec943cee5..98077595a 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1405,13 +1405,11 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
>  	}
>  
>  	if (type == PCI_EXP_TYPE_RC_EC || type == PCI_EXP_TYPE_RC_END) {
> -		if (pcie_has_flr(dev)) {
> -			rc = pcie_flr(dev);
> -			pci_info(dev, "has been reset (%d)\n", rc);
> -		} else {
> -			pci_info(dev, "not reset (no FLR support)\n");
> -			rc = -ENOTTY;
> -		}
> +		rc = pcie_reset_flr(dev, 0);
> +		if (!rc)
> +			pci_info(dev, "has been reset\n");
> +		else
> +			pci_info(dev, "not reset (no FLR support: %d)\n", rc);
>  	} else {
>  		rc = pci_bus_error_reset(dev);
>  		pci_info(dev, "%s Port link has been reset (%d)\n",
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index d85914afe..f977ba79a 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3819,7 +3819,7 @@ static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
>  	u32 cfg;
>  
>  	if (dev->class != PCI_CLASS_STORAGE_EXPRESS ||
> -	    !pcie_has_flr(dev) || !pci_resource_start(dev, 0))
> +	    pcie_reset_flr(dev, 1) || !pci_resource_start(dev, 0))
>  		return -ENOTTY;
>  
>  	if (probe)
> @@ -3888,13 +3888,10 @@ static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
>   */
>  static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
>  {
> -	if (!pcie_has_flr(dev))
> -		return -ENOTTY;
> +	int ret = pcie_reset_flr(dev, probe);
>  
>  	if (probe)
> -		return 0;
> -
> -	pcie_flr(dev);
> +		return ret;
>  
>  	msleep(250);
>  
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index c20211e59..20b90c205 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1225,7 +1225,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
>  			     enum pci_bus_speed *speed,
>  			     enum pcie_link_width *width);
>  void pcie_print_link_status(struct pci_dev *dev);
> -bool pcie_has_flr(struct pci_dev *dev);
> +int pcie_reset_flr(struct pci_dev *dev, int probe);
>  int pcie_flr(struct pci_dev *dev);
>  int __pci_reset_function_locked(struct pci_dev *dev);
>  int pci_reset_function(struct pci_dev *dev);
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods
  2021-06-17 21:57   ` Bjorn Helgaas
@ 2021-06-17 22:51     ` Alex Williamson
  2021-06-18 16:32     ` Amey Narkhede
  1 sibling, 0 replies; 52+ messages in thread
From: Alex Williamson @ 2021-06-17 22:51 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Amey Narkhede, Bjorn Helgaas, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Shanker Donthineni, Sinan Kaya, Len Brown,
	Rafael J . Wysocki, Christoph Hellwig

On Thu, 17 Jun 2021 16:57:34 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> [+cc Christoph, since he added pcie_flr()]
> 
> On Tue, Jun 08, 2021 at 11:18:50AM +0530, Amey Narkhede wrote:
> > Currently there is separate function pcie_has_flr() to probe if pcie flr is
> > supported by the device which does not match the calling convention
> > followed by reset methods which use second function argument to decide
> > whether to probe or not.  Add new function pcie_reset_flr() that follows
> > the calling convention of reset methods.  
> 
> I don't like the fact that we handle FLR differently from other types
> of reset, so I do like the fact that this makes them more consistent.
> 
> > Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
> > Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> > Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
> > ---
> >  drivers/crypto/cavium/nitrox/nitrox_main.c |  4 +-
> >  drivers/pci/pci.c                          | 62 ++++++++++++----------
> >  drivers/pci/pcie/aer.c                     | 12 ++---
> >  drivers/pci/quirks.c                       |  9 ++--
> >  include/linux/pci.h                        |  2 +-
> >  5 files changed, 43 insertions(+), 46 deletions(-)
> > 
> > diff --git a/drivers/crypto/cavium/nitrox/nitrox_main.c b/drivers/crypto/cavium/nitrox/nitrox_main.c
> > index facc8e6bc..15d6c8452 100644
> > --- a/drivers/crypto/cavium/nitrox/nitrox_main.c
> > +++ b/drivers/crypto/cavium/nitrox/nitrox_main.c
> > @@ -306,9 +306,7 @@ static int nitrox_device_flr(struct pci_dev *pdev)
> >  		return -ENOMEM;
> >  	}
> >  
> > -	/* check flr support */
> > -	if (pcie_has_flr(pdev))
> > -		pcie_flr(pdev);
> > +	pcie_reset_flr(pdev, 0);
> >  
> >  	pci_restore_state(pdev);
> >  
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 452351025..3bf36924c 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -4611,32 +4611,12 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev)
> >  }
> >  EXPORT_SYMBOL(pci_wait_for_pending_transaction);
> >  
> > -/**
> > - * pcie_has_flr - check if a device supports function level resets
> > - * @dev: device to check
> > - *
> > - * Returns true if the device advertises support for PCIe function level
> > - * resets.
> > - */
> > -bool pcie_has_flr(struct pci_dev *dev)
> > -{
> > -	u32 cap;
> > -
> > -	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> > -		return false;
> > -
> > -	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
> > -	return cap & PCI_EXP_DEVCAP_FLR;
> > -}
> > -EXPORT_SYMBOL_GPL(pcie_has_flr);
> > -
> >  /**
> >   * pcie_flr - initiate a PCIe function level reset
> >   * @dev: device to reset
> >   *
> > - * Initiate a function level reset on @dev.  The caller should ensure the
> > - * device supports FLR before calling this function, e.g. by using the
> > - * pcie_has_flr() helper.
> > + * Initiate a function level reset unconditionally on @dev without
> > + * checking any flags and DEVCAP
> >   */
> >  int pcie_flr(struct pci_dev *dev)
> >  {
> > @@ -4659,6 +4639,31 @@ int pcie_flr(struct pci_dev *dev)
> >  }
> >  EXPORT_SYMBOL_GPL(pcie_flr);
> >  
> > +/**
> > + * pcie_reset_flr - initiate a PCIe function level reset
> > + * @dev: device to reset
> > + * @probe: If set, only check if the device can be reset this way.
> > + *
> > + * Initiate a function level reset on @dev.
> > + */
> > +int pcie_reset_flr(struct pci_dev *dev, int probe)
> > +{
> > +	u32 cap;
> > +
> > +	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> > +		return -ENOTTY;
> > +
> > +	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
> > +	if (!(cap & PCI_EXP_DEVCAP_FLR))
> > +		return -ENOTTY;
> > +
> > +	if (probe)
> > +		return 0;
> > +
> > +	return pcie_flr(dev);  
> 
> Christoph added pcie_flr() with a60a2b73ba69 ("PCI: Export
> pcie_flr()"), where the commit log says he split out the probing
> because "non-core callers already know their hardware."
> 
> It *is* reasonable to expect that drivers know whether their device
> supports FLR so they don't need to probe.

I don't think it changes your suggestion below, but this statement is a
little troublesome when we look at devices running in VMs where we've
been known to hide various capabilities, or simply quirks where some
combination of a known device feature might be otherwise avoided.  A
more robust driver should try to make fewer assumptions for these
cases, if not simply the inability to predict future changes to the
hardware.  

FLR should be a relatively rare event, but caching for driver API
purposes seems reasonable.  Thanks,

Alex

> But we don't expose the "probe" argument outside the PCI core for any
> other reset methods, and I would like to avoid that here.
> 
> It seems excessive to have to read PCI_EXP_DEVCAP every time.
> PCI_EXP_DEVCAP_FLR is a read-only bit, and we should only need to look
> at it once.
> 
> What I would really like here is a single bit in the pci_dev that we
> could set at enumeration-time, e.g., something like this:
> 
>   struct pci_dev {
>     ...
>     unsigned int has_flr:1;
>   };
> 
>   void set_pcie_port_type(...)    # during enumeration
>   {
>     pci_read_config_word(dev, pos + PCI_EXP_DEVCAP, &reg16);
>     if (reg16 & PCI_EXP_DEVCAP_FLR)
>       dev->has_flr = 1;
>   }
> 
>   static void quirk_no_flr(...)
>   {
>     dev->has_flr = 0;             # get rid of PCI_DEV_FLAGS_NO_FLR_RESET
>   }
> 
>   int pcie_flr(...)
>   {
>     if (!dev->has_flr)
>       return -ENOTTY;
> 
>     if (!pci_wait_for_pending_transaction(dev))
>       ...
>   }
> 
> I think this should be enough that we could get rid of pcie_has_flr()
> without having to expose the "probe" argument outside drivers/pci/.
> 
> Procedural note: if we *do* have to expose the "probe" argument, can
> you arrange it to have the correct type before touching the drivers, so
> we only have to touch the drivers once?
> 
> > +}
> > +EXPORT_SYMBOL_GPL(pcie_reset_flr);
> > +
> >  static int pci_af_flr(struct pci_dev *dev, int probe)
> >  {
> >  	int pos;
> > @@ -5139,11 +5144,9 @@ int __pci_reset_function_locked(struct pci_dev *dev)
> >  	rc = pci_dev_specific_reset(dev, 0);
> >  	if (rc != -ENOTTY)
> >  		return rc;
> > -	if (pcie_has_flr(dev)) {
> > -		rc = pcie_flr(dev);
> > -		if (rc != -ENOTTY)
> > -			return rc;
> > -	}
> > +	rc = pcie_reset_flr(dev, 0);
> > +	if (rc != -ENOTTY)
> > +		return rc;
> >  	rc = pci_af_flr(dev, 0);
> >  	if (rc != -ENOTTY)
> >  		return rc;
> > @@ -5174,8 +5177,9 @@ int pci_probe_reset_function(struct pci_dev *dev)
> >  	rc = pci_dev_specific_reset(dev, 1);
> >  	if (rc != -ENOTTY)
> >  		return rc;
> > -	if (pcie_has_flr(dev))
> > -		return 0;
> > +	rc = pcie_reset_flr(dev, 1);
> > +	if (rc != -ENOTTY)
> > +		return rc;
> >  	rc = pci_af_flr(dev, 1);
> >  	if (rc != -ENOTTY)
> >  		return rc;
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index ec943cee5..98077595a 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -1405,13 +1405,11 @@ static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
> >  	}
> >  
> >  	if (type == PCI_EXP_TYPE_RC_EC || type == PCI_EXP_TYPE_RC_END) {
> > -		if (pcie_has_flr(dev)) {
> > -			rc = pcie_flr(dev);
> > -			pci_info(dev, "has been reset (%d)\n", rc);
> > -		} else {
> > -			pci_info(dev, "not reset (no FLR support)\n");
> > -			rc = -ENOTTY;
> > -		}
> > +		rc = pcie_reset_flr(dev, 0);
> > +		if (!rc)
> > +			pci_info(dev, "has been reset\n");
> > +		else
> > +			pci_info(dev, "not reset (no FLR support: %d)\n", rc);
> >  	} else {
> >  		rc = pci_bus_error_reset(dev);
> >  		pci_info(dev, "%s Port link has been reset (%d)\n",
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index d85914afe..f977ba79a 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -3819,7 +3819,7 @@ static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
> >  	u32 cfg;
> >  
> >  	if (dev->class != PCI_CLASS_STORAGE_EXPRESS ||
> > -	    !pcie_has_flr(dev) || !pci_resource_start(dev, 0))
> > +	    pcie_reset_flr(dev, 1) || !pci_resource_start(dev, 0))
> >  		return -ENOTTY;
> >  
> >  	if (probe)
> > @@ -3888,13 +3888,10 @@ static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
> >   */
> >  static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
> >  {
> > -	if (!pcie_has_flr(dev))
> > -		return -ENOTTY;
> > +	int ret = pcie_reset_flr(dev, probe);
> >  
> >  	if (probe)
> > -		return 0;
> > -
> > -	pcie_flr(dev);
> > +		return ret;
> >  
> >  	msleep(250);
> >  
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index c20211e59..20b90c205 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -1225,7 +1225,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
> >  			     enum pci_bus_speed *speed,
> >  			     enum pcie_link_width *width);
> >  void pcie_print_link_status(struct pci_dev *dev);
> > -bool pcie_has_flr(struct pci_dev *dev);
> > +int pcie_reset_flr(struct pci_dev *dev, int probe);
> >  int pcie_flr(struct pci_dev *dev);
> >  int __pci_reset_function_locked(struct pci_dev *dev);
> >  int pci_reset_function(struct pci_dev *dev);
> > -- 
> > 2.31.1
> >   
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 2/8] PCI: Add new array for keeping track of ordering of reset methods
  2021-06-08  5:48 ` [PATCH v7 2/8] PCI: Add new array for keeping track of ordering of " Amey Narkhede
  2021-06-10 20:15   ` Shanker R Donthineni
@ 2021-06-17 23:13   ` Bjorn Helgaas
  2021-06-18 17:22     ` Amey Narkhede
  1 sibling, 1 reply; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-17 23:13 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: Bjorn Helgaas, alex.williamson, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Shanker Donthineni, Sinan Kaya, Len Brown,
	Rafael J . Wysocki

"Add new" in subject and below is slightly redundant.

On Tue, Jun 08, 2021 at 11:18:51AM +0530, Amey Narkhede wrote:
> Introduce a new array reset_methods in struct pci_dev to keep track of
> reset mechanisms supported by the device and their ordering.
> Also refactor probing and reset functions to take advantage of calling
> convention of reset functions.
> 
> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
> ---
>  drivers/pci/pci.c   | 108 ++++++++++++++++++++++++++------------------
>  drivers/pci/pci.h   |   8 +++-
>  drivers/pci/probe.c |   5 +-
>  include/linux/pci.h |   7 +++
>  4 files changed, 81 insertions(+), 47 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 3bf36924c..39a9ea8bb 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -72,6 +72,14 @@ static void pci_dev_d3_sleep(struct pci_dev *dev)
>  		msleep(delay);
>  }
>  
> +bool pci_reset_supported(struct pci_dev *dev)
> +{
> +	u8 null_reset_methods[PCI_RESET_METHODS_NUM] = { 0 };
> +
> +	return memcmp(null_reset_methods,
> +		      dev->reset_methods, PCI_RESET_METHODS_NUM);

memcmp() doesn't actually return a bool.  Either just return int
and rely on the C "anything non-zero is true, zero is false" or
convert the memcmp result to bool, i.e., something like:

  if (memcmp(...) == 0)
    return true;
  return false;

> +}
> +
>  #ifdef CONFIG_PCI_DOMAINS
>  int pci_domains_supported = 1;
>  #endif
> @@ -5107,6 +5115,18 @@ static void pci_dev_restore(struct pci_dev *dev)
>  		err_handler->reset_done(dev);
>  }
>  
> +/*
> + * The ordering for functions in pci_reset_fn_methods is required for
> + * reset_methods byte array defined in struct pci_dev.

I'm not quite sure what this comment is telling me.  What breaks if I
change the order?  If I add a new method, how do I know where to put
it?

By reading the code, I infer that:

  - Each dev has dev->reset_methods[PCI_RESET_METHODS_NUM]

  - dev->reset_methods[i] corresponds to pci_reset_fn_methods[i]

  - dev->reset_methods[i] == 0 means dev doesn't support that method

  - Otherwise, dev->reset_methods[i] is a value in the range of
    [1, PCI_RESET_METHODS_NUM], and the higher the number, the higher
    the reset method priority

  - The order in pci_reset_fn_methods[] determines the initial
    priority via pci_init_reset_methods(), but the priority can be
    changed via sysfs

> + */
> +const struct pci_reset_fn_method pci_reset_fn_methods[] = {
> +	{ &pci_dev_specific_reset, .name = "device_specific" },
> +	{ &pcie_reset_flr, .name = "flr" },
> +	{ &pci_af_flr, .name = "af_flr" },
> +	{ &pci_pm_reset, .name = "pm" },
> +	{ &pci_reset_bus_function, .name = "bus" },
> +};
> +
>  /**
>   * __pci_reset_function_locked - reset a PCI device function while holding
>   * the @dev mutex lock.
> @@ -5129,65 +5149,67 @@ static void pci_dev_restore(struct pci_dev *dev)
>   */
>  int __pci_reset_function_locked(struct pci_dev *dev)
>  {
> -	int rc;
> +	int i, rc = -ENOTTY;
> +	u8 prio;
>  
>  	might_sleep();
>  
> -	/*
> -	 * A reset method returns -ENOTTY if it doesn't support this device
> -	 * and we should try the next method.
> -	 *
> -	 * If it returns 0 (success), we're finished.  If it returns any
> -	 * other error, we're also finished: this indicates that further
> -	 * reset mechanisms might be broken on the device.
> -	 */
> -	rc = pci_dev_specific_reset(dev, 0);
> -	if (rc != -ENOTTY)
> -		return rc;
> -	rc = pcie_reset_flr(dev, 0);
> -	if (rc != -ENOTTY)
> -		return rc;
> -	rc = pci_af_flr(dev, 0);
> -	if (rc != -ENOTTY)
> -		return rc;
> -	rc = pci_pm_reset(dev, 0);
> -	if (rc != -ENOTTY)
> -		return rc;
> -	return pci_reset_bus_function(dev, 0);
> +	for (prio = PCI_RESET_METHODS_NUM; prio; prio--) {
> +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> +			if (dev->reset_methods[i] == prio) {
> +				/*
> +				 * A reset method returns -ENOTTY if it doesn't
> +				 * support this device and we should try the
> +				 * next method.
> +				 *
> +				 * If it returns 0 (success), we're finished.
> +				 * If it returns any other error, we're also
> +				 * finished: this indicates that further reset
> +				 * mechanisms might be broken on the device.
> +				 */
> +				rc = pci_reset_fn_methods[i].reset_fn(dev, 0);
> +				if (rc != -ENOTTY)
> +					return rc;

Maybe leave the comment outside the loop where it used to be so the
text lines are longer and it's easier to read.

> +				break;
> +			}
> +		}
> +		if (i == PCI_RESET_METHODS_NUM)
> +			break;
> +	}
> +	return rc;

I wonder if this would be easier if dev->reset_methods[] contained
indices into pci_reset_fn_methods[], highest priority first, with the
priority being determined when dev->reset_methods[] is updated.  For
example:

  const struct pci_reset_fn_method pci_reset_fn_methods[] = {
    { },                                                     # 0
    { &pci_dev_specific_reset, .name = "device_specific" },  # 1
    { &pci_dev_acpi_reset, .name = "acpi" },                 # 2
    { &pcie_reset_flr, .name = "flr" },                      # 3
    { &pci_af_flr, .name = "af_flr" },                       # 4
    { &pci_pm_reset, .name = "pm" },                         # 5
    { &pci_reset_bus_function, .name = "bus" },              # 6
  };

  dev->reset_methods[] = [1, 2, 3, 4, 5, 6]
    means all reset methods are supported, in the default priority
    order

  dev->reset_methods[] = [1, 0, 0, 0, 0, 0]
    means only pci_dev_specific_reset is supported

  dev->reset_methods[] = [3, 5, 0, 0, 0, 0]
    means pcie_reset_flr and pci_pm_reset are supported, in that
    priority order

Then we wouldn't need the nested loop and the return value would be
easier to analyze:

  for (i = 0; i < PCI_RESET_METHODS_NUM && (m = dev->reset_methods[i]); i++) {
    rc = pci_reset_fn_methods[m].reset_fn(dev, 0);
    if (rc == 0)
      return 0;
    if (rc != -ENOTTY)
      return rc;
  }
  return -ENOTTY;

pci_init_reset_methods() would be something like:

  n = 0;
  for (i = 1; i < PCI_RESET_METHODS_NUM; i++) {
    rc = pci_reset_fn_methods[i].reset_fn(dev, 1);
    if (!rc)
      dev->reset_methods[n++] = i;
    if (rc != -ENOTTY)
      return;
  }

>  }
>  EXPORT_SYMBOL_GPL(__pci_reset_function_locked);
>  
>  /**
> - * pci_probe_reset_function - check whether the device can be safely reset
> - * @dev: PCI device to reset
> + * pci_init_reset_methods - check whether device can be safely reset
> + * and store supported reset mechanisms.
> + * @dev: PCI device to check for reset mechanisms
>   *
>   * Some devices allow an individual function to be reset without affecting
>   * other functions in the same device.  The PCI device must be responsive
> - * to PCI config space in order to use this function.
> + * to reads and writes to its PCI config space in order to use this function.
>   *
> - * Returns 0 if the device function can be reset or negative if the
> - * device doesn't support resetting a single function.
> + * Stores reset mechanisms supported by device in reset_methods byte array
> + * which is a member of struct pci_dev.
>   */
> -int pci_probe_reset_function(struct pci_dev *dev)
> +void pci_init_reset_methods(struct pci_dev *dev)
>  {
> -	int rc;
> +	int i, rc;
> +	u8 prio = PCI_RESET_METHODS_NUM;
> +	u8 reset_methods[PCI_RESET_METHODS_NUM] = { 0 };
>  
> -	might_sleep();
> +	BUILD_BUG_ON(ARRAY_SIZE(pci_reset_fn_methods) != PCI_RESET_METHODS_NUM);
>  
> -	rc = pci_dev_specific_reset(dev, 1);
> -	if (rc != -ENOTTY)
> -		return rc;
> -	rc = pcie_reset_flr(dev, 1);
> -	if (rc != -ENOTTY)
> -		return rc;
> -	rc = pci_af_flr(dev, 1);
> -	if (rc != -ENOTTY)
> -		return rc;
> -	rc = pci_pm_reset(dev, 1);
> -	if (rc != -ENOTTY)
> -		return rc;
> +	might_sleep();
>  
> -	return pci_reset_bus_function(dev, 1);
> +	for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> +		rc = pci_reset_fn_methods[i].reset_fn(dev, 1);
> +		if (!rc)
> +			reset_methods[i] = prio--;
> +		else if (rc != -ENOTTY)
> +			break;
> +	}
> +	memcpy(dev->reset_methods, reset_methods, sizeof(reset_methods));
>  }
>  
>  /**
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 37c913bbc..13ec6bd6f 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -33,7 +33,7 @@ enum pci_mmap_api {
>  int pci_mmap_fits(struct pci_dev *pdev, int resno, struct vm_area_struct *vmai,
>  		  enum pci_mmap_api mmap_api);
>  
> -int pci_probe_reset_function(struct pci_dev *dev);
> +void pci_init_reset_methods(struct pci_dev *dev);
>  int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
>  int pci_bus_error_reset(struct pci_dev *dev);
>  
> @@ -606,6 +606,12 @@ struct pci_dev_reset_methods {
>  	int (*reset)(struct pci_dev *dev, int probe);
>  };
>  
> +struct pci_reset_fn_method {
> +	int (*reset_fn)(struct pci_dev *pdev, int probe);
> +	char *name;
> +};
> +
> +extern const struct pci_reset_fn_method pci_reset_fn_methods[];
>  #ifdef CONFIG_PCI_QUIRKS
>  int pci_dev_specific_reset(struct pci_dev *dev, int probe);
>  #else
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 3a62d09b8..8cf532681 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2404,9 +2404,8 @@ static void pci_init_capabilities(struct pci_dev *dev)
>  	pci_rcec_init(dev);		/* Root Complex Event Collector */
>  
>  	pcie_report_downtraining(dev);
> -
> -	if (pci_probe_reset_function(dev) == 0)
> -		dev->reset_fn = 1;
> +	pci_init_reset_methods(dev);
> +	dev->reset_fn = pci_reset_supported(dev);
>  }
>  
>  /*
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 20b90c205..0955246f8 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -49,6 +49,8 @@
>  			       PCI_STATUS_SIG_TARGET_ABORT | \
>  			       PCI_STATUS_PARITY)
>  
> +#define PCI_RESET_METHODS_NUM 5

I'm pretty sure this needs to be kept in sync with something, maybe
ARRAY_SIZE(pci_reset_fn_methods)?  We need some mechanism to enforce
this, or at the very least, a comment.  Oh, I see you have a
BUILD_BUG_ON() in pci_init_reset_methods().  That's good, but a
comment here would help, too.

This name should be something like "PCI_RESET_METHODS" or
"PCI_NUM_RESET_METHODS".  Putting "_NUM" at the end makes it sounds
like we're referring to one specific method.

>  /*
>   * The PCI interface treats multi-function devices as independent
>   * devices.  The slot/function address of each device is encoded
> @@ -505,6 +507,10 @@ struct pci_dev {
>  	char		*driver_override; /* Driver name to force a match */
>  
>  	unsigned long	priv_flags;	/* Private flags for the PCI driver */
> +	/*
> +	 * See pci_reset_fn_methods array in pci.c for ordering.
> +	 */
> +	u8 reset_methods[PCI_RESET_METHODS_NUM];	/* Reset methods ordered by priority */
>  };
>  
>  static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
> @@ -1227,6 +1233,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
>  void pcie_print_link_status(struct pci_dev *dev);
>  int pcie_reset_flr(struct pci_dev *dev, int probe);
>  int pcie_flr(struct pci_dev *dev);
> +bool pci_reset_supported(struct pci_dev *dev);

This function isn't used outside drivers/pci/, so I'd rather have the
prototype in drivers/pci/pci.h.

>  int __pci_reset_function_locked(struct pci_dev *dev);
>  int pci_reset_function(struct pci_dev *dev);
>  int pci_reset_function_locked(struct pci_dev *dev);
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods
  2021-06-17 21:57   ` Bjorn Helgaas
  2021-06-17 22:51     ` Alex Williamson
@ 2021-06-18 16:32     ` Amey Narkhede
  1 sibling, 0 replies; 52+ messages in thread
From: Amey Narkhede @ 2021-06-18 16:32 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J .Wysocki,
	Christoph Hellwig

On 21/06/17 04:57PM, Bjorn Helgaas wrote:
> [+cc Christoph, since he added pcie_flr()]
>
> On Tue, Jun 08, 2021 at 11:18:50AM +0530, Amey Narkhede wrote:
> > Currently there is separate function pcie_has_flr() to probe if pcie flr is
> > supported by the device which does not match the calling convention
> > followed by reset methods which use second function argument to decide
> > whether to probe or not.  Add new function pcie_reset_flr() that follows
> > the calling convention of reset methods.
>
> I don't like the fact that we handle FLR differently from other types
> of reset, so I do like the fact that this makes them more consistent.
>
> > Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
> > Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> > Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
> > ---
> >  drivers/crypto/cavium/nitrox/nitrox_main.c |  4 +-
> >  drivers/pci/pci.c                          | 62 ++++++++++++----------
> >  drivers/pci/pcie/aer.c                     | 12 ++---
> >  drivers/pci/quirks.c                       |  9 ++--
> >  include/linux/pci.h                        |  2 +-
> >  5 files changed, 43 insertions(+), 46 deletions(-)
> >
> > diff --git a/drivers/crypto/cavium/nitrox/nitrox_main.c b/drivers/crypto/cavium/nitrox/nitrox_main.c
> > index facc8e6bc..15d6c8452 100644
> > --- a/drivers/crypto/cavium/nitrox/nitrox_main.c
> > +++ b/drivers/crypto/cavium/nitrox/nitrox_main.c
> > @@ -306,9 +306,7 @@ static int nitrox_device_flr(struct pci_dev *pdev)
> >  		return -ENOMEM;
> >  	}
> >
> > -	/* check flr support */
> > -	if (pcie_has_flr(pdev))
> > -		pcie_flr(pdev);
> > +	pcie_reset_flr(pdev, 0);
> >
> >  	pci_restore_state(pdev);
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 452351025..3bf36924c 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -4611,32 +4611,12 @@ int pci_wait_for_pending_transaction(struct pci_dev *dev)
> >  }
> >  EXPORT_SYMBOL(pci_wait_for_pending_transaction);
> >
> > -/**
> > - * pcie_has_flr - check if a device supports function level resets
> > - * @dev: device to check
> > - *
> > - * Returns true if the device advertises support for PCIe function level
> > - * resets.
> > - */
> > -bool pcie_has_flr(struct pci_dev *dev)
> > -{
> > -	u32 cap;
> > -
> > -	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> > -		return false;
> > -
> > -	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
> > -	return cap & PCI_EXP_DEVCAP_FLR;
> > -}
> > -EXPORT_SYMBOL_GPL(pcie_has_flr);
> > -
> >  /**
> >   * pcie_flr - initiate a PCIe function level reset
> >   * @dev: device to reset
> >   *
> > - * Initiate a function level reset on @dev.  The caller should ensure the
> > - * device supports FLR before calling this function, e.g. by using the
> > - * pcie_has_flr() helper.
> > + * Initiate a function level reset unconditionally on @dev without
> > + * checking any flags and DEVCAP
> >   */
> >  int pcie_flr(struct pci_dev *dev)
> >  {
> > @@ -4659,6 +4639,31 @@ int pcie_flr(struct pci_dev *dev)
> >  }
> >  EXPORT_SYMBOL_GPL(pcie_flr);
> >
> > +/**
> > + * pcie_reset_flr - initiate a PCIe function level reset
> > + * @dev: device to reset
> > + * @probe: If set, only check if the device can be reset this way.
> > + *
> > + * Initiate a function level reset on @dev.
> > + */
> > +int pcie_reset_flr(struct pci_dev *dev, int probe)
> > +{
> > +	u32 cap;
> > +
> > +	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> > +		return -ENOTTY;
> > +
> > +	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
> > +	if (!(cap & PCI_EXP_DEVCAP_FLR))
> > +		return -ENOTTY;
> > +
> > +	if (probe)
> > +		return 0;
> > +
> > +	return pcie_flr(dev);
>
> Christoph added pcie_flr() with a60a2b73ba69 ("PCI: Export
> pcie_flr()"), where the commit log says he split out the probing
> because "non-core callers already know their hardware."
>
> It *is* reasonable to expect that drivers know whether their device
> supports FLR so they don't need to probe.
>
> But we don't expose the "probe" argument outside the PCI core for any
> other reset methods, and I would like to avoid that here.
>
> It seems excessive to have to read PCI_EXP_DEVCAP every time.
> PCI_EXP_DEVCAP_FLR is a read-only bit, and we should only need to look
> at it once.
>
> What I would really like here is a single bit in the pci_dev that we
> could set at enumeration-time, e.g., something like this:
>
>   struct pci_dev {
>     ...
>     unsigned int has_flr:1;
>   };
>
>   void set_pcie_port_type(...)    # during enumeration
>   {
>     pci_read_config_word(dev, pos + PCI_EXP_DEVCAP, &reg16);
>     if (reg16 & PCI_EXP_DEVCAP_FLR)
>       dev->has_flr = 1;
>   }
>
>   static void quirk_no_flr(...)
>   {
>     dev->has_flr = 0;             # get rid of PCI_DEV_FLAGS_NO_FLR_RESET
>   }
>
>   int pcie_flr(...)
>   {
>     if (!dev->has_flr)
>       return -ENOTTY;
>
>     if (!pci_wait_for_pending_transaction(dev))
>       ...
>   }
>
> I think this should be enough that we could get rid of pcie_has_flr()
> without having to expose the "probe" argument outside drivers/pci/.
>
> Procedural note: if we *do* have to expose the "probe" argument, can
> you arrange it to have the correct type before touching the drivers, so
> we only have to touch the drivers once?
>
Thanks for the details. I'll add dev->has_flr check in pcie_reset_flr.
[...]

Thanks,
Amey

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 2/8] PCI: Add new array for keeping track of ordering of reset methods
  2021-06-17 23:13   ` Bjorn Helgaas
@ 2021-06-18 17:22     ` Amey Narkhede
  2021-06-21 15:02       ` Shanker R Donthineni
  0 siblings, 1 reply; 52+ messages in thread
From: Amey Narkhede @ 2021-06-18 17:22 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On 21/06/17 06:13PM, Bjorn Helgaas wrote:
> "Add new" in subject and below is slightly redundant.
>
> On Tue, Jun 08, 2021 at 11:18:51AM +0530, Amey Narkhede wrote:
> > Introduce a new array reset_methods in struct pci_dev to keep track of
> > reset mechanisms supported by the device and their ordering.
> > Also refactor probing and reset functions to take advantage of calling
> > convention of reset functions.
> >
> > Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
> > Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
> > Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
> > ---
> >  drivers/pci/pci.c   | 108 ++++++++++++++++++++++++++------------------
> >  drivers/pci/pci.h   |   8 +++-
> >  drivers/pci/probe.c |   5 +-
> >  include/linux/pci.h |   7 +++
> >  4 files changed, 81 insertions(+), 47 deletions(-)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 3bf36924c..39a9ea8bb 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -72,6 +72,14 @@ static void pci_dev_d3_sleep(struct pci_dev *dev)
> >  		msleep(delay);
> >  }
> >
> > +bool pci_reset_supported(struct pci_dev *dev)
> > +{
> > +	u8 null_reset_methods[PCI_RESET_METHODS_NUM] = { 0 };
> > +
> > +	return memcmp(null_reset_methods,
> > +		      dev->reset_methods, PCI_RESET_METHODS_NUM);
>
> memcmp() doesn't actually return a bool.  Either just return int
> and rely on the C "anything non-zero is true, zero is false" or
> convert the memcmp result to bool, i.e., something like:
>
>   if (memcmp(...) == 0)
>     return true;
>   return false;
>
> > +}
> > +
> >  #ifdef CONFIG_PCI_DOMAINS
> >  int pci_domains_supported = 1;
> >  #endif
> > @@ -5107,6 +5115,18 @@ static void pci_dev_restore(struct pci_dev *dev)
> >  		err_handler->reset_done(dev);
> >  }
> >
> > +/*
> > + * The ordering for functions in pci_reset_fn_methods is required for
> > + * reset_methods byte array defined in struct pci_dev.
>
> I'm not quite sure what this comment is telling me.  What breaks if I
> change the order?  If I add a new method, how do I know where to put
> it?
>
> By reading the code, I infer that:
>
>   - Each dev has dev->reset_methods[PCI_RESET_METHODS_NUM]
>
>   - dev->reset_methods[i] corresponds to pci_reset_fn_methods[i]
>
>   - dev->reset_methods[i] == 0 means dev doesn't support that method
>
>   - Otherwise, dev->reset_methods[i] is a value in the range of
>     [1, PCI_RESET_METHODS_NUM], and the higher the number, the higher
>     the reset method priority
>
>   - The order in pci_reset_fn_methods[] determines the initial
>     priority via pci_init_reset_methods(), but the priority can be
>     changed via sysfs
>
Correct. I agree the comment is not clear. Adding new reset method won't break
anything unless default order is changed and user has some assumptions from
previous versions of kernel.
> > + */
> > +const struct pci_reset_fn_method pci_reset_fn_methods[] = {
> > +	{ &pci_dev_specific_reset, .name = "device_specific" },
> > +	{ &pcie_reset_flr, .name = "flr" },
> > +	{ &pci_af_flr, .name = "af_flr" },
> > +	{ &pci_pm_reset, .name = "pm" },
> > +	{ &pci_reset_bus_function, .name = "bus" },
> > +};
> > +
> >  /**
> >   * __pci_reset_function_locked - reset a PCI device function while holding
> >   * the @dev mutex lock.
> > @@ -5129,65 +5149,67 @@ static void pci_dev_restore(struct pci_dev *dev)
> >   */
> >  int __pci_reset_function_locked(struct pci_dev *dev)
> >  {
> > -	int rc;
> > +	int i, rc = -ENOTTY;
> > +	u8 prio;
> >
> >  	might_sleep();
> >
> > -	/*
> > -	 * A reset method returns -ENOTTY if it doesn't support this device
> > -	 * and we should try the next method.
> > -	 *
> > -	 * If it returns 0 (success), we're finished.  If it returns any
> > -	 * other error, we're also finished: this indicates that further
> > -	 * reset mechanisms might be broken on the device.
> > -	 */
> > -	rc = pci_dev_specific_reset(dev, 0);
> > -	if (rc != -ENOTTY)
> > -		return rc;
> > -	rc = pcie_reset_flr(dev, 0);
> > -	if (rc != -ENOTTY)
> > -		return rc;
> > -	rc = pci_af_flr(dev, 0);
> > -	if (rc != -ENOTTY)
> > -		return rc;
> > -	rc = pci_pm_reset(dev, 0);
> > -	if (rc != -ENOTTY)
> > -		return rc;
> > -	return pci_reset_bus_function(dev, 0);
> > +	for (prio = PCI_RESET_METHODS_NUM; prio; prio--) {
> > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> > +			if (dev->reset_methods[i] == prio) {
> > +				/*
> > +				 * A reset method returns -ENOTTY if it doesn't
> > +				 * support this device and we should try the
> > +				 * next method.
> > +				 *
> > +				 * If it returns 0 (success), we're finished.
> > +				 * If it returns any other error, we're also
> > +				 * finished: this indicates that further reset
> > +				 * mechanisms might be broken on the device.
> > +				 */
> > +				rc = pci_reset_fn_methods[i].reset_fn(dev, 0);
> > +				if (rc != -ENOTTY)
> > +					return rc;
>
> Maybe leave the comment outside the loop where it used to be so the
> text lines are longer and it's easier to read.
>
> > +				break;
> > +			}
> > +		}
> > +		if (i == PCI_RESET_METHODS_NUM)
> > +			break;
> > +	}
> > +	return rc;
>
> I wonder if this would be easier if dev->reset_methods[] contained
> indices into pci_reset_fn_methods[], highest priority first, with the
> priority being determined when dev->reset_methods[] is updated.  For
> example:
>
>   const struct pci_reset_fn_method pci_reset_fn_methods[] = {
>     { },                                                     # 0
>     { &pci_dev_specific_reset, .name = "device_specific" },  # 1
>     { &pci_dev_acpi_reset, .name = "acpi" },                 # 2
>     { &pcie_reset_flr, .name = "flr" },                      # 3
>     { &pci_af_flr, .name = "af_flr" },                       # 4
>     { &pci_pm_reset, .name = "pm" },                         # 5
>     { &pci_reset_bus_function, .name = "bus" },              # 6
>   };
>
>   dev->reset_methods[] = [1, 2, 3, 4, 5, 6]
>     means all reset methods are supported, in the default priority
>     order
>
>   dev->reset_methods[] = [1, 0, 0, 0, 0, 0]
>     means only pci_dev_specific_reset is supported
>
>   dev->reset_methods[] = [3, 5, 0, 0, 0, 0]
>     means pcie_reset_flr and pci_pm_reset are supported, in that
>     priority order
>
> Then we wouldn't need the nested loop and the return value would be
> easier to analyze:
>
>   for (i = 0; i < PCI_RESET_METHODS_NUM && (m = dev->reset_methods[i]); i++) {
>     rc = pci_reset_fn_methods[m].reset_fn(dev, 0);
>     if (rc == 0)
>       return 0;
>     if (rc != -ENOTTY)
>       return rc;
>   }
>   return -ENOTTY;
>
> pci_init_reset_methods() would be something like:
>
>   n = 0;
>   for (i = 1; i < PCI_RESET_METHODS_NUM; i++) {
>     rc = pci_reset_fn_methods[i].reset_fn(dev, 1);
>     if (!rc)
>       dev->reset_methods[n++] = i;
>     if (rc != -ENOTTY)
>       return;
>   }
>
I had similar idea initially but couldn't put it in words nicely
thanks for this. I'll update this.
[...]

Thanks,
Amey

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-08  5:48 ` [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism Amey Narkhede
  2021-06-09 21:57   ` Raphael Norwitz
  2021-06-10 20:16   ` Shanker R Donthineni
@ 2021-06-18 20:00   ` Bjorn Helgaas
  2021-06-19 13:59     ` Amey Narkhede
  2021-06-24 12:15   ` Bjorn Helgaas
  3 siblings, 1 reply; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-18 20:00 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: Bjorn Helgaas, alex.williamson, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Shanker Donthineni, Sinan Kaya, Len Brown,
	Rafael J . Wysocki

On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> Add reset_method sysfs attribute to enable user to
> query and set user preferred device reset methods and
> their ordering.

Rewrap to fill 75 columns (also apply to other patches if applicable,
e.g., 3/8 looks like it could use it).

2/8 looks like it's missing a blank line between paragraphs.

> Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
> ---
>  Documentation/ABI/testing/sysfs-bus-pci |  16 ++++
>  drivers/pci/pci-sysfs.c                 | 118 ++++++++++++++++++++++++
>  2 files changed, 134 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> index ef00fada2..cf6dbbb3c 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -121,6 +121,22 @@ Description:
>  		child buses, and re-discover devices removed earlier
>  		from this part of the device tree.
>  
> +What:		/sys/bus/pci/devices/.../reset_method
> +Date:		March 2021
> +Contact:	Amey Narkhede <ameynarkhede03@gmail.com>
> +Description:
> +		Some devices allow an individual function to be reset
> +		without affecting other functions in the same slot.
> +		For devices that have this support, a file named reset_method
> +		will be present in sysfs. Reading this file will give names
> +		of the device supported reset methods and their ordering.
> +		Writing the name or comma separated list of names of any of
> +		the device supported reset methods to this file will set the
> +		reset methods and their ordering to be used when resetting
> +		the device. Writing empty string to this file will disable
> +		ability to reset the device and writing "default" will return
> +		to the original value.

Rewrap to fill or add a blank line if "For devices ..." is supposed to
start a new paragraph.

My guess is you intend reading to show the *currently enabled* reset
methods, not the entire "supported" set?  So if a user has disabled
one of them, it no longer appears when you read the file?

> +
>  What:		/sys/bus/pci/devices/.../reset
>  Date:		July 2009
>  Contact:	Michael S. Tsirkin <mst@redhat.com>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 316f70c3e..52def79aa 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1334,6 +1334,123 @@ static const struct attribute_group pci_dev_rom_attr_group = {
>  	.is_bin_visible = pci_dev_rom_attr_is_visible,
>  };
>  
> +static ssize_t reset_method_show(struct device *dev,
> +				 struct device_attribute *attr,
> +				 char *buf)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	ssize_t len = 0;
> +	int i, prio;
> +
> +	for (prio = PCI_RESET_METHODS_NUM; prio; prio--) {
> +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> +			if (prio == pdev->reset_methods[i]) {
> +				len += sysfs_emit_at(buf, len, "%s%s",
> +						     len ? "," : "",
> +						     pci_reset_fn_methods[i].name);
> +				break;
> +			}
> +		}
> +
> +		if (i == PCI_RESET_METHODS_NUM)
> +			break;
> +	}

I'm guessing that if you adopt the alternate reset_methods[] encoding,
this nested loop becomes a single loop and "prio" goes away?

> +	if (len)
> +		len += sysfs_emit_at(buf, len, "\n");
> +
> +	return len;
> +}
> +
> +static ssize_t reset_method_store(struct device *dev,
> +				  struct device_attribute *attr,
> +				  const char *buf, size_t count)
> +{
> +	u8 reset_methods[PCI_RESET_METHODS_NUM];
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	u8 prio = PCI_RESET_METHODS_NUM;
> +	char *name, *options;
> +	int i;

Reorder decls with to_pci_dev(dev) first, then in order of use.

> +	if (count >= (PAGE_SIZE - 1))
> +		return -EINVAL;
> +
> +	options = kstrndup(buf, count, GFP_KERNEL);
> +	if (!options)
> +		return -ENOMEM;
> +
> +	/*
> +	 * Initialize reset_method such that 0xff indicates
> +	 * supported but not currently enabled reset methods
> +	 * as we only use priority values which are within
> +	 * the range of PCI_RESET_FN_METHODS array size
> +	 */
> +	for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> +		reset_methods[i] = pdev->reset_methods[i] ? 0xff : 0;

I'm hoping the 0xff trick goes away with the alternate encoding?

> +	if (sysfs_streq(options, "")) {
> +		pci_warn(pdev, "All device reset methods disabled by user");
> +		goto set_reset_methods;
> +	}

I think you can get this case out of the way early with no kstrndup(),
no goto, etc.

> +	if (sysfs_streq(options, "default")) {
> +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> +			reset_methods[i] = reset_methods[i] ? prio-- : 0;
> +		goto set_reset_methods;
> +	}

If you use pci_init_reset_methods() here, you can also get this case
out of the way early.

> +	while ((name = strsep(&options, ",")) != NULL) {
> +		if (sysfs_streq(name, ""))
> +			continue;
> +
> +		name = strim(name);
> +
> +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> +			if (reset_methods[i] &&
> +			    sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> +				reset_methods[i] = prio--;
> +				break;
> +			}
> +		}
> +
> +		if (i == PCI_RESET_METHODS_NUM) {
> +			kfree(options);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	if (reset_methods[0] &&
> +	    reset_methods[0] != PCI_RESET_METHODS_NUM)
> +		pci_warn(pdev, "Device specific reset disabled/de-prioritized by user");

Is there a specific reason for this warning?  Is it just telling the
user that he might have shot himself in the foot?  Not sure that's
necessary.

> +set_reset_methods:
> +	kfree(options);
> +	memcpy(pdev->reset_methods, reset_methods, sizeof(reset_methods));
> +	return count;
> +}
> +static DEVICE_ATTR_RW(reset_method);
> +
> +static struct attribute *pci_dev_reset_method_attrs[] = {
> +	&dev_attr_reset_method.attr,
> +	NULL,
> +};
> +
> +static umode_t pci_dev_reset_method_attr_is_visible(struct kobject *kobj,
> +						    struct attribute *a, int n)
> +{
> +	struct pci_dev *pdev = to_pci_dev(kobj_to_dev(kobj));
> +
> +	if (!pci_reset_supported(pdev))
> +		return 0;

I think this _is_visible method is executed only once, at
device_add()-time.  That means if a device doesn't support any resets
at that time, "reset_method" will not be visible, and there will be no
way to ever enable a reset method at run-time.  I assume that's OK;
just double-checking.

> +
> +	return a->mode;
> +}
> +
> +static const struct attribute_group pci_dev_reset_method_attr_group = {
> +	.attrs = pci_dev_reset_method_attrs,
> +	.is_visible = pci_dev_reset_method_attr_is_visible,
> +};
> +
>  static ssize_t reset_store(struct device *dev, struct device_attribute *attr,
>  			   const char *buf, size_t count)
>  {
> @@ -1491,6 +1608,7 @@ const struct attribute_group *pci_dev_groups[] = {
>  	&pci_dev_config_attr_group,
>  	&pci_dev_rom_attr_group,
>  	&pci_dev_reset_attr_group,
> +	&pci_dev_reset_method_attr_group,
>  	&pci_dev_vpd_attr_group,
>  #ifdef CONFIG_DMI
>  	&pci_dev_smbios_attr_group,
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-18 20:00   ` Bjorn Helgaas
@ 2021-06-19 13:59     ` Amey Narkhede
  2021-06-21 13:01       ` Bjorn Helgaas
  0 siblings, 1 reply; 52+ messages in thread
From: Amey Narkhede @ 2021-06-19 13:59 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On 21/06/18 03:00PM, Bjorn Helgaas wrote:
> On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > Add reset_method sysfs attribute to enable user to
> > query and set user preferred device reset methods and
> > their ordering.
>
> Rewrap to fill 75 columns (also apply to other patches if applicable,
> e.g., 3/8 looks like it could use it).
>
> 2/8 looks like it's missing a blank line between paragraphs.
>
> > Co-developed-by: Alex Williamson <alex.williamson@redhat.com>
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > Signed-off-by: Amey Narkhede <ameynarkhede03@gmail.com>
> > ---
> >  Documentation/ABI/testing/sysfs-bus-pci |  16 ++++
> >  drivers/pci/pci-sysfs.c                 | 118 ++++++++++++++++++++++++
> >  2 files changed, 134 insertions(+)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> > index ef00fada2..cf6dbbb3c 100644
> > --- a/Documentation/ABI/testing/sysfs-bus-pci
> > +++ b/Documentation/ABI/testing/sysfs-bus-pci
> > @@ -121,6 +121,22 @@ Description:
> >  		child buses, and re-discover devices removed earlier
> >  		from this part of the device tree.
> >
> > +What:		/sys/bus/pci/devices/.../reset_method
> > +Date:		March 2021
> > +Contact:	Amey Narkhede <ameynarkhede03@gmail.com>
> > +Description:
> > +		Some devices allow an individual function to be reset
> > +		without affecting other functions in the same slot.
> > +		For devices that have this support, a file named reset_method
> > +		will be present in sysfs. Reading this file will give names
> > +		of the device supported reset methods and their ordering.
> > +		Writing the name or comma separated list of names of any of
> > +		the device supported reset methods to this file will set the
> > +		reset methods and their ordering to be used when resetting
> > +		the device. Writing empty string to this file will disable
> > +		ability to reset the device and writing "default" will return
> > +		to the original value.
>
> Rewrap to fill or add a blank line if "For devices ..." is supposed to
> start a new paragraph.
>
> My guess is you intend reading to show the *currently enabled* reset
> methods, not the entire "supported" set?  So if a user has disabled
> one of them, it no longer appears when you read the file?
>
> > +
> >  What:		/sys/bus/pci/devices/.../reset
> >  Date:		July 2009
> >  Contact:	Michael S. Tsirkin <mst@redhat.com>
> > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> > index 316f70c3e..52def79aa 100644
> > --- a/drivers/pci/pci-sysfs.c
> > +++ b/drivers/pci/pci-sysfs.c
> > @@ -1334,6 +1334,123 @@ static const struct attribute_group pci_dev_rom_attr_group = {
> >  	.is_bin_visible = pci_dev_rom_attr_is_visible,
> >  };
> >
> > +static ssize_t reset_method_show(struct device *dev,
> > +				 struct device_attribute *attr,
> > +				 char *buf)
> > +{
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	ssize_t len = 0;
> > +	int i, prio;
> > +
> > +	for (prio = PCI_RESET_METHODS_NUM; prio; prio--) {
> > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> > +			if (prio == pdev->reset_methods[i]) {
> > +				len += sysfs_emit_at(buf, len, "%s%s",
> > +						     len ? "," : "",
> > +						     pci_reset_fn_methods[i].name);
> > +				break;
> > +			}
> > +		}
> > +
> > +		if (i == PCI_RESET_METHODS_NUM)
> > +			break;
> > +	}
>
> I'm guessing that if you adopt the alternate reset_methods[] encoding,
> this nested loop becomes a single loop and "prio" goes away?
>
> > +	if (len)
> > +		len += sysfs_emit_at(buf, len, "\n");
> > +
> > +	return len;
> > +}
> > +
> > +static ssize_t reset_method_store(struct device *dev,
> > +				  struct device_attribute *attr,
> > +				  const char *buf, size_t count)
> > +{
> > +	u8 reset_methods[PCI_RESET_METHODS_NUM];
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	u8 prio = PCI_RESET_METHODS_NUM;
> > +	char *name, *options;
> > +	int i;
>
> Reorder decls with to_pci_dev(dev) first, then in order of use.
>
> > +	if (count >= (PAGE_SIZE - 1))
> > +		return -EINVAL;
> > +
> > +	options = kstrndup(buf, count, GFP_KERNEL);
> > +	if (!options)
> > +		return -ENOMEM;
> > +
> > +	/*
> > +	 * Initialize reset_method such that 0xff indicates
> > +	 * supported but not currently enabled reset methods
> > +	 * as we only use priority values which are within
> > +	 * the range of PCI_RESET_FN_METHODS array size
> > +	 */
> > +	for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> > +		reset_methods[i] = pdev->reset_methods[i] ? 0xff : 0;
>
> I'm hoping the 0xff trick goes away with the alternate encoding?
>
> > +	if (sysfs_streq(options, "")) {
> > +		pci_warn(pdev, "All device reset methods disabled by user");
> > +		goto set_reset_methods;
> > +	}
>
> I think you can get this case out of the way early with no kstrndup(),
> no goto, etc.
>
> > +	if (sysfs_streq(options, "default")) {
> > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> > +			reset_methods[i] = reset_methods[i] ? prio-- : 0;
> > +		goto set_reset_methods;
> > +	}
>
> If you use pci_init_reset_methods() here, you can also get this case
> out of the way early.
>
The problem with alternate encoding is we won't be able to know if
one of the reset methods was disabled previously. For example,

# cat reset_methods
flr,bus 			# dev->reset_methods = [3, 5, 0, ...]
# echo bus > reset_methods 	# dev->reset_methods = [5, 0, 0, ...]
# cat reset_methods
bus

Now if an user wants to enable flr

# echo flr > reset_methods 	# dev->reset_methods = [3, 0, 0, ...]
OR
# echo bus,flr > reset_methods 	# dev->reset_methods = [5, 3, 0, ...]

either they need to write "default" first then flr or we will need to
reprobe reset methods each time when user writes to reset_method attribute.


> > +	while ((name = strsep(&options, ",")) != NULL) {
> > +		if (sysfs_streq(name, ""))
> > +			continue;
> > +
> > +		name = strim(name);
> > +
> > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> > +			if (reset_methods[i] &&
> > +			    sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> > +				reset_methods[i] = prio--;
> > +				break;
> > +			}
> > +		}
> > +
> > +		if (i == PCI_RESET_METHODS_NUM) {
> > +			kfree(options);
> > +			return -EINVAL;
> > +		}
> > +	}
> > +
> > +	if (reset_methods[0] &&
> > +	    reset_methods[0] != PCI_RESET_METHODS_NUM)
> > +		pci_warn(pdev, "Device specific reset disabled/de-prioritized by user");
>
> Is there a specific reason for this warning?  Is it just telling the
> user that he might have shot himself in the foot?  Not sure that's
> necessary.
>
I think generally presence of device specific reset method means other
methods are potentially broken. Is it okay to skip this?

> > +set_reset_methods:
> > +	kfree(options);
> > +	memcpy(pdev->reset_methods, reset_methods, sizeof(reset_methods));
> > +	return count;
> > +}
> > +static DEVICE_ATTR_RW(reset_method);
> > +
> > +static struct attribute *pci_dev_reset_method_attrs[] = {
> > +	&dev_attr_reset_method.attr,
> > +	NULL,
> > +};
> > +
> > +static umode_t pci_dev_reset_method_attr_is_visible(struct kobject *kobj,
> > +						    struct attribute *a, int n)
> > +{
> > +	struct pci_dev *pdev = to_pci_dev(kobj_to_dev(kobj));
> > +
> > +	if (!pci_reset_supported(pdev))
> > +		return 0;
>
> I think this _is_visible method is executed only once, at
> device_add()-time.  That means if a device doesn't support any resets
> at that time, "reset_method" will not be visible, and there will be no
> way to ever enable a reset method at run-time.  I assume that's OK;
> just double-checking.
>
Correct. Its similar to exisitng reset_fn bitfield which is removed in this
patch series.
[...]

Thanks,
Amey

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-19 13:59     ` Amey Narkhede
@ 2021-06-21 13:01       ` Bjorn Helgaas
  2021-06-21 17:28         ` Amey Narkhede
  2021-06-23 17:21         ` Alex Williamson
  0 siblings, 2 replies; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-21 13:01 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On Sat, Jun 19, 2021 at 07:29:20PM +0530, Amey Narkhede wrote:
> On 21/06/18 03:00PM, Bjorn Helgaas wrote:
> > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > > Add reset_method sysfs attribute to enable user to
> > > query and set user preferred device reset methods and
> > > their ordering.

> > > +	if (sysfs_streq(options, "default")) {
> > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> > > +			reset_methods[i] = reset_methods[i] ? prio-- : 0;
> > > +		goto set_reset_methods;
> > > +	}
> >
> > If you use pci_init_reset_methods() here, you can also get this case
> > out of the way early.
> >
> The problem with alternate encoding is we won't be able to know if
> one of the reset methods was disabled previously. For example,
> 
> # cat reset_methods
> flr,bus 			# dev->reset_methods = [3, 5, 0, ...]
> # echo bus > reset_methods 	# dev->reset_methods = [5, 0, 0, ...]
> # cat reset_methods
> bus
> 
> Now if an user wants to enable flr
> 
> # echo flr > reset_methods 	# dev->reset_methods = [3, 0, 0, ...]
> OR
> # echo bus,flr > reset_methods 	# dev->reset_methods = [5, 3, 0, ...]
> 
> either they need to write "default" first then flr or we will need to
> reprobe reset methods each time when user writes to reset_method attribute.

Not sure I completely understand the problem here.  I think relying on
previous state that is invisible to the user is a little problematic
because it's hard for the user to predict what will happen.

If the user enables a method that was previously "disabled" because
the probe failed, won't the reset method itself just fail with
-ENOTTY?  Is that a problem?

> > > +	while ((name = strsep(&options, ",")) != NULL) {
> > > +		if (sysfs_streq(name, ""))
> > > +			continue;
> > > +
> > > +		name = strim(name);
> > > +
> > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> > > +			if (reset_methods[i] &&
> > > +			    sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> > > +				reset_methods[i] = prio--;
> > > +				break;
> > > +			}
> > > +		}
> > > +
> > > +		if (i == PCI_RESET_METHODS_NUM) {
> > > +			kfree(options);
> > > +			return -EINVAL;
> > > +		}
> > > +	}
> > > +
> > > +	if (reset_methods[0] &&
> > > +	    reset_methods[0] != PCI_RESET_METHODS_NUM)
> > > +		pci_warn(pdev, "Device specific reset disabled/de-prioritized by user");
> >
> > Is there a specific reason for this warning?  Is it just telling the
> > user that he might have shot himself in the foot?  Not sure that's
> > necessary.
> >
> I think generally presence of device specific reset method means other
> methods are potentially broken. Is it okay to skip this?

We might want a warning at reset-time if all the methods failed,
because that means we may leak state between users.  Maybe we also
want one here, if *all* reset methods are disabled.  I don't really
like special treatment of device-specific methods here because it
depends on the assumption that "device-specific means all other resets
are broken."  That's hard to maintain.

Bjorn

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 2/8] PCI: Add new array for keeping track of ordering of reset methods
  2021-06-18 17:22     ` Amey Narkhede
@ 2021-06-21 15:02       ` Shanker R Donthineni
  2021-06-21 17:15         ` Amey Narkhede
  0 siblings, 1 reply; 52+ messages in thread
From: Shanker R Donthineni @ 2021-06-21 15:02 UTC (permalink / raw)
  To: Amey Narkhede, Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Sinan Kaya, Len Brown, Rafael J . Wysocki

Hi Bjorn,

On 6/18/21 12:22 PM, Amey Narkhede wrote:
> I wonder if this would be easier if dev->reset_methods[] contained
> indices into pci_reset_fn_methods[], highest priority first, with the
> priority being determined when dev->reset_methods[] is updated.  For
> example:
>
>   const struct pci_reset_fn_method pci_reset_fn_methods[] = {
>     { },                                                     # 0
>     { &pci_dev_specific_reset, .name = "device_specific" },  # 1
>     { &pci_dev_acpi_reset, .name = "acpi" },                 # 2
>     { &pcie_reset_flr, .name = "flr" },                      # 3
>     { &pci_af_flr, .name = "af_flr" },                       # 4
>     { &pci_pm_reset, .name = "pm" },                         # 5
>     { &pci_reset_bus_function, .name = "bus" },              # 6
>   };
>
>   dev->reset_methods[] = [1, 2, 3, 4, 5, 6]
>     means all reset methods are supported, in the default priority
>     order
>
>   dev->reset_methods[] = [1, 0, 0, 0, 0, 0]
>     means only pci_dev_specific_reset is supported
>
>   dev->reset_methods[] = [3, 5, 0, 0, 0, 0]
>     means pcie_reset_flr and pci_pm_reset are supported, in that
>     priority order
What about keeping two bitmap fields 'resets_supported' and 'resets_enabled' in
pci_dev object and mange it through sysfs and probe helper function. We can avoid
two loops multiple paces and takes only 2Bytes of memory to keep track resets.

resets_supported  ---> initialized during pci_dev setup
resets_enabled ---> Exposed to userspace through sysfs by default set to resets_supported

include/linux/pci.h:
------------------------
/* Different types of PCI resets possible, lower number is higher priority */
#define PCI_RESET_METHOD_DEVSPEC     0
#define PCI_RESET_METHOD_ACPI            1
#define PCI_RESET_METHOD_FLR              2
#define PCI_RESET_METHOD_Af_FLR         3
#define PCI_RESET_METHOD_PM               4
#define PCI_RESET_METHOD_BUS             5
#define PCI_RESET_METHOD_MAX            6

struct pci_dev {
    ...
        u8              resets_supported;
        u8              resets_enabled;
};

static inline bool pci_reset_supported(struct pci_dev *dev)
{
        return !!(dev->resets_supported);
}


drivers/pci/pci.c:
--------------------
const struct pci_reset_fn_method pci_reset_fn_methods[PCI_RESET_METHOD_MAX] = {
        [PCI_RESET_METHOD_DEVSPEC] = { &pci_dev_specific_reset,
                                                                   .name = "device_specific" },
        [PCI_RESET_METHOD_ACPI] = { &pci_dev_acpi_reset, .name = "acpi" },
        [PCI_RESET_METHOD_FLR] = { &pcie_reset_flr, .name = "flr" },
        [PCI_RESET_METHOD_Af_FLR] = { &pci_af_flr, .name = "af_flr" },
        [PCI_RESET_METHOD_PM] = { &pci_pm_reset, .name = "pm" },
        [PCI_RESET_METHOD_BUS] = { &pci_reset_bus_function, .name = "bus" }
};


void pci_init_reset_methods(struct pci_dev *dev)
{
        int i, rc;

        BUILD_BUG_ON(ARRAY_SIZE(pci_reset_fn_methods) != PCI_RESET_METHOD_MAX);
        might_sleep();

        for (i = 0; i < PCI_RESET_METHOD_MAX; i++) {
                rc = pci_reset_fn_methods[i].reset_fn(dev, PCI_RESET_PROBE);
                if (!rc)
                        dev->resets_supported |= BIT(i);
                else if (rc != -ENOTTY)
                        break;
        }
        dev->resets_enabled = dev->resets_supported;
}

int __pci_reset_function_locked(struct pci_dev *dev)
{
        int i, rc = -ENOTTY;

        might_sleep();

        for (i = 0; i < PCI_RESET_METHOD_MAX; i++) {
                if (dev->resets_enabled & BIT(i)) {
                        rc = pci_reset_fn_methods[i].reset_fn(dev, 0);
                        if (rc != -ENOTTY)
                                return rc;
                }
        }

        if (rc == -ENOTTY)
                pci_warn(dev, "No reset happened reason %s\n",
                         !!dev->resets_supported ?
                         "disabled by user" : "not supported");

        return rc;
}

drivers/pci/pci-sysfs.c
----------------------------
static ssize_t reset_method_store(struct device *dev,
                                  struct device_attribute *attr,
                                  const char *buf, size_t count)
{
        struct pci_dev *pdev = to_pci_dev(dev);
        u8 resets_enabled = 0;
...
        if (sysfs_streq(options, "default")) {
                pdev->resets_enabled = pdev->resets_supported;
                goto set_reset_methods;
        }

        while ((name = strsep(&options, ",")) != NULL) {
                if (sysfs_streq(name, ""))
                        continue;
                name = strim(name);

                for (i = 0; i < PCI_RESET_METHOD_MAX; i++) {
                        if ((pdev->resets_supported & BIT(i)) &&
                            sysfs_streq(name, pci_reset_fn_methods[i].name)) {
                                resets_enabled |= BIT(i);
                                break;
                        }
                }
...
        }

set_reset_methods:
        kfree(options);
        pdev->resets_enabled =  resets_enabled;
        return count;
}

static ssize_t reset_method_show(struct device *dev,
                                 struct device_attribute *attr,
                                 char *buf)
{
        struct pci_dev *pdev = to_pci_dev(dev);
        ssize_t len = 0;
        int i;

        for (i = 0; i < PCI_RESET_METHOD_MAX; i++) {
                if (pdev->resets_enabled & BIT(i))
                        len += sysfs_emit_at(buf, len, "%s%s",
                                             len ? "," : "",
                                             pci_reset_fn_methods[i].name);
        }
        len += sysfs_emit_at(buf, len, len ? "\n" : "");

        return len;
}


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 2/8] PCI: Add new array for keeping track of ordering of reset methods
  2021-06-21 15:02       ` Shanker R Donthineni
@ 2021-06-21 17:15         ` Amey Narkhede
  2021-06-21 18:37           ` Bjorn Helgaas
  0 siblings, 1 reply; 52+ messages in thread
From: Amey Narkhede @ 2021-06-21 17:15 UTC (permalink / raw)
  To: Shanker R Donthineni
  Cc: Bjorn Helgaas, alex.williamson, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Sinan Kaya, Len Brown, Rafael J . Wysocki

On 21/06/21 10:02AM, Shanker R Donthineni wrote:
> Hi Bjorn,
>
> On 6/18/21 12:22 PM, Amey Narkhede wrote:
> > I wonder if this would be easier if dev->reset_methods[] contained
> > indices into pci_reset_fn_methods[], highest priority first, with the
> > priority being determined when dev->reset_methods[] is updated.  For
> > example:
> >
> >   const struct pci_reset_fn_method pci_reset_fn_methods[] = {
> >     { },                                                     # 0
> >     { &pci_dev_specific_reset, .name = "device_specific" },  # 1
> >     { &pci_dev_acpi_reset, .name = "acpi" },                 # 2
> >     { &pcie_reset_flr, .name = "flr" },                      # 3
> >     { &pci_af_flr, .name = "af_flr" },                       # 4
> >     { &pci_pm_reset, .name = "pm" },                         # 5
> >     { &pci_reset_bus_function, .name = "bus" },              # 6
> >   };
> >
> >   dev->reset_methods[] = [1, 2, 3, 4, 5, 6]
> >     means all reset methods are supported, in the default priority
> >     order
> >
> >   dev->reset_methods[] = [1, 0, 0, 0, 0, 0]
> >     means only pci_dev_specific_reset is supported
> >
> >   dev->reset_methods[] = [3, 5, 0, 0, 0, 0]
> >     means pcie_reset_flr and pci_pm_reset are supported, in that
> >     priority order
> What about keeping two bitmap fields 'resets_supported' and 'resets_enabled' in
> pci_dev object and mange it through sysfs and probe helper function. We can avoid
> two loops multiple paces and takes only 2Bytes of memory to keep track resets.
>
> resets_supported  ---> initialized during pci_dev setup
> resets_enabled ---> Exposed to userspace through sysfs by default set to resets_supported
>
> include/linux/pci.h:
> ------------------------
> /* Different types of PCI resets possible, lower number is higher priority */
> #define PCI_RESET_METHOD_DEVSPEC     0
> #define PCI_RESET_METHOD_ACPI            1
> #define PCI_RESET_METHOD_FLR              2
> #define PCI_RESET_METHOD_Af_FLR         3
> #define PCI_RESET_METHOD_PM               4
> #define PCI_RESET_METHOD_BUS             5
> #define PCI_RESET_METHOD_MAX            6
>
> struct pci_dev {
>     ...
>         u8              resets_supported;
>         u8              resets_enabled;
> };
>
> static inline bool pci_reset_supported(struct pci_dev *dev)
> {
>         return !!(dev->resets_supported);
> }
>
>
> drivers/pci/pci.c:
> --------------------
> const struct pci_reset_fn_method pci_reset_fn_methods[PCI_RESET_METHOD_MAX] = {
>         [PCI_RESET_METHOD_DEVSPEC] = { &pci_dev_specific_reset,
>                                                                    .name = "device_specific" },
>         [PCI_RESET_METHOD_ACPI] = { &pci_dev_acpi_reset, .name = "acpi" },
>         [PCI_RESET_METHOD_FLR] = { &pcie_reset_flr, .name = "flr" },
>         [PCI_RESET_METHOD_Af_FLR] = { &pci_af_flr, .name = "af_flr" },
>         [PCI_RESET_METHOD_PM] = { &pci_pm_reset, .name = "pm" },
>         [PCI_RESET_METHOD_BUS] = { &pci_reset_bus_function, .name = "bus" }
> };
>
>
> void pci_init_reset_methods(struct pci_dev *dev)
> {
>         int i, rc;
>
>         BUILD_BUG_ON(ARRAY_SIZE(pci_reset_fn_methods) != PCI_RESET_METHOD_MAX);
>         might_sleep();
>
>         for (i = 0; i < PCI_RESET_METHOD_MAX; i++) {
>                 rc = pci_reset_fn_methods[i].reset_fn(dev, PCI_RESET_PROBE);
>                 if (!rc)
>                         dev->resets_supported |= BIT(i);
>                 else if (rc != -ENOTTY)
>                         break;
>         }
>         dev->resets_enabled = dev->resets_supported;
> }
>
> int __pci_reset_function_locked(struct pci_dev *dev)
> {
>         int i, rc = -ENOTTY;
>
>         might_sleep();
>
>         for (i = 0; i < PCI_RESET_METHOD_MAX; i++) {
>                 if (dev->resets_enabled & BIT(i)) {
>                         rc = pci_reset_fn_methods[i].reset_fn(dev, 0);
>                         if (rc != -ENOTTY)
>                                 return rc;
>                 }
>         }
>
>         if (rc == -ENOTTY)
>                 pci_warn(dev, "No reset happened reason %s\n",
>                          !!dev->resets_supported ?
>                          "disabled by user" : "not supported");
>
>         return rc;
> }
>
> drivers/pci/pci-sysfs.c
> ----------------------------
> static ssize_t reset_method_store(struct device *dev,
>                                   struct device_attribute *attr,
>                                   const char *buf, size_t count)
> {
>         struct pci_dev *pdev = to_pci_dev(dev);
>         u8 resets_enabled = 0;
> ...
>         if (sysfs_streq(options, "default")) {
>                 pdev->resets_enabled = pdev->resets_supported;
>                 goto set_reset_methods;
>         }
>
>         while ((name = strsep(&options, ",")) != NULL) {
>                 if (sysfs_streq(name, ""))
>                         continue;
>                 name = strim(name);
>
>                 for (i = 0; i < PCI_RESET_METHOD_MAX; i++) {
>                         if ((pdev->resets_supported & BIT(i)) &&
>                             sysfs_streq(name, pci_reset_fn_methods[i].name)) {
>                                 resets_enabled |= BIT(i);
>                                 break;
>                         }
>                 }
> ...
>         }
>
> set_reset_methods:
>         kfree(options);
>         pdev->resets_enabled =  resets_enabled;
>         return count;
> }
>
> static ssize_t reset_method_show(struct device *dev,
>                                  struct device_attribute *attr,
>                                  char *buf)
> {
>         struct pci_dev *pdev = to_pci_dev(dev);
>         ssize_t len = 0;
>         int i;
>
>         for (i = 0; i < PCI_RESET_METHOD_MAX; i++) {
>                 if (pdev->resets_enabled & BIT(i))
>                         len += sysfs_emit_at(buf, len, "%s%s",
>                                              len ? "," : "",
>                                              pci_reset_fn_methods[i].name);
>         }
>         len += sysfs_emit_at(buf, len, len ? "\n" : "");
>
>         return len;
> }
>
Thank you for the idea.
Actually that would be coming full circle because Alex, Raphael and I
tried similar approach earlier while prototyping for v2 but this implementation
does look better than what I had at that time.

Thanks,
Amey

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-21 13:01       ` Bjorn Helgaas
@ 2021-06-21 17:28         ` Amey Narkhede
  2021-06-21 19:07           ` Bjorn Helgaas
  2021-06-23 17:21         ` Alex Williamson
  1 sibling, 1 reply; 52+ messages in thread
From: Amey Narkhede @ 2021-06-21 17:28 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On 21/06/21 08:01AM, Bjorn Helgaas wrote:
> On Sat, Jun 19, 2021 at 07:29:20PM +0530, Amey Narkhede wrote:
> > On 21/06/18 03:00PM, Bjorn Helgaas wrote:
> > > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > > > Add reset_method sysfs attribute to enable user to
> > > > query and set user preferred device reset methods and
> > > > their ordering.
>
> > > > +	if (sysfs_streq(options, "default")) {
> > > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> > > > +			reset_methods[i] = reset_methods[i] ? prio-- : 0;
> > > > +		goto set_reset_methods;
> > > > +	}
> > >
> > > If you use pci_init_reset_methods() here, you can also get this case
> > > out of the way early.
> > >
> > The problem with alternate encoding is we won't be able to know if
> > one of the reset methods was disabled previously. For example,
> >
> > # cat reset_methods
> > flr,bus 			# dev->reset_methods = [3, 5, 0, ...]
> > # echo bus > reset_methods 	# dev->reset_methods = [5, 0, 0, ...]
> > # cat reset_methods
> > bus
> >
> > Now if an user wants to enable flr
> >
> > # echo flr > reset_methods 	# dev->reset_methods = [3, 0, 0, ...]
> > OR
> > # echo bus,flr > reset_methods 	# dev->reset_methods = [5, 3, 0, ...]
> >
> > either they need to write "default" first then flr or we will need to
> > reprobe reset methods each time when user writes to reset_method attribute.
>
> Not sure I completely understand the problem here.  I think relying on
> previous state that is invisible to the user is a little problematic
> because it's hard for the user to predict what will happen.
>
> If the user enables a method that was previously "disabled" because
> the probe failed, won't the reset method itself just fail with
> -ENOTTY?  Is that a problem?
>
I think I didn't explain this correctly. With current implementation
its not necessary to explicitly set *order of availabe* reset methods.
User can directly write a single supported reset method only and then perform
the reset. Side effect of that is other methods are disabled if user
writes single or less than available number of supported reset method.
Current implementation is able to handle this case but with new encoding
we'll need to reprobe reset methods everytime because we have no way
of distingushing supported and currently enabled reset method.

Alternate way of doing this is using 2 bitmaps as outlined here by
Shanker https://marc.info/?l=linux-kernel&m=162428773101702&w=2
> > > > +	while ((name = strsep(&options, ",")) != NULL) {
> > > > +		if (sysfs_streq(name, ""))
> > > > +			continue;
> > > > +
> > > > +		name = strim(name);
> > > > +
> > > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> > > > +			if (reset_methods[i] &&
> > > > +			    sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> > > > +				reset_methods[i] = prio--;
> > > > +				break;
> > > > +			}
> > > > +		}
> > > > +
> > > > +		if (i == PCI_RESET_METHODS_NUM) {
> > > > +			kfree(options);
> > > > +			return -EINVAL;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	if (reset_methods[0] &&
> > > > +	    reset_methods[0] != PCI_RESET_METHODS_NUM)
> > > > +		pci_warn(pdev, "Device specific reset disabled/de-prioritized by user");
> > >
> > > Is there a specific reason for this warning?  Is it just telling the
> > > user that he might have shot himself in the foot?  Not sure that's
> > > necessary.
> > >
> > I think generally presence of device specific reset method means other
> > methods are potentially broken. Is it okay to skip this?
>
> We might want a warning at reset-time if all the methods failed,
> because that means we may leak state between users.  Maybe we also
> want one here, if *all* reset methods are disabled.  I don't really
> like special treatment of device-specific methods here because it
> depends on the assumption that "device-specific means all other resets
> are broken."  That's hard to maintain.
>
> Bjorn
Makes sense. I'll update this.

Thanks,
Amey

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 2/8] PCI: Add new array for keeping track of ordering of reset methods
  2021-06-21 17:15         ` Amey Narkhede
@ 2021-06-21 18:37           ` Bjorn Helgaas
  0 siblings, 0 replies; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-21 18:37 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: Shanker R Donthineni, alex.williamson, Raphael Norwitz,
	linux-pci, linux-kernel, kw, Sinan Kaya, Len Brown,
	Rafael J . Wysocki

On Mon, Jun 21, 2021 at 10:45:18PM +0530, Amey Narkhede wrote:
> On 21/06/21 10:02AM, Shanker R Donthineni wrote:
> > On 6/18/21 12:22 PM, Amey Narkhede wrote:
> > > I wonder if this would be easier if dev->reset_methods[] contained
> > > indices into pci_reset_fn_methods[], highest priority first, with the
> > > priority being determined when dev->reset_methods[] is updated.  For
> > > example:
> > >
> > >   const struct pci_reset_fn_method pci_reset_fn_methods[] = {
> > >     { },                                                     # 0
> > >     { &pci_dev_specific_reset, .name = "device_specific" },  # 1
> > >     { &pci_dev_acpi_reset, .name = "acpi" },                 # 2
> > >     { &pcie_reset_flr, .name = "flr" },                      # 3
> > >     { &pci_af_flr, .name = "af_flr" },                       # 4
> > >     { &pci_pm_reset, .name = "pm" },                         # 5
> > >     { &pci_reset_bus_function, .name = "bus" },              # 6
> > >   };
> > >
> > >   dev->reset_methods[] = [1, 2, 3, 4, 5, 6]
> > >     means all reset methods are supported, in the default priority
> > >     order
> > >
> > >   dev->reset_methods[] = [1, 0, 0, 0, 0, 0]
> > >     means only pci_dev_specific_reset is supported
> > >
> > >   dev->reset_methods[] = [3, 5, 0, 0, 0, 0]
> > >     means pcie_reset_flr and pci_pm_reset are supported, in that
> > >     priority order
> >
> > What about keeping two bitmap fields 'resets_supported' and
> > 'resets_enabled' in pci_dev object and mange it through sysfs and
> > probe helper function. We can avoid two loops multiple paces and
> > takes only 2Bytes of memory to keep track resets.
> >
> > resets_supported  ---> initialized during pci_dev setup
> > resets_enabled ---> Exposed to userspace through sysfs by default set to resets_supported
> >
> > include/linux/pci.h:
> > ------------------------
> > /* Different types of PCI resets possible, lower number is higher priority */
> > #define PCI_RESET_METHOD_DEVSPEC     0
> > #define PCI_RESET_METHOD_ACPI            1
> > #define PCI_RESET_METHOD_FLR              2
> > #define PCI_RESET_METHOD_Af_FLR         3
> > #define PCI_RESET_METHOD_PM               4
> > #define PCI_RESET_METHOD_BUS             5
> > #define PCI_RESET_METHOD_MAX            6
> >
> > struct pci_dev {
> >     ...
> >         u8              resets_supported;
> >         u8              resets_enabled;
> > };
> >
> > static inline bool pci_reset_supported(struct pci_dev *dev)
> > {
> >         return !!(dev->resets_supported);
> > }
> >
> >
> > drivers/pci/pci.c:
> > --------------------
> > const struct pci_reset_fn_method pci_reset_fn_methods[PCI_RESET_METHOD_MAX] = {
> >         [PCI_RESET_METHOD_DEVSPEC] = { &pci_dev_specific_reset,
> >                                                                    .name = "device_specific" },
> >         [PCI_RESET_METHOD_ACPI] = { &pci_dev_acpi_reset, .name = "acpi" },
> >         [PCI_RESET_METHOD_FLR] = { &pcie_reset_flr, .name = "flr" },
> >         [PCI_RESET_METHOD_Af_FLR] = { &pci_af_flr, .name = "af_flr" },
> >         [PCI_RESET_METHOD_PM] = { &pci_pm_reset, .name = "pm" },
> >         [PCI_RESET_METHOD_BUS] = { &pci_reset_bus_function, .name = "bus" }
> > };
> >
> >
> > void pci_init_reset_methods(struct pci_dev *dev)
> > {
> >         int i, rc;
> >
> >         BUILD_BUG_ON(ARRAY_SIZE(pci_reset_fn_methods) != PCI_RESET_METHOD_MAX);
> >         might_sleep();
> >
> >         for (i = 0; i < PCI_RESET_METHOD_MAX; i++) {
> >                 rc = pci_reset_fn_methods[i].reset_fn(dev, PCI_RESET_PROBE);
> >                 if (!rc)
> >                         dev->resets_supported |= BIT(i);
> >                 else if (rc != -ENOTTY)
> >                         break;
> >         }
> >         dev->resets_enabled = dev->resets_supported;
> > }
> >
> > int __pci_reset_function_locked(struct pci_dev *dev)
> > {
> >         int i, rc = -ENOTTY;
> >
> >         might_sleep();
> >
> >         for (i = 0; i < PCI_RESET_METHOD_MAX; i++) {
> >                 if (dev->resets_enabled & BIT(i)) {
> >                         rc = pci_reset_fn_methods[i].reset_fn(dev, 0);
> >                         if (rc != -ENOTTY)
> >                                 return rc;
> >                 }
> >         }
> >
> >         if (rc == -ENOTTY)
> >                 pci_warn(dev, "No reset happened reason %s\n",
> >                          !!dev->resets_supported ?
> >                          "disabled by user" : "not supported");
> >
> >         return rc;
> > }
> >
> > drivers/pci/pci-sysfs.c
> > ----------------------------
> > static ssize_t reset_method_store(struct device *dev,
> >                                   struct device_attribute *attr,
> >                                   const char *buf, size_t count)
> > {
> >         struct pci_dev *pdev = to_pci_dev(dev);
> >         u8 resets_enabled = 0;
> > ...
> >         if (sysfs_streq(options, "default")) {
> >                 pdev->resets_enabled = pdev->resets_supported;
> >                 goto set_reset_methods;
> >         }
> >
> >         while ((name = strsep(&options, ",")) != NULL) {
> >                 if (sysfs_streq(name, ""))
> >                         continue;
> >                 name = strim(name);
> >
> >                 for (i = 0; i < PCI_RESET_METHOD_MAX; i++) {
> >                         if ((pdev->resets_supported & BIT(i)) &&
> >                             sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> >                                 resets_enabled |= BIT(i);
> >                                 break;
> >                         }
> >                 }
> > ...
> >         }
> >
> > set_reset_methods:
> >         kfree(options);
> >         pdev->resets_enabled =  resets_enabled;
> >         return count;
> > }
> >
> > static ssize_t reset_method_show(struct device *dev,
> >                                  struct device_attribute *attr,
> >                                  char *buf)
> > {
> >         struct pci_dev *pdev = to_pci_dev(dev);
> >         ssize_t len = 0;
> >         int i;
> >
> >         for (i = 0; i < PCI_RESET_METHOD_MAX; i++) {
> >                 if (pdev->resets_enabled & BIT(i))
> >                         len += sysfs_emit_at(buf, len, "%s%s",
> >                                              len ? "," : "",
> >                                              pci_reset_fn_methods[i].name);
> >         }
> >         len += sysfs_emit_at(buf, len, len ? "\n" : "");
> >
> >         return len;
> > }
> >
> Thank you for the idea.
> Actually that would be coming full circle because Alex, Raphael and
> I tried similar approach earlier while prototyping for v2 but this
> implementation does look better than what I had at that time.

I thought part of the point of this series was to allow the user to
change the *order* of reset types.  I don't think we can control the
ordering if we only keep a bit (or even two) per reset type.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-21 17:28         ` Amey Narkhede
@ 2021-06-21 19:07           ` Bjorn Helgaas
  2021-06-21 19:33             ` Amey Narkhede
  0 siblings, 1 reply; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-21 19:07 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On Mon, Jun 21, 2021 at 10:58:54PM +0530, Amey Narkhede wrote:
> On 21/06/21 08:01AM, Bjorn Helgaas wrote:
> > On Sat, Jun 19, 2021 at 07:29:20PM +0530, Amey Narkhede wrote:
> > > On 21/06/18 03:00PM, Bjorn Helgaas wrote:
> > > > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > > > > Add reset_method sysfs attribute to enable user to
> > > > > query and set user preferred device reset methods and
> > > > > their ordering.
> >
> > > > > +	if (sysfs_streq(options, "default")) {
> > > > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> > > > > +			reset_methods[i] = reset_methods[i] ? prio-- : 0;
> > > > > +		goto set_reset_methods;
> > > > > +	}
> > > >
> > > > If you use pci_init_reset_methods() here, you can also get this case
> > > > out of the way early.
> > > >
> > > The problem with alternate encoding is we won't be able to know if
> > > one of the reset methods was disabled previously. For example,
> > >
> > > # cat reset_methods
> > > flr,bus 			# dev->reset_methods = [3, 5, 0, ...]
> > > # echo bus > reset_methods 	# dev->reset_methods = [5, 0, 0, ...]
> > > # cat reset_methods
> > > bus
> > >
> > > Now if an user wants to enable flr
> > >
> > > # echo flr > reset_methods 	# dev->reset_methods = [3, 0, 0, ...]
> > > OR
> > > # echo bus,flr > reset_methods 	# dev->reset_methods = [5, 3, 0, ...]
> > >
> > > either they need to write "default" first then flr or we will need to
> > > reprobe reset methods each time when user writes to reset_method attribute.
> >
> > Not sure I completely understand the problem here.  I think relying on
> > previous state that is invisible to the user is a little problematic
> > because it's hard for the user to predict what will happen.
> >
> > If the user enables a method that was previously "disabled" because
> > the probe failed, won't the reset method itself just fail with
> > -ENOTTY?  Is that a problem?
> >
> I think I didn't explain this correctly. With current implementation
> its not necessary to explicitly set *order of availabe* reset methods.
> User can directly write a single supported reset method only and then perform
> the reset. Side effect of that is other methods are disabled if user
> writes single or less than available number of supported reset method.
> Current implementation is able to handle this case but with new encoding
> we'll need to reprobe reset methods everytime because we have no way
> of distingushing supported and currently enabled reset method.

I'm confused.  I thought the point of the nested loops to find the
highest priority enabled reset method was to allow the user to control
the order.  The sysfs doc says writing "reset_method" sets the "reset
methods and their ordering."

It seems complicated to track "supported" and "enabled" separately,
and I don't know what the benefit is.  If we write "reset_method" to
enable reset X, can we just probe reset X to see if it's supported?

Bjorn

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-21 19:07           ` Bjorn Helgaas
@ 2021-06-21 19:33             ` Amey Narkhede
  2021-06-23 12:06               ` Bjorn Helgaas
  0 siblings, 1 reply; 52+ messages in thread
From: Amey Narkhede @ 2021-06-21 19:33 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On 21/06/21 02:07PM, Bjorn Helgaas wrote:
> On Mon, Jun 21, 2021 at 10:58:54PM +0530, Amey Narkhede wrote:
> > On 21/06/21 08:01AM, Bjorn Helgaas wrote:
> > > On Sat, Jun 19, 2021 at 07:29:20PM +0530, Amey Narkhede wrote:
> > > > On 21/06/18 03:00PM, Bjorn Helgaas wrote:
> > > > > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > > > > > Add reset_method sysfs attribute to enable user to
> > > > > > query and set user preferred device reset methods and
> > > > > > their ordering.
> > >
> > > > > > +	if (sysfs_streq(options, "default")) {
> > > > > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> > > > > > +			reset_methods[i] = reset_methods[i] ? prio-- : 0;
> > > > > > +		goto set_reset_methods;
> > > > > > +	}
> > > > >
> > > > > If you use pci_init_reset_methods() here, you can also get this case
> > > > > out of the way early.
> > > > >
> > > > The problem with alternate encoding is we won't be able to know if
> > > > one of the reset methods was disabled previously. For example,
> > > >
> > > > # cat reset_methods
> > > > flr,bus 			# dev->reset_methods = [3, 5, 0, ...]
> > > > # echo bus > reset_methods 	# dev->reset_methods = [5, 0, 0, ...]
> > > > # cat reset_methods
> > > > bus
> > > >
> > > > Now if an user wants to enable flr
> > > >
> > > > # echo flr > reset_methods 	# dev->reset_methods = [3, 0, 0, ...]
> > > > OR
> > > > # echo bus,flr > reset_methods 	# dev->reset_methods = [5, 3, 0, ...]
> > > >
> > > > either they need to write "default" first then flr or we will need to
> > > > reprobe reset methods each time when user writes to reset_method attribute.
> > >
> > > Not sure I completely understand the problem here.  I think relying on
> > > previous state that is invisible to the user is a little problematic
> > > because it's hard for the user to predict what will happen.
> > >
> > > If the user enables a method that was previously "disabled" because
> > > the probe failed, won't the reset method itself just fail with
> > > -ENOTTY?  Is that a problem?
> > >
> > I think I didn't explain this correctly. With current implementation
> > its not necessary to explicitly set *order of availabe* reset methods.
> > User can directly write a single supported reset method only and then perform
> > the reset. Side effect of that is other methods are disabled if user
> > writes single or less than available number of supported reset method.
> > Current implementation is able to handle this case but with new encoding
> > we'll need to reprobe reset methods everytime because we have no way
> > of distingushing supported and currently enabled reset method.
>
> I'm confused.  I thought the point of the nested loops to find the
> highest priority enabled reset method was to allow the user to control
> the order.  The sysfs doc says writing "reset_method" sets the "reset
> methods and their ordering."
>
> It seems complicated to track "supported" and "enabled" separately,
> and I don't know what the benefit is.  If we write "reset_method" to
> enable reset X, can we just probe reset X to see if it's supported?
>
> Bjorn
Although final result is same whether user writes a supported reset method or
their ordering that is,
# echo bus > reset_methods
and
# echo bus,flr > reset_methods

are the same but in the first version, users don't have to explicitly
set the ordering if they just want to perform bus reset.
Current implementation allows the flexibility for switching between
first and second option.

Does this address your doubt?

Thanks,
Amey

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-21 19:33             ` Amey Narkhede
@ 2021-06-23 12:06               ` Bjorn Helgaas
  2021-06-23 14:07                 ` Amey Narkhede
  0 siblings, 1 reply; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-23 12:06 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On Tue, Jun 22, 2021 at 01:03:07AM +0530, Amey Narkhede wrote:
> On 21/06/21 02:07PM, Bjorn Helgaas wrote:
> > On Mon, Jun 21, 2021 at 10:58:54PM +0530, Amey Narkhede wrote:
> > > On 21/06/21 08:01AM, Bjorn Helgaas wrote:
> > > > On Sat, Jun 19, 2021 at 07:29:20PM +0530, Amey Narkhede wrote:
> > > > > On 21/06/18 03:00PM, Bjorn Helgaas wrote:
> > > > > > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > > > > > > Add reset_method sysfs attribute to enable user to
> > > > > > > query and set user preferred device reset methods and
> > > > > > > their ordering.
> > > >
> > > > > > > +	if (sysfs_streq(options, "default")) {
> > > > > > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> > > > > > > +			reset_methods[i] = reset_methods[i] ? prio-- : 0;
> > > > > > > +		goto set_reset_methods;
> > > > > > > +	}
> > > > > >
> > > > > > If you use pci_init_reset_methods() here, you can also get this case
> > > > > > out of the way early.
> > > > > >
> > > > > The problem with alternate encoding is we won't be able to know if
> > > > > one of the reset methods was disabled previously. For example,
> > > > >
> > > > > # cat reset_methods
> > > > > flr,bus 			# dev->reset_methods = [3, 5, 0, ...]
> > > > > # echo bus > reset_methods 	# dev->reset_methods = [5, 0, 0, ...]
> > > > > # cat reset_methods
> > > > > bus
> > > > >
> > > > > Now if an user wants to enable flr
> > > > >
> > > > > # echo flr > reset_methods 	# dev->reset_methods = [3, 0, 0, ...]
> > > > > OR
> > > > > # echo bus,flr > reset_methods 	# dev->reset_methods = [5, 3, 0, ...]
> > > > >
> > > > > either they need to write "default" first then flr or we will need to
> > > > > reprobe reset methods each time when user writes to reset_method attribute.
> > > >
> > > > Not sure I completely understand the problem here.  I think relying on
> > > > previous state that is invisible to the user is a little problematic
> > > > because it's hard for the user to predict what will happen.
> > > >
> > > > If the user enables a method that was previously "disabled" because
> > > > the probe failed, won't the reset method itself just fail with
> > > > -ENOTTY?  Is that a problem?
> > > >
> > > I think I didn't explain this correctly. With current implementation
> > > its not necessary to explicitly set *order of availabe* reset methods.
> > > User can directly write a single supported reset method only and then perform
> > > the reset. Side effect of that is other methods are disabled if user
> > > writes single or less than available number of supported reset method.
> > > Current implementation is able to handle this case but with new encoding
> > > we'll need to reprobe reset methods everytime because we have no way
> > > of distingushing supported and currently enabled reset method.
> >
> > I'm confused.  I thought the point of the nested loops to find the
> > highest priority enabled reset method was to allow the user to control
> > the order.  The sysfs doc says writing "reset_method" sets the "reset
> > methods and their ordering."
> >
> > It seems complicated to track "supported" and "enabled" separately,
> > and I don't know what the benefit is.  If we write "reset_method" to
> > enable reset X, can we just probe reset X to see if it's supported?
>
> Although final result is same whether user writes a supported reset method or
> their ordering that is,
> # echo bus > reset_methods
> and
> # echo bus,flr > reset_methods
> 
> are the same but in the first version, users don't have to explicitly
> set the ordering if they just want to perform bus reset.
> Current implementation allows the flexibility for switching between
> first and second option.

Sorry, I can't quite make sense of the above.

Your doc implies the following are different:

  # echo bus,flr > reset_methods
  # echo flr,bus > reset_methods

Are they?  If you don't need to provide control over the order of
trying resets, this can all be simplified quite a bit.

Bjorn

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-23 12:06               ` Bjorn Helgaas
@ 2021-06-23 14:07                 ` Amey Narkhede
  2021-06-23 17:56                   ` Amey Narkhede
  0 siblings, 1 reply; 52+ messages in thread
From: Amey Narkhede @ 2021-06-23 14:07 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On 21/06/23 07:06AM, Bjorn Helgaas wrote:
> On Tue, Jun 22, 2021 at 01:03:07AM +0530, Amey Narkhede wrote:
> > On 21/06/21 02:07PM, Bjorn Helgaas wrote:
> > > On Mon, Jun 21, 2021 at 10:58:54PM +0530, Amey Narkhede wrote:
> > > > On 21/06/21 08:01AM, Bjorn Helgaas wrote:
> > > > > On Sat, Jun 19, 2021 at 07:29:20PM +0530, Amey Narkhede wrote:
> > > > > > On 21/06/18 03:00PM, Bjorn Helgaas wrote:
> > > > > > > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > > > > > > > Add reset_method sysfs attribute to enable user to
> > > > > > > > query and set user preferred device reset methods and
> > > > > > > > their ordering.
> > > > >
> > > > > > > > +	if (sysfs_streq(options, "default")) {
> > > > > > > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> > > > > > > > +			reset_methods[i] = reset_methods[i] ? prio-- : 0;
> > > > > > > > +		goto set_reset_methods;
> > > > > > > > +	}
> > > > > > >
> > > > > > > If you use pci_init_reset_methods() here, you can also get this case
> > > > > > > out of the way early.
> > > > > > >
> > > > > > The problem with alternate encoding is we won't be able to know if
> > > > > > one of the reset methods was disabled previously. For example,
> > > > > >
> > > > > > # cat reset_methods
> > > > > > flr,bus 			# dev->reset_methods = [3, 5, 0, ...]
> > > > > > # echo bus > reset_methods 	# dev->reset_methods = [5, 0, 0, ...]
> > > > > > # cat reset_methods
> > > > > > bus
> > > > > >
> > > > > > Now if an user wants to enable flr
> > > > > >
> > > > > > # echo flr > reset_methods 	# dev->reset_methods = [3, 0, 0, ...]
> > > > > > OR
> > > > > > # echo bus,flr > reset_methods 	# dev->reset_methods = [5, 3, 0, ...]
> > > > > >
> > > > > > either they need to write "default" first then flr or we will need to
> > > > > > reprobe reset methods each time when user writes to reset_method attribute.
> > > > >
> > > > > Not sure I completely understand the problem here.  I think relying on
> > > > > previous state that is invisible to the user is a little problematic
> > > > > because it's hard for the user to predict what will happen.
> > > > >
> > > > > If the user enables a method that was previously "disabled" because
> > > > > the probe failed, won't the reset method itself just fail with
> > > > > -ENOTTY?  Is that a problem?
> > > > >
> > > > I think I didn't explain this correctly. With current implementation
> > > > its not necessary to explicitly set *order of availabe* reset methods.
> > > > User can directly write a single supported reset method only and then perform
> > > > the reset. Side effect of that is other methods are disabled if user
> > > > writes single or less than available number of supported reset method.
> > > > Current implementation is able to handle this case but with new encoding
> > > > we'll need to reprobe reset methods everytime because we have no way
> > > > of distingushing supported and currently enabled reset method.
> > >
> > > I'm confused.  I thought the point of the nested loops to find the
> > > highest priority enabled reset method was to allow the user to control
> > > the order.  The sysfs doc says writing "reset_method" sets the "reset
> > > methods and their ordering."
> > >
> > > It seems complicated to track "supported" and "enabled" separately,
> > > and I don't know what the benefit is.  If we write "reset_method" to
> > > enable reset X, can we just probe reset X to see if it's supported?
> >
> > Although final result is same whether user writes a supported reset method or
> > their ordering that is,
> > # echo bus > reset_methods
> > and
> > # echo bus,flr > reset_methods
> >
> > are the same but in the first version, users don't have to explicitly
> > set the ordering if they just want to perform bus reset.
> > Current implementation allows the flexibility for switching between
> > first and second option.
>
> Sorry, I can't quite make sense of the above.
>
> Your doc implies the following are different:
>
>   # echo bus,flr > reset_methods
>   # echo flr,bus > reset_methods
>
> Are they?  If you don't need to provide control over the order of
> trying resets, this can all be simplified quite a bit.
>
> Bjorn
The v1 of the patch series allowed only single reset method to be
written instead of ordering of all supported reset methods.
With your example, current implementation allows both writing single
value and list of supported reset methods.

# echo bus > reset_methods
and
# echo bus,flr > reset_methods

OR

# echo flr > reset_methods
and
echo flr,bus > reset_methods

Its more of a preference than a functional point. Ultimately the
__pci_reset_function_locked() function will only perform 'bus' reset in
this example because we make sure 'reset_methods' file only contains
valid device supported reset methods all the time so
__pci_reset_function_locked() won't go into -ENOTTY case but with new
encoding theres no way to make sure `reset_methods`sysfs attribute will
contain valid supported reset methods all the time because of the reset
methods can be masked implicitly if user uses first option of writing
only single value.

My point is current implementation allows more flexibility for the user.

Thanks,
Amey

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-21 13:01       ` Bjorn Helgaas
  2021-06-21 17:28         ` Amey Narkhede
@ 2021-06-23 17:21         ` Alex Williamson
  1 sibling, 0 replies; 52+ messages in thread
From: Alex Williamson @ 2021-06-23 17:21 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Amey Narkhede, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On Mon, 21 Jun 2021 08:01:35 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Sat, Jun 19, 2021 at 07:29:20PM +0530, Amey Narkhede wrote:
> > On 21/06/18 03:00PM, Bjorn Helgaas wrote:  
> > > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:  
> > > > +	while ((name = strsep(&options, ",")) != NULL) {
> > > > +		if (sysfs_streq(name, ""))
> > > > +			continue;
> > > > +
> > > > +		name = strim(name);
> > > > +
> > > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> > > > +			if (reset_methods[i] &&
> > > > +			    sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> > > > +				reset_methods[i] = prio--;
> > > > +				break;
> > > > +			}
> > > > +		}
> > > > +
> > > > +		if (i == PCI_RESET_METHODS_NUM) {
> > > > +			kfree(options);
> > > > +			return -EINVAL;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	if (reset_methods[0] &&
> > > > +	    reset_methods[0] != PCI_RESET_METHODS_NUM)
> > > > +		pci_warn(pdev, "Device specific reset disabled/de-prioritized by user");  
> > >
> > > Is there a specific reason for this warning?  Is it just telling the
> > > user that he might have shot himself in the foot?  Not sure that's
> > > necessary.
> > >  
> > I think generally presence of device specific reset method means other
> > methods are potentially broken. Is it okay to skip this?  
> 
> We might want a warning at reset-time if all the methods failed,
> because that means we may leak state between users.  Maybe we also
> want one here, if *all* reset methods are disabled.  I don't really
> like special treatment of device-specific methods here because it
> depends on the assumption that "device-specific means all other resets
> are broken."  That's hard to maintain.

I'd say the device specific reset is special.  The device itself can
support a number of resets and they're theoretically all equivalent,
it's a policy decision which to use.  But the device specific reset is
a software provided reset.  Someone has specifically gone to the
trouble to create a reset mechanism that is in some way better than the
other methods.  Not using that one by default sure feels like something
worthy of leaving a breadcrumb in dmesg for debugging.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-23 14:07                 ` Amey Narkhede
@ 2021-06-23 17:56                   ` Amey Narkhede
  0 siblings, 0 replies; 52+ messages in thread
From: Amey Narkhede @ 2021-06-23 17:56 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On 21/06/23 07:37PM, Amey Narkhede wrote:
> On 21/06/23 07:06AM, Bjorn Helgaas wrote:
> > On Tue, Jun 22, 2021 at 01:03:07AM +0530, Amey Narkhede wrote:
> > > On 21/06/21 02:07PM, Bjorn Helgaas wrote:
> > > > On Mon, Jun 21, 2021 at 10:58:54PM +0530, Amey Narkhede wrote:
> > > > > On 21/06/21 08:01AM, Bjorn Helgaas wrote:
> > > > > > On Sat, Jun 19, 2021 at 07:29:20PM +0530, Amey Narkhede wrote:
> > > > > > > On 21/06/18 03:00PM, Bjorn Helgaas wrote:
> > > > > > > > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > > > > > > > > Add reset_method sysfs attribute to enable user to
> > > > > > > > > query and set user preferred device reset methods and
> > > > > > > > > their ordering.
> > > > > >
> > > > > > > > > +	if (sysfs_streq(options, "default")) {
> > > > > > > > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++)
> > > > > > > > > +			reset_methods[i] = reset_methods[i] ? prio-- : 0;
> > > > > > > > > +		goto set_reset_methods;
> > > > > > > > > +	}
> > > > > > > >
> > > > > > > > If you use pci_init_reset_methods() here, you can also get this case
> > > > > > > > out of the way early.
> > > > > > > >
> > > > > > > The problem with alternate encoding is we won't be able to know if
> > > > > > > one of the reset methods was disabled previously. For example,
> > > > > > >
> > > > > > > # cat reset_methods
> > > > > > > flr,bus 			# dev->reset_methods = [3, 5, 0, ...]
> > > > > > > # echo bus > reset_methods 	# dev->reset_methods = [5, 0, 0, ...]
> > > > > > > # cat reset_methods
> > > > > > > bus
> > > > > > >
> > > > > > > Now if an user wants to enable flr
> > > > > > >
> > > > > > > # echo flr > reset_methods 	# dev->reset_methods = [3, 0, 0, ...]
> > > > > > > OR
> > > > > > > # echo bus,flr > reset_methods 	# dev->reset_methods = [5, 3, 0, ...]
> > > > > > >
> > > > > > > either they need to write "default" first then flr or we will need to
> > > > > > > reprobe reset methods each time when user writes to reset_method attribute.
> > > > > >
> > > > > > Not sure I completely understand the problem here.  I think relying on
> > > > > > previous state that is invisible to the user is a little problematic
> > > > > > because it's hard for the user to predict what will happen.
> > > > > >
> > > > > > If the user enables a method that was previously "disabled" because
> > > > > > the probe failed, won't the reset method itself just fail with
> > > > > > -ENOTTY?  Is that a problem?
> > > > > >
> > > > > I think I didn't explain this correctly. With current implementation
> > > > > its not necessary to explicitly set *order of availabe* reset methods.
> > > > > User can directly write a single supported reset method only and then perform
> > > > > the reset. Side effect of that is other methods are disabled if user
> > > > > writes single or less than available number of supported reset method.
> > > > > Current implementation is able to handle this case but with new encoding
> > > > > we'll need to reprobe reset methods everytime because we have no way
> > > > > of distingushing supported and currently enabled reset method.
> > > >
> > > > I'm confused.  I thought the point of the nested loops to find the
> > > > highest priority enabled reset method was to allow the user to control
> > > > the order.  The sysfs doc says writing "reset_method" sets the "reset
> > > > methods and their ordering."
> > > >
> > > > It seems complicated to track "supported" and "enabled" separately,
> > > > and I don't know what the benefit is.  If we write "reset_method" to
> > > > enable reset X, can we just probe reset X to see if it's supported?
> > >
> > > Although final result is same whether user writes a supported reset method or
> > > their ordering that is,
> > > # echo bus > reset_methods
> > > and
> > > # echo bus,flr > reset_methods
> > >
> > > are the same but in the first version, users don't have to explicitly
> > > set the ordering if they just want to perform bus reset.
> > > Current implementation allows the flexibility for switching between
> > > first and second option.
> >
> > Sorry, I can't quite make sense of the above.
> >
> > Your doc implies the following are different:
> >
> >   # echo bus,flr > reset_methods
> >   # echo flr,bus > reset_methods
> >
> > Are they?  If you don't need to provide control over the order of
> > trying resets, this can all be simplified quite a bit.
> >
> > Bjorn
> The v1 of the patch series allowed only single reset method to be
> written instead of ordering of all supported reset methods.
> With your example, current implementation allows both writing single
> value and list of supported reset methods.
>
> # echo bus > reset_methods
> and
> # echo bus,flr > reset_methods
>
> OR
>
> # echo flr > reset_methods
> and
> echo flr,bus > reset_methods
>
# echo flr,bus and echo bus,flr are different.

> Its more of a preference than a functional point. Ultimately the
> __pci_reset_function_locked() function will only perform 'bus' reset in
> this example because we make sure 'reset_methods' file only contains
> valid device supported reset methods all the time so
> __pci_reset_function_locked() won't go into -ENOTTY case but with new
Oops I'm wrong here. __pci_reset_function_locked() can return -ENOTTY
and follow through if a reset fails.

Rest of the point should hold.
> encoding theres no way to make sure `reset_methods`sysfs attribute will
> contain valid supported reset methods all the time because of the reset
> methods can be masked implicitly if user uses first option of writing
> only single value.
>
> My point is current implementation allows more flexibility for the user.
>
> Thanks,
> Amey

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-08  5:48 ` [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism Amey Narkhede
                     ` (2 preceding siblings ...)
  2021-06-18 20:00   ` Bjorn Helgaas
@ 2021-06-24 12:15   ` Bjorn Helgaas
  2021-06-24 15:12     ` Amey Narkhede
  3 siblings, 1 reply; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-24 12:15 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: Bjorn Helgaas, alex.williamson, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Shanker Donthineni, Sinan Kaya, Len Brown,
	Rafael J . Wysocki

On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> Add reset_method sysfs attribute to enable user to
> query and set user preferred device reset methods and
> their ordering.

> +		Writing the name or comma separated list of names of any of
> +		the device supported reset methods to this file will set the
> +		reset methods and their ordering to be used when resetting
> +		the device.

> +	while ((name = strsep(&options, ",")) != NULL) {
> +		if (sysfs_streq(name, ""))
> +			continue;
> +
> +		name = strim(name);
> +
> +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> +			if (reset_methods[i] &&
> +			    sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> +				reset_methods[i] = prio--;
> +				break;
> +			}
> +		}
> +
> +		if (i == PCI_RESET_METHODS_NUM) {
> +			kfree(options);
> +			return -EINVAL;
> +		}
> +	}

Asking again since we didn't get this clarified before.  The above
tells me that "reset_methods" allows the user to control the *order*
in which we try reset methods.

Consider the following two uses:

  (1) # echo bus,flr > reset_methods

  (2) # echo flr,bus > reset_methods

Do these have the same effect or not?

If "reset_methods" allows control over the order, I expect them to be
different: (1) would try a bus reset and, if the bus reset failed, an
FLR, while (2) would try an FLR and, if the FLR failed, a bus reset.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods
  2021-06-08  5:48 ` [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods Amey Narkhede
  2021-06-10 20:15   ` Shanker R Donthineni
  2021-06-17 21:57   ` Bjorn Helgaas
@ 2021-06-24 12:23   ` Bjorn Helgaas
  2021-06-24 15:28     ` Amey Narkhede
  2 siblings, 1 reply; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-24 12:23 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: Bjorn Helgaas, alex.williamson, Raphael Norwitz, linux-pci,
	linux-kernel, kw, Shanker Donthineni, Sinan Kaya, Len Brown,
	Rafael J . Wysocki

On Tue, Jun 08, 2021 at 11:18:50AM +0530, Amey Narkhede wrote:
> Currently there is separate function pcie_has_flr() to probe if pcie flr is
> supported by the device which does not match the calling convention
> followed by reset methods which use second function argument to decide
> whether to probe or not.  Add new function pcie_reset_flr() that follows
> the calling convention of reset methods.

> +/**
> + * pcie_reset_flr - initiate a PCIe function level reset
> + * @dev: device to reset
> + * @probe: If set, only check if the device can be reset this way.
> + *
> + * Initiate a function level reset on @dev.
> + */
> +int pcie_reset_flr(struct pci_dev *dev, int probe)
> +{
> +	u32 cap;
> +
> +	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> +		return -ENOTTY;
> +
> +	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
> +	if (!(cap & PCI_EXP_DEVCAP_FLR))
> +		return -ENOTTY;
> +
> +	if (probe)
> +		return 0;
> +
> +	return pcie_flr(dev);
> +}

Tangent: I've been told before, but I can't remember why we need the
"probe" interface.  Since we're looking at this area again, can we add
a comment to clarify this?

Every time I read this, I wonder why we can't just get rid of the
probe and attempt a reset.  If it fails because it's not supported, we
could just try the next one in the list.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-24 12:15   ` Bjorn Helgaas
@ 2021-06-24 15:12     ` Amey Narkhede
  2021-06-24 16:56       ` Bjorn Helgaas
  0 siblings, 1 reply; 52+ messages in thread
From: Amey Narkhede @ 2021-06-24 15:12 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On 21/06/24 07:15AM, Bjorn Helgaas wrote:
> On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > Add reset_method sysfs attribute to enable user to
> > query and set user preferred device reset methods and
> > their ordering.
>
> > +		Writing the name or comma separated list of names of any of
> > +		the device supported reset methods to this file will set the
> > +		reset methods and their ordering to be used when resetting
> > +		the device.
>
> > +	while ((name = strsep(&options, ",")) != NULL) {
> > +		if (sysfs_streq(name, ""))
> > +			continue;
> > +
> > +		name = strim(name);
> > +
> > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> > +			if (reset_methods[i] &&
> > +			    sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> > +				reset_methods[i] = prio--;
> > +				break;
> > +			}
> > +		}
> > +
> > +		if (i == PCI_RESET_METHODS_NUM) {
> > +			kfree(options);
> > +			return -EINVAL;
> > +		}
> > +	}
>
> Asking again since we didn't get this clarified before.  The above
> tells me that "reset_methods" allows the user to control the *order*
> in which we try reset methods.
>
> Consider the following two uses:
>
>   (1) # echo bus,flr > reset_methods
>
>   (2) # echo flr,bus > reset_methods
>
> Do these have the same effect or not?
>
They have different effect.
> If "reset_methods" allows control over the order, I expect them to be
> different: (1) would try a bus reset and, if the bus reset failed, an
> FLR, while (2) would try an FLR and, if the FLR failed, a bus reset.
Exactly you are right.

Now the point I was presenting was with new encoding we have to write
list of *all of the supported reset methods* in order for example, in
above example flr,bus or bus,flr. We can't just write 'flr' or 'bus'
then switch back to writing flr,bus/bus,flr(these have different effect
as mentioned earlier).

Basically with new encoding an user can't write subset of reset methods
they have to write list of *all* supported methods everytime.

Thanks,
Amey

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods
  2021-06-24 12:23   ` Bjorn Helgaas
@ 2021-06-24 15:28     ` Amey Narkhede
  2021-06-24 16:15       ` Bjorn Helgaas
  0 siblings, 1 reply; 52+ messages in thread
From: Amey Narkhede @ 2021-06-24 15:28 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On 21/06/24 07:23AM, Bjorn Helgaas wrote:
> On Tue, Jun 08, 2021 at 11:18:50AM +0530, Amey Narkhede wrote:
> > Currently there is separate function pcie_has_flr() to probe if pcie flr is
> > supported by the device which does not match the calling convention
> > followed by reset methods which use second function argument to decide
> > whether to probe or not.  Add new function pcie_reset_flr() that follows
> > the calling convention of reset methods.
>
> > +/**
> > + * pcie_reset_flr - initiate a PCIe function level reset
> > + * @dev: device to reset
> > + * @probe: If set, only check if the device can be reset this way.
> > + *
> > + * Initiate a function level reset on @dev.
> > + */
> > +int pcie_reset_flr(struct pci_dev *dev, int probe)
> > +{
> > +	u32 cap;
> > +
> > +	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> > +		return -ENOTTY;
> > +
> > +	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
> > +	if (!(cap & PCI_EXP_DEVCAP_FLR))
> > +		return -ENOTTY;
> > +
> > +	if (probe)
> > +		return 0;
> > +
> > +	return pcie_flr(dev);
> > +}
>
> Tangent: I've been told before, but I can't remember why we need the
> "probe" interface.  Since we're looking at this area again, can we add
> a comment to clarify this?
>
> Every time I read this, I wonder why we can't just get rid of the
> probe and attempt a reset.  If it fails because it's not supported, we
> could just try the next one in the list.

Part of the reason is to have same calling convention as other reset
methods and other reason is devices that run in VMs where various
capabilities can be hidden or have quirks for avoiding known troublesome
combination of device features as Alex explained here
https://lore.kernel.org/linux-pci/20210624151242.ybew2z5rseuusj7v@archlinux/T/#mb67c09a2ce08ce4787652e4c0e7b9e5adf1df57a

On the side note as you suggested earlier to cache flr capability
earlier the PCI_EXP_DEVCAP reading code won't be there in next version
so its just trivial check(dev->has_flr).

Thanks,
Amey

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods
  2021-06-24 15:28     ` Amey Narkhede
@ 2021-06-24 16:15       ` Bjorn Helgaas
  2021-06-24 18:48         ` Alex Williamson
  0 siblings, 1 reply; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-24 16:15 UTC (permalink / raw)
  To: Amey Narkhede, Alex Williamson
  Cc: Raphael Norwitz, linux-pci, linux-kernel, kw, Shanker Donthineni,
	Sinan Kaya, Len Brown, Rafael J . Wysocki

[+to Alex]

On Thu, Jun 24, 2021 at 08:58:09PM +0530, Amey Narkhede wrote:
> On 21/06/24 07:23AM, Bjorn Helgaas wrote:
> > On Tue, Jun 08, 2021 at 11:18:50AM +0530, Amey Narkhede wrote:
> > > Currently there is separate function pcie_has_flr() to probe if pcie flr is
> > > supported by the device which does not match the calling convention
> > > followed by reset methods which use second function argument to decide
> > > whether to probe or not.  Add new function pcie_reset_flr() that follows
> > > the calling convention of reset methods.
> >
> > > +/**
> > > + * pcie_reset_flr - initiate a PCIe function level reset
> > > + * @dev: device to reset
> > > + * @probe: If set, only check if the device can be reset this way.
> > > + *
> > > + * Initiate a function level reset on @dev.
> > > + */
> > > +int pcie_reset_flr(struct pci_dev *dev, int probe)
> > > +{
> > > +	u32 cap;
> > > +
> > > +	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> > > +		return -ENOTTY;
> > > +
> > > +	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
> > > +	if (!(cap & PCI_EXP_DEVCAP_FLR))
> > > +		return -ENOTTY;
> > > +
> > > +	if (probe)
> > > +		return 0;
> > > +
> > > +	return pcie_flr(dev);
> > > +}
> >
> > Tangent: I've been told before, but I can't remember why we need the
> > "probe" interface.  Since we're looking at this area again, can we add
> > a comment to clarify this?
> >
> > Every time I read this, I wonder why we can't just get rid of the
> > probe and attempt a reset.  If it fails because it's not supported, we
> > could just try the next one in the list.
> 
> Part of the reason is to have same calling convention as other reset
> methods and other reason is devices that run in VMs where various
> capabilities can be hidden or have quirks for avoiding known troublesome
> combination of device features as Alex explained here
> https://lore.kernel.org/linux-pci/20210624151242.ybew2z5rseuusj7v@archlinux/T/#mb67c09a2ce08ce4787652e4c0e7b9e5adf1df57a
> 
> On the side note as you suggested earlier to cache flr capability
> earlier the PCI_EXP_DEVCAP reading code won't be there in next version
> so its just trivial check(dev->has_flr).

Sorry, I didn't make my question clear.  I'm not asking why we're
adding a "probe" argument to pcie_reset_flr() to make it consistent
with pci_af_flr(), pci_pm_reset(), pci_parent_bus_reset(), etc.  I
like making the interfaces consistent.

What I'm asking here is why the "probe" argument exists for *any* of
these interfaces and why pci_probe_reset_function() exists.

This is really more a question for Alex since it's a historical
question, not anything directly related to your series.  I'm not
proposing *removing* the "probe" argument; I know it exists for a
reason because I've asked about it before.  But I forgot the answer,
which makes me think a hint in the code would be useful.

Bjorn

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-24 15:12     ` Amey Narkhede
@ 2021-06-24 16:56       ` Bjorn Helgaas
  2021-06-24 17:20         ` Shanker R Donthineni
  2021-06-24 17:28         ` Amey Narkhede
  0 siblings, 2 replies; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-24 16:56 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On Thu, Jun 24, 2021 at 08:42:42PM +0530, Amey Narkhede wrote:
> On 21/06/24 07:15AM, Bjorn Helgaas wrote:
> > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > > Add reset_method sysfs attribute to enable user to
> > > query and set user preferred device reset methods and
> > > their ordering.
> >
> > > +		Writing the name or comma separated list of names of any of
> > > +		the device supported reset methods to this file will set the
> > > +		reset methods and their ordering to be used when resetting
> > > +		the device.
> >
> > > +	while ((name = strsep(&options, ",")) != NULL) {
> > > +		if (sysfs_streq(name, ""))
> > > +			continue;
> > > +
> > > +		name = strim(name);
> > > +
> > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> > > +			if (reset_methods[i] &&
> > > +			    sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> > > +				reset_methods[i] = prio--;
> > > +				break;
> > > +			}
> > > +		}
> > > +
> > > +		if (i == PCI_RESET_METHODS_NUM) {
> > > +			kfree(options);
> > > +			return -EINVAL;
> > > +		}
> > > +	}
> >
> > Asking again since we didn't get this clarified before.  The above
> > tells me that "reset_methods" allows the user to control the
> > *order* in which we try reset methods.
> >
> > Consider the following two uses:
> >
> >   (1) # echo bus,flr > reset_methods
> >
> >   (2) # echo flr,bus > reset_methods
> >
> > Do these have the same effect or not?
> >
> They have different effect.

I asked about this because Shanker's idea [1] of using two bitmaps
only keeps track of which resets are *enabled*.  It does not keep
track of the *ordering*.  Since you want to control the ordering, I
think we need more state than just the supported/enabled bitmaps.

> > If "reset_methods" allows control over the order, I expect them to
> > be different: (1) would try a bus reset and, if the bus reset
> > failed, an FLR, while (2) would try an FLR and, if the FLR failed,
> > a bus reset.
>
> Exactly you are right.
> 
> Now the point I was presenting was with new encoding we have to
> write list of *all of the supported reset methods* in order for
> example, in above example flr,bus or bus,flr. We can't just write
> 'flr' or 'bus' then switch back to writing flr,bus/bus,flr (these
> have different effect as mentioned earlier).

It sounds like you're saying this sequence can't work:

  # echo flr > reset_methods
  # echo bus,flr > reset_methods

But I'm afraid you'll have to walk me through the reasons why this
can't be made to work.

> Basically with new encoding an user can't write subset of reset
> methods they have to write list of *all* supported methods
> everytime.

Why does the user have to write all supported methods?  Is that to
preserve the fact that "cat reset_methods" always shows all the
supported methods so the user knows what's available?

I'm wondering why we can't do something like this (pidgin code):

  if (option == "default") {
    pci_init_reset_methods(dev);
    return;
  }

  n = 0;
  foreach method in option {
    i = lookup_reset_method(method);
    if (pci_reset_methods[i].reset_fn(dev, PROBE) == 0)
      dev->reset_methods[n++] = i;           # method i supported
  }
  dev->reset_methods[n++] = 0;               # end of supported methods

If we did something like the above, the user could always find the
list of all methods supported by a device by doing this:

  # echo default > reset_methods
  # cat reset_methods

Yes, this does call the "probe" methods several times.  I don't think
that's necessarily a problem.

Bjorn

[1] https://lore.kernel.org/r/1fb0a184-908c-5f98-ef6d-74edc602c2e0@nvidia.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-24 16:56       ` Bjorn Helgaas
@ 2021-06-24 17:20         ` Shanker R Donthineni
  2021-06-24 17:28         ` Amey Narkhede
  1 sibling, 0 replies; 52+ messages in thread
From: Shanker R Donthineni @ 2021-06-24 17:20 UTC (permalink / raw)
  To: Bjorn Helgaas, Amey Narkhede
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Sinan Kaya, Len Brown, Rafael J . Wysocki



On 6/24/21 11:56 AM, Bjorn Helgaas wrote:
> Why does the user have to write all supported methods?  Is that to
> preserve the fact that "cat reset_methods" always shows all the
> supported methods so the user knows what's available?
>
> I'm wondering why we can't do something like this (pidgin code):
>
>   if (option == "default") {
>     pci_init_reset_methods(dev);
>     return;
>   }
>
>   n = 0;
>   foreach method in option {
>     i = lookup_reset_method(method);
>     if (pci_reset_methods[i].reset_fn(dev, PROBE) == 0)
>       dev->reset_methods[n++] = i;           # method i supported
>   }
>   dev->reset_methods[n++] = 0;               # end of supported methods
>
> If we did something like the above, the user could always find the
> list of all methods supported by a device by doing this:
>
>   # echo default > reset_methods
>   # cat reset_methods
>
> Yes, this does call the "probe" methods several times.  I don't think
> that's necessarily a problem.
Agree, I don't think admin/user will change reset methods frequently and no
side effects or performance impacts on probing multiple times.   

We should support enabling partially ordered reset methods and restore
default methods either by re-probing resets or remember supported
resets in pci_dev.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-24 16:56       ` Bjorn Helgaas
  2021-06-24 17:20         ` Shanker R Donthineni
@ 2021-06-24 17:28         ` Amey Narkhede
  2021-06-24 17:59           ` Bjorn Helgaas
  1 sibling, 1 reply; 52+ messages in thread
From: Amey Narkhede @ 2021-06-24 17:28 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On 21/06/24 11:56AM, Bjorn Helgaas wrote:
> On Thu, Jun 24, 2021 at 08:42:42PM +0530, Amey Narkhede wrote:
> > On 21/06/24 07:15AM, Bjorn Helgaas wrote:
> > > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > > > Add reset_method sysfs attribute to enable user to
> > > > query and set user preferred device reset methods and
> > > > their ordering.
> > >
> > > > +		Writing the name or comma separated list of names of any of
> > > > +		the device supported reset methods to this file will set the
> > > > +		reset methods and their ordering to be used when resetting
> > > > +		the device.
> > >
> > > > +	while ((name = strsep(&options, ",")) != NULL) {
> > > > +		if (sysfs_streq(name, ""))
> > > > +			continue;
> > > > +
> > > > +		name = strim(name);
> > > > +
> > > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> > > > +			if (reset_methods[i] &&
> > > > +			    sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> > > > +				reset_methods[i] = prio--;
> > > > +				break;
> > > > +			}
> > > > +		}
> > > > +
> > > > +		if (i == PCI_RESET_METHODS_NUM) {
> > > > +			kfree(options);
> > > > +			return -EINVAL;
> > > > +		}
> > > > +	}
> > >
> > > Asking again since we didn't get this clarified before.  The above
> > > tells me that "reset_methods" allows the user to control the
> > > *order* in which we try reset methods.
> > >
> > > Consider the following two uses:
> > >
> > >   (1) # echo bus,flr > reset_methods
> > >
> > >   (2) # echo flr,bus > reset_methods
> > >
> > > Do these have the same effect or not?
> > >
> > They have different effect.
>
> I asked about this because Shanker's idea [1] of using two bitmaps
> only keeps track of which resets are *enabled*.  It does not keep
> track of the *ordering*.  Since you want to control the ordering, I
> think we need more state than just the supported/enabled bitmaps.
>
> > > If "reset_methods" allows control over the order, I expect them to
> > > be different: (1) would try a bus reset and, if the bus reset
> > > failed, an FLR, while (2) would try an FLR and, if the FLR failed,
> > > a bus reset.
> >
> > Exactly you are right.
> >
> > Now the point I was presenting was with new encoding we have to
> > write list of *all of the supported reset methods* in order for
> > example, in above example flr,bus or bus,flr. We can't just write
> > 'flr' or 'bus' then switch back to writing flr,bus/bus,flr (these
> > have different effect as mentioned earlier).
>
> It sounds like you're saying this sequence can't work:
>
>   # echo flr > reset_methods
# dev->reset_methods = [3, 0, 0, ..]
>   # echo bus,flr > reset_methods
# to get dev->reset_methods = [6, 3, 0, ...]
we'll need to probe reset methods here.
>
> But I'm afraid you'll have to walk me through the reasons why this
> can't be made to work.
I wrote incomplete description. It can work but we'll need to probe
everytime which involves reading different capabilities(PCI_CAP_ID_AF,
PCI_PM_CTRL etc) from device. With current encoding we just have to
probe at the begining.
>
> > Basically with new encoding an user can't write subset of reset
> > methods they have to write list of *all* supported methods
> > everytime.
>
> Why does the user have to write all supported methods?  Is that to
> preserve the fact that "cat reset_methods" always shows all the
> supported methods so the user knows what's available?
>
> I'm wondering why we can't do something like this (pidgin code):
>
>   if (option == "default") {
>     pci_init_reset_methods(dev);
>     return;
>   }
>
>   n = 0;
>   foreach method in option {
>     i = lookup_reset_method(method);
>     if (pci_reset_methods[i].reset_fn(dev, PROBE) == 0)
Repeatedly calling probe might have some impact as it involves reading
device registers as explained earlier.
>       dev->reset_methods[n++] = i;           # method i supported
>   }
>   dev->reset_methods[n++] = 0;               # end of supported methods
>
> If we did something like the above, the user could always find the
> list of all methods supported by a device by doing this:
>
>   # echo default > reset_methods
>   # cat reset_methods
>
This is one solution for current problem with new encoding.
> Yes, this does call the "probe" methods several times.  I don't think
> that's necessarily a problem.
I thought this would be a problem because of your earlier suggestion
of caching flr capability to avoid probing multiple times. In this case
we'll need to read different device registers multiple times. With
current encoding we don't have to do that multiple times.

Thanks,
Amey
>
> Bjorn
>
> [1] https://lore.kernel.org/r/1fb0a184-908c-5f98-ef6d-74edc602c2e0@nvidia.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism
  2021-06-24 17:28         ` Amey Narkhede
@ 2021-06-24 17:59           ` Bjorn Helgaas
  0 siblings, 0 replies; 52+ messages in thread
From: Bjorn Helgaas @ 2021-06-24 17:59 UTC (permalink / raw)
  To: Amey Narkhede
  Cc: alex.williamson, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On Thu, Jun 24, 2021 at 10:58:06PM +0530, Amey Narkhede wrote:
> On 21/06/24 11:56AM, Bjorn Helgaas wrote:
> > On Thu, Jun 24, 2021 at 08:42:42PM +0530, Amey Narkhede wrote:
> > > On 21/06/24 07:15AM, Bjorn Helgaas wrote:
> > > > On Tue, Jun 08, 2021 at 11:18:53AM +0530, Amey Narkhede wrote:
> > > > > Add reset_method sysfs attribute to enable user to
> > > > > query and set user preferred device reset methods and
> > > > > their ordering.
> > > >
> > > > > +		Writing the name or comma separated list of names of any of
> > > > > +		the device supported reset methods to this file will set the
> > > > > +		reset methods and their ordering to be used when resetting
> > > > > +		the device.
> > > >
> > > > > +	while ((name = strsep(&options, ",")) != NULL) {
> > > > > +		if (sysfs_streq(name, ""))
> > > > > +			continue;
> > > > > +
> > > > > +		name = strim(name);
> > > > > +
> > > > > +		for (i = 0; i < PCI_RESET_METHODS_NUM; i++) {
> > > > > +			if (reset_methods[i] &&
> > > > > +			    sysfs_streq(name, pci_reset_fn_methods[i].name)) {
> > > > > +				reset_methods[i] = prio--;
> > > > > +				break;
> > > > > +			}
> > > > > +		}
> > > > > +
> > > > > +		if (i == PCI_RESET_METHODS_NUM) {
> > > > > +			kfree(options);
> > > > > +			return -EINVAL;
> > > > > +		}
> > > > > +	}
> > > >
> > > > Asking again since we didn't get this clarified before.  The above
> > > > tells me that "reset_methods" allows the user to control the
> > > > *order* in which we try reset methods.
> > > >
> > > > Consider the following two uses:
> > > >
> > > >   (1) # echo bus,flr > reset_methods
> > > >
> > > >   (2) # echo flr,bus > reset_methods
> > > >
> > > > Do these have the same effect or not?
> > > >
> > > They have different effect.
> >
> > I asked about this because Shanker's idea [1] of using two bitmaps
> > only keeps track of which resets are *enabled*.  It does not keep
> > track of the *ordering*.  Since you want to control the ordering, I
> > think we need more state than just the supported/enabled bitmaps.
> >
> > > > If "reset_methods" allows control over the order, I expect them to
> > > > be different: (1) would try a bus reset and, if the bus reset
> > > > failed, an FLR, while (2) would try an FLR and, if the FLR failed,
> > > > a bus reset.
> > >
> > > Exactly you are right.
> > >
> > > Now the point I was presenting was with new encoding we have to
> > > write list of *all of the supported reset methods* in order for
> > > example, in above example flr,bus or bus,flr. We can't just write
> > > 'flr' or 'bus' then switch back to writing flr,bus/bus,flr (these
> > > have different effect as mentioned earlier).
> >
> > It sounds like you're saying this sequence can't work:
> >
> >   # echo flr > reset_methods
>
> # dev->reset_methods = [3, 0, 0, ..]
>
> >   # echo bus,flr > reset_methods
>
> # to get dev->reset_methods = [6, 3, 0, ...]
> we'll need to probe reset methods here.
>
> > But I'm afraid you'll have to walk me through the reasons why this
> > can't be made to work.
>
> I wrote incomplete description. It can work but we'll need to probe
> everytime which involves reading different capabilities(PCI_CAP_ID_AF,
> PCI_PM_CTRL etc) from device. With current encoding we just have to
> probe at the begining.
>
> > > Basically with new encoding an user can't write subset of reset
> > > methods they have to write list of *all* supported methods
> > > everytime.
> >
> > Why does the user have to write all supported methods?  Is that to
> > preserve the fact that "cat reset_methods" always shows all the
> > supported methods so the user knows what's available?
> >
> > I'm wondering why we can't do something like this (pidgin code):
> >
> >   if (option == "default") {
> >     pci_init_reset_methods(dev);
> >     return;
> >   }
> >
> >   n = 0;
> >   foreach method in option {
> >     i = lookup_reset_method(method);
> >     if (pci_reset_methods[i].reset_fn(dev, PROBE) == 0)
>
> Repeatedly calling probe might have some impact as it involves reading
> device registers as explained earlier.
>
> >       dev->reset_methods[n++] = i;           # method i supported
> >   }
> >   dev->reset_methods[n++] = 0;               # end of supported methods
> >
> > If we did something like the above, the user could always find the
> > list of all methods supported by a device by doing this:
> >
> >   # echo default > reset_methods
> >   # cat reset_methods
>
> This is one solution for current problem with new encoding.
>
> > Yes, this does call the "probe" methods several times.  I don't think
> > that's necessarily a problem.
>
> I thought this would be a problem because of your earlier suggestion
> of caching flr capability to avoid probing multiple times. In this case
> we'll need to read different device registers multiple times. With
> current encoding we don't have to do that multiple times.

I don't think it's a problem to run "probe" methods when we're setting
the enabled reset methods (either at enumeration-time or when we write
to "reset_methods").  These are low-frequency events and I don't think
there's any performance issue.

I don't think we should have to run "probe" methods every time we call
pci_reset_function().

I suggested a dev->has_flr bit for two reasons:

  1) It avoids reading PCI_EXP_DEVCAP every time a driver calls
     pcie_reset_flr(), and

  2) It gives a nice opportunity for quirks to disable FLR for devices
     where it's broken.

> > [1] https://lore.kernel.org/r/1fb0a184-908c-5f98-ef6d-74edc602c2e0@nvidia.com

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods
  2021-06-24 16:15       ` Bjorn Helgaas
@ 2021-06-24 18:48         ` Alex Williamson
  0 siblings, 0 replies; 52+ messages in thread
From: Alex Williamson @ 2021-06-24 18:48 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Amey Narkhede, Raphael Norwitz, linux-pci, linux-kernel, kw,
	Shanker Donthineni, Sinan Kaya, Len Brown, Rafael J . Wysocki

On Thu, 24 Jun 2021 11:15:59 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:

> [+to Alex]
> 
> On Thu, Jun 24, 2021 at 08:58:09PM +0530, Amey Narkhede wrote:
> > On 21/06/24 07:23AM, Bjorn Helgaas wrote:  
> > > On Tue, Jun 08, 2021 at 11:18:50AM +0530, Amey Narkhede wrote:  
> > > > Currently there is separate function pcie_has_flr() to probe if pcie flr is
> > > > supported by the device which does not match the calling convention
> > > > followed by reset methods which use second function argument to decide
> > > > whether to probe or not.  Add new function pcie_reset_flr() that follows
> > > > the calling convention of reset methods.  
> > >  
> > > > +/**
> > > > + * pcie_reset_flr - initiate a PCIe function level reset
> > > > + * @dev: device to reset
> > > > + * @probe: If set, only check if the device can be reset this way.
> > > > + *
> > > > + * Initiate a function level reset on @dev.
> > > > + */
> > > > +int pcie_reset_flr(struct pci_dev *dev, int probe)
> > > > +{
> > > > +	u32 cap;
> > > > +
> > > > +	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
> > > > +		return -ENOTTY;
> > > > +
> > > > +	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
> > > > +	if (!(cap & PCI_EXP_DEVCAP_FLR))
> > > > +		return -ENOTTY;
> > > > +
> > > > +	if (probe)
> > > > +		return 0;
> > > > +
> > > > +	return pcie_flr(dev);
> > > > +}  
> > >
> > > Tangent: I've been told before, but I can't remember why we need the
> > > "probe" interface.  Since we're looking at this area again, can we add
> > > a comment to clarify this?
> > >
> > > Every time I read this, I wonder why we can't just get rid of the
> > > probe and attempt a reset.  If it fails because it's not supported, we
> > > could just try the next one in the list.  
> > 
> > Part of the reason is to have same calling convention as other reset
> > methods and other reason is devices that run in VMs where various
> > capabilities can be hidden or have quirks for avoiding known troublesome
> > combination of device features as Alex explained here
> > https://lore.kernel.org/linux-pci/20210624151242.ybew2z5rseuusj7v@archlinux/T/#mb67c09a2ce08ce4787652e4c0e7b9e5adf1df57a
> > 
> > On the side note as you suggested earlier to cache flr capability
> > earlier the PCI_EXP_DEVCAP reading code won't be there in next version
> > so its just trivial check(dev->has_flr).  
> 
> Sorry, I didn't make my question clear.  I'm not asking why we're
> adding a "probe" argument to pcie_reset_flr() to make it consistent
> with pci_af_flr(), pci_pm_reset(), pci_parent_bus_reset(), etc.  I
> like making the interfaces consistent.
> 
> What I'm asking here is why the "probe" argument exists for *any* of
> these interfaces and why pci_probe_reset_function() exists.
> 
> This is really more a question for Alex since it's a historical
> question, not anything directly related to your series.  I'm not
> proposing *removing* the "probe" argument; I know it exists for a
> reason because I've asked about it before.  But I forgot the answer,
> which makes me think a hint in the code would be useful.

Heh [1]

That might be what you're recalling, but in that case I was adding
exported symbols that allowed probing bus vs slot reset because the
scope of affected devices is different.  My use case is testing whether
the user owns all the affected devices, so it's really not a
test-by-doing opportunity.

For these single-function scoped resets, as in the reply to [1]
pci_probe_reset_function() isn't exported and the only caller is
internal PCI code to determine whether to create the 'reset' sysfs
attribute.  Sure, as it exists today we could reset the device and test
whether it worked to get that value, that's what vfio-pci does now
before we give the device to the user, but the critical difference is
that in the vfio case we always want to flush any state that might be
leaked to the user and at device init time, doing so only invites
issues.

This series obviously expands the scope of probing, we don't just want
to know that there's at least one method available to us, but precisely
which ones.  It's rather impractical to try to reset a function a half
dozen different ways on boot for the possibility that the admin might
want to manipulate the reset order later.  And oh gosh, if we don't
cache the methods supported and re-test-by-doing when the attribute is
written, let's just not go there.  Thanks,

Alex

[1]https://lore.kernel.org/linux-pci/CAErSpo625CTnxZvy-gmy8VzxT4favF4s=_giU6nGey_N=VwK5A@mail.gmail.com/


^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2021-06-24 18:48 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08  5:48 [PATCH v7 0/8] Expose and manage PCI device reset Amey Narkhede
2021-06-08  5:48 ` [PATCH v7 1/8] PCI: Add pcie_reset_flr to follow calling convention of other reset methods Amey Narkhede
2021-06-10 20:15   ` Shanker R Donthineni
2021-06-17 21:57   ` Bjorn Helgaas
2021-06-17 22:51     ` Alex Williamson
2021-06-18 16:32     ` Amey Narkhede
2021-06-24 12:23   ` Bjorn Helgaas
2021-06-24 15:28     ` Amey Narkhede
2021-06-24 16:15       ` Bjorn Helgaas
2021-06-24 18:48         ` Alex Williamson
2021-06-08  5:48 ` [PATCH v7 2/8] PCI: Add new array for keeping track of ordering of " Amey Narkhede
2021-06-10 20:15   ` Shanker R Donthineni
2021-06-17 23:13   ` Bjorn Helgaas
2021-06-18 17:22     ` Amey Narkhede
2021-06-21 15:02       ` Shanker R Donthineni
2021-06-21 17:15         ` Amey Narkhede
2021-06-21 18:37           ` Bjorn Helgaas
2021-06-08  5:48 ` [PATCH v7 3/8] PCI: Remove reset_fn field from pci_dev Amey Narkhede
2021-06-10 20:16   ` Shanker R Donthineni
2021-06-08  5:48 ` [PATCH v7 4/8] PCI/sysfs: Allow userspace to query and set device reset mechanism Amey Narkhede
2021-06-09 21:57   ` Raphael Norwitz
2021-06-09 22:36     ` Shanker R Donthineni
2021-06-09 22:48       ` Raphael Norwitz
2021-06-10 20:16   ` Shanker R Donthineni
2021-06-18 20:00   ` Bjorn Helgaas
2021-06-19 13:59     ` Amey Narkhede
2021-06-21 13:01       ` Bjorn Helgaas
2021-06-21 17:28         ` Amey Narkhede
2021-06-21 19:07           ` Bjorn Helgaas
2021-06-21 19:33             ` Amey Narkhede
2021-06-23 12:06               ` Bjorn Helgaas
2021-06-23 14:07                 ` Amey Narkhede
2021-06-23 17:56                   ` Amey Narkhede
2021-06-23 17:21         ` Alex Williamson
2021-06-24 12:15   ` Bjorn Helgaas
2021-06-24 15:12     ` Amey Narkhede
2021-06-24 16:56       ` Bjorn Helgaas
2021-06-24 17:20         ` Shanker R Donthineni
2021-06-24 17:28         ` Amey Narkhede
2021-06-24 17:59           ` Bjorn Helgaas
2021-06-08  5:48 ` [PATCH v7 5/8] PCI: Setup ACPI_COMPANION early Amey Narkhede
2021-06-08  5:48 ` [PATCH v7 6/8] PCI: Add support for ACPI _RST reset method Amey Narkhede
2021-06-08  5:48 ` [PATCH v7 7/8] PCI: Enable NO_BUS_RESET quirk for Nvidia GPUs Amey Narkhede
2021-06-10 23:16   ` Bjorn Helgaas
2021-06-10 23:33     ` Shanker R Donthineni
2021-06-10 23:43     ` Shanker R Donthineni
2021-06-10 23:53       ` Bjorn Helgaas
2021-06-11  4:15         ` Shanker R Donthineni
2021-06-08  5:48 ` [PATCH v7 8/8] PCI: Change the type of probe argument in reset functions Amey Narkhede
2021-06-09 21:40   ` Raphael Norwitz
2021-06-08 10:05 ` [PATCH v7 0/8] Expose and manage PCI device reset Enrico Weigelt, metux IT consult
2021-06-08 15:44   ` Amey Narkhede

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).