LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3 0/3] PCI: NVMe reset quirks
@ 2018-07-24 16:14 Alex Williamson
  2018-07-24 16:14 ` [PATCH v3 1/3] PCI: Export pcie_has_flr() Alex Williamson
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Alex Williamson @ 2018-07-24 16:14 UTC (permalink / raw)
  To: linux-pci; +Cc: linux-kernel, linux-nvme

v3: Separate quirks, only for the afflicted devices

v2: Add bug link, use Samsung vendor ID, add spec references

Fix two different NVMe device reset issues with device specific
quirks.  The Samsung controller in patch 2 sometimes doesn't like
being reset while enabled, so disable the NVMe controller prior to
FLR.  This quirk is generic to all NVMe class devices, though I've
dropped the additional delay some devices require between disabling
and checking ready status.  This can be added later should any of
those devices need this quirk.  The Intel controller quirk is now
just a simple delay after FLR, which clearly any device needing
similar behavior can also use.  Thanks,

Alex

---

Alex Williamson (3):
      PCI: Export pcie_has_flr()
      PCI: Samsung SM961/PM961 NVMe disable before FLR quirk
      PCI: Intel DC P3700 NVMe delay after FLR quirk


 drivers/pci/pci.c    |    3 +
 drivers/pci/quirks.c |  105 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pci.h  |    1 
 3 files changed, 108 insertions(+), 1 deletion(-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/3] PCI: Export pcie_has_flr()
  2018-07-24 16:14 [PATCH v3 0/3] PCI: NVMe reset quirks Alex Williamson
@ 2018-07-24 16:14 ` Alex Williamson
  2018-07-24 16:14 ` [PATCH v3 2/3] PCI: Samsung SM961/PM961 NVMe disable before FLR quirk Alex Williamson
  2018-07-24 16:14 ` [PATCH v3 3/3] PCI: Intel DC P3700 NVMe delay after " Alex Williamson
  2 siblings, 0 replies; 8+ messages in thread
From: Alex Williamson @ 2018-07-24 16:14 UTC (permalink / raw)
  To: linux-pci; +Cc: linux-kernel, linux-nvme

pcie_flr() suggests pcie_has_flr() to ensure that PCIe FLR support is
present prior to calling.  pcie_flr() is exported while pcie_has_flr()
is not.  Resolve this.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/pci.c   |    3 ++-
 include/linux/pci.h |    1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 2bec76c9d9a7..52fe2d72a99c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4071,7 +4071,7 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
  * Returns true if the device advertises support for PCIe function level
  * resets.
  */
-static bool pcie_has_flr(struct pci_dev *dev)
+bool pcie_has_flr(struct pci_dev *dev)
 {
 	u32 cap;
 
@@ -4081,6 +4081,7 @@ static bool pcie_has_flr(struct pci_dev *dev)
 	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
 	return cap & PCI_EXP_DEVCAP_FLR;
 }
+EXPORT_SYMBOL_GPL(pcie_has_flr);
 
 /**
  * pcie_flr - initiate a PCIe function level reset
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 04c7ea6ed67b..bbe030d7814f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1092,6 +1092,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev,
 			     enum pci_bus_speed *speed,
 			     enum pcie_link_width *width);
 void pcie_print_link_status(struct pci_dev *dev);
+bool pcie_has_flr(struct pci_dev *dev);
 int pcie_flr(struct pci_dev *dev);
 int __pci_reset_function_locked(struct pci_dev *dev);
 int pci_reset_function(struct pci_dev *dev);


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 2/3] PCI: Samsung SM961/PM961 NVMe disable before FLR quirk
  2018-07-24 16:14 [PATCH v3 0/3] PCI: NVMe reset quirks Alex Williamson
  2018-07-24 16:14 ` [PATCH v3 1/3] PCI: Export pcie_has_flr() Alex Williamson
@ 2018-07-24 16:14 ` Alex Williamson
  2018-07-24 19:53   ` Minwoo Im
  2018-07-24 16:14 ` [PATCH v3 3/3] PCI: Intel DC P3700 NVMe delay after " Alex Williamson
  2 siblings, 1 reply; 8+ messages in thread
From: Alex Williamson @ 2018-07-24 16:14 UTC (permalink / raw)
  To: linux-pci; +Cc: linux-kernel, linux-nvme

The Samsung SM961/PM961 (960 EVO) sometimes fails to return from FLR
with the PCI config space reading back as -1.  A reproducible instance
of this behavior is resolved by clearing the enable bit in the NVMe
configuration register and waiting for the ready status to clear
(disabling the NVMe controller) prior to FLR.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/quirks.c |   83 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 83 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index e72c8742aafa..3899cdd2514b 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -28,6 +28,7 @@
 #include <linux/platform_data/x86/apple.h>
 #include <linux/pm_runtime.h>
 #include <linux/switchtec.h>
+#include <linux/nvme.h>
 #include <asm/dma.h>	/* isa_dma_bridge_buggy */
 #include "pci.h"
 
@@ -3669,6 +3670,87 @@ static int reset_chelsio_generic_dev(struct pci_dev *dev, int probe)
 #define PCI_DEVICE_ID_INTEL_IVB_M_VGA      0x0156
 #define PCI_DEVICE_ID_INTEL_IVB_M2_VGA     0x0166
 
+/*
+ * The Samsung SM961/PM961 controller can sometimes enter a fatal state after
+ * FLR where config space reads from the device return -1.  We seem to be
+ * able to avoid this condition if we disable the NVMe controller prior to
+ * FLR.  This quirk is generic for any NVMe class device requiring similar
+ * assistance to quiesce the device prior to FLR.
+ *
+ * NVMe specification: https://nvmexpress.org/resources/specifications/
+ * Revision 1.0e:
+ *    Chapter 2: Required and optional PCI config registers
+ *    Chapter 3: NVMe control registers
+ *    Chapter 7.3: Reset behavior
+ */
+static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
+{
+	void __iomem *bar;
+	u16 cmd;
+	u32 cfg;
+
+	if (dev->class != PCI_CLASS_STORAGE_EXPRESS ||
+	    !pcie_has_flr(dev) || !pci_resource_start(dev, 0))
+		return -ENOTTY;
+
+	if (probe)
+		return 0;
+
+	bar = pci_iomap(dev, 0, NVME_REG_CC + sizeof(cfg));
+	if (!bar)
+		return -ENOTTY;
+
+	pci_read_config_word(dev, PCI_COMMAND, &cmd);
+	pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY);
+
+	cfg = readl(bar + NVME_REG_CC);
+
+	/* Disable controller if enabled */
+	if (cfg & NVME_CC_ENABLE) {
+		u64 cap = readq(bar + NVME_REG_CAP);
+		unsigned long timeout;
+
+		/*
+		 * Per nvme_disable_ctrl() skip shutdown notification as it
+		 * could complete commands to the admin queue.  We only intend
+		 * to quiesce the device before reset.
+		 */
+		cfg &= ~(NVME_CC_SHN_MASK | NVME_CC_ENABLE);
+
+		writel(cfg, bar + NVME_REG_CC);
+
+		/*
+		 * Some controllers require an additional delay here, see
+		 * NVME_QUIRK_DELAY_BEFORE_CHK_RDY.  None of those are yet
+		 * supported by this quirk.
+		 */
+
+		/* Cap register provides max timeout in 500ms increments */
+		timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies;
+
+		for (;;) {
+			u32 status = readl(bar + NVME_REG_CSTS);
+
+			/* Ready status becomes zero on disable complete */
+			if (!(status & NVME_CSTS_RDY))
+				break;
+
+			msleep(100);
+
+			if (time_after(jiffies, timeout)) {
+				pci_warn(dev, "Timeout waiting for NVMe ready status to clear after disable\n");
+				break;
+			}
+		}
+	}
+
+	pci_iounmap(dev, bar);
+
+	pcie_flr(dev);
+
+	return 0;
+}
+
 static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
 	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
 		 reset_intel_82599_sfp_virtfn },
@@ -3676,6 +3758,7 @@ static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
 		reset_ivb_igd },
 	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA,
 		reset_ivb_igd },
+	{ PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr },
 	{ PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID,
 		reset_chelsio_generic_dev },
 	{ 0 }


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 3/3] PCI: Intel DC P3700 NVMe delay after FLR quirk
  2018-07-24 16:14 [PATCH v3 0/3] PCI: NVMe reset quirks Alex Williamson
  2018-07-24 16:14 ` [PATCH v3 1/3] PCI: Export pcie_has_flr() Alex Williamson
  2018-07-24 16:14 ` [PATCH v3 2/3] PCI: Samsung SM961/PM961 NVMe disable before FLR quirk Alex Williamson
@ 2018-07-24 16:14 ` " Alex Williamson
  2018-07-24 16:18   ` Alex Williamson
  2 siblings, 1 reply; 8+ messages in thread
From: Alex Williamson @ 2018-07-24 16:14 UTC (permalink / raw)
  To: linux-pci; +Cc: linux-kernel, linux-nvme

Add a device specific reset for Intel DC P3700 NVMe device which
exhibits a timeout failure in drivers waiting for the ready status to
update after NVMe enable if the driver interacts with the device too
quickly after FLR.  As this has been observed in device assignment
scenarios, resolve this with a device specific reset quirk to add an
additional, heuristically determined, delay after the FLR completes.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 drivers/pci/quirks.c |   22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 3899cdd2514b..08fafd804588 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3751,6 +3751,27 @@ static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
 	return 0;
 }
 
+/*
+ * Intel DC P3700 NVMe controller will timeout waiting for ready status
+ * to change after NVMe enable if the driver starts interacting with the
+ * device too quickly after FLR.  A 250ms delay after FLR has heuristically
+ * proven to produce reliably working results for device assignment cases.
+ */
+static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
+{
+	if (!pcie_has_flr(dev))
+		return -ENOTTY;
+
+	if (probe)
+		return 0;
+
+	pcie_flr(dev);
+
+	msleep(250);
+
+	return 0;
+}
+
 static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
 	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
 		 reset_intel_82599_sfp_virtfn },
@@ -3759,6 +3780,7 @@ static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
 	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA,
 		reset_ivb_igd },
 	{ PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr },
+	{ PCI_VENDOR_ID_INTEL, 0x0953, delay_250ms_after_flr },
 	{ PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID,
 		reset_chelsio_generic_dev },
 	{ 0 }


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] PCI: Intel DC P3700 NVMe delay after FLR quirk
  2018-07-24 16:14 ` [PATCH v3 3/3] PCI: Intel DC P3700 NVMe delay after " Alex Williamson
@ 2018-07-24 16:18   ` Alex Williamson
  2018-08-09 19:35     ` Bjorn Helgaas
  0 siblings, 1 reply; 8+ messages in thread
From: Alex Williamson @ 2018-07-24 16:18 UTC (permalink / raw)
  To: linux-pci; +Cc: linux-kernel, linux-nvme

On Tue, 24 Jul 2018 10:14:46 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> Add a device specific reset for Intel DC P3700 NVMe device which
> exhibits a timeout failure in drivers waiting for the ready status to
> update after NVMe enable if the driver interacts with the device too
> quickly after FLR.  As this has been observed in device assignment
> scenarios, resolve this with a device specific reset quirk to add an
> additional, heuristically determined, delay after the FLR completes.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>  drivers/pci/quirks.c |   22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)

I forgot to link the bz in this one, if this somehow becomes the final
version, please add:

Link: https://bugzilla.redhat.com/show_bug.cgi?id=159265

Thanks,
Alex
 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 3899cdd2514b..08fafd804588 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3751,6 +3751,27 @@ static int nvme_disable_and_flr(struct pci_dev *dev, int probe)
>  	return 0;
>  }
>  
> +/*
> + * Intel DC P3700 NVMe controller will timeout waiting for ready status
> + * to change after NVMe enable if the driver starts interacting with the
> + * device too quickly after FLR.  A 250ms delay after FLR has heuristically
> + * proven to produce reliably working results for device assignment cases.
> + */
> +static int delay_250ms_after_flr(struct pci_dev *dev, int probe)
> +{
> +	if (!pcie_has_flr(dev))
> +		return -ENOTTY;
> +
> +	if (probe)
> +		return 0;
> +
> +	pcie_flr(dev);
> +
> +	msleep(250);
> +
> +	return 0;
> +}
> +
>  static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
>  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
>  		 reset_intel_82599_sfp_virtfn },
> @@ -3759,6 +3780,7 @@ static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
>  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA,
>  		reset_ivb_igd },
>  	{ PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr },
> +	{ PCI_VENDOR_ID_INTEL, 0x0953, delay_250ms_after_flr },
>  	{ PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID,
>  		reset_chelsio_generic_dev },
>  	{ 0 }
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 2/3] PCI: Samsung SM961/PM961 NVMe disable before FLR quirk
  2018-07-24 16:14 ` [PATCH v3 2/3] PCI: Samsung SM961/PM961 NVMe disable before FLR quirk Alex Williamson
@ 2018-07-24 19:53   ` Minwoo Im
  2018-07-24 20:09     ` Alex Williamson
  0 siblings, 1 reply; 8+ messages in thread
From: Minwoo Im @ 2018-07-24 19:53 UTC (permalink / raw)
  To: Alex Williamson, linux-pci; +Cc: linux-kernel, linux-nvme

Hi Alex,

On Tue, 2018-07-24 at 10:14 -0600, Alex Williamson wrote:
> The Samsung SM961/PM961 (960 EVO) sometimes fails to return from FLR
> with the PCI config space reading back as -1.  A reproducible instance
> of this behavior is resolved by clearing the enable bit in the NVMe
> configuration register and waiting for the ready status to clear
> (disabling the NVMe controller) prior to FLR.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>  drivers/pci/quirks.c |   83
> ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 83 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index e72c8742aafa..3899cdd2514b 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -28,6 +28,7 @@
>  #include <linux/platform_data/x86/apple.h>
>  #include <linux/pm_runtime.h>
>  #include <linux/switchtec.h>
> +#include <linux/nvme.h>
>  #include <asm/dma.h>	/* isa_dma_bridge_buggy */
>  #include "pci.h"
>  
> @@ -3669,6 +3670,87 @@ static int reset_chelsio_generic_dev(struct pci_dev
> *dev, int probe)
>  #define PCI_DEVICE_ID_INTEL_IVB_M_VGA      0x0156
>  #define PCI_DEVICE_ID_INTEL_IVB_M2_VGA     0x0166
>  
> +/*
> + * The Samsung SM961/PM961 controller can sometimes enter a fatal state after
> + * FLR where config space reads from the device return -1.  We seem to be
> + * able to avoid this condition if we disable the NVMe controller prior to
> + * FLR.  This quirk is generic for any NVMe class device requiring similar
> + * assistance to quiesce the device prior to FLR.
> + *
> + * NVMe specification: https://nvmexpress.org/resources/specifications/
> + * Revision 1.0e:

It seems too old version of NVMe specification.  Do you have any special reason
to comment the specified 1.0 version instead of 1.3 or something newer?

> + *    Chapter 2: Required and optional PCI config registers
> + *    Chapter 3: NVMe control registers
> + *    Chapter 7.3: Reset behavior
> + */
> +static int nvme_disable_and_flr(struct pci_dev *dev, int probe)

The name of this function seems able to be started with 'reset_' prefix just
like other quirks for reset.
What about reset_samsung_pm961 or something?

> +{
> +	void __iomem *bar;
> +	u16 cmd;
> +	u32 cfg;
> +
> +	if (dev->class != PCI_CLASS_STORAGE_EXPRESS ||
> +	    !pcie_has_flr(dev) || !pci_resource_start(dev, 0))
> +		return -ENOTTY;
> +
> +	if (probe)
> +		return 0;
> +
> +	bar = pci_iomap(dev, 0, NVME_REG_CC + sizeof(cfg));
> +	if (!bar)
> +		return -ENOTTY;
> +
> +	pci_read_config_word(dev, PCI_COMMAND, &cmd);
> +	pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY);
> +
> +	cfg = readl(bar + NVME_REG_CC);
> +
> +	/* Disable controller if enabled */
> +	if (cfg & NVME_CC_ENABLE) {
> +		u64 cap = readq(bar + NVME_REG_CAP);
> +		unsigned long timeout;
> +
> +		/*
> +		 * Per nvme_disable_ctrl() skip shutdown notification as it
> +		 * could complete commands to the admin queue.  We only
> intend
> +		 * to quiesce the device before reset.
> +		 */
> +		cfg &= ~(NVME_CC_SHN_MASK | NVME_CC_ENABLE);
> +
> +		writel(cfg, bar + NVME_REG_CC);
> +
> +		/*
> +		 * Some controllers require an additional delay here, see
> +		 * NVME_QUIRK_DELAY_BEFORE_CHK_RDY.  None of those are yet
> +		 * supported by this quirk.
> +		 */
> +
> +		/* Cap register provides max timeout in 500ms increments */
> +		timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies;
> +
> +		for (;;) {
> +			u32 status = readl(bar + NVME_REG_CSTS);
> +
> +			/* Ready status becomes zero on disable complete */
> +			if (!(status & NVME_CSTS_RDY))
> +				break;
> +
> +			msleep(100);
> +
> +			if (time_after(jiffies, timeout)) {
> +				pci_warn(dev, "Timeout waiting for NVMe ready
> status to clear after disable\n");
> +				break;
> +			}
> +		}
> +	}
> +
> +	pci_iounmap(dev, bar);
> +
> +	pcie_flr(dev);
> +
> +	return 0;
> +}
> +
>  static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
>  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
>  		 reset_intel_82599_sfp_virtfn },
> @@ -3676,6 +3758,7 @@ static const struct pci_dev_reset_methods
> pci_dev_reset_methods[] = {
>  		reset_ivb_igd },
>  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA,
>  		reset_ivb_igd },
> +	{ PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr },

Why don't we just define a macro just like other DEVICE_IDs. (e.g.
PCIE_DEVICE_ID_INTEL_82599_SFP_VF).

#define PCI_DEVICE_ID_SAMSUNG_PM961  0xa804

>  	{ PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID,
>  		reset_chelsio_generic_dev },
>  	{ 0 }
> 
> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 2/3] PCI: Samsung SM961/PM961 NVMe disable before FLR quirk
  2018-07-24 19:53   ` Minwoo Im
@ 2018-07-24 20:09     ` Alex Williamson
  0 siblings, 0 replies; 8+ messages in thread
From: Alex Williamson @ 2018-07-24 20:09 UTC (permalink / raw)
  To: Minwoo Im; +Cc: linux-pci, linux-kernel, linux-nvme

On Wed, 25 Jul 2018 04:53:18 +0900
Minwoo Im <minwoo.im.dev@gmail.com> wrote:

> Hi Alex,
> 
> On Tue, 2018-07-24 at 10:14 -0600, Alex Williamson wrote:
> > The Samsung SM961/PM961 (960 EVO) sometimes fails to return from FLR
> > with the PCI config space reading back as -1.  A reproducible instance
> > of this behavior is resolved by clearing the enable bit in the NVMe
> > configuration register and waiting for the ready status to clear
> > (disabling the NVMe controller) prior to FLR.
> > 
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > ---
> >  drivers/pci/quirks.c |   83
> > ++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 83 insertions(+)
> > 
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index e72c8742aafa..3899cdd2514b 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -28,6 +28,7 @@
> >  #include <linux/platform_data/x86/apple.h>
> >  #include <linux/pm_runtime.h>
> >  #include <linux/switchtec.h>
> > +#include <linux/nvme.h>
> >  #include <asm/dma.h>	/* isa_dma_bridge_buggy */
> >  #include "pci.h"
> >  
> > @@ -3669,6 +3670,87 @@ static int reset_chelsio_generic_dev(struct pci_dev
> > *dev, int probe)
> >  #define PCI_DEVICE_ID_INTEL_IVB_M_VGA      0x0156
> >  #define PCI_DEVICE_ID_INTEL_IVB_M2_VGA     0x0166
> >  
> > +/*
> > + * The Samsung SM961/PM961 controller can sometimes enter a fatal state after
> > + * FLR where config space reads from the device return -1.  We seem to be
> > + * able to avoid this condition if we disable the NVMe controller prior to
> > + * FLR.  This quirk is generic for any NVMe class device requiring similar
> > + * assistance to quiesce the device prior to FLR.
> > + *
> > + * NVMe specification: https://nvmexpress.org/resources/specifications/
> > + * Revision 1.0e:  
> 
> It seems too old version of NVMe specification.  Do you have any special reason
> to comment the specified 1.0 version instead of 1.3 or something newer?

I wanted to show that I'm using NVMe features that have been available
since the initial release and there's no reason to check the version
field for their support.

> > + *    Chapter 2: Required and optional PCI config registers
> > + *    Chapter 3: NVMe control registers
> > + *    Chapter 7.3: Reset behavior
> > + */
> > +static int nvme_disable_and_flr(struct pci_dev *dev, int probe)  
> 
> The name of this function seems able to be started with 'reset_' prefix just
> like other quirks for reset.
> What about reset_samsung_pm961 or something?

I'm fine with any generic prefix, but I'm not fine with obfuscating the
purpose of the function behind a vendor/device specific name.  If
someone else comes along needing this same functionality, they'll
probably be reluctant to even look at what a "reset_samsung_sm961"
function does.  If they do, they might still be reluctant to reuse
something so obviously made for a specific device.  I thought this was
pretty descriptive of what it's doing.  Prefixing with 'reset_' is a
tad redundant.

> > +{
> > +	void __iomem *bar;
> > +	u16 cmd;
> > +	u32 cfg;
> > +
> > +	if (dev->class != PCI_CLASS_STORAGE_EXPRESS ||
> > +	    !pcie_has_flr(dev) || !pci_resource_start(dev, 0))
> > +		return -ENOTTY;
> > +
> > +	if (probe)
> > +		return 0;
> > +
> > +	bar = pci_iomap(dev, 0, NVME_REG_CC + sizeof(cfg));
> > +	if (!bar)
> > +		return -ENOTTY;
> > +
> > +	pci_read_config_word(dev, PCI_COMMAND, &cmd);
> > +	pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY);
> > +
> > +	cfg = readl(bar + NVME_REG_CC);
> > +
> > +	/* Disable controller if enabled */
> > +	if (cfg & NVME_CC_ENABLE) {
> > +		u64 cap = readq(bar + NVME_REG_CAP);
> > +		unsigned long timeout;
> > +
> > +		/*
> > +		 * Per nvme_disable_ctrl() skip shutdown notification as it
> > +		 * could complete commands to the admin queue.  We only
> > intend
> > +		 * to quiesce the device before reset.
> > +		 */
> > +		cfg &= ~(NVME_CC_SHN_MASK | NVME_CC_ENABLE);
> > +
> > +		writel(cfg, bar + NVME_REG_CC);
> > +
> > +		/*
> > +		 * Some controllers require an additional delay here, see
> > +		 * NVME_QUIRK_DELAY_BEFORE_CHK_RDY.  None of those are yet
> > +		 * supported by this quirk.
> > +		 */
> > +
> > +		/* Cap register provides max timeout in 500ms increments */
> > +		timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies;
> > +
> > +		for (;;) {
> > +			u32 status = readl(bar + NVME_REG_CSTS);
> > +
> > +			/* Ready status becomes zero on disable complete */
> > +			if (!(status & NVME_CSTS_RDY))
> > +				break;
> > +
> > +			msleep(100);
> > +
> > +			if (time_after(jiffies, timeout)) {
> > +				pci_warn(dev, "Timeout waiting for NVMe ready
> > status to clear after disable\n");
> > +				break;
> > +			}
> > +		}
> > +	}
> > +
> > +	pci_iounmap(dev, bar);
> > +
> > +	pcie_flr(dev);
> > +
> > +	return 0;
> > +}
> > +
> >  static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
> >  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
> >  		 reset_intel_82599_sfp_virtfn },
> > @@ -3676,6 +3758,7 @@ static const struct pci_dev_reset_methods
> > pci_dev_reset_methods[] = {
> >  		reset_ivb_igd },
> >  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA,
> >  		reset_ivb_igd },
> > +	{ PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr },  
> 
> Why don't we just define a macro just like other DEVICE_IDs. (e.g.
> PCIE_DEVICE_ID_INTEL_82599_SFP_VF).
> 
> #define PCI_DEVICE_ID_SAMSUNG_PM961  0xa804

include/linux/pci_ids.h"
/*
 *      PCI Class, Vendor and Device IDs
 *
 *      Please keep sorted.
 *
 *      Do not add new entries to this file unless the definitions
 *      are shared between multiple drivers.
 */

Those other devices are relatively old, they were probably #define'd
before we started the policy in the header.  Thanks,

Alex

> >  	{ PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID,
> >  		reset_chelsio_generic_dev },
> >  	{ 0 }
> > 
> > 
> > _______________________________________________
> > Linux-nvme mailing list
> > Linux-nvme@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-nvme  


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] PCI: Intel DC P3700 NVMe delay after FLR quirk
  2018-07-24 16:18   ` Alex Williamson
@ 2018-08-09 19:35     ` Bjorn Helgaas
  0 siblings, 0 replies; 8+ messages in thread
From: Bjorn Helgaas @ 2018-08-09 19:35 UTC (permalink / raw)
  To: Alex Williamson; +Cc: linux-pci, linux-kernel, linux-nvme

On Tue, Jul 24, 2018 at 10:18:48AM -0600, Alex Williamson wrote:
> On Tue, 24 Jul 2018 10:14:46 -0600
> Alex Williamson <alex.williamson@redhat.com> wrote:
> 
> > Add a device specific reset for Intel DC P3700 NVMe device which
> > exhibits a timeout failure in drivers waiting for the ready status to
> > update after NVMe enable if the driver interacts with the device too
> > quickly after FLR.  As this has been observed in device assignment
> > scenarios, resolve this with a device specific reset quirk to add an
> > additional, heuristically determined, delay after the FLR completes.
> > 
> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> > ---
> >  drivers/pci/quirks.c |   22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> 
> I forgot to link the bz in this one, if this somehow becomes the final
> version, please add:
> 
> Link: https://bugzilla.redhat.com/show_bug.cgi?id=159265

Corrected link to: https://bugzilla.redhat.com/show_bug.cgi?id=1592654

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-24 16:14 [PATCH v3 0/3] PCI: NVMe reset quirks Alex Williamson
2018-07-24 16:14 ` [PATCH v3 1/3] PCI: Export pcie_has_flr() Alex Williamson
2018-07-24 16:14 ` [PATCH v3 2/3] PCI: Samsung SM961/PM961 NVMe disable before FLR quirk Alex Williamson
2018-07-24 19:53   ` Minwoo Im
2018-07-24 20:09     ` Alex Williamson
2018-07-24 16:14 ` [PATCH v3 3/3] PCI: Intel DC P3700 NVMe delay after " Alex Williamson
2018-07-24 16:18   ` Alex Williamson
2018-08-09 19:35     ` Bjorn Helgaas

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox