LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Minwoo Im <minwoo.im.dev@gmail.com>
To: Alex Williamson <alex.williamson@redhat.com>, linux-pci@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org
Subject: Re: [PATCH v3 2/3] PCI: Samsung SM961/PM961 NVMe disable before FLR quirk
Date: Wed, 25 Jul 2018 04:53:18 +0900
Message-ID: <1532461998.20066.5.camel@gmail.com> (raw)
In-Reply-To: <20180724161440.2729.89835.stgit@gimli.home>

Hi Alex,

On Tue, 2018-07-24 at 10:14 -0600, Alex Williamson wrote:
> The Samsung SM961/PM961 (960 EVO) sometimes fails to return from FLR
> with the PCI config space reading back as -1.  A reproducible instance
> of this behavior is resolved by clearing the enable bit in the NVMe
> configuration register and waiting for the ready status to clear
> (disabling the NVMe controller) prior to FLR.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>  drivers/pci/quirks.c |   83
> ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 83 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index e72c8742aafa..3899cdd2514b 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -28,6 +28,7 @@
>  #include <linux/platform_data/x86/apple.h>
>  #include <linux/pm_runtime.h>
>  #include <linux/switchtec.h>
> +#include <linux/nvme.h>
>  #include <asm/dma.h>	/* isa_dma_bridge_buggy */
>  #include "pci.h"
>  
> @@ -3669,6 +3670,87 @@ static int reset_chelsio_generic_dev(struct pci_dev
> *dev, int probe)
>  #define PCI_DEVICE_ID_INTEL_IVB_M_VGA      0x0156
>  #define PCI_DEVICE_ID_INTEL_IVB_M2_VGA     0x0166
>  
> +/*
> + * The Samsung SM961/PM961 controller can sometimes enter a fatal state after
> + * FLR where config space reads from the device return -1.  We seem to be
> + * able to avoid this condition if we disable the NVMe controller prior to
> + * FLR.  This quirk is generic for any NVMe class device requiring similar
> + * assistance to quiesce the device prior to FLR.
> + *
> + * NVMe specification: https://nvmexpress.org/resources/specifications/
> + * Revision 1.0e:

It seems too old version of NVMe specification.  Do you have any special reason
to comment the specified 1.0 version instead of 1.3 or something newer?

> + *    Chapter 2: Required and optional PCI config registers
> + *    Chapter 3: NVMe control registers
> + *    Chapter 7.3: Reset behavior
> + */
> +static int nvme_disable_and_flr(struct pci_dev *dev, int probe)

The name of this function seems able to be started with 'reset_' prefix just
like other quirks for reset.
What about reset_samsung_pm961 or something?

> +{
> +	void __iomem *bar;
> +	u16 cmd;
> +	u32 cfg;
> +
> +	if (dev->class != PCI_CLASS_STORAGE_EXPRESS ||
> +	    !pcie_has_flr(dev) || !pci_resource_start(dev, 0))
> +		return -ENOTTY;
> +
> +	if (probe)
> +		return 0;
> +
> +	bar = pci_iomap(dev, 0, NVME_REG_CC + sizeof(cfg));
> +	if (!bar)
> +		return -ENOTTY;
> +
> +	pci_read_config_word(dev, PCI_COMMAND, &cmd);
> +	pci_write_config_word(dev, PCI_COMMAND, cmd | PCI_COMMAND_MEMORY);
> +
> +	cfg = readl(bar + NVME_REG_CC);
> +
> +	/* Disable controller if enabled */
> +	if (cfg & NVME_CC_ENABLE) {
> +		u64 cap = readq(bar + NVME_REG_CAP);
> +		unsigned long timeout;
> +
> +		/*
> +		 * Per nvme_disable_ctrl() skip shutdown notification as it
> +		 * could complete commands to the admin queue.  We only
> intend
> +		 * to quiesce the device before reset.
> +		 */
> +		cfg &= ~(NVME_CC_SHN_MASK | NVME_CC_ENABLE);
> +
> +		writel(cfg, bar + NVME_REG_CC);
> +
> +		/*
> +		 * Some controllers require an additional delay here, see
> +		 * NVME_QUIRK_DELAY_BEFORE_CHK_RDY.  None of those are yet
> +		 * supported by this quirk.
> +		 */
> +
> +		/* Cap register provides max timeout in 500ms increments */
> +		timeout = ((NVME_CAP_TIMEOUT(cap) + 1) * HZ / 2) + jiffies;
> +
> +		for (;;) {
> +			u32 status = readl(bar + NVME_REG_CSTS);
> +
> +			/* Ready status becomes zero on disable complete */
> +			if (!(status & NVME_CSTS_RDY))
> +				break;
> +
> +			msleep(100);
> +
> +			if (time_after(jiffies, timeout)) {
> +				pci_warn(dev, "Timeout waiting for NVMe ready
> status to clear after disable\n");
> +				break;
> +			}
> +		}
> +	}
> +
> +	pci_iounmap(dev, bar);
> +
> +	pcie_flr(dev);
> +
> +	return 0;
> +}
> +
>  static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
>  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
>  		 reset_intel_82599_sfp_virtfn },
> @@ -3676,6 +3758,7 @@ static const struct pci_dev_reset_methods
> pci_dev_reset_methods[] = {
>  		reset_ivb_igd },
>  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_IVB_M2_VGA,
>  		reset_ivb_igd },
> +	{ PCI_VENDOR_ID_SAMSUNG, 0xa804, nvme_disable_and_flr },

Why don't we just define a macro just like other DEVICE_IDs. (e.g.
PCIE_DEVICE_ID_INTEL_82599_SFP_VF).

#define PCI_DEVICE_ID_SAMSUNG_PM961  0xa804

>  	{ PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID,
>  		reset_chelsio_generic_dev },
>  	{ 0 }
> 
> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply index

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-24 16:14 [PATCH v3 0/3] PCI: NVMe reset quirks Alex Williamson
2018-07-24 16:14 ` [PATCH v3 1/3] PCI: Export pcie_has_flr() Alex Williamson
2018-07-24 16:14 ` [PATCH v3 2/3] PCI: Samsung SM961/PM961 NVMe disable before FLR quirk Alex Williamson
2018-07-24 19:53   ` Minwoo Im [this message]
2018-07-24 20:09     ` Alex Williamson
2018-07-24 16:14 ` [PATCH v3 3/3] PCI: Intel DC P3700 NVMe delay after " Alex Williamson
2018-07-24 16:18   ` Alex Williamson
2018-08-09 19:35     ` Bjorn Helgaas

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1532461998.20066.5.camel@gmail.com \
    --to=minwoo.im.dev@gmail.com \
    --cc=alex.williamson@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox