Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3] Workaround for discard on non-conformant nvme devices
@ 2019-11-11 22:07 Eduard Hasenleithner
  2019-11-11 22:37 ` Keith Busch
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Eduard Hasenleithner @ 2019-11-11 22:07 UTC (permalink / raw)
  To: linux-nvme

Some devices seem not to strictly adhere to the DSM 'Number of Ranges' 
field when determining how much data to DMA. This behavior causes 
affected devices to incur read access violations with IO-MMU employed in 
modern setups. Logs of such read access violations were reported by 
several users in https://bugzilla.kernel.org/show_bug.cgi?id=202665

This patch
* introduces a new NVME_QUIRK_DSM_PAGEALIGN,
* makes the nvme_setup_discard function always return the
     (page aligned) discard_page for devices with the quirk, and
* adds a list of known to be affected devices gathered
     from said bugzilla.kernel.org entry

List of affected devices was gathered by Vladimir Smirnov. I can only 
confirm Intel 660p (always reads multiple of 512b) and Phison E12 
(always reads 4KB) to be affected.

Signed-off-by: Eduard Hasenleithner <eduard@hasenleithner.at>

--- linux-5.3.7/drivers/nvme/host/core.c.orig	2019-11-04 
21:53:20.758837001 +0100
+++ linux-5.3.7/drivers/nvme/host/core.c	2019-11-11 22:10:15.096817458 +0100
@@ -559,11 +559,13 @@ static blk_status_t nvme_setup_discard(s
  		struct nvme_command *cmnd)
  {
  	unsigned short segments = blk_rq_nr_discard_segments(req), n = 0;
-	struct nvme_dsm_range *range;
+	struct nvme_dsm_range *range = NULL;
  	struct bio *bio;

-	range = kmalloc_array(segments, sizeof(*range),
-				GFP_ATOMIC | __GFP_NOWARN);
+	if (!(ns->ctrl->quirks & NVME_QUIRK_DSM_PAGEALIGN)) {
+		range = kmalloc_array(segments, sizeof(*range),
+					GFP_ATOMIC | __GFP_NOWARN);
+	}
  	if (!range) {
  		/*
  		 * If we fail allocation our range, fallback to the controller
--- linux-5.3.7/drivers/nvme/host/nvme.h.orig	2019-11-10 
18:16:39.097549037 +0100
+++ linux-5.3.7/drivers/nvme/host/nvme.h	2019-11-11 22:12:39.415402403 +0100
@@ -97,6 +97,14 @@ enum nvme_quirks {
  	 * Force simple suspend/resume path.
  	 */
  	NVME_QUIRK_SIMPLE_SUSPEND		= (1 << 10),
+
+	/*
+	 * For devices that do not consider the DSM 'Number of Ranges'
+	 * field when determining how much data to DMA. Page aligned and
+	 * sized is always sufficient as that is the largest a DSM range
+	 * list can be.
+	 */
+	NVME_QUIRK_DSM_PAGEALIGN		= (1 << 11),
  };

  /*
--- linux-5.3.7/drivers/nvme/host/pci.c.orig	2019-11-10 
18:31:23.685680418 +0100
+++ linux-5.3.7/drivers/nvme/host/pci.c	2019-11-11 22:30:47.248248362 +0100
@@ -3012,12 +3012,28 @@ static const struct pci_device_id nvme_i
  				NVME_QUIRK_DEALLOCATE_ZEROES, },
  	{ PCI_VDEVICE(INTEL, 0xf1a5),	/* Intel 600P/P3100 */
  		.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
-				NVME_QUIRK_MEDIUM_PRIO_SQ },
+				NVME_QUIRK_MEDIUM_PRIO_SQ |
+				NVME_QUIRK_DSM_PAGEALIGN, },
  	{ PCI_VDEVICE(INTEL, 0xf1a6),	/* Intel 760p/Pro 7600p */
-		.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
+		.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN |
+				NVME_QUIRK_DSM_PAGEALIGN, },
+	{ PCI_VDEVICE(INTEL, 0xf1a8),	/* Intel 660p */
+		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
  	{ PCI_VDEVICE(INTEL, 0x5845),	/* Qemu emulated controller */
  		.driver_data = NVME_QUIRK_IDENTIFY_CNS |
  				NVME_QUIRK_DISABLE_WRITE_ZEROES, },
+	{ PCI_DEVICE(0x126f, 0x2260),   /* Silicon Motion SM2260 */
+		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
+	{ PCI_DEVICE(0x126f, 0x2262),   /* Silicon Motion SM2262 */
+		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
+	{ PCI_DEVICE(0x126f, 0x2263),   /* Silicon Motion SM2263 */
+		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
+	{ PCI_DEVICE(0x126f, 0x2265),   /* Silicon Motion SM2265 */
+		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
+	{ PCI_DEVICE(0x1987, 0x5012),   /* Phison E12 */
+		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
+	{ PCI_DEVICE(0x1987, 0x5016),   /* Phison E16 */
+		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
  	{ PCI_DEVICE(0x1bb1, 0x0100),   /* Seagate Nytro Flash Storage */
  		.driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY, },
  	{ PCI_DEVICE(0x1c58, 0x0003),	/* HGST adapter */
@@ -3038,6 +3054,8 @@ static const struct pci_device_id nvme_i
  		.driver_data = NVME_QUIRK_LIGHTNVM, },
  	{ PCI_DEVICE(0x10ec, 0x5762),   /* ADATA SX6000LNP */
  		.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
+	{ PCI_DEVICE(0xc0a9, 0x2263),   /* Crucial P1 (SM2263) */
+		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
  	{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
  	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001) },
  	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] Workaround for discard on non-conformant nvme devices
  2019-11-11 22:07 [PATCH v3] Workaround for discard on non-conformant nvme devices Eduard Hasenleithner
@ 2019-11-11 22:37 ` Keith Busch
  2019-11-12  1:31 ` Sagi Grimberg
  2019-11-12 16:37 ` Keith Busch
  2 siblings, 0 replies; 4+ messages in thread
From: Keith Busch @ 2019-11-11 22:37 UTC (permalink / raw)
  To: Eduard Hasenleithner; +Cc: linux-nvme

On Mon, Nov 11, 2019 at 11:07:43PM +0100, Eduard Hasenleithner wrote:
>  	{ PCI_VDEVICE(INTEL, 0xf1a5),	/* Intel 600P/P3100 */
>  		.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
> -				NVME_QUIRK_MEDIUM_PRIO_SQ },
> +				NVME_QUIRK_MEDIUM_PRIO_SQ |
> +				NVME_QUIRK_DSM_PAGEALIGN, },
>  	{ PCI_VDEVICE(INTEL, 0xf1a6),	/* Intel 760p/Pro 7600p */
> -		.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
> +		.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN |
> +				NVME_QUIRK_DSM_PAGEALIGN, },
> +	{ PCI_VDEVICE(INTEL, 0xf1a8),	/* Intel 660p */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
>  	{ PCI_VDEVICE(INTEL, 0x5845),	/* Qemu emulated controller */
>  		.driver_data = NVME_QUIRK_IDENTIFY_CNS |
>  				NVME_QUIRK_DISABLE_WRITE_ZEROES, },
> +	{ PCI_DEVICE(0x126f, 0x2260),   /* Silicon Motion SM2260 */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
> +	{ PCI_DEVICE(0x126f, 0x2262),   /* Silicon Motion SM2262 */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
> +	{ PCI_DEVICE(0x126f, 0x2263),   /* Silicon Motion SM2263 */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
> +	{ PCI_DEVICE(0x126f, 0x2265),   /* Silicon Motion SM2265 */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
>
> +	{ PCI_DEVICE(0x1987, 0x5012),   /* Phison E12 */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
> +	{ PCI_DEVICE(0x1987, 0x5016),   /* Phison E16 */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
>  	{ PCI_DEVICE(0x1bb1, 0x0100),   /* Seagate Nytro Flash Storage */
>  		.driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY, },
>  	{ PCI_DEVICE(0x1c58, 0x0003),	/* HGST adapter */
> @@ -3038,6 +3054,8 @@ static const struct pci_device_id nvme_i
>  		.driver_data = NVME_QUIRK_LIGHTNVM, },
>  	{ PCI_DEVICE(0x10ec, 0x5762),   /* ADATA SX6000LNP */
>  		.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
> +	{ PCI_DEVICE(0xc0a9, 0x2263),   /* Crucial P1 (SM2263) */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },

Aha, I'm seeing a pattern here: SMI makes the controllers for at least
8 of the 10 needing this behavior.

Patch looks fine to me. I'll give this a short time to see if there's
additional comments before committing for the next merge window.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] Workaround for discard on non-conformant nvme devices
  2019-11-11 22:07 [PATCH v3] Workaround for discard on non-conformant nvme devices Eduard Hasenleithner
  2019-11-11 22:37 ` Keith Busch
@ 2019-11-12  1:31 ` Sagi Grimberg
  2019-11-12 16:37 ` Keith Busch
  2 siblings, 0 replies; 4+ messages in thread
From: Sagi Grimberg @ 2019-11-12  1:31 UTC (permalink / raw)
  To: Eduard Hasenleithner, linux-nvme

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] Workaround for discard on non-conformant nvme devices
  2019-11-11 22:07 [PATCH v3] Workaround for discard on non-conformant nvme devices Eduard Hasenleithner
  2019-11-11 22:37 ` Keith Busch
  2019-11-12  1:31 ` Sagi Grimberg
@ 2019-11-12 16:37 ` Keith Busch
  2 siblings, 0 replies; 4+ messages in thread
From: Keith Busch @ 2019-11-12 16:37 UTC (permalink / raw)
  To: Eduard Hasenleithner; +Cc: linux-nvme

On Mon, Nov 11, 2019 at 11:07:43PM +0100, Eduard Hasenleithner wrote:
> @@ -3012,12 +3012,28 @@ static const struct pci_device_id nvme_i
>  				NVME_QUIRK_DEALLOCATE_ZEROES, },
>  	{ PCI_VDEVICE(INTEL, 0xf1a5),	/* Intel 600P/P3100 */
>  		.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
> -				NVME_QUIRK_MEDIUM_PRIO_SQ },
> +				NVME_QUIRK_MEDIUM_PRIO_SQ |
> +				NVME_QUIRK_DSM_PAGEALIGN, },
>  	{ PCI_VDEVICE(INTEL, 0xf1a6),	/* Intel 760p/Pro 7600p */
> -		.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
> +		.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN |
> +				NVME_QUIRK_DSM_PAGEALIGN, },
> +	{ PCI_VDEVICE(INTEL, 0xf1a8),	/* Intel 660p */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
>  	{ PCI_VDEVICE(INTEL, 0x5845),	/* Qemu emulated controller */
>  		.driver_data = NVME_QUIRK_IDENTIFY_CNS |
>  				NVME_QUIRK_DISABLE_WRITE_ZEROES, },
> +	{ PCI_DEVICE(0x126f, 0x2260),   /* Silicon Motion SM2260 */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
> +	{ PCI_DEVICE(0x126f, 0x2262),   /* Silicon Motion SM2262 */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
> +	{ PCI_DEVICE(0x126f, 0x2263),   /* Silicon Motion SM2263 */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
> +	{ PCI_DEVICE(0x126f, 0x2265),   /* Silicon Motion SM2265 */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
> +	{ PCI_DEVICE(0x1987, 0x5012),   /* Phison E12 */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
> +	{ PCI_DEVICE(0x1987, 0x5016),   /* Phison E16 */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
>  	{ PCI_DEVICE(0x1bb1, 0x0100),   /* Seagate Nytro Flash Storage */
>  		.driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY, },
>  	{ PCI_DEVICE(0x1c58, 0x0003),	/* HGST adapter */
> @@ -3038,6 +3054,8 @@ static const struct pci_device_id nvme_i
>  		.driver_data = NVME_QUIRK_LIGHTNVM, },
>  	{ PCI_DEVICE(0x10ec, 0x5762),   /* ADATA SX6000LNP */
>  		.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
> +	{ PCI_DEVICE(0xc0a9, 0x2263),   /* Crucial P1 (SM2263) */
> +		.driver_data = NVME_QUIRK_DSM_PAGEALIGN, },
>  	{ PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
>  	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001) },
>  	{ PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },

I've done a little more digging on who uses these SM22XX, and this list
is incomplete, and would unlikely ever be complete. So on second thought,
you had the right idea initially to just default to a known safe size,
otherwise we'll be chasing this issue for a while.

Let's revisit your original proposal:

> +++ drivers/nvme/host/core.c	2019-11-04 22:05:54.409415849 +0100
> @@ -561,9 +561,9 @@ static blk_status_t nvme_setup_discard(s
>  	unsigned short segments = blk_rq_nr_discard_segments(req), n = 0;
>  	struct nvme_dsm_range *range;
>  	struct bio *bio;
>+	size_t aligned_size = round_up(sizeof *range * segments, 4096);

The max range defined by spec is 256 ranges by virtue of being an 8-byte
value, we can just hard-code 256 * sizeof(struct nvme_dsm_range) being
4k as the allocation size, or just use ctrl->page_size.

>
>-	range = kmalloc_array(segments, sizeof(*range),
>-				GFP_ATOMIC | __GFP_NOWARN);
>+	range = kmalloc(aligned_size, GFP_ATOMIC | __GFP_NOWARN);

Use kzalloc() since we're only going to fill in a partial of the
allocated memory.

Finally, be sure to update req->special_vec.bv_len to ctrl->page_size as
well. This will ensure the driver maps PRP2 in case the allocation has an
offset. The address is usually aligned, but that's not always true.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-11 22:07 [PATCH v3] Workaround for discard on non-conformant nvme devices Eduard Hasenleithner
2019-11-11 22:37 ` Keith Busch
2019-11-12  1:31 ` Sagi Grimberg
2019-11-12 16:37 ` Keith Busch

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git