linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv4] Workaround for discard on non-conformant nvme devices
@ 2019-11-12 20:55 Eduard Hasenleithner
  2019-11-12 21:51 ` Keith Busch
  0 siblings, 1 reply; 2+ messages in thread
From: Eduard Hasenleithner @ 2019-11-12 20:55 UTC (permalink / raw)
  To: linux-nvme; +Cc: Keith Busch

As documented in https://bugzilla.kernel.org/show_bug.cgi?id=202665 there are lots of Linux nvme users which get IO-MMU related errors when performing discard on nvme. So far analysis suggests that the errors are caused by non-conformant nvme devices which are reading beyond the end of the buffer containing the segments to be discarded.

Until now two different variants of this behavior have been observed: SM22XX controllers round up the read size to a multiple of 512 bytes. Phison (E12) are found to unconditionally read the maximum amount of discard segment data allowed by the spec (256 segments or 4kB). In case the start address of the buffer is not aligned (512B or 4kB) and the actual data fits in the remainder of the buffer page it may cause the controller to read a page which is not mapped on IO-MMU.

The patch changes the nvme_setup_discard function to unconditionally allocate memory for a segment array of maximum size (256 segments). This prevents the nvme from reading beyond the end of the IO-MMU mapped buffer.

Signed-off-by: Eduard Hasenleithner <eduard@hasenleithner.at>

--- linux-5.3.7/drivers/nvme/host/core.c.orig	2019-11-12 20:42:16.394800789 +0100
+++ linux-5.3.7/drivers/nvme/host/core.c	2019-11-12 21:23:07.635266361 +0100
@@ -555,15 +555,22 @@ static inline void nvme_setup_flush(stru
 	cmnd->common.nsid = cpu_to_le32(ns->head->ns_id);
 }
 
+#define DSM_SEGMENTS_MAX 256
+
 static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req,
 		struct nvme_command *cmnd)
 {
 	unsigned short segments = blk_rq_nr_discard_segments(req), n = 0;
 	struct nvme_dsm_range *range;
 	struct bio *bio;
+	/* Some devices do not consider the DSM 'Number of Ranges'
+	 * field when determining how much data to DMA. Always allocate
+	 * memory for maximum number of segments to prevent device
+	 * reading beyond end of buffer. */
+	static const size_t alloc_size = sizeof *range * DSM_SEGMENTS_MAX;
 
-	range = kmalloc_array(segments, sizeof(*range),
-				GFP_ATOMIC | __GFP_NOWARN);
+	WARN_ON(segments > DSM_SEGMENTS_MAX);
+	range = kzalloc(alloc_size, GFP_ATOMIC | __GFP_NOWARN);
 	if (!range) {
 		/*
 		 * If we fail allocation our range, fallback to the controller
@@ -603,7 +610,7 @@ static blk_status_t nvme_setup_discard(s
 
 	req->special_vec.bv_page = virt_to_page(range);
 	req->special_vec.bv_offset = offset_in_page(range);
-	req->special_vec.bv_len = sizeof(*range) * segments;
+	req->special_vec.bv_len = alloc_size;
 	req->rq_flags |= RQF_SPECIAL_PAYLOAD;
 
 	return BLK_STS_OK;

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-11-12 21:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-12 20:55 [PATCHv4] Workaround for discard on non-conformant nvme devices Eduard Hasenleithner
2019-11-12 21:51 ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).