All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME
@ 2018-06-26  0:10 Bart Van Assche
  2018-06-26  0:10 ` [PATCH 1/3] block: Fix blk_rq_payload_bytes() Bart Van Assche
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Bart Van Assche @ 2018-06-26  0:10 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Christoph Hellwig, Bart Van Assche

Hello Jens,

This series of three patches fixes a crash in the block layer core that I
encountered while retesting the SRP tests in blktests. Please consider these
patches for kernel v4.19.

Thanks,

Bart.

Bart Van Assche (3):
  block: Fix blk_rq_payload_bytes()
  sd: Remove the __data_len hack for WRITE SAME again
  block: Fix blk_end_request_all() for WRITE SAME requests

 block/blk-core.c       | 31 ++++++++++++++++++++++++-------
 block/blk-mq.c         |  2 +-
 block/blk.h            |  2 ++
 drivers/scsi/sd.c      | 18 +-----------------
 include/linux/blkdev.h | 14 +++++++++-----
 5 files changed, 37 insertions(+), 30 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] block: Fix blk_rq_payload_bytes()
  2018-06-26  0:10 [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME Bart Van Assche
@ 2018-06-26  0:10 ` Bart Van Assche
  2018-06-26  0:10 ` [PATCH 2/3] sd: Remove the __data_len hack for WRITE SAME again Bart Van Assche
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Bart Van Assche @ 2018-06-26  0:10 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Christoph Hellwig, Bart Van Assche, Ming Lei

The SCSI sd driver converts a block layer request into a SCSI WRITE
SAME command in the following cases:
1. REQ_OP_DISCARD if the sd driver has been configured to translate
   this request type into WRITE SAME.
2. REQ_OP_WRITE_SAME.
The SCSI sd driver sets RQF_SPECIAL_PAYLOAD in case (1) but not in
case (2). Make sure that blk_rq_payload_bytes() handles both cases
correctly.

Fixes: 2e3258ecfaeb ("block: add blk_rq_payload_bytes")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
---
 include/linux/blkdev.h | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index dcf5e0990bfa..b7519a5c1002 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1091,15 +1091,19 @@ static inline unsigned int blk_rq_zone_is_seq(struct request *rq)
 #endif /* CONFIG_BLK_DEV_ZONED */
 
 /*
- * Some commands like WRITE SAME have a payload or data transfer size which
- * is different from the size of the request.  Any driver that supports such
- * commands using the RQF_SPECIAL_PAYLOAD flag needs to use this helper to
- * calculate the data transfer size.
+ * Some commands like DISCARD and WRITE SAME have a payload size which is
+ * different from the number of bytes affected on the storage medium. Any
+ * driver that supports such commands needs to use this helper to calculate
+ * the data buffer size.
  */
-static inline unsigned int blk_rq_payload_bytes(struct request *rq)
+static inline unsigned int blk_rq_payload_bytes(const struct request *rq)
 {
 	if (rq->rq_flags & RQF_SPECIAL_PAYLOAD)
 		return rq->special_vec.bv_len;
+	if (req_op(rq) == REQ_OP_WRITE_SAME) {
+		WARN_ON_ONCE(rq->bio->bi_vcnt != 1);
+		return rq->bio->bi_io_vec->bv_len;
+	}
 	return blk_rq_bytes(rq);
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] sd: Remove the __data_len hack for WRITE SAME again
  2018-06-26  0:10 [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME Bart Van Assche
  2018-06-26  0:10 ` [PATCH 1/3] block: Fix blk_rq_payload_bytes() Bart Van Assche
@ 2018-06-26  0:10 ` Bart Van Assche
  2018-06-26  0:10 ` [PATCH 3/3] block: Fix blk_end_request_all() for WRITE SAME requests Bart Van Assche
  2018-06-26 15:53 ` [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME Martin K. Petersen
  3 siblings, 0 replies; 8+ messages in thread
From: Bart Van Assche @ 2018-06-26  0:10 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche,
	Martin K . Petersen, Ming Lei

Now that blk_rq_payload_bytes() has been fixed, remove the __data_len
hack again from sd_setup_write_same_cmnd(), the function that handles
REQ_OP_WRITE_SAME. See also commit 08965c2eba13 ("Revert "sd: remove
__data_len hack for WRITE SAME""). See also commit f80de881d8df ("sd:
remove __data_len hack for WRITE SAME").

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
---
 drivers/scsi/sd.c | 18 +-----------------
 1 file changed, 1 insertion(+), 17 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 9421d9877730..89da86308aaf 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -947,8 +947,6 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd *cmd)
 	struct bio *bio = rq->bio;
 	sector_t sector = blk_rq_pos(rq);
 	unsigned int nr_sectors = blk_rq_sectors(rq);
-	unsigned int nr_bytes = blk_rq_bytes(rq);
-	int ret;
 
 	if (sdkp->device->no_write_same)
 		return BLKPREP_INVALID;
@@ -975,21 +973,7 @@ static int sd_setup_write_same_cmnd(struct scsi_cmnd *cmd)
 	cmd->transfersize = sdp->sector_size;
 	cmd->allowed = SD_MAX_RETRIES;
 
-	/*
-	 * For WRITE SAME the data transferred via the DATA OUT buffer is
-	 * different from the amount of data actually written to the target.
-	 *
-	 * We set up __data_len to the amount of data transferred via the
-	 * DATA OUT buffer so that blk_rq_map_sg sets up the proper S/G list
-	 * to transfer a single sector of data first, but then reset it to
-	 * the amount of data to be written right after so that the I/O path
-	 * knows how much to actually write.
-	 */
-	rq->__data_len = sdp->sector_size;
-	ret = scsi_init_io(cmd);
-	rq->__data_len = nr_bytes;
-
-	return ret;
+	return scsi_init_io(cmd);
 }
 
 static int sd_setup_flush_cmnd(struct scsi_cmnd *cmd)
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] block: Fix blk_end_request_all() for WRITE SAME requests
  2018-06-26  0:10 [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME Bart Van Assche
  2018-06-26  0:10 ` [PATCH 1/3] block: Fix blk_rq_payload_bytes() Bart Van Assche
  2018-06-26  0:10 ` [PATCH 2/3] sd: Remove the __data_len hack for WRITE SAME again Bart Van Assche
@ 2018-06-26  0:10 ` Bart Van Assche
  2018-06-26 15:53 ` [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME Martin K. Petersen
  3 siblings, 0 replies; 8+ messages in thread
From: Bart Van Assche @ 2018-06-26  0:10 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, Mike Snitzer, Ming Lei

Most REQ_OP_WRITE_SAME requests have a data buffer size that differs
from the number of bytes affected on the storage medium. Since
blk_update_request() expects that its third argument is the number of
bytes that have been completed from the data out buffer, pass that
number fo blk_update_request(). This patch avoids that removing a
path controlled by the dm-mpath driver while mkfs is running triggers
the following kernel bug:

-----------[ cut here ]------------
kernel BUG at block/blk-core.c:3347!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 20 PID: 24369 Comm: mkfs.ext4 Not tainted 4.18.0-rc1-dbg+ #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-pro
ject.org 04/01/2014
RIP: 0010:blk_end_request_all+0x68/0x70
Call Trace:
 <IRQ>
 dm_softirq_done+0x326/0x3d0 [dm_mod]
 blk_done_softirq+0x19b/0x1e0
 __do_softirq+0x128/0x60d
 irq_exit+0x100/0x110
 smp_call_function_single_interrupt+0x90/0x330
 call_function_single_interrupt+0xf/0x20
 </IRQ>

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
---
 block/blk-core.c | 31 ++++++++++++++++++++++++-------
 block/blk-mq.c   |  2 +-
 block/blk.h      |  2 ++
 3 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 81936b9d6d26..6f9483d4b988 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2686,6 +2686,21 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *
 }
 EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
 
+/**
+ * blk_rq_bio_bytes - sum of all bytes in all bios associated with a request
+ * @rq: request pointer.
+ */
+unsigned int blk_rq_bio_bytes(const struct request *rq)
+{
+	unsigned int bytes = 0;
+	struct bio *bio;
+
+	for (bio = rq->bio; bio; bio = bio->bi_next)
+		bytes += bio->bi_iter.bi_size;
+
+	return bytes;
+}
+
 /**
  * blk_rq_err_bytes - determine number of bytes till the next failure boundary
  * @rq: request to examine
@@ -3080,8 +3095,8 @@ EXPORT_SYMBOL_GPL(blk_steal_bios);
  *     (e.g. request-based dm) so that they can handle partial completion.
  *     Actual device drivers should use blk_end_request instead.
  *
- *     Passing the result of blk_rq_bytes() as @nr_bytes guarantees
- *     %false return from this function.
+ *     Passing the result of blk_rq_bio_bytes() as @nr_bytes guarantees that
+ *     this function will return %false.
  *
  * Return:
  *     %false - this request doesn't have any more data
@@ -3152,7 +3167,7 @@ bool blk_update_request(struct request *req, blk_status_t error,
 		 * If total number of sectors is less than the first segment
 		 * size, something has gone terribly wrong.
 		 */
-		if (blk_rq_bytes(req) < blk_rq_cur_bytes(req)) {
+		if (blk_rq_bio_bytes(req) < blk_rq_cur_bytes(req)) {
 			blk_dump_rq_flags(req, "request botched");
 			req->__data_len = blk_rq_cur_bytes(req);
 		}
@@ -3341,9 +3356,10 @@ void blk_end_request_all(struct request *rq, blk_status_t error)
 	unsigned int bidi_bytes = 0;
 
 	if (unlikely(blk_bidi_rq(rq)))
-		bidi_bytes = blk_rq_bytes(rq->next_rq);
+		bidi_bytes = blk_rq_bio_bytes(rq->next_rq);
 
-	pending = blk_end_bidi_request(rq, error, blk_rq_bytes(rq), bidi_bytes);
+	pending = blk_end_bidi_request(rq, error, blk_rq_bio_bytes(rq),
+				       bidi_bytes);
 	BUG_ON(pending);
 }
 EXPORT_SYMBOL(blk_end_request_all);
@@ -3388,9 +3404,10 @@ void __blk_end_request_all(struct request *rq, blk_status_t error)
 	WARN_ON_ONCE(rq->q->mq_ops);
 
 	if (unlikely(blk_bidi_rq(rq)))
-		bidi_bytes = blk_rq_bytes(rq->next_rq);
+		bidi_bytes = blk_rq_bio_bytes(rq->next_rq);
 
-	pending = __blk_end_bidi_request(rq, error, blk_rq_bytes(rq), bidi_bytes);
+	pending = __blk_end_bidi_request(rq, error, blk_rq_bio_bytes(rq),
+					 bidi_bytes);
 	BUG_ON(pending);
 }
 EXPORT_SYMBOL(__blk_end_request_all);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8c00fcd300b9..852a87895b90 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -543,7 +543,7 @@ EXPORT_SYMBOL(__blk_mq_end_request);
 
 void blk_mq_end_request(struct request *rq, blk_status_t error)
 {
-	if (blk_update_request(rq, error, blk_rq_bytes(rq)))
+	if (blk_update_request(rq, error, blk_rq_bio_bytes(rq)))
 		BUG();
 	__blk_mq_end_request(rq, error);
 }
diff --git a/block/blk.h b/block/blk.h
index a8f0f7986cfd..37a31d992c83 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -149,6 +149,8 @@ static inline void blk_queue_enter_live(struct request_queue *q)
 	percpu_ref_get(&q->q_usage_counter);
 }
 
+unsigned int blk_rq_bio_bytes(const struct request *rq);
+
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 void blk_flush_integrity(void);
 bool __bio_integrity_endio(struct bio *);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME
  2018-06-26  0:10 [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME Bart Van Assche
                   ` (2 preceding siblings ...)
  2018-06-26  0:10 ` [PATCH 3/3] block: Fix blk_end_request_all() for WRITE SAME requests Bart Van Assche
@ 2018-06-26 15:53 ` Martin K. Petersen
  2018-06-26 16:24   ` Bart Van Assche
  2018-06-29  8:58   ` Christoph Hellwig
  3 siblings, 2 replies; 8+ messages in thread
From: Martin K. Petersen @ 2018-06-26 15:53 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Jens Axboe, linux-block, Christoph Hellwig


Hi Bart,

> This series of three patches fixes a crash in the block layer core
> that I encountered while retesting the SRP tests in blktests. Please
> consider these patches for kernel v4.19.

Patches 1 and 2 look fine. However, I'm not sure I'm so keen on patch
3. It looks like it's papering over a more fundamental issue.

Can you elaborate a bit on why the existing code fails with dm in the
mix?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME
  2018-06-26 15:53 ` [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME Martin K. Petersen
@ 2018-06-26 16:24   ` Bart Van Assche
  2018-06-29  8:58   ` Christoph Hellwig
  1 sibling, 0 replies; 8+ messages in thread
From: Bart Van Assche @ 2018-06-26 16:24 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: Jens Axboe, linux-block, Christoph Hellwig

On 06/26/18 08:53, Martin K. Petersen wrote:
>> This series of three patches fixes a crash in the block layer core
>> that I encountered while retesting the SRP tests in blktests. Please
>> consider these patches for kernel v4.19.
> 
> Patches 1 and 2 look fine. However, I'm not sure I'm so keen on patch
> 3. It looks like it's papering over a more fundamental issue.
> 
> Can you elaborate a bit on why the existing code fails with dm in the
> mix?

Hello Martin,

The issue I ran into is not specific to dm. The only role of dm in this 
is that I found a way to trigger the reported bug with dm.

A close look at the block layer core learned me that in several 
functions there is confusion between what is called the Data Out buffer 
size in the SCSI spec and the number of bytes that are affected on the 
medium. As you know these two numbers are identical for most commands 
but not for discard nor for write same requests. In the block layer core 
the number of bytes affected on the medium is __data_len. The size of 
the Data Out buffer is one of the following:
* rq->special_vec.bv_len if RQF_SPECIAL_PAYLOAD has been set.
* zero for discard requests for which RQF_SPECIAL_PAYLOAD has not been
   set.
* The cumulative size of all bio's associated with a request for all
   other request types, including write same.

Discard requests are generated and processed as follows:
* __blkdev_issue_discard() allocates a bio and sets .bi_size of that
   bio but does not attach any pages to the bio.
* generic_make_request() sets rq->__data_len by calling
   blk_rq_bio_prep() indirectly.
* The sd driver decides which SCSI command to translate discard requests
   into - UNMAP, WRITE SAME(10) or WRITE SAME(16).

Write same requests are generated and processed as follows:
* __blkdev_issue_write_same() allocates a bio, attaches a zero page to
   the bio and sets .bi_size.
* generic_make_request() sets rq->__data_len by calling
   blk_rq_bio_prep() indirectly.
* The sd driver translates the request into a WRITE SAME(10) or WRITE
   SAME(16) SCSI command.

Upon request completion scsi_finish_command() executes the following code:

	good_bytes = scsi_bufflen(cmd);
	scsi_io_completion(cmd, good_bytes);

In other words, the SCSI core passes the size of the Data Out buffer to 
blk_update_request(). In case of discard and write same requests, that 
number can differ from the number of bytes affected on the medium.

The loop in blk_update_request() iterates over the bio's attached to a 
request and hence finishes as soon as the value passed as the nr_bytes 
argument equals or exceeds the Data Out buffer size.

This is why I think that all blk_end_request_all() variants should pass 
the Data Out buffer size to blk_update_request() instead of the number 
of bytes affected on the medium.

Bart.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME
  2018-06-26 15:53 ` [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME Martin K. Petersen
  2018-06-26 16:24   ` Bart Van Assche
@ 2018-06-29  8:58   ` Christoph Hellwig
  2018-06-29 15:50     ` Bart Van Assche
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2018-06-29  8:58 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Bart Van Assche, Jens Axboe, linux-block, Christoph Hellwig

Maybe it is time to remove REQ_OP_WRITE_SAME finally.  We don't have
a user except for drbd which is passing it from the other side, and
it creates way too many special cases like this.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME
  2018-06-29  8:58   ` Christoph Hellwig
@ 2018-06-29 15:50     ` Bart Van Assche
  0 siblings, 0 replies; 8+ messages in thread
From: Bart Van Assche @ 2018-06-29 15:50 UTC (permalink / raw)
  To: Christoph Hellwig, Martin K. Petersen; +Cc: Jens Axboe, linux-block

[-- Attachment #1: Type: text/plain, Size: 2235 bytes --]

On 06/29/18 01:58, Christoph Hellwig wrote:
> Maybe it is time to remove REQ_OP_WRITE_SAME finally.  We don't have
> a user except for drbd which is passing it from the other side, and
> it creates way too many special cases like this.

Hello Christoph,

If REQ_OP_WRITE_SAME would be removed from the block layer, how is 
software like e.g. LIO ever expected to pass down WRITE SAME requests to 
block devices? What if in the future we would want to add support for a 
new request type to the block layer for which - just like for write same 
commands - the number of bytes affected on the medium differs from the 
data out buffer size? The SCSI spec already support several such 
commands today. One variant of the SCSI VERIFY command verifies multiple 
logical blocks on the medium but does not have a Data Out buffer. 
Another example is the COMPARE AND WRITE command, for which the size of 
the Data Out buffer is the double of the number of bytes affected on the 
medium. It's not impossible that an equivalent of COMPARE AND WRITE will 
ever be added to the NVMe spec.

Have you considered to modify the block layer such that 
blk_rq_payload_bytes() does not have to be made more complicated than 
returning a single member from the request data structure? How about the 
following approach:
* Add a new member to struct request that represents what is called the 
Data Out buffer size in the SCSI specs and keep using __data_len for the 
number of bytes affected on the medium. Modify blk_rq_bio_prep() and 
__blk_rq_prep_clone() such that these functions set both __data_len and 
the new struct request member instead of only __data_len.
* Remove RQF_SPECIAL_PAYLOAD and request.special_vec. Modify all block 
driver code that sets the RQF_SPECIAL_PAYLOAD and request.special_vec 
such that rq->bio is replaced by a bio with the proper payload and such 
that the original bio is restored before completing the request. As you 
are most likely aware there is already code in the block layer that 
replaces and restores the original request bio, namely the code in 
blk-flush.c.

If you would like to see the patches I came up with to add 
REQ_OP_WRITE_SAME support to LIO, I have attached these to this e-mail.

Thanks,

Bart.

[-- Attachment #2: 0001-block-Add-blkdev_submit_write_same.patch --]
[-- Type: text/x-patch, Size: 3142 bytes --]

>From 44c0b07191629e4d9b1cefb43cc9f84e5491bb81 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@wdc.com>
Date: Thu, 28 Jun 2018 10:44:35 -0700
Subject: [PATCH 1/2] block: Add blkdev_submit_write_same()

Add an asynchronous version of blkdev_issue_write_same().

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
---
 block/blk-lib.c        | 37 ++++++++++++++++++++++++++++++++++++-
 include/linux/blkdev.h |  3 +++
 2 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 8faa70f26fcd..680e2dadb2b8 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -212,7 +212,8 @@ static int __blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
  * @page:	page containing data
  *
  * Description:
- *    Issue a write same request for the sectors in question.
+ *    Issue a write same request for the sectors in question and wait until it
+ *    has finished.
  */
 int blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
 				sector_t nr_sects, gfp_t gfp_mask,
@@ -234,6 +235,40 @@ int blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
 }
 EXPORT_SYMBOL(blkdev_issue_write_same);
 
+/**
+ * blkdev_submit_write_same - queue a write same operation
+ * @bdev:	target blockdev
+ * @sector:	start sector
+ * @nr_sects:	number of sectors to write
+ * @gfp_mask:	memory allocation flags (for bio_alloc)
+ * @page:	page containing data
+ * @bi_end_io:  will be called upon completion
+ * @bi_private: will be stored in the bio->bi_private field of the bio passed
+ *		to @bi_end_io.
+ *
+ * Description:
+ *    Submit a write same request asynchronously for the sectors in question.
+ *    @bi_end_io will be called upon request completion.
+ */
+int blkdev_submit_write_same(struct block_device *bdev, sector_t sector,
+			     sector_t nr_sects, gfp_t gfp_mask,
+			     struct page *page, bio_end_io_t bi_end_io,
+			     void *bi_private)
+{
+	struct bio *bio = NULL;
+	int ret;
+
+	ret = __blkdev_issue_write_same(bdev, sector, nr_sects, gfp_mask, page,
+					&bio);
+	if (ret)
+		return ret;
+	bio->bi_end_io = bi_end_io;
+	bio->bi_private = bi_private;
+	submit_bio(bio);
+	return 0;
+}
+EXPORT_SYMBOL(blkdev_submit_write_same);
+
 static int __blkdev_issue_write_zeroes(struct block_device *bdev,
 		sector_t sector, sector_t nr_sects, gfp_t gfp_mask,
 		struct bio **biop, unsigned flags)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 9154570edf29..771d37c347ea 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1388,6 +1388,9 @@ static inline struct request *blk_map_queue_find_tag(struct blk_queue_tag *bqt,
 extern int blkdev_issue_flush(struct block_device *, gfp_t, sector_t *);
 extern int blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
 		sector_t nr_sects, gfp_t gfp_mask, struct page *page);
+extern int blkdev_submit_write_same(struct block_device *bdev, sector_t sector,
+		sector_t nr_sects, gfp_t gfp_mask, struct page *page,
+		bio_end_io_t bi_end_io, void *bi_private);
 
 #define BLKDEV_DISCARD_SECURE	(1 << 0)	/* issue a secure erase */
 
-- 
2.17.1


[-- Attachment #3: 0002-target-Use-REQ_OP_WRITE_SAME-to-implement-WRITE-SAME.patch --]
[-- Type: text/x-patch, Size: 2001 bytes --]

>From 3d8c13d953221411d5d803e76e2e6cabd506d0d4 Mon Sep 17 00:00:00 2001
From: Bart Van Assche <bart.vanassche@wdc.com>
Date: Thu, 28 Jun 2018 10:12:58 -0700
Subject: [PATCH 2/2] target: Use REQ_OP_WRITE_SAME to implement WRITE SAME

Use blkdev_submit_write_same() instead of open-coding it.

Note: as one can see in target_alloc_sgl(), the target core sets the
offset in a scatter/gather list to zero for all allocated pages.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
---
 drivers/target/target_core_iblock.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c
index 1bc9b14236d8..fa4dd4f5c8b3 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -465,6 +465,8 @@ iblock_execute_write_same(struct se_cmd *cmd)
 	sector_t block_lba = target_to_linux_sector(dev, cmd->t_task_lba);
 	sector_t sectors = target_to_linux_sector(dev,
 					sbc_get_write_same_sectors(cmd));
+	struct blk_plug plug;
+	int ret;
 
 	if (cmd->prot_op) {
 		pr_err("WRITE_SAME: Protection information with IBLOCK"
@@ -481,6 +483,7 @@ iblock_execute_write_same(struct se_cmd *cmd)
 		return TCM_INVALID_CDB_FIELD;
 	}
 
+	/* 1. Use REQ_OP_WRITE_ZEROES if supported and if appropriate. */
 	if (bdev_write_zeroes_sectors(bdev)) {
 		if (!iblock_execute_zero_out(bdev, cmd))
 			return 0;
@@ -491,6 +494,20 @@ iblock_execute_write_same(struct se_cmd *cmd)
 		goto fail;
 	cmd->priv = ibr;
 
+	/* 2. Try REQ_OP_WRITE_SAME. */
+	blk_start_plug(&plug);
+	ret = blkdev_submit_write_same(bdev, block_lba, sectors, GFP_KERNEL,
+				       sg_page(sg), iblock_bio_done, cmd);
+	blk_finish_plug(&plug);
+	if (ret == 0)
+		return 0;
+	if (ret != -EOPNOTSUPP)
+		goto fail;
+
+	/*
+	 * 3. If neither REQ_OP_WRITE_SAME nor REQ_OP_WRITE_ZEROES are
+	 * supported, use REQ_OP_WRITE.
+	 */
 	bio = iblock_get_bio(cmd, block_lba, 1, REQ_OP_WRITE, 0);
 	if (!bio)
 		goto fail_free_ibr;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-06-29 15:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-26  0:10 [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME Bart Van Assche
2018-06-26  0:10 ` [PATCH 1/3] block: Fix blk_rq_payload_bytes() Bart Van Assche
2018-06-26  0:10 ` [PATCH 2/3] sd: Remove the __data_len hack for WRITE SAME again Bart Van Assche
2018-06-26  0:10 ` [PATCH 3/3] block: Fix blk_end_request_all() for WRITE SAME requests Bart Van Assche
2018-06-26 15:53 ` [PATCH 0/3] Fix blk_mq_end_request() and blk_end_request() for WRITE SAME Martin K. Petersen
2018-06-26 16:24   ` Bart Van Assche
2018-06-29  8:58   ` Christoph Hellwig
2018-06-29 15:50     ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.