linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue
@ 2019-02-11  5:41 Jianchao Wang
  2019-02-11 15:59 ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Jianchao Wang @ 2019-02-11  5:41 UTC (permalink / raw)
  To: axboe; +Cc: linux-block, linux-kernel

When requeue, if RQF_DONTPREP, rq has contained some driver
specific data, so insert it to hctx dispatch list to avoid any
merge. Take scsi as example, here is the trace event log (no
io scheduler, because RQF_STARTED would prevent merging),

   kworker/0:1H-339   [000] ...1  2037.209289: block_rq_insert: 8,0 R 4096 () 32768 + 8 [kworker/0:1H]
scsi_inert_test-1987  [000] ....  2037.220465: block_bio_queue: 8,0 R 32776 + 8 [scsi_inert_test]
scsi_inert_test-1987  [000] ...2  2037.220466: block_bio_backmerge: 8,0 R 32776 + 8 [scsi_inert_test]
   kworker/0:1H-339   [000] ....  2047.220913: block_rq_issue: 8,0 R 8192 () 32768 + 16 [kworker/0:1H]
scsi_inert_test-1996  [000] ..s1  2047.221007: block_rq_complete: 8,0 R () 32768 + 8 [0]
scsi_inert_test-1996  [000] .Ns1  2047.221045: block_rq_requeue: 8,0 R () 32776 + 8 [0]
   kworker/0:1H-339   [000] ...1  2047.221054: block_rq_insert: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
   kworker/0:1H-339   [000] ...1  2047.221056: block_rq_issue: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
scsi_inert_test-1986  [000] ..s1  2047.221119: block_rq_complete: 8,0 R () 32776 + 8 [0]

(32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP.
Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP,
the sdb only contained the part of (32768 + 8), then only that part
was completed. The lucky thing was that scsi_io_completion detected
it and requeued the remaining part. So we didn't get corrupted data.
However, the requeue of (32776 + 8) is not expected.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 block/blk-mq.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8f5b533..2d93eb5 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -737,6 +737,18 @@ static void blk_mq_requeue_work(struct work_struct *work)
 	spin_unlock_irq(&q->requeue_lock);
 
 	list_for_each_entry_safe(rq, next, &rq_list, queuelist) {
+		/*
+		 * If RQF_DONTPREP, rq has contained some driver specific
+		 * data, so insert it to hctx dispatch list to avoid any
+		 * merge.
+		 */
+		if (rq->rq_flags & RQF_DONTPREP) {
+			rq->rq_flags &= ~RQF_SOFTBARRIER;
+			list_del_init(&rq->queuelist);
+			blk_mq_request_bypass_insert(rq, false);
+			continue;
+		}
+
 		if (!(rq->rq_flags & RQF_SOFTBARRIER))
 			continue;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue
  2019-02-11  5:41 [PATCH] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue Jianchao Wang
@ 2019-02-11 15:59 ` Jens Axboe
  2019-02-11 23:15   ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2019-02-11 15:59 UTC (permalink / raw)
  To: Jianchao Wang; +Cc: linux-block, linux-kernel

On 2/10/19 10:41 PM, Jianchao Wang wrote:
> When requeue, if RQF_DONTPREP, rq has contained some driver
> specific data, so insert it to hctx dispatch list to avoid any
> merge. Take scsi as example, here is the trace event log (no
> io scheduler, because RQF_STARTED would prevent merging),
> 
>    kworker/0:1H-339   [000] ...1  2037.209289: block_rq_insert: 8,0 R 4096 () 32768 + 8 [kworker/0:1H]
> scsi_inert_test-1987  [000] ....  2037.220465: block_bio_queue: 8,0 R 32776 + 8 [scsi_inert_test]
> scsi_inert_test-1987  [000] ...2  2037.220466: block_bio_backmerge: 8,0 R 32776 + 8 [scsi_inert_test]
>    kworker/0:1H-339   [000] ....  2047.220913: block_rq_issue: 8,0 R 8192 () 32768 + 16 [kworker/0:1H]
> scsi_inert_test-1996  [000] ..s1  2047.221007: block_rq_complete: 8,0 R () 32768 + 8 [0]
> scsi_inert_test-1996  [000] .Ns1  2047.221045: block_rq_requeue: 8,0 R () 32776 + 8 [0]
>    kworker/0:1H-339   [000] ...1  2047.221054: block_rq_insert: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
>    kworker/0:1H-339   [000] ...1  2047.221056: block_rq_issue: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
> scsi_inert_test-1986  [000] ..s1  2047.221119: block_rq_complete: 8,0 R () 32776 + 8 [0]
> 
> (32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP.
> Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP,
> the sdb only contained the part of (32768 + 8), then only that part
> was completed. The lucky thing was that scsi_io_completion detected
> it and requeued the remaining part. So we didn't get corrupted data.
> However, the requeue of (32776 + 8) is not expected.

Good catch, but how about something like this? Makes it more integrated,
I think that's cleaner.


diff --git a/block/blk-mq.c b/block/blk-mq.c
index 44d471ff8754..4c26bbb4330f 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -737,12 +737,20 @@ static void blk_mq_requeue_work(struct work_struct *work)
 	spin_unlock_irq(&q->requeue_lock);
 
 	list_for_each_entry_safe(rq, next, &rq_list, queuelist) {
-		if (!(rq->rq_flags & RQF_SOFTBARRIER))
+		/*
+		 * If RQF_DONTPREP is set, rq may contain some driver
+		 * specific data. Insert it to hctx dispatch list to avoid
+		 * any merge.
+		 */
+		if (!(rq->rq_flags & (RQF_SOFTBARRIER | RQF_DONTPREP)))
 			continue;
 
 		rq->rq_flags &= ~RQF_SOFTBARRIER;
 		list_del_init(&rq->queuelist);
-		blk_mq_sched_insert_request(rq, true, false, false);
+		if (rq->rq_flags & RQF_SOFTBARRIER)
+			blk_mq_sched_insert_request(rq, true, false, false);
+		else
+			blk_mq_request_bypass_insert(rq, false);
 	}
 
 	while (!list_empty(&rq_list)) {


-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue
  2019-02-11 15:59 ` Jens Axboe
@ 2019-02-11 23:15   ` Jens Axboe
  2019-02-11 23:20     ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2019-02-11 23:15 UTC (permalink / raw)
  To: Jianchao Wang; +Cc: linux-block, linux-kernel

On 2/11/19 8:59 AM, Jens Axboe wrote:
> On 2/10/19 10:41 PM, Jianchao Wang wrote:
>> When requeue, if RQF_DONTPREP, rq has contained some driver
>> specific data, so insert it to hctx dispatch list to avoid any
>> merge. Take scsi as example, here is the trace event log (no
>> io scheduler, because RQF_STARTED would prevent merging),
>>
>>    kworker/0:1H-339   [000] ...1  2037.209289: block_rq_insert: 8,0 R 4096 () 32768 + 8 [kworker/0:1H]
>> scsi_inert_test-1987  [000] ....  2037.220465: block_bio_queue: 8,0 R 32776 + 8 [scsi_inert_test]
>> scsi_inert_test-1987  [000] ...2  2037.220466: block_bio_backmerge: 8,0 R 32776 + 8 [scsi_inert_test]
>>    kworker/0:1H-339   [000] ....  2047.220913: block_rq_issue: 8,0 R 8192 () 32768 + 16 [kworker/0:1H]
>> scsi_inert_test-1996  [000] ..s1  2047.221007: block_rq_complete: 8,0 R () 32768 + 8 [0]
>> scsi_inert_test-1996  [000] .Ns1  2047.221045: block_rq_requeue: 8,0 R () 32776 + 8 [0]
>>    kworker/0:1H-339   [000] ...1  2047.221054: block_rq_insert: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
>>    kworker/0:1H-339   [000] ...1  2047.221056: block_rq_issue: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
>> scsi_inert_test-1986  [000] ..s1  2047.221119: block_rq_complete: 8,0 R () 32776 + 8 [0]
>>
>> (32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP.
>> Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP,
>> the sdb only contained the part of (32768 + 8), then only that part
>> was completed. The lucky thing was that scsi_io_completion detected
>> it and requeued the remaining part. So we didn't get corrupted data.
>> However, the requeue of (32776 + 8) is not expected.
> 
> Good catch, but how about something like this? Makes it more integrated,
> I think that's cleaner.

This is probably better (and safer):


diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8f5b533764ca..b3908eb3881c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -737,12 +737,21 @@ static void blk_mq_requeue_work(struct work_struct *work)
 	spin_unlock_irq(&q->requeue_lock);
 
 	list_for_each_entry_safe(rq, next, &rq_list, queuelist) {
-		if (!(rq->rq_flags & RQF_SOFTBARRIER))
+		if (!(rq->rq_flags & (RQF_SOFTBARRIER | RQF_DONTPREP)))
 			continue;
 
 		rq->rq_flags &= ~RQF_SOFTBARRIER;
 		list_del_init(&rq->queuelist);
-		blk_mq_sched_insert_request(rq, true, false, false);
+
+		/*
+		 * If RQF_DONTPREP is set, rq may contain some driver
+		 * specific data. Insert it to hctx dispatch list to avoid
+		 * any merge.
+		 */
+		if (rq->rq_flags & RQF_DONTPREP)
+			blk_mq_sched_insert_request(rq, true, false, false);
+		else
+			blk_mq_request_bypass_insert(rq, false);
 	}
 
 	while (!list_empty(&rq_list)) {

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue
  2019-02-11 23:15   ` Jens Axboe
@ 2019-02-11 23:20     ` Jens Axboe
  2019-02-12  1:56       ` jianchao.wang
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2019-02-11 23:20 UTC (permalink / raw)
  To: Jianchao Wang; +Cc: linux-block, linux-kernel

On 2/11/19 4:15 PM, Jens Axboe wrote:
> On 2/11/19 8:59 AM, Jens Axboe wrote:
>> On 2/10/19 10:41 PM, Jianchao Wang wrote:
>>> When requeue, if RQF_DONTPREP, rq has contained some driver
>>> specific data, so insert it to hctx dispatch list to avoid any
>>> merge. Take scsi as example, here is the trace event log (no
>>> io scheduler, because RQF_STARTED would prevent merging),
>>>
>>>    kworker/0:1H-339   [000] ...1  2037.209289: block_rq_insert: 8,0 R 4096 () 32768 + 8 [kworker/0:1H]
>>> scsi_inert_test-1987  [000] ....  2037.220465: block_bio_queue: 8,0 R 32776 + 8 [scsi_inert_test]
>>> scsi_inert_test-1987  [000] ...2  2037.220466: block_bio_backmerge: 8,0 R 32776 + 8 [scsi_inert_test]
>>>    kworker/0:1H-339   [000] ....  2047.220913: block_rq_issue: 8,0 R 8192 () 32768 + 16 [kworker/0:1H]
>>> scsi_inert_test-1996  [000] ..s1  2047.221007: block_rq_complete: 8,0 R () 32768 + 8 [0]
>>> scsi_inert_test-1996  [000] .Ns1  2047.221045: block_rq_requeue: 8,0 R () 32776 + 8 [0]
>>>    kworker/0:1H-339   [000] ...1  2047.221054: block_rq_insert: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
>>>    kworker/0:1H-339   [000] ...1  2047.221056: block_rq_issue: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
>>> scsi_inert_test-1986  [000] ..s1  2047.221119: block_rq_complete: 8,0 R () 32776 + 8 [0]
>>>
>>> (32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP.
>>> Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP,
>>> the sdb only contained the part of (32768 + 8), then only that part
>>> was completed. The lucky thing was that scsi_io_completion detected
>>> it and requeued the remaining part. So we didn't get corrupted data.
>>> However, the requeue of (32776 + 8) is not expected.
>>
>> Good catch, but how about something like this? Makes it more integrated,
>> I think that's cleaner.
> 
> This is probably better (and safer):

Here's the one I wanted to send, not a half done one. Maybe I'll be
luckier this time around?


diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8f5b533764ca..35e6aba52808 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -737,12 +737,21 @@ static void blk_mq_requeue_work(struct work_struct *work)
 	spin_unlock_irq(&q->requeue_lock);
 
 	list_for_each_entry_safe(rq, next, &rq_list, queuelist) {
-		if (!(rq->rq_flags & RQF_SOFTBARRIER))
+		if (!(rq->rq_flags & (RQF_SOFTBARRIER | RQF_DONTPREP)))
 			continue;
 
 		rq->rq_flags &= ~RQF_SOFTBARRIER;
 		list_del_init(&rq->queuelist);
-		blk_mq_sched_insert_request(rq, true, false, false);
+
+		/*
+		 * If RQF_DONTPREP is set, rq may contain some driver
+		 * specific data. Insert it to hctx dispatch list to avoid
+		 * any merge.
+		 */
+		if (rq->rq_flags & RQF_DONTPREP)
+			blk_mq_request_bypass_insert(rq, false);
+		else
+			blk_mq_sched_insert_request(rq, true, false, false);
 	}
 
 	while (!list_empty(&rq_list)) {

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue
  2019-02-11 23:20     ` Jens Axboe
@ 2019-02-12  1:56       ` jianchao.wang
  0 siblings, 0 replies; 5+ messages in thread
From: jianchao.wang @ 2019-02-12  1:56 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel

Hi Jens

Thanks for your kindly response.

On 2/12/19 7:20 AM, Jens Axboe wrote:
> On 2/11/19 4:15 PM, Jens Axboe wrote:
>> On 2/11/19 8:59 AM, Jens Axboe wrote:
>>> On 2/10/19 10:41 PM, Jianchao Wang wrote:
>>>> When requeue, if RQF_DONTPREP, rq has contained some driver
>>>> specific data, so insert it to hctx dispatch list to avoid any
>>>> merge. Take scsi as example, here is the trace event log (no
>>>> io scheduler, because RQF_STARTED would prevent merging),
>>>>
>>>>    kworker/0:1H-339   [000] ...1  2037.209289: block_rq_insert: 8,0 R 4096 () 32768 + 8 [kworker/0:1H]
>>>> scsi_inert_test-1987  [000] ....  2037.220465: block_bio_queue: 8,0 R 32776 + 8 [scsi_inert_test]
>>>> scsi_inert_test-1987  [000] ...2  2037.220466: block_bio_backmerge: 8,0 R 32776 + 8 [scsi_inert_test]
>>>>    kworker/0:1H-339   [000] ....  2047.220913: block_rq_issue: 8,0 R 8192 () 32768 + 16 [kworker/0:1H]
>>>> scsi_inert_test-1996  [000] ..s1  2047.221007: block_rq_complete: 8,0 R () 32768 + 8 [0]
>>>> scsi_inert_test-1996  [000] .Ns1  2047.221045: block_rq_requeue: 8,0 R () 32776 + 8 [0]
>>>>    kworker/0:1H-339   [000] ...1  2047.221054: block_rq_insert: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
>>>>    kworker/0:1H-339   [000] ...1  2047.221056: block_rq_issue: 8,0 R 4096 () 32776 + 8 [kworker/0:1H]
>>>> scsi_inert_test-1986  [000] ..s1  2047.221119: block_rq_complete: 8,0 R () 32776 + 8 [0]
>>>>
>>>> (32768 + 8) was requeued by scsi_queue_insert and had RQF_DONTPREP.
>>>> Then it was merged with (32776 + 8) and issued. Due to RQF_DONTPREP,
>>>> the sdb only contained the part of (32768 + 8), then only that part
>>>> was completed. The lucky thing was that scsi_io_completion detected
>>>> it and requeued the remaining part. So we didn't get corrupted data.
>>>> However, the requeue of (32776 + 8) is not expected.
>>>
>>> Good catch, but how about something like this? Makes it more integrated,
>>> I think that's cleaner.
>>
>> This is probably better (and safer):
> 
> Here's the one I wanted to send, not a half done one. Maybe I'll be
> luckier this time around?
> 
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 8f5b533764ca..35e6aba52808 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -737,12 +737,21 @@ static void blk_mq_requeue_work(struct work_struct *work)
>  	spin_unlock_irq(&q->requeue_lock);
>  
>  	list_for_each_entry_safe(rq, next, &rq_list, queuelist) {
> -		if (!(rq->rq_flags & RQF_SOFTBARRIER))
> +		if (!(rq->rq_flags & (RQF_SOFTBARRIER | RQF_DONTPREP)))
>  			continue;
>  
>  		rq->rq_flags &= ~RQF_SOFTBARRIER;
>  		list_del_init(&rq->queuelist);
> -		blk_mq_sched_insert_request(rq, true, false, false);
> +
> +		/*
> +		 * If RQF_DONTPREP is set, rq may contain some driver
> +		 * specific data. Insert it to hctx dispatch list to avoid
> +		 * any merge.
> +		 */
> +		if (rq->rq_flags & RQF_DONTPREP)
> +			blk_mq_request_bypass_insert(rq, false);
> +		else
> +			blk_mq_sched_insert_request(rq, true, false, false);
>  	}
>  
>  	while (!list_empty(&rq_list)) {
> 

The test is OK.
And I will send out the V2 based on this.

Thanks
Jianchao

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-02-12  1:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-11  5:41 [PATCH] blk-mq: insert rq with DONTPREP to hctx dispatch list when requeue Jianchao Wang
2019-02-11 15:59 ` Jens Axboe
2019-02-11 23:15   ` Jens Axboe
2019-02-11 23:20     ` Jens Axboe
2019-02-12  1:56       ` jianchao.wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).