linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] block: export __blk_complete_request
@ 2018-06-15  1:57 Jianchao Wang
  2018-06-15  1:57 ` [PATCH 2/2] scsi_transport_fc: use __blk_complete_request in fc_bsg_job_timeout Jianchao Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Jianchao Wang @ 2018-06-15  1:57 UTC (permalink / raw)
  To: axboe, hch; +Cc: jejb, martin.petersen, linux-block, linux-scsi, linux-kernel

After f6e7d48 (block: remove BLK_EH_HANDLED), LLDD is responsible
to complete the timed out request, however, for blk-legacy, the
'complete' is still marked, blk_complete_request will do nothing,
we export __blk_complete_request for LLDD to complete the request
in timeout path.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 block/blk-softirq.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/blk-softirq.c b/block/blk-softirq.c
index 01e2b35..15c1f5e 100644
--- a/block/blk-softirq.c
+++ b/block/blk-softirq.c
@@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req)
 
 	local_irq_restore(flags);
 }
+EXPORT_SYMBOL(__blk_complete_request);
 
 /**
  * blk_complete_request - end I/O on a request
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] scsi_transport_fc: use __blk_complete_request in fc_bsg_job_timeout
  2018-06-15  1:57 [PATCH 1/2] block: export __blk_complete_request Jianchao Wang
@ 2018-06-15  1:57 ` Jianchao Wang
  2018-06-15  2:17 ` [PATCH 1/2] block: export __blk_complete_request Ming Lei
  2018-06-19 14:09 ` Christoph Hellwig
  2 siblings, 0 replies; 15+ messages in thread
From: Jianchao Wang @ 2018-06-15  1:57 UTC (permalink / raw)
  To: axboe, hch; +Cc: jejb, martin.petersen, linux-block, linux-scsi, linux-kernel

bsg is based on blk-legacy, so we should use blk-legacy interface
here. On the other hand, for blk-legacy, the timed out request
has 'complete' mark, so use __blk_complete_request.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 drivers/scsi/scsi_transport_fc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 1da3d71..1394810 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3592,7 +3592,7 @@ fc_bsg_job_timeout(struct request *req)
 
 	/* the blk_end_sync_io() doesn't check the error */
 	if (inflight)
-		blk_mq_complete_request(req);
+		__blk_complete_request(req);
 	return BLK_EH_DONE;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-15  1:57 [PATCH 1/2] block: export __blk_complete_request Jianchao Wang
  2018-06-15  1:57 ` [PATCH 2/2] scsi_transport_fc: use __blk_complete_request in fc_bsg_job_timeout Jianchao Wang
@ 2018-06-15  2:17 ` Ming Lei
  2018-06-15  2:22   ` jianchao.wang
  2018-06-19 14:09 ` Christoph Hellwig
  2 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2018-06-15  2:17 UTC (permalink / raw)
  To: Jianchao Wang
  Cc: Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, linux-block, Linux SCSI List,
	Linux Kernel Mailing List

On Fri, Jun 15, 2018 at 9:57 AM, Jianchao Wang
<jianchao.w.wang@oracle.com> wrote:
> After f6e7d48 (block: remove BLK_EH_HANDLED), LLDD is responsible
> to complete the timed out request, however, for blk-legacy, the
> 'complete' is still marked, blk_complete_request will do nothing,
> we export __blk_complete_request for LLDD to complete the request
> in timeout path.
>
> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> ---
>  block/blk-softirq.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/block/blk-softirq.c b/block/blk-softirq.c
> index 01e2b35..15c1f5e 100644
> --- a/block/blk-softirq.c
> +++ b/block/blk-softirq.c
> @@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req)
>
>         local_irq_restore(flags);
>  }
> +EXPORT_SYMBOL(__blk_complete_request);
>
>  /**
>   * blk_complete_request - end I/O on a request
> --
> 2.7.4
>

Looks non-blk-mq timeout code need to convert to ref-counter
based approach too?

Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-15  2:17 ` [PATCH 1/2] block: export __blk_complete_request Ming Lei
@ 2018-06-15  2:22   ` jianchao.wang
  2018-06-15  2:44     ` jianchao.wang
  2018-06-15  2:49     ` Ming Lei
  0 siblings, 2 replies; 15+ messages in thread
From: jianchao.wang @ 2018-06-15  2:22 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, linux-block, Linux SCSI List,
	Linux Kernel Mailing List

Hi Ming

On 06/15/2018 10:17 AM, Ming Lei wrote:
> On Fri, Jun 15, 2018 at 9:57 AM, Jianchao Wang
> <jianchao.w.wang@oracle.com> wrote:
>> After f6e7d48 (block: remove BLK_EH_HANDLED), LLDD is responsible
>> to complete the timed out request, however, for blk-legacy, the
>> 'complete' is still marked, blk_complete_request will do nothing,
>> we export __blk_complete_request for LLDD to complete the request
>> in timeout path.
>>
>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
>> ---
>>  block/blk-softirq.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/block/blk-softirq.c b/block/blk-softirq.c
>> index 01e2b35..15c1f5e 100644
>> --- a/block/blk-softirq.c
>> +++ b/block/blk-softirq.c
>> @@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req)
>>
>>         local_irq_restore(flags);
>>  }
>> +EXPORT_SYMBOL(__blk_complete_request);
>>
>>  /**
>>   * blk_complete_request - end I/O on a request
>> --
>> 2.7.4
>>
> 
> Looks non-blk-mq timeout code need to convert to ref-counter
> based approach too?

IMO, ref-counter is just to fix the blk-mq req life recycle issue.
It cannot replace the blk_mark_rq_complete which could avoid the race between
timeout and io completion path.
Or maybe my understanding is wrong ...

Thanks
Jianchao
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-15  2:22   ` jianchao.wang
@ 2018-06-15  2:44     ` jianchao.wang
  2018-06-15  2:56       ` Ming Lei
  2018-06-15  2:49     ` Ming Lei
  1 sibling, 1 reply; 15+ messages in thread
From: jianchao.wang @ 2018-06-15  2:44 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, linux-block, Linux SCSI List,
	Linux Kernel Mailing List



On 06/15/2018 10:22 AM, jianchao.wang wrote:
> Hi Ming
> 
> On 06/15/2018 10:17 AM, Ming Lei wrote:
>> On Fri, Jun 15, 2018 at 9:57 AM, Jianchao Wang
>> <jianchao.w.wang@oracle.com> wrote:
>>> After f6e7d48 (block: remove BLK_EH_HANDLED), LLDD is responsible
>>> to complete the timed out request, however, for blk-legacy, the
>>> 'complete' is still marked, blk_complete_request will do nothing,
>>> we export __blk_complete_request for LLDD to complete the request
>>> in timeout path.
>>>
>>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
>>> ---
>>>  block/blk-softirq.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/block/blk-softirq.c b/block/blk-softirq.c
>>> index 01e2b35..15c1f5e 100644
>>> --- a/block/blk-softirq.c
>>> +++ b/block/blk-softirq.c
>>> @@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req)
>>>
>>>         local_irq_restore(flags);
>>>  }
>>> +EXPORT_SYMBOL(__blk_complete_request);
>>>
>>>  /**
>>>   * blk_complete_request - end I/O on a request
>>> --
>>> 2.7.4
>>>
>>
>> Looks non-blk-mq timeout code need to convert to ref-counter
>> based approach too?
> 
> IMO, ref-counter is just to fix the blk-mq req life recycle issue.
> It cannot replace the blk_mark_rq_complete which could avoid the race between
> timeout and io completion path.

The .timeout return BLK_EH_DONE doesn't always mean the request has been completed.
Such as scsi-mid layer, its .timeout callback return BLK_EH_DONE but the timed out
request is still in abort or eh process. What if a completion irq come during that ? 

> Or maybe my understanding is wrong ...
> 
> Thanks
> Jianchao
>>
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-15  2:22   ` jianchao.wang
  2018-06-15  2:44     ` jianchao.wang
@ 2018-06-15  2:49     ` Ming Lei
  1 sibling, 0 replies; 15+ messages in thread
From: Ming Lei @ 2018-06-15  2:49 UTC (permalink / raw)
  To: jianchao.wang
  Cc: Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, linux-block, Linux SCSI List,
	Linux Kernel Mailing List

On Fri, Jun 15, 2018 at 10:22 AM, jianchao.wang
<jianchao.w.wang@oracle.com> wrote:
> Hi Ming
>
> On 06/15/2018 10:17 AM, Ming Lei wrote:
>> On Fri, Jun 15, 2018 at 9:57 AM, Jianchao Wang
>> <jianchao.w.wang@oracle.com> wrote:
>>> After f6e7d48 (block: remove BLK_EH_HANDLED), LLDD is responsible
>>> to complete the timed out request, however, for blk-legacy, the
>>> 'complete' is still marked, blk_complete_request will do nothing,
>>> we export __blk_complete_request for LLDD to complete the request
>>> in timeout path.
>>>
>>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
>>> ---
>>>  block/blk-softirq.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/block/blk-softirq.c b/block/blk-softirq.c
>>> index 01e2b35..15c1f5e 100644
>>> --- a/block/blk-softirq.c
>>> +++ b/block/blk-softirq.c
>>> @@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req)
>>>
>>>         local_irq_restore(flags);
>>>  }
>>> +EXPORT_SYMBOL(__blk_complete_request);
>>>
>>>  /**
>>>   * blk_complete_request - end I/O on a request
>>> --
>>> 2.7.4
>>>
>>
>> Looks non-blk-mq timeout code need to convert to ref-counter
>> based approach too?
>
> IMO, ref-counter is just to fix the blk-mq req life recycle issue.

Just thought of that, it is one blk-mq specific issue.

> It cannot replace the blk_mark_rq_complete which could avoid the race between
> timeout and io completion path.
> Or maybe my understanding is wrong ...

I didn't mean that this patch is unnecessary.

But the question is that given driver has to deal with race between timeout
and normal completion, why don't you follow blk-mq's way to move
the atomic state change into __blk_complete_request()?


Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-15  2:44     ` jianchao.wang
@ 2018-06-15  2:56       ` Ming Lei
  2018-06-15  3:04         ` jianchao.wang
  0 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2018-06-15  2:56 UTC (permalink / raw)
  To: jianchao.wang
  Cc: Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, linux-block, Linux SCSI List,
	Linux Kernel Mailing List

On Fri, Jun 15, 2018 at 10:44 AM, jianchao.wang
<jianchao.w.wang@oracle.com> wrote:
>
>
> On 06/15/2018 10:22 AM, jianchao.wang wrote:
>> Hi Ming
>>
>> On 06/15/2018 10:17 AM, Ming Lei wrote:
>>> On Fri, Jun 15, 2018 at 9:57 AM, Jianchao Wang
>>> <jianchao.w.wang@oracle.com> wrote:
>>>> After f6e7d48 (block: remove BLK_EH_HANDLED), LLDD is responsible
>>>> to complete the timed out request, however, for blk-legacy, the
>>>> 'complete' is still marked, blk_complete_request will do nothing,
>>>> we export __blk_complete_request for LLDD to complete the request
>>>> in timeout path.
>>>>
>>>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
>>>> ---
>>>>  block/blk-softirq.c | 1 +
>>>>  1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/block/blk-softirq.c b/block/blk-softirq.c
>>>> index 01e2b35..15c1f5e 100644
>>>> --- a/block/blk-softirq.c
>>>> +++ b/block/blk-softirq.c
>>>> @@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req)
>>>>
>>>>         local_irq_restore(flags);
>>>>  }
>>>> +EXPORT_SYMBOL(__blk_complete_request);
>>>>
>>>>  /**
>>>>   * blk_complete_request - end I/O on a request
>>>> --
>>>> 2.7.4
>>>>
>>>
>>> Looks non-blk-mq timeout code need to convert to ref-counter
>>> based approach too?
>>
>> IMO, ref-counter is just to fix the blk-mq req life recycle issue.
>> It cannot replace the blk_mark_rq_complete which could avoid the race between
>> timeout and io completion path.
>
> The .timeout return BLK_EH_DONE doesn't always mean the request has been completed.
> Such as scsi-mid layer, its .timeout callback return BLK_EH_DONE but the timed out
> request is still in abort or eh process. What if a completion irq come during that ?

For blk-mq, it is avoided by the atomic state change in
__blk_mq_complete_request(),
that is why I mentioned the question in my last reply.

But what if the timed-out request has been freed by EH? Then seems
req's ref_counter
is still needed for non-mq?


Thanks,
Ming Lei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-15  2:56       ` Ming Lei
@ 2018-06-15  3:04         ` jianchao.wang
  2018-06-15  3:20           ` Ming Lei
  0 siblings, 1 reply; 15+ messages in thread
From: jianchao.wang @ 2018-06-15  3:04 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, linux-block, Linux SCSI List,
	Linux Kernel Mailing List

Hi Ming

Thanks for your kindly response.

On 06/15/2018 10:56 AM, Ming Lei wrote:
>>> IMO, ref-counter is just to fix the blk-mq req life recycle issue.
>>> It cannot replace the blk_mark_rq_complete which could avoid the race between
>>> timeout and io completion path.
>> The .timeout return BLK_EH_DONE doesn't always mean the request has been completed.
>> Such as scsi-mid layer, its .timeout callback return BLK_EH_DONE but the timed out
>> request is still in abort or eh process. What if a completion irq come during that ?
> For blk-mq, it is avoided by the atomic state change in
> __blk_mq_complete_request(),
> that is why I mentioned the question in my last reply.
> 

but blk_mq_check_expired doesn't do that.
do I miss anything ?

> But what if the timed-out request has been freed by EH? Then seems
> req's ref_counter

Thanks
Jianchao

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-15  3:04         ` jianchao.wang
@ 2018-06-15  3:20           ` Ming Lei
  2018-06-15  3:26             ` jianchao.wang
  2018-06-15 11:58             ` Christoph Hellwig
  0 siblings, 2 replies; 15+ messages in thread
From: Ming Lei @ 2018-06-15  3:20 UTC (permalink / raw)
  To: jianchao.wang
  Cc: Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, linux-block, Linux SCSI List,
	Linux Kernel Mailing List

On Fri, Jun 15, 2018 at 11:04 AM, jianchao.wang
<jianchao.w.wang@oracle.com> wrote:
> Hi Ming
>
> Thanks for your kindly response.
>
> On 06/15/2018 10:56 AM, Ming Lei wrote:
>>>> IMO, ref-counter is just to fix the blk-mq req life recycle issue.
>>>> It cannot replace the blk_mark_rq_complete which could avoid the race between
>>>> timeout and io completion path.
>>> The .timeout return BLK_EH_DONE doesn't always mean the request has been completed.
>>> Such as scsi-mid layer, its .timeout callback return BLK_EH_DONE but the timed out
>>> request is still in abort or eh process. What if a completion irq come during that ?
>> For blk-mq, it is avoided by the atomic state change in
>> __blk_mq_complete_request(),
>> that is why I mentioned the question in my last reply.
>>
>
> but blk_mq_check_expired doesn't do that.
> do I miss anything ?

Right, that is the difference between blk-mq and legacy now, then if scsi-mq
drivers can work well, they should work well with the following change in the
non-mq mode:

diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 4b8a48d48ba1..9fce09d55652 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -88,7 +88,6 @@ static void blk_rq_timed_out(struct request *req)
        switch (ret) {
        case BLK_EH_RESET_TIMER:
                blk_add_timer(req);
-               blk_clear_rq_complete(req);
                break;
        case BLK_EH_DONE:
                /*
@@ -115,8 +114,7 @@ static void blk_rq_check_expired(struct request
*rq, unsigned long *next_timeout
                /*
                 * Check if we raced with end io completion
                 */
-               if (!blk_mark_rq_complete(rq))
-                       blk_rq_timed_out(rq);
+               blk_rq_timed_out(rq);
        } else if (!*next_set || time_after(*next_timeout, deadline)) {
                *next_timeout = deadline;
                *next_set = 1;


Thanks,
Ming Lei

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-15  3:20           ` Ming Lei
@ 2018-06-15  3:26             ` jianchao.wang
  2018-06-15  4:03               ` Ming Lei
  2018-06-15 11:58             ` Christoph Hellwig
  1 sibling, 1 reply; 15+ messages in thread
From: jianchao.wang @ 2018-06-15  3:26 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, linux-block, Linux SCSI List,
	Linux Kernel Mailing List

Hi Ming

On 06/15/2018 11:20 AM, Ming Lei wrote:
> On Fri, Jun 15, 2018 at 11:04 AM, jianchao.wang
> <jianchao.w.wang@oracle.com> wrote:
>> Hi Ming
>>
>> Thanks for your kindly response.
>>
>> On 06/15/2018 10:56 AM, Ming Lei wrote:
>>>>> IMO, ref-counter is just to fix the blk-mq req life recycle issue.
>>>>> It cannot replace the blk_mark_rq_complete which could avoid the race between
>>>>> timeout and io completion path.
>>>> The .timeout return BLK_EH_DONE doesn't always mean the request has been completed.
>>>> Such as scsi-mid layer, its .timeout callback return BLK_EH_DONE but the timed out
>>>> request is still in abort or eh process. What if a completion irq come during that ?
>>> For blk-mq, it is avoided by the atomic state change in
>>> __blk_mq_complete_request(),
>>> that is why I mentioned the question in my last reply.
>>>
>>
>> but blk_mq_check_expired doesn't do that.
>> do I miss anything ?
> 
> Right, that is the difference between blk-mq and legacy now, 

Sorry, I cannot follow your point.
blk_mq_check_expired doesn't do a atmoc state change from IN-FLIGHT to COMPLETE.
__blk_mq_complete_request could still proceed to complete a timed out request
which is in scsi abort or eh process. Is it really OK ?


Thanks
Jianchao


> then if scsi-mq
> drivers can work well, they should work well with the following change in the
> non-mq mode:
> 
> diff --git a/block/blk-timeout.c b/block/blk-timeout.c
> index 4b8a48d48ba1..9fce09d55652 100644
> --- a/block/blk-timeout.c
> +++ b/block/blk-timeout.c
> @@ -88,7 +88,6 @@ static void blk_rq_timed_out(struct request *req)
>         switch (ret) {
>         case BLK_EH_RESET_TIMER:
>                 blk_add_timer(req);
> -               blk_clear_rq_complete(req);
>                 break;
>         case BLK_EH_DONE:
>                 /*
> @@ -115,8 +114,7 @@ static void blk_rq_check_expired(struct request
> *rq, unsigned long *next_timeout
>                 /*
>                  * Check if we raced with end io completion
>                  */
> -               if (!blk_mark_rq_complete(rq))
> -                       blk_rq_timed_out(rq);
> +               blk_rq_timed_out(rq);
>         } else if (!*next_set || time_after(*next_timeout, deadline)) {
>                 *next_timeout = deadline;
>                 *next_set = 1;
> 
> 
> Thanks,
> Ming Lei
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-15  3:26             ` jianchao.wang
@ 2018-06-15  4:03               ` Ming Lei
  2018-06-15  5:10                 ` jianchao.wang
  0 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2018-06-15  4:03 UTC (permalink / raw)
  To: jianchao.wang
  Cc: Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, linux-block, Linux SCSI List,
	Linux Kernel Mailing List

On Fri, Jun 15, 2018 at 11:26 AM, jianchao.wang
<jianchao.w.wang@oracle.com> wrote:
> Hi Ming
>
> On 06/15/2018 11:20 AM, Ming Lei wrote:
>> On Fri, Jun 15, 2018 at 11:04 AM, jianchao.wang
>> <jianchao.w.wang@oracle.com> wrote:
>>> Hi Ming
>>>
>>> Thanks for your kindly response.
>>>
>>> On 06/15/2018 10:56 AM, Ming Lei wrote:
>>>>>> IMO, ref-counter is just to fix the blk-mq req life recycle issue.
>>>>>> It cannot replace the blk_mark_rq_complete which could avoid the race between
>>>>>> timeout and io completion path.
>>>>> The .timeout return BLK_EH_DONE doesn't always mean the request has been completed.
>>>>> Such as scsi-mid layer, its .timeout callback return BLK_EH_DONE but the timed out
>>>>> request is still in abort or eh process. What if a completion irq come during that ?
>>>> For blk-mq, it is avoided by the atomic state change in
>>>> __blk_mq_complete_request(),
>>>> that is why I mentioned the question in my last reply.
>>>>
>>>
>>> but blk_mq_check_expired doesn't do that.
>>> do I miss anything ?
>>
>> Right, that is the difference between blk-mq and legacy now,
>
> Sorry, I cannot follow your point.
> blk_mq_check_expired doesn't do a atmoc state change from IN-FLIGHT to COMPLETE.
> __blk_mq_complete_request could still proceed to complete a timed out request
> which is in scsi abort or eh process. Is it really OK ?

That is the idea of Christoph's patchset of 'complete requests from ->timeout',
then drivers need to cover race between timeout and normal completeion.

But at least the request won't be completed twice because of the atomic
state change in __blk_mq_complete_request().

So what is your real concern about blk-mq's timeout?

Thanks,
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-15  4:03               ` Ming Lei
@ 2018-06-15  5:10                 ` jianchao.wang
  0 siblings, 0 replies; 15+ messages in thread
From: jianchao.wang @ 2018-06-15  5:10 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, linux-block, Linux SCSI List,
	Linux Kernel Mailing List

Hi Ming

Thanks for your kindly response

On 06/15/2018 12:03 PM, Ming Lei wrote:
> On Fri, Jun 15, 2018 at 11:26 AM, jianchao.wang
> <jianchao.w.wang@oracle.com> wrote:
>> Hi Ming
>>
>> On 06/15/2018 11:20 AM, Ming Lei wrote:
>>> On Fri, Jun 15, 2018 at 11:04 AM, jianchao.wang
>>> <jianchao.w.wang@oracle.com> wrote:
>>>> Hi Ming
>>>>
>>>> Thanks for your kindly response.
>>>>
>>>> On 06/15/2018 10:56 AM, Ming Lei wrote:
>>>>>>> IMO, ref-counter is just to fix the blk-mq req life recycle issue.
>>>>>>> It cannot replace the blk_mark_rq_complete which could avoid the race between
>>>>>>> timeout and io completion path.
>>>>>> The .timeout return BLK_EH_DONE doesn't always mean the request has been completed.
>>>>>> Such as scsi-mid layer, its .timeout callback return BLK_EH_DONE but the timed out
>>>>>> request is still in abort or eh process. What if a completion irq come during that ?
>>>>> For blk-mq, it is avoided by the atomic state change in
>>>>> __blk_mq_complete_request(),
>>>>> that is why I mentioned the question in my last reply.
>>>>>
>>>>
>>>> but blk_mq_check_expired doesn't do that.
>>>> do I miss anything ?
>>>
>>> Right, that is the difference between blk-mq and legacy now,
>>
>> Sorry, I cannot follow your point.
>> blk_mq_check_expired doesn't do a atmoc state change from IN-FLIGHT to COMPLETE.
>> __blk_mq_complete_request could still proceed to complete a timed out request
>> which is in scsi abort or eh process. Is it really OK ?
> > That is the idea of Christoph's patchset of 'complete requests from ->timeout',

Yes, I used to read that mail thread.

> then drivers need to cover race between timeout and normal completeion.
> 
> But at least the request won't be completed twice because of the atomic
> state change in __blk_mq_complete_request().

Yes

> 
> So what is your real concern about blk-mq's timeout?
I concern whether the current drivers have bee ready for taking this task currently.
At least, for scsi, if I try to trigger timeout and completion path concurrently, system would crash.
4.17.rc7 or 4.18 with a patch that change state in blk_mq_check_expired will survive.

Thanks
jianchao 

> 
> Thanks,
> Ming
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-15  3:20           ` Ming Lei
  2018-06-15  3:26             ` jianchao.wang
@ 2018-06-15 11:58             ` Christoph Hellwig
  1 sibling, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2018-06-15 11:58 UTC (permalink / raw)
  To: Ming Lei
  Cc: jianchao.wang, Jens Axboe, Christoph Hellwig, James Bottomley,
	Martin K. Petersen, linux-block, Linux SCSI List,
	Linux Kernel Mailing List

On Fri, Jun 15, 2018 at 11:20:40AM +0800, Ming Lei wrote:
> > but blk_mq_check_expired doesn't do that.
> > do I miss anything ?
> 
> Right, that is the difference between blk-mq and legacy now, then if scsi-mq
> drivers can work well, they should work well with the following change in the
> non-mq mode:

We'll still need referene counting against reuse and/or premature
freeing of requests.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-15  1:57 [PATCH 1/2] block: export __blk_complete_request Jianchao Wang
  2018-06-15  1:57 ` [PATCH 2/2] scsi_transport_fc: use __blk_complete_request in fc_bsg_job_timeout Jianchao Wang
  2018-06-15  2:17 ` [PATCH 1/2] block: export __blk_complete_request Ming Lei
@ 2018-06-19 14:09 ` Christoph Hellwig
  2018-06-19 14:52   ` jianchao.wang
  2 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2018-06-19 14:09 UTC (permalink / raw)
  To: Jianchao Wang
  Cc: axboe, hch, jejb, martin.petersen, linux-block, linux-scsi, linux-kernel

Does the patch below fix your FC issue?

---
From 5e5b4fc51c84a0f5c27f2f770be7a4eaed0f6e8c Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Tue, 19 Jun 2018 13:59:52 +0200
Subject: block: fix timeout changes for legacy request drivers

blk_mq_complete_request can only be called for blk-mq drivers, but when
removing the BLK_EH_HANDLED return value, two legacy request timeout
methods incorrectly got switched to call blk_mq_complete_request.
Call __blk_complete_request instead to reinstance the previous behavior.
For that __blk_complete_request needs to be exported.

Fixes: 1fc2b62e ("scsi_transport_fc: complete requests from ->timeout")
Fixes: 0df0bb08 ("null_blk: complete requests from ->timeout")
Reported-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-softirq.c              | 1 +
 drivers/block/null_blk.c         | 2 +-
 drivers/scsi/scsi_transport_fc.c | 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/blk-softirq.c b/block/blk-softirq.c
index 01e2b353a2b9..15c1f5e12eb8 100644
--- a/block/blk-softirq.c
+++ b/block/blk-softirq.c
@@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req)
 
 	local_irq_restore(flags);
 }
+EXPORT_SYMBOL(__blk_complete_request);
 
 /**
  * blk_complete_request - end I/O on a request
diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c
index 7948049f6c43..042c778e5a4e 100644
--- a/drivers/block/null_blk.c
+++ b/drivers/block/null_blk.c
@@ -1365,7 +1365,7 @@ static blk_qc_t null_queue_bio(struct request_queue *q, struct bio *bio)
 static enum blk_eh_timer_return null_rq_timed_out_fn(struct request *rq)
 {
 	pr_info("null: rq %p timed out\n", rq);
-	blk_mq_complete_request(rq);
+	__blk_complete_request(rq);
 	return BLK_EH_DONE;
 }
 
diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
index 1da3d71e9f61..13948102ca29 100644
--- a/drivers/scsi/scsi_transport_fc.c
+++ b/drivers/scsi/scsi_transport_fc.c
@@ -3592,7 +3592,7 @@ fc_bsg_job_timeout(struct request *req)
 
 	/* the blk_end_sync_io() doesn't check the error */
 	if (inflight)
-		blk_mq_complete_request(req);
+		__blk_complete_request(req);
 	return BLK_EH_DONE;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] block: export __blk_complete_request
  2018-06-19 14:09 ` Christoph Hellwig
@ 2018-06-19 14:52   ` jianchao.wang
  0 siblings, 0 replies; 15+ messages in thread
From: jianchao.wang @ 2018-06-19 14:52 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, jejb, martin.petersen, linux-block, linux-scsi, linux-kernel

Hi Christoph

Thanks for your kindly response.
The patch I posted ('scsi_transport_fc: use __blk_complete_request in fc_bsg_job_timeout')
is just based on code review. I don't have actual issue on it. :)

Thanks
Jianchao

On 06/19/2018 10:09 PM, Christoph Hellwig wrote:
> Does the patch below fix your FC issue?
> 
> ---
>>From 5e5b4fc51c84a0f5c27f2f770be7a4eaed0f6e8c Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Tue, 19 Jun 2018 13:59:52 +0200
> Subject: block: fix timeout changes for legacy request drivers
> 
> blk_mq_complete_request can only be called for blk-mq drivers, but when
> removing the BLK_EH_HANDLED return value, two legacy request timeout
> methods incorrectly got switched to call blk_mq_complete_request.
> Call __blk_complete_request instead to reinstance the previous behavior.
> For that __blk_complete_request needs to be exported.
> 
> Fixes: 1fc2b62e ("scsi_transport_fc: complete requests from ->timeout")
> Fixes: 0df0bb08 ("null_blk: complete requests from ->timeout")
> Reported-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  block/blk-softirq.c              | 1 +
>  drivers/block/null_blk.c         | 2 +-
>  drivers/scsi/scsi_transport_fc.c | 2 +-
>  3 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-softirq.c b/block/blk-softirq.c
> index 01e2b353a2b9..15c1f5e12eb8 100644
> --- a/block/blk-softirq.c
> +++ b/block/blk-softirq.c
> @@ -144,6 +144,7 @@ void __blk_complete_request(struct request *req)
>  
>  	local_irq_restore(flags);
>  }
> +EXPORT_SYMBOL(__blk_complete_request);
>  
>  /**
>   * blk_complete_request - end I/O on a request
> diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c
> index 7948049f6c43..042c778e5a4e 100644
> --- a/drivers/block/null_blk.c
> +++ b/drivers/block/null_blk.c
> @@ -1365,7 +1365,7 @@ static blk_qc_t null_queue_bio(struct request_queue *q, struct bio *bio)
>  static enum blk_eh_timer_return null_rq_timed_out_fn(struct request *rq)
>  {
>  	pr_info("null: rq %p timed out\n", rq);
> -	blk_mq_complete_request(rq);
> +	__blk_complete_request(rq);
>  	return BLK_EH_DONE;
>  }
>  
> diff --git a/drivers/scsi/scsi_transport_fc.c b/drivers/scsi/scsi_transport_fc.c
> index 1da3d71e9f61..13948102ca29 100644
> --- a/drivers/scsi/scsi_transport_fc.c
> +++ b/drivers/scsi/scsi_transport_fc.c
> @@ -3592,7 +3592,7 @@ fc_bsg_job_timeout(struct request *req)
>  
>  	/* the blk_end_sync_io() doesn't check the error */
>  	if (inflight)
> -		blk_mq_complete_request(req);
> +		__blk_complete_request(req);
>  	return BLK_EH_DONE;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-06-19 14:52 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-15  1:57 [PATCH 1/2] block: export __blk_complete_request Jianchao Wang
2018-06-15  1:57 ` [PATCH 2/2] scsi_transport_fc: use __blk_complete_request in fc_bsg_job_timeout Jianchao Wang
2018-06-15  2:17 ` [PATCH 1/2] block: export __blk_complete_request Ming Lei
2018-06-15  2:22   ` jianchao.wang
2018-06-15  2:44     ` jianchao.wang
2018-06-15  2:56       ` Ming Lei
2018-06-15  3:04         ` jianchao.wang
2018-06-15  3:20           ` Ming Lei
2018-06-15  3:26             ` jianchao.wang
2018-06-15  4:03               ` Ming Lei
2018-06-15  5:10                 ` jianchao.wang
2018-06-15 11:58             ` Christoph Hellwig
2018-06-15  2:49     ` Ming Lei
2018-06-19 14:09 ` Christoph Hellwig
2018-06-19 14:52   ` jianchao.wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).