All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chao Leng <lengchao@huawei.com>
To: Hannes Reinecke <hare@suse.de>, <linux-nvme@lists.infradead.org>
Cc: <axboe@kernel.dk>, <linux-block@vger.kernel.org>,
	<sagi@grimberg.me>, <axboe@fb.com>, <kbusch@kernel.org>,
	<hch@lst.de>
Subject: Re: [PATCH v3 3/5] nvme-fabrics: avoid double request completion for nvmf_fail_nonready_command
Date: Fri, 22 Jan 2021 09:48:03 +0800	[thread overview]
Message-ID: <a100b5dd-d38b-3158-d000-b84920a4e274@huawei.com> (raw)
In-Reply-To: <fda1fdb8-8a9d-2e95-4d08-8d8ee1df450d@suse.de>



On 2021/1/21 16:58, Hannes Reinecke wrote:
> On 1/21/21 8:03 AM, Chao Leng wrote:
>> When reconnect, the request may be completed with NVME_SC_HOST_PATH_ERROR
>> in nvmf_fail_nonready_command. The state of request will be changed to
>> MQ_RQ_IN_FLIGHT before call nvme_complete_rq. If free the request
>> asynchronously such as in nvme_submit_user_cmd, in extreme scenario
>> the request may be completed again in tear down process.
>> nvmf_fail_nonready_command do not need calling blk_mq_start_request
>> before complete the request. nvmf_fail_nonready_command should set
>> the state of request to MQ_RQ_COMPLETE before complete the request.
>>
> 
> So what you are saying is that there is a race condition between
> blk_mq_start_request()
> and
> nvme_complete_request()
Yes. The race is:
process1:error recovery->tear down->quiesce queue(wait dispatch done)
process2:dispatch->queue_rq->nvmf_fail_nonready_command->
     nvme_complete_rq(if the request is freed asynchronously, wake
	nvme_submit_user_cmd( for example) but have no chance to run).
process1:continue ->cancle suspend request, check the state is not
     MQ_RQ_IDLE and MQ_RQ_COMPLETE, complete(free) the request.
process3: nvme_submit_user_cmd now has chance to run, and the free the
     request again.
Test Injection Method: inject a msleep before call blk_mq_free_request
in nvme_submit_user_cmd.
> 
>> Signed-off-by: Chao Leng <lengchao@huawei.com>
>> ---
>>   drivers/nvme/host/fabrics.c | 4 +---
>>   1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
>> index 72ac00173500..874e4320e214 100644
>> --- a/drivers/nvme/host/fabrics.c
>> +++ b/drivers/nvme/host/fabrics.c
>> @@ -553,9 +553,7 @@ blk_status_t nvmf_fail_nonready_command(struct nvme_ctrl *ctrl,
>>           !blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH))
>>           return BLK_STS_RESOURCE;
>> -    nvme_req(rq)->status = NVME_SC_HOST_PATH_ERROR;
>> -    blk_mq_start_request(rq);
>> -    nvme_complete_rq(rq);
>> +    nvme_complete_failed_req(rq);
>>       return BLK_STS_OK;
>>   }
>>   EXPORT_SYMBOL_GPL(nvmf_fail_nonready_command);
>> I'd rather have 'nvme_complete_failed_req()' accept the status as 
> argument, like
> 
> nvme_complete_failed_request(rq, NVME_SC_HOST_PATH_ERROR)
> 
> that way it's obvious what is happening, and the status isn't hidden in the function.
Ok, good idea. Thank you for your suggestion.
> 
> Cheers,
> 
> Hannes

WARNING: multiple messages have this Message-ID (diff)
From: Chao Leng <lengchao@huawei.com>
To: Hannes Reinecke <hare@suse.de>, <linux-nvme@lists.infradead.org>
Cc: axboe@kernel.dk, axboe@fb.com, sagi@grimberg.me,
	linux-block@vger.kernel.org, kbusch@kernel.org, hch@lst.de
Subject: Re: [PATCH v3 3/5] nvme-fabrics: avoid double request completion for nvmf_fail_nonready_command
Date: Fri, 22 Jan 2021 09:48:03 +0800	[thread overview]
Message-ID: <a100b5dd-d38b-3158-d000-b84920a4e274@huawei.com> (raw)
In-Reply-To: <fda1fdb8-8a9d-2e95-4d08-8d8ee1df450d@suse.de>



On 2021/1/21 16:58, Hannes Reinecke wrote:
> On 1/21/21 8:03 AM, Chao Leng wrote:
>> When reconnect, the request may be completed with NVME_SC_HOST_PATH_ERROR
>> in nvmf_fail_nonready_command. The state of request will be changed to
>> MQ_RQ_IN_FLIGHT before call nvme_complete_rq. If free the request
>> asynchronously such as in nvme_submit_user_cmd, in extreme scenario
>> the request may be completed again in tear down process.
>> nvmf_fail_nonready_command do not need calling blk_mq_start_request
>> before complete the request. nvmf_fail_nonready_command should set
>> the state of request to MQ_RQ_COMPLETE before complete the request.
>>
> 
> So what you are saying is that there is a race condition between
> blk_mq_start_request()
> and
> nvme_complete_request()
Yes. The race is:
process1:error recovery->tear down->quiesce queue(wait dispatch done)
process2:dispatch->queue_rq->nvmf_fail_nonready_command->
     nvme_complete_rq(if the request is freed asynchronously, wake
	nvme_submit_user_cmd( for example) but have no chance to run).
process1:continue ->cancle suspend request, check the state is not
     MQ_RQ_IDLE and MQ_RQ_COMPLETE, complete(free) the request.
process3: nvme_submit_user_cmd now has chance to run, and the free the
     request again.
Test Injection Method: inject a msleep before call blk_mq_free_request
in nvme_submit_user_cmd.
> 
>> Signed-off-by: Chao Leng <lengchao@huawei.com>
>> ---
>>   drivers/nvme/host/fabrics.c | 4 +---
>>   1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
>> index 72ac00173500..874e4320e214 100644
>> --- a/drivers/nvme/host/fabrics.c
>> +++ b/drivers/nvme/host/fabrics.c
>> @@ -553,9 +553,7 @@ blk_status_t nvmf_fail_nonready_command(struct nvme_ctrl *ctrl,
>>           !blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH))
>>           return BLK_STS_RESOURCE;
>> -    nvme_req(rq)->status = NVME_SC_HOST_PATH_ERROR;
>> -    blk_mq_start_request(rq);
>> -    nvme_complete_rq(rq);
>> +    nvme_complete_failed_req(rq);
>>       return BLK_STS_OK;
>>   }
>>   EXPORT_SYMBOL_GPL(nvmf_fail_nonready_command);
>> I'd rather have 'nvme_complete_failed_req()' accept the status as 
> argument, like
> 
> nvme_complete_failed_request(rq, NVME_SC_HOST_PATH_ERROR)
> 
> that way it's obvious what is happening, and the status isn't hidden in the function.
Ok, good idea. Thank you for your suggestion.
> 
> Cheers,
> 
> Hannes

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  parent reply	other threads:[~2021-01-22  1:48 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-21  7:03 [PATCH v3 0/5] avoid double request completion and IO error Chao Leng
2021-01-21  7:03 ` Chao Leng
2021-01-21  7:03 ` [PATCH v3 1/5] blk-mq: introduce blk_mq_set_request_complete Chao Leng
2021-01-21  7:03   ` Chao Leng
2021-01-21  8:40   ` Christoph Hellwig
2021-01-21  8:40     ` Christoph Hellwig
2021-01-22  1:46     ` Chao Leng
2021-01-22  1:46       ` Chao Leng
2021-01-21  7:03 ` [PATCH v3 2/5] nvme-core: introduce complete failed request Chao Leng
2021-01-21  7:03   ` Chao Leng
2021-01-21  8:41   ` Christoph Hellwig
2021-01-21  8:41     ` Christoph Hellwig
2021-01-22  1:46     ` Chao Leng
2021-01-22  1:46       ` Chao Leng
2021-01-21  7:03 ` [PATCH v3 3/5] nvme-fabrics: avoid double request completion for nvmf_fail_nonready_command Chao Leng
2021-01-21  7:03   ` Chao Leng
2021-01-21  8:58   ` Hannes Reinecke
2021-01-21  8:58     ` Hannes Reinecke
2021-01-21  9:00     ` Christoph Hellwig
2021-01-21  9:00       ` Christoph Hellwig
2021-01-21  9:27       ` Hannes Reinecke
2021-01-21  9:27         ` Hannes Reinecke
2021-01-22  1:50         ` Chao Leng
2021-01-22  1:50           ` Chao Leng
2021-01-22  1:48     ` Chao Leng [this message]
2021-01-22  1:48       ` Chao Leng
2021-01-21  7:03 ` [PATCH v3 4/5] nvme-rdma: avoid IO error for nvme native multipath Chao Leng
2021-01-21  7:03   ` Chao Leng
2021-01-21  7:03 ` [PATCH v3 5/5] nvme-fc: " Chao Leng
2021-01-21  7:03   ` Chao Leng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a100b5dd-d38b-3158-d000-b84920a4e274@huawei.com \
    --to=lengchao@huawei.com \
    --cc=axboe@fb.com \
    --cc=axboe@kernel.dk \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.