linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yu Kuai <yukuai3@huawei.com>
To: Josef Bacik <josef@toxicpanda.com>
Cc: <axboe@kernel.dk>, <ming.lei@redhat.com>,
	<linux-block@vger.kernel.org>, <nbd@other.debian.org>,
	<linux-kernel@vger.kernel.org>, <yi.zhang@huawei.com>
Subject: Re: [PATCH -next v3 3/6] nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed
Date: Tue, 24 May 2022 09:51:08 +0800	[thread overview]
Message-ID: <ee5ee5f2-74ea-cac1-00e1-0645c04893ee@huawei.com> (raw)
In-Reply-To: <6a549193-909b-6f6e-532b-99cd2898ad80@huawei.com>

在 2022/05/24 9:07, Yu Kuai 写道:
> 在 2022/05/23 22:12, Josef Bacik 写道:
>> On Sat, May 21, 2022 at 03:37:46PM +0800, Yu Kuai wrote:
>>> Otherwise io will hung because request will only be completed if the
>>> cmd has the flag 'NBD_CMD_INFLIGHT'.
>>>
>>> Fixes: 07175cb1baf4 ("nbd: make sure request completion won't 
>>> concurrent")
>>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>>> ---
>>>   drivers/block/nbd.c | 18 ++++++++++++++----
>>>   1 file changed, 14 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>> index 2ee1e376d5c4..a0d0910dae2a 100644
>>> --- a/drivers/block/nbd.c
>>> +++ b/drivers/block/nbd.c
>>> @@ -403,13 +403,14 @@ static enum blk_eh_timer_return 
>>> nbd_xmit_timeout(struct request *req,
>>>       if (!mutex_trylock(&cmd->lock))
>>>           return BLK_EH_RESET_TIMER;
>>> -    if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>>> +    if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>>>           mutex_unlock(&cmd->lock);
>>>           return BLK_EH_DONE;
>>>       }
>>>       if (!refcount_inc_not_zero(&nbd->config_refs)) {
>>>           cmd->status = BLK_STS_TIMEOUT;
>>> +        __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
>>>           mutex_unlock(&cmd->lock);
>>>           goto done;
>>>       }
>>> @@ -478,6 +479,7 @@ static enum blk_eh_timer_return 
>>> nbd_xmit_timeout(struct request *req,
>>>       dev_err_ratelimited(nbd_to_dev(nbd), "Connection timed out\n");
>>>       set_bit(NBD_RT_TIMEDOUT, &config->runtime_flags);
>>>       cmd->status = BLK_STS_IOERR;
>>> +    __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
>>>       mutex_unlock(&cmd->lock);
>>>       sock_shutdown(nbd);
>>>       nbd_config_put(nbd);
>>> @@ -745,7 +747,7 @@ static struct nbd_cmd *nbd_handle_reply(struct 
>>> nbd_device *nbd, int index,
>>>       cmd = blk_mq_rq_to_pdu(req);
>>>       mutex_lock(&cmd->lock);
>>> -    if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>>> +    if (!test_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>>>           dev_err(disk_to_dev(nbd->disk), "Suspicious reply %d 
>>> (status %u flags %lu)",
>>>               tag, cmd->status, cmd->flags);
>>>           ret = -ENOENT;
>>> @@ -854,8 +856,16 @@ static void recv_work(struct work_struct *work)
>>>           }
>>>           rq = blk_mq_rq_from_pdu(cmd);
>>> -        if (likely(!blk_should_fake_timeout(rq->q)))
>>> -            blk_mq_complete_request(rq);
>>> +        if (likely(!blk_should_fake_timeout(rq->q))) {
>>> +            bool complete;
>>> +
>>> +            mutex_lock(&cmd->lock);
>>> +            complete = __test_and_clear_bit(NBD_CMD_INFLIGHT,
>>> +                            &cmd->flags);
>>> +            mutex_unlock(&cmd->lock);
>>> +            if (complete)
>>> +                blk_mq_complete_request(rq);
>>> +        }
>>
>> I'd rather this be handled in nbd_handle_reply.  We should return with it
>> cleared if it's ready to be completed.  Thanks,
> Hi,
> 
> Thanks for your advice, I'll do that in next version. I'll still have to
> hold the lock to set the bit again in case blk_should_fake_timeout()
> pass...

Hi, Josef

I just found out that this way is problematic:
t1:			t2:
recv_work
  nbd_handle_reply
   __clear_bit
			nbd_xmit_timeout
			 test_bit(NBD_CMD_INFLIGHT, &cmd->flags) -> fail
			 return BLK_EH_DONE -> rq can't complete
  blk_should_fake_timeout -> true
  __set_bit

__clear_bit and then __set_bit from recv_work leaves a window, and
concurrent nbd_xmit_timeout() may lead to that request can't be
completed through both timeout and recv_work().

Do you think it's ok to keep the current implementation with some
comments to explain the above scenario?

Thanks,
Kuai

  reply	other threads:[~2022-05-24  1:51 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-21  7:37 [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Yu Kuai
2022-05-21  7:37 ` [PATCH -next v3 1/6] nbd: call genl_unregister_family() first in nbd_cleanup() Yu Kuai
2022-05-23 14:13   ` Josef Bacik
2022-05-21  7:37 ` [PATCH -next v3 2/6] nbd: fix race between nbd_alloc_config() and module removal Yu Kuai
2022-05-23 14:14   ` Josef Bacik
2022-05-21  7:37 ` [PATCH -next v3 3/6] nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed Yu Kuai
2022-05-23 14:12   ` Josef Bacik
2022-05-24  1:07     ` Yu Kuai
2022-05-24  1:51       ` Yu Kuai [this message]
2022-05-21  7:37 ` [PATCH -next v3 4/6] nbd: fix io hung while disconnecting device Yu Kuai
2022-05-23 14:15   ` Josef Bacik
2022-05-21  7:37 ` [PATCH -next v3 5/6] nbd: fix possible overflow on 'first_minor' in nbd_dev_add() Yu Kuai
2022-05-23 14:15   ` Josef Bacik
2022-05-21  7:37 ` [PATCH -next v3 6/6] nbd: use pr_err to output error message Yu Kuai
2022-05-23 14:16   ` Josef Bacik
2022-05-28 12:20 ` [PATCH -next v3 0/6] nbd: bugfix and cleanup patches Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ee5ee5f2-74ea-cac1-00e1-0645c04893ee@huawei.com \
    --to=yukuai3@huawei.com \
    --cc=axboe@kernel.dk \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=nbd@other.debian.org \
    --cc=yi.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).