All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: "yukuai (C)" <yukuai3@huawei.com>
Cc: axboe@kernel.dk, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, yi.zhang@huawei.com,
	ming.lei@redhat.com
Subject: Re: [PATCH -next v2] blk-mq: fix panic during blk_mq_run_work_fn()
Date: Fri, 20 May 2022 21:56:35 +0800	[thread overview]
Message-ID: <YoeeEw4SFvWtXNRk@T590> (raw)
In-Reply-To: <8e6a806b-f42e-319b-e6c8-de1f07befce2@huawei.com>

On Fri, May 20, 2022 at 08:01:31PM +0800, yukuai (C) wrote:
> 在 2022/05/20 19:39, Ming Lei 写道:
> 
> > 
> > In short:
> > 
> > 1) run queue can be in-progress during cleanup queue, or returns from
> > cleanup queue; we drain it in both blk_cleanup_queue() and
> > disk_release_mq(), see commit 2a19b28f7929 ("blk-mq: cancel blk-mq dispatch
> > work in both blk_cleanup_queue and disk_release()")
> I understand that, however, there is no garantee new 'hctx->run_work'
> won't be queued after 'drain it', for this crash, I think this is how

No, run queue activity will be shutdown after both disk_release_mq()
and blk_cleanup_queue() are done.

disk_release_mq() is called after all FS IOs are done, so there isn't
any run queue from FS IO code path, either sync or async.

In blk_cleanup_queue(), we only focus on passthrough request, and
passthrough request is always explicitly allocated & freed by
its caller, so once queue is frozen, all sync dispatch activity
for passthrough request has been done, then it is enough to just cancel
dispatch work for avoiding any dispatch activity.

That is why both request queue and hctx can be released safely
after the two are done.

> it triggered:
> 
> assum that there is no io, while some bfq_queue is still busy:
> 
> blk_cleanup_queue
>  blk_freeze_queue
>  blk_mq_cancel_work_sync
>  cancel_delayed_work_sync(hctx1)
> 				blk_mq_run_work_fn -> hctx2
> 				 __blk_mq_run_hw_queue
> 				  blk_mq_sched_dispatch_requests
> 				   __blk_mq_do_dispatch_sched
> 				    blk_mq_delay_run_hw_queues
> 				     blk_mq_delay_run_hw_queue
> 				      -> add hctx1->run_work again
>  cancel_delayed_work_sync(hctx2)

Yes, even blk_mq_delay_run_hw_queues() can be called after all
hctx->run_work are canceled since __blk_mq_run_hw_queue() could be
running in sync io code path, not via ->run_work.

And my patch will fix the issue, won't it?


Thanks,
Ming


  reply	other threads:[~2022-05-20 13:57 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-20  3:25 [PATCH -next v2] blk-mq: fix panic during blk_mq_run_work_fn() Yu Kuai
2022-05-20  3:44 ` Ming Lei
2022-05-20  6:23   ` yukuai (C)
2022-05-20  7:02     ` yukuai (C)
2022-05-20  8:34       ` Ming Lei
2022-05-20  8:49         ` yukuai (C)
2022-05-20  9:53           ` Ming Lei
2022-05-20 10:56             ` yukuai (C)
2022-05-20 11:39               ` Ming Lei
2022-05-20 12:01                 ` yukuai (C)
2022-05-20 13:56                   ` Ming Lei [this message]
2022-05-21  3:33                     ` yukuai (C)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YoeeEw4SFvWtXNRk@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.