linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	Daniel Wagner <dwagner@suse.de>,
	Khazhismel Kumykov <khazhy@google.com>,
	Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	Hannes Reinecke <hare@suse.de>,
	Johannes Thumshirn <johannes.thumshirn@wdc.com>,
	John Garry <john.garry@huawei.com>,
	linux-scsi@vger.kernel.org
Subject: Re: [PATCH v7 3/5] blk-mq: Fix races between iterating over requests and freeing requests
Date: Fri, 23 Apr 2021 11:52:35 +0800	[thread overview]
Message-ID: <YIJEg9DLWoOJ06Kc@T590> (raw)
In-Reply-To: <32a121b7-2444-ac19-420d-4961f2a18129@acm.org>

On Thu, Apr 22, 2021 at 08:51:06AM -0700, Bart Van Assche wrote:
> On 4/22/21 12:13 AM, Ming Lei wrote:
> > On Wed, Apr 21, 2021 at 08:54:30PM -0700, Bart Van Assche wrote:
> >> On 4/21/21 8:15 PM, Ming Lei wrote:
> >>> On Tue, Apr 20, 2021 at 05:02:33PM -0700, Bart Van Assche wrote:
> >>>> +static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
> >>>> +{
> >>>> +	struct bt_tags_iter_data *iter_data = data;
> >>>> +	struct blk_mq_tags *tags = iter_data->tags;
> >>>> +	bool res;
> >>>> +
> >>>> +	if (iter_data->flags & BT_TAG_ITER_MAY_SLEEP) {
> >>>> +		down_read(&tags->iter_rwsem);
> >>>> +		res = __bt_tags_iter(bitmap, bitnr, data);
> >>>> +		up_read(&tags->iter_rwsem);
> >>>> +	} else {
> >>>> +		rcu_read_lock();
> >>>> +		res = __bt_tags_iter(bitmap, bitnr, data);
> >>>> +		rcu_read_unlock();
> >>>> +	}
> >>>> +
> >>>> +	return res;
> >>>> +}
> >>>
> >>> Holding one rwsem or rcu read lock won't avoid the issue completely
> >>> because request may be completed remotely in iter_data->fn(), such as
> >>> nbd_clear_req(), nvme_cancel_request(), complete_all_cmds_iter(),
> >>> mtip_no_dev_cleanup(), because blk_mq_complete_request() may complete
> >>> request in softirq, remote IPI, even wq, and the request is still
> >>> referenced in these contexts after bt_tags_iter() returns.
> >>
> >> The rwsem and RCU read lock are used to serialize iterating over
> >> requests against blk_mq_sched_free_requests() calls. I don't think it
> >> matters for this patch from which context requests are freed.
> > 
> > Requests still can be referred in other context after blk_mq_wait_for_tag_iter()
> > returns, then follows freeing request pool. And use-after-free exists too, doesn't it?
> 
> The request pool should only be freed after it has been guaranteed that
> all pending requests have finished and also that no new requests will be
> started. This patch series adds two blk_mq_wait_for_tag_iter() calls.
> Both calls happen while the queue is frozen so I don't think that the
> issue mentioned in your email can happen.

For example, scsi aacraid normal completion vs. reset together with elevator
switch, aacraid is one single queue HBA, and the request will be completed
via IPI or softirq asynchronously, that said request isn't really completed
after blk_mq_complete_request() returns.

1) interrupt comes, and request A is completed via blk_mq_complete_request()
from aacraid's interrupt handler via ->scsi_done()

2) _aac_reset_adapter() comes because of reset event which can be
triggered by sysfs store or whatever, irq is drained in 
_aac_reset_adpter(), so blk_mq_complete_request(request A) from aacraid irq
context is done, but request A is just scheduled to be completed via IPI
or softirq asynchronously, not really done yet.

3) scsi_host_complete_all_commands() is called from _aac_reset_adapter() for
failing all pending requests. request A is still visible in
scsi_host_complete_all_commands, because its tag isn't freed yet. But the
tag & request A can be completed & freed exactly after scsi_host_complete_all_commands()
reads ->rqs[bitnr] in bt_tags_iter(), which calls complete_all_cmds_iter()
-> .scsi_done() -> blk_mq_complete_request(), and same request A is scheduled via
IPI or softirq, and request A is addded in ipi or softirq list.

4) meantime request A is freed from normal completion triggered by interrupt, one
pending elevator switch can move on since request A drops the last reference; and
bt_tags_iter() returns from reset path, so blk_mq_wait_for_tag_iter() can return
too, then the whole scheduler request pool is freed now.

5) request A in ipi/softirq list scheduled from _aac_reset_adapter is read , UAF
is triggered.

It is supposed that driver covers normal completion vs. error handling, but wrt.
remove completion, not sure driver is capable of covering that.

Thanks,
Ming


  reply	other threads:[~2021-04-23  3:52 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-21  0:02 [PATCH v7 0/5] blk-mq: Fix a race between iterating over requests and freeing requests Bart Van Assche
2021-04-21  0:02 ` [PATCH v7 1/5] blk-mq: Move the elevator_exit() definition Bart Van Assche
2021-04-21  0:02 ` [PATCH v7 2/5] blk-mq: Introduce atomic variants of blk_mq_(all_tag|tagset_busy)_iter Bart Van Assche
2021-04-21  0:02 ` [PATCH v7 3/5] blk-mq: Fix races between iterating over requests and freeing requests Bart Van Assche
2021-04-22  2:25   ` Ming Lei
2021-04-22  4:01     ` Bart Van Assche
2021-04-22  7:23       ` Ming Lei
2021-04-22  3:15   ` Ming Lei
2021-04-22  3:54     ` Bart Van Assche
2021-04-22  7:13       ` Ming Lei
2021-04-22 15:51         ` Bart Van Assche
2021-04-23  3:52           ` Ming Lei [this message]
2021-04-23 17:52             ` Bart Van Assche
2021-04-25  0:09               ` Ming Lei
2021-04-25 21:01                 ` Bart Van Assche
2021-04-26  0:55                   ` Ming Lei
2021-04-26 16:29                 ` Bart Van Assche
2021-04-27  0:11                   ` Ming Lei
2021-04-21  0:02 ` [PATCH v7 4/5] blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list Bart Van Assche
2021-04-21  0:02 ` [PATCH v7 5/5] blk-mq: Fix races between blk_mq_update_nr_hw_queues() and iterating over tags Bart Van Assche
2021-04-21 14:40 ` [PATCH v7 0/5] blk-mq: Fix a race between iterating over requests and freeing requests Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YIJEg9DLWoOJ06Kc@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=dwagner@suse.de \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=johannes.thumshirn@wdc.com \
    --cc=john.garry@huawei.com \
    --cc=khazhy@google.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=shinichiro.kawasaki@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).