All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	Daniel Wagner <dwagner@suse.de>,
	Khazhismel Kumykov <khazhy@google.com>,
	Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	Hannes Reinecke <hare@suse.de>,
	Johannes Thumshirn <johannes.thumshirn@wdc.com>,
	John Garry <john.garry@huawei.com>,
	linux-scsi@vger.kernel.org
Subject: Re: [PATCH v7 3/5] blk-mq: Fix races between iterating over requests and freeing requests
Date: Fri, 23 Apr 2021 10:52:43 -0700	[thread overview]
Message-ID: <28607d75-042f-7a6a-f5d0-2ee03754917e@acm.org> (raw)
In-Reply-To: <YIJEg9DLWoOJ06Kc@T590>

On 4/22/21 8:52 PM, Ming Lei wrote:
> For example, scsi aacraid normal completion vs. reset together with elevator
> switch, aacraid is one single queue HBA, and the request will be completed
> via IPI or softirq asynchronously, that said request isn't really completed
> after blk_mq_complete_request() returns.
> 
> 1) interrupt comes, and request A is completed via blk_mq_complete_request()
> from aacraid's interrupt handler via ->scsi_done()
> 
> 2) _aac_reset_adapter() comes because of reset event which can be
> triggered by sysfs store or whatever, irq is drained in 
> _aac_reset_adpter(), so blk_mq_complete_request(request A) from aacraid irq
> context is done, but request A is just scheduled to be completed via IPI
> or softirq asynchronously, not really done yet.
> 
> 3) scsi_host_complete_all_commands() is called from _aac_reset_adapter() for
> failing all pending requests. request A is still visible in
> scsi_host_complete_all_commands, because its tag isn't freed yet. But the
> tag & request A can be completed & freed exactly after scsi_host_complete_all_commands()
> reads ->rqs[bitnr] in bt_tags_iter(), which calls complete_all_cmds_iter()
> -> .scsi_done() -> blk_mq_complete_request(), and same request A is scheduled via
> IPI or softirq, and request A is addded in ipi or softirq list.
> 
> 4) meantime request A is freed from normal completion triggered by interrupt, one
> pending elevator switch can move on since request A drops the last reference; and
> bt_tags_iter() returns from reset path, so blk_mq_wait_for_tag_iter() can return
> too, then the whole scheduler request pool is freed now.
> 
> 5) request A in ipi/softirq list scheduled from _aac_reset_adapter is read , UAF
> is triggered.
> 
> It is supposed that driver covers normal completion vs. error handling, but wrt.
> remove completion, not sure driver is capable of covering that.

Hi Ming,

I agree that the scenario above can happen and also that a fix is
needed. However, I do not agree that this should be fixed by modifying
the tag iteration functions. I see scsi_host_complete_all_commands() as
a workaround for broken storage controllers - storage controllers that
do not have a facility for terminating all pending commands. NVMe
controllers can be told to terminate all pending I/O commands by e.g.
deleting all I/O submission queues. Many SCSI controllers also provide a
facility for aborting all pending commands. For SCSI controllers that do
not provide such a facility I propose to fix races like the race
described above in the SCSI LLD instead of modifying the block layer tag
iteration functions.

Thanks,

Bart.

  reply	other threads:[~2021-04-23 17:52 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-21  0:02 [PATCH v7 0/5] blk-mq: Fix a race between iterating over requests and freeing requests Bart Van Assche
2021-04-21  0:02 ` [PATCH v7 1/5] blk-mq: Move the elevator_exit() definition Bart Van Assche
2021-04-21  0:02 ` [PATCH v7 2/5] blk-mq: Introduce atomic variants of blk_mq_(all_tag|tagset_busy)_iter Bart Van Assche
2021-04-21  0:02 ` [PATCH v7 3/5] blk-mq: Fix races between iterating over requests and freeing requests Bart Van Assche
2021-04-22  2:25   ` Ming Lei
2021-04-22  4:01     ` Bart Van Assche
2021-04-22  7:23       ` Ming Lei
2021-04-22  3:15   ` Ming Lei
2021-04-22  3:54     ` Bart Van Assche
2021-04-22  7:13       ` Ming Lei
2021-04-22 15:51         ` Bart Van Assche
2021-04-23  3:52           ` Ming Lei
2021-04-23 17:52             ` Bart Van Assche [this message]
2021-04-25  0:09               ` Ming Lei
2021-04-25 21:01                 ` Bart Van Assche
2021-04-26  0:55                   ` Ming Lei
2021-04-26 16:29                 ` Bart Van Assche
2021-04-27  0:11                   ` Ming Lei
2021-04-21  0:02 ` [PATCH v7 4/5] blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list Bart Van Assche
2021-04-21  0:02 ` [PATCH v7 5/5] blk-mq: Fix races between blk_mq_update_nr_hw_queues() and iterating over tags Bart Van Assche
2021-04-21 14:40 ` [PATCH v7 0/5] blk-mq: Fix a race between iterating over requests and freeing requests Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=28607d75-042f-7a6a-f5d0-2ee03754917e@acm.org \
    --to=bvanassche@acm.org \
    --cc=axboe@kernel.dk \
    --cc=dwagner@suse.de \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=johannes.thumshirn@wdc.com \
    --cc=john.garry@huawei.com \
    --cc=khazhy@google.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    --cc=shinichiro.kawasaki@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.