From: Bart Van Assche <bart.vanassche@sandisk.com> To: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de>, James Bottomley <jejb@linux.vnet.ibm.com>, "Martin K. Petersen" <martin.petersen@oracle.com>, Mike Snitzer <snitzer@redhat.com>, Doug Ledford <dledford@redhat.com>, Keith Busch <keith.busch@intel.com>, Ming Lei <tom.leiming@gmail.com>, Laurence Oberman <loberman@redhat.com>, "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>, "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>, "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>, "linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org> Subject: [PATCH 08/12] dm: Fix a race condition related to stopping and starting queues Date: Wed, 26 Oct 2016 15:54:34 -0700 [thread overview] Message-ID: <28b3e91c-018a-0dbd-8ca9-0a7994a97a5d@sandisk.com> (raw) In-Reply-To: <b22edafc-725f-0419-d074-34d35d57d126@sandisk.com> Ensure that all ongoing dm_mq_queue_rq() and dm_mq_requeue_request() calls have stopped before setting the "queue stopped" flag. This allows to remove the "queue stopped" test from dm_mq_queue_rq() and dm_mq_requeue_request(). This patch fixes a race condition because dm_mq_queue_rq() is called without holding the queue lock and hence BLK_MQ_S_STOPPED can be set at any time while dm_mq_queue_rq() is in progress. This patch prevents that the following hang occurs sporadically when using dm-mq: INFO: task systemd-udevd:10111 blocked for more than 480 seconds. Call Trace: [<ffffffff8161f397>] schedule+0x37/0x90 [<ffffffff816239ef>] schedule_timeout+0x27f/0x470 [<ffffffff8161e76f>] io_schedule_timeout+0x9f/0x110 [<ffffffff8161fb36>] bit_wait_io+0x16/0x60 [<ffffffff8161f929>] __wait_on_bit_lock+0x49/0xa0 [<ffffffff8114fe69>] __lock_page+0xb9/0xc0 [<ffffffff81165d90>] truncate_inode_pages_range+0x3e0/0x760 [<ffffffff81166120>] truncate_inode_pages+0x10/0x20 [<ffffffff81212a20>] kill_bdev+0x30/0x40 [<ffffffff81213d41>] __blkdev_put+0x71/0x360 [<ffffffff81214079>] blkdev_put+0x49/0x170 [<ffffffff812141c0>] blkdev_close+0x20/0x30 [<ffffffff811d48e8>] __fput+0xe8/0x1f0 [<ffffffff811d4a29>] ____fput+0x9/0x10 [<ffffffff810842d3>] task_work_run+0x83/0xb0 [<ffffffff8106606e>] do_exit+0x3ee/0xc40 [<ffffffff8106694b>] do_group_exit+0x4b/0xc0 [<ffffffff81073d9a>] get_signal+0x2ca/0x940 [<ffffffff8101bf43>] do_signal+0x23/0x660 [<ffffffff810022b3>] exit_to_usermode_loop+0x73/0xb0 [<ffffffff81002cb0>] syscall_return_slowpath+0xb0/0xc0 [<ffffffff81624e33>] entry_SYSCALL_64_fastpath+0xa6/0xa8 Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Mike Snitzer <snitzer@redhat.com> --- drivers/md/dm-rq.c | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index d47a504..107ed19 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -105,6 +105,8 @@ static void dm_mq_stop_queue(struct request_queue *q) /* Avoid that requeuing could restart the queue. */ blk_mq_cancel_requeue_work(q); blk_mq_stop_hw_queues(q); + /* Wait until dm_mq_queue_rq() has finished. */ + blk_mq_quiesce_queue(q); } void dm_stop_queue(struct request_queue *q) @@ -887,17 +889,6 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx, dm_put_live_table(md, srcu_idx); } - /* - * On suspend dm_stop_queue() handles stopping the blk-mq - * request_queue BUT: even though the hw_queues are marked - * BLK_MQ_S_STOPPED at that point there is still a race that - * is allowing block/blk-mq.c to call ->queue_rq against a - * hctx that it really shouldn't. The following check guards - * against this rarity (albeit _not_ race-free). - */ - if (unlikely(test_bit(BLK_MQ_S_STOPPED, &hctx->state))) - return BLK_MQ_RQ_QUEUE_BUSY; - if (ti->type->busy && ti->type->busy(ti)) return BLK_MQ_RQ_QUEUE_BUSY; -- 2.10.1
WARNING: multiple messages have this Message-ID (diff)
From: bart.vanassche@sandisk.com (Bart Van Assche) Subject: [PATCH 08/12] dm: Fix a race condition related to stopping and starting queues Date: Wed, 26 Oct 2016 15:54:34 -0700 [thread overview] Message-ID: <28b3e91c-018a-0dbd-8ca9-0a7994a97a5d@sandisk.com> (raw) In-Reply-To: <b22edafc-725f-0419-d074-34d35d57d126@sandisk.com> Ensure that all ongoing dm_mq_queue_rq() and dm_mq_requeue_request() calls have stopped before setting the "queue stopped" flag. This allows to remove the "queue stopped" test from dm_mq_queue_rq() and dm_mq_requeue_request(). This patch fixes a race condition because dm_mq_queue_rq() is called without holding the queue lock and hence BLK_MQ_S_STOPPED can be set at any time while dm_mq_queue_rq() is in progress. This patch prevents that the following hang occurs sporadically when using dm-mq: INFO: task systemd-udevd:10111 blocked for more than 480 seconds. Call Trace: [<ffffffff8161f397>] schedule+0x37/0x90 [<ffffffff816239ef>] schedule_timeout+0x27f/0x470 [<ffffffff8161e76f>] io_schedule_timeout+0x9f/0x110 [<ffffffff8161fb36>] bit_wait_io+0x16/0x60 [<ffffffff8161f929>] __wait_on_bit_lock+0x49/0xa0 [<ffffffff8114fe69>] __lock_page+0xb9/0xc0 [<ffffffff81165d90>] truncate_inode_pages_range+0x3e0/0x760 [<ffffffff81166120>] truncate_inode_pages+0x10/0x20 [<ffffffff81212a20>] kill_bdev+0x30/0x40 [<ffffffff81213d41>] __blkdev_put+0x71/0x360 [<ffffffff81214079>] blkdev_put+0x49/0x170 [<ffffffff812141c0>] blkdev_close+0x20/0x30 [<ffffffff811d48e8>] __fput+0xe8/0x1f0 [<ffffffff811d4a29>] ____fput+0x9/0x10 [<ffffffff810842d3>] task_work_run+0x83/0xb0 [<ffffffff8106606e>] do_exit+0x3ee/0xc40 [<ffffffff8106694b>] do_group_exit+0x4b/0xc0 [<ffffffff81073d9a>] get_signal+0x2ca/0x940 [<ffffffff8101bf43>] do_signal+0x23/0x660 [<ffffffff810022b3>] exit_to_usermode_loop+0x73/0xb0 [<ffffffff81002cb0>] syscall_return_slowpath+0xb0/0xc0 [<ffffffff81624e33>] entry_SYSCALL_64_fastpath+0xa6/0xa8 Signed-off-by: Bart Van Assche <bart.vanassche at sandisk.com> Reviewed-by: Hannes Reinecke <hare at suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de> Reviewed-by: Christoph Hellwig <hch at lst.de> Cc: Mike Snitzer <snitzer at redhat.com> --- drivers/md/dm-rq.c | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c index d47a504..107ed19 100644 --- a/drivers/md/dm-rq.c +++ b/drivers/md/dm-rq.c @@ -105,6 +105,8 @@ static void dm_mq_stop_queue(struct request_queue *q) /* Avoid that requeuing could restart the queue. */ blk_mq_cancel_requeue_work(q); blk_mq_stop_hw_queues(q); + /* Wait until dm_mq_queue_rq() has finished. */ + blk_mq_quiesce_queue(q); } void dm_stop_queue(struct request_queue *q) @@ -887,17 +889,6 @@ static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx, dm_put_live_table(md, srcu_idx); } - /* - * On suspend dm_stop_queue() handles stopping the blk-mq - * request_queue BUT: even though the hw_queues are marked - * BLK_MQ_S_STOPPED at that point there is still a race that - * is allowing block/blk-mq.c to call ->queue_rq against a - * hctx that it really shouldn't. The following check guards - * against this rarity (albeit _not_ race-free). - */ - if (unlikely(test_bit(BLK_MQ_S_STOPPED, &hctx->state))) - return BLK_MQ_RQ_QUEUE_BUSY; - if (ti->type->busy && ti->type->busy(ti)) return BLK_MQ_RQ_QUEUE_BUSY; -- 2.10.1
next prev parent reply other threads:[~2016-10-26 22:54 UTC|newest] Thread overview: 149+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-10-26 22:49 [PATCH v4 0/12] Fix race conditions related to stopping block layer queues Bart Van Assche 2016-10-26 22:49 ` Bart Van Assche 2016-10-26 22:49 ` Bart Van Assche 2016-10-26 22:50 ` [PATCH 01/12] blk-mq: Do not invoke .queue_rq() for a stopped queue Bart Van Assche 2016-10-26 22:50 ` Bart Van Assche 2016-10-26 22:50 ` Bart Van Assche 2016-10-27 5:47 ` Hannes Reinecke 2016-10-27 5:47 ` Hannes Reinecke 2016-10-27 8:14 ` Johannes Thumshirn 2016-10-27 8:14 ` Johannes Thumshirn 2016-10-27 8:14 ` Johannes Thumshirn 2016-10-27 12:14 ` Sagi Grimberg 2016-10-27 12:14 ` Sagi Grimberg 2016-10-27 12:14 ` Sagi Grimberg 2016-10-26 22:51 ` [PATCH 02/12] blk-mq: Introduce blk_mq_hctx_stopped() Bart Van Assche 2016-10-26 22:51 ` Bart Van Assche 2016-10-26 22:51 ` Bart Van Assche 2016-10-27 1:33 ` Ming Lei 2016-10-27 1:33 ` Ming Lei 2016-10-27 5:48 ` Hannes Reinecke 2016-10-27 8:15 ` Johannes Thumshirn 2016-10-27 8:15 ` Johannes Thumshirn 2016-10-27 8:15 ` Johannes Thumshirn 2016-10-27 12:15 ` Sagi Grimberg 2016-10-27 12:15 ` Sagi Grimberg 2016-10-27 12:15 ` Sagi Grimberg 2016-10-27 12:40 ` Christoph Hellwig 2016-10-27 12:40 ` Christoph Hellwig 2016-10-27 12:40 ` Christoph Hellwig 2016-10-26 22:52 ` [PATCH 03/12] blk-mq: Introduce blk_mq_queue_stopped() Bart Van Assche 2016-10-26 22:52 ` Bart Van Assche 2016-10-26 22:52 ` Bart Van Assche 2016-10-27 5:49 ` Hannes Reinecke 2016-10-27 5:49 ` Hannes Reinecke 2016-10-27 8:16 ` Johannes Thumshirn 2016-10-27 8:16 ` Johannes Thumshirn 2016-10-27 8:16 ` Johannes Thumshirn 2016-10-26 22:52 ` [PATCH 04/12] blk-mq: Move more code into blk_mq_direct_issue_request() Bart Van Assche 2016-10-26 22:52 ` Bart Van Assche 2016-10-27 5:50 ` Hannes Reinecke 2016-10-27 5:50 ` Hannes Reinecke 2016-10-27 5:50 ` Hannes Reinecke 2016-10-27 8:17 ` Johannes Thumshirn 2016-10-27 8:17 ` Johannes Thumshirn 2016-10-27 8:17 ` Johannes Thumshirn 2016-10-27 8:18 ` Johannes Thumshirn 2016-10-27 8:18 ` Johannes Thumshirn 2016-10-27 8:18 ` Johannes Thumshirn 2016-10-27 12:16 ` Sagi Grimberg 2016-10-27 12:16 ` Sagi Grimberg 2016-10-27 12:16 ` Sagi Grimberg 2016-10-27 12:40 ` Christoph Hellwig 2016-10-27 12:40 ` Christoph Hellwig 2016-10-27 12:40 ` Christoph Hellwig 2016-10-26 22:53 ` [PATCH 05/12] blk-mq: Introduce blk_mq_quiesce_queue() Bart Van Assche 2016-10-26 22:53 ` Bart Van Assche 2016-10-27 1:30 ` Ming Lei 2016-10-27 1:30 ` Ming Lei 2016-10-27 1:30 ` Ming Lei 2016-10-27 2:04 ` Bart Van Assche 2016-10-27 2:04 ` Bart Van Assche 2016-10-27 2:04 ` Bart Van Assche 2016-10-27 2:31 ` Ming Lei 2016-10-27 2:31 ` Ming Lei 2016-10-27 2:40 ` Bart Van Assche 2016-10-27 2:40 ` Bart Van Assche 2016-10-27 2:40 ` Bart Van Assche 2016-10-27 2:48 ` Ming Lei 2016-10-27 2:48 ` Ming Lei 2016-10-27 3:05 ` Bart Van Assche 2016-10-27 3:05 ` Bart Van Assche 2016-10-27 3:05 ` Bart Van Assche 2016-10-27 12:42 ` Christoph Hellwig 2016-10-27 12:42 ` Christoph Hellwig 2016-10-27 12:42 ` Christoph Hellwig 2016-10-27 13:16 ` Ming Lei 2016-10-27 13:16 ` Ming Lei 2016-10-27 13:16 ` Ming Lei 2016-10-27 5:52 ` Hannes Reinecke 2016-10-27 5:52 ` Hannes Reinecke 2016-10-27 5:52 ` Hannes Reinecke 2016-10-27 15:56 ` Bart Van Assche 2016-10-27 15:56 ` Bart Van Assche 2016-10-27 15:56 ` Bart Van Assche 2016-10-27 12:41 ` Christoph Hellwig 2016-10-27 12:41 ` Christoph Hellwig 2016-10-27 12:41 ` Christoph Hellwig 2016-10-26 22:53 ` [PATCH 06/12] blk-mq: Add a kick_requeue_list argument to blk_mq_requeue_request() Bart Van Assche 2016-10-26 22:53 ` Bart Van Assche 2016-10-26 22:53 ` Bart Van Assche 2016-10-27 7:26 ` Hannes Reinecke 2016-10-27 7:26 ` Hannes Reinecke 2016-10-27 8:28 ` Johannes Thumshirn 2016-10-27 8:28 ` Johannes Thumshirn 2016-10-27 8:28 ` Johannes Thumshirn 2016-10-27 12:44 ` Christoph Hellwig 2016-10-27 12:44 ` Christoph Hellwig 2016-10-27 12:44 ` Christoph Hellwig 2016-10-26 22:54 ` [PATCH 07/12] dm: Use BLK_MQ_S_STOPPED instead of QUEUE_FLAG_STOPPED in blk-mq code Bart Van Assche 2016-10-26 22:54 ` Bart Van Assche 2016-10-27 7:27 ` Hannes Reinecke 2016-10-27 7:27 ` Hannes Reinecke 2016-10-27 8:28 ` Johannes Thumshirn 2016-10-27 8:28 ` Johannes Thumshirn 2016-10-27 8:28 ` Johannes Thumshirn 2016-10-27 14:01 ` Mike Snitzer 2016-10-27 14:01 ` Mike Snitzer 2016-10-27 14:01 ` Mike Snitzer 2016-10-26 22:54 ` Bart Van Assche [this message] 2016-10-26 22:54 ` [PATCH 08/12] dm: Fix a race condition related to stopping and starting queues Bart Van Assche 2016-10-27 14:01 ` Mike Snitzer 2016-10-27 14:01 ` Mike Snitzer 2016-10-27 14:01 ` Mike Snitzer 2016-10-26 22:55 ` [PATCH 09/12] SRP transport: Move queuecommand() wait code to SCSI core Bart Van Assche 2016-10-26 22:55 ` Bart Van Assche 2016-10-26 22:55 ` Bart Van Assche 2016-10-27 8:27 ` Johannes Thumshirn 2016-10-27 8:27 ` Johannes Thumshirn 2016-10-27 8:27 ` Johannes Thumshirn 2016-10-27 12:20 ` Sagi Grimberg 2016-10-27 12:20 ` Sagi Grimberg 2016-10-27 12:20 ` Sagi Grimberg 2016-10-26 22:55 ` [PATCH 10/12] SRP transport, scsi-mq: Wait for .queue_rq() if necessary Bart Van Assche 2016-10-26 22:55 ` Bart Van Assche 2016-10-26 22:55 ` Bart Van Assche 2016-10-27 8:27 ` Johannes Thumshirn 2016-10-27 8:27 ` Johannes Thumshirn 2016-10-27 8:27 ` Johannes Thumshirn 2016-10-27 12:19 ` Sagi Grimberg 2016-10-27 12:19 ` Sagi Grimberg 2016-10-27 12:19 ` Sagi Grimberg 2016-10-26 22:56 ` [PATCH 11/12] nvme: Use BLK_MQ_S_STOPPED instead of QUEUE_FLAG_STOPPED in blk-mq code Bart Van Assche 2016-10-26 22:56 ` Bart Van Assche 2016-10-26 22:56 ` Bart Van Assche 2016-10-27 12:19 ` Sagi Grimberg 2016-10-27 12:19 ` Sagi Grimberg 2016-10-27 12:19 ` Sagi Grimberg 2016-10-28 16:01 ` Keith Busch 2016-10-28 16:01 ` Keith Busch 2016-10-28 18:51 ` Bart Van Assche 2016-10-28 18:51 ` Bart Van Assche 2016-10-28 21:06 ` Keith Busch 2016-10-28 21:06 ` Keith Busch 2016-10-26 22:56 ` [PATCH 12/12] nvme: Fix a race condition related to stopping queues Bart Van Assche 2016-10-26 22:56 ` Bart Van Assche 2016-10-26 22:56 ` Bart Van Assche 2016-10-26 23:28 ` [PATCH v4 0/12] Fix race conditions related to stopping block layer queues Jens Axboe 2016-10-26 23:28 ` Jens Axboe 2016-10-26 23:28 ` Jens Axboe
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=28b3e91c-018a-0dbd-8ca9-0a7994a97a5d@sandisk.com \ --to=bart.vanassche@sandisk.com \ --cc=axboe@fb.com \ --cc=dledford@redhat.com \ --cc=hch@lst.de \ --cc=jejb@linux.vnet.ibm.com \ --cc=keith.busch@intel.com \ --cc=linux-block@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=linux-rdma@vger.kernel.org \ --cc=linux-scsi@vger.kernel.org \ --cc=loberman@redhat.com \ --cc=martin.petersen@oracle.com \ --cc=snitzer@redhat.com \ --cc=tom.leiming@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.