From: Jianchao Wang <jianchao.w.wang@oracle.com> To: axboe@kernel.dk Cc: hch@lst.de, jthumshirn@suse.de, hare@suse.de, josef@toxicpanda.com, bvanassche@acm.org, sagi@grimberg.me, keith.busch@intel.com, jsmart2021@gmail.com, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH 0/8]: blk-mq: use static_rqs to iterate busy tags Date: Fri, 15 Mar 2019 16:57:36 +0800 [thread overview] Message-ID: <1552640264-26101-1-git-send-email-jianchao.w.wang@oracle.com> (raw) Hi Jens As we know, there is a risk of accesing stale requests when iterate in-flight requests with tags->rqs[] and this has been talked in following thread, [1] https://marc.info/?l=linux-scsi&m=154511693912752&w=2 [2] https://marc.info/?l=linux-block&m=154526189023236&w=2 A typical sence could be blk_mq_get_request blk_mq_queue_tag_busy_iter -> blk_mq_get_tag -> bt_for_each -> bt_iter -> rq = taags->rqs[] -> rq->q -> blk_mq_rq_ctx_init -> data->hctx->tags->rqs[rq->tag] = rq; The root cause is that there is a window between set bit on tag sbitmap and set tags->rqs[]. This patch would fix this issue by iterating requests with tags->static_rqs[] instead of tags->rqs[] which would be changed dynamically. Moreover, we will try to get a non-zero q_usage_counter before access hctxs and tags and thus could avoid the race with updating nr_hw_queues, switching io scheduler and even queue clean up which are all under a frozen and drained queue. The 1st patch get rid of the useless of synchronize_rcu in __blk_mq_update_nr_hw_queues The 2nd patch modify the blk_mq_queue_tag_busy_iter to use tags->static_rqs[] instead of tags->rqs[] to iterate the busy tags. The 3rd ~ 7th patch change the blk_mq_tagset_busy_iter to blk_mq_queue_tag_busy_iter which is safer The 8th patch get rid of the blk_mq_tagset_busy_iter. Jianchao Wang(8) blk-mq: get rid of the synchronize_rcu in blk-mq: change the method of iterating busy tags of a blk-mq: use blk_mq_queue_tag_busy_iter in debugfs mtip32xx: use blk_mq_queue_tag_busy_iter nbd: use blk_mq_queue_tag_busy_iter skd: use blk_mq_queue_tag_busy_iter nvme: use blk_mq_queue_tag_busy_iter blk-mq: remove blk_mq_tagset_busy_iter diff stat block/blk-mq-debugfs.c | 4 +- block/blk-mq-tag.c | 173 +++++++++++++++++++++++++------------------------------------------------------------- block/blk-mq-tag.h | 2 - block/blk-mq.c | 35 ++++++------------ drivers/block/mtip32xx/mtip32xx.c | 8 ++-- drivers/block/nbd.c | 2 +- drivers/block/skd_main.c | 4 +- drivers/nvme/host/core.c | 12 ++++++ drivers/nvme/host/fc.c | 12 +++--- drivers/nvme/host/nvme.h | 2 + drivers/nvme/host/pci.c | 5 ++- drivers/nvme/host/rdma.c | 6 +-- drivers/nvme/host/tcp.c | 5 ++- drivers/nvme/target/loop.c | 6 +-- include/linux/blk-mq.h | 7 ++-- 15 files changed, 105 insertions(+), 178 deletions(- Thanks Jianchao
WARNING: multiple messages have this Message-ID (diff)
From: jianchao.w.wang@oracle.com (Jianchao Wang) Subject: [PATCH 0/8]: blk-mq: use static_rqs to iterate busy tags Date: Fri, 15 Mar 2019 16:57:36 +0800 [thread overview] Message-ID: <1552640264-26101-1-git-send-email-jianchao.w.wang@oracle.com> (raw) Hi Jens As we know, there is a risk of accesing stale requests when iterate in-flight requests with tags->rqs[] and this has been talked in following thread, [1] https://marc.info/?l=linux-scsi&m=154511693912752&w=2 [2] https://marc.info/?l=linux-block&m=154526189023236&w=2 A typical sence could be blk_mq_get_request blk_mq_queue_tag_busy_iter -> blk_mq_get_tag -> bt_for_each -> bt_iter -> rq = taags->rqs[] -> rq->q -> blk_mq_rq_ctx_init -> data->hctx->tags->rqs[rq->tag] = rq; The root cause is that there is a window between set bit on tag sbitmap and set tags->rqs[]. This patch would fix this issue by iterating requests with tags->static_rqs[] instead of tags->rqs[] which would be changed dynamically. Moreover, we will try to get a non-zero q_usage_counter before access hctxs and tags and thus could avoid the race with updating nr_hw_queues, switching io scheduler and even queue clean up which are all under a frozen and drained queue. The 1st patch get rid of the useless of synchronize_rcu in __blk_mq_update_nr_hw_queues The 2nd patch modify the blk_mq_queue_tag_busy_iter to use tags->static_rqs[] instead of tags->rqs[] to iterate the busy tags. The 3rd ~ 7th patch change the blk_mq_tagset_busy_iter to blk_mq_queue_tag_busy_iter which is safer The 8th patch get rid of the blk_mq_tagset_busy_iter. Jianchao Wang(8) blk-mq: get rid of the synchronize_rcu in blk-mq: change the method of iterating busy tags of a blk-mq: use blk_mq_queue_tag_busy_iter in debugfs mtip32xx: use blk_mq_queue_tag_busy_iter nbd: use blk_mq_queue_tag_busy_iter skd: use blk_mq_queue_tag_busy_iter nvme: use blk_mq_queue_tag_busy_iter blk-mq: remove blk_mq_tagset_busy_iter diff stat block/blk-mq-debugfs.c | 4 +- block/blk-mq-tag.c | 173 +++++++++++++++++++++++++------------------------------------------------------------- block/blk-mq-tag.h | 2 - block/blk-mq.c | 35 ++++++------------ drivers/block/mtip32xx/mtip32xx.c | 8 ++-- drivers/block/nbd.c | 2 +- drivers/block/skd_main.c | 4 +- drivers/nvme/host/core.c | 12 ++++++ drivers/nvme/host/fc.c | 12 +++--- drivers/nvme/host/nvme.h | 2 + drivers/nvme/host/pci.c | 5 ++- drivers/nvme/host/rdma.c | 6 +-- drivers/nvme/host/tcp.c | 5 ++- drivers/nvme/target/loop.c | 6 +-- include/linux/blk-mq.h | 7 ++-- 15 files changed, 105 insertions(+), 178 deletions(- Thanks Jianchao
next reply other threads:[~2019-03-15 9:06 UTC|newest] Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-03-15 8:57 Jianchao Wang [this message] 2019-03-15 8:57 ` [PATCH 0/8]: blk-mq: use static_rqs to iterate busy tags Jianchao Wang 2019-03-15 8:57 ` [PATCH 1/8] blk-mq: get rid of the synchronize_rcu in __blk_mq_update_nr_hw_queues Jianchao Wang 2019-03-15 8:57 ` Jianchao Wang 2019-03-17 6:14 ` Ming Lei 2019-03-17 6:14 ` Ming Lei 2019-03-15 8:57 ` [PATCH 2/8] blk-mq: change the method of iterating busy tags of a request_queue Jianchao Wang 2019-03-15 8:57 ` Jianchao Wang 2019-03-15 16:16 ` Keith Busch 2019-03-15 16:16 ` Keith Busch 2019-03-17 6:50 ` Ming Lei 2019-03-17 6:50 ` Ming Lei 2019-03-18 15:53 ` Keith Busch 2019-03-18 15:53 ` Keith Busch 2019-03-18 1:49 ` jianchao.wang 2019-03-18 1:49 ` jianchao.wang 2019-03-20 18:52 ` Sagi Grimberg 2019-03-20 18:52 ` Sagi Grimberg 2019-03-21 1:33 ` jianchao.wang 2019-03-21 1:33 ` jianchao.wang 2019-03-15 8:57 ` [PATCH 3/8] blk-mq: use blk_mq_queue_tag_busy_iter in debugfs Jianchao Wang 2019-03-15 8:57 ` Jianchao Wang 2019-03-15 8:57 ` [PATCH 4/8] mtip32xx: use blk_mq_queue_tag_busy_iter Jianchao Wang 2019-03-15 8:57 ` Jianchao Wang 2019-03-15 8:57 ` [PATCH 5/8] nbd: " Jianchao Wang 2019-03-15 8:57 ` Jianchao Wang 2019-03-18 17:16 ` Bart Van Assche 2019-03-18 17:16 ` Bart Van Assche 2019-03-19 2:04 ` jianchao.wang 2019-03-19 2:04 ` jianchao.wang 2019-03-15 8:57 ` [PATCH 6/8] skd: " Jianchao Wang 2019-03-15 8:57 ` Jianchao Wang 2019-03-18 17:20 ` Bart Van Assche 2019-03-18 17:20 ` Bart Van Assche 2019-03-19 1:54 ` jianchao.wang 2019-03-19 1:54 ` jianchao.wang 2019-03-15 8:57 ` [PATCH 7/8] nvme: " Jianchao Wang 2019-03-15 8:57 ` Jianchao Wang 2019-03-15 16:33 ` James Smart 2019-03-15 16:33 ` James Smart 2019-03-15 16:39 ` James Smart 2019-03-15 16:39 ` James Smart 2019-03-15 16:49 ` Hannes Reinecke 2019-03-15 16:49 ` Hannes Reinecke 2019-03-18 7:00 ` jianchao.wang 2019-03-18 7:00 ` jianchao.wang 2019-03-15 8:57 ` [PATCH 8/8] blk-mq: remove blk_mq_tagset_busy_iter Jianchao Wang 2019-03-15 8:57 ` Jianchao Wang 2019-03-15 9:20 ` [PATCH 0/8]: blk-mq: use static_rqs to iterate busy tags Christoph Hellwig 2019-03-15 9:20 ` Christoph Hellwig 2019-03-15 9:44 ` jianchao.wang 2019-03-15 9:44 ` jianchao.wang 2019-03-15 16:19 ` Bart Van Assche 2019-03-15 16:19 ` Bart Van Assche 2019-03-18 2:47 ` jianchao.wang 2019-03-18 2:47 ` jianchao.wang 2019-03-15 13:30 ` Josef Bacik 2019-03-15 13:30 ` Josef Bacik 2019-03-18 17:28 ` Bart Van Assche 2019-03-18 17:28 ` Bart Van Assche 2019-03-19 1:25 ` jianchao.wang 2019-03-19 1:25 ` jianchao.wang 2019-03-19 15:10 ` Bart Van Assche 2019-03-19 15:10 ` Bart Van Assche 2019-03-19 15:25 ` Keith Busch 2019-03-19 15:25 ` Keith Busch 2019-03-20 18:38 ` Sagi Grimberg 2019-03-20 18:38 ` Sagi Grimberg
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1552640264-26101-1-git-send-email-jianchao.w.wang@oracle.com \ --to=jianchao.w.wang@oracle.com \ --cc=axboe@kernel.dk \ --cc=bvanassche@acm.org \ --cc=hare@suse.de \ --cc=hch@lst.de \ --cc=josef@toxicpanda.com \ --cc=jsmart2021@gmail.com \ --cc=jthumshirn@suse.de \ --cc=keith.busch@intel.com \ --cc=linux-block@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=sagi@grimberg.me \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.