All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jianchao Wang <jianchao.w.wang@oracle.com>
To: axboe@kernel.dk
Cc: hch@lst.de, jthumshirn@suse.de, hare@suse.de,
	josef@toxicpanda.com, bvanassche@acm.org, sagi@grimberg.me,
	keith.busch@intel.com, jsmart2021@gmail.com,
	linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH 0/8]: blk-mq: use static_rqs to iterate busy tags
Date: Fri, 15 Mar 2019 16:57:36 +0800	[thread overview]
Message-ID: <1552640264-26101-1-git-send-email-jianchao.w.wang@oracle.com> (raw)

Hi Jens

As we know, there is a risk of accesing stale requests when iterate
in-flight requests with tags->rqs[] and this has been talked in following
thread,
[1] https://marc.info/?l=linux-scsi&m=154511693912752&w=2
[2] https://marc.info/?l=linux-block&m=154526189023236&w=2

A typical sence could be
blk_mq_get_request         blk_mq_queue_tag_busy_iter
  -> blk_mq_get_tag
                             -> bt_for_each
                               -> bt_iter
                                 -> rq = taags->rqs[]
                                 -> rq->q
  -> blk_mq_rq_ctx_init
    -> data->hctx->tags->rqs[rq->tag] = rq;

The root cause is that there is a window between set bit on tag sbitmap
and set tags->rqs[].

This patch would fix this issue by iterating requests with tags->static_rqs[]
instead of tags->rqs[] which would be changed dynamically. Moreover,
we will try to get a non-zero q_usage_counter before access hctxs and tags and
thus could avoid the race with updating nr_hw_queues, switching io scheduler
and even queue clean up which are all under a frozen and drained queue.

The 1st patch get rid of the useless of synchronize_rcu in __blk_mq_update_nr_hw_queues

The 2nd patch modify the blk_mq_queue_tag_busy_iter to use tags->static_rqs[]
instead of tags->rqs[] to iterate the busy tags.

The 3rd ~ 7th patch change the blk_mq_tagset_busy_iter to blk_mq_queue_tag_busy_iter
which is safer

The 8th patch get rid of the blk_mq_tagset_busy_iter.

Jianchao Wang(8)
	blk-mq: get rid of the synchronize_rcu in
	blk-mq: change the method of iterating busy tags of a
	blk-mq: use blk_mq_queue_tag_busy_iter in debugfs
	mtip32xx: use blk_mq_queue_tag_busy_iter
	nbd: use blk_mq_queue_tag_busy_iter
	skd: use blk_mq_queue_tag_busy_iter
	nvme: use blk_mq_queue_tag_busy_iter
	blk-mq: remove blk_mq_tagset_busy_iter

diff stat
 block/blk-mq-debugfs.c            |   4 +-
 block/blk-mq-tag.c                | 173 +++++++++++++++++++++++++-------------------------------------------------------------
 block/blk-mq-tag.h                |   2 -
 block/blk-mq.c                    |  35 ++++++------------
 drivers/block/mtip32xx/mtip32xx.c |   8 ++--
 drivers/block/nbd.c               |   2 +-
 drivers/block/skd_main.c          |   4 +-
 drivers/nvme/host/core.c          |  12 ++++++
 drivers/nvme/host/fc.c            |  12 +++---
 drivers/nvme/host/nvme.h          |   2 +
 drivers/nvme/host/pci.c           |   5 ++-
 drivers/nvme/host/rdma.c          |   6 +--
 drivers/nvme/host/tcp.c           |   5 ++-
 drivers/nvme/target/loop.c        |   6 +--
 include/linux/blk-mq.h            |   7 ++--
 15 files changed, 105 insertions(+), 178 deletions(-

Thanks
Jianchao


WARNING: multiple messages have this Message-ID (diff)
From: jianchao.w.wang@oracle.com (Jianchao Wang)
Subject: [PATCH 0/8]: blk-mq: use static_rqs to iterate busy tags
Date: Fri, 15 Mar 2019 16:57:36 +0800	[thread overview]
Message-ID: <1552640264-26101-1-git-send-email-jianchao.w.wang@oracle.com> (raw)

Hi Jens

As we know, there is a risk of accesing stale requests when iterate
in-flight requests with tags->rqs[] and this has been talked in following
thread,
[1] https://marc.info/?l=linux-scsi&m=154511693912752&w=2
[2] https://marc.info/?l=linux-block&m=154526189023236&w=2

A typical sence could be
blk_mq_get_request         blk_mq_queue_tag_busy_iter
  -> blk_mq_get_tag
                             -> bt_for_each
                               -> bt_iter
                                 -> rq = taags->rqs[]
                                 -> rq->q
  -> blk_mq_rq_ctx_init
    -> data->hctx->tags->rqs[rq->tag] = rq;

The root cause is that there is a window between set bit on tag sbitmap
and set tags->rqs[].

This patch would fix this issue by iterating requests with tags->static_rqs[]
instead of tags->rqs[] which would be changed dynamically. Moreover,
we will try to get a non-zero q_usage_counter before access hctxs and tags and
thus could avoid the race with updating nr_hw_queues, switching io scheduler
and even queue clean up which are all under a frozen and drained queue.

The 1st patch get rid of the useless of synchronize_rcu in __blk_mq_update_nr_hw_queues

The 2nd patch modify the blk_mq_queue_tag_busy_iter to use tags->static_rqs[]
instead of tags->rqs[] to iterate the busy tags.

The 3rd ~ 7th patch change the blk_mq_tagset_busy_iter to blk_mq_queue_tag_busy_iter
which is safer

The 8th patch get rid of the blk_mq_tagset_busy_iter.

Jianchao Wang(8)
	blk-mq: get rid of the synchronize_rcu in
	blk-mq: change the method of iterating busy tags of a
	blk-mq: use blk_mq_queue_tag_busy_iter in debugfs
	mtip32xx: use blk_mq_queue_tag_busy_iter
	nbd: use blk_mq_queue_tag_busy_iter
	skd: use blk_mq_queue_tag_busy_iter
	nvme: use blk_mq_queue_tag_busy_iter
	blk-mq: remove blk_mq_tagset_busy_iter

diff stat
 block/blk-mq-debugfs.c            |   4 +-
 block/blk-mq-tag.c                | 173 +++++++++++++++++++++++++-------------------------------------------------------------
 block/blk-mq-tag.h                |   2 -
 block/blk-mq.c                    |  35 ++++++------------
 drivers/block/mtip32xx/mtip32xx.c |   8 ++--
 drivers/block/nbd.c               |   2 +-
 drivers/block/skd_main.c          |   4 +-
 drivers/nvme/host/core.c          |  12 ++++++
 drivers/nvme/host/fc.c            |  12 +++---
 drivers/nvme/host/nvme.h          |   2 +
 drivers/nvme/host/pci.c           |   5 ++-
 drivers/nvme/host/rdma.c          |   6 +--
 drivers/nvme/host/tcp.c           |   5 ++-
 drivers/nvme/target/loop.c        |   6 +--
 include/linux/blk-mq.h            |   7 ++--
 15 files changed, 105 insertions(+), 178 deletions(-

Thanks
Jianchao

             reply	other threads:[~2019-03-15  9:06 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-15  8:57 Jianchao Wang [this message]
2019-03-15  8:57 ` [PATCH 0/8]: blk-mq: use static_rqs to iterate busy tags Jianchao Wang
2019-03-15  8:57 ` [PATCH 1/8] blk-mq: get rid of the synchronize_rcu in __blk_mq_update_nr_hw_queues Jianchao Wang
2019-03-15  8:57   ` Jianchao Wang
2019-03-17  6:14   ` Ming Lei
2019-03-17  6:14     ` Ming Lei
2019-03-15  8:57 ` [PATCH 2/8] blk-mq: change the method of iterating busy tags of a request_queue Jianchao Wang
2019-03-15  8:57   ` Jianchao Wang
2019-03-15 16:16   ` Keith Busch
2019-03-15 16:16     ` Keith Busch
2019-03-17  6:50     ` Ming Lei
2019-03-17  6:50       ` Ming Lei
2019-03-18 15:53       ` Keith Busch
2019-03-18 15:53         ` Keith Busch
2019-03-18  1:49     ` jianchao.wang
2019-03-18  1:49       ` jianchao.wang
2019-03-20 18:52   ` Sagi Grimberg
2019-03-20 18:52     ` Sagi Grimberg
2019-03-21  1:33     ` jianchao.wang
2019-03-21  1:33       ` jianchao.wang
2019-03-15  8:57 ` [PATCH 3/8] blk-mq: use blk_mq_queue_tag_busy_iter in debugfs Jianchao Wang
2019-03-15  8:57   ` Jianchao Wang
2019-03-15  8:57 ` [PATCH 4/8] mtip32xx: use blk_mq_queue_tag_busy_iter Jianchao Wang
2019-03-15  8:57   ` Jianchao Wang
2019-03-15  8:57 ` [PATCH 5/8] nbd: " Jianchao Wang
2019-03-15  8:57   ` Jianchao Wang
2019-03-18 17:16   ` Bart Van Assche
2019-03-18 17:16     ` Bart Van Assche
2019-03-19  2:04     ` jianchao.wang
2019-03-19  2:04       ` jianchao.wang
2019-03-15  8:57 ` [PATCH 6/8] skd: " Jianchao Wang
2019-03-15  8:57   ` Jianchao Wang
2019-03-18 17:20   ` Bart Van Assche
2019-03-18 17:20     ` Bart Van Assche
2019-03-19  1:54     ` jianchao.wang
2019-03-19  1:54       ` jianchao.wang
2019-03-15  8:57 ` [PATCH 7/8] nvme: " Jianchao Wang
2019-03-15  8:57   ` Jianchao Wang
2019-03-15 16:33   ` James Smart
2019-03-15 16:33     ` James Smart
2019-03-15 16:39     ` James Smart
2019-03-15 16:39       ` James Smart
2019-03-15 16:49       ` Hannes Reinecke
2019-03-15 16:49         ` Hannes Reinecke
2019-03-18  7:00     ` jianchao.wang
2019-03-18  7:00       ` jianchao.wang
2019-03-15  8:57 ` [PATCH 8/8] blk-mq: remove blk_mq_tagset_busy_iter Jianchao Wang
2019-03-15  8:57   ` Jianchao Wang
2019-03-15  9:20 ` [PATCH 0/8]: blk-mq: use static_rqs to iterate busy tags Christoph Hellwig
2019-03-15  9:20   ` Christoph Hellwig
2019-03-15  9:44   ` jianchao.wang
2019-03-15  9:44     ` jianchao.wang
2019-03-15 16:19     ` Bart Van Assche
2019-03-15 16:19       ` Bart Van Assche
2019-03-18  2:47       ` jianchao.wang
2019-03-18  2:47         ` jianchao.wang
2019-03-15 13:30   ` Josef Bacik
2019-03-15 13:30     ` Josef Bacik
2019-03-18 17:28 ` Bart Van Assche
2019-03-18 17:28   ` Bart Van Assche
2019-03-19  1:25   ` jianchao.wang
2019-03-19  1:25     ` jianchao.wang
2019-03-19 15:10     ` Bart Van Assche
2019-03-19 15:10       ` Bart Van Assche
2019-03-19 15:25       ` Keith Busch
2019-03-19 15:25         ` Keith Busch
2019-03-20 18:38         ` Sagi Grimberg
2019-03-20 18:38           ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1552640264-26101-1-git-send-email-jianchao.w.wang@oracle.com \
    --to=jianchao.w.wang@oracle.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=josef@toxicpanda.com \
    --cc=jsmart2021@gmail.com \
    --cc=jthumshirn@suse.de \
    --cc=keith.busch@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.