linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-rdma@vger.kernel.org, target-devel@vger.kernel.org
Subject: [PATCH rfc 00/10] non selective polling block interface
Date: Thu,  9 Mar 2017 15:16:32 +0200	[thread overview]
Message-ID: <1489065402-14757-1-git-send-email-sagi@grimberg.me> (raw)

Today, our only polling interface is selective in the sense that
it polls for a specific tag (cookie). blk_mq_poll will not complete
until the specific tag has completed (given that the block driver
implements it obviously).

target mode drivers like our nvme and scsi target, can benefit
from opportunistically polling the block device when we submit
a bio to it, but it doesn't make sense to use a selective
polling interface (like nvmet does at the moment) for it
because we don't care about specific I/O for the time being.

Instead, allow to poll for batch of completions and return if
we don't have any completions or we exhausted our budget (batch).

This set also adds poll_batch support for nvme-pci and nvme-rdma,
and converts nvmet and scsi target to use it. Note that I couldn't
come up with a hero value for the batch size, so I left it at
magic 4 for now, perhaps someone can have a better idea for this.

In addition, I'd like to see if we can hook this with frontend
context (nvmet-rdma, srpt or isert) to avoid scheduling for interrupt
if we have pending block IO that we can poll for.

I would also like to somehow allow aio-dio user-space reap to also have
access to this in the future, but I have yet to come up with something
good for it.

I experimented with this code on nvmet-rdma with a strong initiator
bombarding small 512B IOs (4k block size saturates my network) against
a 4 cpu-core nvmet-rdma target system.

Without this patchset I got:
590K/590K read/write IOPs

With this patchset applied I got:
680K/680K read/write IOPs

The canonical read latency (QD=1) did not have a noticeable
change (29-30 usec).

Hopefully if this is appealing, people can experiment with this
and report back their results.

Sagi Grimberg (10):
  nvme-pci: Split __nvme_process_cq to poll and handle
  nvme-pci: Add budget to __nvme_process_cq
  nvme-pci: open-code polling logic in nvme_poll
  block: Add a non-selective polling interface
  nvme-pci: Support blk_poll_batch
  IB/cq: Don't force IB_POLL_DIRECT poll context for
    ib_process_cq_direct
  nvme-rdma: Don't rearm the CQ when polling directly
  nvme-rdma: Support blk_poll_batch
  nvmet: Use non-selective polling
  target: Use non-selective polling

 block/blk-mq.c                      |  14 ++++
 drivers/infiniband/core/cq.c        |   2 -
 drivers/nvme/host/pci.c             | 146 +++++++++++++++++++++++-------------
 drivers/nvme/host/rdma.c            |   9 ++-
 drivers/nvme/target/io-cmd.c        |   8 +-
 drivers/target/target_core_iblock.c |   1 +
 include/linux/blk-mq.h              |   2 +
 include/linux/blkdev.h              |   1 +
 8 files changed, 125 insertions(+), 58 deletions(-)

-- 
2.7.4


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

             reply	other threads:[~2017-03-09 13:16 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-09 13:16 Sagi Grimberg [this message]
2017-03-09 13:16 ` [PATCH rfc 01/10] nvme-pci: Split __nvme_process_cq to poll and handle Sagi Grimberg
2017-03-09 13:57   ` Johannes Thumshirn
2017-03-22 19:07   ` Christoph Hellwig
2017-03-09 13:16 ` [PATCH rfc 02/10] nvme-pci: Add budget to __nvme_process_cq Sagi Grimberg
2017-03-09 13:46   ` Johannes Thumshirn
2017-03-22 19:08   ` Christoph Hellwig
2017-03-09 13:16 ` [PATCH rfc 03/10] nvme-pci: open-code polling logic in nvme_poll Sagi Grimberg
2017-03-09 13:56   ` Johannes Thumshirn
2017-03-22 19:09   ` Christoph Hellwig
2017-03-09 13:16 ` [PATCH rfc 04/10] block: Add a non-selective polling interface Sagi Grimberg
2017-03-09 13:44   ` Johannes Thumshirn
2017-03-10  3:04     ` Damien Le Moal
2017-03-13  8:26       ` Sagi Grimberg
2017-03-09 16:25   ` Bart Van Assche
2017-03-13  8:15     ` Sagi Grimberg
2017-03-14 21:21       ` Bart Van Assche
2017-03-09 13:16 ` [PATCH rfc 05/10] nvme-pci: Support blk_poll_batch Sagi Grimberg
2017-03-09 13:16 ` [PATCH rfc 06/10] IB/cq: Don't force IB_POLL_DIRECT poll context for ib_process_cq_direct Sagi Grimberg
2017-03-09 16:30   ` Bart Van Assche
2017-03-13  8:24     ` Sagi Grimberg
2017-03-14 21:32       ` Bart Van Assche
2017-03-09 13:16 ` [PATCH rfc 07/10] nvme-rdma: Don't rearm the CQ when polling directly Sagi Grimberg
2017-03-09 13:52   ` Johannes Thumshirn
2017-03-09 13:16 ` [PATCH rfc 08/10] nvme-rdma: Support blk_poll_batch Sagi Grimberg
2017-03-09 13:16 ` [PATCH rfc 09/10] nvmet: Use non-selective polling Sagi Grimberg
2017-03-09 13:54   ` Johannes Thumshirn
2017-03-09 13:16 ` [PATCH rfc 10/10] target: " Sagi Grimberg
2017-03-18 23:58   ` Nicholas A. Bellinger
2017-03-21 11:35     ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1489065402-14757-1-git-send-email-sagi@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=target-devel@vger.kernel.org \
    --subject='Re: [PATCH rfc 00/10] non selective polling block interface' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox