io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] dma mapping optimisations
@ 2022-07-26 17:38 Keith Busch
  2022-07-26 17:38 ` [PATCH 1/5] blk-mq: add ops to dma map bvec Keith Busch
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Keith Busch @ 2022-07-26 17:38 UTC (permalink / raw)
  To: linux-nvme, linux-block, io-uring, linux-fsdevel
  Cc: axboe, hch, Alexander Viro, Keith Busch

From: Keith Busch <kbusch@kernel.org>

The typical journey a user address takes for a read or write to a block
device undergoes various represenations for every IO. Each consumes
memory and CPU cycles. When the backing storage is NVMe, the sequence
looks something like the following:

  __user void *
  struct iov_iter
  struct pages[]
  struct bio_vec[]
  struct scatterlist[]
  __le64[]

Applications will often use the same buffer for many IO, though, so
these per-IO transformations to reach the exact same hardware descriptor
is unnecessary.

The io_uring interface already provides a way for users to register
buffers to get to the 'struct bio_vec[]'. That still leaves the
scatterlist needed for the repeated dma_map_sg(), then transform to
nvme's PRP list format.

This series takes the registered buffers a step further. A block driver
can implement a new .dma_map() callback to complete the to the
hardware's DMA mapped address representation, and return a cookie so a
user can reference it later for any given IO. When used, the block stack
can skip significant amounts of code, improving CPU utilization, and, if
not bandwidth limited, IOPs. The larger the IO, the more signficant the
improvement.

The implementation is currently limited to mapping a registered buffer
to a single block device.

Here's some perf profiling 128k random read tests demonstrating the CPU
savings:

With premapped bvec:

  --46.84%--blk_mq_submit_bio
            |
            |--31.67%--blk_mq_try_issue_directly
                       |
                        --31.57%--__blk_mq_try_issue_directly
                                  |
                                   --31.39%--nvme_queue_rq
                                             |
                                             |--25.35%--nvme_prep_rq.part.68

With premapped DMA:

  --25.86%--blk_mq_submit_bio
            |
            |--12.95%--blk_mq_try_issue_directly
                       |
                        --12.84%--__blk_mq_try_issue_directly
                                  |
                                   --12.53%--nvme_queue_rq
                                             |
                                             |--5.01%--nvme_prep_rq.part.68

Keith Busch (5):
  blk-mq: add ops to dma map bvec
  iov_iter: introduce type for preregistered dma tags
  block: add dma tag bio type
  io_uring: add support for dma pre-mapping
  nvme-pci: implement dma_map support

 block/bdev.c                  |  20 +++
 block/bio.c                   |  25 ++-
 block/blk-merge.c             |  18 +++
 drivers/nvme/host/pci.c       | 291 +++++++++++++++++++++++++++++++++-
 include/linux/bio.h           |  21 +--
 include/linux/blk-mq.h        |  25 +++
 include/linux/blk_types.h     |   6 +-
 include/linux/blkdev.h        |  16 ++
 include/linux/uio.h           |   9 ++
 include/uapi/linux/io_uring.h |  12 ++
 io_uring/io_uring.c           | 129 +++++++++++++++
 io_uring/net.c                |   2 +-
 io_uring/rsrc.c               |  13 +-
 io_uring/rsrc.h               |  16 +-
 io_uring/rw.c                 |   2 +-
 lib/iov_iter.c                |  25 ++-
 16 files changed, 600 insertions(+), 30 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-07-28 13:25 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-26 17:38 [PATCH 0/5] dma mapping optimisations Keith Busch
2022-07-26 17:38 ` [PATCH 1/5] blk-mq: add ops to dma map bvec Keith Busch
2022-07-26 17:38 ` [PATCH 2/5] iov_iter: introduce type for preregistered dma tags Keith Busch
2022-07-26 23:10   ` Al Viro
2022-07-27 13:52     ` Keith Busch
2022-07-26 17:38 ` [PATCH 3/5] block: add dma tag bio type Keith Busch
2022-07-26 17:38 ` [PATCH 4/5] io_uring: add support for dma pre-mapping Keith Busch
2022-07-26 23:12   ` Al Viro
2022-07-27 13:58     ` Keith Busch
2022-07-27 14:04       ` Al Viro
2022-07-27 15:04         ` Keith Busch
2022-07-27 22:32           ` Dave Chinner
2022-07-27 23:00             ` Keith Busch
2022-07-28  2:35               ` Dave Chinner
2022-07-28 13:25                 ` Keith Busch
2022-07-27 14:11   ` Al Viro
2022-07-27 14:48     ` Keith Busch
2022-07-27 15:26       ` Al Viro
2022-07-26 17:38 ` [PATCH 5/5] nvme-pci: implement dma_map support Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).