[PATCHSET v3 0/18] Support for polled aio

* [PATCHSET v3 0/18] Support for polled aio
@ 2018-11-26 16:45 Jens Axboe
  2018-11-26 16:45 ` [PATCH 01/20] aio: fix failure to put the file pointer Jens Axboe
                   ` (19 more replies)
  0 siblings, 20 replies; 30+ messages in thread
From: Jens Axboe @ 2018-11-26 16:45 UTC (permalink / raw)
  To: linux-block, linux-aio, linux-fsdevel

For the grand introduction to this feature, see my original posting
here:

https://lore.kernel.org/linux-block/20181117235317.7366-1-axboe@kernel.dk/

The patchset continues to evolve, and has grown some optimizations that
benefit non-polled aio as well. One such example is the user mapped
iocbs, which avoid copying a full iocb for each IO. If you look at
memory performance, we copy 72b for each IO at submission time (8 byte
pointer, 64 byte iocb), then copy back 16 bytes at completion for the
io_event. That's 88 bytes of memory copies, for something that's
potentially a 512b IO.

In terms of observed performance, I've been able to do 1070K 4k IOPS
with this on a single device, using a single thread for both IO
submissions and completions. For actual real life use cases, on a
flash storage box permance is up 25% in terms of IOPS/throughput.

As before, patches are on top of my mq-perf branch

Changes since v2:

- Fix an off-by-one causing QD=1 aio polls to sometimes stall.
- Add fget_many/fput_many optimization.
- Ensure that io_cancel() is never called on a polled iocb.
- Add support for user mapped iocbs.
- IOCTX_FLAG_IOPOLL is now (1 << 1), since it was reordered after
  the user iocb support.
- Kill full aio_kiocb zero on alloc.
- Move fops->iopoll() storage back to the iocb (->ki_cookie). The usage
  in the dio private storage was inherently racy, making the store of
  the submission cookie open to use-after-free.
- Fix narrow use-before-submit race in fops->iopoll() access.
- Batch move submitted iocbs to poll_submitted list.
- Pass back EAGAIN at submission time, don't punt to getevents() time.
- Split some patches up.
- Address review concerns.
- Various little fixes and optimizations.

 Documentation/filesystems/vfs.txt      |   3 +
 arch/x86/entry/syscalls/syscall_64.tbl |   1 +
 fs/aio.c                               | 841 +++++++++++++++++++++----
 fs/block_dev.c                         |  20 +-
 fs/file.c                              |  15 +-
 fs/file_table.c                        |  10 +-
 fs/gfs2/file.c                         |   2 +
 fs/iomap.c                             |  51 +-
 fs/xfs/xfs_file.c                      |   1 +
 include/linux/file.h                   |   2 +
 include/linux/fs.h                     |   5 +-
 include/linux/iomap.h                  |   1 +
 include/linux/syscalls.h               |   2 +
 include/uapi/asm-generic/unistd.h      |   4 +-
 include/uapi/linux/aio_abi.h           |   5 +
 kernel/sys_ni.c                        |   1 +
 16 files changed, 819 insertions(+), 145 deletions(-)

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 30+ messages in thread