[PATCHSET v4] Support for polled aio

* [PATCHSET v4] Support for polled aio
@ 2018-11-30 16:56 Jens Axboe
  2018-11-30 16:56 ` [PATCH 01/27] aio: fix failure to put the file pointer Jens Axboe
                   ` (26 more replies)
  0 siblings, 27 replies; 59+ messages in thread
From: Jens Axboe @ 2018-11-30 16:56 UTC (permalink / raw)
  To: linux-block, linux-fsdevel, linux-aio; +Cc: hch

For the grand introduction to this feature, see my original posting
here:

https://lore.kernel.org/linux-block/20181117235317.7366-1-axboe@kernel.dk/

and refer to the previous postings of this patchset for whatever
features were added there.

Outside of "just" supporting polled IO, it also adds support for user
mapped IOCBs, so we don't have to copy those for every IO. A new
addition in this version is support for pre-mapped buffers as well.
If an application uses fixed IO buffers, it can set IOCTX_FLAG_FIXEDBUFS
along with IOCTX_FLAG_USERIOCB. The iocbs that are mapped in should have
the maximum length already set, and the buffer field pointing to the
right location. That eliminates the need to do get_user_pages() for
every IO.

Everything is solid for me in all the testing I have done, no problems
observed, crashes, corruptions, etc.

For the testing below, 'Mainline' refers to current -git from Linus,
'aio-poll' is the aio-poll branch but with none of the new features
enabled, and finally 'aio-poll-all' is the aio-poll branch with
useriocbs turned on, polling turned on, and user mapped buffers
turned on. In other words, mainline and aio-poll are running the
exact same workload, and aio-poll-all is running that workload,
but with the new features turned on.

All testing done with fio. Latencies quoted are 50th percentile.

All testing is done with a single thread, using a maximum of one
core in the system. Testing is run on two devices - one that supports
high peak IOPS, and one that is low latency.

Peak IOPS testing on an NVMe device that supports high IOPS:

Depth      Mainline           aio-poll        aio-poll-all
============================================================
1            77K                80K              132K
2           145K               163K              262K
4           287K               342K              514K
8           560K               582K              824K
16          616K               727K             1013K
32          636K               773K             1155K
64          635K               776K             1230K

Low latency testing on low latency device:

Depth      Mainline	      aio-poll        aio-poll-all
============================================================
1        84K / 8.5 usec    87K / 8.3 usec    168K / 5.0 usec
2       201K / 7.4 usec   208K / 7.1 usec    330K / 5.0 usec
4       389K / 7.7 usec   398K / 7.2 usec    547K / 6.1 usec

It's worth nothing that the average IO submission time for
'aio-poll-all' is 660 nsec, with aio-poll 1.8 - 2.0 usec, and finally
mainline at 1.8 - 2.1 usec.

As before, patches are against my 'mq-perf' branch, and can also
be found in my aio-poll branch.

 Documentation/filesystems/vfs.txt      |    3 +
 arch/x86/entry/syscalls/syscall_64.tbl |    1 +
 block/bio.c                            |   36 +-
 fs/aio.c                               | 1055 +++++++++++++++++++++---
 fs/block_dev.c                         |   36 +-
 fs/file.c                              |   15 +-
 fs/file_table.c                        |   10 +-
 fs/gfs2/file.c                         |    2 +
 fs/iomap.c                             |   56 +-
 fs/xfs/xfs_file.c                      |    1 +
 include/linux/bio.h                    |    1 +
 include/linux/blk_types.h              |    1 +
 include/linux/file.h                   |    2 +
 include/linux/fs.h                     |    5 +-
 include/linux/iomap.h                  |    1 +
 include/linux/syscalls.h               |    2 +
 include/linux/uio.h                    |    3 +
 include/uapi/asm-generic/unistd.h      |    4 +-
 include/uapi/linux/aio_abi.h           |    6 +
 kernel/sys_ni.c                        |    1 +
 lib/iov_iter.c                         |   35 +-
 21 files changed, 1109 insertions(+), 167 deletions(-)

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 59+ messages in thread