[PATCH v9 0/6] MAP_DIRECT for DAX userspace flush

* [PATCH v9 0/6] MAP_DIRECT for DAX userspace flush
@ 2017-10-12  0:47 ` Dan Williams
  0 siblings, 0 replies; 116+ messages in thread
From: Dan Williams @ 2017-10-12  0:47 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: J. Bruce Fields, Jan Kara, Andrew Morton, Arnd Bergmann,
	Darrick J. Wong, linux-api, Dave Chinner, linux-xfs, linux-mm,
	Al Viro, Andy Lutomirski, Jeff Layton, linux-fsdevel,
	Linus Torvalds, Christoph Hellwig

Changes since v8 [1]:
* Move MAP_SHARED_VALIDATE definition next to MAP_SHARED in all arch
  headers (Jan)

* Include xfs_layout.h directly in all the files that call
  xfs_break_layouts() (Dave)

* Clarify / add more comments to the MAP_DIRECT checks at fault time
  (Dave)

* Rename iomap_can_allocate() to break_layouts_nowait() to make it plain
  the reason we are bailing out of iomap_begin.

* Defer the lease_direct mechanism and RDMA core changes to a later
  patch series.

* EXT4 support is in the works and will be rebased on Jan's MAP_SYNC
  patches.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2017-October/012772.html

---

MAP_DIRECT is a mechanism that allows an application to establish a
mapping where the kernel will not change the block-map, or otherwise
dirty the block-map metadata of a file without notification. It supports
a "flush from userspace" model where persistent memory applications can
bypass the overhead of ongoing coordination of writes with the
filesystem, and it provides safety to RDMA operations involving DAX
mappings.

The kernel always has the ability to revoke access and convert the file
back to normal operation after performing a "lease break". Similar to
fcntl leases, there is no way for userspace to to cancel the lease break
process once it has started, it can only delay it via the
/proc/sys/fs/lease-break-time setting.

MAP_DIRECT enables XFS to supplant the device-dax interface for
mmap-write access to persistent memory with no ongoing coordination with
the filesystem via fsync/msync syscalls.

The MAP_DIRECT mechanism is complimentary to MAP_SYNC. Here are some
scenarios where you would choose one over the other:

* 3rd party DMA / RDMA to DAX with hardware that does not support
  on-demand paging (shared virtual memory) => MAP_DIRECT

* Support for reflinked inodes, fallocate-punch-hole, truncate, or any
  other operation that mutates the block map of an actively
  mapped file => MAP_SYNC

* Userpsace flush => MAP_SYNC or MAP_DIRECT

* Assurances that the file's block map metadata is stable, i.e. minimize
  worst case fault latency by locking out updates => MAP_DIRECT

---

Dan Williams (6):
      mm: introduce MAP_SHARED_VALIDATE, a mechanism to safely define new mmap flags
      fs, mm: pass fd to ->mmap_validate()
      fs: MAP_DIRECT core
      xfs: prepare xfs_break_layouts() for reuse with MAP_DIRECT
      fs, xfs, iomap: introduce break_layout_nowait()
      xfs: wire up MAP_DIRECT

 arch/alpha/include/uapi/asm/mman.h           |    1 
 arch/mips/include/uapi/asm/mman.h            |    1 
 arch/mips/kernel/vdso.c                      |    2 
 arch/parisc/include/uapi/asm/mman.h          |    1 
 arch/tile/mm/elf.c                           |    3 
 arch/x86/mm/mpx.c                            |    3 
 arch/xtensa/include/uapi/asm/mman.h          |    1 
 fs/Kconfig                                   |    1 
 fs/Makefile                                  |    2 
 fs/aio.c                                     |    2 
 fs/mapdirect.c                               |  237 ++++++++++++++++++++++++++
 fs/xfs/Kconfig                               |    4 
 fs/xfs/Makefile                              |    1 
 fs/xfs/xfs_file.c                            |  108 ++++++++++++
 fs/xfs/xfs_ioctl.c                           |    1 
 fs/xfs/xfs_iomap.c                           |    3 
 fs/xfs/xfs_iops.c                            |    1 
 fs/xfs/xfs_layout.c                          |   45 +++++
 fs/xfs/xfs_layout.h                          |   13 +
 fs/xfs/xfs_pnfs.c                            |   31 ---
 fs/xfs/xfs_pnfs.h                            |    8 -
 include/linux/fs.h                           |   11 +
 include/linux/mapdirect.h                    |   40 ++++
 include/linux/mm.h                           |    9 +
 include/linux/mman.h                         |   42 +++++
 include/uapi/asm-generic/mman-common.h       |    1 
 include/uapi/asm-generic/mman.h              |    1 
 ipc/shm.c                                    |    3 
 mm/internal.h                                |    2 
 mm/mmap.c                                    |   28 ++-
 mm/nommu.c                                   |    5 -
 mm/util.c                                    |    7 -
 tools/include/uapi/asm-generic/mman-common.h |    1 
 33 files changed, 557 insertions(+), 62 deletions(-)
 create mode 100644 fs/mapdirect.c
 create mode 100644 fs/xfs/xfs_layout.c
 create mode 100644 fs/xfs/xfs_layout.h
 create mode 100644 include/linux/mapdirect.h
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 116+ messages in thread