[PATCH v7 00/12] MAP_DIRECT for DAX RDMA and userspace flush

* [PATCH v7 00/12] MAP_DIRECT for DAX RDMA and userspace flush
@ 2017-10-06 22:35 ` Dan Williams
  0 siblings, 0 replies; 158+ messages in thread
From: Dan Williams @ 2017-10-06 22:35 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: Jan Kara, Dave Chinner, J. Bruce Fields, linux-mm, Sean Hefty,
	Christoph Hellwig, Marek Szyprowski, Ashok Raj, Darrick J. Wong,
	linux-rdma, Joerg Roedel, Doug Ledford, Linus Torvalds,
	Hal Rosenstock, Arnd Bergmann, Alexander Viro, Andy Lutomirski,
	Jeff Layton, Greg Kroah-Hartman, linux-xfs, linux-api,
	linux-fsdevel, Andrew Morton, David Woodhouse, Robin Murphy

Changes since v6 [1]:
* Abandon the concept of immutable files and rework the implementation
  to reuse same FL_LAYOUT file lease mechanism that coordinates pnfsd
  layouts vs local filesystem changes. This establishes an interface where
  the kernel is always in control of the block-map and is free to
  invalidate MAP_DIRECT mappings when a lease breaker arrives. (Christoph)

* Introduce a new ->mmap_validate() file operation since we need both
  the original @flags and @fd passed to mmap(2) to setup a MAP_DIRECT
  mapping.

* Introduce a ->lease_direct() vm operation to allow the RDMA core to
  safely register memory against DAX and tear down the mapping when the
  lease is broken. This can be reused by any sub-system that follows a
  memory registration semantic.

[1]: https://lkml.org/lkml/2017/8/23/754

---

MAP_DIRECT is a mechanism that allows an application to establish a
mapping where the kernel will not change the block-map, or otherwise
dirty the block-map metadata of a file without notification. It supports
a "flush from userspace" model where persistent memory applications can
bypass the overhead of ongoing coordination of writes with the
filesystem, and it provides safety to RDMA operations involving DAX
mappings.

The kernel always has the ability to revoke access and convert the file
back to normal operation after performing a "lease break". Similar to
fcntl leases, there is no way for userspace to to cancel the lease break
process once it has started, it can only delay it via the
/proc/sys/fs/lease-break-time setting.

MAP_DIRECT enables XFS to supplant the device-dax interface for
mmap-write access to persistent memory with no ongoing coordination with
the filesystem via fsync/msync syscalls.

---

Dan Williams (12):
      mm: introduce MAP_SHARED_VALIDATE, a mechanism to safely define new mmap flags
      fs, mm: pass fd to ->mmap_validate()
      fs: introduce i_mapdcount
      fs: MAP_DIRECT core
      xfs: prepare xfs_break_layouts() for reuse with MAP_DIRECT
      xfs: wire up MAP_DIRECT
      dma-mapping: introduce dma_has_iommu()
      fs, mapdirect: introduce ->lease_direct()
      xfs: wire up ->lease_direct()
      device-dax: wire up ->lease_direct()
      IB/core: use MAP_DIRECT to fix / enable RDMA to DAX mappings
      tools/testing/nvdimm: enable rdma unit tests

 arch/alpha/include/uapi/asm/mman.h           |    1 
 arch/mips/include/uapi/asm/mman.h            |    1 
 arch/mips/kernel/vdso.c                      |    2 
 arch/parisc/include/uapi/asm/mman.h          |    1 
 arch/tile/mm/elf.c                           |    3 
 arch/x86/mm/mpx.c                            |    3 
 arch/xtensa/include/uapi/asm/mman.h          |    1 
 drivers/base/dma-mapping.c                   |   10 +
 drivers/dax/device.c                         |    4 
 drivers/infiniband/core/umem.c               |   90 ++++++-
 drivers/iommu/amd_iommu.c                    |    6 
 drivers/iommu/intel-iommu.c                  |    6 
 fs/Kconfig                                   |    4 
 fs/Makefile                                  |    1 
 fs/aio.c                                     |    2 
 fs/mapdirect.c                               |  349 ++++++++++++++++++++++++++
 fs/xfs/Kconfig                               |    4 
 fs/xfs/Makefile                              |    1 
 fs/xfs/xfs_file.c                            |  130 ++++++++++
 fs/xfs/xfs_iomap.c                           |    9 +
 fs/xfs/xfs_layout.c                          |   42 +++
 fs/xfs/xfs_layout.h                          |   13 +
 fs/xfs/xfs_pnfs.c                            |   30 --
 fs/xfs/xfs_pnfs.h                            |   10 -
 include/linux/dma-mapping.h                  |    3 
 include/linux/fs.h                           |   33 ++
 include/linux/mapdirect.h                    |   68 +++++
 include/linux/mm.h                           |   15 +
 include/linux/mman.h                         |   42 +++
 include/rdma/ib_umem.h                       |    8 +
 include/uapi/asm-generic/mman-common.h       |    1 
 include/uapi/asm-generic/mman.h              |    1 
 ipc/shm.c                                    |    3 
 mm/internal.h                                |    2 
 mm/mmap.c                                    |   28 ++
 mm/nommu.c                                   |    5 
 mm/util.c                                    |    7 -
 tools/include/uapi/asm-generic/mman-common.h |    1 
 tools/testing/nvdimm/Kbuild                  |   31 ++
 tools/testing/nvdimm/config_check.c          |    2 
 tools/testing/nvdimm/test/iomap.c            |    6 
 41 files changed, 906 insertions(+), 73 deletions(-)
 create mode 100644 fs/mapdirect.c
 create mode 100644 fs/xfs/xfs_layout.c
 create mode 100644 fs/xfs/xfs_layout.h
 create mode 100644 include/linux/mapdirect.h
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 158+ messages in thread