All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/15] dax: prep work for fixing dax-dma vs truncate collisions
@ 2017-10-31 23:21 ` Dan Williams
  0 siblings, 0 replies; 92+ messages in thread
From: Dan Williams @ 2017-10-31 23:21 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: Michal Hocko, Jan Kara, Peter Zijlstra, Benjamin Herrenschmidt,
	Heiko Carstens, linux-mm, Paul Mackerras, Sean Hefty, hch,
	Matthew Wilcox, linux-rdma, Michael Ellerman, Jeff Moyer,
	Jason Gunthorpe, Doug Ledford, Ingo Molnar, Ross Zwisler,
	Hal Rosenstock, linux-media, linux-fsdevel,
	Jérôme Glisse, Mauro Carvalho Chehab, Gerald Schaefer,
	Jens Axboe, linux-kernel, stable, linux-xfs, Martin Schwidefsky,
	akpm, Kirill A. Shutemov

This is hopefully the uncontroversial lead-in set of changes that lay
the groundwork for solving the dax-dma vs truncate problem. The overview
of the changes is:

1/ Disable DAX when we do not have struct page entries backing dax
   mappings, or otherwise allow limited DAX support for axonram and
   dcssblk. Is anyone actually using the DAX capability of axonram
   dcssblk?

2/ Disable code paths that establish potentially long lived DMA
   access to a filesystem-dax memory mapping, i.e. RDMA and V4L2. In the
   4.16 timeframe the plan is to introduce a "register memory for DMA
   with a lease" mechanism for userspace to establish mappings but also
   be responsible for tearing down the mapping when the kernel needs to
   invalidate the mapping due to truncate or hole-punch.

3/ Add a wakeup mechanism for awaiting for DAX pages to be released
   from DMA access.

This overall effort started when Christoph noted during the review of
the MAP_DIRECT proposal:

    get_user_pages on DAX doesn't give the same guarantees as on
    pagecache or anonymous memory, and that is the problem we need to
    fix. In fact I'm pretty sure if we try hard enough (and we might
    have to try very hard) we can see the same problem with plain direct
    I/O and without any RDMA involved, e.g. do a larger direct I/O write
    to memory that is mmap()ed from a DAX file, then truncate the DAX
    file and reallocate the blocks, and we might corrupt that new file.
    We'll probably need a special setup where there is little other
    chance but to reallocate those used blocks.

    So what we need to do first is to fix get_user_pages vs unmapping
    DAX mmap()ed blocks, be that from a hole punch, truncate, COW
    operation, etc.

Included in the changes is a nfit_test mechanism to trivially trigger
this collision by delaying the put_page() that the block layer performs
after performing direct-I/O to a filesystem-DAX page.

Given the ongoing coordination of this set across multiple sub-systems
and the dax core my proposal is to manage this as a branch in the nvdimm
tree with acks from mm, rdma, v4l2, ext4, and xfs.

---

Dan Williams (15):
      dax: quiet bdev_dax_supported()
      mm, dax: introduce pfn_t_special()
      dax: require 'struct page' by default for filesystem dax
      brd: remove dax support
      dax: stop using VM_MIXEDMAP for dax
      dax: stop using VM_HUGEPAGE for dax
      dax: stop requiring a live device for dax_flush()
      dax: store pfns in the radix
      tools/testing/nvdimm: add 'bio_delay' mechanism
      IB/core: disable memory registration of fileystem-dax vmas
      [media] v4l2: disable filesystem-dax mapping support
      mm, dax: enable filesystems to trigger page-idle callbacks
      mm, devmap: introduce CONFIG_DEVMAP_MANAGED_PAGES
      dax: associate mappings with inodes, and warn if dma collides with truncate
      wait_bit: introduce {wait_on,wake_up}_devmap_idle


 arch/powerpc/platforms/Kconfig            |    1 
 arch/powerpc/sysdev/axonram.c             |    3 -
 drivers/block/Kconfig                     |   12 ---
 drivers/block/brd.c                       |   65 --------------
 drivers/dax/device.c                      |    1 
 drivers/dax/super.c                       |  113 +++++++++++++++++++++----
 drivers/infiniband/core/umem.c            |   49 ++++++++---
 drivers/media/v4l2-core/videobuf-dma-sg.c |   39 ++++++++-
 drivers/nvdimm/pmem.c                     |   13 +++
 drivers/s390/block/Kconfig                |    1 
 drivers/s390/block/dcssblk.c              |    4 +
 fs/Kconfig                                |    8 ++
 fs/dax.c                                  |  131 +++++++++++++++++++----------
 fs/ext2/file.c                            |    1 
 fs/ext2/super.c                           |    6 +
 fs/ext4/file.c                            |    1 
 fs/ext4/super.c                           |    6 +
 fs/xfs/xfs_file.c                         |    2 
 fs/xfs/xfs_super.c                        |   20 ++--
 include/linux/dax.h                       |   17 ++--
 include/linux/memremap.h                  |   24 +++++
 include/linux/mm.h                        |   47 ++++++----
 include/linux/mm_types.h                  |   20 +++-
 include/linux/pfn_t.h                     |   13 +++
 include/linux/vma.h                       |   33 +++++++
 include/linux/wait_bit.h                  |   10 ++
 kernel/memremap.c                         |   36 ++++++--
 kernel/sched/wait_bit.c                   |   64 ++++++++++++--
 mm/Kconfig                                |    5 +
 mm/hmm.c                                  |   13 ---
 mm/huge_memory.c                          |    8 +-
 mm/ksm.c                                  |    3 +
 mm/madvise.c                              |    2 
 mm/memory.c                               |   22 ++++-
 mm/migrate.c                              |    3 -
 mm/mlock.c                                |    5 +
 mm/mmap.c                                 |    8 +-
 mm/swap.c                                 |    3 -
 tools/testing/nvdimm/Kbuild               |    1 
 tools/testing/nvdimm/test/iomap.c         |   62 ++++++++++++++
 tools/testing/nvdimm/test/nfit.c          |   34 ++++++++
 tools/testing/nvdimm/test/nfit_test.h     |    1 
 42 files changed, 650 insertions(+), 260 deletions(-)
 create mode 100644 include/linux/vma.h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 92+ messages in thread

end of thread, other threads:[~2017-12-22 10:41 UTC | newest]

Thread overview: 92+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-31 23:21 [PATCH 00/15] dax: prep work for fixing dax-dma vs truncate collisions Dan Williams
2017-10-31 23:21 ` Dan Williams
2017-10-31 23:21 ` Dan Williams
2017-10-31 23:21 ` [PATCH 01/15] dax: quiet bdev_dax_supported() Dan Williams
2017-10-31 23:21   ` Dan Williams
2017-10-31 23:21   ` Dan Williams
2017-11-02 20:11   ` Christoph Hellwig
2017-11-02 20:11     ` Christoph Hellwig
2017-10-31 23:21 ` [PATCH 02/15] mm, dax: introduce pfn_t_special() Dan Williams
2017-10-31 23:21   ` Dan Williams
2017-11-03  2:32   ` Michael Ellerman
2017-11-03  2:32     ` Michael Ellerman
2017-11-03  2:32     ` Michael Ellerman
2017-10-31 23:21 ` [PATCH 03/15] dax: require 'struct page' by default for filesystem dax Dan Williams
2017-10-31 23:21   ` Dan Williams
2017-10-31 23:21   ` Dan Williams
2017-10-31 23:21 ` [PATCH 04/15] brd: remove dax support Dan Williams
2017-10-31 23:21   ` Dan Williams
2017-10-31 23:21   ` Dan Williams
2017-11-02 20:12   ` Christoph Hellwig
2017-11-02 20:12     ` Christoph Hellwig
2017-11-04 16:31   ` Jens Axboe
2017-11-04 16:31     ` Jens Axboe
2017-11-04 16:31     ` Jens Axboe
2017-10-31 23:22 ` [PATCH 05/15] dax: stop using VM_MIXEDMAP for dax Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22 ` [PATCH 06/15] dax: stop using VM_HUGEPAGE " Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22 ` [PATCH 07/15] dax: stop requiring a live device for dax_flush() Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-11-02 20:12   ` Christoph Hellwig
2017-11-02 20:12     ` Christoph Hellwig
2017-10-31 23:22 ` [PATCH 08/15] dax: store pfns in the radix Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22 ` [PATCH 09/15] tools/testing/nvdimm: add 'bio_delay' mechanism Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22 ` [PATCH 10/15] IB/core: disable memory registration of fileystem-dax vmas Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-11-02 20:13   ` Christoph Hellwig
2017-11-02 20:13     ` Christoph Hellwig
2017-11-02 21:06     ` Dan Williams
2017-11-02 21:06       ` Dan Williams
2017-11-02 21:06       ` Dan Williams
2017-10-31 23:22 ` [PATCH 11/15] [media] v4l2: disable filesystem-dax mapping support Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22 ` [PATCH 12/15] mm, dax: enable filesystems to trigger page-idle callbacks Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-11-10  9:04   ` Christoph Hellwig
2017-11-10  9:04     ` Christoph Hellwig
2017-10-31 23:22 ` [PATCH 13/15] mm, devmap: introduce CONFIG_DEVMAP_MANAGED_PAGES Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-11-10  9:06   ` Christoph Hellwig
2017-11-10  9:06     ` Christoph Hellwig
2017-10-31 23:22 ` [PATCH 14/15] dax: associate mappings with inodes, and warn if dma collides with truncate Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-11-10  9:08   ` Christoph Hellwig
2017-11-10  9:08     ` Christoph Hellwig
2017-11-10  9:08     ` Christoph Hellwig
2017-12-20  1:11     ` Dan Williams
2017-12-20  1:11       ` Dan Williams
2017-12-20 14:38       ` Jan Kara
2017-12-20 14:38         ` Jan Kara
2017-12-20 22:41         ` Dan Williams
2017-12-20 22:41           ` Dan Williams
2017-12-20 22:41           ` Dan Williams
2017-12-21 12:14           ` Jan Kara
2017-12-21 12:14             ` Jan Kara
2017-12-21 17:31             ` Dan Williams
2017-12-21 17:31               ` Dan Williams
2017-12-22  8:51               ` Jan Kara
2017-12-22  8:51                 ` Jan Kara
2017-12-20 22:14       ` Dave Chinner
2017-12-20 22:14         ` Dave Chinner
2017-12-20 22:14         ` Dave Chinner
2017-10-31 23:22 ` [PATCH 15/15] wait_bit: introduce {wait_on,wake_up}_devmap_idle Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-10-31 23:22   ` Dan Williams
2017-11-10  9:09   ` Christoph Hellwig
2017-11-10  9:09     ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.