[PATCH v2 00/20] get_user_pages() for dax mappings

* [PATCH v2 00/20] get_user_pages() for dax mappings
@ 2015-10-10  0:55 Dan Williams
  2015-10-10  0:55 ` [PATCH v2 01/20] block: generic request_queue reference counting Dan Williams
                   ` (20 more replies)
  0 siblings, 21 replies; 37+ messages in thread
From: Dan Williams @ 2015-10-10  0:55 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: linux-mips, Dave Hansen, Boaz Harrosh, David Airlie,
	Catalin Marinas, Dave Hansen, Dave Chinner, Keith Busch,
	linux-mm, Paul Mackerras, H. Peter Anvin, hch, Russell King,
	Richard Weinberger, Peter Zijlstra, Jeff Moyer, Ingo Molnar,
	Benjamin Herrenschmidt, Matthew Wilcox, ross.zwisler,
	Gleb Natapov, Marc Zyngier, Will Deacon, Jeff Dike,
	Alexander Viro, Thomas Gleixner, Jens Axboe, linux-kernel,
	Ralf Baechle, Alexander Graf, Paolo Bonzini, Andrew Morton,
	Christoffer Dall

Changes since v1 [1]:
1/ Rebased on the accepted cleanups to the memremap() api and the NUMA
   hints for devm allocations. (see libnvdimm-for-next [2]).

2/ Rebased on DAX fixes from Ross [3], currently in -mm, and Dave [4],
   applied locally for now.

3/ Renamed __pfn_t to pfn_t and converted KVM and UM accordingly (Dave
   Hansen)

4/ Make pfn-to-pfn_t conversions a nop (binary identical) for typical
   mapped pfns (Dave Hansen)

5/ Fixed up the devm_memremap_pages() api to require passing in a
   percpu_ref object.  Addresses a crash reported-by Logan.

6/ Moved the back pointer from a page to its hosting 'struct
   dev_pagemap' to share storage with the 'lru' field rather than
   'mapping'.  Enables us to revoke mappings at devm_memunmap_page()
   time and addresses a crash reported-by Logan.

7/ Rework dax_map_bh() into dax_map_atomic() to avoid proliferating
   buffer_head usage deeper into the dax implementation.  Also addresses
   a crash reported by Logan (Dave Chinner)

8/ Include an initial, only lightly tested, implementation of revoking
   usages of ZONE_DEVICE pages when the driver disables the pmem device.
   This coordinates with blk_cleanup_queue() for the pmem gendisk, see
   patch 19.

9/ Include a cleaned up version of the vmem_altmap infrastructure
   allowing the struct page memmap to optionally be allocated from pmem
   itself.

[1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html
[2]: https://git.kernel.org/cgit/linux/kernel/git/nvdimm/nvdimm.git/log/?h=libnvdimm-for-next
[3]: https://git.kernel.org/cgit/linux/kernel/git/nvdimm/nvdimm.git/commit/?h=dax-fixes&id=93fdde069dce
[4]: https://lists.01.org/pipermail/linux-nvdimm/2015-October/002286.html

---
To date, we have implemented two I/O usage models for persistent memory,
PMEM (a persistent "ram disk") and DAX (mmap persistent memory into
userspace).  This series adds a third, DAX-GUP, that allows DAX mappings
to be the target of direct-i/o.  It allows userspace to coordinate
DMA/RDMA from/to persitent memory.

The implementation leverages the ZONE_DEVICE mm-zone that went into
4.3-rc1 to flag pages that are owned and dynamically mapped by a device
driver.  The pmem driver, after mapping a persistent memory range into
the system memmap via devm_memremap_pages(), arranges for DAX to
distinguish pfn-only versus page-backed pmem-pfns via flags in the new
__pfn_t type.  The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn,
flags the resulting pte(s) inserted into the process page tables with a
new _PAGE_DEVMAP flag.  Later, when get_user_pages() is walking ptes it
keys off _PAGE_DEVMAP to pin the device hosting the page range active.
Finally, get_page() and put_page() are modified to take references
against the device driver established page mapping.

This series is available via git here:

  git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm libnvdimm-pending

---

Dan Williams (20):
      block: generic request_queue reference counting
      dax: increase granularity of dax_clear_blocks() operations
      block, dax: fix lifetime of in-kernel dax mappings with dax_map_atomic()
      mm: introduce __get_dev_pagemap()
      x86, mm: introduce vmem_altmap to augment vmemmap_populate()
      libnvdimm, pfn, pmem: allocate memmap array in persistent memory
      avr32: convert to asm-generic/memory_model.h
      hugetlb: fix compile error on tile
      frv: fix compiler warning from definition of __pmd()
      um: kill pfn_t
      kvm: rename pfn_t to kvm_pfn_t
      mips: fix PAGE_MASK definition
      mm, dax, pmem: introduce pfn_t
      mm, dax, gpu: convert vm_insert_mixed to pfn_t, introduce _PAGE_DEVMAP
      mm, dax: convert vmf_insert_pfn_pmd() to pfn_t
      list: introduce list_poison() and LIST_POISON3
      mm, dax, pmem: introduce {get|put}_dev_pagemap() for dax-gup
      block: notify queue death confirmation
      mm, pmem: devm_memunmap_pages(), truncate and unmap ZONE_DEVICE pages
      mm, x86: get_user_pages() for dax mappings

 arch/alpha/include/asm/pgtable.h        |    1 
 arch/arm/include/asm/kvm_mmu.h          |    5 -
 arch/arm/kvm/mmu.c                      |   10 +
 arch/arm64/include/asm/kvm_mmu.h        |    3 
 arch/avr32/include/asm/page.h           |    8 -
 arch/frv/include/asm/page.h             |    2 
 arch/ia64/include/asm/pgtable.h         |    1 
 arch/m68k/include/asm/page_mm.h         |    1 
 arch/m68k/include/asm/page_no.h         |    1 
 arch/mips/include/asm/kvm_host.h        |    6 -
 arch/mips/include/asm/page.h            |    2 
 arch/mips/kvm/emulate.c                 |    2 
 arch/mips/kvm/tlb.c                     |   14 +
 arch/parisc/include/asm/pgtable.h       |    1 
 arch/powerpc/include/asm/kvm_book3s.h   |    4 
 arch/powerpc/include/asm/kvm_ppc.h      |    2 
 arch/powerpc/include/asm/pgtable.h      |    1 
 arch/powerpc/kvm/book3s.c               |    6 -
 arch/powerpc/kvm/book3s_32_mmu_host.c   |    2 
 arch/powerpc/kvm/book3s_64_mmu_host.c   |    2 
 arch/powerpc/kvm/e500.h                 |    2 
 arch/powerpc/kvm/e500_mmu_host.c        |    8 -
 arch/powerpc/kvm/trace_pr.h             |    2 
 arch/powerpc/sysdev/axonram.c           |    8 -
 arch/sparc/include/asm/pgtable_64.h     |    2 
 arch/tile/include/asm/pgtable.h         |    1 
 arch/um/include/asm/page.h              |    6 -
 arch/um/include/asm/pgtable-3level.h    |    5 -
 arch/um/include/asm/pgtable.h           |    2 
 arch/x86/include/asm/pgtable.h          |   24 ++
 arch/x86/include/asm/pgtable_types.h    |    7 +
 arch/x86/kvm/iommu.c                    |   11 +
 arch/x86/kvm/mmu.c                      |   37 ++--
 arch/x86/kvm/mmu_audit.c                |    2 
 arch/x86/kvm/paging_tmpl.h              |    6 -
 arch/x86/kvm/vmx.c                      |    2 
 arch/x86/kvm/x86.c                      |    2 
 arch/x86/mm/gup.c                       |   56 +++++-
 arch/x86/mm/init_64.c                   |   32 +++
 arch/x86/mm/pat.c                       |    4 
 block/blk-core.c                        |   79 +++++++-
 block/blk-mq-sysfs.c                    |    6 -
 block/blk-mq.c                          |   87 +++------
 block/blk-sysfs.c                       |    3 
 block/blk.h                             |   12 +
 drivers/block/brd.c                     |    4 
 drivers/gpu/drm/exynos/exynos_drm_gem.c |    3 
 drivers/gpu/drm/gma500/framebuffer.c    |    3 
 drivers/gpu/drm/msm/msm_gem.c           |    3 
 drivers/gpu/drm/omapdrm/omap_gem.c      |    6 -
 drivers/gpu/drm/ttm/ttm_bo_vm.c         |    3 
 drivers/nvdimm/pfn_devs.c               |    3 
 drivers/nvdimm/pmem.c                   |  128 +++++++++----
 drivers/s390/block/dcssblk.c            |   10 -
 fs/block_dev.c                          |    2 
 fs/dax.c                                |  199 +++++++++++++--------
 include/asm-generic/pgtable.h           |    6 -
 include/linux/blk-mq.h                  |    1 
 include/linux/blkdev.h                  |   12 +
 include/linux/huge_mm.h                 |    2 
 include/linux/hugetlb.h                 |    1 
 include/linux/io.h                      |   17 --
 include/linux/kvm_host.h                |   37 ++--
 include/linux/kvm_types.h               |    2 
 include/linux/list.h                    |   14 +
 include/linux/memory_hotplug.h          |    3 
 include/linux/mm.h                      |  300 +++++++++++++++++++++++++++++--
 include/linux/mm_types.h                |    5 +
 include/linux/pfn.h                     |    9 +
 include/linux/poison.h                  |    1 
 kernel/memremap.c                       |  187 +++++++++++++++++++
 lib/list_debug.c                        |    2 
 mm/gup.c                                |   11 +
 mm/huge_memory.c                        |   10 +
 mm/hugetlb.c                            |   18 ++
 mm/memory.c                             |   17 +-
 mm/memory_hotplug.c                     |   66 +++++--
 mm/page_alloc.c                         |   10 +
 mm/sparse-vmemmap.c                     |   37 ++++
 mm/sparse.c                             |    8 +
 mm/swap.c                               |   15 ++
 virt/kvm/kvm_main.c                     |   47 ++---
 82 files changed, 1264 insertions(+), 418 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread