[PATCH hmm v3 00/14] Consolidate the mmu notifier interval_tree and locking

* [PATCH hmm v3 00/14] Consolidate the mmu notifier interval_tree and locking
@ 2019-11-12 20:22 Jason Gunthorpe
  2019-11-12 20:22 ` [PATCH v3 01/14] mm/mmu_notifier: define the header pre-processor parts even if disabled Jason Gunthorpe
                   ` (13 more replies)
  0 siblings, 14 replies; 24+ messages in thread
From: Jason Gunthorpe @ 2019-11-12 20:22 UTC (permalink / raw)
  To: linux-mm, Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: linux-rdma, dri-devel, amd-gfx, Alex Deucher, Ben Skeggs,
	Boris Ostrovsky, Christian König, David Zhou,
	Dennis Dalessandro, Juergen Gross, Mike Marciniszyn,
	Oleksandr Andrushchenko, Petr Cvek, Stefano Stabellini, nouveau,
	xen-devel, Christoph Hellwig, Jason Gunthorpe

From: Jason Gunthorpe <jgg@mellanox.com>

8 of the mmu_notifier using drivers (i915_gem, radeon_mn, umem_odp, hfi1,
scif_dma, vhost, gntdev, hmm) drivers are using a common pattern where
they only use invalidate_range_start/end and immediately check the
invalidating range against some driver data structure to tell if the
driver is interested. Half of them use an interval_tree, the others are
simple linear search lists.

Of the ones I checked they largely seem to have various kinds of races,
bugs and poor implementation. This is a result of the complexity in how
the notifier interacts with get_user_pages(). It is extremely difficult to
use it correctly.

Consolidate all of this code together into the core mmu_notifier and
provide a locking scheme similar to hmm_mirror that allows the user to
safely use get_user_pages() and reliably know if the page list still
matches the mm.

This new arrangment plays nicely with the !blockable mode for
OOM. Scanning the interval tree is done such that the intersection test
will always succeed, and since there is no invalidate_range_end exposed to
drivers the scheme safely allows multiple drivers to be subscribed.

Four places are converted as an example of how the new API is used.
Four are left for future patches:
 - i915_gem has complex locking around destruction of a registration,
   needs more study
 - hfi1 (2nd user) needs access to the rbtree
 - scif_dma has a complicated logic flow
 - vhost's mmu notifiers are already being rewritten

This is already in linux-next, a git tree is available here:

 https://github.com/jgunthorpe/linux/commits/mmu_notifier

v3:
- Rename mmu_range_notifier to mmu_interval_notifier for clarity
  Avoids confusion with struct mmu_notifier_range
- Fix bugs in odp, amdgpu and xen gntdev from testing
- Make ops an argument to mmu_interval_notifier_insert() to make it
  harder to misuse
- Update many comments
- Add testing of mm_count during insertion

v2: https://lore.kernel.org/r/20191028201032.6352-1-jgg@ziepe.ca
v1: https://lore.kernel.org/r/20191015181242.8343-1-jgg@ziepe.ca

Absent any new discussion I think this will go to Linus at the next merge
window.

Thanks to everyone to helped!

Jason Gunthorpe (14):
  mm/mmu_notifier: define the header pre-processor parts even if
    disabled
  mm/mmu_notifier: add an interval tree notifier
  mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or
    hmm_mirror
  mm/hmm: define the pre-processor related parts of hmm.h even if
    disabled
  RDMA/odp: Use mmu_interval_notifier_insert()
  RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv
  drm/radeon: use mmu_interval_notifier_insert
  nouveau: use mmu_notifier directly for invalidate_range_start
  nouveau: use mmu_interval_notifier instead of hmm_mirror
  drm/amdgpu: Call find_vma under mmap_sem
  drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror
  drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror
  mm/hmm: remove hmm_mirror and related
  xen/gntdev: use mmu_interval_notifier_insert

 Documentation/vm/hmm.rst                      | 105 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h           |   2 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   9 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |  14 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c        | 443 ++------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h        |  53 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h    |  13 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       | 145 +++--
 drivers/gpu/drm/nouveau/nouveau_svm.c         | 230 ++++---
 drivers/gpu/drm/radeon/radeon.h               |   9 +-
 drivers/gpu/drm/radeon/radeon_mn.c            | 218 ++-----
 drivers/infiniband/core/device.c              |   1 -
 drivers/infiniband/core/umem_odp.c            | 303 ++--------
 drivers/infiniband/hw/hfi1/file_ops.c         |   2 +-
 drivers/infiniband/hw/hfi1/hfi.h              |   2 +-
 drivers/infiniband/hw/hfi1/user_exp_rcv.c     | 146 ++---
 drivers/infiniband/hw/hfi1/user_exp_rcv.h     |   3 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h          |   7 +-
 drivers/infiniband/hw/mlx5/mr.c               |   3 +-
 drivers/infiniband/hw/mlx5/odp.c              |  50 +-
 drivers/xen/gntdev-common.h                   |   8 +-
 drivers/xen/gntdev.c                          | 179 ++----
 include/linux/hmm.h                           | 195 +-----
 include/linux/mmu_notifier.h                  | 147 ++++-
 include/rdma/ib_umem_odp.h                    |  68 +--
 include/rdma/ib_verbs.h                       |   2 -
 kernel/fork.c                                 |   1 -
 mm/Kconfig                                    |   2 +-
 mm/hmm.c                                      | 276 +--------
 mm/mmu_notifier.c                             | 565 +++++++++++++++++-
 31 files changed, 1271 insertions(+), 1931 deletions(-)

-- 
2.24.0

^ permalink raw reply	[flat|nested] 24+ messages in thread