All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/13] mmu_notifier kill invalidate_page callback
@ 2017-08-29 23:54 ` Jérôme Glisse
  0 siblings, 0 replies; 160+ messages in thread
From: Jérôme Glisse @ 2017-08-29 23:54 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg
  Cc: Jérôme Glisse, Kirill A . Shutemov, Linus Torvalds,
	Andrew Morton, Andrea Arcangeli, Joerg Roedel, Dan Williams,
	Sudeep Dutt, Ashutosh Dixit, Dimitri Sivanich, Jack Steiner,
	Paolo Bonzini, Radim Krčmář,
	linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b, kvm

(Sorry for so many list cross-posting and big cc)

Please help testing !

The invalidate_page callback suffered from 2 pitfalls. First it used to
happen after page table lock was release and thus a new page might have
been setup for the virtual address before the call to invalidate_page().

This is in a weird way fixed by c7ab0d2fdc840266b39db94538f74207ec2afbf6
which moved the callback under the page table lock. Which also broke
several existing user of the mmu_notifier API that assumed they could
sleep inside this callback.

The second pitfall was invalidate_page being the only callback not taking
a range of address in respect to invalidation but was giving an address
and a page. Lot of the callback implementer assumed this could never be
THP and thus failed to invalidate the appropriate range for THP pages.

By killing this callback we unify the mmu_notifier callback API to always
take a virtual address range as input.

There is now 2 clear API (I am not mentioning the youngess API which is
seldomly used):
  - invalidate_range_start()/end() callback (which allow you to sleep)
  - invalidate_range() where you can not sleep but happen right after
    page table update under page table lock


Note that a lot of existing user feels broken in respect to range_start/
range_end. Many user only have range_start() callback but there is nothing
preventing them to undo what was invalidated in their range_start() callback
after it returns but before any CPU page table update take place.

The code pattern use in kvm or umem odp is an example on how to properly
avoid such race. In a nutshell use some kind of sequence number and active
range invalidation counter to block anything that might undo what the
range_start() callback did.

If you do not care about keeping fully in sync with CPU page table (ie
you can live with CPU page table pointing to new different page for a
given virtual address) then you can take a reference on the pages inside
the range_start callback and drop it in range_end or when your driver
is done with those pages.

Last alternative is to use invalidate_range() if you can do invalidation
without sleeping as invalidate_range() callback happens under the CPU
page table spinlock right after the page table is updated.


Note this is barely tested. I intend to do more testing of next few days
but i do not have access to all hardware that make use of the mmu_notifier
API.


First 2 patches convert existing call of mmu_notifier_invalidate_page()
to mmu_notifier_invalidate_range() and bracket those call with call to
mmu_notifier_invalidate_range_start()/end().

The next 10 patches remove existing invalidate_page() callback as it can
no longer happen.

Finaly the last page remove it completely so it can RIP.

Jérôme Glisse (13):
  dax: update to new mmu_notifier semantic
  mm/rmap: update to new mmu_notifier semantic
  powerpc/powernv: update to new mmu_notifier semantic
  drm/amdgpu: update to new mmu_notifier semantic
  IB/umem: update to new mmu_notifier semantic
  IB/hfi1: update to new mmu_notifier semantic
  iommu/amd: update to new mmu_notifier semantic
  iommu/intel: update to new mmu_notifier semantic
  misc/mic/scif: update to new mmu_notifier semantic
  sgi-gru: update to new mmu_notifier semantic
  xen/gntdev: update to new mmu_notifier semantic
  KVM: update to new mmu_notifier semantic
  mm/mmu_notifier: kill invalidate_page

Cc: Kirill A. Shutemov <kirill.shutemov-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Cc: Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: Andrea Arcangeli <aarcange-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org>
Cc: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Sudeep Dutt <sudeep.dutt-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Ashutosh Dixit <ashutosh.dixit-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Dimitri Sivanich <sivanich-sJ/iWh9BUns@public.gmane.org>
Cc: Jack Steiner <steiner-sJ/iWh9BUns@public.gmane.org>
Cc: Paolo Bonzini <pbonzini-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Radim Krčmář <rkrcmar-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Cc: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
Cc: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
Cc: xen-devel-GuqFBffKawtpuQazS67q72D2FQJk+8+b@public.gmane.org
Cc: kvm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org


 arch/powerpc/platforms/powernv/npu-dma.c | 10 --------
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c   | 31 ----------------------
 drivers/infiniband/core/umem_odp.c       | 19 --------------
 drivers/infiniband/hw/hfi1/mmu_rb.c      |  9 -------
 drivers/iommu/amd_iommu_v2.c             |  8 ------
 drivers/iommu/intel-svm.c                |  9 -------
 drivers/misc/mic/scif/scif_dma.c         | 11 --------
 drivers/misc/sgi-gru/grutlbpurge.c       | 12 ---------
 drivers/xen/gntdev.c                     |  8 ------
 fs/dax.c                                 | 19 ++++++++------
 include/linux/mm.h                       |  1 +
 include/linux/mmu_notifier.h             | 25 ------------------
 mm/memory.c                              | 26 +++++++++++++++----
 mm/mmu_notifier.c                        | 14 ----------
 mm/rmap.c                                | 44 +++++++++++++++++++++++++++++---
 virt/kvm/kvm_main.c                      | 42 ------------------------------
 16 files changed, 74 insertions(+), 214 deletions(-)

-- 
2.13.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 160+ messages in thread

end of thread, other threads:[~2017-12-13 13:07 UTC | newest]

Thread overview: 160+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-29 23:54 [PATCH 00/13] mmu_notifier kill invalidate_page callback Jérôme Glisse
2017-08-29 23:54 ` Jérôme Glisse
2017-08-29 23:54 ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 01/13] dax: update to new mmu_notifier semantic Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 02/13] mm/rmap: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-30  2:46   ` Nadav Amit
2017-08-30  2:46     ` Nadav Amit
2017-08-30  2:59     ` Jerome Glisse
2017-08-30  2:59       ` Jerome Glisse
2017-08-30  3:16       ` Nadav Amit
2017-08-30  3:16         ` Nadav Amit
2017-08-30  3:18         ` Nadav Amit
2017-08-30  3:18           ` Nadav Amit
2017-08-30 17:27     ` Andrea Arcangeli
2017-08-30 17:27       ` Andrea Arcangeli
2017-08-30 18:00       ` Nadav Amit
2017-08-30 18:00         ` Nadav Amit
2017-08-30 21:25         ` Andrea Arcangeli
2017-08-30 21:25           ` Andrea Arcangeli
2017-08-30 23:25           ` Nadav Amit
2017-08-30 23:25             ` Nadav Amit
2017-08-31  0:47             ` Jerome Glisse
2017-08-31  0:47               ` Jerome Glisse
2017-08-31  0:47               ` Jerome Glisse
2017-08-31 17:12               ` Andrea Arcangeli
2017-08-31 17:12                 ` Andrea Arcangeli
2017-08-31 17:12                 ` Andrea Arcangeli
2017-08-31 19:15                 ` Nadav Amit
2017-08-31 19:15                   ` Nadav Amit
2017-08-30 18:20       ` Jerome Glisse
2017-08-30 18:20         ` Jerome Glisse
2017-08-30 18:40         ` Nadav Amit
2017-08-30 18:40           ` Nadav Amit
2017-08-30 20:45           ` Jerome Glisse
2017-08-30 20:45             ` Jerome Glisse
2017-08-30 22:17             ` Andrea Arcangeli
2017-08-30 22:17               ` Andrea Arcangeli
2017-08-30 20:55           ` Andrea Arcangeli
2017-08-30 20:55             ` Andrea Arcangeli
2017-08-30 16:52   ` Andrea Arcangeli
2017-08-30 16:52     ` Andrea Arcangeli
2017-08-30 17:48     ` Jerome Glisse
2017-08-30 17:48       ` Jerome Glisse
2017-08-30 21:53     ` Linus Torvalds
2017-08-30 21:53       ` Linus Torvalds
2017-08-30 23:01       ` Andrea Arcangeli
2017-08-30 23:01         ` Andrea Arcangeli
2017-08-31 18:25         ` Jerome Glisse
2017-08-31 18:25           ` Jerome Glisse
2017-08-31 19:40           ` Linus Torvalds
2017-08-31 19:40             ` Linus Torvalds
2017-08-29 23:54 ` [PATCH 03/13] powerpc/powernv: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 04/13] drm/amdgpu: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-30  6:18   ` Christian König
2017-08-30  6:18     ` Christian König
2017-08-29 23:54 ` [PATCH 05/13] IB/umem: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-30  6:13   ` Leon Romanovsky
2017-08-29 23:54 ` [PATCH 06/13] IB/hfi1: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-09-06 14:08   ` Arumugam, Kamenee
2017-08-29 23:54 ` [PATCH 07/13] iommu/amd: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 08/13] iommu/intel: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 09/13] misc/mic/scif: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 10/13] sgi-gru: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 11/13] xen/gntdev: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-30 19:32   ` Boris Ostrovsky
2017-08-30 19:32   ` Boris Ostrovsky
2017-08-29 23:54 ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 12/13] KVM: " Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
2017-08-29 23:54 ` [PATCH 13/13] mm/mmu_notifier: kill invalidate_page Jérôme Glisse
2017-08-29 23:54   ` Jérôme Glisse
     [not found] ` <20170829235447.10050-1-jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-08-30  0:11   ` [PATCH 00/13] mmu_notifier kill invalidate_page callback Linus Torvalds
2017-08-30  0:11     ` Linus Torvalds
2017-08-30  0:11     ` Linus Torvalds
2017-08-30  0:11     ` Linus Torvalds
2017-08-30  0:56     ` Jerome Glisse
2017-08-30  0:56     ` Jerome Glisse
2017-08-30  0:56       ` Jerome Glisse
2017-08-30  0:56       ` Jerome Glisse
2017-08-30  8:40       ` Mike Galbraith
2017-08-30  8:40       ` Mike Galbraith
2017-08-30  8:40         ` Mike Galbraith
2017-08-30  8:40         ` Mike Galbraith
2017-08-30  8:40         ` Mike Galbraith
2017-08-30 14:57       ` Adam Borowski
2017-08-30 14:57       ` Adam Borowski
2017-08-30 14:57         ` Adam Borowski
2017-08-30 14:57         ` Adam Borowski
2017-09-01 14:47         ` Jeff Cook
2017-09-01 14:47         ` Jeff Cook
2017-09-01 14:47           ` Jeff Cook
2017-09-01 14:47           ` Jeff Cook
2017-09-01 14:47           ` Jeff Cook
2017-09-01 14:50           ` taskboxtester
2017-09-01 14:50             ` taskboxtester
2017-08-30 21:51   ` Felix Kuehling
2017-08-31 13:59     ` Jerome Glisse
     [not found]       ` <20170831135953.GA9227-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-08-31 14:14         ` Christian König
2017-08-31 18:39         ` Felix Kuehling
2017-08-31 19:00           ` Jerome Glisse
     [not found]             ` <20170831190021.GG9227-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-08-31 23:19               ` Felix Kuehling
2017-08-31 23:29                 ` Jerome Glisse
2017-08-30  0:11 ` Linus Torvalds
2017-11-30  9:33 ` BSOD with " Fabian Grünbichler
2017-11-30  9:33   ` Fabian Grünbichler
2017-11-30 11:20   ` Paolo Bonzini
2017-11-30 11:20     ` Paolo Bonzini
2017-11-30 11:20     ` Paolo Bonzini
2017-11-30 16:19     ` Radim Krčmář
2017-11-30 16:19       ` Radim Krčmář
2017-11-30 18:05       ` [PATCH 1/2] KVM: x86: fix APIC page invalidation Radim Krčmář
2017-11-30 18:05         ` Radim Krčmář
2017-11-30 18:05         ` Radim Krčmář
2017-11-30 18:05         ` [PATCH 2/2] TESTING! KVM: x86: add invalidate_range mmu notifier Radim Krčmář
2017-11-30 18:05           ` Radim Krčmář
2017-12-01 15:15           ` Paolo Bonzini
2017-12-01 15:15             ` Paolo Bonzini
2017-12-01 15:15             ` Paolo Bonzini
2017-12-03 17:24             ` Andrea Arcangeli
2017-12-03 17:24               ` Andrea Arcangeli
2017-12-03 17:24               ` Andrea Arcangeli
2017-12-01 12:21         ` [PATCH 1/2] KVM: x86: fix APIC page invalidation Fabian Grünbichler
2017-12-01 12:21           ` Fabian Grünbichler
2017-12-01 15:27         ` Paolo Bonzini
2017-12-01 15:27           ` Paolo Bonzini
2017-12-03 17:28         ` Andrea Arcangeli
2017-12-03 17:28           ` Andrea Arcangeli
2017-12-03 17:28           ` Andrea Arcangeli
2017-12-06  2:32         ` Wanpeng Li
2017-12-06  2:32           ` Wanpeng Li
2017-12-06  9:50           ` 王金浦
2017-12-06  9:50             ` 王金浦
2017-12-06 10:00             ` Paolo Bonzini
2017-12-06 10:00               ` Paolo Bonzini
2017-12-06 10:00               ` Paolo Bonzini
2017-12-06  8:15         ` Fabian Grünbichler
2017-12-06  8:15           ` Fabian Grünbichler
2017-12-06  8:15           ` Fabian Grünbichler
2017-12-13 12:54         ` Richard Purdie
2017-12-13 12:54           ` Richard Purdie
2017-12-13 12:54           ` Richard Purdie
2017-11-30 16:19     ` BSOD with [PATCH 00/13] mmu_notifier kill invalidate_page callback Radim Krčmář
2017-11-30 11:20   ` Paolo Bonzini
2017-11-30  9:33 ` Fabian Grünbichler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.