linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/8] mmu notifier provide context informations
@ 2019-03-26 16:47 jglisse
  2019-03-26 16:47 ` [PATCH v6 1/8] mm/mmu_notifier: helper to test if a range invalidation is blockable jglisse
                   ` (9 more replies)
  0 siblings, 10 replies; 18+ messages in thread
From: jglisse @ 2019-03-26 16:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, Andrew Morton, linux-mm,
	Christian König, Joonas Lahtinen, Jani Nikula, Rodrigo Vivi,
	Jan Kara, Andrea Arcangeli, Peter Xu, Felix Kuehling,
	Jason Gunthorpe, Ross Zwisler, Dan Williams, Paolo Bonzini,
	Alex Deucher, Radim Krčmář,
	Michal Hocko, Ben Skeggs, Ralph Campbell, John Hubbard, kvm,
	dri-devel, linux-rdma, Arnd Bergmann

From: Jérôme Glisse <jglisse@redhat.com>

(Andrew this apply on top of my HMM patchset as otherwise you will have
 conflict with changes to mm/hmm.c)

Changes since v5:
    - drop KVM bits waiting for KVM people to express interest if they
      do not then i will post patchset to remove change_pte_notify as
      without the changes in v5 change_pte_notify is just useless (it
      it is useless today upstream it is just wasting cpu cycles)
    - rebase on top of lastest Linus tree

Previous cover letter with minor update:


Here i am not posting users of this, they already have been posted to
appropriate mailing list [6] and will be merge through the appropriate
tree once this patchset is upstream.

Note that this serie does not change any behavior for any existing
code. It just pass down more information to mmu notifier listener.

The rational for this patchset:

CPU page table update can happens for many reasons, not only as a
result of a syscall (munmap(), mprotect(), mremap(), madvise(), ...)
but also as a result of kernel activities (memory compression, reclaim,
migration, ...).

This patch introduce a set of enums that can be associated with each
of the events triggering a mmu notifier:

    - UNMAP: munmap() or mremap()
    - CLEAR: page table is cleared (migration, compaction, reclaim, ...)
    - PROTECTION_VMA: change in access protections for the range
    - PROTECTION_PAGE: change in access protections for page in the range
    - SOFT_DIRTY: soft dirtyness tracking

Being able to identify munmap() and mremap() from other reasons why the
page table is cleared is important to allow user of mmu notifier to
update their own internal tracking structure accordingly (on munmap or
mremap it is not longer needed to track range of virtual address as it
becomes invalid). Without this serie, driver are force to assume that
every notification is an munmap which triggers useless trashing within
drivers that associate structure with range of virtual address. Each
driver is force to free up its tracking structure and then restore it
on next device page fault. With this serie we can also optimize device
page table update [6].

More over this can also be use to optimize out some page table updates
like for KVM where we can update the secondary MMU directly from the
callback instead of clearing it.

ACKS AMD/RADEON https://lkml.org/lkml/2019/2/1/395
ACKS RDMA https://lkml.org/lkml/2018/12/6/1473

Cheers,
Jérôme

[1] v1 https://lkml.org/lkml/2018/3/23/1049
[2] v2 https://lkml.org/lkml/2018/12/5/10
[3] v3 https://lkml.org/lkml/2018/12/13/620
[4] v4 https://lkml.org/lkml/2019/1/23/838
[5] v5 https://lkml.org/lkml/2019/2/19/752
[6] patches to use this:
    https://lkml.org/lkml/2019/1/23/833
    https://lkml.org/lkml/2019/1/23/834
    https://lkml.org/lkml/2019/1/23/832
    https://lkml.org/lkml/2019/1/23/831

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Cc: Christian König <christian.koenig@amd.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kvm@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-rdma@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>

Jérôme Glisse (8):
  mm/mmu_notifier: helper to test if a range invalidation is blockable
  mm/mmu_notifier: convert user range->blockable to helper function
  mm/mmu_notifier: convert mmu_notifier_range->blockable to a flags
  mm/mmu_notifier: contextual information for event enums
  mm/mmu_notifier: contextual information for event triggering
    invalidation v2
  mm/mmu_notifier: use correct mmu_notifier events for each invalidation
  mm/mmu_notifier: pass down vma and reasons why mmu notifier is
    happening v2
  mm/mmu_notifier: mmu_notifier_range_update_to_read_only() helper

 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c  |  8 ++--
 drivers/gpu/drm/i915/i915_gem_userptr.c |  2 +-
 drivers/gpu/drm/radeon/radeon_mn.c      |  4 +-
 drivers/infiniband/core/umem_odp.c      |  5 +-
 drivers/xen/gntdev.c                    |  6 +--
 fs/proc/task_mmu.c                      |  3 +-
 include/linux/mmu_notifier.h            | 63 +++++++++++++++++++++++--
 kernel/events/uprobes.c                 |  3 +-
 mm/hmm.c                                |  6 +--
 mm/huge_memory.c                        | 14 +++---
 mm/hugetlb.c                            | 12 +++--
 mm/khugepaged.c                         |  3 +-
 mm/ksm.c                                |  6 ++-
 mm/madvise.c                            |  3 +-
 mm/memory.c                             | 25 ++++++----
 mm/migrate.c                            |  5 +-
 mm/mmu_notifier.c                       | 12 ++++-
 mm/mprotect.c                           |  4 +-
 mm/mremap.c                             |  3 +-
 mm/oom_kill.c                           |  3 +-
 mm/rmap.c                               |  6 ++-
 virt/kvm/kvm_main.c                     |  3 +-
 22 files changed, 147 insertions(+), 52 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v6 1/8] mm/mmu_notifier: helper to test if a range invalidation is blockable
  2019-03-26 16:47 [PATCH v6 0/8] mmu notifier provide context informations jglisse
@ 2019-03-26 16:47 ` jglisse
  2019-03-26 16:47 ` [PATCH v6 2/8] mm/mmu_notifier: convert user range->blockable to helper function jglisse
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: jglisse @ 2019-03-26 16:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, Andrew Morton, linux-mm,
	Christian König, Joonas Lahtinen, Jani Nikula, Rodrigo Vivi,
	Jan Kara, Andrea Arcangeli, Peter Xu, Felix Kuehling,
	Jason Gunthorpe, Ross Zwisler, Dan Williams, Paolo Bonzini,
	Radim Krčmář,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, linux-fsdevel, Arnd Bergmann

From: Jérôme Glisse <jglisse@redhat.com>

Simple helpers to test if range invalidation is blockable. Latter
patches use cocinnelle to convert all direct dereference of range->
blockable to use this function instead so that we can convert the
blockable field to an unsigned for more flags.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Cc: Christian König <christian.koenig@amd.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kvm@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
---
 include/linux/mmu_notifier.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 4050ec1c3b45..e630def131ce 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -226,6 +226,12 @@ extern void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *r,
 extern void __mmu_notifier_invalidate_range(struct mm_struct *mm,
 				  unsigned long start, unsigned long end);
 
+static inline bool
+mmu_notifier_range_blockable(const struct mmu_notifier_range *range)
+{
+	return range->blockable;
+}
+
 static inline void mmu_notifier_release(struct mm_struct *mm)
 {
 	if (mm_has_notifiers(mm))
@@ -455,6 +461,11 @@ static inline void _mmu_notifier_range_init(struct mmu_notifier_range *range,
 #define mmu_notifier_range_init(range, mm, start, end) \
 	_mmu_notifier_range_init(range, start, end)
 
+static inline bool
+mmu_notifier_range_blockable(const struct mmu_notifier_range *range)
+{
+	return true;
+}
 
 static inline int mm_has_notifiers(struct mm_struct *mm)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 2/8] mm/mmu_notifier: convert user range->blockable to helper function
  2019-03-26 16:47 [PATCH v6 0/8] mmu notifier provide context informations jglisse
  2019-03-26 16:47 ` [PATCH v6 1/8] mm/mmu_notifier: helper to test if a range invalidation is blockable jglisse
@ 2019-03-26 16:47 ` jglisse
  2019-03-26 16:47 ` [PATCH v6 3/8] mm/mmu_notifier: convert mmu_notifier_range->blockable to a flags jglisse
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: jglisse @ 2019-03-26 16:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, Andrew Morton, linux-mm,
	Christian König, Joonas Lahtinen, Jani Nikula, Rodrigo Vivi,
	Jan Kara, Andrea Arcangeli, Peter Xu, Felix Kuehling,
	Jason Gunthorpe, Ross Zwisler, Dan Williams, Paolo Bonzini,
	Radim Krčmář,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

From: Jérôme Glisse <jglisse@redhat.com>

Use the mmu_notifier_range_blockable() helper function instead of
directly dereferencing the range->blockable field. This is done to
make it easier to change the mmu_notifier range field.

This patch is the outcome of the following coccinelle patch:

%<-------------------------------------------------------------------
@@
identifier I1, FN;
@@
FN(..., struct mmu_notifier_range *I1, ...) {
<...
-I1->blockable
+mmu_notifier_range_blockable(I1)
...>
}
------------------------------------------------------------------->%

spatch --in-place --sp-file blockable.spatch --dir .

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Cc: Christian König <christian.koenig@amd.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kvm@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-rdma@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c  | 8 ++++----
 drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
 drivers/gpu/drm/radeon/radeon_mn.c      | 4 ++--
 drivers/infiniband/core/umem_odp.c      | 5 +++--
 drivers/xen/gntdev.c                    | 6 +++---
 mm/hmm.c                                | 6 +++---
 mm/mmu_notifier.c                       | 2 +-
 virt/kvm/kvm_main.c                     | 3 ++-
 8 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 3e6823fdd939..58ed401c5996 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -256,14 +256,14 @@ static int amdgpu_mn_invalidate_range_start_gfx(struct mmu_notifier *mn,
 	/* TODO we should be able to split locking for interval tree and
 	 * amdgpu_mn_invalidate_node
 	 */
-	if (amdgpu_mn_read_lock(amn, range->blockable))
+	if (amdgpu_mn_read_lock(amn, mmu_notifier_range_blockable(range)))
 		return -EAGAIN;
 
 	it = interval_tree_iter_first(&amn->objects, range->start, end);
 	while (it) {
 		struct amdgpu_mn_node *node;
 
-		if (!range->blockable) {
+		if (!mmu_notifier_range_blockable(range)) {
 			amdgpu_mn_read_unlock(amn);
 			return -EAGAIN;
 		}
@@ -299,7 +299,7 @@ static int amdgpu_mn_invalidate_range_start_hsa(struct mmu_notifier *mn,
 	/* notification is exclusive, but interval is inclusive */
 	end = range->end - 1;
 
-	if (amdgpu_mn_read_lock(amn, range->blockable))
+	if (amdgpu_mn_read_lock(amn, mmu_notifier_range_blockable(range)))
 		return -EAGAIN;
 
 	it = interval_tree_iter_first(&amn->objects, range->start, end);
@@ -307,7 +307,7 @@ static int amdgpu_mn_invalidate_range_start_hsa(struct mmu_notifier *mn,
 		struct amdgpu_mn_node *node;
 		struct amdgpu_bo *bo;
 
-		if (!range->blockable) {
+		if (!mmu_notifier_range_blockable(range)) {
 			amdgpu_mn_read_unlock(amn);
 			return -EAGAIN;
 		}
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 1d3f9a31ad61..777b3f8727e7 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -122,7 +122,7 @@ userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
 	while (it) {
 		struct drm_i915_gem_object *obj;
 
-		if (!range->blockable) {
+		if (!mmu_notifier_range_blockable(range)) {
 			ret = -EAGAIN;
 			break;
 		}
diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c
index b3019505065a..c9bd1278f573 100644
--- a/drivers/gpu/drm/radeon/radeon_mn.c
+++ b/drivers/gpu/drm/radeon/radeon_mn.c
@@ -133,7 +133,7 @@ static int radeon_mn_invalidate_range_start(struct mmu_notifier *mn,
 	/* TODO we should be able to split locking for interval tree and
 	 * the tear down.
 	 */
-	if (range->blockable)
+	if (mmu_notifier_range_blockable(range))
 		mutex_lock(&rmn->lock);
 	else if (!mutex_trylock(&rmn->lock))
 		return -EAGAIN;
@@ -144,7 +144,7 @@ static int radeon_mn_invalidate_range_start(struct mmu_notifier *mn,
 		struct radeon_bo *bo;
 		long r;
 
-		if (!range->blockable) {
+		if (!mmu_notifier_range_blockable(range)) {
 			ret = -EAGAIN;
 			goto out_unlock;
 		}
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index e6ec79ad9cc8..59ef912fbe03 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -152,7 +152,7 @@ static int ib_umem_notifier_invalidate_range_start(struct mmu_notifier *mn,
 	struct ib_ucontext_per_mm *per_mm =
 		container_of(mn, struct ib_ucontext_per_mm, mn);
 
-	if (range->blockable)
+	if (mmu_notifier_range_blockable(range))
 		down_read(&per_mm->umem_rwsem);
 	else if (!down_read_trylock(&per_mm->umem_rwsem))
 		return -EAGAIN;
@@ -170,7 +170,8 @@ static int ib_umem_notifier_invalidate_range_start(struct mmu_notifier *mn,
 	return rbt_ib_umem_for_each_in_range(&per_mm->umem_tree, range->start,
 					     range->end,
 					     invalidate_range_start_trampoline,
-					     range->blockable, NULL);
+					     mmu_notifier_range_blockable(range),
+					     NULL);
 }
 
 static int invalidate_range_end_trampoline(struct ib_umem_odp *item, u64 start,
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 7cf9c51318aa..8012ab7f5120 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -526,20 +526,20 @@ static int mn_invl_range_start(struct mmu_notifier *mn,
 	struct gntdev_grant_map *map;
 	int ret = 0;
 
-	if (range->blockable)
+	if (mmu_notifier_range_blockable(range))
 		mutex_lock(&priv->lock);
 	else if (!mutex_trylock(&priv->lock))
 		return -EAGAIN;
 
 	list_for_each_entry(map, &priv->maps, next) {
 		ret = unmap_if_in_range(map, range->start, range->end,
-					range->blockable);
+					mmu_notifier_range_blockable(range));
 		if (ret)
 			goto out_unlock;
 	}
 	list_for_each_entry(map, &priv->freeable_maps, next) {
 		ret = unmap_if_in_range(map, range->start, range->end,
-					range->blockable);
+					mmu_notifier_range_blockable(range));
 		if (ret)
 			goto out_unlock;
 	}
diff --git a/mm/hmm.c b/mm/hmm.c
index fd143251b157..25f372da325a 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -201,9 +201,9 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 	update.start = nrange->start;
 	update.end = nrange->end;
 	update.event = HMM_UPDATE_INVALIDATE;
-	update.blockable = nrange->blockable;
+	update.blockable = mmu_notifier_range_blockable(nrange);
 
-	if (nrange->blockable)
+	if (mmu_notifier_range_blockable(nrange))
 		mutex_lock(&hmm->lock);
 	else if (!mutex_trylock(&hmm->lock)) {
 		ret = -EAGAIN;
@@ -218,7 +218,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 	}
 	mutex_unlock(&hmm->lock);
 
-	if (nrange->blockable)
+	if (mmu_notifier_range_blockable(nrange))
 		down_read(&hmm->mirrors_sem);
 	else if (!down_read_trylock(&hmm->mirrors_sem)) {
 		ret = -EAGAIN;
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 9c884abc7850..abd88c466eb2 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -180,7 +180,7 @@ int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range)
 			if (_ret) {
 				pr_info("%pS callback failed with %d in %sblockable context.\n",
 					mn->ops->invalidate_range_start, _ret,
-					!range->blockable ? "non-" : "");
+					!mmu_notifier_range_blockable(range) ? "non-" : "");
 				ret = _ret;
 			}
 		}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f25aa98a94df..16edfc3e0d1a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -391,7 +391,8 @@ static int kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
 	spin_unlock(&kvm->mmu_lock);
 
 	ret = kvm_arch_mmu_notifier_invalidate_range(kvm, range->start,
-					range->end, range->blockable);
+					range->end,
+					mmu_notifier_range_blockable(range));
 
 	srcu_read_unlock(&kvm->srcu, idx);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 3/8] mm/mmu_notifier: convert mmu_notifier_range->blockable to a flags
  2019-03-26 16:47 [PATCH v6 0/8] mmu notifier provide context informations jglisse
  2019-03-26 16:47 ` [PATCH v6 1/8] mm/mmu_notifier: helper to test if a range invalidation is blockable jglisse
  2019-03-26 16:47 ` [PATCH v6 2/8] mm/mmu_notifier: convert user range->blockable to helper function jglisse
@ 2019-03-26 16:47 ` jglisse
  2019-03-26 16:47 ` [PATCH v6 4/8] mm/mmu_notifier: contextual information for event enums jglisse
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: jglisse @ 2019-03-26 16:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, Andrew Morton, linux-mm,
	Christian König, Joonas Lahtinen, Jani Nikula, Rodrigo Vivi,
	Jan Kara, Andrea Arcangeli, Peter Xu, Felix Kuehling,
	Jason Gunthorpe, Ross Zwisler, Dan Williams, Paolo Bonzini,
	Radim Krčmář,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

From: Jérôme Glisse <jglisse@redhat.com>

Use an unsigned field for flags other than blockable and convert
the blockable field to be one of those flags.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Cc: Christian König <christian.koenig@amd.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kvm@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-rdma@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
---
 include/linux/mmu_notifier.h | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index e630def131ce..c8672c366f67 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -25,11 +25,13 @@ struct mmu_notifier_mm {
 	spinlock_t lock;
 };
 
+#define MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0)
+
 struct mmu_notifier_range {
 	struct mm_struct *mm;
 	unsigned long start;
 	unsigned long end;
-	bool blockable;
+	unsigned flags;
 };
 
 struct mmu_notifier_ops {
@@ -229,7 +231,7 @@ extern void __mmu_notifier_invalidate_range(struct mm_struct *mm,
 static inline bool
 mmu_notifier_range_blockable(const struct mmu_notifier_range *range)
 {
-	return range->blockable;
+	return (range->flags & MMU_NOTIFIER_RANGE_BLOCKABLE);
 }
 
 static inline void mmu_notifier_release(struct mm_struct *mm)
@@ -275,7 +277,7 @@ static inline void
 mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range)
 {
 	if (mm_has_notifiers(range->mm)) {
-		range->blockable = true;
+		range->flags |= MMU_NOTIFIER_RANGE_BLOCKABLE;
 		__mmu_notifier_invalidate_range_start(range);
 	}
 }
@@ -284,7 +286,7 @@ static inline int
 mmu_notifier_invalidate_range_start_nonblock(struct mmu_notifier_range *range)
 {
 	if (mm_has_notifiers(range->mm)) {
-		range->blockable = false;
+		range->flags &= ~MMU_NOTIFIER_RANGE_BLOCKABLE;
 		return __mmu_notifier_invalidate_range_start(range);
 	}
 	return 0;
@@ -331,6 +333,7 @@ static inline void mmu_notifier_range_init(struct mmu_notifier_range *range,
 	range->mm = mm;
 	range->start = start;
 	range->end = end;
+	range->flags = 0;
 }
 
 #define ptep_clear_flush_young_notify(__vma, __address, __ptep)		\
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 4/8] mm/mmu_notifier: contextual information for event enums
  2019-03-26 16:47 [PATCH v6 0/8] mmu notifier provide context informations jglisse
                   ` (2 preceding siblings ...)
  2019-03-26 16:47 ` [PATCH v6 3/8] mm/mmu_notifier: convert mmu_notifier_range->blockable to a flags jglisse
@ 2019-03-26 16:47 ` jglisse
  2019-03-26 16:47 ` [PATCH v6 5/8] mm/mmu_notifier: contextual information for event triggering invalidation v2 jglisse
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: jglisse @ 2019-03-26 16:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, Andrew Morton, linux-mm,
	Christian König, Joonas Lahtinen, Jani Nikula, Rodrigo Vivi,
	Jan Kara, Andrea Arcangeli, Peter Xu, Felix Kuehling,
	Jason Gunthorpe, Ross Zwisler, Dan Williams, Paolo Bonzini,
	Radim Krčmář,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

From: Jérôme Glisse <jglisse@redhat.com>

CPU page table update can happens for many reasons, not only as a result
of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also
as a result of kernel activities (memory compression, reclaim, migration,
...).

This patch introduce a set of enums that can be associated with each of
the events triggering a mmu notifier. Latter patches take advantages of
those enum values.

    - UNMAP: munmap() or mremap()
    - CLEAR: page table is cleared (migration, compaction, reclaim, ...)
    - PROTECTION_VMA: change in access protections for the range
    - PROTECTION_PAGE: change in access protections for page in the range
    - SOFT_DIRTY: soft dirtyness tracking

Being able to identify munmap() and mremap() from other reasons why the
page table is cleared is important to allow user of mmu notifier to
update their own internal tracking structure accordingly (on munmap or
mremap it is not longer needed to track range of virtual address as it
becomes invalid).

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Cc: Christian König <christian.koenig@amd.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kvm@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-rdma@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
---
 include/linux/mmu_notifier.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index c8672c366f67..2386e71ac1b8 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -10,6 +10,36 @@
 struct mmu_notifier;
 struct mmu_notifier_ops;
 
+/**
+ * enum mmu_notifier_event - reason for the mmu notifier callback
+ * @MMU_NOTIFY_UNMAP: either munmap() that unmap the range or a mremap() that
+ * move the range
+ *
+ * @MMU_NOTIFY_CLEAR: clear page table entry (many reasons for this like
+ * madvise() or replacing a page by another one, ...).
+ *
+ * @MMU_NOTIFY_PROTECTION_VMA: update is due to protection change for the range
+ * ie using the vma access permission (vm_page_prot) to update the whole range
+ * is enough no need to inspect changes to the CPU page table (mprotect()
+ * syscall)
+ *
+ * @MMU_NOTIFY_PROTECTION_PAGE: update is due to change in read/write flag for
+ * pages in the range so to mirror those changes the user must inspect the CPU
+ * page table (from the end callback).
+ *
+ * @MMU_NOTIFY_SOFT_DIRTY: soft dirty accounting (still same page and same
+ * access flags). User should soft dirty the page in the end callback to make
+ * sure that anyone relying on soft dirtyness catch pages that might be written
+ * through non CPU mappings.
+ */
+enum mmu_notifier_event {
+	MMU_NOTIFY_UNMAP = 0,
+	MMU_NOTIFY_CLEAR,
+	MMU_NOTIFY_PROTECTION_VMA,
+	MMU_NOTIFY_PROTECTION_PAGE,
+	MMU_NOTIFY_SOFT_DIRTY,
+};
+
 #ifdef CONFIG_MMU_NOTIFIER
 
 /*
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 5/8] mm/mmu_notifier: contextual information for event triggering invalidation v2
  2019-03-26 16:47 [PATCH v6 0/8] mmu notifier provide context informations jglisse
                   ` (3 preceding siblings ...)
  2019-03-26 16:47 ` [PATCH v6 4/8] mm/mmu_notifier: contextual information for event enums jglisse
@ 2019-03-26 16:47 ` jglisse
  2019-03-26 16:47 ` [PATCH v6 6/8] mm/mmu_notifier: use correct mmu_notifier events for each invalidation jglisse
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: jglisse @ 2019-03-26 16:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, Andrew Morton, linux-mm,
	Christian König, Joonas Lahtinen, Jani Nikula, Rodrigo Vivi,
	Jan Kara, Andrea Arcangeli, Peter Xu, Felix Kuehling,
	Jason Gunthorpe, Ross Zwisler, Dan Williams, Paolo Bonzini,
	Radim Krčmář,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

From: Jérôme Glisse <jglisse@redhat.com>

CPU page table update can happens for many reasons, not only as a result
of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also
as a result of kernel activities (memory compression, reclaim, migration,
...).

Users of mmu notifier API track changes to the CPU page table and take
specific action for them. While current API only provide range of virtual
address affected by the change, not why the changes is happening.

This patchset do the initial mechanical convertion of all the places that
calls mmu_notifier_range_init to also provide the default MMU_NOTIFY_UNMAP
event as well as the vma if it is know (most invalidation happens against
a given vma). Passing down the vma allows the users of mmu notifier to
inspect the new vma page protection.

The MMU_NOTIFY_UNMAP is always the safe default as users of mmu notifier
should assume that every for the range is going away when that event
happens. A latter patch do convert mm call path to use a more appropriate
events for each call.

Changes since v1:
    - add the flags parameter to init range flags

This is done as 2 patches so that no call site is forgotten especialy
as it uses this following coccinelle patch:

%<----------------------------------------------------------------------
@@
identifier I1, I2, I3, I4;
@@
static inline void mmu_notifier_range_init(struct mmu_notifier_range *I1,
+enum mmu_notifier_event event,
+unsigned flags,
+struct vm_area_struct *vma,
struct mm_struct *I2, unsigned long I3, unsigned long I4) { ... }

@@
@@
-#define mmu_notifier_range_init(range, mm, start, end)
+#define mmu_notifier_range_init(range, event, flags, vma, mm, start, end)

@@
expression E1, E3, E4;
identifier I1;
@@
<...
mmu_notifier_range_init(E1,
+MMU_NOTIFY_UNMAP, 0, I1,
I1->vm_mm, E3, E4)
...>

@@
expression E1, E2, E3, E4;
identifier FN, VMA;
@@
FN(..., struct vm_area_struct *VMA, ...) {
<...
mmu_notifier_range_init(E1,
+MMU_NOTIFY_UNMAP, 0, VMA,
E2, E3, E4)
...> }

@@
expression E1, E2, E3, E4;
identifier FN, VMA;
@@
FN(...) {
struct vm_area_struct *VMA;
<...
mmu_notifier_range_init(E1,
+MMU_NOTIFY_UNMAP, 0, VMA,
E2, E3, E4)
...> }

@@
expression E1, E2, E3, E4;
identifier FN;
@@
FN(...) {
<...
mmu_notifier_range_init(E1,
+MMU_NOTIFY_UNMAP, 0, NULL,
E2, E3, E4)
...> }
---------------------------------------------------------------------->%

Applied with:
spatch --all-includes --sp-file mmu-notifier.spatch fs/proc/task_mmu.c --in-place
spatch --sp-file mmu-notifier.spatch --dir kernel/events/ --in-place
spatch --sp-file mmu-notifier.spatch --dir mm --in-place

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Cc: Christian König <christian.koenig@amd.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kvm@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-rdma@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
---
 fs/proc/task_mmu.c           |  3 ++-
 include/linux/mmu_notifier.h |  5 ++++-
 kernel/events/uprobes.c      |  3 ++-
 mm/huge_memory.c             | 12 ++++++++----
 mm/hugetlb.c                 | 12 ++++++++----
 mm/khugepaged.c              |  3 ++-
 mm/ksm.c                     |  6 ++++--
 mm/madvise.c                 |  3 ++-
 mm/memory.c                  | 25 ++++++++++++++++---------
 mm/migrate.c                 |  5 ++++-
 mm/mprotect.c                |  3 ++-
 mm/mremap.c                  |  3 ++-
 mm/oom_kill.c                |  3 ++-
 mm/rmap.c                    |  6 ++++--
 14 files changed, 62 insertions(+), 30 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 92a91e7816d8..fcbd0e574917 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1151,7 +1151,8 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
 				break;
 			}
 
-			mmu_notifier_range_init(&range, mm, 0, -1UL);
+			mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0,
+						NULL, mm, 0, -1UL);
 			mmu_notifier_invalidate_range_start(&range);
 		}
 		walk_page_range(0, mm->highest_vm_end, &clear_refs_walk);
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 2386e71ac1b8..62f94cd85455 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -356,6 +356,9 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm)
 
 
 static inline void mmu_notifier_range_init(struct mmu_notifier_range *range,
+					   enum mmu_notifier_event event,
+					   unsigned flags,
+					   struct vm_area_struct *vma,
 					   struct mm_struct *mm,
 					   unsigned long start,
 					   unsigned long end)
@@ -491,7 +494,7 @@ static inline void _mmu_notifier_range_init(struct mmu_notifier_range *range,
 	range->end = end;
 }
 
-#define mmu_notifier_range_init(range, mm, start, end) \
+#define mmu_notifier_range_init(range,event,flags,vma,mm,start,end)  \
 	_mmu_notifier_range_init(range, start, end)
 
 static inline bool
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index c5cde87329c7..77c3f079c723 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -161,7 +161,8 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
 	struct mmu_notifier_range range;
 	struct mem_cgroup *memcg;
 
-	mmu_notifier_range_init(&range, mm, addr, addr + PAGE_SIZE);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm, addr,
+				addr + PAGE_SIZE);
 
 	VM_BUG_ON_PAGE(PageTransHuge(old_page), old_page);
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 404acdcd0455..4309939be22d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1184,7 +1184,8 @@ static vm_fault_t do_huge_pmd_wp_page_fallback(struct vm_fault *vmf,
 		cond_resched();
 	}
 
-	mmu_notifier_range_init(&range, vma->vm_mm, haddr,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+				haddr,
 				haddr + HPAGE_PMD_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 
@@ -1348,7 +1349,8 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd)
 				    vma, HPAGE_PMD_NR);
 	__SetPageUptodate(new_page);
 
-	mmu_notifier_range_init(&range, vma->vm_mm, haddr,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+				haddr,
 				haddr + HPAGE_PMD_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 
@@ -2024,7 +2026,8 @@ void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud,
 	spinlock_t *ptl;
 	struct mmu_notifier_range range;
 
-	mmu_notifier_range_init(&range, vma->vm_mm, address & HPAGE_PUD_MASK,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+				address & HPAGE_PUD_MASK,
 				(address & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 	ptl = pud_lock(vma->vm_mm, pud);
@@ -2242,7 +2245,8 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 	spinlock_t *ptl;
 	struct mmu_notifier_range range;
 
-	mmu_notifier_range_init(&range, vma->vm_mm, address & HPAGE_PMD_MASK,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+				address & HPAGE_PMD_MASK,
 				(address & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 	ptl = pmd_lock(vma->vm_mm, pmd);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 97b1e0290c66..c20a8d2de3f3 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3247,7 +3247,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 	cow = (vma->vm_flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
 
 	if (cow) {
-		mmu_notifier_range_init(&range, src, vma->vm_start,
+		mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, src,
+					vma->vm_start,
 					vma->vm_end);
 		mmu_notifier_invalidate_range_start(&range);
 	}
@@ -3359,7 +3360,8 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
 	/*
 	 * If sharing possible, alert mmu notifiers of worst case.
 	 */
-	mmu_notifier_range_init(&range, mm, start, end);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm, start,
+				end);
 	adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
 	mmu_notifier_invalidate_range_start(&range);
 	address = start;
@@ -3626,7 +3628,8 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
 			    pages_per_huge_page(h));
 	__SetPageUptodate(new_page);
 
-	mmu_notifier_range_init(&range, mm, haddr, haddr + huge_page_size(h));
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm, haddr,
+				haddr + huge_page_size(h));
 	mmu_notifier_invalidate_range_start(&range);
 
 	/*
@@ -4358,7 +4361,8 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
 	 * start/end.  Set range.start/range.end to cover the maximum possible
 	 * range if PMD sharing is possible.
 	 */
-	mmu_notifier_range_init(&range, mm, start, end);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm, start,
+				end);
 	adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
 
 	BUG_ON(address >= end);
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 449044378782..e7944f5e6258 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1016,7 +1016,8 @@ static void collapse_huge_page(struct mm_struct *mm,
 	pte = pte_offset_map(pmd, address);
 	pte_ptl = pte_lockptr(mm, pmd);
 
-	mmu_notifier_range_init(&range, mm, address, address + HPAGE_PMD_SIZE);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, NULL, mm,
+				address, address + HPAGE_PMD_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 	pmd_ptl = pmd_lock(mm, pmd); /* probably unnecessary */
 	/*
diff --git a/mm/ksm.c b/mm/ksm.c
index fc64874dc6f4..01f5fe2c90cf 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1066,7 +1066,8 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page,
 
 	BUG_ON(PageTransCompound(page));
 
-	mmu_notifier_range_init(&range, mm, pvmw.address,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm,
+				pvmw.address,
 				pvmw.address + PAGE_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 
@@ -1154,7 +1155,8 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
 	if (!pmd)
 		goto out;
 
-	mmu_notifier_range_init(&range, mm, addr, addr + PAGE_SIZE);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm, addr,
+				addr + PAGE_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 
 	ptep = pte_offset_map_lock(mm, pmd, addr, &ptl);
diff --git a/mm/madvise.c b/mm/madvise.c
index 21a7881a2db4..c617f53a9c09 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -472,7 +472,8 @@ static int madvise_free_single_vma(struct vm_area_struct *vma,
 	range.end = min(vma->vm_end, end_addr);
 	if (range.end <= vma->vm_start)
 		return -EINVAL;
-	mmu_notifier_range_init(&range, mm, range.start, range.end);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm,
+				range.start, range.end);
 
 	lru_add_drain();
 	tlb_gather_mmu(&tlb, mm, range.start, range.end);
diff --git a/mm/memory.c b/mm/memory.c
index 47fe250307c7..ac6754bb30c8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1010,7 +1010,8 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	is_cow = is_cow_mapping(vma->vm_flags);
 
 	if (is_cow) {
-		mmu_notifier_range_init(&range, src_mm, addr, end);
+		mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma,
+					src_mm, addr, end);
 		mmu_notifier_invalidate_range_start(&range);
 	}
 
@@ -1334,7 +1335,8 @@ void unmap_vmas(struct mmu_gather *tlb,
 {
 	struct mmu_notifier_range range;
 
-	mmu_notifier_range_init(&range, vma->vm_mm, start_addr, end_addr);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+				start_addr, end_addr);
 	mmu_notifier_invalidate_range_start(&range);
 	for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next)
 		unmap_single_vma(tlb, vma, start_addr, end_addr, NULL);
@@ -1356,7 +1358,8 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start,
 	struct mmu_gather tlb;
 
 	lru_add_drain();
-	mmu_notifier_range_init(&range, vma->vm_mm, start, start + size);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+				start, start + size);
 	tlb_gather_mmu(&tlb, vma->vm_mm, start, range.end);
 	update_hiwater_rss(vma->vm_mm);
 	mmu_notifier_invalidate_range_start(&range);
@@ -1382,7 +1385,8 @@ static void zap_page_range_single(struct vm_area_struct *vma, unsigned long addr
 	struct mmu_gather tlb;
 
 	lru_add_drain();
-	mmu_notifier_range_init(&range, vma->vm_mm, address, address + size);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+				address, address + size);
 	tlb_gather_mmu(&tlb, vma->vm_mm, address, range.end);
 	update_hiwater_rss(vma->vm_mm);
 	mmu_notifier_invalidate_range_start(&range);
@@ -2278,7 +2282,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
 
 	__SetPageUptodate(new_page);
 
-	mmu_notifier_range_init(&range, mm, vmf->address & PAGE_MASK,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm,
+				vmf->address & PAGE_MASK,
 				(vmf->address & PAGE_MASK) + PAGE_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 
@@ -4103,8 +4108,9 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address,
 			goto out;
 
 		if (range) {
-			mmu_notifier_range_init(range, mm, address & PMD_MASK,
-					     (address & PMD_MASK) + PMD_SIZE);
+			mmu_notifier_range_init(range, MMU_NOTIFY_UNMAP, 0,
+						NULL, mm, address & PMD_MASK,
+						(address & PMD_MASK) + PMD_SIZE);
 			mmu_notifier_invalidate_range_start(range);
 		}
 		*ptlp = pmd_lock(mm, pmd);
@@ -4121,8 +4127,9 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address,
 		goto out;
 
 	if (range) {
-		mmu_notifier_range_init(range, mm, address & PAGE_MASK,
-				     (address & PAGE_MASK) + PAGE_SIZE);
+		mmu_notifier_range_init(range, MMU_NOTIFY_UNMAP, 0, NULL, mm,
+					address & PAGE_MASK,
+					(address & PAGE_MASK) + PAGE_SIZE);
 		mmu_notifier_invalidate_range_start(range);
 	}
 	ptep = pte_offset_map_lock(mm, pmd, address, ptlp);
diff --git a/mm/migrate.c b/mm/migrate.c
index ac6f4939bb59..8f6e4382f0ad 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2351,7 +2351,8 @@ static void migrate_vma_collect(struct migrate_vma *migrate)
 	mm_walk.mm = migrate->vma->vm_mm;
 	mm_walk.private = migrate;
 
-	mmu_notifier_range_init(&range, mm_walk.mm, migrate->start,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, NULL, mm_walk.mm,
+				migrate->start,
 				migrate->end);
 	mmu_notifier_invalidate_range_start(&range);
 	walk_page_range(migrate->start, migrate->end, &mm_walk);
@@ -2759,6 +2760,8 @@ static void migrate_vma_pages(struct migrate_vma *migrate)
 				notified = true;
 
 				mmu_notifier_range_init(&range,
+							MMU_NOTIFY_UNMAP, 0,
+							NULL,
 							migrate->vma->vm_mm,
 							addr, migrate->end);
 				mmu_notifier_invalidate_range_start(&range);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 028c724dcb1a..b10984052ae9 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -185,7 +185,8 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma,
 
 		/* invoke the mmu notifier if the pmd is populated */
 		if (!range.start) {
-			mmu_notifier_range_init(&range, vma->vm_mm, addr, end);
+			mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0,
+						vma, vma->vm_mm, addr, end);
 			mmu_notifier_invalidate_range_start(&range);
 		}
 
diff --git a/mm/mremap.c b/mm/mremap.c
index e3edef6b7a12..fc241d23cd97 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -249,7 +249,8 @@ unsigned long move_page_tables(struct vm_area_struct *vma,
 	old_end = old_addr + len;
 	flush_cache_range(vma, old_addr, old_end);
 
-	mmu_notifier_range_init(&range, vma->vm_mm, old_addr, old_end);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+				old_addr, old_end);
 	mmu_notifier_invalidate_range_start(&range);
 
 	for (; old_addr < old_end; old_addr += extent, new_addr += extent) {
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 3a2484884cfd..539c91d0b26a 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -531,7 +531,8 @@ bool __oom_reap_task_mm(struct mm_struct *mm)
 			struct mmu_notifier_range range;
 			struct mmu_gather tlb;
 
-			mmu_notifier_range_init(&range, mm, vma->vm_start,
+			mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0,
+						vma, mm, vma->vm_start,
 						vma->vm_end);
 			tlb_gather_mmu(&tlb, mm, range.start, range.end);
 			if (mmu_notifier_invalidate_range_start_nonblock(&range)) {
diff --git a/mm/rmap.c b/mm/rmap.c
index b30c7c71d1d9..dba724f83ddd 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -896,7 +896,8 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 	 * We have to assume the worse case ie pmd for invalidation. Note that
 	 * the page can not be free from this function.
 	 */
-	mmu_notifier_range_init(&range, vma->vm_mm, address,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+				address,
 				min(vma->vm_end, address +
 				    (PAGE_SIZE << compound_order(page))));
 	mmu_notifier_invalidate_range_start(&range);
@@ -1371,7 +1372,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	 * Note that the page can not be free in this function as call of
 	 * try_to_unmap() must hold a reference on the page.
 	 */
-	mmu_notifier_range_init(&range, vma->vm_mm, address,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+				address,
 				min(vma->vm_end, address +
 				    (PAGE_SIZE << compound_order(page))));
 	if (PageHuge(page)) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 6/8] mm/mmu_notifier: use correct mmu_notifier events for each invalidation
  2019-03-26 16:47 [PATCH v6 0/8] mmu notifier provide context informations jglisse
                   ` (4 preceding siblings ...)
  2019-03-26 16:47 ` [PATCH v6 5/8] mm/mmu_notifier: contextual information for event triggering invalidation v2 jglisse
@ 2019-03-26 16:47 ` jglisse
  2019-03-26 16:47 ` [PATCH v6 7/8] mm/mmu_notifier: pass down vma and reasons why mmu notifier is happening v2 jglisse
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 18+ messages in thread
From: jglisse @ 2019-03-26 16:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, Andrew Morton, linux-mm,
	Christian König, Joonas Lahtinen, Jani Nikula, Rodrigo Vivi,
	Jan Kara, Andrea Arcangeli, Peter Xu, Felix Kuehling,
	Jason Gunthorpe, Ross Zwisler, Dan Williams, Paolo Bonzini,
	Radim Krčmář,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

From: Jérôme Glisse <jglisse@redhat.com>

This update each existing invalidation to use the correct mmu notifier
event that represent what is happening to the CPU page table. See the
patch which introduced the events to see the rational behind this.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Cc: Christian König <christian.koenig@amd.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kvm@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-rdma@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
---
 fs/proc/task_mmu.c      |  4 ++--
 kernel/events/uprobes.c |  2 +-
 mm/huge_memory.c        | 14 ++++++--------
 mm/hugetlb.c            |  8 ++++----
 mm/khugepaged.c         |  2 +-
 mm/ksm.c                |  4 ++--
 mm/madvise.c            |  2 +-
 mm/memory.c             | 14 +++++++-------
 mm/migrate.c            |  4 ++--
 mm/mprotect.c           |  5 +++--
 mm/rmap.c               |  6 +++---
 11 files changed, 32 insertions(+), 33 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index fcbd0e574917..3b93ce496dd4 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1151,8 +1151,8 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
 				break;
 			}
 
-			mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0,
-						NULL, mm, 0, -1UL);
+			mmu_notifier_range_init(&range, MMU_NOTIFY_SOFT_DIRTY,
+						0, NULL, mm, 0, -1UL);
 			mmu_notifier_invalidate_range_start(&range);
 		}
 		walk_page_range(0, mm->highest_vm_end, &clear_refs_walk);
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 77c3f079c723..79c84bb48ea9 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -161,7 +161,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
 	struct mmu_notifier_range range;
 	struct mem_cgroup *memcg;
 
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm, addr,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, addr,
 				addr + PAGE_SIZE);
 
 	VM_BUG_ON_PAGE(PageTransHuge(old_page), old_page);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 4309939be22d..f0ad70c29500 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1184,9 +1184,8 @@ static vm_fault_t do_huge_pmd_wp_page_fallback(struct vm_fault *vmf,
 		cond_resched();
 	}
 
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
-				haddr,
-				haddr + HPAGE_PMD_SIZE);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
+				haddr, haddr + HPAGE_PMD_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 
 	vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
@@ -1349,9 +1348,8 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd)
 				    vma, HPAGE_PMD_NR);
 	__SetPageUptodate(new_page);
 
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
-				haddr,
-				haddr + HPAGE_PMD_SIZE);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
+				haddr, haddr + HPAGE_PMD_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 
 	spin_lock(vmf->ptl);
@@ -2026,7 +2024,7 @@ void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud,
 	spinlock_t *ptl;
 	struct mmu_notifier_range range;
 
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
 				address & HPAGE_PUD_MASK,
 				(address & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
@@ -2245,7 +2243,7 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 	spinlock_t *ptl;
 	struct mmu_notifier_range range;
 
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
 				address & HPAGE_PMD_MASK,
 				(address & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index c20a8d2de3f3..44fe3565ef37 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3247,7 +3247,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 	cow = (vma->vm_flags & (VM_SHARED | VM_MAYWRITE)) == VM_MAYWRITE;
 
 	if (cow) {
-		mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, src,
+		mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, src,
 					vma->vm_start,
 					vma->vm_end);
 		mmu_notifier_invalidate_range_start(&range);
@@ -3628,7 +3628,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
 			    pages_per_huge_page(h));
 	__SetPageUptodate(new_page);
 
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm, haddr,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, haddr,
 				haddr + huge_page_size(h));
 	mmu_notifier_invalidate_range_start(&range);
 
@@ -4361,8 +4361,8 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
 	 * start/end.  Set range.start/range.end to cover the maximum possible
 	 * range if PMD sharing is possible.
 	 */
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm, start,
-				end);
+	mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_VMA,
+				0, vma, mm, start, end);
 	adjust_range_if_pmd_sharing_possible(vma, &range.start, &range.end);
 
 	BUG_ON(address >= end);
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index e7944f5e6258..579699d2b347 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1016,7 +1016,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 	pte = pte_offset_map(pmd, address);
 	pte_ptl = pte_lockptr(mm, pmd);
 
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, NULL, mm,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, NULL, mm,
 				address, address + HPAGE_PMD_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 	pmd_ptl = pmd_lock(mm, pmd); /* probably unnecessary */
diff --git a/mm/ksm.c b/mm/ksm.c
index 01f5fe2c90cf..81c20ed57bf6 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1066,7 +1066,7 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page,
 
 	BUG_ON(PageTransCompound(page));
 
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm,
 				pvmw.address,
 				pvmw.address + PAGE_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
@@ -1155,7 +1155,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
 	if (!pmd)
 		goto out;
 
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm, addr,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm, addr,
 				addr + PAGE_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 
diff --git a/mm/madvise.c b/mm/madvise.c
index c617f53a9c09..a692d2a893b5 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -472,7 +472,7 @@ static int madvise_free_single_vma(struct vm_area_struct *vma,
 	range.end = min(vma->vm_end, end_addr);
 	if (range.end <= vma->vm_start)
 		return -EINVAL;
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm,
 				range.start, range.end);
 
 	lru_add_drain();
diff --git a/mm/memory.c b/mm/memory.c
index ac6754bb30c8..c24c5ffe950f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1010,8 +1010,8 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	is_cow = is_cow_mapping(vma->vm_flags);
 
 	if (is_cow) {
-		mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma,
-					src_mm, addr, end);
+		mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_PAGE,
+					0, vma, src_mm, addr, end);
 		mmu_notifier_invalidate_range_start(&range);
 	}
 
@@ -1358,7 +1358,7 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start,
 	struct mmu_gather tlb;
 
 	lru_add_drain();
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
 				start, start + size);
 	tlb_gather_mmu(&tlb, vma->vm_mm, start, range.end);
 	update_hiwater_rss(vma->vm_mm);
@@ -1385,7 +1385,7 @@ static void zap_page_range_single(struct vm_area_struct *vma, unsigned long addr
 	struct mmu_gather tlb;
 
 	lru_add_drain();
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
 				address, address + size);
 	tlb_gather_mmu(&tlb, vma->vm_mm, address, range.end);
 	update_hiwater_rss(vma->vm_mm);
@@ -2282,7 +2282,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
 
 	__SetPageUptodate(new_page);
 
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, mm,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, mm,
 				vmf->address & PAGE_MASK,
 				(vmf->address & PAGE_MASK) + PAGE_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
@@ -4108,7 +4108,7 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address,
 			goto out;
 
 		if (range) {
-			mmu_notifier_range_init(range, MMU_NOTIFY_UNMAP, 0,
+			mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0,
 						NULL, mm, address & PMD_MASK,
 						(address & PMD_MASK) + PMD_SIZE);
 			mmu_notifier_invalidate_range_start(range);
@@ -4127,7 +4127,7 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address,
 		goto out;
 
 	if (range) {
-		mmu_notifier_range_init(range, MMU_NOTIFY_UNMAP, 0, NULL, mm,
+		mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0, NULL, mm,
 					address & PAGE_MASK,
 					(address & PAGE_MASK) + PAGE_SIZE);
 		mmu_notifier_invalidate_range_start(range);
diff --git a/mm/migrate.c b/mm/migrate.c
index 8f6e4382f0ad..f37b1a4bf9c0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2351,7 +2351,7 @@ static void migrate_vma_collect(struct migrate_vma *migrate)
 	mm_walk.mm = migrate->vma->vm_mm;
 	mm_walk.private = migrate;
 
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, NULL, mm_walk.mm,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, NULL, mm_walk.mm,
 				migrate->start,
 				migrate->end);
 	mmu_notifier_invalidate_range_start(&range);
@@ -2760,7 +2760,7 @@ static void migrate_vma_pages(struct migrate_vma *migrate)
 				notified = true;
 
 				mmu_notifier_range_init(&range,
-							MMU_NOTIFY_UNMAP, 0,
+							MMU_NOTIFY_CLEAR, 0,
 							NULL,
 							migrate->vma->vm_mm,
 							addr, migrate->end);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index b10984052ae9..65242f1e4457 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -185,8 +185,9 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma,
 
 		/* invoke the mmu notifier if the pmd is populated */
 		if (!range.start) {
-			mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0,
-						vma, vma->vm_mm, addr, end);
+			mmu_notifier_range_init(&range,
+				MMU_NOTIFY_PROTECTION_VMA, 0,
+				vma, vma->vm_mm, addr, end);
 			mmu_notifier_invalidate_range_start(&range);
 		}
 
diff --git a/mm/rmap.c b/mm/rmap.c
index dba724f83ddd..c26becc3982c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -896,8 +896,8 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 	 * We have to assume the worse case ie pmd for invalidation. Note that
 	 * the page can not be free from this function.
 	 */
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
-				address,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_PAGE,
+				0, vma, vma->vm_mm, address,
 				min(vma->vm_end, address +
 				    (PAGE_SIZE << compound_order(page))));
 	mmu_notifier_invalidate_range_start(&range);
@@ -1372,7 +1372,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	 * Note that the page can not be free in this function as call of
 	 * try_to_unmap() must hold a reference on the page.
 	 */
-	mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm,
+	mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm,
 				address,
 				min(vma->vm_end, address +
 				    (PAGE_SIZE << compound_order(page))));
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 7/8] mm/mmu_notifier: pass down vma and reasons why mmu notifier is happening v2
  2019-03-26 16:47 [PATCH v6 0/8] mmu notifier provide context informations jglisse
                   ` (5 preceding siblings ...)
  2019-03-26 16:47 ` [PATCH v6 6/8] mm/mmu_notifier: use correct mmu_notifier events for each invalidation jglisse
@ 2019-03-26 16:47 ` jglisse
  2019-04-10 23:41   ` Ira Weiny
  2019-03-26 16:47 ` [PATCH v6 8/8] mm/mmu_notifier: mmu_notifier_range_update_to_read_only() helper jglisse
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 18+ messages in thread
From: jglisse @ 2019-03-26 16:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, Andrew Morton, linux-mm,
	Christian König, Joonas Lahtinen, Jani Nikula, Rodrigo Vivi,
	Jan Kara, Andrea Arcangeli, Peter Xu, Felix Kuehling,
	Jason Gunthorpe, Ross Zwisler, Dan Williams, Paolo Bonzini,
	Radim Krčmář,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

From: Jérôme Glisse <jglisse@redhat.com>

CPU page table update can happens for many reasons, not only as a result
of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also
as a result of kernel activities (memory compression, reclaim, migration,
...).

Users of mmu notifier API track changes to the CPU page table and take
specific action for them. While current API only provide range of virtual
address affected by the change, not why the changes is happening

This patch is just passing down the new informations by adding it to the
mmu_notifier_range structure.

Changes since v1:
    - Initialize flags field from mmu_notifier_range_init() arguments

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Cc: Christian König <christian.koenig@amd.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kvm@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-rdma@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
---
 include/linux/mmu_notifier.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 62f94cd85455..0379956fff23 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -58,10 +58,12 @@ struct mmu_notifier_mm {
 #define MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0)
 
 struct mmu_notifier_range {
+	struct vm_area_struct *vma;
 	struct mm_struct *mm;
 	unsigned long start;
 	unsigned long end;
 	unsigned flags;
+	enum mmu_notifier_event event;
 };
 
 struct mmu_notifier_ops {
@@ -363,10 +365,12 @@ static inline void mmu_notifier_range_init(struct mmu_notifier_range *range,
 					   unsigned long start,
 					   unsigned long end)
 {
+	range->vma = vma;
+	range->event = event;
 	range->mm = mm;
 	range->start = start;
 	range->end = end;
-	range->flags = 0;
+	range->flags = flags;
 }
 
 #define ptep_clear_flush_young_notify(__vma, __address, __ptep)		\
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v6 8/8] mm/mmu_notifier: mmu_notifier_range_update_to_read_only() helper
  2019-03-26 16:47 [PATCH v6 0/8] mmu notifier provide context informations jglisse
                   ` (6 preceding siblings ...)
  2019-03-26 16:47 ` [PATCH v6 7/8] mm/mmu_notifier: pass down vma and reasons why mmu notifier is happening v2 jglisse
@ 2019-03-26 16:47 ` jglisse
  2019-04-09 14:21 ` [PATCH v6 0/8] mmu notifier provide context informations Jerome Glisse
  2019-04-09 22:08 ` Andrew Morton
  9 siblings, 0 replies; 18+ messages in thread
From: jglisse @ 2019-03-26 16:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, Andrew Morton, linux-mm,
	Christian König, Joonas Lahtinen, Jani Nikula, Rodrigo Vivi,
	Jan Kara, Andrea Arcangeli, Peter Xu, Felix Kuehling,
	Jason Gunthorpe, Ross Zwisler, Dan Williams, Paolo Bonzini,
	Radim Krčmář,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

From: Jérôme Glisse <jglisse@redhat.com>

Helper to test if a range is updated to read only (it is still valid
to read from the range). This is useful for device driver or anyone
who wish to optimize out update when they know that they already have
the range map read only.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Cc: Christian König <christian.koenig@amd.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: kvm@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-rdma@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
---
 include/linux/mmu_notifier.h |  4 ++++
 mm/mmu_notifier.c            | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 0379956fff23..b6c004bd9f6a 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -259,6 +259,8 @@ extern void __mmu_notifier_invalidate_range_end(struct mmu_notifier_range *r,
 				  bool only_end);
 extern void __mmu_notifier_invalidate_range(struct mm_struct *mm,
 				  unsigned long start, unsigned long end);
+extern bool
+mmu_notifier_range_update_to_read_only(const struct mmu_notifier_range *range);
 
 static inline bool
 mmu_notifier_range_blockable(const struct mmu_notifier_range *range)
@@ -568,6 +570,8 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm)
 {
 }
 
+#define mmu_notifier_range_update_to_read_only(r) false
+
 #define ptep_clear_flush_young_notify ptep_clear_flush_young
 #define pmdp_clear_flush_young_notify pmdp_clear_flush_young
 #define ptep_clear_young_notify ptep_test_and_clear_young
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index abd88c466eb2..ee36068077b6 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -395,3 +395,13 @@ void mmu_notifier_unregister_no_release(struct mmu_notifier *mn,
 	mmdrop(mm);
 }
 EXPORT_SYMBOL_GPL(mmu_notifier_unregister_no_release);
+
+bool
+mmu_notifier_range_update_to_read_only(const struct mmu_notifier_range *range)
+{
+	if (!range->vma || range->event != MMU_NOTIFY_PROTECTION_VMA)
+		return false;
+	/* Return true if the vma still have the read flag set. */
+	return range->vma->vm_flags & VM_READ;
+}
+EXPORT_SYMBOL_GPL(mmu_notifier_range_update_to_read_only);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 0/8] mmu notifier provide context informations
  2019-03-26 16:47 [PATCH v6 0/8] mmu notifier provide context informations jglisse
                   ` (7 preceding siblings ...)
  2019-03-26 16:47 ` [PATCH v6 8/8] mm/mmu_notifier: mmu_notifier_range_update_to_read_only() helper jglisse
@ 2019-04-09 14:21 ` Jerome Glisse
  2019-04-09 22:08 ` Andrew Morton
  9 siblings, 0 replies; 18+ messages in thread
From: Jerome Glisse @ 2019-04-09 14:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Christian König, Joonas Lahtinen,
	Jani Nikula, Rodrigo Vivi, Jan Kara, Andrea Arcangeli, Peter Xu,
	Felix Kuehling, Jason Gunthorpe, Ross Zwisler, Dan Williams,
	Paolo Bonzini, Alex Deucher, Radim Krčmář,
	Michal Hocko, Ben Skeggs, Ralph Campbell, John Hubbard, kvm,
	dri-devel, linux-rdma, Arnd Bergmann

Andrew anything blocking this for 5.2 ? Should i ask people (ie the end
user of this) to re-ack v6 (it is the same as previous version just rebase
and dropped kvm bits).



On Tue, Mar 26, 2019 at 12:47:39PM -0400, jglisse@redhat.com wrote:
> From: Jérôme Glisse <jglisse@redhat.com>
> 
> (Andrew this apply on top of my HMM patchset as otherwise you will have
>  conflict with changes to mm/hmm.c)
> 
> Changes since v5:
>     - drop KVM bits waiting for KVM people to express interest if they
>       do not then i will post patchset to remove change_pte_notify as
>       without the changes in v5 change_pte_notify is just useless (it
>       it is useless today upstream it is just wasting cpu cycles)
>     - rebase on top of lastest Linus tree
> 
> Previous cover letter with minor update:
> 
> 
> Here i am not posting users of this, they already have been posted to
> appropriate mailing list [6] and will be merge through the appropriate
> tree once this patchset is upstream.
> 
> Note that this serie does not change any behavior for any existing
> code. It just pass down more information to mmu notifier listener.
> 
> The rational for this patchset:
> 
> CPU page table update can happens for many reasons, not only as a
> result of a syscall (munmap(), mprotect(), mremap(), madvise(), ...)
> but also as a result of kernel activities (memory compression, reclaim,
> migration, ...).
> 
> This patch introduce a set of enums that can be associated with each
> of the events triggering a mmu notifier:
> 
>     - UNMAP: munmap() or mremap()
>     - CLEAR: page table is cleared (migration, compaction, reclaim, ...)
>     - PROTECTION_VMA: change in access protections for the range
>     - PROTECTION_PAGE: change in access protections for page in the range
>     - SOFT_DIRTY: soft dirtyness tracking
> 
> Being able to identify munmap() and mremap() from other reasons why the
> page table is cleared is important to allow user of mmu notifier to
> update their own internal tracking structure accordingly (on munmap or
> mremap it is not longer needed to track range of virtual address as it
> becomes invalid). Without this serie, driver are force to assume that
> every notification is an munmap which triggers useless trashing within
> drivers that associate structure with range of virtual address. Each
> driver is force to free up its tracking structure and then restore it
> on next device page fault. With this serie we can also optimize device
> page table update [6].
> 
> More over this can also be use to optimize out some page table updates
> like for KVM where we can update the secondary MMU directly from the
> callback instead of clearing it.
> 
> ACKS AMD/RADEON https://lkml.org/lkml/2019/2/1/395
> ACKS RDMA https://lkml.org/lkml/2018/12/6/1473
> 
> Cheers,
> Jérôme
> 
> [1] v1 https://lkml.org/lkml/2018/3/23/1049
> [2] v2 https://lkml.org/lkml/2018/12/5/10
> [3] v3 https://lkml.org/lkml/2018/12/13/620
> [4] v4 https://lkml.org/lkml/2019/1/23/838
> [5] v5 https://lkml.org/lkml/2019/2/19/752
> [6] patches to use this:
>     https://lkml.org/lkml/2019/1/23/833
>     https://lkml.org/lkml/2019/1/23/834
>     https://lkml.org/lkml/2019/1/23/832
>     https://lkml.org/lkml/2019/1/23/831
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: linux-mm@kvack.org
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> Cc: Jason Gunthorpe <jgg@mellanox.com>
> Cc: Ross Zwisler <zwisler@kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Ben Skeggs <bskeggs@redhat.com>
> Cc: Ralph Campbell <rcampbell@nvidia.com>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Cc: kvm@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: linux-rdma@vger.kernel.org
> Cc: Arnd Bergmann <arnd@arndb.de>
> 
> Jérôme Glisse (8):
>   mm/mmu_notifier: helper to test if a range invalidation is blockable
>   mm/mmu_notifier: convert user range->blockable to helper function
>   mm/mmu_notifier: convert mmu_notifier_range->blockable to a flags
>   mm/mmu_notifier: contextual information for event enums
>   mm/mmu_notifier: contextual information for event triggering
>     invalidation v2
>   mm/mmu_notifier: use correct mmu_notifier events for each invalidation
>   mm/mmu_notifier: pass down vma and reasons why mmu notifier is
>     happening v2
>   mm/mmu_notifier: mmu_notifier_range_update_to_read_only() helper
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c  |  8 ++--
>  drivers/gpu/drm/i915/i915_gem_userptr.c |  2 +-
>  drivers/gpu/drm/radeon/radeon_mn.c      |  4 +-
>  drivers/infiniband/core/umem_odp.c      |  5 +-
>  drivers/xen/gntdev.c                    |  6 +--
>  fs/proc/task_mmu.c                      |  3 +-
>  include/linux/mmu_notifier.h            | 63 +++++++++++++++++++++++--
>  kernel/events/uprobes.c                 |  3 +-
>  mm/hmm.c                                |  6 +--
>  mm/huge_memory.c                        | 14 +++---
>  mm/hugetlb.c                            | 12 +++--
>  mm/khugepaged.c                         |  3 +-
>  mm/ksm.c                                |  6 ++-
>  mm/madvise.c                            |  3 +-
>  mm/memory.c                             | 25 ++++++----
>  mm/migrate.c                            |  5 +-
>  mm/mmu_notifier.c                       | 12 ++++-
>  mm/mprotect.c                           |  4 +-
>  mm/mremap.c                             |  3 +-
>  mm/oom_kill.c                           |  3 +-
>  mm/rmap.c                               |  6 ++-
>  virt/kvm/kvm_main.c                     |  3 +-
>  22 files changed, 147 insertions(+), 52 deletions(-)
> 
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 0/8] mmu notifier provide context informations
  2019-03-26 16:47 [PATCH v6 0/8] mmu notifier provide context informations jglisse
                   ` (8 preceding siblings ...)
  2019-04-09 14:21 ` [PATCH v6 0/8] mmu notifier provide context informations Jerome Glisse
@ 2019-04-09 22:08 ` Andrew Morton
  2019-04-10 16:06   ` Jerome Glisse
  9 siblings, 1 reply; 18+ messages in thread
From: Andrew Morton @ 2019-04-09 22:08 UTC (permalink / raw)
  To: jglisse
  Cc: linux-kernel, linux-mm, Christian König, Joonas Lahtinen,
	Jani Nikula, Rodrigo Vivi, Jan Kara, Andrea Arcangeli, Peter Xu,
	Felix Kuehling, Jason Gunthorpe, Ross Zwisler, Dan Williams,
	Paolo Bonzini, Alex Deucher, Radim Krčmář,
	Michal Hocko, Ben Skeggs, Ralph Campbell, John Hubbard, kvm,
	dri-devel, linux-rdma, Arnd Bergmann

On Tue, 26 Mar 2019 12:47:39 -0400 jglisse@redhat.com wrote:

> From: Jérôme Glisse <jglisse@redhat.com>
> 
> (Andrew this apply on top of my HMM patchset as otherwise you will have
>  conflict with changes to mm/hmm.c)
> 
> Changes since v5:
>     - drop KVM bits waiting for KVM people to express interest if they
>       do not then i will post patchset to remove change_pte_notify as
>       without the changes in v5 change_pte_notify is just useless (it
>       it is useless today upstream it is just wasting cpu cycles)
>     - rebase on top of lastest Linus tree
> 
> Previous cover letter with minor update:
> 
> 
> Here i am not posting users of this, they already have been posted to
> appropriate mailing list [6] and will be merge through the appropriate
> tree once this patchset is upstream.
> 
> Note that this serie does not change any behavior for any existing
> code. It just pass down more information to mmu notifier listener.
> 
> The rational for this patchset:
> 
> CPU page table update can happens for many reasons, not only as a
> result of a syscall (munmap(), mprotect(), mremap(), madvise(), ...)
> but also as a result of kernel activities (memory compression, reclaim,
> migration, ...).
> 
> This patch introduce a set of enums that can be associated with each
> of the events triggering a mmu notifier:
> 
>     - UNMAP: munmap() or mremap()
>     - CLEAR: page table is cleared (migration, compaction, reclaim, ...)
>     - PROTECTION_VMA: change in access protections for the range
>     - PROTECTION_PAGE: change in access protections for page in the range
>     - SOFT_DIRTY: soft dirtyness tracking
> 
> Being able to identify munmap() and mremap() from other reasons why the
> page table is cleared is important to allow user of mmu notifier to
> update their own internal tracking structure accordingly (on munmap or
> mremap it is not longer needed to track range of virtual address as it
> becomes invalid). Without this serie, driver are force to assume that
> every notification is an munmap which triggers useless trashing within
> drivers that associate structure with range of virtual address. Each
> driver is force to free up its tracking structure and then restore it
> on next device page fault. With this serie we can also optimize device
> page table update [6].
> 
> More over this can also be use to optimize out some page table updates
> like for KVM where we can update the secondary MMU directly from the
> callback instead of clearing it.

We seem to be rather short of review input on this patchset.  ie: there
is none.

> ACKS AMD/RADEON https://lkml.org/lkml/2019/2/1/395

OK, kind of ackish, but not a review.

> ACKS RDMA https://lkml.org/lkml/2018/12/6/1473

This actually acks the infiniband part of a patch which isn't in this
series.


So we have some work to do, please.  Who would be suitable reviewers?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 0/8] mmu notifier provide context informations
  2019-04-09 22:08 ` Andrew Morton
@ 2019-04-10 16:06   ` Jerome Glisse
  0 siblings, 0 replies; 18+ messages in thread
From: Jerome Glisse @ 2019-04-10 16:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Christian König, Joonas Lahtinen,
	Jani Nikula, Rodrigo Vivi, Jan Kara, Andrea Arcangeli, Peter Xu,
	Felix Kuehling, Jason Gunthorpe, Ross Zwisler, Dan Williams,
	Paolo Bonzini, Alex Deucher, Radim Krčmář,
	Michal Hocko, Ben Skeggs, Ralph Campbell, John Hubbard, kvm,
	dri-devel, linux-rdma, Arnd Bergmann

On Tue, Apr 09, 2019 at 03:08:55PM -0700, Andrew Morton wrote:
> On Tue, 26 Mar 2019 12:47:39 -0400 jglisse@redhat.com wrote:
> 
> > From: Jérôme Glisse <jglisse@redhat.com>
> > 
> > (Andrew this apply on top of my HMM patchset as otherwise you will have
> >  conflict with changes to mm/hmm.c)
> > 
> > Changes since v5:
> >     - drop KVM bits waiting for KVM people to express interest if they
> >       do not then i will post patchset to remove change_pte_notify as
> >       without the changes in v5 change_pte_notify is just useless (it
> >       it is useless today upstream it is just wasting cpu cycles)
> >     - rebase on top of lastest Linus tree
> > 
> > Previous cover letter with minor update:
> > 
> > 
> > Here i am not posting users of this, they already have been posted to
> > appropriate mailing list [6] and will be merge through the appropriate
> > tree once this patchset is upstream.
> > 
> > Note that this serie does not change any behavior for any existing
> > code. It just pass down more information to mmu notifier listener.
> > 
> > The rational for this patchset:
> > 
> > CPU page table update can happens for many reasons, not only as a
> > result of a syscall (munmap(), mprotect(), mremap(), madvise(), ...)
> > but also as a result of kernel activities (memory compression, reclaim,
> > migration, ...).
> > 
> > This patch introduce a set of enums that can be associated with each
> > of the events triggering a mmu notifier:
> > 
> >     - UNMAP: munmap() or mremap()
> >     - CLEAR: page table is cleared (migration, compaction, reclaim, ...)
> >     - PROTECTION_VMA: change in access protections for the range
> >     - PROTECTION_PAGE: change in access protections for page in the range
> >     - SOFT_DIRTY: soft dirtyness tracking
> > 
> > Being able to identify munmap() and mremap() from other reasons why the
> > page table is cleared is important to allow user of mmu notifier to
> > update their own internal tracking structure accordingly (on munmap or
> > mremap it is not longer needed to track range of virtual address as it
> > becomes invalid). Without this serie, driver are force to assume that
> > every notification is an munmap which triggers useless trashing within
> > drivers that associate structure with range of virtual address. Each
> > driver is force to free up its tracking structure and then restore it
> > on next device page fault. With this serie we can also optimize device
> > page table update [6].
> > 
> > More over this can also be use to optimize out some page table updates
> > like for KVM where we can update the secondary MMU directly from the
> > callback instead of clearing it.
> 
> We seem to be rather short of review input on this patchset.  ie: there
> is none.

I forgot to update the review tag but Ralph did review v5:
https://lkml.org/lkml/2019/2/22/564
https://lkml.org/lkml/2019/2/22/561
https://lkml.org/lkml/2019/2/22/558
https://lkml.org/lkml/2019/2/22/710
https://lkml.org/lkml/2019/2/22/711
https://lkml.org/lkml/2019/2/22/695
https://lkml.org/lkml/2019/2/22/738
https://lkml.org/lkml/2019/2/22/757

and since this v6 is a rebase just with better comments here and
there i believe those reviews holds.

> 
> > ACKS AMD/RADEON https://lkml.org/lkml/2019/2/1/395
> 
> OK, kind of ackish, but not a review.
> 
> > ACKS RDMA https://lkml.org/lkml/2018/12/6/1473
> 
> This actually acks the infiniband part of a patch which isn't in this
> series.

This to show that they are end user and that those end user are
wanted. Also obviously i will be using this within HMM and thus
it will be use by mlx5, nouveau and amdgpu (which are all the
HMM user that are either upstream or queue up for 5.2 or 5.3).

> So we have some work to do, please.  Who would be suitable reviewers?

Anyone willing to review mmu notifier code. I believe this patchset is
not that hard to review this is about giving contextual informations
on why mmu notifier are happening it does not change the logic of any-
thing. They are no maintainers for the mmu notifier so i don't have a
person i can single out for review, thought given i have been the one
doing most changes in that area it could fall on me ...

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 7/8] mm/mmu_notifier: pass down vma and reasons why mmu notifier is happening v2
  2019-03-26 16:47 ` [PATCH v6 7/8] mm/mmu_notifier: pass down vma and reasons why mmu notifier is happening v2 jglisse
@ 2019-04-10 23:41   ` Ira Weiny
  2019-04-11  5:41     ` Leon Romanovsky
  2019-04-11 14:39     ` Jerome Glisse
  0 siblings, 2 replies; 18+ messages in thread
From: Ira Weiny @ 2019-04-10 23:41 UTC (permalink / raw)
  To: jglisse
  Cc: linux-kernel, Andrew Morton, linux-mm, Christian König,
	Joonas Lahtinen, Jani Nikula, Rodrigo Vivi, Jan Kara,
	Andrea Arcangeli, Peter Xu, Felix Kuehling, Jason Gunthorpe,
	Ross Zwisler, Dan Williams, Paolo Bonzini,
	Radim Krčmář,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

On Tue, Mar 26, 2019 at 12:47:46PM -0400, Jerome Glisse wrote:
> From: Jérôme Glisse <jglisse@redhat.com>
> 
> CPU page table update can happens for many reasons, not only as a result
> of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also
> as a result of kernel activities (memory compression, reclaim, migration,
> ...).
> 
> Users of mmu notifier API track changes to the CPU page table and take
> specific action for them. While current API only provide range of virtual
> address affected by the change, not why the changes is happening
> 
> This patch is just passing down the new informations by adding it to the
> mmu_notifier_range structure.
> 
> Changes since v1:
>     - Initialize flags field from mmu_notifier_range_init() arguments
> 
> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: linux-mm@kvack.org
> Cc: Christian König <christian.koenig@amd.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> Cc: Jason Gunthorpe <jgg@mellanox.com>
> Cc: Ross Zwisler <zwisler@kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Ralph Campbell <rcampbell@nvidia.com>
> Cc: John Hubbard <jhubbard@nvidia.com>
> Cc: kvm@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: linux-rdma@vger.kernel.org
> Cc: Arnd Bergmann <arnd@arndb.de>
> ---
>  include/linux/mmu_notifier.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
> index 62f94cd85455..0379956fff23 100644
> --- a/include/linux/mmu_notifier.h
> +++ b/include/linux/mmu_notifier.h
> @@ -58,10 +58,12 @@ struct mmu_notifier_mm {
>  #define MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0)
>  
>  struct mmu_notifier_range {
> +	struct vm_area_struct *vma;
>  	struct mm_struct *mm;
>  	unsigned long start;
>  	unsigned long end;
>  	unsigned flags;
> +	enum mmu_notifier_event event;
>  };
>  
>  struct mmu_notifier_ops {
> @@ -363,10 +365,12 @@ static inline void mmu_notifier_range_init(struct mmu_notifier_range *range,
>  					   unsigned long start,
>  					   unsigned long end)
>  {
> +	range->vma = vma;
> +	range->event = event;
>  	range->mm = mm;
>  	range->start = start;
>  	range->end = end;
> -	range->flags = 0;
> +	range->flags = flags;

Which of the "user patch sets" uses the new flags?

I'm not seeing that user yet.  In general I don't see anything wrong with the
series and I like the idea of telling drivers why the invalidate has fired.

But is the flags a future feature?

For the series:

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

Ira

>  }
>  
>  #define ptep_clear_flush_young_notify(__vma, __address, __ptep)		\
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 7/8] mm/mmu_notifier: pass down vma and reasons why mmu notifier is happening v2
  2019-04-10 23:41   ` Ira Weiny
@ 2019-04-11  5:41     ` Leon Romanovsky
  2019-04-11 21:50       ` Ira Weiny
  2019-04-11 14:39     ` Jerome Glisse
  1 sibling, 1 reply; 18+ messages in thread
From: Leon Romanovsky @ 2019-04-11  5:41 UTC (permalink / raw)
  To: Ira Weiny
  Cc: jglisse, linux-kernel, Andrew Morton, linux-mm,
	Christian König, Joonas Lahtinen, Jani Nikula, Rodrigo Vivi,
	Jan Kara, Andrea Arcangeli, Peter Xu, Felix Kuehling,
	Jason Gunthorpe, Ross Zwisler, Dan Williams, Paolo Bonzini,
	Radim Krčmář,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

[-- Attachment #1: Type: text/plain, Size: 3560 bytes --]

On Wed, Apr 10, 2019 at 04:41:57PM -0700, Ira Weiny wrote:
> On Tue, Mar 26, 2019 at 12:47:46PM -0400, Jerome Glisse wrote:
> > From: Jérôme Glisse <jglisse@redhat.com>
> >
> > CPU page table update can happens for many reasons, not only as a result
> > of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also
> > as a result of kernel activities (memory compression, reclaim, migration,
> > ...).
> >
> > Users of mmu notifier API track changes to the CPU page table and take
> > specific action for them. While current API only provide range of virtual
> > address affected by the change, not why the changes is happening
> >
> > This patch is just passing down the new informations by adding it to the
> > mmu_notifier_range structure.
> >
> > Changes since v1:
> >     - Initialize flags field from mmu_notifier_range_init() arguments
> >
> > Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: linux-mm@kvack.org
> > Cc: Christian König <christian.koenig@amd.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Cc: Jan Kara <jack@suse.cz>
> > Cc: Andrea Arcangeli <aarcange@redhat.com>
> > Cc: Peter Xu <peterx@redhat.com>
> > Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > Cc: Ross Zwisler <zwisler@kernel.org>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: Christian Koenig <christian.koenig@amd.com>
> > Cc: Ralph Campbell <rcampbell@nvidia.com>
> > Cc: John Hubbard <jhubbard@nvidia.com>
> > Cc: kvm@vger.kernel.org
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > ---
> >  include/linux/mmu_notifier.h | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
> > index 62f94cd85455..0379956fff23 100644
> > --- a/include/linux/mmu_notifier.h
> > +++ b/include/linux/mmu_notifier.h
> > @@ -58,10 +58,12 @@ struct mmu_notifier_mm {
> >  #define MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0)
> >
> >  struct mmu_notifier_range {
> > +	struct vm_area_struct *vma;
> >  	struct mm_struct *mm;
> >  	unsigned long start;
> >  	unsigned long end;
> >  	unsigned flags;
> > +	enum mmu_notifier_event event;
> >  };
> >
> >  struct mmu_notifier_ops {
> > @@ -363,10 +365,12 @@ static inline void mmu_notifier_range_init(struct mmu_notifier_range *range,
> >  					   unsigned long start,
> >  					   unsigned long end)
> >  {
> > +	range->vma = vma;
> > +	range->event = event;
> >  	range->mm = mm;
> >  	range->start = start;
> >  	range->end = end;
> > -	range->flags = 0;
> > +	range->flags = flags;
>
> Which of the "user patch sets" uses the new flags?
>
> I'm not seeing that user yet.  In general I don't see anything wrong with the
> series and I like the idea of telling drivers why the invalidate has fired.
>
> But is the flags a future feature?

It seems that it is used in HMM ODP patch.
https://patchwork.kernel.org/patch/10894281/

Thanks

>
> For the series:
>
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
>
> Ira
>
> >  }
> >
> >  #define ptep_clear_flush_young_notify(__vma, __address, __ptep)		\
> > --
> > 2.20.1
> >

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 7/8] mm/mmu_notifier: pass down vma and reasons why mmu notifier is happening v2
  2019-04-10 23:41   ` Ira Weiny
  2019-04-11  5:41     ` Leon Romanovsky
@ 2019-04-11 14:39     ` Jerome Glisse
  2019-04-11 15:21       ` Weiny, Ira
  1 sibling, 1 reply; 18+ messages in thread
From: Jerome Glisse @ 2019-04-11 14:39 UTC (permalink / raw)
  To: Ira Weiny
  Cc: linux-kernel, Andrew Morton, linux-mm, Christian König,
	Joonas Lahtinen, Jani Nikula, Rodrigo Vivi, Jan Kara,
	Andrea Arcangeli, Peter Xu, Felix Kuehling, Jason Gunthorpe,
	Ross Zwisler, Dan Williams, Paolo Bonzini,
	Radim Krčmář,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

On Wed, Apr 10, 2019 at 04:41:57PM -0700, Ira Weiny wrote:
> On Tue, Mar 26, 2019 at 12:47:46PM -0400, Jerome Glisse wrote:
> > From: Jérôme Glisse <jglisse@redhat.com>
> > 
> > CPU page table update can happens for many reasons, not only as a result
> > of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also
> > as a result of kernel activities (memory compression, reclaim, migration,
> > ...).
> > 
> > Users of mmu notifier API track changes to the CPU page table and take
> > specific action for them. While current API only provide range of virtual
> > address affected by the change, not why the changes is happening
> > 
> > This patch is just passing down the new informations by adding it to the
> > mmu_notifier_range structure.
> > 
> > Changes since v1:
> >     - Initialize flags field from mmu_notifier_range_init() arguments
> > 
> > Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: linux-mm@kvack.org
> > Cc: Christian König <christian.koenig@amd.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Cc: Jan Kara <jack@suse.cz>
> > Cc: Andrea Arcangeli <aarcange@redhat.com>
> > Cc: Peter Xu <peterx@redhat.com>
> > Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > Cc: Ross Zwisler <zwisler@kernel.org>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Cc: Michal Hocko <mhocko@kernel.org>
> > Cc: Christian Koenig <christian.koenig@amd.com>
> > Cc: Ralph Campbell <rcampbell@nvidia.com>
> > Cc: John Hubbard <jhubbard@nvidia.com>
> > Cc: kvm@vger.kernel.org
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: linux-rdma@vger.kernel.org
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > ---
> >  include/linux/mmu_notifier.h | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
> > index 62f94cd85455..0379956fff23 100644
> > --- a/include/linux/mmu_notifier.h
> > +++ b/include/linux/mmu_notifier.h
> > @@ -58,10 +58,12 @@ struct mmu_notifier_mm {
> >  #define MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0)
> >  
> >  struct mmu_notifier_range {
> > +	struct vm_area_struct *vma;
> >  	struct mm_struct *mm;
> >  	unsigned long start;
> >  	unsigned long end;
> >  	unsigned flags;
> > +	enum mmu_notifier_event event;
> >  };
> >  
> >  struct mmu_notifier_ops {
> > @@ -363,10 +365,12 @@ static inline void mmu_notifier_range_init(struct mmu_notifier_range *range,
> >  					   unsigned long start,
> >  					   unsigned long end)
> >  {
> > +	range->vma = vma;
> > +	range->event = event;
> >  	range->mm = mm;
> >  	range->start = start;
> >  	range->end = end;
> > -	range->flags = 0;
> > +	range->flags = flags;
> 
> Which of the "user patch sets" uses the new flags?
> 
> I'm not seeing that user yet.  In general I don't see anything wrong with the
> series and I like the idea of telling drivers why the invalidate has fired.
> 
> But is the flags a future feature?
> 

I believe the link were in the cover:

https://lkml.org/lkml/2019/1/23/833
https://lkml.org/lkml/2019/1/23/834
https://lkml.org/lkml/2019/1/23/832
https://lkml.org/lkml/2019/1/23/831

I have more coming for HMM but i am waiting after 5.2 once amdgpu
HMM patch are merge upstream as it will change what is passed down
to driver and it would conflict with non merged HMM driver (like
amdgpu today).

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v6 7/8] mm/mmu_notifier: pass down vma and reasons why mmu notifier is happening v2
  2019-04-11 14:39     ` Jerome Glisse
@ 2019-04-11 15:21       ` Weiny, Ira
  2019-04-11 17:00         ` Jerome Glisse
  0 siblings, 1 reply; 18+ messages in thread
From: Weiny, Ira @ 2019-04-11 15:21 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: linux-kernel, Andrew Morton, linux-mm, Christian König,
	Joonas Lahtinen, Jani Nikula, Vivi, Rodrigo, Jan Kara,
	Andrea Arcangeli, Peter Xu, Felix Kuehling, Jason Gunthorpe,
	Ross Zwisler, Williams, Dan J, Paolo Bonzini, Radim Krcmár,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

> On Wed, Apr 10, 2019 at 04:41:57PM -0700, Ira Weiny wrote:
> > On Tue, Mar 26, 2019 at 12:47:46PM -0400, Jerome Glisse wrote:
> > > From: Jérôme Glisse <jglisse@redhat.com>
> > >
> > > CPU page table update can happens for many reasons, not only as a
> > > result of a syscall (munmap(), mprotect(), mremap(), madvise(), ...)
> > > but also as a result of kernel activities (memory compression,
> > > reclaim, migration, ...).
> > >
> > > Users of mmu notifier API track changes to the CPU page table and
> > > take specific action for them. While current API only provide range
> > > of virtual address affected by the change, not why the changes is
> > > happening
> > >
> > > This patch is just passing down the new informations by adding it to
> > > the mmu_notifier_range structure.
> > >
> > > Changes since v1:
> > >     - Initialize flags field from mmu_notifier_range_init()
> > > arguments
> > >
> > > Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: linux-mm@kvack.org
> > > Cc: Christian König <christian.koenig@amd.com>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > Cc: Jan Kara <jack@suse.cz>
> > > Cc: Andrea Arcangeli <aarcange@redhat.com>
> > > Cc: Peter Xu <peterx@redhat.com>
> > > Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> > > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > > Cc: Ross Zwisler <zwisler@kernel.org>
> > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > > Cc: Michal Hocko <mhocko@kernel.org>
> > > Cc: Christian Koenig <christian.koenig@amd.com>
> > > Cc: Ralph Campbell <rcampbell@nvidia.com>
> > > Cc: John Hubbard <jhubbard@nvidia.com>
> > > Cc: kvm@vger.kernel.org
> > > Cc: dri-devel@lists.freedesktop.org
> > > Cc: linux-rdma@vger.kernel.org
> > > Cc: Arnd Bergmann <arnd@arndb.de>
> > > ---
> > >  include/linux/mmu_notifier.h | 6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/mmu_notifier.h
> > > b/include/linux/mmu_notifier.h index 62f94cd85455..0379956fff23
> > > 100644
> > > --- a/include/linux/mmu_notifier.h
> > > +++ b/include/linux/mmu_notifier.h
> > > @@ -58,10 +58,12 @@ struct mmu_notifier_mm {  #define
> > > MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0)
> > >
> > >  struct mmu_notifier_range {
> > > +	struct vm_area_struct *vma;
> > >  	struct mm_struct *mm;
> > >  	unsigned long start;
> > >  	unsigned long end;
> > >  	unsigned flags;
> > > +	enum mmu_notifier_event event;
> > >  };
> > >
> > >  struct mmu_notifier_ops {
> > > @@ -363,10 +365,12 @@ static inline void
> mmu_notifier_range_init (struct mmu_notifier_range *range,
> > >  					   unsigned long start,
> > >  					   unsigned long end)
> > >  {
> > > +	range->vma = vma;
> > > +	range->event = event;
> > >  	range->mm = mm;
> > >  	range->start = start;
> > >  	range->end = end;
> > > -	range->flags = 0;
> > > +	range->flags = flags;
> >
> > Which of the "user patch sets" uses the new flags?
> >
> > I'm not seeing that user yet.  In general I don't see anything wrong
> > with the series and I like the idea of telling drivers why the invalidate has
> fired.
> >
> > But is the flags a future feature?
> >
> 
> I believe the link were in the cover:
> 
> https://lkml.org/lkml/2019/1/23/833
> https://lkml.org/lkml/2019/1/23/834
> https://lkml.org/lkml/2019/1/23/832
> https://lkml.org/lkml/2019/1/23/831
> 
> I have more coming for HMM but i am waiting after 5.2 once amdgpu HMM
> patch are merge upstream as it will change what is passed down to driver
> and it would conflict with non merged HMM driver (like amdgpu today).
> 

Unfortunately this does not answer my question.  Yes I saw the links to the patches which use this in the header.  Furthermore, I checked the links again, I still do not see a use of range->flags nor a use of the new flags parameter to mmu_notifier_range_init().

I still gave a reviewed by because I'm not saying it is wrong I'm just trying to understand what use drivers have of this flag.

So again I'm curious what is the use case of these flags and the use case of exposing it to the users of MMU notifiers?

Ira

> Cheers,
> Jérôme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 7/8] mm/mmu_notifier: pass down vma and reasons why mmu notifier is happening v2
  2019-04-11 15:21       ` Weiny, Ira
@ 2019-04-11 17:00         ` Jerome Glisse
  0 siblings, 0 replies; 18+ messages in thread
From: Jerome Glisse @ 2019-04-11 17:00 UTC (permalink / raw)
  To: Weiny, Ira
  Cc: linux-kernel, Andrew Morton, linux-mm, Christian König,
	Joonas Lahtinen, Jani Nikula, Vivi, Rodrigo, Jan Kara,
	Andrea Arcangeli, Peter Xu, Felix Kuehling, Jason Gunthorpe,
	Ross Zwisler, Williams, Dan J, Paolo Bonzini, Radim Krcmár,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

On Thu, Apr 11, 2019 at 03:21:08PM +0000, Weiny, Ira wrote:
> > On Wed, Apr 10, 2019 at 04:41:57PM -0700, Ira Weiny wrote:
> > > On Tue, Mar 26, 2019 at 12:47:46PM -0400, Jerome Glisse wrote:
> > > > From: Jérôme Glisse <jglisse@redhat.com>
> > > >
> > > > CPU page table update can happens for many reasons, not only as a
> > > > result of a syscall (munmap(), mprotect(), mremap(), madvise(), ...)
> > > > but also as a result of kernel activities (memory compression,
> > > > reclaim, migration, ...).
> > > >
> > > > Users of mmu notifier API track changes to the CPU page table and
> > > > take specific action for them. While current API only provide range
> > > > of virtual address affected by the change, not why the changes is
> > > > happening
> > > >
> > > > This patch is just passing down the new informations by adding it to
> > > > the mmu_notifier_range structure.
> > > >
> > > > Changes since v1:
> > > >     - Initialize flags field from mmu_notifier_range_init()
> > > > arguments
> > > >
> > > > Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > Cc: linux-mm@kvack.org
> > > > Cc: Christian König <christian.koenig@amd.com>
> > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > Cc: Jan Kara <jack@suse.cz>
> > > > Cc: Andrea Arcangeli <aarcange@redhat.com>
> > > > Cc: Peter Xu <peterx@redhat.com>
> > > > Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> > > > Cc: Jason Gunthorpe <jgg@mellanox.com>
> > > > Cc: Ross Zwisler <zwisler@kernel.org>
> > > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > > > Cc: Michal Hocko <mhocko@kernel.org>
> > > > Cc: Christian Koenig <christian.koenig@amd.com>
> > > > Cc: Ralph Campbell <rcampbell@nvidia.com>
> > > > Cc: John Hubbard <jhubbard@nvidia.com>
> > > > Cc: kvm@vger.kernel.org
> > > > Cc: dri-devel@lists.freedesktop.org
> > > > Cc: linux-rdma@vger.kernel.org
> > > > Cc: Arnd Bergmann <arnd@arndb.de>
> > > > ---
> > > >  include/linux/mmu_notifier.h | 6 +++++-
> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/include/linux/mmu_notifier.h
> > > > b/include/linux/mmu_notifier.h index 62f94cd85455..0379956fff23
> > > > 100644
> > > > --- a/include/linux/mmu_notifier.h
> > > > +++ b/include/linux/mmu_notifier.h
> > > > @@ -58,10 +58,12 @@ struct mmu_notifier_mm {  #define
> > > > MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0)
> > > >
> > > >  struct mmu_notifier_range {
> > > > +	struct vm_area_struct *vma;
> > > >  	struct mm_struct *mm;
> > > >  	unsigned long start;
> > > >  	unsigned long end;
> > > >  	unsigned flags;
> > > > +	enum mmu_notifier_event event;
> > > >  };
> > > >
> > > >  struct mmu_notifier_ops {
> > > > @@ -363,10 +365,12 @@ static inline void
> > mmu_notifier_range_init (struct mmu_notifier_range *range,
> > > >  					   unsigned long start,
> > > >  					   unsigned long end)
> > > >  {
> > > > +	range->vma = vma;
> > > > +	range->event = event;
> > > >  	range->mm = mm;
> > > >  	range->start = start;
> > > >  	range->end = end;
> > > > -	range->flags = 0;
> > > > +	range->flags = flags;
> > >
> > > Which of the "user patch sets" uses the new flags?
> > >
> > > I'm not seeing that user yet.  In general I don't see anything wrong
> > > with the series and I like the idea of telling drivers why the invalidate has
> > fired.
> > >
> > > But is the flags a future feature?
> > >
> > 
> > I believe the link were in the cover:
> > 
> > https://lkml.org/lkml/2019/1/23/833
> > https://lkml.org/lkml/2019/1/23/834
> > https://lkml.org/lkml/2019/1/23/832
> > https://lkml.org/lkml/2019/1/23/831
> > 
> > I have more coming for HMM but i am waiting after 5.2 once amdgpu HMM
> > patch are merge upstream as it will change what is passed down to driver
> > and it would conflict with non merged HMM driver (like amdgpu today).
> > 
> 
> Unfortunately this does not answer my question.  Yes I saw the links to the patches which use this in the header.  Furthermore, I checked the links again, I still do not see a use of range->flags nor a use of the new flags parameter to mmu_notifier_range_init().
> 
> I still gave a reviewed by because I'm not saying it is wrong I'm just trying to understand what use drivers have of this flag.
> 
> So again I'm curious what is the use case of these flags and the use case of exposing it to the users of MMU notifiers?

Oh sorry did miss the exact question, not enough coffee. The
flags is use for BLOCKABLE i converted the bool blockable to
an int flags field so that we can have flags in the future.
The first user is probably gonna be for restoring the KVM
change_pte() optimization.

https://lkml.org/lkml/2019/2/19/754
https://lkml.org/lkml/2019/2/20/1087

I droped the CHANGE_PTE flags from this version as i need to
harass the KVM people some more for them to review this as
today the change_pte() optimization does not work so today
the change_pte() callback are useless and just wasting CPU
cycles.

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v6 7/8] mm/mmu_notifier: pass down vma and reasons why mmu notifier is happening v2
  2019-04-11  5:41     ` Leon Romanovsky
@ 2019-04-11 21:50       ` Ira Weiny
  0 siblings, 0 replies; 18+ messages in thread
From: Ira Weiny @ 2019-04-11 21:50 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: jglisse, linux-kernel, Andrew Morton, linux-mm,
	Christian König, Joonas Lahtinen, Jani Nikula, Rodrigo Vivi,
	Jan Kara, Andrea Arcangeli, Peter Xu, Felix Kuehling,
	Jason Gunthorpe, Ross Zwisler, Dan Williams, Paolo Bonzini,
	Radim Krčmář,
	Michal Hocko, Ralph Campbell, John Hubbard, kvm, dri-devel,
	linux-rdma, Arnd Bergmann

On Thu, Apr 11, 2019 at 08:41:30AM +0300, Leon Romanovsky wrote:
> On Wed, Apr 10, 2019 at 04:41:57PM -0700, Ira Weiny wrote:
> > On Tue, Mar 26, 2019 at 12:47:46PM -0400, Jerome Glisse wrote:
> > > From: Jérôme Glisse <jglisse@redhat.com>
> > >

[snip]

> > > diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
> > > index 62f94cd85455..0379956fff23 100644
> > > --- a/include/linux/mmu_notifier.h
> > > +++ b/include/linux/mmu_notifier.h
> > > @@ -58,10 +58,12 @@ struct mmu_notifier_mm {
> > >  #define MMU_NOTIFIER_RANGE_BLOCKABLE (1 << 0)
> > >
> > >  struct mmu_notifier_range {
> > > +	struct vm_area_struct *vma;
> > >  	struct mm_struct *mm;
> > >  	unsigned long start;
> > >  	unsigned long end;
> > >  	unsigned flags;
> > > +	enum mmu_notifier_event event;
> > >  };
> > >
> > >  struct mmu_notifier_ops {
> > > @@ -363,10 +365,12 @@ static inline void mmu_notifier_range_init(struct mmu_notifier_range *range,
> > >  					   unsigned long start,
> > >  					   unsigned long end)
> > >  {
> > > +	range->vma = vma;
> > > +	range->event = event;
> > >  	range->mm = mm;
> > >  	range->start = start;
> > >  	range->end = end;
> > > -	range->flags = 0;
> > > +	range->flags = flags;
> >
> > Which of the "user patch sets" uses the new flags?
> >
> > I'm not seeing that user yet.  In general I don't see anything wrong with the
> > series and I like the idea of telling drivers why the invalidate has fired.
> >
> > But is the flags a future feature?
> 
> It seems that it is used in HMM ODP patch.
> https://patchwork.kernel.org/patch/10894281/

AFAICT the flags in that patch are "hmm_range->flags"  not
"mmu_notifier_range->flags"

They are not the same.

Ira

> 
> Thanks
> 
> >
> > For the series:
> >
> > Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> >
> > Ira
> >
> > >  }
> > >
> > >  #define ptep_clear_flush_young_notify(__vma, __address, __ptep)		\
> > > --
> > > 2.20.1
> > >



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-04-11 21:50 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-26 16:47 [PATCH v6 0/8] mmu notifier provide context informations jglisse
2019-03-26 16:47 ` [PATCH v6 1/8] mm/mmu_notifier: helper to test if a range invalidation is blockable jglisse
2019-03-26 16:47 ` [PATCH v6 2/8] mm/mmu_notifier: convert user range->blockable to helper function jglisse
2019-03-26 16:47 ` [PATCH v6 3/8] mm/mmu_notifier: convert mmu_notifier_range->blockable to a flags jglisse
2019-03-26 16:47 ` [PATCH v6 4/8] mm/mmu_notifier: contextual information for event enums jglisse
2019-03-26 16:47 ` [PATCH v6 5/8] mm/mmu_notifier: contextual information for event triggering invalidation v2 jglisse
2019-03-26 16:47 ` [PATCH v6 6/8] mm/mmu_notifier: use correct mmu_notifier events for each invalidation jglisse
2019-03-26 16:47 ` [PATCH v6 7/8] mm/mmu_notifier: pass down vma and reasons why mmu notifier is happening v2 jglisse
2019-04-10 23:41   ` Ira Weiny
2019-04-11  5:41     ` Leon Romanovsky
2019-04-11 21:50       ` Ira Weiny
2019-04-11 14:39     ` Jerome Glisse
2019-04-11 15:21       ` Weiny, Ira
2019-04-11 17:00         ` Jerome Glisse
2019-03-26 16:47 ` [PATCH v6 8/8] mm/mmu_notifier: mmu_notifier_range_update_to_read_only() helper jglisse
2019-04-09 14:21 ` [PATCH v6 0/8] mmu notifier provide context informations Jerome Glisse
2019-04-09 22:08 ` Andrew Morton
2019-04-10 16:06   ` Jerome Glisse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).