[PATCH 0/2] [i915] Getting rid of GUP and use HMM for user ptr features.

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] [i915] Getting rid of GUP and use HMM for user ptr features.
@ 2018-09-10  0:57 ` jglisse
  0 siblings, 0 replies; 11+ messages in thread
From: jglisse @ 2018-09-10  0:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, dri-devel, David Airlie, Daniel Vetter,
	Chris Wilson, Lionel Landwerlin, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, intel-gfx

From: Jérôme Glisse <jglisse@redhat.com>

[This depends on some HMM patchset queued upstream see branch [1]]

This is simple change to switch to use HMM for user ptr buffer object
which conveniently avoid to pin pages. I have more things in the pipe
to make HMM more usefull for such cases (like sharing more resources
accross multiple mirror of a same process).

Beside avoiding pining, this is also an attempt to isolate core mm
from device drivers by having clearly define API and boundary where
we can set expection of everyone and thus having mm folks to have to
read and understand driver code and conversly having driver folks
understand mm maze.

This is also part of what i want to discuss during XDC2018.

Consider this as an RFC to start the discussion.

[1] https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-intel-v00

Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org

Jérôme Glisse (2):
  gpu/i915: use HMM mirror instead of mmu_notifier
  gpu/i915: use HMM mirror for userptr buffer object.

 drivers/gpu/drm/i915/Kconfig            |   4 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c | 395 ++++++++++++------------
 2 files changed, 199 insertions(+), 200 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 0/2] [i915] Getting rid of GUP and use HMM for user ptr features.
@ 2018-09-10  0:57 ` jglisse
  0 siblings, 0 replies; 11+ messages in thread
From: jglisse @ 2018-09-10  0:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Airlie, Daniel Vetter, intel-gfx, Jérôme Glisse,
	dri-devel, Rodrigo Vivi

From: Jérôme Glisse <jglisse@redhat.com>

[This depends on some HMM patchset queued upstream see branch [1]]

This is simple change to switch to use HMM for user ptr buffer object
which conveniently avoid to pin pages. I have more things in the pipe
to make HMM more usefull for such cases (like sharing more resources
accross multiple mirror of a same process).

Beside avoiding pining, this is also an attempt to isolate core mm
from device drivers by having clearly define API and boundary where
we can set expection of everyone and thus having mm folks to have to
read and understand driver code and conversly having driver folks
understand mm maze.

This is also part of what i want to discuss during XDC2018.

Consider this as an RFC to start the discussion.

[1] https://cgit.freedesktop.org/~glisse/linux/log/?h=hmm-intel-v00

Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org

Jérôme Glisse (2):
  gpu/i915: use HMM mirror instead of mmu_notifier
  gpu/i915: use HMM mirror for userptr buffer object.

 drivers/gpu/drm/i915/Kconfig            |   4 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c | 395 ++++++++++++------------
 2 files changed, 199 insertions(+), 200 deletions(-)

-- 
2.17.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] gpu/i915: use HMM mirror instead of mmu_notifier
  2018-09-10  0:57 ` jglisse
@ 2018-09-10  0:57   ` jglisse
  -1 siblings, 0 replies; 11+ messages in thread
From: jglisse @ 2018-09-10  0:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, dri-devel, David Airlie, Daniel Vetter,
	Chris Wilson, Lionel Landwerlin, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, intel-gfx

From: Jérôme Glisse <jglisse@redhat.com>

HMM provide a sets of helpers to avoid individual drivers re-doing
their own. This patch convert the radeon to use HMM mirror to track
CPU page table update and invalidate accordingly for userptr object.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/i915/Kconfig            |   4 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c | 189 ++++++++++++------------
 2 files changed, 97 insertions(+), 96 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 33a458b7f1fc..40bba0bd8124 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -87,10 +87,10 @@ config DRM_I915_COMPRESS_ERROR
 config DRM_I915_USERPTR
 	bool "Always enable userptr support"
 	depends on DRM_I915
-	select MMU_NOTIFIER
+	select HMM_MIRROR
 	default y
 	help
-	  This option selects CONFIG_MMU_NOTIFIER if it isn't already
+	  This option selects CONFIG_HMM_MIRROR if it isn't already
 	  selected to enabled full userptr support.
 
 	  If in doubt, say "Y".
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 2c9b284036d1..5e09b654b5ad 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -28,7 +28,7 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 #include <linux/mmu_context.h>
-#include <linux/mmu_notifier.h>
+#include <linux/hmm.h>
 #include <linux/mempolicy.h>
 #include <linux/swap.h>
 #include <linux/sched/mm.h>
@@ -36,25 +36,25 @@
 struct i915_mm_struct {
 	struct mm_struct *mm;
 	struct drm_i915_private *i915;
-	struct i915_mmu_notifier *mn;
+	struct i915_mirror *mirror;
 	struct hlist_node node;
 	struct kref kref;
 	struct work_struct work;
 };
 
-#if defined(CONFIG_MMU_NOTIFIER)
+#if defined(CONFIG_HMM_MIRROR)
 #include <linux/interval_tree.h>
 
-struct i915_mmu_notifier {
+struct i915_mirror {
 	spinlock_t lock;
 	struct hlist_node node;
-	struct mmu_notifier mn;
+	struct hmm_mirror mirror;
 	struct rb_root_cached objects;
 	struct workqueue_struct *wq;
 };
 
 struct i915_mmu_object {
-	struct i915_mmu_notifier *mn;
+	struct i915_mirror *mirror;
 	struct drm_i915_gem_object *obj;
 	struct interval_tree_node it;
 	struct list_head link;
@@ -99,7 +99,7 @@ static void add_object(struct i915_mmu_object *mo)
 	if (mo->attached)
 		return;
 
-	interval_tree_insert(&mo->it, &mo->mn->objects);
+	interval_tree_insert(&mo->it, &mo->mirror->objects);
 	mo->attached = true;
 }
 
@@ -108,33 +108,29 @@ static void del_object(struct i915_mmu_object *mo)
 	if (!mo->attached)
 		return;
 
-	interval_tree_remove(&mo->it, &mo->mn->objects);
+	interval_tree_remove(&mo->it, &mo->mirror->objects);
 	mo->attached = false;
 }
 
-static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
-						       struct mm_struct *mm,
-						       unsigned long start,
-						       unsigned long end,
-						       bool blockable)
+static int i915_sync_cpu_device_pagetables(struct hmm_mirror *_mirror,
+					   const struct hmm_update *update)
 {
-	struct i915_mmu_notifier *mn =
-		container_of(_mn, struct i915_mmu_notifier, mn);
+	struct i915_mirror *mirror =
+		container_of(_mirror, struct i915_mirror, mirror);
+	/* interval ranges are inclusive, but invalidate range is exclusive */
+	unsigned long end = update->end - 1;
 	struct i915_mmu_object *mo;
 	struct interval_tree_node *it;
 	LIST_HEAD(cancelled);
 
-	if (RB_EMPTY_ROOT(&mn->objects.rb_root))
+	if (RB_EMPTY_ROOT(&mirror->objects.rb_root))
 		return 0;
 
-	/* interval ranges are inclusive, but invalidate range is exclusive */
-	end--;
-
-	spin_lock(&mn->lock);
-	it = interval_tree_iter_first(&mn->objects, start, end);
+	spin_lock(&mirror->lock);
+	it = interval_tree_iter_first(&mirror->objects, update->start, end);
 	while (it) {
-		if (!blockable) {
-			spin_unlock(&mn->lock);
+		if (!update->blockable) {
+			spin_unlock(&mirror->lock);
 			return -EAGAIN;
 		}
 		/* The mmu_object is released late when destroying the
@@ -148,50 +144,56 @@ static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
 		 */
 		mo = container_of(it, struct i915_mmu_object, it);
 		if (kref_get_unless_zero(&mo->obj->base.refcount))
-			queue_work(mn->wq, &mo->work);
+			queue_work(mirror->wq, &mo->work);
 
 		list_add(&mo->link, &cancelled);
-		it = interval_tree_iter_next(it, start, end);
+		it = interval_tree_iter_next(it, update->start, end);
 	}
 	list_for_each_entry(mo, &cancelled, link)
 		del_object(mo);
-	spin_unlock(&mn->lock);
+	spin_unlock(&mirror->lock);
 
 	if (!list_empty(&cancelled))
-		flush_workqueue(mn->wq);
+		flush_workqueue(mirror->wq);
 
 	return 0;
 }
 
-static const struct mmu_notifier_ops i915_gem_userptr_notifier = {
-	.invalidate_range_start = i915_gem_userptr_mn_invalidate_range_start,
+static void
+i915_mirror_release(struct hmm_mirror *mirror)
+{
+}
+
+static const struct hmm_mirror_ops i915_mirror_ops = {
+	.sync_cpu_device_pagetables = &i915_sync_cpu_device_pagetables,
+	.release = &i915_mirror_release,
 };
 
-static struct i915_mmu_notifier *
-i915_mmu_notifier_create(struct mm_struct *mm)
+static struct i915_mirror*
+i915_mirror_create(struct mm_struct *mm)
 {
-	struct i915_mmu_notifier *mn;
+	struct i915_mirror *mirror;
 
-	mn = kmalloc(sizeof(*mn), GFP_KERNEL);
-	if (mn == NULL)
+	mirror = kmalloc(sizeof(*mirror), GFP_KERNEL);
+	if (mirror == NULL)
 		return ERR_PTR(-ENOMEM);
 
-	spin_lock_init(&mn->lock);
-	mn->mn.ops = &i915_gem_userptr_notifier;
-	mn->objects = RB_ROOT_CACHED;
-	mn->wq = alloc_workqueue("i915-userptr-release",
-				 WQ_UNBOUND | WQ_MEM_RECLAIM,
-				 0);
-	if (mn->wq == NULL) {
-		kfree(mn);
+	spin_lock_init(&mirror->lock);
+	mirror->mirror.ops = &i915_mirror_ops;
+	mirror->objects = RB_ROOT_CACHED;
+	mirror->wq = alloc_workqueue("i915-userptr-release",
+				     WQ_UNBOUND | WQ_MEM_RECLAIM,
+				     0);
+	if (mirror->wq == NULL) {
+		kfree(mirror);
 		return ERR_PTR(-ENOMEM);
 	}
 
-	return mn;
+	return mirror;
 }
 
 static void
-i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj)
+i915_gem_userptr_release__mirror(struct drm_i915_gem_object *obj)
 {
 	struct i915_mmu_object *mo;
 
@@ -199,38 +201,38 @@ i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj)
 	if (mo == NULL)
 		return;
 
-	spin_lock(&mo->mn->lock);
+	spin_lock(&mo->mirror->lock);
 	del_object(mo);
-	spin_unlock(&mo->mn->lock);
+	spin_unlock(&mo->mirror->lock);
 	kfree(mo);
 
 	obj->userptr.mmu_object = NULL;
 }
 
-static struct i915_mmu_notifier *
-i915_mmu_notifier_find(struct i915_mm_struct *mm)
+static struct i915_mirror *
+i915_mirror_find(struct i915_mm_struct *mm)
 {
-	struct i915_mmu_notifier *mn;
+	struct i915_mirror *mirror;
 	int err = 0;
 
-	mn = mm->mn;
-	if (mn)
-		return mn;
+	mirror = mm->mirror;
+	if (mirror)
+		return mirror;
 
-	mn = i915_mmu_notifier_create(mm->mm);
-	if (IS_ERR(mn))
-		err = PTR_ERR(mn);
+	mirror = i915_mirror_create(mm->mm);
+	if (IS_ERR(mirror))
+		err = PTR_ERR(mirror);
 
 	down_write(&mm->mm->mmap_sem);
 	mutex_lock(&mm->i915->mm_lock);
-	if (mm->mn == NULL && !err) {
+	if (mm->mirror == NULL && !err) {
 		/* Protected by mmap_sem (write-lock) */
-		err = __mmu_notifier_register(&mn->mn, mm->mm);
+		err = hmm_mirror_register(&mirror->mirror, mm->mm);
 		if (!err) {
 			/* Protected by mm_lock */
-			mm->mn = fetch_and_zero(&mn);
+			mm->mirror = fetch_and_zero(&mirror);
 		}
-	} else if (mm->mn) {
+	} else if (mm->mirror) {
 		/*
 		 * Someone else raced and successfully installed the mmu
 		 * notifier, we can cancel our own errors.
@@ -240,19 +242,19 @@ i915_mmu_notifier_find(struct i915_mm_struct *mm)
 	mutex_unlock(&mm->i915->mm_lock);
 	up_write(&mm->mm->mmap_sem);
 
-	if (mn && !IS_ERR(mn)) {
-		destroy_workqueue(mn->wq);
-		kfree(mn);
+	if (mirror && !IS_ERR(mirror)) {
+		destroy_workqueue(mirror->wq);
+		kfree(mirror);
 	}
 
-	return err ? ERR_PTR(err) : mm->mn;
+	return err ? ERR_PTR(err) : mm->mirror;
 }
 
 static int
-i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
+i915_gem_userptr_init__mirror(struct drm_i915_gem_object *obj,
 				    unsigned flags)
 {
-	struct i915_mmu_notifier *mn;
+	struct i915_mirror *mirror;
 	struct i915_mmu_object *mo;
 
 	if (flags & I915_USERPTR_UNSYNCHRONIZED)
@@ -261,15 +263,15 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
 	if (WARN_ON(obj->userptr.mm == NULL))
 		return -EINVAL;
 
-	mn = i915_mmu_notifier_find(obj->userptr.mm);
-	if (IS_ERR(mn))
-		return PTR_ERR(mn);
+	mirror = i915_mirror_find(obj->userptr.mm);
+	if (IS_ERR(mirror))
+		return PTR_ERR(mirror);
 
 	mo = kzalloc(sizeof(*mo), GFP_KERNEL);
 	if (mo == NULL)
 		return -ENOMEM;
 
-	mo->mn = mn;
+	mo->mirror = mirror;
 	mo->obj = obj;
 	mo->it.start = obj->userptr.ptr;
 	mo->it.last = obj->userptr.ptr + obj->base.size - 1;
@@ -280,26 +282,25 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
 }
 
 static void
-i915_mmu_notifier_free(struct i915_mmu_notifier *mn,
-		       struct mm_struct *mm)
+i915_mirror_free(struct i915_mirror *mirror, struct mm_struct *mm)
 {
-	if (mn == NULL)
+	if (mirror == NULL)
 		return;
 
-	mmu_notifier_unregister(&mn->mn, mm);
-	destroy_workqueue(mn->wq);
-	kfree(mn);
+	hmm_mirror_unregister(&mirror->mirror);
+	destroy_workqueue(mirror->wq);
+	kfree(mirror);
 }
 
 #else
 
 static void
-i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj)
+i915_gem_userptr_release__mirror(struct drm_i915_gem_object *obj)
 {
 }
 
 static int
-i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
+i915_gem_userptr_init__mirror(struct drm_i915_gem_object *obj,
 				    unsigned flags)
 {
 	if ((flags & I915_USERPTR_UNSYNCHRONIZED) == 0)
@@ -312,8 +313,8 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
 }
 
 static void
-i915_mmu_notifier_free(struct i915_mmu_notifier *mn,
-		       struct mm_struct *mm)
+i915_mirror_free(struct i915_mirror *mirror,
+		 struct mm_struct *mm)
 {
 }
 
@@ -364,7 +365,7 @@ i915_gem_userptr_init__mm_struct(struct drm_i915_gem_object *obj)
 		mm->mm = current->mm;
 		mmgrab(current->mm);
 
-		mm->mn = NULL;
+		mm->mirror = NULL;
 
 		/* Protected by dev_priv->mm_lock */
 		hash_add(dev_priv->mm_structs,
@@ -382,7 +383,7 @@ static void
 __i915_mm_struct_free__worker(struct work_struct *work)
 {
 	struct i915_mm_struct *mm = container_of(work, typeof(*mm), work);
-	i915_mmu_notifier_free(mm->mn, mm->mm);
+	i915_mirror_free(mm->mirror, mm->mm);
 	mmdrop(mm->mm);
 	kfree(mm);
 }
@@ -474,14 +475,14 @@ __i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
 	 * a GTT mmapping (possible with a MAP_FIXED) - then when we have
 	 * to invalidate that mmaping, mm_invalidate_range is called with
 	 * the userptr address *and* the struct_mutex held.  To prevent that
-	 * we set a flag under the i915_mmu_notifier spinlock to indicate
+	 * we set a flag under the i915_mirror spinlock to indicate
 	 * whether this object is valid.
 	 */
-#if defined(CONFIG_MMU_NOTIFIER)
+#if defined(CONFIG_HMM_MIRROR)
 	if (obj->userptr.mmu_object == NULL)
 		return 0;
 
-	spin_lock(&obj->userptr.mmu_object->mn->lock);
+	spin_lock(&obj->userptr.mmu_object->mirror->lock);
 	/* In order to serialise get_pages with an outstanding
 	 * cancel_userptr, we must drop the struct_mutex and try again.
 	 */
@@ -491,7 +492,7 @@ __i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
 		add_object(obj->userptr.mmu_object);
 	else
 		ret = -EAGAIN;
-	spin_unlock(&obj->userptr.mmu_object->mn->lock);
+	spin_unlock(&obj->userptr.mmu_object->mirror->lock);
 #endif
 
 	return ret;
@@ -625,10 +626,10 @@ static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 	 * the process may not be expecting that a particular piece of
 	 * memory is tied to the GPU.
 	 *
-	 * Fortunately, we can hook into the mmu_notifier in order to
-	 * discard the page references prior to anything nasty happening
-	 * to the vma (discard or cloning) which should prevent the more
-	 * egregious cases from causing harm.
+	 * Fortunately, we can hook into mirror callback in order to discard
+	 * the  page references prior to anything nasty happening to the vma
+	 * (discard or cloning) which should prevent the more egregious cases
+	 * from causing harm.
 	 */
 
 	if (obj->userptr.work) {
@@ -706,7 +707,7 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 static void
 i915_gem_userptr_release(struct drm_i915_gem_object *obj)
 {
-	i915_gem_userptr_release__mmu_notifier(obj);
+	i915_gem_userptr_release__mirror(obj);
 	i915_gem_userptr_release__mm_struct(obj);
 }
 
@@ -716,7 +717,7 @@ i915_gem_userptr_dmabuf_export(struct drm_i915_gem_object *obj)
 	if (obj->userptr.mmu_object)
 		return 0;
 
-	return i915_gem_userptr_init__mmu_notifier(obj, 0);
+	return i915_gem_userptr_init__mirror(obj, 0);
 }
 
 static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
@@ -822,12 +823,12 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
 		i915_gem_object_set_readonly(obj);
 
 	/* And keep a pointer to the current->mm for resolving the user pages
-	 * at binding. This means that we need to hook into the mmu_notifier
-	 * in order to detect if the mmu is destroyed.
+	 * at binding. This means that we need to hook into the mirror in order
+	 * to detect if the mmu is destroyed.
 	 */
 	ret = i915_gem_userptr_init__mm_struct(obj);
 	if (ret == 0)
-		ret = i915_gem_userptr_init__mmu_notifier(obj, args->flags);
+		ret = i915_gem_userptr_init__mirror(obj, args->flags);
 	if (ret == 0)
 		ret = drm_gem_handle_create(file, &obj->base, &handle);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 1/2] gpu/i915: use HMM mirror instead of mmu_notifier
@ 2018-09-10  0:57   ` jglisse
  0 siblings, 0 replies; 11+ messages in thread
From: jglisse @ 2018-09-10  0:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Airlie, Daniel Vetter, intel-gfx, Jérôme Glisse,
	dri-devel, Rodrigo Vivi

From: Jérôme Glisse <jglisse@redhat.com>

HMM provide a sets of helpers to avoid individual drivers re-doing
their own. This patch convert the radeon to use HMM mirror to track
CPU page table update and invalidate accordingly for userptr object.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/i915/Kconfig            |   4 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c | 189 ++++++++++++------------
 2 files changed, 97 insertions(+), 96 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 33a458b7f1fc..40bba0bd8124 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -87,10 +87,10 @@ config DRM_I915_COMPRESS_ERROR
 config DRM_I915_USERPTR
 	bool "Always enable userptr support"
 	depends on DRM_I915
-	select MMU_NOTIFIER
+	select HMM_MIRROR
 	default y
 	help
-	  This option selects CONFIG_MMU_NOTIFIER if it isn't already
+	  This option selects CONFIG_HMM_MIRROR if it isn't already
 	  selected to enabled full userptr support.
 
 	  If in doubt, say "Y".
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 2c9b284036d1..5e09b654b5ad 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -28,7 +28,7 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 #include <linux/mmu_context.h>
-#include <linux/mmu_notifier.h>
+#include <linux/hmm.h>
 #include <linux/mempolicy.h>
 #include <linux/swap.h>
 #include <linux/sched/mm.h>
@@ -36,25 +36,25 @@
 struct i915_mm_struct {
 	struct mm_struct *mm;
 	struct drm_i915_private *i915;
-	struct i915_mmu_notifier *mn;
+	struct i915_mirror *mirror;
 	struct hlist_node node;
 	struct kref kref;
 	struct work_struct work;
 };
 
-#if defined(CONFIG_MMU_NOTIFIER)
+#if defined(CONFIG_HMM_MIRROR)
 #include <linux/interval_tree.h>
 
-struct i915_mmu_notifier {
+struct i915_mirror {
 	spinlock_t lock;
 	struct hlist_node node;
-	struct mmu_notifier mn;
+	struct hmm_mirror mirror;
 	struct rb_root_cached objects;
 	struct workqueue_struct *wq;
 };
 
 struct i915_mmu_object {
-	struct i915_mmu_notifier *mn;
+	struct i915_mirror *mirror;
 	struct drm_i915_gem_object *obj;
 	struct interval_tree_node it;
 	struct list_head link;
@@ -99,7 +99,7 @@ static void add_object(struct i915_mmu_object *mo)
 	if (mo->attached)
 		return;
 
-	interval_tree_insert(&mo->it, &mo->mn->objects);
+	interval_tree_insert(&mo->it, &mo->mirror->objects);
 	mo->attached = true;
 }
 
@@ -108,33 +108,29 @@ static void del_object(struct i915_mmu_object *mo)
 	if (!mo->attached)
 		return;
 
-	interval_tree_remove(&mo->it, &mo->mn->objects);
+	interval_tree_remove(&mo->it, &mo->mirror->objects);
 	mo->attached = false;
 }
 
-static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
-						       struct mm_struct *mm,
-						       unsigned long start,
-						       unsigned long end,
-						       bool blockable)
+static int i915_sync_cpu_device_pagetables(struct hmm_mirror *_mirror,
+					   const struct hmm_update *update)
 {
-	struct i915_mmu_notifier *mn =
-		container_of(_mn, struct i915_mmu_notifier, mn);
+	struct i915_mirror *mirror =
+		container_of(_mirror, struct i915_mirror, mirror);
+	/* interval ranges are inclusive, but invalidate range is exclusive */
+	unsigned long end = update->end - 1;
 	struct i915_mmu_object *mo;
 	struct interval_tree_node *it;
 	LIST_HEAD(cancelled);
 
-	if (RB_EMPTY_ROOT(&mn->objects.rb_root))
+	if (RB_EMPTY_ROOT(&mirror->objects.rb_root))
 		return 0;
 
-	/* interval ranges are inclusive, but invalidate range is exclusive */
-	end--;
-
-	spin_lock(&mn->lock);
-	it = interval_tree_iter_first(&mn->objects, start, end);
+	spin_lock(&mirror->lock);
+	it = interval_tree_iter_first(&mirror->objects, update->start, end);
 	while (it) {
-		if (!blockable) {
-			spin_unlock(&mn->lock);
+		if (!update->blockable) {
+			spin_unlock(&mirror->lock);
 			return -EAGAIN;
 		}
 		/* The mmu_object is released late when destroying the
@@ -148,50 +144,56 @@ static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
 		 */
 		mo = container_of(it, struct i915_mmu_object, it);
 		if (kref_get_unless_zero(&mo->obj->base.refcount))
-			queue_work(mn->wq, &mo->work);
+			queue_work(mirror->wq, &mo->work);
 
 		list_add(&mo->link, &cancelled);
-		it = interval_tree_iter_next(it, start, end);
+		it = interval_tree_iter_next(it, update->start, end);
 	}
 	list_for_each_entry(mo, &cancelled, link)
 		del_object(mo);
-	spin_unlock(&mn->lock);
+	spin_unlock(&mirror->lock);
 
 	if (!list_empty(&cancelled))
-		flush_workqueue(mn->wq);
+		flush_workqueue(mirror->wq);
 
 	return 0;
 }
 
-static const struct mmu_notifier_ops i915_gem_userptr_notifier = {
-	.invalidate_range_start = i915_gem_userptr_mn_invalidate_range_start,
+static void
+i915_mirror_release(struct hmm_mirror *mirror)
+{
+}
+
+static const struct hmm_mirror_ops i915_mirror_ops = {
+	.sync_cpu_device_pagetables = &i915_sync_cpu_device_pagetables,
+	.release = &i915_mirror_release,
 };
 
-static struct i915_mmu_notifier *
-i915_mmu_notifier_create(struct mm_struct *mm)
+static struct i915_mirror*
+i915_mirror_create(struct mm_struct *mm)
 {
-	struct i915_mmu_notifier *mn;
+	struct i915_mirror *mirror;
 
-	mn = kmalloc(sizeof(*mn), GFP_KERNEL);
-	if (mn == NULL)
+	mirror = kmalloc(sizeof(*mirror), GFP_KERNEL);
+	if (mirror == NULL)
 		return ERR_PTR(-ENOMEM);
 
-	spin_lock_init(&mn->lock);
-	mn->mn.ops = &i915_gem_userptr_notifier;
-	mn->objects = RB_ROOT_CACHED;
-	mn->wq = alloc_workqueue("i915-userptr-release",
-				 WQ_UNBOUND | WQ_MEM_RECLAIM,
-				 0);
-	if (mn->wq == NULL) {
-		kfree(mn);
+	spin_lock_init(&mirror->lock);
+	mirror->mirror.ops = &i915_mirror_ops;
+	mirror->objects = RB_ROOT_CACHED;
+	mirror->wq = alloc_workqueue("i915-userptr-release",
+				     WQ_UNBOUND | WQ_MEM_RECLAIM,
+				     0);
+	if (mirror->wq == NULL) {
+		kfree(mirror);
 		return ERR_PTR(-ENOMEM);
 	}
 
-	return mn;
+	return mirror;
 }
 
 static void
-i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj)
+i915_gem_userptr_release__mirror(struct drm_i915_gem_object *obj)
 {
 	struct i915_mmu_object *mo;
 
@@ -199,38 +201,38 @@ i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj)
 	if (mo == NULL)
 		return;
 
-	spin_lock(&mo->mn->lock);
+	spin_lock(&mo->mirror->lock);
 	del_object(mo);
-	spin_unlock(&mo->mn->lock);
+	spin_unlock(&mo->mirror->lock);
 	kfree(mo);
 
 	obj->userptr.mmu_object = NULL;
 }
 
-static struct i915_mmu_notifier *
-i915_mmu_notifier_find(struct i915_mm_struct *mm)
+static struct i915_mirror *
+i915_mirror_find(struct i915_mm_struct *mm)
 {
-	struct i915_mmu_notifier *mn;
+	struct i915_mirror *mirror;
 	int err = 0;
 
-	mn = mm->mn;
-	if (mn)
-		return mn;
+	mirror = mm->mirror;
+	if (mirror)
+		return mirror;
 
-	mn = i915_mmu_notifier_create(mm->mm);
-	if (IS_ERR(mn))
-		err = PTR_ERR(mn);
+	mirror = i915_mirror_create(mm->mm);
+	if (IS_ERR(mirror))
+		err = PTR_ERR(mirror);
 
 	down_write(&mm->mm->mmap_sem);
 	mutex_lock(&mm->i915->mm_lock);
-	if (mm->mn == NULL && !err) {
+	if (mm->mirror == NULL && !err) {
 		/* Protected by mmap_sem (write-lock) */
-		err = __mmu_notifier_register(&mn->mn, mm->mm);
+		err = hmm_mirror_register(&mirror->mirror, mm->mm);
 		if (!err) {
 			/* Protected by mm_lock */
-			mm->mn = fetch_and_zero(&mn);
+			mm->mirror = fetch_and_zero(&mirror);
 		}
-	} else if (mm->mn) {
+	} else if (mm->mirror) {
 		/*
 		 * Someone else raced and successfully installed the mmu
 		 * notifier, we can cancel our own errors.
@@ -240,19 +242,19 @@ i915_mmu_notifier_find(struct i915_mm_struct *mm)
 	mutex_unlock(&mm->i915->mm_lock);
 	up_write(&mm->mm->mmap_sem);
 
-	if (mn && !IS_ERR(mn)) {
-		destroy_workqueue(mn->wq);
-		kfree(mn);
+	if (mirror && !IS_ERR(mirror)) {
+		destroy_workqueue(mirror->wq);
+		kfree(mirror);
 	}
 
-	return err ? ERR_PTR(err) : mm->mn;
+	return err ? ERR_PTR(err) : mm->mirror;
 }
 
 static int
-i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
+i915_gem_userptr_init__mirror(struct drm_i915_gem_object *obj,
 				    unsigned flags)
 {
-	struct i915_mmu_notifier *mn;
+	struct i915_mirror *mirror;
 	struct i915_mmu_object *mo;
 
 	if (flags & I915_USERPTR_UNSYNCHRONIZED)
@@ -261,15 +263,15 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
 	if (WARN_ON(obj->userptr.mm == NULL))
 		return -EINVAL;
 
-	mn = i915_mmu_notifier_find(obj->userptr.mm);
-	if (IS_ERR(mn))
-		return PTR_ERR(mn);
+	mirror = i915_mirror_find(obj->userptr.mm);
+	if (IS_ERR(mirror))
+		return PTR_ERR(mirror);
 
 	mo = kzalloc(sizeof(*mo), GFP_KERNEL);
 	if (mo == NULL)
 		return -ENOMEM;
 
-	mo->mn = mn;
+	mo->mirror = mirror;
 	mo->obj = obj;
 	mo->it.start = obj->userptr.ptr;
 	mo->it.last = obj->userptr.ptr + obj->base.size - 1;
@@ -280,26 +282,25 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
 }
 
 static void
-i915_mmu_notifier_free(struct i915_mmu_notifier *mn,
-		       struct mm_struct *mm)
+i915_mirror_free(struct i915_mirror *mirror, struct mm_struct *mm)
 {
-	if (mn == NULL)
+	if (mirror == NULL)
 		return;
 
-	mmu_notifier_unregister(&mn->mn, mm);
-	destroy_workqueue(mn->wq);
-	kfree(mn);
+	hmm_mirror_unregister(&mirror->mirror);
+	destroy_workqueue(mirror->wq);
+	kfree(mirror);
 }
 
 #else
 
 static void
-i915_gem_userptr_release__mmu_notifier(struct drm_i915_gem_object *obj)
+i915_gem_userptr_release__mirror(struct drm_i915_gem_object *obj)
 {
 }
 
 static int
-i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
+i915_gem_userptr_init__mirror(struct drm_i915_gem_object *obj,
 				    unsigned flags)
 {
 	if ((flags & I915_USERPTR_UNSYNCHRONIZED) == 0)
@@ -312,8 +313,8 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
 }
 
 static void
-i915_mmu_notifier_free(struct i915_mmu_notifier *mn,
-		       struct mm_struct *mm)
+i915_mirror_free(struct i915_mirror *mirror,
+		 struct mm_struct *mm)
 {
 }
 
@@ -364,7 +365,7 @@ i915_gem_userptr_init__mm_struct(struct drm_i915_gem_object *obj)
 		mm->mm = current->mm;
 		mmgrab(current->mm);
 
-		mm->mn = NULL;
+		mm->mirror = NULL;
 
 		/* Protected by dev_priv->mm_lock */
 		hash_add(dev_priv->mm_structs,
@@ -382,7 +383,7 @@ static void
 __i915_mm_struct_free__worker(struct work_struct *work)
 {
 	struct i915_mm_struct *mm = container_of(work, typeof(*mm), work);
-	i915_mmu_notifier_free(mm->mn, mm->mm);
+	i915_mirror_free(mm->mirror, mm->mm);
 	mmdrop(mm->mm);
 	kfree(mm);
 }
@@ -474,14 +475,14 @@ __i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
 	 * a GTT mmapping (possible with a MAP_FIXED) - then when we have
 	 * to invalidate that mmaping, mm_invalidate_range is called with
 	 * the userptr address *and* the struct_mutex held.  To prevent that
-	 * we set a flag under the i915_mmu_notifier spinlock to indicate
+	 * we set a flag under the i915_mirror spinlock to indicate
 	 * whether this object is valid.
 	 */
-#if defined(CONFIG_MMU_NOTIFIER)
+#if defined(CONFIG_HMM_MIRROR)
 	if (obj->userptr.mmu_object == NULL)
 		return 0;
 
-	spin_lock(&obj->userptr.mmu_object->mn->lock);
+	spin_lock(&obj->userptr.mmu_object->mirror->lock);
 	/* In order to serialise get_pages with an outstanding
 	 * cancel_userptr, we must drop the struct_mutex and try again.
 	 */
@@ -491,7 +492,7 @@ __i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
 		add_object(obj->userptr.mmu_object);
 	else
 		ret = -EAGAIN;
-	spin_unlock(&obj->userptr.mmu_object->mn->lock);
+	spin_unlock(&obj->userptr.mmu_object->mirror->lock);
 #endif
 
 	return ret;
@@ -625,10 +626,10 @@ static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 	 * the process may not be expecting that a particular piece of
 	 * memory is tied to the GPU.
 	 *
-	 * Fortunately, we can hook into the mmu_notifier in order to
-	 * discard the page references prior to anything nasty happening
-	 * to the vma (discard or cloning) which should prevent the more
-	 * egregious cases from causing harm.
+	 * Fortunately, we can hook into mirror callback in order to discard
+	 * the  page references prior to anything nasty happening to the vma
+	 * (discard or cloning) which should prevent the more egregious cases
+	 * from causing harm.
 	 */
 
 	if (obj->userptr.work) {
@@ -706,7 +707,7 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 static void
 i915_gem_userptr_release(struct drm_i915_gem_object *obj)
 {
-	i915_gem_userptr_release__mmu_notifier(obj);
+	i915_gem_userptr_release__mirror(obj);
 	i915_gem_userptr_release__mm_struct(obj);
 }
 
@@ -716,7 +717,7 @@ i915_gem_userptr_dmabuf_export(struct drm_i915_gem_object *obj)
 	if (obj->userptr.mmu_object)
 		return 0;
 
-	return i915_gem_userptr_init__mmu_notifier(obj, 0);
+	return i915_gem_userptr_init__mirror(obj, 0);
 }
 
 static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
@@ -822,12 +823,12 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
 		i915_gem_object_set_readonly(obj);
 
 	/* And keep a pointer to the current->mm for resolving the user pages
-	 * at binding. This means that we need to hook into the mmu_notifier
-	 * in order to detect if the mmu is destroyed.
+	 * at binding. This means that we need to hook into the mirror in order
+	 * to detect if the mmu is destroyed.
 	 */
 	ret = i915_gem_userptr_init__mm_struct(obj);
 	if (ret == 0)
-		ret = i915_gem_userptr_init__mmu_notifier(obj, args->flags);
+		ret = i915_gem_userptr_init__mirror(obj, args->flags);
 	if (ret == 0)
 		ret = drm_gem_handle_create(file, &obj->base, &handle);
 
-- 
2.17.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] gpu/i915: use HMM mirror for userptr buffer object.
  2018-09-10  0:57 ` jglisse
@ 2018-09-10  0:57   ` jglisse
  -1 siblings, 0 replies; 11+ messages in thread
From: jglisse @ 2018-09-10  0:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, dri-devel, David Airlie, Daniel Vetter,
	Chris Wilson, Lionel Landwerlin, Jani Nikula, Joonas Lahtinen,
	Rodrigo Vivi, intel-gfx

From: Jérôme Glisse <jglisse@redhat.com>

This replace existing code that rely on get_user_page() aka GUP with
code that now use HMM mirror to mirror a range of virtual address as
a buffer object accessible by the GPU. There is no functional changes
from userspace point of view.

From kernel point of view we no longer pin pages for userptr buffer
object which is a welcome change (i am assuming that everyone dislike
page pin as i do).

Another change, from kernel point of view, is that it does no longer
have a fast path with get_user_pages_fast() this can eventually added
back through HMM.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/i915/i915_gem_userptr.c | 206 ++++++++++++------------
 1 file changed, 102 insertions(+), 104 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 5e09b654b5ad..378aab438ebd 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -464,7 +464,7 @@ __i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
 
 static int
 __i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
-			      bool value)
+			      struct hmm_range *range)
 {
 	int ret = 0;
 
@@ -486,86 +486,120 @@ __i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
 	/* In order to serialise get_pages with an outstanding
 	 * cancel_userptr, we must drop the struct_mutex and try again.
 	 */
-	if (!value)
+	if (range) {
+		if (!hmm_vma_range_done(range))
+			ret = -EAGAIN;
+		else
+			add_object(obj->userptr.mmu_object);
+	} else
 		del_object(obj->userptr.mmu_object);
-	else if (!work_pending(&obj->userptr.mmu_object->work))
-		add_object(obj->userptr.mmu_object);
-	else
-		ret = -EAGAIN;
 	spin_unlock(&obj->userptr.mmu_object->mirror->lock);
 #endif
 
 	return ret;
 }
 
-static void
-__i915_gem_userptr_get_pages_worker(struct work_struct *_work)
+static int
+i915_gem_userptr_map(struct drm_i915_gem_object *obj, bool try)
 {
-	struct get_pages_work *work = container_of(_work, typeof(*work), work);
-	struct drm_i915_gem_object *obj = work->obj;
-	const int npages = obj->base.size >> PAGE_SHIFT;
+#if defined(CONFIG_HMM_MIRROR)
+	static const uint64_t i915_range_flags[HMM_PFN_FLAG_MAX] = {
+		(1 << 0), /* HMM_PFN_VALID */
+		(1 << 1), /* HMM_PFN_WRITE */
+		0 /* HMM_PFN_DEVICE_PRIVATE */
+	};
+	static const uint64_t i915_range_values[HMM_PFN_VALUE_MAX] = {
+		0xfffffffffffffffeUL, /* HMM_PFN_ERROR */
+		0, /* HMM_PFN_NONE */
+		0xfffffffffffffffcUL /* HMM_PFN_SPECIAL */
+	};
+
+	const unsigned long npages = obj->base.size >> PAGE_SHIFT;
+	struct mm_struct *mm = obj->userptr.mm->mm;
+	struct sg_table *pages;
+	struct hmm_range range;
 	struct page **pvec;
-	int pinned, ret;
-
-	ret = -ENOMEM;
-	pinned = 0;
-
-	pvec = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
-	if (pvec != NULL) {
-		struct mm_struct *mm = obj->userptr.mm->mm;
-		unsigned int flags = 0;
-
-		if (!i915_gem_object_is_readonly(obj))
-			flags |= FOLL_WRITE;
-
-		ret = -EFAULT;
-		if (mmget_not_zero(mm)) {
-			down_read(&mm->mmap_sem);
-			while (pinned < npages) {
-				ret = get_user_pages_remote
-					(work->task, mm,
-					 obj->userptr.ptr + pinned * PAGE_SIZE,
-					 npages - pinned,
-					 flags,
-					 pvec + pinned, NULL, NULL);
-				if (ret < 0)
-					break;
-
-				pinned += ret;
-			}
-			up_read(&mm->mmap_sem);
-			mmput(mm);
-		}
+	unsigned long i;
+	bool write = !i915_gem_object_is_readonly(obj);
+	int err;
+
+	range.pfns = kvmalloc_array(npages, sizeof(uint64_t),
+				    try ? GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN : GFP_KERNEL);
+	if (range.pfns == NULL)
+		return try ? -EAGAIN : -ENOMEM;
+
+	range.pfn_shift = 12;
+	range.start = obj->userptr.ptr;
+	range.flags = i915_range_flags;
+	range.values = i915_range_values;
+	range.end = range.start + obj->base.size;
+
+	range.vma = find_vma(mm, range.start);
+	if (!range.vma || range.vma->vm_file || range.vma->vm_end < range.end) {
+		kvfree(range.pfns);
+		return -EFAULT;
 	}
 
-	mutex_lock(&obj->mm.lock);
-	if (obj->userptr.work == &work->work) {
-		struct sg_table *pages = ERR_PTR(ret);
-
-		if (pinned == npages) {
-			pages = __i915_gem_userptr_alloc_pages(obj, pvec,
-							       npages);
-			if (!IS_ERR(pages)) {
-				pinned = 0;
-				pages = NULL;
+again:
+	for (i = 0; i < npages; ++i) {
+		range.pfns[i] = range.flags[HMM_PFN_VALID];
+		range.pfns[i] |= write ? range.flags[HMM_PFN_WRITE] : 0;
+	}
+
+	err = hmm_vma_fault(&range, true);
+	if (err) {
+		kvfree(range.pfns);
+		return err;
+	}
+
+	for (i = 0, pvec = (void *)range.pfns; i < npages; ++i) {
+		struct page *page = hmm_pfn_to_page(&range, range.pfns[i]);
+
+		if (page == NULL) {
+			hmm_vma_range_done(&range);
+
+			if (try) {
+				kvfree(range.pfns);
+				return -EAGAIN;
 			}
+			goto again;
 		}
-
-		obj->userptr.work = ERR_CAST(pages);
-		if (IS_ERR(pages))
-			__i915_gem_userptr_set_active(obj, false);
+		pvec[i] = page;
 	}
-	mutex_unlock(&obj->mm.lock);
 
-	release_pages(pvec, pinned);
-	kvfree(pvec);
+	if (!try)
+		mutex_lock(&obj->mm.lock);
+	pages = __i915_gem_userptr_alloc_pages(obj, pvec, npages);
+	if (!IS_ERR(pages))
+		__i915_gem_userptr_set_active(obj, &range);
+	else
+		hmm_vma_range_done(&range);
+	if (!try)
+		mutex_unlock(&obj->mm.lock);
+
+	kvfree(range.pfns);
+	return PTR_ERR_OR_ZERO(pages);
+#else /* CONFIG_HMM_MIRROR */
+	return -EFAULT;
+#endif
+}
+
+static void
+__i915_gem_userptr_get_pages_worker(struct work_struct *_work)
+{
+	struct get_pages_work *work = container_of(_work, typeof(*work), work);
+	struct drm_i915_gem_object *obj = work->obj;
+	int ret;
+
+	ret = i915_gem_userptr_map(obj, false);
+	obj->userptr.work = ERR_PTR(ret);
 
 	i915_gem_object_put(obj);
 	put_task_struct(work->task);
 	kfree(work);
 }
 
-static struct sg_table *
+static int
 __i915_gem_userptr_get_pages_schedule(struct drm_i915_gem_object *obj)
 {
 	struct get_pages_work *work;
@@ -591,7 +625,7 @@ __i915_gem_userptr_get_pages_schedule(struct drm_i915_gem_object *obj)
 	 */
 	work = kmalloc(sizeof(*work), GFP_KERNEL);
 	if (work == NULL)
-		return ERR_PTR(-ENOMEM);
+		return -ENOMEM;
 
 	obj->userptr.work = &work->work;
 
@@ -603,17 +637,12 @@ __i915_gem_userptr_get_pages_schedule(struct drm_i915_gem_object *obj)
 	INIT_WORK(&work->work, __i915_gem_userptr_get_pages_worker);
 	queue_work(to_i915(obj->base.dev)->mm.userptr_wq, &work->work);
 
-	return ERR_PTR(-EAGAIN);
+	return -EAGAIN;
 }
 
 static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 {
-	const int num_pages = obj->base.size >> PAGE_SHIFT;
-	struct mm_struct *mm = obj->userptr.mm->mm;
-	struct page **pvec;
-	struct sg_table *pages;
-	bool active;
-	int pinned;
+	int ret;
 
 	/* If userspace should engineer that these pages are replaced in
 	 * the vma between us binding this page into the GTT and completion
@@ -640,40 +669,10 @@ static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 			return -EAGAIN;
 	}
 
-	pvec = NULL;
-	pinned = 0;
-
-	if (mm == current->mm) {
-		pvec = kvmalloc_array(num_pages, sizeof(struct page *),
-				      GFP_KERNEL |
-				      __GFP_NORETRY |
-				      __GFP_NOWARN);
-		if (pvec) /* defer to worker if malloc fails */
-			pinned = __get_user_pages_fast(obj->userptr.ptr,
-						       num_pages,
-						       !i915_gem_object_is_readonly(obj),
-						       pvec);
-	}
-
-	active = false;
-	if (pinned < 0) {
-		pages = ERR_PTR(pinned);
-		pinned = 0;
-	} else if (pinned < num_pages) {
-		pages = __i915_gem_userptr_get_pages_schedule(obj);
-		active = pages == ERR_PTR(-EAGAIN);
-	} else {
-		pages = __i915_gem_userptr_alloc_pages(obj, pvec, num_pages);
-		active = !IS_ERR(pages);
-	}
-	if (active)
-		__i915_gem_userptr_set_active(obj, true);
-
-	if (IS_ERR(pages))
-		release_pages(pvec, pinned);
-	kvfree(pvec);
-
-	return PTR_ERR_OR_ZERO(pages);
+	ret = i915_gem_userptr_map(obj, true);
+	if (ret == -EAGAIN)
+		ret = __i915_gem_userptr_get_pages_schedule(obj);
+	return ret;
 }
 
 static void
@@ -684,7 +683,7 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 	struct page *page;
 
 	BUG_ON(obj->userptr.work != NULL);
-	__i915_gem_userptr_set_active(obj, false);
+	__i915_gem_userptr_set_active(obj, NULL);
 
 	if (obj->mm.madv != I915_MADV_WILLNEED)
 		obj->mm.dirty = false;
@@ -696,7 +695,6 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 			set_page_dirty(page);
 
 		mark_page_accessed(page);
-		put_page(page);
 	}
 	obj->mm.dirty = false;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] gpu/i915: use HMM mirror for userptr buffer object.
@ 2018-09-10  0:57   ` jglisse
  0 siblings, 0 replies; 11+ messages in thread
From: jglisse @ 2018-09-10  0:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Airlie, Daniel Vetter, intel-gfx, Jérôme Glisse,
	dri-devel, Rodrigo Vivi

From: Jérôme Glisse <jglisse@redhat.com>

This replace existing code that rely on get_user_page() aka GUP with
code that now use HMM mirror to mirror a range of virtual address as
a buffer object accessible by the GPU. There is no functional changes
from userspace point of view.

From kernel point of view we no longer pin pages for userptr buffer
object which is a welcome change (i am assuming that everyone dislike
page pin as i do).

Another change, from kernel point of view, is that it does no longer
have a fast path with get_user_pages_fast() this can eventually added
back through HMM.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
---
 drivers/gpu/drm/i915/i915_gem_userptr.c | 206 ++++++++++++------------
 1 file changed, 102 insertions(+), 104 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 5e09b654b5ad..378aab438ebd 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -464,7 +464,7 @@ __i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
 
 static int
 __i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
-			      bool value)
+			      struct hmm_range *range)
 {
 	int ret = 0;
 
@@ -486,86 +486,120 @@ __i915_gem_userptr_set_active(struct drm_i915_gem_object *obj,
 	/* In order to serialise get_pages with an outstanding
 	 * cancel_userptr, we must drop the struct_mutex and try again.
 	 */
-	if (!value)
+	if (range) {
+		if (!hmm_vma_range_done(range))
+			ret = -EAGAIN;
+		else
+			add_object(obj->userptr.mmu_object);
+	} else
 		del_object(obj->userptr.mmu_object);
-	else if (!work_pending(&obj->userptr.mmu_object->work))
-		add_object(obj->userptr.mmu_object);
-	else
-		ret = -EAGAIN;
 	spin_unlock(&obj->userptr.mmu_object->mirror->lock);
 #endif
 
 	return ret;
 }
 
-static void
-__i915_gem_userptr_get_pages_worker(struct work_struct *_work)
+static int
+i915_gem_userptr_map(struct drm_i915_gem_object *obj, bool try)
 {
-	struct get_pages_work *work = container_of(_work, typeof(*work), work);
-	struct drm_i915_gem_object *obj = work->obj;
-	const int npages = obj->base.size >> PAGE_SHIFT;
+#if defined(CONFIG_HMM_MIRROR)
+	static const uint64_t i915_range_flags[HMM_PFN_FLAG_MAX] = {
+		(1 << 0), /* HMM_PFN_VALID */
+		(1 << 1), /* HMM_PFN_WRITE */
+		0 /* HMM_PFN_DEVICE_PRIVATE */
+	};
+	static const uint64_t i915_range_values[HMM_PFN_VALUE_MAX] = {
+		0xfffffffffffffffeUL, /* HMM_PFN_ERROR */
+		0, /* HMM_PFN_NONE */
+		0xfffffffffffffffcUL /* HMM_PFN_SPECIAL */
+	};
+
+	const unsigned long npages = obj->base.size >> PAGE_SHIFT;
+	struct mm_struct *mm = obj->userptr.mm->mm;
+	struct sg_table *pages;
+	struct hmm_range range;
 	struct page **pvec;
-	int pinned, ret;
-
-	ret = -ENOMEM;
-	pinned = 0;
-
-	pvec = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
-	if (pvec != NULL) {
-		struct mm_struct *mm = obj->userptr.mm->mm;
-		unsigned int flags = 0;
-
-		if (!i915_gem_object_is_readonly(obj))
-			flags |= FOLL_WRITE;
-
-		ret = -EFAULT;
-		if (mmget_not_zero(mm)) {
-			down_read(&mm->mmap_sem);
-			while (pinned < npages) {
-				ret = get_user_pages_remote
-					(work->task, mm,
-					 obj->userptr.ptr + pinned * PAGE_SIZE,
-					 npages - pinned,
-					 flags,
-					 pvec + pinned, NULL, NULL);
-				if (ret < 0)
-					break;
-
-				pinned += ret;
-			}
-			up_read(&mm->mmap_sem);
-			mmput(mm);
-		}
+	unsigned long i;
+	bool write = !i915_gem_object_is_readonly(obj);
+	int err;
+
+	range.pfns = kvmalloc_array(npages, sizeof(uint64_t),
+				    try ? GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN : GFP_KERNEL);
+	if (range.pfns == NULL)
+		return try ? -EAGAIN : -ENOMEM;
+
+	range.pfn_shift = 12;
+	range.start = obj->userptr.ptr;
+	range.flags = i915_range_flags;
+	range.values = i915_range_values;
+	range.end = range.start + obj->base.size;
+
+	range.vma = find_vma(mm, range.start);
+	if (!range.vma || range.vma->vm_file || range.vma->vm_end < range.end) {
+		kvfree(range.pfns);
+		return -EFAULT;
 	}
 
-	mutex_lock(&obj->mm.lock);
-	if (obj->userptr.work == &work->work) {
-		struct sg_table *pages = ERR_PTR(ret);
-
-		if (pinned == npages) {
-			pages = __i915_gem_userptr_alloc_pages(obj, pvec,
-							       npages);
-			if (!IS_ERR(pages)) {
-				pinned = 0;
-				pages = NULL;
+again:
+	for (i = 0; i < npages; ++i) {
+		range.pfns[i] = range.flags[HMM_PFN_VALID];
+		range.pfns[i] |= write ? range.flags[HMM_PFN_WRITE] : 0;
+	}
+
+	err = hmm_vma_fault(&range, true);
+	if (err) {
+		kvfree(range.pfns);
+		return err;
+	}
+
+	for (i = 0, pvec = (void *)range.pfns; i < npages; ++i) {
+		struct page *page = hmm_pfn_to_page(&range, range.pfns[i]);
+
+		if (page == NULL) {
+			hmm_vma_range_done(&range);
+
+			if (try) {
+				kvfree(range.pfns);
+				return -EAGAIN;
 			}
+			goto again;
 		}
-
-		obj->userptr.work = ERR_CAST(pages);
-		if (IS_ERR(pages))
-			__i915_gem_userptr_set_active(obj, false);
+		pvec[i] = page;
 	}
-	mutex_unlock(&obj->mm.lock);
 
-	release_pages(pvec, pinned);
-	kvfree(pvec);
+	if (!try)
+		mutex_lock(&obj->mm.lock);
+	pages = __i915_gem_userptr_alloc_pages(obj, pvec, npages);
+	if (!IS_ERR(pages))
+		__i915_gem_userptr_set_active(obj, &range);
+	else
+		hmm_vma_range_done(&range);
+	if (!try)
+		mutex_unlock(&obj->mm.lock);
+
+	kvfree(range.pfns);
+	return PTR_ERR_OR_ZERO(pages);
+#else /* CONFIG_HMM_MIRROR */
+	return -EFAULT;
+#endif
+}
+
+static void
+__i915_gem_userptr_get_pages_worker(struct work_struct *_work)
+{
+	struct get_pages_work *work = container_of(_work, typeof(*work), work);
+	struct drm_i915_gem_object *obj = work->obj;
+	int ret;
+
+	ret = i915_gem_userptr_map(obj, false);
+	obj->userptr.work = ERR_PTR(ret);
 
 	i915_gem_object_put(obj);
 	put_task_struct(work->task);
 	kfree(work);
 }
 
-static struct sg_table *
+static int
 __i915_gem_userptr_get_pages_schedule(struct drm_i915_gem_object *obj)
 {
 	struct get_pages_work *work;
@@ -591,7 +625,7 @@ __i915_gem_userptr_get_pages_schedule(struct drm_i915_gem_object *obj)
 	 */
 	work = kmalloc(sizeof(*work), GFP_KERNEL);
 	if (work == NULL)
-		return ERR_PTR(-ENOMEM);
+		return -ENOMEM;
 
 	obj->userptr.work = &work->work;
 
@@ -603,17 +637,12 @@ __i915_gem_userptr_get_pages_schedule(struct drm_i915_gem_object *obj)
 	INIT_WORK(&work->work, __i915_gem_userptr_get_pages_worker);
 	queue_work(to_i915(obj->base.dev)->mm.userptr_wq, &work->work);
 
-	return ERR_PTR(-EAGAIN);
+	return -EAGAIN;
 }
 
 static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 {
-	const int num_pages = obj->base.size >> PAGE_SHIFT;
-	struct mm_struct *mm = obj->userptr.mm->mm;
-	struct page **pvec;
-	struct sg_table *pages;
-	bool active;
-	int pinned;
+	int ret;
 
 	/* If userspace should engineer that these pages are replaced in
 	 * the vma between us binding this page into the GTT and completion
@@ -640,40 +669,10 @@ static int i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
 			return -EAGAIN;
 	}
 
-	pvec = NULL;
-	pinned = 0;
-
-	if (mm == current->mm) {
-		pvec = kvmalloc_array(num_pages, sizeof(struct page *),
-				      GFP_KERNEL |
-				      __GFP_NORETRY |
-				      __GFP_NOWARN);
-		if (pvec) /* defer to worker if malloc fails */
-			pinned = __get_user_pages_fast(obj->userptr.ptr,
-						       num_pages,
-						       !i915_gem_object_is_readonly(obj),
-						       pvec);
-	}
-
-	active = false;
-	if (pinned < 0) {
-		pages = ERR_PTR(pinned);
-		pinned = 0;
-	} else if (pinned < num_pages) {
-		pages = __i915_gem_userptr_get_pages_schedule(obj);
-		active = pages == ERR_PTR(-EAGAIN);
-	} else {
-		pages = __i915_gem_userptr_alloc_pages(obj, pvec, num_pages);
-		active = !IS_ERR(pages);
-	}
-	if (active)
-		__i915_gem_userptr_set_active(obj, true);
-
-	if (IS_ERR(pages))
-		release_pages(pvec, pinned);
-	kvfree(pvec);
-
-	return PTR_ERR_OR_ZERO(pages);
+	ret = i915_gem_userptr_map(obj, true);
+	if (ret == -EAGAIN)
+		ret = __i915_gem_userptr_get_pages_schedule(obj);
+	return ret;
 }
 
 static void
@@ -684,7 +683,7 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 	struct page *page;
 
 	BUG_ON(obj->userptr.work != NULL);
-	__i915_gem_userptr_set_active(obj, false);
+	__i915_gem_userptr_set_active(obj, NULL);
 
 	if (obj->mm.madv != I915_MADV_WILLNEED)
 		obj->mm.dirty = false;
@@ -696,7 +695,6 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
 			set_page_dirty(page);
 
 		mark_page_accessed(page);
-		put_page(page);
 	}
 	obj->mm.dirty = false;
 
-- 
2.17.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* ✗ Fi.CI.BAT: failure for Getting rid of GUP and use HMM for user ptr features.
  2018-09-10  0:57 ` jglisse
                   ` (2 preceding siblings ...)
  (?)
@ 2018-09-10  9:27 ` Patchwork
  -1 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-09-10  9:27 UTC (permalink / raw)
  To: jglisse; +Cc: intel-gfx

== Series Details ==

Series: Getting rid of GUP and use HMM for user ptr features.
URL   : https://patchwork.freedesktop.org/series/49395/
State : failure

== Summary ==

CALL    scripts/checksyscalls.sh
  DESCEND  objtool
  CHK     include/generated/compile.h
  CC [M]  drivers/gpu/drm/i915/i915_gem_userptr.o
drivers/gpu/drm/i915/i915_gem_userptr.c:51:20: error: field ‘mirror’ has incomplete type
  struct hmm_mirror mirror;
                    ^~~~~~
drivers/gpu/drm/i915/i915_gem_userptr.c:116:22: error: ‘struct hmm_update’ declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
         const struct hmm_update *update)
                      ^~~~~~~~~~
In file included from ./include/linux/export.h:45:0,
                 from ./include/linux/linkage.h:7,
                 from ./include/linux/kernel.h:7,
                 from ./include/linux/list.h:9,
                 from ./include/linux/agp_backend.h:33,
                 from ./include/drm/drmP.h:35,
                 from drivers/gpu/drm/i915/i915_gem_userptr.c:25:
drivers/gpu/drm/i915/i915_gem_userptr.c: In function ‘i915_sync_cpu_device_pagetables’:
./include/linux/kernel.h:997:32: error: dereferencing pointer to incomplete type ‘struct hmm_mirror’
  BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
                                ^~~~~~
./include/linux/compiler.h:335:18: note: in definition of macro ‘__compiletime_assert’
   int __cond = !(condition);    \
                  ^~~~~~~~~
./include/linux/compiler.h:358:2: note: in expansion of macro ‘_compiletime_assert’
  _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
  ^~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:45:37: note: in expansion of macro ‘compiletime_assert’
 #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                     ^~~~~~~~~~~~~~~~~~
./include/linux/kernel.h:997:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
  BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
  ^~~~~~~~~~~~~~~~
./include/linux/kernel.h:997:20: note: in expansion of macro ‘__same_type’
  BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
                    ^~~~~~~~~~~
drivers/gpu/drm/i915/i915_gem_userptr.c:119:3: note: in expansion of macro ‘container_of’
   container_of(_mirror, struct i915_mirror, mirror);
   ^~~~~~~~~~~~
drivers/gpu/drm/i915/i915_gem_userptr.c:121:28: error: dereferencing pointer to incomplete type ‘const struct hmm_update’
  unsigned long end = update->end - 1;
                            ^~
drivers/gpu/drm/i915/i915_gem_userptr.c: At top level:
drivers/gpu/drm/i915/i915_gem_userptr.c:167:21: error: variable ‘i915_mirror_ops’ has initializer but incomplete type
 static const struct hmm_mirror_ops i915_mirror_ops = {
                     ^~~~~~~~~~~~~~
drivers/gpu/drm/i915/i915_gem_userptr.c:168:3: error: ‘const struct hmm_mirror_ops’ has no member named ‘sync_cpu_device_pagetables’
  .sync_cpu_device_pagetables = &i915_sync_cpu_device_pagetables,
   ^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/i915_gem_userptr.c:168:32: error: excess elements in struct initializer [-Werror]
  .sync_cpu_device_pagetables = &i915_sync_cpu_device_pagetables,
                                ^
drivers/gpu/drm/i915/i915_gem_userptr.c:168:32: note: (near initialization for ‘i915_mirror_ops’)
drivers/gpu/drm/i915/i915_gem_userptr.c:169:3: error: ‘const struct hmm_mirror_ops’ has no member named ‘release’
  .release = &i915_mirror_release,
   ^~~~~~~
drivers/gpu/drm/i915/i915_gem_userptr.c:169:13: error: excess elements in struct initializer [-Werror]
  .release = &i915_mirror_release,
             ^
drivers/gpu/drm/i915/i915_gem_userptr.c:169:13: note: (near initialization for ‘i915_mirror_ops’)
drivers/gpu/drm/i915/i915_gem_userptr.c: In function ‘i915_mirror_find’:
drivers/gpu/drm/i915/i915_gem_userptr.c:230:9: error: implicit declaration of function ‘hmm_mirror_register’; did you mean ‘rc_map_register’? [-Werror=implicit-function-declaration]
   err = hmm_mirror_register(&mirror->mirror, mm->mm);
         ^~~~~~~~~~~~~~~~~~~
         rc_map_register
drivers/gpu/drm/i915/i915_gem_userptr.c: In function ‘i915_mirror_free’:
drivers/gpu/drm/i915/i915_gem_userptr.c:290:2: error: implicit declaration of function ‘hmm_mirror_unregister’; did you mean ‘rc_map_unregister’? [-Werror=implicit-function-declaration]
  hmm_mirror_unregister(&mirror->mirror);
  ^~~~~~~~~~~~~~~~~~~~~
  rc_map_unregister
drivers/gpu/drm/i915/i915_gem_userptr.c: At top level:
drivers/gpu/drm/i915/i915_gem_userptr.c:467:17: error: ‘struct hmm_range’ declared inside parameter list will not be visible outside of this definition or declaration [-Werror]
          struct hmm_range *range)
                 ^~~~~~~~~
drivers/gpu/drm/i915/i915_gem_userptr.c: In function ‘__i915_gem_userptr_set_active’:
drivers/gpu/drm/i915/i915_gem_userptr.c:490:8: error: implicit declaration of function ‘hmm_vma_range_done’; did you mean ‘drm_vma_node_size’? [-Werror=implicit-function-declaration]
   if (!hmm_vma_range_done(range))
        ^~~~~~~~~~~~~~~~~~
        drm_vma_node_size
drivers/gpu/drm/i915/i915_gem_userptr.c: In function ‘i915_gem_userptr_map’:
drivers/gpu/drm/i915/i915_gem_userptr.c:506:41: error: ‘HMM_PFN_FLAG_MAX’ undeclared (first use in this function); did you mean ‘FS_XFLAG_DAX’?
  static const uint64_t i915_range_flags[HMM_PFN_FLAG_MAX] = {
                                         ^~~~~~~~~~~~~~~~
                                         FS_XFLAG_DAX
drivers/gpu/drm/i915/i915_gem_userptr.c:506:41: note: each undeclared identifier is reported only once for each function it appears in
drivers/gpu/drm/i915/i915_gem_userptr.c:511:42: error: ‘HMM_PFN_VALUE_MAX’ undeclared (first use in this function); did you mean ‘HMM_PFN_FLAG_MAX’?
  static const uint64_t i915_range_values[HMM_PFN_VALUE_MAX] = {
                                          ^~~~~~~~~~~~~~~~~
                                          HMM_PFN_FLAG_MAX
drivers/gpu/drm/i915/i915_gem_userptr.c:520:19: error: storage size of ‘range’ isn’t known
  struct hmm_range range;
                   ^~~~~
drivers/gpu/drm/i915/i915_gem_userptr.c:545:31: error: ‘HMM_PFN_VALID’ undeclared (first use in this function); did you mean ‘HMM_PFN_VALUE_MAX’?
   range.pfns[i] = range.flags[HMM_PFN_VALID];
                               ^~~~~~~~~~~~~
                               HMM_PFN_VALUE_MAX
drivers/gpu/drm/i915/i915_gem_userptr.c:546:40: error: ‘HMM_PFN_WRITE’ undeclared (first use in this function); did you mean ‘DMA_PTE_WRITE’?
   range.pfns[i] |= write ? range.flags[HMM_PFN_WRITE] : 0;
                                        ^~~~~~~~~~~~~
                                        DMA_PTE_WRITE
drivers/gpu/drm/i915/i915_gem_userptr.c:549:8: error: implicit declaration of function ‘hmm_vma_fault’; did you mean ‘hmm_mm_init’? [-Werror=implicit-function-declaration]
  err = hmm_vma_fault(&range, true);
        ^~~~~~~~~~~~~
        hmm_mm_init
drivers/gpu/drm/i915/i915_gem_userptr.c:556:23: error: implicit declaration of function ‘hmm_pfn_to_page’; did you mean ‘__pfn_to_page’? [-Werror=implicit-function-declaration]
   struct page *page = hmm_pfn_to_page(&range, range.pfns[i]);
                       ^~~~~~~~~~~~~~~
                       __pfn_to_page
drivers/gpu/drm/i915/i915_gem_userptr.c:520:19: error: unused variable ‘range’ [-Werror=unused-variable]
  struct hmm_range range;
                   ^~~~~
drivers/gpu/drm/i915/i915_gem_userptr.c:511:24: error: unused variable ‘i915_range_values’ [-Werror=unused-variable]
  static const uint64_t i915_range_values[HMM_PFN_VALUE_MAX] = {
                        ^~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/i915_gem_userptr.c:506:24: error: unused variable ‘i915_range_flags’ [-Werror=unused-variable]
  static const uint64_t i915_range_flags[HMM_PFN_FLAG_MAX] = {
                        ^~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/i915_gem_userptr.c: At top level:
drivers/gpu/drm/i915/i915_gem_userptr.c:167:36: error: storage size of ‘i915_mirror_ops’ isn’t known
 static const struct hmm_mirror_ops i915_mirror_ops = {
                                    ^~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
scripts/Makefile.build:305: recipe for target 'drivers/gpu/drm/i915/i915_gem_userptr.o' failed
make[4]: *** [drivers/gpu/drm/i915/i915_gem_userptr.o] Error 1
scripts/Makefile.build:546: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:546: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:546: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1060: recipe for target 'drivers' failed
make: *** [drivers] Error 2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] gpu/i915: use HMM mirror for userptr buffer object.
  2018-09-10  0:57   ` jglisse
@ 2018-11-22 13:59     ` Joonas Lahtinen
  -1 siblings, 0 replies; 11+ messages in thread
From: Joonas Lahtinen @ 2018-11-22 13:59 UTC (permalink / raw)
  To: jglisse, linux-kernel
  Cc: Jérôme Glisse, dri-devel, David Airlie, Daniel Vetter,
	Chris Wilson, Lionel Landwerlin, Jani Nikula, Rodrigo Vivi,
	intel-gfx

Hi Jerome,

Bit late reply, but here goes :)

We're working quite hard to avoid pinning any pages unless they're in
the GPU page tables. And when they are in the GPU page tables, they must
be pinned for whole of that duration, for the reason that our GPUs can
not take a fault. And to avoid thrashing GPU page tables, we do leave
objects in page tables with the expectation that smart userspace
recycles buffers.

So what I understand of your proposal, it wouldn't really make a
difference for us in the amount of pinned pages (which I agree,
we'd love to see going down). When we're unable to take a fault,
the first use effectively forces us to pin any pages and keep them
pinned to avoid thrashing GPU page tables.

So from i915 perspective, it just seems to be mostly an exchange of
an API to an another for getting the pages. You already mentioned
the fast path is being worked on, which is an obvious difference.
But is there some other improvement one would be expecting, beyond
the page pinning?

Also, is the requirement for a single non-file-backed VMA in the
plans of being eliminated or is that inherent restriction of the
HMM_MIRROR feature? We're currently not imposing such a limitation.

Regards, Joonas

Quoting jglisse@redhat.com (2018-09-10 03:57:36)
> From: Jérôme Glisse <jglisse@redhat.com>
> 
> This replace existing code that rely on get_user_page() aka GUP with
> code that now use HMM mirror to mirror a range of virtual address as
> a buffer object accessible by the GPU. There is no functional changes
> from userspace point of view.
> 
> From kernel point of view we no longer pin pages for userptr buffer
> object which is a welcome change (i am assuming that everyone dislike
> page pin as i do).
> 
> Another change, from kernel point of view, is that it does no longer
> have a fast path with get_user_pages_fast() this can eventually added
> back through HMM.
> 
> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: intel-gfx@lists.freedesktop.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] gpu/i915: use HMM mirror for userptr buffer object.
@ 2018-11-22 13:59     ` Joonas Lahtinen
  0 siblings, 0 replies; 11+ messages in thread
From: Joonas Lahtinen @ 2018-11-22 13:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jérôme Glisse, dri-devel, David Airlie, Daniel Vetter,
	Chris Wilson, Lionel Landwerlin, Jani Nikula, Rodrigo Vivi,
	intel-gfx

Hi Jerome,

Bit late reply, but here goes :)

We're working quite hard to avoid pinning any pages unless they're in
the GPU page tables. And when they are in the GPU page tables, they must
be pinned for whole of that duration, for the reason that our GPUs can
not take a fault. And to avoid thrashing GPU page tables, we do leave
objects in page tables with the expectation that smart userspace
recycles buffers.

So what I understand of your proposal, it wouldn't really make a
difference for us in the amount of pinned pages (which I agree,
we'd love to see going down). When we're unable to take a fault,
the first use effectively forces us to pin any pages and keep them
pinned to avoid thrashing GPU page tables.

So from i915 perspective, it just seems to be mostly an exchange of
an API to an another for getting the pages. You already mentioned
the fast path is being worked on, which is an obvious difference.
But is there some other improvement one would be expecting, beyond
the page pinning?

Also, is the requirement for a single non-file-backed VMA in the
plans of being eliminated or is that inherent restriction of the
HMM_MIRROR feature? We're currently not imposing such a limitation.

Regards, Joonas

Quoting jglisse@redhat.com (2018-09-10 03:57:36)
> From: Jérôme Glisse <jglisse@redhat.com>
> 
> This replace existing code that rely on get_user_page() aka GUP with
> code that now use HMM mirror to mirror a range of virtual address as
> a buffer object accessible by the GPU. There is no functional changes
> from userspace point of view.
> 
> From kernel point of view we no longer pin pages for userptr buffer
> object which is a welcome change (i am assuming that everyone dislike
> page pin as i do).
> 
> Another change, from kernel point of view, is that it does no longer
> have a fast path with get_user_pages_fast() this can eventually added
> back through HMM.
> 
> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: intel-gfx@lists.freedesktop.org

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] gpu/i915: use HMM mirror for userptr buffer object.
  2018-11-22 13:59     ` Joonas Lahtinen
@ 2018-11-22 14:31       ` Daniel Vetter
  -1 siblings, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2018-11-22 14:31 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: jglisse, linux-kernel, dri-devel, David Airlie, Daniel Vetter,
	Chris Wilson, Lionel Landwerlin, Jani Nikula, Rodrigo Vivi,
	intel-gfx

On Thu, Nov 22, 2018 at 03:59:50PM +0200, Joonas Lahtinen wrote:
> Hi Jerome,
> 
> Bit late reply, but here goes :)
> 
> We're working quite hard to avoid pinning any pages unless they're in
> the GPU page tables. And when they are in the GPU page tables, they must
> be pinned for whole of that duration, for the reason that our GPUs can
> not take a fault. And to avoid thrashing GPU page tables, we do leave
> objects in page tables with the expectation that smart userspace
> recycles buffers.
> 
> So what I understand of your proposal, it wouldn't really make a
> difference for us in the amount of pinned pages (which I agree,
> we'd love to see going down). When we're unable to take a fault,
> the first use effectively forces us to pin any pages and keep them
> pinned to avoid thrashing GPU page tables.
> 
> So from i915 perspective, it just seems to be mostly an exchange of
> an API to an another for getting the pages. You already mentioned
> the fast path is being worked on, which is an obvious difference.
> But is there some other improvement one would be expecting, beyond
> the page pinning?
> 
> Also, is the requirement for a single non-file-backed VMA in the
> plans of being eliminated or is that inherent restriction of the
> HMM_MIRROR feature? We're currently not imposing such a limitation.

I think a clear plus for HMM would be if this helps us fix the deadlocks
and races we're seeing. But I have no idea whether this gets us any closer
here or not.
-Daniel

> 
> Regards, Joonas
> 
> Quoting jglisse@redhat.com (2018-09-10 03:57:36)
> > From: Jérôme Glisse <jglisse@redhat.com>
> > 
> > This replace existing code that rely on get_user_page() aka GUP with
> > code that now use HMM mirror to mirror a range of virtual address as
> > a buffer object accessible by the GPU. There is no functional changes
> > from userspace point of view.
> > 
> > From kernel point of view we no longer pin pages for userptr buffer
> > object which is a welcome change (i am assuming that everyone dislike
> > page pin as i do).
> > 
> > Another change, from kernel point of view, is that it does no longer
> > have a fast path with get_user_pages_fast() this can eventually added
> > back through HMM.
> > 
> > Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Cc: intel-gfx@lists.freedesktop.org

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] gpu/i915: use HMM mirror for userptr buffer object.
@ 2018-11-22 14:31       ` Daniel Vetter
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2018-11-22 14:31 UTC (permalink / raw)
  To: Joonas Lahtinen
  Cc: David Airlie, Daniel Vetter, intel-gfx, linux-kernel, dri-devel,
	jglisse, Rodrigo Vivi

On Thu, Nov 22, 2018 at 03:59:50PM +0200, Joonas Lahtinen wrote:
> Hi Jerome,
> 
> Bit late reply, but here goes :)
> 
> We're working quite hard to avoid pinning any pages unless they're in
> the GPU page tables. And when they are in the GPU page tables, they must
> be pinned for whole of that duration, for the reason that our GPUs can
> not take a fault. And to avoid thrashing GPU page tables, we do leave
> objects in page tables with the expectation that smart userspace
> recycles buffers.
> 
> So what I understand of your proposal, it wouldn't really make a
> difference for us in the amount of pinned pages (which I agree,
> we'd love to see going down). When we're unable to take a fault,
> the first use effectively forces us to pin any pages and keep them
> pinned to avoid thrashing GPU page tables.
> 
> So from i915 perspective, it just seems to be mostly an exchange of
> an API to an another for getting the pages. You already mentioned
> the fast path is being worked on, which is an obvious difference.
> But is there some other improvement one would be expecting, beyond
> the page pinning?
> 
> Also, is the requirement for a single non-file-backed VMA in the
> plans of being eliminated or is that inherent restriction of the
> HMM_MIRROR feature? We're currently not imposing such a limitation.

I think a clear plus for HMM would be if this helps us fix the deadlocks
and races we're seeing. But I have no idea whether this gets us any closer
here or not.
-Daniel

> 
> Regards, Joonas
> 
> Quoting jglisse@redhat.com (2018-09-10 03:57:36)
> > From: Jérôme Glisse <jglisse@redhat.com>
> > 
> > This replace existing code that rely on get_user_page() aka GUP with
> > code that now use HMM mirror to mirror a range of virtual address as
> > a buffer object accessible by the GPU. There is no functional changes
> > from userspace point of view.
> > 
> > From kernel point of view we no longer pin pages for userptr buffer
> > object which is a welcome change (i am assuming that everyone dislike
> > page pin as i do).
> > 
> > Another change, from kernel point of view, is that it does no longer
> > have a fast path with get_user_pages_fast() this can eventually added
> > back through HMM.
> > 
> > Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Cc: intel-gfx@lists.freedesktop.org

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-11-22 14:31 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-10  0:57 [PATCH 0/2] [i915] Getting rid of GUP and use HMM for user ptr features jglisse
2018-09-10  0:57 ` jglisse
2018-09-10  0:57 ` [PATCH 1/2] gpu/i915: use HMM mirror instead of mmu_notifier jglisse
2018-09-10  0:57   ` jglisse
2018-09-10  0:57 ` [PATCH 2/2] gpu/i915: use HMM mirror for userptr buffer object jglisse
2018-09-10  0:57   ` jglisse
2018-11-22 13:59   ` Joonas Lahtinen
2018-11-22 13:59     ` Joonas Lahtinen
2018-11-22 14:31     ` Daniel Vetter
2018-11-22 14:31       ` Daniel Vetter
2018-09-10  9:27 ` ✗ Fi.CI.BAT: failure for Getting rid of GUP and use HMM for user ptr features Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.