All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/6] drm/i915: Asynchronous vma unbinding
@ 2022-01-04 12:51 ` Thomas Hellström
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld


This patch series introduces infrastructure for asynchronous vma
unbinding. The single enabled use-case is initially at buffer object
migration where we otherwise sync when unbinding vmas before migration.
This in theory allows us to pipeline any number of migrations, but in
practice the number is restricted by a sync wait when filling the
migration context ring. We might want to look at that moving forward if
needed.

The other main use-case is to be able to pipeline vma evictions, for
example with softpinning where a new vma wants to reuse the vm range
of an already active vma. We can't support this just yet because we
need dma_resv locking around vma eviction for that, which is under
implementation.

Patch 1 introduces vma resource first for error capture purposes
Patch 2 changes the vm backend interface to take vma resources rather than vmas
Patch 3 removes and unneeded page pinning
Patch 4 introduces the async unbinding itself, and finally
Patch 5 introduces a selftest
Patch 6 realizes we have duplicated functionality and removes the vma snapshots

v2:
-- Some kernel test robot reports addressed.
-- kmem cache for vma resources, See patch 4
-- Various fixes all over the place. See separate commit messages.
v3:
-- Re-add a missing i915_vma_resource_put()
-- Remove a stray debug printout
v4:
-- Patch series split in two. This is the second part.
-- Take cache coloring into account when searching for vma_resources
   pending unbind. (Matthew Auld)
v5:
-- Add a selftest.
-- Remove page pinning while sync binding.
-- A couple of fixes in i915_vma_resource_bind_dep_await()

Thomas Hellström (6):
  drm/i915: Initial introduction of vma resources
  drm/i915: Use the vma resource as argument for gtt binding / unbinding
  drm/i915: Don't pin the object pages during pending vma binds
  drm/i915: Use vma resources for async unbinding
  drm/i915: Asynchronous migration selftest
  drm/i915: Use struct vma_resource instead of struct vma_snapshot

 drivers/gpu/drm/i915/Makefile                 |   2 +-
 drivers/gpu/drm/i915/display/intel_dpt.c      |  27 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  17 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  12 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   3 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  27 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  |  11 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  37 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c | 192 +++++++-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |  19 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  37 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   9 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |  72 +--
 drivers/gpu/drm/i915/gt/intel_gtt.c           |   4 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  19 +-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  22 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      |  13 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h      |   2 +-
 drivers/gpu/drm/i915/i915_debugfs.c           |   3 +-
 drivers/gpu/drm/i915/i915_drv.h               |   1 +
 drivers/gpu/drm/i915/i915_gem.c               |  12 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  87 ++--
 drivers/gpu/drm/i915/i915_module.c            |   3 +
 drivers/gpu/drm/i915/i915_request.c           |  12 +-
 drivers/gpu/drm/i915/i915_request.h           |   6 +-
 drivers/gpu/drm/i915/i915_vma.c               | 240 +++++++++-
 drivers/gpu/drm/i915/i915_vma.h               |  33 +-
 drivers/gpu/drm/i915/i915_vma_resource.c      | 417 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_vma_resource.h      | 235 ++++++++++
 drivers/gpu/drm/i915/i915_vma_snapshot.c      | 134 ------
 drivers/gpu/drm/i915/i915_vma_snapshot.h      | 112 -----
 drivers/gpu/drm/i915/i915_vma_types.h         |   5 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 159 ++++---
 drivers/gpu/drm/i915/selftests/mock_gtt.c     |  12 +-
 34 files changed, 1415 insertions(+), 581 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_vma_resource.c
 create mode 100644 drivers/gpu/drm/i915/i915_vma_resource.h
 delete mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.c
 delete mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Intel-gfx] [PATCH v5 0/6] drm/i915: Asynchronous vma unbinding
@ 2022-01-04 12:51 ` Thomas Hellström
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld


This patch series introduces infrastructure for asynchronous vma
unbinding. The single enabled use-case is initially at buffer object
migration where we otherwise sync when unbinding vmas before migration.
This in theory allows us to pipeline any number of migrations, but in
practice the number is restricted by a sync wait when filling the
migration context ring. We might want to look at that moving forward if
needed.

The other main use-case is to be able to pipeline vma evictions, for
example with softpinning where a new vma wants to reuse the vm range
of an already active vma. We can't support this just yet because we
need dma_resv locking around vma eviction for that, which is under
implementation.

Patch 1 introduces vma resource first for error capture purposes
Patch 2 changes the vm backend interface to take vma resources rather than vmas
Patch 3 removes and unneeded page pinning
Patch 4 introduces the async unbinding itself, and finally
Patch 5 introduces a selftest
Patch 6 realizes we have duplicated functionality and removes the vma snapshots

v2:
-- Some kernel test robot reports addressed.
-- kmem cache for vma resources, See patch 4
-- Various fixes all over the place. See separate commit messages.
v3:
-- Re-add a missing i915_vma_resource_put()
-- Remove a stray debug printout
v4:
-- Patch series split in two. This is the second part.
-- Take cache coloring into account when searching for vma_resources
   pending unbind. (Matthew Auld)
v5:
-- Add a selftest.
-- Remove page pinning while sync binding.
-- A couple of fixes in i915_vma_resource_bind_dep_await()

Thomas Hellström (6):
  drm/i915: Initial introduction of vma resources
  drm/i915: Use the vma resource as argument for gtt binding / unbinding
  drm/i915: Don't pin the object pages during pending vma binds
  drm/i915: Use vma resources for async unbinding
  drm/i915: Asynchronous migration selftest
  drm/i915: Use struct vma_resource instead of struct vma_snapshot

 drivers/gpu/drm/i915/Makefile                 |   2 +-
 drivers/gpu/drm/i915/display/intel_dpt.c      |  27 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  17 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  12 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   3 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  27 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  |  11 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  37 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c | 192 +++++++-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |  19 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  37 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   9 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |  72 +--
 drivers/gpu/drm/i915/gt/intel_gtt.c           |   4 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  19 +-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  22 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      |  13 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h      |   2 +-
 drivers/gpu/drm/i915/i915_debugfs.c           |   3 +-
 drivers/gpu/drm/i915/i915_drv.h               |   1 +
 drivers/gpu/drm/i915/i915_gem.c               |  12 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  87 ++--
 drivers/gpu/drm/i915/i915_module.c            |   3 +
 drivers/gpu/drm/i915/i915_request.c           |  12 +-
 drivers/gpu/drm/i915/i915_request.h           |   6 +-
 drivers/gpu/drm/i915/i915_vma.c               | 240 +++++++++-
 drivers/gpu/drm/i915/i915_vma.h               |  33 +-
 drivers/gpu/drm/i915/i915_vma_resource.c      | 417 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_vma_resource.h      | 235 ++++++++++
 drivers/gpu/drm/i915/i915_vma_snapshot.c      | 134 ------
 drivers/gpu/drm/i915/i915_vma_snapshot.h      | 112 -----
 drivers/gpu/drm/i915/i915_vma_types.h         |   5 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 159 ++++---
 drivers/gpu/drm/i915/selftests/mock_gtt.c     |  12 +-
 34 files changed, 1415 insertions(+), 581 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_vma_resource.c
 create mode 100644 drivers/gpu/drm/i915/i915_vma_resource.h
 delete mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.c
 delete mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v5 1/6] drm/i915: Initial introduction of vma resources
  2022-01-04 12:51 ` [Intel-gfx] " Thomas Hellström
@ 2022-01-04 12:51   ` Thomas Hellström
  -1 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

Introduce vma resources, sort of similar to TTM resources,  needed for
asynchronous bind management. Initially we will use them to hold
completion of unbinding when we capture data from a vma, but they will
be used extensively in upcoming patches for asynchronous vma unbinding.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
 drivers/gpu/drm/i915/i915_vma.c               |  55 +++++++-
 drivers/gpu/drm/i915/i915_vma.h               |  19 ++-
 drivers/gpu/drm/i915/i915_vma_resource.c      | 124 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_vma_resource.h      |  70 ++++++++++
 drivers/gpu/drm/i915/i915_vma_snapshot.c      |  15 +--
 drivers/gpu/drm/i915/i915_vma_snapshot.h      |   7 +-
 drivers/gpu/drm/i915/i915_vma_types.h         |   5 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  99 ++++++++------
 10 files changed, 334 insertions(+), 63 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_vma_resource.c
 create mode 100644 drivers/gpu/drm/i915/i915_vma_resource.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 1b62b9f65196..98433ad74194 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -174,6 +174,7 @@ i915-y += \
 	  i915_trace_points.o \
 	  i915_ttm_buddy_manager.o \
 	  i915_vma.o \
+	  i915_vma_resource.o \
 	  i915_vma_snapshot.o \
 	  intel_wopcm.o
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index e9541244027a..72e497745c12 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1422,7 +1422,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 			mutex_lock(&vma->vm->mutex);
 			err = i915_vma_bind(target->vma,
 					    target->vma->obj->cache_level,
-					    PIN_GLOBAL, NULL);
+					    PIN_GLOBAL, NULL, NULL);
 			mutex_unlock(&vma->vm->mutex);
 			reloc_cache_remap(&eb->reloc_cache, ev->vma->obj);
 			if (err)
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index be208a8f1ed0..7097c5016431 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -37,6 +37,7 @@
 #include "i915_sw_fence_work.h"
 #include "i915_trace.h"
 #include "i915_vma.h"
+#include "i915_vma_resource.h"
 
 static struct kmem_cache *slab_vmas;
 
@@ -380,6 +381,8 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
  * @cache_level: mapping cache level
  * @flags: flags like global or local mapping
  * @work: preallocated worker for allocating and binding the PTE
+ * @vma_res: pointer to a preallocated vma resource. The resource is either
+ * consumed or freed.
  *
  * DMA addresses are taken from the scatter-gather table of this object (or of
  * this VMA in case of non-default GGTT views) and PTE entries set up.
@@ -388,7 +391,8 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
 int i915_vma_bind(struct i915_vma *vma,
 		  enum i915_cache_level cache_level,
 		  u32 flags,
-		  struct i915_vma_work *work)
+		  struct i915_vma_work *work,
+		  struct i915_vma_resource *vma_res)
 {
 	u32 bind_flags;
 	u32 vma_flags;
@@ -399,11 +403,15 @@ int i915_vma_bind(struct i915_vma *vma,
 
 	if (GEM_DEBUG_WARN_ON(range_overflows(vma->node.start,
 					      vma->node.size,
-					      vma->vm->total)))
+					      vma->vm->total))) {
+		kfree(vma_res);
 		return -ENODEV;
+	}
 
-	if (GEM_DEBUG_WARN_ON(!flags))
+	if (GEM_DEBUG_WARN_ON(!flags)) {
+		kfree(vma_res);
 		return -EINVAL;
+	}
 
 	bind_flags = flags;
 	bind_flags &= I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
@@ -412,11 +420,21 @@ int i915_vma_bind(struct i915_vma *vma,
 	vma_flags &= I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
 
 	bind_flags &= ~vma_flags;
-	if (bind_flags == 0)
+	if (bind_flags == 0) {
+		kfree(vma_res);
 		return 0;
+	}
 
 	GEM_BUG_ON(!atomic_read(&vma->pages_count));
 
+	if (vma->resource || !vma_res) {
+		/* Rebinding with an additional I915_VMA_*_BIND */
+		GEM_WARN_ON(!vma_flags);
+		kfree(vma_res);
+	} else {
+		i915_vma_resource_init(vma_res);
+		vma->resource = vma_res;
+	}
 	trace_i915_vma_bind(vma, bind_flags);
 	if (work && bind_flags & vma->vm->bind_async_flags) {
 		struct dma_fence *prev;
@@ -1279,6 +1297,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 {
 	struct i915_vma_work *work = NULL;
 	struct dma_fence *moving = NULL;
+	struct i915_vma_resource *vma_res = NULL;
 	intel_wakeref_t wakeref = 0;
 	unsigned int bound;
 	int err;
@@ -1333,6 +1352,12 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 		}
 	}
 
+	vma_res = i915_vma_resource_alloc();
+	if (IS_ERR(vma_res)) {
+		err = PTR_ERR(vma_res);
+		goto err_fence;
+	}
+
 	/*
 	 * Differentiate between user/kernel vma inside the aliasing-ppgtt.
 	 *
@@ -1353,7 +1378,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	err = mutex_lock_interruptible_nested(&vma->vm->mutex,
 					      !(flags & PIN_GLOBAL));
 	if (err)
-		goto err_fence;
+		goto err_vma_res;
 
 	/* No more allocations allowed now we hold vm->mutex */
 
@@ -1394,7 +1419,8 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	GEM_BUG_ON(!vma->pages);
 	err = i915_vma_bind(vma,
 			    vma->obj->cache_level,
-			    flags, work);
+			    flags, work, vma_res);
+	vma_res = NULL;
 	if (err)
 		goto err_remove;
 
@@ -1417,6 +1443,8 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	i915_active_release(&vma->active);
 err_unlock:
 	mutex_unlock(&vma->vm->mutex);
+err_vma_res:
+	kfree(vma_res);
 err_fence:
 	if (work)
 		dma_fence_work_commit_imm(&work->base);
@@ -1567,6 +1595,7 @@ void i915_vma_release(struct kref *ref)
 	i915_vm_put(vma->vm);
 
 	i915_active_fini(&vma->active);
+	GEM_WARN_ON(vma->resource);
 	i915_vma_free(vma);
 }
 
@@ -1715,6 +1744,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
 
 void __i915_vma_evict(struct i915_vma *vma)
 {
+	struct dma_fence *unbind_fence;
+
 	GEM_BUG_ON(i915_vma_is_pinned(vma));
 
 	if (i915_vma_is_map_and_fenceable(vma)) {
@@ -1752,8 +1783,20 @@ void __i915_vma_evict(struct i915_vma *vma)
 	atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE),
 		   &vma->flags);
 
+	unbind_fence = i915_vma_resource_unbind(vma->resource);
+	i915_vma_resource_put(vma->resource);
+	vma->resource = NULL;
+
 	i915_vma_detach(vma);
 	vma_unbind_pages(vma);
+
+	/*
+	 * This uninterruptible wait under the vm mutex is currently
+	 * only ever blocking while the vma is being captured from.
+	 * With async unbinding, this wait here will be removed.
+	 */
+	dma_fence_wait(unbind_fence, false);
+	dma_fence_put(unbind_fence);
 }
 
 int __i915_vma_unbind(struct i915_vma *vma)
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 32719431b3df..de0f3e44cdfa 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -37,6 +37,7 @@
 
 #include "i915_active.h"
 #include "i915_request.h"
+#include "i915_vma_resource.h"
 #include "i915_vma_types.h"
 
 struct i915_vma *
@@ -204,7 +205,8 @@ struct i915_vma_work *i915_vma_work(void);
 int i915_vma_bind(struct i915_vma *vma,
 		  enum i915_cache_level cache_level,
 		  u32 flags,
-		  struct i915_vma_work *work);
+		  struct i915_vma_work *work,
+		  struct i915_vma_resource *vma_res);
 
 bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long color);
 bool i915_vma_misplaced(const struct i915_vma *vma,
@@ -428,6 +430,21 @@ static inline int i915_vma_sync(struct i915_vma *vma)
 	return i915_active_wait(&vma->active);
 }
 
+/**
+ * i915_vma_get_current_resource - Get the current resource of the vma
+ * @vma: The vma to get the current resource from.
+ *
+ * It's illegal to call this function if the vma is not bound.
+ *
+ * Return: A refcounted pointer to the current vma resource
+ * of the vma, assuming the vma is bound.
+ */
+static inline struct i915_vma_resource *
+i915_vma_get_current_resource(struct i915_vma *vma)
+{
+	return i915_vma_resource_get(vma->resource);
+}
+
 void i915_vma_module_exit(void);
 int i915_vma_module_init(void);
 
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c
new file mode 100644
index 000000000000..833e987bed2a
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_vma_resource.c
@@ -0,0 +1,124 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+#include <linux/slab.h>
+
+#include "i915_vma_resource.h"
+
+/* Callbacks for the unbind dma-fence. */
+static const char *get_driver_name(struct dma_fence *fence)
+{
+	return "vma unbind fence";
+}
+
+static const char *get_timeline_name(struct dma_fence *fence)
+{
+	return "unbound";
+}
+
+static struct dma_fence_ops unbind_fence_ops = {
+	.get_driver_name = get_driver_name,
+	.get_timeline_name = get_timeline_name,
+};
+
+/**
+ * i915_vma_resource_init - Initialize a vma resource.
+ * @vma_res: The vma resource to initialize
+ *
+ * Initializes a vma resource allocated using i915_vma_resource_alloc().
+ * The reason for having separate allocate and initialize function is that
+ * initialization may need to be performed from under a lock where
+ * allocation is not allowed.
+ */
+void i915_vma_resource_init(struct i915_vma_resource *vma_res)
+{
+	spin_lock_init(&vma_res->lock);
+	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
+		       &vma_res->lock, 0, 0);
+	refcount_set(&vma_res->hold_count, 1);
+}
+
+/**
+ * i915_vma_resource_alloc - Allocate a vma resource
+ *
+ * Return: A pointer to a cleared struct i915_vma_resource or
+ * a -ENOMEM error pointer if allocation fails.
+ */
+struct i915_vma_resource *i915_vma_resource_alloc(void)
+{
+	struct i915_vma_resource *vma_res =
+		kzalloc(sizeof(*vma_res), GFP_KERNEL);
+
+	return vma_res ? vma_res : ERR_PTR(-ENOMEM);
+}
+
+static void __i915_vma_resource_unhold(struct i915_vma_resource *vma_res)
+{
+	if (refcount_dec_and_test(&vma_res->hold_count))
+		dma_fence_signal(&vma_res->unbind_fence);
+}
+
+/**
+ * i915_vma_resource_unhold - Unhold the signaling of the vma resource unbind
+ * fence.
+ * @vma_res: The vma resource.
+ * @lockdep_cookie: The lockdep cookie returned from i915_vma_resource_hold.
+ *
+ * The function may leave a dma_fence critical section.
+ */
+void i915_vma_resource_unhold(struct i915_vma_resource *vma_res,
+			      bool lockdep_cookie)
+{
+	dma_fence_end_signalling(lockdep_cookie);
+
+	if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
+		unsigned long irq_flags;
+
+		/* Inefficient open-coded might_lock_irqsave() */
+		spin_lock_irqsave(&vma_res->lock, irq_flags);
+		spin_unlock_irqrestore(&vma_res->lock, irq_flags);
+	}
+
+	__i915_vma_resource_unhold(vma_res);
+}
+
+/**
+ * i915_vma_resource_hold - Hold the signaling of the vma resource unbind fence.
+ * @vma_res: The vma resource.
+ * @lockdep_cookie: Pointer to a bool serving as a lockdep cooke that should
+ * be given as an argument to the pairing i915_vma_resource_unhold.
+ *
+ * If returning true, the function enters a dma_fence signalling critical
+ * section is not in one already.
+ *
+ * Return: true if holding successful, false if not.
+ */
+bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
+			    bool *lockdep_cookie)
+{
+	bool held = refcount_inc_not_zero(&vma_res->hold_count);
+
+	if (held)
+		*lockdep_cookie = dma_fence_begin_signalling();
+
+	return held;
+}
+
+/**
+ * i915_vma_resource_unbind - Unbind a vma resource
+ * @vma_res: The vma resource to unbind.
+ *
+ * At this point this function does little more than publish a fence that
+ * signals immediately unless signaling is held back.
+ *
+ * Return: A refcounted pointer to a dma-fence that signals when unbinding is
+ * complete.
+ */
+struct dma_fence *
+i915_vma_resource_unbind(struct i915_vma_resource *vma_res)
+{
+	__i915_vma_resource_unhold(vma_res);
+	dma_fence_get(&vma_res->unbind_fence);
+	return &vma_res->unbind_fence;
+}
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
new file mode 100644
index 000000000000..34744da23072
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_vma_resource.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#ifndef __I915_VMA_RESOURCE_H__
+#define __I915_VMA_RESOURCE_H__
+
+#include <linux/dma-fence.h>
+#include <linux/refcount.h>
+
+/**
+ * struct i915_vma_resource - Snapshotted unbind information.
+ * @unbind_fence: Fence to mark unbinding complete. Note that this fence
+ * is not considered published until unbind is scheduled, and as such it
+ * is illegal to access this fence before scheduled unbind other than
+ * for refcounting.
+ * @lock: The @unbind_fence lock. We're also using it to protect the weak
+ * pointer to the struct i915_vma, @vma during lookup and takedown.
+ * @hold_count: Number of holders blocking the fence from finishing.
+ * The vma itself is keeping a hold, which is released when unbind
+ * is scheduled.
+ *
+ * The lifetime of a struct i915_vma_resource is from a binding request to
+ * the actual possible asynchronous unbind has completed.
+ */
+struct i915_vma_resource {
+	struct dma_fence unbind_fence;
+	/* See above for description of the lock. */
+	spinlock_t lock;
+	refcount_t hold_count;
+};
+
+bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
+			    bool *lockdep_cookie);
+
+void i915_vma_resource_unhold(struct i915_vma_resource *vma_res,
+			      bool lockdep_cookie);
+
+struct i915_vma_resource *i915_vma_resource_alloc(void);
+
+struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res);
+
+/**
+ * i915_vma_resource_get - Take a reference on a vma resource
+ * @vma_res: The vma resource on which to take a reference.
+ *
+ * Return: The @vma_res pointer
+ */
+static inline struct i915_vma_resource
+*i915_vma_resource_get(struct i915_vma_resource *vma_res)
+{
+	dma_fence_get(&vma_res->unbind_fence);
+	return vma_res;
+}
+
+/**
+ * i915_vma_resource_put - Release a reference to a struct i915_vma_resource
+ * @vma_res: The resource
+ */
+static inline void i915_vma_resource_put(struct i915_vma_resource *vma_res)
+{
+	dma_fence_put(&vma_res->unbind_fence);
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+void i915_vma_resource_init(struct i915_vma_resource *vma_res);
+#endif
+
+#endif
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
index 2949ceea9884..f7333c7a2f5e 100644
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.c
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.c
@@ -3,6 +3,7 @@
  * Copyright © 2021 Intel Corporation
  */
 
+#include "i915_vma_resource.h"
 #include "i915_vma_snapshot.h"
 #include "i915_vma_types.h"
 #include "i915_vma.h"
@@ -35,7 +36,7 @@ void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
 		vsnap->pages_rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);
 	vsnap->mr = vma->obj->mm.region;
 	kref_init(&vsnap->kref);
-	vsnap->vma_resource = &vma->active;
+	vsnap->vma_resource = i915_vma_get_current_resource(vma);
 	vsnap->onstack = false;
 	vsnap->present = true;
 }
@@ -62,6 +63,7 @@ static void vma_snapshot_release(struct kref *ref)
 		container_of(ref, typeof(*vsnap), kref);
 
 	vsnap->present = false;
+	i915_vma_resource_put(vsnap->vma_resource);
 	if (vsnap->pages_rsgt)
 		i915_refct_sgt_put(vsnap->pages_rsgt);
 	if (!vsnap->onstack)
@@ -109,12 +111,7 @@ void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap)
 bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
 				    bool *lockdep_cookie)
 {
-	bool pinned = i915_active_acquire_if_busy(vsnap->vma_resource);
-
-	if (pinned)
-		*lockdep_cookie = dma_fence_begin_signalling();
-
-	return pinned;
+	return i915_vma_resource_hold(vsnap->vma_resource, lockdep_cookie);
 }
 
 /**
@@ -128,7 +125,5 @@ bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
 void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
 				      bool lockdep_cookie)
 {
-	dma_fence_end_signalling(lockdep_cookie);
-
-	return i915_active_release(vsnap->vma_resource);
+	i915_vma_resource_unhold(vsnap->vma_resource, lockdep_cookie);
 }
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
index 940581df4622..e74588dd676b 100644
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.h
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.h
@@ -31,10 +31,7 @@ struct sg_table;
  * @pages_rsgt: The refcounted sg_table holding the reference for @pages if any.
  * @mr: The memory region pointed for the pages bound.
  * @kref: Reference for this structure.
- * @vma_resource: FIXME: A means to keep the unbind fence from signaling.
- * Temporarily while we have only sync unbinds, and still use the vma
- * active, we use that. With async unbinding we need a signaling refcount
- * for the unbind fence.
+ * @vma_resource: Pointer to the vma resource representing the vma binding.
  * @page_sizes: The vma GTT page sizes information.
  * @onstack: Whether the structure shouldn't be freed on final put.
  * @present: Whether the structure is present and initialized.
@@ -49,7 +46,7 @@ struct i915_vma_snapshot {
 	struct i915_refct_sgt *pages_rsgt;
 	struct intel_memory_region *mr;
 	struct kref kref;
-	struct i915_active *vma_resource;
+	struct i915_vma_resource *vma_resource;
 	u32 page_sizes;
 	bool onstack:1;
 	bool present:1;
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index ca575e129ced..ac1928d1dc11 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -95,6 +95,8 @@ enum i915_cache_level;
  *
  */
 
+struct i915_vma_resource;
+
 struct intel_remapped_plane_info {
 	/* in gtt pages */
 	u32 offset:31;
@@ -291,6 +293,9 @@ struct i915_vma {
 	struct list_head evict_link;
 
 	struct list_head closed_link;
+
+	/** The async vma resource. Protected by the vm_mutex */
+	struct i915_vma_resource *resource;
 };
 
 #endif
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 575705c3bce9..54be880e55c3 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -32,6 +32,7 @@
 
 #include "i915_random.h"
 #include "i915_selftest.h"
+#include "i915_vma_resource.h"
 
 #include "mock_drm.h"
 #include "mock_gem_device.h"
@@ -1336,6 +1337,33 @@ static int igt_mock_drunk(void *arg)
 	return exercise_mock(ggtt->vm.i915, drunk_hole);
 }
 
+static int reserve_gtt_with_resource(struct i915_vma *vma, u64 offset)
+{
+	struct i915_address_space *vm = vma->vm;
+	struct i915_vma_resource *vma_res;
+	struct drm_i915_gem_object *obj = vma->obj;
+	int err;
+
+	vma_res = i915_vma_resource_alloc();
+	if (IS_ERR(vma_res))
+		return PTR_ERR(vma_res);
+
+	mutex_lock(&vm->mutex);
+	err = i915_gem_gtt_reserve(vm, &vma->node, obj->base.size,
+				   offset,
+				   obj->cache_level,
+				   0);
+	if (!err) {
+		i915_vma_resource_init(vma_res);
+		vma->resource = vma_res;
+	} else {
+		kfree(vma_res);
+	}
+	mutex_unlock(&vm->mutex);
+
+	return err;
+}
+
 static int igt_gtt_reserve(void *arg)
 {
 	struct i915_ggtt *ggtt = arg;
@@ -1370,20 +1398,13 @@ static int igt_gtt_reserve(void *arg)
 		}
 
 		list_add(&obj->st_link, &objects);
-
 		vma = i915_vma_instance(obj, &ggtt->vm, NULL);
 		if (IS_ERR(vma)) {
 			err = PTR_ERR(vma);
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
-					   obj->base.size,
-					   total,
-					   obj->cache_level,
-					   0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = reserve_gtt_with_resource(vma, total);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 1) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1429,13 +1450,7 @@ static int igt_gtt_reserve(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
-					   obj->base.size,
-					   total,
-					   obj->cache_level,
-					   0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = reserve_gtt_with_resource(vma, total);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 2) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1476,13 +1491,7 @@ static int igt_gtt_reserve(void *arg)
 					   2 * I915_GTT_PAGE_SIZE,
 					   I915_GTT_MIN_ALIGNMENT);
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
-					   obj->base.size,
-					   offset,
-					   obj->cache_level,
-					   0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = reserve_gtt_with_resource(vma, offset);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 3) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1509,6 +1518,31 @@ static int igt_gtt_reserve(void *arg)
 	return err;
 }
 
+static int insert_gtt_with_resource(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+	struct i915_vma_resource *vma_res;
+	struct drm_i915_gem_object *obj = vma->obj;
+	int err;
+
+	vma_res = i915_vma_resource_alloc();
+	if (IS_ERR(vma_res))
+		return PTR_ERR(vma_res);
+
+	mutex_lock(&vm->mutex);
+	err = i915_gem_gtt_insert(vm, &vma->node, obj->base.size, 0,
+				  obj->cache_level, 0, vm->total, 0);
+	if (!err) {
+		i915_vma_resource_init(vma_res);
+		vma->resource = vma_res;
+	} else {
+		kfree(vma_res);
+	}
+	mutex_unlock(&vm->mutex);
+
+	return err;
+}
+
 static int igt_gtt_insert(void *arg)
 {
 	struct i915_ggtt *ggtt = arg;
@@ -1593,12 +1627,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
-					  obj->base.size, 0, obj->cache_level,
-					  0, ggtt->vm.total,
-					  0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = insert_gtt_with_resource(vma);
 		if (err == -ENOSPC) {
 			/* maxed out the GGTT space */
 			i915_gem_object_put(obj);
@@ -1653,12 +1682,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
-					  obj->base.size, 0, obj->cache_level,
-					  0, ggtt->vm.total,
-					  0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = insert_gtt_with_resource(vma);
 		if (err) {
 			pr_err("i915_gem_gtt_insert (pass 2) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1702,12 +1726,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
-					  obj->base.size, 0, obj->cache_level,
-					  0, ggtt->vm.total,
-					  0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = insert_gtt_with_resource(vma);
 		if (err) {
 			pr_err("i915_gem_gtt_insert (pass 3) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Intel-gfx] [PATCH v5 1/6] drm/i915: Initial introduction of vma resources
@ 2022-01-04 12:51   ` Thomas Hellström
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

Introduce vma resources, sort of similar to TTM resources,  needed for
asynchronous bind management. Initially we will use them to hold
completion of unbinding when we capture data from a vma, but they will
be used extensively in upcoming patches for asynchronous vma unbinding.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
 drivers/gpu/drm/i915/i915_vma.c               |  55 +++++++-
 drivers/gpu/drm/i915/i915_vma.h               |  19 ++-
 drivers/gpu/drm/i915/i915_vma_resource.c      | 124 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_vma_resource.h      |  70 ++++++++++
 drivers/gpu/drm/i915/i915_vma_snapshot.c      |  15 +--
 drivers/gpu/drm/i915/i915_vma_snapshot.h      |   7 +-
 drivers/gpu/drm/i915/i915_vma_types.h         |   5 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  99 ++++++++------
 10 files changed, 334 insertions(+), 63 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_vma_resource.c
 create mode 100644 drivers/gpu/drm/i915/i915_vma_resource.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 1b62b9f65196..98433ad74194 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -174,6 +174,7 @@ i915-y += \
 	  i915_trace_points.o \
 	  i915_ttm_buddy_manager.o \
 	  i915_vma.o \
+	  i915_vma_resource.o \
 	  i915_vma_snapshot.o \
 	  intel_wopcm.o
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index e9541244027a..72e497745c12 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1422,7 +1422,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 			mutex_lock(&vma->vm->mutex);
 			err = i915_vma_bind(target->vma,
 					    target->vma->obj->cache_level,
-					    PIN_GLOBAL, NULL);
+					    PIN_GLOBAL, NULL, NULL);
 			mutex_unlock(&vma->vm->mutex);
 			reloc_cache_remap(&eb->reloc_cache, ev->vma->obj);
 			if (err)
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index be208a8f1ed0..7097c5016431 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -37,6 +37,7 @@
 #include "i915_sw_fence_work.h"
 #include "i915_trace.h"
 #include "i915_vma.h"
+#include "i915_vma_resource.h"
 
 static struct kmem_cache *slab_vmas;
 
@@ -380,6 +381,8 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
  * @cache_level: mapping cache level
  * @flags: flags like global or local mapping
  * @work: preallocated worker for allocating and binding the PTE
+ * @vma_res: pointer to a preallocated vma resource. The resource is either
+ * consumed or freed.
  *
  * DMA addresses are taken from the scatter-gather table of this object (or of
  * this VMA in case of non-default GGTT views) and PTE entries set up.
@@ -388,7 +391,8 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
 int i915_vma_bind(struct i915_vma *vma,
 		  enum i915_cache_level cache_level,
 		  u32 flags,
-		  struct i915_vma_work *work)
+		  struct i915_vma_work *work,
+		  struct i915_vma_resource *vma_res)
 {
 	u32 bind_flags;
 	u32 vma_flags;
@@ -399,11 +403,15 @@ int i915_vma_bind(struct i915_vma *vma,
 
 	if (GEM_DEBUG_WARN_ON(range_overflows(vma->node.start,
 					      vma->node.size,
-					      vma->vm->total)))
+					      vma->vm->total))) {
+		kfree(vma_res);
 		return -ENODEV;
+	}
 
-	if (GEM_DEBUG_WARN_ON(!flags))
+	if (GEM_DEBUG_WARN_ON(!flags)) {
+		kfree(vma_res);
 		return -EINVAL;
+	}
 
 	bind_flags = flags;
 	bind_flags &= I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
@@ -412,11 +420,21 @@ int i915_vma_bind(struct i915_vma *vma,
 	vma_flags &= I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
 
 	bind_flags &= ~vma_flags;
-	if (bind_flags == 0)
+	if (bind_flags == 0) {
+		kfree(vma_res);
 		return 0;
+	}
 
 	GEM_BUG_ON(!atomic_read(&vma->pages_count));
 
+	if (vma->resource || !vma_res) {
+		/* Rebinding with an additional I915_VMA_*_BIND */
+		GEM_WARN_ON(!vma_flags);
+		kfree(vma_res);
+	} else {
+		i915_vma_resource_init(vma_res);
+		vma->resource = vma_res;
+	}
 	trace_i915_vma_bind(vma, bind_flags);
 	if (work && bind_flags & vma->vm->bind_async_flags) {
 		struct dma_fence *prev;
@@ -1279,6 +1297,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 {
 	struct i915_vma_work *work = NULL;
 	struct dma_fence *moving = NULL;
+	struct i915_vma_resource *vma_res = NULL;
 	intel_wakeref_t wakeref = 0;
 	unsigned int bound;
 	int err;
@@ -1333,6 +1352,12 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 		}
 	}
 
+	vma_res = i915_vma_resource_alloc();
+	if (IS_ERR(vma_res)) {
+		err = PTR_ERR(vma_res);
+		goto err_fence;
+	}
+
 	/*
 	 * Differentiate between user/kernel vma inside the aliasing-ppgtt.
 	 *
@@ -1353,7 +1378,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	err = mutex_lock_interruptible_nested(&vma->vm->mutex,
 					      !(flags & PIN_GLOBAL));
 	if (err)
-		goto err_fence;
+		goto err_vma_res;
 
 	/* No more allocations allowed now we hold vm->mutex */
 
@@ -1394,7 +1419,8 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	GEM_BUG_ON(!vma->pages);
 	err = i915_vma_bind(vma,
 			    vma->obj->cache_level,
-			    flags, work);
+			    flags, work, vma_res);
+	vma_res = NULL;
 	if (err)
 		goto err_remove;
 
@@ -1417,6 +1443,8 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 	i915_active_release(&vma->active);
 err_unlock:
 	mutex_unlock(&vma->vm->mutex);
+err_vma_res:
+	kfree(vma_res);
 err_fence:
 	if (work)
 		dma_fence_work_commit_imm(&work->base);
@@ -1567,6 +1595,7 @@ void i915_vma_release(struct kref *ref)
 	i915_vm_put(vma->vm);
 
 	i915_active_fini(&vma->active);
+	GEM_WARN_ON(vma->resource);
 	i915_vma_free(vma);
 }
 
@@ -1715,6 +1744,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
 
 void __i915_vma_evict(struct i915_vma *vma)
 {
+	struct dma_fence *unbind_fence;
+
 	GEM_BUG_ON(i915_vma_is_pinned(vma));
 
 	if (i915_vma_is_map_and_fenceable(vma)) {
@@ -1752,8 +1783,20 @@ void __i915_vma_evict(struct i915_vma *vma)
 	atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE),
 		   &vma->flags);
 
+	unbind_fence = i915_vma_resource_unbind(vma->resource);
+	i915_vma_resource_put(vma->resource);
+	vma->resource = NULL;
+
 	i915_vma_detach(vma);
 	vma_unbind_pages(vma);
+
+	/*
+	 * This uninterruptible wait under the vm mutex is currently
+	 * only ever blocking while the vma is being captured from.
+	 * With async unbinding, this wait here will be removed.
+	 */
+	dma_fence_wait(unbind_fence, false);
+	dma_fence_put(unbind_fence);
 }
 
 int __i915_vma_unbind(struct i915_vma *vma)
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 32719431b3df..de0f3e44cdfa 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -37,6 +37,7 @@
 
 #include "i915_active.h"
 #include "i915_request.h"
+#include "i915_vma_resource.h"
 #include "i915_vma_types.h"
 
 struct i915_vma *
@@ -204,7 +205,8 @@ struct i915_vma_work *i915_vma_work(void);
 int i915_vma_bind(struct i915_vma *vma,
 		  enum i915_cache_level cache_level,
 		  u32 flags,
-		  struct i915_vma_work *work);
+		  struct i915_vma_work *work,
+		  struct i915_vma_resource *vma_res);
 
 bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long color);
 bool i915_vma_misplaced(const struct i915_vma *vma,
@@ -428,6 +430,21 @@ static inline int i915_vma_sync(struct i915_vma *vma)
 	return i915_active_wait(&vma->active);
 }
 
+/**
+ * i915_vma_get_current_resource - Get the current resource of the vma
+ * @vma: The vma to get the current resource from.
+ *
+ * It's illegal to call this function if the vma is not bound.
+ *
+ * Return: A refcounted pointer to the current vma resource
+ * of the vma, assuming the vma is bound.
+ */
+static inline struct i915_vma_resource *
+i915_vma_get_current_resource(struct i915_vma *vma)
+{
+	return i915_vma_resource_get(vma->resource);
+}
+
 void i915_vma_module_exit(void);
 int i915_vma_module_init(void);
 
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c
new file mode 100644
index 000000000000..833e987bed2a
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_vma_resource.c
@@ -0,0 +1,124 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+#include <linux/slab.h>
+
+#include "i915_vma_resource.h"
+
+/* Callbacks for the unbind dma-fence. */
+static const char *get_driver_name(struct dma_fence *fence)
+{
+	return "vma unbind fence";
+}
+
+static const char *get_timeline_name(struct dma_fence *fence)
+{
+	return "unbound";
+}
+
+static struct dma_fence_ops unbind_fence_ops = {
+	.get_driver_name = get_driver_name,
+	.get_timeline_name = get_timeline_name,
+};
+
+/**
+ * i915_vma_resource_init - Initialize a vma resource.
+ * @vma_res: The vma resource to initialize
+ *
+ * Initializes a vma resource allocated using i915_vma_resource_alloc().
+ * The reason for having separate allocate and initialize function is that
+ * initialization may need to be performed from under a lock where
+ * allocation is not allowed.
+ */
+void i915_vma_resource_init(struct i915_vma_resource *vma_res)
+{
+	spin_lock_init(&vma_res->lock);
+	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
+		       &vma_res->lock, 0, 0);
+	refcount_set(&vma_res->hold_count, 1);
+}
+
+/**
+ * i915_vma_resource_alloc - Allocate a vma resource
+ *
+ * Return: A pointer to a cleared struct i915_vma_resource or
+ * a -ENOMEM error pointer if allocation fails.
+ */
+struct i915_vma_resource *i915_vma_resource_alloc(void)
+{
+	struct i915_vma_resource *vma_res =
+		kzalloc(sizeof(*vma_res), GFP_KERNEL);
+
+	return vma_res ? vma_res : ERR_PTR(-ENOMEM);
+}
+
+static void __i915_vma_resource_unhold(struct i915_vma_resource *vma_res)
+{
+	if (refcount_dec_and_test(&vma_res->hold_count))
+		dma_fence_signal(&vma_res->unbind_fence);
+}
+
+/**
+ * i915_vma_resource_unhold - Unhold the signaling of the vma resource unbind
+ * fence.
+ * @vma_res: The vma resource.
+ * @lockdep_cookie: The lockdep cookie returned from i915_vma_resource_hold.
+ *
+ * The function may leave a dma_fence critical section.
+ */
+void i915_vma_resource_unhold(struct i915_vma_resource *vma_res,
+			      bool lockdep_cookie)
+{
+	dma_fence_end_signalling(lockdep_cookie);
+
+	if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
+		unsigned long irq_flags;
+
+		/* Inefficient open-coded might_lock_irqsave() */
+		spin_lock_irqsave(&vma_res->lock, irq_flags);
+		spin_unlock_irqrestore(&vma_res->lock, irq_flags);
+	}
+
+	__i915_vma_resource_unhold(vma_res);
+}
+
+/**
+ * i915_vma_resource_hold - Hold the signaling of the vma resource unbind fence.
+ * @vma_res: The vma resource.
+ * @lockdep_cookie: Pointer to a bool serving as a lockdep cooke that should
+ * be given as an argument to the pairing i915_vma_resource_unhold.
+ *
+ * If returning true, the function enters a dma_fence signalling critical
+ * section is not in one already.
+ *
+ * Return: true if holding successful, false if not.
+ */
+bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
+			    bool *lockdep_cookie)
+{
+	bool held = refcount_inc_not_zero(&vma_res->hold_count);
+
+	if (held)
+		*lockdep_cookie = dma_fence_begin_signalling();
+
+	return held;
+}
+
+/**
+ * i915_vma_resource_unbind - Unbind a vma resource
+ * @vma_res: The vma resource to unbind.
+ *
+ * At this point this function does little more than publish a fence that
+ * signals immediately unless signaling is held back.
+ *
+ * Return: A refcounted pointer to a dma-fence that signals when unbinding is
+ * complete.
+ */
+struct dma_fence *
+i915_vma_resource_unbind(struct i915_vma_resource *vma_res)
+{
+	__i915_vma_resource_unhold(vma_res);
+	dma_fence_get(&vma_res->unbind_fence);
+	return &vma_res->unbind_fence;
+}
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
new file mode 100644
index 000000000000..34744da23072
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_vma_resource.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#ifndef __I915_VMA_RESOURCE_H__
+#define __I915_VMA_RESOURCE_H__
+
+#include <linux/dma-fence.h>
+#include <linux/refcount.h>
+
+/**
+ * struct i915_vma_resource - Snapshotted unbind information.
+ * @unbind_fence: Fence to mark unbinding complete. Note that this fence
+ * is not considered published until unbind is scheduled, and as such it
+ * is illegal to access this fence before scheduled unbind other than
+ * for refcounting.
+ * @lock: The @unbind_fence lock. We're also using it to protect the weak
+ * pointer to the struct i915_vma, @vma during lookup and takedown.
+ * @hold_count: Number of holders blocking the fence from finishing.
+ * The vma itself is keeping a hold, which is released when unbind
+ * is scheduled.
+ *
+ * The lifetime of a struct i915_vma_resource is from a binding request to
+ * the actual possible asynchronous unbind has completed.
+ */
+struct i915_vma_resource {
+	struct dma_fence unbind_fence;
+	/* See above for description of the lock. */
+	spinlock_t lock;
+	refcount_t hold_count;
+};
+
+bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
+			    bool *lockdep_cookie);
+
+void i915_vma_resource_unhold(struct i915_vma_resource *vma_res,
+			      bool lockdep_cookie);
+
+struct i915_vma_resource *i915_vma_resource_alloc(void);
+
+struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res);
+
+/**
+ * i915_vma_resource_get - Take a reference on a vma resource
+ * @vma_res: The vma resource on which to take a reference.
+ *
+ * Return: The @vma_res pointer
+ */
+static inline struct i915_vma_resource
+*i915_vma_resource_get(struct i915_vma_resource *vma_res)
+{
+	dma_fence_get(&vma_res->unbind_fence);
+	return vma_res;
+}
+
+/**
+ * i915_vma_resource_put - Release a reference to a struct i915_vma_resource
+ * @vma_res: The resource
+ */
+static inline void i915_vma_resource_put(struct i915_vma_resource *vma_res)
+{
+	dma_fence_put(&vma_res->unbind_fence);
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+void i915_vma_resource_init(struct i915_vma_resource *vma_res);
+#endif
+
+#endif
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
index 2949ceea9884..f7333c7a2f5e 100644
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.c
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.c
@@ -3,6 +3,7 @@
  * Copyright © 2021 Intel Corporation
  */
 
+#include "i915_vma_resource.h"
 #include "i915_vma_snapshot.h"
 #include "i915_vma_types.h"
 #include "i915_vma.h"
@@ -35,7 +36,7 @@ void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
 		vsnap->pages_rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);
 	vsnap->mr = vma->obj->mm.region;
 	kref_init(&vsnap->kref);
-	vsnap->vma_resource = &vma->active;
+	vsnap->vma_resource = i915_vma_get_current_resource(vma);
 	vsnap->onstack = false;
 	vsnap->present = true;
 }
@@ -62,6 +63,7 @@ static void vma_snapshot_release(struct kref *ref)
 		container_of(ref, typeof(*vsnap), kref);
 
 	vsnap->present = false;
+	i915_vma_resource_put(vsnap->vma_resource);
 	if (vsnap->pages_rsgt)
 		i915_refct_sgt_put(vsnap->pages_rsgt);
 	if (!vsnap->onstack)
@@ -109,12 +111,7 @@ void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap)
 bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
 				    bool *lockdep_cookie)
 {
-	bool pinned = i915_active_acquire_if_busy(vsnap->vma_resource);
-
-	if (pinned)
-		*lockdep_cookie = dma_fence_begin_signalling();
-
-	return pinned;
+	return i915_vma_resource_hold(vsnap->vma_resource, lockdep_cookie);
 }
 
 /**
@@ -128,7 +125,5 @@ bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
 void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
 				      bool lockdep_cookie)
 {
-	dma_fence_end_signalling(lockdep_cookie);
-
-	return i915_active_release(vsnap->vma_resource);
+	i915_vma_resource_unhold(vsnap->vma_resource, lockdep_cookie);
 }
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
index 940581df4622..e74588dd676b 100644
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.h
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.h
@@ -31,10 +31,7 @@ struct sg_table;
  * @pages_rsgt: The refcounted sg_table holding the reference for @pages if any.
  * @mr: The memory region pointed for the pages bound.
  * @kref: Reference for this structure.
- * @vma_resource: FIXME: A means to keep the unbind fence from signaling.
- * Temporarily while we have only sync unbinds, and still use the vma
- * active, we use that. With async unbinding we need a signaling refcount
- * for the unbind fence.
+ * @vma_resource: Pointer to the vma resource representing the vma binding.
  * @page_sizes: The vma GTT page sizes information.
  * @onstack: Whether the structure shouldn't be freed on final put.
  * @present: Whether the structure is present and initialized.
@@ -49,7 +46,7 @@ struct i915_vma_snapshot {
 	struct i915_refct_sgt *pages_rsgt;
 	struct intel_memory_region *mr;
 	struct kref kref;
-	struct i915_active *vma_resource;
+	struct i915_vma_resource *vma_resource;
 	u32 page_sizes;
 	bool onstack:1;
 	bool present:1;
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index ca575e129ced..ac1928d1dc11 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -95,6 +95,8 @@ enum i915_cache_level;
  *
  */
 
+struct i915_vma_resource;
+
 struct intel_remapped_plane_info {
 	/* in gtt pages */
 	u32 offset:31;
@@ -291,6 +293,9 @@ struct i915_vma {
 	struct list_head evict_link;
 
 	struct list_head closed_link;
+
+	/** The async vma resource. Protected by the vm_mutex */
+	struct i915_vma_resource *resource;
 };
 
 #endif
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 575705c3bce9..54be880e55c3 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -32,6 +32,7 @@
 
 #include "i915_random.h"
 #include "i915_selftest.h"
+#include "i915_vma_resource.h"
 
 #include "mock_drm.h"
 #include "mock_gem_device.h"
@@ -1336,6 +1337,33 @@ static int igt_mock_drunk(void *arg)
 	return exercise_mock(ggtt->vm.i915, drunk_hole);
 }
 
+static int reserve_gtt_with_resource(struct i915_vma *vma, u64 offset)
+{
+	struct i915_address_space *vm = vma->vm;
+	struct i915_vma_resource *vma_res;
+	struct drm_i915_gem_object *obj = vma->obj;
+	int err;
+
+	vma_res = i915_vma_resource_alloc();
+	if (IS_ERR(vma_res))
+		return PTR_ERR(vma_res);
+
+	mutex_lock(&vm->mutex);
+	err = i915_gem_gtt_reserve(vm, &vma->node, obj->base.size,
+				   offset,
+				   obj->cache_level,
+				   0);
+	if (!err) {
+		i915_vma_resource_init(vma_res);
+		vma->resource = vma_res;
+	} else {
+		kfree(vma_res);
+	}
+	mutex_unlock(&vm->mutex);
+
+	return err;
+}
+
 static int igt_gtt_reserve(void *arg)
 {
 	struct i915_ggtt *ggtt = arg;
@@ -1370,20 +1398,13 @@ static int igt_gtt_reserve(void *arg)
 		}
 
 		list_add(&obj->st_link, &objects);
-
 		vma = i915_vma_instance(obj, &ggtt->vm, NULL);
 		if (IS_ERR(vma)) {
 			err = PTR_ERR(vma);
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
-					   obj->base.size,
-					   total,
-					   obj->cache_level,
-					   0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = reserve_gtt_with_resource(vma, total);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 1) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1429,13 +1450,7 @@ static int igt_gtt_reserve(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
-					   obj->base.size,
-					   total,
-					   obj->cache_level,
-					   0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = reserve_gtt_with_resource(vma, total);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 2) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1476,13 +1491,7 @@ static int igt_gtt_reserve(void *arg)
 					   2 * I915_GTT_PAGE_SIZE,
 					   I915_GTT_MIN_ALIGNMENT);
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_reserve(&ggtt->vm, &vma->node,
-					   obj->base.size,
-					   offset,
-					   obj->cache_level,
-					   0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = reserve_gtt_with_resource(vma, offset);
 		if (err) {
 			pr_err("i915_gem_gtt_reserve (pass 3) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1509,6 +1518,31 @@ static int igt_gtt_reserve(void *arg)
 	return err;
 }
 
+static int insert_gtt_with_resource(struct i915_vma *vma)
+{
+	struct i915_address_space *vm = vma->vm;
+	struct i915_vma_resource *vma_res;
+	struct drm_i915_gem_object *obj = vma->obj;
+	int err;
+
+	vma_res = i915_vma_resource_alloc();
+	if (IS_ERR(vma_res))
+		return PTR_ERR(vma_res);
+
+	mutex_lock(&vm->mutex);
+	err = i915_gem_gtt_insert(vm, &vma->node, obj->base.size, 0,
+				  obj->cache_level, 0, vm->total, 0);
+	if (!err) {
+		i915_vma_resource_init(vma_res);
+		vma->resource = vma_res;
+	} else {
+		kfree(vma_res);
+	}
+	mutex_unlock(&vm->mutex);
+
+	return err;
+}
+
 static int igt_gtt_insert(void *arg)
 {
 	struct i915_ggtt *ggtt = arg;
@@ -1593,12 +1627,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
-					  obj->base.size, 0, obj->cache_level,
-					  0, ggtt->vm.total,
-					  0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = insert_gtt_with_resource(vma);
 		if (err == -ENOSPC) {
 			/* maxed out the GGTT space */
 			i915_gem_object_put(obj);
@@ -1653,12 +1682,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
-					  obj->base.size, 0, obj->cache_level,
-					  0, ggtt->vm.total,
-					  0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = insert_gtt_with_resource(vma);
 		if (err) {
 			pr_err("i915_gem_gtt_insert (pass 2) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
@@ -1702,12 +1726,7 @@ static int igt_gtt_insert(void *arg)
 			goto out;
 		}
 
-		mutex_lock(&ggtt->vm.mutex);
-		err = i915_gem_gtt_insert(&ggtt->vm, &vma->node,
-					  obj->base.size, 0, obj->cache_level,
-					  0, ggtt->vm.total,
-					  0);
-		mutex_unlock(&ggtt->vm.mutex);
+		err = insert_gtt_with_resource(vma);
 		if (err) {
 			pr_err("i915_gem_gtt_insert (pass 3) failed at %llu/%llu with err=%d\n",
 			       total, ggtt->vm.total, err);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v5 2/6] drm/i915: Use the vma resource as argument for gtt binding / unbinding
  2022-01-04 12:51 ` [Intel-gfx] " Thomas Hellström
@ 2022-01-04 12:51   ` Thomas Hellström
  -1 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

When introducing asynchronous unbinding, the vma itself may no longer
be alive when the actual binding or unbinding takes place.

Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource
instead of a struct i915_vma for the bind_vma() and unbind_vma() ops.
Similarly change the insert_entries() op for struct i915_address_space.

Replace a couple of i915_vma_snapshot members with their newly introduced
i915_vma_resource counterparts, since they have the same lifetime.

Also make sure to avoid changing the struct i915_vma_flags (in particular
the bind flags) async. That should now only be done sync under the
vm mutex.

v2:
- Update the vma_res::bound_flags when binding to the aliased ggtt

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/display/intel_dpt.c      | 27 ++---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 27 +----
 .../gpu/drm/i915/gem/selftests/huge_pages.c   | 37 +++----
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 19 ++--
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 37 +++----
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  4 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          | 70 ++++++-------
 drivers/gpu/drm/i915/gt/intel_gtt.h           | 16 +--
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         | 22 +++--
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      | 13 ++-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h      |  2 +-
 drivers/gpu/drm/i915/i915_debugfs.c           |  3 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  6 +-
 drivers/gpu/drm/i915/i915_vma.c               | 24 ++++-
 drivers/gpu/drm/i915/i915_vma.h               | 11 +--
 drivers/gpu/drm/i915/i915_vma_resource.c      |  9 +-
 drivers/gpu/drm/i915/i915_vma_resource.h      | 99 ++++++++++++++++++-
 drivers/gpu/drm/i915/i915_vma_snapshot.c      |  4 -
 drivers/gpu/drm/i915/i915_vma_snapshot.h      |  8 --
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 64 ++++++++----
 drivers/gpu/drm/i915/selftests/mock_gtt.c     | 12 +--
 21 files changed, 308 insertions(+), 206 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
index 8f674745e7e0..63a83d5f85a1 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -48,7 +48,7 @@ static void dpt_insert_page(struct i915_address_space *vm,
 }
 
 static void dpt_insert_entries(struct i915_address_space *vm,
-			       struct i915_vma *vma,
+			       struct i915_vma_resource *vma_res,
 			       enum i915_cache_level level,
 			       u32 flags)
 {
@@ -64,8 +64,8 @@ static void dpt_insert_entries(struct i915_address_space *vm,
 	 * not to allow the user to override access to a read only page.
 	 */
 
-	i = vma->node.start / I915_GTT_PAGE_SIZE;
-	for_each_sgt_daddr(addr, sgt_iter, vma->pages)
+	i = vma_res->start / I915_GTT_PAGE_SIZE;
+	for_each_sgt_daddr(addr, sgt_iter, vma_res->bi.pages)
 		gen8_set_pte(&base[i++], pte_encode | addr);
 }
 
@@ -76,35 +76,38 @@ static void dpt_clear_range(struct i915_address_space *vm,
 
 static void dpt_bind_vma(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash,
-			 struct i915_vma *vma,
+			 struct i915_vma_resource *vma_res,
 			 enum i915_cache_level cache_level,
 			 u32 flags)
 {
-	struct drm_i915_gem_object *obj = vma->obj;
 	u32 pte_flags;
 
+	if (vma_res->bound_flags)
+		return;
+
 	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
 	pte_flags = 0;
-	if (vma->vm->has_read_only && i915_gem_object_is_readonly(obj))
+	if (vm->has_read_only && vma_res->bi.readonly)
 		pte_flags |= PTE_READ_ONLY;
-	if (i915_gem_object_is_lmem(obj))
+	if (vma_res->bi.lmem)
 		pte_flags |= PTE_LM;
 
-	vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
+	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
 
-	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
+	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
 
 	/*
 	 * Without aliasing PPGTT there's no difference between
 	 * GLOBAL/LOCAL_BIND, it's all the same ptes. Hence unconditionally
 	 * upgrade to both bound if we bind either to avoid double-binding.
 	 */
-	atomic_or(I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND, &vma->flags);
+	vma_res->bound_flags = I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
 }
 
-static void dpt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
+static void dpt_unbind_vma(struct i915_address_space *vm,
+			   struct i915_vma_resource *vma_res)
 {
-	vm->clear_range(vm, vma->node.start, vma->size);
+	vm->clear_range(vm, vma_res->start, vma_res->vma_size);
 }
 
 static void dpt_cleanup(struct i915_address_space *vm)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index f9f7e44099fe..f99d260e0684 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -15,6 +15,7 @@
 
 #include "i915_active.h"
 #include "i915_selftest.h"
+#include "i915_vma_resource.h"
 
 struct drm_i915_gem_object;
 struct intel_fronbuffer;
@@ -549,31 +550,7 @@ struct drm_i915_gem_object {
 		struct sg_table *pages;
 		void *mapping;
 
-		struct i915_page_sizes {
-			/**
-			 * The sg mask of the pages sg_table. i.e the mask of
-			 * of the lengths for each sg entry.
-			 */
-			unsigned int phys;
-
-			/**
-			 * The gtt page sizes we are allowed to use given the
-			 * sg mask and the supported page sizes. This will
-			 * express the smallest unit we can use for the whole
-			 * object, as well as the larger sizes we may be able
-			 * to use opportunistically.
-			 */
-			unsigned int sg;
-
-			/**
-			 * The actual gtt page size usage. Since we can have
-			 * multiple vma associated with this object we need to
-			 * prevent any trampling of state, hence a copy of this
-			 * struct also lives in each vma, therefore the gtt
-			 * value here should only be read/write through the vma.
-			 */
-			unsigned int gtt;
-		} page_sizes;
+		struct i915_page_sizes page_sizes;
 
 		I915_SELFTEST_DECLARE(unsigned int page_mask);
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 11f0aa65f8a3..26f997c376a2 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -370,9 +370,9 @@ static int igt_check_page_sizes(struct i915_vma *vma)
 		err = -EINVAL;
 	}
 
-	if (!HAS_PAGE_SIZES(i915, vma->page_sizes.gtt)) {
+	if (!HAS_PAGE_SIZES(i915, vma->resource->page_sizes_gtt)) {
 		pr_err("unsupported page_sizes.gtt=%u, supported=%u\n",
-		       vma->page_sizes.gtt & ~supported, supported);
+		       vma->resource->page_sizes_gtt & ~supported, supported);
 		err = -EINVAL;
 	}
 
@@ -403,15 +403,9 @@ static int igt_check_page_sizes(struct i915_vma *vma)
 	if (i915_gem_object_is_lmem(obj) &&
 	    IS_ALIGNED(vma->node.start, SZ_2M) &&
 	    vma->page_sizes.sg & SZ_2M &&
-	    vma->page_sizes.gtt < SZ_2M) {
+	    vma->resource->page_sizes_gtt < SZ_2M) {
 		pr_err("gtt pages mismatch for LMEM, expected 2M GTT pages, sg(%u), gtt(%u)\n",
-		       vma->page_sizes.sg, vma->page_sizes.gtt);
-		err = -EINVAL;
-	}
-
-	if (obj->mm.page_sizes.gtt) {
-		pr_err("obj->page_sizes.gtt(%u) should never be set\n",
-		       obj->mm.page_sizes.gtt);
+		       vma->page_sizes.sg, vma->resource->page_sizes_gtt);
 		err = -EINVAL;
 	}
 
@@ -547,9 +541,9 @@ static int igt_mock_memory_region_huge_pages(void *arg)
 				goto out_unpin;
 			}
 
-			if (vma->page_sizes.gtt != page_size) {
+			if (vma->resource->page_sizes_gtt != page_size) {
 				pr_err("%s page_sizes.gtt=%u, expected=%u\n",
-				       __func__, vma->page_sizes.gtt,
+				       __func__, vma->resource->page_sizes_gtt,
 				       page_size);
 				err = -EINVAL;
 				goto out_unpin;
@@ -630,9 +624,9 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg)
 
 		err = igt_check_page_sizes(vma);
 
-		if (vma->page_sizes.gtt != page_size) {
+		if (vma->resource->page_sizes_gtt != page_size) {
 			pr_err("page_sizes.gtt=%u, expected %u\n",
-			       vma->page_sizes.gtt, page_size);
+			       vma->resource->page_sizes_gtt, page_size);
 			err = -EINVAL;
 		}
 
@@ -657,9 +651,10 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg)
 
 			err = igt_check_page_sizes(vma);
 
-			if (vma->page_sizes.gtt != I915_GTT_PAGE_SIZE_4K) {
+			if (vma->resource->page_sizes_gtt != I915_GTT_PAGE_SIZE_4K) {
 				pr_err("page_sizes.gtt=%u, expected %llu\n",
-				       vma->page_sizes.gtt, I915_GTT_PAGE_SIZE_4K);
+				       vma->resource->page_sizes_gtt,
+				       I915_GTT_PAGE_SIZE_4K);
 				err = -EINVAL;
 			}
 
@@ -805,9 +800,9 @@ static int igt_mock_ppgtt_huge_fill(void *arg)
 			}
 		}
 
-		if (vma->page_sizes.gtt != expected_gtt) {
+		if (vma->resource->page_sizes_gtt != expected_gtt) {
 			pr_err("gtt=%u, expected=%u, size=%zd, single=%s\n",
-			       vma->page_sizes.gtt, expected_gtt,
+			       vma->resource->page_sizes_gtt, expected_gtt,
 			       obj->base.size, yesno(!!single));
 			err = -EINVAL;
 			break;
@@ -961,10 +956,10 @@ static int igt_mock_ppgtt_64K(void *arg)
 				}
 			}
 
-			if (vma->page_sizes.gtt != expected_gtt) {
+			if (vma->resource->page_sizes_gtt != expected_gtt) {
 				pr_err("gtt=%u, expected=%u, i=%d, single=%s\n",
-				       vma->page_sizes.gtt, expected_gtt, i,
-				       yesno(!!single));
+				       vma->resource->page_sizes_gtt,
+				       expected_gtt, i, yesno(!!single));
 				err = -EINVAL;
 				goto out_vma_unpin;
 			}
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 6e9292918bfc..d657ffd6c86a 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -104,17 +104,17 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 }
 
 static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
-				      struct i915_vma *vma,
+				      struct i915_vma_resource *vma_res,
 				      enum i915_cache_level cache_level,
 				      u32 flags)
 {
 	struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
 	struct i915_page_directory * const pd = ppgtt->pd;
-	unsigned int first_entry = vma->node.start / I915_GTT_PAGE_SIZE;
+	unsigned int first_entry = vma_res->start / I915_GTT_PAGE_SIZE;
 	unsigned int act_pt = first_entry / GEN6_PTES;
 	unsigned int act_pte = first_entry % GEN6_PTES;
 	const u32 pte_encode = vm->pte_encode(0, cache_level, flags);
-	struct sgt_dma iter = sgt_dma(vma);
+	struct sgt_dma iter = sgt_dma(vma_res);
 	gen6_pte_t *vaddr;
 
 	GEM_BUG_ON(!pd->entry[act_pt]);
@@ -140,7 +140,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 		}
 	} while (1);
 
-	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
+	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
 }
 
 static void gen6_flush_pd(struct gen6_ppgtt *ppgtt, u64 start, u64 end)
@@ -271,13 +271,13 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 
 static void pd_vma_bind(struct i915_address_space *vm,
 			struct i915_vm_pt_stash *stash,
-			struct i915_vma *vma,
+			struct i915_vma_resource *vma_res,
 			enum i915_cache_level cache_level,
 			u32 unused)
 {
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
-	struct gen6_ppgtt *ppgtt = vma->private;
-	u32 ggtt_offset = i915_ggtt_offset(vma) / I915_GTT_PAGE_SIZE;
+	struct gen6_ppgtt *ppgtt = vma_res->private;
+	u32 ggtt_offset = vma_res->start / I915_GTT_PAGE_SIZE;
 
 	ppgtt->pp_dir = ggtt_offset * sizeof(gen6_pte_t) << 10;
 	ppgtt->pd_addr = (gen6_pte_t __iomem *)ggtt->gsm + ggtt_offset;
@@ -285,9 +285,10 @@ static void pd_vma_bind(struct i915_address_space *vm,
 	gen6_flush_pd(ppgtt, 0, ppgtt->base.vm.total);
 }
 
-static void pd_vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
+static void pd_vma_unbind(struct i915_address_space *vm,
+			  struct i915_vma_resource *vma_res)
 {
-	struct gen6_ppgtt *ppgtt = vma->private;
+	struct gen6_ppgtt *ppgtt = vma_res->private;
 	struct i915_page_directory * const pd = ppgtt->base.pd;
 	struct i915_page_table *pt;
 	unsigned int pde;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index b012c50f7ce7..c43e724afa9f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -453,20 +453,21 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 	return idx;
 }
 
-static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
+static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
+				   struct i915_vma_resource *vma_res,
 				   struct sgt_dma *iter,
 				   enum i915_cache_level cache_level,
 				   u32 flags)
 {
 	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
 	unsigned int rem = sg_dma_len(iter->sg);
-	u64 start = vma->node.start;
+	u64 start = vma_res->start;
 
-	GEM_BUG_ON(!i915_vm_is_4lvl(vma->vm));
+	GEM_BUG_ON(!i915_vm_is_4lvl(vm));
 
 	do {
 		struct i915_page_directory * const pdp =
-			gen8_pdp_for_page_address(vma->vm, start);
+			gen8_pdp_for_page_address(vm, start);
 		struct i915_page_directory * const pd =
 			i915_pd_entry(pdp, __gen8_pte_index(start, 2));
 		gen8_pte_t encode = pte_encode;
@@ -475,7 +476,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		gen8_pte_t *vaddr;
 		u16 index;
 
-		if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_2M &&
+		if (vma_res->bi.page_sizes.sg & I915_GTT_PAGE_SIZE_2M &&
 		    IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_2M) &&
 		    rem >= I915_GTT_PAGE_SIZE_2M &&
 		    !__gen8_pte_index(start, 0)) {
@@ -492,7 +493,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			page_size = I915_GTT_PAGE_SIZE;
 
 			if (!index &&
-			    vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K &&
+			    vma_res->bi.page_sizes.sg & I915_GTT_PAGE_SIZE_64K &&
 			    IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) &&
 			    (IS_ALIGNED(rem, I915_GTT_PAGE_SIZE_64K) ||
 			     rem >= (I915_PDES - index) * I915_GTT_PAGE_SIZE))
@@ -541,9 +542,9 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		 */
 		if (maybe_64K != -1 &&
 		    (index == I915_PDES ||
-		     (i915_vm_has_scratch_64K(vma->vm) &&
-		      !iter->sg && IS_ALIGNED(vma->node.start +
-					      vma->node.size,
+		     (i915_vm_has_scratch_64K(vm) &&
+		      !iter->sg && IS_ALIGNED(vma_res->start +
+					      vma_res->node_size,
 					      I915_GTT_PAGE_SIZE_2M)))) {
 			vaddr = px_vaddr(pd);
 			vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
@@ -559,10 +560,10 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			 * instead - which we detect as missing results during
 			 * selftests.
 			 */
-			if (I915_SELFTEST_ONLY(vma->vm->scrub_64K)) {
+			if (I915_SELFTEST_ONLY(vm->scrub_64K)) {
 				u16 i;
 
-				encode = vma->vm->scratch[0]->encode;
+				encode = vm->scratch[0]->encode;
 				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K));
 
 				for (i = 1; i < index; i += 16)
@@ -572,22 +573,22 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			}
 		}
 
-		vma->page_sizes.gtt |= page_size;
+		vma_res->page_sizes_gtt |= page_size;
 	} while (iter->sg && sg_dma_len(iter->sg));
 }
 
 static void gen8_ppgtt_insert(struct i915_address_space *vm,
-			      struct i915_vma *vma,
+			      struct i915_vma_resource *vma_res,
 			      enum i915_cache_level cache_level,
 			      u32 flags)
 {
 	struct i915_ppgtt * const ppgtt = i915_vm_to_ppgtt(vm);
-	struct sgt_dma iter = sgt_dma(vma);
+	struct sgt_dma iter = sgt_dma(vma_res);
 
-	if (vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
-		gen8_ppgtt_insert_huge(vma, &iter, cache_level, flags);
+	if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) {
+		gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags);
 	} else  {
-		u64 idx = vma->node.start >> GEN8_PTE_SHIFT;
+		u64 idx = vma_res->start >> GEN8_PTE_SHIFT;
 
 		do {
 			struct i915_page_directory * const pdp =
@@ -597,7 +598,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
 						    cache_level, flags);
 		} while (idx);
 
-		vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
+		vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 352254e001b4..74aa90587061 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1718,8 +1718,8 @@ static void print_request_ring(struct drm_printer *m, struct i915_request *rq)
 	drm_printf(m,
 		   "[head %04x, postfix %04x, tail %04x, batch 0x%08x_%08x]:\n",
 		   rq->head, rq->postfix, rq->tail,
-		   vsnap ? upper_32_bits(vsnap->gtt_offset) : ~0u,
-		   vsnap ? lower_32_bits(vsnap->gtt_offset) : ~0u);
+		   vsnap ? upper_32_bits(vsnap->vma_resource->start) : ~0u,
+		   vsnap ? lower_32_bits(vsnap->vma_resource->start) : ~0u);
 
 	size = rq->tail - rq->head;
 	if (rq->tail < rq->head)
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 5263dda7f8d5..0137b6af0973 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -235,7 +235,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
 }
 
 static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct i915_vma *vma,
+				     struct i915_vma_resource *vma_res,
 				     enum i915_cache_level level,
 				     u32 flags)
 {
@@ -252,10 +252,10 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 	 */
 
 	gte = (gen8_pte_t __iomem *)ggtt->gsm;
-	gte += vma->node.start / I915_GTT_PAGE_SIZE;
-	end = gte + vma->node.size / I915_GTT_PAGE_SIZE;
+	gte += vma_res->start / I915_GTT_PAGE_SIZE;
+	end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
 
-	for_each_sgt_daddr(addr, iter, vma->pages)
+	for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
 		gen8_set_pte(gte++, pte_encode | addr);
 	GEM_BUG_ON(gte > end);
 
@@ -292,7 +292,7 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm,
  * through the GMADR mapped BAR (i915->mm.gtt->gtt).
  */
 static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct i915_vma *vma,
+				     struct i915_vma_resource *vma_res,
 				     enum i915_cache_level level,
 				     u32 flags)
 {
@@ -303,10 +303,10 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 	dma_addr_t addr;
 
 	gte = (gen6_pte_t __iomem *)ggtt->gsm;
-	gte += vma->node.start / I915_GTT_PAGE_SIZE;
-	end = gte + vma->node.size / I915_GTT_PAGE_SIZE;
+	gte += vma_res->start / I915_GTT_PAGE_SIZE;
+	end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
 
-	for_each_sgt_daddr(addr, iter, vma->pages)
+	for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
 		iowrite32(vm->pte_encode(addr, level, flags), gte++);
 	GEM_BUG_ON(gte > end);
 
@@ -389,7 +389,7 @@ static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
 
 struct insert_entries {
 	struct i915_address_space *vm;
-	struct i915_vma *vma;
+	struct i915_vma_resource *vma_res;
 	enum i915_cache_level level;
 	u32 flags;
 };
@@ -398,18 +398,18 @@ static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
 {
 	struct insert_entries *arg = _arg;
 
-	gen8_ggtt_insert_entries(arg->vm, arg->vma, arg->level, arg->flags);
+	gen8_ggtt_insert_entries(arg->vm, arg->vma_res, arg->level, arg->flags);
 	bxt_vtd_ggtt_wa(arg->vm);
 
 	return 0;
 }
 
 static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
-					     struct i915_vma *vma,
+					     struct i915_vma_resource *vma_res,
 					     enum i915_cache_level level,
 					     u32 flags)
 {
-	struct insert_entries arg = { vm, vma, level, flags };
+	struct insert_entries arg = { vm, vma_res, level, flags };
 
 	stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
 }
@@ -448,14 +448,14 @@ static void i915_ggtt_insert_page(struct i915_address_space *vm,
 }
 
 static void i915_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct i915_vma *vma,
+				     struct i915_vma_resource *vma_res,
 				     enum i915_cache_level cache_level,
 				     u32 unused)
 {
 	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
 		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
 
-	intel_gtt_insert_sg_entries(vma->pages, vma->node.start >> PAGE_SHIFT,
+	intel_gtt_insert_sg_entries(vma_res->bi.pages, vma_res->start >> PAGE_SHIFT,
 				    flags);
 }
 
@@ -467,30 +467,32 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
 
 static void ggtt_bind_vma(struct i915_address_space *vm,
 			  struct i915_vm_pt_stash *stash,
-			  struct i915_vma *vma,
+			  struct i915_vma_resource *vma_res,
 			  enum i915_cache_level cache_level,
 			  u32 flags)
 {
-	struct drm_i915_gem_object *obj = vma->obj;
 	u32 pte_flags;
 
-	if (i915_vma_is_bound(vma, ~flags & I915_VMA_BIND_MASK))
+	if (vma_res->bound_flags & (~flags & I915_VMA_BIND_MASK))
 		return;
 
+	vma_res->bound_flags |= flags;
+
 	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
 	pte_flags = 0;
-	if (i915_gem_object_is_readonly(obj))
+	if (vma_res->bi.readonly)
 		pte_flags |= PTE_READ_ONLY;
-	if (i915_gem_object_is_lmem(obj))
+	if (vma_res->bi.lmem)
 		pte_flags |= PTE_LM;
 
-	vm->insert_entries(vm, vma, cache_level, pte_flags);
-	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
+	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
+	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
 }
 
-static void ggtt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
+static void ggtt_unbind_vma(struct i915_address_space *vm,
+			    struct i915_vma_resource *vma_res)
 {
-	vm->clear_range(vm, vma->node.start, vma->size);
+	vm->clear_range(vm, vma_res->start, vma_res->vma_size);
 }
 
 static int ggtt_reserve_guc_top(struct i915_ggtt *ggtt)
@@ -623,7 +625,7 @@ static int init_ggtt(struct i915_ggtt *ggtt)
 
 static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
 				  struct i915_vm_pt_stash *stash,
-				  struct i915_vma *vma,
+				  struct i915_vma_resource *vma_res,
 				  enum i915_cache_level cache_level,
 				  u32 flags)
 {
@@ -631,25 +633,27 @@ static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
 
 	/* Currently applicable only to VLV */
 	pte_flags = 0;
-	if (i915_gem_object_is_readonly(vma->obj))
+	if (vma_res->bi.readonly)
 		pte_flags |= PTE_READ_ONLY;
 
 	if (flags & I915_VMA_LOCAL_BIND)
 		ppgtt_bind_vma(&i915_vm_to_ggtt(vm)->alias->vm,
-			       stash, vma, cache_level, flags);
+			       stash, vma_res, cache_level, flags);
 
 	if (flags & I915_VMA_GLOBAL_BIND)
-		vm->insert_entries(vm, vma, cache_level, pte_flags);
+		vm->insert_entries(vm, vma_res, cache_level, pte_flags);
+
+	vma_res->bound_flags |= flags;
 }
 
 static void aliasing_gtt_unbind_vma(struct i915_address_space *vm,
-				    struct i915_vma *vma)
+				    struct i915_vma_resource *vma_res)
 {
-	if (i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND))
-		vm->clear_range(vm, vma->node.start, vma->size);
+	if (vma_res->bound_flags & I915_VMA_GLOBAL_BIND)
+		vm->clear_range(vm, vma_res->start, vma_res->vma_size);
 
-	if (i915_vma_is_bound(vma, I915_VMA_LOCAL_BIND))
-		ppgtt_unbind_vma(&i915_vm_to_ggtt(vm)->alias->vm, vma);
+	if (vma_res->bound_flags & I915_VMA_LOCAL_BIND)
+		ppgtt_unbind_vma(&i915_vm_to_ggtt(vm)->alias->vm, vma_res);
 }
 
 static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
@@ -1280,7 +1284,7 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
 			atomic_read(&vma->flags) & I915_VMA_BIND_MASK;
 
 		GEM_BUG_ON(!was_bound);
-		vma->ops->bind_vma(vm, NULL, vma,
+		vma->ops->bind_vma(vm, NULL, vma->resource,
 				   obj ? obj->cache_level : 0,
 				   was_bound);
 		if (obj) { /* only used during resume => exclusive access */
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 177b42b935a1..676b839d1a34 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -27,6 +27,7 @@
 
 #include "gt/intel_reset.h"
 #include "i915_selftest.h"
+#include "i915_vma_resource.h"
 #include "i915_vma_types.h"
 
 #define I915_GFP_ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
@@ -200,7 +201,7 @@ struct i915_vma_ops {
 	/* Map an object into an address space with the given cache flags. */
 	void (*bind_vma)(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash,
-			 struct i915_vma *vma,
+			 struct i915_vma_resource *vma_res,
 			 enum i915_cache_level cache_level,
 			 u32 flags);
 	/*
@@ -208,7 +209,8 @@ struct i915_vma_ops {
 	 * setting the valid PTE entries to a reserved scratch page.
 	 */
 	void (*unbind_vma)(struct i915_address_space *vm,
-			   struct i915_vma *vma);
+			   struct i915_vma_resource *vma_res);
+
 };
 
 struct i915_address_space {
@@ -285,7 +287,7 @@ struct i915_address_space {
 			    enum i915_cache_level cache_level,
 			    u32 flags);
 	void (*insert_entries)(struct i915_address_space *vm,
-			       struct i915_vma *vma,
+			       struct i915_vma_resource *vma_res,
 			       enum i915_cache_level cache_level,
 			       u32 flags);
 	void (*cleanup)(struct i915_address_space *vm);
@@ -600,11 +602,11 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt);
 
 void ppgtt_bind_vma(struct i915_address_space *vm,
 		    struct i915_vm_pt_stash *stash,
-		    struct i915_vma *vma,
+		    struct i915_vma_resource *vma_res,
 		    enum i915_cache_level cache_level,
 		    u32 flags);
 void ppgtt_unbind_vma(struct i915_address_space *vm,
-		      struct i915_vma *vma);
+		      struct i915_vma_resource *vma_res);
 
 void gtt_write_workarounds(struct intel_gt *gt);
 
@@ -627,8 +629,8 @@ __vm_create_scratch_for_read_pinned(struct i915_address_space *vm, unsigned long
 static inline struct sgt_dma {
 	struct scatterlist *sg;
 	dma_addr_t dma, max;
-} sgt_dma(struct i915_vma *vma) {
-	struct scatterlist *sg = vma->pages->sgl;
+} sgt_dma(struct i915_vma_resource *vma_res) {
+	struct scatterlist *sg = vma_res->bi.pages->sgl;
 	dma_addr_t addr = sg_dma_address(sg);
 
 	return (struct sgt_dma){ sg, addr, addr + sg_dma_len(sg) };
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 083b3090c69c..48e6e2f87700 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -179,32 +179,34 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt,
 
 void ppgtt_bind_vma(struct i915_address_space *vm,
 		    struct i915_vm_pt_stash *stash,
-		    struct i915_vma *vma,
+		    struct i915_vma_resource *vma_res,
 		    enum i915_cache_level cache_level,
 		    u32 flags)
 {
 	u32 pte_flags;
 
-	if (!test_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma))) {
-		vm->allocate_va_range(vm, stash, vma->node.start, vma->size);
-		set_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma));
+	if (!vma_res->allocated) {
+		vm->allocate_va_range(vm, stash, vma_res->start,
+				      vma_res->vma_size);
+		vma_res->allocated = true;
 	}
 
 	/* Applicable to VLV, and gen8+ */
 	pte_flags = 0;
-	if (i915_gem_object_is_readonly(vma->obj))
+	if (vma_res->bi.readonly)
 		pte_flags |= PTE_READ_ONLY;
-	if (i915_gem_object_is_lmem(vma->obj))
+	if (vma_res->bi.lmem)
 		pte_flags |= PTE_LM;
 
-	vm->insert_entries(vm, vma, cache_level, pte_flags);
+	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
 	wmb();
 }
 
-void ppgtt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
+void ppgtt_unbind_vma(struct i915_address_space *vm,
+		      struct i915_vma_resource *vma_res)
 {
-	if (test_and_clear_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma)))
-		vm->clear_range(vm, vma->node.start, vma->size);
+	if (vma_res->allocated)
+		vm->clear_range(vm, vma_res->start, vma_res->vma_size);
 }
 
 static unsigned long pd_count(u64 size, int shift)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index a5af05bde6f2..777fc6f0ceff 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -448,20 +448,19 @@ static void uc_fw_bind_ggtt(struct intel_uc_fw *uc_fw)
 {
 	struct drm_i915_gem_object *obj = uc_fw->obj;
 	struct i915_ggtt *ggtt = __uc_fw_to_gt(uc_fw)->ggtt;
-	struct i915_vma *dummy = &uc_fw->dummy;
+	struct i915_vma_resource *dummy = &uc_fw->dummy;
 	u32 pte_flags = 0;
 
-	dummy->node.start = uc_fw_ggtt_offset(uc_fw);
-	dummy->node.size = obj->base.size;
-	dummy->pages = obj->mm.pages;
-	dummy->vm = &ggtt->vm;
+	dummy->start = uc_fw_ggtt_offset(uc_fw);
+	dummy->node_size = obj->base.size;
+	dummy->bi.pages = obj->mm.pages;
 
 	GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
-	GEM_BUG_ON(dummy->node.size > ggtt->uc_fw.size);
+	GEM_BUG_ON(dummy->node_size > ggtt->uc_fw.size);
 
 	/* uc_fw->obj cache domains were not controlled across suspend */
 	if (i915_gem_object_has_struct_page(obj))
-		drm_clflush_sg(dummy->pages);
+		drm_clflush_sg(dummy->bi.pages);
 
 	if (i915_gem_object_is_lmem(obj))
 		pte_flags |= PTE_LM;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
index d9d1dc0b4cbb..3229018877d3 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
@@ -85,7 +85,7 @@ struct intel_uc_fw {
 	 * threaded as it done during driver load (inherently single threaded)
 	 * or during a GT reset (mutex guarantees single threaded).
 	 */
-	struct i915_vma dummy;
+	struct i915_vma_resource dummy;
 	struct i915_vma *rsa_data;
 
 	/*
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index e0e052cdf8b8..f7d1feba5aa4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -170,7 +170,8 @@ i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		seq_printf(m, " (%s offset: %08llx, size: %08llx, pages: %s",
 			   stringify_vma_type(vma),
 			   vma->node.start, vma->node.size,
-			   stringify_page_sizes(vma->page_sizes.gtt, NULL, 0));
+			   stringify_page_sizes(vma->resource->page_sizes_gtt,
+						NULL, 0));
 		if (i915_vma_is_ggtt(vma) || i915_vma_is_dpt(vma)) {
 			switch (vma->ggtt_view.type) {
 			case I915_GGTT_VIEW_NORMAL:
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 5ae812d60abe..1af54ff374f9 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1040,9 +1040,9 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	strcpy(dst->name, vsnap->name);
 	dst->next = NULL;
 
-	dst->gtt_offset = vsnap->gtt_offset;
-	dst->gtt_size = vsnap->gtt_size;
-	dst->gtt_page_sizes = vsnap->page_sizes;
+	dst->gtt_offset = vsnap->vma_resource->start;
+	dst->gtt_size = vsnap->vma_resource->node_size;
+	dst->gtt_page_sizes = vsnap->vma_resource->page_sizes_gtt;
 	dst->unused = 0;
 
 	ret = -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 7097c5016431..1d4e448d22d9 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -298,7 +298,7 @@ static void __vma_bind(struct dma_fence_work *work)
 	struct i915_vma *vma = vw->vma;
 
 	vma->ops->bind_vma(vw->vm, &vw->stash,
-			   vma, vw->cache_level, vw->flags);
+			   vma->resource, vw->cache_level, vw->flags);
 }
 
 static void __vma_release(struct dma_fence_work *work)
@@ -375,6 +375,21 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
 #define i915_vma_verify_bind_complete(_vma) 0
 #endif
 
+I915_SELFTEST_EXPORT void
+i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
+				struct i915_vma *vma)
+{
+	struct drm_i915_gem_object *obj = vma->obj;
+
+	i915_vma_resource_init(vma_res, vma->pages, &vma->page_sizes,
+			       i915_gem_object_is_readonly(obj),
+			       i915_gem_object_is_lmem(obj),
+			       vma->private,
+			       vma->node.start,
+			       vma->node.size,
+			       vma->size);
+}
+
 /**
  * i915_vma_bind - Sets up PTEs for an VMA in it's corresponding address space.
  * @vma: VMA to map
@@ -432,7 +447,7 @@ int i915_vma_bind(struct i915_vma *vma,
 		GEM_WARN_ON(!vma_flags);
 		kfree(vma_res);
 	} else {
-		i915_vma_resource_init(vma_res);
+		i915_vma_resource_init_from_vma(vma_res, vma);
 		vma->resource = vma_res;
 	}
 	trace_i915_vma_bind(vma, bind_flags);
@@ -472,7 +487,8 @@ int i915_vma_bind(struct i915_vma *vma,
 			if (ret)
 				return ret;
 		}
-		vma->ops->bind_vma(vma->vm, NULL, vma, cache_level, bind_flags);
+		vma->ops->bind_vma(vma->vm, NULL, vma->resource, cache_level,
+				   bind_flags);
 	}
 
 	atomic_or(bind_flags, &vma->flags);
@@ -1778,7 +1794,7 @@ void __i915_vma_evict(struct i915_vma *vma)
 
 	if (likely(atomic_read(&vma->vm->open))) {
 		trace_i915_vma_unbind(vma);
-		vma->ops->unbind_vma(vma->vm, vma);
+		vma->ops->unbind_vma(vma->vm, vma->resource);
 	}
 	atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE),
 		   &vma->flags);
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index de0f3e44cdfa..1df57ec832bd 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -339,12 +339,6 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma);
  */
 void i915_vma_unpin_iomap(struct i915_vma *vma);
 
-static inline struct page *i915_vma_first_page(struct i915_vma *vma)
-{
-	GEM_BUG_ON(!vma->pages);
-	return sg_page(vma->pages->sgl);
-}
-
 /**
  * i915_vma_pin_fence - pin fencing state
  * @vma: vma to pin fencing for
@@ -445,6 +439,11 @@ i915_vma_get_current_resource(struct i915_vma *vma)
 	return i915_vma_resource_get(vma->resource);
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+void i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
+				     struct i915_vma *vma);
+#endif
+
 void i915_vma_module_exit(void);
 int i915_vma_module_init(void);
 
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c
index 833e987bed2a..c86db89ab5d2 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.c
+++ b/drivers/gpu/drm/i915/i915_vma_resource.c
@@ -23,15 +23,12 @@ static struct dma_fence_ops unbind_fence_ops = {
 };
 
 /**
- * i915_vma_resource_init - Initialize a vma resource.
+ * __i915_vma_resource_init - Initialize a vma resource.
  * @vma_res: The vma resource to initialize
  *
- * Initializes a vma resource allocated using i915_vma_resource_alloc().
- * The reason for having separate allocate and initialize function is that
- * initialization may need to be performed from under a lock where
- * allocation is not allowed.
+ * Initializes the private members of a vma resource.
  */
-void i915_vma_resource_init(struct i915_vma_resource *vma_res)
+void __i915_vma_resource_init(struct i915_vma_resource *vma_res)
 {
 	spin_lock_init(&vma_res->lock);
 	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
index 34744da23072..9872de58268b 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.h
+++ b/drivers/gpu/drm/i915/i915_vma_resource.h
@@ -9,6 +9,25 @@
 #include <linux/dma-fence.h>
 #include <linux/refcount.h>
 
+#include "i915_gem.h"
+
+struct i915_page_sizes {
+	/**
+	 * The sg mask of the pages sg_table. i.e the mask of
+	 * the lengths for each sg entry.
+	 */
+	unsigned int phys;
+
+	/**
+	 * The gtt page sizes we are allowed to use given the
+	 * sg mask and the supported page sizes. This will
+	 * express the smallest unit we can use for the whole
+	 * object, as well as the larger sizes we may be able
+	 * to use opportunistically.
+	 */
+	unsigned int sg;
+};
+
 /**
  * struct i915_vma_resource - Snapshotted unbind information.
  * @unbind_fence: Fence to mark unbinding complete. Note that this fence
@@ -20,6 +39,13 @@
  * @hold_count: Number of holders blocking the fence from finishing.
  * The vma itself is keeping a hold, which is released when unbind
  * is scheduled.
+ * @private: Bind backend private info.
+ * @start: Offset into the address space of bind range start.
+ * @node_size: Size of the allocated range manager node.
+ * @vma_size: Bind size.
+ * @page_sizes_gtt: Resulting page sizes from the bind operation.
+ * @bound_flags: Flags indicating binding status.
+ * @allocated: Backend private data. TODO: Should move into @private.
  *
  * The lifetime of a struct i915_vma_resource is from a binding request to
  * the actual possible asynchronous unbind has completed.
@@ -29,6 +55,32 @@ struct i915_vma_resource {
 	/* See above for description of the lock. */
 	spinlock_t lock;
 	refcount_t hold_count;
+
+	/**
+	 * struct i915_vma_bindinfo - Information needed for async bind
+	 * only but that can be dropped after the bind has taken place.
+	 * Consider making this a separate argument to the bind_vma
+	 * op, coalescing with other arguments like vm, stash, cache_level
+	 * and flags
+	 * @pages: The pages sg-table.
+	 * @page_sizes: Page sizes of the pages.
+	 * @readonly: Whether the vma should be bound read-only.
+	 * @lmem: Whether the vma points to lmem.
+	 */
+	struct i915_vma_bindinfo {
+		struct sg_table *pages;
+		struct i915_page_sizes page_sizes;
+		bool readonly:1;
+		bool lmem:1;
+	} bi;
+
+	void *private;
+	unsigned long start;
+	unsigned long node_size;
+	unsigned long vma_size;
+	u32 page_sizes_gtt;
+	u32 bound_flags;
+	bool allocated:1;
 };
 
 bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
@@ -41,6 +93,8 @@ struct i915_vma_resource *i915_vma_resource_alloc(void);
 
 struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res);
 
+void __i915_vma_resource_init(struct i915_vma_resource *vma_res);
+
 /**
  * i915_vma_resource_get - Take a reference on a vma resource
  * @vma_res: The vma resource on which to take a reference.
@@ -63,8 +117,47 @@ static inline void i915_vma_resource_put(struct i915_vma_resource *vma_res)
 	dma_fence_put(&vma_res->unbind_fence);
 }
 
-#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-void i915_vma_resource_init(struct i915_vma_resource *vma_res);
-#endif
+/**
+ * i915_vma_resource_init - Initialize a vma resource.
+ * @vma_res: The vma resource to initialize
+ * @pages: The pages sg-table.
+ * @page_sizes: Page sizes of the pages.
+ * @readonly: Whether the vma should be bound read-only.
+ * @lmem: Whether the vma points to lmem.
+ * @private: Bind backend private info.
+ * @start: Offset into the address space of bind range start.
+ * @node_size: Size of the allocated range manager node.
+ * @size: Bind size.
+ *
+ * Initializes a vma resource allocated using i915_vma_resource_alloc().
+ * The reason for having separate allocate and initialize function is that
+ * initialization may need to be performed from under a lock where
+ * allocation is not allowed.
+ */
+static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
+					  struct sg_table *pages,
+					  const struct i915_page_sizes *page_sizes,
+					  bool readonly,
+					  bool lmem,
+					  void *private,
+					  unsigned long start,
+					  unsigned long node_size,
+					  unsigned long size)
+{
+	__i915_vma_resource_init(vma_res);
+	vma_res->bi.pages = pages;
+	vma_res->bi.page_sizes = *page_sizes;
+	vma_res->bi.readonly = readonly;
+	vma_res->bi.lmem = lmem;
+	vma_res->private = private;
+	vma_res->start = start;
+	vma_res->node_size = node_size;
+	vma_res->vma_size = size;
+}
+
+static inline void i915_vma_resource_fini(struct i915_vma_resource *vma_res)
+{
+	GEM_BUG_ON(refcount_read(&vma_res->hold_count) != 1);
+}
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
index f7333c7a2f5e..69f62c1ca967 100644
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.c
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.c
@@ -24,11 +24,7 @@ void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
 		assert_object_held(vma->obj);
 
 	vsnap->name = name;
-	vsnap->size = vma->size;
 	vsnap->obj_size = vma->obj->base.size;
-	vsnap->gtt_offset = vma->node.start;
-	vsnap->gtt_size = vma->node.size;
-	vsnap->page_sizes = vma->page_sizes.gtt;
 	vsnap->pages = vma->pages;
 	vsnap->pages_rsgt = NULL;
 	vsnap->mr = NULL;
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
index e74588dd676b..1b08ce9f8576 100644
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.h
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.h
@@ -23,31 +23,23 @@ struct sg_table;
 
 /**
  * struct i915_vma_snapshot - Snapshot of vma metadata.
- * @size: The vma size in bytes.
  * @obj_size: The size of the underlying object in bytes.
- * @gtt_offset: The gtt offset the vma is bound to.
- * @gtt_size: The size in bytes allocated for the vma in the GTT.
  * @pages: The struct sg_table pointing to the pages bound.
  * @pages_rsgt: The refcounted sg_table holding the reference for @pages if any.
  * @mr: The memory region pointed for the pages bound.
  * @kref: Reference for this structure.
  * @vma_resource: Pointer to the vma resource representing the vma binding.
- * @page_sizes: The vma GTT page sizes information.
  * @onstack: Whether the structure shouldn't be freed on final put.
  * @present: Whether the structure is present and initialized.
  */
 struct i915_vma_snapshot {
 	const char *name;
-	size_t size;
 	size_t obj_size;
-	size_t gtt_offset;
-	size_t gtt_size;
 	struct sg_table *pages;
 	struct i915_refct_sgt *pages_rsgt;
 	struct intel_memory_region *mr;
 	struct kref kref;
 	struct i915_vma_resource *vma_resource;
-	u32 page_sizes;
 	bool onstack:1;
 	bool present:1;
 };
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 54be880e55c3..70b5c47890b9 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -239,11 +239,11 @@ static int lowlevel_hole(struct i915_address_space *vm,
 			 unsigned long end_time)
 {
 	I915_RND_STATE(seed_prng);
-	struct i915_vma *mock_vma;
+	struct i915_vma_resource *mock_vma_res;
 	unsigned int size;
 
-	mock_vma = kzalloc(sizeof(*mock_vma), GFP_KERNEL);
-	if (!mock_vma)
+	mock_vma_res = kzalloc(sizeof(*mock_vma_res), GFP_KERNEL);
+	if (!mock_vma_res)
 		return -ENOMEM;
 
 	/* Keep creating larger objects until one cannot fit into the hole */
@@ -269,7 +269,7 @@ static int lowlevel_hole(struct i915_address_space *vm,
 				break;
 		} while (count >>= 1);
 		if (!count) {
-			kfree(mock_vma);
+			kfree(mock_vma_res);
 			return -ENOMEM;
 		}
 		GEM_BUG_ON(!order);
@@ -343,12 +343,12 @@ static int lowlevel_hole(struct i915_address_space *vm,
 					break;
 			}
 
-			mock_vma->pages = obj->mm.pages;
-			mock_vma->node.size = BIT_ULL(size);
-			mock_vma->node.start = addr;
+			mock_vma_res->bi.pages = obj->mm.pages;
+			mock_vma_res->node_size = BIT_ULL(size);
+			mock_vma_res->start = addr;
 
 			with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref)
-				vm->insert_entries(vm, mock_vma,
+			  vm->insert_entries(vm, mock_vma_res,
 						   I915_CACHE_NONE, 0);
 		}
 		count = n;
@@ -371,7 +371,7 @@ static int lowlevel_hole(struct i915_address_space *vm,
 		cleanup_freed_objects(vm->i915);
 	}
 
-	kfree(mock_vma);
+	kfree(mock_vma_res);
 	return 0;
 }
 
@@ -1280,6 +1280,7 @@ static void track_vma_bind(struct i915_vma *vma)
 	atomic_set(&vma->pages_count, I915_VMA_PAGES_ACTIVE);
 	__i915_gem_object_pin_pages(obj);
 	vma->pages = obj->mm.pages;
+	vma->resource->bi.pages = vma->pages;
 
 	mutex_lock(&vma->vm->mutex);
 	list_add_tail(&vma->vm_link, &vma->vm->bound_list);
@@ -1354,7 +1355,7 @@ static int reserve_gtt_with_resource(struct i915_vma *vma, u64 offset)
 				   obj->cache_level,
 				   0);
 	if (!err) {
-		i915_vma_resource_init(vma_res);
+		i915_vma_resource_init_from_vma(vma_res, vma);
 		vma->resource = vma_res;
 	} else {
 		kfree(vma_res);
@@ -1533,7 +1534,7 @@ static int insert_gtt_with_resource(struct i915_vma *vma)
 	err = i915_gem_gtt_insert(vm, &vma->node, obj->base.size, 0,
 				  obj->cache_level, 0, vm->total, 0);
 	if (!err) {
-		i915_vma_resource_init(vma_res);
+		i915_vma_resource_init_from_vma(vma_res, vma);
 		vma->resource = vma_res;
 	} else {
 		kfree(vma_res);
@@ -1958,6 +1959,7 @@ static int igt_cs_tlb(void *arg)
 			struct i915_vm_pt_stash stash = {};
 			struct i915_request *rq;
 			struct i915_gem_ww_ctx ww;
+			struct i915_vma_resource *vma_res;
 			u64 offset;
 
 			offset = igt_random_offset(&prng,
@@ -1978,6 +1980,13 @@ static int igt_cs_tlb(void *arg)
 			if (err)
 				goto end;
 
+			vma_res = i915_vma_resource_alloc();
+			if (IS_ERR(vma_res)) {
+				i915_vma_put_pages(vma);
+				err = PTR_ERR(vma_res);
+				goto end;
+			}
+
 			i915_gem_ww_ctx_init(&ww, false);
 retry:
 			err = i915_vm_lock_objects(vm, &ww);
@@ -1999,33 +2008,41 @@ static int igt_cs_tlb(void *arg)
 					goto retry;
 			}
 			i915_gem_ww_ctx_fini(&ww);
-			if (err)
+			if (err) {
+				kfree(vma_res);
 				goto end;
+			}
 
+			i915_vma_resource_init_from_vma(vma_res, vma);
 			/* Prime the TLB with the dummy pages */
 			for (i = 0; i < count; i++) {
-				vma->node.start = offset + i * PAGE_SIZE;
-				vm->insert_entries(vm, vma, I915_CACHE_NONE, 0);
+				vma_res->start = offset + i * PAGE_SIZE;
+				vm->insert_entries(vm, vma_res, I915_CACHE_NONE,
+						   0);
 
-				rq = submit_batch(ce, vma->node.start);
+				rq = submit_batch(ce, vma_res->start);
 				if (IS_ERR(rq)) {
 					err = PTR_ERR(rq);
+					i915_vma_resource_fini(vma_res);
+					kfree(vma_res);
 					goto end;
 				}
 				i915_request_put(rq);
 			}
-
+			i915_vma_resource_fini(vma_res);
 			i915_vma_put_pages(vma);
 
 			err = context_sync(ce);
 			if (err) {
 				pr_err("%s: dummy setup timed out\n",
 				       ce->engine->name);
+				kfree(vma_res);
 				goto end;
 			}
 
 			vma = i915_vma_instance(act, vm, NULL);
 			if (IS_ERR(vma)) {
+				kfree(vma_res);
 				err = PTR_ERR(vma);
 				goto end;
 			}
@@ -2033,19 +2050,22 @@ static int igt_cs_tlb(void *arg)
 			i915_gem_object_lock(act, NULL);
 			err = i915_vma_get_pages(vma);
 			i915_gem_object_unlock(act);
-			if (err)
+			if (err) {
+				kfree(vma_res);
 				goto end;
+			}
 
+			i915_vma_resource_init_from_vma(vma_res, vma);
 			/* Replace the TLB with target batches */
 			for (i = 0; i < count; i++) {
 				struct i915_request *rq;
 				u32 *cs = batch + i * 64 / sizeof(*cs);
 				u64 addr;
 
-				vma->node.start = offset + i * PAGE_SIZE;
-				vm->insert_entries(vm, vma, I915_CACHE_NONE, 0);
+				vma_res->start = offset + i * PAGE_SIZE;
+				vm->insert_entries(vm, vma_res, I915_CACHE_NONE, 0);
 
-				addr = vma->node.start + i * 64;
+				addr = vma_res->start + i * 64;
 				cs[4] = MI_NOOP;
 				cs[6] = lower_32_bits(addr);
 				cs[7] = upper_32_bits(addr);
@@ -2054,6 +2074,8 @@ static int igt_cs_tlb(void *arg)
 				rq = submit_batch(ce, addr);
 				if (IS_ERR(rq)) {
 					err = PTR_ERR(rq);
+					i915_vma_resource_fini(vma_res);
+					kfree(vma_res);
 					goto end;
 				}
 
@@ -2070,6 +2092,8 @@ static int igt_cs_tlb(void *arg)
 			}
 			end_spin(batch, count - 1);
 
+			i915_vma_resource_fini(vma_res);
+			kfree(vma_res);
 			i915_vma_put_pages(vma);
 
 			err = context_sync(ce);
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index 1802baf80a17..d40519e3ca38 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -33,23 +33,23 @@ static void mock_insert_page(struct i915_address_space *vm,
 }
 
 static void mock_insert_entries(struct i915_address_space *vm,
-				struct i915_vma *vma,
+				struct i915_vma_resource *vma_res,
 				enum i915_cache_level level, u32 flags)
 {
 }
 
 static void mock_bind_ppgtt(struct i915_address_space *vm,
 			    struct i915_vm_pt_stash *stash,
-			    struct i915_vma *vma,
+			    struct i915_vma_resource *vma_res,
 			    enum i915_cache_level cache_level,
 			    u32 flags)
 {
 	GEM_BUG_ON(flags & I915_VMA_GLOBAL_BIND);
-	set_bit(I915_VMA_LOCAL_BIND_BIT, __i915_vma_flags(vma));
+	vma_res->bound_flags |= flags;
 }
 
 static void mock_unbind_ppgtt(struct i915_address_space *vm,
-			      struct i915_vma *vma)
+			      struct i915_vma_resource *vma_res)
 {
 }
 
@@ -93,14 +93,14 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
 
 static void mock_bind_ggtt(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash,
-			   struct i915_vma *vma,
+			   struct i915_vma_resource *vma_res,
 			   enum i915_cache_level cache_level,
 			   u32 flags)
 {
 }
 
 static void mock_unbind_ggtt(struct i915_address_space *vm,
-			     struct i915_vma *vma)
+			     struct i915_vma_resource *vma_res)
 {
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Intel-gfx] [PATCH v5 2/6] drm/i915: Use the vma resource as argument for gtt binding / unbinding
@ 2022-01-04 12:51   ` Thomas Hellström
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

When introducing asynchronous unbinding, the vma itself may no longer
be alive when the actual binding or unbinding takes place.

Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource
instead of a struct i915_vma for the bind_vma() and unbind_vma() ops.
Similarly change the insert_entries() op for struct i915_address_space.

Replace a couple of i915_vma_snapshot members with their newly introduced
i915_vma_resource counterparts, since they have the same lifetime.

Also make sure to avoid changing the struct i915_vma_flags (in particular
the bind flags) async. That should now only be done sync under the
vm mutex.

v2:
- Update the vma_res::bound_flags when binding to the aliased ggtt

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/display/intel_dpt.c      | 27 ++---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 27 +----
 .../gpu/drm/i915/gem/selftests/huge_pages.c   | 37 +++----
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 19 ++--
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 37 +++----
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  4 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          | 70 ++++++-------
 drivers/gpu/drm/i915/gt/intel_gtt.h           | 16 +--
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         | 22 +++--
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      | 13 ++-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h      |  2 +-
 drivers/gpu/drm/i915/i915_debugfs.c           |  3 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  6 +-
 drivers/gpu/drm/i915/i915_vma.c               | 24 ++++-
 drivers/gpu/drm/i915/i915_vma.h               | 11 +--
 drivers/gpu/drm/i915/i915_vma_resource.c      |  9 +-
 drivers/gpu/drm/i915/i915_vma_resource.h      | 99 ++++++++++++++++++-
 drivers/gpu/drm/i915/i915_vma_snapshot.c      |  4 -
 drivers/gpu/drm/i915/i915_vma_snapshot.h      |  8 --
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 64 ++++++++----
 drivers/gpu/drm/i915/selftests/mock_gtt.c     | 12 +--
 21 files changed, 308 insertions(+), 206 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
index 8f674745e7e0..63a83d5f85a1 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -48,7 +48,7 @@ static void dpt_insert_page(struct i915_address_space *vm,
 }
 
 static void dpt_insert_entries(struct i915_address_space *vm,
-			       struct i915_vma *vma,
+			       struct i915_vma_resource *vma_res,
 			       enum i915_cache_level level,
 			       u32 flags)
 {
@@ -64,8 +64,8 @@ static void dpt_insert_entries(struct i915_address_space *vm,
 	 * not to allow the user to override access to a read only page.
 	 */
 
-	i = vma->node.start / I915_GTT_PAGE_SIZE;
-	for_each_sgt_daddr(addr, sgt_iter, vma->pages)
+	i = vma_res->start / I915_GTT_PAGE_SIZE;
+	for_each_sgt_daddr(addr, sgt_iter, vma_res->bi.pages)
 		gen8_set_pte(&base[i++], pte_encode | addr);
 }
 
@@ -76,35 +76,38 @@ static void dpt_clear_range(struct i915_address_space *vm,
 
 static void dpt_bind_vma(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash,
-			 struct i915_vma *vma,
+			 struct i915_vma_resource *vma_res,
 			 enum i915_cache_level cache_level,
 			 u32 flags)
 {
-	struct drm_i915_gem_object *obj = vma->obj;
 	u32 pte_flags;
 
+	if (vma_res->bound_flags)
+		return;
+
 	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
 	pte_flags = 0;
-	if (vma->vm->has_read_only && i915_gem_object_is_readonly(obj))
+	if (vm->has_read_only && vma_res->bi.readonly)
 		pte_flags |= PTE_READ_ONLY;
-	if (i915_gem_object_is_lmem(obj))
+	if (vma_res->bi.lmem)
 		pte_flags |= PTE_LM;
 
-	vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
+	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
 
-	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
+	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
 
 	/*
 	 * Without aliasing PPGTT there's no difference between
 	 * GLOBAL/LOCAL_BIND, it's all the same ptes. Hence unconditionally
 	 * upgrade to both bound if we bind either to avoid double-binding.
 	 */
-	atomic_or(I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND, &vma->flags);
+	vma_res->bound_flags = I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
 }
 
-static void dpt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
+static void dpt_unbind_vma(struct i915_address_space *vm,
+			   struct i915_vma_resource *vma_res)
 {
-	vm->clear_range(vm, vma->node.start, vma->size);
+	vm->clear_range(vm, vma_res->start, vma_res->vma_size);
 }
 
 static void dpt_cleanup(struct i915_address_space *vm)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index f9f7e44099fe..f99d260e0684 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -15,6 +15,7 @@
 
 #include "i915_active.h"
 #include "i915_selftest.h"
+#include "i915_vma_resource.h"
 
 struct drm_i915_gem_object;
 struct intel_fronbuffer;
@@ -549,31 +550,7 @@ struct drm_i915_gem_object {
 		struct sg_table *pages;
 		void *mapping;
 
-		struct i915_page_sizes {
-			/**
-			 * The sg mask of the pages sg_table. i.e the mask of
-			 * of the lengths for each sg entry.
-			 */
-			unsigned int phys;
-
-			/**
-			 * The gtt page sizes we are allowed to use given the
-			 * sg mask and the supported page sizes. This will
-			 * express the smallest unit we can use for the whole
-			 * object, as well as the larger sizes we may be able
-			 * to use opportunistically.
-			 */
-			unsigned int sg;
-
-			/**
-			 * The actual gtt page size usage. Since we can have
-			 * multiple vma associated with this object we need to
-			 * prevent any trampling of state, hence a copy of this
-			 * struct also lives in each vma, therefore the gtt
-			 * value here should only be read/write through the vma.
-			 */
-			unsigned int gtt;
-		} page_sizes;
+		struct i915_page_sizes page_sizes;
 
 		I915_SELFTEST_DECLARE(unsigned int page_mask);
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 11f0aa65f8a3..26f997c376a2 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -370,9 +370,9 @@ static int igt_check_page_sizes(struct i915_vma *vma)
 		err = -EINVAL;
 	}
 
-	if (!HAS_PAGE_SIZES(i915, vma->page_sizes.gtt)) {
+	if (!HAS_PAGE_SIZES(i915, vma->resource->page_sizes_gtt)) {
 		pr_err("unsupported page_sizes.gtt=%u, supported=%u\n",
-		       vma->page_sizes.gtt & ~supported, supported);
+		       vma->resource->page_sizes_gtt & ~supported, supported);
 		err = -EINVAL;
 	}
 
@@ -403,15 +403,9 @@ static int igt_check_page_sizes(struct i915_vma *vma)
 	if (i915_gem_object_is_lmem(obj) &&
 	    IS_ALIGNED(vma->node.start, SZ_2M) &&
 	    vma->page_sizes.sg & SZ_2M &&
-	    vma->page_sizes.gtt < SZ_2M) {
+	    vma->resource->page_sizes_gtt < SZ_2M) {
 		pr_err("gtt pages mismatch for LMEM, expected 2M GTT pages, sg(%u), gtt(%u)\n",
-		       vma->page_sizes.sg, vma->page_sizes.gtt);
-		err = -EINVAL;
-	}
-
-	if (obj->mm.page_sizes.gtt) {
-		pr_err("obj->page_sizes.gtt(%u) should never be set\n",
-		       obj->mm.page_sizes.gtt);
+		       vma->page_sizes.sg, vma->resource->page_sizes_gtt);
 		err = -EINVAL;
 	}
 
@@ -547,9 +541,9 @@ static int igt_mock_memory_region_huge_pages(void *arg)
 				goto out_unpin;
 			}
 
-			if (vma->page_sizes.gtt != page_size) {
+			if (vma->resource->page_sizes_gtt != page_size) {
 				pr_err("%s page_sizes.gtt=%u, expected=%u\n",
-				       __func__, vma->page_sizes.gtt,
+				       __func__, vma->resource->page_sizes_gtt,
 				       page_size);
 				err = -EINVAL;
 				goto out_unpin;
@@ -630,9 +624,9 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg)
 
 		err = igt_check_page_sizes(vma);
 
-		if (vma->page_sizes.gtt != page_size) {
+		if (vma->resource->page_sizes_gtt != page_size) {
 			pr_err("page_sizes.gtt=%u, expected %u\n",
-			       vma->page_sizes.gtt, page_size);
+			       vma->resource->page_sizes_gtt, page_size);
 			err = -EINVAL;
 		}
 
@@ -657,9 +651,10 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg)
 
 			err = igt_check_page_sizes(vma);
 
-			if (vma->page_sizes.gtt != I915_GTT_PAGE_SIZE_4K) {
+			if (vma->resource->page_sizes_gtt != I915_GTT_PAGE_SIZE_4K) {
 				pr_err("page_sizes.gtt=%u, expected %llu\n",
-				       vma->page_sizes.gtt, I915_GTT_PAGE_SIZE_4K);
+				       vma->resource->page_sizes_gtt,
+				       I915_GTT_PAGE_SIZE_4K);
 				err = -EINVAL;
 			}
 
@@ -805,9 +800,9 @@ static int igt_mock_ppgtt_huge_fill(void *arg)
 			}
 		}
 
-		if (vma->page_sizes.gtt != expected_gtt) {
+		if (vma->resource->page_sizes_gtt != expected_gtt) {
 			pr_err("gtt=%u, expected=%u, size=%zd, single=%s\n",
-			       vma->page_sizes.gtt, expected_gtt,
+			       vma->resource->page_sizes_gtt, expected_gtt,
 			       obj->base.size, yesno(!!single));
 			err = -EINVAL;
 			break;
@@ -961,10 +956,10 @@ static int igt_mock_ppgtt_64K(void *arg)
 				}
 			}
 
-			if (vma->page_sizes.gtt != expected_gtt) {
+			if (vma->resource->page_sizes_gtt != expected_gtt) {
 				pr_err("gtt=%u, expected=%u, i=%d, single=%s\n",
-				       vma->page_sizes.gtt, expected_gtt, i,
-				       yesno(!!single));
+				       vma->resource->page_sizes_gtt,
+				       expected_gtt, i, yesno(!!single));
 				err = -EINVAL;
 				goto out_vma_unpin;
 			}
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 6e9292918bfc..d657ffd6c86a 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -104,17 +104,17 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 }
 
 static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
-				      struct i915_vma *vma,
+				      struct i915_vma_resource *vma_res,
 				      enum i915_cache_level cache_level,
 				      u32 flags)
 {
 	struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
 	struct i915_page_directory * const pd = ppgtt->pd;
-	unsigned int first_entry = vma->node.start / I915_GTT_PAGE_SIZE;
+	unsigned int first_entry = vma_res->start / I915_GTT_PAGE_SIZE;
 	unsigned int act_pt = first_entry / GEN6_PTES;
 	unsigned int act_pte = first_entry % GEN6_PTES;
 	const u32 pte_encode = vm->pte_encode(0, cache_level, flags);
-	struct sgt_dma iter = sgt_dma(vma);
+	struct sgt_dma iter = sgt_dma(vma_res);
 	gen6_pte_t *vaddr;
 
 	GEM_BUG_ON(!pd->entry[act_pt]);
@@ -140,7 +140,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 		}
 	} while (1);
 
-	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
+	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
 }
 
 static void gen6_flush_pd(struct gen6_ppgtt *ppgtt, u64 start, u64 end)
@@ -271,13 +271,13 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 
 static void pd_vma_bind(struct i915_address_space *vm,
 			struct i915_vm_pt_stash *stash,
-			struct i915_vma *vma,
+			struct i915_vma_resource *vma_res,
 			enum i915_cache_level cache_level,
 			u32 unused)
 {
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
-	struct gen6_ppgtt *ppgtt = vma->private;
-	u32 ggtt_offset = i915_ggtt_offset(vma) / I915_GTT_PAGE_SIZE;
+	struct gen6_ppgtt *ppgtt = vma_res->private;
+	u32 ggtt_offset = vma_res->start / I915_GTT_PAGE_SIZE;
 
 	ppgtt->pp_dir = ggtt_offset * sizeof(gen6_pte_t) << 10;
 	ppgtt->pd_addr = (gen6_pte_t __iomem *)ggtt->gsm + ggtt_offset;
@@ -285,9 +285,10 @@ static void pd_vma_bind(struct i915_address_space *vm,
 	gen6_flush_pd(ppgtt, 0, ppgtt->base.vm.total);
 }
 
-static void pd_vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
+static void pd_vma_unbind(struct i915_address_space *vm,
+			  struct i915_vma_resource *vma_res)
 {
-	struct gen6_ppgtt *ppgtt = vma->private;
+	struct gen6_ppgtt *ppgtt = vma_res->private;
 	struct i915_page_directory * const pd = ppgtt->base.pd;
 	struct i915_page_table *pt;
 	unsigned int pde;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index b012c50f7ce7..c43e724afa9f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -453,20 +453,21 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 	return idx;
 }
 
-static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
+static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
+				   struct i915_vma_resource *vma_res,
 				   struct sgt_dma *iter,
 				   enum i915_cache_level cache_level,
 				   u32 flags)
 {
 	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
 	unsigned int rem = sg_dma_len(iter->sg);
-	u64 start = vma->node.start;
+	u64 start = vma_res->start;
 
-	GEM_BUG_ON(!i915_vm_is_4lvl(vma->vm));
+	GEM_BUG_ON(!i915_vm_is_4lvl(vm));
 
 	do {
 		struct i915_page_directory * const pdp =
-			gen8_pdp_for_page_address(vma->vm, start);
+			gen8_pdp_for_page_address(vm, start);
 		struct i915_page_directory * const pd =
 			i915_pd_entry(pdp, __gen8_pte_index(start, 2));
 		gen8_pte_t encode = pte_encode;
@@ -475,7 +476,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		gen8_pte_t *vaddr;
 		u16 index;
 
-		if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_2M &&
+		if (vma_res->bi.page_sizes.sg & I915_GTT_PAGE_SIZE_2M &&
 		    IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_2M) &&
 		    rem >= I915_GTT_PAGE_SIZE_2M &&
 		    !__gen8_pte_index(start, 0)) {
@@ -492,7 +493,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			page_size = I915_GTT_PAGE_SIZE;
 
 			if (!index &&
-			    vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K &&
+			    vma_res->bi.page_sizes.sg & I915_GTT_PAGE_SIZE_64K &&
 			    IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) &&
 			    (IS_ALIGNED(rem, I915_GTT_PAGE_SIZE_64K) ||
 			     rem >= (I915_PDES - index) * I915_GTT_PAGE_SIZE))
@@ -541,9 +542,9 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 		 */
 		if (maybe_64K != -1 &&
 		    (index == I915_PDES ||
-		     (i915_vm_has_scratch_64K(vma->vm) &&
-		      !iter->sg && IS_ALIGNED(vma->node.start +
-					      vma->node.size,
+		     (i915_vm_has_scratch_64K(vm) &&
+		      !iter->sg && IS_ALIGNED(vma_res->start +
+					      vma_res->node_size,
 					      I915_GTT_PAGE_SIZE_2M)))) {
 			vaddr = px_vaddr(pd);
 			vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
@@ -559,10 +560,10 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			 * instead - which we detect as missing results during
 			 * selftests.
 			 */
-			if (I915_SELFTEST_ONLY(vma->vm->scrub_64K)) {
+			if (I915_SELFTEST_ONLY(vm->scrub_64K)) {
 				u16 i;
 
-				encode = vma->vm->scratch[0]->encode;
+				encode = vm->scratch[0]->encode;
 				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K));
 
 				for (i = 1; i < index; i += 16)
@@ -572,22 +573,22 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			}
 		}
 
-		vma->page_sizes.gtt |= page_size;
+		vma_res->page_sizes_gtt |= page_size;
 	} while (iter->sg && sg_dma_len(iter->sg));
 }
 
 static void gen8_ppgtt_insert(struct i915_address_space *vm,
-			      struct i915_vma *vma,
+			      struct i915_vma_resource *vma_res,
 			      enum i915_cache_level cache_level,
 			      u32 flags)
 {
 	struct i915_ppgtt * const ppgtt = i915_vm_to_ppgtt(vm);
-	struct sgt_dma iter = sgt_dma(vma);
+	struct sgt_dma iter = sgt_dma(vma_res);
 
-	if (vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
-		gen8_ppgtt_insert_huge(vma, &iter, cache_level, flags);
+	if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) {
+		gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags);
 	} else  {
-		u64 idx = vma->node.start >> GEN8_PTE_SHIFT;
+		u64 idx = vma_res->start >> GEN8_PTE_SHIFT;
 
 		do {
 			struct i915_page_directory * const pdp =
@@ -597,7 +598,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
 						    cache_level, flags);
 		} while (idx);
 
-		vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
+		vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 352254e001b4..74aa90587061 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1718,8 +1718,8 @@ static void print_request_ring(struct drm_printer *m, struct i915_request *rq)
 	drm_printf(m,
 		   "[head %04x, postfix %04x, tail %04x, batch 0x%08x_%08x]:\n",
 		   rq->head, rq->postfix, rq->tail,
-		   vsnap ? upper_32_bits(vsnap->gtt_offset) : ~0u,
-		   vsnap ? lower_32_bits(vsnap->gtt_offset) : ~0u);
+		   vsnap ? upper_32_bits(vsnap->vma_resource->start) : ~0u,
+		   vsnap ? lower_32_bits(vsnap->vma_resource->start) : ~0u);
 
 	size = rq->tail - rq->head;
 	if (rq->tail < rq->head)
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 5263dda7f8d5..0137b6af0973 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -235,7 +235,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
 }
 
 static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct i915_vma *vma,
+				     struct i915_vma_resource *vma_res,
 				     enum i915_cache_level level,
 				     u32 flags)
 {
@@ -252,10 +252,10 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 	 */
 
 	gte = (gen8_pte_t __iomem *)ggtt->gsm;
-	gte += vma->node.start / I915_GTT_PAGE_SIZE;
-	end = gte + vma->node.size / I915_GTT_PAGE_SIZE;
+	gte += vma_res->start / I915_GTT_PAGE_SIZE;
+	end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
 
-	for_each_sgt_daddr(addr, iter, vma->pages)
+	for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
 		gen8_set_pte(gte++, pte_encode | addr);
 	GEM_BUG_ON(gte > end);
 
@@ -292,7 +292,7 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm,
  * through the GMADR mapped BAR (i915->mm.gtt->gtt).
  */
 static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct i915_vma *vma,
+				     struct i915_vma_resource *vma_res,
 				     enum i915_cache_level level,
 				     u32 flags)
 {
@@ -303,10 +303,10 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 	dma_addr_t addr;
 
 	gte = (gen6_pte_t __iomem *)ggtt->gsm;
-	gte += vma->node.start / I915_GTT_PAGE_SIZE;
-	end = gte + vma->node.size / I915_GTT_PAGE_SIZE;
+	gte += vma_res->start / I915_GTT_PAGE_SIZE;
+	end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
 
-	for_each_sgt_daddr(addr, iter, vma->pages)
+	for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
 		iowrite32(vm->pte_encode(addr, level, flags), gte++);
 	GEM_BUG_ON(gte > end);
 
@@ -389,7 +389,7 @@ static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
 
 struct insert_entries {
 	struct i915_address_space *vm;
-	struct i915_vma *vma;
+	struct i915_vma_resource *vma_res;
 	enum i915_cache_level level;
 	u32 flags;
 };
@@ -398,18 +398,18 @@ static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
 {
 	struct insert_entries *arg = _arg;
 
-	gen8_ggtt_insert_entries(arg->vm, arg->vma, arg->level, arg->flags);
+	gen8_ggtt_insert_entries(arg->vm, arg->vma_res, arg->level, arg->flags);
 	bxt_vtd_ggtt_wa(arg->vm);
 
 	return 0;
 }
 
 static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
-					     struct i915_vma *vma,
+					     struct i915_vma_resource *vma_res,
 					     enum i915_cache_level level,
 					     u32 flags)
 {
-	struct insert_entries arg = { vm, vma, level, flags };
+	struct insert_entries arg = { vm, vma_res, level, flags };
 
 	stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
 }
@@ -448,14 +448,14 @@ static void i915_ggtt_insert_page(struct i915_address_space *vm,
 }
 
 static void i915_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct i915_vma *vma,
+				     struct i915_vma_resource *vma_res,
 				     enum i915_cache_level cache_level,
 				     u32 unused)
 {
 	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
 		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
 
-	intel_gtt_insert_sg_entries(vma->pages, vma->node.start >> PAGE_SHIFT,
+	intel_gtt_insert_sg_entries(vma_res->bi.pages, vma_res->start >> PAGE_SHIFT,
 				    flags);
 }
 
@@ -467,30 +467,32 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
 
 static void ggtt_bind_vma(struct i915_address_space *vm,
 			  struct i915_vm_pt_stash *stash,
-			  struct i915_vma *vma,
+			  struct i915_vma_resource *vma_res,
 			  enum i915_cache_level cache_level,
 			  u32 flags)
 {
-	struct drm_i915_gem_object *obj = vma->obj;
 	u32 pte_flags;
 
-	if (i915_vma_is_bound(vma, ~flags & I915_VMA_BIND_MASK))
+	if (vma_res->bound_flags & (~flags & I915_VMA_BIND_MASK))
 		return;
 
+	vma_res->bound_flags |= flags;
+
 	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
 	pte_flags = 0;
-	if (i915_gem_object_is_readonly(obj))
+	if (vma_res->bi.readonly)
 		pte_flags |= PTE_READ_ONLY;
-	if (i915_gem_object_is_lmem(obj))
+	if (vma_res->bi.lmem)
 		pte_flags |= PTE_LM;
 
-	vm->insert_entries(vm, vma, cache_level, pte_flags);
-	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
+	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
+	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
 }
 
-static void ggtt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
+static void ggtt_unbind_vma(struct i915_address_space *vm,
+			    struct i915_vma_resource *vma_res)
 {
-	vm->clear_range(vm, vma->node.start, vma->size);
+	vm->clear_range(vm, vma_res->start, vma_res->vma_size);
 }
 
 static int ggtt_reserve_guc_top(struct i915_ggtt *ggtt)
@@ -623,7 +625,7 @@ static int init_ggtt(struct i915_ggtt *ggtt)
 
 static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
 				  struct i915_vm_pt_stash *stash,
-				  struct i915_vma *vma,
+				  struct i915_vma_resource *vma_res,
 				  enum i915_cache_level cache_level,
 				  u32 flags)
 {
@@ -631,25 +633,27 @@ static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
 
 	/* Currently applicable only to VLV */
 	pte_flags = 0;
-	if (i915_gem_object_is_readonly(vma->obj))
+	if (vma_res->bi.readonly)
 		pte_flags |= PTE_READ_ONLY;
 
 	if (flags & I915_VMA_LOCAL_BIND)
 		ppgtt_bind_vma(&i915_vm_to_ggtt(vm)->alias->vm,
-			       stash, vma, cache_level, flags);
+			       stash, vma_res, cache_level, flags);
 
 	if (flags & I915_VMA_GLOBAL_BIND)
-		vm->insert_entries(vm, vma, cache_level, pte_flags);
+		vm->insert_entries(vm, vma_res, cache_level, pte_flags);
+
+	vma_res->bound_flags |= flags;
 }
 
 static void aliasing_gtt_unbind_vma(struct i915_address_space *vm,
-				    struct i915_vma *vma)
+				    struct i915_vma_resource *vma_res)
 {
-	if (i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND))
-		vm->clear_range(vm, vma->node.start, vma->size);
+	if (vma_res->bound_flags & I915_VMA_GLOBAL_BIND)
+		vm->clear_range(vm, vma_res->start, vma_res->vma_size);
 
-	if (i915_vma_is_bound(vma, I915_VMA_LOCAL_BIND))
-		ppgtt_unbind_vma(&i915_vm_to_ggtt(vm)->alias->vm, vma);
+	if (vma_res->bound_flags & I915_VMA_LOCAL_BIND)
+		ppgtt_unbind_vma(&i915_vm_to_ggtt(vm)->alias->vm, vma_res);
 }
 
 static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
@@ -1280,7 +1284,7 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
 			atomic_read(&vma->flags) & I915_VMA_BIND_MASK;
 
 		GEM_BUG_ON(!was_bound);
-		vma->ops->bind_vma(vm, NULL, vma,
+		vma->ops->bind_vma(vm, NULL, vma->resource,
 				   obj ? obj->cache_level : 0,
 				   was_bound);
 		if (obj) { /* only used during resume => exclusive access */
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 177b42b935a1..676b839d1a34 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -27,6 +27,7 @@
 
 #include "gt/intel_reset.h"
 #include "i915_selftest.h"
+#include "i915_vma_resource.h"
 #include "i915_vma_types.h"
 
 #define I915_GFP_ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
@@ -200,7 +201,7 @@ struct i915_vma_ops {
 	/* Map an object into an address space with the given cache flags. */
 	void (*bind_vma)(struct i915_address_space *vm,
 			 struct i915_vm_pt_stash *stash,
-			 struct i915_vma *vma,
+			 struct i915_vma_resource *vma_res,
 			 enum i915_cache_level cache_level,
 			 u32 flags);
 	/*
@@ -208,7 +209,8 @@ struct i915_vma_ops {
 	 * setting the valid PTE entries to a reserved scratch page.
 	 */
 	void (*unbind_vma)(struct i915_address_space *vm,
-			   struct i915_vma *vma);
+			   struct i915_vma_resource *vma_res);
+
 };
 
 struct i915_address_space {
@@ -285,7 +287,7 @@ struct i915_address_space {
 			    enum i915_cache_level cache_level,
 			    u32 flags);
 	void (*insert_entries)(struct i915_address_space *vm,
-			       struct i915_vma *vma,
+			       struct i915_vma_resource *vma_res,
 			       enum i915_cache_level cache_level,
 			       u32 flags);
 	void (*cleanup)(struct i915_address_space *vm);
@@ -600,11 +602,11 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt);
 
 void ppgtt_bind_vma(struct i915_address_space *vm,
 		    struct i915_vm_pt_stash *stash,
-		    struct i915_vma *vma,
+		    struct i915_vma_resource *vma_res,
 		    enum i915_cache_level cache_level,
 		    u32 flags);
 void ppgtt_unbind_vma(struct i915_address_space *vm,
-		      struct i915_vma *vma);
+		      struct i915_vma_resource *vma_res);
 
 void gtt_write_workarounds(struct intel_gt *gt);
 
@@ -627,8 +629,8 @@ __vm_create_scratch_for_read_pinned(struct i915_address_space *vm, unsigned long
 static inline struct sgt_dma {
 	struct scatterlist *sg;
 	dma_addr_t dma, max;
-} sgt_dma(struct i915_vma *vma) {
-	struct scatterlist *sg = vma->pages->sgl;
+} sgt_dma(struct i915_vma_resource *vma_res) {
+	struct scatterlist *sg = vma_res->bi.pages->sgl;
 	dma_addr_t addr = sg_dma_address(sg);
 
 	return (struct sgt_dma){ sg, addr, addr + sg_dma_len(sg) };
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 083b3090c69c..48e6e2f87700 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -179,32 +179,34 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt,
 
 void ppgtt_bind_vma(struct i915_address_space *vm,
 		    struct i915_vm_pt_stash *stash,
-		    struct i915_vma *vma,
+		    struct i915_vma_resource *vma_res,
 		    enum i915_cache_level cache_level,
 		    u32 flags)
 {
 	u32 pte_flags;
 
-	if (!test_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma))) {
-		vm->allocate_va_range(vm, stash, vma->node.start, vma->size);
-		set_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma));
+	if (!vma_res->allocated) {
+		vm->allocate_va_range(vm, stash, vma_res->start,
+				      vma_res->vma_size);
+		vma_res->allocated = true;
 	}
 
 	/* Applicable to VLV, and gen8+ */
 	pte_flags = 0;
-	if (i915_gem_object_is_readonly(vma->obj))
+	if (vma_res->bi.readonly)
 		pte_flags |= PTE_READ_ONLY;
-	if (i915_gem_object_is_lmem(vma->obj))
+	if (vma_res->bi.lmem)
 		pte_flags |= PTE_LM;
 
-	vm->insert_entries(vm, vma, cache_level, pte_flags);
+	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
 	wmb();
 }
 
-void ppgtt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
+void ppgtt_unbind_vma(struct i915_address_space *vm,
+		      struct i915_vma_resource *vma_res)
 {
-	if (test_and_clear_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma)))
-		vm->clear_range(vm, vma->node.start, vma->size);
+	if (vma_res->allocated)
+		vm->clear_range(vm, vma_res->start, vma_res->vma_size);
 }
 
 static unsigned long pd_count(u64 size, int shift)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index a5af05bde6f2..777fc6f0ceff 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -448,20 +448,19 @@ static void uc_fw_bind_ggtt(struct intel_uc_fw *uc_fw)
 {
 	struct drm_i915_gem_object *obj = uc_fw->obj;
 	struct i915_ggtt *ggtt = __uc_fw_to_gt(uc_fw)->ggtt;
-	struct i915_vma *dummy = &uc_fw->dummy;
+	struct i915_vma_resource *dummy = &uc_fw->dummy;
 	u32 pte_flags = 0;
 
-	dummy->node.start = uc_fw_ggtt_offset(uc_fw);
-	dummy->node.size = obj->base.size;
-	dummy->pages = obj->mm.pages;
-	dummy->vm = &ggtt->vm;
+	dummy->start = uc_fw_ggtt_offset(uc_fw);
+	dummy->node_size = obj->base.size;
+	dummy->bi.pages = obj->mm.pages;
 
 	GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
-	GEM_BUG_ON(dummy->node.size > ggtt->uc_fw.size);
+	GEM_BUG_ON(dummy->node_size > ggtt->uc_fw.size);
 
 	/* uc_fw->obj cache domains were not controlled across suspend */
 	if (i915_gem_object_has_struct_page(obj))
-		drm_clflush_sg(dummy->pages);
+		drm_clflush_sg(dummy->bi.pages);
 
 	if (i915_gem_object_is_lmem(obj))
 		pte_flags |= PTE_LM;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
index d9d1dc0b4cbb..3229018877d3 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
@@ -85,7 +85,7 @@ struct intel_uc_fw {
 	 * threaded as it done during driver load (inherently single threaded)
 	 * or during a GT reset (mutex guarantees single threaded).
 	 */
-	struct i915_vma dummy;
+	struct i915_vma_resource dummy;
 	struct i915_vma *rsa_data;
 
 	/*
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index e0e052cdf8b8..f7d1feba5aa4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -170,7 +170,8 @@ i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		seq_printf(m, " (%s offset: %08llx, size: %08llx, pages: %s",
 			   stringify_vma_type(vma),
 			   vma->node.start, vma->node.size,
-			   stringify_page_sizes(vma->page_sizes.gtt, NULL, 0));
+			   stringify_page_sizes(vma->resource->page_sizes_gtt,
+						NULL, 0));
 		if (i915_vma_is_ggtt(vma) || i915_vma_is_dpt(vma)) {
 			switch (vma->ggtt_view.type) {
 			case I915_GGTT_VIEW_NORMAL:
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 5ae812d60abe..1af54ff374f9 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1040,9 +1040,9 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	strcpy(dst->name, vsnap->name);
 	dst->next = NULL;
 
-	dst->gtt_offset = vsnap->gtt_offset;
-	dst->gtt_size = vsnap->gtt_size;
-	dst->gtt_page_sizes = vsnap->page_sizes;
+	dst->gtt_offset = vsnap->vma_resource->start;
+	dst->gtt_size = vsnap->vma_resource->node_size;
+	dst->gtt_page_sizes = vsnap->vma_resource->page_sizes_gtt;
 	dst->unused = 0;
 
 	ret = -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 7097c5016431..1d4e448d22d9 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -298,7 +298,7 @@ static void __vma_bind(struct dma_fence_work *work)
 	struct i915_vma *vma = vw->vma;
 
 	vma->ops->bind_vma(vw->vm, &vw->stash,
-			   vma, vw->cache_level, vw->flags);
+			   vma->resource, vw->cache_level, vw->flags);
 }
 
 static void __vma_release(struct dma_fence_work *work)
@@ -375,6 +375,21 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
 #define i915_vma_verify_bind_complete(_vma) 0
 #endif
 
+I915_SELFTEST_EXPORT void
+i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
+				struct i915_vma *vma)
+{
+	struct drm_i915_gem_object *obj = vma->obj;
+
+	i915_vma_resource_init(vma_res, vma->pages, &vma->page_sizes,
+			       i915_gem_object_is_readonly(obj),
+			       i915_gem_object_is_lmem(obj),
+			       vma->private,
+			       vma->node.start,
+			       vma->node.size,
+			       vma->size);
+}
+
 /**
  * i915_vma_bind - Sets up PTEs for an VMA in it's corresponding address space.
  * @vma: VMA to map
@@ -432,7 +447,7 @@ int i915_vma_bind(struct i915_vma *vma,
 		GEM_WARN_ON(!vma_flags);
 		kfree(vma_res);
 	} else {
-		i915_vma_resource_init(vma_res);
+		i915_vma_resource_init_from_vma(vma_res, vma);
 		vma->resource = vma_res;
 	}
 	trace_i915_vma_bind(vma, bind_flags);
@@ -472,7 +487,8 @@ int i915_vma_bind(struct i915_vma *vma,
 			if (ret)
 				return ret;
 		}
-		vma->ops->bind_vma(vma->vm, NULL, vma, cache_level, bind_flags);
+		vma->ops->bind_vma(vma->vm, NULL, vma->resource, cache_level,
+				   bind_flags);
 	}
 
 	atomic_or(bind_flags, &vma->flags);
@@ -1778,7 +1794,7 @@ void __i915_vma_evict(struct i915_vma *vma)
 
 	if (likely(atomic_read(&vma->vm->open))) {
 		trace_i915_vma_unbind(vma);
-		vma->ops->unbind_vma(vma->vm, vma);
+		vma->ops->unbind_vma(vma->vm, vma->resource);
 	}
 	atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE),
 		   &vma->flags);
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index de0f3e44cdfa..1df57ec832bd 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -339,12 +339,6 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma);
  */
 void i915_vma_unpin_iomap(struct i915_vma *vma);
 
-static inline struct page *i915_vma_first_page(struct i915_vma *vma)
-{
-	GEM_BUG_ON(!vma->pages);
-	return sg_page(vma->pages->sgl);
-}
-
 /**
  * i915_vma_pin_fence - pin fencing state
  * @vma: vma to pin fencing for
@@ -445,6 +439,11 @@ i915_vma_get_current_resource(struct i915_vma *vma)
 	return i915_vma_resource_get(vma->resource);
 }
 
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+void i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
+				     struct i915_vma *vma);
+#endif
+
 void i915_vma_module_exit(void);
 int i915_vma_module_init(void);
 
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c
index 833e987bed2a..c86db89ab5d2 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.c
+++ b/drivers/gpu/drm/i915/i915_vma_resource.c
@@ -23,15 +23,12 @@ static struct dma_fence_ops unbind_fence_ops = {
 };
 
 /**
- * i915_vma_resource_init - Initialize a vma resource.
+ * __i915_vma_resource_init - Initialize a vma resource.
  * @vma_res: The vma resource to initialize
  *
- * Initializes a vma resource allocated using i915_vma_resource_alloc().
- * The reason for having separate allocate and initialize function is that
- * initialization may need to be performed from under a lock where
- * allocation is not allowed.
+ * Initializes the private members of a vma resource.
  */
-void i915_vma_resource_init(struct i915_vma_resource *vma_res)
+void __i915_vma_resource_init(struct i915_vma_resource *vma_res)
 {
 	spin_lock_init(&vma_res->lock);
 	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
index 34744da23072..9872de58268b 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.h
+++ b/drivers/gpu/drm/i915/i915_vma_resource.h
@@ -9,6 +9,25 @@
 #include <linux/dma-fence.h>
 #include <linux/refcount.h>
 
+#include "i915_gem.h"
+
+struct i915_page_sizes {
+	/**
+	 * The sg mask of the pages sg_table. i.e the mask of
+	 * the lengths for each sg entry.
+	 */
+	unsigned int phys;
+
+	/**
+	 * The gtt page sizes we are allowed to use given the
+	 * sg mask and the supported page sizes. This will
+	 * express the smallest unit we can use for the whole
+	 * object, as well as the larger sizes we may be able
+	 * to use opportunistically.
+	 */
+	unsigned int sg;
+};
+
 /**
  * struct i915_vma_resource - Snapshotted unbind information.
  * @unbind_fence: Fence to mark unbinding complete. Note that this fence
@@ -20,6 +39,13 @@
  * @hold_count: Number of holders blocking the fence from finishing.
  * The vma itself is keeping a hold, which is released when unbind
  * is scheduled.
+ * @private: Bind backend private info.
+ * @start: Offset into the address space of bind range start.
+ * @node_size: Size of the allocated range manager node.
+ * @vma_size: Bind size.
+ * @page_sizes_gtt: Resulting page sizes from the bind operation.
+ * @bound_flags: Flags indicating binding status.
+ * @allocated: Backend private data. TODO: Should move into @private.
  *
  * The lifetime of a struct i915_vma_resource is from a binding request to
  * the actual possible asynchronous unbind has completed.
@@ -29,6 +55,32 @@ struct i915_vma_resource {
 	/* See above for description of the lock. */
 	spinlock_t lock;
 	refcount_t hold_count;
+
+	/**
+	 * struct i915_vma_bindinfo - Information needed for async bind
+	 * only but that can be dropped after the bind has taken place.
+	 * Consider making this a separate argument to the bind_vma
+	 * op, coalescing with other arguments like vm, stash, cache_level
+	 * and flags
+	 * @pages: The pages sg-table.
+	 * @page_sizes: Page sizes of the pages.
+	 * @readonly: Whether the vma should be bound read-only.
+	 * @lmem: Whether the vma points to lmem.
+	 */
+	struct i915_vma_bindinfo {
+		struct sg_table *pages;
+		struct i915_page_sizes page_sizes;
+		bool readonly:1;
+		bool lmem:1;
+	} bi;
+
+	void *private;
+	unsigned long start;
+	unsigned long node_size;
+	unsigned long vma_size;
+	u32 page_sizes_gtt;
+	u32 bound_flags;
+	bool allocated:1;
 };
 
 bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
@@ -41,6 +93,8 @@ struct i915_vma_resource *i915_vma_resource_alloc(void);
 
 struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res);
 
+void __i915_vma_resource_init(struct i915_vma_resource *vma_res);
+
 /**
  * i915_vma_resource_get - Take a reference on a vma resource
  * @vma_res: The vma resource on which to take a reference.
@@ -63,8 +117,47 @@ static inline void i915_vma_resource_put(struct i915_vma_resource *vma_res)
 	dma_fence_put(&vma_res->unbind_fence);
 }
 
-#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-void i915_vma_resource_init(struct i915_vma_resource *vma_res);
-#endif
+/**
+ * i915_vma_resource_init - Initialize a vma resource.
+ * @vma_res: The vma resource to initialize
+ * @pages: The pages sg-table.
+ * @page_sizes: Page sizes of the pages.
+ * @readonly: Whether the vma should be bound read-only.
+ * @lmem: Whether the vma points to lmem.
+ * @private: Bind backend private info.
+ * @start: Offset into the address space of bind range start.
+ * @node_size: Size of the allocated range manager node.
+ * @size: Bind size.
+ *
+ * Initializes a vma resource allocated using i915_vma_resource_alloc().
+ * The reason for having separate allocate and initialize function is that
+ * initialization may need to be performed from under a lock where
+ * allocation is not allowed.
+ */
+static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
+					  struct sg_table *pages,
+					  const struct i915_page_sizes *page_sizes,
+					  bool readonly,
+					  bool lmem,
+					  void *private,
+					  unsigned long start,
+					  unsigned long node_size,
+					  unsigned long size)
+{
+	__i915_vma_resource_init(vma_res);
+	vma_res->bi.pages = pages;
+	vma_res->bi.page_sizes = *page_sizes;
+	vma_res->bi.readonly = readonly;
+	vma_res->bi.lmem = lmem;
+	vma_res->private = private;
+	vma_res->start = start;
+	vma_res->node_size = node_size;
+	vma_res->vma_size = size;
+}
+
+static inline void i915_vma_resource_fini(struct i915_vma_resource *vma_res)
+{
+	GEM_BUG_ON(refcount_read(&vma_res->hold_count) != 1);
+}
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
index f7333c7a2f5e..69f62c1ca967 100644
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.c
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.c
@@ -24,11 +24,7 @@ void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
 		assert_object_held(vma->obj);
 
 	vsnap->name = name;
-	vsnap->size = vma->size;
 	vsnap->obj_size = vma->obj->base.size;
-	vsnap->gtt_offset = vma->node.start;
-	vsnap->gtt_size = vma->node.size;
-	vsnap->page_sizes = vma->page_sizes.gtt;
 	vsnap->pages = vma->pages;
 	vsnap->pages_rsgt = NULL;
 	vsnap->mr = NULL;
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
index e74588dd676b..1b08ce9f8576 100644
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.h
+++ b/drivers/gpu/drm/i915/i915_vma_snapshot.h
@@ -23,31 +23,23 @@ struct sg_table;
 
 /**
  * struct i915_vma_snapshot - Snapshot of vma metadata.
- * @size: The vma size in bytes.
  * @obj_size: The size of the underlying object in bytes.
- * @gtt_offset: The gtt offset the vma is bound to.
- * @gtt_size: The size in bytes allocated for the vma in the GTT.
  * @pages: The struct sg_table pointing to the pages bound.
  * @pages_rsgt: The refcounted sg_table holding the reference for @pages if any.
  * @mr: The memory region pointed for the pages bound.
  * @kref: Reference for this structure.
  * @vma_resource: Pointer to the vma resource representing the vma binding.
- * @page_sizes: The vma GTT page sizes information.
  * @onstack: Whether the structure shouldn't be freed on final put.
  * @present: Whether the structure is present and initialized.
  */
 struct i915_vma_snapshot {
 	const char *name;
-	size_t size;
 	size_t obj_size;
-	size_t gtt_offset;
-	size_t gtt_size;
 	struct sg_table *pages;
 	struct i915_refct_sgt *pages_rsgt;
 	struct intel_memory_region *mr;
 	struct kref kref;
 	struct i915_vma_resource *vma_resource;
-	u32 page_sizes;
 	bool onstack:1;
 	bool present:1;
 };
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 54be880e55c3..70b5c47890b9 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -239,11 +239,11 @@ static int lowlevel_hole(struct i915_address_space *vm,
 			 unsigned long end_time)
 {
 	I915_RND_STATE(seed_prng);
-	struct i915_vma *mock_vma;
+	struct i915_vma_resource *mock_vma_res;
 	unsigned int size;
 
-	mock_vma = kzalloc(sizeof(*mock_vma), GFP_KERNEL);
-	if (!mock_vma)
+	mock_vma_res = kzalloc(sizeof(*mock_vma_res), GFP_KERNEL);
+	if (!mock_vma_res)
 		return -ENOMEM;
 
 	/* Keep creating larger objects until one cannot fit into the hole */
@@ -269,7 +269,7 @@ static int lowlevel_hole(struct i915_address_space *vm,
 				break;
 		} while (count >>= 1);
 		if (!count) {
-			kfree(mock_vma);
+			kfree(mock_vma_res);
 			return -ENOMEM;
 		}
 		GEM_BUG_ON(!order);
@@ -343,12 +343,12 @@ static int lowlevel_hole(struct i915_address_space *vm,
 					break;
 			}
 
-			mock_vma->pages = obj->mm.pages;
-			mock_vma->node.size = BIT_ULL(size);
-			mock_vma->node.start = addr;
+			mock_vma_res->bi.pages = obj->mm.pages;
+			mock_vma_res->node_size = BIT_ULL(size);
+			mock_vma_res->start = addr;
 
 			with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref)
-				vm->insert_entries(vm, mock_vma,
+			  vm->insert_entries(vm, mock_vma_res,
 						   I915_CACHE_NONE, 0);
 		}
 		count = n;
@@ -371,7 +371,7 @@ static int lowlevel_hole(struct i915_address_space *vm,
 		cleanup_freed_objects(vm->i915);
 	}
 
-	kfree(mock_vma);
+	kfree(mock_vma_res);
 	return 0;
 }
 
@@ -1280,6 +1280,7 @@ static void track_vma_bind(struct i915_vma *vma)
 	atomic_set(&vma->pages_count, I915_VMA_PAGES_ACTIVE);
 	__i915_gem_object_pin_pages(obj);
 	vma->pages = obj->mm.pages;
+	vma->resource->bi.pages = vma->pages;
 
 	mutex_lock(&vma->vm->mutex);
 	list_add_tail(&vma->vm_link, &vma->vm->bound_list);
@@ -1354,7 +1355,7 @@ static int reserve_gtt_with_resource(struct i915_vma *vma, u64 offset)
 				   obj->cache_level,
 				   0);
 	if (!err) {
-		i915_vma_resource_init(vma_res);
+		i915_vma_resource_init_from_vma(vma_res, vma);
 		vma->resource = vma_res;
 	} else {
 		kfree(vma_res);
@@ -1533,7 +1534,7 @@ static int insert_gtt_with_resource(struct i915_vma *vma)
 	err = i915_gem_gtt_insert(vm, &vma->node, obj->base.size, 0,
 				  obj->cache_level, 0, vm->total, 0);
 	if (!err) {
-		i915_vma_resource_init(vma_res);
+		i915_vma_resource_init_from_vma(vma_res, vma);
 		vma->resource = vma_res;
 	} else {
 		kfree(vma_res);
@@ -1958,6 +1959,7 @@ static int igt_cs_tlb(void *arg)
 			struct i915_vm_pt_stash stash = {};
 			struct i915_request *rq;
 			struct i915_gem_ww_ctx ww;
+			struct i915_vma_resource *vma_res;
 			u64 offset;
 
 			offset = igt_random_offset(&prng,
@@ -1978,6 +1980,13 @@ static int igt_cs_tlb(void *arg)
 			if (err)
 				goto end;
 
+			vma_res = i915_vma_resource_alloc();
+			if (IS_ERR(vma_res)) {
+				i915_vma_put_pages(vma);
+				err = PTR_ERR(vma_res);
+				goto end;
+			}
+
 			i915_gem_ww_ctx_init(&ww, false);
 retry:
 			err = i915_vm_lock_objects(vm, &ww);
@@ -1999,33 +2008,41 @@ static int igt_cs_tlb(void *arg)
 					goto retry;
 			}
 			i915_gem_ww_ctx_fini(&ww);
-			if (err)
+			if (err) {
+				kfree(vma_res);
 				goto end;
+			}
 
+			i915_vma_resource_init_from_vma(vma_res, vma);
 			/* Prime the TLB with the dummy pages */
 			for (i = 0; i < count; i++) {
-				vma->node.start = offset + i * PAGE_SIZE;
-				vm->insert_entries(vm, vma, I915_CACHE_NONE, 0);
+				vma_res->start = offset + i * PAGE_SIZE;
+				vm->insert_entries(vm, vma_res, I915_CACHE_NONE,
+						   0);
 
-				rq = submit_batch(ce, vma->node.start);
+				rq = submit_batch(ce, vma_res->start);
 				if (IS_ERR(rq)) {
 					err = PTR_ERR(rq);
+					i915_vma_resource_fini(vma_res);
+					kfree(vma_res);
 					goto end;
 				}
 				i915_request_put(rq);
 			}
-
+			i915_vma_resource_fini(vma_res);
 			i915_vma_put_pages(vma);
 
 			err = context_sync(ce);
 			if (err) {
 				pr_err("%s: dummy setup timed out\n",
 				       ce->engine->name);
+				kfree(vma_res);
 				goto end;
 			}
 
 			vma = i915_vma_instance(act, vm, NULL);
 			if (IS_ERR(vma)) {
+				kfree(vma_res);
 				err = PTR_ERR(vma);
 				goto end;
 			}
@@ -2033,19 +2050,22 @@ static int igt_cs_tlb(void *arg)
 			i915_gem_object_lock(act, NULL);
 			err = i915_vma_get_pages(vma);
 			i915_gem_object_unlock(act);
-			if (err)
+			if (err) {
+				kfree(vma_res);
 				goto end;
+			}
 
+			i915_vma_resource_init_from_vma(vma_res, vma);
 			/* Replace the TLB with target batches */
 			for (i = 0; i < count; i++) {
 				struct i915_request *rq;
 				u32 *cs = batch + i * 64 / sizeof(*cs);
 				u64 addr;
 
-				vma->node.start = offset + i * PAGE_SIZE;
-				vm->insert_entries(vm, vma, I915_CACHE_NONE, 0);
+				vma_res->start = offset + i * PAGE_SIZE;
+				vm->insert_entries(vm, vma_res, I915_CACHE_NONE, 0);
 
-				addr = vma->node.start + i * 64;
+				addr = vma_res->start + i * 64;
 				cs[4] = MI_NOOP;
 				cs[6] = lower_32_bits(addr);
 				cs[7] = upper_32_bits(addr);
@@ -2054,6 +2074,8 @@ static int igt_cs_tlb(void *arg)
 				rq = submit_batch(ce, addr);
 				if (IS_ERR(rq)) {
 					err = PTR_ERR(rq);
+					i915_vma_resource_fini(vma_res);
+					kfree(vma_res);
 					goto end;
 				}
 
@@ -2070,6 +2092,8 @@ static int igt_cs_tlb(void *arg)
 			}
 			end_spin(batch, count - 1);
 
+			i915_vma_resource_fini(vma_res);
+			kfree(vma_res);
 			i915_vma_put_pages(vma);
 
 			err = context_sync(ce);
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index 1802baf80a17..d40519e3ca38 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -33,23 +33,23 @@ static void mock_insert_page(struct i915_address_space *vm,
 }
 
 static void mock_insert_entries(struct i915_address_space *vm,
-				struct i915_vma *vma,
+				struct i915_vma_resource *vma_res,
 				enum i915_cache_level level, u32 flags)
 {
 }
 
 static void mock_bind_ppgtt(struct i915_address_space *vm,
 			    struct i915_vm_pt_stash *stash,
-			    struct i915_vma *vma,
+			    struct i915_vma_resource *vma_res,
 			    enum i915_cache_level cache_level,
 			    u32 flags)
 {
 	GEM_BUG_ON(flags & I915_VMA_GLOBAL_BIND);
-	set_bit(I915_VMA_LOCAL_BIND_BIT, __i915_vma_flags(vma));
+	vma_res->bound_flags |= flags;
 }
 
 static void mock_unbind_ppgtt(struct i915_address_space *vm,
-			      struct i915_vma *vma)
+			      struct i915_vma_resource *vma_res)
 {
 }
 
@@ -93,14 +93,14 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
 
 static void mock_bind_ggtt(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash,
-			   struct i915_vma *vma,
+			   struct i915_vma_resource *vma_res,
 			   enum i915_cache_level cache_level,
 			   u32 flags)
 {
 }
 
 static void mock_unbind_ggtt(struct i915_address_space *vm,
-			     struct i915_vma *vma)
+			     struct i915_vma_resource *vma_res)
 {
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v5 3/6] drm/i915: Don't pin the object pages during pending vma binds
  2022-01-04 12:51 ` [Intel-gfx] " Thomas Hellström
@ 2022-01-04 12:51   ` Thomas Hellström
  -1 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

A pin-count is already held by vma->pages so taking an additional pin
during async binds is not necessary.

When we introduce async unbinding we have other means of keeping the
object pages alive.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 1d4e448d22d9..8fa3e0b2fe26 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -305,10 +305,8 @@ static void __vma_release(struct dma_fence_work *work)
 {
 	struct i915_vma_work *vw = container_of(work, typeof(*vw), base);
 
-	if (vw->pinned) {
-		__i915_gem_object_unpin_pages(vw->pinned);
+	if (vw->pinned)
 		i915_gem_object_put(vw->pinned);
-	}
 
 	i915_vm_free_pt_stash(vw->vm, &vw->stash);
 	i915_vm_put(vw->vm);
@@ -477,7 +475,6 @@ int i915_vma_bind(struct i915_vma *vma,
 
 		work->base.dma.error = 0; /* enable the queue_work() */
 
-		__i915_gem_object_pin_pages(vma->obj);
 		work->pinned = i915_gem_object_get(vma->obj);
 	} else {
 		if (vma->obj) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Intel-gfx] [PATCH v5 3/6] drm/i915: Don't pin the object pages during pending vma binds
@ 2022-01-04 12:51   ` Thomas Hellström
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

A pin-count is already held by vma->pages so taking an additional pin
during async binds is not necessary.

When we introduce async unbinding we have other means of keeping the
object pages alive.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 1d4e448d22d9..8fa3e0b2fe26 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -305,10 +305,8 @@ static void __vma_release(struct dma_fence_work *work)
 {
 	struct i915_vma_work *vw = container_of(work, typeof(*vw), base);
 
-	if (vw->pinned) {
-		__i915_gem_object_unpin_pages(vw->pinned);
+	if (vw->pinned)
 		i915_gem_object_put(vw->pinned);
-	}
 
 	i915_vm_free_pt_stash(vw->vm, &vw->stash);
 	i915_vm_put(vw->vm);
@@ -477,7 +475,6 @@ int i915_vma_bind(struct i915_vma *vma,
 
 		work->base.dma.error = 0; /* enable the queue_work() */
 
-		__i915_gem_object_pin_pages(vma->obj);
 		work->pinned = i915_gem_object_get(vma->obj);
 	} else {
 		if (vma->obj) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v5 4/6] drm/i915: Use vma resources for async unbinding
  2022-01-04 12:51 ` [Intel-gfx] " Thomas Hellström
@ 2022-01-04 12:51   ` Thomas Hellström
  -1 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

Implement async (non-blocking) unbinding by not syncing the vma before
calling unbind on the vma_resource.
Add the resulting unbind fence to the object's dma_resv from where it is
picked up by the ttm migration code.
Ideally these unbind fences should be coalesced with the migration blit
fence to avoid stalling the migration blit waiting for unbind, as they
can certainly go on in parallel, but since we don't yet have a
reasonable data structure to use to coalesce fences and attach the
resulting fence to a timeline, we defer that for now.

Note that with async unbinding, even while the unbind waits for the
preceding bind to complete before unbinding, the vma itself might have been
destroyed in the process, clearing the vma pages. Therefore we can
only allow async unbinding if we have a refcounted sg-list and keep a
refcount on that for the vma resource pages to stay intact until
binding occurs. If this condition is not met, a request for an async
unbind is diverted to a sync unbind.

v2:
- Use a separate kmem_cache for vma resources for now to isolate their
  memory allocation and aid debugging.
- Move the check for vm closed to the actual unbinding thread. Regardless
  of whether the vm is closed, we need the unbind fence to properly wait
  for capture.
- Clear vma_res::vm on unbind and update its documentation.
v4:
- Take cache coloring into account when searching for vma resources
  pending unbind. (Matthew Auld)
v5:
- Fix timeout and error check in i915_vma_resource_bind_dep_await().
- Avoid taking a reference on the object for async binding if
  async unbind capable.
- Fix braces around a single-line if statement.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c |  11 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c         |   2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c          |   4 +
 drivers/gpu/drm/i915/gt/intel_gtt.h          |   3 +
 drivers/gpu/drm/i915/i915_drv.h              |   1 +
 drivers/gpu/drm/i915/i915_gem.c              |  12 +-
 drivers/gpu/drm/i915/i915_module.c           |   3 +
 drivers/gpu/drm/i915/i915_vma.c              | 204 +++++++++--
 drivers/gpu/drm/i915/i915_vma.h              |   3 +-
 drivers/gpu/drm/i915/i915_vma_resource.c     | 354 +++++++++++++++++--
 drivers/gpu/drm/i915/i915_vma_resource.h     |  48 +++
 11 files changed, 578 insertions(+), 67 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
index ee9612a3ee5e..0f514435b9c5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
@@ -142,7 +142,16 @@ int i915_ttm_move_notify(struct ttm_buffer_object *bo)
 	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
 	int ret;
 
-	ret = i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE);
+	/*
+	 * Note: The async unbinding here will actually transform the
+	 * blocking wait for unbind into a wait before finally submitting
+	 * evict / migration blit and thus stall the migration timeline
+	 * which may not be good for overall throughput. We should make
+	 * sure we await the unbind fences *after* the migration blit
+	 * instead of *before* as we currently do.
+	 */
+	ret = i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE |
+				     I915_GEM_OBJECT_UNBIND_ASYNC);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 0137b6af0973..ae7bbd8914c1 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -142,7 +142,7 @@ void i915_ggtt_suspend_vm(struct i915_address_space *vm)
 			continue;
 
 		if (!i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) {
-			__i915_vma_evict(vma);
+			__i915_vma_evict(vma, false);
 			drm_mm_remove_node(&vma->node);
 		}
 	}
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index a94be0306464..46be4197b93f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -161,6 +161,9 @@ static void __i915_vm_release(struct work_struct *work)
 	struct i915_address_space *vm =
 		container_of(work, struct i915_address_space, release_work);
 
+	/* Synchronize async unbinds. */
+	i915_vma_resource_bind_dep_sync_all(vm);
+
 	vm->cleanup(vm);
 	i915_address_space_fini(vm);
 
@@ -189,6 +192,7 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	if (!kref_read(&vm->resv_ref))
 		kref_init(&vm->resv_ref);
 
+	vm->pending_unbind = RB_ROOT_CACHED;
 	INIT_WORK(&vm->release_work, __i915_vm_release);
 	atomic_set(&vm->open, 1);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 676b839d1a34..8073438b67c8 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -265,6 +265,9 @@ struct i915_address_space {
 	/* Flags used when creating page-table objects for this vm */
 	unsigned long lmem_pt_obj_flags;
 
+	/* Interval tree for pending unbind vma resources */
+	struct rb_root_cached pending_unbind;
+
 	struct drm_i915_gem_object *
 		(*alloc_pt_dma)(struct i915_address_space *vm, int sz);
 	struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index beeb42a14aae..63712b2a729e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1670,6 +1670,7 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 #define I915_GEM_OBJECT_UNBIND_BARRIER BIT(1)
 #define I915_GEM_OBJECT_UNBIND_TEST BIT(2)
 #define I915_GEM_OBJECT_UNBIND_VM_TRYLOCK BIT(3)
+#define I915_GEM_OBJECT_UNBIND_ASYNC BIT(4)
 
 void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 915bf431f320..d6d9b5c13299 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -155,10 +155,16 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 		spin_unlock(&obj->vma.lock);
 
 		if (vma) {
+			bool vm_trylock = !!(flags & I915_GEM_OBJECT_UNBIND_VM_TRYLOCK);
 			ret = -EBUSY;
-			if (flags & I915_GEM_OBJECT_UNBIND_ACTIVE ||
-			    !i915_vma_is_active(vma)) {
-				if (flags & I915_GEM_OBJECT_UNBIND_VM_TRYLOCK) {
+			if (flags & I915_GEM_OBJECT_UNBIND_ASYNC) {
+				assert_object_held(vma->obj);
+				ret = i915_vma_unbind_async(vma, vm_trylock);
+			}
+
+			if (ret == -EBUSY && (flags & I915_GEM_OBJECT_UNBIND_ACTIVE ||
+					      !i915_vma_is_active(vma))) {
+				if (vm_trylock) {
 					if (mutex_trylock(&vma->vm->mutex)) {
 						ret = __i915_vma_unbind(vma);
 						mutex_unlock(&vma->vm->mutex);
diff --git a/drivers/gpu/drm/i915/i915_module.c b/drivers/gpu/drm/i915/i915_module.c
index f6bcd2f89257..a8f175960b34 100644
--- a/drivers/gpu/drm/i915/i915_module.c
+++ b/drivers/gpu/drm/i915/i915_module.c
@@ -17,6 +17,7 @@
 #include "i915_scheduler.h"
 #include "i915_selftest.h"
 #include "i915_vma.h"
+#include "i915_vma_resource.h"
 
 static int i915_check_nomodeset(void)
 {
@@ -64,6 +65,8 @@ static const struct {
 	  .exit = i915_scheduler_module_exit },
 	{ .init = i915_vma_module_init,
 	  .exit = i915_vma_module_exit },
+	{ .init = i915_vma_resource_module_init,
+	  .exit = i915_vma_resource_module_exit },
 	{ .init = i915_mock_selftests },
 	{ .init = i915_pmu_init,
 	  .exit = i915_pmu_exit },
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 8fa3e0b2fe26..b886fe649e5c 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -285,9 +285,10 @@ struct i915_vma_work {
 	struct dma_fence_work base;
 	struct i915_address_space *vm;
 	struct i915_vm_pt_stash stash;
-	struct i915_vma *vma;
+	struct i915_vma_resource *vma_res;
 	struct drm_i915_gem_object *pinned;
 	struct i915_sw_dma_fence_cb cb;
+	struct i915_refct_sgt *rsgt;
 	enum i915_cache_level cache_level;
 	unsigned int flags;
 };
@@ -295,10 +296,11 @@ struct i915_vma_work {
 static void __vma_bind(struct dma_fence_work *work)
 {
 	struct i915_vma_work *vw = container_of(work, typeof(*vw), base);
-	struct i915_vma *vma = vw->vma;
+	struct i915_vma_resource *vma_res = vw->vma_res;
+
+	vma_res->ops->bind_vma(vma_res->vm, &vw->stash,
+			       vma_res, vw->cache_level, vw->flags);
 
-	vma->ops->bind_vma(vw->vm, &vw->stash,
-			   vma->resource, vw->cache_level, vw->flags);
 }
 
 static void __vma_release(struct dma_fence_work *work)
@@ -310,6 +312,10 @@ static void __vma_release(struct dma_fence_work *work)
 
 	i915_vm_free_pt_stash(vw->vm, &vw->stash);
 	i915_vm_put(vw->vm);
+	if (vw->vma_res)
+		i915_vma_resource_put(vw->vma_res);
+	if (vw->rsgt)
+		i915_refct_sgt_put(vw->rsgt);
 }
 
 static const struct dma_fence_work_ops bind_ops = {
@@ -379,13 +385,11 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
 {
 	struct drm_i915_gem_object *obj = vma->obj;
 
-	i915_vma_resource_init(vma_res, vma->pages, &vma->page_sizes,
+	i915_vma_resource_init(vma_res, vma->vm, vma->pages, &vma->page_sizes,
 			       i915_gem_object_is_readonly(obj),
 			       i915_gem_object_is_lmem(obj),
-			       vma->private,
-			       vma->node.start,
-			       vma->node.size,
-			       vma->size);
+			       vma->ops, vma->private, vma->node.start,
+			       vma->node.size, vma->size);
 }
 
 /**
@@ -409,6 +413,7 @@ int i915_vma_bind(struct i915_vma *vma,
 {
 	u32 bind_flags;
 	u32 vma_flags;
+	int ret;
 
 	lockdep_assert_held(&vma->vm->mutex);
 	GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
@@ -417,12 +422,12 @@ int i915_vma_bind(struct i915_vma *vma,
 	if (GEM_DEBUG_WARN_ON(range_overflows(vma->node.start,
 					      vma->node.size,
 					      vma->vm->total))) {
-		kfree(vma_res);
+		i915_vma_resource_free(vma_res);
 		return -ENODEV;
 	}
 
 	if (GEM_DEBUG_WARN_ON(!flags)) {
-		kfree(vma_res);
+		i915_vma_resource_free(vma_res);
 		return -EINVAL;
 	}
 
@@ -434,12 +439,30 @@ int i915_vma_bind(struct i915_vma *vma,
 
 	bind_flags &= ~vma_flags;
 	if (bind_flags == 0) {
-		kfree(vma_res);
+		i915_vma_resource_free(vma_res);
 		return 0;
 	}
 
 	GEM_BUG_ON(!atomic_read(&vma->pages_count));
 
+	/* Wait for or await async unbinds touching our range */
+	if (work && bind_flags & vma->vm->bind_async_flags)
+		ret = i915_vma_resource_bind_dep_await(vma->vm,
+						       &work->base.chain,
+						       vma->node.start,
+						       vma->node.size,
+						       true,
+						       GFP_NOWAIT |
+						       __GFP_RETRY_MAYFAIL |
+						       __GFP_NOWARN);
+	else
+		ret = i915_vma_resource_bind_dep_sync(vma->vm, vma->node.start,
+						      vma->node.size, true);
+	if (ret) {
+		i915_vma_resource_free(vma_res);
+		return ret;
+	}
+
 	if (vma->resource || !vma_res) {
 		/* Rebinding with an additional I915_VMA_*_BIND */
 		GEM_WARN_ON(!vma_flags);
@@ -452,9 +475,11 @@ int i915_vma_bind(struct i915_vma *vma,
 	if (work && bind_flags & vma->vm->bind_async_flags) {
 		struct dma_fence *prev;
 
-		work->vma = vma;
+		work->vma_res = i915_vma_resource_get(vma->resource);
 		work->cache_level = cache_level;
 		work->flags = bind_flags;
+		if (vma->obj->mm.rsgt)
+			work->rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);
 
 		/*
 		 * Note we only want to chain up to the migration fence on
@@ -475,14 +500,24 @@ int i915_vma_bind(struct i915_vma *vma,
 
 		work->base.dma.error = 0; /* enable the queue_work() */
 
-		work->pinned = i915_gem_object_get(vma->obj);
+		/*
+		 * If we don't have the refcounted pages list, keep a reference
+		 * on the object to avoid waiting for the async bind to
+		 * complete in the object destruction path.
+		 */
+		if (!work->rsgt)
+			work->pinned = i915_gem_object_get(vma->obj);
 	} else {
 		if (vma->obj) {
 			int ret;
 
 			ret = i915_gem_object_wait_moving_fence(vma->obj, true);
-			if (ret)
+			if (ret) {
+				i915_vma_resource_free(vma->resource);
+				vma->resource = NULL;
+
 				return ret;
+			}
 		}
 		vma->ops->bind_vma(vma->vm, NULL, vma->resource, cache_level,
 				   bind_flags);
@@ -1755,8 +1790,9 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
 	return 0;
 }
 
-void __i915_vma_evict(struct i915_vma *vma)
+struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async)
 {
+	struct i915_vma_resource *vma_res = vma->resource;
 	struct dma_fence *unbind_fence;
 
 	GEM_BUG_ON(i915_vma_is_pinned(vma));
@@ -1789,27 +1825,39 @@ void __i915_vma_evict(struct i915_vma *vma)
 	GEM_BUG_ON(vma->fence);
 	GEM_BUG_ON(i915_vma_has_userfault(vma));
 
-	if (likely(atomic_read(&vma->vm->open))) {
-		trace_i915_vma_unbind(vma);
-		vma->ops->unbind_vma(vma->vm, vma->resource);
-	}
+	/* Object backend must be async capable. */
+	GEM_WARN_ON(async && !vma->obj->mm.rsgt);
+
+	/* If vm is not open, unbind is a nop. */
+	vma_res->needs_wakeref = i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND) &&
+		atomic_read(&vma->vm->open);
+	trace_i915_vma_unbind(vma);
+
+	unbind_fence = i915_vma_resource_unbind(vma_res);
+	vma->resource = NULL;
+
 	atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE),
 		   &vma->flags);
 
-	unbind_fence = i915_vma_resource_unbind(vma->resource);
-	i915_vma_resource_put(vma->resource);
-	vma->resource = NULL;
+	/* Object backend must be async capable. */
+	GEM_WARN_ON(async && !vma->obj->mm.rsgt);
 
 	i915_vma_detach(vma);
-	vma_unbind_pages(vma);
+
+	if (!async && unbind_fence) {
+		dma_fence_wait(unbind_fence, false);
+		dma_fence_put(unbind_fence);
+		unbind_fence = NULL;
+	}
 
 	/*
-	 * This uninterruptible wait under the vm mutex is currently
-	 * only ever blocking while the vma is being captured from.
-	 * With async unbinding, this wait here will be removed.
+	 * Binding itself may not have completed until the unbind fence signals,
+	 * so don't drop the pages until that happens, unless the resource is
+	 * async_capable.
 	 */
-	dma_fence_wait(unbind_fence, false);
-	dma_fence_put(unbind_fence);
+
+	vma_unbind_pages(vma);
+	return unbind_fence;
 }
 
 int __i915_vma_unbind(struct i915_vma *vma)
@@ -1836,12 +1884,46 @@ int __i915_vma_unbind(struct i915_vma *vma)
 		return ret;
 
 	GEM_BUG_ON(i915_vma_is_active(vma));
-	__i915_vma_evict(vma);
+	__i915_vma_evict(vma, false);
 
 	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
 	return 0;
 }
 
+static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma)
+{
+	struct dma_fence *fence;
+
+	lockdep_assert_held(&vma->vm->mutex);
+
+	if (!drm_mm_node_allocated(&vma->node))
+		return NULL;
+
+	if (i915_vma_is_pinned(vma))
+		return ERR_PTR(-EAGAIN);
+
+	/*
+	 * We probably need to replace this with awaiting the fences of the
+	 * object's dma_resv when the vma active goes away. When doing that
+	 * we need to be careful to not add the vma_resource unbind fence
+	 * immediately to the object's dma_resv, because then unbinding
+	 * the next vma from the object, in case there are many, will
+	 * actually await the unbinding of the previous vmas, which is
+	 * undesirable.
+	 */
+	if (i915_sw_fence_await_active(&vma->resource->chain, &vma->active,
+				       I915_ACTIVE_AWAIT_EXCL |
+				       I915_ACTIVE_AWAIT_ACTIVE) < 0) {
+		return ERR_PTR(-EBUSY);
+	}
+
+	fence = __i915_vma_evict(vma, true);
+
+	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
+
+	return fence;
+}
+
 int i915_vma_unbind(struct i915_vma *vma)
 {
 	struct i915_address_space *vm = vma->vm;
@@ -1878,6 +1960,68 @@ int i915_vma_unbind(struct i915_vma *vma)
 	return err;
 }
 
+int i915_vma_unbind_async(struct i915_vma *vma, bool trylock_vm)
+{
+	struct drm_i915_gem_object *obj = vma->obj;
+	struct i915_address_space *vm = vma->vm;
+	intel_wakeref_t wakeref = 0;
+	struct dma_fence *fence;
+	int err;
+
+	/*
+	 * We need the dma-resv lock since we add the
+	 * unbind fence to the dma-resv object.
+	 */
+	assert_object_held(obj);
+
+	if (!drm_mm_node_allocated(&vma->node))
+		return 0;
+
+	if (i915_vma_is_pinned(vma)) {
+		vma_print_allocator(vma, "is pinned");
+		return -EAGAIN;
+	}
+
+	if (!obj->mm.rsgt)
+		return -EBUSY;
+
+	err = dma_resv_reserve_shared(obj->base.resv, 1);
+	if (err)
+		return -EBUSY;
+
+	/*
+	 * It would be great if we could grab this wakeref from the
+	 * async unbind work if needed, but we can't because it uses
+	 * kmalloc and it's in the dma-fence signalling critical path.
+	 */
+	if (i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND))
+		wakeref = intel_runtime_pm_get(&vm->i915->runtime_pm);
+
+	if (trylock_vm && !mutex_trylock(&vm->mutex)) {
+		err = -EBUSY;
+		goto out_rpm;
+	} else if (!trylock_vm) {
+		err = mutex_lock_interruptible_nested(&vm->mutex, !wakeref);
+		if (err)
+			goto out_rpm;
+	}
+
+	fence = __i915_vma_unbind_async(vma);
+	mutex_unlock(&vm->mutex);
+	if (IS_ERR_OR_NULL(fence)) {
+		err = PTR_ERR_OR_ZERO(fence);
+		goto out_rpm;
+	}
+
+	dma_resv_add_shared_fence(obj->base.resv, fence);
+	dma_fence_put(fence);
+
+out_rpm:
+	if (wakeref)
+		intel_runtime_pm_put(&vm->i915->runtime_pm, wakeref);
+	return err;
+}
+
 struct i915_vma *i915_vma_make_unshrinkable(struct i915_vma *vma)
 {
 	i915_gem_object_make_unshrinkable(vma->obj);
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 1df57ec832bd..a560bae04e7e 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -213,9 +213,10 @@ bool i915_vma_misplaced(const struct i915_vma *vma,
 			u64 size, u64 alignment, u64 flags);
 void __i915_vma_set_map_and_fenceable(struct i915_vma *vma);
 void i915_vma_revoke_mmap(struct i915_vma *vma);
-void __i915_vma_evict(struct i915_vma *vma);
+struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async);
 int __i915_vma_unbind(struct i915_vma *vma);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
+int __must_check i915_vma_unbind_async(struct i915_vma *vma, bool trylock_vm);
 void i915_vma_unlink_ctx(struct i915_vma *vma);
 void i915_vma_close(struct i915_vma *vma);
 void i915_vma_reopen(struct i915_vma *vma);
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c
index c86db89ab5d2..3dfb3c6731f8 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.c
+++ b/drivers/gpu/drm/i915/i915_vma_resource.c
@@ -2,39 +2,44 @@
 /*
  * Copyright © 2021 Intel Corporation
  */
+
+#include <linux/interval_tree_generic.h>
 #include <linux/slab.h>
 
+#include "i915_sw_fence.h"
 #include "i915_vma_resource.h"
+#include "i915_drv.h"
 
-/* Callbacks for the unbind dma-fence. */
-static const char *get_driver_name(struct dma_fence *fence)
-{
-	return "vma unbind fence";
-}
+#include "gt/intel_gtt.h"
 
-static const char *get_timeline_name(struct dma_fence *fence)
-{
-	return "unbound";
-}
-
-static struct dma_fence_ops unbind_fence_ops = {
-	.get_driver_name = get_driver_name,
-	.get_timeline_name = get_timeline_name,
-};
+static struct kmem_cache *slab_vma_resources;
 
 /**
- * __i915_vma_resource_init - Initialize a vma resource.
- * @vma_res: The vma resource to initialize
+ * DOC:
+ * We use a per-vm interval tree to keep track of vma_resources
+ * scheduled for unbind but not yet unbound. The tree is protected by
+ * the vm mutex, and nodes are removed just after the unbind fence signals.
+ * The removal takes the vm mutex from a kernel thread which we need to
+ * keep in mind so that we don't grab the mutex and try to wait for all
+ * pending unbinds to complete, because that will temporaryily block many
+ * of the workqueue threads, and people will get angry.
  *
- * Initializes the private members of a vma resource.
+ * We should consider using a single ordered fence per VM instead but that
+ * requires ordering the unbinds and might introduce unnecessary waiting
+ * for unrelated unbinds. Amount of code will probably be roughly the same
+ * due to the simplicity of using the interval tree interface.
+ *
+ * Another drawback of this interval tree is that the complexity of insertion
+ * and removal of fences increases as O(ln(pending_unbinds)) instead of
+ * O(1) for a single fence without interval tree.
  */
-void __i915_vma_resource_init(struct i915_vma_resource *vma_res)
-{
-	spin_lock_init(&vma_res->lock);
-	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
-		       &vma_res->lock, 0, 0);
-	refcount_set(&vma_res->hold_count, 1);
-}
+#define VMA_RES_START(_node) ((_node)->start)
+#define VMA_RES_LAST(_node) ((_node)->start + (_node)->node_size - 1)
+INTERVAL_TREE_DEFINE(struct i915_vma_resource, rb,
+		     unsigned long, __subtree_last,
+		     VMA_RES_START, VMA_RES_LAST, static, vma_res_itree);
+
+/* Callbacks for the unbind dma-fence. */
 
 /**
  * i915_vma_resource_alloc - Allocate a vma resource
@@ -45,15 +50,73 @@ void __i915_vma_resource_init(struct i915_vma_resource *vma_res)
 struct i915_vma_resource *i915_vma_resource_alloc(void)
 {
 	struct i915_vma_resource *vma_res =
-		kzalloc(sizeof(*vma_res), GFP_KERNEL);
+		kmem_cache_zalloc(slab_vma_resources, GFP_KERNEL);
 
 	return vma_res ? vma_res : ERR_PTR(-ENOMEM);
 }
 
+/**
+ * i915_vma_resource_free - Free a vma resource
+ * @vma_res: The vma resource to free.
+ */
+void i915_vma_resource_free(struct i915_vma_resource *vma_res)
+{
+	kmem_cache_free(slab_vma_resources, vma_res);
+}
+
+static const char *get_driver_name(struct dma_fence *fence)
+{
+	return "vma unbind fence";
+}
+
+static const char *get_timeline_name(struct dma_fence *fence)
+{
+	return "unbound";
+}
+
+static void unbind_fence_free_rcu(struct rcu_head *head)
+{
+	struct i915_vma_resource *vma_res =
+		container_of(head, typeof(*vma_res), unbind_fence.rcu);
+
+	i915_vma_resource_free(vma_res);
+}
+
+static void unbind_fence_release(struct dma_fence *fence)
+{
+	struct i915_vma_resource *vma_res =
+		container_of(fence, typeof(*vma_res), unbind_fence);
+
+	i915_sw_fence_fini(&vma_res->chain);
+
+	call_rcu(&fence->rcu, unbind_fence_free_rcu);
+}
+
+static struct dma_fence_ops unbind_fence_ops = {
+	.get_driver_name = get_driver_name,
+	.get_timeline_name = get_timeline_name,
+	.release = unbind_fence_release,
+};
+
 static void __i915_vma_resource_unhold(struct i915_vma_resource *vma_res)
 {
-	if (refcount_dec_and_test(&vma_res->hold_count))
-		dma_fence_signal(&vma_res->unbind_fence);
+	struct i915_address_space *vm;
+
+	if (!refcount_dec_and_test(&vma_res->hold_count))
+		return;
+
+	dma_fence_signal(&vma_res->unbind_fence);
+
+	vm = vma_res->vm;
+	if (vma_res->wakeref)
+		intel_runtime_pm_put(&vm->i915->runtime_pm, vma_res->wakeref);
+
+	vma_res->vm = NULL;
+	if (!RB_EMPTY_NODE(&vma_res->rb)) {
+		mutex_lock(&vm->mutex);
+		vma_res_itree_remove(vma_res, &vm->pending_unbind);
+		mutex_unlock(&vm->mutex);
+	}
 }
 
 /**
@@ -102,6 +165,49 @@ bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
 	return held;
 }
 
+static void i915_vma_resource_unbind_work(struct work_struct *work)
+{
+	struct i915_vma_resource *vma_res =
+		container_of(work, typeof(*vma_res), work);
+	struct i915_address_space *vm = vma_res->vm;
+	bool lockdep_cookie;
+
+	lockdep_cookie = dma_fence_begin_signalling();
+	if (likely(atomic_read(&vm->open)))
+		vma_res->ops->unbind_vma(vm, vma_res);
+
+	dma_fence_end_signalling(lockdep_cookie);
+	__i915_vma_resource_unhold(vma_res);
+	i915_vma_resource_put(vma_res);
+}
+
+static int
+i915_vma_resource_fence_notify(struct i915_sw_fence *fence,
+			       enum i915_sw_fence_notify state)
+{
+	struct i915_vma_resource *vma_res =
+		container_of(fence, typeof(*vma_res), chain);
+	struct dma_fence *unbind_fence =
+		&vma_res->unbind_fence;
+
+	switch (state) {
+	case FENCE_COMPLETE:
+		dma_fence_get(unbind_fence);
+		if (vma_res->immediate_unbind) {
+			i915_vma_resource_unbind_work(&vma_res->work);
+		} else {
+			INIT_WORK(&vma_res->work, i915_vma_resource_unbind_work);
+			queue_work(system_unbound_wq, &vma_res->work);
+		}
+		break;
+	case FENCE_FREE:
+		i915_vma_resource_put(vma_res);
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
 /**
  * i915_vma_resource_unbind - Unbind a vma resource
  * @vma_res: The vma resource to unbind.
@@ -112,10 +218,196 @@ bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
  * Return: A refcounted pointer to a dma-fence that signals when unbinding is
  * complete.
  */
-struct dma_fence *
-i915_vma_resource_unbind(struct i915_vma_resource *vma_res)
+struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res)
 {
-	__i915_vma_resource_unhold(vma_res);
-	dma_fence_get(&vma_res->unbind_fence);
+	struct i915_address_space *vm = vma_res->vm;
+
+	/* Reference for the sw fence */
+	i915_vma_resource_get(vma_res);
+
+	/* Caller must already have a wakeref in this case. */
+	if (vma_res->needs_wakeref)
+		vma_res->wakeref = intel_runtime_pm_get_if_in_use(&vm->i915->runtime_pm);
+
+	if (atomic_read(&vma_res->chain.pending) <= 1) {
+		RB_CLEAR_NODE(&vma_res->rb);
+		vma_res->immediate_unbind = 1;
+	} else {
+		vma_res_itree_insert(vma_res, &vma_res->vm->pending_unbind);
+	}
+
+	i915_sw_fence_commit(&vma_res->chain);
+
 	return &vma_res->unbind_fence;
 }
+
+/**
+ * __i915_vma_resource_init - Initialize a vma resource.
+ * @vma_res: The vma resource to initialize
+ *
+ * Initializes the private members of a vma resource.
+ */
+void __i915_vma_resource_init(struct i915_vma_resource *vma_res)
+{
+	spin_lock_init(&vma_res->lock);
+	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
+		       &vma_res->lock, 0, 0);
+	refcount_set(&vma_res->hold_count, 1);
+	i915_sw_fence_init(&vma_res->chain, i915_vma_resource_fence_notify);
+}
+
+static void
+i915_vma_resource_color_adjust_range(struct i915_address_space *vm,
+				     unsigned long *start,
+				     unsigned long *end)
+{
+	if (i915_vm_has_cache_coloring(vm)) {
+		if (start)
+			start -= I915_GTT_PAGE_SIZE;
+		end += I915_GTT_PAGE_SIZE;
+	}
+}
+
+/**
+ * i915_vma_resource_bind_dep_sync - Wait for / sync all unbinds touching a
+ * certain vm range.
+ * @vm: The vm to look at.
+ * @offset: The range start.
+ * @size: The range size.
+ * @intr: Whether to wait interrubtible.
+ *
+ * The function needs to be called with the vm lock held.
+ *
+ * Return: Zero on success, -ERESTARTSYS if interrupted and @intr==true
+ */
+int i915_vma_resource_bind_dep_sync(struct i915_address_space *vm,
+				    unsigned long offset,
+				    unsigned long size,
+				    bool intr)
+{
+	struct i915_vma_resource *node;
+	unsigned long last = offset + size - 1;
+
+	lockdep_assert_held(&vm->mutex);
+	might_sleep();
+
+	i915_vma_resource_color_adjust_range(vm, &offset, &last);
+	node = vma_res_itree_iter_first(&vm->pending_unbind, offset, last);
+	while (node) {
+		int ret = dma_fence_wait(&node->unbind_fence, intr);
+
+		if (ret)
+			return ret;
+
+		node = vma_res_itree_iter_next(node, offset, last);
+	}
+
+	return 0;
+}
+
+/**
+ * i915_vma_resource_bind_dep_sync_all - Wait for / sync all unbinds of a vm,
+ * releasing the vm lock while waiting.
+ * @vm: The vm to look at.
+ *
+ * The function may not be called with the vm lock held.
+ * Typically this is called at vm destruction to finish any pending
+ * unbind operations. The vm mutex is released while waiting to avoid
+ * stalling kernel workqueues trying to grab the mutex.
+ */
+void i915_vma_resource_bind_dep_sync_all(struct i915_address_space *vm)
+{
+	struct i915_vma_resource *node;
+	struct dma_fence *fence;
+
+	do {
+		fence = NULL;
+		mutex_lock(&vm->mutex);
+		node = vma_res_itree_iter_first(&vm->pending_unbind, 0,
+						ULONG_MAX);
+		if (node)
+			fence = dma_fence_get_rcu(&node->unbind_fence);
+		mutex_unlock(&vm->mutex);
+
+		if (fence) {
+			/*
+			 * The wait makes sure the node eventually removes
+			 * itself from the tree.
+			 */
+			dma_fence_wait(fence, false);
+			dma_fence_put(fence);
+		}
+	} while (node);
+}
+
+/**
+ * i915_vma_resource_bind_dep_await - Have a struct i915_sw_fence await all
+ * pending unbinds in a certain range of a vm.
+ * @vm: The vm to look at.
+ * @sw_fence: The struct i915_sw_fence that will be awaiting the unbinds.
+ * @offset: The range start.
+ * @size: The range size.
+ * @intr: Whether to wait interrubtible.
+ * @gfp: Allocation mode for memory allocations.
+ *
+ * The function makes @sw_fence await all pending unbinds in a certain
+ * vm range before calling the complete notifier. To be able to await
+ * each individual unbind, the function needs to allocate memory using
+ * the @gpf allocation mode. If that fails, the function will instead
+ * wait for the unbind fence to signal, using @intr to judge whether to
+ * wait interruptible or not. Note that @gfp should ideally be selected so
+ * as to avoid any expensive memory allocation stalls and rather fail and
+ * synchronize itself. For now the vm mutex is required when calling this
+ * function with means that @gfp can't call into direct reclaim. In reality
+ * this means that during heavy memory pressure, we will sync in this
+ * function.
+ *
+ * Return: Zero on success, -ERESTARTSYS if interrupted and @intr==true
+ */
+int i915_vma_resource_bind_dep_await(struct i915_address_space *vm,
+				     struct i915_sw_fence *sw_fence,
+				     unsigned long offset,
+				     unsigned long size,
+				     bool intr,
+				     gfp_t gfp)
+{
+	struct i915_vma_resource *node;
+	unsigned long last = offset + size - 1;
+
+	lockdep_assert_held(&vm->mutex);
+	might_alloc(gfp);
+	might_sleep();
+
+	i915_vma_resource_color_adjust_range(vm, &offset, &last);
+	node = vma_res_itree_iter_first(&vm->pending_unbind, offset, last);
+	while (node) {
+		int ret;
+
+		ret = i915_sw_fence_await_dma_fence(sw_fence,
+						    &node->unbind_fence,
+						    0, gfp);
+		if (ret < 0) {
+			ret = dma_fence_wait(&node->unbind_fence, intr);
+			if (ret)
+				return ret;
+		}
+
+		node = vma_res_itree_iter_next(node, offset, last);
+	}
+
+	return 0;
+}
+
+void i915_vma_resource_module_exit(void)
+{
+	kmem_cache_destroy(slab_vma_resources);
+}
+
+int __init i915_vma_resource_module_init(void)
+{
+	slab_vma_resources = KMEM_CACHE(i915_vma_resource, SLAB_HWCACHE_ALIGN);
+	if (!slab_vma_resources)
+		return -ENOMEM;
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
index 9872de58268b..a89537e83c70 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.h
+++ b/drivers/gpu/drm/i915/i915_vma_resource.h
@@ -10,6 +10,8 @@
 #include <linux/refcount.h>
 
 #include "i915_gem.h"
+#include "i915_sw_fence.h"
+#include "intel_runtime_pm.h"
 
 struct i915_page_sizes {
 	/**
@@ -39,6 +41,13 @@ struct i915_page_sizes {
  * @hold_count: Number of holders blocking the fence from finishing.
  * The vma itself is keeping a hold, which is released when unbind
  * is scheduled.
+ * @work: Work struct for deferred unbind work.
+ * @chain: Pointer to struct i915_sw_fence used to await dependencies.
+ * @rb: Rb node for the vm's pending unbind interval tree.
+ * @__subtree_last: Interval tree private member.
+ * @vm: non-refcounted pointer to the vm. This is for internal use only and
+ * this member is cleared after vm_resource unbind.
+ * @ops: Pointer to the backend i915_vma_ops.
  * @private: Bind backend private info.
  * @start: Offset into the address space of bind range start.
  * @node_size: Size of the allocated range manager node.
@@ -46,6 +55,8 @@ struct i915_page_sizes {
  * @page_sizes_gtt: Resulting page sizes from the bind operation.
  * @bound_flags: Flags indicating binding status.
  * @allocated: Backend private data. TODO: Should move into @private.
+ * @immediate_unbind: Unbind can be done immediately and don't need to be
+ * deferred to a work item awaiting unsignaled fences.
  *
  * The lifetime of a struct i915_vma_resource is from a binding request to
  * the actual possible asynchronous unbind has completed.
@@ -55,6 +66,12 @@ struct i915_vma_resource {
 	/* See above for description of the lock. */
 	spinlock_t lock;
 	refcount_t hold_count;
+	struct work_struct work;
+	struct i915_sw_fence chain;
+	struct rb_node rb;
+	unsigned long __subtree_last;
+	struct i915_address_space *vm;
+	intel_wakeref_t wakeref;
 
 	/**
 	 * struct i915_vma_bindinfo - Information needed for async bind
@@ -74,13 +91,17 @@ struct i915_vma_resource {
 		bool lmem:1;
 	} bi;
 
+	const struct i915_vma_ops *ops;
 	void *private;
 	unsigned long start;
 	unsigned long node_size;
 	unsigned long vma_size;
 	u32 page_sizes_gtt;
+
 	u32 bound_flags;
 	bool allocated:1;
+	bool immediate_unbind:1;
+	bool needs_wakeref:1;
 };
 
 bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
@@ -91,6 +112,8 @@ void i915_vma_resource_unhold(struct i915_vma_resource *vma_res,
 
 struct i915_vma_resource *i915_vma_resource_alloc(void);
 
+void i915_vma_resource_free(struct i915_vma_resource *vma_res);
+
 struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res);
 
 void __i915_vma_resource_init(struct i915_vma_resource *vma_res);
@@ -120,10 +143,12 @@ static inline void i915_vma_resource_put(struct i915_vma_resource *vma_res)
 /**
  * i915_vma_resource_init - Initialize a vma resource.
  * @vma_res: The vma resource to initialize
+ * @vm: Pointer to the vm.
  * @pages: The pages sg-table.
  * @page_sizes: Page sizes of the pages.
  * @readonly: Whether the vma should be bound read-only.
  * @lmem: Whether the vma points to lmem.
+ * @ops: The backend ops.
  * @private: Bind backend private info.
  * @start: Offset into the address space of bind range start.
  * @node_size: Size of the allocated range manager node.
@@ -135,20 +160,24 @@ static inline void i915_vma_resource_put(struct i915_vma_resource *vma_res)
  * allocation is not allowed.
  */
 static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
+					  struct i915_address_space *vm,
 					  struct sg_table *pages,
 					  const struct i915_page_sizes *page_sizes,
 					  bool readonly,
 					  bool lmem,
+					  const struct i915_vma_ops *ops,
 					  void *private,
 					  unsigned long start,
 					  unsigned long node_size,
 					  unsigned long size)
 {
 	__i915_vma_resource_init(vma_res);
+	vma_res->vm = vm;
 	vma_res->bi.pages = pages;
 	vma_res->bi.page_sizes = *page_sizes;
 	vma_res->bi.readonly = readonly;
 	vma_res->bi.lmem = lmem;
+	vma_res->ops = ops;
 	vma_res->private = private;
 	vma_res->start = start;
 	vma_res->node_size = node_size;
@@ -158,6 +187,25 @@ static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
 static inline void i915_vma_resource_fini(struct i915_vma_resource *vma_res)
 {
 	GEM_BUG_ON(refcount_read(&vma_res->hold_count) != 1);
+	i915_sw_fence_fini(&vma_res->chain);
 }
 
+int i915_vma_resource_bind_dep_sync(struct i915_address_space *vm,
+				    unsigned long first,
+				    unsigned long last,
+				    bool intr);
+
+int i915_vma_resource_bind_dep_await(struct i915_address_space *vm,
+				     struct i915_sw_fence *sw_fence,
+				     unsigned long first,
+				     unsigned long last,
+				     bool intr,
+				     gfp_t gfp);
+
+void i915_vma_resource_bind_dep_sync_all(struct i915_address_space *vm);
+
+void i915_vma_resource_module_exit(void);
+
+int i915_vma_resource_module_init(void);
+
 #endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Intel-gfx] [PATCH v5 4/6] drm/i915: Use vma resources for async unbinding
@ 2022-01-04 12:51   ` Thomas Hellström
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

Implement async (non-blocking) unbinding by not syncing the vma before
calling unbind on the vma_resource.
Add the resulting unbind fence to the object's dma_resv from where it is
picked up by the ttm migration code.
Ideally these unbind fences should be coalesced with the migration blit
fence to avoid stalling the migration blit waiting for unbind, as they
can certainly go on in parallel, but since we don't yet have a
reasonable data structure to use to coalesce fences and attach the
resulting fence to a timeline, we defer that for now.

Note that with async unbinding, even while the unbind waits for the
preceding bind to complete before unbinding, the vma itself might have been
destroyed in the process, clearing the vma pages. Therefore we can
only allow async unbinding if we have a refcounted sg-list and keep a
refcount on that for the vma resource pages to stay intact until
binding occurs. If this condition is not met, a request for an async
unbind is diverted to a sync unbind.

v2:
- Use a separate kmem_cache for vma resources for now to isolate their
  memory allocation and aid debugging.
- Move the check for vm closed to the actual unbinding thread. Regardless
  of whether the vm is closed, we need the unbind fence to properly wait
  for capture.
- Clear vma_res::vm on unbind and update its documentation.
v4:
- Take cache coloring into account when searching for vma resources
  pending unbind. (Matthew Auld)
v5:
- Fix timeout and error check in i915_vma_resource_bind_dep_await().
- Avoid taking a reference on the object for async binding if
  async unbind capable.
- Fix braces around a single-line if statement.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c |  11 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c         |   2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c          |   4 +
 drivers/gpu/drm/i915/gt/intel_gtt.h          |   3 +
 drivers/gpu/drm/i915/i915_drv.h              |   1 +
 drivers/gpu/drm/i915/i915_gem.c              |  12 +-
 drivers/gpu/drm/i915/i915_module.c           |   3 +
 drivers/gpu/drm/i915/i915_vma.c              | 204 +++++++++--
 drivers/gpu/drm/i915/i915_vma.h              |   3 +-
 drivers/gpu/drm/i915/i915_vma_resource.c     | 354 +++++++++++++++++--
 drivers/gpu/drm/i915/i915_vma_resource.h     |  48 +++
 11 files changed, 578 insertions(+), 67 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
index ee9612a3ee5e..0f514435b9c5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c
@@ -142,7 +142,16 @@ int i915_ttm_move_notify(struct ttm_buffer_object *bo)
 	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
 	int ret;
 
-	ret = i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE);
+	/*
+	 * Note: The async unbinding here will actually transform the
+	 * blocking wait for unbind into a wait before finally submitting
+	 * evict / migration blit and thus stall the migration timeline
+	 * which may not be good for overall throughput. We should make
+	 * sure we await the unbind fences *after* the migration blit
+	 * instead of *before* as we currently do.
+	 */
+	ret = i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE |
+				     I915_GEM_OBJECT_UNBIND_ASYNC);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 0137b6af0973..ae7bbd8914c1 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -142,7 +142,7 @@ void i915_ggtt_suspend_vm(struct i915_address_space *vm)
 			continue;
 
 		if (!i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) {
-			__i915_vma_evict(vma);
+			__i915_vma_evict(vma, false);
 			drm_mm_remove_node(&vma->node);
 		}
 	}
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index a94be0306464..46be4197b93f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -161,6 +161,9 @@ static void __i915_vm_release(struct work_struct *work)
 	struct i915_address_space *vm =
 		container_of(work, struct i915_address_space, release_work);
 
+	/* Synchronize async unbinds. */
+	i915_vma_resource_bind_dep_sync_all(vm);
+
 	vm->cleanup(vm);
 	i915_address_space_fini(vm);
 
@@ -189,6 +192,7 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	if (!kref_read(&vm->resv_ref))
 		kref_init(&vm->resv_ref);
 
+	vm->pending_unbind = RB_ROOT_CACHED;
 	INIT_WORK(&vm->release_work, __i915_vm_release);
 	atomic_set(&vm->open, 1);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 676b839d1a34..8073438b67c8 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -265,6 +265,9 @@ struct i915_address_space {
 	/* Flags used when creating page-table objects for this vm */
 	unsigned long lmem_pt_obj_flags;
 
+	/* Interval tree for pending unbind vma resources */
+	struct rb_root_cached pending_unbind;
+
 	struct drm_i915_gem_object *
 		(*alloc_pt_dma)(struct i915_address_space *vm, int sz);
 	struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index beeb42a14aae..63712b2a729e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1670,6 +1670,7 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 #define I915_GEM_OBJECT_UNBIND_BARRIER BIT(1)
 #define I915_GEM_OBJECT_UNBIND_TEST BIT(2)
 #define I915_GEM_OBJECT_UNBIND_VM_TRYLOCK BIT(3)
+#define I915_GEM_OBJECT_UNBIND_ASYNC BIT(4)
 
 void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 915bf431f320..d6d9b5c13299 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -155,10 +155,16 @@ int i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 		spin_unlock(&obj->vma.lock);
 
 		if (vma) {
+			bool vm_trylock = !!(flags & I915_GEM_OBJECT_UNBIND_VM_TRYLOCK);
 			ret = -EBUSY;
-			if (flags & I915_GEM_OBJECT_UNBIND_ACTIVE ||
-			    !i915_vma_is_active(vma)) {
-				if (flags & I915_GEM_OBJECT_UNBIND_VM_TRYLOCK) {
+			if (flags & I915_GEM_OBJECT_UNBIND_ASYNC) {
+				assert_object_held(vma->obj);
+				ret = i915_vma_unbind_async(vma, vm_trylock);
+			}
+
+			if (ret == -EBUSY && (flags & I915_GEM_OBJECT_UNBIND_ACTIVE ||
+					      !i915_vma_is_active(vma))) {
+				if (vm_trylock) {
 					if (mutex_trylock(&vma->vm->mutex)) {
 						ret = __i915_vma_unbind(vma);
 						mutex_unlock(&vma->vm->mutex);
diff --git a/drivers/gpu/drm/i915/i915_module.c b/drivers/gpu/drm/i915/i915_module.c
index f6bcd2f89257..a8f175960b34 100644
--- a/drivers/gpu/drm/i915/i915_module.c
+++ b/drivers/gpu/drm/i915/i915_module.c
@@ -17,6 +17,7 @@
 #include "i915_scheduler.h"
 #include "i915_selftest.h"
 #include "i915_vma.h"
+#include "i915_vma_resource.h"
 
 static int i915_check_nomodeset(void)
 {
@@ -64,6 +65,8 @@ static const struct {
 	  .exit = i915_scheduler_module_exit },
 	{ .init = i915_vma_module_init,
 	  .exit = i915_vma_module_exit },
+	{ .init = i915_vma_resource_module_init,
+	  .exit = i915_vma_resource_module_exit },
 	{ .init = i915_mock_selftests },
 	{ .init = i915_pmu_init,
 	  .exit = i915_pmu_exit },
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 8fa3e0b2fe26..b886fe649e5c 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -285,9 +285,10 @@ struct i915_vma_work {
 	struct dma_fence_work base;
 	struct i915_address_space *vm;
 	struct i915_vm_pt_stash stash;
-	struct i915_vma *vma;
+	struct i915_vma_resource *vma_res;
 	struct drm_i915_gem_object *pinned;
 	struct i915_sw_dma_fence_cb cb;
+	struct i915_refct_sgt *rsgt;
 	enum i915_cache_level cache_level;
 	unsigned int flags;
 };
@@ -295,10 +296,11 @@ struct i915_vma_work {
 static void __vma_bind(struct dma_fence_work *work)
 {
 	struct i915_vma_work *vw = container_of(work, typeof(*vw), base);
-	struct i915_vma *vma = vw->vma;
+	struct i915_vma_resource *vma_res = vw->vma_res;
+
+	vma_res->ops->bind_vma(vma_res->vm, &vw->stash,
+			       vma_res, vw->cache_level, vw->flags);
 
-	vma->ops->bind_vma(vw->vm, &vw->stash,
-			   vma->resource, vw->cache_level, vw->flags);
 }
 
 static void __vma_release(struct dma_fence_work *work)
@@ -310,6 +312,10 @@ static void __vma_release(struct dma_fence_work *work)
 
 	i915_vm_free_pt_stash(vw->vm, &vw->stash);
 	i915_vm_put(vw->vm);
+	if (vw->vma_res)
+		i915_vma_resource_put(vw->vma_res);
+	if (vw->rsgt)
+		i915_refct_sgt_put(vw->rsgt);
 }
 
 static const struct dma_fence_work_ops bind_ops = {
@@ -379,13 +385,11 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
 {
 	struct drm_i915_gem_object *obj = vma->obj;
 
-	i915_vma_resource_init(vma_res, vma->pages, &vma->page_sizes,
+	i915_vma_resource_init(vma_res, vma->vm, vma->pages, &vma->page_sizes,
 			       i915_gem_object_is_readonly(obj),
 			       i915_gem_object_is_lmem(obj),
-			       vma->private,
-			       vma->node.start,
-			       vma->node.size,
-			       vma->size);
+			       vma->ops, vma->private, vma->node.start,
+			       vma->node.size, vma->size);
 }
 
 /**
@@ -409,6 +413,7 @@ int i915_vma_bind(struct i915_vma *vma,
 {
 	u32 bind_flags;
 	u32 vma_flags;
+	int ret;
 
 	lockdep_assert_held(&vma->vm->mutex);
 	GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
@@ -417,12 +422,12 @@ int i915_vma_bind(struct i915_vma *vma,
 	if (GEM_DEBUG_WARN_ON(range_overflows(vma->node.start,
 					      vma->node.size,
 					      vma->vm->total))) {
-		kfree(vma_res);
+		i915_vma_resource_free(vma_res);
 		return -ENODEV;
 	}
 
 	if (GEM_DEBUG_WARN_ON(!flags)) {
-		kfree(vma_res);
+		i915_vma_resource_free(vma_res);
 		return -EINVAL;
 	}
 
@@ -434,12 +439,30 @@ int i915_vma_bind(struct i915_vma *vma,
 
 	bind_flags &= ~vma_flags;
 	if (bind_flags == 0) {
-		kfree(vma_res);
+		i915_vma_resource_free(vma_res);
 		return 0;
 	}
 
 	GEM_BUG_ON(!atomic_read(&vma->pages_count));
 
+	/* Wait for or await async unbinds touching our range */
+	if (work && bind_flags & vma->vm->bind_async_flags)
+		ret = i915_vma_resource_bind_dep_await(vma->vm,
+						       &work->base.chain,
+						       vma->node.start,
+						       vma->node.size,
+						       true,
+						       GFP_NOWAIT |
+						       __GFP_RETRY_MAYFAIL |
+						       __GFP_NOWARN);
+	else
+		ret = i915_vma_resource_bind_dep_sync(vma->vm, vma->node.start,
+						      vma->node.size, true);
+	if (ret) {
+		i915_vma_resource_free(vma_res);
+		return ret;
+	}
+
 	if (vma->resource || !vma_res) {
 		/* Rebinding with an additional I915_VMA_*_BIND */
 		GEM_WARN_ON(!vma_flags);
@@ -452,9 +475,11 @@ int i915_vma_bind(struct i915_vma *vma,
 	if (work && bind_flags & vma->vm->bind_async_flags) {
 		struct dma_fence *prev;
 
-		work->vma = vma;
+		work->vma_res = i915_vma_resource_get(vma->resource);
 		work->cache_level = cache_level;
 		work->flags = bind_flags;
+		if (vma->obj->mm.rsgt)
+			work->rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);
 
 		/*
 		 * Note we only want to chain up to the migration fence on
@@ -475,14 +500,24 @@ int i915_vma_bind(struct i915_vma *vma,
 
 		work->base.dma.error = 0; /* enable the queue_work() */
 
-		work->pinned = i915_gem_object_get(vma->obj);
+		/*
+		 * If we don't have the refcounted pages list, keep a reference
+		 * on the object to avoid waiting for the async bind to
+		 * complete in the object destruction path.
+		 */
+		if (!work->rsgt)
+			work->pinned = i915_gem_object_get(vma->obj);
 	} else {
 		if (vma->obj) {
 			int ret;
 
 			ret = i915_gem_object_wait_moving_fence(vma->obj, true);
-			if (ret)
+			if (ret) {
+				i915_vma_resource_free(vma->resource);
+				vma->resource = NULL;
+
 				return ret;
+			}
 		}
 		vma->ops->bind_vma(vma->vm, NULL, vma->resource, cache_level,
 				   bind_flags);
@@ -1755,8 +1790,9 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
 	return 0;
 }
 
-void __i915_vma_evict(struct i915_vma *vma)
+struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async)
 {
+	struct i915_vma_resource *vma_res = vma->resource;
 	struct dma_fence *unbind_fence;
 
 	GEM_BUG_ON(i915_vma_is_pinned(vma));
@@ -1789,27 +1825,39 @@ void __i915_vma_evict(struct i915_vma *vma)
 	GEM_BUG_ON(vma->fence);
 	GEM_BUG_ON(i915_vma_has_userfault(vma));
 
-	if (likely(atomic_read(&vma->vm->open))) {
-		trace_i915_vma_unbind(vma);
-		vma->ops->unbind_vma(vma->vm, vma->resource);
-	}
+	/* Object backend must be async capable. */
+	GEM_WARN_ON(async && !vma->obj->mm.rsgt);
+
+	/* If vm is not open, unbind is a nop. */
+	vma_res->needs_wakeref = i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND) &&
+		atomic_read(&vma->vm->open);
+	trace_i915_vma_unbind(vma);
+
+	unbind_fence = i915_vma_resource_unbind(vma_res);
+	vma->resource = NULL;
+
 	atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE),
 		   &vma->flags);
 
-	unbind_fence = i915_vma_resource_unbind(vma->resource);
-	i915_vma_resource_put(vma->resource);
-	vma->resource = NULL;
+	/* Object backend must be async capable. */
+	GEM_WARN_ON(async && !vma->obj->mm.rsgt);
 
 	i915_vma_detach(vma);
-	vma_unbind_pages(vma);
+
+	if (!async && unbind_fence) {
+		dma_fence_wait(unbind_fence, false);
+		dma_fence_put(unbind_fence);
+		unbind_fence = NULL;
+	}
 
 	/*
-	 * This uninterruptible wait under the vm mutex is currently
-	 * only ever blocking while the vma is being captured from.
-	 * With async unbinding, this wait here will be removed.
+	 * Binding itself may not have completed until the unbind fence signals,
+	 * so don't drop the pages until that happens, unless the resource is
+	 * async_capable.
 	 */
-	dma_fence_wait(unbind_fence, false);
-	dma_fence_put(unbind_fence);
+
+	vma_unbind_pages(vma);
+	return unbind_fence;
 }
 
 int __i915_vma_unbind(struct i915_vma *vma)
@@ -1836,12 +1884,46 @@ int __i915_vma_unbind(struct i915_vma *vma)
 		return ret;
 
 	GEM_BUG_ON(i915_vma_is_active(vma));
-	__i915_vma_evict(vma);
+	__i915_vma_evict(vma, false);
 
 	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
 	return 0;
 }
 
+static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma)
+{
+	struct dma_fence *fence;
+
+	lockdep_assert_held(&vma->vm->mutex);
+
+	if (!drm_mm_node_allocated(&vma->node))
+		return NULL;
+
+	if (i915_vma_is_pinned(vma))
+		return ERR_PTR(-EAGAIN);
+
+	/*
+	 * We probably need to replace this with awaiting the fences of the
+	 * object's dma_resv when the vma active goes away. When doing that
+	 * we need to be careful to not add the vma_resource unbind fence
+	 * immediately to the object's dma_resv, because then unbinding
+	 * the next vma from the object, in case there are many, will
+	 * actually await the unbinding of the previous vmas, which is
+	 * undesirable.
+	 */
+	if (i915_sw_fence_await_active(&vma->resource->chain, &vma->active,
+				       I915_ACTIVE_AWAIT_EXCL |
+				       I915_ACTIVE_AWAIT_ACTIVE) < 0) {
+		return ERR_PTR(-EBUSY);
+	}
+
+	fence = __i915_vma_evict(vma, true);
+
+	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
+
+	return fence;
+}
+
 int i915_vma_unbind(struct i915_vma *vma)
 {
 	struct i915_address_space *vm = vma->vm;
@@ -1878,6 +1960,68 @@ int i915_vma_unbind(struct i915_vma *vma)
 	return err;
 }
 
+int i915_vma_unbind_async(struct i915_vma *vma, bool trylock_vm)
+{
+	struct drm_i915_gem_object *obj = vma->obj;
+	struct i915_address_space *vm = vma->vm;
+	intel_wakeref_t wakeref = 0;
+	struct dma_fence *fence;
+	int err;
+
+	/*
+	 * We need the dma-resv lock since we add the
+	 * unbind fence to the dma-resv object.
+	 */
+	assert_object_held(obj);
+
+	if (!drm_mm_node_allocated(&vma->node))
+		return 0;
+
+	if (i915_vma_is_pinned(vma)) {
+		vma_print_allocator(vma, "is pinned");
+		return -EAGAIN;
+	}
+
+	if (!obj->mm.rsgt)
+		return -EBUSY;
+
+	err = dma_resv_reserve_shared(obj->base.resv, 1);
+	if (err)
+		return -EBUSY;
+
+	/*
+	 * It would be great if we could grab this wakeref from the
+	 * async unbind work if needed, but we can't because it uses
+	 * kmalloc and it's in the dma-fence signalling critical path.
+	 */
+	if (i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND))
+		wakeref = intel_runtime_pm_get(&vm->i915->runtime_pm);
+
+	if (trylock_vm && !mutex_trylock(&vm->mutex)) {
+		err = -EBUSY;
+		goto out_rpm;
+	} else if (!trylock_vm) {
+		err = mutex_lock_interruptible_nested(&vm->mutex, !wakeref);
+		if (err)
+			goto out_rpm;
+	}
+
+	fence = __i915_vma_unbind_async(vma);
+	mutex_unlock(&vm->mutex);
+	if (IS_ERR_OR_NULL(fence)) {
+		err = PTR_ERR_OR_ZERO(fence);
+		goto out_rpm;
+	}
+
+	dma_resv_add_shared_fence(obj->base.resv, fence);
+	dma_fence_put(fence);
+
+out_rpm:
+	if (wakeref)
+		intel_runtime_pm_put(&vm->i915->runtime_pm, wakeref);
+	return err;
+}
+
 struct i915_vma *i915_vma_make_unshrinkable(struct i915_vma *vma)
 {
 	i915_gem_object_make_unshrinkable(vma->obj);
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 1df57ec832bd..a560bae04e7e 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -213,9 +213,10 @@ bool i915_vma_misplaced(const struct i915_vma *vma,
 			u64 size, u64 alignment, u64 flags);
 void __i915_vma_set_map_and_fenceable(struct i915_vma *vma);
 void i915_vma_revoke_mmap(struct i915_vma *vma);
-void __i915_vma_evict(struct i915_vma *vma);
+struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async);
 int __i915_vma_unbind(struct i915_vma *vma);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
+int __must_check i915_vma_unbind_async(struct i915_vma *vma, bool trylock_vm);
 void i915_vma_unlink_ctx(struct i915_vma *vma);
 void i915_vma_close(struct i915_vma *vma);
 void i915_vma_reopen(struct i915_vma *vma);
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c
index c86db89ab5d2..3dfb3c6731f8 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.c
+++ b/drivers/gpu/drm/i915/i915_vma_resource.c
@@ -2,39 +2,44 @@
 /*
  * Copyright © 2021 Intel Corporation
  */
+
+#include <linux/interval_tree_generic.h>
 #include <linux/slab.h>
 
+#include "i915_sw_fence.h"
 #include "i915_vma_resource.h"
+#include "i915_drv.h"
 
-/* Callbacks for the unbind dma-fence. */
-static const char *get_driver_name(struct dma_fence *fence)
-{
-	return "vma unbind fence";
-}
+#include "gt/intel_gtt.h"
 
-static const char *get_timeline_name(struct dma_fence *fence)
-{
-	return "unbound";
-}
-
-static struct dma_fence_ops unbind_fence_ops = {
-	.get_driver_name = get_driver_name,
-	.get_timeline_name = get_timeline_name,
-};
+static struct kmem_cache *slab_vma_resources;
 
 /**
- * __i915_vma_resource_init - Initialize a vma resource.
- * @vma_res: The vma resource to initialize
+ * DOC:
+ * We use a per-vm interval tree to keep track of vma_resources
+ * scheduled for unbind but not yet unbound. The tree is protected by
+ * the vm mutex, and nodes are removed just after the unbind fence signals.
+ * The removal takes the vm mutex from a kernel thread which we need to
+ * keep in mind so that we don't grab the mutex and try to wait for all
+ * pending unbinds to complete, because that will temporaryily block many
+ * of the workqueue threads, and people will get angry.
  *
- * Initializes the private members of a vma resource.
+ * We should consider using a single ordered fence per VM instead but that
+ * requires ordering the unbinds and might introduce unnecessary waiting
+ * for unrelated unbinds. Amount of code will probably be roughly the same
+ * due to the simplicity of using the interval tree interface.
+ *
+ * Another drawback of this interval tree is that the complexity of insertion
+ * and removal of fences increases as O(ln(pending_unbinds)) instead of
+ * O(1) for a single fence without interval tree.
  */
-void __i915_vma_resource_init(struct i915_vma_resource *vma_res)
-{
-	spin_lock_init(&vma_res->lock);
-	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
-		       &vma_res->lock, 0, 0);
-	refcount_set(&vma_res->hold_count, 1);
-}
+#define VMA_RES_START(_node) ((_node)->start)
+#define VMA_RES_LAST(_node) ((_node)->start + (_node)->node_size - 1)
+INTERVAL_TREE_DEFINE(struct i915_vma_resource, rb,
+		     unsigned long, __subtree_last,
+		     VMA_RES_START, VMA_RES_LAST, static, vma_res_itree);
+
+/* Callbacks for the unbind dma-fence. */
 
 /**
  * i915_vma_resource_alloc - Allocate a vma resource
@@ -45,15 +50,73 @@ void __i915_vma_resource_init(struct i915_vma_resource *vma_res)
 struct i915_vma_resource *i915_vma_resource_alloc(void)
 {
 	struct i915_vma_resource *vma_res =
-		kzalloc(sizeof(*vma_res), GFP_KERNEL);
+		kmem_cache_zalloc(slab_vma_resources, GFP_KERNEL);
 
 	return vma_res ? vma_res : ERR_PTR(-ENOMEM);
 }
 
+/**
+ * i915_vma_resource_free - Free a vma resource
+ * @vma_res: The vma resource to free.
+ */
+void i915_vma_resource_free(struct i915_vma_resource *vma_res)
+{
+	kmem_cache_free(slab_vma_resources, vma_res);
+}
+
+static const char *get_driver_name(struct dma_fence *fence)
+{
+	return "vma unbind fence";
+}
+
+static const char *get_timeline_name(struct dma_fence *fence)
+{
+	return "unbound";
+}
+
+static void unbind_fence_free_rcu(struct rcu_head *head)
+{
+	struct i915_vma_resource *vma_res =
+		container_of(head, typeof(*vma_res), unbind_fence.rcu);
+
+	i915_vma_resource_free(vma_res);
+}
+
+static void unbind_fence_release(struct dma_fence *fence)
+{
+	struct i915_vma_resource *vma_res =
+		container_of(fence, typeof(*vma_res), unbind_fence);
+
+	i915_sw_fence_fini(&vma_res->chain);
+
+	call_rcu(&fence->rcu, unbind_fence_free_rcu);
+}
+
+static struct dma_fence_ops unbind_fence_ops = {
+	.get_driver_name = get_driver_name,
+	.get_timeline_name = get_timeline_name,
+	.release = unbind_fence_release,
+};
+
 static void __i915_vma_resource_unhold(struct i915_vma_resource *vma_res)
 {
-	if (refcount_dec_and_test(&vma_res->hold_count))
-		dma_fence_signal(&vma_res->unbind_fence);
+	struct i915_address_space *vm;
+
+	if (!refcount_dec_and_test(&vma_res->hold_count))
+		return;
+
+	dma_fence_signal(&vma_res->unbind_fence);
+
+	vm = vma_res->vm;
+	if (vma_res->wakeref)
+		intel_runtime_pm_put(&vm->i915->runtime_pm, vma_res->wakeref);
+
+	vma_res->vm = NULL;
+	if (!RB_EMPTY_NODE(&vma_res->rb)) {
+		mutex_lock(&vm->mutex);
+		vma_res_itree_remove(vma_res, &vm->pending_unbind);
+		mutex_unlock(&vm->mutex);
+	}
 }
 
 /**
@@ -102,6 +165,49 @@ bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
 	return held;
 }
 
+static void i915_vma_resource_unbind_work(struct work_struct *work)
+{
+	struct i915_vma_resource *vma_res =
+		container_of(work, typeof(*vma_res), work);
+	struct i915_address_space *vm = vma_res->vm;
+	bool lockdep_cookie;
+
+	lockdep_cookie = dma_fence_begin_signalling();
+	if (likely(atomic_read(&vm->open)))
+		vma_res->ops->unbind_vma(vm, vma_res);
+
+	dma_fence_end_signalling(lockdep_cookie);
+	__i915_vma_resource_unhold(vma_res);
+	i915_vma_resource_put(vma_res);
+}
+
+static int
+i915_vma_resource_fence_notify(struct i915_sw_fence *fence,
+			       enum i915_sw_fence_notify state)
+{
+	struct i915_vma_resource *vma_res =
+		container_of(fence, typeof(*vma_res), chain);
+	struct dma_fence *unbind_fence =
+		&vma_res->unbind_fence;
+
+	switch (state) {
+	case FENCE_COMPLETE:
+		dma_fence_get(unbind_fence);
+		if (vma_res->immediate_unbind) {
+			i915_vma_resource_unbind_work(&vma_res->work);
+		} else {
+			INIT_WORK(&vma_res->work, i915_vma_resource_unbind_work);
+			queue_work(system_unbound_wq, &vma_res->work);
+		}
+		break;
+	case FENCE_FREE:
+		i915_vma_resource_put(vma_res);
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
 /**
  * i915_vma_resource_unbind - Unbind a vma resource
  * @vma_res: The vma resource to unbind.
@@ -112,10 +218,196 @@ bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
  * Return: A refcounted pointer to a dma-fence that signals when unbinding is
  * complete.
  */
-struct dma_fence *
-i915_vma_resource_unbind(struct i915_vma_resource *vma_res)
+struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res)
 {
-	__i915_vma_resource_unhold(vma_res);
-	dma_fence_get(&vma_res->unbind_fence);
+	struct i915_address_space *vm = vma_res->vm;
+
+	/* Reference for the sw fence */
+	i915_vma_resource_get(vma_res);
+
+	/* Caller must already have a wakeref in this case. */
+	if (vma_res->needs_wakeref)
+		vma_res->wakeref = intel_runtime_pm_get_if_in_use(&vm->i915->runtime_pm);
+
+	if (atomic_read(&vma_res->chain.pending) <= 1) {
+		RB_CLEAR_NODE(&vma_res->rb);
+		vma_res->immediate_unbind = 1;
+	} else {
+		vma_res_itree_insert(vma_res, &vma_res->vm->pending_unbind);
+	}
+
+	i915_sw_fence_commit(&vma_res->chain);
+
 	return &vma_res->unbind_fence;
 }
+
+/**
+ * __i915_vma_resource_init - Initialize a vma resource.
+ * @vma_res: The vma resource to initialize
+ *
+ * Initializes the private members of a vma resource.
+ */
+void __i915_vma_resource_init(struct i915_vma_resource *vma_res)
+{
+	spin_lock_init(&vma_res->lock);
+	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
+		       &vma_res->lock, 0, 0);
+	refcount_set(&vma_res->hold_count, 1);
+	i915_sw_fence_init(&vma_res->chain, i915_vma_resource_fence_notify);
+}
+
+static void
+i915_vma_resource_color_adjust_range(struct i915_address_space *vm,
+				     unsigned long *start,
+				     unsigned long *end)
+{
+	if (i915_vm_has_cache_coloring(vm)) {
+		if (start)
+			start -= I915_GTT_PAGE_SIZE;
+		end += I915_GTT_PAGE_SIZE;
+	}
+}
+
+/**
+ * i915_vma_resource_bind_dep_sync - Wait for / sync all unbinds touching a
+ * certain vm range.
+ * @vm: The vm to look at.
+ * @offset: The range start.
+ * @size: The range size.
+ * @intr: Whether to wait interrubtible.
+ *
+ * The function needs to be called with the vm lock held.
+ *
+ * Return: Zero on success, -ERESTARTSYS if interrupted and @intr==true
+ */
+int i915_vma_resource_bind_dep_sync(struct i915_address_space *vm,
+				    unsigned long offset,
+				    unsigned long size,
+				    bool intr)
+{
+	struct i915_vma_resource *node;
+	unsigned long last = offset + size - 1;
+
+	lockdep_assert_held(&vm->mutex);
+	might_sleep();
+
+	i915_vma_resource_color_adjust_range(vm, &offset, &last);
+	node = vma_res_itree_iter_first(&vm->pending_unbind, offset, last);
+	while (node) {
+		int ret = dma_fence_wait(&node->unbind_fence, intr);
+
+		if (ret)
+			return ret;
+
+		node = vma_res_itree_iter_next(node, offset, last);
+	}
+
+	return 0;
+}
+
+/**
+ * i915_vma_resource_bind_dep_sync_all - Wait for / sync all unbinds of a vm,
+ * releasing the vm lock while waiting.
+ * @vm: The vm to look at.
+ *
+ * The function may not be called with the vm lock held.
+ * Typically this is called at vm destruction to finish any pending
+ * unbind operations. The vm mutex is released while waiting to avoid
+ * stalling kernel workqueues trying to grab the mutex.
+ */
+void i915_vma_resource_bind_dep_sync_all(struct i915_address_space *vm)
+{
+	struct i915_vma_resource *node;
+	struct dma_fence *fence;
+
+	do {
+		fence = NULL;
+		mutex_lock(&vm->mutex);
+		node = vma_res_itree_iter_first(&vm->pending_unbind, 0,
+						ULONG_MAX);
+		if (node)
+			fence = dma_fence_get_rcu(&node->unbind_fence);
+		mutex_unlock(&vm->mutex);
+
+		if (fence) {
+			/*
+			 * The wait makes sure the node eventually removes
+			 * itself from the tree.
+			 */
+			dma_fence_wait(fence, false);
+			dma_fence_put(fence);
+		}
+	} while (node);
+}
+
+/**
+ * i915_vma_resource_bind_dep_await - Have a struct i915_sw_fence await all
+ * pending unbinds in a certain range of a vm.
+ * @vm: The vm to look at.
+ * @sw_fence: The struct i915_sw_fence that will be awaiting the unbinds.
+ * @offset: The range start.
+ * @size: The range size.
+ * @intr: Whether to wait interrubtible.
+ * @gfp: Allocation mode for memory allocations.
+ *
+ * The function makes @sw_fence await all pending unbinds in a certain
+ * vm range before calling the complete notifier. To be able to await
+ * each individual unbind, the function needs to allocate memory using
+ * the @gpf allocation mode. If that fails, the function will instead
+ * wait for the unbind fence to signal, using @intr to judge whether to
+ * wait interruptible or not. Note that @gfp should ideally be selected so
+ * as to avoid any expensive memory allocation stalls and rather fail and
+ * synchronize itself. For now the vm mutex is required when calling this
+ * function with means that @gfp can't call into direct reclaim. In reality
+ * this means that during heavy memory pressure, we will sync in this
+ * function.
+ *
+ * Return: Zero on success, -ERESTARTSYS if interrupted and @intr==true
+ */
+int i915_vma_resource_bind_dep_await(struct i915_address_space *vm,
+				     struct i915_sw_fence *sw_fence,
+				     unsigned long offset,
+				     unsigned long size,
+				     bool intr,
+				     gfp_t gfp)
+{
+	struct i915_vma_resource *node;
+	unsigned long last = offset + size - 1;
+
+	lockdep_assert_held(&vm->mutex);
+	might_alloc(gfp);
+	might_sleep();
+
+	i915_vma_resource_color_adjust_range(vm, &offset, &last);
+	node = vma_res_itree_iter_first(&vm->pending_unbind, offset, last);
+	while (node) {
+		int ret;
+
+		ret = i915_sw_fence_await_dma_fence(sw_fence,
+						    &node->unbind_fence,
+						    0, gfp);
+		if (ret < 0) {
+			ret = dma_fence_wait(&node->unbind_fence, intr);
+			if (ret)
+				return ret;
+		}
+
+		node = vma_res_itree_iter_next(node, offset, last);
+	}
+
+	return 0;
+}
+
+void i915_vma_resource_module_exit(void)
+{
+	kmem_cache_destroy(slab_vma_resources);
+}
+
+int __init i915_vma_resource_module_init(void)
+{
+	slab_vma_resources = KMEM_CACHE(i915_vma_resource, SLAB_HWCACHE_ALIGN);
+	if (!slab_vma_resources)
+		return -ENOMEM;
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
index 9872de58268b..a89537e83c70 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.h
+++ b/drivers/gpu/drm/i915/i915_vma_resource.h
@@ -10,6 +10,8 @@
 #include <linux/refcount.h>
 
 #include "i915_gem.h"
+#include "i915_sw_fence.h"
+#include "intel_runtime_pm.h"
 
 struct i915_page_sizes {
 	/**
@@ -39,6 +41,13 @@ struct i915_page_sizes {
  * @hold_count: Number of holders blocking the fence from finishing.
  * The vma itself is keeping a hold, which is released when unbind
  * is scheduled.
+ * @work: Work struct for deferred unbind work.
+ * @chain: Pointer to struct i915_sw_fence used to await dependencies.
+ * @rb: Rb node for the vm's pending unbind interval tree.
+ * @__subtree_last: Interval tree private member.
+ * @vm: non-refcounted pointer to the vm. This is for internal use only and
+ * this member is cleared after vm_resource unbind.
+ * @ops: Pointer to the backend i915_vma_ops.
  * @private: Bind backend private info.
  * @start: Offset into the address space of bind range start.
  * @node_size: Size of the allocated range manager node.
@@ -46,6 +55,8 @@ struct i915_page_sizes {
  * @page_sizes_gtt: Resulting page sizes from the bind operation.
  * @bound_flags: Flags indicating binding status.
  * @allocated: Backend private data. TODO: Should move into @private.
+ * @immediate_unbind: Unbind can be done immediately and don't need to be
+ * deferred to a work item awaiting unsignaled fences.
  *
  * The lifetime of a struct i915_vma_resource is from a binding request to
  * the actual possible asynchronous unbind has completed.
@@ -55,6 +66,12 @@ struct i915_vma_resource {
 	/* See above for description of the lock. */
 	spinlock_t lock;
 	refcount_t hold_count;
+	struct work_struct work;
+	struct i915_sw_fence chain;
+	struct rb_node rb;
+	unsigned long __subtree_last;
+	struct i915_address_space *vm;
+	intel_wakeref_t wakeref;
 
 	/**
 	 * struct i915_vma_bindinfo - Information needed for async bind
@@ -74,13 +91,17 @@ struct i915_vma_resource {
 		bool lmem:1;
 	} bi;
 
+	const struct i915_vma_ops *ops;
 	void *private;
 	unsigned long start;
 	unsigned long node_size;
 	unsigned long vma_size;
 	u32 page_sizes_gtt;
+
 	u32 bound_flags;
 	bool allocated:1;
+	bool immediate_unbind:1;
+	bool needs_wakeref:1;
 };
 
 bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
@@ -91,6 +112,8 @@ void i915_vma_resource_unhold(struct i915_vma_resource *vma_res,
 
 struct i915_vma_resource *i915_vma_resource_alloc(void);
 
+void i915_vma_resource_free(struct i915_vma_resource *vma_res);
+
 struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res);
 
 void __i915_vma_resource_init(struct i915_vma_resource *vma_res);
@@ -120,10 +143,12 @@ static inline void i915_vma_resource_put(struct i915_vma_resource *vma_res)
 /**
  * i915_vma_resource_init - Initialize a vma resource.
  * @vma_res: The vma resource to initialize
+ * @vm: Pointer to the vm.
  * @pages: The pages sg-table.
  * @page_sizes: Page sizes of the pages.
  * @readonly: Whether the vma should be bound read-only.
  * @lmem: Whether the vma points to lmem.
+ * @ops: The backend ops.
  * @private: Bind backend private info.
  * @start: Offset into the address space of bind range start.
  * @node_size: Size of the allocated range manager node.
@@ -135,20 +160,24 @@ static inline void i915_vma_resource_put(struct i915_vma_resource *vma_res)
  * allocation is not allowed.
  */
 static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
+					  struct i915_address_space *vm,
 					  struct sg_table *pages,
 					  const struct i915_page_sizes *page_sizes,
 					  bool readonly,
 					  bool lmem,
+					  const struct i915_vma_ops *ops,
 					  void *private,
 					  unsigned long start,
 					  unsigned long node_size,
 					  unsigned long size)
 {
 	__i915_vma_resource_init(vma_res);
+	vma_res->vm = vm;
 	vma_res->bi.pages = pages;
 	vma_res->bi.page_sizes = *page_sizes;
 	vma_res->bi.readonly = readonly;
 	vma_res->bi.lmem = lmem;
+	vma_res->ops = ops;
 	vma_res->private = private;
 	vma_res->start = start;
 	vma_res->node_size = node_size;
@@ -158,6 +187,25 @@ static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
 static inline void i915_vma_resource_fini(struct i915_vma_resource *vma_res)
 {
 	GEM_BUG_ON(refcount_read(&vma_res->hold_count) != 1);
+	i915_sw_fence_fini(&vma_res->chain);
 }
 
+int i915_vma_resource_bind_dep_sync(struct i915_address_space *vm,
+				    unsigned long first,
+				    unsigned long last,
+				    bool intr);
+
+int i915_vma_resource_bind_dep_await(struct i915_address_space *vm,
+				     struct i915_sw_fence *sw_fence,
+				     unsigned long first,
+				     unsigned long last,
+				     bool intr,
+				     gfp_t gfp);
+
+void i915_vma_resource_bind_dep_sync_all(struct i915_address_space *vm);
+
+void i915_vma_resource_module_exit(void);
+
+int i915_vma_resource_module_init(void);
+
 #endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v5 5/6] drm/i915: Asynchronous migration selftest
  2022-01-04 12:51 ` [Intel-gfx] " Thomas Hellström
@ 2022-01-04 12:51   ` Thomas Hellström
  -1 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

Add a selftest to exercise asynchronous migration and -unbining.
Extend the gem_migrate selftest to perform the migrations while
depending on a spinner and a bound vma set up on the migrated
buffer object.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  12 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   3 +
 .../drm/i915/gem/selftests/i915_gem_migrate.c | 192 ++++++++++++++++--
 3 files changed, 192 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index d87b508b59b1..1a9e1f940a7d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -756,6 +756,18 @@ i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj)
 	return dma_fence_get(i915_gem_to_ttm(obj)->moving);
 }
 
+void i915_gem_object_set_moving_fence(struct drm_i915_gem_object *obj,
+				      struct dma_fence *fence)
+{
+	struct dma_fence **moving = &i915_gem_to_ttm(obj)->moving;
+
+	if (*moving == fence)
+		return;
+
+	dma_fence_put(*moving);
+	*moving = dma_fence_get(fence);
+}
+
 /**
  * i915_gem_object_wait_moving_fence - Wait for the object's moving fence if any
  * @obj: The object whose moving fence to wait for.
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index f66d46882ea7..1d17ffff8236 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -524,6 +524,9 @@ i915_gem_object_finish_access(struct drm_i915_gem_object *obj)
 struct dma_fence *
 i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj);
 
+void i915_gem_object_set_moving_fence(struct drm_i915_gem_object *obj,
+				      struct dma_fence *fence);
+
 int i915_gem_object_wait_moving_fence(struct drm_i915_gem_object *obj,
 				      bool intr);
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
index ecb691c81d1e..d534141b2cf7 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
@@ -4,8 +4,13 @@
  */
 
 #include "gt/intel_migrate.h"
+#include "gt/intel_gpu_commands.h"
 #include "gem/i915_gem_ttm_move.h"
 
+#include "i915_deps.h"
+
+#include "selftests/igt_spinner.h"
+
 static int igt_fill_check_buffer(struct drm_i915_gem_object *obj,
 				 bool fill)
 {
@@ -101,7 +106,8 @@ static int igt_same_create_migrate(void *arg)
 }
 
 static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
-				  struct drm_i915_gem_object *obj)
+				  struct drm_i915_gem_object *obj,
+				  struct i915_vma *vma)
 {
 	int err;
 
@@ -109,6 +115,24 @@ static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
 	if (err)
 		return err;
 
+	if (vma) {
+		err = i915_vma_pin_ww(vma, ww, obj->base.size, 0,
+				      0UL | PIN_OFFSET_FIXED |
+				      PIN_USER);
+		if (err) {
+			if (err != -EINTR && err != ERESTARTSYS &&
+			    err != -EDEADLK)
+				pr_err("Failed to pin vma.\n");
+			return err;
+		}
+
+		i915_vma_unpin(vma);
+	}
+
+	/*
+	 * Migration will implicitly unbind (asynchronously) any bound
+	 * vmas.
+	 */
 	if (i915_gem_object_is_lmem(obj)) {
 		err = i915_gem_object_migrate(obj, ww, INTEL_REGION_SMEM);
 		if (err) {
@@ -149,11 +173,15 @@ static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
 	return err;
 }
 
-static int igt_lmem_pages_migrate(void *arg)
+static int __igt_lmem_pages_migrate(struct intel_gt *gt,
+				    struct i915_address_space *vm,
+				    struct i915_deps *deps,
+				    struct igt_spinner *spin,
+				    struct dma_fence *spin_fence)
 {
-	struct intel_gt *gt = arg;
 	struct drm_i915_private *i915 = gt->i915;
 	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma = NULL;
 	struct i915_gem_ww_ctx ww;
 	struct i915_request *rq;
 	int err;
@@ -165,6 +193,14 @@ static int igt_lmem_pages_migrate(void *arg)
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
+	if (vm) {
+		vma = i915_vma_instance(obj, vm, NULL);
+		if (IS_ERR(vma)) {
+			err = PTR_ERR(vma);
+			goto out_put;
+		}
+	}
+
 	/* Initial GPU fill, sync, CPU initialization. */
 	for_i915_gem_ww(&ww, err, true) {
 		err = i915_gem_object_lock(obj, &ww);
@@ -175,25 +211,23 @@ static int igt_lmem_pages_migrate(void *arg)
 		if (err)
 			continue;
 
-		err = intel_migrate_clear(&gt->migrate, &ww, NULL,
+		err = intel_migrate_clear(&gt->migrate, &ww, deps,
 					  obj->mm.pages->sgl, obj->cache_level,
 					  i915_gem_object_is_lmem(obj),
 					  0xdeadbeaf, &rq);
 		if (rq) {
 			dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
+			i915_gem_object_set_moving_fence(obj, &rq->fence);
 			i915_request_put(rq);
 		}
 		if (err)
 			continue;
 
-		err = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE,
-					   5 * HZ);
-		if (err)
-			continue;
-
-		err = igt_fill_check_buffer(obj, true);
-		if (err)
-			continue;
+		if (!vma) {
+			err = igt_fill_check_buffer(obj, true);
+			if (err)
+				continue;
+		}
 	}
 	if (err)
 		goto out_put;
@@ -204,7 +238,7 @@ static int igt_lmem_pages_migrate(void *arg)
 	 */
 	for (i = 1; i <= 5; ++i) {
 		for_i915_gem_ww(&ww, err, true)
-			err = lmem_pages_migrate_one(&ww, obj);
+			err = lmem_pages_migrate_one(&ww, obj, vma);
 		if (err)
 			goto out_put;
 	}
@@ -213,12 +247,27 @@ static int igt_lmem_pages_migrate(void *arg)
 	if (err)
 		goto out_put;
 
+	if (spin) {
+		if (dma_fence_is_signaled(spin_fence)) {
+			pr_err("Spinner was terminated by hangcheck.\n");
+			err = -EBUSY;
+			goto out_unlock;
+		}
+		igt_spinner_end(spin);
+	}
+
 	/* Finally sync migration and check content. */
 	err = i915_gem_object_wait_migration(obj, true);
 	if (err)
 		goto out_unlock;
 
-	err = igt_fill_check_buffer(obj, false);
+	if (vma) {
+		err = i915_vma_wait_for_bind(vma);
+		if (err)
+			goto out_unlock;
+	} else {
+		err = igt_fill_check_buffer(obj, false);
+	}
 
 out_unlock:
 	i915_gem_object_unlock(obj);
@@ -231,6 +280,7 @@ static int igt_lmem_pages_migrate(void *arg)
 static int igt_lmem_pages_failsafe_migrate(void *arg)
 {
 	int fail_gpu, fail_alloc, ret;
+	struct intel_gt *gt = arg;
 
 	for (fail_gpu = 0; fail_gpu < 2; ++fail_gpu) {
 		for (fail_alloc = 0; fail_alloc < 2; ++fail_alloc) {
@@ -238,7 +288,118 @@ static int igt_lmem_pages_failsafe_migrate(void *arg)
 				fail_gpu, fail_alloc);
 			i915_ttm_migrate_set_failure_modes(fail_gpu,
 							   fail_alloc);
-			ret = igt_lmem_pages_migrate(arg);
+			ret = __igt_lmem_pages_migrate(gt, NULL, NULL, NULL, NULL);
+			if (ret)
+				goto out_err;
+		}
+	}
+
+out_err:
+	i915_ttm_migrate_set_failure_modes(false, false);
+	return ret;
+}
+
+/*
+ * This subtest tests that unbinding at migration is indeed performed
+ * async. We launch a spinner and a number of migrations depending on
+ * that spinner to have terminated. Before each migration we bind a
+ * vma, which should then be async unbound by the migration operation.
+ * If we are able to schedule migrations without blocking while the
+ * spinner is still running, those unbinds are indeed async and non-
+ * blocking.
+ *
+ * Note that each async bind operation is awaiting the previous migration
+ * due to the moving fence resulting from the migration.
+ */
+static int igt_async_migrate(struct intel_gt *gt)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	struct i915_ppgtt *ppgtt;
+	struct igt_spinner spin;
+	int err;
+
+	ppgtt = i915_ppgtt_create(gt, 0);
+	if (IS_ERR(ppgtt))
+		return PTR_ERR(ppgtt);
+
+	if (igt_spinner_init(&spin, gt)) {
+		err = -ENOMEM;
+		goto out_spin;
+	}
+
+	for_each_engine(engine, gt, id) {
+		struct ttm_operation_ctx ctx = {
+			.interruptible = true
+		};
+		struct dma_fence *spin_fence;
+		struct intel_context *ce;
+		struct i915_request *rq;
+		struct i915_deps deps;
+
+		ce = intel_context_create(engine);
+		if (IS_ERR(ce)) {
+			err = PTR_ERR(ce);
+			goto out_ce;
+		}
+
+		/*
+		 * Use MI_NOOP, making the spinner non-preemptible. If there
+		 * is a code path where we fail async operation due to the
+		 * running spinner, we will block and fail to end the
+		 * spinner resulting in a deadlock. But with a non-
+		 * preemptible spinner, hangcheck will terminate the spinner
+		 * for us, and we will later detect that and fail the test.
+		 */
+		rq = igt_spinner_create_request(&spin, ce, MI_NOOP);
+		intel_context_put(ce);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto out_ce;
+		}
+
+		i915_deps_init(&deps, GFP_KERNEL);
+		err = i915_deps_add_dependency(&deps, &rq->fence, &ctx);
+		spin_fence = dma_fence_get(&rq->fence);
+		i915_request_add(rq);
+		if (err)
+			goto out_ce;
+
+		err = __igt_lmem_pages_migrate(gt, &ppgtt->vm, &deps, &spin,
+					       spin_fence);
+		i915_deps_fini(&deps);
+		dma_fence_put(spin_fence);
+		if (err)
+			goto out_ce;
+	}
+
+out_ce:
+	igt_spinner_fini(&spin);
+out_spin:
+	i915_vm_put(&ppgtt->vm);
+
+	return err;
+}
+
+/*
+ * Setting ASYNC_FAIL_ALLOC to 2 will simulate memory allocation failure while
+ * arming the migration error check and block async migration. This
+ * will cause us to deadlock and hangcheck will terminate the spinner
+ * causing the test to fail.
+ */
+#define ASYNC_FAIL_ALLOC 1
+static int igt_lmem_async_migrate(void *arg)
+{
+	int fail_gpu, fail_alloc, ret;
+	struct intel_gt *gt = arg;
+
+	for (fail_gpu = 0; fail_gpu < 2; ++fail_gpu) {
+		for (fail_alloc = 0; fail_alloc < ASYNC_FAIL_ALLOC; ++fail_alloc) {
+			pr_info("Simulated failure modes: gpu: %d, alloc: %d\n",
+				fail_gpu, fail_alloc);
+			i915_ttm_migrate_set_failure_modes(fail_gpu,
+							   fail_alloc);
+			ret = igt_async_migrate(gt);
 			if (ret)
 				goto out_err;
 		}
@@ -256,6 +417,7 @@ int i915_gem_migrate_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(igt_lmem_create_migrate),
 		SUBTEST(igt_same_create_migrate),
 		SUBTEST(igt_lmem_pages_failsafe_migrate),
+		SUBTEST(igt_lmem_async_migrate),
 	};
 
 	if (!HAS_LMEM(i915))
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Intel-gfx] [PATCH v5 5/6] drm/i915: Asynchronous migration selftest
@ 2022-01-04 12:51   ` Thomas Hellström
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

Add a selftest to exercise asynchronous migration and -unbining.
Extend the gem_migrate selftest to perform the migrations while
depending on a spinner and a bound vma set up on the migrated
buffer object.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  12 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   3 +
 .../drm/i915/gem/selftests/i915_gem_migrate.c | 192 ++++++++++++++++--
 3 files changed, 192 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index d87b508b59b1..1a9e1f940a7d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -756,6 +756,18 @@ i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj)
 	return dma_fence_get(i915_gem_to_ttm(obj)->moving);
 }
 
+void i915_gem_object_set_moving_fence(struct drm_i915_gem_object *obj,
+				      struct dma_fence *fence)
+{
+	struct dma_fence **moving = &i915_gem_to_ttm(obj)->moving;
+
+	if (*moving == fence)
+		return;
+
+	dma_fence_put(*moving);
+	*moving = dma_fence_get(fence);
+}
+
 /**
  * i915_gem_object_wait_moving_fence - Wait for the object's moving fence if any
  * @obj: The object whose moving fence to wait for.
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index f66d46882ea7..1d17ffff8236 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -524,6 +524,9 @@ i915_gem_object_finish_access(struct drm_i915_gem_object *obj)
 struct dma_fence *
 i915_gem_object_get_moving_fence(struct drm_i915_gem_object *obj);
 
+void i915_gem_object_set_moving_fence(struct drm_i915_gem_object *obj,
+				      struct dma_fence *fence);
+
 int i915_gem_object_wait_moving_fence(struct drm_i915_gem_object *obj,
 				      bool intr);
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
index ecb691c81d1e..d534141b2cf7 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_migrate.c
@@ -4,8 +4,13 @@
  */
 
 #include "gt/intel_migrate.h"
+#include "gt/intel_gpu_commands.h"
 #include "gem/i915_gem_ttm_move.h"
 
+#include "i915_deps.h"
+
+#include "selftests/igt_spinner.h"
+
 static int igt_fill_check_buffer(struct drm_i915_gem_object *obj,
 				 bool fill)
 {
@@ -101,7 +106,8 @@ static int igt_same_create_migrate(void *arg)
 }
 
 static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
-				  struct drm_i915_gem_object *obj)
+				  struct drm_i915_gem_object *obj,
+				  struct i915_vma *vma)
 {
 	int err;
 
@@ -109,6 +115,24 @@ static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
 	if (err)
 		return err;
 
+	if (vma) {
+		err = i915_vma_pin_ww(vma, ww, obj->base.size, 0,
+				      0UL | PIN_OFFSET_FIXED |
+				      PIN_USER);
+		if (err) {
+			if (err != -EINTR && err != ERESTARTSYS &&
+			    err != -EDEADLK)
+				pr_err("Failed to pin vma.\n");
+			return err;
+		}
+
+		i915_vma_unpin(vma);
+	}
+
+	/*
+	 * Migration will implicitly unbind (asynchronously) any bound
+	 * vmas.
+	 */
 	if (i915_gem_object_is_lmem(obj)) {
 		err = i915_gem_object_migrate(obj, ww, INTEL_REGION_SMEM);
 		if (err) {
@@ -149,11 +173,15 @@ static int lmem_pages_migrate_one(struct i915_gem_ww_ctx *ww,
 	return err;
 }
 
-static int igt_lmem_pages_migrate(void *arg)
+static int __igt_lmem_pages_migrate(struct intel_gt *gt,
+				    struct i915_address_space *vm,
+				    struct i915_deps *deps,
+				    struct igt_spinner *spin,
+				    struct dma_fence *spin_fence)
 {
-	struct intel_gt *gt = arg;
 	struct drm_i915_private *i915 = gt->i915;
 	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma = NULL;
 	struct i915_gem_ww_ctx ww;
 	struct i915_request *rq;
 	int err;
@@ -165,6 +193,14 @@ static int igt_lmem_pages_migrate(void *arg)
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
+	if (vm) {
+		vma = i915_vma_instance(obj, vm, NULL);
+		if (IS_ERR(vma)) {
+			err = PTR_ERR(vma);
+			goto out_put;
+		}
+	}
+
 	/* Initial GPU fill, sync, CPU initialization. */
 	for_i915_gem_ww(&ww, err, true) {
 		err = i915_gem_object_lock(obj, &ww);
@@ -175,25 +211,23 @@ static int igt_lmem_pages_migrate(void *arg)
 		if (err)
 			continue;
 
-		err = intel_migrate_clear(&gt->migrate, &ww, NULL,
+		err = intel_migrate_clear(&gt->migrate, &ww, deps,
 					  obj->mm.pages->sgl, obj->cache_level,
 					  i915_gem_object_is_lmem(obj),
 					  0xdeadbeaf, &rq);
 		if (rq) {
 			dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
+			i915_gem_object_set_moving_fence(obj, &rq->fence);
 			i915_request_put(rq);
 		}
 		if (err)
 			continue;
 
-		err = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE,
-					   5 * HZ);
-		if (err)
-			continue;
-
-		err = igt_fill_check_buffer(obj, true);
-		if (err)
-			continue;
+		if (!vma) {
+			err = igt_fill_check_buffer(obj, true);
+			if (err)
+				continue;
+		}
 	}
 	if (err)
 		goto out_put;
@@ -204,7 +238,7 @@ static int igt_lmem_pages_migrate(void *arg)
 	 */
 	for (i = 1; i <= 5; ++i) {
 		for_i915_gem_ww(&ww, err, true)
-			err = lmem_pages_migrate_one(&ww, obj);
+			err = lmem_pages_migrate_one(&ww, obj, vma);
 		if (err)
 			goto out_put;
 	}
@@ -213,12 +247,27 @@ static int igt_lmem_pages_migrate(void *arg)
 	if (err)
 		goto out_put;
 
+	if (spin) {
+		if (dma_fence_is_signaled(spin_fence)) {
+			pr_err("Spinner was terminated by hangcheck.\n");
+			err = -EBUSY;
+			goto out_unlock;
+		}
+		igt_spinner_end(spin);
+	}
+
 	/* Finally sync migration and check content. */
 	err = i915_gem_object_wait_migration(obj, true);
 	if (err)
 		goto out_unlock;
 
-	err = igt_fill_check_buffer(obj, false);
+	if (vma) {
+		err = i915_vma_wait_for_bind(vma);
+		if (err)
+			goto out_unlock;
+	} else {
+		err = igt_fill_check_buffer(obj, false);
+	}
 
 out_unlock:
 	i915_gem_object_unlock(obj);
@@ -231,6 +280,7 @@ static int igt_lmem_pages_migrate(void *arg)
 static int igt_lmem_pages_failsafe_migrate(void *arg)
 {
 	int fail_gpu, fail_alloc, ret;
+	struct intel_gt *gt = arg;
 
 	for (fail_gpu = 0; fail_gpu < 2; ++fail_gpu) {
 		for (fail_alloc = 0; fail_alloc < 2; ++fail_alloc) {
@@ -238,7 +288,118 @@ static int igt_lmem_pages_failsafe_migrate(void *arg)
 				fail_gpu, fail_alloc);
 			i915_ttm_migrate_set_failure_modes(fail_gpu,
 							   fail_alloc);
-			ret = igt_lmem_pages_migrate(arg);
+			ret = __igt_lmem_pages_migrate(gt, NULL, NULL, NULL, NULL);
+			if (ret)
+				goto out_err;
+		}
+	}
+
+out_err:
+	i915_ttm_migrate_set_failure_modes(false, false);
+	return ret;
+}
+
+/*
+ * This subtest tests that unbinding at migration is indeed performed
+ * async. We launch a spinner and a number of migrations depending on
+ * that spinner to have terminated. Before each migration we bind a
+ * vma, which should then be async unbound by the migration operation.
+ * If we are able to schedule migrations without blocking while the
+ * spinner is still running, those unbinds are indeed async and non-
+ * blocking.
+ *
+ * Note that each async bind operation is awaiting the previous migration
+ * due to the moving fence resulting from the migration.
+ */
+static int igt_async_migrate(struct intel_gt *gt)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	struct i915_ppgtt *ppgtt;
+	struct igt_spinner spin;
+	int err;
+
+	ppgtt = i915_ppgtt_create(gt, 0);
+	if (IS_ERR(ppgtt))
+		return PTR_ERR(ppgtt);
+
+	if (igt_spinner_init(&spin, gt)) {
+		err = -ENOMEM;
+		goto out_spin;
+	}
+
+	for_each_engine(engine, gt, id) {
+		struct ttm_operation_ctx ctx = {
+			.interruptible = true
+		};
+		struct dma_fence *spin_fence;
+		struct intel_context *ce;
+		struct i915_request *rq;
+		struct i915_deps deps;
+
+		ce = intel_context_create(engine);
+		if (IS_ERR(ce)) {
+			err = PTR_ERR(ce);
+			goto out_ce;
+		}
+
+		/*
+		 * Use MI_NOOP, making the spinner non-preemptible. If there
+		 * is a code path where we fail async operation due to the
+		 * running spinner, we will block and fail to end the
+		 * spinner resulting in a deadlock. But with a non-
+		 * preemptible spinner, hangcheck will terminate the spinner
+		 * for us, and we will later detect that and fail the test.
+		 */
+		rq = igt_spinner_create_request(&spin, ce, MI_NOOP);
+		intel_context_put(ce);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto out_ce;
+		}
+
+		i915_deps_init(&deps, GFP_KERNEL);
+		err = i915_deps_add_dependency(&deps, &rq->fence, &ctx);
+		spin_fence = dma_fence_get(&rq->fence);
+		i915_request_add(rq);
+		if (err)
+			goto out_ce;
+
+		err = __igt_lmem_pages_migrate(gt, &ppgtt->vm, &deps, &spin,
+					       spin_fence);
+		i915_deps_fini(&deps);
+		dma_fence_put(spin_fence);
+		if (err)
+			goto out_ce;
+	}
+
+out_ce:
+	igt_spinner_fini(&spin);
+out_spin:
+	i915_vm_put(&ppgtt->vm);
+
+	return err;
+}
+
+/*
+ * Setting ASYNC_FAIL_ALLOC to 2 will simulate memory allocation failure while
+ * arming the migration error check and block async migration. This
+ * will cause us to deadlock and hangcheck will terminate the spinner
+ * causing the test to fail.
+ */
+#define ASYNC_FAIL_ALLOC 1
+static int igt_lmem_async_migrate(void *arg)
+{
+	int fail_gpu, fail_alloc, ret;
+	struct intel_gt *gt = arg;
+
+	for (fail_gpu = 0; fail_gpu < 2; ++fail_gpu) {
+		for (fail_alloc = 0; fail_alloc < ASYNC_FAIL_ALLOC; ++fail_alloc) {
+			pr_info("Simulated failure modes: gpu: %d, alloc: %d\n",
+				fail_gpu, fail_alloc);
+			i915_ttm_migrate_set_failure_modes(fail_gpu,
+							   fail_alloc);
+			ret = igt_async_migrate(gt);
 			if (ret)
 				goto out_err;
 		}
@@ -256,6 +417,7 @@ int i915_gem_migrate_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(igt_lmem_create_migrate),
 		SUBTEST(igt_same_create_migrate),
 		SUBTEST(igt_lmem_pages_failsafe_migrate),
+		SUBTEST(igt_lmem_async_migrate),
 	};
 
 	if (!HAS_LMEM(i915))
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v5 6/6] drm/i915: Use struct vma_resource instead of struct vma_snapshot
  2022-01-04 12:51 ` [Intel-gfx] " Thomas Hellström
@ 2022-01-04 12:51   ` Thomas Hellström
  -1 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

There is always a struct vma_resource guaranteed to be alive when we
access a corresponding struct vma_snapshot.

So ditch the latter and instead of allocating vma_snapshots, reference
the already existning vma_resource.

This requires a couple of extra members in struct vma_resource but that's
a small price to pay for the simplification.

v2:
- Fix a missing include and declaration (kernel test robot <lkp@intel.com>)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 -
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  15 +--
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   9 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  87 ++++++------
 drivers/gpu/drm/i915/i915_request.c           |  12 +-
 drivers/gpu/drm/i915/i915_request.h           |   6 +-
 drivers/gpu/drm/i915/i915_vma.c               |  16 +--
 drivers/gpu/drm/i915/i915_vma_resource.c      |   4 +
 drivers/gpu/drm/i915/i915_vma_resource.h      |  28 +++-
 drivers/gpu/drm/i915/i915_vma_snapshot.c      | 125 ------------------
 drivers/gpu/drm/i915/i915_vma_snapshot.h      | 101 --------------
 11 files changed, 90 insertions(+), 314 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.c
 delete mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 98433ad74194..aa86ac33effc 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -175,7 +175,6 @@ i915-y += \
 	  i915_ttm_buddy_manager.o \
 	  i915_vma.o \
 	  i915_vma_resource.o \
-	  i915_vma_snapshot.o \
 	  intel_wopcm.o
 
 # general-purpose microcontroller (GuC) support
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 72e497745c12..2f85fe557ad2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -29,7 +29,6 @@
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
-#include "i915_vma_snapshot.h"
 
 struct eb_vma {
 	struct i915_vma *vma;
@@ -1952,7 +1951,6 @@ static void eb_capture_stage(struct i915_execbuffer *eb)
 {
 	const unsigned int count = eb->buffer_count;
 	unsigned int i = count, j;
-	struct i915_vma_snapshot *vsnap;
 
 	while (i--) {
 		struct eb_vma *ev = &eb->vma[i];
@@ -1962,11 +1960,6 @@ static void eb_capture_stage(struct i915_execbuffer *eb)
 		if (!(flags & EXEC_OBJECT_CAPTURE))
 			continue;
 
-		vsnap = i915_vma_snapshot_alloc(GFP_KERNEL);
-		if (!vsnap)
-			continue;
-
-		i915_vma_snapshot_init(vsnap, vma, "user");
 		for_each_batch_create_order(eb, j) {
 			struct i915_capture_list *capture;
 
@@ -1975,10 +1968,9 @@ static void eb_capture_stage(struct i915_execbuffer *eb)
 				continue;
 
 			capture->next = eb->capture_lists[j];
-			capture->vma_snapshot = i915_vma_snapshot_get(vsnap);
+			capture->vma_res = i915_vma_resource_get(vma->resource);
 			eb->capture_lists[j] = capture;
 		}
-		i915_vma_snapshot_put(vsnap);
 	}
 }
 
@@ -3281,9 +3273,8 @@ eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence,
 		 * _onstack interface.
 		 */
 		if (eb->batches[i]->vma)
-			i915_vma_snapshot_init_onstack(&eb->requests[i]->batch_snapshot,
-						       eb->batches[i]->vma,
-						       "batch");
+			eb->requests[i]->batch_res =
+				i915_vma_resource_get(eb->batches[i]->vma->resource);
 		if (eb->batch_pool) {
 			GEM_BUG_ON(intel_context_is_parallel(eb->context));
 			intel_gt_buffer_pool_mark_active(eb->batch_pool,
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 74aa90587061..d1daa4cc2895 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1708,18 +1708,15 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
 
 static void print_request_ring(struct drm_printer *m, struct i915_request *rq)
 {
-	struct i915_vma_snapshot *vsnap = &rq->batch_snapshot;
+	struct i915_vma_resource *vma_res = rq->batch_res;
 	void *ring;
 	int size;
 
-	if (!i915_vma_snapshot_present(vsnap))
-		vsnap = NULL;
-
 	drm_printf(m,
 		   "[head %04x, postfix %04x, tail %04x, batch 0x%08x_%08x]:\n",
 		   rq->head, rq->postfix, rq->tail,
-		   vsnap ? upper_32_bits(vsnap->vma_resource->start) : ~0u,
-		   vsnap ? lower_32_bits(vsnap->vma_resource->start) : ~0u);
+		   vma_res ? upper_32_bits(vma_res->start) : ~0u,
+		   vma_res ? lower_32_bits(vma_res->start) : ~0u);
 
 	size = rq->tail - rq->head;
 	if (rq->tail < rq->head)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 1af54ff374f9..f8c4336cba89 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -48,7 +48,6 @@
 #include "i915_gpu_error.h"
 #include "i915_memcpy.h"
 #include "i915_scatterlist.h"
-#include "i915_vma_snapshot.h"
 
 #define ALLOW_FAIL (__GFP_KSWAPD_RECLAIM | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
 #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)
@@ -1013,8 +1012,10 @@ void __i915_gpu_coredump_free(struct kref *error_ref)
 
 static struct i915_vma_coredump *
 i915_vma_coredump_create(const struct intel_gt *gt,
-			 const struct i915_vma_snapshot *vsnap,
-			 struct i915_vma_compress *compress)
+			 const struct i915_vma_resource *vma_res,
+			 struct i915_vma_compress *compress,
+			 const char *name)
+
 {
 	struct i915_ggtt *ggtt = gt->ggtt;
 	const u64 slot = ggtt->error_capture.start;
@@ -1024,7 +1025,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 
 	might_sleep();
 
-	if (!vsnap || !vsnap->pages || !compress)
+	if (!vma_res || !vma_res->bi.pages || !compress)
 		return NULL;
 
 	dst = kmalloc(sizeof(*dst), ALLOW_FAIL);
@@ -1037,12 +1038,12 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	}
 
 	INIT_LIST_HEAD(&dst->page_list);
-	strcpy(dst->name, vsnap->name);
+	strcpy(dst->name, name);
 	dst->next = NULL;
 
-	dst->gtt_offset = vsnap->vma_resource->start;
-	dst->gtt_size = vsnap->vma_resource->node_size;
-	dst->gtt_page_sizes = vsnap->vma_resource->page_sizes_gtt;
+	dst->gtt_offset = vma_res->start;
+	dst->gtt_size = vma_res->node_size;
+	dst->gtt_page_sizes = vma_res->page_sizes_gtt;
 	dst->unused = 0;
 
 	ret = -EINVAL;
@@ -1050,7 +1051,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 		void __iomem *s;
 		dma_addr_t dma;
 
-		for_each_sgt_daddr(dma, iter, vsnap->pages) {
+		for_each_sgt_daddr(dma, iter, vma_res->bi.pages) {
 			mutex_lock(&ggtt->error_mutex);
 			ggtt->vm.insert_page(&ggtt->vm, dma, slot,
 					     I915_CACHE_NONE, 0);
@@ -1068,11 +1069,11 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 			if (ret)
 				break;
 		}
-	} else if (vsnap->mr && vsnap->mr->type != INTEL_MEMORY_SYSTEM) {
-		struct intel_memory_region *mem = vsnap->mr;
+	} else if (vma_res->bi.lmem) {
+		struct intel_memory_region *mem = vma_res->mr;
 		dma_addr_t dma;
 
-		for_each_sgt_daddr(dma, iter, vsnap->pages) {
+		for_each_sgt_daddr(dma, iter, vma_res->bi.pages) {
 			void __iomem *s;
 
 			s = io_mapping_map_wc(&mem->iomap,
@@ -1088,7 +1089,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	} else {
 		struct page *page;
 
-		for_each_sgt_page(page, iter, vsnap->pages) {
+		for_each_sgt_page(page, iter, vma_res->bi.pages) {
 			void *s;
 
 			drm_clflush_pages(&page, 1);
@@ -1324,33 +1325,32 @@ static bool record_context(struct i915_gem_context_coredump *e,
 
 struct intel_engine_capture_vma {
 	struct intel_engine_capture_vma *next;
-	struct i915_vma_snapshot *vsnap;
+	struct i915_vma_resource *vma_res;
 	char name[16];
 	bool lockdep_cookie;
 };
 
 static struct intel_engine_capture_vma *
 capture_vma_snapshot(struct intel_engine_capture_vma *next,
-		     struct i915_vma_snapshot *vsnap,
-		     gfp_t gfp)
+		     struct i915_vma_resource *vma_res,
+		     gfp_t gfp, const char *name)
 {
 	struct intel_engine_capture_vma *c;
 
-	if (!i915_vma_snapshot_present(vsnap))
+	if (!vma_res)
 		return next;
 
 	c = kmalloc(sizeof(*c), gfp);
 	if (!c)
 		return next;
 
-	if (!i915_vma_snapshot_resource_pin(vsnap, &c->lockdep_cookie)) {
+	if (!i915_vma_resource_hold(vma_res, &c->lockdep_cookie)) {
 		kfree(c);
 		return next;
 	}
 
-	strcpy(c->name, vsnap->name);
-	c->vsnap = vsnap;
-	i915_vma_snapshot_get(vsnap);
+	strcpy(c->name, name);
+	c->vma_res = i915_vma_resource_get(vma_res);
 
 	c->next = next;
 	return c;
@@ -1362,8 +1362,6 @@ capture_vma(struct intel_engine_capture_vma *next,
 	    const char *name,
 	    gfp_t gfp)
 {
-	struct i915_vma_snapshot *vsnap;
-
 	if (!vma)
 		return next;
 
@@ -1372,19 +1370,10 @@ capture_vma(struct intel_engine_capture_vma *next,
 	 * to a struct i915_vma_snapshot at command submission time.
 	 * Not here.
 	 */
-	GEM_WARN_ON(!i915_vma_is_pinned(vma));
-	if (!i915_vma_is_pinned(vma))
-		return next;
-
-	vsnap = i915_vma_snapshot_alloc(gfp);
-	if (!vsnap)
+	if (GEM_WARN_ON(!i915_vma_is_pinned(vma)))
 		return next;
 
-	i915_vma_snapshot_init(vsnap, vma, name);
-	next = capture_vma_snapshot(next, vsnap, gfp);
-
-	/* FIXME: Replace on async unbind. */
-	i915_vma_snapshot_put(vsnap);
+	next = capture_vma_snapshot(next, vma->resource, gfp, name);
 
 	return next;
 }
@@ -1397,7 +1386,8 @@ capture_user(struct intel_engine_capture_vma *capture,
 	struct i915_capture_list *c;
 
 	for (c = rq->capture_list; c; c = c->next)
-		capture = capture_vma_snapshot(capture, c->vma_snapshot, gfp);
+		capture = capture_vma_snapshot(capture, c->vma_res, gfp,
+					       "user");
 
 	return capture;
 }
@@ -1415,16 +1405,19 @@ static struct i915_vma_coredump *
 create_vma_coredump(const struct intel_gt *gt, struct i915_vma *vma,
 		    const char *name, struct i915_vma_compress *compress)
 {
-	struct i915_vma_coredump *ret;
-	struct i915_vma_snapshot tmp;
+	struct i915_vma_coredump *ret = NULL;
+	struct i915_vma_resource *vma_res;
+	bool lockdep_cookie;
 
 	if (!vma)
 		return NULL;
 
-	GEM_WARN_ON(!i915_vma_is_pinned(vma));
-	i915_vma_snapshot_init_onstack(&tmp, vma, name);
-	ret = i915_vma_coredump_create(gt, &tmp, compress);
-	i915_vma_snapshot_put_onstack(&tmp);
+	vma_res = vma->resource;
+
+	if (i915_vma_resource_hold(vma_res, &lockdep_cookie)) {
+		ret = i915_vma_coredump_create(gt, vma_res, compress, name);
+		i915_vma_resource_unhold(vma_res, lockdep_cookie);
+	}
 
 	return ret;
 }
@@ -1471,7 +1464,7 @@ intel_engine_coredump_add_request(struct intel_engine_coredump *ee,
 	 * as the simplest method to avoid being overwritten
 	 * by userspace.
 	 */
-	vma = capture_vma_snapshot(vma, &rq->batch_snapshot, gfp);
+	vma = capture_vma_snapshot(vma, rq->batch_res, gfp, "batch");
 	vma = capture_user(vma, rq, gfp);
 	vma = capture_vma(vma, rq->ring->vma, "ring", gfp);
 	vma = capture_vma(vma, rq->context->state, "HW context", gfp);
@@ -1492,14 +1485,14 @@ intel_engine_coredump_add_vma(struct intel_engine_coredump *ee,
 
 	while (capture) {
 		struct intel_engine_capture_vma *this = capture;
-		struct i915_vma_snapshot *vsnap = this->vsnap;
+		struct i915_vma_resource *vma_res = this->vma_res;
 
 		add_vma(ee,
-			i915_vma_coredump_create(engine->gt,
-						 vsnap, compress));
+			i915_vma_coredump_create(engine->gt, vma_res,
+						 compress, this->name));
 
-		i915_vma_snapshot_resource_unpin(vsnap, this->lockdep_cookie);
-		i915_vma_snapshot_put(vsnap);
+		i915_vma_resource_unhold(vma_res, this->lockdep_cookie);
+		i915_vma_resource_put(vma_res);
 
 		capture = this->next;
 		kfree(this);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 76cf5ac91e94..ba3a70b2cc57 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -116,8 +116,10 @@ static void i915_fence_release(struct dma_fence *fence)
 		   rq->guc_prio != GUC_PRIO_FINI);
 
 	i915_request_free_capture_list(fetch_and_zero(&rq->capture_list));
-	if (i915_vma_snapshot_present(&rq->batch_snapshot))
-		i915_vma_snapshot_put_onstack(&rq->batch_snapshot);
+	if (rq->batch_res) {
+		i915_vma_resource_put(rq->batch_res);
+		rq->batch_res = NULL;
+	}
 
 	/*
 	 * The request is put onto a RCU freelist (i.e. the address
@@ -308,7 +310,7 @@ void i915_request_free_capture_list(struct i915_capture_list *capture)
 	while (capture) {
 		struct i915_capture_list *next = capture->next;
 
-		i915_vma_snapshot_put(capture->vma_snapshot);
+		i915_vma_resource_put(capture->vma_res);
 		kfree(capture);
 		capture = next;
 	}
@@ -854,7 +856,7 @@ static void __i915_request_ctor(void *arg)
 	i915_sw_fence_init(&rq->semaphore, semaphore_notify);
 
 	clear_capture_list(rq);
-	rq->batch_snapshot.present = false;
+	rq->batch_res = NULL;
 
 	init_llist_head(&rq->execute_cb);
 }
@@ -960,7 +962,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 	__rq_init_watchdog(rq);
 	assert_capture_list_is_null(rq);
 	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
-	GEM_BUG_ON(i915_vma_snapshot_present(&rq->batch_snapshot));
+	GEM_BUG_ON(rq->batch_res);
 
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 170ee78c2858..28b1f9db5487 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -40,7 +40,7 @@
 #include "i915_scheduler.h"
 #include "i915_selftest.h"
 #include "i915_sw_fence.h"
-#include "i915_vma_snapshot.h"
+#include "i915_vma_resource.h"
 
 #include <uapi/drm/i915_drm.h>
 
@@ -52,7 +52,7 @@ struct i915_request;
 
 #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
 struct i915_capture_list {
-	struct i915_vma_snapshot *vma_snapshot;
+	struct i915_vma_resource *vma_res;
 	struct i915_capture_list *next;
 };
 
@@ -300,7 +300,7 @@ struct i915_request {
 	/** Batch buffer pointer for selftest internal use. */
 	I915_SELFTEST_DECLARE(struct i915_vma *batch);
 
-	struct i915_vma_snapshot batch_snapshot;
+	struct i915_vma_resource *batch_res;
 
 #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
 	/**
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index b886fe649e5c..18cb7a70cf03 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -288,7 +288,6 @@ struct i915_vma_work {
 	struct i915_vma_resource *vma_res;
 	struct drm_i915_gem_object *pinned;
 	struct i915_sw_dma_fence_cb cb;
-	struct i915_refct_sgt *rsgt;
 	enum i915_cache_level cache_level;
 	unsigned int flags;
 };
@@ -314,8 +313,6 @@ static void __vma_release(struct dma_fence_work *work)
 	i915_vm_put(vw->vm);
 	if (vw->vma_res)
 		i915_vma_resource_put(vw->vma_res);
-	if (vw->rsgt)
-		i915_refct_sgt_put(vw->rsgt);
 }
 
 static const struct dma_fence_work_ops bind_ops = {
@@ -386,8 +383,8 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
 	struct drm_i915_gem_object *obj = vma->obj;
 
 	i915_vma_resource_init(vma_res, vma->vm, vma->pages, &vma->page_sizes,
-			       i915_gem_object_is_readonly(obj),
-			       i915_gem_object_is_lmem(obj),
+			       obj->mm.rsgt, i915_gem_object_is_readonly(obj),
+			       i915_gem_object_is_lmem(obj), obj->mm.region,
 			       vma->ops, vma->private, vma->node.start,
 			       vma->node.size, vma->size);
 }
@@ -478,8 +475,6 @@ int i915_vma_bind(struct i915_vma *vma,
 		work->vma_res = i915_vma_resource_get(vma->resource);
 		work->cache_level = cache_level;
 		work->flags = bind_flags;
-		if (vma->obj->mm.rsgt)
-			work->rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);
 
 		/*
 		 * Note we only want to chain up to the migration fence on
@@ -505,7 +500,7 @@ int i915_vma_bind(struct i915_vma *vma,
 		 * on the object to avoid waiting for the async bind to
 		 * complete in the object destruction path.
 		 */
-		if (!work->rsgt)
+		if (!work->vma_res->bi.pages_rsgt)
 			work->pinned = i915_gem_object_get(vma->obj);
 	} else {
 		if (vma->obj) {
@@ -1826,7 +1821,7 @@ struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async)
 	GEM_BUG_ON(i915_vma_has_userfault(vma));
 
 	/* Object backend must be async capable. */
-	GEM_WARN_ON(async && !vma->obj->mm.rsgt);
+	GEM_WARN_ON(async && !vma->resource->bi.pages_rsgt);
 
 	/* If vm is not open, unbind is a nop. */
 	vma_res->needs_wakeref = i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND) &&
@@ -1839,9 +1834,6 @@ struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async)
 	atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE),
 		   &vma->flags);
 
-	/* Object backend must be async capable. */
-	GEM_WARN_ON(async && !vma->obj->mm.rsgt);
-
 	i915_vma_detach(vma);
 
 	if (!async && unbind_fence) {
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c
index 3dfb3c6731f8..0d8da44eccd2 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.c
+++ b/drivers/gpu/drm/i915/i915_vma_resource.c
@@ -9,6 +9,7 @@
 #include "i915_sw_fence.h"
 #include "i915_vma_resource.h"
 #include "i915_drv.h"
+#include "intel_memory_region.h"
 
 #include "gt/intel_gtt.h"
 
@@ -117,6 +118,9 @@ static void __i915_vma_resource_unhold(struct i915_vma_resource *vma_res)
 		vma_res_itree_remove(vma_res, &vm->pending_unbind);
 		mutex_unlock(&vm->mutex);
 	}
+
+	if (vma_res->bi.pages_rsgt)
+		i915_refct_sgt_put(vma_res->bi.pages_rsgt);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
index a89537e83c70..faecdf3e7eca 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.h
+++ b/drivers/gpu/drm/i915/i915_vma_resource.h
@@ -10,9 +10,12 @@
 #include <linux/refcount.h>
 
 #include "i915_gem.h"
+#include "i915_scatterlist.h"
 #include "i915_sw_fence.h"
 #include "intel_runtime_pm.h"
 
+struct intel_memory_region;
+
 struct i915_page_sizes {
 	/**
 	 * The sg mask of the pages sg_table. i.e the mask of
@@ -47,6 +50,7 @@ struct i915_page_sizes {
  * @__subtree_last: Interval tree private member.
  * @vm: non-refcounted pointer to the vm. This is for internal use only and
  * this member is cleared after vm_resource unbind.
+ * @mr: The memory region of the object pointed to by the vma.
  * @ops: Pointer to the backend i915_vma_ops.
  * @private: Bind backend private info.
  * @start: Offset into the address space of bind range start.
@@ -55,8 +59,10 @@ struct i915_page_sizes {
  * @page_sizes_gtt: Resulting page sizes from the bind operation.
  * @bound_flags: Flags indicating binding status.
  * @allocated: Backend private data. TODO: Should move into @private.
- * @immediate_unbind: Unbind can be done immediately and don't need to be
- * deferred to a work item awaiting unsignaled fences.
+ * @immediate_unbind: Unbind can be done immediately and doesn't need to be
+ * deferred to a work item awaiting unsignaled fences. This is a hack.
+ * (dma_fence_work uses a fence flag for this, but this seems slightly
+ * cleaner).
  *
  * The lifetime of a struct i915_vma_resource is from a binding request to
  * the actual possible asynchronous unbind has completed.
@@ -81,16 +87,22 @@ struct i915_vma_resource {
 	 * and flags
 	 * @pages: The pages sg-table.
 	 * @page_sizes: Page sizes of the pages.
+	 * @pages_rsgt: Refcounted sg-table when delayed object destruction
+	 * is supported. May be NULL.
 	 * @readonly: Whether the vma should be bound read-only.
 	 * @lmem: Whether the vma points to lmem.
 	 */
 	struct i915_vma_bindinfo {
 		struct sg_table *pages;
 		struct i915_page_sizes page_sizes;
+		struct i915_refct_sgt *pages_rsgt;
 		bool readonly:1;
 		bool lmem:1;
 	} bi;
 
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
+	struct intel_memory_region *mr;
+#endif
 	const struct i915_vma_ops *ops;
 	void *private;
 	unsigned long start;
@@ -146,8 +158,11 @@ static inline void i915_vma_resource_put(struct i915_vma_resource *vma_res)
  * @vm: Pointer to the vm.
  * @pages: The pages sg-table.
  * @page_sizes: Page sizes of the pages.
+ * @pages_rsgt: Pointer to a struct i915_refct_sgt of an object with
+ * delayed destruction.
  * @readonly: Whether the vma should be bound read-only.
  * @lmem: Whether the vma points to lmem.
+ * @mr: The memory region of the object the vma points to.
  * @ops: The backend ops.
  * @private: Bind backend private info.
  * @start: Offset into the address space of bind range start.
@@ -163,8 +178,10 @@ static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
 					  struct i915_address_space *vm,
 					  struct sg_table *pages,
 					  const struct i915_page_sizes *page_sizes,
+					  struct i915_refct_sgt *pages_rsgt,
 					  bool readonly,
 					  bool lmem,
+					  struct intel_memory_region *mr,
 					  const struct i915_vma_ops *ops,
 					  void *private,
 					  unsigned long start,
@@ -175,8 +192,13 @@ static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
 	vma_res->vm = vm;
 	vma_res->bi.pages = pages;
 	vma_res->bi.page_sizes = *page_sizes;
+	if (pages_rsgt)
+		vma_res->bi.pages_rsgt = i915_refct_sgt_get(pages_rsgt);
 	vma_res->bi.readonly = readonly;
 	vma_res->bi.lmem = lmem;
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
+	vma_res->mr = mr;
+#endif
 	vma_res->ops = ops;
 	vma_res->private = private;
 	vma_res->start = start;
@@ -187,6 +209,8 @@ static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
 static inline void i915_vma_resource_fini(struct i915_vma_resource *vma_res)
 {
 	GEM_BUG_ON(refcount_read(&vma_res->hold_count) != 1);
+	if (vma_res->bi.pages_rsgt)
+		i915_refct_sgt_put(vma_res->bi.pages_rsgt);
 	i915_sw_fence_fini(&vma_res->chain);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
deleted file mode 100644
index 69f62c1ca967..000000000000
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.c
+++ /dev/null
@@ -1,125 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2021 Intel Corporation
- */
-
-#include "i915_vma_resource.h"
-#include "i915_vma_snapshot.h"
-#include "i915_vma_types.h"
-#include "i915_vma.h"
-
-/**
- * i915_vma_snapshot_init - Initialize a struct i915_vma_snapshot from
- * a struct i915_vma.
- * @vsnap: The i915_vma_snapshot to init.
- * @vma: A struct i915_vma used to initialize @vsnap.
- * @name: Name associated with the snapshot. The character pointer needs to
- * stay alive over the lifitime of the shapsot
- */
-void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
-			    struct i915_vma *vma,
-			    const char *name)
-{
-	if (!i915_vma_is_pinned(vma))
-		assert_object_held(vma->obj);
-
-	vsnap->name = name;
-	vsnap->obj_size = vma->obj->base.size;
-	vsnap->pages = vma->pages;
-	vsnap->pages_rsgt = NULL;
-	vsnap->mr = NULL;
-	if (vma->obj->mm.rsgt)
-		vsnap->pages_rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);
-	vsnap->mr = vma->obj->mm.region;
-	kref_init(&vsnap->kref);
-	vsnap->vma_resource = i915_vma_get_current_resource(vma);
-	vsnap->onstack = false;
-	vsnap->present = true;
-}
-
-/**
- * i915_vma_snapshot_init_onstack - Initialize a struct i915_vma_snapshot from
- * a struct i915_vma, but avoid kfreeing it on last put.
- * @vsnap: The i915_vma_snapshot to init.
- * @vma: A struct i915_vma used to initialize @vsnap.
- * @name: Name associated with the snapshot. The character pointer needs to
- * stay alive over the lifitime of the shapsot
- */
-void i915_vma_snapshot_init_onstack(struct i915_vma_snapshot *vsnap,
-				    struct i915_vma *vma,
-				    const char *name)
-{
-	i915_vma_snapshot_init(vsnap, vma, name);
-	vsnap->onstack = true;
-}
-
-static void vma_snapshot_release(struct kref *ref)
-{
-	struct i915_vma_snapshot *vsnap =
-		container_of(ref, typeof(*vsnap), kref);
-
-	vsnap->present = false;
-	i915_vma_resource_put(vsnap->vma_resource);
-	if (vsnap->pages_rsgt)
-		i915_refct_sgt_put(vsnap->pages_rsgt);
-	if (!vsnap->onstack)
-		kfree(vsnap);
-}
-
-/**
- * i915_vma_snapshot_put - Put an i915_vma_snapshot pointer reference
- * @vsnap: The pointer reference
- */
-void i915_vma_snapshot_put(struct i915_vma_snapshot *vsnap)
-{
-	kref_put(&vsnap->kref, vma_snapshot_release);
-}
-
-/**
- * i915_vma_snapshot_put_onstack - Put an onstcak i915_vma_snapshot pointer
- * reference and varify that the structure is released
- * @vsnap: The pointer reference
- *
- * This function is intended to be paired with a i915_vma_init_onstack()
- * and should be called before exiting the scope that declared or
- * freeing the structure that embedded @vsnap to verify that all references
- * have been released.
- */
-void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap)
-{
-	if (!kref_put(&vsnap->kref, vma_snapshot_release))
-		GEM_BUG_ON(1);
-}
-
-/**
- * i915_vma_snapshot_resource_pin - Temporarily block the memory the
- * vma snapshot is pointing to from being released.
- * @vsnap: The vma snapshot.
- * @lockdep_cookie: Pointer to bool needed for lockdep support. This needs
- * to be passed to the paired i915_vma_snapshot_resource_unpin.
- *
- * This function will temporarily try to hold up a fence or similar structure
- * and will therefore enter a fence signaling critical section.
- *
- * Return: true if we succeeded in blocking the memory from being released,
- * false otherwise.
- */
-bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
-				    bool *lockdep_cookie)
-{
-	return i915_vma_resource_hold(vsnap->vma_resource, lockdep_cookie);
-}
-
-/**
- * i915_vma_snapshot_resource_unpin - Unblock vma snapshot memory from
- * being released.
- * @vsnap: The vma snapshot.
- * @lockdep_cookie: Cookie returned from matching i915_vma_resource_pin().
- *
- * Might leave a fence signalling critical section and signal a fence.
- */
-void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
-				      bool lockdep_cookie)
-{
-	i915_vma_resource_unhold(vsnap->vma_resource, lockdep_cookie);
-}
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
deleted file mode 100644
index 1b08ce9f8576..000000000000
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.h
+++ /dev/null
@@ -1,101 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2021 Intel Corporation
- */
-#ifndef _I915_VMA_SNAPSHOT_H_
-#define _I915_VMA_SNAPSHOT_H_
-
-#include <linux/kref.h>
-#include <linux/slab.h>
-#include <linux/types.h>
-
-struct i915_active;
-struct i915_refct_sgt;
-struct i915_vma;
-struct intel_memory_region;
-struct sg_table;
-
-/**
- * DOC: Simple utilities for snapshotting GPU vma metadata, later used for
- * error capture. Vi use a separate header for this to avoid issues due to
- * recursive header includes.
- */
-
-/**
- * struct i915_vma_snapshot - Snapshot of vma metadata.
- * @obj_size: The size of the underlying object in bytes.
- * @pages: The struct sg_table pointing to the pages bound.
- * @pages_rsgt: The refcounted sg_table holding the reference for @pages if any.
- * @mr: The memory region pointed for the pages bound.
- * @kref: Reference for this structure.
- * @vma_resource: Pointer to the vma resource representing the vma binding.
- * @onstack: Whether the structure shouldn't be freed on final put.
- * @present: Whether the structure is present and initialized.
- */
-struct i915_vma_snapshot {
-	const char *name;
-	size_t obj_size;
-	struct sg_table *pages;
-	struct i915_refct_sgt *pages_rsgt;
-	struct intel_memory_region *mr;
-	struct kref kref;
-	struct i915_vma_resource *vma_resource;
-	bool onstack:1;
-	bool present:1;
-};
-
-void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
-			    struct i915_vma *vma,
-			    const char *name);
-
-void i915_vma_snapshot_init_onstack(struct i915_vma_snapshot *vsnap,
-				    struct i915_vma *vma,
-				    const char *name);
-
-void i915_vma_snapshot_put(struct i915_vma_snapshot *vsnap);
-
-void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap);
-
-bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
-				    bool *lockdep_cookie);
-
-void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
-				      bool lockdep_cookie);
-
-/**
- * i915_vma_snapshot_alloc - Allocate a struct i915_vma_snapshot
- * @gfp: Allocation mode.
- *
- * Return: A pointer to a struct i915_vma_snapshot if successful.
- * NULL otherwise.
- */
-static inline struct i915_vma_snapshot *i915_vma_snapshot_alloc(gfp_t gfp)
-{
-	return kmalloc(sizeof(struct i915_vma_snapshot), gfp);
-}
-
-/**
- * i915_vma_snapshot_get - Take a reference on a struct i915_vma_snapshot
- *
- * Return: A pointer to a struct i915_vma_snapshot.
- */
-static inline struct i915_vma_snapshot *
-i915_vma_snapshot_get(struct i915_vma_snapshot *vsnap)
-{
-	kref_get(&vsnap->kref);
-	return vsnap;
-}
-
-/**
- * i915_vma_snapshot_present - Whether a struct i915_vma_snapshot is
- * present and initialized.
- *
- * Return: true if present and initialized; false otherwise.
- */
-static inline bool
-i915_vma_snapshot_present(const struct i915_vma_snapshot *vsnap)
-{
-	return vsnap && vsnap->present;
-}
-
-#endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Intel-gfx] [PATCH v5 6/6] drm/i915: Use struct vma_resource instead of struct vma_snapshot
@ 2022-01-04 12:51   ` Thomas Hellström
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-04 12:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

There is always a struct vma_resource guaranteed to be alive when we
access a corresponding struct vma_snapshot.

So ditch the latter and instead of allocating vma_snapshots, reference
the already existning vma_resource.

This requires a couple of extra members in struct vma_resource but that's
a small price to pay for the simplification.

v2:
- Fix a missing include and declaration (kernel test robot <lkp@intel.com>)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 -
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  15 +--
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   9 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  87 ++++++------
 drivers/gpu/drm/i915/i915_request.c           |  12 +-
 drivers/gpu/drm/i915/i915_request.h           |   6 +-
 drivers/gpu/drm/i915/i915_vma.c               |  16 +--
 drivers/gpu/drm/i915/i915_vma_resource.c      |   4 +
 drivers/gpu/drm/i915/i915_vma_resource.h      |  28 +++-
 drivers/gpu/drm/i915/i915_vma_snapshot.c      | 125 ------------------
 drivers/gpu/drm/i915/i915_vma_snapshot.h      | 101 --------------
 11 files changed, 90 insertions(+), 314 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.c
 delete mode 100644 drivers/gpu/drm/i915/i915_vma_snapshot.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 98433ad74194..aa86ac33effc 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -175,7 +175,6 @@ i915-y += \
 	  i915_ttm_buddy_manager.o \
 	  i915_vma.o \
 	  i915_vma_resource.o \
-	  i915_vma_snapshot.o \
 	  intel_wopcm.o
 
 # general-purpose microcontroller (GuC) support
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 72e497745c12..2f85fe557ad2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -29,7 +29,6 @@
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
-#include "i915_vma_snapshot.h"
 
 struct eb_vma {
 	struct i915_vma *vma;
@@ -1952,7 +1951,6 @@ static void eb_capture_stage(struct i915_execbuffer *eb)
 {
 	const unsigned int count = eb->buffer_count;
 	unsigned int i = count, j;
-	struct i915_vma_snapshot *vsnap;
 
 	while (i--) {
 		struct eb_vma *ev = &eb->vma[i];
@@ -1962,11 +1960,6 @@ static void eb_capture_stage(struct i915_execbuffer *eb)
 		if (!(flags & EXEC_OBJECT_CAPTURE))
 			continue;
 
-		vsnap = i915_vma_snapshot_alloc(GFP_KERNEL);
-		if (!vsnap)
-			continue;
-
-		i915_vma_snapshot_init(vsnap, vma, "user");
 		for_each_batch_create_order(eb, j) {
 			struct i915_capture_list *capture;
 
@@ -1975,10 +1968,9 @@ static void eb_capture_stage(struct i915_execbuffer *eb)
 				continue;
 
 			capture->next = eb->capture_lists[j];
-			capture->vma_snapshot = i915_vma_snapshot_get(vsnap);
+			capture->vma_res = i915_vma_resource_get(vma->resource);
 			eb->capture_lists[j] = capture;
 		}
-		i915_vma_snapshot_put(vsnap);
 	}
 }
 
@@ -3281,9 +3273,8 @@ eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence,
 		 * _onstack interface.
 		 */
 		if (eb->batches[i]->vma)
-			i915_vma_snapshot_init_onstack(&eb->requests[i]->batch_snapshot,
-						       eb->batches[i]->vma,
-						       "batch");
+			eb->requests[i]->batch_res =
+				i915_vma_resource_get(eb->batches[i]->vma->resource);
 		if (eb->batch_pool) {
 			GEM_BUG_ON(intel_context_is_parallel(eb->context));
 			intel_gt_buffer_pool_mark_active(eb->batch_pool,
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 74aa90587061..d1daa4cc2895 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1708,18 +1708,15 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
 
 static void print_request_ring(struct drm_printer *m, struct i915_request *rq)
 {
-	struct i915_vma_snapshot *vsnap = &rq->batch_snapshot;
+	struct i915_vma_resource *vma_res = rq->batch_res;
 	void *ring;
 	int size;
 
-	if (!i915_vma_snapshot_present(vsnap))
-		vsnap = NULL;
-
 	drm_printf(m,
 		   "[head %04x, postfix %04x, tail %04x, batch 0x%08x_%08x]:\n",
 		   rq->head, rq->postfix, rq->tail,
-		   vsnap ? upper_32_bits(vsnap->vma_resource->start) : ~0u,
-		   vsnap ? lower_32_bits(vsnap->vma_resource->start) : ~0u);
+		   vma_res ? upper_32_bits(vma_res->start) : ~0u,
+		   vma_res ? lower_32_bits(vma_res->start) : ~0u);
 
 	size = rq->tail - rq->head;
 	if (rq->tail < rq->head)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 1af54ff374f9..f8c4336cba89 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -48,7 +48,6 @@
 #include "i915_gpu_error.h"
 #include "i915_memcpy.h"
 #include "i915_scatterlist.h"
-#include "i915_vma_snapshot.h"
 
 #define ALLOW_FAIL (__GFP_KSWAPD_RECLAIM | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
 #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)
@@ -1013,8 +1012,10 @@ void __i915_gpu_coredump_free(struct kref *error_ref)
 
 static struct i915_vma_coredump *
 i915_vma_coredump_create(const struct intel_gt *gt,
-			 const struct i915_vma_snapshot *vsnap,
-			 struct i915_vma_compress *compress)
+			 const struct i915_vma_resource *vma_res,
+			 struct i915_vma_compress *compress,
+			 const char *name)
+
 {
 	struct i915_ggtt *ggtt = gt->ggtt;
 	const u64 slot = ggtt->error_capture.start;
@@ -1024,7 +1025,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 
 	might_sleep();
 
-	if (!vsnap || !vsnap->pages || !compress)
+	if (!vma_res || !vma_res->bi.pages || !compress)
 		return NULL;
 
 	dst = kmalloc(sizeof(*dst), ALLOW_FAIL);
@@ -1037,12 +1038,12 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	}
 
 	INIT_LIST_HEAD(&dst->page_list);
-	strcpy(dst->name, vsnap->name);
+	strcpy(dst->name, name);
 	dst->next = NULL;
 
-	dst->gtt_offset = vsnap->vma_resource->start;
-	dst->gtt_size = vsnap->vma_resource->node_size;
-	dst->gtt_page_sizes = vsnap->vma_resource->page_sizes_gtt;
+	dst->gtt_offset = vma_res->start;
+	dst->gtt_size = vma_res->node_size;
+	dst->gtt_page_sizes = vma_res->page_sizes_gtt;
 	dst->unused = 0;
 
 	ret = -EINVAL;
@@ -1050,7 +1051,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 		void __iomem *s;
 		dma_addr_t dma;
 
-		for_each_sgt_daddr(dma, iter, vsnap->pages) {
+		for_each_sgt_daddr(dma, iter, vma_res->bi.pages) {
 			mutex_lock(&ggtt->error_mutex);
 			ggtt->vm.insert_page(&ggtt->vm, dma, slot,
 					     I915_CACHE_NONE, 0);
@@ -1068,11 +1069,11 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 			if (ret)
 				break;
 		}
-	} else if (vsnap->mr && vsnap->mr->type != INTEL_MEMORY_SYSTEM) {
-		struct intel_memory_region *mem = vsnap->mr;
+	} else if (vma_res->bi.lmem) {
+		struct intel_memory_region *mem = vma_res->mr;
 		dma_addr_t dma;
 
-		for_each_sgt_daddr(dma, iter, vsnap->pages) {
+		for_each_sgt_daddr(dma, iter, vma_res->bi.pages) {
 			void __iomem *s;
 
 			s = io_mapping_map_wc(&mem->iomap,
@@ -1088,7 +1089,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
 	} else {
 		struct page *page;
 
-		for_each_sgt_page(page, iter, vsnap->pages) {
+		for_each_sgt_page(page, iter, vma_res->bi.pages) {
 			void *s;
 
 			drm_clflush_pages(&page, 1);
@@ -1324,33 +1325,32 @@ static bool record_context(struct i915_gem_context_coredump *e,
 
 struct intel_engine_capture_vma {
 	struct intel_engine_capture_vma *next;
-	struct i915_vma_snapshot *vsnap;
+	struct i915_vma_resource *vma_res;
 	char name[16];
 	bool lockdep_cookie;
 };
 
 static struct intel_engine_capture_vma *
 capture_vma_snapshot(struct intel_engine_capture_vma *next,
-		     struct i915_vma_snapshot *vsnap,
-		     gfp_t gfp)
+		     struct i915_vma_resource *vma_res,
+		     gfp_t gfp, const char *name)
 {
 	struct intel_engine_capture_vma *c;
 
-	if (!i915_vma_snapshot_present(vsnap))
+	if (!vma_res)
 		return next;
 
 	c = kmalloc(sizeof(*c), gfp);
 	if (!c)
 		return next;
 
-	if (!i915_vma_snapshot_resource_pin(vsnap, &c->lockdep_cookie)) {
+	if (!i915_vma_resource_hold(vma_res, &c->lockdep_cookie)) {
 		kfree(c);
 		return next;
 	}
 
-	strcpy(c->name, vsnap->name);
-	c->vsnap = vsnap;
-	i915_vma_snapshot_get(vsnap);
+	strcpy(c->name, name);
+	c->vma_res = i915_vma_resource_get(vma_res);
 
 	c->next = next;
 	return c;
@@ -1362,8 +1362,6 @@ capture_vma(struct intel_engine_capture_vma *next,
 	    const char *name,
 	    gfp_t gfp)
 {
-	struct i915_vma_snapshot *vsnap;
-
 	if (!vma)
 		return next;
 
@@ -1372,19 +1370,10 @@ capture_vma(struct intel_engine_capture_vma *next,
 	 * to a struct i915_vma_snapshot at command submission time.
 	 * Not here.
 	 */
-	GEM_WARN_ON(!i915_vma_is_pinned(vma));
-	if (!i915_vma_is_pinned(vma))
-		return next;
-
-	vsnap = i915_vma_snapshot_alloc(gfp);
-	if (!vsnap)
+	if (GEM_WARN_ON(!i915_vma_is_pinned(vma)))
 		return next;
 
-	i915_vma_snapshot_init(vsnap, vma, name);
-	next = capture_vma_snapshot(next, vsnap, gfp);
-
-	/* FIXME: Replace on async unbind. */
-	i915_vma_snapshot_put(vsnap);
+	next = capture_vma_snapshot(next, vma->resource, gfp, name);
 
 	return next;
 }
@@ -1397,7 +1386,8 @@ capture_user(struct intel_engine_capture_vma *capture,
 	struct i915_capture_list *c;
 
 	for (c = rq->capture_list; c; c = c->next)
-		capture = capture_vma_snapshot(capture, c->vma_snapshot, gfp);
+		capture = capture_vma_snapshot(capture, c->vma_res, gfp,
+					       "user");
 
 	return capture;
 }
@@ -1415,16 +1405,19 @@ static struct i915_vma_coredump *
 create_vma_coredump(const struct intel_gt *gt, struct i915_vma *vma,
 		    const char *name, struct i915_vma_compress *compress)
 {
-	struct i915_vma_coredump *ret;
-	struct i915_vma_snapshot tmp;
+	struct i915_vma_coredump *ret = NULL;
+	struct i915_vma_resource *vma_res;
+	bool lockdep_cookie;
 
 	if (!vma)
 		return NULL;
 
-	GEM_WARN_ON(!i915_vma_is_pinned(vma));
-	i915_vma_snapshot_init_onstack(&tmp, vma, name);
-	ret = i915_vma_coredump_create(gt, &tmp, compress);
-	i915_vma_snapshot_put_onstack(&tmp);
+	vma_res = vma->resource;
+
+	if (i915_vma_resource_hold(vma_res, &lockdep_cookie)) {
+		ret = i915_vma_coredump_create(gt, vma_res, compress, name);
+		i915_vma_resource_unhold(vma_res, lockdep_cookie);
+	}
 
 	return ret;
 }
@@ -1471,7 +1464,7 @@ intel_engine_coredump_add_request(struct intel_engine_coredump *ee,
 	 * as the simplest method to avoid being overwritten
 	 * by userspace.
 	 */
-	vma = capture_vma_snapshot(vma, &rq->batch_snapshot, gfp);
+	vma = capture_vma_snapshot(vma, rq->batch_res, gfp, "batch");
 	vma = capture_user(vma, rq, gfp);
 	vma = capture_vma(vma, rq->ring->vma, "ring", gfp);
 	vma = capture_vma(vma, rq->context->state, "HW context", gfp);
@@ -1492,14 +1485,14 @@ intel_engine_coredump_add_vma(struct intel_engine_coredump *ee,
 
 	while (capture) {
 		struct intel_engine_capture_vma *this = capture;
-		struct i915_vma_snapshot *vsnap = this->vsnap;
+		struct i915_vma_resource *vma_res = this->vma_res;
 
 		add_vma(ee,
-			i915_vma_coredump_create(engine->gt,
-						 vsnap, compress));
+			i915_vma_coredump_create(engine->gt, vma_res,
+						 compress, this->name));
 
-		i915_vma_snapshot_resource_unpin(vsnap, this->lockdep_cookie);
-		i915_vma_snapshot_put(vsnap);
+		i915_vma_resource_unhold(vma_res, this->lockdep_cookie);
+		i915_vma_resource_put(vma_res);
 
 		capture = this->next;
 		kfree(this);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 76cf5ac91e94..ba3a70b2cc57 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -116,8 +116,10 @@ static void i915_fence_release(struct dma_fence *fence)
 		   rq->guc_prio != GUC_PRIO_FINI);
 
 	i915_request_free_capture_list(fetch_and_zero(&rq->capture_list));
-	if (i915_vma_snapshot_present(&rq->batch_snapshot))
-		i915_vma_snapshot_put_onstack(&rq->batch_snapshot);
+	if (rq->batch_res) {
+		i915_vma_resource_put(rq->batch_res);
+		rq->batch_res = NULL;
+	}
 
 	/*
 	 * The request is put onto a RCU freelist (i.e. the address
@@ -308,7 +310,7 @@ void i915_request_free_capture_list(struct i915_capture_list *capture)
 	while (capture) {
 		struct i915_capture_list *next = capture->next;
 
-		i915_vma_snapshot_put(capture->vma_snapshot);
+		i915_vma_resource_put(capture->vma_res);
 		kfree(capture);
 		capture = next;
 	}
@@ -854,7 +856,7 @@ static void __i915_request_ctor(void *arg)
 	i915_sw_fence_init(&rq->semaphore, semaphore_notify);
 
 	clear_capture_list(rq);
-	rq->batch_snapshot.present = false;
+	rq->batch_res = NULL;
 
 	init_llist_head(&rq->execute_cb);
 }
@@ -960,7 +962,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 	__rq_init_watchdog(rq);
 	assert_capture_list_is_null(rq);
 	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
-	GEM_BUG_ON(i915_vma_snapshot_present(&rq->batch_snapshot));
+	GEM_BUG_ON(rq->batch_res);
 
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 170ee78c2858..28b1f9db5487 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -40,7 +40,7 @@
 #include "i915_scheduler.h"
 #include "i915_selftest.h"
 #include "i915_sw_fence.h"
-#include "i915_vma_snapshot.h"
+#include "i915_vma_resource.h"
 
 #include <uapi/drm/i915_drm.h>
 
@@ -52,7 +52,7 @@ struct i915_request;
 
 #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
 struct i915_capture_list {
-	struct i915_vma_snapshot *vma_snapshot;
+	struct i915_vma_resource *vma_res;
 	struct i915_capture_list *next;
 };
 
@@ -300,7 +300,7 @@ struct i915_request {
 	/** Batch buffer pointer for selftest internal use. */
 	I915_SELFTEST_DECLARE(struct i915_vma *batch);
 
-	struct i915_vma_snapshot batch_snapshot;
+	struct i915_vma_resource *batch_res;
 
 #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
 	/**
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index b886fe649e5c..18cb7a70cf03 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -288,7 +288,6 @@ struct i915_vma_work {
 	struct i915_vma_resource *vma_res;
 	struct drm_i915_gem_object *pinned;
 	struct i915_sw_dma_fence_cb cb;
-	struct i915_refct_sgt *rsgt;
 	enum i915_cache_level cache_level;
 	unsigned int flags;
 };
@@ -314,8 +313,6 @@ static void __vma_release(struct dma_fence_work *work)
 	i915_vm_put(vw->vm);
 	if (vw->vma_res)
 		i915_vma_resource_put(vw->vma_res);
-	if (vw->rsgt)
-		i915_refct_sgt_put(vw->rsgt);
 }
 
 static const struct dma_fence_work_ops bind_ops = {
@@ -386,8 +383,8 @@ i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
 	struct drm_i915_gem_object *obj = vma->obj;
 
 	i915_vma_resource_init(vma_res, vma->vm, vma->pages, &vma->page_sizes,
-			       i915_gem_object_is_readonly(obj),
-			       i915_gem_object_is_lmem(obj),
+			       obj->mm.rsgt, i915_gem_object_is_readonly(obj),
+			       i915_gem_object_is_lmem(obj), obj->mm.region,
 			       vma->ops, vma->private, vma->node.start,
 			       vma->node.size, vma->size);
 }
@@ -478,8 +475,6 @@ int i915_vma_bind(struct i915_vma *vma,
 		work->vma_res = i915_vma_resource_get(vma->resource);
 		work->cache_level = cache_level;
 		work->flags = bind_flags;
-		if (vma->obj->mm.rsgt)
-			work->rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);
 
 		/*
 		 * Note we only want to chain up to the migration fence on
@@ -505,7 +500,7 @@ int i915_vma_bind(struct i915_vma *vma,
 		 * on the object to avoid waiting for the async bind to
 		 * complete in the object destruction path.
 		 */
-		if (!work->rsgt)
+		if (!work->vma_res->bi.pages_rsgt)
 			work->pinned = i915_gem_object_get(vma->obj);
 	} else {
 		if (vma->obj) {
@@ -1826,7 +1821,7 @@ struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async)
 	GEM_BUG_ON(i915_vma_has_userfault(vma));
 
 	/* Object backend must be async capable. */
-	GEM_WARN_ON(async && !vma->obj->mm.rsgt);
+	GEM_WARN_ON(async && !vma->resource->bi.pages_rsgt);
 
 	/* If vm is not open, unbind is a nop. */
 	vma_res->needs_wakeref = i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND) &&
@@ -1839,9 +1834,6 @@ struct dma_fence *__i915_vma_evict(struct i915_vma *vma, bool async)
 	atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE),
 		   &vma->flags);
 
-	/* Object backend must be async capable. */
-	GEM_WARN_ON(async && !vma->obj->mm.rsgt);
-
 	i915_vma_detach(vma);
 
 	if (!async && unbind_fence) {
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c
index 3dfb3c6731f8..0d8da44eccd2 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.c
+++ b/drivers/gpu/drm/i915/i915_vma_resource.c
@@ -9,6 +9,7 @@
 #include "i915_sw_fence.h"
 #include "i915_vma_resource.h"
 #include "i915_drv.h"
+#include "intel_memory_region.h"
 
 #include "gt/intel_gtt.h"
 
@@ -117,6 +118,9 @@ static void __i915_vma_resource_unhold(struct i915_vma_resource *vma_res)
 		vma_res_itree_remove(vma_res, &vm->pending_unbind);
 		mutex_unlock(&vm->mutex);
 	}
+
+	if (vma_res->bi.pages_rsgt)
+		i915_refct_sgt_put(vma_res->bi.pages_rsgt);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
index a89537e83c70..faecdf3e7eca 100644
--- a/drivers/gpu/drm/i915/i915_vma_resource.h
+++ b/drivers/gpu/drm/i915/i915_vma_resource.h
@@ -10,9 +10,12 @@
 #include <linux/refcount.h>
 
 #include "i915_gem.h"
+#include "i915_scatterlist.h"
 #include "i915_sw_fence.h"
 #include "intel_runtime_pm.h"
 
+struct intel_memory_region;
+
 struct i915_page_sizes {
 	/**
 	 * The sg mask of the pages sg_table. i.e the mask of
@@ -47,6 +50,7 @@ struct i915_page_sizes {
  * @__subtree_last: Interval tree private member.
  * @vm: non-refcounted pointer to the vm. This is for internal use only and
  * this member is cleared after vm_resource unbind.
+ * @mr: The memory region of the object pointed to by the vma.
  * @ops: Pointer to the backend i915_vma_ops.
  * @private: Bind backend private info.
  * @start: Offset into the address space of bind range start.
@@ -55,8 +59,10 @@ struct i915_page_sizes {
  * @page_sizes_gtt: Resulting page sizes from the bind operation.
  * @bound_flags: Flags indicating binding status.
  * @allocated: Backend private data. TODO: Should move into @private.
- * @immediate_unbind: Unbind can be done immediately and don't need to be
- * deferred to a work item awaiting unsignaled fences.
+ * @immediate_unbind: Unbind can be done immediately and doesn't need to be
+ * deferred to a work item awaiting unsignaled fences. This is a hack.
+ * (dma_fence_work uses a fence flag for this, but this seems slightly
+ * cleaner).
  *
  * The lifetime of a struct i915_vma_resource is from a binding request to
  * the actual possible asynchronous unbind has completed.
@@ -81,16 +87,22 @@ struct i915_vma_resource {
 	 * and flags
 	 * @pages: The pages sg-table.
 	 * @page_sizes: Page sizes of the pages.
+	 * @pages_rsgt: Refcounted sg-table when delayed object destruction
+	 * is supported. May be NULL.
 	 * @readonly: Whether the vma should be bound read-only.
 	 * @lmem: Whether the vma points to lmem.
 	 */
 	struct i915_vma_bindinfo {
 		struct sg_table *pages;
 		struct i915_page_sizes page_sizes;
+		struct i915_refct_sgt *pages_rsgt;
 		bool readonly:1;
 		bool lmem:1;
 	} bi;
 
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
+	struct intel_memory_region *mr;
+#endif
 	const struct i915_vma_ops *ops;
 	void *private;
 	unsigned long start;
@@ -146,8 +158,11 @@ static inline void i915_vma_resource_put(struct i915_vma_resource *vma_res)
  * @vm: Pointer to the vm.
  * @pages: The pages sg-table.
  * @page_sizes: Page sizes of the pages.
+ * @pages_rsgt: Pointer to a struct i915_refct_sgt of an object with
+ * delayed destruction.
  * @readonly: Whether the vma should be bound read-only.
  * @lmem: Whether the vma points to lmem.
+ * @mr: The memory region of the object the vma points to.
  * @ops: The backend ops.
  * @private: Bind backend private info.
  * @start: Offset into the address space of bind range start.
@@ -163,8 +178,10 @@ static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
 					  struct i915_address_space *vm,
 					  struct sg_table *pages,
 					  const struct i915_page_sizes *page_sizes,
+					  struct i915_refct_sgt *pages_rsgt,
 					  bool readonly,
 					  bool lmem,
+					  struct intel_memory_region *mr,
 					  const struct i915_vma_ops *ops,
 					  void *private,
 					  unsigned long start,
@@ -175,8 +192,13 @@ static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
 	vma_res->vm = vm;
 	vma_res->bi.pages = pages;
 	vma_res->bi.page_sizes = *page_sizes;
+	if (pages_rsgt)
+		vma_res->bi.pages_rsgt = i915_refct_sgt_get(pages_rsgt);
 	vma_res->bi.readonly = readonly;
 	vma_res->bi.lmem = lmem;
+#if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
+	vma_res->mr = mr;
+#endif
 	vma_res->ops = ops;
 	vma_res->private = private;
 	vma_res->start = start;
@@ -187,6 +209,8 @@ static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
 static inline void i915_vma_resource_fini(struct i915_vma_resource *vma_res)
 {
 	GEM_BUG_ON(refcount_read(&vma_res->hold_count) != 1);
+	if (vma_res->bi.pages_rsgt)
+		i915_refct_sgt_put(vma_res->bi.pages_rsgt);
 	i915_sw_fence_fini(&vma_res->chain);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
deleted file mode 100644
index 69f62c1ca967..000000000000
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.c
+++ /dev/null
@@ -1,125 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2021 Intel Corporation
- */
-
-#include "i915_vma_resource.h"
-#include "i915_vma_snapshot.h"
-#include "i915_vma_types.h"
-#include "i915_vma.h"
-
-/**
- * i915_vma_snapshot_init - Initialize a struct i915_vma_snapshot from
- * a struct i915_vma.
- * @vsnap: The i915_vma_snapshot to init.
- * @vma: A struct i915_vma used to initialize @vsnap.
- * @name: Name associated with the snapshot. The character pointer needs to
- * stay alive over the lifitime of the shapsot
- */
-void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
-			    struct i915_vma *vma,
-			    const char *name)
-{
-	if (!i915_vma_is_pinned(vma))
-		assert_object_held(vma->obj);
-
-	vsnap->name = name;
-	vsnap->obj_size = vma->obj->base.size;
-	vsnap->pages = vma->pages;
-	vsnap->pages_rsgt = NULL;
-	vsnap->mr = NULL;
-	if (vma->obj->mm.rsgt)
-		vsnap->pages_rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);
-	vsnap->mr = vma->obj->mm.region;
-	kref_init(&vsnap->kref);
-	vsnap->vma_resource = i915_vma_get_current_resource(vma);
-	vsnap->onstack = false;
-	vsnap->present = true;
-}
-
-/**
- * i915_vma_snapshot_init_onstack - Initialize a struct i915_vma_snapshot from
- * a struct i915_vma, but avoid kfreeing it on last put.
- * @vsnap: The i915_vma_snapshot to init.
- * @vma: A struct i915_vma used to initialize @vsnap.
- * @name: Name associated with the snapshot. The character pointer needs to
- * stay alive over the lifitime of the shapsot
- */
-void i915_vma_snapshot_init_onstack(struct i915_vma_snapshot *vsnap,
-				    struct i915_vma *vma,
-				    const char *name)
-{
-	i915_vma_snapshot_init(vsnap, vma, name);
-	vsnap->onstack = true;
-}
-
-static void vma_snapshot_release(struct kref *ref)
-{
-	struct i915_vma_snapshot *vsnap =
-		container_of(ref, typeof(*vsnap), kref);
-
-	vsnap->present = false;
-	i915_vma_resource_put(vsnap->vma_resource);
-	if (vsnap->pages_rsgt)
-		i915_refct_sgt_put(vsnap->pages_rsgt);
-	if (!vsnap->onstack)
-		kfree(vsnap);
-}
-
-/**
- * i915_vma_snapshot_put - Put an i915_vma_snapshot pointer reference
- * @vsnap: The pointer reference
- */
-void i915_vma_snapshot_put(struct i915_vma_snapshot *vsnap)
-{
-	kref_put(&vsnap->kref, vma_snapshot_release);
-}
-
-/**
- * i915_vma_snapshot_put_onstack - Put an onstcak i915_vma_snapshot pointer
- * reference and varify that the structure is released
- * @vsnap: The pointer reference
- *
- * This function is intended to be paired with a i915_vma_init_onstack()
- * and should be called before exiting the scope that declared or
- * freeing the structure that embedded @vsnap to verify that all references
- * have been released.
- */
-void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap)
-{
-	if (!kref_put(&vsnap->kref, vma_snapshot_release))
-		GEM_BUG_ON(1);
-}
-
-/**
- * i915_vma_snapshot_resource_pin - Temporarily block the memory the
- * vma snapshot is pointing to from being released.
- * @vsnap: The vma snapshot.
- * @lockdep_cookie: Pointer to bool needed for lockdep support. This needs
- * to be passed to the paired i915_vma_snapshot_resource_unpin.
- *
- * This function will temporarily try to hold up a fence or similar structure
- * and will therefore enter a fence signaling critical section.
- *
- * Return: true if we succeeded in blocking the memory from being released,
- * false otherwise.
- */
-bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
-				    bool *lockdep_cookie)
-{
-	return i915_vma_resource_hold(vsnap->vma_resource, lockdep_cookie);
-}
-
-/**
- * i915_vma_snapshot_resource_unpin - Unblock vma snapshot memory from
- * being released.
- * @vsnap: The vma snapshot.
- * @lockdep_cookie: Cookie returned from matching i915_vma_resource_pin().
- *
- * Might leave a fence signalling critical section and signal a fence.
- */
-void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
-				      bool lockdep_cookie)
-{
-	i915_vma_resource_unhold(vsnap->vma_resource, lockdep_cookie);
-}
diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
deleted file mode 100644
index 1b08ce9f8576..000000000000
--- a/drivers/gpu/drm/i915/i915_vma_snapshot.h
+++ /dev/null
@@ -1,101 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2021 Intel Corporation
- */
-#ifndef _I915_VMA_SNAPSHOT_H_
-#define _I915_VMA_SNAPSHOT_H_
-
-#include <linux/kref.h>
-#include <linux/slab.h>
-#include <linux/types.h>
-
-struct i915_active;
-struct i915_refct_sgt;
-struct i915_vma;
-struct intel_memory_region;
-struct sg_table;
-
-/**
- * DOC: Simple utilities for snapshotting GPU vma metadata, later used for
- * error capture. Vi use a separate header for this to avoid issues due to
- * recursive header includes.
- */
-
-/**
- * struct i915_vma_snapshot - Snapshot of vma metadata.
- * @obj_size: The size of the underlying object in bytes.
- * @pages: The struct sg_table pointing to the pages bound.
- * @pages_rsgt: The refcounted sg_table holding the reference for @pages if any.
- * @mr: The memory region pointed for the pages bound.
- * @kref: Reference for this structure.
- * @vma_resource: Pointer to the vma resource representing the vma binding.
- * @onstack: Whether the structure shouldn't be freed on final put.
- * @present: Whether the structure is present and initialized.
- */
-struct i915_vma_snapshot {
-	const char *name;
-	size_t obj_size;
-	struct sg_table *pages;
-	struct i915_refct_sgt *pages_rsgt;
-	struct intel_memory_region *mr;
-	struct kref kref;
-	struct i915_vma_resource *vma_resource;
-	bool onstack:1;
-	bool present:1;
-};
-
-void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
-			    struct i915_vma *vma,
-			    const char *name);
-
-void i915_vma_snapshot_init_onstack(struct i915_vma_snapshot *vsnap,
-				    struct i915_vma *vma,
-				    const char *name);
-
-void i915_vma_snapshot_put(struct i915_vma_snapshot *vsnap);
-
-void i915_vma_snapshot_put_onstack(struct i915_vma_snapshot *vsnap);
-
-bool i915_vma_snapshot_resource_pin(struct i915_vma_snapshot *vsnap,
-				    bool *lockdep_cookie);
-
-void i915_vma_snapshot_resource_unpin(struct i915_vma_snapshot *vsnap,
-				      bool lockdep_cookie);
-
-/**
- * i915_vma_snapshot_alloc - Allocate a struct i915_vma_snapshot
- * @gfp: Allocation mode.
- *
- * Return: A pointer to a struct i915_vma_snapshot if successful.
- * NULL otherwise.
- */
-static inline struct i915_vma_snapshot *i915_vma_snapshot_alloc(gfp_t gfp)
-{
-	return kmalloc(sizeof(struct i915_vma_snapshot), gfp);
-}
-
-/**
- * i915_vma_snapshot_get - Take a reference on a struct i915_vma_snapshot
- *
- * Return: A pointer to a struct i915_vma_snapshot.
- */
-static inline struct i915_vma_snapshot *
-i915_vma_snapshot_get(struct i915_vma_snapshot *vsnap)
-{
-	kref_get(&vsnap->kref);
-	return vsnap;
-}
-
-/**
- * i915_vma_snapshot_present - Whether a struct i915_vma_snapshot is
- * present and initialized.
- *
- * Return: true if present and initialized; false otherwise.
- */
-static inline bool
-i915_vma_snapshot_present(const struct i915_vma_snapshot *vsnap)
-{
-	return vsnap && vsnap->present;
-}
-
-#endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Asynchronous vma unbinding (rev5)
  2022-01-04 12:51 ` [Intel-gfx] " Thomas Hellström
                   ` (6 preceding siblings ...)
  (?)
@ 2022-01-04 15:56 ` Patchwork
  -1 siblings, 0 replies; 32+ messages in thread
From: Patchwork @ 2022-01-04 15:56 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Asynchronous vma unbinding (rev5)
URL   : https://patchwork.freedesktop.org/series/98055/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
c8d84998c84d drm/i915: Initial introduction of vma resources
-:245: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#245: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 626 lines checked
353826018a53 drm/i915: Use the vma resource as argument for gtt binding / unbinding
f6a70ef9f9d6 drm/i915: Don't pin the object pages during pending vma binds
9e4353b6105e drm/i915: Use vma resources for async unbinding
-:588: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_node' - possible side-effects?
#588: FILE: drivers/gpu/drm/i915/i915_vma_resource.c:37:
+#define VMA_RES_LAST(_node) ((_node)->start + (_node)->node_size - 1)

total: 0 errors, 0 warnings, 1 checks, 940 lines checked
f109b0968e2a drm/i915: Asynchronous migration selftest
ffcebda37217 drm/i915: Use struct vma_resource instead of struct vma_snapshot
-:613: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#613: 
deleted file mode 100644

total: 0 errors, 1 warnings, 0 checks, 507 lines checked



^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915: Asynchronous vma unbinding (rev5)
  2022-01-04 12:51 ` [Intel-gfx] " Thomas Hellström
                   ` (7 preceding siblings ...)
  (?)
@ 2022-01-04 15:58 ` Patchwork
  -1 siblings, 0 replies; 32+ messages in thread
From: Patchwork @ 2022-01-04 15:58 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Asynchronous vma unbinding (rev5)
URL   : https://patchwork.freedesktop.org/series/98055/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Asynchronous vma unbinding (rev5)
  2022-01-04 12:51 ` [Intel-gfx] " Thomas Hellström
                   ` (8 preceding siblings ...)
  (?)
@ 2022-01-04 16:10 ` Patchwork
  -1 siblings, 0 replies; 32+ messages in thread
From: Patchwork @ 2022-01-04 16:10 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 4641 bytes --]

== Series Details ==

Series: drm/i915: Asynchronous vma unbinding (rev5)
URL   : https://patchwork.freedesktop.org/series/98055/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_11046 -> Patchwork_21915
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/index.html

Participating hosts (50 -> 37)
------------------------------

  Missing    (13): fi-ilk-m540 bat-dg1-6 bat-dg1-5 fi-hsw-4200u fi-icl-u2 fi-bsw-cyan bat-adlp-6 bat-adlp-4 fi-ctg-p8600 bat-rpls-1 fi-bdw-samus bat-jsl-2 bat-jsl-1 

Known issues
------------

  Here are the changes found in Patchwork_21915 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@amdgpu/amd_basic@query-info:
    - fi-bsw-kefka:       NOTRUN -> [SKIP][1] ([fdo#109271]) +17 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/fi-bsw-kefka/igt@amdgpu/amd_basic@query-info.html

  * igt@amdgpu/amd_prime@amd-to-i915:
    - fi-pnv-d510:        NOTRUN -> [SKIP][2] ([fdo#109271]) +17 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/fi-pnv-d510/igt@amdgpu/amd_prime@amd-to-i915.html

  * igt@debugfs_test@read_all_entries:
    - fi-apl-guc:         [PASS][3] -> [DMESG-WARN][4] ([i915#1610])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/fi-apl-guc/igt@debugfs_test@read_all_entries.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/fi-apl-guc/igt@debugfs_test@read_all_entries.html

  * igt@gem_flink_basic@bad-flink:
    - fi-skl-6600u:       NOTRUN -> [FAIL][5] ([i915#4547])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/fi-skl-6600u/igt@gem_flink_basic@bad-flink.html

  
#### Possible fixes ####

  * igt@gem_exec_suspend@basic-s0@smem:
    - fi-tgl-1115g4:      [FAIL][6] ([i915#1888]) -> [PASS][7]
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/fi-tgl-1115g4/igt@gem_exec_suspend@basic-s0@smem.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/fi-tgl-1115g4/igt@gem_exec_suspend@basic-s0@smem.html

  * igt@gem_exec_suspend@basic-s3@smem:
    - fi-skl-6600u:       [INCOMPLETE][8] ([i915#4547]) -> [PASS][9]
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/fi-skl-6600u/igt@gem_exec_suspend@basic-s3@smem.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/fi-skl-6600u/igt@gem_exec_suspend@basic-s3@smem.html

  * igt@i915_selftest@live@execlists:
    - fi-bsw-kefka:       [INCOMPLETE][10] ([i915#2940]) -> [PASS][11]
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/fi-bsw-kefka/igt@i915_selftest@live@execlists.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/fi-bsw-kefka/igt@i915_selftest@live@execlists.html

  * igt@i915_selftest@live@requests:
    - fi-pnv-d510:        [DMESG-FAIL][12] ([i915#2927] / [i915#4528]) -> [PASS][13]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/fi-pnv-d510/igt@i915_selftest@live@requests.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/fi-pnv-d510/igt@i915_selftest@live@requests.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1610]: https://gitlab.freedesktop.org/drm/intel/issues/1610
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#2927]: https://gitlab.freedesktop.org/drm/intel/issues/2927
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940
  [i915#4528]: https://gitlab.freedesktop.org/drm/intel/issues/4528
  [i915#4547]: https://gitlab.freedesktop.org/drm/intel/issues/4547


Build changes
-------------

  * Linux: CI_DRM_11046 -> Patchwork_21915

  CI-20190529: 20190529
  CI_DRM_11046: ee55310525cbff1a700aeeaf08f63a0f7b33c521 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6322: b0b7679b358b300b7b6bf42c6921d0aa1fc14388 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_21915: ffcebda37217b7b7aae07717291e96c37cdafe92 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ffcebda37217 drm/i915: Use struct vma_resource instead of struct vma_snapshot
f109b0968e2a drm/i915: Asynchronous migration selftest
9e4353b6105e drm/i915: Use vma resources for async unbinding
f6a70ef9f9d6 drm/i915: Don't pin the object pages during pending vma binds
353826018a53 drm/i915: Use the vma resource as argument for gtt binding / unbinding
c8d84998c84d drm/i915: Initial introduction of vma resources

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/index.html

[-- Attachment #2: Type: text/html, Size: 5602 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Asynchronous vma unbinding (rev5)
  2022-01-04 12:51 ` [Intel-gfx] " Thomas Hellström
                   ` (9 preceding siblings ...)
  (?)
@ 2022-01-04 17:52 ` Patchwork
  -1 siblings, 0 replies; 32+ messages in thread
From: Patchwork @ 2022-01-04 17:52 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 30266 bytes --]

== Series Details ==

Series: drm/i915: Asynchronous vma unbinding (rev5)
URL   : https://patchwork.freedesktop.org/series/98055/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11046_full -> Patchwork_21915_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_21915_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_21915_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Participating hosts (13 -> 13)
------------------------------

  No changes in participating hosts

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_21915_full:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_exec_suspend@basic-s3@smem:
    - shard-skl:          [PASS][1] -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-skl7/igt@gem_exec_suspend@basic-s3@smem.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl3/igt@gem_exec_suspend@basic-s3@smem.html

  * igt@i915_pm_dc@dc5-dpms:
    - shard-skl:          NOTRUN -> [INCOMPLETE][3]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl7/igt@i915_pm_dc@dc5-dpms.html

  * igt@kms_frontbuffer_tracking@fbc-suspend:
    - shard-snb:          [PASS][4] -> [DMESG-WARN][5]
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-snb6/igt@kms_frontbuffer_tracking@fbc-suspend.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-snb5/igt@kms_frontbuffer_tracking@fbc-suspend.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@drm_import_export@import-close-race-prime:
    - {shard-rkl}:        [PASS][6] -> [INCOMPLETE][7]
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-rkl-6/igt@drm_import_export@import-close-race-prime.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-rkl-5/igt@drm_import_export@import-close-race-prime.html

  * igt@gem_ctx_engines@independent@all:
    - {shard-dg1}:        NOTRUN -> [FAIL][8] +4 similar issues
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-dg1-17/igt@gem_ctx_engines@independent@all.html

  * igt@gem_ctx_persistence@many-contexts:
    - {shard-tglu}:       [PASS][9] -> [INCOMPLETE][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-tglu-1/igt@gem_ctx_persistence@many-contexts.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglu-7/igt@gem_ctx_persistence@many-contexts.html

  * igt@gem_exec_schedule@u-submit-golden-slice@bcs0:
    - {shard-tglu}:       NOTRUN -> [INCOMPLETE][11] +1 similar issue
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglu-7/igt@gem_exec_schedule@u-submit-golden-slice@bcs0.html

  * igt@gem_mmap_gtt@close-race:
    - {shard-dg1}:        NOTRUN -> [SKIP][12] +1 similar issue
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-dg1-17/igt@gem_mmap_gtt@close-race.html

  * igt@kms_big_fb@linear-16bpp-rotate-0:
    - {shard-tglu}:       [PASS][13] -> [DMESG-WARN][14] +3 similar issues
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-tglu-2/igt@kms_big_fb@linear-16bpp-rotate-0.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglu-5/igt@kms_big_fb@linear-16bpp-rotate-0.html

  * igt@kms_ccs@pipe-a-crc-sprite-planes-basic-y_tiled_gen12_rc_ccs:
    - {shard-tglu}:       [PASS][15] -> [FAIL][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-tglu-8/igt@kms_ccs@pipe-a-crc-sprite-planes-basic-y_tiled_gen12_rc_ccs.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglu-1/igt@kms_ccs@pipe-a-crc-sprite-planes-basic-y_tiled_gen12_rc_ccs.html

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-indfb-plflip-blt:
    - {shard-tglu}:       NOTRUN -> [SKIP][17] +36 similar issues
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglu-5/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-indfb-plflip-blt.html

  * igt@runner@aborted:
    - {shard-tglu}:       [FAIL][18] ([i915#3002] / [i915#4312]) -> ([FAIL][19], [FAIL][20], [FAIL][21], [FAIL][22], [FAIL][23]) ([i915#1436] / [i915#3002] / [i915#3690] / [i915#4312])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-tglu-5/igt@runner@aborted.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglu-3/igt@runner@aborted.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglu-5/igt@runner@aborted.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglu-4/igt@runner@aborted.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglu-7/igt@runner@aborted.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglu-5/igt@runner@aborted.html

  
Known issues
------------

  Here are the changes found in Patchwork_21915_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_create@create-massive:
    - shard-tglb:         NOTRUN -> [DMESG-WARN][24] ([i915#3002])
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb6/igt@gem_create@create-massive.html

  * igt@gem_ctx_persistence@heartbeat-stop:
    - shard-skl:          [PASS][25] -> [DMESG-WARN][26] ([i915#1982])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-skl10/igt@gem_ctx_persistence@heartbeat-stop.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl5/igt@gem_ctx_persistence@heartbeat-stop.html

  * igt@gem_ctx_sseu@invalid-sseu:
    - shard-tglb:         NOTRUN -> [SKIP][27] ([i915#280])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@gem_ctx_sseu@invalid-sseu.html

  * igt@gem_exec_balancer@parallel-out-fence:
    - shard-iclb:         [PASS][28] -> [SKIP][29] ([i915#4525]) +2 similar issues
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-iclb2/igt@gem_exec_balancer@parallel-out-fence.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-iclb6/igt@gem_exec_balancer@parallel-out-fence.html

  * igt@gem_exec_capture@pi@rcs0:
    - shard-skl:          [PASS][30] -> [INCOMPLETE][31] ([i915#4547])
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-skl2/igt@gem_exec_capture@pi@rcs0.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl4/igt@gem_exec_capture@pi@rcs0.html

  * igt@gem_exec_fair@basic-none-rrul@rcs0:
    - shard-tglb:         NOTRUN -> [FAIL][32] ([i915#2842])
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@gem_exec_fair@basic-none-rrul@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs1:
    - shard-iclb:         NOTRUN -> [FAIL][33] ([i915#2842])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-iclb4/igt@gem_exec_fair@basic-none@vcs1.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - shard-glk:          [PASS][34] -> [FAIL][35] ([i915#2842])
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-glk1/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-glk7/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
    - shard-kbl:          [PASS][36] -> [FAIL][37] ([i915#2842])
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-kbl6/igt@gem_exec_fair@basic-pace@vecs0.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-kbl3/igt@gem_exec_fair@basic-pace@vecs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-iclb:         [PASS][38] -> [FAIL][39] ([i915#2849])
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-iclb2/igt@gem_exec_fair@basic-throttle@rcs0.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-iclb2/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_lmem_swapping@heavy-verify-multi:
    - shard-glk:          NOTRUN -> [SKIP][40] ([fdo#109271] / [i915#4613])
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-glk6/igt@gem_lmem_swapping@heavy-verify-multi.html

  * igt@gem_lmem_swapping@parallel-random:
    - shard-skl:          NOTRUN -> [SKIP][41] ([fdo#109271] / [i915#4613]) +5 similar issues
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl9/igt@gem_lmem_swapping@parallel-random.html

  * igt@gem_lmem_swapping@parallel-random-verify:
    - shard-apl:          NOTRUN -> [SKIP][42] ([fdo#109271] / [i915#4613])
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-apl8/igt@gem_lmem_swapping@parallel-random-verify.html

  * igt@gem_render_copy@yf-tiled-to-vebox-linear:
    - shard-kbl:          NOTRUN -> [SKIP][43] ([fdo#109271]) +4 similar issues
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-kbl4/igt@gem_render_copy@yf-tiled-to-vebox-linear.html

  * igt@gem_userptr_blits@unsync-unmap-cycles:
    - shard-tglb:         NOTRUN -> [SKIP][44] ([i915#3297])
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@gem_userptr_blits@unsync-unmap-cycles.html

  * igt@gem_userptr_blits@vma-merge:
    - shard-skl:          NOTRUN -> [FAIL][45] ([i915#3318])
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl6/igt@gem_userptr_blits@vma-merge.html

  * igt@gen3_render_mixed_blits:
    - shard-tglb:         NOTRUN -> [SKIP][46] ([fdo#109289])
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@gen3_render_mixed_blits.html

  * igt@i915_selftest@perf@region:
    - shard-tglb:         NOTRUN -> [DMESG-WARN][47] ([i915#2867]) +1 similar issue
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@i915_selftest@perf@region.html

  * igt@i915_suspend@debugfs-reader:
    - shard-apl:          [PASS][48] -> [DMESG-WARN][49] ([i915#180]) +2 similar issues
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-apl8/igt@i915_suspend@debugfs-reader.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-apl4/igt@i915_suspend@debugfs-reader.html

  * igt@kms_big_fb@linear-16bpp-rotate-90:
    - shard-apl:          NOTRUN -> [SKIP][50] ([fdo#109271]) +71 similar issues
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-apl4/igt@kms_big_fb@linear-16bpp-rotate-90.html

  * igt@kms_big_fb@linear-8bpp-rotate-90:
    - shard-tglb:         NOTRUN -> [SKIP][51] ([fdo#111614])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@kms_big_fb@linear-8bpp-rotate-90.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-hflip:
    - shard-skl:          NOTRUN -> [SKIP][52] ([fdo#109271] / [i915#3777])
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl7/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-hflip.html

  * igt@kms_big_fb@yf-tiled-32bpp-rotate-270:
    - shard-tglb:         NOTRUN -> [SKIP][53] ([fdo#111615])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@kms_big_fb@yf-tiled-32bpp-rotate-270.html

  * igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-180-async-flip:
    - shard-skl:          NOTRUN -> [FAIL][54] ([i915#3743]) +1 similar issue
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl7/igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-180-async-flip.html

  * igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-180-hflip:
    - shard-apl:          NOTRUN -> [SKIP][55] ([fdo#109271] / [i915#3777])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-apl8/igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-180-hflip.html

  * igt@kms_ccs@pipe-a-ccs-on-another-bo-y_tiled_gen12_rc_ccs_cc:
    - shard-skl:          NOTRUN -> [SKIP][56] ([fdo#109271] / [i915#3886]) +13 similar issues
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl7/igt@kms_ccs@pipe-a-ccs-on-another-bo-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-a-crc-primary-rotation-180-y_tiled_gen12_rc_ccs_cc:
    - shard-apl:          NOTRUN -> [SKIP][57] ([fdo#109271] / [i915#3886]) +1 similar issue
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-apl4/igt@kms_ccs@pipe-a-crc-primary-rotation-180-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-b-bad-rotation-90-y_tiled_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][58] ([i915#3689])
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@kms_ccs@pipe-b-bad-rotation-90-y_tiled_ccs.html

  * igt@kms_ccs@pipe-c-bad-pixel-format-y_tiled_gen12_mc_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][59] ([i915#3689] / [i915#3886])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@kms_ccs@pipe-c-bad-pixel-format-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-c-bad-rotation-90-y_tiled_gen12_rc_ccs_cc:
    - shard-glk:          NOTRUN -> [SKIP][60] ([fdo#109271] / [i915#3886])
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-glk6/igt@kms_ccs@pipe-c-bad-rotation-90-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_chamelium@vga-hpd:
    - shard-apl:          NOTRUN -> [SKIP][61] ([fdo#109271] / [fdo#111827]) +7 similar issues
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-apl4/igt@kms_chamelium@vga-hpd.html

  * igt@kms_color_chamelium@pipe-a-ctm-0-25:
    - shard-glk:          NOTRUN -> [SKIP][62] ([fdo#109271] / [fdo#111827]) +2 similar issues
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-glk6/igt@kms_color_chamelium@pipe-a-ctm-0-25.html

  * igt@kms_color_chamelium@pipe-b-ctm-max:
    - shard-skl:          NOTRUN -> [SKIP][63] ([fdo#109271] / [fdo#111827]) +27 similar issues
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl10/igt@kms_color_chamelium@pipe-b-ctm-max.html

  * igt@kms_color_chamelium@pipe-d-ctm-max:
    - shard-tglb:         NOTRUN -> [SKIP][64] ([fdo#109284] / [fdo#111827]) +1 similar issue
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@kms_color_chamelium@pipe-d-ctm-max.html

  * igt@kms_content_protection@atomic:
    - shard-apl:          NOTRUN -> [TIMEOUT][65] ([i915#1319])
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-apl4/igt@kms_content_protection@atomic.html

  * igt@kms_cursor_crc@pipe-a-cursor-32x32-random:
    - shard-tglb:         NOTRUN -> [SKIP][66] ([i915#3319])
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@kms_cursor_crc@pipe-a-cursor-32x32-random.html

  * igt@kms_cursor_crc@pipe-a-cursor-512x512-sliding:
    - shard-tglb:         NOTRUN -> [SKIP][67] ([fdo#109279] / [i915#3359])
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@kms_cursor_crc@pipe-a-cursor-512x512-sliding.html

  * igt@kms_cursor_crc@pipe-b-cursor-32x32-onscreen:
    - shard-skl:          NOTRUN -> [SKIP][68] ([fdo#109271]) +399 similar issues
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl9/igt@kms_cursor_crc@pipe-b-cursor-32x32-onscreen.html

  * igt@kms_cursor_crc@pipe-b-cursor-max-size-random:
    - shard-tglb:         NOTRUN -> [SKIP][69] ([i915#3359])
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@kms_cursor_crc@pipe-b-cursor-max-size-random.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-skl:          NOTRUN -> [FAIL][70] ([i915#2346])
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl9/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size:
    - shard-iclb:         [PASS][71] -> [FAIL][72] ([i915#2346])
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-iclb1/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-iclb7/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html

  * igt@kms_flip@2x-blocking-absolute-wf_vblank:
    - shard-tglb:         NOTRUN -> [SKIP][73] ([fdo#111825]) +3 similar issues
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@kms_flip@2x-blocking-absolute-wf_vblank.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@c-hdmi-a1:
    - shard-glk:          [PASS][74] -> [FAIL][75] ([i915#79])
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-glk6/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-hdmi-a1.html
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-glk1/igt@kms_flip@flip-vs-expired-vblank-interruptible@c-hdmi-a1.html

  * igt@kms_flip@flip-vs-expired-vblank@c-edp1:
    - shard-skl:          NOTRUN -> [FAIL][76] ([i915#79])
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl9/igt@kms_flip@flip-vs-expired-vblank@c-edp1.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytileccs-downscaling:
    - shard-skl:          NOTRUN -> [INCOMPLETE][77] ([i915#3701])
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl9/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytileccs-downscaling.html

  * igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytile-downscaling:
    - shard-iclb:         [PASS][78] -> [SKIP][79] ([i915#3701])
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-iclb7/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytile-downscaling.html
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-iclb2/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytile-downscaling.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-offscren-pri-indfb-draw-mmap-wc:
    - shard-glk:          NOTRUN -> [SKIP][80] ([fdo#109271]) +40 similar issues
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-glk6/igt@kms_frontbuffer_tracking@fbcpsr-1p-offscren-pri-indfb-draw-mmap-wc.html

  * igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-shrfb-draw-mmap-wc:
    - shard-iclb:         [PASS][81] -> [FAIL][82] ([i915#2546])
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-iclb8/igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-shrfb-draw-mmap-wc.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-iclb5/igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-shrfb-draw-mmap-wc.html

  * igt@kms_hdr@bpc-switch-suspend:
    - shard-kbl:          [PASS][83] -> [INCOMPLETE][84] ([i915#2828])
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-kbl7/igt@kms_hdr@bpc-switch-suspend.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-kbl4/igt@kms_hdr@bpc-switch-suspend.html

  * igt@kms_multipipe_modeset@basic-max-pipe-crc-check:
    - shard-tglb:         NOTRUN -> [SKIP][85] ([i915#1839])
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@kms_multipipe_modeset@basic-max-pipe-crc-check.html

  * igt@kms_pipe_crc_basic@disable-crc-after-crtc-pipe-d:
    - shard-glk:          NOTRUN -> [SKIP][86] ([fdo#109271] / [i915#533])
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-glk6/igt@kms_pipe_crc_basic@disable-crc-after-crtc-pipe-d.html

  * igt@kms_pipe_crc_basic@read-crc-pipe-d:
    - shard-skl:          NOTRUN -> [SKIP][87] ([fdo#109271] / [i915#533]) +2 similar issues
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl6/igt@kms_pipe_crc_basic@read-crc-pipe-d.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb:
    - shard-skl:          NOTRUN -> [FAIL][88] ([i915#265]) +1 similar issue
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl10/igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb.html

  * igt@kms_plane_alpha_blend@pipe-b-constant-alpha-max:
    - shard-apl:          NOTRUN -> [FAIL][89] ([fdo#108145] / [i915#265])
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-apl1/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-max.html

  * igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min:
    - shard-skl:          NOTRUN -> [FAIL][90] ([fdo#108145] / [i915#265]) +3 similar issues
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl6/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html

  * igt@kms_plane_cursor@pipe-c-viewport-size-128:
    - shard-glk:          [PASS][91] -> [FAIL][92] ([i915#4729])
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-glk8/igt@kms_plane_cursor@pipe-c-viewport-size-128.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-glk7/igt@kms_plane_cursor@pipe-c-viewport-size-128.html

  * igt@kms_plane_lowres@pipe-a-tiling-y:
    - shard-tglb:         NOTRUN -> [SKIP][93] ([i915#3536])
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@kms_plane_lowres@pipe-a-tiling-y.html

  * igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area:
    - shard-skl:          NOTRUN -> [SKIP][94] ([fdo#109271] / [i915#658]) +2 similar issues
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl7/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area.html

  * igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area:
    - shard-apl:          NOTRUN -> [SKIP][95] ([fdo#109271] / [i915#658])
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-apl4/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area.html

  * igt@kms_psr@psr2_primary_mmap_cpu:
    - shard-iclb:         [PASS][96] -> [SKIP][97] ([fdo#109441]) +1 similar issue
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-iclb2/igt@kms_psr@psr2_primary_mmap_cpu.html
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-iclb4/igt@kms_psr@psr2_primary_mmap_cpu.html

  * igt@kms_psr@psr2_sprite_mmap_gtt:
    - shard-tglb:         NOTRUN -> [FAIL][98] ([i915#132] / [i915#3467])
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@kms_psr@psr2_sprite_mmap_gtt.html

  * igt@kms_writeback@writeback-check-output:
    - shard-skl:          NOTRUN -> [SKIP][99] ([fdo#109271] / [i915#2437])
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl1/igt@kms_writeback@writeback-check-output.html

  * igt@kms_writeback@writeback-fb-id:
    - shard-apl:          NOTRUN -> [SKIP][100] ([fdo#109271] / [i915#2437])
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-apl4/igt@kms_writeback@writeback-fb-id.html

  * igt@nouveau_crc@pipe-d-ctx-flip-skip-current-frame:
    - shard-tglb:         NOTRUN -> [SKIP][101] ([i915#2530])
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@nouveau_crc@pipe-d-ctx-flip-skip-current-frame.html

  * igt@perf@polling-small-buf:
    - shard-skl:          [PASS][102] -> [FAIL][103] ([i915#1722])
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-skl4/igt@perf@polling-small-buf.html
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl4/igt@perf@polling-small-buf.html

  * igt@perf_pmu@rc6-suspend:
    - shard-kbl:          [PASS][104] -> [INCOMPLETE][105] ([i915#794])
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-kbl3/igt@perf_pmu@rc6-suspend.html
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-kbl4/igt@perf_pmu@rc6-suspend.html

  * igt@prime_nv_api@nv_i915_import_twice_check_flink_name:
    - shard-tglb:         NOTRUN -> [SKIP][106] ([fdo#109291])
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@prime_nv_api@nv_i915_import_twice_check_flink_name.html

  * igt@prime_vgem@fence-read-hang:
    - shard-tglb:         NOTRUN -> [SKIP][107] ([fdo#109295])
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb7/igt@prime_vgem@fence-read-hang.html

  * igt@sysfs_clients@fair-7:
    - shard-skl:          NOTRUN -> [SKIP][108] ([fdo#109271] / [i915#2994]) +3 similar issues
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-skl9/igt@sysfs_clients@fair-7.html
    - shard-glk:          NOTRUN -> [SKIP][109] ([fdo#109271] / [i915#2994])
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-glk6/igt@sysfs_clients@fair-7.html

  
#### Possible fixes ####

  * igt@gem_exec_fair@basic-none-vip@rcs0:
    - shard-kbl:          [FAIL][110] ([i915#2842]) -> [PASS][111] +3 similar issues
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-kbl7/igt@gem_exec_fair@basic-none-vip@rcs0.html
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-kbl1/igt@gem_exec_fair@basic-none-vip@rcs0.html

  * igt@gem_exec_whisper@basic-contexts-forked-all:
    - shard-glk:          [DMESG-WARN][112] ([i915#118]) -> [PASS][113]
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-glk3/igt@gem_exec_whisper@basic-contexts-forked-all.html
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-glk8/igt@gem_exec_whisper@basic-contexts-forked-all.html

  * igt@gem_huc_copy@huc-copy:
    - shard-tglb:         [SKIP][114] ([i915#2190]) -> [PASS][115]
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-tglb7/igt@gem_huc_copy@huc-copy.html
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglb2/igt@gem_huc_copy@huc-copy.html

  * igt@gem_mmap_wc@read-write:
    - {shard-rkl}:        [INCOMPLETE][116] ([i915#2295]) -> [PASS][117]
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-rkl-5/igt@gem_mmap_wc@read-write.html
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-rkl-5/igt@gem_mmap_wc@read-write.html

  * igt@gem_workarounds@suspend-resume-context:
    - shard-apl:          [DMESG-WARN][118] ([i915#180]) -> [PASS][119] +4 similar issues
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-apl2/igt@gem_workarounds@suspend-resume-context.html
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-apl4/igt@gem_workarounds@suspend-resume-context.html

  * igt@i915_pm_rpm@basic-rte:
    - {shard-rkl}:        [SKIP][120] ([fdo#109308]) -> [PASS][121]
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-rkl-2/igt@i915_pm_rpm@basic-rte.html
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-rkl-6/igt@i915_pm_rpm@basic-rte.html

  * igt@i915_pm_rpm@modeset-lpsp-stress-no-wait:
    - {shard-rkl}:        [SKIP][122] ([i915#1397]) -> [PASS][123]
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-rkl-4/igt@i915_pm_rpm@modeset-lpsp-stress-no-wait.html
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-rkl-6/igt@i915_pm_rpm@modeset-lpsp-stress-no-wait.html

  * igt@i915_pm_rps@min-max-config-idle:
    - {shard-rkl}:        [FAIL][124] ([i915#4016]) -> [PASS][125]
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-rkl-6/igt@i915_pm_rps@min-max-config-idle.html
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-rkl-5/igt@i915_pm_rps@min-max-config-idle.html

  * igt@kms_big_fb@y-tiled-32bpp-rotate-0:
    - {shard-tglu}:       [DMESG-WARN][126] -> [PASS][127]
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-tglu-5/igt@kms_big_fb@y-tiled-32bpp-rotate-0.html
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-tglu-6/igt@kms_big_fb@y-tiled-32bpp-rotate-0.html

  * igt@kms_ccs@pipe-b-bad-pixel-format-y_tiled_gen12_rc_ccs_cc:
    - {shard-rkl}:        [SKIP][128] ([i915#1845] / [i915#4098]) -> [PASS][129] +1 similar issue
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-rkl-2/igt@kms_ccs@pipe-b-bad-pixel-format-y_tiled_gen12_rc_ccs_cc.html
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-rkl-6/igt@kms_ccs@pipe-b-bad-pixel-format-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_color@pipe-a-ctm-negative:
    - {shard-rkl}:        [SKIP][130] ([i915#1149] / [i915#4098]) -> [PASS][131]
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-rkl-4/igt@kms_color@pipe-a-ctm-negative.html
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-rkl-6/igt@kms_color@pipe-a-ctm-negative.html

  * igt@kms_cursor_crc@pipe-a-cursor-256x256-rapid-movement:
    - {shard-rkl}:        [SKIP][132] ([fdo#112022] / [i915#4070]) -> [PASS][133] +1 similar issue
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-rkl-2/igt@kms_cursor_crc@pipe-a-cursor-256x256-rapid-movement.html
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-rkl-6/igt@kms_cursor_crc@pipe-a-cursor-256x256-rapid-movement.html

  * igt@kms_cursor_legacy@cursora-vs-flipa-varying-size:
    - {shard-rkl}:        [SKIP][134] ([fdo#111825]) -> [PASS][135]
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-rkl-5/igt@kms_cursor_legacy@cursora-vs-flipa-varying-size.html
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-rkl-6/igt@kms_cursor_legacy@cursora-vs-flipa-varying-size.html

  * igt@kms_cursor_legacy@short-flip-before-cursor-atomic-transitions:
    - {shard-rkl}:        [SKIP][136] ([fdo#111825] / [i915#4070]) -> [PASS][137]
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11046/shard-rkl-2/igt@kms_cursor_legacy@short-flip-before-cursor-atomic-transitions.html
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/shard-rkl-6/igt@kms_cursor_legacy@short-flip-before-cursor-atomic-transitions.html

  * igt@kms_draw_crc@draw-method-xrgb2101010-pwrit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21915/index.html

[-- Attachment #2: Type: text/html, Size: 33523 bytes --]

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/6] drm/i915: Use vma resources for async unbinding
  2022-01-04 12:51   ` [Intel-gfx] " Thomas Hellström
@ 2022-01-05 15:52     ` Matthew Auld
  -1 siblings, 0 replies; 32+ messages in thread
From: Matthew Auld @ 2022-01-05 15:52 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 04/01/2022 12:51, Thomas Hellström wrote:
> Implement async (non-blocking) unbinding by not syncing the vma before
> calling unbind on the vma_resource.
> Add the resulting unbind fence to the object's dma_resv from where it is
> picked up by the ttm migration code.
> Ideally these unbind fences should be coalesced with the migration blit
> fence to avoid stalling the migration blit waiting for unbind, as they
> can certainly go on in parallel, but since we don't yet have a
> reasonable data structure to use to coalesce fences and attach the
> resulting fence to a timeline, we defer that for now.
> 
> Note that with async unbinding, even while the unbind waits for the
> preceding bind to complete before unbinding, the vma itself might have been
> destroyed in the process, clearing the vma pages. Therefore we can
> only allow async unbinding if we have a refcounted sg-list and keep a
> refcount on that for the vma resource pages to stay intact until
> binding occurs. If this condition is not met, a request for an async
> unbind is diverted to a sync unbind.
> 
> v2:
> - Use a separate kmem_cache for vma resources for now to isolate their
>    memory allocation and aid debugging.
> - Move the check for vm closed to the actual unbinding thread. Regardless
>    of whether the vm is closed, we need the unbind fence to properly wait
>    for capture.
> - Clear vma_res::vm on unbind and update its documentation.
> v4:
> - Take cache coloring into account when searching for vma resources
>    pending unbind. (Matthew Auld)
> v5:
> - Fix timeout and error check in i915_vma_resource_bind_dep_await().
> - Avoid taking a reference on the object for async binding if
>    async unbind capable.
> - Fix braces around a single-line if statement.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

<snip>

> @@ -434,12 +439,30 @@ int i915_vma_bind(struct i915_vma *vma,
>   
>   	bind_flags &= ~vma_flags;
>   	if (bind_flags == 0) {
> -		kfree(vma_res);
> +		i915_vma_resource_free(vma_res);
>   		return 0;
>   	}
>   
>   	GEM_BUG_ON(!atomic_read(&vma->pages_count));
>   
> +	/* Wait for or await async unbinds touching our range */
> +	if (work && bind_flags & vma->vm->bind_async_flags)
> +		ret = i915_vma_resource_bind_dep_await(vma->vm,
> +						       &work->base.chain,
> +						       vma->node.start,
> +						       vma->node.size,
> +						       true,
> +						       GFP_NOWAIT |
> +						       __GFP_RETRY_MAYFAIL |
> +						       __GFP_NOWARN);
> +	else
> +		ret = i915_vma_resource_bind_dep_sync(vma->vm, vma->node.start,
> +						      vma->node.size, true);
> +	if (ret) {
> +		i915_vma_resource_free(vma_res);
> +		return ret;
> +	}
> +
>   	if (vma->resource || !vma_res) {
>   		/* Rebinding with an additional I915_VMA_*_BIND */
>   		GEM_WARN_ON(!vma_flags);
> @@ -452,9 +475,11 @@ int i915_vma_bind(struct i915_vma *vma,
>   	if (work && bind_flags & vma->vm->bind_async_flags) {
>   		struct dma_fence *prev;
>   
> -		work->vma = vma;
> +		work->vma_res = i915_vma_resource_get(vma->resource);
>   		work->cache_level = cache_level;
>   		work->flags = bind_flags;
> +		if (vma->obj->mm.rsgt)
> +			work->rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);

Hmmm, at a glance I would have expected this to use the vma->pages. I 
think with the GGTT the vma will often create its own sg layout which != 
obj->mm.sgt. IIUC the async unbind will still call vma_unbind_pages 
which might nuke the vma sgt? Or is something else going on here?


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH v5 4/6] drm/i915: Use vma resources for async unbinding
@ 2022-01-05 15:52     ` Matthew Auld
  0 siblings, 0 replies; 32+ messages in thread
From: Matthew Auld @ 2022-01-05 15:52 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 04/01/2022 12:51, Thomas Hellström wrote:
> Implement async (non-blocking) unbinding by not syncing the vma before
> calling unbind on the vma_resource.
> Add the resulting unbind fence to the object's dma_resv from where it is
> picked up by the ttm migration code.
> Ideally these unbind fences should be coalesced with the migration blit
> fence to avoid stalling the migration blit waiting for unbind, as they
> can certainly go on in parallel, but since we don't yet have a
> reasonable data structure to use to coalesce fences and attach the
> resulting fence to a timeline, we defer that for now.
> 
> Note that with async unbinding, even while the unbind waits for the
> preceding bind to complete before unbinding, the vma itself might have been
> destroyed in the process, clearing the vma pages. Therefore we can
> only allow async unbinding if we have a refcounted sg-list and keep a
> refcount on that for the vma resource pages to stay intact until
> binding occurs. If this condition is not met, a request for an async
> unbind is diverted to a sync unbind.
> 
> v2:
> - Use a separate kmem_cache for vma resources for now to isolate their
>    memory allocation and aid debugging.
> - Move the check for vm closed to the actual unbinding thread. Regardless
>    of whether the vm is closed, we need the unbind fence to properly wait
>    for capture.
> - Clear vma_res::vm on unbind and update its documentation.
> v4:
> - Take cache coloring into account when searching for vma resources
>    pending unbind. (Matthew Auld)
> v5:
> - Fix timeout and error check in i915_vma_resource_bind_dep_await().
> - Avoid taking a reference on the object for async binding if
>    async unbind capable.
> - Fix braces around a single-line if statement.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

<snip>

> @@ -434,12 +439,30 @@ int i915_vma_bind(struct i915_vma *vma,
>   
>   	bind_flags &= ~vma_flags;
>   	if (bind_flags == 0) {
> -		kfree(vma_res);
> +		i915_vma_resource_free(vma_res);
>   		return 0;
>   	}
>   
>   	GEM_BUG_ON(!atomic_read(&vma->pages_count));
>   
> +	/* Wait for or await async unbinds touching our range */
> +	if (work && bind_flags & vma->vm->bind_async_flags)
> +		ret = i915_vma_resource_bind_dep_await(vma->vm,
> +						       &work->base.chain,
> +						       vma->node.start,
> +						       vma->node.size,
> +						       true,
> +						       GFP_NOWAIT |
> +						       __GFP_RETRY_MAYFAIL |
> +						       __GFP_NOWARN);
> +	else
> +		ret = i915_vma_resource_bind_dep_sync(vma->vm, vma->node.start,
> +						      vma->node.size, true);
> +	if (ret) {
> +		i915_vma_resource_free(vma_res);
> +		return ret;
> +	}
> +
>   	if (vma->resource || !vma_res) {
>   		/* Rebinding with an additional I915_VMA_*_BIND */
>   		GEM_WARN_ON(!vma_flags);
> @@ -452,9 +475,11 @@ int i915_vma_bind(struct i915_vma *vma,
>   	if (work && bind_flags & vma->vm->bind_async_flags) {
>   		struct dma_fence *prev;
>   
> -		work->vma = vma;
> +		work->vma_res = i915_vma_resource_get(vma->resource);
>   		work->cache_level = cache_level;
>   		work->flags = bind_flags;
> +		if (vma->obj->mm.rsgt)
> +			work->rsgt = i915_refct_sgt_get(vma->obj->mm.rsgt);

Hmmm, at a glance I would have expected this to use the vma->pages. I 
think with the GGTT the vma will often create its own sg layout which != 
obj->mm.sgt. IIUC the async unbind will still call vma_unbind_pages 
which might nuke the vma sgt? Or is something else going on here?


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/6] drm/i915: Use vma resources for async unbinding
  2022-01-05 15:52     ` [Intel-gfx] " Matthew Auld
@ 2022-01-05 16:03       ` Thomas Hellström
  -1 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-05 16:03 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx, dri-devel

On Wed, 2022-01-05 at 15:52 +0000, Matthew Auld wrote:
> On 04/01/2022 12:51, Thomas Hellström wrote:
> > Implement async (non-blocking) unbinding by not syncing the vma
> > before
> > calling unbind on the vma_resource.
> > Add the resulting unbind fence to the object's dma_resv from where
> > it is
> > picked up by the ttm migration code.
> > Ideally these unbind fences should be coalesced with the migration
> > blit
> > fence to avoid stalling the migration blit waiting for unbind, as
> > they
> > can certainly go on in parallel, but since we don't yet have a
> > reasonable data structure to use to coalesce fences and attach the
> > resulting fence to a timeline, we defer that for now.
> > 
> > Note that with async unbinding, even while the unbind waits for the
> > preceding bind to complete before unbinding, the vma itself might
> > have been
> > destroyed in the process, clearing the vma pages. Therefore we can
> > only allow async unbinding if we have a refcounted sg-list and keep
> > a
> > refcount on that for the vma resource pages to stay intact until
> > binding occurs. If this condition is not met, a request for an
> > async
> > unbind is diverted to a sync unbind.
> > 
> > v2:
> > - Use a separate kmem_cache for vma resources for now to isolate
> > their
> >    memory allocation and aid debugging.
> > - Move the check for vm closed to the actual unbinding thread.
> > Regardless
> >    of whether the vm is closed, we need the unbind fence to
> > properly wait
> >    for capture.
> > - Clear vma_res::vm on unbind and update its documentation.
> > v4:
> > - Take cache coloring into account when searching for vma resources
> >    pending unbind. (Matthew Auld)
> > v5:
> > - Fix timeout and error check in
> > i915_vma_resource_bind_dep_await().
> > - Avoid taking a reference on the object for async binding if
> >    async unbind capable.
> > - Fix braces around a single-line if statement.
> > 
> > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> 
> <snip>
> 
> > @@ -434,12 +439,30 @@ int i915_vma_bind(struct i915_vma *vma,
> >   
> >         bind_flags &= ~vma_flags;
> >         if (bind_flags == 0) {
> > -               kfree(vma_res);
> > +               i915_vma_resource_free(vma_res);
> >                 return 0;
> >         }
> >   
> >         GEM_BUG_ON(!atomic_read(&vma->pages_count));
> >   
> > +       /* Wait for or await async unbinds touching our range */
> > +       if (work && bind_flags & vma->vm->bind_async_flags)
> > +               ret = i915_vma_resource_bind_dep_await(vma->vm,
> > +                                                      &work-
> > >base.chain,
> > +                                                      vma-
> > >node.start,
> > +                                                      vma-
> > >node.size,
> > +                                                      true,
> > +                                                      GFP_NOWAIT |
> > +                                                     
> > __GFP_RETRY_MAYFAIL |
> > +                                                     
> > __GFP_NOWARN);
> > +       else
> > +               ret = i915_vma_resource_bind_dep_sync(vma->vm, vma-
> > >node.start,
> > +                                                     vma-
> > >node.size, true);
> > +       if (ret) {
> > +               i915_vma_resource_free(vma_res);
> > +               return ret;
> > +       }
> > +
> >         if (vma->resource || !vma_res) {
> >                 /* Rebinding with an additional I915_VMA_*_BIND */
> >                 GEM_WARN_ON(!vma_flags);
> > @@ -452,9 +475,11 @@ int i915_vma_bind(struct i915_vma *vma,
> >         if (work && bind_flags & vma->vm->bind_async_flags) {
> >                 struct dma_fence *prev;
> >   
> > -               work->vma = vma;
> > +               work->vma_res = i915_vma_resource_get(vma-
> > >resource);
> >                 work->cache_level = cache_level;
> >                 work->flags = bind_flags;
> > +               if (vma->obj->mm.rsgt)
> > +                       work->rsgt = i915_refct_sgt_get(vma->obj-
> > >mm.rsgt);
> 
> Hmmm, at a glance I would have expected this to use the vma->pages. I
> think with the GGTT the vma will often create its own sg layout which
> != 
> obj->mm.sgt. IIUC the async unbind will still call vma_unbind_pages 
> which might nuke the vma sgt? Or is something else going on here?
> 

Yes, the binding code is only using vma_res->pages, which should have
been copied from vma->pages, and keeps a reference to the rsgt just in
case we do an async unbind.

However good point we should refuse async unbind for now if vma_res-
>pages != &rsgt->table, because the former might otherwise be nuked
before the async unbind actually happens. Will fix that for next
version.

/Thomas





^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH v5 4/6] drm/i915: Use vma resources for async unbinding
@ 2022-01-05 16:03       ` Thomas Hellström
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Hellström @ 2022-01-05 16:03 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx, dri-devel

On Wed, 2022-01-05 at 15:52 +0000, Matthew Auld wrote:
> On 04/01/2022 12:51, Thomas Hellström wrote:
> > Implement async (non-blocking) unbinding by not syncing the vma
> > before
> > calling unbind on the vma_resource.
> > Add the resulting unbind fence to the object's dma_resv from where
> > it is
> > picked up by the ttm migration code.
> > Ideally these unbind fences should be coalesced with the migration
> > blit
> > fence to avoid stalling the migration blit waiting for unbind, as
> > they
> > can certainly go on in parallel, but since we don't yet have a
> > reasonable data structure to use to coalesce fences and attach the
> > resulting fence to a timeline, we defer that for now.
> > 
> > Note that with async unbinding, even while the unbind waits for the
> > preceding bind to complete before unbinding, the vma itself might
> > have been
> > destroyed in the process, clearing the vma pages. Therefore we can
> > only allow async unbinding if we have a refcounted sg-list and keep
> > a
> > refcount on that for the vma resource pages to stay intact until
> > binding occurs. If this condition is not met, a request for an
> > async
> > unbind is diverted to a sync unbind.
> > 
> > v2:
> > - Use a separate kmem_cache for vma resources for now to isolate
> > their
> >    memory allocation and aid debugging.
> > - Move the check for vm closed to the actual unbinding thread.
> > Regardless
> >    of whether the vm is closed, we need the unbind fence to
> > properly wait
> >    for capture.
> > - Clear vma_res::vm on unbind and update its documentation.
> > v4:
> > - Take cache coloring into account when searching for vma resources
> >    pending unbind. (Matthew Auld)
> > v5:
> > - Fix timeout and error check in
> > i915_vma_resource_bind_dep_await().
> > - Avoid taking a reference on the object for async binding if
> >    async unbind capable.
> > - Fix braces around a single-line if statement.
> > 
> > Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> 
> <snip>
> 
> > @@ -434,12 +439,30 @@ int i915_vma_bind(struct i915_vma *vma,
> >   
> >         bind_flags &= ~vma_flags;
> >         if (bind_flags == 0) {
> > -               kfree(vma_res);
> > +               i915_vma_resource_free(vma_res);
> >                 return 0;
> >         }
> >   
> >         GEM_BUG_ON(!atomic_read(&vma->pages_count));
> >   
> > +       /* Wait for or await async unbinds touching our range */
> > +       if (work && bind_flags & vma->vm->bind_async_flags)
> > +               ret = i915_vma_resource_bind_dep_await(vma->vm,
> > +                                                      &work-
> > >base.chain,
> > +                                                      vma-
> > >node.start,
> > +                                                      vma-
> > >node.size,
> > +                                                      true,
> > +                                                      GFP_NOWAIT |
> > +                                                     
> > __GFP_RETRY_MAYFAIL |
> > +                                                     
> > __GFP_NOWARN);
> > +       else
> > +               ret = i915_vma_resource_bind_dep_sync(vma->vm, vma-
> > >node.start,
> > +                                                     vma-
> > >node.size, true);
> > +       if (ret) {
> > +               i915_vma_resource_free(vma_res);
> > +               return ret;
> > +       }
> > +
> >         if (vma->resource || !vma_res) {
> >                 /* Rebinding with an additional I915_VMA_*_BIND */
> >                 GEM_WARN_ON(!vma_flags);
> > @@ -452,9 +475,11 @@ int i915_vma_bind(struct i915_vma *vma,
> >         if (work && bind_flags & vma->vm->bind_async_flags) {
> >                 struct dma_fence *prev;
> >   
> > -               work->vma = vma;
> > +               work->vma_res = i915_vma_resource_get(vma-
> > >resource);
> >                 work->cache_level = cache_level;
> >                 work->flags = bind_flags;
> > +               if (vma->obj->mm.rsgt)
> > +                       work->rsgt = i915_refct_sgt_get(vma->obj-
> > >mm.rsgt);
> 
> Hmmm, at a glance I would have expected this to use the vma->pages. I
> think with the GGTT the vma will often create its own sg layout which
> != 
> obj->mm.sgt. IIUC the async unbind will still call vma_unbind_pages 
> which might nuke the vma sgt? Or is something else going on here?
> 

Yes, the binding code is only using vma_res->pages, which should have
been copied from vma->pages, and keeps a reference to the rsgt just in
case we do an async unbind.

However good point we should refuse async unbind for now if vma_res-
>pages != &rsgt->table, because the former might otherwise be nuked
before the async unbind actually happens. Will fix that for next
version.

/Thomas





^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/6] drm/i915: Use vma resources for async unbinding
  2022-01-04 12:51   ` [Intel-gfx] " Thomas Hellström
@ 2022-01-06 12:13     ` Matthew Auld
  -1 siblings, 0 replies; 32+ messages in thread
From: Matthew Auld @ 2022-01-06 12:13 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 04/01/2022 12:51, Thomas Hellström wrote:
> Implement async (non-blocking) unbinding by not syncing the vma before
> calling unbind on the vma_resource.
> Add the resulting unbind fence to the object's dma_resv from where it is
> picked up by the ttm migration code.
> Ideally these unbind fences should be coalesced with the migration blit
> fence to avoid stalling the migration blit waiting for unbind, as they
> can certainly go on in parallel, but since we don't yet have a
> reasonable data structure to use to coalesce fences and attach the
> resulting fence to a timeline, we defer that for now.
> 
> Note that with async unbinding, even while the unbind waits for the
> preceding bind to complete before unbinding, the vma itself might have been
> destroyed in the process, clearing the vma pages. Therefore we can
> only allow async unbinding if we have a refcounted sg-list and keep a
> refcount on that for the vma resource pages to stay intact until
> binding occurs. If this condition is not met, a request for an async
> unbind is diverted to a sync unbind.
> 
> v2:
> - Use a separate kmem_cache for vma resources for now to isolate their
>    memory allocation and aid debugging.
> - Move the check for vm closed to the actual unbinding thread. Regardless
>    of whether the vm is closed, we need the unbind fence to properly wait
>    for capture.
> - Clear vma_res::vm on unbind and update its documentation.
> v4:
> - Take cache coloring into account when searching for vma resources
>    pending unbind. (Matthew Auld)
> v5:
> - Fix timeout and error check in i915_vma_resource_bind_dep_await().
> - Avoid taking a reference on the object for async binding if
>    async unbind capable.
> - Fix braces around a single-line if statement.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---

<snip>

> +
> +static void
> +i915_vma_resource_color_adjust_range(struct i915_address_space *vm,
> +				     unsigned long *start,
> +				     unsigned long *end)

Make these u64, below also? Just in case this is 32b?

> +{
> +	if (i915_vm_has_cache_coloring(vm)) {
> +		if (start)
> +			start -= I915_GTT_PAGE_SIZE;
> +		end += I915_GTT_PAGE_SIZE;

*start *end :)

> +	}

else {
     WARN_ON_ONCE(vm->color_adjust);
}

?

> +}


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH v5 4/6] drm/i915: Use vma resources for async unbinding
@ 2022-01-06 12:13     ` Matthew Auld
  0 siblings, 0 replies; 32+ messages in thread
From: Matthew Auld @ 2022-01-06 12:13 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 04/01/2022 12:51, Thomas Hellström wrote:
> Implement async (non-blocking) unbinding by not syncing the vma before
> calling unbind on the vma_resource.
> Add the resulting unbind fence to the object's dma_resv from where it is
> picked up by the ttm migration code.
> Ideally these unbind fences should be coalesced with the migration blit
> fence to avoid stalling the migration blit waiting for unbind, as they
> can certainly go on in parallel, but since we don't yet have a
> reasonable data structure to use to coalesce fences and attach the
> resulting fence to a timeline, we defer that for now.
> 
> Note that with async unbinding, even while the unbind waits for the
> preceding bind to complete before unbinding, the vma itself might have been
> destroyed in the process, clearing the vma pages. Therefore we can
> only allow async unbinding if we have a refcounted sg-list and keep a
> refcount on that for the vma resource pages to stay intact until
> binding occurs. If this condition is not met, a request for an async
> unbind is diverted to a sync unbind.
> 
> v2:
> - Use a separate kmem_cache for vma resources for now to isolate their
>    memory allocation and aid debugging.
> - Move the check for vm closed to the actual unbinding thread. Regardless
>    of whether the vm is closed, we need the unbind fence to properly wait
>    for capture.
> - Clear vma_res::vm on unbind and update its documentation.
> v4:
> - Take cache coloring into account when searching for vma resources
>    pending unbind. (Matthew Auld)
> v5:
> - Fix timeout and error check in i915_vma_resource_bind_dep_await().
> - Avoid taking a reference on the object for async binding if
>    async unbind capable.
> - Fix braces around a single-line if statement.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---

<snip>

> +
> +static void
> +i915_vma_resource_color_adjust_range(struct i915_address_space *vm,
> +				     unsigned long *start,
> +				     unsigned long *end)

Make these u64, below also? Just in case this is 32b?

> +{
> +	if (i915_vm_has_cache_coloring(vm)) {
> +		if (start)
> +			start -= I915_GTT_PAGE_SIZE;
> +		end += I915_GTT_PAGE_SIZE;

*start *end :)

> +	}

else {
     WARN_ON_ONCE(vm->color_adjust);
}

?

> +}


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 1/6] drm/i915: Initial introduction of vma resources
  2022-01-04 12:51   ` [Intel-gfx] " Thomas Hellström
@ 2022-01-06 15:22     ` Matthew Auld
  -1 siblings, 0 replies; 32+ messages in thread
From: Matthew Auld @ 2022-01-06 15:22 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 04/01/2022 12:51, Thomas Hellström wrote:
> Introduce vma resources, sort of similar to TTM resources,  needed for
> asynchronous bind management. Initially we will use them to hold
> completion of unbinding when we capture data from a vma, but they will
> be used extensively in upcoming patches for asynchronous vma unbinding.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/Makefile                 |   1 +
>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
>   drivers/gpu/drm/i915/i915_vma.c               |  55 +++++++-
>   drivers/gpu/drm/i915/i915_vma.h               |  19 ++-
>   drivers/gpu/drm/i915/i915_vma_resource.c      | 124 ++++++++++++++++++
>   drivers/gpu/drm/i915/i915_vma_resource.h      |  70 ++++++++++
>   drivers/gpu/drm/i915/i915_vma_snapshot.c      |  15 +--
>   drivers/gpu/drm/i915/i915_vma_snapshot.h      |   7 +-
>   drivers/gpu/drm/i915/i915_vma_types.h         |   5 +
>   drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  99 ++++++++------
>   10 files changed, 334 insertions(+), 63 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/i915_vma_resource.c
>   create mode 100644 drivers/gpu/drm/i915/i915_vma_resource.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 1b62b9f65196..98433ad74194 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -174,6 +174,7 @@ i915-y += \
>   	  i915_trace_points.o \
>   	  i915_ttm_buddy_manager.o \
>   	  i915_vma.o \
> +	  i915_vma_resource.o \
>   	  i915_vma_snapshot.o \
>   	  intel_wopcm.o
>   
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index e9541244027a..72e497745c12 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -1422,7 +1422,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
>   			mutex_lock(&vma->vm->mutex);
>   			err = i915_vma_bind(target->vma,
>   					    target->vma->obj->cache_level,
> -					    PIN_GLOBAL, NULL);
> +					    PIN_GLOBAL, NULL, NULL);
>   			mutex_unlock(&vma->vm->mutex);
>   			reloc_cache_remap(&eb->reloc_cache, ev->vma->obj);
>   			if (err)
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index be208a8f1ed0..7097c5016431 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -37,6 +37,7 @@
>   #include "i915_sw_fence_work.h"
>   #include "i915_trace.h"
>   #include "i915_vma.h"
> +#include "i915_vma_resource.h"
>   
>   static struct kmem_cache *slab_vmas;
>   
> @@ -380,6 +381,8 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
>    * @cache_level: mapping cache level
>    * @flags: flags like global or local mapping
>    * @work: preallocated worker for allocating and binding the PTE
> + * @vma_res: pointer to a preallocated vma resource. The resource is either
> + * consumed or freed.
>    *
>    * DMA addresses are taken from the scatter-gather table of this object (or of
>    * this VMA in case of non-default GGTT views) and PTE entries set up.
> @@ -388,7 +391,8 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
>   int i915_vma_bind(struct i915_vma *vma,
>   		  enum i915_cache_level cache_level,
>   		  u32 flags,
> -		  struct i915_vma_work *work)
> +		  struct i915_vma_work *work,
> +		  struct i915_vma_resource *vma_res)
>   {
>   	u32 bind_flags;
>   	u32 vma_flags;
> @@ -399,11 +403,15 @@ int i915_vma_bind(struct i915_vma *vma,
>   
>   	if (GEM_DEBUG_WARN_ON(range_overflows(vma->node.start,
>   					      vma->node.size,
> -					      vma->vm->total)))
> +					      vma->vm->total))) {
> +		kfree(vma_res);
>   		return -ENODEV;
> +	}
>   
> -	if (GEM_DEBUG_WARN_ON(!flags))
> +	if (GEM_DEBUG_WARN_ON(!flags)) {
> +		kfree(vma_res);
>   		return -EINVAL;
> +	}
>   
>   	bind_flags = flags;
>   	bind_flags &= I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
> @@ -412,11 +420,21 @@ int i915_vma_bind(struct i915_vma *vma,
>   	vma_flags &= I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
>   
>   	bind_flags &= ~vma_flags;
> -	if (bind_flags == 0)
> +	if (bind_flags == 0) {
> +		kfree(vma_res);
>   		return 0;
> +	}
>   
>   	GEM_BUG_ON(!atomic_read(&vma->pages_count));
>   
> +	if (vma->resource || !vma_res) {
> +		/* Rebinding with an additional I915_VMA_*_BIND */
> +		GEM_WARN_ON(!vma_flags);
> +		kfree(vma_res);
> +	} else {
> +		i915_vma_resource_init(vma_res);
> +		vma->resource = vma_res;
> +	}
>   	trace_i915_vma_bind(vma, bind_flags);
>   	if (work && bind_flags & vma->vm->bind_async_flags) {
>   		struct dma_fence *prev;
> @@ -1279,6 +1297,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>   {
>   	struct i915_vma_work *work = NULL;
>   	struct dma_fence *moving = NULL;
> +	struct i915_vma_resource *vma_res = NULL;
>   	intel_wakeref_t wakeref = 0;
>   	unsigned int bound;
>   	int err;
> @@ -1333,6 +1352,12 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>   		}
>   	}
>   
> +	vma_res = i915_vma_resource_alloc();
> +	if (IS_ERR(vma_res)) {
> +		err = PTR_ERR(vma_res);
> +		goto err_fence;
> +	}
> +
>   	/*
>   	 * Differentiate between user/kernel vma inside the aliasing-ppgtt.
>   	 *
> @@ -1353,7 +1378,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>   	err = mutex_lock_interruptible_nested(&vma->vm->mutex,
>   					      !(flags & PIN_GLOBAL));
>   	if (err)
> -		goto err_fence;
> +		goto err_vma_res;
>   
>   	/* No more allocations allowed now we hold vm->mutex */
>   
> @@ -1394,7 +1419,8 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>   	GEM_BUG_ON(!vma->pages);
>   	err = i915_vma_bind(vma,
>   			    vma->obj->cache_level,
> -			    flags, work);
> +			    flags, work, vma_res);
> +	vma_res = NULL;
>   	if (err)
>   		goto err_remove;
>   
> @@ -1417,6 +1443,8 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>   	i915_active_release(&vma->active);
>   err_unlock:
>   	mutex_unlock(&vma->vm->mutex);
> +err_vma_res:
> +	kfree(vma_res);
>   err_fence:
>   	if (work)
>   		dma_fence_work_commit_imm(&work->base);
> @@ -1567,6 +1595,7 @@ void i915_vma_release(struct kref *ref)
>   	i915_vm_put(vma->vm);
>   
>   	i915_active_fini(&vma->active);
> +	GEM_WARN_ON(vma->resource);
>   	i915_vma_free(vma);
>   }
>   
> @@ -1715,6 +1744,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
>   
>   void __i915_vma_evict(struct i915_vma *vma)
>   {
> +	struct dma_fence *unbind_fence;
> +
>   	GEM_BUG_ON(i915_vma_is_pinned(vma));
>   
>   	if (i915_vma_is_map_and_fenceable(vma)) {
> @@ -1752,8 +1783,20 @@ void __i915_vma_evict(struct i915_vma *vma)
>   	atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE),
>   		   &vma->flags);
>   
> +	unbind_fence = i915_vma_resource_unbind(vma->resource);
> +	i915_vma_resource_put(vma->resource);
> +	vma->resource = NULL;
> +
>   	i915_vma_detach(vma);
>   	vma_unbind_pages(vma);
> +
> +	/*
> +	 * This uninterruptible wait under the vm mutex is currently
> +	 * only ever blocking while the vma is being captured from.
> +	 * With async unbinding, this wait here will be removed.
> +	 */
> +	dma_fence_wait(unbind_fence, false);
> +	dma_fence_put(unbind_fence);
>   }
>   
>   int __i915_vma_unbind(struct i915_vma *vma)
> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
> index 32719431b3df..de0f3e44cdfa 100644
> --- a/drivers/gpu/drm/i915/i915_vma.h
> +++ b/drivers/gpu/drm/i915/i915_vma.h
> @@ -37,6 +37,7 @@
>   
>   #include "i915_active.h"
>   #include "i915_request.h"
> +#include "i915_vma_resource.h"
>   #include "i915_vma_types.h"
>   
>   struct i915_vma *
> @@ -204,7 +205,8 @@ struct i915_vma_work *i915_vma_work(void);
>   int i915_vma_bind(struct i915_vma *vma,
>   		  enum i915_cache_level cache_level,
>   		  u32 flags,
> -		  struct i915_vma_work *work);
> +		  struct i915_vma_work *work,
> +		  struct i915_vma_resource *vma_res);
>   
>   bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long color);
>   bool i915_vma_misplaced(const struct i915_vma *vma,
> @@ -428,6 +430,21 @@ static inline int i915_vma_sync(struct i915_vma *vma)
>   	return i915_active_wait(&vma->active);
>   }
>   
> +/**
> + * i915_vma_get_current_resource - Get the current resource of the vma
> + * @vma: The vma to get the current resource from.
> + *
> + * It's illegal to call this function if the vma is not bound.
> + *
> + * Return: A refcounted pointer to the current vma resource
> + * of the vma, assuming the vma is bound.
> + */
> +static inline struct i915_vma_resource *
> +i915_vma_get_current_resource(struct i915_vma *vma)
> +{
> +	return i915_vma_resource_get(vma->resource);
> +}
> +
>   void i915_vma_module_exit(void);
>   int i915_vma_module_init(void);
>   
> diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c
> new file mode 100644
> index 000000000000..833e987bed2a
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_vma_resource.c
> @@ -0,0 +1,124 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +#include <linux/slab.h>
> +
> +#include "i915_vma_resource.h"
> +
> +/* Callbacks for the unbind dma-fence. */
> +static const char *get_driver_name(struct dma_fence *fence)
> +{
> +	return "vma unbind fence";
> +}
> +
> +static const char *get_timeline_name(struct dma_fence *fence)
> +{
> +	return "unbound";
> +}
> +
> +static struct dma_fence_ops unbind_fence_ops = {
> +	.get_driver_name = get_driver_name,
> +	.get_timeline_name = get_timeline_name,
> +};
> +
> +/**
> + * i915_vma_resource_init - Initialize a vma resource.
> + * @vma_res: The vma resource to initialize
> + *
> + * Initializes a vma resource allocated using i915_vma_resource_alloc().
> + * The reason for having separate allocate and initialize function is that
> + * initialization may need to be performed from under a lock where
> + * allocation is not allowed.
> + */
> +void i915_vma_resource_init(struct i915_vma_resource *vma_res)
> +{
> +	spin_lock_init(&vma_res->lock);
> +	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
> +		       &vma_res->lock, 0, 0);
> +	refcount_set(&vma_res->hold_count, 1);
> +}
> +
> +/**
> + * i915_vma_resource_alloc - Allocate a vma resource
> + *
> + * Return: A pointer to a cleared struct i915_vma_resource or
> + * a -ENOMEM error pointer if allocation fails.
> + */
> +struct i915_vma_resource *i915_vma_resource_alloc(void)
> +{
> +	struct i915_vma_resource *vma_res =
> +		kzalloc(sizeof(*vma_res), GFP_KERNEL);
> +
> +	return vma_res ? vma_res : ERR_PTR(-ENOMEM);
> +}
> +
> +static void __i915_vma_resource_unhold(struct i915_vma_resource *vma_res)
> +{
> +	if (refcount_dec_and_test(&vma_res->hold_count))
> +		dma_fence_signal(&vma_res->unbind_fence);
> +}
> +
> +/**
> + * i915_vma_resource_unhold - Unhold the signaling of the vma resource unbind
> + * fence.
> + * @vma_res: The vma resource.
> + * @lockdep_cookie: The lockdep cookie returned from i915_vma_resource_hold.
> + *
> + * The function may leave a dma_fence critical section.
> + */
> +void i915_vma_resource_unhold(struct i915_vma_resource *vma_res,
> +			      bool lockdep_cookie)
> +{
> +	dma_fence_end_signalling(lockdep_cookie);
> +
> +	if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
> +		unsigned long irq_flags;
> +
> +		/* Inefficient open-coded might_lock_irqsave() */
> +		spin_lock_irqsave(&vma_res->lock, irq_flags);
> +		spin_unlock_irqrestore(&vma_res->lock, irq_flags);
> +	}
> +
> +	__i915_vma_resource_unhold(vma_res);
> +}
> +
> +/**
> + * i915_vma_resource_hold - Hold the signaling of the vma resource unbind fence.
> + * @vma_res: The vma resource.
> + * @lockdep_cookie: Pointer to a bool serving as a lockdep cooke that should
> + * be given as an argument to the pairing i915_vma_resource_unhold.
> + *
> + * If returning true, the function enters a dma_fence signalling critical
> + * section is not in one already.

if not?

> + *
> + * Return: true if holding successful, false if not.
> + */
> +bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
> +			    bool *lockdep_cookie)
> +{
> +	bool held = refcount_inc_not_zero(&vma_res->hold_count);
> +
> +	if (held)
> +		*lockdep_cookie = dma_fence_begin_signalling();
> +
> +	return held;
> +}
> +
> +/**
> + * i915_vma_resource_unbind - Unbind a vma resource
> + * @vma_res: The vma resource to unbind.
> + *
> + * At this point this function does little more than publish a fence that
> + * signals immediately unless signaling is held back.
> + *
> + * Return: A refcounted pointer to a dma-fence that signals when unbinding is
> + * complete.
> + */
> +struct dma_fence *
> +i915_vma_resource_unbind(struct i915_vma_resource *vma_res)
> +{
> +	__i915_vma_resource_unhold(vma_res);
> +	dma_fence_get(&vma_res->unbind_fence);
> +	return &vma_res->unbind_fence;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
> new file mode 100644
> index 000000000000..34744da23072
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_vma_resource.h
> @@ -0,0 +1,70 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +
> +#ifndef __I915_VMA_RESOURCE_H__
> +#define __I915_VMA_RESOURCE_H__
> +
> +#include <linux/dma-fence.h>
> +#include <linux/refcount.h>
> +
> +/**
> + * struct i915_vma_resource - Snapshotted unbind information.
> + * @unbind_fence: Fence to mark unbinding complete. Note that this fence
> + * is not considered published until unbind is scheduled, and as such it
> + * is illegal to access this fence before scheduled unbind other than
> + * for refcounting.
> + * @lock: The @unbind_fence lock. We're also using it to protect the weak
> + * pointer to the struct i915_vma, @vma during lookup and takedown.

Not sure what the @vma here is referring to?

Otherwise,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH v5 1/6] drm/i915: Initial introduction of vma resources
@ 2022-01-06 15:22     ` Matthew Auld
  0 siblings, 0 replies; 32+ messages in thread
From: Matthew Auld @ 2022-01-06 15:22 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 04/01/2022 12:51, Thomas Hellström wrote:
> Introduce vma resources, sort of similar to TTM resources,  needed for
> asynchronous bind management. Initially we will use them to hold
> completion of unbinding when we capture data from a vma, but they will
> be used extensively in upcoming patches for asynchronous vma unbinding.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/Makefile                 |   1 +
>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   2 +-
>   drivers/gpu/drm/i915/i915_vma.c               |  55 +++++++-
>   drivers/gpu/drm/i915/i915_vma.h               |  19 ++-
>   drivers/gpu/drm/i915/i915_vma_resource.c      | 124 ++++++++++++++++++
>   drivers/gpu/drm/i915/i915_vma_resource.h      |  70 ++++++++++
>   drivers/gpu/drm/i915/i915_vma_snapshot.c      |  15 +--
>   drivers/gpu/drm/i915/i915_vma_snapshot.h      |   7 +-
>   drivers/gpu/drm/i915/i915_vma_types.h         |   5 +
>   drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  99 ++++++++------
>   10 files changed, 334 insertions(+), 63 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/i915_vma_resource.c
>   create mode 100644 drivers/gpu/drm/i915/i915_vma_resource.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 1b62b9f65196..98433ad74194 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -174,6 +174,7 @@ i915-y += \
>   	  i915_trace_points.o \
>   	  i915_ttm_buddy_manager.o \
>   	  i915_vma.o \
> +	  i915_vma_resource.o \
>   	  i915_vma_snapshot.o \
>   	  intel_wopcm.o
>   
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index e9541244027a..72e497745c12 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -1422,7 +1422,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
>   			mutex_lock(&vma->vm->mutex);
>   			err = i915_vma_bind(target->vma,
>   					    target->vma->obj->cache_level,
> -					    PIN_GLOBAL, NULL);
> +					    PIN_GLOBAL, NULL, NULL);
>   			mutex_unlock(&vma->vm->mutex);
>   			reloc_cache_remap(&eb->reloc_cache, ev->vma->obj);
>   			if (err)
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index be208a8f1ed0..7097c5016431 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -37,6 +37,7 @@
>   #include "i915_sw_fence_work.h"
>   #include "i915_trace.h"
>   #include "i915_vma.h"
> +#include "i915_vma_resource.h"
>   
>   static struct kmem_cache *slab_vmas;
>   
> @@ -380,6 +381,8 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
>    * @cache_level: mapping cache level
>    * @flags: flags like global or local mapping
>    * @work: preallocated worker for allocating and binding the PTE
> + * @vma_res: pointer to a preallocated vma resource. The resource is either
> + * consumed or freed.
>    *
>    * DMA addresses are taken from the scatter-gather table of this object (or of
>    * this VMA in case of non-default GGTT views) and PTE entries set up.
> @@ -388,7 +391,8 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
>   int i915_vma_bind(struct i915_vma *vma,
>   		  enum i915_cache_level cache_level,
>   		  u32 flags,
> -		  struct i915_vma_work *work)
> +		  struct i915_vma_work *work,
> +		  struct i915_vma_resource *vma_res)
>   {
>   	u32 bind_flags;
>   	u32 vma_flags;
> @@ -399,11 +403,15 @@ int i915_vma_bind(struct i915_vma *vma,
>   
>   	if (GEM_DEBUG_WARN_ON(range_overflows(vma->node.start,
>   					      vma->node.size,
> -					      vma->vm->total)))
> +					      vma->vm->total))) {
> +		kfree(vma_res);
>   		return -ENODEV;
> +	}
>   
> -	if (GEM_DEBUG_WARN_ON(!flags))
> +	if (GEM_DEBUG_WARN_ON(!flags)) {
> +		kfree(vma_res);
>   		return -EINVAL;
> +	}
>   
>   	bind_flags = flags;
>   	bind_flags &= I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
> @@ -412,11 +420,21 @@ int i915_vma_bind(struct i915_vma *vma,
>   	vma_flags &= I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
>   
>   	bind_flags &= ~vma_flags;
> -	if (bind_flags == 0)
> +	if (bind_flags == 0) {
> +		kfree(vma_res);
>   		return 0;
> +	}
>   
>   	GEM_BUG_ON(!atomic_read(&vma->pages_count));
>   
> +	if (vma->resource || !vma_res) {
> +		/* Rebinding with an additional I915_VMA_*_BIND */
> +		GEM_WARN_ON(!vma_flags);
> +		kfree(vma_res);
> +	} else {
> +		i915_vma_resource_init(vma_res);
> +		vma->resource = vma_res;
> +	}
>   	trace_i915_vma_bind(vma, bind_flags);
>   	if (work && bind_flags & vma->vm->bind_async_flags) {
>   		struct dma_fence *prev;
> @@ -1279,6 +1297,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>   {
>   	struct i915_vma_work *work = NULL;
>   	struct dma_fence *moving = NULL;
> +	struct i915_vma_resource *vma_res = NULL;
>   	intel_wakeref_t wakeref = 0;
>   	unsigned int bound;
>   	int err;
> @@ -1333,6 +1352,12 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>   		}
>   	}
>   
> +	vma_res = i915_vma_resource_alloc();
> +	if (IS_ERR(vma_res)) {
> +		err = PTR_ERR(vma_res);
> +		goto err_fence;
> +	}
> +
>   	/*
>   	 * Differentiate between user/kernel vma inside the aliasing-ppgtt.
>   	 *
> @@ -1353,7 +1378,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>   	err = mutex_lock_interruptible_nested(&vma->vm->mutex,
>   					      !(flags & PIN_GLOBAL));
>   	if (err)
> -		goto err_fence;
> +		goto err_vma_res;
>   
>   	/* No more allocations allowed now we hold vm->mutex */
>   
> @@ -1394,7 +1419,8 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>   	GEM_BUG_ON(!vma->pages);
>   	err = i915_vma_bind(vma,
>   			    vma->obj->cache_level,
> -			    flags, work);
> +			    flags, work, vma_res);
> +	vma_res = NULL;
>   	if (err)
>   		goto err_remove;
>   
> @@ -1417,6 +1443,8 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
>   	i915_active_release(&vma->active);
>   err_unlock:
>   	mutex_unlock(&vma->vm->mutex);
> +err_vma_res:
> +	kfree(vma_res);
>   err_fence:
>   	if (work)
>   		dma_fence_work_commit_imm(&work->base);
> @@ -1567,6 +1595,7 @@ void i915_vma_release(struct kref *ref)
>   	i915_vm_put(vma->vm);
>   
>   	i915_active_fini(&vma->active);
> +	GEM_WARN_ON(vma->resource);
>   	i915_vma_free(vma);
>   }
>   
> @@ -1715,6 +1744,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
>   
>   void __i915_vma_evict(struct i915_vma *vma)
>   {
> +	struct dma_fence *unbind_fence;
> +
>   	GEM_BUG_ON(i915_vma_is_pinned(vma));
>   
>   	if (i915_vma_is_map_and_fenceable(vma)) {
> @@ -1752,8 +1783,20 @@ void __i915_vma_evict(struct i915_vma *vma)
>   	atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE),
>   		   &vma->flags);
>   
> +	unbind_fence = i915_vma_resource_unbind(vma->resource);
> +	i915_vma_resource_put(vma->resource);
> +	vma->resource = NULL;
> +
>   	i915_vma_detach(vma);
>   	vma_unbind_pages(vma);
> +
> +	/*
> +	 * This uninterruptible wait under the vm mutex is currently
> +	 * only ever blocking while the vma is being captured from.
> +	 * With async unbinding, this wait here will be removed.
> +	 */
> +	dma_fence_wait(unbind_fence, false);
> +	dma_fence_put(unbind_fence);
>   }
>   
>   int __i915_vma_unbind(struct i915_vma *vma)
> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
> index 32719431b3df..de0f3e44cdfa 100644
> --- a/drivers/gpu/drm/i915/i915_vma.h
> +++ b/drivers/gpu/drm/i915/i915_vma.h
> @@ -37,6 +37,7 @@
>   
>   #include "i915_active.h"
>   #include "i915_request.h"
> +#include "i915_vma_resource.h"
>   #include "i915_vma_types.h"
>   
>   struct i915_vma *
> @@ -204,7 +205,8 @@ struct i915_vma_work *i915_vma_work(void);
>   int i915_vma_bind(struct i915_vma *vma,
>   		  enum i915_cache_level cache_level,
>   		  u32 flags,
> -		  struct i915_vma_work *work);
> +		  struct i915_vma_work *work,
> +		  struct i915_vma_resource *vma_res);
>   
>   bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long color);
>   bool i915_vma_misplaced(const struct i915_vma *vma,
> @@ -428,6 +430,21 @@ static inline int i915_vma_sync(struct i915_vma *vma)
>   	return i915_active_wait(&vma->active);
>   }
>   
> +/**
> + * i915_vma_get_current_resource - Get the current resource of the vma
> + * @vma: The vma to get the current resource from.
> + *
> + * It's illegal to call this function if the vma is not bound.
> + *
> + * Return: A refcounted pointer to the current vma resource
> + * of the vma, assuming the vma is bound.
> + */
> +static inline struct i915_vma_resource *
> +i915_vma_get_current_resource(struct i915_vma *vma)
> +{
> +	return i915_vma_resource_get(vma->resource);
> +}
> +
>   void i915_vma_module_exit(void);
>   int i915_vma_module_init(void);
>   
> diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c
> new file mode 100644
> index 000000000000..833e987bed2a
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_vma_resource.c
> @@ -0,0 +1,124 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +#include <linux/slab.h>
> +
> +#include "i915_vma_resource.h"
> +
> +/* Callbacks for the unbind dma-fence. */
> +static const char *get_driver_name(struct dma_fence *fence)
> +{
> +	return "vma unbind fence";
> +}
> +
> +static const char *get_timeline_name(struct dma_fence *fence)
> +{
> +	return "unbound";
> +}
> +
> +static struct dma_fence_ops unbind_fence_ops = {
> +	.get_driver_name = get_driver_name,
> +	.get_timeline_name = get_timeline_name,
> +};
> +
> +/**
> + * i915_vma_resource_init - Initialize a vma resource.
> + * @vma_res: The vma resource to initialize
> + *
> + * Initializes a vma resource allocated using i915_vma_resource_alloc().
> + * The reason for having separate allocate and initialize function is that
> + * initialization may need to be performed from under a lock where
> + * allocation is not allowed.
> + */
> +void i915_vma_resource_init(struct i915_vma_resource *vma_res)
> +{
> +	spin_lock_init(&vma_res->lock);
> +	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
> +		       &vma_res->lock, 0, 0);
> +	refcount_set(&vma_res->hold_count, 1);
> +}
> +
> +/**
> + * i915_vma_resource_alloc - Allocate a vma resource
> + *
> + * Return: A pointer to a cleared struct i915_vma_resource or
> + * a -ENOMEM error pointer if allocation fails.
> + */
> +struct i915_vma_resource *i915_vma_resource_alloc(void)
> +{
> +	struct i915_vma_resource *vma_res =
> +		kzalloc(sizeof(*vma_res), GFP_KERNEL);
> +
> +	return vma_res ? vma_res : ERR_PTR(-ENOMEM);
> +}
> +
> +static void __i915_vma_resource_unhold(struct i915_vma_resource *vma_res)
> +{
> +	if (refcount_dec_and_test(&vma_res->hold_count))
> +		dma_fence_signal(&vma_res->unbind_fence);
> +}
> +
> +/**
> + * i915_vma_resource_unhold - Unhold the signaling of the vma resource unbind
> + * fence.
> + * @vma_res: The vma resource.
> + * @lockdep_cookie: The lockdep cookie returned from i915_vma_resource_hold.
> + *
> + * The function may leave a dma_fence critical section.
> + */
> +void i915_vma_resource_unhold(struct i915_vma_resource *vma_res,
> +			      bool lockdep_cookie)
> +{
> +	dma_fence_end_signalling(lockdep_cookie);
> +
> +	if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
> +		unsigned long irq_flags;
> +
> +		/* Inefficient open-coded might_lock_irqsave() */
> +		spin_lock_irqsave(&vma_res->lock, irq_flags);
> +		spin_unlock_irqrestore(&vma_res->lock, irq_flags);
> +	}
> +
> +	__i915_vma_resource_unhold(vma_res);
> +}
> +
> +/**
> + * i915_vma_resource_hold - Hold the signaling of the vma resource unbind fence.
> + * @vma_res: The vma resource.
> + * @lockdep_cookie: Pointer to a bool serving as a lockdep cooke that should
> + * be given as an argument to the pairing i915_vma_resource_unhold.
> + *
> + * If returning true, the function enters a dma_fence signalling critical
> + * section is not in one already.

if not?

> + *
> + * Return: true if holding successful, false if not.
> + */
> +bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
> +			    bool *lockdep_cookie)
> +{
> +	bool held = refcount_inc_not_zero(&vma_res->hold_count);
> +
> +	if (held)
> +		*lockdep_cookie = dma_fence_begin_signalling();
> +
> +	return held;
> +}
> +
> +/**
> + * i915_vma_resource_unbind - Unbind a vma resource
> + * @vma_res: The vma resource to unbind.
> + *
> + * At this point this function does little more than publish a fence that
> + * signals immediately unless signaling is held back.
> + *
> + * Return: A refcounted pointer to a dma-fence that signals when unbinding is
> + * complete.
> + */
> +struct dma_fence *
> +i915_vma_resource_unbind(struct i915_vma_resource *vma_res)
> +{
> +	__i915_vma_resource_unhold(vma_res);
> +	dma_fence_get(&vma_res->unbind_fence);
> +	return &vma_res->unbind_fence;
> +}
> diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
> new file mode 100644
> index 000000000000..34744da23072
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_vma_resource.h
> @@ -0,0 +1,70 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +
> +#ifndef __I915_VMA_RESOURCE_H__
> +#define __I915_VMA_RESOURCE_H__
> +
> +#include <linux/dma-fence.h>
> +#include <linux/refcount.h>
> +
> +/**
> + * struct i915_vma_resource - Snapshotted unbind information.
> + * @unbind_fence: Fence to mark unbinding complete. Note that this fence
> + * is not considered published until unbind is scheduled, and as such it
> + * is illegal to access this fence before scheduled unbind other than
> + * for refcounting.
> + * @lock: The @unbind_fence lock. We're also using it to protect the weak
> + * pointer to the struct i915_vma, @vma during lookup and takedown.

Not sure what the @vma here is referring to?

Otherwise,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 2/6] drm/i915: Use the vma resource as argument for gtt binding / unbinding
  2022-01-04 12:51   ` [Intel-gfx] " Thomas Hellström
@ 2022-01-06 16:01     ` Matthew Auld
  -1 siblings, 0 replies; 32+ messages in thread
From: Matthew Auld @ 2022-01-06 16:01 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 04/01/2022 12:51, Thomas Hellström wrote:
> When introducing asynchronous unbinding, the vma itself may no longer
> be alive when the actual binding or unbinding takes place.
> 
> Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource
> instead of a struct i915_vma for the bind_vma() and unbind_vma() ops.
> Similarly change the insert_entries() op for struct i915_address_space.
> 
> Replace a couple of i915_vma_snapshot members with their newly introduced
> i915_vma_resource counterparts, since they have the same lifetime.
> 
> Also make sure to avoid changing the struct i915_vma_flags (in particular
> the bind flags) async. That should now only be done sync under the
> vm mutex.
> 
> v2:
> - Update the vma_res::bound_flags when binding to the aliased ggtt
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/display/intel_dpt.c      | 27 ++---
>   .../gpu/drm/i915/gem/i915_gem_object_types.h  | 27 +----
>   .../gpu/drm/i915/gem/selftests/huge_pages.c   | 37 +++----
>   drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 19 ++--
>   drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 37 +++----
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  4 +-
>   drivers/gpu/drm/i915/gt/intel_ggtt.c          | 70 ++++++-------
>   drivers/gpu/drm/i915/gt/intel_gtt.h           | 16 +--
>   drivers/gpu/drm/i915/gt/intel_ppgtt.c         | 22 +++--
>   drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      | 13 ++-
>   drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h      |  2 +-
>   drivers/gpu/drm/i915/i915_debugfs.c           |  3 +-
>   drivers/gpu/drm/i915/i915_gpu_error.c         |  6 +-
>   drivers/gpu/drm/i915/i915_vma.c               | 24 ++++-
>   drivers/gpu/drm/i915/i915_vma.h               | 11 +--
>   drivers/gpu/drm/i915/i915_vma_resource.c      |  9 +-
>   drivers/gpu/drm/i915/i915_vma_resource.h      | 99 ++++++++++++++++++-
>   drivers/gpu/drm/i915/i915_vma_snapshot.c      |  4 -
>   drivers/gpu/drm/i915/i915_vma_snapshot.h      |  8 --
>   drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 64 ++++++++----
>   drivers/gpu/drm/i915/selftests/mock_gtt.c     | 12 +--
>   21 files changed, 308 insertions(+), 206 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
> index 8f674745e7e0..63a83d5f85a1 100644
> --- a/drivers/gpu/drm/i915/display/intel_dpt.c
> +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
> @@ -48,7 +48,7 @@ static void dpt_insert_page(struct i915_address_space *vm,
>   }
>   
>   static void dpt_insert_entries(struct i915_address_space *vm,
> -			       struct i915_vma *vma,
> +			       struct i915_vma_resource *vma_res,
>   			       enum i915_cache_level level,
>   			       u32 flags)
>   {
> @@ -64,8 +64,8 @@ static void dpt_insert_entries(struct i915_address_space *vm,
>   	 * not to allow the user to override access to a read only page.
>   	 */
>   
> -	i = vma->node.start / I915_GTT_PAGE_SIZE;
> -	for_each_sgt_daddr(addr, sgt_iter, vma->pages)
> +	i = vma_res->start / I915_GTT_PAGE_SIZE;
> +	for_each_sgt_daddr(addr, sgt_iter, vma_res->bi.pages)
>   		gen8_set_pte(&base[i++], pte_encode | addr);
>   }
>   
> @@ -76,35 +76,38 @@ static void dpt_clear_range(struct i915_address_space *vm,
>   
>   static void dpt_bind_vma(struct i915_address_space *vm,
>   			 struct i915_vm_pt_stash *stash,
> -			 struct i915_vma *vma,
> +			 struct i915_vma_resource *vma_res,
>   			 enum i915_cache_level cache_level,
>   			 u32 flags)
>   {
> -	struct drm_i915_gem_object *obj = vma->obj;
>   	u32 pte_flags;
>   
> +	if (vma_res->bound_flags)
> +		return;
> +
>   	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
>   	pte_flags = 0;
> -	if (vma->vm->has_read_only && i915_gem_object_is_readonly(obj))
> +	if (vm->has_read_only && vma_res->bi.readonly)
>   		pte_flags |= PTE_READ_ONLY;
> -	if (i915_gem_object_is_lmem(obj))
> +	if (vma_res->bi.lmem)
>   		pte_flags |= PTE_LM;
>   
> -	vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
> +	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
>   
> -	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
> +	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>   
>   	/*
>   	 * Without aliasing PPGTT there's no difference between
>   	 * GLOBAL/LOCAL_BIND, it's all the same ptes. Hence unconditionally
>   	 * upgrade to both bound if we bind either to avoid double-binding.
>   	 */
> -	atomic_or(I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND, &vma->flags);
> +	vma_res->bound_flags = I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
>   }
>   
> -static void dpt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
> +static void dpt_unbind_vma(struct i915_address_space *vm,
> +			   struct i915_vma_resource *vma_res)
>   {
> -	vm->clear_range(vm, vma->node.start, vma->size);
> +	vm->clear_range(vm, vma_res->start, vma_res->vma_size);
>   }
>   
>   static void dpt_cleanup(struct i915_address_space *vm)
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index f9f7e44099fe..f99d260e0684 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -15,6 +15,7 @@
>   
>   #include "i915_active.h"
>   #include "i915_selftest.h"
> +#include "i915_vma_resource.h"
>   
>   struct drm_i915_gem_object;
>   struct intel_fronbuffer;
> @@ -549,31 +550,7 @@ struct drm_i915_gem_object {
>   		struct sg_table *pages;
>   		void *mapping;
>   
> -		struct i915_page_sizes {
> -			/**
> -			 * The sg mask of the pages sg_table. i.e the mask of
> -			 * of the lengths for each sg entry.
> -			 */
> -			unsigned int phys;
> -
> -			/**
> -			 * The gtt page sizes we are allowed to use given the
> -			 * sg mask and the supported page sizes. This will
> -			 * express the smallest unit we can use for the whole
> -			 * object, as well as the larger sizes we may be able
> -			 * to use opportunistically.
> -			 */
> -			unsigned int sg;
> -
> -			/**
> -			 * The actual gtt page size usage. Since we can have
> -			 * multiple vma associated with this object we need to
> -			 * prevent any trampling of state, hence a copy of this
> -			 * struct also lives in each vma, therefore the gtt
> -			 * value here should only be read/write through the vma.
> -			 */
> -			unsigned int gtt;
> -		} page_sizes;
> +		struct i915_page_sizes page_sizes;
>   
>   		I915_SELFTEST_DECLARE(unsigned int page_mask);
>   
> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> index 11f0aa65f8a3..26f997c376a2 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> @@ -370,9 +370,9 @@ static int igt_check_page_sizes(struct i915_vma *vma)
>   		err = -EINVAL;
>   	}
>   
> -	if (!HAS_PAGE_SIZES(i915, vma->page_sizes.gtt)) {
> +	if (!HAS_PAGE_SIZES(i915, vma->resource->page_sizes_gtt)) {
>   		pr_err("unsupported page_sizes.gtt=%u, supported=%u\n",
> -		       vma->page_sizes.gtt & ~supported, supported);
> +		       vma->resource->page_sizes_gtt & ~supported, supported);
>   		err = -EINVAL;
>   	}
>   
> @@ -403,15 +403,9 @@ static int igt_check_page_sizes(struct i915_vma *vma)
>   	if (i915_gem_object_is_lmem(obj) &&
>   	    IS_ALIGNED(vma->node.start, SZ_2M) &&
>   	    vma->page_sizes.sg & SZ_2M &&
> -	    vma->page_sizes.gtt < SZ_2M) {
> +	    vma->resource->page_sizes_gtt < SZ_2M) {
>   		pr_err("gtt pages mismatch for LMEM, expected 2M GTT pages, sg(%u), gtt(%u)\n",
> -		       vma->page_sizes.sg, vma->page_sizes.gtt);
> -		err = -EINVAL;
> -	}
> -
> -	if (obj->mm.page_sizes.gtt) {
> -		pr_err("obj->page_sizes.gtt(%u) should never be set\n",
> -		       obj->mm.page_sizes.gtt);
> +		       vma->page_sizes.sg, vma->resource->page_sizes_gtt);
>   		err = -EINVAL;
>   	}
>   
> @@ -547,9 +541,9 @@ static int igt_mock_memory_region_huge_pages(void *arg)
>   				goto out_unpin;
>   			}
>   
> -			if (vma->page_sizes.gtt != page_size) {
> +			if (vma->resource->page_sizes_gtt != page_size) {
>   				pr_err("%s page_sizes.gtt=%u, expected=%u\n",
> -				       __func__, vma->page_sizes.gtt,
> +				       __func__, vma->resource->page_sizes_gtt,
>   				       page_size);
>   				err = -EINVAL;
>   				goto out_unpin;
> @@ -630,9 +624,9 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg)
>   
>   		err = igt_check_page_sizes(vma);
>   
> -		if (vma->page_sizes.gtt != page_size) {
> +		if (vma->resource->page_sizes_gtt != page_size) {
>   			pr_err("page_sizes.gtt=%u, expected %u\n",
> -			       vma->page_sizes.gtt, page_size);
> +			       vma->resource->page_sizes_gtt, page_size);
>   			err = -EINVAL;
>   		}
>   
> @@ -657,9 +651,10 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg)
>   
>   			err = igt_check_page_sizes(vma);
>   
> -			if (vma->page_sizes.gtt != I915_GTT_PAGE_SIZE_4K) {
> +			if (vma->resource->page_sizes_gtt != I915_GTT_PAGE_SIZE_4K) {
>   				pr_err("page_sizes.gtt=%u, expected %llu\n",
> -				       vma->page_sizes.gtt, I915_GTT_PAGE_SIZE_4K);
> +				       vma->resource->page_sizes_gtt,
> +				       I915_GTT_PAGE_SIZE_4K);
>   				err = -EINVAL;
>   			}
>   
> @@ -805,9 +800,9 @@ static int igt_mock_ppgtt_huge_fill(void *arg)
>   			}
>   		}
>   
> -		if (vma->page_sizes.gtt != expected_gtt) {
> +		if (vma->resource->page_sizes_gtt != expected_gtt) {
>   			pr_err("gtt=%u, expected=%u, size=%zd, single=%s\n",
> -			       vma->page_sizes.gtt, expected_gtt,
> +			       vma->resource->page_sizes_gtt, expected_gtt,
>   			       obj->base.size, yesno(!!single));
>   			err = -EINVAL;
>   			break;
> @@ -961,10 +956,10 @@ static int igt_mock_ppgtt_64K(void *arg)
>   				}
>   			}
>   
> -			if (vma->page_sizes.gtt != expected_gtt) {
> +			if (vma->resource->page_sizes_gtt != expected_gtt) {
>   				pr_err("gtt=%u, expected=%u, i=%d, single=%s\n",
> -				       vma->page_sizes.gtt, expected_gtt, i,
> -				       yesno(!!single));
> +				       vma->resource->page_sizes_gtt,
> +				       expected_gtt, i, yesno(!!single));
>   				err = -EINVAL;
>   				goto out_vma_unpin;
>   			}
> diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> index 6e9292918bfc..d657ffd6c86a 100644
> --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> @@ -104,17 +104,17 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
>   }
>   
>   static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
> -				      struct i915_vma *vma,
> +				      struct i915_vma_resource *vma_res,
>   				      enum i915_cache_level cache_level,
>   				      u32 flags)
>   {
>   	struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
>   	struct i915_page_directory * const pd = ppgtt->pd;
> -	unsigned int first_entry = vma->node.start / I915_GTT_PAGE_SIZE;
> +	unsigned int first_entry = vma_res->start / I915_GTT_PAGE_SIZE;
>   	unsigned int act_pt = first_entry / GEN6_PTES;
>   	unsigned int act_pte = first_entry % GEN6_PTES;
>   	const u32 pte_encode = vm->pte_encode(0, cache_level, flags);
> -	struct sgt_dma iter = sgt_dma(vma);
> +	struct sgt_dma iter = sgt_dma(vma_res);
>   	gen6_pte_t *vaddr;
>   
>   	GEM_BUG_ON(!pd->entry[act_pt]);
> @@ -140,7 +140,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>   		}
>   	} while (1);
>   
> -	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
> +	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>   }
>   
>   static void gen6_flush_pd(struct gen6_ppgtt *ppgtt, u64 start, u64 end)
> @@ -271,13 +271,13 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
>   
>   static void pd_vma_bind(struct i915_address_space *vm,
>   			struct i915_vm_pt_stash *stash,
> -			struct i915_vma *vma,
> +			struct i915_vma_resource *vma_res,
>   			enum i915_cache_level cache_level,
>   			u32 unused)
>   {
>   	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
> -	struct gen6_ppgtt *ppgtt = vma->private;
> -	u32 ggtt_offset = i915_ggtt_offset(vma) / I915_GTT_PAGE_SIZE;
> +	struct gen6_ppgtt *ppgtt = vma_res->private;
> +	u32 ggtt_offset = vma_res->start / I915_GTT_PAGE_SIZE;
>   
>   	ppgtt->pp_dir = ggtt_offset * sizeof(gen6_pte_t) << 10;
>   	ppgtt->pd_addr = (gen6_pte_t __iomem *)ggtt->gsm + ggtt_offset;
> @@ -285,9 +285,10 @@ static void pd_vma_bind(struct i915_address_space *vm,
>   	gen6_flush_pd(ppgtt, 0, ppgtt->base.vm.total);
>   }
>   
> -static void pd_vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
> +static void pd_vma_unbind(struct i915_address_space *vm,
> +			  struct i915_vma_resource *vma_res)
>   {
> -	struct gen6_ppgtt *ppgtt = vma->private;
> +	struct gen6_ppgtt *ppgtt = vma_res->private;
>   	struct i915_page_directory * const pd = ppgtt->base.pd;
>   	struct i915_page_table *pt;
>   	unsigned int pde;
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index b012c50f7ce7..c43e724afa9f 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -453,20 +453,21 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>   	return idx;
>   }
>   
> -static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
> +static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
> +				   struct i915_vma_resource *vma_res,
>   				   struct sgt_dma *iter,
>   				   enum i915_cache_level cache_level,
>   				   u32 flags)
>   {
>   	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
>   	unsigned int rem = sg_dma_len(iter->sg);
> -	u64 start = vma->node.start;
> +	u64 start = vma_res->start;
>   
> -	GEM_BUG_ON(!i915_vm_is_4lvl(vma->vm));
> +	GEM_BUG_ON(!i915_vm_is_4lvl(vm));
>   
>   	do {
>   		struct i915_page_directory * const pdp =
> -			gen8_pdp_for_page_address(vma->vm, start);
> +			gen8_pdp_for_page_address(vm, start);
>   		struct i915_page_directory * const pd =
>   			i915_pd_entry(pdp, __gen8_pte_index(start, 2));
>   		gen8_pte_t encode = pte_encode;
> @@ -475,7 +476,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>   		gen8_pte_t *vaddr;
>   		u16 index;
>   
> -		if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_2M &&
> +		if (vma_res->bi.page_sizes.sg & I915_GTT_PAGE_SIZE_2M &&
>   		    IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_2M) &&
>   		    rem >= I915_GTT_PAGE_SIZE_2M &&
>   		    !__gen8_pte_index(start, 0)) {
> @@ -492,7 +493,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>   			page_size = I915_GTT_PAGE_SIZE;
>   
>   			if (!index &&
> -			    vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K &&
> +			    vma_res->bi.page_sizes.sg & I915_GTT_PAGE_SIZE_64K &&
>   			    IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) &&
>   			    (IS_ALIGNED(rem, I915_GTT_PAGE_SIZE_64K) ||
>   			     rem >= (I915_PDES - index) * I915_GTT_PAGE_SIZE))
> @@ -541,9 +542,9 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>   		 */
>   		if (maybe_64K != -1 &&
>   		    (index == I915_PDES ||
> -		     (i915_vm_has_scratch_64K(vma->vm) &&
> -		      !iter->sg && IS_ALIGNED(vma->node.start +
> -					      vma->node.size,
> +		     (i915_vm_has_scratch_64K(vm) &&
> +		      !iter->sg && IS_ALIGNED(vma_res->start +
> +					      vma_res->node_size,
>   					      I915_GTT_PAGE_SIZE_2M)))) {
>   			vaddr = px_vaddr(pd);
>   			vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
> @@ -559,10 +560,10 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>   			 * instead - which we detect as missing results during
>   			 * selftests.
>   			 */
> -			if (I915_SELFTEST_ONLY(vma->vm->scrub_64K)) {
> +			if (I915_SELFTEST_ONLY(vm->scrub_64K)) {
>   				u16 i;
>   
> -				encode = vma->vm->scratch[0]->encode;
> +				encode = vm->scratch[0]->encode;
>   				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K));
>   
>   				for (i = 1; i < index; i += 16)
> @@ -572,22 +573,22 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>   			}
>   		}
>   
> -		vma->page_sizes.gtt |= page_size;
> +		vma_res->page_sizes_gtt |= page_size;
>   	} while (iter->sg && sg_dma_len(iter->sg));
>   }
>   
>   static void gen8_ppgtt_insert(struct i915_address_space *vm,
> -			      struct i915_vma *vma,
> +			      struct i915_vma_resource *vma_res,
>   			      enum i915_cache_level cache_level,
>   			      u32 flags)
>   {
>   	struct i915_ppgtt * const ppgtt = i915_vm_to_ppgtt(vm);
> -	struct sgt_dma iter = sgt_dma(vma);
> +	struct sgt_dma iter = sgt_dma(vma_res);
>   
> -	if (vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
> -		gen8_ppgtt_insert_huge(vma, &iter, cache_level, flags);
> +	if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) {
> +		gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags);
>   	} else  {
> -		u64 idx = vma->node.start >> GEN8_PTE_SHIFT;
> +		u64 idx = vma_res->start >> GEN8_PTE_SHIFT;
>   
>   		do {
>   			struct i915_page_directory * const pdp =
> @@ -597,7 +598,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
>   						    cache_level, flags);
>   		} while (idx);
>   
> -		vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
> +		vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>   	}
>   }
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 352254e001b4..74aa90587061 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -1718,8 +1718,8 @@ static void print_request_ring(struct drm_printer *m, struct i915_request *rq)
>   	drm_printf(m,
>   		   "[head %04x, postfix %04x, tail %04x, batch 0x%08x_%08x]:\n",
>   		   rq->head, rq->postfix, rq->tail,
> -		   vsnap ? upper_32_bits(vsnap->gtt_offset) : ~0u,
> -		   vsnap ? lower_32_bits(vsnap->gtt_offset) : ~0u);
> +		   vsnap ? upper_32_bits(vsnap->vma_resource->start) : ~0u,
> +		   vsnap ? lower_32_bits(vsnap->vma_resource->start) : ~0u);
>   
>   	size = rq->tail - rq->head;
>   	if (rq->tail < rq->head)
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 5263dda7f8d5..0137b6af0973 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -235,7 +235,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
>   }
>   
>   static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
> -				     struct i915_vma *vma,
> +				     struct i915_vma_resource *vma_res,
>   				     enum i915_cache_level level,
>   				     u32 flags)
>   {
> @@ -252,10 +252,10 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
>   	 */
>   
>   	gte = (gen8_pte_t __iomem *)ggtt->gsm;
> -	gte += vma->node.start / I915_GTT_PAGE_SIZE;
> -	end = gte + vma->node.size / I915_GTT_PAGE_SIZE;
> +	gte += vma_res->start / I915_GTT_PAGE_SIZE;
> +	end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
>   
> -	for_each_sgt_daddr(addr, iter, vma->pages)
> +	for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
>   		gen8_set_pte(gte++, pte_encode | addr);
>   	GEM_BUG_ON(gte > end);
>   
> @@ -292,7 +292,7 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm,
>    * through the GMADR mapped BAR (i915->mm.gtt->gtt).
>    */
>   static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
> -				     struct i915_vma *vma,
> +				     struct i915_vma_resource *vma_res,
>   				     enum i915_cache_level level,
>   				     u32 flags)
>   {
> @@ -303,10 +303,10 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
>   	dma_addr_t addr;
>   
>   	gte = (gen6_pte_t __iomem *)ggtt->gsm;
> -	gte += vma->node.start / I915_GTT_PAGE_SIZE;
> -	end = gte + vma->node.size / I915_GTT_PAGE_SIZE;
> +	gte += vma_res->start / I915_GTT_PAGE_SIZE;
> +	end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
>   
> -	for_each_sgt_daddr(addr, iter, vma->pages)
> +	for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
>   		iowrite32(vm->pte_encode(addr, level, flags), gte++);
>   	GEM_BUG_ON(gte > end);
>   
> @@ -389,7 +389,7 @@ static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
>   
>   struct insert_entries {
>   	struct i915_address_space *vm;
> -	struct i915_vma *vma;
> +	struct i915_vma_resource *vma_res;
>   	enum i915_cache_level level;
>   	u32 flags;
>   };
> @@ -398,18 +398,18 @@ static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
>   {
>   	struct insert_entries *arg = _arg;
>   
> -	gen8_ggtt_insert_entries(arg->vm, arg->vma, arg->level, arg->flags);
> +	gen8_ggtt_insert_entries(arg->vm, arg->vma_res, arg->level, arg->flags);
>   	bxt_vtd_ggtt_wa(arg->vm);
>   
>   	return 0;
>   }
>   
>   static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
> -					     struct i915_vma *vma,
> +					     struct i915_vma_resource *vma_res,
>   					     enum i915_cache_level level,
>   					     u32 flags)
>   {
> -	struct insert_entries arg = { vm, vma, level, flags };
> +	struct insert_entries arg = { vm, vma_res, level, flags };
>   
>   	stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
>   }
> @@ -448,14 +448,14 @@ static void i915_ggtt_insert_page(struct i915_address_space *vm,
>   }
>   
>   static void i915_ggtt_insert_entries(struct i915_address_space *vm,
> -				     struct i915_vma *vma,
> +				     struct i915_vma_resource *vma_res,
>   				     enum i915_cache_level cache_level,
>   				     u32 unused)
>   {
>   	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
>   		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
>   
> -	intel_gtt_insert_sg_entries(vma->pages, vma->node.start >> PAGE_SHIFT,
> +	intel_gtt_insert_sg_entries(vma_res->bi.pages, vma_res->start >> PAGE_SHIFT,
>   				    flags);
>   }
>   
> @@ -467,30 +467,32 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
>   
>   static void ggtt_bind_vma(struct i915_address_space *vm,
>   			  struct i915_vm_pt_stash *stash,
> -			  struct i915_vma *vma,
> +			  struct i915_vma_resource *vma_res,
>   			  enum i915_cache_level cache_level,
>   			  u32 flags)
>   {
> -	struct drm_i915_gem_object *obj = vma->obj;
>   	u32 pte_flags;
>   
> -	if (i915_vma_is_bound(vma, ~flags & I915_VMA_BIND_MASK))
> +	if (vma_res->bound_flags & (~flags & I915_VMA_BIND_MASK))
>   		return;
>   
> +	vma_res->bound_flags |= flags;
> +
>   	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
>   	pte_flags = 0;
> -	if (i915_gem_object_is_readonly(obj))
> +	if (vma_res->bi.readonly)
>   		pte_flags |= PTE_READ_ONLY;
> -	if (i915_gem_object_is_lmem(obj))
> +	if (vma_res->bi.lmem)
>   		pte_flags |= PTE_LM;
>   
> -	vm->insert_entries(vm, vma, cache_level, pte_flags);
> -	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
> +	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
> +	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>   }
>   
> -static void ggtt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
> +static void ggtt_unbind_vma(struct i915_address_space *vm,
> +			    struct i915_vma_resource *vma_res)
>   {
> -	vm->clear_range(vm, vma->node.start, vma->size);
> +	vm->clear_range(vm, vma_res->start, vma_res->vma_size);
>   }
>   
>   static int ggtt_reserve_guc_top(struct i915_ggtt *ggtt)
> @@ -623,7 +625,7 @@ static int init_ggtt(struct i915_ggtt *ggtt)
>   
>   static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
>   				  struct i915_vm_pt_stash *stash,
> -				  struct i915_vma *vma,
> +				  struct i915_vma_resource *vma_res,
>   				  enum i915_cache_level cache_level,
>   				  u32 flags)
>   {
> @@ -631,25 +633,27 @@ static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
>   
>   	/* Currently applicable only to VLV */
>   	pte_flags = 0;
> -	if (i915_gem_object_is_readonly(vma->obj))
> +	if (vma_res->bi.readonly)
>   		pte_flags |= PTE_READ_ONLY;
>   
>   	if (flags & I915_VMA_LOCAL_BIND)
>   		ppgtt_bind_vma(&i915_vm_to_ggtt(vm)->alias->vm,
> -			       stash, vma, cache_level, flags);
> +			       stash, vma_res, cache_level, flags);
>   
>   	if (flags & I915_VMA_GLOBAL_BIND)
> -		vm->insert_entries(vm, vma, cache_level, pte_flags);
> +		vm->insert_entries(vm, vma_res, cache_level, pte_flags);
> +
> +	vma_res->bound_flags |= flags;
>   }
>   
>   static void aliasing_gtt_unbind_vma(struct i915_address_space *vm,
> -				    struct i915_vma *vma)
> +				    struct i915_vma_resource *vma_res)
>   {
> -	if (i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND))
> -		vm->clear_range(vm, vma->node.start, vma->size);
> +	if (vma_res->bound_flags & I915_VMA_GLOBAL_BIND)
> +		vm->clear_range(vm, vma_res->start, vma_res->vma_size);
>   
> -	if (i915_vma_is_bound(vma, I915_VMA_LOCAL_BIND))
> -		ppgtt_unbind_vma(&i915_vm_to_ggtt(vm)->alias->vm, vma);
> +	if (vma_res->bound_flags & I915_VMA_LOCAL_BIND)
> +		ppgtt_unbind_vma(&i915_vm_to_ggtt(vm)->alias->vm, vma_res);
>   }
>   
>   static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
> @@ -1280,7 +1284,7 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
>   			atomic_read(&vma->flags) & I915_VMA_BIND_MASK;
>   
>   		GEM_BUG_ON(!was_bound);
> -		vma->ops->bind_vma(vm, NULL, vma,
> +		vma->ops->bind_vma(vm, NULL, vma->resource,
>   				   obj ? obj->cache_level : 0,
>   				   was_bound);
>   		if (obj) { /* only used during resume => exclusive access */
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 177b42b935a1..676b839d1a34 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -27,6 +27,7 @@
>   
>   #include "gt/intel_reset.h"
>   #include "i915_selftest.h"
> +#include "i915_vma_resource.h"
>   #include "i915_vma_types.h"
>   
>   #define I915_GFP_ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
> @@ -200,7 +201,7 @@ struct i915_vma_ops {
>   	/* Map an object into an address space with the given cache flags. */
>   	void (*bind_vma)(struct i915_address_space *vm,
>   			 struct i915_vm_pt_stash *stash,
> -			 struct i915_vma *vma,
> +			 struct i915_vma_resource *vma_res,
>   			 enum i915_cache_level cache_level,
>   			 u32 flags);
>   	/*
> @@ -208,7 +209,8 @@ struct i915_vma_ops {
>   	 * setting the valid PTE entries to a reserved scratch page.
>   	 */
>   	void (*unbind_vma)(struct i915_address_space *vm,
> -			   struct i915_vma *vma);
> +			   struct i915_vma_resource *vma_res);
> +
>   };
>   
>   struct i915_address_space {
> @@ -285,7 +287,7 @@ struct i915_address_space {
>   			    enum i915_cache_level cache_level,
>   			    u32 flags);
>   	void (*insert_entries)(struct i915_address_space *vm,
> -			       struct i915_vma *vma,
> +			       struct i915_vma_resource *vma_res,
>   			       enum i915_cache_level cache_level,
>   			       u32 flags);
>   	void (*cleanup)(struct i915_address_space *vm);
> @@ -600,11 +602,11 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt);
>   
>   void ppgtt_bind_vma(struct i915_address_space *vm,
>   		    struct i915_vm_pt_stash *stash,
> -		    struct i915_vma *vma,
> +		    struct i915_vma_resource *vma_res,
>   		    enum i915_cache_level cache_level,
>   		    u32 flags);
>   void ppgtt_unbind_vma(struct i915_address_space *vm,
> -		      struct i915_vma *vma);
> +		      struct i915_vma_resource *vma_res);
>   
>   void gtt_write_workarounds(struct intel_gt *gt);
>   
> @@ -627,8 +629,8 @@ __vm_create_scratch_for_read_pinned(struct i915_address_space *vm, unsigned long
>   static inline struct sgt_dma {
>   	struct scatterlist *sg;
>   	dma_addr_t dma, max;
> -} sgt_dma(struct i915_vma *vma) {
> -	struct scatterlist *sg = vma->pages->sgl;
> +} sgt_dma(struct i915_vma_resource *vma_res) {
> +	struct scatterlist *sg = vma_res->bi.pages->sgl;
>   	dma_addr_t addr = sg_dma_address(sg);
>   
>   	return (struct sgt_dma){ sg, addr, addr + sg_dma_len(sg) };
> diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> index 083b3090c69c..48e6e2f87700 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> @@ -179,32 +179,34 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt,
>   
>   void ppgtt_bind_vma(struct i915_address_space *vm,
>   		    struct i915_vm_pt_stash *stash,
> -		    struct i915_vma *vma,
> +		    struct i915_vma_resource *vma_res,
>   		    enum i915_cache_level cache_level,
>   		    u32 flags)
>   {
>   	u32 pte_flags;
>   
> -	if (!test_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma))) {
> -		vm->allocate_va_range(vm, stash, vma->node.start, vma->size);
> -		set_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma));
> +	if (!vma_res->allocated) {
> +		vm->allocate_va_range(vm, stash, vma_res->start,
> +				      vma_res->vma_size);
> +		vma_res->allocated = true;
>   	}
>   
>   	/* Applicable to VLV, and gen8+ */
>   	pte_flags = 0;
> -	if (i915_gem_object_is_readonly(vma->obj))
> +	if (vma_res->bi.readonly)
>   		pte_flags |= PTE_READ_ONLY;
> -	if (i915_gem_object_is_lmem(vma->obj))
> +	if (vma_res->bi.lmem)
>   		pte_flags |= PTE_LM;
>   
> -	vm->insert_entries(vm, vma, cache_level, pte_flags);
> +	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
>   	wmb();
>   }
>   
> -void ppgtt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
> +void ppgtt_unbind_vma(struct i915_address_space *vm,
> +		      struct i915_vma_resource *vma_res)
>   {
> -	if (test_and_clear_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma)))

Can we remove ALLOC_BIT? Or are there still users?

> -		vm->clear_range(vm, vma->node.start, vma->size);
> +	if (vma_res->allocated)
> +		vm->clear_range(vm, vma_res->start, vma_res->vma_size);
>   }
>   
>   static unsigned long pd_count(u64 size, int shift)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
> index a5af05bde6f2..777fc6f0ceff 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
> @@ -448,20 +448,19 @@ static void uc_fw_bind_ggtt(struct intel_uc_fw *uc_fw)
>   {
>   	struct drm_i915_gem_object *obj = uc_fw->obj;
>   	struct i915_ggtt *ggtt = __uc_fw_to_gt(uc_fw)->ggtt;
> -	struct i915_vma *dummy = &uc_fw->dummy;
> +	struct i915_vma_resource *dummy = &uc_fw->dummy;
>   	u32 pte_flags = 0;
>   
> -	dummy->node.start = uc_fw_ggtt_offset(uc_fw);
> -	dummy->node.size = obj->base.size;
> -	dummy->pages = obj->mm.pages;
> -	dummy->vm = &ggtt->vm;
> +	dummy->start = uc_fw_ggtt_offset(uc_fw);
> +	dummy->node_size = obj->base.size;
> +	dummy->bi.pages = obj->mm.pages;
>   
>   	GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
> -	GEM_BUG_ON(dummy->node.size > ggtt->uc_fw.size);
> +	GEM_BUG_ON(dummy->node_size > ggtt->uc_fw.size);
>   
>   	/* uc_fw->obj cache domains were not controlled across suspend */
>   	if (i915_gem_object_has_struct_page(obj))
> -		drm_clflush_sg(dummy->pages);
> +		drm_clflush_sg(dummy->bi.pages);
>   
>   	if (i915_gem_object_is_lmem(obj))
>   		pte_flags |= PTE_LM;
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
> index d9d1dc0b4cbb..3229018877d3 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
> @@ -85,7 +85,7 @@ struct intel_uc_fw {
>   	 * threaded as it done during driver load (inherently single threaded)
>   	 * or during a GT reset (mutex guarantees single threaded).
>   	 */
> -	struct i915_vma dummy;
> +	struct i915_vma_resource dummy;
>   	struct i915_vma *rsa_data;
>   
>   	/*
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index e0e052cdf8b8..f7d1feba5aa4 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -170,7 +170,8 @@ i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>   		seq_printf(m, " (%s offset: %08llx, size: %08llx, pages: %s",
>   			   stringify_vma_type(vma),
>   			   vma->node.start, vma->node.size,
> -			   stringify_page_sizes(vma->page_sizes.gtt, NULL, 0));
> +			   stringify_page_sizes(vma->resource->page_sizes_gtt,
> +						NULL, 0));
>   		if (i915_vma_is_ggtt(vma) || i915_vma_is_dpt(vma)) {
>   			switch (vma->ggtt_view.type) {
>   			case I915_GGTT_VIEW_NORMAL:
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 5ae812d60abe..1af54ff374f9 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1040,9 +1040,9 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>   	strcpy(dst->name, vsnap->name);
>   	dst->next = NULL;
>   
> -	dst->gtt_offset = vsnap->gtt_offset;
> -	dst->gtt_size = vsnap->gtt_size;
> -	dst->gtt_page_sizes = vsnap->page_sizes;
> +	dst->gtt_offset = vsnap->vma_resource->start;
> +	dst->gtt_size = vsnap->vma_resource->node_size;
> +	dst->gtt_page_sizes = vsnap->vma_resource->page_sizes_gtt;
>   	dst->unused = 0;
>   
>   	ret = -EINVAL;
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 7097c5016431..1d4e448d22d9 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -298,7 +298,7 @@ static void __vma_bind(struct dma_fence_work *work)
>   	struct i915_vma *vma = vw->vma;
>   
>   	vma->ops->bind_vma(vw->vm, &vw->stash,
> -			   vma, vw->cache_level, vw->flags);
> +			   vma->resource, vw->cache_level, vw->flags);
>   }
>   
>   static void __vma_release(struct dma_fence_work *work)
> @@ -375,6 +375,21 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
>   #define i915_vma_verify_bind_complete(_vma) 0
>   #endif
>   
> +I915_SELFTEST_EXPORT void
> +i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
> +				struct i915_vma *vma)
> +{
> +	struct drm_i915_gem_object *obj = vma->obj;
> +
> +	i915_vma_resource_init(vma_res, vma->pages, &vma->page_sizes,
> +			       i915_gem_object_is_readonly(obj),
> +			       i915_gem_object_is_lmem(obj),
> +			       vma->private,
> +			       vma->node.start,
> +			       vma->node.size,
> +			       vma->size);
> +}
> +
>   /**
>    * i915_vma_bind - Sets up PTEs for an VMA in it's corresponding address space.
>    * @vma: VMA to map
> @@ -432,7 +447,7 @@ int i915_vma_bind(struct i915_vma *vma,
>   		GEM_WARN_ON(!vma_flags);
>   		kfree(vma_res);
>   	} else {
> -		i915_vma_resource_init(vma_res);
> +		i915_vma_resource_init_from_vma(vma_res, vma);
>   		vma->resource = vma_res;
>   	}
>   	trace_i915_vma_bind(vma, bind_flags);
> @@ -472,7 +487,8 @@ int i915_vma_bind(struct i915_vma *vma,
>   			if (ret)
>   				return ret;
>   		}
> -		vma->ops->bind_vma(vma->vm, NULL, vma, cache_level, bind_flags);
> +		vma->ops->bind_vma(vma->vm, NULL, vma->resource, cache_level,
> +				   bind_flags);
>   	}
>   
>   	atomic_or(bind_flags, &vma->flags);
> @@ -1778,7 +1794,7 @@ void __i915_vma_evict(struct i915_vma *vma)
>   
>   	if (likely(atomic_read(&vma->vm->open))) {
>   		trace_i915_vma_unbind(vma);
> -		vma->ops->unbind_vma(vma->vm, vma);
> +		vma->ops->unbind_vma(vma->vm, vma->resource);
>   	}
>   	atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE),
>   		   &vma->flags);
> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
> index de0f3e44cdfa..1df57ec832bd 100644
> --- a/drivers/gpu/drm/i915/i915_vma.h
> +++ b/drivers/gpu/drm/i915/i915_vma.h
> @@ -339,12 +339,6 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma);
>    */
>   void i915_vma_unpin_iomap(struct i915_vma *vma);
>   
> -static inline struct page *i915_vma_first_page(struct i915_vma *vma)
> -{
> -	GEM_BUG_ON(!vma->pages);
> -	return sg_page(vma->pages->sgl);
> -}
> -
>   /**
>    * i915_vma_pin_fence - pin fencing state
>    * @vma: vma to pin fencing for
> @@ -445,6 +439,11 @@ i915_vma_get_current_resource(struct i915_vma *vma)
>   	return i915_vma_resource_get(vma->resource);
>   }
>   
> +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> +void i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
> +				     struct i915_vma *vma);
> +#endif
> +
>   void i915_vma_module_exit(void);
>   int i915_vma_module_init(void);
>   
> diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c
> index 833e987bed2a..c86db89ab5d2 100644
> --- a/drivers/gpu/drm/i915/i915_vma_resource.c
> +++ b/drivers/gpu/drm/i915/i915_vma_resource.c
> @@ -23,15 +23,12 @@ static struct dma_fence_ops unbind_fence_ops = {
>   };
>   
>   /**
> - * i915_vma_resource_init - Initialize a vma resource.
> + * __i915_vma_resource_init - Initialize a vma resource.
>    * @vma_res: The vma resource to initialize
>    *
> - * Initializes a vma resource allocated using i915_vma_resource_alloc().
> - * The reason for having separate allocate and initialize function is that
> - * initialization may need to be performed from under a lock where
> - * allocation is not allowed.
> + * Initializes the private members of a vma resource.
>    */
> -void i915_vma_resource_init(struct i915_vma_resource *vma_res)
> +void __i915_vma_resource_init(struct i915_vma_resource *vma_res)
>   {
>   	spin_lock_init(&vma_res->lock);
>   	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
> diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
> index 34744da23072..9872de58268b 100644
> --- a/drivers/gpu/drm/i915/i915_vma_resource.h
> +++ b/drivers/gpu/drm/i915/i915_vma_resource.h
> @@ -9,6 +9,25 @@
>   #include <linux/dma-fence.h>
>   #include <linux/refcount.h>
>   
> +#include "i915_gem.h"
> +
> +struct i915_page_sizes {
> +	/**
> +	 * The sg mask of the pages sg_table. i.e the mask of
> +	 * the lengths for each sg entry.
> +	 */
> +	unsigned int phys;
> +
> +	/**
> +	 * The gtt page sizes we are allowed to use given the
> +	 * sg mask and the supported page sizes. This will
> +	 * express the smallest unit we can use for the whole
> +	 * object, as well as the larger sizes we may be able
> +	 * to use opportunistically.
> +	 */
> +	unsigned int sg;
> +};
> +
>   /**
>    * struct i915_vma_resource - Snapshotted unbind information.
>    * @unbind_fence: Fence to mark unbinding complete. Note that this fence
> @@ -20,6 +39,13 @@
>    * @hold_count: Number of holders blocking the fence from finishing.
>    * The vma itself is keeping a hold, which is released when unbind
>    * is scheduled.
> + * @private: Bind backend private info.
> + * @start: Offset into the address space of bind range start.
> + * @node_size: Size of the allocated range manager node.
> + * @vma_size: Bind size.
> + * @page_sizes_gtt: Resulting page sizes from the bind operation.
> + * @bound_flags: Flags indicating binding status.
> + * @allocated: Backend private data. TODO: Should move into @private.
>    *
>    * The lifetime of a struct i915_vma_resource is from a binding request to
>    * the actual possible asynchronous unbind has completed.
> @@ -29,6 +55,32 @@ struct i915_vma_resource {
>   	/* See above for description of the lock. */
>   	spinlock_t lock;
>   	refcount_t hold_count;
> +
> +	/**
> +	 * struct i915_vma_bindinfo - Information needed for async bind
> +	 * only but that can be dropped after the bind has taken place.
> +	 * Consider making this a separate argument to the bind_vma
> +	 * op, coalescing with other arguments like vm, stash, cache_level
> +	 * and flags
> +	 * @pages: The pages sg-table.
> +	 * @page_sizes: Page sizes of the pages.
> +	 * @readonly: Whether the vma should be bound read-only.
> +	 * @lmem: Whether the vma points to lmem.
> +	 */
> +	struct i915_vma_bindinfo {
> +		struct sg_table *pages;
> +		struct i915_page_sizes page_sizes;
> +		bool readonly:1;
> +		bool lmem:1;
> +	} bi;
> +
> +	void *private;
> +	unsigned long start;
> +	unsigned long node_size;
> +	unsigned long vma_size;

AFAIK these need to be u64, or at least the node_size & start.

Otherwise,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> +	u32 page_sizes_gtt;
> +	u32 bound_flags;
> +	bool allocated:1;
>   };
>   
>   bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
> @@ -41,6 +93,8 @@ struct i915_vma_resource *i915_vma_resource_alloc(void);
>   
>   struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res);
>   
> +void __i915_vma_resource_init(struct i915_vma_resource *vma_res);
> +
>   /**
>    * i915_vma_resource_get - Take a reference on a vma resource
>    * @vma_res: The vma resource on which to take a reference.
> @@ -63,8 +117,47 @@ static inline void i915_vma_resource_put(struct i915_vma_resource *vma_res)
>   	dma_fence_put(&vma_res->unbind_fence);
>   }
>   
> -#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> -void i915_vma_resource_init(struct i915_vma_resource *vma_res);
> -#endif
> +/**
> + * i915_vma_resource_init - Initialize a vma resource.
> + * @vma_res: The vma resource to initialize
> + * @pages: The pages sg-table.
> + * @page_sizes: Page sizes of the pages.
> + * @readonly: Whether the vma should be bound read-only.
> + * @lmem: Whether the vma points to lmem.
> + * @private: Bind backend private info.
> + * @start: Offset into the address space of bind range start.
> + * @node_size: Size of the allocated range manager node.
> + * @size: Bind size.
> + *
> + * Initializes a vma resource allocated using i915_vma_resource_alloc().
> + * The reason for having separate allocate and initialize function is that
> + * initialization may need to be performed from under a lock where
> + * allocation is not allowed.
> + */
> +static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
> +					  struct sg_table *pages,
> +					  const struct i915_page_sizes *page_sizes,
> +					  bool readonly,
> +					  bool lmem,
> +					  void *private,
> +					  unsigned long start,
> +					  unsigned long node_size,
> +					  unsigned long size)
> +{
> +	__i915_vma_resource_init(vma_res);
> +	vma_res->bi.pages = pages;
> +	vma_res->bi.page_sizes = *page_sizes;
> +	vma_res->bi.readonly = readonly;
> +	vma_res->bi.lmem = lmem;
> +	vma_res->private = private;
> +	vma_res->start = start;
> +	vma_res->node_size = node_size;
> +	vma_res->vma_size = size;
> +}
> +
> +static inline void i915_vma_resource_fini(struct i915_vma_resource *vma_res)
> +{
> +	GEM_BUG_ON(refcount_read(&vma_res->hold_count) != 1);
> +}
>   
>   #endif
> diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
> index f7333c7a2f5e..69f62c1ca967 100644
> --- a/drivers/gpu/drm/i915/i915_vma_snapshot.c
> +++ b/drivers/gpu/drm/i915/i915_vma_snapshot.c
> @@ -24,11 +24,7 @@ void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
>   		assert_object_held(vma->obj);
>   
>   	vsnap->name = name;
> -	vsnap->size = vma->size;
>   	vsnap->obj_size = vma->obj->base.size;
> -	vsnap->gtt_offset = vma->node.start;
> -	vsnap->gtt_size = vma->node.size;
> -	vsnap->page_sizes = vma->page_sizes.gtt;
>   	vsnap->pages = vma->pages;
>   	vsnap->pages_rsgt = NULL;
>   	vsnap->mr = NULL;
> diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
> index e74588dd676b..1b08ce9f8576 100644
> --- a/drivers/gpu/drm/i915/i915_vma_snapshot.h
> +++ b/drivers/gpu/drm/i915/i915_vma_snapshot.h
> @@ -23,31 +23,23 @@ struct sg_table;
>   
>   /**
>    * struct i915_vma_snapshot - Snapshot of vma metadata.
> - * @size: The vma size in bytes.
>    * @obj_size: The size of the underlying object in bytes.
> - * @gtt_offset: The gtt offset the vma is bound to.
> - * @gtt_size: The size in bytes allocated for the vma in the GTT.
>    * @pages: The struct sg_table pointing to the pages bound.
>    * @pages_rsgt: The refcounted sg_table holding the reference for @pages if any.
>    * @mr: The memory region pointed for the pages bound.
>    * @kref: Reference for this structure.
>    * @vma_resource: Pointer to the vma resource representing the vma binding.
> - * @page_sizes: The vma GTT page sizes information.
>    * @onstack: Whether the structure shouldn't be freed on final put.
>    * @present: Whether the structure is present and initialized.
>    */
>   struct i915_vma_snapshot {
>   	const char *name;
> -	size_t size;
>   	size_t obj_size;
> -	size_t gtt_offset;
> -	size_t gtt_size;
>   	struct sg_table *pages;
>   	struct i915_refct_sgt *pages_rsgt;
>   	struct intel_memory_region *mr;
>   	struct kref kref;
>   	struct i915_vma_resource *vma_resource;
> -	u32 page_sizes;
>   	bool onstack:1;
>   	bool present:1;
>   };
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> index 54be880e55c3..70b5c47890b9 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> @@ -239,11 +239,11 @@ static int lowlevel_hole(struct i915_address_space *vm,
>   			 unsigned long end_time)
>   {
>   	I915_RND_STATE(seed_prng);
> -	struct i915_vma *mock_vma;
> +	struct i915_vma_resource *mock_vma_res;
>   	unsigned int size;
>   
> -	mock_vma = kzalloc(sizeof(*mock_vma), GFP_KERNEL);
> -	if (!mock_vma)
> +	mock_vma_res = kzalloc(sizeof(*mock_vma_res), GFP_KERNEL);
> +	if (!mock_vma_res)
>   		return -ENOMEM;
>   
>   	/* Keep creating larger objects until one cannot fit into the hole */
> @@ -269,7 +269,7 @@ static int lowlevel_hole(struct i915_address_space *vm,
>   				break;
>   		} while (count >>= 1);
>   		if (!count) {
> -			kfree(mock_vma);
> +			kfree(mock_vma_res);
>   			return -ENOMEM;
>   		}
>   		GEM_BUG_ON(!order);
> @@ -343,12 +343,12 @@ static int lowlevel_hole(struct i915_address_space *vm,
>   					break;
>   			}
>   
> -			mock_vma->pages = obj->mm.pages;
> -			mock_vma->node.size = BIT_ULL(size);
> -			mock_vma->node.start = addr;
> +			mock_vma_res->bi.pages = obj->mm.pages;
> +			mock_vma_res->node_size = BIT_ULL(size);
> +			mock_vma_res->start = addr;
>   
>   			with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref)
> -				vm->insert_entries(vm, mock_vma,
> +			  vm->insert_entries(vm, mock_vma_res,
>   						   I915_CACHE_NONE, 0);
>   		}
>   		count = n;
> @@ -371,7 +371,7 @@ static int lowlevel_hole(struct i915_address_space *vm,
>   		cleanup_freed_objects(vm->i915);
>   	}
>   
> -	kfree(mock_vma);
> +	kfree(mock_vma_res);
>   	return 0;
>   }
>   
> @@ -1280,6 +1280,7 @@ static void track_vma_bind(struct i915_vma *vma)
>   	atomic_set(&vma->pages_count, I915_VMA_PAGES_ACTIVE);
>   	__i915_gem_object_pin_pages(obj);
>   	vma->pages = obj->mm.pages;
> +	vma->resource->bi.pages = vma->pages;
>   
>   	mutex_lock(&vma->vm->mutex);
>   	list_add_tail(&vma->vm_link, &vma->vm->bound_list);
> @@ -1354,7 +1355,7 @@ static int reserve_gtt_with_resource(struct i915_vma *vma, u64 offset)
>   				   obj->cache_level,
>   				   0);
>   	if (!err) {
> -		i915_vma_resource_init(vma_res);
> +		i915_vma_resource_init_from_vma(vma_res, vma);
>   		vma->resource = vma_res;
>   	} else {
>   		kfree(vma_res);
> @@ -1533,7 +1534,7 @@ static int insert_gtt_with_resource(struct i915_vma *vma)
>   	err = i915_gem_gtt_insert(vm, &vma->node, obj->base.size, 0,
>   				  obj->cache_level, 0, vm->total, 0);
>   	if (!err) {
> -		i915_vma_resource_init(vma_res);
> +		i915_vma_resource_init_from_vma(vma_res, vma);
>   		vma->resource = vma_res;
>   	} else {
>   		kfree(vma_res);
> @@ -1958,6 +1959,7 @@ static int igt_cs_tlb(void *arg)
>   			struct i915_vm_pt_stash stash = {};
>   			struct i915_request *rq;
>   			struct i915_gem_ww_ctx ww;
> +			struct i915_vma_resource *vma_res;
>   			u64 offset;
>   
>   			offset = igt_random_offset(&prng,
> @@ -1978,6 +1980,13 @@ static int igt_cs_tlb(void *arg)
>   			if (err)
>   				goto end;
>   
> +			vma_res = i915_vma_resource_alloc();
> +			if (IS_ERR(vma_res)) {
> +				i915_vma_put_pages(vma);
> +				err = PTR_ERR(vma_res);
> +				goto end;
> +			}
> +
>   			i915_gem_ww_ctx_init(&ww, false);
>   retry:
>   			err = i915_vm_lock_objects(vm, &ww);
> @@ -1999,33 +2008,41 @@ static int igt_cs_tlb(void *arg)
>   					goto retry;
>   			}
>   			i915_gem_ww_ctx_fini(&ww);
> -			if (err)
> +			if (err) {
> +				kfree(vma_res);
>   				goto end;
> +			}
>   
> +			i915_vma_resource_init_from_vma(vma_res, vma);
>   			/* Prime the TLB with the dummy pages */
>   			for (i = 0; i < count; i++) {
> -				vma->node.start = offset + i * PAGE_SIZE;
> -				vm->insert_entries(vm, vma, I915_CACHE_NONE, 0);
> +				vma_res->start = offset + i * PAGE_SIZE;
> +				vm->insert_entries(vm, vma_res, I915_CACHE_NONE,
> +						   0);
>   
> -				rq = submit_batch(ce, vma->node.start);
> +				rq = submit_batch(ce, vma_res->start);
>   				if (IS_ERR(rq)) {
>   					err = PTR_ERR(rq);
> +					i915_vma_resource_fini(vma_res);
> +					kfree(vma_res);
>   					goto end;
>   				}
>   				i915_request_put(rq);
>   			}
> -
> +			i915_vma_resource_fini(vma_res);
>   			i915_vma_put_pages(vma);
>   
>   			err = context_sync(ce);
>   			if (err) {
>   				pr_err("%s: dummy setup timed out\n",
>   				       ce->engine->name);
> +				kfree(vma_res);
>   				goto end;
>   			}
>   
>   			vma = i915_vma_instance(act, vm, NULL);
>   			if (IS_ERR(vma)) {
> +				kfree(vma_res);
>   				err = PTR_ERR(vma);
>   				goto end;
>   			}
> @@ -2033,19 +2050,22 @@ static int igt_cs_tlb(void *arg)
>   			i915_gem_object_lock(act, NULL);
>   			err = i915_vma_get_pages(vma);
>   			i915_gem_object_unlock(act);
> -			if (err)
> +			if (err) {
> +				kfree(vma_res);
>   				goto end;
> +			}
>   
> +			i915_vma_resource_init_from_vma(vma_res, vma);
>   			/* Replace the TLB with target batches */
>   			for (i = 0; i < count; i++) {
>   				struct i915_request *rq;
>   				u32 *cs = batch + i * 64 / sizeof(*cs);
>   				u64 addr;
>   
> -				vma->node.start = offset + i * PAGE_SIZE;
> -				vm->insert_entries(vm, vma, I915_CACHE_NONE, 0);
> +				vma_res->start = offset + i * PAGE_SIZE;
> +				vm->insert_entries(vm, vma_res, I915_CACHE_NONE, 0);
>   
> -				addr = vma->node.start + i * 64;
> +				addr = vma_res->start + i * 64;
>   				cs[4] = MI_NOOP;
>   				cs[6] = lower_32_bits(addr);
>   				cs[7] = upper_32_bits(addr);
> @@ -2054,6 +2074,8 @@ static int igt_cs_tlb(void *arg)
>   				rq = submit_batch(ce, addr);
>   				if (IS_ERR(rq)) {
>   					err = PTR_ERR(rq);
> +					i915_vma_resource_fini(vma_res);
> +					kfree(vma_res);
>   					goto end;
>   				}
>   
> @@ -2070,6 +2092,8 @@ static int igt_cs_tlb(void *arg)
>   			}
>   			end_spin(batch, count - 1);
>   
> +			i915_vma_resource_fini(vma_res);
> +			kfree(vma_res);
>   			i915_vma_put_pages(vma);
>   
>   			err = context_sync(ce);
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> index 1802baf80a17..d40519e3ca38 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> @@ -33,23 +33,23 @@ static void mock_insert_page(struct i915_address_space *vm,
>   }
>   
>   static void mock_insert_entries(struct i915_address_space *vm,
> -				struct i915_vma *vma,
> +				struct i915_vma_resource *vma_res,
>   				enum i915_cache_level level, u32 flags)
>   {
>   }
>   
>   static void mock_bind_ppgtt(struct i915_address_space *vm,
>   			    struct i915_vm_pt_stash *stash,
> -			    struct i915_vma *vma,
> +			    struct i915_vma_resource *vma_res,
>   			    enum i915_cache_level cache_level,
>   			    u32 flags)
>   {
>   	GEM_BUG_ON(flags & I915_VMA_GLOBAL_BIND);
> -	set_bit(I915_VMA_LOCAL_BIND_BIT, __i915_vma_flags(vma));
> +	vma_res->bound_flags |= flags;
>   }
>   
>   static void mock_unbind_ppgtt(struct i915_address_space *vm,
> -			      struct i915_vma *vma)
> +			      struct i915_vma_resource *vma_res)
>   {
>   }
>   
> @@ -93,14 +93,14 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
>   
>   static void mock_bind_ggtt(struct i915_address_space *vm,
>   			   struct i915_vm_pt_stash *stash,
> -			   struct i915_vma *vma,
> +			   struct i915_vma_resource *vma_res,
>   			   enum i915_cache_level cache_level,
>   			   u32 flags)
>   {
>   }
>   
>   static void mock_unbind_ggtt(struct i915_address_space *vm,
> -			     struct i915_vma *vma)
> +			     struct i915_vma_resource *vma_res)
>   {
>   }
>   
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH v5 2/6] drm/i915: Use the vma resource as argument for gtt binding / unbinding
@ 2022-01-06 16:01     ` Matthew Auld
  0 siblings, 0 replies; 32+ messages in thread
From: Matthew Auld @ 2022-01-06 16:01 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 04/01/2022 12:51, Thomas Hellström wrote:
> When introducing asynchronous unbinding, the vma itself may no longer
> be alive when the actual binding or unbinding takes place.
> 
> Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource
> instead of a struct i915_vma for the bind_vma() and unbind_vma() ops.
> Similarly change the insert_entries() op for struct i915_address_space.
> 
> Replace a couple of i915_vma_snapshot members with their newly introduced
> i915_vma_resource counterparts, since they have the same lifetime.
> 
> Also make sure to avoid changing the struct i915_vma_flags (in particular
> the bind flags) async. That should now only be done sync under the
> vm mutex.
> 
> v2:
> - Update the vma_res::bound_flags when binding to the aliased ggtt
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/display/intel_dpt.c      | 27 ++---
>   .../gpu/drm/i915/gem/i915_gem_object_types.h  | 27 +----
>   .../gpu/drm/i915/gem/selftests/huge_pages.c   | 37 +++----
>   drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 19 ++--
>   drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 37 +++----
>   drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  4 +-
>   drivers/gpu/drm/i915/gt/intel_ggtt.c          | 70 ++++++-------
>   drivers/gpu/drm/i915/gt/intel_gtt.h           | 16 +--
>   drivers/gpu/drm/i915/gt/intel_ppgtt.c         | 22 +++--
>   drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      | 13 ++-
>   drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h      |  2 +-
>   drivers/gpu/drm/i915/i915_debugfs.c           |  3 +-
>   drivers/gpu/drm/i915/i915_gpu_error.c         |  6 +-
>   drivers/gpu/drm/i915/i915_vma.c               | 24 ++++-
>   drivers/gpu/drm/i915/i915_vma.h               | 11 +--
>   drivers/gpu/drm/i915/i915_vma_resource.c      |  9 +-
>   drivers/gpu/drm/i915/i915_vma_resource.h      | 99 ++++++++++++++++++-
>   drivers/gpu/drm/i915/i915_vma_snapshot.c      |  4 -
>   drivers/gpu/drm/i915/i915_vma_snapshot.h      |  8 --
>   drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 64 ++++++++----
>   drivers/gpu/drm/i915/selftests/mock_gtt.c     | 12 +--
>   21 files changed, 308 insertions(+), 206 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
> index 8f674745e7e0..63a83d5f85a1 100644
> --- a/drivers/gpu/drm/i915/display/intel_dpt.c
> +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
> @@ -48,7 +48,7 @@ static void dpt_insert_page(struct i915_address_space *vm,
>   }
>   
>   static void dpt_insert_entries(struct i915_address_space *vm,
> -			       struct i915_vma *vma,
> +			       struct i915_vma_resource *vma_res,
>   			       enum i915_cache_level level,
>   			       u32 flags)
>   {
> @@ -64,8 +64,8 @@ static void dpt_insert_entries(struct i915_address_space *vm,
>   	 * not to allow the user to override access to a read only page.
>   	 */
>   
> -	i = vma->node.start / I915_GTT_PAGE_SIZE;
> -	for_each_sgt_daddr(addr, sgt_iter, vma->pages)
> +	i = vma_res->start / I915_GTT_PAGE_SIZE;
> +	for_each_sgt_daddr(addr, sgt_iter, vma_res->bi.pages)
>   		gen8_set_pte(&base[i++], pte_encode | addr);
>   }
>   
> @@ -76,35 +76,38 @@ static void dpt_clear_range(struct i915_address_space *vm,
>   
>   static void dpt_bind_vma(struct i915_address_space *vm,
>   			 struct i915_vm_pt_stash *stash,
> -			 struct i915_vma *vma,
> +			 struct i915_vma_resource *vma_res,
>   			 enum i915_cache_level cache_level,
>   			 u32 flags)
>   {
> -	struct drm_i915_gem_object *obj = vma->obj;
>   	u32 pte_flags;
>   
> +	if (vma_res->bound_flags)
> +		return;
> +
>   	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
>   	pte_flags = 0;
> -	if (vma->vm->has_read_only && i915_gem_object_is_readonly(obj))
> +	if (vm->has_read_only && vma_res->bi.readonly)
>   		pte_flags |= PTE_READ_ONLY;
> -	if (i915_gem_object_is_lmem(obj))
> +	if (vma_res->bi.lmem)
>   		pte_flags |= PTE_LM;
>   
> -	vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
> +	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
>   
> -	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
> +	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>   
>   	/*
>   	 * Without aliasing PPGTT there's no difference between
>   	 * GLOBAL/LOCAL_BIND, it's all the same ptes. Hence unconditionally
>   	 * upgrade to both bound if we bind either to avoid double-binding.
>   	 */
> -	atomic_or(I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND, &vma->flags);
> +	vma_res->bound_flags = I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND;
>   }
>   
> -static void dpt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
> +static void dpt_unbind_vma(struct i915_address_space *vm,
> +			   struct i915_vma_resource *vma_res)
>   {
> -	vm->clear_range(vm, vma->node.start, vma->size);
> +	vm->clear_range(vm, vma_res->start, vma_res->vma_size);
>   }
>   
>   static void dpt_cleanup(struct i915_address_space *vm)
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index f9f7e44099fe..f99d260e0684 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -15,6 +15,7 @@
>   
>   #include "i915_active.h"
>   #include "i915_selftest.h"
> +#include "i915_vma_resource.h"
>   
>   struct drm_i915_gem_object;
>   struct intel_fronbuffer;
> @@ -549,31 +550,7 @@ struct drm_i915_gem_object {
>   		struct sg_table *pages;
>   		void *mapping;
>   
> -		struct i915_page_sizes {
> -			/**
> -			 * The sg mask of the pages sg_table. i.e the mask of
> -			 * of the lengths for each sg entry.
> -			 */
> -			unsigned int phys;
> -
> -			/**
> -			 * The gtt page sizes we are allowed to use given the
> -			 * sg mask and the supported page sizes. This will
> -			 * express the smallest unit we can use for the whole
> -			 * object, as well as the larger sizes we may be able
> -			 * to use opportunistically.
> -			 */
> -			unsigned int sg;
> -
> -			/**
> -			 * The actual gtt page size usage. Since we can have
> -			 * multiple vma associated with this object we need to
> -			 * prevent any trampling of state, hence a copy of this
> -			 * struct also lives in each vma, therefore the gtt
> -			 * value here should only be read/write through the vma.
> -			 */
> -			unsigned int gtt;
> -		} page_sizes;
> +		struct i915_page_sizes page_sizes;
>   
>   		I915_SELFTEST_DECLARE(unsigned int page_mask);
>   
> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> index 11f0aa65f8a3..26f997c376a2 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> @@ -370,9 +370,9 @@ static int igt_check_page_sizes(struct i915_vma *vma)
>   		err = -EINVAL;
>   	}
>   
> -	if (!HAS_PAGE_SIZES(i915, vma->page_sizes.gtt)) {
> +	if (!HAS_PAGE_SIZES(i915, vma->resource->page_sizes_gtt)) {
>   		pr_err("unsupported page_sizes.gtt=%u, supported=%u\n",
> -		       vma->page_sizes.gtt & ~supported, supported);
> +		       vma->resource->page_sizes_gtt & ~supported, supported);
>   		err = -EINVAL;
>   	}
>   
> @@ -403,15 +403,9 @@ static int igt_check_page_sizes(struct i915_vma *vma)
>   	if (i915_gem_object_is_lmem(obj) &&
>   	    IS_ALIGNED(vma->node.start, SZ_2M) &&
>   	    vma->page_sizes.sg & SZ_2M &&
> -	    vma->page_sizes.gtt < SZ_2M) {
> +	    vma->resource->page_sizes_gtt < SZ_2M) {
>   		pr_err("gtt pages mismatch for LMEM, expected 2M GTT pages, sg(%u), gtt(%u)\n",
> -		       vma->page_sizes.sg, vma->page_sizes.gtt);
> -		err = -EINVAL;
> -	}
> -
> -	if (obj->mm.page_sizes.gtt) {
> -		pr_err("obj->page_sizes.gtt(%u) should never be set\n",
> -		       obj->mm.page_sizes.gtt);
> +		       vma->page_sizes.sg, vma->resource->page_sizes_gtt);
>   		err = -EINVAL;
>   	}
>   
> @@ -547,9 +541,9 @@ static int igt_mock_memory_region_huge_pages(void *arg)
>   				goto out_unpin;
>   			}
>   
> -			if (vma->page_sizes.gtt != page_size) {
> +			if (vma->resource->page_sizes_gtt != page_size) {
>   				pr_err("%s page_sizes.gtt=%u, expected=%u\n",
> -				       __func__, vma->page_sizes.gtt,
> +				       __func__, vma->resource->page_sizes_gtt,
>   				       page_size);
>   				err = -EINVAL;
>   				goto out_unpin;
> @@ -630,9 +624,9 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg)
>   
>   		err = igt_check_page_sizes(vma);
>   
> -		if (vma->page_sizes.gtt != page_size) {
> +		if (vma->resource->page_sizes_gtt != page_size) {
>   			pr_err("page_sizes.gtt=%u, expected %u\n",
> -			       vma->page_sizes.gtt, page_size);
> +			       vma->resource->page_sizes_gtt, page_size);
>   			err = -EINVAL;
>   		}
>   
> @@ -657,9 +651,10 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg)
>   
>   			err = igt_check_page_sizes(vma);
>   
> -			if (vma->page_sizes.gtt != I915_GTT_PAGE_SIZE_4K) {
> +			if (vma->resource->page_sizes_gtt != I915_GTT_PAGE_SIZE_4K) {
>   				pr_err("page_sizes.gtt=%u, expected %llu\n",
> -				       vma->page_sizes.gtt, I915_GTT_PAGE_SIZE_4K);
> +				       vma->resource->page_sizes_gtt,
> +				       I915_GTT_PAGE_SIZE_4K);
>   				err = -EINVAL;
>   			}
>   
> @@ -805,9 +800,9 @@ static int igt_mock_ppgtt_huge_fill(void *arg)
>   			}
>   		}
>   
> -		if (vma->page_sizes.gtt != expected_gtt) {
> +		if (vma->resource->page_sizes_gtt != expected_gtt) {
>   			pr_err("gtt=%u, expected=%u, size=%zd, single=%s\n",
> -			       vma->page_sizes.gtt, expected_gtt,
> +			       vma->resource->page_sizes_gtt, expected_gtt,
>   			       obj->base.size, yesno(!!single));
>   			err = -EINVAL;
>   			break;
> @@ -961,10 +956,10 @@ static int igt_mock_ppgtt_64K(void *arg)
>   				}
>   			}
>   
> -			if (vma->page_sizes.gtt != expected_gtt) {
> +			if (vma->resource->page_sizes_gtt != expected_gtt) {
>   				pr_err("gtt=%u, expected=%u, i=%d, single=%s\n",
> -				       vma->page_sizes.gtt, expected_gtt, i,
> -				       yesno(!!single));
> +				       vma->resource->page_sizes_gtt,
> +				       expected_gtt, i, yesno(!!single));
>   				err = -EINVAL;
>   				goto out_vma_unpin;
>   			}
> diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> index 6e9292918bfc..d657ffd6c86a 100644
> --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
> @@ -104,17 +104,17 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
>   }
>   
>   static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
> -				      struct i915_vma *vma,
> +				      struct i915_vma_resource *vma_res,
>   				      enum i915_cache_level cache_level,
>   				      u32 flags)
>   {
>   	struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
>   	struct i915_page_directory * const pd = ppgtt->pd;
> -	unsigned int first_entry = vma->node.start / I915_GTT_PAGE_SIZE;
> +	unsigned int first_entry = vma_res->start / I915_GTT_PAGE_SIZE;
>   	unsigned int act_pt = first_entry / GEN6_PTES;
>   	unsigned int act_pte = first_entry % GEN6_PTES;
>   	const u32 pte_encode = vm->pte_encode(0, cache_level, flags);
> -	struct sgt_dma iter = sgt_dma(vma);
> +	struct sgt_dma iter = sgt_dma(vma_res);
>   	gen6_pte_t *vaddr;
>   
>   	GEM_BUG_ON(!pd->entry[act_pt]);
> @@ -140,7 +140,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>   		}
>   	} while (1);
>   
> -	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
> +	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>   }
>   
>   static void gen6_flush_pd(struct gen6_ppgtt *ppgtt, u64 start, u64 end)
> @@ -271,13 +271,13 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
>   
>   static void pd_vma_bind(struct i915_address_space *vm,
>   			struct i915_vm_pt_stash *stash,
> -			struct i915_vma *vma,
> +			struct i915_vma_resource *vma_res,
>   			enum i915_cache_level cache_level,
>   			u32 unused)
>   {
>   	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
> -	struct gen6_ppgtt *ppgtt = vma->private;
> -	u32 ggtt_offset = i915_ggtt_offset(vma) / I915_GTT_PAGE_SIZE;
> +	struct gen6_ppgtt *ppgtt = vma_res->private;
> +	u32 ggtt_offset = vma_res->start / I915_GTT_PAGE_SIZE;
>   
>   	ppgtt->pp_dir = ggtt_offset * sizeof(gen6_pte_t) << 10;
>   	ppgtt->pd_addr = (gen6_pte_t __iomem *)ggtt->gsm + ggtt_offset;
> @@ -285,9 +285,10 @@ static void pd_vma_bind(struct i915_address_space *vm,
>   	gen6_flush_pd(ppgtt, 0, ppgtt->base.vm.total);
>   }
>   
> -static void pd_vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
> +static void pd_vma_unbind(struct i915_address_space *vm,
> +			  struct i915_vma_resource *vma_res)
>   {
> -	struct gen6_ppgtt *ppgtt = vma->private;
> +	struct gen6_ppgtt *ppgtt = vma_res->private;
>   	struct i915_page_directory * const pd = ppgtt->base.pd;
>   	struct i915_page_table *pt;
>   	unsigned int pde;
> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> index b012c50f7ce7..c43e724afa9f 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
> @@ -453,20 +453,21 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
>   	return idx;
>   }
>   
> -static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
> +static void gen8_ppgtt_insert_huge(struct i915_address_space *vm,
> +				   struct i915_vma_resource *vma_res,
>   				   struct sgt_dma *iter,
>   				   enum i915_cache_level cache_level,
>   				   u32 flags)
>   {
>   	const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
>   	unsigned int rem = sg_dma_len(iter->sg);
> -	u64 start = vma->node.start;
> +	u64 start = vma_res->start;
>   
> -	GEM_BUG_ON(!i915_vm_is_4lvl(vma->vm));
> +	GEM_BUG_ON(!i915_vm_is_4lvl(vm));
>   
>   	do {
>   		struct i915_page_directory * const pdp =
> -			gen8_pdp_for_page_address(vma->vm, start);
> +			gen8_pdp_for_page_address(vm, start);
>   		struct i915_page_directory * const pd =
>   			i915_pd_entry(pdp, __gen8_pte_index(start, 2));
>   		gen8_pte_t encode = pte_encode;
> @@ -475,7 +476,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>   		gen8_pte_t *vaddr;
>   		u16 index;
>   
> -		if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_2M &&
> +		if (vma_res->bi.page_sizes.sg & I915_GTT_PAGE_SIZE_2M &&
>   		    IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_2M) &&
>   		    rem >= I915_GTT_PAGE_SIZE_2M &&
>   		    !__gen8_pte_index(start, 0)) {
> @@ -492,7 +493,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>   			page_size = I915_GTT_PAGE_SIZE;
>   
>   			if (!index &&
> -			    vma->page_sizes.sg & I915_GTT_PAGE_SIZE_64K &&
> +			    vma_res->bi.page_sizes.sg & I915_GTT_PAGE_SIZE_64K &&
>   			    IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_64K) &&
>   			    (IS_ALIGNED(rem, I915_GTT_PAGE_SIZE_64K) ||
>   			     rem >= (I915_PDES - index) * I915_GTT_PAGE_SIZE))
> @@ -541,9 +542,9 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>   		 */
>   		if (maybe_64K != -1 &&
>   		    (index == I915_PDES ||
> -		     (i915_vm_has_scratch_64K(vma->vm) &&
> -		      !iter->sg && IS_ALIGNED(vma->node.start +
> -					      vma->node.size,
> +		     (i915_vm_has_scratch_64K(vm) &&
> +		      !iter->sg && IS_ALIGNED(vma_res->start +
> +					      vma_res->node_size,
>   					      I915_GTT_PAGE_SIZE_2M)))) {
>   			vaddr = px_vaddr(pd);
>   			vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
> @@ -559,10 +560,10 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>   			 * instead - which we detect as missing results during
>   			 * selftests.
>   			 */
> -			if (I915_SELFTEST_ONLY(vma->vm->scrub_64K)) {
> +			if (I915_SELFTEST_ONLY(vm->scrub_64K)) {
>   				u16 i;
>   
> -				encode = vma->vm->scratch[0]->encode;
> +				encode = vm->scratch[0]->encode;
>   				vaddr = px_vaddr(i915_pt_entry(pd, maybe_64K));
>   
>   				for (i = 1; i < index; i += 16)
> @@ -572,22 +573,22 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
>   			}
>   		}
>   
> -		vma->page_sizes.gtt |= page_size;
> +		vma_res->page_sizes_gtt |= page_size;
>   	} while (iter->sg && sg_dma_len(iter->sg));
>   }
>   
>   static void gen8_ppgtt_insert(struct i915_address_space *vm,
> -			      struct i915_vma *vma,
> +			      struct i915_vma_resource *vma_res,
>   			      enum i915_cache_level cache_level,
>   			      u32 flags)
>   {
>   	struct i915_ppgtt * const ppgtt = i915_vm_to_ppgtt(vm);
> -	struct sgt_dma iter = sgt_dma(vma);
> +	struct sgt_dma iter = sgt_dma(vma_res);
>   
> -	if (vma->page_sizes.sg > I915_GTT_PAGE_SIZE) {
> -		gen8_ppgtt_insert_huge(vma, &iter, cache_level, flags);
> +	if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) {
> +		gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags);
>   	} else  {
> -		u64 idx = vma->node.start >> GEN8_PTE_SHIFT;
> +		u64 idx = vma_res->start >> GEN8_PTE_SHIFT;
>   
>   		do {
>   			struct i915_page_directory * const pdp =
> @@ -597,7 +598,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
>   						    cache_level, flags);
>   		} while (idx);
>   
> -		vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
> +		vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>   	}
>   }
>   
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 352254e001b4..74aa90587061 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -1718,8 +1718,8 @@ static void print_request_ring(struct drm_printer *m, struct i915_request *rq)
>   	drm_printf(m,
>   		   "[head %04x, postfix %04x, tail %04x, batch 0x%08x_%08x]:\n",
>   		   rq->head, rq->postfix, rq->tail,
> -		   vsnap ? upper_32_bits(vsnap->gtt_offset) : ~0u,
> -		   vsnap ? lower_32_bits(vsnap->gtt_offset) : ~0u);
> +		   vsnap ? upper_32_bits(vsnap->vma_resource->start) : ~0u,
> +		   vsnap ? lower_32_bits(vsnap->vma_resource->start) : ~0u);
>   
>   	size = rq->tail - rq->head;
>   	if (rq->tail < rq->head)
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 5263dda7f8d5..0137b6af0973 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -235,7 +235,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
>   }
>   
>   static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
> -				     struct i915_vma *vma,
> +				     struct i915_vma_resource *vma_res,
>   				     enum i915_cache_level level,
>   				     u32 flags)
>   {
> @@ -252,10 +252,10 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
>   	 */
>   
>   	gte = (gen8_pte_t __iomem *)ggtt->gsm;
> -	gte += vma->node.start / I915_GTT_PAGE_SIZE;
> -	end = gte + vma->node.size / I915_GTT_PAGE_SIZE;
> +	gte += vma_res->start / I915_GTT_PAGE_SIZE;
> +	end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
>   
> -	for_each_sgt_daddr(addr, iter, vma->pages)
> +	for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
>   		gen8_set_pte(gte++, pte_encode | addr);
>   	GEM_BUG_ON(gte > end);
>   
> @@ -292,7 +292,7 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm,
>    * through the GMADR mapped BAR (i915->mm.gtt->gtt).
>    */
>   static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
> -				     struct i915_vma *vma,
> +				     struct i915_vma_resource *vma_res,
>   				     enum i915_cache_level level,
>   				     u32 flags)
>   {
> @@ -303,10 +303,10 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
>   	dma_addr_t addr;
>   
>   	gte = (gen6_pte_t __iomem *)ggtt->gsm;
> -	gte += vma->node.start / I915_GTT_PAGE_SIZE;
> -	end = gte + vma->node.size / I915_GTT_PAGE_SIZE;
> +	gte += vma_res->start / I915_GTT_PAGE_SIZE;
> +	end = gte + vma_res->node_size / I915_GTT_PAGE_SIZE;
>   
> -	for_each_sgt_daddr(addr, iter, vma->pages)
> +	for_each_sgt_daddr(addr, iter, vma_res->bi.pages)
>   		iowrite32(vm->pte_encode(addr, level, flags), gte++);
>   	GEM_BUG_ON(gte > end);
>   
> @@ -389,7 +389,7 @@ static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
>   
>   struct insert_entries {
>   	struct i915_address_space *vm;
> -	struct i915_vma *vma;
> +	struct i915_vma_resource *vma_res;
>   	enum i915_cache_level level;
>   	u32 flags;
>   };
> @@ -398,18 +398,18 @@ static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
>   {
>   	struct insert_entries *arg = _arg;
>   
> -	gen8_ggtt_insert_entries(arg->vm, arg->vma, arg->level, arg->flags);
> +	gen8_ggtt_insert_entries(arg->vm, arg->vma_res, arg->level, arg->flags);
>   	bxt_vtd_ggtt_wa(arg->vm);
>   
>   	return 0;
>   }
>   
>   static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
> -					     struct i915_vma *vma,
> +					     struct i915_vma_resource *vma_res,
>   					     enum i915_cache_level level,
>   					     u32 flags)
>   {
> -	struct insert_entries arg = { vm, vma, level, flags };
> +	struct insert_entries arg = { vm, vma_res, level, flags };
>   
>   	stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
>   }
> @@ -448,14 +448,14 @@ static void i915_ggtt_insert_page(struct i915_address_space *vm,
>   }
>   
>   static void i915_ggtt_insert_entries(struct i915_address_space *vm,
> -				     struct i915_vma *vma,
> +				     struct i915_vma_resource *vma_res,
>   				     enum i915_cache_level cache_level,
>   				     u32 unused)
>   {
>   	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
>   		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
>   
> -	intel_gtt_insert_sg_entries(vma->pages, vma->node.start >> PAGE_SHIFT,
> +	intel_gtt_insert_sg_entries(vma_res->bi.pages, vma_res->start >> PAGE_SHIFT,
>   				    flags);
>   }
>   
> @@ -467,30 +467,32 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
>   
>   static void ggtt_bind_vma(struct i915_address_space *vm,
>   			  struct i915_vm_pt_stash *stash,
> -			  struct i915_vma *vma,
> +			  struct i915_vma_resource *vma_res,
>   			  enum i915_cache_level cache_level,
>   			  u32 flags)
>   {
> -	struct drm_i915_gem_object *obj = vma->obj;
>   	u32 pte_flags;
>   
> -	if (i915_vma_is_bound(vma, ~flags & I915_VMA_BIND_MASK))
> +	if (vma_res->bound_flags & (~flags & I915_VMA_BIND_MASK))
>   		return;
>   
> +	vma_res->bound_flags |= flags;
> +
>   	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
>   	pte_flags = 0;
> -	if (i915_gem_object_is_readonly(obj))
> +	if (vma_res->bi.readonly)
>   		pte_flags |= PTE_READ_ONLY;
> -	if (i915_gem_object_is_lmem(obj))
> +	if (vma_res->bi.lmem)
>   		pte_flags |= PTE_LM;
>   
> -	vm->insert_entries(vm, vma, cache_level, pte_flags);
> -	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
> +	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
> +	vma_res->page_sizes_gtt = I915_GTT_PAGE_SIZE;
>   }
>   
> -static void ggtt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
> +static void ggtt_unbind_vma(struct i915_address_space *vm,
> +			    struct i915_vma_resource *vma_res)
>   {
> -	vm->clear_range(vm, vma->node.start, vma->size);
> +	vm->clear_range(vm, vma_res->start, vma_res->vma_size);
>   }
>   
>   static int ggtt_reserve_guc_top(struct i915_ggtt *ggtt)
> @@ -623,7 +625,7 @@ static int init_ggtt(struct i915_ggtt *ggtt)
>   
>   static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
>   				  struct i915_vm_pt_stash *stash,
> -				  struct i915_vma *vma,
> +				  struct i915_vma_resource *vma_res,
>   				  enum i915_cache_level cache_level,
>   				  u32 flags)
>   {
> @@ -631,25 +633,27 @@ static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
>   
>   	/* Currently applicable only to VLV */
>   	pte_flags = 0;
> -	if (i915_gem_object_is_readonly(vma->obj))
> +	if (vma_res->bi.readonly)
>   		pte_flags |= PTE_READ_ONLY;
>   
>   	if (flags & I915_VMA_LOCAL_BIND)
>   		ppgtt_bind_vma(&i915_vm_to_ggtt(vm)->alias->vm,
> -			       stash, vma, cache_level, flags);
> +			       stash, vma_res, cache_level, flags);
>   
>   	if (flags & I915_VMA_GLOBAL_BIND)
> -		vm->insert_entries(vm, vma, cache_level, pte_flags);
> +		vm->insert_entries(vm, vma_res, cache_level, pte_flags);
> +
> +	vma_res->bound_flags |= flags;
>   }
>   
>   static void aliasing_gtt_unbind_vma(struct i915_address_space *vm,
> -				    struct i915_vma *vma)
> +				    struct i915_vma_resource *vma_res)
>   {
> -	if (i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND))
> -		vm->clear_range(vm, vma->node.start, vma->size);
> +	if (vma_res->bound_flags & I915_VMA_GLOBAL_BIND)
> +		vm->clear_range(vm, vma_res->start, vma_res->vma_size);
>   
> -	if (i915_vma_is_bound(vma, I915_VMA_LOCAL_BIND))
> -		ppgtt_unbind_vma(&i915_vm_to_ggtt(vm)->alias->vm, vma);
> +	if (vma_res->bound_flags & I915_VMA_LOCAL_BIND)
> +		ppgtt_unbind_vma(&i915_vm_to_ggtt(vm)->alias->vm, vma_res);
>   }
>   
>   static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
> @@ -1280,7 +1284,7 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm)
>   			atomic_read(&vma->flags) & I915_VMA_BIND_MASK;
>   
>   		GEM_BUG_ON(!was_bound);
> -		vma->ops->bind_vma(vm, NULL, vma,
> +		vma->ops->bind_vma(vm, NULL, vma->resource,
>   				   obj ? obj->cache_level : 0,
>   				   was_bound);
>   		if (obj) { /* only used during resume => exclusive access */
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 177b42b935a1..676b839d1a34 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -27,6 +27,7 @@
>   
>   #include "gt/intel_reset.h"
>   #include "i915_selftest.h"
> +#include "i915_vma_resource.h"
>   #include "i915_vma_types.h"
>   
>   #define I915_GFP_ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
> @@ -200,7 +201,7 @@ struct i915_vma_ops {
>   	/* Map an object into an address space with the given cache flags. */
>   	void (*bind_vma)(struct i915_address_space *vm,
>   			 struct i915_vm_pt_stash *stash,
> -			 struct i915_vma *vma,
> +			 struct i915_vma_resource *vma_res,
>   			 enum i915_cache_level cache_level,
>   			 u32 flags);
>   	/*
> @@ -208,7 +209,8 @@ struct i915_vma_ops {
>   	 * setting the valid PTE entries to a reserved scratch page.
>   	 */
>   	void (*unbind_vma)(struct i915_address_space *vm,
> -			   struct i915_vma *vma);
> +			   struct i915_vma_resource *vma_res);
> +
>   };
>   
>   struct i915_address_space {
> @@ -285,7 +287,7 @@ struct i915_address_space {
>   			    enum i915_cache_level cache_level,
>   			    u32 flags);
>   	void (*insert_entries)(struct i915_address_space *vm,
> -			       struct i915_vma *vma,
> +			       struct i915_vma_resource *vma_res,
>   			       enum i915_cache_level cache_level,
>   			       u32 flags);
>   	void (*cleanup)(struct i915_address_space *vm);
> @@ -600,11 +602,11 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt);
>   
>   void ppgtt_bind_vma(struct i915_address_space *vm,
>   		    struct i915_vm_pt_stash *stash,
> -		    struct i915_vma *vma,
> +		    struct i915_vma_resource *vma_res,
>   		    enum i915_cache_level cache_level,
>   		    u32 flags);
>   void ppgtt_unbind_vma(struct i915_address_space *vm,
> -		      struct i915_vma *vma);
> +		      struct i915_vma_resource *vma_res);
>   
>   void gtt_write_workarounds(struct intel_gt *gt);
>   
> @@ -627,8 +629,8 @@ __vm_create_scratch_for_read_pinned(struct i915_address_space *vm, unsigned long
>   static inline struct sgt_dma {
>   	struct scatterlist *sg;
>   	dma_addr_t dma, max;
> -} sgt_dma(struct i915_vma *vma) {
> -	struct scatterlist *sg = vma->pages->sgl;
> +} sgt_dma(struct i915_vma_resource *vma_res) {
> +	struct scatterlist *sg = vma_res->bi.pages->sgl;
>   	dma_addr_t addr = sg_dma_address(sg);
>   
>   	return (struct sgt_dma){ sg, addr, addr + sg_dma_len(sg) };
> diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> index 083b3090c69c..48e6e2f87700 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
> @@ -179,32 +179,34 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt,
>   
>   void ppgtt_bind_vma(struct i915_address_space *vm,
>   		    struct i915_vm_pt_stash *stash,
> -		    struct i915_vma *vma,
> +		    struct i915_vma_resource *vma_res,
>   		    enum i915_cache_level cache_level,
>   		    u32 flags)
>   {
>   	u32 pte_flags;
>   
> -	if (!test_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma))) {
> -		vm->allocate_va_range(vm, stash, vma->node.start, vma->size);
> -		set_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma));
> +	if (!vma_res->allocated) {
> +		vm->allocate_va_range(vm, stash, vma_res->start,
> +				      vma_res->vma_size);
> +		vma_res->allocated = true;
>   	}
>   
>   	/* Applicable to VLV, and gen8+ */
>   	pte_flags = 0;
> -	if (i915_gem_object_is_readonly(vma->obj))
> +	if (vma_res->bi.readonly)
>   		pte_flags |= PTE_READ_ONLY;
> -	if (i915_gem_object_is_lmem(vma->obj))
> +	if (vma_res->bi.lmem)
>   		pte_flags |= PTE_LM;
>   
> -	vm->insert_entries(vm, vma, cache_level, pte_flags);
> +	vm->insert_entries(vm, vma_res, cache_level, pte_flags);
>   	wmb();
>   }
>   
> -void ppgtt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
> +void ppgtt_unbind_vma(struct i915_address_space *vm,
> +		      struct i915_vma_resource *vma_res)
>   {
> -	if (test_and_clear_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma)))

Can we remove ALLOC_BIT? Or are there still users?

> -		vm->clear_range(vm, vma->node.start, vma->size);
> +	if (vma_res->allocated)
> +		vm->clear_range(vm, vma_res->start, vma_res->vma_size);
>   }
>   
>   static unsigned long pd_count(u64 size, int shift)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
> index a5af05bde6f2..777fc6f0ceff 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
> @@ -448,20 +448,19 @@ static void uc_fw_bind_ggtt(struct intel_uc_fw *uc_fw)
>   {
>   	struct drm_i915_gem_object *obj = uc_fw->obj;
>   	struct i915_ggtt *ggtt = __uc_fw_to_gt(uc_fw)->ggtt;
> -	struct i915_vma *dummy = &uc_fw->dummy;
> +	struct i915_vma_resource *dummy = &uc_fw->dummy;
>   	u32 pte_flags = 0;
>   
> -	dummy->node.start = uc_fw_ggtt_offset(uc_fw);
> -	dummy->node.size = obj->base.size;
> -	dummy->pages = obj->mm.pages;
> -	dummy->vm = &ggtt->vm;
> +	dummy->start = uc_fw_ggtt_offset(uc_fw);
> +	dummy->node_size = obj->base.size;
> +	dummy->bi.pages = obj->mm.pages;
>   
>   	GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
> -	GEM_BUG_ON(dummy->node.size > ggtt->uc_fw.size);
> +	GEM_BUG_ON(dummy->node_size > ggtt->uc_fw.size);
>   
>   	/* uc_fw->obj cache domains were not controlled across suspend */
>   	if (i915_gem_object_has_struct_page(obj))
> -		drm_clflush_sg(dummy->pages);
> +		drm_clflush_sg(dummy->bi.pages);
>   
>   	if (i915_gem_object_is_lmem(obj))
>   		pte_flags |= PTE_LM;
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
> index d9d1dc0b4cbb..3229018877d3 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h
> @@ -85,7 +85,7 @@ struct intel_uc_fw {
>   	 * threaded as it done during driver load (inherently single threaded)
>   	 * or during a GT reset (mutex guarantees single threaded).
>   	 */
> -	struct i915_vma dummy;
> +	struct i915_vma_resource dummy;
>   	struct i915_vma *rsa_data;
>   
>   	/*
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index e0e052cdf8b8..f7d1feba5aa4 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -170,7 +170,8 @@ i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>   		seq_printf(m, " (%s offset: %08llx, size: %08llx, pages: %s",
>   			   stringify_vma_type(vma),
>   			   vma->node.start, vma->node.size,
> -			   stringify_page_sizes(vma->page_sizes.gtt, NULL, 0));
> +			   stringify_page_sizes(vma->resource->page_sizes_gtt,
> +						NULL, 0));
>   		if (i915_vma_is_ggtt(vma) || i915_vma_is_dpt(vma)) {
>   			switch (vma->ggtt_view.type) {
>   			case I915_GGTT_VIEW_NORMAL:
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 5ae812d60abe..1af54ff374f9 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1040,9 +1040,9 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>   	strcpy(dst->name, vsnap->name);
>   	dst->next = NULL;
>   
> -	dst->gtt_offset = vsnap->gtt_offset;
> -	dst->gtt_size = vsnap->gtt_size;
> -	dst->gtt_page_sizes = vsnap->page_sizes;
> +	dst->gtt_offset = vsnap->vma_resource->start;
> +	dst->gtt_size = vsnap->vma_resource->node_size;
> +	dst->gtt_page_sizes = vsnap->vma_resource->page_sizes_gtt;
>   	dst->unused = 0;
>   
>   	ret = -EINVAL;
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 7097c5016431..1d4e448d22d9 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -298,7 +298,7 @@ static void __vma_bind(struct dma_fence_work *work)
>   	struct i915_vma *vma = vw->vma;
>   
>   	vma->ops->bind_vma(vw->vm, &vw->stash,
> -			   vma, vw->cache_level, vw->flags);
> +			   vma->resource, vw->cache_level, vw->flags);
>   }
>   
>   static void __vma_release(struct dma_fence_work *work)
> @@ -375,6 +375,21 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
>   #define i915_vma_verify_bind_complete(_vma) 0
>   #endif
>   
> +I915_SELFTEST_EXPORT void
> +i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
> +				struct i915_vma *vma)
> +{
> +	struct drm_i915_gem_object *obj = vma->obj;
> +
> +	i915_vma_resource_init(vma_res, vma->pages, &vma->page_sizes,
> +			       i915_gem_object_is_readonly(obj),
> +			       i915_gem_object_is_lmem(obj),
> +			       vma->private,
> +			       vma->node.start,
> +			       vma->node.size,
> +			       vma->size);
> +}
> +
>   /**
>    * i915_vma_bind - Sets up PTEs for an VMA in it's corresponding address space.
>    * @vma: VMA to map
> @@ -432,7 +447,7 @@ int i915_vma_bind(struct i915_vma *vma,
>   		GEM_WARN_ON(!vma_flags);
>   		kfree(vma_res);
>   	} else {
> -		i915_vma_resource_init(vma_res);
> +		i915_vma_resource_init_from_vma(vma_res, vma);
>   		vma->resource = vma_res;
>   	}
>   	trace_i915_vma_bind(vma, bind_flags);
> @@ -472,7 +487,8 @@ int i915_vma_bind(struct i915_vma *vma,
>   			if (ret)
>   				return ret;
>   		}
> -		vma->ops->bind_vma(vma->vm, NULL, vma, cache_level, bind_flags);
> +		vma->ops->bind_vma(vma->vm, NULL, vma->resource, cache_level,
> +				   bind_flags);
>   	}
>   
>   	atomic_or(bind_flags, &vma->flags);
> @@ -1778,7 +1794,7 @@ void __i915_vma_evict(struct i915_vma *vma)
>   
>   	if (likely(atomic_read(&vma->vm->open))) {
>   		trace_i915_vma_unbind(vma);
> -		vma->ops->unbind_vma(vma->vm, vma);
> +		vma->ops->unbind_vma(vma->vm, vma->resource);
>   	}
>   	atomic_and(~(I915_VMA_BIND_MASK | I915_VMA_ERROR | I915_VMA_GGTT_WRITE),
>   		   &vma->flags);
> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
> index de0f3e44cdfa..1df57ec832bd 100644
> --- a/drivers/gpu/drm/i915/i915_vma.h
> +++ b/drivers/gpu/drm/i915/i915_vma.h
> @@ -339,12 +339,6 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma);
>    */
>   void i915_vma_unpin_iomap(struct i915_vma *vma);
>   
> -static inline struct page *i915_vma_first_page(struct i915_vma *vma)
> -{
> -	GEM_BUG_ON(!vma->pages);
> -	return sg_page(vma->pages->sgl);
> -}
> -
>   /**
>    * i915_vma_pin_fence - pin fencing state
>    * @vma: vma to pin fencing for
> @@ -445,6 +439,11 @@ i915_vma_get_current_resource(struct i915_vma *vma)
>   	return i915_vma_resource_get(vma->resource);
>   }
>   
> +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> +void i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
> +				     struct i915_vma *vma);
> +#endif
> +
>   void i915_vma_module_exit(void);
>   int i915_vma_module_init(void);
>   
> diff --git a/drivers/gpu/drm/i915/i915_vma_resource.c b/drivers/gpu/drm/i915/i915_vma_resource.c
> index 833e987bed2a..c86db89ab5d2 100644
> --- a/drivers/gpu/drm/i915/i915_vma_resource.c
> +++ b/drivers/gpu/drm/i915/i915_vma_resource.c
> @@ -23,15 +23,12 @@ static struct dma_fence_ops unbind_fence_ops = {
>   };
>   
>   /**
> - * i915_vma_resource_init - Initialize a vma resource.
> + * __i915_vma_resource_init - Initialize a vma resource.
>    * @vma_res: The vma resource to initialize
>    *
> - * Initializes a vma resource allocated using i915_vma_resource_alloc().
> - * The reason for having separate allocate and initialize function is that
> - * initialization may need to be performed from under a lock where
> - * allocation is not allowed.
> + * Initializes the private members of a vma resource.
>    */
> -void i915_vma_resource_init(struct i915_vma_resource *vma_res)
> +void __i915_vma_resource_init(struct i915_vma_resource *vma_res)
>   {
>   	spin_lock_init(&vma_res->lock);
>   	dma_fence_init(&vma_res->unbind_fence, &unbind_fence_ops,
> diff --git a/drivers/gpu/drm/i915/i915_vma_resource.h b/drivers/gpu/drm/i915/i915_vma_resource.h
> index 34744da23072..9872de58268b 100644
> --- a/drivers/gpu/drm/i915/i915_vma_resource.h
> +++ b/drivers/gpu/drm/i915/i915_vma_resource.h
> @@ -9,6 +9,25 @@
>   #include <linux/dma-fence.h>
>   #include <linux/refcount.h>
>   
> +#include "i915_gem.h"
> +
> +struct i915_page_sizes {
> +	/**
> +	 * The sg mask of the pages sg_table. i.e the mask of
> +	 * the lengths for each sg entry.
> +	 */
> +	unsigned int phys;
> +
> +	/**
> +	 * The gtt page sizes we are allowed to use given the
> +	 * sg mask and the supported page sizes. This will
> +	 * express the smallest unit we can use for the whole
> +	 * object, as well as the larger sizes we may be able
> +	 * to use opportunistically.
> +	 */
> +	unsigned int sg;
> +};
> +
>   /**
>    * struct i915_vma_resource - Snapshotted unbind information.
>    * @unbind_fence: Fence to mark unbinding complete. Note that this fence
> @@ -20,6 +39,13 @@
>    * @hold_count: Number of holders blocking the fence from finishing.
>    * The vma itself is keeping a hold, which is released when unbind
>    * is scheduled.
> + * @private: Bind backend private info.
> + * @start: Offset into the address space of bind range start.
> + * @node_size: Size of the allocated range manager node.
> + * @vma_size: Bind size.
> + * @page_sizes_gtt: Resulting page sizes from the bind operation.
> + * @bound_flags: Flags indicating binding status.
> + * @allocated: Backend private data. TODO: Should move into @private.
>    *
>    * The lifetime of a struct i915_vma_resource is from a binding request to
>    * the actual possible asynchronous unbind has completed.
> @@ -29,6 +55,32 @@ struct i915_vma_resource {
>   	/* See above for description of the lock. */
>   	spinlock_t lock;
>   	refcount_t hold_count;
> +
> +	/**
> +	 * struct i915_vma_bindinfo - Information needed for async bind
> +	 * only but that can be dropped after the bind has taken place.
> +	 * Consider making this a separate argument to the bind_vma
> +	 * op, coalescing with other arguments like vm, stash, cache_level
> +	 * and flags
> +	 * @pages: The pages sg-table.
> +	 * @page_sizes: Page sizes of the pages.
> +	 * @readonly: Whether the vma should be bound read-only.
> +	 * @lmem: Whether the vma points to lmem.
> +	 */
> +	struct i915_vma_bindinfo {
> +		struct sg_table *pages;
> +		struct i915_page_sizes page_sizes;
> +		bool readonly:1;
> +		bool lmem:1;
> +	} bi;
> +
> +	void *private;
> +	unsigned long start;
> +	unsigned long node_size;
> +	unsigned long vma_size;

AFAIK these need to be u64, or at least the node_size & start.

Otherwise,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> +	u32 page_sizes_gtt;
> +	u32 bound_flags;
> +	bool allocated:1;
>   };
>   
>   bool i915_vma_resource_hold(struct i915_vma_resource *vma_res,
> @@ -41,6 +93,8 @@ struct i915_vma_resource *i915_vma_resource_alloc(void);
>   
>   struct dma_fence *i915_vma_resource_unbind(struct i915_vma_resource *vma_res);
>   
> +void __i915_vma_resource_init(struct i915_vma_resource *vma_res);
> +
>   /**
>    * i915_vma_resource_get - Take a reference on a vma resource
>    * @vma_res: The vma resource on which to take a reference.
> @@ -63,8 +117,47 @@ static inline void i915_vma_resource_put(struct i915_vma_resource *vma_res)
>   	dma_fence_put(&vma_res->unbind_fence);
>   }
>   
> -#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> -void i915_vma_resource_init(struct i915_vma_resource *vma_res);
> -#endif
> +/**
> + * i915_vma_resource_init - Initialize a vma resource.
> + * @vma_res: The vma resource to initialize
> + * @pages: The pages sg-table.
> + * @page_sizes: Page sizes of the pages.
> + * @readonly: Whether the vma should be bound read-only.
> + * @lmem: Whether the vma points to lmem.
> + * @private: Bind backend private info.
> + * @start: Offset into the address space of bind range start.
> + * @node_size: Size of the allocated range manager node.
> + * @size: Bind size.
> + *
> + * Initializes a vma resource allocated using i915_vma_resource_alloc().
> + * The reason for having separate allocate and initialize function is that
> + * initialization may need to be performed from under a lock where
> + * allocation is not allowed.
> + */
> +static inline void i915_vma_resource_init(struct i915_vma_resource *vma_res,
> +					  struct sg_table *pages,
> +					  const struct i915_page_sizes *page_sizes,
> +					  bool readonly,
> +					  bool lmem,
> +					  void *private,
> +					  unsigned long start,
> +					  unsigned long node_size,
> +					  unsigned long size)
> +{
> +	__i915_vma_resource_init(vma_res);
> +	vma_res->bi.pages = pages;
> +	vma_res->bi.page_sizes = *page_sizes;
> +	vma_res->bi.readonly = readonly;
> +	vma_res->bi.lmem = lmem;
> +	vma_res->private = private;
> +	vma_res->start = start;
> +	vma_res->node_size = node_size;
> +	vma_res->vma_size = size;
> +}
> +
> +static inline void i915_vma_resource_fini(struct i915_vma_resource *vma_res)
> +{
> +	GEM_BUG_ON(refcount_read(&vma_res->hold_count) != 1);
> +}
>   
>   #endif
> diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.c b/drivers/gpu/drm/i915/i915_vma_snapshot.c
> index f7333c7a2f5e..69f62c1ca967 100644
> --- a/drivers/gpu/drm/i915/i915_vma_snapshot.c
> +++ b/drivers/gpu/drm/i915/i915_vma_snapshot.c
> @@ -24,11 +24,7 @@ void i915_vma_snapshot_init(struct i915_vma_snapshot *vsnap,
>   		assert_object_held(vma->obj);
>   
>   	vsnap->name = name;
> -	vsnap->size = vma->size;
>   	vsnap->obj_size = vma->obj->base.size;
> -	vsnap->gtt_offset = vma->node.start;
> -	vsnap->gtt_size = vma->node.size;
> -	vsnap->page_sizes = vma->page_sizes.gtt;
>   	vsnap->pages = vma->pages;
>   	vsnap->pages_rsgt = NULL;
>   	vsnap->mr = NULL;
> diff --git a/drivers/gpu/drm/i915/i915_vma_snapshot.h b/drivers/gpu/drm/i915/i915_vma_snapshot.h
> index e74588dd676b..1b08ce9f8576 100644
> --- a/drivers/gpu/drm/i915/i915_vma_snapshot.h
> +++ b/drivers/gpu/drm/i915/i915_vma_snapshot.h
> @@ -23,31 +23,23 @@ struct sg_table;
>   
>   /**
>    * struct i915_vma_snapshot - Snapshot of vma metadata.
> - * @size: The vma size in bytes.
>    * @obj_size: The size of the underlying object in bytes.
> - * @gtt_offset: The gtt offset the vma is bound to.
> - * @gtt_size: The size in bytes allocated for the vma in the GTT.
>    * @pages: The struct sg_table pointing to the pages bound.
>    * @pages_rsgt: The refcounted sg_table holding the reference for @pages if any.
>    * @mr: The memory region pointed for the pages bound.
>    * @kref: Reference for this structure.
>    * @vma_resource: Pointer to the vma resource representing the vma binding.
> - * @page_sizes: The vma GTT page sizes information.
>    * @onstack: Whether the structure shouldn't be freed on final put.
>    * @present: Whether the structure is present and initialized.
>    */
>   struct i915_vma_snapshot {
>   	const char *name;
> -	size_t size;
>   	size_t obj_size;
> -	size_t gtt_offset;
> -	size_t gtt_size;
>   	struct sg_table *pages;
>   	struct i915_refct_sgt *pages_rsgt;
>   	struct intel_memory_region *mr;
>   	struct kref kref;
>   	struct i915_vma_resource *vma_resource;
> -	u32 page_sizes;
>   	bool onstack:1;
>   	bool present:1;
>   };
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> index 54be880e55c3..70b5c47890b9 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> @@ -239,11 +239,11 @@ static int lowlevel_hole(struct i915_address_space *vm,
>   			 unsigned long end_time)
>   {
>   	I915_RND_STATE(seed_prng);
> -	struct i915_vma *mock_vma;
> +	struct i915_vma_resource *mock_vma_res;
>   	unsigned int size;
>   
> -	mock_vma = kzalloc(sizeof(*mock_vma), GFP_KERNEL);
> -	if (!mock_vma)
> +	mock_vma_res = kzalloc(sizeof(*mock_vma_res), GFP_KERNEL);
> +	if (!mock_vma_res)
>   		return -ENOMEM;
>   
>   	/* Keep creating larger objects until one cannot fit into the hole */
> @@ -269,7 +269,7 @@ static int lowlevel_hole(struct i915_address_space *vm,
>   				break;
>   		} while (count >>= 1);
>   		if (!count) {
> -			kfree(mock_vma);
> +			kfree(mock_vma_res);
>   			return -ENOMEM;
>   		}
>   		GEM_BUG_ON(!order);
> @@ -343,12 +343,12 @@ static int lowlevel_hole(struct i915_address_space *vm,
>   					break;
>   			}
>   
> -			mock_vma->pages = obj->mm.pages;
> -			mock_vma->node.size = BIT_ULL(size);
> -			mock_vma->node.start = addr;
> +			mock_vma_res->bi.pages = obj->mm.pages;
> +			mock_vma_res->node_size = BIT_ULL(size);
> +			mock_vma_res->start = addr;
>   
>   			with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref)
> -				vm->insert_entries(vm, mock_vma,
> +			  vm->insert_entries(vm, mock_vma_res,
>   						   I915_CACHE_NONE, 0);
>   		}
>   		count = n;
> @@ -371,7 +371,7 @@ static int lowlevel_hole(struct i915_address_space *vm,
>   		cleanup_freed_objects(vm->i915);
>   	}
>   
> -	kfree(mock_vma);
> +	kfree(mock_vma_res);
>   	return 0;
>   }
>   
> @@ -1280,6 +1280,7 @@ static void track_vma_bind(struct i915_vma *vma)
>   	atomic_set(&vma->pages_count, I915_VMA_PAGES_ACTIVE);
>   	__i915_gem_object_pin_pages(obj);
>   	vma->pages = obj->mm.pages;
> +	vma->resource->bi.pages = vma->pages;
>   
>   	mutex_lock(&vma->vm->mutex);
>   	list_add_tail(&vma->vm_link, &vma->vm->bound_list);
> @@ -1354,7 +1355,7 @@ static int reserve_gtt_with_resource(struct i915_vma *vma, u64 offset)
>   				   obj->cache_level,
>   				   0);
>   	if (!err) {
> -		i915_vma_resource_init(vma_res);
> +		i915_vma_resource_init_from_vma(vma_res, vma);
>   		vma->resource = vma_res;
>   	} else {
>   		kfree(vma_res);
> @@ -1533,7 +1534,7 @@ static int insert_gtt_with_resource(struct i915_vma *vma)
>   	err = i915_gem_gtt_insert(vm, &vma->node, obj->base.size, 0,
>   				  obj->cache_level, 0, vm->total, 0);
>   	if (!err) {
> -		i915_vma_resource_init(vma_res);
> +		i915_vma_resource_init_from_vma(vma_res, vma);
>   		vma->resource = vma_res;
>   	} else {
>   		kfree(vma_res);
> @@ -1958,6 +1959,7 @@ static int igt_cs_tlb(void *arg)
>   			struct i915_vm_pt_stash stash = {};
>   			struct i915_request *rq;
>   			struct i915_gem_ww_ctx ww;
> +			struct i915_vma_resource *vma_res;
>   			u64 offset;
>   
>   			offset = igt_random_offset(&prng,
> @@ -1978,6 +1980,13 @@ static int igt_cs_tlb(void *arg)
>   			if (err)
>   				goto end;
>   
> +			vma_res = i915_vma_resource_alloc();
> +			if (IS_ERR(vma_res)) {
> +				i915_vma_put_pages(vma);
> +				err = PTR_ERR(vma_res);
> +				goto end;
> +			}
> +
>   			i915_gem_ww_ctx_init(&ww, false);
>   retry:
>   			err = i915_vm_lock_objects(vm, &ww);
> @@ -1999,33 +2008,41 @@ static int igt_cs_tlb(void *arg)
>   					goto retry;
>   			}
>   			i915_gem_ww_ctx_fini(&ww);
> -			if (err)
> +			if (err) {
> +				kfree(vma_res);
>   				goto end;
> +			}
>   
> +			i915_vma_resource_init_from_vma(vma_res, vma);
>   			/* Prime the TLB with the dummy pages */
>   			for (i = 0; i < count; i++) {
> -				vma->node.start = offset + i * PAGE_SIZE;
> -				vm->insert_entries(vm, vma, I915_CACHE_NONE, 0);
> +				vma_res->start = offset + i * PAGE_SIZE;
> +				vm->insert_entries(vm, vma_res, I915_CACHE_NONE,
> +						   0);
>   
> -				rq = submit_batch(ce, vma->node.start);
> +				rq = submit_batch(ce, vma_res->start);
>   				if (IS_ERR(rq)) {
>   					err = PTR_ERR(rq);
> +					i915_vma_resource_fini(vma_res);
> +					kfree(vma_res);
>   					goto end;
>   				}
>   				i915_request_put(rq);
>   			}
> -
> +			i915_vma_resource_fini(vma_res);
>   			i915_vma_put_pages(vma);
>   
>   			err = context_sync(ce);
>   			if (err) {
>   				pr_err("%s: dummy setup timed out\n",
>   				       ce->engine->name);
> +				kfree(vma_res);
>   				goto end;
>   			}
>   
>   			vma = i915_vma_instance(act, vm, NULL);
>   			if (IS_ERR(vma)) {
> +				kfree(vma_res);
>   				err = PTR_ERR(vma);
>   				goto end;
>   			}
> @@ -2033,19 +2050,22 @@ static int igt_cs_tlb(void *arg)
>   			i915_gem_object_lock(act, NULL);
>   			err = i915_vma_get_pages(vma);
>   			i915_gem_object_unlock(act);
> -			if (err)
> +			if (err) {
> +				kfree(vma_res);
>   				goto end;
> +			}
>   
> +			i915_vma_resource_init_from_vma(vma_res, vma);
>   			/* Replace the TLB with target batches */
>   			for (i = 0; i < count; i++) {
>   				struct i915_request *rq;
>   				u32 *cs = batch + i * 64 / sizeof(*cs);
>   				u64 addr;
>   
> -				vma->node.start = offset + i * PAGE_SIZE;
> -				vm->insert_entries(vm, vma, I915_CACHE_NONE, 0);
> +				vma_res->start = offset + i * PAGE_SIZE;
> +				vm->insert_entries(vm, vma_res, I915_CACHE_NONE, 0);
>   
> -				addr = vma->node.start + i * 64;
> +				addr = vma_res->start + i * 64;
>   				cs[4] = MI_NOOP;
>   				cs[6] = lower_32_bits(addr);
>   				cs[7] = upper_32_bits(addr);
> @@ -2054,6 +2074,8 @@ static int igt_cs_tlb(void *arg)
>   				rq = submit_batch(ce, addr);
>   				if (IS_ERR(rq)) {
>   					err = PTR_ERR(rq);
> +					i915_vma_resource_fini(vma_res);
> +					kfree(vma_res);
>   					goto end;
>   				}
>   
> @@ -2070,6 +2092,8 @@ static int igt_cs_tlb(void *arg)
>   			}
>   			end_spin(batch, count - 1);
>   
> +			i915_vma_resource_fini(vma_res);
> +			kfree(vma_res);
>   			i915_vma_put_pages(vma);
>   
>   			err = context_sync(ce);
> diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> index 1802baf80a17..d40519e3ca38 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
> @@ -33,23 +33,23 @@ static void mock_insert_page(struct i915_address_space *vm,
>   }
>   
>   static void mock_insert_entries(struct i915_address_space *vm,
> -				struct i915_vma *vma,
> +				struct i915_vma_resource *vma_res,
>   				enum i915_cache_level level, u32 flags)
>   {
>   }
>   
>   static void mock_bind_ppgtt(struct i915_address_space *vm,
>   			    struct i915_vm_pt_stash *stash,
> -			    struct i915_vma *vma,
> +			    struct i915_vma_resource *vma_res,
>   			    enum i915_cache_level cache_level,
>   			    u32 flags)
>   {
>   	GEM_BUG_ON(flags & I915_VMA_GLOBAL_BIND);
> -	set_bit(I915_VMA_LOCAL_BIND_BIT, __i915_vma_flags(vma));
> +	vma_res->bound_flags |= flags;
>   }
>   
>   static void mock_unbind_ppgtt(struct i915_address_space *vm,
> -			      struct i915_vma *vma)
> +			      struct i915_vma_resource *vma_res)
>   {
>   }
>   
> @@ -93,14 +93,14 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
>   
>   static void mock_bind_ggtt(struct i915_address_space *vm,
>   			   struct i915_vm_pt_stash *stash,
> -			   struct i915_vma *vma,
> +			   struct i915_vma_resource *vma_res,
>   			   enum i915_cache_level cache_level,
>   			   u32 flags)
>   {
>   }
>   
>   static void mock_unbind_ggtt(struct i915_address_space *vm,
> -			     struct i915_vma *vma)
> +			     struct i915_vma_resource *vma_res)
>   {
>   }
>   
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/6] drm/i915: Use vma resources for async unbinding
  2022-01-04 12:51   ` [Intel-gfx] " Thomas Hellström
@ 2022-01-06 16:06     ` kernel test robot
  -1 siblings, 0 replies; 32+ messages in thread
From: kernel test robot @ 2022-01-06 16:06 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: llvm, kbuild-all

Hi "Thomas,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip drm/drm-next next-20220105]
[cannot apply to drm-exynos/exynos-drm-next tegra-drm/drm/tegra/for-next v5.16-rc8]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Thomas-Hellstr-m/drm-i915-Asynchronous-vma-unbinding/20220104-205238
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-buildonly-randconfig-r001-20220106 (https://download.01.org/0day-ci/archive/20220106/202201062355.yUX5doeJ-lkp@intel.com/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project ca7ffe09dc6e525109e3cd570cc5182ce568be13)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/b23720e8418513dff7cf35e04fb5af41ffeea98f
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Thomas-Hellstr-m/drm-i915-Asynchronous-vma-unbinding/20220104-205238
        git checkout b23720e8418513dff7cf35e04fb5af41ffeea98f
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/gpu/drm/i915/ fs/nfs/ sound/x86/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/i915/i915_vma_resource.c:262:25: warning: parameter 'end' set but not used [-Wunused-but-set-parameter]
                                        unsigned long *end)
                                                       ^
   1 warning generated.


vim +/end +262 drivers/gpu/drm/i915/i915_vma_resource.c

   258	
   259	static void
   260	i915_vma_resource_color_adjust_range(struct i915_address_space *vm,
   261					     unsigned long *start,
 > 262					     unsigned long *end)
   263	{
   264		if (i915_vm_has_cache_coloring(vm)) {
   265			if (start)
   266				start -= I915_GTT_PAGE_SIZE;
   267			end += I915_GTT_PAGE_SIZE;
   268		}
   269	}
   270	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 4/6] drm/i915: Use vma resources for async unbinding
@ 2022-01-06 16:06     ` kernel test robot
  0 siblings, 0 replies; 32+ messages in thread
From: kernel test robot @ 2022-01-06 16:06 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2634 bytes --]

Hi "Thomas,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on drm-tip/drm-tip drm/drm-next next-20220105]
[cannot apply to drm-exynos/exynos-drm-next tegra-drm/drm/tegra/for-next v5.16-rc8]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Thomas-Hellstr-m/drm-i915-Asynchronous-vma-unbinding/20220104-205238
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-buildonly-randconfig-r001-20220106 (https://download.01.org/0day-ci/archive/20220106/202201062355.yUX5doeJ-lkp(a)intel.com/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project ca7ffe09dc6e525109e3cd570cc5182ce568be13)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/b23720e8418513dff7cf35e04fb5af41ffeea98f
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Thomas-Hellstr-m/drm-i915-Asynchronous-vma-unbinding/20220104-205238
        git checkout b23720e8418513dff7cf35e04fb5af41ffeea98f
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/gpu/drm/i915/ fs/nfs/ sound/x86/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/i915/i915_vma_resource.c:262:25: warning: parameter 'end' set but not used [-Wunused-but-set-parameter]
                                        unsigned long *end)
                                                       ^
   1 warning generated.


vim +/end +262 drivers/gpu/drm/i915/i915_vma_resource.c

   258	
   259	static void
   260	i915_vma_resource_color_adjust_range(struct i915_address_space *vm,
   261					     unsigned long *start,
 > 262					     unsigned long *end)
   263	{
   264		if (i915_vm_has_cache_coloring(vm)) {
   265			if (start)
   266				start -= I915_GTT_PAGE_SIZE;
   267			end += I915_GTT_PAGE_SIZE;
   268		}
   269	}
   270	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v5 3/6] drm/i915: Don't pin the object pages during pending vma binds
  2022-01-04 12:51   ` [Intel-gfx] " Thomas Hellström
@ 2022-01-06 16:08     ` Matthew Auld
  -1 siblings, 0 replies; 32+ messages in thread
From: Matthew Auld @ 2022-01-06 16:08 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 04/01/2022 12:51, Thomas Hellström wrote:
> A pin-count is already held by vma->pages so taking an additional pin
> during async binds is not necessary.
> 
> When we introduce async unbinding we have other means of keeping the
> object pages alive.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Intel-gfx] [PATCH v5 3/6] drm/i915: Don't pin the object pages during pending vma binds
@ 2022-01-06 16:08     ` Matthew Auld
  0 siblings, 0 replies; 32+ messages in thread
From: Matthew Auld @ 2022-01-06 16:08 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 04/01/2022 12:51, Thomas Hellström wrote:
> A pin-count is already held by vma->pages so taking an additional pin
> during async binds is not necessary.
> 
> When we introduce async unbinding we have other means of keeping the
> object pages alive.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2022-01-06 16:12 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-04 12:51 [PATCH v5 0/6] drm/i915: Asynchronous vma unbinding Thomas Hellström
2022-01-04 12:51 ` [Intel-gfx] " Thomas Hellström
2022-01-04 12:51 ` [PATCH v5 1/6] drm/i915: Initial introduction of vma resources Thomas Hellström
2022-01-04 12:51   ` [Intel-gfx] " Thomas Hellström
2022-01-06 15:22   ` Matthew Auld
2022-01-06 15:22     ` [Intel-gfx] " Matthew Auld
2022-01-04 12:51 ` [PATCH v5 2/6] drm/i915: Use the vma resource as argument for gtt binding / unbinding Thomas Hellström
2022-01-04 12:51   ` [Intel-gfx] " Thomas Hellström
2022-01-06 16:01   ` Matthew Auld
2022-01-06 16:01     ` [Intel-gfx] " Matthew Auld
2022-01-04 12:51 ` [PATCH v5 3/6] drm/i915: Don't pin the object pages during pending vma binds Thomas Hellström
2022-01-04 12:51   ` [Intel-gfx] " Thomas Hellström
2022-01-06 16:08   ` Matthew Auld
2022-01-06 16:08     ` [Intel-gfx] " Matthew Auld
2022-01-04 12:51 ` [PATCH v5 4/6] drm/i915: Use vma resources for async unbinding Thomas Hellström
2022-01-04 12:51   ` [Intel-gfx] " Thomas Hellström
2022-01-05 15:52   ` Matthew Auld
2022-01-05 15:52     ` [Intel-gfx] " Matthew Auld
2022-01-05 16:03     ` Thomas Hellström
2022-01-05 16:03       ` [Intel-gfx] " Thomas Hellström
2022-01-06 12:13   ` Matthew Auld
2022-01-06 12:13     ` [Intel-gfx] " Matthew Auld
2022-01-06 16:06   ` kernel test robot
2022-01-06 16:06     ` kernel test robot
2022-01-04 12:51 ` [PATCH v5 5/6] drm/i915: Asynchronous migration selftest Thomas Hellström
2022-01-04 12:51   ` [Intel-gfx] " Thomas Hellström
2022-01-04 12:51 ` [PATCH v5 6/6] drm/i915: Use struct vma_resource instead of struct vma_snapshot Thomas Hellström
2022-01-04 12:51   ` [Intel-gfx] " Thomas Hellström
2022-01-04 15:56 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Asynchronous vma unbinding (rev5) Patchwork
2022-01-04 15:58 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2022-01-04 16:10 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-01-04 17:52 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.