linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH drm-misc-next v3 0/7] [RFC] DRM GPUVA Manager GPU-VM features
@ 2023-09-09 15:31 Danilo Krummrich
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 1/7] drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm Danilo Krummrich
                   ` (6 more replies)
  0 siblings, 7 replies; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-09 15:31 UTC (permalink / raw)
  To: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel, Danilo Krummrich

So far the DRM GPUVA manager offers common infrastructure to track GPU VA
allocations and mappings, generically connect GPU VA mappings to their
backing buffers and perform more complex mapping operations on the GPU VA
space.

However, there are more design patterns commonly used by drivers, which
can potentially be generalized in order to make the DRM GPUVA manager
represent a basic GPU-VM implementation. In this context, this patch series
aims at generalizing the following elements.

1) Provide a common dma-resv for GEM objects not being used outside of
   this GPU-VM.

2) Provide tracking of external GEM objects (GEM objects which are
   shared with other GPU-VMs).

3) Provide functions to efficiently lock all GEM objects dma-resv the
   GPU-VM contains mappings of.

4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
   of, such that validation of evicted GEM objects is accelerated.

5) Provide some convinience functions for common patterns.

The implementation introduces struct drm_gpuvm_bo, which serves as abstraction
combining a struct drm_gpuvm and struct drm_gem_object, similar to what
amdgpu does with struct amdgpu_bo_vm. While this adds a bit of complexity it
improves the efficiency of tracking external and evicted GEM objects.

This patch series also renames struct drm_gpuva_manager to struct drm_gpuvm
including corresponding functions. This way the GPUVA manager's structures align
better with the documentation of VM_BIND [1] and VM_BIND locking [2]. It also
provides a better foundation for the naming of data structures and functions
introduced for implementing the features of this patch series.

This patch series is also available at [3].

[1] Documentation/gpu/drm-vm-bind-async.rst
[2] Documentation/gpu/drm-vm-bind-locking.rst
[3] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/gpuvm-next

Changes in V2:
==============
  - rename 'drm_gpuva_manager' -> 'drm_gpuvm' which generally leads to more
    consistent naming
  - properly separate commits (introduce common dma-resv, drm_gpuvm_bo
    abstraction, etc.)
  - remove maple tree for tracking external objects, use a list drm_gpuvm_bos
    per drm_gpuvm instead
  - rework dma-resv locking helpers (Thomas)
  - add a locking helper for a given range of the VA space (Christian)
  - make the GPUVA manager buildable as module, rather than drm_exec
    builtin (Christian)

Changes in V3:
==============
  - rename missing function and files (Boris)
  - warn if vm_obj->obj != obj in drm_gpuva_link() (Boris)
  - don't expose drm_gpuvm_bo_destroy() (Boris)
  - unlink VM_BO from GEM in drm_gpuvm_bo_destroy() rather than
    drm_gpuva_unlink() and link within drm_gpuvm_bo_obtain() to keep
    drm_gpuvm_bo instances unique
  - add internal locking to external and evicted object lists to support drivers
    updating the VA space from within the fence signalling critical path (Boris)
  - unlink external objects and evicted objects from the GPUVM's list in
    drm_gpuvm_bo_destroy()
  - add more documentation and fix some kernel doc issues

Danilo Krummrich (7):
  drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm
  drm/gpuvm: allow building as module
  drm/nouveau: uvmm: rename 'umgr' to 'base'
  drm/gpuvm: common dma-resv per struct drm_gpuvm
  drm/gpuvm: add an abstraction for a VM / BO combination
  drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  drm/nouveau: GPUVM dma-resv/extobj handling, GEM validation

 drivers/gpu/drm/Kconfig                       |    7 +
 drivers/gpu/drm/Makefile                      |    2 +-
 drivers/gpu/drm/drm_debugfs.c                 |   16 +-
 .../gpu/drm/{drm_gpuva_mgr.c => drm_gpuvm.c}  | 1209 ++++++++++++++---
 drivers/gpu/drm/nouveau/Kconfig               |    1 +
 drivers/gpu/drm/nouveau/nouveau_bo.c          |    4 +-
 drivers/gpu/drm/nouveau/nouveau_debugfs.c     |    2 +-
 drivers/gpu/drm/nouveau/nouveau_exec.c        |   52 +-
 drivers/gpu/drm/nouveau/nouveau_exec.h        |    4 -
 drivers/gpu/drm/nouveau/nouveau_gem.c         |    4 +-
 drivers/gpu/drm/nouveau/nouveau_sched.h       |    4 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c        |  207 +--
 drivers/gpu/drm/nouveau/nouveau_uvmm.h        |    8 +-
 include/drm/drm_debugfs.h                     |    6 +-
 include/drm/drm_gem.h                         |   32 +-
 include/drm/{drm_gpuva_mgr.h => drm_gpuvm.h}  |  510 +++++--
 16 files changed, 1605 insertions(+), 463 deletions(-)
 rename drivers/gpu/drm/{drm_gpuva_mgr.c => drm_gpuvm.c} (51%)
 rename include/drm/{drm_gpuva_mgr.h => drm_gpuvm.h} (53%)


base-commit: 6bd3d8da51ca1ec97c724016466606aec7739b9f
-- 
2.41.0


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH drm-misc-next v3 1/7] drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm
  2023-09-09 15:31 [PATCH drm-misc-next v3 0/7] [RFC] DRM GPUVA Manager GPU-VM features Danilo Krummrich
@ 2023-09-09 15:31 ` Danilo Krummrich
  2023-09-09 18:23   ` kernel test robot
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 2/7] drm/gpuvm: allow building as module Danilo Krummrich
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-09 15:31 UTC (permalink / raw)
  To: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel, Danilo Krummrich

Rename struct drm_gpuva_manager to struct drm_gpuvm including
corresponding functions. This way the GPUVA manager's structures align
very well with the documentation of VM_BIND [1] and VM_BIND locking [2].

It also provides a better foundation for the naming of data structures
and functions introduced for implementing a common dma-resv per GPU-VM
including tracking of external and evicted objects in subsequent
patches.

[1] Documentation/gpu/drm-vm-bind-async.rst
[2] Documentation/gpu/drm-vm-bind-locking.rst

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
 drivers/gpu/drm/Makefile                      |   2 +-
 drivers/gpu/drm/drm_debugfs.c                 |  16 +-
 .../gpu/drm/{drm_gpuva_mgr.c => drm_gpuvm.c}  | 392 +++++++++---------
 drivers/gpu/drm/nouveau/nouveau_exec.c        |   2 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c        |  24 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.h        |   6 +-
 include/drm/drm_debugfs.h                     |   6 +-
 include/drm/{drm_gpuva_mgr.h => drm_gpuvm.h}  | 153 ++++---
 8 files changed, 300 insertions(+), 301 deletions(-)
 rename drivers/gpu/drm/{drm_gpuva_mgr.c => drm_gpuvm.c} (78%)
 rename include/drm/{drm_gpuva_mgr.h => drm_gpuvm.h} (78%)

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 215e78e79125..7a84b3cddeab 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -45,7 +45,7 @@ drm-y := \
 	drm_vblank.o \
 	drm_vblank_work.o \
 	drm_vma_manager.o \
-	drm_gpuva_mgr.o \
+	drm_gpuvm.o \
 	drm_writeback.o
 drm-$(CONFIG_DRM_LEGACY) += \
 	drm_agpsupport.o \
diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c
index 34c7d1a580e3..f8c5cf880802 100644
--- a/drivers/gpu/drm/drm_debugfs.c
+++ b/drivers/gpu/drm/drm_debugfs.c
@@ -40,7 +40,7 @@
 #include <drm/drm_file.h>
 #include <drm/drm_gem.h>
 #include <drm/drm_managed.h>
-#include <drm/drm_gpuva_mgr.h>
+#include <drm/drm_gpuvm.h>
 
 #include "drm_crtc_internal.h"
 #include "drm_internal.h"
@@ -187,31 +187,31 @@ static const struct file_operations drm_debugfs_fops = {
 /**
  * drm_debugfs_gpuva_info - dump the given DRM GPU VA space
  * @m: pointer to the &seq_file to write
- * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @gpuvm: the &drm_gpuvm representing the GPU VA space
  *
  * Dumps the GPU VA mappings of a given DRM GPU VA manager.
  *
  * For each DRM GPU VA space drivers should call this function from their
  * &drm_info_list's show callback.
  *
- * Returns: 0 on success, -ENODEV if the &mgr is not initialized
+ * Returns: 0 on success, -ENODEV if the &gpuvm is not initialized
  */
 int drm_debugfs_gpuva_info(struct seq_file *m,
-			   struct drm_gpuva_manager *mgr)
+			   struct drm_gpuvm *gpuvm)
 {
-	struct drm_gpuva *va, *kva = &mgr->kernel_alloc_node;
+	struct drm_gpuva *va, *kva = &gpuvm->kernel_alloc_node;
 
-	if (!mgr->name)
+	if (!gpuvm->name)
 		return -ENODEV;
 
 	seq_printf(m, "DRM GPU VA space (%s) [0x%016llx;0x%016llx]\n",
-		   mgr->name, mgr->mm_start, mgr->mm_start + mgr->mm_range);
+		   gpuvm->name, gpuvm->mm_start, gpuvm->mm_start + gpuvm->mm_range);
 	seq_printf(m, "Kernel reserved node [0x%016llx;0x%016llx]\n",
 		   kva->va.addr, kva->va.addr + kva->va.range);
 	seq_puts(m, "\n");
 	seq_puts(m, " VAs | start              | range              | end                | object             | object offset\n");
 	seq_puts(m, "-------------------------------------------------------------------------------------------------------------\n");
-	drm_gpuva_for_each_va(va, mgr) {
+	drm_gpuvm_for_each_va(va, gpuvm) {
 		if (unlikely(va == kva))
 			continue;
 
diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuvm.c
similarity index 78%
rename from drivers/gpu/drm/drm_gpuva_mgr.c
rename to drivers/gpu/drm/drm_gpuvm.c
index f86bfad74ff8..de1a69bc4a44 100644
--- a/drivers/gpu/drm/drm_gpuva_mgr.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -25,7 +25,7 @@
  *
  */
 
-#include <drm/drm_gpuva_mgr.h>
+#include <drm/drm_gpuvm.h>
 
 #include <linux/interval_tree_generic.h>
 #include <linux/mm.h>
@@ -33,8 +33,8 @@
 /**
  * DOC: Overview
  *
- * The DRM GPU VA Manager, represented by struct drm_gpuva_manager keeps track
- * of a GPU's virtual address (VA) space and manages the corresponding virtual
+ * The DRM GPU VA Manager, represented by struct drm_gpuvm keeps track of a
+ * GPU's virtual address (VA) space and manages the corresponding virtual
  * mappings represented by &drm_gpuva objects. It also keeps track of the
  * mapping's backing &drm_gem_object buffers.
  *
@@ -47,28 +47,28 @@
  * The GPU VA manager internally uses a rb-tree to manage the
  * &drm_gpuva mappings within a GPU's virtual address space.
  *
- * The &drm_gpuva_manager contains a special &drm_gpuva representing the
+ * The &drm_gpuvm structure contains a special &drm_gpuva representing the
  * portion of VA space reserved by the kernel. This node is initialized together
  * with the GPU VA manager instance and removed when the GPU VA manager is
  * destroyed.
  *
- * In a typical application drivers would embed struct drm_gpuva_manager and
+ * In a typical application drivers would embed struct drm_gpuvm and
  * struct drm_gpuva within their own driver specific structures, there won't be
  * any memory allocations of its own nor memory allocations of &drm_gpuva
  * entries.
  *
- * The data structures needed to store &drm_gpuvas within the &drm_gpuva_manager
- * are contained within struct drm_gpuva already. Hence, for inserting
- * &drm_gpuva entries from within dma-fence signalling critical sections it is
- * enough to pre-allocate the &drm_gpuva structures.
+ * The data structures needed to store &drm_gpuvas within the &drm_gpuvm are
+ * contained within struct drm_gpuva already. Hence, for inserting &drm_gpuva
+ * entries from within dma-fence signalling critical sections it is enough to
+ * pre-allocate the &drm_gpuva structures.
  */
 
 /**
  * DOC: Split and Merge
  *
  * Besides its capability to manage and represent a GPU VA space, the
- * &drm_gpuva_manager also provides functions to let the &drm_gpuva_manager
- * calculate a sequence of operations to satisfy a given map or unmap request.
+ * GPU VA manager also provides functions to let the &drm_gpuvm calculate a
+ * sequence of operations to satisfy a given map or unmap request.
  *
  * Therefore the DRM GPU VA manager provides an algorithm implementing splitting
  * and merging of existent GPU VA mappings with the ones that are requested to
@@ -76,16 +76,16 @@
  * implement Vulkan 'Sparse Memory Bindings' - drivers UAPIs often refer to this
  * as VM BIND.
  *
- * Drivers can call drm_gpuva_sm_map() to receive a sequence of callbacks
+ * Drivers can call drm_gpuvm_sm_map() to receive a sequence of callbacks
  * containing map, unmap and remap operations for a given newly requested
  * mapping. The sequence of callbacks represents the set of operations to
  * execute in order to integrate the new mapping cleanly into the current state
  * of the GPU VA space.
  *
  * Depending on how the new GPU VA mapping intersects with the existent mappings
- * of the GPU VA space the &drm_gpuva_fn_ops callbacks contain an arbitrary
- * amount of unmap operations, a maximum of two remap operations and a single
- * map operation. The caller might receive no callback at all if no operation is
+ * of the GPU VA space the &drm_gpuvm_ops callbacks contain an arbitrary amount
+ * of unmap operations, a maximum of two remap operations and a single map
+ * operation. The caller might receive no callback at all if no operation is
  * required, e.g. if the requested mapping already exists in the exact same way.
  *
  * The single map operation represents the original map operation requested by
@@ -95,7 +95,7 @@
  * &drm_gpuva to unmap is physically contiguous with the original mapping
  * request. Optionally, if 'keep' is set, drivers may keep the actual page table
  * entries for this &drm_gpuva, adding the missing page table entries only and
- * update the &drm_gpuva_manager's view of things accordingly.
+ * update the &drm_gpuvm's view of things accordingly.
  *
  * Drivers may do the same optimization, namely delta page table updates, also
  * for remap operations. This is possible since &drm_gpuva_op_remap consists of
@@ -106,34 +106,34 @@
  * the beginning and one at the end of the new mapping, hence there is a
  * maximum of two remap operations.
  *
- * Analogous to drm_gpuva_sm_map() drm_gpuva_sm_unmap() uses &drm_gpuva_fn_ops
- * to call back into the driver in order to unmap a range of GPU VA space. The
+ * Analogous to drm_gpuvm_sm_map() drm_gpuvm_sm_unmap() uses &drm_gpuvm_ops to
+ * call back into the driver in order to unmap a range of GPU VA space. The
  * logic behind this function is way simpler though: For all existent mappings
  * enclosed by the given range unmap operations are created. For mappings which
  * are only partically located within the given range, remap operations are
  * created such that those mappings are split up and re-mapped partically.
  *
- * As an alternative to drm_gpuva_sm_map() and drm_gpuva_sm_unmap(),
- * drm_gpuva_sm_map_ops_create() and drm_gpuva_sm_unmap_ops_create() can be used
+ * As an alternative to drm_gpuvm_sm_map() and drm_gpuvm_sm_unmap(),
+ * drm_gpuvm_sm_map_ops_create() and drm_gpuvm_sm_unmap_ops_create() can be used
  * to directly obtain an instance of struct drm_gpuva_ops containing a list of
  * &drm_gpuva_op, which can be iterated with drm_gpuva_for_each_op(). This list
  * contains the &drm_gpuva_ops analogous to the callbacks one would receive when
- * calling drm_gpuva_sm_map() or drm_gpuva_sm_unmap(). While this way requires
+ * calling drm_gpuvm_sm_map() or drm_gpuvm_sm_unmap(). While this way requires
  * more memory (to allocate the &drm_gpuva_ops), it provides drivers a way to
  * iterate the &drm_gpuva_op multiple times, e.g. once in a context where memory
  * allocations are possible (e.g. to allocate GPU page tables) and once in the
  * dma-fence signalling critical path.
  *
- * To update the &drm_gpuva_manager's view of the GPU VA space
- * drm_gpuva_insert() and drm_gpuva_remove() may be used. These functions can
- * safely be used from &drm_gpuva_fn_ops callbacks originating from
- * drm_gpuva_sm_map() or drm_gpuva_sm_unmap(). However, it might be more
- * convenient to use the provided helper functions drm_gpuva_map(),
- * drm_gpuva_remap() and drm_gpuva_unmap() instead.
+ * To update the &drm_gpuvm's view of the GPU VA space drm_gpuva_insert() and
+ * drm_gpuva_remove() may be used. These functions can safely be used from
+ * &drm_gpuvm_ops callbacks originating from drm_gpuvm_sm_map() or
+ * drm_gpuvm_sm_unmap(). However, it might be more convenient to use the
+ * provided helper functions drm_gpuva_map(), drm_gpuva_remap() and
+ * drm_gpuva_unmap() instead.
  *
  * The following diagram depicts the basic relationships of existent GPU VA
  * mappings, a newly requested mapping and the resulting mappings as implemented
- * by drm_gpuva_sm_map() - it doesn't cover any arbitrary combinations of these.
+ * by drm_gpuvm_sm_map() - it doesn't cover any arbitrary combinations of these.
  *
  * 1) Requested mapping is identical. Replace it, but indicate the backing PTEs
  *    could be kept.
@@ -421,10 +421,10 @@
  *	// Allocates a new &drm_gpuva.
  *	struct drm_gpuva * driver_gpuva_alloc(void);
  *
- *	// Typically drivers would embedd the &drm_gpuva_manager and &drm_gpuva
+ *	// Typically drivers would embedd the &drm_gpuvm and &drm_gpuva
  *	// structure in individual driver structures and lock the dma-resv with
  *	// drm_exec or similar helpers.
- *	int driver_mapping_create(struct drm_gpuva_manager *mgr,
+ *	int driver_mapping_create(struct drm_gpuvm *gpuvm,
  *				  u64 addr, u64 range,
  *				  struct drm_gem_object *obj, u64 offset)
  *	{
@@ -432,7 +432,7 @@
  *		struct drm_gpuva_op *op
  *
  *		driver_lock_va_space();
- *		ops = drm_gpuva_sm_map_ops_create(mgr, addr, range,
+ *		ops = drm_gpuvm_sm_map_ops_create(gpuvm, addr, range,
  *						  obj, offset);
  *		if (IS_ERR(ops))
  *			return PTR_ERR(ops);
@@ -448,7 +448,7 @@
  *					  // free memory and unlock
  *
  *				driver_vm_map();
- *				drm_gpuva_map(mgr, va, &op->map);
+ *				drm_gpuva_map(gpuvm, va, &op->map);
  *				drm_gpuva_link(va);
  *
  *				break;
@@ -504,23 +504,23 @@
  * 2) Receive a callback for each &drm_gpuva_op to create a new mapping::
  *
  *	struct driver_context {
- *		struct drm_gpuva_manager *mgr;
+ *		struct drm_gpuvm *gpuvm;
  *		struct drm_gpuva *new_va;
  *		struct drm_gpuva *prev_va;
  *		struct drm_gpuva *next_va;
  *	};
  *
- *	// ops to pass to drm_gpuva_manager_init()
- *	static const struct drm_gpuva_fn_ops driver_gpuva_ops = {
+ *	// ops to pass to drm_gpuvm_init()
+ *	static const struct drm_gpuvm_ops driver_gpuvm_ops = {
  *		.sm_step_map = driver_gpuva_map,
  *		.sm_step_remap = driver_gpuva_remap,
  *		.sm_step_unmap = driver_gpuva_unmap,
  *	};
  *
- *	// Typically drivers would embedd the &drm_gpuva_manager and &drm_gpuva
+ *	// Typically drivers would embedd the &drm_gpuvm and &drm_gpuva
  *	// structure in individual driver structures and lock the dma-resv with
  *	// drm_exec or similar helpers.
- *	int driver_mapping_create(struct drm_gpuva_manager *mgr,
+ *	int driver_mapping_create(struct drm_gpuvm *gpuvm,
  *				  u64 addr, u64 range,
  *				  struct drm_gem_object *obj, u64 offset)
  *	{
@@ -529,7 +529,7 @@
  *		struct drm_gpuva_op *op;
  *		int ret = 0;
  *
- *		ctx.mgr = mgr;
+ *		ctx.gpuvm = gpuvm;
  *
  *		ctx.new_va = kzalloc(sizeof(*ctx.new_va), GFP_KERNEL);
  *		ctx.prev_va = kzalloc(sizeof(*ctx.prev_va), GFP_KERNEL);
@@ -540,7 +540,7 @@
  *		}
  *
  *		driver_lock_va_space();
- *		ret = drm_gpuva_sm_map(mgr, &ctx, addr, range, obj, offset);
+ *		ret = drm_gpuvm_sm_map(gpuvm, &ctx, addr, range, obj, offset);
  *		driver_unlock_va_space();
  *
  *	out:
@@ -554,7 +554,7 @@
  *	{
  *		struct driver_context *ctx = __ctx;
  *
- *		drm_gpuva_map(ctx->mgr, ctx->new_va, &op->map);
+ *		drm_gpuva_map(ctx->vm, ctx->new_va, &op->map);
  *
  *		drm_gpuva_link(ctx->new_va);
  *
@@ -609,7 +609,7 @@ INTERVAL_TREE_DEFINE(struct drm_gpuva, rb.node, u64, rb.__subtree_last,
 		     GPUVA_START, GPUVA_LAST, static __maybe_unused,
 		     drm_gpuva_it)
 
-static int __drm_gpuva_insert(struct drm_gpuva_manager *mgr,
+static int __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
 			      struct drm_gpuva *va);
 static void __drm_gpuva_remove(struct drm_gpuva *va);
 
@@ -623,121 +623,121 @@ drm_gpuva_check_overflow(u64 addr, u64 range)
 }
 
 static bool
-drm_gpuva_in_mm_range(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
+drm_gpuva_in_mm_range(struct drm_gpuvm *gpuvm, u64 addr, u64 range)
 {
 	u64 end = addr + range;
-	u64 mm_start = mgr->mm_start;
-	u64 mm_end = mm_start + mgr->mm_range;
+	u64 mm_start = gpuvm->mm_start;
+	u64 mm_end = mm_start + gpuvm->mm_range;
 
 	return addr >= mm_start && end <= mm_end;
 }
 
 static bool
-drm_gpuva_in_kernel_node(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
+drm_gpuva_in_kernel_node(struct drm_gpuvm *gpuvm, u64 addr, u64 range)
 {
 	u64 end = addr + range;
-	u64 kstart = mgr->kernel_alloc_node.va.addr;
-	u64 krange = mgr->kernel_alloc_node.va.range;
+	u64 kstart = gpuvm->kernel_alloc_node.va.addr;
+	u64 krange = gpuvm->kernel_alloc_node.va.range;
 	u64 kend = kstart + krange;
 
 	return krange && addr < kend && kstart < end;
 }
 
 static bool
-drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
+drm_gpuva_range_valid(struct drm_gpuvm *gpuvm,
 		      u64 addr, u64 range)
 {
 	return !drm_gpuva_check_overflow(addr, range) &&
-	       drm_gpuva_in_mm_range(mgr, addr, range) &&
-	       !drm_gpuva_in_kernel_node(mgr, addr, range);
+	       drm_gpuva_in_mm_range(gpuvm, addr, range) &&
+	       !drm_gpuva_in_kernel_node(gpuvm, addr, range);
 }
 
 /**
- * drm_gpuva_manager_init() - initialize a &drm_gpuva_manager
- * @mgr: pointer to the &drm_gpuva_manager to initialize
+ * drm_gpuvm_init() - initialize a &drm_gpuvm
+ * @gpuvm: pointer to the &drm_gpuvm to initialize
  * @name: the name of the GPU VA space
  * @start_offset: the start offset of the GPU VA space
  * @range: the size of the GPU VA space
  * @reserve_offset: the start of the kernel reserved GPU VA area
  * @reserve_range: the size of the kernel reserved GPU VA area
- * @ops: &drm_gpuva_fn_ops called on &drm_gpuva_sm_map / &drm_gpuva_sm_unmap
+ * @ops: &drm_gpuvm_ops called on &drm_gpuvm_sm_map / &drm_gpuvm_sm_unmap
  *
- * The &drm_gpuva_manager must be initialized with this function before use.
+ * The &drm_gpuvm must be initialized with this function before use.
  *
- * Note that @mgr must be cleared to 0 before calling this function. The given
+ * Note that @gpuvm must be cleared to 0 before calling this function. The given
  * &name is expected to be managed by the surrounding driver structures.
  */
 void
-drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
-		       const char *name,
-		       u64 start_offset, u64 range,
-		       u64 reserve_offset, u64 reserve_range,
-		       const struct drm_gpuva_fn_ops *ops)
+drm_gpuvm_init(struct drm_gpuvm *gpuvm,
+	       const char *name,
+	       u64 start_offset, u64 range,
+	       u64 reserve_offset, u64 reserve_range,
+	       const struct drm_gpuvm_ops *ops)
 {
-	mgr->rb.tree = RB_ROOT_CACHED;
-	INIT_LIST_HEAD(&mgr->rb.list);
+	gpuvm->rb.tree = RB_ROOT_CACHED;
+	INIT_LIST_HEAD(&gpuvm->rb.list);
 
 	drm_gpuva_check_overflow(start_offset, range);
-	mgr->mm_start = start_offset;
-	mgr->mm_range = range;
+	gpuvm->mm_start = start_offset;
+	gpuvm->mm_range = range;
 
-	mgr->name = name ? name : "unknown";
-	mgr->ops = ops;
+	gpuvm->name = name ? name : "unknown";
+	gpuvm->ops = ops;
 
-	memset(&mgr->kernel_alloc_node, 0, sizeof(struct drm_gpuva));
+	memset(&gpuvm->kernel_alloc_node, 0, sizeof(struct drm_gpuva));
 
 	if (reserve_range) {
-		mgr->kernel_alloc_node.va.addr = reserve_offset;
-		mgr->kernel_alloc_node.va.range = reserve_range;
+		gpuvm->kernel_alloc_node.va.addr = reserve_offset;
+		gpuvm->kernel_alloc_node.va.range = reserve_range;
 
 		if (likely(!drm_gpuva_check_overflow(reserve_offset,
 						     reserve_range)))
-			__drm_gpuva_insert(mgr, &mgr->kernel_alloc_node);
+			__drm_gpuva_insert(gpuvm, &gpuvm->kernel_alloc_node);
 	}
 }
-EXPORT_SYMBOL_GPL(drm_gpuva_manager_init);
+EXPORT_SYMBOL_GPL(drm_gpuvm_init);
 
 /**
- * drm_gpuva_manager_destroy() - cleanup a &drm_gpuva_manager
- * @mgr: pointer to the &drm_gpuva_manager to clean up
+ * drm_gpuvm_destroy() - cleanup a &drm_gpuvm
+ * @gpuvm: pointer to the &drm_gpuvm to clean up
  *
  * Note that it is a bug to call this function on a manager that still
  * holds GPU VA mappings.
  */
 void
-drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr)
+drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
 {
-	mgr->name = NULL;
+	gpuvm->name = NULL;
 
-	if (mgr->kernel_alloc_node.va.range)
-		__drm_gpuva_remove(&mgr->kernel_alloc_node);
+	if (gpuvm->kernel_alloc_node.va.range)
+		__drm_gpuva_remove(&gpuvm->kernel_alloc_node);
 
-	WARN(!RB_EMPTY_ROOT(&mgr->rb.tree.rb_root),
+	WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
 	     "GPUVA tree is not empty, potentially leaking memory.");
 }
-EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy);
+EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
 
 static int
-__drm_gpuva_insert(struct drm_gpuva_manager *mgr,
+__drm_gpuva_insert(struct drm_gpuvm *gpuvm,
 		   struct drm_gpuva *va)
 {
 	struct rb_node *node;
 	struct list_head *head;
 
-	if (drm_gpuva_it_iter_first(&mgr->rb.tree,
+	if (drm_gpuva_it_iter_first(&gpuvm->rb.tree,
 				    GPUVA_START(va),
 				    GPUVA_LAST(va)))
 		return -EEXIST;
 
-	va->mgr = mgr;
+	va->vm = gpuvm;
 
-	drm_gpuva_it_insert(va, &mgr->rb.tree);
+	drm_gpuva_it_insert(va, &gpuvm->rb.tree);
 
 	node = rb_prev(&va->rb.node);
 	if (node)
 		head = &(to_drm_gpuva(node))->rb.entry;
 	else
-		head = &mgr->rb.list;
+		head = &gpuvm->rb.list;
 
 	list_add(&va->rb.entry, head);
 
@@ -746,36 +746,36 @@ __drm_gpuva_insert(struct drm_gpuva_manager *mgr,
 
 /**
  * drm_gpuva_insert() - insert a &drm_gpuva
- * @mgr: the &drm_gpuva_manager to insert the &drm_gpuva in
+ * @gpuvm: the &drm_gpuvm to insert the &drm_gpuva in
  * @va: the &drm_gpuva to insert
  *
  * Insert a &drm_gpuva with a given address and range into a
- * &drm_gpuva_manager.
+ * &drm_gpuvm.
  *
  * It is safe to use this function using the safe versions of iterating the GPU
- * VA space, such as drm_gpuva_for_each_va_safe() and
- * drm_gpuva_for_each_va_range_safe().
+ * VA space, such as drm_gpuvm_for_each_va_safe() and
+ * drm_gpuvm_for_each_va_range_safe().
  *
  * Returns: 0 on success, negative error code on failure.
  */
 int
-drm_gpuva_insert(struct drm_gpuva_manager *mgr,
+drm_gpuva_insert(struct drm_gpuvm *gpuvm,
 		 struct drm_gpuva *va)
 {
 	u64 addr = va->va.addr;
 	u64 range = va->va.range;
 
-	if (unlikely(!drm_gpuva_range_valid(mgr, addr, range)))
+	if (unlikely(!drm_gpuva_range_valid(gpuvm, addr, range)))
 		return -EINVAL;
 
-	return __drm_gpuva_insert(mgr, va);
+	return __drm_gpuva_insert(gpuvm, va);
 }
 EXPORT_SYMBOL_GPL(drm_gpuva_insert);
 
 static void
 __drm_gpuva_remove(struct drm_gpuva *va)
 {
-	drm_gpuva_it_remove(va, &va->mgr->rb.tree);
+	drm_gpuva_it_remove(va, &va->vm->rb.tree);
 	list_del_init(&va->rb.entry);
 }
 
@@ -786,15 +786,15 @@ __drm_gpuva_remove(struct drm_gpuva *va)
  * This removes the given &va from the underlaying tree.
  *
  * It is safe to use this function using the safe versions of iterating the GPU
- * VA space, such as drm_gpuva_for_each_va_safe() and
- * drm_gpuva_for_each_va_range_safe().
+ * VA space, such as drm_gpuvm_for_each_va_safe() and
+ * drm_gpuvm_for_each_va_range_safe().
  */
 void
 drm_gpuva_remove(struct drm_gpuva *va)
 {
-	struct drm_gpuva_manager *mgr = va->mgr;
+	struct drm_gpuvm *gpuvm = va->vm;
 
-	if (unlikely(va == &mgr->kernel_alloc_node)) {
+	if (unlikely(va == &gpuvm->kernel_alloc_node)) {
 		WARN(1, "Can't destroy kernel reserved node.\n");
 		return;
 	}
@@ -853,37 +853,37 @@ EXPORT_SYMBOL_GPL(drm_gpuva_unlink);
 
 /**
  * drm_gpuva_find_first() - find the first &drm_gpuva in the given range
- * @mgr: the &drm_gpuva_manager to search in
+ * @gpuvm: the &drm_gpuvm to search in
  * @addr: the &drm_gpuvas address
  * @range: the &drm_gpuvas range
  *
  * Returns: the first &drm_gpuva within the given range
  */
 struct drm_gpuva *
-drm_gpuva_find_first(struct drm_gpuva_manager *mgr,
+drm_gpuva_find_first(struct drm_gpuvm *gpuvm,
 		     u64 addr, u64 range)
 {
 	u64 last = addr + range - 1;
 
-	return drm_gpuva_it_iter_first(&mgr->rb.tree, addr, last);
+	return drm_gpuva_it_iter_first(&gpuvm->rb.tree, addr, last);
 }
 EXPORT_SYMBOL_GPL(drm_gpuva_find_first);
 
 /**
  * drm_gpuva_find() - find a &drm_gpuva
- * @mgr: the &drm_gpuva_manager to search in
+ * @gpuvm: the &drm_gpuvm to search in
  * @addr: the &drm_gpuvas address
  * @range: the &drm_gpuvas range
  *
  * Returns: the &drm_gpuva at a given &addr and with a given &range
  */
 struct drm_gpuva *
-drm_gpuva_find(struct drm_gpuva_manager *mgr,
+drm_gpuva_find(struct drm_gpuvm *gpuvm,
 	       u64 addr, u64 range)
 {
 	struct drm_gpuva *va;
 
-	va = drm_gpuva_find_first(mgr, addr, range);
+	va = drm_gpuva_find_first(gpuvm, addr, range);
 	if (!va)
 		goto out;
 
@@ -900,7 +900,7 @@ EXPORT_SYMBOL_GPL(drm_gpuva_find);
 
 /**
  * drm_gpuva_find_prev() - find the &drm_gpuva before the given address
- * @mgr: the &drm_gpuva_manager to search in
+ * @gpuvm: the &drm_gpuvm to search in
  * @start: the given GPU VA's start address
  *
  * Find the adjacent &drm_gpuva before the GPU VA with given &start address.
@@ -911,18 +911,18 @@ EXPORT_SYMBOL_GPL(drm_gpuva_find);
  * Returns: a pointer to the found &drm_gpuva or NULL if none was found
  */
 struct drm_gpuva *
-drm_gpuva_find_prev(struct drm_gpuva_manager *mgr, u64 start)
+drm_gpuva_find_prev(struct drm_gpuvm *gpuvm, u64 start)
 {
-	if (!drm_gpuva_range_valid(mgr, start - 1, 1))
+	if (!drm_gpuva_range_valid(gpuvm, start - 1, 1))
 		return NULL;
 
-	return drm_gpuva_it_iter_first(&mgr->rb.tree, start - 1, start);
+	return drm_gpuva_it_iter_first(&gpuvm->rb.tree, start - 1, start);
 }
 EXPORT_SYMBOL_GPL(drm_gpuva_find_prev);
 
 /**
  * drm_gpuva_find_next() - find the &drm_gpuva after the given address
- * @mgr: the &drm_gpuva_manager to search in
+ * @gpuvm: the &drm_gpuvm to search in
  * @end: the given GPU VA's end address
  *
  * Find the adjacent &drm_gpuva after the GPU VA with given &end address.
@@ -933,47 +933,47 @@ EXPORT_SYMBOL_GPL(drm_gpuva_find_prev);
  * Returns: a pointer to the found &drm_gpuva or NULL if none was found
  */
 struct drm_gpuva *
-drm_gpuva_find_next(struct drm_gpuva_manager *mgr, u64 end)
+drm_gpuva_find_next(struct drm_gpuvm *gpuvm, u64 end)
 {
-	if (!drm_gpuva_range_valid(mgr, end, 1))
+	if (!drm_gpuva_range_valid(gpuvm, end, 1))
 		return NULL;
 
-	return drm_gpuva_it_iter_first(&mgr->rb.tree, end, end + 1);
+	return drm_gpuva_it_iter_first(&gpuvm->rb.tree, end, end + 1);
 }
 EXPORT_SYMBOL_GPL(drm_gpuva_find_next);
 
 /**
  * drm_gpuva_interval_empty() - indicate whether a given interval of the VA space
  * is empty
- * @mgr: the &drm_gpuva_manager to check the range for
+ * @gpuvm: the &drm_gpuvm to check the range for
  * @addr: the start address of the range
  * @range: the range of the interval
  *
  * Returns: true if the interval is empty, false otherwise
  */
 bool
-drm_gpuva_interval_empty(struct drm_gpuva_manager *mgr, u64 addr, u64 range)
+drm_gpuva_interval_empty(struct drm_gpuvm *gpuvm, u64 addr, u64 range)
 {
-	return !drm_gpuva_find_first(mgr, addr, range);
+	return !drm_gpuva_find_first(gpuvm, addr, range);
 }
 EXPORT_SYMBOL_GPL(drm_gpuva_interval_empty);
 
 /**
  * drm_gpuva_map() - helper to insert a &drm_gpuva according to a
  * &drm_gpuva_op_map
- * @mgr: the &drm_gpuva_manager
+ * @gpuvm: the &drm_gpuvm
  * @va: the &drm_gpuva to insert
  * @op: the &drm_gpuva_op_map to initialize @va with
  *
- * Initializes the @va from the @op and inserts it into the given @mgr.
+ * Initializes the @va from the @op and inserts it into the given @gpuvm.
  */
 void
-drm_gpuva_map(struct drm_gpuva_manager *mgr,
+drm_gpuva_map(struct drm_gpuvm *gpuvm,
 	      struct drm_gpuva *va,
 	      struct drm_gpuva_op_map *op)
 {
 	drm_gpuva_init_from_op(va, op);
-	drm_gpuva_insert(mgr, va);
+	drm_gpuva_insert(gpuvm, va);
 }
 EXPORT_SYMBOL_GPL(drm_gpuva_map);
 
@@ -993,18 +993,18 @@ drm_gpuva_remap(struct drm_gpuva *prev,
 		struct drm_gpuva_op_remap *op)
 {
 	struct drm_gpuva *curr = op->unmap->va;
-	struct drm_gpuva_manager *mgr = curr->mgr;
+	struct drm_gpuvm *gpuvm = curr->vm;
 
 	drm_gpuva_remove(curr);
 
 	if (op->prev) {
 		drm_gpuva_init_from_op(prev, op->prev);
-		drm_gpuva_insert(mgr, prev);
+		drm_gpuva_insert(gpuvm, prev);
 	}
 
 	if (op->next) {
 		drm_gpuva_init_from_op(next, op->next);
-		drm_gpuva_insert(mgr, next);
+		drm_gpuva_insert(gpuvm, next);
 	}
 }
 EXPORT_SYMBOL_GPL(drm_gpuva_remap);
@@ -1024,7 +1024,7 @@ drm_gpuva_unmap(struct drm_gpuva_op_unmap *op)
 EXPORT_SYMBOL_GPL(drm_gpuva_unmap);
 
 static int
-op_map_cb(const struct drm_gpuva_fn_ops *fn, void *priv,
+op_map_cb(const struct drm_gpuvm_ops *fn, void *priv,
 	  u64 addr, u64 range,
 	  struct drm_gem_object *obj, u64 offset)
 {
@@ -1040,7 +1040,7 @@ op_map_cb(const struct drm_gpuva_fn_ops *fn, void *priv,
 }
 
 static int
-op_remap_cb(const struct drm_gpuva_fn_ops *fn, void *priv,
+op_remap_cb(const struct drm_gpuvm_ops *fn, void *priv,
 	    struct drm_gpuva_op_map *prev,
 	    struct drm_gpuva_op_map *next,
 	    struct drm_gpuva_op_unmap *unmap)
@@ -1058,7 +1058,7 @@ op_remap_cb(const struct drm_gpuva_fn_ops *fn, void *priv,
 }
 
 static int
-op_unmap_cb(const struct drm_gpuva_fn_ops *fn, void *priv,
+op_unmap_cb(const struct drm_gpuvm_ops *fn, void *priv,
 	    struct drm_gpuva *va, bool merge)
 {
 	struct drm_gpuva_op op = {};
@@ -1071,8 +1071,8 @@ op_unmap_cb(const struct drm_gpuva_fn_ops *fn, void *priv,
 }
 
 static int
-__drm_gpuva_sm_map(struct drm_gpuva_manager *mgr,
-		   const struct drm_gpuva_fn_ops *ops, void *priv,
+__drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm,
+		   const struct drm_gpuvm_ops *ops, void *priv,
 		   u64 req_addr, u64 req_range,
 		   struct drm_gem_object *req_obj, u64 req_offset)
 {
@@ -1080,10 +1080,10 @@ __drm_gpuva_sm_map(struct drm_gpuva_manager *mgr,
 	u64 req_end = req_addr + req_range;
 	int ret;
 
-	if (unlikely(!drm_gpuva_range_valid(mgr, req_addr, req_range)))
+	if (unlikely(!drm_gpuva_range_valid(gpuvm, req_addr, req_range)))
 		return -EINVAL;
 
-	drm_gpuva_for_each_va_range_safe(va, next, mgr, req_addr, req_end) {
+	drm_gpuvm_for_each_va_range_safe(va, next, gpuvm, req_addr, req_end) {
 		struct drm_gem_object *obj = va->gem.obj;
 		u64 offset = va->gem.offset;
 		u64 addr = va->va.addr;
@@ -1215,18 +1215,18 @@ __drm_gpuva_sm_map(struct drm_gpuva_manager *mgr,
 }
 
 static int
-__drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr,
-		     const struct drm_gpuva_fn_ops *ops, void *priv,
+__drm_gpuvm_sm_unmap(struct drm_gpuvm *gpuvm,
+		     const struct drm_gpuvm_ops *ops, void *priv,
 		     u64 req_addr, u64 req_range)
 {
 	struct drm_gpuva *va, *next;
 	u64 req_end = req_addr + req_range;
 	int ret;
 
-	if (unlikely(!drm_gpuva_range_valid(mgr, req_addr, req_range)))
+	if (unlikely(!drm_gpuva_range_valid(gpuvm, req_addr, req_range)))
 		return -EINVAL;
 
-	drm_gpuva_for_each_va_range_safe(va, next, mgr, req_addr, req_end) {
+	drm_gpuvm_for_each_va_range_safe(va, next, gpuvm, req_addr, req_end) {
 		struct drm_gpuva_op_map prev = {}, next = {};
 		bool prev_split = false, next_split = false;
 		struct drm_gem_object *obj = va->gem.obj;
@@ -1273,8 +1273,8 @@ __drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr,
 }
 
 /**
- * drm_gpuva_sm_map() - creates the &drm_gpuva_op split/merge steps
- * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * drm_gpuvm_sm_map() - creates the &drm_gpuva_op split/merge steps
+ * @gpuvm: the &drm_gpuvm representing the GPU VA space
  * @req_addr: the start address of the new mapping
  * @req_range: the range of the new mapping
  * @req_obj: the &drm_gem_object to map
@@ -1282,15 +1282,15 @@ __drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr,
  * @priv: pointer to a driver private data structure
  *
  * This function iterates the given range of the GPU VA space. It utilizes the
- * &drm_gpuva_fn_ops to call back into the driver providing the split and merge
+ * &drm_gpuvm_ops to call back into the driver providing the split and merge
  * steps.
  *
  * Drivers may use these callbacks to update the GPU VA space right away within
  * the callback. In case the driver decides to copy and store the operations for
- * later processing neither this function nor &drm_gpuva_sm_unmap is allowed to
- * be called before the &drm_gpuva_manager's view of the GPU VA space was
+ * later processing neither this function nor &drm_gpuvm_sm_unmap is allowed to
+ * be called before the &drm_gpuvm's view of the GPU VA space was
  * updated with the previous set of operations. To update the
- * &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
+ * &drm_gpuvm's view of the GPU VA space drm_gpuva_insert(),
  * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
  * used.
  *
@@ -1305,39 +1305,39 @@ __drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr,
  * Returns: 0 on success or a negative error code
  */
 int
-drm_gpuva_sm_map(struct drm_gpuva_manager *mgr, void *priv,
+drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
 		 u64 req_addr, u64 req_range,
 		 struct drm_gem_object *req_obj, u64 req_offset)
 {
-	const struct drm_gpuva_fn_ops *ops = mgr->ops;
+	const struct drm_gpuvm_ops *ops = gpuvm->ops;
 
 	if (unlikely(!(ops && ops->sm_step_map &&
 		       ops->sm_step_remap &&
 		       ops->sm_step_unmap)))
 		return -EINVAL;
 
-	return __drm_gpuva_sm_map(mgr, ops, priv,
+	return __drm_gpuvm_sm_map(gpuvm, ops, priv,
 				  req_addr, req_range,
 				  req_obj, req_offset);
 }
-EXPORT_SYMBOL_GPL(drm_gpuva_sm_map);
+EXPORT_SYMBOL_GPL(drm_gpuvm_sm_map);
 
 /**
- * drm_gpuva_sm_unmap() - creates the &drm_gpuva_ops to split on unmap
- * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * drm_gpuvm_sm_unmap() - creates the &drm_gpuva_ops to split on unmap
+ * @gpuvm: the &drm_gpuvm representing the GPU VA space
  * @priv: pointer to a driver private data structure
  * @req_addr: the start address of the range to unmap
  * @req_range: the range of the mappings to unmap
  *
  * This function iterates the given range of the GPU VA space. It utilizes the
- * &drm_gpuva_fn_ops to call back into the driver providing the operations to
+ * &drm_gpuvm_ops to call back into the driver providing the operations to
  * unmap and, if required, split existent mappings.
  *
  * Drivers may use these callbacks to update the GPU VA space right away within
  * the callback. In case the driver decides to copy and store the operations for
- * later processing neither this function nor &drm_gpuva_sm_map is allowed to be
- * called before the &drm_gpuva_manager's view of the GPU VA space was updated
- * with the previous set of operations. To update the &drm_gpuva_manager's view
+ * later processing neither this function nor &drm_gpuvm_sm_map is allowed to be
+ * called before the &drm_gpuvm's view of the GPU VA space was updated
+ * with the previous set of operations. To update the &drm_gpuvm's view
  * of the GPU VA space drm_gpuva_insert(), drm_gpuva_destroy_locked() and/or
  * drm_gpuva_destroy_unlocked() should be used.
  *
@@ -1350,24 +1350,24 @@ EXPORT_SYMBOL_GPL(drm_gpuva_sm_map);
  * Returns: 0 on success or a negative error code
  */
 int
-drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr, void *priv,
+drm_gpuvm_sm_unmap(struct drm_gpuvm *gpuvm, void *priv,
 		   u64 req_addr, u64 req_range)
 {
-	const struct drm_gpuva_fn_ops *ops = mgr->ops;
+	const struct drm_gpuvm_ops *ops = gpuvm->ops;
 
 	if (unlikely(!(ops && ops->sm_step_remap &&
 		       ops->sm_step_unmap)))
 		return -EINVAL;
 
-	return __drm_gpuva_sm_unmap(mgr, ops, priv,
+	return __drm_gpuvm_sm_unmap(gpuvm, ops, priv,
 				    req_addr, req_range);
 }
-EXPORT_SYMBOL_GPL(drm_gpuva_sm_unmap);
+EXPORT_SYMBOL_GPL(drm_gpuvm_sm_unmap);
 
 static struct drm_gpuva_op *
-gpuva_op_alloc(struct drm_gpuva_manager *mgr)
+gpuva_op_alloc(struct drm_gpuvm *gpuvm)
 {
-	const struct drm_gpuva_fn_ops *fn = mgr->ops;
+	const struct drm_gpuvm_ops *fn = gpuvm->ops;
 	struct drm_gpuva_op *op;
 
 	if (fn && fn->op_alloc)
@@ -1382,10 +1382,10 @@ gpuva_op_alloc(struct drm_gpuva_manager *mgr)
 }
 
 static void
-gpuva_op_free(struct drm_gpuva_manager *mgr,
+gpuva_op_free(struct drm_gpuvm *gpuvm,
 	      struct drm_gpuva_op *op)
 {
-	const struct drm_gpuva_fn_ops *fn = mgr->ops;
+	const struct drm_gpuvm_ops *fn = gpuvm->ops;
 
 	if (fn && fn->op_free)
 		fn->op_free(op);
@@ -1398,14 +1398,14 @@ drm_gpuva_sm_step(struct drm_gpuva_op *__op,
 		  void *priv)
 {
 	struct {
-		struct drm_gpuva_manager *mgr;
+		struct drm_gpuvm *vm;
 		struct drm_gpuva_ops *ops;
 	} *args = priv;
-	struct drm_gpuva_manager *mgr = args->mgr;
+	struct drm_gpuvm *gpuvm = args->vm;
 	struct drm_gpuva_ops *ops = args->ops;
 	struct drm_gpuva_op *op;
 
-	op = gpuva_op_alloc(mgr);
+	op = gpuva_op_alloc(gpuvm);
 	if (unlikely(!op))
 		goto err;
 
@@ -1444,20 +1444,20 @@ drm_gpuva_sm_step(struct drm_gpuva_op *__op,
 err_free_prev:
 	kfree(op->remap.prev);
 err_free_op:
-	gpuva_op_free(mgr, op);
+	gpuva_op_free(gpuvm, op);
 err:
 	return -ENOMEM;
 }
 
-static const struct drm_gpuva_fn_ops gpuva_list_ops = {
+static const struct drm_gpuvm_ops gpuvm_list_ops = {
 	.sm_step_map = drm_gpuva_sm_step,
 	.sm_step_remap = drm_gpuva_sm_step,
 	.sm_step_unmap = drm_gpuva_sm_step,
 };
 
 /**
- * drm_gpuva_sm_map_ops_create() - creates the &drm_gpuva_ops to split and merge
- * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * drm_gpuvm_sm_map_ops_create() - creates the &drm_gpuva_ops to split and merge
+ * @gpuvm: the &drm_gpuvm representing the GPU VA space
  * @req_addr: the start address of the new mapping
  * @req_range: the range of the new mapping
  * @req_obj: the &drm_gem_object to map
@@ -1476,9 +1476,9 @@ static const struct drm_gpuva_fn_ops gpuva_list_ops = {
  * map operation requested by the caller.
  *
  * Note that before calling this function again with another mapping request it
- * is necessary to update the &drm_gpuva_manager's view of the GPU VA space. The
+ * is necessary to update the &drm_gpuvm's view of the GPU VA space. The
  * previously obtained operations must be either processed or abandoned. To
- * update the &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
+ * update the &drm_gpuvm's view of the GPU VA space drm_gpuva_insert(),
  * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
  * used.
  *
@@ -1488,13 +1488,13 @@ static const struct drm_gpuva_fn_ops gpuva_list_ops = {
  * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
  */
 struct drm_gpuva_ops *
-drm_gpuva_sm_map_ops_create(struct drm_gpuva_manager *mgr,
+drm_gpuvm_sm_map_ops_create(struct drm_gpuvm *gpuvm,
 			    u64 req_addr, u64 req_range,
 			    struct drm_gem_object *req_obj, u64 req_offset)
 {
 	struct drm_gpuva_ops *ops;
 	struct {
-		struct drm_gpuva_manager *mgr;
+		struct drm_gpuvm *vm;
 		struct drm_gpuva_ops *ops;
 	} args;
 	int ret;
@@ -1505,10 +1505,10 @@ drm_gpuva_sm_map_ops_create(struct drm_gpuva_manager *mgr,
 
 	INIT_LIST_HEAD(&ops->list);
 
-	args.mgr = mgr;
+	args.vm = gpuvm;
 	args.ops = ops;
 
-	ret = __drm_gpuva_sm_map(mgr, &gpuva_list_ops, &args,
+	ret = __drm_gpuvm_sm_map(gpuvm, &gpuvm_list_ops, &args,
 				 req_addr, req_range,
 				 req_obj, req_offset);
 	if (ret)
@@ -1517,15 +1517,15 @@ drm_gpuva_sm_map_ops_create(struct drm_gpuva_manager *mgr,
 	return ops;
 
 err_free_ops:
-	drm_gpuva_ops_free(mgr, ops);
+	drm_gpuva_ops_free(gpuvm, ops);
 	return ERR_PTR(ret);
 }
-EXPORT_SYMBOL_GPL(drm_gpuva_sm_map_ops_create);
+EXPORT_SYMBOL_GPL(drm_gpuvm_sm_map_ops_create);
 
 /**
- * drm_gpuva_sm_unmap_ops_create() - creates the &drm_gpuva_ops to split on
+ * drm_gpuvm_sm_unmap_ops_create() - creates the &drm_gpuva_ops to split on
  * unmap
- * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * @gpuvm: the &drm_gpuvm representing the GPU VA space
  * @req_addr: the start address of the range to unmap
  * @req_range: the range of the mappings to unmap
  *
@@ -1540,9 +1540,9 @@ EXPORT_SYMBOL_GPL(drm_gpuva_sm_map_ops_create);
  * remap operations.
  *
  * Note that before calling this function again with another range to unmap it
- * is necessary to update the &drm_gpuva_manager's view of the GPU VA space. The
+ * is necessary to update the &drm_gpuvm's view of the GPU VA space. The
  * previously obtained operations must be processed or abandoned. To update the
- * &drm_gpuva_manager's view of the GPU VA space drm_gpuva_insert(),
+ * &drm_gpuvm's view of the GPU VA space drm_gpuva_insert(),
  * drm_gpuva_destroy_locked() and/or drm_gpuva_destroy_unlocked() should be
  * used.
  *
@@ -1552,12 +1552,12 @@ EXPORT_SYMBOL_GPL(drm_gpuva_sm_map_ops_create);
  * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
  */
 struct drm_gpuva_ops *
-drm_gpuva_sm_unmap_ops_create(struct drm_gpuva_manager *mgr,
+drm_gpuvm_sm_unmap_ops_create(struct drm_gpuvm *gpuvm,
 			      u64 req_addr, u64 req_range)
 {
 	struct drm_gpuva_ops *ops;
 	struct {
-		struct drm_gpuva_manager *mgr;
+		struct drm_gpuvm *vm;
 		struct drm_gpuva_ops *ops;
 	} args;
 	int ret;
@@ -1568,10 +1568,10 @@ drm_gpuva_sm_unmap_ops_create(struct drm_gpuva_manager *mgr,
 
 	INIT_LIST_HEAD(&ops->list);
 
-	args.mgr = mgr;
+	args.vm = gpuvm;
 	args.ops = ops;
 
-	ret = __drm_gpuva_sm_unmap(mgr, &gpuva_list_ops, &args,
+	ret = __drm_gpuvm_sm_unmap(gpuvm, &gpuvm_list_ops, &args,
 				   req_addr, req_range);
 	if (ret)
 		goto err_free_ops;
@@ -1579,14 +1579,14 @@ drm_gpuva_sm_unmap_ops_create(struct drm_gpuva_manager *mgr,
 	return ops;
 
 err_free_ops:
-	drm_gpuva_ops_free(mgr, ops);
+	drm_gpuva_ops_free(gpuvm, ops);
 	return ERR_PTR(ret);
 }
-EXPORT_SYMBOL_GPL(drm_gpuva_sm_unmap_ops_create);
+EXPORT_SYMBOL_GPL(drm_gpuvm_sm_unmap_ops_create);
 
 /**
- * drm_gpuva_prefetch_ops_create() - creates the &drm_gpuva_ops to prefetch
- * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * drm_gpuvm_prefetch_ops_create() - creates the &drm_gpuva_ops to prefetch
+ * @gpuvm: the &drm_gpuvm representing the GPU VA space
  * @addr: the start address of the range to prefetch
  * @range: the range of the mappings to prefetch
  *
@@ -1603,7 +1603,7 @@ EXPORT_SYMBOL_GPL(drm_gpuva_sm_unmap_ops_create);
  * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
  */
 struct drm_gpuva_ops *
-drm_gpuva_prefetch_ops_create(struct drm_gpuva_manager *mgr,
+drm_gpuvm_prefetch_ops_create(struct drm_gpuvm *gpuvm,
 			      u64 addr, u64 range)
 {
 	struct drm_gpuva_ops *ops;
@@ -1618,8 +1618,8 @@ drm_gpuva_prefetch_ops_create(struct drm_gpuva_manager *mgr,
 
 	INIT_LIST_HEAD(&ops->list);
 
-	drm_gpuva_for_each_va_range(va, mgr, addr, end) {
-		op = gpuva_op_alloc(mgr);
+	drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
+		op = gpuva_op_alloc(gpuvm);
 		if (!op) {
 			ret = -ENOMEM;
 			goto err_free_ops;
@@ -1633,14 +1633,14 @@ drm_gpuva_prefetch_ops_create(struct drm_gpuva_manager *mgr,
 	return ops;
 
 err_free_ops:
-	drm_gpuva_ops_free(mgr, ops);
+	drm_gpuva_ops_free(gpuvm, ops);
 	return ERR_PTR(ret);
 }
-EXPORT_SYMBOL_GPL(drm_gpuva_prefetch_ops_create);
+EXPORT_SYMBOL_GPL(drm_gpuvm_prefetch_ops_create);
 
 /**
- * drm_gpuva_gem_unmap_ops_create() - creates the &drm_gpuva_ops to unmap a GEM
- * @mgr: the &drm_gpuva_manager representing the GPU VA space
+ * drm_gpuvm_gem_unmap_ops_create() - creates the &drm_gpuva_ops to unmap a GEM
+ * @gpuvm: the &drm_gpuvm representing the GPU VA space
  * @obj: the &drm_gem_object to unmap
  *
  * This function creates a list of operations to perform unmapping for every
@@ -1658,7 +1658,7 @@ EXPORT_SYMBOL_GPL(drm_gpuva_prefetch_ops_create);
  * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
  */
 struct drm_gpuva_ops *
-drm_gpuva_gem_unmap_ops_create(struct drm_gpuva_manager *mgr,
+drm_gpuvm_gem_unmap_ops_create(struct drm_gpuvm *gpuvm,
 			       struct drm_gem_object *obj)
 {
 	struct drm_gpuva_ops *ops;
@@ -1675,7 +1675,7 @@ drm_gpuva_gem_unmap_ops_create(struct drm_gpuva_manager *mgr,
 	INIT_LIST_HEAD(&ops->list);
 
 	drm_gem_for_each_gpuva(va, obj) {
-		op = gpuva_op_alloc(mgr);
+		op = gpuva_op_alloc(gpuvm);
 		if (!op) {
 			ret = -ENOMEM;
 			goto err_free_ops;
@@ -1689,21 +1689,21 @@ drm_gpuva_gem_unmap_ops_create(struct drm_gpuva_manager *mgr,
 	return ops;
 
 err_free_ops:
-	drm_gpuva_ops_free(mgr, ops);
+	drm_gpuva_ops_free(gpuvm, ops);
 	return ERR_PTR(ret);
 }
-EXPORT_SYMBOL_GPL(drm_gpuva_gem_unmap_ops_create);
+EXPORT_SYMBOL_GPL(drm_gpuvm_gem_unmap_ops_create);
 
 /**
  * drm_gpuva_ops_free() - free the given &drm_gpuva_ops
- * @mgr: the &drm_gpuva_manager the ops were created for
+ * @gpuvm: the &drm_gpuvm the ops were created for
  * @ops: the &drm_gpuva_ops to free
  *
  * Frees the given &drm_gpuva_ops structure including all the ops associated
  * with it.
  */
 void
-drm_gpuva_ops_free(struct drm_gpuva_manager *mgr,
+drm_gpuva_ops_free(struct drm_gpuvm *gpuvm,
 		   struct drm_gpuva_ops *ops)
 {
 	struct drm_gpuva_op *op, *next;
@@ -1717,7 +1717,7 @@ drm_gpuva_ops_free(struct drm_gpuva_manager *mgr,
 			kfree(op->remap.unmap);
 		}
 
-		gpuva_op_free(mgr, op);
+		gpuva_op_free(gpuvm, op);
 	}
 
 	kfree(ops);
diff --git a/drivers/gpu/drm/nouveau/nouveau_exec.c b/drivers/gpu/drm/nouveau/nouveau_exec.c
index a90c4cd8cbb2..c001952cd678 100644
--- a/drivers/gpu/drm/nouveau/nouveau_exec.c
+++ b/drivers/gpu/drm/nouveau/nouveau_exec.c
@@ -106,7 +106,7 @@ nouveau_exec_job_submit(struct nouveau_job *job)
 	drm_exec_until_all_locked(exec) {
 		struct drm_gpuva *va;
 
-		drm_gpuva_for_each_va(va, &uvmm->umgr) {
+		drm_gpuvm_for_each_va(va, &uvmm->umgr) {
 			if (unlikely(va == &uvmm->umgr.kernel_alloc_node))
 				continue;
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
index aae780e4a4aa..c750072cb268 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -444,7 +444,7 @@ op_map_prepare_unwind(struct nouveau_uvma *uvma)
 static void
 op_unmap_prepare_unwind(struct drm_gpuva *va)
 {
-	drm_gpuva_insert(va->mgr, va);
+	drm_gpuva_insert(va->vm, va);
 }
 
 static void
@@ -1194,7 +1194,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 				goto unwind_continue;
 			}
 
-			op->ops = drm_gpuva_sm_unmap_ops_create(&uvmm->umgr,
+			op->ops = drm_gpuvm_sm_unmap_ops_create(&uvmm->umgr,
 								op->va.addr,
 								op->va.range);
 			if (IS_ERR(op->ops)) {
@@ -1240,7 +1240,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 				}
 			}
 
-			op->ops = drm_gpuva_sm_map_ops_create(&uvmm->umgr,
+			op->ops = drm_gpuvm_sm_map_ops_create(&uvmm->umgr,
 							      op->va.addr,
 							      op->va.range,
 							      op->gem.obj,
@@ -1264,7 +1264,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 			break;
 		}
 		case OP_UNMAP:
-			op->ops = drm_gpuva_sm_unmap_ops_create(&uvmm->umgr,
+			op->ops = drm_gpuvm_sm_unmap_ops_create(&uvmm->umgr,
 								op->va.addr,
 								op->va.range);
 			if (IS_ERR(op->ops)) {
@@ -1836,11 +1836,11 @@ nouveau_uvmm_init(struct nouveau_uvmm *uvmm, struct nouveau_cli *cli,
 	uvmm->kernel_managed_addr = kernel_managed_addr;
 	uvmm->kernel_managed_size = kernel_managed_size;
 
-	drm_gpuva_manager_init(&uvmm->umgr, cli->name,
-			       NOUVEAU_VA_SPACE_START,
-			       NOUVEAU_VA_SPACE_END,
-			       kernel_managed_addr, kernel_managed_size,
-			       NULL);
+	drm_gpuvm_init(&uvmm->umgr, cli->name,
+		       NOUVEAU_VA_SPACE_START,
+		       NOUVEAU_VA_SPACE_END,
+		       kernel_managed_addr, kernel_managed_size,
+		       NULL);
 
 	ret = nvif_vmm_ctor(&cli->mmu, "uvmm",
 			    cli->vmm.vmm.object.oclass, RAW,
@@ -1855,7 +1855,7 @@ nouveau_uvmm_init(struct nouveau_uvmm *uvmm, struct nouveau_cli *cli,
 	return 0;
 
 out_free_gpuva_mgr:
-	drm_gpuva_manager_destroy(&uvmm->umgr);
+	drm_gpuvm_destroy(&uvmm->umgr);
 out_unlock:
 	mutex_unlock(&cli->mutex);
 	return ret;
@@ -1877,7 +1877,7 @@ nouveau_uvmm_fini(struct nouveau_uvmm *uvmm)
 	wait_event(entity->job.wq, list_empty(&entity->job.list.head));
 
 	nouveau_uvmm_lock(uvmm);
-	drm_gpuva_for_each_va_safe(va, next, &uvmm->umgr) {
+	drm_gpuvm_for_each_va_safe(va, next, &uvmm->umgr) {
 		struct nouveau_uvma *uvma = uvma_from_va(va);
 		struct drm_gem_object *obj = va->gem.obj;
 
@@ -1910,7 +1910,7 @@ nouveau_uvmm_fini(struct nouveau_uvmm *uvmm)
 
 	mutex_lock(&cli->mutex);
 	nouveau_vmm_fini(&uvmm->vmm);
-	drm_gpuva_manager_destroy(&uvmm->umgr);
+	drm_gpuvm_destroy(&uvmm->umgr);
 	mutex_unlock(&cli->mutex);
 
 	dma_resv_fini(&uvmm->resv);
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.h b/drivers/gpu/drm/nouveau/nouveau_uvmm.h
index fc7f6fd2a4e1..e96c9919d1bd 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.h
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.h
@@ -3,13 +3,13 @@
 #ifndef __NOUVEAU_UVMM_H__
 #define __NOUVEAU_UVMM_H__
 
-#include <drm/drm_gpuva_mgr.h>
+#include <drm/drm_gpuvm.h>
 
 #include "nouveau_drv.h"
 
 struct nouveau_uvmm {
 	struct nouveau_vmm vmm;
-	struct drm_gpuva_manager umgr;
+	struct drm_gpuvm umgr;
 	struct maple_tree region_mt;
 	struct mutex mutex;
 	struct dma_resv resv;
@@ -44,7 +44,7 @@ struct nouveau_uvma {
 #define uvmm_from_mgr(x) container_of((x), struct nouveau_uvmm, umgr)
 #define uvma_from_va(x) container_of((x), struct nouveau_uvma, va)
 
-#define to_uvmm(x) uvmm_from_mgr((x)->va.mgr)
+#define to_uvmm(x) uvmm_from_mgr((x)->va.vm)
 
 struct nouveau_uvmm_bind_job {
 	struct nouveau_job base;
diff --git a/include/drm/drm_debugfs.h b/include/drm/drm_debugfs.h
index 7213ce15e4ff..047a54e46c72 100644
--- a/include/drm/drm_debugfs.h
+++ b/include/drm/drm_debugfs.h
@@ -35,7 +35,7 @@
 #include <linux/types.h>
 #include <linux/seq_file.h>
 
-#include <drm/drm_gpuva_mgr.h>
+#include <drm/drm_gpuvm.h>
 
 /**
  * DRM_DEBUGFS_GPUVA_INFO - &drm_info_list entry to dump a GPU VA space
@@ -152,7 +152,7 @@ void drm_debugfs_add_files(struct drm_device *dev,
 			   const struct drm_debugfs_info *files, int count);
 
 int drm_debugfs_gpuva_info(struct seq_file *m,
-			   struct drm_gpuva_manager *mgr);
+			   struct drm_gpuvm *gpuvm);
 #else
 static inline void drm_debugfs_create_files(const struct drm_info_list *files,
 					    int count, struct dentry *root,
@@ -176,7 +176,7 @@ static inline void drm_debugfs_add_files(struct drm_device *dev,
 {}
 
 static inline int drm_debugfs_gpuva_info(struct seq_file *m,
-					 struct drm_gpuva_manager *mgr)
+					 struct drm_gpuvm *gpuvm)
 {
 	return 0;
 }
diff --git a/include/drm/drm_gpuva_mgr.h b/include/drm/drm_gpuvm.h
similarity index 78%
rename from include/drm/drm_gpuva_mgr.h
rename to include/drm/drm_gpuvm.h
index ed8d50200cc3..0e802676e0a9 100644
--- a/include/drm/drm_gpuva_mgr.h
+++ b/include/drm/drm_gpuvm.h
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
 
-#ifndef __DRM_GPUVA_MGR_H__
-#define __DRM_GPUVA_MGR_H__
+#ifndef __DRM_GPUVM_H__
+#define __DRM_GPUVM_H__
 
 /*
  * Copyright (c) 2022 Red Hat.
@@ -31,8 +31,8 @@
 
 #include <drm/drm_gem.h>
 
-struct drm_gpuva_manager;
-struct drm_gpuva_fn_ops;
+struct drm_gpuvm;
+struct drm_gpuvm_ops;
 
 /**
  * enum drm_gpuva_flags - flags for struct drm_gpuva
@@ -62,15 +62,15 @@ enum drm_gpuva_flags {
  * struct drm_gpuva - structure to track a GPU VA mapping
  *
  * This structure represents a GPU VA mapping and is associated with a
- * &drm_gpuva_manager.
+ * &drm_gpuvm.
  *
  * Typically, this structure is embedded in bigger driver structures.
  */
 struct drm_gpuva {
 	/**
-	 * @mgr: the &drm_gpuva_manager this object is associated with
+	 * @vm: the &drm_gpuvm this object is associated with
 	 */
-	struct drm_gpuva_manager *mgr;
+	struct drm_gpuvm *vm;
 
 	/**
 	 * @flags: the &drm_gpuva_flags for this mapping
@@ -137,20 +137,20 @@ struct drm_gpuva {
 	} rb;
 };
 
-int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct drm_gpuva *va);
+int drm_gpuva_insert(struct drm_gpuvm *gpuvm, struct drm_gpuva *va);
 void drm_gpuva_remove(struct drm_gpuva *va);
 
 void drm_gpuva_link(struct drm_gpuva *va);
 void drm_gpuva_unlink(struct drm_gpuva *va);
 
-struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr,
+struct drm_gpuva *drm_gpuva_find(struct drm_gpuvm *gpuvm,
 				 u64 addr, u64 range);
-struct drm_gpuva *drm_gpuva_find_first(struct drm_gpuva_manager *mgr,
+struct drm_gpuva *drm_gpuva_find_first(struct drm_gpuvm *gpuvm,
 				       u64 addr, u64 range);
-struct drm_gpuva *drm_gpuva_find_prev(struct drm_gpuva_manager *mgr, u64 start);
-struct drm_gpuva *drm_gpuva_find_next(struct drm_gpuva_manager *mgr, u64 end);
+struct drm_gpuva *drm_gpuva_find_prev(struct drm_gpuvm *gpuvm, u64 start);
+struct drm_gpuva *drm_gpuva_find_next(struct drm_gpuvm *gpuvm, u64 end);
 
-bool drm_gpuva_interval_empty(struct drm_gpuva_manager *mgr, u64 addr, u64 range);
+bool drm_gpuva_interval_empty(struct drm_gpuvm *gpuvm, u64 addr, u64 range);
 
 static inline void drm_gpuva_init(struct drm_gpuva *va, u64 addr, u64 range,
 				  struct drm_gem_object *obj, u64 offset)
@@ -186,7 +186,7 @@ static inline bool drm_gpuva_invalidated(struct drm_gpuva *va)
 }
 
 /**
- * struct drm_gpuva_manager - DRM GPU VA Manager
+ * struct drm_gpuvm - DRM GPU VA Manager
  *
  * The DRM GPU VA Manager keeps track of a GPU's virtual address space by using
  * &maple_tree structures. Typically, this structure is embedded in bigger
@@ -197,7 +197,7 @@ static inline bool drm_gpuva_invalidated(struct drm_gpuva *va)
  *
  * There should be one manager instance per GPU virtual address space.
  */
-struct drm_gpuva_manager {
+struct drm_gpuvm {
 	/**
 	 * @name: the name of the DRM GPU VA space
 	 */
@@ -237,100 +237,99 @@ struct drm_gpuva_manager {
 	struct drm_gpuva kernel_alloc_node;
 
 	/**
-	 * @ops: &drm_gpuva_fn_ops providing the split/merge steps to drivers
+	 * @ops: &drm_gpuvm_ops providing the split/merge steps to drivers
 	 */
-	const struct drm_gpuva_fn_ops *ops;
+	const struct drm_gpuvm_ops *ops;
 };
 
-void drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
-			    const char *name,
-			    u64 start_offset, u64 range,
-			    u64 reserve_offset, u64 reserve_range,
-			    const struct drm_gpuva_fn_ops *ops);
-void drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr);
+void drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name,
+		    u64 start_offset, u64 range,
+		    u64 reserve_offset, u64 reserve_range,
+		    const struct drm_gpuvm_ops *ops);
+void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
 
 static inline struct drm_gpuva *
 __drm_gpuva_next(struct drm_gpuva *va)
 {
-	if (va && !list_is_last(&va->rb.entry, &va->mgr->rb.list))
+	if (va && !list_is_last(&va->rb.entry, &va->vm->rb.list))
 		return list_next_entry(va, rb.entry);
 
 	return NULL;
 }
 
 /**
- * drm_gpuva_for_each_va_range() - iterate over a range of &drm_gpuvas
+ * drm_gpuvm_for_each_va_range() - iterate over a range of &drm_gpuvas
  * @va__: &drm_gpuva structure to assign to in each iteration step
- * @mgr__: &drm_gpuva_manager to walk over
+ * @gpuvm__: &drm_gpuvm to walk over
  * @start__: starting offset, the first gpuva will overlap this
  * @end__: ending offset, the last gpuva will start before this (but may
  * overlap)
  *
- * This iterator walks over all &drm_gpuvas in the &drm_gpuva_manager that lie
+ * This iterator walks over all &drm_gpuvas in the &drm_gpuvm that lie
  * between @start__ and @end__. It is implemented similarly to list_for_each(),
- * but is using the &drm_gpuva_manager's internal interval tree to accelerate
+ * but is using the &drm_gpuvm's internal interval tree to accelerate
  * the search for the starting &drm_gpuva, and hence isn't safe against removal
  * of elements. It assumes that @end__ is within (or is the upper limit of) the
- * &drm_gpuva_manager. This iterator does not skip over the &drm_gpuva_manager's
+ * &drm_gpuvm. This iterator does not skip over the &drm_gpuvm's
  * @kernel_alloc_node.
  */
-#define drm_gpuva_for_each_va_range(va__, mgr__, start__, end__) \
-	for (va__ = drm_gpuva_find_first((mgr__), (start__), (end__) - (start__)); \
+#define drm_gpuvm_for_each_va_range(va__, gpuvm__, start__, end__) \
+	for (va__ = drm_gpuva_find_first((gpuvm__), (start__), (end__) - (start__)); \
 	     va__ && (va__->va.addr < (end__)); \
 	     va__ = __drm_gpuva_next(va__))
 
 /**
- * drm_gpuva_for_each_va_range_safe() - safely iterate over a range of
+ * drm_gpuvm_for_each_va_range_safe() - safely iterate over a range of
  * &drm_gpuvas
  * @va__: &drm_gpuva to assign to in each iteration step
  * @next__: another &drm_gpuva to use as temporary storage
- * @mgr__: &drm_gpuva_manager to walk over
+ * @gpuvm__: &drm_gpuvm to walk over
  * @start__: starting offset, the first gpuva will overlap this
  * @end__: ending offset, the last gpuva will start before this (but may
  * overlap)
  *
- * This iterator walks over all &drm_gpuvas in the &drm_gpuva_manager that lie
+ * This iterator walks over all &drm_gpuvas in the &drm_gpuvm that lie
  * between @start__ and @end__. It is implemented similarly to
- * list_for_each_safe(), but is using the &drm_gpuva_manager's internal interval
+ * list_for_each_safe(), but is using the &drm_gpuvm's internal interval
  * tree to accelerate the search for the starting &drm_gpuva, and hence is safe
  * against removal of elements. It assumes that @end__ is within (or is the
- * upper limit of) the &drm_gpuva_manager. This iterator does not skip over the
- * &drm_gpuva_manager's @kernel_alloc_node.
+ * upper limit of) the &drm_gpuvm. This iterator does not skip over the
+ * &drm_gpuvm's @kernel_alloc_node.
  */
-#define drm_gpuva_for_each_va_range_safe(va__, next__, mgr__, start__, end__) \
-	for (va__ = drm_gpuva_find_first((mgr__), (start__), (end__) - (start__)), \
+#define drm_gpuvm_for_each_va_range_safe(va__, next__, gpuvm__, start__, end__) \
+	for (va__ = drm_gpuva_find_first((gpuvm__), (start__), (end__) - (start__)), \
 	     next__ = __drm_gpuva_next(va__); \
 	     va__ && (va__->va.addr < (end__)); \
 	     va__ = next__, next__ = __drm_gpuva_next(va__))
 
 /**
- * drm_gpuva_for_each_va() - iterate over all &drm_gpuvas
+ * drm_gpuvm_for_each_va() - iterate over all &drm_gpuvas
  * @va__: &drm_gpuva to assign to in each iteration step
- * @mgr__: &drm_gpuva_manager to walk over
+ * @gpuvm__: &drm_gpuvm to walk over
  *
  * This iterator walks over all &drm_gpuva structures associated with the given
- * &drm_gpuva_manager.
+ * &drm_gpuvm.
  */
-#define drm_gpuva_for_each_va(va__, mgr__) \
-	list_for_each_entry(va__, &(mgr__)->rb.list, rb.entry)
+#define drm_gpuvm_for_each_va(va__, gpuvm__) \
+	list_for_each_entry(va__, &(gpuvm__)->rb.list, rb.entry)
 
 /**
- * drm_gpuva_for_each_va_safe() - safely iterate over all &drm_gpuvas
+ * drm_gpuvm_for_each_va_safe() - safely iterate over all &drm_gpuvas
  * @va__: &drm_gpuva to assign to in each iteration step
  * @next__: another &drm_gpuva to use as temporary storage
- * @mgr__: &drm_gpuva_manager to walk over
+ * @gpuvm__: &drm_gpuvm to walk over
  *
  * This iterator walks over all &drm_gpuva structures associated with the given
- * &drm_gpuva_manager. It is implemented with list_for_each_entry_safe(), and
+ * &drm_gpuvm. It is implemented with list_for_each_entry_safe(), and
  * hence safe against the removal of elements.
  */
-#define drm_gpuva_for_each_va_safe(va__, next__, mgr__) \
-	list_for_each_entry_safe(va__, next__, &(mgr__)->rb.list, rb.entry)
+#define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) \
+	list_for_each_entry_safe(va__, next__, &(gpuvm__)->rb.list, rb.entry)
 
 /**
  * enum drm_gpuva_op_type - GPU VA operation type
  *
- * Operations to alter the GPU VA mappings tracked by the &drm_gpuva_manager.
+ * Operations to alter the GPU VA mappings tracked by the &drm_gpuvm.
  */
 enum drm_gpuva_op_type {
 	/**
@@ -413,7 +412,7 @@ struct drm_gpuva_op_unmap {
 	 *
 	 * Optionally, if &keep is set, drivers may keep the actual page table
 	 * mappings for this &drm_gpuva, adding the missing page table entries
-	 * only and update the &drm_gpuva_manager accordingly.
+	 * only and update the &drm_gpuvm accordingly.
 	 */
 	bool keep;
 };
@@ -584,22 +583,22 @@ struct drm_gpuva_ops {
 #define drm_gpuva_next_op(op) list_next_entry(op, entry)
 
 struct drm_gpuva_ops *
-drm_gpuva_sm_map_ops_create(struct drm_gpuva_manager *mgr,
+drm_gpuvm_sm_map_ops_create(struct drm_gpuvm *gpuvm,
 			    u64 addr, u64 range,
 			    struct drm_gem_object *obj, u64 offset);
 struct drm_gpuva_ops *
-drm_gpuva_sm_unmap_ops_create(struct drm_gpuva_manager *mgr,
+drm_gpuvm_sm_unmap_ops_create(struct drm_gpuvm *gpuvm,
 			      u64 addr, u64 range);
 
 struct drm_gpuva_ops *
-drm_gpuva_prefetch_ops_create(struct drm_gpuva_manager *mgr,
+drm_gpuvm_prefetch_ops_create(struct drm_gpuvm *gpuvm,
 				 u64 addr, u64 range);
 
 struct drm_gpuva_ops *
-drm_gpuva_gem_unmap_ops_create(struct drm_gpuva_manager *mgr,
+drm_gpuvm_gem_unmap_ops_create(struct drm_gpuvm *gpuvm,
 			       struct drm_gem_object *obj);
 
-void drm_gpuva_ops_free(struct drm_gpuva_manager *mgr,
+void drm_gpuva_ops_free(struct drm_gpuvm *gpuvm,
 			struct drm_gpuva_ops *ops);
 
 static inline void drm_gpuva_init_from_op(struct drm_gpuva *va,
@@ -610,15 +609,15 @@ static inline void drm_gpuva_init_from_op(struct drm_gpuva *va,
 }
 
 /**
- * struct drm_gpuva_fn_ops - callbacks for split/merge steps
+ * struct drm_gpuvm_ops - callbacks for split/merge steps
  *
- * This structure defines the callbacks used by &drm_gpuva_sm_map and
- * &drm_gpuva_sm_unmap to provide the split/merge steps for map and unmap
+ * This structure defines the callbacks used by &drm_gpuvm_sm_map and
+ * &drm_gpuvm_sm_unmap to provide the split/merge steps for map and unmap
  * operations to drivers.
  */
-struct drm_gpuva_fn_ops {
+struct drm_gpuvm_ops {
 	/**
-	 * @op_alloc: called when the &drm_gpuva_manager allocates
+	 * @op_alloc: called when the &drm_gpuvm allocates
 	 * a struct drm_gpuva_op
 	 *
 	 * Some drivers may want to embed struct drm_gpuva_op into driver
@@ -630,7 +629,7 @@ struct drm_gpuva_fn_ops {
 	struct drm_gpuva_op *(*op_alloc)(void);
 
 	/**
-	 * @op_free: called when the &drm_gpuva_manager frees a
+	 * @op_free: called when the &drm_gpuvm frees a
 	 * struct drm_gpuva_op
 	 *
 	 * Some drivers may want to embed struct drm_gpuva_op into driver
@@ -642,19 +641,19 @@ struct drm_gpuva_fn_ops {
 	void (*op_free)(struct drm_gpuva_op *op);
 
 	/**
-	 * @sm_step_map: called from &drm_gpuva_sm_map to finally insert the
+	 * @sm_step_map: called from &drm_gpuvm_sm_map to finally insert the
 	 * mapping once all previous steps were completed
 	 *
 	 * The &priv pointer matches the one the driver passed to
-	 * &drm_gpuva_sm_map or &drm_gpuva_sm_unmap, respectively.
+	 * &drm_gpuvm_sm_map or &drm_gpuvm_sm_unmap, respectively.
 	 *
-	 * Can be NULL if &drm_gpuva_sm_map is used.
+	 * Can be NULL if &drm_gpuvm_sm_map is used.
 	 */
 	int (*sm_step_map)(struct drm_gpuva_op *op, void *priv);
 
 	/**
-	 * @sm_step_remap: called from &drm_gpuva_sm_map and
-	 * &drm_gpuva_sm_unmap to split up an existent mapping
+	 * @sm_step_remap: called from &drm_gpuvm_sm_map and
+	 * &drm_gpuvm_sm_unmap to split up an existent mapping
 	 *
 	 * This callback is called when existent mapping needs to be split up.
 	 * This is the case when either a newly requested mapping overlaps or
@@ -662,38 +661,38 @@ struct drm_gpuva_fn_ops {
 	 * mapping is requested.
 	 *
 	 * The &priv pointer matches the one the driver passed to
-	 * &drm_gpuva_sm_map or &drm_gpuva_sm_unmap, respectively.
+	 * &drm_gpuvm_sm_map or &drm_gpuvm_sm_unmap, respectively.
 	 *
-	 * Can be NULL if neither &drm_gpuva_sm_map nor &drm_gpuva_sm_unmap is
+	 * Can be NULL if neither &drm_gpuvm_sm_map nor &drm_gpuvm_sm_unmap is
 	 * used.
 	 */
 	int (*sm_step_remap)(struct drm_gpuva_op *op, void *priv);
 
 	/**
-	 * @sm_step_unmap: called from &drm_gpuva_sm_map and
-	 * &drm_gpuva_sm_unmap to unmap an existent mapping
+	 * @sm_step_unmap: called from &drm_gpuvm_sm_map and
+	 * &drm_gpuvm_sm_unmap to unmap an existent mapping
 	 *
 	 * This callback is called when existent mapping needs to be unmapped.
 	 * This is the case when either a newly requested mapping encloses an
 	 * existent mapping or an unmap of an existent mapping is requested.
 	 *
 	 * The &priv pointer matches the one the driver passed to
-	 * &drm_gpuva_sm_map or &drm_gpuva_sm_unmap, respectively.
+	 * &drm_gpuvm_sm_map or &drm_gpuvm_sm_unmap, respectively.
 	 *
-	 * Can be NULL if neither &drm_gpuva_sm_map nor &drm_gpuva_sm_unmap is
+	 * Can be NULL if neither &drm_gpuvm_sm_map nor &drm_gpuvm_sm_unmap is
 	 * used.
 	 */
 	int (*sm_step_unmap)(struct drm_gpuva_op *op, void *priv);
 };
 
-int drm_gpuva_sm_map(struct drm_gpuva_manager *mgr, void *priv,
+int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
 		     u64 addr, u64 range,
 		     struct drm_gem_object *obj, u64 offset);
 
-int drm_gpuva_sm_unmap(struct drm_gpuva_manager *mgr, void *priv,
+int drm_gpuvm_sm_unmap(struct drm_gpuvm *gpuvm, void *priv,
 		       u64 addr, u64 range);
 
-void drm_gpuva_map(struct drm_gpuva_manager *mgr,
+void drm_gpuva_map(struct drm_gpuvm *gpuvm,
 		   struct drm_gpuva *va,
 		   struct drm_gpuva_op_map *op);
 
@@ -703,4 +702,4 @@ void drm_gpuva_remap(struct drm_gpuva *prev,
 
 void drm_gpuva_unmap(struct drm_gpuva_op_unmap *op);
 
-#endif /* __DRM_GPUVA_MGR_H__ */
+#endif /* __DRM_GPUVM_H__ */
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH drm-misc-next v3 2/7] drm/gpuvm: allow building as module
  2023-09-09 15:31 [PATCH drm-misc-next v3 0/7] [RFC] DRM GPUVA Manager GPU-VM features Danilo Krummrich
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 1/7] drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm Danilo Krummrich
@ 2023-09-09 15:31 ` Danilo Krummrich
  2023-09-11 13:09   ` Christian König
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 3/7] drm/nouveau: uvmm: rename 'umgr' to 'base' Danilo Krummrich
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-09 15:31 UTC (permalink / raw)
  To: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel, Danilo Krummrich

Currently, the DRM GPUVM does not have any core dependencies preventing
a module build.

Also, new features from subsequent patches require helpers (namely
drm_exec) which can be built as module.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
 drivers/gpu/drm/Kconfig         | 7 +++++++
 drivers/gpu/drm/Makefile        | 2 +-
 drivers/gpu/drm/drm_gpuvm.c     | 3 +++
 drivers/gpu/drm/nouveau/Kconfig | 1 +
 4 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index ab9ef1c20349..0f78a03e4e84 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -216,6 +216,13 @@ config DRM_EXEC
 	help
 	  Execution context for command submissions
 
+config DRM_GPUVM
+	tristate
+	depends on DRM && DRM_EXEC
+	help
+	  GPU-VM representation providing helpers to manage a GPUs virtual
+	  address space
+
 config DRM_BUDDY
 	tristate
 	depends on DRM
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 7a84b3cddeab..8e1bde059170 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -45,7 +45,6 @@ drm-y := \
 	drm_vblank.o \
 	drm_vblank_work.o \
 	drm_vma_manager.o \
-	drm_gpuvm.o \
 	drm_writeback.o
 drm-$(CONFIG_DRM_LEGACY) += \
 	drm_agpsupport.o \
@@ -81,6 +80,7 @@ obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += drm_panel_orientation_quirks.o
 #
 #
 obj-$(CONFIG_DRM_EXEC) += drm_exec.o
+obj-$(CONFIG_DRM_GPUVM) += drm_gpuvm.o
 
 obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o
 
diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index de1a69bc4a44..aae086deaa2b 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -1723,3 +1723,6 @@ drm_gpuva_ops_free(struct drm_gpuvm *gpuvm,
 	kfree(ops);
 }
 EXPORT_SYMBOL_GPL(drm_gpuva_ops_free);
+
+MODULE_DESCRIPTION("DRM GPUVM");
+MODULE_LICENSE("GPL");
diff --git a/drivers/gpu/drm/nouveau/Kconfig b/drivers/gpu/drm/nouveau/Kconfig
index c52e8096cca4..1e6aaf95ff7c 100644
--- a/drivers/gpu/drm/nouveau/Kconfig
+++ b/drivers/gpu/drm/nouveau/Kconfig
@@ -11,6 +11,7 @@ config DRM_NOUVEAU
 	select DRM_TTM
 	select DRM_TTM_HELPER
 	select DRM_EXEC
+	select DRM_GPUVM
 	select DRM_SCHED
 	select I2C
 	select I2C_ALGOBIT
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH drm-misc-next v3 3/7] drm/nouveau: uvmm: rename 'umgr' to 'base'
  2023-09-09 15:31 [PATCH drm-misc-next v3 0/7] [RFC] DRM GPUVA Manager GPU-VM features Danilo Krummrich
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 1/7] drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm Danilo Krummrich
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 2/7] drm/gpuvm: allow building as module Danilo Krummrich
@ 2023-09-09 15:31 ` Danilo Krummrich
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 4/7] drm/gpuvm: common dma-resv per struct drm_gpuvm Danilo Krummrich
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-09 15:31 UTC (permalink / raw)
  To: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel, Danilo Krummrich

Rename struct drm_gpuvm within struct nouveau_uvmm from 'umgr' to base.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
 drivers/gpu/drm/nouveau/nouveau_debugfs.c |  2 +-
 drivers/gpu/drm/nouveau/nouveau_exec.c    |  4 +--
 drivers/gpu/drm/nouveau/nouveau_uvmm.c    | 32 +++++++++++------------
 drivers/gpu/drm/nouveau/nouveau_uvmm.h    |  6 ++---
 4 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_debugfs.c b/drivers/gpu/drm/nouveau/nouveau_debugfs.c
index 053f703f2f68..e83db051e851 100644
--- a/drivers/gpu/drm/nouveau/nouveau_debugfs.c
+++ b/drivers/gpu/drm/nouveau/nouveau_debugfs.c
@@ -231,7 +231,7 @@ nouveau_debugfs_gpuva(struct seq_file *m, void *data)
 			continue;
 
 		nouveau_uvmm_lock(uvmm);
-		drm_debugfs_gpuva_info(m, &uvmm->umgr);
+		drm_debugfs_gpuva_info(m, &uvmm->base);
 		seq_puts(m, "\n");
 		nouveau_debugfs_gpuva_regions(m, uvmm);
 		nouveau_uvmm_unlock(uvmm);
diff --git a/drivers/gpu/drm/nouveau/nouveau_exec.c b/drivers/gpu/drm/nouveau/nouveau_exec.c
index c001952cd678..b4239af29e5a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_exec.c
+++ b/drivers/gpu/drm/nouveau/nouveau_exec.c
@@ -106,8 +106,8 @@ nouveau_exec_job_submit(struct nouveau_job *job)
 	drm_exec_until_all_locked(exec) {
 		struct drm_gpuva *va;
 
-		drm_gpuvm_for_each_va(va, &uvmm->umgr) {
-			if (unlikely(va == &uvmm->umgr.kernel_alloc_node))
+		drm_gpuvm_for_each_va(va, &uvmm->base) {
+			if (unlikely(va == &uvmm->base.kernel_alloc_node))
 				continue;
 
 			ret = drm_exec_prepare_obj(exec, va->gem.obj, 1);
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
index c750072cb268..6c86b64273c3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -329,7 +329,7 @@ nouveau_uvma_region_create(struct nouveau_uvmm *uvmm,
 	struct nouveau_uvma_region *reg;
 	int ret;
 
-	if (!drm_gpuva_interval_empty(&uvmm->umgr, addr, range))
+	if (!drm_gpuva_interval_empty(&uvmm->base, addr, range))
 		return -ENOSPC;
 
 	ret = nouveau_uvma_region_alloc(&reg);
@@ -384,7 +384,7 @@ nouveau_uvma_region_empty(struct nouveau_uvma_region *reg)
 {
 	struct nouveau_uvmm *uvmm = reg->uvmm;
 
-	return drm_gpuva_interval_empty(&uvmm->umgr,
+	return drm_gpuva_interval_empty(&uvmm->base,
 					reg->va.addr,
 					reg->va.range);
 }
@@ -589,7 +589,7 @@ op_map_prepare(struct nouveau_uvmm *uvmm,
 	uvma->region = args->region;
 	uvma->kind = args->kind;
 
-	drm_gpuva_map(&uvmm->umgr, &uvma->va, op);
+	drm_gpuva_map(&uvmm->base, &uvma->va, op);
 
 	/* Keep a reference until this uvma is destroyed. */
 	nouveau_uvma_gem_get(uvma);
@@ -1194,7 +1194,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 				goto unwind_continue;
 			}
 
-			op->ops = drm_gpuvm_sm_unmap_ops_create(&uvmm->umgr,
+			op->ops = drm_gpuvm_sm_unmap_ops_create(&uvmm->base,
 								op->va.addr,
 								op->va.range);
 			if (IS_ERR(op->ops)) {
@@ -1205,7 +1205,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 			ret = nouveau_uvmm_sm_unmap_prepare(uvmm, &op->new,
 							    op->ops);
 			if (ret) {
-				drm_gpuva_ops_free(&uvmm->umgr, op->ops);
+				drm_gpuva_ops_free(&uvmm->base, op->ops);
 				op->ops = NULL;
 				op->reg = NULL;
 				goto unwind_continue;
@@ -1240,7 +1240,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 				}
 			}
 
-			op->ops = drm_gpuvm_sm_map_ops_create(&uvmm->umgr,
+			op->ops = drm_gpuvm_sm_map_ops_create(&uvmm->base,
 							      op->va.addr,
 							      op->va.range,
 							      op->gem.obj,
@@ -1256,7 +1256,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 							  op->va.range,
 							  op->flags & 0xff);
 			if (ret) {
-				drm_gpuva_ops_free(&uvmm->umgr, op->ops);
+				drm_gpuva_ops_free(&uvmm->base, op->ops);
 				op->ops = NULL;
 				goto unwind_continue;
 			}
@@ -1264,7 +1264,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 			break;
 		}
 		case OP_UNMAP:
-			op->ops = drm_gpuvm_sm_unmap_ops_create(&uvmm->umgr,
+			op->ops = drm_gpuvm_sm_unmap_ops_create(&uvmm->base,
 								op->va.addr,
 								op->va.range);
 			if (IS_ERR(op->ops)) {
@@ -1275,7 +1275,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 			ret = nouveau_uvmm_sm_unmap_prepare(uvmm, &op->new,
 							    op->ops);
 			if (ret) {
-				drm_gpuva_ops_free(&uvmm->umgr, op->ops);
+				drm_gpuva_ops_free(&uvmm->base, op->ops);
 				op->ops = NULL;
 				goto unwind_continue;
 			}
@@ -1404,7 +1404,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 			break;
 		}
 
-		drm_gpuva_ops_free(&uvmm->umgr, op->ops);
+		drm_gpuva_ops_free(&uvmm->base, op->ops);
 		op->ops = NULL;
 		op->reg = NULL;
 	}
@@ -1509,7 +1509,7 @@ nouveau_uvmm_bind_job_free_work_fn(struct work_struct *work)
 		}
 
 		if (!IS_ERR_OR_NULL(op->ops))
-			drm_gpuva_ops_free(&uvmm->umgr, op->ops);
+			drm_gpuva_ops_free(&uvmm->base, op->ops);
 
 		if (obj)
 			drm_gem_object_put(obj);
@@ -1836,7 +1836,7 @@ nouveau_uvmm_init(struct nouveau_uvmm *uvmm, struct nouveau_cli *cli,
 	uvmm->kernel_managed_addr = kernel_managed_addr;
 	uvmm->kernel_managed_size = kernel_managed_size;
 
-	drm_gpuvm_init(&uvmm->umgr, cli->name,
+	drm_gpuvm_init(&uvmm->base, cli->name,
 		       NOUVEAU_VA_SPACE_START,
 		       NOUVEAU_VA_SPACE_END,
 		       kernel_managed_addr, kernel_managed_size,
@@ -1855,7 +1855,7 @@ nouveau_uvmm_init(struct nouveau_uvmm *uvmm, struct nouveau_cli *cli,
 	return 0;
 
 out_free_gpuva_mgr:
-	drm_gpuvm_destroy(&uvmm->umgr);
+	drm_gpuvm_destroy(&uvmm->base);
 out_unlock:
 	mutex_unlock(&cli->mutex);
 	return ret;
@@ -1877,11 +1877,11 @@ nouveau_uvmm_fini(struct nouveau_uvmm *uvmm)
 	wait_event(entity->job.wq, list_empty(&entity->job.list.head));
 
 	nouveau_uvmm_lock(uvmm);
-	drm_gpuvm_for_each_va_safe(va, next, &uvmm->umgr) {
+	drm_gpuvm_for_each_va_safe(va, next, &uvmm->base) {
 		struct nouveau_uvma *uvma = uvma_from_va(va);
 		struct drm_gem_object *obj = va->gem.obj;
 
-		if (unlikely(va == &uvmm->umgr.kernel_alloc_node))
+		if (unlikely(va == &uvmm->base.kernel_alloc_node))
 			continue;
 
 		drm_gpuva_remove(va);
@@ -1910,7 +1910,7 @@ nouveau_uvmm_fini(struct nouveau_uvmm *uvmm)
 
 	mutex_lock(&cli->mutex);
 	nouveau_vmm_fini(&uvmm->vmm);
-	drm_gpuvm_destroy(&uvmm->umgr);
+	drm_gpuvm_destroy(&uvmm->base);
 	mutex_unlock(&cli->mutex);
 
 	dma_resv_fini(&uvmm->resv);
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.h b/drivers/gpu/drm/nouveau/nouveau_uvmm.h
index e96c9919d1bd..a308c59760a5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.h
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.h
@@ -8,8 +8,8 @@
 #include "nouveau_drv.h"
 
 struct nouveau_uvmm {
+	struct drm_gpuvm base;
 	struct nouveau_vmm vmm;
-	struct drm_gpuvm umgr;
 	struct maple_tree region_mt;
 	struct mutex mutex;
 	struct dma_resv resv;
@@ -41,10 +41,10 @@ struct nouveau_uvma {
 	u8 kind;
 };
 
-#define uvmm_from_mgr(x) container_of((x), struct nouveau_uvmm, umgr)
+#define uvmm_from_gpuvm(x) container_of((x), struct nouveau_uvmm, base)
 #define uvma_from_va(x) container_of((x), struct nouveau_uvma, va)
 
-#define to_uvmm(x) uvmm_from_mgr((x)->va.vm)
+#define to_uvmm(x) uvmm_from_gpuvm((x)->va.vm)
 
 struct nouveau_uvmm_bind_job {
 	struct nouveau_job base;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH drm-misc-next v3 4/7] drm/gpuvm: common dma-resv per struct drm_gpuvm
  2023-09-09 15:31 [PATCH drm-misc-next v3 0/7] [RFC] DRM GPUVA Manager GPU-VM features Danilo Krummrich
                   ` (2 preceding siblings ...)
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 3/7] drm/nouveau: uvmm: rename 'umgr' to 'base' Danilo Krummrich
@ 2023-09-09 15:31 ` Danilo Krummrich
  2023-09-11 12:00   ` Boris Brezillon
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination Danilo Krummrich
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-09 15:31 UTC (permalink / raw)
  To: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel, Danilo Krummrich

Provide a common dma-resv for GEM objects not being used outside of this
GPU-VM. This is used in a subsequent patch to generalize dma-resv,
external and evicted object handling and GEM validation.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
 drivers/gpu/drm/drm_gpuvm.c            | 10 ++++++++--
 drivers/gpu/drm/nouveau/nouveau_uvmm.c |  2 +-
 include/drm/drm_gpuvm.h                | 15 ++++++++++++++-
 3 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index aae086deaa2b..218204fe5068 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuvm *gpuvm,
 /**
  * drm_gpuvm_init() - initialize a &drm_gpuvm
  * @gpuvm: pointer to the &drm_gpuvm to initialize
+ * @drm: the drivers &drm_device
  * @name: the name of the GPU VA space
  * @start_offset: the start offset of the GPU VA space
  * @range: the size of the GPU VA space
@@ -668,7 +669,7 @@ drm_gpuva_range_valid(struct drm_gpuvm *gpuvm,
  * &name is expected to be managed by the surrounding driver structures.
  */
 void
-drm_gpuvm_init(struct drm_gpuvm *gpuvm,
+drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
 	       const char *name,
 	       u64 start_offset, u64 range,
 	       u64 reserve_offset, u64 reserve_range,
@@ -694,6 +695,9 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
 						     reserve_range)))
 			__drm_gpuva_insert(gpuvm, &gpuvm->kernel_alloc_node);
 	}
+
+	drm_gem_private_object_init(drm, &gpuvm->d_obj, 0);
+	gpuvm->resv = gpuvm->d_obj.resv;
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_init);
 
@@ -713,7 +717,9 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
 		__drm_gpuva_remove(&gpuvm->kernel_alloc_node);
 
 	WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
-	     "GPUVA tree is not empty, potentially leaking memory.");
+	     "GPUVA tree is not empty, potentially leaking memory.\n");
+
+	drm_gem_private_object_fini(&gpuvm->d_obj);
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
index 6c86b64273c3..a80ac8767843 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -1836,7 +1836,7 @@ nouveau_uvmm_init(struct nouveau_uvmm *uvmm, struct nouveau_cli *cli,
 	uvmm->kernel_managed_addr = kernel_managed_addr;
 	uvmm->kernel_managed_size = kernel_managed_size;
 
-	drm_gpuvm_init(&uvmm->base, cli->name,
+	drm_gpuvm_init(&uvmm->base, cli->drm->dev, cli->name,
 		       NOUVEAU_VA_SPACE_START,
 		       NOUVEAU_VA_SPACE_END,
 		       kernel_managed_addr, kernel_managed_size,
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index 0e802676e0a9..4abc753c01eb 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -240,9 +240,22 @@ struct drm_gpuvm {
 	 * @ops: &drm_gpuvm_ops providing the split/merge steps to drivers
 	 */
 	const struct drm_gpuvm_ops *ops;
+
+	/**
+	 * @d_obj: Dummy GEM object; used internally to pass the GPU VMs
+	 * dma-resv to &drm_exec.
+	 */
+	struct drm_gem_object d_obj;
+
+	/**
+	 * @resv: the &dma_resv for &drm_gem_objects mapped in this GPU VA
+	 * space
+	 */
+	struct dma_resv *resv;
 };
 
-void drm_gpuvm_init(struct drm_gpuvm *gpuvm, const char *name,
+void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
+		    const char *name,
 		    u64 start_offset, u64 range,
 		    u64 reserve_offset, u64 reserve_range,
 		    const struct drm_gpuvm_ops *ops);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination
  2023-09-09 15:31 [PATCH drm-misc-next v3 0/7] [RFC] DRM GPUVA Manager GPU-VM features Danilo Krummrich
                   ` (3 preceding siblings ...)
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 4/7] drm/gpuvm: common dma-resv per struct drm_gpuvm Danilo Krummrich
@ 2023-09-09 15:31 ` Danilo Krummrich
  2023-09-11 17:19   ` Thomas Hellström
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation Danilo Krummrich
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 7/7] drm/nouveau: GPUVM dma-resv/extobj handling, " Danilo Krummrich
  6 siblings, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-09 15:31 UTC (permalink / raw)
  To: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel, Danilo Krummrich

This patch adds an abstraction layer between the drm_gpuva mappings of
a particular drm_gem_object and this GEM object itself. The abstraction
represents a combination of a drm_gem_object and drm_gpuvm. The
drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
representing this abstraction), while each drm_gpuvm_bo contains list of
mappings of this GEM object.

This has multiple advantages:

1) We can use the drm_gpuvm_bo structure to attach it to various lists
   of the drm_gpuvm. This is useful for tracking external and evicted
   objects per VM, which is introduced in subsequent patches.

2) Finding mappings of a certain drm_gem_object mapped in a certain
   drm_gpuvm becomes much cheaper.

3) Drivers can derive and extend the structure to easily represent
   driver specific states of a BO for a certain GPUVM.

The idea of this abstraction was taken from amdgpu, hence the credit for
this idea goes to the developers of amdgpu.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
 drivers/gpu/drm/drm_gpuvm.c            | 304 ++++++++++++++++++++++---
 drivers/gpu/drm/nouveau/nouveau_uvmm.c |  68 ++++--
 include/drm/drm_gem.h                  |  32 +--
 include/drm/drm_gpuvm.h                | 149 +++++++++++-
 4 files changed, 478 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 218204fe5068..f4411047dbb3 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -61,6 +61,18 @@
  * contained within struct drm_gpuva already. Hence, for inserting &drm_gpuva
  * entries from within dma-fence signalling critical sections it is enough to
  * pre-allocate the &drm_gpuva structures.
+ *
+ * In order to connect a struct drm_gpuva its backing &drm_gem_object each
+ * &drm_gem_object maintains a list of &drm_gpuvm_bo structures, and each
+ * &drm_gpuvm_bo contains a list of &&drm_gpuva structures.
+ *
+ * A &drm_gpuvm_bo is an abstraction that represents a combination of a
+ * &drm_gpuvm and a &drm_gem_object. Every such combination should be unique.
+ * This is ensured by the API through drm_gpuvm_bo_obtain() and
+ * drm_gpuvm_bo_obtain_prealloc() which first look into the corresponding
+ * &drm_gem_object list of &drm_gpuvm_bos for an existing instance of this
+ * particular combination. If not existent a new instance is created and linked
+ * to the &drm_gem_object.
  */
 
 /**
@@ -393,14 +405,21 @@
  * split / merge or prefetch.
  *
  * The GPU VA manager also does not take care of the locking of the backing
- * &drm_gem_object buffers GPU VA lists by itself; drivers are responsible to
- * enforce mutual exclusion using either the GEMs dma_resv lock or alternatively
- * a driver specific external lock. For the latter see also
- * drm_gem_gpuva_set_lock().
+ * &drm_gem_object buffers GPU VA lists and &drm_gpuvm_bo abstractions by
+ * itself; drivers are responsible to enforce mutual exclusion using either the
+ * GEMs dma_resv lock or alternatively a driver specific external lock. For the
+ * latter see also drm_gem_gpuva_set_lock().
  *
  * However, the GPU VA manager contains lockdep checks to ensure callers of its
  * API hold the corresponding lock whenever the &drm_gem_objects GPU VA list is
- * accessed by functions such as drm_gpuva_link() or drm_gpuva_unlink().
+ * accessed by functions such as drm_gpuva_link() or drm_gpuva_unlink(), but
+ * also drm_gpuvm_bo_obtain() and drm_gpuvm_bo_put().
+ *
+ * The latter is required since on creation and destruction of a &drm_gpuvm_bo
+ * the &drm_gpuvm_bo is attached / removed from the &drm_gem_objects gpuva list.
+ * Subsequent calls to drm_gpuvm_bo_obtain() for the same &drm_gpuvm and
+ * &drm_gem_object must be able to observe previous creations and destructions
+ * of &drm_gpuvm_bos in order to keep instances unique.
  */
 
 /**
@@ -430,6 +449,7 @@
  *	{
  *		struct drm_gpuva_ops *ops;
  *		struct drm_gpuva_op *op
+ *		struct drm_gpuvm_bo *vm_bo;
  *
  *		driver_lock_va_space();
  *		ops = drm_gpuvm_sm_map_ops_create(gpuvm, addr, range,
@@ -437,6 +457,10 @@
  *		if (IS_ERR(ops))
  *			return PTR_ERR(ops);
  *
+ *		vm_bo = drm_gpuvm_bo_obtain(gpuvm, obj);
+ *		if (IS_ERR(vm_bo))
+ *			return PTR_ERR(vm_bo);
+ *
  *		drm_gpuva_for_each_op(op, ops) {
  *			struct drm_gpuva *va;
  *
@@ -449,7 +473,7 @@
  *
  *				driver_vm_map();
  *				drm_gpuva_map(gpuvm, va, &op->map);
- *				drm_gpuva_link(va);
+ *				drm_gpuva_link(va, vm_bo);
  *
  *				break;
  *			case DRM_GPUVA_OP_REMAP: {
@@ -476,11 +500,11 @@
  *				driver_vm_remap();
  *				drm_gpuva_remap(prev, next, &op->remap);
  *
- *				drm_gpuva_unlink(va);
  *				if (prev)
- *					drm_gpuva_link(prev);
+ *					drm_gpuva_link(prev, va->vm_bo);
  *				if (next)
- *					drm_gpuva_link(next);
+ *					drm_gpuva_link(next, va->vm_bo);
+ *				drm_gpuva_unlink(va);
  *
  *				break;
  *			}
@@ -496,6 +520,7 @@
  *				break;
  *			}
  *		}
+ *		drm_gpuvm_bo_put(vm_bo);
  *		driver_unlock_va_space();
  *
  *		return 0;
@@ -505,6 +530,7 @@
  *
  *	struct driver_context {
  *		struct drm_gpuvm *gpuvm;
+ *		struct drm_gpuvm_bo *vm_bo;
  *		struct drm_gpuva *new_va;
  *		struct drm_gpuva *prev_va;
  *		struct drm_gpuva *next_va;
@@ -525,6 +551,7 @@
  *				  struct drm_gem_object *obj, u64 offset)
  *	{
  *		struct driver_context ctx;
+ *		struct drm_gpuvm_bo *vm_bo;
  *		struct drm_gpuva_ops *ops;
  *		struct drm_gpuva_op *op;
  *		int ret = 0;
@@ -534,16 +561,23 @@
  *		ctx.new_va = kzalloc(sizeof(*ctx.new_va), GFP_KERNEL);
  *		ctx.prev_va = kzalloc(sizeof(*ctx.prev_va), GFP_KERNEL);
  *		ctx.next_va = kzalloc(sizeof(*ctx.next_va), GFP_KERNEL);
- *		if (!ctx.new_va || !ctx.prev_va || !ctx.next_va) {
+ *		ctx.vm_bo = drm_gpuvm_bo_create(gpuvm, obj);
+ *		if (!ctx.new_va || !ctx.prev_va || !ctx.next_va || !vm_bo) {
  *			ret = -ENOMEM;
  *			goto out;
  *		}
  *
+ *		// Typically protected with a driver specific GEM gpuva lock
+ *		// used in the fence signaling path for drm_gpuva_link() and
+ *		// drm_gpuva_unlink(), hence pre-allocate.
+ *		ctx.vm_bo = drm_gpuvm_bo_obtain_prealloc(ctx.vm_bo);
+ *
  *		driver_lock_va_space();
  *		ret = drm_gpuvm_sm_map(gpuvm, &ctx, addr, range, obj, offset);
  *		driver_unlock_va_space();
  *
  *	out:
+ *		drm_gpuvm_bo_put(ctx.vm_bo);
  *		kfree(ctx.new_va);
  *		kfree(ctx.prev_va);
  *		kfree(ctx.next_va);
@@ -556,7 +590,7 @@
  *
  *		drm_gpuva_map(ctx->vm, ctx->new_va, &op->map);
  *
- *		drm_gpuva_link(ctx->new_va);
+ *		drm_gpuva_link(ctx->new_va, ctx->vm_bo);
  *
  *		// prevent the new GPUVA from being freed in
  *		// driver_mapping_create()
@@ -568,22 +602,23 @@
  *	int driver_gpuva_remap(struct drm_gpuva_op *op, void *__ctx)
  *	{
  *		struct driver_context *ctx = __ctx;
+ *		struct drm_gpuva *va = op->remap.unmap->va;
  *
  *		drm_gpuva_remap(ctx->prev_va, ctx->next_va, &op->remap);
  *
- *		drm_gpuva_unlink(op->remap.unmap->va);
- *		kfree(op->remap.unmap->va);
- *
  *		if (op->remap.prev) {
- *			drm_gpuva_link(ctx->prev_va);
+ *			drm_gpuva_link(ctx->prev_va, va->vm_bo);
  *			ctx->prev_va = NULL;
  *		}
  *
  *		if (op->remap.next) {
- *			drm_gpuva_link(ctx->next_va);
+ *			drm_gpuva_link(ctx->next_va, va->vm_bo);
  *			ctx->next_va = NULL;
  *		}
  *
+ *		drm_gpuva_unlink(va);
+ *		kfree(va);
+ *
  *		return 0;
  *	}
  *
@@ -723,6 +758,186 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
 
+/**
+ * drm_gpuvm_bo_create() - create a new instance of struct drm_gpuvm_bo
+ * @gpuvm: The &drm_gpuvm the @obj is mapped in.
+ * @obj: The &drm_gem_object being mapped in the @gpuvm.
+ *
+ * If provided by the driver, this function uses the &drm_gpuvm_ops
+ * vm_bo_alloc() callback to allocate.
+ *
+ * Returns: a pointer to the &drm_gpuvm_bo on success, NULL on failure
+ */
+struct drm_gpuvm_bo *
+drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm,
+		    struct drm_gem_object *obj)
+{
+	const struct drm_gpuvm_ops *ops = gpuvm->ops;
+	struct drm_gpuvm_bo *vm_bo;
+
+	if (ops && ops->vm_bo_alloc)
+		vm_bo = ops->vm_bo_alloc();
+	else
+		vm_bo = kzalloc(sizeof(*vm_bo), GFP_KERNEL);
+
+	if (unlikely(!vm_bo))
+		return NULL;
+
+	vm_bo->vm = gpuvm;
+	vm_bo->obj = obj;
+
+	kref_init(&vm_bo->kref);
+	INIT_LIST_HEAD(&vm_bo->list.gpuva);
+	INIT_LIST_HEAD(&vm_bo->list.entry.gem);
+
+	drm_gem_object_get(obj);
+
+	return vm_bo;
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_bo_create);
+
+static void
+drm_gpuvm_bo_destroy(struct kref *kref)
+{
+	struct drm_gpuvm_bo *vm_bo = container_of(kref, struct drm_gpuvm_bo,
+						  kref);
+	struct drm_gpuvm *gpuvm = vm_bo->vm;
+	const struct drm_gpuvm_ops *ops = gpuvm->ops;
+	struct drm_gem_object *obj = vm_bo->obj;
+
+	drm_gem_gpuva_assert_lock_held(vm_bo->obj);
+
+	list_del(&vm_bo->list.entry.gem);
+
+	drm_gem_object_put(obj);
+
+	if (ops && ops->vm_bo_free)
+		ops->vm_bo_free(vm_bo);
+	else
+		kfree(vm_bo);
+}
+
+/**
+ * drm_gpuvm_bo_put() - drop a struct drm_gpuvm_bo reference
+ * @vm_bo: the &drm_gpuvm_bo to release the reference of
+ *
+ * This releases a reference to @vm_bo.
+ */
+void
+drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
+{
+	if (vm_bo)
+		kref_put(&vm_bo->kref, drm_gpuvm_bo_destroy);
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
+
+static struct drm_gpuvm_bo *
+__drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
+		    struct drm_gem_object *obj)
+{
+	struct drm_gpuvm_bo *vm_bo;
+
+	drm_gem_gpuva_assert_lock_held(obj);
+
+	drm_gem_for_each_gpuvm_bo(vm_bo, obj)
+		if (vm_bo->vm == gpuvm)
+			return vm_bo;
+
+	return NULL;
+}
+
+/**
+ * drm_gpuvm_bo_find() - find the &drm_gpuvm_bo for the given
+ * &drm_gpuvm and &drm_gem_object
+ * @gpuvm: The &drm_gpuvm the @obj is mapped in.
+ * @obj: The &drm_gem_object being mapped in the @gpuvm.
+ *
+ * Find the &drm_gpuvm_bo representing the combination of the given
+ * &drm_gpuvm and &drm_gem_object. If found, increases the reference
+ * count of the &drm_gpuvm_bo accordingly.
+ *
+ * Returns: a pointer to the &drm_gpuvm_bo on success, NULL on failure
+ */
+struct drm_gpuvm_bo *
+drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
+		  struct drm_gem_object *obj)
+{
+	struct drm_gpuvm_bo *vm_bo = __drm_gpuvm_bo_find(gpuvm, obj);
+
+	return vm_bo ? drm_gpuvm_bo_get(vm_bo) : NULL;
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_bo_find);
+
+/**
+ * drm_gpuvm_bo_obtain() - obtains and instance of the &drm_gpuvm_bo for the
+ * given &drm_gpuvm and &drm_gem_object
+ * @gpuvm: The &drm_gpuvm the @obj is mapped in.
+ * @obj: The &drm_gem_object being mapped in the @gpuvm.
+ *
+ * Find the &drm_gpuvm_bo representing the combination of the given
+ * &drm_gpuvm and &drm_gem_object. If found, increases the reference
+ * count of the &drm_gpuvm_bo accordingly. If not found, allocates a new
+ * &drm_gpuvm_bo.
+ *
+ * A new &drm_gpuvm_bo is added to the GEMs gpuva list.
+ *
+ * Returns: a pointer to the &drm_gpuvm_bo on success, an ERR_PTR on failure
+ */
+struct drm_gpuvm_bo *
+drm_gpuvm_bo_obtain(struct drm_gpuvm *gpuvm,
+		    struct drm_gem_object *obj)
+{
+	struct drm_gpuvm_bo *vm_bo;
+
+	vm_bo = drm_gpuvm_bo_find(gpuvm, obj);
+	if (vm_bo)
+		return vm_bo;
+
+	vm_bo = drm_gpuvm_bo_create(gpuvm, obj);
+	if (!vm_bo)
+		return ERR_PTR(-ENOMEM);
+
+	list_add_tail(&vm_bo->list.entry.gem, &obj->gpuva.list);
+
+	return vm_bo;
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain);
+
+/**
+ * drm_gpuvm_bo_obtain_prealloc() - obtains and instance of the &drm_gpuvm_bo
+ * for the given &drm_gpuvm and &drm_gem_object
+ * @__vm_bo: A pre-allocated struct drm_gpuvm_bo.
+ *
+ * Find the &drm_gpuvm_bo representing the combination of the given
+ * &drm_gpuvm and &drm_gem_object. If found, increases the reference
+ * count of the found &drm_gpuvm_bo accordingly, while the @__vm_bo reference
+ * count is decreased. If not found @__vm_bo is returned without further
+ * increase of the reference count.
+ *
+ * A new &drm_gpuvm_bo is added to the GEMs gpuva list.
+ *
+ * Returns: a pointer to the found &drm_gpuvm_bo or @__vm_bo if no existing
+ * &drm_gpuvm_bo was found
+ */
+struct drm_gpuvm_bo *
+drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *__vm_bo)
+{
+	struct drm_gpuvm *gpuvm = __vm_bo->vm;
+	struct drm_gem_object *obj = __vm_bo->obj;
+	struct drm_gpuvm_bo *vm_bo;
+
+	vm_bo = drm_gpuvm_bo_find(gpuvm, obj);
+	if (vm_bo) {
+		drm_gpuvm_bo_put(__vm_bo);
+		return vm_bo;
+	}
+
+	list_add_tail(&__vm_bo->list.entry.gem, &obj->gpuva.list);
+
+	return __vm_bo;
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
+
 static int
 __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
 		   struct drm_gpuva *va)
@@ -812,24 +1027,33 @@ EXPORT_SYMBOL_GPL(drm_gpuva_remove);
 /**
  * drm_gpuva_link() - link a &drm_gpuva
  * @va: the &drm_gpuva to link
+ * @vm_bo: the &drm_gpuvm_bo to add the &drm_gpuva to
  *
- * This adds the given &va to the GPU VA list of the &drm_gem_object it is
- * associated with.
+ * This adds the given &va to the GPU VA list of the &drm_gpuvm_bo and the
+ * &drm_gpuvm_bo to the &drm_gem_object it is associated with.
+ *
+ * For every &drm_gpuva entry added to the &drm_gpuvm_bo an additional
+ * reference of the latter is taken.
  *
  * This function expects the caller to protect the GEM's GPUVA list against
- * concurrent access using the GEMs dma_resv lock.
+ * concurrent access using either the GEMs dma_resv lock or a driver specific
+ * lock set through drm_gem_gpuva_set_lock().
  */
 void
-drm_gpuva_link(struct drm_gpuva *va)
+drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuvm_bo *vm_bo)
 {
 	struct drm_gem_object *obj = va->gem.obj;
 
 	if (unlikely(!obj))
 		return;
 
+	WARN_ON(obj != vm_bo->obj);
 	drm_gem_gpuva_assert_lock_held(obj);
 
-	list_add_tail(&va->gem.entry, &obj->gpuva.list);
+	drm_gpuvm_bo_get(vm_bo);
+	va->vm_bo = vm_bo;
+
+	list_add_tail(&va->gem.entry, &vm_bo->list.gpuva);
 }
 EXPORT_SYMBOL_GPL(drm_gpuva_link);
 
@@ -840,13 +1064,22 @@ EXPORT_SYMBOL_GPL(drm_gpuva_link);
  * This removes the given &va from the GPU VA list of the &drm_gem_object it is
  * associated with.
  *
+ * This removes the given &va from the GPU VA list of the &drm_gpuvm_bo and
+ * the &drm_gpuvm_bo from the &drm_gem_object it is associated with in case
+ * this call unlinks the last &drm_gpuva from the &drm_gpuvm_bo.
+ *
+ * For every &drm_gpuva entry removed from the &drm_gpuvm_bo a reference of
+ * the latter is dropped.
+ *
  * This function expects the caller to protect the GEM's GPUVA list against
- * concurrent access using the GEMs dma_resv lock.
+ * concurrent access using either the GEMs dma_resv lock or a driver specific
+ * lock set through drm_gem_gpuva_set_lock().
  */
 void
 drm_gpuva_unlink(struct drm_gpuva *va)
 {
 	struct drm_gem_object *obj = va->gem.obj;
+	struct drm_gpuvm_bo *vm_bo = va->vm_bo;
 
 	if (unlikely(!obj))
 		return;
@@ -854,6 +1087,9 @@ drm_gpuva_unlink(struct drm_gpuva *va)
 	drm_gem_gpuva_assert_lock_held(obj);
 
 	list_del_init(&va->gem.entry);
+
+	va->vm_bo = NULL;
+	drm_gpuvm_bo_put(vm_bo);
 }
 EXPORT_SYMBOL_GPL(drm_gpuva_unlink);
 
@@ -998,10 +1234,10 @@ drm_gpuva_remap(struct drm_gpuva *prev,
 		struct drm_gpuva *next,
 		struct drm_gpuva_op_remap *op)
 {
-	struct drm_gpuva *curr = op->unmap->va;
-	struct drm_gpuvm *gpuvm = curr->vm;
+	struct drm_gpuva *va = op->unmap->va;
+	struct drm_gpuvm *gpuvm = va->vm;
 
-	drm_gpuva_remove(curr);
+	drm_gpuva_remove(va);
 
 	if (op->prev) {
 		drm_gpuva_init_from_op(prev, op->prev);
@@ -1645,9 +1881,8 @@ drm_gpuvm_prefetch_ops_create(struct drm_gpuvm *gpuvm,
 EXPORT_SYMBOL_GPL(drm_gpuvm_prefetch_ops_create);
 
 /**
- * drm_gpuvm_gem_unmap_ops_create() - creates the &drm_gpuva_ops to unmap a GEM
- * @gpuvm: the &drm_gpuvm representing the GPU VA space
- * @obj: the &drm_gem_object to unmap
+ * drm_gpuvm_bo_unmap_ops_create() - creates the &drm_gpuva_ops to unmap a GEM
+ * @vm_bo: the &drm_gpuvm_bo abstraction
  *
  * This function creates a list of operations to perform unmapping for every
  * GPUVA attached to a GEM.
@@ -1664,15 +1899,14 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_prefetch_ops_create);
  * Returns: a pointer to the &drm_gpuva_ops on success, an ERR_PTR on failure
  */
 struct drm_gpuva_ops *
-drm_gpuvm_gem_unmap_ops_create(struct drm_gpuvm *gpuvm,
-			       struct drm_gem_object *obj)
+drm_gpuvm_bo_unmap_ops_create(struct drm_gpuvm_bo *vm_bo)
 {
 	struct drm_gpuva_ops *ops;
 	struct drm_gpuva_op *op;
 	struct drm_gpuva *va;
 	int ret;
 
-	drm_gem_gpuva_assert_lock_held(obj);
+	drm_gem_gpuva_assert_lock_held(vm_bo->obj);
 
 	ops = kzalloc(sizeof(*ops), GFP_KERNEL);
 	if (!ops)
@@ -1680,8 +1914,8 @@ drm_gpuvm_gem_unmap_ops_create(struct drm_gpuvm *gpuvm,
 
 	INIT_LIST_HEAD(&ops->list);
 
-	drm_gem_for_each_gpuva(va, obj) {
-		op = gpuva_op_alloc(gpuvm);
+	drm_gpuvm_bo_for_each_va(va, vm_bo) {
+		op = gpuva_op_alloc(vm_bo->vm);
 		if (!op) {
 			ret = -ENOMEM;
 			goto err_free_ops;
@@ -1695,10 +1929,10 @@ drm_gpuvm_gem_unmap_ops_create(struct drm_gpuvm *gpuvm,
 	return ops;
 
 err_free_ops:
-	drm_gpuva_ops_free(gpuvm, ops);
+	drm_gpuva_ops_free(vm_bo->vm, ops);
 	return ERR_PTR(ret);
 }
-EXPORT_SYMBOL_GPL(drm_gpuvm_gem_unmap_ops_create);
+EXPORT_SYMBOL_GPL(drm_gpuvm_bo_unmap_ops_create);
 
 /**
  * drm_gpuva_ops_free() - free the given &drm_gpuva_ops
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
index a80ac8767843..cf709afd2ac7 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -62,6 +62,8 @@ struct bind_job_op {
 	enum vm_bind_op op;
 	u32 flags;
 
+	struct drm_gpuvm_bo *vm_bo;
+
 	struct {
 		u64 addr;
 		u64 range;
@@ -1113,22 +1115,28 @@ bind_validate_region(struct nouveau_job *job)
 }
 
 static void
-bind_link_gpuvas(struct drm_gpuva_ops *ops, struct nouveau_uvma_prealloc *new)
+bind_link_gpuvas(struct bind_job_op *bop)
 {
+	struct nouveau_uvma_prealloc *new = &bop->new;
+	struct drm_gpuvm_bo *vm_bo = bop->vm_bo;
+	struct drm_gpuva_ops *ops = bop->ops;
 	struct drm_gpuva_op *op;
 
 	drm_gpuva_for_each_op(op, ops) {
 		switch (op->op) {
 		case DRM_GPUVA_OP_MAP:
-			drm_gpuva_link(&new->map->va);
+			drm_gpuva_link(&new->map->va, vm_bo);
 			break;
-		case DRM_GPUVA_OP_REMAP:
+		case DRM_GPUVA_OP_REMAP: {
+			struct drm_gpuva *va = op->remap.unmap->va;
+
 			if (op->remap.prev)
-				drm_gpuva_link(&new->prev->va);
+				drm_gpuva_link(&new->prev->va, va->vm_bo);
 			if (op->remap.next)
-				drm_gpuva_link(&new->next->va);
-			drm_gpuva_unlink(op->remap.unmap->va);
+				drm_gpuva_link(&new->next->va, va->vm_bo);
+			drm_gpuva_unlink(va);
 			break;
+		}
 		case DRM_GPUVA_OP_UNMAP:
 			drm_gpuva_unlink(op->unmap.va);
 			break;
@@ -1150,10 +1158,18 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 
 	list_for_each_op(op, &bind_job->ops) {
 		if (op->op == OP_MAP) {
-			op->gem.obj = drm_gem_object_lookup(job->file_priv,
-							    op->gem.handle);
-			if (!op->gem.obj)
+			struct drm_gem_object *obj;
+
+			obj = drm_gem_object_lookup(job->file_priv,
+						    op->gem.handle);
+			if (!(op->gem.obj = obj))
 				return -ENOENT;
+
+			dma_resv_lock(obj->resv, NULL);
+			op->vm_bo = drm_gpuvm_bo_obtain(&uvmm->base, obj);
+			dma_resv_unlock(obj->resv);
+			if (IS_ERR(op->vm_bo))
+				return PTR_ERR(op->vm_bo);
 		}
 
 		ret = bind_validate_op(job, op);
@@ -1364,7 +1380,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 		case OP_UNMAP_SPARSE:
 		case OP_MAP:
 		case OP_UNMAP:
-			bind_link_gpuvas(op->ops, &op->new);
+			bind_link_gpuvas(op);
 			break;
 		default:
 			break;
@@ -1511,6 +1527,12 @@ nouveau_uvmm_bind_job_free_work_fn(struct work_struct *work)
 		if (!IS_ERR_OR_NULL(op->ops))
 			drm_gpuva_ops_free(&uvmm->base, op->ops);
 
+		if (!IS_ERR_OR_NULL(op->vm_bo)) {
+			dma_resv_lock(obj->resv, NULL);
+			drm_gpuvm_bo_put(op->vm_bo);
+			dma_resv_unlock(obj->resv);
+		}
+
 		if (obj)
 			drm_gem_object_put(obj);
 	}
@@ -1776,15 +1798,18 @@ void
 nouveau_uvmm_bo_map_all(struct nouveau_bo *nvbo, struct nouveau_mem *mem)
 {
 	struct drm_gem_object *obj = &nvbo->bo.base;
+	struct drm_gpuvm_bo *vm_bo;
 	struct drm_gpuva *va;
 
 	dma_resv_assert_held(obj->resv);
 
-	drm_gem_for_each_gpuva(va, obj) {
-		struct nouveau_uvma *uvma = uvma_from_va(va);
+	drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
+		drm_gpuvm_bo_for_each_va(va, vm_bo) {
+			struct nouveau_uvma *uvma = uvma_from_va(va);
 
-		nouveau_uvma_map(uvma, mem);
-		drm_gpuva_invalidate(va, false);
+			nouveau_uvma_map(uvma, mem);
+			drm_gpuva_invalidate(va, false);
+		}
 	}
 }
 
@@ -1792,15 +1817,18 @@ void
 nouveau_uvmm_bo_unmap_all(struct nouveau_bo *nvbo)
 {
 	struct drm_gem_object *obj = &nvbo->bo.base;
+	struct drm_gpuvm_bo *vm_bo;
 	struct drm_gpuva *va;
 
 	dma_resv_assert_held(obj->resv);
 
-	drm_gem_for_each_gpuva(va, obj) {
-		struct nouveau_uvma *uvma = uvma_from_va(va);
+	drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
+		drm_gpuvm_bo_for_each_va(va, vm_bo) {
+			struct nouveau_uvma *uvma = uvma_from_va(va);
 
-		nouveau_uvma_unmap(uvma);
-		drm_gpuva_invalidate(va, true);
+			nouveau_uvma_unmap(uvma);
+			drm_gpuva_invalidate(va, true);
+		}
 	}
 }
 
@@ -1847,14 +1875,14 @@ nouveau_uvmm_init(struct nouveau_uvmm *uvmm, struct nouveau_cli *cli,
 			    kernel_managed_addr, kernel_managed_size,
 			    NULL, 0, &cli->uvmm.vmm.vmm);
 	if (ret)
-		goto out_free_gpuva_mgr;
+		goto out_free_gpuvm;
 
 	cli->uvmm.vmm.cli = cli;
 	mutex_unlock(&cli->mutex);
 
 	return 0;
 
-out_free_gpuva_mgr:
+out_free_gpuvm:
 	drm_gpuvm_destroy(&uvmm->base);
 out_unlock:
 	mutex_unlock(&cli->mutex);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index bc9f6aa2f3fe..7147978d82d8 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -571,7 +571,7 @@ int drm_gem_evict(struct drm_gem_object *obj);
  * drm_gem_gpuva_init() - initialize the gpuva list of a GEM object
  * @obj: the &drm_gem_object
  *
- * This initializes the &drm_gem_object's &drm_gpuva list.
+ * This initializes the &drm_gem_object's &drm_gpuvm_bo list.
  *
  * Calling this function is only necessary for drivers intending to support the
  * &drm_driver_feature DRIVER_GEM_GPUVA.
@@ -584,28 +584,28 @@ static inline void drm_gem_gpuva_init(struct drm_gem_object *obj)
 }
 
 /**
- * drm_gem_for_each_gpuva() - iternator to walk over a list of gpuvas
- * @entry__: &drm_gpuva structure to assign to in each iteration step
- * @obj__: the &drm_gem_object the &drm_gpuvas to walk are associated with
+ * drm_gem_for_each_gpuvm_bo() - iterator to walk over a list of &drm_gpuvm_bo
+ * @entry__: &drm_gpuvm_bo structure to assign to in each iteration step
+ * @obj__: the &drm_gem_object the &drm_gpuvm_bo to walk are associated with
  *
- * This iterator walks over all &drm_gpuva structures associated with the
- * &drm_gpuva_manager.
+ * This iterator walks over all &drm_gpuvm_bo structures associated with the
+ * &drm_gem_object.
  */
-#define drm_gem_for_each_gpuva(entry__, obj__) \
-	list_for_each_entry(entry__, &(obj__)->gpuva.list, gem.entry)
+#define drm_gem_for_each_gpuvm_bo(entry__, obj__) \
+	list_for_each_entry(entry__, &(obj__)->gpuva.list, list.entry.gem)
 
 /**
- * drm_gem_for_each_gpuva_safe() - iternator to safely walk over a list of
- * gpuvas
- * @entry__: &drm_gpuva structure to assign to in each iteration step
- * @next__: &next &drm_gpuva to store the next step
- * @obj__: the &drm_gem_object the &drm_gpuvas to walk are associated with
+ * drm_gem_for_each_gpuvm_bo_safe() - iterator to safely walk over a list of
+ * &drm_gpuvm_bo
+ * @entry__: &drm_gpuvm_bostructure to assign to in each iteration step
+ * @next__: &next &drm_gpuvm_bo to store the next step
+ * @obj__: the &drm_gem_object the &drm_gpuvm_bo to walk are associated with
  *
- * This iterator walks over all &drm_gpuva structures associated with the
+ * This iterator walks over all &drm_gpuvm_bo structures associated with the
  * &drm_gem_object. It is implemented with list_for_each_entry_safe(), hence
  * it is save against removal of elements.
  */
-#define drm_gem_for_each_gpuva_safe(entry__, next__, obj__) \
-	list_for_each_entry_safe(entry__, next__, &(obj__)->gpuva.list, gem.entry)
+#define drm_gem_for_each_gpuvm_bo_safe(entry__, next__, obj__) \
+	list_for_each_entry_safe(entry__, next__, &(obj__)->gpuva.list, list.entry.gem)
 
 #endif /* __DRM_GEM_H__ */
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index 4abc753c01eb..afa50b9059a2 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -32,6 +32,7 @@
 #include <drm/drm_gem.h>
 
 struct drm_gpuvm;
+struct drm_gpuvm_bo;
 struct drm_gpuvm_ops;
 
 /**
@@ -72,6 +73,12 @@ struct drm_gpuva {
 	 */
 	struct drm_gpuvm *vm;
 
+	/**
+	 * @vm_bo: the &drm_gpuvm_bo abstraction for the mapped
+	 * &drm_gem_object
+	 */
+	struct drm_gpuvm_bo *vm_bo;
+
 	/**
 	 * @flags: the &drm_gpuva_flags for this mapping
 	 */
@@ -107,7 +114,7 @@ struct drm_gpuva {
 		struct drm_gem_object *obj;
 
 		/**
-		 * @entry: the &list_head to attach this object to a &drm_gem_object
+		 * @entry: the &list_head to attach this object to a &drm_gpuvm_bo
 		 */
 		struct list_head entry;
 	} gem;
@@ -140,7 +147,7 @@ struct drm_gpuva {
 int drm_gpuva_insert(struct drm_gpuvm *gpuvm, struct drm_gpuva *va);
 void drm_gpuva_remove(struct drm_gpuva *va);
 
-void drm_gpuva_link(struct drm_gpuva *va);
+void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuvm_bo *vm_bo);
 void drm_gpuva_unlink(struct drm_gpuva *va);
 
 struct drm_gpuva *drm_gpuva_find(struct drm_gpuvm *gpuvm,
@@ -339,6 +346,117 @@ __drm_gpuva_next(struct drm_gpuva *va)
 #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) \
 	list_for_each_entry_safe(va__, next__, &(gpuvm__)->rb.list, rb.entry)
 
+/**
+ * struct drm_gpuvm_bo - structure representing a &drm_gpuvm and
+ * &drm_gem_object combination
+ *
+ * This structure is an abstraction representing a &drm_gpuvm and
+ * &drm_gem_object combination. It serves as an indirection to accelerate
+ * iterating all &drm_gpuvas within a &drm_gpuvm backed by the same
+ * &drm_gem_object.
+ *
+ * Furthermore it is used cache evicted GEM objects for a certain GPU-VM to
+ * accelerate validation.
+ *
+ * Typically, drivers want to create an instance of a struct drm_gpuvm_bo once
+ * a GEM object is mapped first in a GPU-VM and release the instance once the
+ * last mapping of the GEM object in this GPU-VM is unmapped.
+ */
+struct drm_gpuvm_bo {
+
+	/**
+	 * @gpuvm: The &drm_gpuvm the @obj is mapped in.
+	 */
+	struct drm_gpuvm *vm;
+
+	/**
+	 * @obj: The &drm_gem_object being mapped in the @gpuvm.
+	 */
+	struct drm_gem_object *obj;
+
+	/**
+	 * @kref: The reference count for this &drm_gpuvm_bo.
+	 */
+	struct kref kref;
+
+	/**
+	 * @list: Structure containing all &list_heads.
+	 */
+	struct {
+		/**
+		 * @gpuva: The list of linked &drm_gpuvas.
+		 */
+		struct list_head gpuva;
+
+		/**
+		 * @entry: Structure containing all &list_heads serving as
+		 * entry.
+		 */
+		struct {
+			/**
+			 * @gem: List entry to attach to the &drm_gem_objects
+			 * gpuva list.
+			 */
+			struct list_head gem;
+		} entry;
+	} list;
+};
+
+struct drm_gpuvm_bo *
+drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm,
+		    struct drm_gem_object *obj);
+
+struct drm_gpuvm_bo *
+drm_gpuvm_bo_obtain(struct drm_gpuvm *gpuvm,
+		    struct drm_gem_object *obj);
+struct drm_gpuvm_bo *
+drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *vm_bo);
+
+/**
+ * drm_gpuvm_bo_get() - acquire a struct drm_gpuvm_bo reference
+ * @vm_bo: the &drm_gpuvm_bo to acquire the reference of
+ *
+ * This function acquires an additional reference to @vm_bo. It is illegal to
+ * call this without already holding a reference. No locks required.
+ */
+static inline struct drm_gpuvm_bo *
+drm_gpuvm_bo_get(struct drm_gpuvm_bo *vm_bo)
+{
+	kref_get(&vm_bo->kref);
+	return vm_bo;
+}
+
+void drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo);
+
+struct drm_gpuvm_bo *
+drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
+		  struct drm_gem_object *obj);
+
+/**
+ * drm_gpuvm_bo_for_each_va() - iterator to walk over a list of &drm_gpuva
+ * @va__: &drm_gpuva structure to assign to in each iteration step
+ * @vm_bo__: the &drm_gpuvm_bo the &drm_gpuva to walk are associated with
+ *
+ * This iterator walks over all &drm_gpuva structures associated with the
+ * &drm_gpuvm_bo.
+ */
+#define drm_gpuvm_bo_for_each_va(va__, vm_bo__) \
+	list_for_each_entry(va__, &(vm_bo)->list.gpuva, gem.entry)
+
+/**
+ * drm_gpuvm_bo_for_each_va_safe() - iterator to safely walk over a list of
+ * &drm_gpuva
+ * @va__: &drm_gpuva structure to assign to in each iteration step
+ * @next__: &next &drm_gpuva to store the next step
+ * @vm_bo__: the &drm_gpuvm_bo the &drm_gpuva to walk are associated with
+ *
+ * This iterator walks over all &drm_gpuva structures associated with the
+ * &drm_gpuvm_bo. It is implemented with list_for_each_entry_safe(), hence
+ * it is save against removal of elements.
+ */
+#define drm_gpuvm_bo_for_each_va_safe(va__, next__, vm_bo__) \
+	list_for_each_entry_safe(va__, next__, &(vm_bo)->list.gpuva, gem.entry)
+
 /**
  * enum drm_gpuva_op_type - GPU VA operation type
  *
@@ -608,8 +726,7 @@ drm_gpuvm_prefetch_ops_create(struct drm_gpuvm *gpuvm,
 				 u64 addr, u64 range);
 
 struct drm_gpuva_ops *
-drm_gpuvm_gem_unmap_ops_create(struct drm_gpuvm *gpuvm,
-			       struct drm_gem_object *obj);
+drm_gpuvm_bo_unmap_ops_create(struct drm_gpuvm_bo *vm_bo);
 
 void drm_gpuva_ops_free(struct drm_gpuvm *gpuvm,
 			struct drm_gpuva_ops *ops);
@@ -653,6 +770,30 @@ struct drm_gpuvm_ops {
 	 */
 	void (*op_free)(struct drm_gpuva_op *op);
 
+	/**
+	 * @vm_bo_alloc: called when the &drm_gpuvm allocates
+	 * a struct drm_gpuvm_bo
+	 *
+	 * Some drivers may want to embed struct drm_gpuvm_bo into driver
+	 * specific structures. By implementing this callback drivers can
+	 * allocate memory accordingly.
+	 *
+	 * This callback is optional.
+	 */
+	struct drm_gpuvm_bo *(*vm_bo_alloc)(void);
+
+	/**
+	 * @vm_bo_free: called when the &drm_gpuvm frees a
+	 * struct drm_gpuvm_bo
+	 *
+	 * Some drivers may want to embed struct drm_gpuvm_bo into driver
+	 * specific structures. By implementing this callback drivers can
+	 * free the previously allocated memory accordingly.
+	 *
+	 * This callback is optional.
+	 */
+	void (*vm_bo_free)(struct drm_gpuvm_bo *vm_bo);
+
 	/**
 	 * @sm_step_map: called from &drm_gpuvm_sm_map to finally insert the
 	 * mapping once all previous steps were completed
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-09 15:31 [PATCH drm-misc-next v3 0/7] [RFC] DRM GPUVA Manager GPU-VM features Danilo Krummrich
                   ` (4 preceding siblings ...)
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination Danilo Krummrich
@ 2023-09-09 15:31 ` Danilo Krummrich
  2023-09-09 20:16   ` kernel test robot
                     ` (5 more replies)
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 7/7] drm/nouveau: GPUVM dma-resv/extobj handling, " Danilo Krummrich
  6 siblings, 6 replies; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-09 15:31 UTC (permalink / raw)
  To: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel, Danilo Krummrich

So far the DRM GPUVA manager offers common infrastructure to track GPU VA
allocations and mappings, generically connect GPU VA mappings to their
backing buffers and perform more complex mapping operations on the GPU VA
space.

However, there are more design patterns commonly used by drivers, which
can potentially be generalized in order to make the DRM GPUVA manager
represent a basic GPU-VM implementation. In this context, this patch aims
at generalizing the following elements.

1) Provide a common dma-resv for GEM objects not being used outside of
   this GPU-VM.

2) Provide tracking of external GEM objects (GEM objects which are
   shared with other GPU-VMs).

3) Provide functions to efficiently lock all GEM objects dma-resv the
   GPU-VM contains mappings of.

4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
   of, such that validation of evicted GEM objects is accelerated.

5) Provide some convinience functions for common patterns.

Rather than being designed as a "framework", the target is to make all
features appear as a collection of optional helper functions, such that
drivers are free to make use of the DRM GPUVA managers basic
functionality and opt-in for other features without setting any feature
flags, just by making use of the corresponding functions.

Big kudos to Boris Brezillon for his help to figure out locking for drivers
updating the GPU VA space within the fence signalling path.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
 drivers/gpu/drm/drm_gpuvm.c | 516 ++++++++++++++++++++++++++++++++++++
 include/drm/drm_gpuvm.h     | 197 ++++++++++++++
 2 files changed, 713 insertions(+)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index f4411047dbb3..8e62a043f719 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -73,6 +73,21 @@
  * &drm_gem_object list of &drm_gpuvm_bos for an existing instance of this
  * particular combination. If not existent a new instance is created and linked
  * to the &drm_gem_object.
+ *
+ * &drm_gpuvm_bo structures, since unique for a given &drm_gpuvm, are also used
+ * as entry for the &drm_gpuvm's lists of external and evicted objects. Those
+ * list are maintained in order to accelerate locking of dma-resv locks and
+ * validation of evicted objects bound in a &drm_gpuvm. For instance the all
+ * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be locked by calling
+ * drm_gpuvm_exec_lock(). Once locked drivers can call drm_gpuvm_validate() in
+ * order to validate all evicted &drm_gem_objects. It is also possible to lock
+ * additional &drm_gem_objects by providing the corresponding parameters to
+ * drm_gpuvm_exec_lock() as well as open code the &drm_exec loop while making
+ * use of helper functions such as drm_gpuvm_prepare_range() or
+ * drm_gpuvm_prepare_objects().
+ *
+ * Every bound &drm_gem_object is treated as external object when its &dma_resv
+ * structure is different than the &drm_gpuvm's common &dma_resv structure.
  */
 
 /**
@@ -420,6 +435,20 @@
  * Subsequent calls to drm_gpuvm_bo_obtain() for the same &drm_gpuvm and
  * &drm_gem_object must be able to observe previous creations and destructions
  * of &drm_gpuvm_bos in order to keep instances unique.
+ *
+ * The &drm_gpuvm's lists for keeping track of external and evicted objects are
+ * protected against concurrent insertion / removal and iteration internally.
+ *
+ * However, drivers still need ensure to protect concurrent calls to functions
+ * iterating those lists, such as drm_gpuvm_validate() and
+ * drm_gpuvm_prepare_objects(). Every such function contains a particular
+ * comment and lockdep checks if possible.
+ *
+ * Functions adding or removing entries from those lists, such as
+ * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be called with external
+ * locks being held, e.g. in order to avoid the corresponding list to be
+ * (safely) modified while potentially being iternated by other API functions.
+ * However, this is entirely optional.
  */
 
 /**
@@ -632,6 +661,131 @@
  *	}
  */
 
+/**
+ * get_next_vm_bo_from_list() - get the next vm_bo element
+ * @__gpuvm: The GPU VM
+ * @__list_name: The name of the list we're iterating on
+ * @__local_list: A pointer to the local list used to store already iterated items
+ * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
+ *
+ * This helper is here to provide lockless list iteration. Lockless as in, the
+ * iterator releases the lock immediately after picking the first element from
+ * the list, so list insertion deletion can happen concurrently.
+ *
+ * Elements popped from the original list are kept in a local list, so removal
+ * and is_empty checks can still happen while we're iterating the list.
+ */
+#define get_next_vm_bo_from_list(__gpuvm, __list_name, __local_list, __prev_vm_bo)	\
+	({										\
+		struct drm_gpuvm_bo *__vm_bo;						\
+											\
+		drm_gpuvm_bo_put(__prev_vm_bo);						\
+											\
+		spin_lock(&(__gpuvm)->__list_name.lock);				\
+		while (!list_empty(&(__gpuvm)->__list_name.list)) {			\
+			__vm_bo = list_first_entry(&(__gpuvm)->__list_name.list,	\
+						   struct drm_gpuvm_bo,			\
+						   list.entry.__list_name);		\
+			if (drm_gpuvm_bo_get_unless_zero(__vm_bo)) {			\
+				list_move_tail(&(__vm_bo)->list.entry.__list_name,	\
+					       __local_list);				\
+				break;							\
+			} else {							\
+				list_del_init(&(__vm_bo)->list.entry.__list_name);	\
+				__vm_bo = NULL;						\
+			}								\
+		}									\
+		spin_unlock(&(__gpuvm)->__list_name.lock);				\
+											\
+		__vm_bo;								\
+	})
+
+/**
+ * for_each_vm_bo_in_list() - internal vm_bo list iterator
+ *
+ * This helper is here to provide lockless list iteration. Lockless as in, the
+ * iterator releases the lock immediately after picking the first element from the
+ * list, so list insertion and deletion can happen concurrently.
+ *
+ * Typical use:
+ *
+ *	struct drm_gpuvm_bo *vm_bo;
+ *	LIST_HEAD(my_local_list);
+ *
+ *	ret = 0;
+ *	drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>, &my_local_list, vm_bo) {
+ *		ret = do_something_with_vm_bo(..., vm_bo);
+ *		if (ret)
+ *			break;
+ *	}
+ *	drm_gpuvm_bo_put(vm_bo);
+ *	drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>, &my_local_list);
+ *
+ *
+ * Only used for internal list iterations, not meant to be exposed to the outside
+ * world.
+ */
+#define for_each_vm_bo_in_list(__gpuvm, __list_name, __local_list, __vm_bo)	\
+	for (__vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
+						__local_list, NULL);		\
+	     __vm_bo;								\
+	     __vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
+						__local_list, __vm_bo))		\
+
+/**
+ * restore_vm_bo_list() - move vm_bo elements back to their original list
+ * @__gpuvm: The GPU VM
+ * @__list_name: The name of the list we're iterating on
+ * @__local_list: A pointer to the local list used to store already iterated items
+ *
+ * When we're done iterating a vm_bo list, we should call restore_vm_bo_list()
+ * to restore the original state and let new iterations take place.
+ */
+#define restore_vm_bo_list(__gpuvm, __list_name, __local_list)				\
+	do {										\
+		/* Merge back the two lists, moving local list elements to the		\
+		 * head to preserve previous ordering, in case it matters.		\
+		 */									\
+		spin_lock(&(__gpuvm)->__list_name.lock);				\
+		list_splice(__local_list, &(__gpuvm)->__list_name.list);		\
+		spin_unlock(&(__gpuvm)->__list_name.lock);				\
+	} while (0)
+/**
+ * drm_gpuvm_bo_list_add() - insert a vm_bo into the given list
+ * @__vm_bo: the &drm_gpuvm_bo
+ * @__list_name: the name of the list to insert into
+ *
+ * Inserts the given @__vm_bo into the list specified by @__list_name and
+ * increases the vm_bo's reference count.
+ */
+#define drm_gpuvm_bo_list_add(__vm_bo, __list_name)				\
+	do {									\
+		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
+		if (list_empty(&(__vm_bo)->list.entry.__list_name))		\
+			list_add_tail(&(__vm_bo)->list.entry.__list_name,	\
+				      &(__vm_bo)->vm->__list_name.list);	\
+		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
+	} while (0)
+
+/**
+ * drm_gpuvm_bo_list_del() - remove a vm_bo from the given list
+ * @__vm_bo: the &drm_gpuvm_bo
+ * @__list_name: the name of the list to insert into
+ *
+ * Removes the given @__vm_bo from the list specified by @__list_name and
+ * decreases the vm_bo's reference count.
+ */
+#define drm_gpuvm_bo_list_del(__vm_bo, __list_name)				\
+	do {									\
+		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
+		if (!list_empty(&(__vm_bo)->list.entry.__list_name))		\
+			list_del_init(&(__vm_bo)->list.entry.__list_name);	\
+		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
+	} while (0)
+
+static int __must_check
+drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
+
 #define to_drm_gpuva(__node)	container_of((__node), struct drm_gpuva, rb.node)
 
 #define GPUVA_START(node) ((node)->va.addr)
@@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
 	gpuvm->rb.tree = RB_ROOT_CACHED;
 	INIT_LIST_HEAD(&gpuvm->rb.list);
 
+	INIT_LIST_HEAD(&gpuvm->extobj.list);
+	spin_lock_init(&gpuvm->extobj.lock);
+
+	INIT_LIST_HEAD(&gpuvm->evict.list);
+	spin_lock_init(&gpuvm->evict.lock);
+
 	drm_gpuva_check_overflow(start_offset, range);
 	gpuvm->mm_start = start_offset;
 	gpuvm->mm_range = range;
@@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
 	WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
 	     "GPUVA tree is not empty, potentially leaking memory.\n");
 
+	WARN(!list_empty(&gpuvm->extobj.list), "Extobj list should be empty.\n");
+	WARN(!list_empty(&gpuvm->evict.list), "Evict list should be empty.\n");
+
 	drm_gem_private_object_fini(&gpuvm->d_obj);
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
 
+/**
+ * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
+ * @gpuvm: the &drm_gpuvm
+ * @exec: the &drm_exec locking context
+ * @num_fences: the amount of &dma_fences to reserve
+ *
+ * Calls drm_exec_prepare_obj() for all &drm_gem_objects the given
+ * &drm_gpuvm contains mappings of.
+ *
+ * Using this function directly, it is the drivers responsibility to call
+ * drm_exec_init() and drm_exec_fini() accordingly.
+ *
+ * Note: This function is safe against concurrent insertion and removal of
+ * external objects, however it is not safe against concurrent usage itself.
+ *
+ * Drivers need to make sure to protect this case with either an outer VM lock
+ * or by calling drm_gpuvm_prepare_vm() before this function within the
+ * drm_exec_until_all_locked() loop, such that the GPUVM's dma-resv lock ensures
+ * mutual exclusion.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
+			  struct drm_exec *exec,
+			  unsigned int num_fences)
+{
+	struct drm_gpuvm_bo *vm_bo;
+	LIST_HEAD(extobjs);
+	int ret = 0;
+
+	for_each_vm_bo_in_list(gpuvm, extobj, &extobjs, vm_bo) {
+		ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
+		if (ret)
+			break;
+	}
+	/* Drop ref in case we break out of the loop. */
+	drm_gpuvm_bo_put(vm_bo);
+	restore_vm_bo_list(gpuvm, extobj, &extobjs);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
+
+/**
+ * drm_gpuvm_prepare_range() - prepare all BOs mapped within a given range
+ * @gpuvm: the &drm_gpuvm
+ * @exec: the &drm_exec locking context
+ * @addr: the start address within the VA space
+ * @range: the range to iterate within the VA space
+ * @num_fences: the amount of &dma_fences to reserve
+ *
+ * Calls drm_exec_prepare_obj() for all &drm_gem_objects mapped between @addr
+ * and @addr + @range.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct drm_exec *exec,
+			u64 addr, u64 range, unsigned int num_fences)
+{
+	struct drm_gpuva *va;
+	u64 end = addr + range;
+	int ret;
+
+	drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
+		struct drm_gem_object *obj = va->gem.obj;
+
+		ret = drm_exec_prepare_obj(exec, obj, num_fences);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
+
+/**
+ * drm_gpuvm_exec_lock() - lock all dma-resv of all assoiciated BOs
+ * @vm_exec: the &drm_gpuvm_exec abstraction
+ * @num_fences: the amount of &dma_fences to reserve
+ * @interruptible: sleep interruptible if waiting
+ *
+ * Acquires all dma-resv locks of all &drm_gem_objects the given
+ * &drm_gpuvm contains mappings of.
+ *
+ * Addionally, when calling this function with struct drm_gpuvm_exec::extra
+ * being set the driver receives the given @fn callback to lock additional
+ * dma-resv in the context of the &drm_gpuvm_exec instance. Typically, drivers
+ * would call drm_exec_prepare_obj() from within this callback.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
+		    unsigned int num_fences,
+		    bool interruptible)
+{
+	struct drm_gpuvm *gpuvm = vm_exec->vm;
+	struct drm_exec *exec = &vm_exec->exec;
+	uint32_t flags;
+	int ret;
+
+	flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT : 0 |
+		DRM_EXEC_IGNORE_DUPLICATES;
+
+	drm_exec_init(exec, flags);
+
+	drm_exec_until_all_locked(exec) {
+		ret = drm_gpuvm_prepare_vm(gpuvm, exec, num_fences);
+		drm_exec_retry_on_contention(exec);
+		if (ret)
+			goto err;
+
+		ret = drm_gpuvm_prepare_objects(gpuvm, exec, num_fences);
+		drm_exec_retry_on_contention(exec);
+		if (ret)
+			goto err;
+
+		if (vm_exec->extra.fn) {
+			ret = vm_exec->extra.fn(vm_exec, num_fences);
+			drm_exec_retry_on_contention(exec);
+			if (ret)
+				goto err;
+		}
+	}
+
+	return 0;
+
+err:
+	drm_exec_fini(exec);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
+
+static int
+fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int num_fences)
+{
+	struct {
+		struct drm_gem_object **objs;
+		unsigned int num_objs;
+	} *args = vm_exec->extra.priv;
+
+	return drm_exec_prepare_array(&vm_exec->exec, args->objs,
+				      args->num_objs, num_fences);
+}
+
+/**
+ * drm_gpuvm_exec_lock_array() - lock all dma-resv of all assoiciated BOs
+ * @vm_exec: the &drm_gpuvm_exec abstraction
+ * @objs: additional &drm_gem_objects to lock
+ * @num_objs: the number of additional &drm_gem_objects to lock
+ * @num_fences: the amount of &dma_fences to reserve
+ * @interruptible: sleep interruptible if waiting
+ *
+ * Acquires all dma-resv locks of all &drm_gem_objects the given &drm_gpuvm
+ * contains mappings of, plus the ones given through @objs.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
+			  struct drm_gem_object **objs,
+			  unsigned int num_objs,
+			  unsigned int num_fences,
+			  bool interruptible)
+{
+	struct {
+		struct drm_gem_object **objs;
+		unsigned int num_objs;
+	} args;
+
+	args.objs = objs;
+	args.num_objs = num_objs;
+
+	vm_exec->extra.fn = fn_lock_array;
+	vm_exec->extra.priv = &args;
+
+	return drm_gpuvm_exec_lock(vm_exec, num_fences, interruptible);
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
+
+/**
+ * drm_gpuvm_exec_lock_range() - prepare all BOs mapped within a given range
+ * @vm_exec: the &drm_gpuvm_exec abstraction
+ * @addr: the start address within the VA space
+ * @range: the range to iterate within the VA space
+ * @num_fences: the amount of &dma_fences to reserve
+ * @interruptible: sleep interruptible if waiting
+ *
+ * Acquires all dma-resv locks of all &drm_gem_objects mapped between @addr and
+ * @addr + @range.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
+			  u64 addr, u64 range,
+			  unsigned int num_fences,
+			  bool interruptible)
+{
+	struct drm_gpuvm *gpuvm = vm_exec->vm;
+	struct drm_exec *exec = &vm_exec->exec;
+	uint32_t flags;
+	int ret;
+
+	flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT : 0 |
+		DRM_EXEC_IGNORE_DUPLICATES;
+
+	drm_exec_init(exec, flags);
+
+	drm_exec_until_all_locked(exec) {
+		ret = drm_gpuvm_prepare_range(gpuvm, exec, addr, range,
+					      num_fences);
+		drm_exec_retry_on_contention(exec);
+		if (ret)
+			goto err;
+	}
+
+	return ret;
+
+err:
+	drm_exec_fini(exec);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
+
+/**
+ * drm_gpuvm_validate() - validate all BOs marked as evicted
+ * @gpuvm: the &drm_gpuvm to validate evicted BOs
+ *
+ * Calls the &drm_gpuvm_ops.bo_validate callback for all evicted buffer
+ * objects being mapped in the given &drm_gpuvm.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
+{
+	const struct drm_gpuvm_ops *ops = gpuvm->ops;
+	struct drm_gpuvm_bo *vm_bo;
+	LIST_HEAD(evict);
+	int ret = 0;
+
+	if (unlikely(!ops || !ops->bo_validate))
+		return -ENOTSUPP;
+
+	for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
+		dma_resv_assert_held(vm_bo->obj->resv);
+		ret = ops->bo_validate(vm_bo->obj);
+		if (ret)
+			break;
+	}
+	/* Drop ref in case we break out of the loop. */
+	drm_gpuvm_bo_put(vm_bo);
+	restore_vm_bo_list(gpuvm, evict, &evict);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
+
+/**
+ * drm_gpuvm_resv_add_fence - add fence to private and all extobj
+ * dma-resv
+ * @gpuvm: the &drm_gpuvm to add a fence to
+ * @exec: the &drm_exec locking context
+ * @fence: fence to add
+ * @private_usage: private dma-resv usage
+ * @extobj_usage: extobj dma-resv usage
+ */
+void
+drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
+			 struct drm_exec *exec,
+			 struct dma_fence *fence,
+			 enum dma_resv_usage private_usage,
+			 enum dma_resv_usage extobj_usage)
+{
+	struct drm_gem_object *obj;
+	unsigned long index;
+
+	drm_exec_for_each_locked_object(exec, index, obj) {
+		dma_resv_assert_held(obj->resv);
+		dma_resv_add_fence(obj->resv, fence,
+				   drm_gpuvm_is_extobj(gpuvm, obj) ?
+				   private_usage : extobj_usage);
+	}
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
+
 /**
  * drm_gpuvm_bo_create() - create a new instance of struct drm_gpuvm_bo
  * @gpuvm: The &drm_gpuvm the @obj is mapped in.
@@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm,
 	INIT_LIST_HEAD(&vm_bo->list.gpuva);
 	INIT_LIST_HEAD(&vm_bo->list.entry.gem);
 
+	INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
+	INIT_LIST_HEAD(&vm_bo->list.entry.evict);
+
 	drm_gem_object_get(obj);
 
 	return vm_bo;
@@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
 
 	drm_gem_gpuva_assert_lock_held(vm_bo->obj);
 
+	spin_lock(&gpuvm->extobj.lock);
+	list_del(&vm_bo->list.entry.extobj);
+	spin_unlock(&gpuvm->extobj.lock);
+
+	spin_lock(&gpuvm->evict.lock);
+	list_del(&vm_bo->list.entry.evict);
+	spin_unlock(&gpuvm->evict.lock);
+
 	list_del(&vm_bo->list.entry.gem);
 
 	drm_gem_object_put(obj);
@@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
  * @vm_bo: the &drm_gpuvm_bo to release the reference of
  *
  * This releases a reference to @vm_bo.
+ *
+ * If the reference count drops to zero, the &gpuvm_bo is destroyed, which
+ * includes removing it from the GEMs gpuva list. Hence, if a call to this
+ * function can potentially let the reference count to zero the caller must
+ * hold the dma-resv or driver specific GEM gpuva lock.
  */
 void
 drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
@@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
 
+static int __must_check
+drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
+{
+	return kref_get_unless_zero(&vm_bo->kref);
+}
+
 static struct drm_gpuvm_bo *
 __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
 		    struct drm_gem_object *obj)
@@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *__vm_bo)
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
 
+/**
+ * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its &drm_gpuvm's
+ * extobj list
+ * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the extobj list.
+ *
+ * Adds the given @vm_bo to its &drm_gpuvm's extobj list if not on the list
+ * already and if the corresponding &drm_gem_object is an external object,
+ * actually.
+ */
+void
+drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
+{
+	struct drm_gpuvm *gpuvm = vm_bo->vm;
+
+	if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
+		drm_gpuvm_bo_list_add(vm_bo, extobj);
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
+
+/**
+ * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to / from a
+ * &drm_gpuvms evicted list
+ * @obj: the &drm_gem_object to add or remove
+ * @evict: indicates whether the object is evicted
+ *
+ * Adds a &drm_gem_object to or removes it from all &drm_gpuvms evicted
+ * list containing a mapping of this &drm_gem_object.
+ */
+void
+drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
+{
+	struct drm_gpuvm_bo *vm_bo;
+
+	drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
+		if (evict)
+			drm_gpuvm_bo_list_add(vm_bo, evict);
+		else
+			drm_gpuvm_bo_list_del(vm_bo, evict);
+	}
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
+
 static int
 __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
 		   struct drm_gpuva *va)
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index afa50b9059a2..834bb6d6617e 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -26,10 +26,12 @@
  */
 
 #include <linux/list.h>
+#include <linux/dma-resv.h>
 #include <linux/rbtree.h>
 #include <linux/types.h>
 
 #include <drm/drm_gem.h>
+#include <drm/drm_exec.h>
 
 struct drm_gpuvm;
 struct drm_gpuvm_bo;
@@ -259,6 +261,38 @@ struct drm_gpuvm {
 	 * space
 	 */
 	struct dma_resv *resv;
+
+	/**
+	 * @extobj: structure holding the extobj list
+	 */
+	struct {
+		/**
+		 * @list: &list_head storing &drm_gpuvm_bos serving as
+		 * external object
+		 */
+		struct list_head list;
+
+		/**
+		 * @lock: spinlock to protect the extobj list
+		 */
+		spinlock_t lock;
+	} extobj;
+
+	/**
+	 * @evict: structure holding the evict list and evict list lock
+	 */
+	struct {
+		/**
+		 * @list: &list_head storing &drm_gpuvm_bos currently being
+		 * evicted
+		 */
+		struct list_head list;
+
+		/**
+		 * @lock: spinlock to protect the evict list
+		 */
+		spinlock_t lock;
+	} evict;
 };
 
 void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
@@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
 		    const struct drm_gpuvm_ops *ops);
 void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
 
+/**
+ * drm_gpuvm_is_extobj() - indicates whether the given &drm_gem_object is an
+ * external object
+ * @gpuvm: the &drm_gpuvm to check
+ * @obj: the &drm_gem_object to check
+ *
+ * Returns: true if the &drm_gem_object &dma_resv differs from the
+ * &drm_gpuvms &dma_resv, false otherwise
+ */
+static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm *gpuvm,
+				       struct drm_gem_object *obj)
+{
+	return obj && obj->resv != gpuvm->resv;
+}
+
 static inline struct drm_gpuva *
 __drm_gpuva_next(struct drm_gpuva *va)
 {
@@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
 #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) \
 	list_for_each_entry_safe(va__, next__, &(gpuvm__)->rb.list, rb.entry)
 
+/**
+ * struct drm_gpuvm_exec - &drm_gpuvm abstraction of &drm_exec
+ *
+ * This structure should be created on the stack as &drm_exec should be.
+ *
+ * Optionally, @extra can be set in order to lock additional &drm_gem_objects.
+ */
+struct drm_gpuvm_exec {
+	/**
+	 * @exec: the &drm_exec structure
+	 */
+	struct drm_exec exec;
+
+	/**
+	 * @vm: the &drm_gpuvm to lock its DMA reservations
+	 */
+	struct drm_gpuvm *vm;
+
+	/**
+	 * @extra: Callback and corresponding private data for the driver to
+	 * lock arbitrary additional &drm_gem_objects.
+	 */
+	struct {
+		/**
+		 * @fn: The driver callback to lock additional &drm_gem_objects.
+		 */
+		int (*fn)(struct drm_gpuvm_exec *vm_exec,
+			  unsigned int num_fences);
+
+		/**
+		 * @priv: driver private data for the @fn callback
+		 */
+		void *priv;
+	} extra;
+};
+
+/**
+ * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
+ * @gpuvm: the &drm_gpuvm
+ * @exec: the &drm_exec context
+ * @num_fences: the amount of &dma_fences to reserve
+ *
+ * Calls drm_exec_prepare_obj() for the GPUVMs dummy &drm_gem_object.
+ *
+ * Using this function directly, it is the drivers responsibility to call
+ * drm_exec_init() and drm_exec_fini() accordingly.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+static inline int
+drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
+		     struct drm_exec *exec,
+		     unsigned int num_fences)
+{
+	return drm_exec_prepare_obj(exec, &gpuvm->d_obj, num_fences);
+}
+
+int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
+			      struct drm_exec *exec,
+			      unsigned int num_fences);
+
+int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
+			    struct drm_exec *exec,
+			    u64 addr, u64 range,
+			    unsigned int num_fences);
+
+int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
+			unsigned int num_fences,
+			bool interruptible);
+
+int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
+			      struct drm_gem_object **objs,
+			      unsigned int num_objs,
+			      unsigned int num_fences,
+			      bool interruptible);
+
+int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
+			      u64 addr, u64 range,
+			      unsigned int num_fences,
+			      bool interruptible);
+
+/**
+ * drm_gpuvm_lock() - lock all dma-resv of all assoiciated BOs
+ * @gpuvm: the &drm_gpuvm
+ *
+ * Releases all dma-resv locks of all &drm_gem_objects previously acquired
+ * through drm_gpuvm_lock() or its variants.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+static inline void
+drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
+{
+	drm_exec_fini(&vm_exec->exec);
+}
+
+int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
+void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
+			      struct drm_exec *exec,
+			      struct dma_fence *fence,
+			      enum dma_resv_usage private_usage,
+			      enum dma_resv_usage extobj_usage);
+
+/**
+ * drm_gpuvm_exec_resv_add_fence()
+ * @vm_exec: the &drm_gpuvm_exec abstraction
+ * @fence: fence to add
+ * @private_usage: private dma-resv usage
+ * @extobj_usage: extobj dma-resv usage
+ *
+ * See drm_gpuvm_resv_add_fence().
+ */
+static inline void
+drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec *vm_exec,
+			      struct dma_fence *fence,
+			      enum dma_resv_usage private_usage,
+			      enum dma_resv_usage extobj_usage)
+{
+	drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec, fence,
+				 private_usage, extobj_usage);
+}
+
 /**
  * struct drm_gpuvm_bo - structure representing a &drm_gpuvm and
  * &drm_gem_object combination
@@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
 			 * gpuva list.
 			 */
 			struct list_head gem;
+
+			/**
+			 * @evict: List entry to attach to the &drm_gpuvms
+			 * extobj list.
+			 */
+			struct list_head extobj;
+
+			/**
+			 * @evict: List entry to attach to the &drm_gpuvms evict
+			 * list.
+			 */
+			struct list_head evict;
 		} entry;
 	} list;
 };
@@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
 drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
 		  struct drm_gem_object *obj);
 
+void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict);
+void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
+
 /**
  * drm_gpuvm_bo_for_each_va() - iterator to walk over a list of &drm_gpuva
  * @va__: &drm_gpuva structure to assign to in each iteration step
@@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
 	 * used.
 	 */
 	int (*sm_step_unmap)(struct drm_gpuva_op *op, void *priv);
+
+	/**
+	 * @bo_validate: called from drm_gpuvm_validate()
+	 *
+	 * Drivers receive this callback for every evicted &drm_gem_object being
+	 * mapped in the corresponding &drm_gpuvm.
+	 *
+	 * Typically, drivers would call their driver specific variant of
+	 * ttm_bo_validate() from within this callback.
+	 */
+	int (*bo_validate)(struct drm_gem_object *obj);
 };
 
 int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH drm-misc-next v3 7/7] drm/nouveau: GPUVM dma-resv/extobj handling, GEM validation
  2023-09-09 15:31 [PATCH drm-misc-next v3 0/7] [RFC] DRM GPUVA Manager GPU-VM features Danilo Krummrich
                   ` (5 preceding siblings ...)
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation Danilo Krummrich
@ 2023-09-09 15:31 ` Danilo Krummrich
  6 siblings, 0 replies; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-09 15:31 UTC (permalink / raw)
  To: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel, Danilo Krummrich

Make use of the DRM GPUVA managers GPU-VM common dma-resv, external GEM
object tracking, dma-resv locking, evicted GEM object tracking and
validation features.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
 drivers/gpu/drm/nouveau/nouveau_bo.c    |  4 +-
 drivers/gpu/drm/nouveau/nouveau_exec.c  | 52 +++----------
 drivers/gpu/drm/nouveau/nouveau_exec.h  |  4 -
 drivers/gpu/drm/nouveau/nouveau_gem.c   |  4 +-
 drivers/gpu/drm/nouveau/nouveau_sched.h |  4 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c  | 99 ++++++++++++++++---------
 6 files changed, 82 insertions(+), 85 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 19cab37ac69c..18c91993dae1 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1060,17 +1060,18 @@ nouveau_bo_move(struct ttm_buffer_object *bo, bool evict,
 {
 	struct nouveau_drm *drm = nouveau_bdev(bo->bdev);
 	struct nouveau_bo *nvbo = nouveau_bo(bo);
+	struct drm_gem_object *obj = &bo->base;
 	struct ttm_resource *old_reg = bo->resource;
 	struct nouveau_drm_tile *new_tile = NULL;
 	int ret = 0;
 
-
 	if (new_reg->mem_type == TTM_PL_TT) {
 		ret = nouveau_ttm_tt_bind(bo->bdev, bo->ttm, new_reg);
 		if (ret)
 			return ret;
 	}
 
+	drm_gpuvm_bo_evict(obj, evict);
 	nouveau_bo_move_ntfy(bo, new_reg);
 	ret = ttm_bo_wait_ctx(bo, ctx);
 	if (ret)
@@ -1135,6 +1136,7 @@ nouveau_bo_move(struct ttm_buffer_object *bo, bool evict,
 out_ntfy:
 	if (ret) {
 		nouveau_bo_move_ntfy(bo, bo->resource);
+		drm_gpuvm_bo_evict(obj, !evict);
 	}
 	return ret;
 }
diff --git a/drivers/gpu/drm/nouveau/nouveau_exec.c b/drivers/gpu/drm/nouveau/nouveau_exec.c
index b4239af29e5a..5f86043046f5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_exec.c
+++ b/drivers/gpu/drm/nouveau/nouveau_exec.c
@@ -1,7 +1,5 @@
 // SPDX-License-Identifier: MIT
 
-#include <drm/drm_exec.h>
-
 #include "nouveau_drv.h"
 #include "nouveau_gem.h"
 #include "nouveau_mem.h"
@@ -91,9 +89,6 @@ nouveau_exec_job_submit(struct nouveau_job *job)
 	struct nouveau_exec_job *exec_job = to_nouveau_exec_job(job);
 	struct nouveau_cli *cli = job->cli;
 	struct nouveau_uvmm *uvmm = nouveau_cli_uvmm(cli);
-	struct drm_exec *exec = &job->exec;
-	struct drm_gem_object *obj;
-	unsigned long index;
 	int ret;
 
 	ret = nouveau_fence_new(&exec_job->fence);
@@ -101,52 +96,29 @@ nouveau_exec_job_submit(struct nouveau_job *job)
 		return ret;
 
 	nouveau_uvmm_lock(uvmm);
-	drm_exec_init(exec, DRM_EXEC_INTERRUPTIBLE_WAIT |
-			    DRM_EXEC_IGNORE_DUPLICATES);
-	drm_exec_until_all_locked(exec) {
-		struct drm_gpuva *va;
-
-		drm_gpuvm_for_each_va(va, &uvmm->base) {
-			if (unlikely(va == &uvmm->base.kernel_alloc_node))
-				continue;
-
-			ret = drm_exec_prepare_obj(exec, va->gem.obj, 1);
-			drm_exec_retry_on_contention(exec);
-			if (ret)
-				goto err_uvmm_unlock;
-		}
+	job->vm_exec.vm = &uvmm->base;
+	ret = drm_gpuvm_exec_lock(&job->vm_exec, 1, false);
+	if (ret) {
+		nouveau_uvmm_unlock(uvmm);
+		return ret;
 	}
 	nouveau_uvmm_unlock(uvmm);
 
-	drm_exec_for_each_locked_object(exec, index, obj) {
-		struct nouveau_bo *nvbo = nouveau_gem_object(obj);
-
-		ret = nouveau_bo_validate(nvbo, true, false);
-		if (ret)
-			goto err_exec_fini;
+	ret = drm_gpuvm_validate(&uvmm->base);
+	if (ret) {
+		drm_gpuvm_exec_unlock(&job->vm_exec);
+		return ret;
 	}
 
 	return 0;
-
-err_uvmm_unlock:
-	nouveau_uvmm_unlock(uvmm);
-err_exec_fini:
-	drm_exec_fini(exec);
-	return ret;
-
 }
 
 static void
 nouveau_exec_job_armed_submit(struct nouveau_job *job)
 {
-	struct drm_exec *exec = &job->exec;
-	struct drm_gem_object *obj;
-	unsigned long index;
-
-	drm_exec_for_each_locked_object(exec, index, obj)
-		dma_resv_add_fence(obj->resv, job->done_fence, job->resv_usage);
-
-	drm_exec_fini(exec);
+	drm_gpuvm_exec_resv_add_fence(&job->vm_exec, job->done_fence,
+				      job->resv_usage, job->resv_usage);
+	drm_gpuvm_exec_unlock(&job->vm_exec);
 }
 
 static struct dma_fence *
diff --git a/drivers/gpu/drm/nouveau/nouveau_exec.h b/drivers/gpu/drm/nouveau/nouveau_exec.h
index 778cacd90f65..b815de2428f3 100644
--- a/drivers/gpu/drm/nouveau/nouveau_exec.h
+++ b/drivers/gpu/drm/nouveau/nouveau_exec.h
@@ -3,16 +3,12 @@
 #ifndef __NOUVEAU_EXEC_H__
 #define __NOUVEAU_EXEC_H__
 
-#include <drm/drm_exec.h>
-
 #include "nouveau_drv.h"
 #include "nouveau_sched.h"
 
 struct nouveau_exec_job_args {
 	struct drm_file *file_priv;
 	struct nouveau_sched_entity *sched_entity;
-
-	struct drm_exec exec;
 	struct nouveau_channel *chan;
 
 	struct {
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index c0b10d8d3d03..b89b2494af98 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -111,7 +111,7 @@ nouveau_gem_object_open(struct drm_gem_object *gem, struct drm_file *file_priv)
 	if (vmm->vmm.object.oclass < NVIF_CLASS_VMM_NV50)
 		return 0;
 
-	if (nvbo->no_share && uvmm && &uvmm->resv != nvbo->bo.base.resv)
+	if (nvbo->no_share && uvmm && uvmm->base.resv != nvbo->bo.base.resv)
 		return -EPERM;
 
 	ret = ttm_bo_reserve(&nvbo->bo, false, false, NULL);
@@ -245,7 +245,7 @@ nouveau_gem_new(struct nouveau_cli *cli, u64 size, int align, uint32_t domain,
 		if (unlikely(!uvmm))
 			return -EINVAL;
 
-		resv = &uvmm->resv;
+		resv = uvmm->base.resv;
 	}
 
 	if (!(domain & (NOUVEAU_GEM_DOMAIN_VRAM | NOUVEAU_GEM_DOMAIN_GART)))
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.h b/drivers/gpu/drm/nouveau/nouveau_sched.h
index 27ac19792597..54379af6f925 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sched.h
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.h
@@ -5,7 +5,7 @@
 
 #include <linux/types.h>
 
-#include <drm/drm_exec.h>
+#include <drm/drm_gpuvm.h>
 #include <drm/gpu_scheduler.h>
 
 #include "nouveau_drv.h"
@@ -54,7 +54,7 @@ struct nouveau_job {
 	struct drm_file *file_priv;
 	struct nouveau_cli *cli;
 
-	struct drm_exec exec;
+	struct drm_gpuvm_exec vm_exec;
 	enum dma_resv_usage resv_usage;
 	struct dma_fence *done_fence;
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
index cf709afd2ac7..2fc13afaa624 100644
--- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
@@ -438,8 +438,9 @@ nouveau_uvma_region_complete(struct nouveau_uvma_region *reg)
 static void
 op_map_prepare_unwind(struct nouveau_uvma *uvma)
 {
+	struct drm_gpuva *va = &uvma->va;
 	nouveau_uvma_gem_put(uvma);
-	drm_gpuva_remove(&uvma->va);
+	drm_gpuva_remove(va);
 	nouveau_uvma_free(uvma);
 }
 
@@ -468,6 +469,7 @@ nouveau_uvmm_sm_prepare_unwind(struct nouveau_uvmm *uvmm,
 			break;
 		case DRM_GPUVA_OP_REMAP: {
 			struct drm_gpuva_op_remap *r = &op->remap;
+			struct drm_gpuva *va = r->unmap->va;
 
 			if (r->next)
 				op_map_prepare_unwind(new->next);
@@ -475,7 +477,7 @@ nouveau_uvmm_sm_prepare_unwind(struct nouveau_uvmm *uvmm,
 			if (r->prev)
 				op_map_prepare_unwind(new->prev);
 
-			op_unmap_prepare_unwind(r->unmap->va);
+			op_unmap_prepare_unwind(va);
 			break;
 		}
 		case DRM_GPUVA_OP_UNMAP:
@@ -634,6 +636,7 @@ nouveau_uvmm_sm_prepare(struct nouveau_uvmm *uvmm,
 					goto unwind;
 				}
 			}
+
 			break;
 		}
 		case DRM_GPUVA_OP_REMAP: {
@@ -1146,13 +1149,44 @@ bind_link_gpuvas(struct bind_job_op *bop)
 	}
 }
 
+static int
+bind_lock_extra(struct drm_gpuvm_exec *vm_exec, unsigned int num_fences)
+{
+	struct nouveau_uvmm_bind_job *bind_job = vm_exec->extra.priv;
+	struct drm_exec *exec = &vm_exec->exec;
+	struct bind_job_op *op;
+	int ret;
+
+	list_for_each_op(op, &bind_job->ops) {
+		struct drm_gpuva_op *va_op;
+
+		if (IS_ERR_OR_NULL(op->ops))
+			continue;
+
+		drm_gpuva_for_each_op(va_op, op->ops) {
+			struct drm_gem_object *obj = op_gem_obj(va_op);
+
+			if (unlikely(!obj))
+				continue;
+
+			if (va_op->op != DRM_GPUVA_OP_UNMAP)
+				continue;
+
+			ret = drm_exec_prepare_obj(exec, obj, num_fences);
+			if (ret)
+				return ret;
+		}
+	}
+
+	return 0;
+}
+
 static int
 nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 {
 	struct nouveau_uvmm *uvmm = nouveau_cli_uvmm(job->cli);
 	struct nouveau_uvmm_bind_job *bind_job = to_uvmm_bind_job(job);
 	struct nouveau_sched_entity *entity = job->entity;
-	struct drm_exec *exec = &job->exec;
 	struct bind_job_op *op;
 	int ret;
 
@@ -1170,6 +1204,8 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 			dma_resv_unlock(obj->resv);
 			if (IS_ERR(op->vm_bo))
 				return PTR_ERR(op->vm_bo);
+
+			drm_gpuvm_bo_extobj_add(op->vm_bo);
 		}
 
 		ret = bind_validate_op(job, op);
@@ -1192,6 +1228,7 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 	 * unwind all GPU VA space changes on failure.
 	 */
 	nouveau_uvmm_lock(uvmm);
+
 	list_for_each_op(op, &bind_job->ops) {
 		switch (op->op) {
 		case OP_MAP_SPARSE:
@@ -1303,30 +1340,13 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 		}
 	}
 
-	drm_exec_init(exec, DRM_EXEC_INTERRUPTIBLE_WAIT |
-			    DRM_EXEC_IGNORE_DUPLICATES);
-	drm_exec_until_all_locked(exec) {
-		list_for_each_op(op, &bind_job->ops) {
-			struct drm_gpuva_op *va_op;
+	job->vm_exec.vm = &uvmm->base;
+	job->vm_exec.extra.fn = bind_lock_extra;
+	job->vm_exec.extra.priv = bind_job;
 
-			if (IS_ERR_OR_NULL(op->ops))
-				continue;
-
-			drm_gpuva_for_each_op(va_op, op->ops) {
-				struct drm_gem_object *obj = op_gem_obj(va_op);
-
-				if (unlikely(!obj))
-					continue;
-
-				ret = drm_exec_prepare_obj(exec, obj, 1);
-				drm_exec_retry_on_contention(exec);
-				if (ret) {
-					op = list_last_op(&bind_job->ops);
-					goto unwind;
-				}
-			}
-		}
-	}
+	ret = drm_gpuvm_exec_lock(&job->vm_exec, 1, false);
+	if (ret)
+		goto unwind_continue;
 
 	list_for_each_op(op, &bind_job->ops) {
 		struct drm_gpuva_op *va_op;
@@ -1426,21 +1446,16 @@ nouveau_uvmm_bind_job_submit(struct nouveau_job *job)
 	}
 
 	nouveau_uvmm_unlock(uvmm);
-	drm_exec_fini(exec);
+	drm_gpuvm_exec_unlock(&job->vm_exec);
 	return ret;
 }
 
 static void
 nouveau_uvmm_bind_job_armed_submit(struct nouveau_job *job)
 {
-	struct drm_exec *exec = &job->exec;
-	struct drm_gem_object *obj;
-	unsigned long index;
-
-	drm_exec_for_each_locked_object(exec, index, obj)
-		dma_resv_add_fence(obj->resv, job->done_fence, job->resv_usage);
-
-	drm_exec_fini(exec);
+	drm_gpuvm_exec_resv_add_fence(&job->vm_exec, job->done_fence,
+				      job->resv_usage, job->resv_usage);
+	drm_gpuvm_exec_unlock(&job->vm_exec);
 }
 
 static struct dma_fence *
@@ -1832,6 +1847,18 @@ nouveau_uvmm_bo_unmap_all(struct nouveau_bo *nvbo)
 	}
 }
 
+static int
+nouveau_uvmm_bo_validate(struct drm_gem_object *obj)
+{
+	struct nouveau_bo *nvbo = nouveau_gem_object(obj);
+
+	return nouveau_bo_validate(nvbo, true, false);
+}
+
+static const struct drm_gpuvm_ops gpuvm_ops = {
+	.bo_validate = nouveau_uvmm_bo_validate,
+};
+
 int
 nouveau_uvmm_init(struct nouveau_uvmm *uvmm, struct nouveau_cli *cli,
 		  u64 kernel_managed_addr, u64 kernel_managed_size)
@@ -1868,7 +1895,7 @@ nouveau_uvmm_init(struct nouveau_uvmm *uvmm, struct nouveau_cli *cli,
 		       NOUVEAU_VA_SPACE_START,
 		       NOUVEAU_VA_SPACE_END,
 		       kernel_managed_addr, kernel_managed_size,
-		       NULL);
+		       &gpuvm_ops);
 
 	ret = nvif_vmm_ctor(&cli->mmu, "uvmm",
 			    cli->vmm.vmm.object.oclass, RAW,
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 1/7] drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 1/7] drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm Danilo Krummrich
@ 2023-09-09 18:23   ` kernel test robot
  0 siblings, 0 replies; 77+ messages in thread
From: kernel test robot @ 2023-09-09 18:23 UTC (permalink / raw)
  To: Danilo Krummrich, airlied, daniel, matthew.brost,
	thomas.hellstrom, sarah.walker, donald.robson, boris.brezillon,
	christian.koenig, faith.ekstrand
  Cc: oe-kbuild-all, nouveau, Danilo Krummrich, linux-kernel, dri-devel

Hi Danilo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 6bd3d8da51ca1ec97c724016466606aec7739b9f]

url:    https://github.com/intel-lab-lkp/linux/commits/Danilo-Krummrich/drm-gpuvm-rename-struct-drm_gpuva_manager-to-struct-drm_gpuvm/20230909-233346
base:   6bd3d8da51ca1ec97c724016466606aec7739b9f
patch link:    https://lore.kernel.org/r/20230909153125.30032-2-dakr%40redhat.com
patch subject: [PATCH drm-misc-next v3 1/7] drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm
config: riscv-defconfig (https://download.01.org/0day-ci/archive/20230910/202309100242.Xp5Sk9EY-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230910/202309100242.Xp5Sk9EY-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202309100242.Xp5Sk9EY-lkp@intel.com/

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/drm_gpuvm.c: In function '__drm_gpuvm_sm_map':
>> drivers/gpu/drm/drm_gpuvm.c:1079:39: warning: variable 'prev' set but not used [-Wunused-but-set-variable]
    1079 |         struct drm_gpuva *va, *next, *prev = NULL;
         |                                       ^~~~


vim +/prev +1079 drivers/gpu/drm/drm_gpuvm.c

e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1072  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1073  static int
5d2000a9816e19 drivers/gpu/drm/drm_gpuvm.c     Danilo Krummrich 2023-09-09  1074  __drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm,
5d2000a9816e19 drivers/gpu/drm/drm_gpuvm.c     Danilo Krummrich 2023-09-09  1075  		   const struct drm_gpuvm_ops *ops, void *priv,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1076  		   u64 req_addr, u64 req_range,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1077  		   struct drm_gem_object *req_obj, u64 req_offset)
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1078  {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20 @1079  	struct drm_gpuva *va, *next, *prev = NULL;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1080  	u64 req_end = req_addr + req_range;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1081  	int ret;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1082  
5d2000a9816e19 drivers/gpu/drm/drm_gpuvm.c     Danilo Krummrich 2023-09-09  1083  	if (unlikely(!drm_gpuva_range_valid(gpuvm, req_addr, req_range)))
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1084  		return -EINVAL;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1085  
5d2000a9816e19 drivers/gpu/drm/drm_gpuvm.c     Danilo Krummrich 2023-09-09  1086  	drm_gpuvm_for_each_va_range_safe(va, next, gpuvm, req_addr, req_end) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1087  		struct drm_gem_object *obj = va->gem.obj;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1088  		u64 offset = va->gem.offset;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1089  		u64 addr = va->va.addr;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1090  		u64 range = va->va.range;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1091  		u64 end = addr + range;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1092  		bool merge = !!va->gem.obj;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1093  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1094  		if (addr == req_addr) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1095  			merge &= obj == req_obj &&
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1096  				 offset == req_offset;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1097  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1098  			if (end == req_end) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1099  				ret = op_unmap_cb(ops, priv, va, merge);
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1100  				if (ret)
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1101  					return ret;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1102  				break;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1103  			}
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1104  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1105  			if (end < req_end) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1106  				ret = op_unmap_cb(ops, priv, va, merge);
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1107  				if (ret)
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1108  					return ret;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1109  				goto next;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1110  			}
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1111  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1112  			if (end > req_end) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1113  				struct drm_gpuva_op_map n = {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1114  					.va.addr = req_end,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1115  					.va.range = range - req_range,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1116  					.gem.obj = obj,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1117  					.gem.offset = offset + req_range,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1118  				};
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1119  				struct drm_gpuva_op_unmap u = {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1120  					.va = va,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1121  					.keep = merge,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1122  				};
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1123  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1124  				ret = op_remap_cb(ops, priv, NULL, &n, &u);
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1125  				if (ret)
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1126  					return ret;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1127  				break;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1128  			}
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1129  		} else if (addr < req_addr) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1130  			u64 ls_range = req_addr - addr;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1131  			struct drm_gpuva_op_map p = {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1132  				.va.addr = addr,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1133  				.va.range = ls_range,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1134  				.gem.obj = obj,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1135  				.gem.offset = offset,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1136  			};
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1137  			struct drm_gpuva_op_unmap u = { .va = va };
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1138  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1139  			merge &= obj == req_obj &&
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1140  				 offset + ls_range == req_offset;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1141  			u.keep = merge;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1142  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1143  			if (end == req_end) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1144  				ret = op_remap_cb(ops, priv, &p, NULL, &u);
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1145  				if (ret)
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1146  					return ret;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1147  				break;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1148  			}
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1149  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1150  			if (end < req_end) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1151  				ret = op_remap_cb(ops, priv, &p, NULL, &u);
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1152  				if (ret)
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1153  					return ret;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1154  				goto next;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1155  			}
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1156  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1157  			if (end > req_end) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1158  				struct drm_gpuva_op_map n = {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1159  					.va.addr = req_end,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1160  					.va.range = end - req_end,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1161  					.gem.obj = obj,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1162  					.gem.offset = offset + ls_range +
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1163  						      req_range,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1164  				};
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1165  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1166  				ret = op_remap_cb(ops, priv, &p, &n, &u);
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1167  				if (ret)
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1168  					return ret;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1169  				break;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1170  			}
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1171  		} else if (addr > req_addr) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1172  			merge &= obj == req_obj &&
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1173  				 offset == req_offset +
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1174  					   (addr - req_addr);
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1175  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1176  			if (end == req_end) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1177  				ret = op_unmap_cb(ops, priv, va, merge);
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1178  				if (ret)
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1179  					return ret;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1180  				break;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1181  			}
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1182  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1183  			if (end < req_end) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1184  				ret = op_unmap_cb(ops, priv, va, merge);
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1185  				if (ret)
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1186  					return ret;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1187  				goto next;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1188  			}
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1189  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1190  			if (end > req_end) {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1191  				struct drm_gpuva_op_map n = {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1192  					.va.addr = req_end,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1193  					.va.range = end - req_end,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1194  					.gem.obj = obj,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1195  					.gem.offset = offset + req_end - addr,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1196  				};
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1197  				struct drm_gpuva_op_unmap u = {
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1198  					.va = va,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1199  					.keep = merge,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1200  				};
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1201  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1202  				ret = op_remap_cb(ops, priv, NULL, &n, &u);
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1203  				if (ret)
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1204  					return ret;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1205  				break;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1206  			}
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1207  		}
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1208  next:
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1209  		prev = va;
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1210  	}
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1211  
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1212  	return op_map_cb(ops, priv,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1213  			 req_addr, req_range,
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1214  			 req_obj, req_offset);
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1215  }
e6303f323b1ad9 drivers/gpu/drm/drm_gpuva_mgr.c Danilo Krummrich 2023-07-20  1216  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation Danilo Krummrich
@ 2023-09-09 20:16   ` kernel test robot
  2023-09-11 10:35   ` Boris Brezillon
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 77+ messages in thread
From: kernel test robot @ 2023-09-09 20:16 UTC (permalink / raw)
  To: Danilo Krummrich, airlied, daniel, matthew.brost,
	thomas.hellstrom, sarah.walker, donald.robson, boris.brezillon,
	christian.koenig, faith.ekstrand
  Cc: oe-kbuild-all, nouveau, Danilo Krummrich, linux-kernel, dri-devel

Hi Danilo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 6bd3d8da51ca1ec97c724016466606aec7739b9f]

url:    https://github.com/intel-lab-lkp/linux/commits/Danilo-Krummrich/drm-gpuvm-rename-struct-drm_gpuva_manager-to-struct-drm_gpuvm/20230909-233346
base:   6bd3d8da51ca1ec97c724016466606aec7739b9f
patch link:    https://lore.kernel.org/r/20230909153125.30032-7-dakr%40redhat.com
patch subject: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
config: riscv-defconfig (https://download.01.org/0day-ci/archive/20230910/202309100424.uNXGR9d4-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230910/202309100424.uNXGR9d4-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202309100424.uNXGR9d4-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/drm_gpuvm.c:734: warning: Function parameter or member '__gpuvm' not described in 'for_each_vm_bo_in_list'
>> drivers/gpu/drm/drm_gpuvm.c:734: warning: Function parameter or member '__list_name' not described in 'for_each_vm_bo_in_list'
>> drivers/gpu/drm/drm_gpuvm.c:734: warning: Function parameter or member '__local_list' not described in 'for_each_vm_bo_in_list'
>> drivers/gpu/drm/drm_gpuvm.c:734: warning: Function parameter or member '__vm_bo' not described in 'for_each_vm_bo_in_list'


vim +734 drivers/gpu/drm/drm_gpuvm.c

    32	
    33	/**
    34	 * DOC: Overview
    35	 *
    36	 * The DRM GPU VA Manager, represented by struct drm_gpuvm keeps track of a
    37	 * GPU's virtual address (VA) space and manages the corresponding virtual
    38	 * mappings represented by &drm_gpuva objects. It also keeps track of the
    39	 * mapping's backing &drm_gem_object buffers.
    40	 *
    41	 * &drm_gem_object buffers maintain a list of &drm_gpuva objects representing
    42	 * all existent GPU VA mappings using this &drm_gem_object as backing buffer.
    43	 *
    44	 * GPU VAs can be flagged as sparse, such that drivers may use GPU VAs to also
    45	 * keep track of sparse PTEs in order to support Vulkan 'Sparse Resources'.
    46	 *
    47	 * The GPU VA manager internally uses a rb-tree to manage the
    48	 * &drm_gpuva mappings within a GPU's virtual address space.
    49	 *
    50	 * The &drm_gpuvm structure contains a special &drm_gpuva representing the
    51	 * portion of VA space reserved by the kernel. This node is initialized together
    52	 * with the GPU VA manager instance and removed when the GPU VA manager is
    53	 * destroyed.
    54	 *
    55	 * In a typical application drivers would embed struct drm_gpuvm and
    56	 * struct drm_gpuva within their own driver specific structures, there won't be
    57	 * any memory allocations of its own nor memory allocations of &drm_gpuva
    58	 * entries.
    59	 *
    60	 * The data structures needed to store &drm_gpuvas within the &drm_gpuvm are
    61	 * contained within struct drm_gpuva already. Hence, for inserting &drm_gpuva
    62	 * entries from within dma-fence signalling critical sections it is enough to
    63	 * pre-allocate the &drm_gpuva structures.
    64	 *
    65	 * In order to connect a struct drm_gpuva its backing &drm_gem_object each
    66	 * &drm_gem_object maintains a list of &drm_gpuvm_bo structures, and each
    67	 * &drm_gpuvm_bo contains a list of &&drm_gpuva structures.
    68	 *
    69	 * A &drm_gpuvm_bo is an abstraction that represents a combination of a
    70	 * &drm_gpuvm and a &drm_gem_object. Every such combination should be unique.
    71	 * This is ensured by the API through drm_gpuvm_bo_obtain() and
    72	 * drm_gpuvm_bo_obtain_prealloc() which first look into the corresponding
    73	 * &drm_gem_object list of &drm_gpuvm_bos for an existing instance of this
    74	 * particular combination. If not existent a new instance is created and linked
    75	 * to the &drm_gem_object.
    76	 *
    77	 * &drm_gpuvm_bo structures, since unique for a given &drm_gpuvm, are also used
    78	 * as entry for the &drm_gpuvm's lists of external and evicted objects. Those
    79	 * list are maintained in order to accelerate locking of dma-resv locks and
    80	 * validation of evicted objects bound in a &drm_gpuvm. For instance the all
    81	 * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be locked by calling
    82	 * drm_gpuvm_exec_lock(). Once locked drivers can call drm_gpuvm_validate() in
    83	 * order to validate all evicted &drm_gem_objects. It is also possible to lock
    84	 * additional &drm_gem_objects by providing the corresponding parameters to
    85	 * drm_gpuvm_exec_lock() as well as open code the &drm_exec loop while making
    86	 * use of helper functions such as drm_gpuvm_prepare_range() or
    87	 * drm_gpuvm_prepare_objects().
    88	 *
    89	 * Every bound &drm_gem_object is treated as external object when its &dma_resv
    90	 * structure is different than the &drm_gpuvm's common &dma_resv structure.
    91	 */
    92	
    93	/**
    94	 * DOC: Split and Merge
    95	 *
    96	 * Besides its capability to manage and represent a GPU VA space, the
    97	 * GPU VA manager also provides functions to let the &drm_gpuvm calculate a
    98	 * sequence of operations to satisfy a given map or unmap request.
    99	 *
   100	 * Therefore the DRM GPU VA manager provides an algorithm implementing splitting
   101	 * and merging of existent GPU VA mappings with the ones that are requested to
   102	 * be mapped or unmapped. This feature is required by the Vulkan API to
   103	 * implement Vulkan 'Sparse Memory Bindings' - drivers UAPIs often refer to this
   104	 * as VM BIND.
   105	 *
   106	 * Drivers can call drm_gpuvm_sm_map() to receive a sequence of callbacks
   107	 * containing map, unmap and remap operations for a given newly requested
   108	 * mapping. The sequence of callbacks represents the set of operations to
   109	 * execute in order to integrate the new mapping cleanly into the current state
   110	 * of the GPU VA space.
   111	 *
   112	 * Depending on how the new GPU VA mapping intersects with the existent mappings
   113	 * of the GPU VA space the &drm_gpuvm_ops callbacks contain an arbitrary amount
   114	 * of unmap operations, a maximum of two remap operations and a single map
   115	 * operation. The caller might receive no callback at all if no operation is
   116	 * required, e.g. if the requested mapping already exists in the exact same way.
   117	 *
   118	 * The single map operation represents the original map operation requested by
   119	 * the caller.
   120	 *
   121	 * &drm_gpuva_op_unmap contains a 'keep' field, which indicates whether the
   122	 * &drm_gpuva to unmap is physically contiguous with the original mapping
   123	 * request. Optionally, if 'keep' is set, drivers may keep the actual page table
   124	 * entries for this &drm_gpuva, adding the missing page table entries only and
   125	 * update the &drm_gpuvm's view of things accordingly.
   126	 *
   127	 * Drivers may do the same optimization, namely delta page table updates, also
   128	 * for remap operations. This is possible since &drm_gpuva_op_remap consists of
   129	 * one unmap operation and one or two map operations, such that drivers can
   130	 * derive the page table update delta accordingly.
   131	 *
   132	 * Note that there can't be more than two existent mappings to split up, one at
   133	 * the beginning and one at the end of the new mapping, hence there is a
   134	 * maximum of two remap operations.
   135	 *
   136	 * Analogous to drm_gpuvm_sm_map() drm_gpuvm_sm_unmap() uses &drm_gpuvm_ops to
   137	 * call back into the driver in order to unmap a range of GPU VA space. The
   138	 * logic behind this function is way simpler though: For all existent mappings
   139	 * enclosed by the given range unmap operations are created. For mappings which
   140	 * are only partically located within the given range, remap operations are
   141	 * created such that those mappings are split up and re-mapped partically.
   142	 *
   143	 * As an alternative to drm_gpuvm_sm_map() and drm_gpuvm_sm_unmap(),
   144	 * drm_gpuvm_sm_map_ops_create() and drm_gpuvm_sm_unmap_ops_create() can be used
   145	 * to directly obtain an instance of struct drm_gpuva_ops containing a list of
   146	 * &drm_gpuva_op, which can be iterated with drm_gpuva_for_each_op(). This list
   147	 * contains the &drm_gpuva_ops analogous to the callbacks one would receive when
   148	 * calling drm_gpuvm_sm_map() or drm_gpuvm_sm_unmap(). While this way requires
   149	 * more memory (to allocate the &drm_gpuva_ops), it provides drivers a way to
   150	 * iterate the &drm_gpuva_op multiple times, e.g. once in a context where memory
   151	 * allocations are possible (e.g. to allocate GPU page tables) and once in the
   152	 * dma-fence signalling critical path.
   153	 *
   154	 * To update the &drm_gpuvm's view of the GPU VA space drm_gpuva_insert() and
   155	 * drm_gpuva_remove() may be used. These functions can safely be used from
   156	 * &drm_gpuvm_ops callbacks originating from drm_gpuvm_sm_map() or
   157	 * drm_gpuvm_sm_unmap(). However, it might be more convenient to use the
   158	 * provided helper functions drm_gpuva_map(), drm_gpuva_remap() and
   159	 * drm_gpuva_unmap() instead.
   160	 *
   161	 * The following diagram depicts the basic relationships of existent GPU VA
   162	 * mappings, a newly requested mapping and the resulting mappings as implemented
   163	 * by drm_gpuvm_sm_map() - it doesn't cover any arbitrary combinations of these.
   164	 *
   165	 * 1) Requested mapping is identical. Replace it, but indicate the backing PTEs
   166	 *    could be kept.
   167	 *
   168	 *    ::
   169	 *
   170	 *	     0     a     1
   171	 *	old: |-----------| (bo_offset=n)
   172	 *
   173	 *	     0     a     1
   174	 *	req: |-----------| (bo_offset=n)
   175	 *
   176	 *	     0     a     1
   177	 *	new: |-----------| (bo_offset=n)
   178	 *
   179	 *
   180	 * 2) Requested mapping is identical, except for the BO offset, hence replace
   181	 *    the mapping.
   182	 *
   183	 *    ::
   184	 *
   185	 *	     0     a     1
   186	 *	old: |-----------| (bo_offset=n)
   187	 *
   188	 *	     0     a     1
   189	 *	req: |-----------| (bo_offset=m)
   190	 *
   191	 *	     0     a     1
   192	 *	new: |-----------| (bo_offset=m)
   193	 *
   194	 *
   195	 * 3) Requested mapping is identical, except for the backing BO, hence replace
   196	 *    the mapping.
   197	 *
   198	 *    ::
   199	 *
   200	 *	     0     a     1
   201	 *	old: |-----------| (bo_offset=n)
   202	 *
   203	 *	     0     b     1
   204	 *	req: |-----------| (bo_offset=n)
   205	 *
   206	 *	     0     b     1
   207	 *	new: |-----------| (bo_offset=n)
   208	 *
   209	 *
   210	 * 4) Existent mapping is a left aligned subset of the requested one, hence
   211	 *    replace the existent one.
   212	 *
   213	 *    ::
   214	 *
   215	 *	     0  a  1
   216	 *	old: |-----|       (bo_offset=n)
   217	 *
   218	 *	     0     a     2
   219	 *	req: |-----------| (bo_offset=n)
   220	 *
   221	 *	     0     a     2
   222	 *	new: |-----------| (bo_offset=n)
   223	 *
   224	 *    .. note::
   225	 *       We expect to see the same result for a request with a different BO
   226	 *       and/or non-contiguous BO offset.
   227	 *
   228	 *
   229	 * 5) Requested mapping's range is a left aligned subset of the existent one,
   230	 *    but backed by a different BO. Hence, map the requested mapping and split
   231	 *    the existent one adjusting its BO offset.
   232	 *
   233	 *    ::
   234	 *
   235	 *	     0     a     2
   236	 *	old: |-----------| (bo_offset=n)
   237	 *
   238	 *	     0  b  1
   239	 *	req: |-----|       (bo_offset=n)
   240	 *
   241	 *	     0  b  1  a' 2
   242	 *	new: |-----|-----| (b.bo_offset=n, a.bo_offset=n+1)
   243	 *
   244	 *    .. note::
   245	 *       We expect to see the same result for a request with a different BO
   246	 *       and/or non-contiguous BO offset.
   247	 *
   248	 *
   249	 * 6) Existent mapping is a superset of the requested mapping. Split it up, but
   250	 *    indicate that the backing PTEs could be kept.
   251	 *
   252	 *    ::
   253	 *
   254	 *	     0     a     2
   255	 *	old: |-----------| (bo_offset=n)
   256	 *
   257	 *	     0  a  1
   258	 *	req: |-----|       (bo_offset=n)
   259	 *
   260	 *	     0  a  1  a' 2
   261	 *	new: |-----|-----| (a.bo_offset=n, a'.bo_offset=n+1)
   262	 *
   263	 *
   264	 * 7) Requested mapping's range is a right aligned subset of the existent one,
   265	 *    but backed by a different BO. Hence, map the requested mapping and split
   266	 *    the existent one, without adjusting the BO offset.
   267	 *
   268	 *    ::
   269	 *
   270	 *	     0     a     2
   271	 *	old: |-----------| (bo_offset=n)
   272	 *
   273	 *	           1  b  2
   274	 *	req:       |-----| (bo_offset=m)
   275	 *
   276	 *	     0  a  1  b  2
   277	 *	new: |-----|-----| (a.bo_offset=n,b.bo_offset=m)
   278	 *
   279	 *
   280	 * 8) Existent mapping is a superset of the requested mapping. Split it up, but
   281	 *    indicate that the backing PTEs could be kept.
   282	 *
   283	 *    ::
   284	 *
   285	 *	      0     a     2
   286	 *	old: |-----------| (bo_offset=n)
   287	 *
   288	 *	           1  a  2
   289	 *	req:       |-----| (bo_offset=n+1)
   290	 *
   291	 *	     0  a' 1  a  2
   292	 *	new: |-----|-----| (a'.bo_offset=n, a.bo_offset=n+1)
   293	 *
   294	 *
   295	 * 9) Existent mapping is overlapped at the end by the requested mapping backed
   296	 *    by a different BO. Hence, map the requested mapping and split up the
   297	 *    existent one, without adjusting the BO offset.
   298	 *
   299	 *    ::
   300	 *
   301	 *	     0     a     2
   302	 *	old: |-----------|       (bo_offset=n)
   303	 *
   304	 *	           1     b     3
   305	 *	req:       |-----------| (bo_offset=m)
   306	 *
   307	 *	     0  a  1     b     3
   308	 *	new: |-----|-----------| (a.bo_offset=n,b.bo_offset=m)
   309	 *
   310	 *
   311	 * 10) Existent mapping is overlapped by the requested mapping, both having the
   312	 *     same backing BO with a contiguous offset. Indicate the backing PTEs of
   313	 *     the old mapping could be kept.
   314	 *
   315	 *     ::
   316	 *
   317	 *	      0     a     2
   318	 *	 old: |-----------|       (bo_offset=n)
   319	 *
   320	 *	            1     a     3
   321	 *	 req:       |-----------| (bo_offset=n+1)
   322	 *
   323	 *	      0  a' 1     a     3
   324	 *	 new: |-----|-----------| (a'.bo_offset=n, a.bo_offset=n+1)
   325	 *
   326	 *
   327	 * 11) Requested mapping's range is a centered subset of the existent one
   328	 *     having a different backing BO. Hence, map the requested mapping and split
   329	 *     up the existent one in two mappings, adjusting the BO offset of the right
   330	 *     one accordingly.
   331	 *
   332	 *     ::
   333	 *
   334	 *	      0        a        3
   335	 *	 old: |-----------------| (bo_offset=n)
   336	 *
   337	 *	            1  b  2
   338	 *	 req:       |-----|       (bo_offset=m)
   339	 *
   340	 *	      0  a  1  b  2  a' 3
   341	 *	 new: |-----|-----|-----| (a.bo_offset=n,b.bo_offset=m,a'.bo_offset=n+2)
   342	 *
   343	 *
   344	 * 12) Requested mapping is a contiguous subset of the existent one. Split it
   345	 *     up, but indicate that the backing PTEs could be kept.
   346	 *
   347	 *     ::
   348	 *
   349	 *	      0        a        3
   350	 *	 old: |-----------------| (bo_offset=n)
   351	 *
   352	 *	            1  a  2
   353	 *	 req:       |-----|       (bo_offset=n+1)
   354	 *
   355	 *	      0  a' 1  a  2 a'' 3
   356	 *	 old: |-----|-----|-----| (a'.bo_offset=n, a.bo_offset=n+1, a''.bo_offset=n+2)
   357	 *
   358	 *
   359	 * 13) Existent mapping is a right aligned subset of the requested one, hence
   360	 *     replace the existent one.
   361	 *
   362	 *     ::
   363	 *
   364	 *	            1  a  2
   365	 *	 old:       |-----| (bo_offset=n+1)
   366	 *
   367	 *	      0     a     2
   368	 *	 req: |-----------| (bo_offset=n)
   369	 *
   370	 *	      0     a     2
   371	 *	 new: |-----------| (bo_offset=n)
   372	 *
   373	 *     .. note::
   374	 *        We expect to see the same result for a request with a different bo
   375	 *        and/or non-contiguous bo_offset.
   376	 *
   377	 *
   378	 * 14) Existent mapping is a centered subset of the requested one, hence
   379	 *     replace the existent one.
   380	 *
   381	 *     ::
   382	 *
   383	 *	            1  a  2
   384	 *	 old:       |-----| (bo_offset=n+1)
   385	 *
   386	 *	      0        a       3
   387	 *	 req: |----------------| (bo_offset=n)
   388	 *
   389	 *	      0        a       3
   390	 *	 new: |----------------| (bo_offset=n)
   391	 *
   392	 *     .. note::
   393	 *        We expect to see the same result for a request with a different bo
   394	 *        and/or non-contiguous bo_offset.
   395	 *
   396	 *
   397	 * 15) Existent mappings is overlapped at the beginning by the requested mapping
   398	 *     backed by a different BO. Hence, map the requested mapping and split up
   399	 *     the existent one, adjusting its BO offset accordingly.
   400	 *
   401	 *     ::
   402	 *
   403	 *	            1     a     3
   404	 *	 old:       |-----------| (bo_offset=n)
   405	 *
   406	 *	      0     b     2
   407	 *	 req: |-----------|       (bo_offset=m)
   408	 *
   409	 *	      0     b     2  a' 3
   410	 *	 new: |-----------|-----| (b.bo_offset=m,a.bo_offset=n+2)
   411	 */
   412	
   413	/**
   414	 * DOC: Locking
   415	 *
   416	 * Generally, the GPU VA manager does not take care of locking itself, it is
   417	 * the drivers responsibility to take care about locking. Drivers might want to
   418	 * protect the following operations: inserting, removing and iterating
   419	 * &drm_gpuva objects as well as generating all kinds of operations, such as
   420	 * split / merge or prefetch.
   421	 *
   422	 * The GPU VA manager also does not take care of the locking of the backing
   423	 * &drm_gem_object buffers GPU VA lists and &drm_gpuvm_bo abstractions by
   424	 * itself; drivers are responsible to enforce mutual exclusion using either the
   425	 * GEMs dma_resv lock or alternatively a driver specific external lock. For the
   426	 * latter see also drm_gem_gpuva_set_lock().
   427	 *
   428	 * However, the GPU VA manager contains lockdep checks to ensure callers of its
   429	 * API hold the corresponding lock whenever the &drm_gem_objects GPU VA list is
   430	 * accessed by functions such as drm_gpuva_link() or drm_gpuva_unlink(), but
   431	 * also drm_gpuvm_bo_obtain() and drm_gpuvm_bo_put().
   432	 *
   433	 * The latter is required since on creation and destruction of a &drm_gpuvm_bo
   434	 * the &drm_gpuvm_bo is attached / removed from the &drm_gem_objects gpuva list.
   435	 * Subsequent calls to drm_gpuvm_bo_obtain() for the same &drm_gpuvm and
   436	 * &drm_gem_object must be able to observe previous creations and destructions
   437	 * of &drm_gpuvm_bos in order to keep instances unique.
   438	 *
   439	 * The &drm_gpuvm's lists for keeping track of external and evicted objects are
   440	 * protected against concurrent insertion / removal and iteration internally.
   441	 *
   442	 * However, drivers still need ensure to protect concurrent calls to functions
   443	 * iterating those lists, such as drm_gpuvm_validate() and
   444	 * drm_gpuvm_prepare_objects(). Every such function contains a particular
   445	 * comment and lockdep checks if possible.
   446	 *
   447	 * Functions adding or removing entries from those lists, such as
   448	 * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be called with external
   449	 * locks being held, e.g. in order to avoid the corresponding list to be
   450	 * (safely) modified while potentially being iternated by other API functions.
   451	 * However, this is entirely optional.
   452	 */
   453	
   454	/**
   455	 * DOC: Examples
   456	 *
   457	 * This section gives two examples on how to let the DRM GPUVA Manager generate
   458	 * &drm_gpuva_op in order to satisfy a given map or unmap request and how to
   459	 * make use of them.
   460	 *
   461	 * The below code is strictly limited to illustrate the generic usage pattern.
   462	 * To maintain simplicitly, it doesn't make use of any abstractions for common
   463	 * code, different (asyncronous) stages with fence signalling critical paths,
   464	 * any other helpers or error handling in terms of freeing memory and dropping
   465	 * previously taken locks.
   466	 *
   467	 * 1) Obtain a list of &drm_gpuva_op to create a new mapping::
   468	 *
   469	 *	// Allocates a new &drm_gpuva.
   470	 *	struct drm_gpuva * driver_gpuva_alloc(void);
   471	 *
   472	 *	// Typically drivers would embedd the &drm_gpuvm and &drm_gpuva
   473	 *	// structure in individual driver structures and lock the dma-resv with
   474	 *	// drm_exec or similar helpers.
   475	 *	int driver_mapping_create(struct drm_gpuvm *gpuvm,
   476	 *				  u64 addr, u64 range,
   477	 *				  struct drm_gem_object *obj, u64 offset)
   478	 *	{
   479	 *		struct drm_gpuva_ops *ops;
   480	 *		struct drm_gpuva_op *op
   481	 *		struct drm_gpuvm_bo *vm_bo;
   482	 *
   483	 *		driver_lock_va_space();
   484	 *		ops = drm_gpuvm_sm_map_ops_create(gpuvm, addr, range,
   485	 *						  obj, offset);
   486	 *		if (IS_ERR(ops))
   487	 *			return PTR_ERR(ops);
   488	 *
   489	 *		vm_bo = drm_gpuvm_bo_obtain(gpuvm, obj);
   490	 *		if (IS_ERR(vm_bo))
   491	 *			return PTR_ERR(vm_bo);
   492	 *
   493	 *		drm_gpuva_for_each_op(op, ops) {
   494	 *			struct drm_gpuva *va;
   495	 *
   496	 *			switch (op->op) {
   497	 *			case DRM_GPUVA_OP_MAP:
   498	 *				va = driver_gpuva_alloc();
   499	 *				if (!va)
   500	 *					; // unwind previous VA space updates,
   501	 *					  // free memory and unlock
   502	 *
   503	 *				driver_vm_map();
   504	 *				drm_gpuva_map(gpuvm, va, &op->map);
   505	 *				drm_gpuva_link(va, vm_bo);
   506	 *
   507	 *				break;
   508	 *			case DRM_GPUVA_OP_REMAP: {
   509	 *				struct drm_gpuva *prev = NULL, *next = NULL;
   510	 *
   511	 *				va = op->remap.unmap->va;
   512	 *
   513	 *				if (op->remap.prev) {
   514	 *					prev = driver_gpuva_alloc();
   515	 *					if (!prev)
   516	 *						; // unwind previous VA space
   517	 *						  // updates, free memory and
   518	 *						  // unlock
   519	 *				}
   520	 *
   521	 *				if (op->remap.next) {
   522	 *					next = driver_gpuva_alloc();
   523	 *					if (!next)
   524	 *						; // unwind previous VA space
   525	 *						  // updates, free memory and
   526	 *						  // unlock
   527	 *				}
   528	 *
   529	 *				driver_vm_remap();
   530	 *				drm_gpuva_remap(prev, next, &op->remap);
   531	 *
   532	 *				if (prev)
   533	 *					drm_gpuva_link(prev, va->vm_bo);
   534	 *				if (next)
   535	 *					drm_gpuva_link(next, va->vm_bo);
   536	 *				drm_gpuva_unlink(va);
   537	 *
   538	 *				break;
   539	 *			}
   540	 *			case DRM_GPUVA_OP_UNMAP:
   541	 *				va = op->unmap->va;
   542	 *
   543	 *				driver_vm_unmap();
   544	 *				drm_gpuva_unlink(va);
   545	 *				drm_gpuva_unmap(&op->unmap);
   546	 *
   547	 *				break;
   548	 *			default:
   549	 *				break;
   550	 *			}
   551	 *		}
   552	 *		drm_gpuvm_bo_put(vm_bo);
   553	 *		driver_unlock_va_space();
   554	 *
   555	 *		return 0;
   556	 *	}
   557	 *
   558	 * 2) Receive a callback for each &drm_gpuva_op to create a new mapping::
   559	 *
   560	 *	struct driver_context {
   561	 *		struct drm_gpuvm *gpuvm;
   562	 *		struct drm_gpuvm_bo *vm_bo;
   563	 *		struct drm_gpuva *new_va;
   564	 *		struct drm_gpuva *prev_va;
   565	 *		struct drm_gpuva *next_va;
   566	 *	};
   567	 *
   568	 *	// ops to pass to drm_gpuvm_init()
   569	 *	static const struct drm_gpuvm_ops driver_gpuvm_ops = {
   570	 *		.sm_step_map = driver_gpuva_map,
   571	 *		.sm_step_remap = driver_gpuva_remap,
   572	 *		.sm_step_unmap = driver_gpuva_unmap,
   573	 *	};
   574	 *
   575	 *	// Typically drivers would embedd the &drm_gpuvm and &drm_gpuva
   576	 *	// structure in individual driver structures and lock the dma-resv with
   577	 *	// drm_exec or similar helpers.
   578	 *	int driver_mapping_create(struct drm_gpuvm *gpuvm,
   579	 *				  u64 addr, u64 range,
   580	 *				  struct drm_gem_object *obj, u64 offset)
   581	 *	{
   582	 *		struct driver_context ctx;
   583	 *		struct drm_gpuvm_bo *vm_bo;
   584	 *		struct drm_gpuva_ops *ops;
   585	 *		struct drm_gpuva_op *op;
   586	 *		int ret = 0;
   587	 *
   588	 *		ctx.gpuvm = gpuvm;
   589	 *
   590	 *		ctx.new_va = kzalloc(sizeof(*ctx.new_va), GFP_KERNEL);
   591	 *		ctx.prev_va = kzalloc(sizeof(*ctx.prev_va), GFP_KERNEL);
   592	 *		ctx.next_va = kzalloc(sizeof(*ctx.next_va), GFP_KERNEL);
   593	 *		ctx.vm_bo = drm_gpuvm_bo_create(gpuvm, obj);
   594	 *		if (!ctx.new_va || !ctx.prev_va || !ctx.next_va || !vm_bo) {
   595	 *			ret = -ENOMEM;
   596	 *			goto out;
   597	 *		}
   598	 *
   599	 *		// Typically protected with a driver specific GEM gpuva lock
   600	 *		// used in the fence signaling path for drm_gpuva_link() and
   601	 *		// drm_gpuva_unlink(), hence pre-allocate.
   602	 *		ctx.vm_bo = drm_gpuvm_bo_obtain_prealloc(ctx.vm_bo);
   603	 *
   604	 *		driver_lock_va_space();
   605	 *		ret = drm_gpuvm_sm_map(gpuvm, &ctx, addr, range, obj, offset);
   606	 *		driver_unlock_va_space();
   607	 *
   608	 *	out:
   609	 *		drm_gpuvm_bo_put(ctx.vm_bo);
   610	 *		kfree(ctx.new_va);
   611	 *		kfree(ctx.prev_va);
   612	 *		kfree(ctx.next_va);
   613	 *		return ret;
   614	 *	}
   615	 *
   616	 *	int driver_gpuva_map(struct drm_gpuva_op *op, void *__ctx)
   617	 *	{
   618	 *		struct driver_context *ctx = __ctx;
   619	 *
   620	 *		drm_gpuva_map(ctx->vm, ctx->new_va, &op->map);
   621	 *
   622	 *		drm_gpuva_link(ctx->new_va, ctx->vm_bo);
   623	 *
   624	 *		// prevent the new GPUVA from being freed in
   625	 *		// driver_mapping_create()
   626	 *		ctx->new_va = NULL;
   627	 *
   628	 *		return 0;
   629	 *	}
   630	 *
   631	 *	int driver_gpuva_remap(struct drm_gpuva_op *op, void *__ctx)
   632	 *	{
   633	 *		struct driver_context *ctx = __ctx;
   634	 *		struct drm_gpuva *va = op->remap.unmap->va;
   635	 *
   636	 *		drm_gpuva_remap(ctx->prev_va, ctx->next_va, &op->remap);
   637	 *
   638	 *		if (op->remap.prev) {
   639	 *			drm_gpuva_link(ctx->prev_va, va->vm_bo);
   640	 *			ctx->prev_va = NULL;
   641	 *		}
   642	 *
   643	 *		if (op->remap.next) {
   644	 *			drm_gpuva_link(ctx->next_va, va->vm_bo);
   645	 *			ctx->next_va = NULL;
   646	 *		}
   647	 *
   648	 *		drm_gpuva_unlink(va);
   649	 *		kfree(va);
   650	 *
   651	 *		return 0;
   652	 *	}
   653	 *
   654	 *	int driver_gpuva_unmap(struct drm_gpuva_op *op, void *__ctx)
   655	 *	{
   656	 *		drm_gpuva_unlink(op->unmap.va);
   657	 *		drm_gpuva_unmap(&op->unmap);
   658	 *		kfree(op->unmap.va);
   659	 *
   660	 *		return 0;
   661	 *	}
   662	 */
   663	
   664	/**
   665	 * get_next_vm_bo_from_list() - get the next vm_bo element
   666	 * @__gpuvm: The GPU VM
   667	 * @__list_name: The name of the list we're iterating on
   668	 * @__local_list: A pointer to the local list used to store already iterated items
   669	 * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
   670	 *
   671	 * This helper is here to provide lockless list iteration. Lockless as in, the
   672	 * iterator releases the lock immediately after picking the first element from
   673	 * the list, so list insertion deletion can happen concurrently.
   674	 *
   675	 * Elements popped from the original list are kept in a local list, so removal
   676	 * and is_empty checks can still happen while we're iterating the list.
   677	 */
   678	#define get_next_vm_bo_from_list(__gpuvm, __list_name, __local_list, __prev_vm_bo)	\
   679		({										\
   680			struct drm_gpuvm_bo *__vm_bo;						\
   681												\
   682			drm_gpuvm_bo_put(__prev_vm_bo);						\
   683												\
   684			spin_lock(&(__gpuvm)->__list_name.lock);				\
   685			while (!list_empty(&(__gpuvm)->__list_name.list)) {			\
   686				__vm_bo = list_first_entry(&(__gpuvm)->__list_name.list,	\
   687							   struct drm_gpuvm_bo,			\
   688							   list.entry.__list_name);		\
   689				if (drm_gpuvm_bo_get_unless_zero(__vm_bo)) {			\
   690					list_move_tail(&(__vm_bo)->list.entry.__list_name,	\
   691						       __local_list);				\
   692					break;							\
   693				} else {							\
   694					list_del_init(&(__vm_bo)->list.entry.__list_name);	\
   695					__vm_bo = NULL;						\
   696				}								\
   697			}									\
   698			spin_unlock(&(__gpuvm)->__list_name.lock);				\
   699												\
   700			__vm_bo;								\
   701		})
   702	
   703	/**
   704	 * for_each_vm_bo_in_list() - internal vm_bo list iterator
   705	 *
   706	 * This helper is here to provide lockless list iteration. Lockless as in, the
   707	 * iterator releases the lock immediately after picking the first element from the
   708	 * list, so list insertion and deletion can happen concurrently.
   709	 *
   710	 * Typical use:
   711	 *
   712	 *	struct drm_gpuvm_bo *vm_bo;
   713	 *	LIST_HEAD(my_local_list);
   714	 *
   715	 *	ret = 0;
   716	 *	drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>, &my_local_list, vm_bo) {
   717	 *		ret = do_something_with_vm_bo(..., vm_bo);
   718	 *		if (ret)
   719	 *			break;
   720	 *	}
   721	 *	drm_gpuvm_bo_put(vm_bo);
   722	 *	drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>, &my_local_list);
   723	 *
   724	 *
   725	 * Only used for internal list iterations, not meant to be exposed to the outside
   726	 * world.
   727	 */
   728	#define for_each_vm_bo_in_list(__gpuvm, __list_name, __local_list, __vm_bo)	\
   729		for (__vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
   730							__local_list, NULL);		\
   731		     __vm_bo;								\
   732		     __vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
   733							__local_list, __vm_bo))		\
 > 734	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation Danilo Krummrich
  2023-09-09 20:16   ` kernel test robot
@ 2023-09-11 10:35   ` Boris Brezillon
  2023-09-11 16:23     ` Danilo Krummrich
  2023-09-11 12:54   ` Boris Brezillon
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 77+ messages in thread
From: Boris Brezillon @ 2023-09-11 10:35 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

Hello Danilo,

On Sat,  9 Sep 2023 17:31:13 +0200
Danilo Krummrich <dakr@redhat.com> wrote:


> @@ -632,6 +661,131 @@
>   *	}
>   */
>  
> +/**
> + * get_next_vm_bo_from_list() - get the next vm_bo element
> + * @__gpuvm: The GPU VM
> + * @__list_name: The name of the list we're iterating on
> + * @__local_list: A pointer to the local list used to store already iterated items
> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> + *
> + * This helper is here to provide lockless list iteration. Lockless as in, the
> + * iterator releases the lock immediately after picking the first element from
> + * the list, so list insertion deletion can happen concurrently.
> + *
> + * Elements popped from the original list are kept in a local list, so removal
> + * and is_empty checks can still happen while we're iterating the list.
> + */
> +#define get_next_vm_bo_from_list(__gpuvm, __list_name, __local_list, __prev_vm_bo)	\
> +	({										\
> +		struct drm_gpuvm_bo *__vm_bo;						\
> +											\
> +		drm_gpuvm_bo_put(__prev_vm_bo);						\
> +											\
> +		spin_lock(&(__gpuvm)->__list_name.lock);				\

I'm tempted to add a drm_gpuvm::<list_name>::local_list field, so we
can catch concurrent iterations with something like:

		if (!(__gpuvm)->__list_name.local_list)
			(__gpuvm)->__list_name.local_list = __local_list;
		else
			WARN_ON((__gpuvm)->__list_name.local_list != __local_list);

with (__gpuvm)->__list_name.local_list being restored to NULL
in restore_vm_bo_list().

> +		while (!list_empty(&(__gpuvm)->__list_name.list)) {			\
> +			__vm_bo = list_first_entry(&(__gpuvm)->__list_name.list,	\
> +						   struct drm_gpuvm_bo,			\
> +						   list.entry.__list_name);		\
> +			if (drm_gpuvm_bo_get_unless_zero(__vm_bo)) {			\
> +				list_move_tail(&(__vm_bo)->list.entry.__list_name,	\
> +					       __local_list);				\
> +				break;							\
> +			} else {							\
> +				list_del_init(&(__vm_bo)->list.entry.__list_name);	\
> +				__vm_bo = NULL;						\
> +			}								\
> +		}									\
> +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
> +											\
> +		__vm_bo;								\
> +	})
> +
> +/**
> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
> + *
> + * This helper is here to provide lockless list iteration. Lockless as in, the
> + * iterator releases the lock immediately after picking the first element from the
> + * list, so list insertion and deletion can happen concurrently.
> + *
> + * Typical use:
> + *
> + *	struct drm_gpuvm_bo *vm_bo;
> + *	LIST_HEAD(my_local_list);
> + *
> + *	ret = 0;
> + *	drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>, &my_local_list, vm_bo) {
> + *		ret = do_something_with_vm_bo(..., vm_bo);
> + *		if (ret)
> + *			break;
> + *	}
> + *	drm_gpuvm_bo_put(vm_bo);
> + *	drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>, &my_local_list);

The names in this example and the helper names don't match.

> + *
> + *
> + * Only used for internal list iterations, not meant to be exposed to the outside
> + * world.
> + */
> +#define for_each_vm_bo_in_list(__gpuvm, __list_name, __local_list, __vm_bo)	\
> +	for (__vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
> +						__local_list, NULL);		\
> +	     __vm_bo;								\
> +	     __vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
> +						__local_list, __vm_bo))		\
> +
> +/**
> + * restore_vm_bo_list() - move vm_bo elements back to their original list
> + * @__gpuvm: The GPU VM
> + * @__list_name: The name of the list we're iterating on
> + * @__local_list: A pointer to the local list used to store already iterated items
> + *
> + * When we're done iterating a vm_bo list, we should call restore_vm_bo_list()
> + * to restore the original state and let new iterations take place.
> + */
> +#define restore_vm_bo_list(__gpuvm, __list_name, __local_list)				\
> +	do {										\
> +		/* Merge back the two lists, moving local list elements to the		\
> +		 * head to preserve previous ordering, in case it matters.		\
> +		 */									\
> +		spin_lock(&(__gpuvm)->__list_name.lock);				\
> +		list_splice(__local_list, &(__gpuvm)->__list_name.list);		\
> +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
> +	} while (0)
> +/**
> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given list
> + * @__vm_bo: the &drm_gpuvm_bo
> + * @__list_name: the name of the list to insert into
> + *
> + * Inserts the given @__vm_bo into the list specified by @__list_name and
> + * increases the vm_bo's reference count.
> + */
> +#define drm_gpuvm_bo_list_add(__vm_bo, __list_name)				\
> +	do {									\
> +		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
> +		if (list_empty(&(__vm_bo)->list.entry.__list_name))		\
> +			list_add_tail(&(__vm_bo)->list.entry.__list_name,	\
> +				      &(__vm_bo)->vm->__list_name.list);	\
> +		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
> +	} while (0)
> +
> +/**
> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given list
> + * @__vm_bo: the &drm_gpuvm_bo
> + * @__list_name: the name of the list to insert into
> + *
> + * Removes the given @__vm_bo from the list specified by @__list_name and
> + * decreases the vm_bo's reference count.
> + */
> +#define drm_gpuvm_bo_list_del(__vm_bo, __list_name)				\
> +	do {									\
> +		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
> +		if (!list_empty(&(__vm_bo)->list.entry.__list_name))		\
> +			list_del_init(&(__vm_bo)->list.entry.__list_name);	\
> +		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
> +	} while (0)
> +
> +static int __must_check
> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);

I see no obvious reason to have a forward declaration for this helper,
if we decide to keep it, let's at least move the declaration here.


> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>  
>  	drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>  
> +	spin_lock(&gpuvm->extobj.lock);
> +	list_del(&vm_bo->list.entry.extobj);
> +	spin_unlock(&gpuvm->extobj.lock);
> +
> +	spin_lock(&gpuvm->evict.lock);
> +	list_del(&vm_bo->list.entry.evict);
> +	spin_unlock(&gpuvm->evict.lock);
> +
>  	list_del(&vm_bo->list.entry.gem);
>  
>  	drm_gem_object_put(obj);
> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>   * @vm_bo: the &drm_gpuvm_bo to release the reference of
>   *
>   * This releases a reference to @vm_bo.
> + *
> + * If the reference count drops to zero, the &gpuvm_bo is destroyed, which
> + * includes removing it from the GEMs gpuva list. Hence, if a call to this
> + * function can potentially let the reference count to zero the caller must
> + * hold the dma-resv or driver specific GEM gpuva lock.

Looks like this should have been part of the previous patch. I hate
the fact we have to worry about GEM gpuva lock being held when we call
_put() only if the ref drops to zero though. I think I'd feel more
comfortable if the function was named differently. Maybe _return() or
_release() to match the _obtain() function, where the object is inserted
in the GEM vm_bo list. I would also do the lock_is_held() check
unconditionally, move the list removal in this function with a del_init(),
and have a WARN_ON(!list_empty) in vm_bo_destroy().

>   */
>  void
>  drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>  }
>  EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>  
> +static int __must_check
> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
> +{
> +	return kref_get_unless_zero(&vm_bo->kref);

Not convinced this helper is needed. It's only used once, and I
don't think we'll need it elsewhere.

> +}
> +
>  static struct drm_gpuvm_bo *
>  __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>  		    struct drm_gem_object *obj)


Regards,

Boris

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 4/7] drm/gpuvm: common dma-resv per struct drm_gpuvm
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 4/7] drm/gpuvm: common dma-resv per struct drm_gpuvm Danilo Krummrich
@ 2023-09-11 12:00   ` Boris Brezillon
  2023-09-11 16:16     ` Danilo Krummrich
  0 siblings, 1 reply; 77+ messages in thread
From: Boris Brezillon @ 2023-09-11 12:00 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

On Sat,  9 Sep 2023 17:31:11 +0200
Danilo Krummrich <dakr@redhat.com> wrote:

> @@ -240,9 +240,22 @@ struct drm_gpuvm {
>  	 * @ops: &drm_gpuvm_ops providing the split/merge steps to drivers
>  	 */
>  	const struct drm_gpuvm_ops *ops;
> +
> +	/**
> +	 * @d_obj: Dummy GEM object; used internally to pass the GPU VMs
> +	 * dma-resv to &drm_exec.
> +	 */
> +	struct drm_gem_object d_obj;
> +
> +	/**
> +	 * @resv: the &dma_resv for &drm_gem_objects mapped in this GPU VA
> +	 * space
> +	 */
> +	struct dma_resv *resv;

Hm, I'd be tempted to drop this field and add a drm_gpuvm_resv() helper
returning vm->d_obj.resv;

>  };

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation Danilo Krummrich
  2023-09-09 20:16   ` kernel test robot
  2023-09-11 10:35   ` Boris Brezillon
@ 2023-09-11 12:54   ` Boris Brezillon
  2023-09-11 14:45   ` Boris Brezillon
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 77+ messages in thread
From: Boris Brezillon @ 2023-09-11 12:54 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

On Sat,  9 Sep 2023 17:31:13 +0200
Danilo Krummrich <dakr@redhat.com> wrote:

> +/**
> + * get_next_vm_bo_from_list() - get the next vm_bo element
> + * @__gpuvm: The GPU VM
> + * @__list_name: The name of the list we're iterating on
> + * @__local_list: A pointer to the local list used to store already iterated items
> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> + *
> + * This helper is here to provide lockless list iteration. Lockless as in, the
> + * iterator releases the lock immediately after picking the first element from
> + * the list, so list insertion deletion can happen concurrently.
> + *
> + * Elements popped from the original list are kept in a local list, so removal
> + * and is_empty checks can still happen while we're iterating the list.
> + */
> +#define get_next_vm_bo_from_list(__gpuvm, __list_name, __local_list, __prev_vm_bo)	\
> +	({										\
> +		struct drm_gpuvm_bo *__vm_bo;						\

Missing NULL assignment here.

> +											\
> +		drm_gpuvm_bo_put(__prev_vm_bo);						\
> +											\
> +		spin_lock(&(__gpuvm)->__list_name.lock);				\
> +		while (!list_empty(&(__gpuvm)->__list_name.list)) {			\
> +			__vm_bo = list_first_entry(&(__gpuvm)->__list_name.list,	\
> +						   struct drm_gpuvm_bo,			\
> +						   list.entry.__list_name);		\
> +			if (drm_gpuvm_bo_get_unless_zero(__vm_bo)) {			\
> +				list_move_tail(&(__vm_bo)->list.entry.__list_name,	\
> +					       __local_list);				\
> +				break;							\
> +			} else {							\
> +				list_del_init(&(__vm_bo)->list.entry.__list_name);	\
> +				__vm_bo = NULL;						\
> +			}								\
> +		}									\
> +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
> +											\
> +		__vm_bo;								\
> +	})
> +
> +/**
> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
> + *
> + * This helper is here to provide lockless list iteration. Lockless as in, the
> + * iterator releases the lock immediately after picking the first element from the
> + * list, so list insertion and deletion can happen concurrently.
> + *
> + * Typical use:
> + *
> + *	struct drm_gpuvm_bo *vm_bo;
> + *	LIST_HEAD(my_local_list);
> + *
> + *	ret = 0;
> + *	drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>, &my_local_list, vm_bo) {
> + *		ret = do_something_with_vm_bo(..., vm_bo);
> + *		if (ret)
> + *			break;
> + *	}
> + *	drm_gpuvm_bo_put(vm_bo);
> + *	drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>, &my_local_list);

Might be worth mentioning that the vm_bo pointer shouldn't be
re-assigned from inside for loop, otherwise
the next get_next_vm_bo_from_list() will be passed a wrong prev_vm_bo.

> + *
> + *
> + * Only used for internal list iterations, not meant to be exposed to the outside
> + * world.
> + */
> +#define for_each_vm_bo_in_list(__gpuvm, __list_name, __local_list, __vm_bo)	\
> +	for (__vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
> +						__local_list, NULL);		\
> +	     __vm_bo;								\
> +	     __vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
> +						__local_list, __vm_bo))		\

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 2/7] drm/gpuvm: allow building as module
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 2/7] drm/gpuvm: allow building as module Danilo Krummrich
@ 2023-09-11 13:09   ` Christian König
  0 siblings, 0 replies; 77+ messages in thread
From: Christian König @ 2023-09-11 13:09 UTC (permalink / raw)
  To: Danilo Krummrich, airlied, daniel, matthew.brost,
	thomas.hellstrom, sarah.walker, donald.robson, boris.brezillon,
	faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel

Am 09.09.23 um 17:31 schrieb Danilo Krummrich:
> Currently, the DRM GPUVM does not have any core dependencies preventing
> a module build.
>
> Also, new features from subsequent patches require helpers (namely
> drm_exec) which can be built as module.
>
> Signed-off-by: Danilo Krummrich <dakr@redhat.com>

Reviewed-by: Christian König <christian.koenig@amd.com> for this one here.

I hope that I can get somebody to work on the remaining patches with the 
end goal of using this in amdgpu as well.

Regards,
Christian.

> ---
>   drivers/gpu/drm/Kconfig         | 7 +++++++
>   drivers/gpu/drm/Makefile        | 2 +-
>   drivers/gpu/drm/drm_gpuvm.c     | 3 +++
>   drivers/gpu/drm/nouveau/Kconfig | 1 +
>   4 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index ab9ef1c20349..0f78a03e4e84 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -216,6 +216,13 @@ config DRM_EXEC
>   	help
>   	  Execution context for command submissions
>   
> +config DRM_GPUVM
> +	tristate
> +	depends on DRM && DRM_EXEC
> +	help
> +	  GPU-VM representation providing helpers to manage a GPUs virtual
> +	  address space
> +
>   config DRM_BUDDY
>   	tristate
>   	depends on DRM
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index 7a84b3cddeab..8e1bde059170 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -45,7 +45,6 @@ drm-y := \
>   	drm_vblank.o \
>   	drm_vblank_work.o \
>   	drm_vma_manager.o \
> -	drm_gpuvm.o \
>   	drm_writeback.o
>   drm-$(CONFIG_DRM_LEGACY) += \
>   	drm_agpsupport.o \
> @@ -81,6 +80,7 @@ obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += drm_panel_orientation_quirks.o
>   #
>   #
>   obj-$(CONFIG_DRM_EXEC) += drm_exec.o
> +obj-$(CONFIG_DRM_GPUVM) += drm_gpuvm.o
>   
>   obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o
>   
> diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
> index de1a69bc4a44..aae086deaa2b 100644
> --- a/drivers/gpu/drm/drm_gpuvm.c
> +++ b/drivers/gpu/drm/drm_gpuvm.c
> @@ -1723,3 +1723,6 @@ drm_gpuva_ops_free(struct drm_gpuvm *gpuvm,
>   	kfree(ops);
>   }
>   EXPORT_SYMBOL_GPL(drm_gpuva_ops_free);
> +
> +MODULE_DESCRIPTION("DRM GPUVM");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/gpu/drm/nouveau/Kconfig b/drivers/gpu/drm/nouveau/Kconfig
> index c52e8096cca4..1e6aaf95ff7c 100644
> --- a/drivers/gpu/drm/nouveau/Kconfig
> +++ b/drivers/gpu/drm/nouveau/Kconfig
> @@ -11,6 +11,7 @@ config DRM_NOUVEAU
>   	select DRM_TTM
>   	select DRM_TTM_HELPER
>   	select DRM_EXEC
> +	select DRM_GPUVM
>   	select DRM_SCHED
>   	select I2C
>   	select I2C_ALGOBIT


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation Danilo Krummrich
                     ` (2 preceding siblings ...)
  2023-09-11 12:54   ` Boris Brezillon
@ 2023-09-11 14:45   ` Boris Brezillon
  2023-09-11 16:30     ` Danilo Krummrich
  2023-09-12 16:20   ` Thomas Hellström
  2023-09-14 13:48   ` Thomas Hellström
  5 siblings, 1 reply; 77+ messages in thread
From: Boris Brezillon @ 2023-09-11 14:45 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

On Sat,  9 Sep 2023 17:31:13 +0200
Danilo Krummrich <dakr@redhat.com> wrote:

> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>  
>  	drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>  
> +	spin_lock(&gpuvm->extobj.lock);
> +	list_del(&vm_bo->list.entry.extobj);
> +	spin_unlock(&gpuvm->extobj.lock);
> +
> +	spin_lock(&gpuvm->evict.lock);
> +	list_del(&vm_bo->list.entry.evict);
> +	spin_unlock(&gpuvm->evict.lock);
> +
>  	list_del(&vm_bo->list.entry.gem);
>  
>  	drm_gem_object_put(obj);

I ran into a UAF situation when the drm_gpuvm_bo object is the last
owner of obj, because the lock that's supposed to be held when calling
this function (drm_gem_gpuva_assert_lock_held() call above), belongs to
obj (either obj->resv, or a driver specific lock that's attached to the
driver-specific GEM object). I worked around it by taking a ref to obj
before calling lock()+drm_gpuvm_bo_put()+unlock(), and releasing it
after I'm node with the lock, but that just feels wrong.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 4/7] drm/gpuvm: common dma-resv per struct drm_gpuvm
  2023-09-11 12:00   ` Boris Brezillon
@ 2023-09-11 16:16     ` Danilo Krummrich
  0 siblings, 0 replies; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-11 16:16 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

On Mon, Sep 11, 2023 at 02:00:35PM +0200, Boris Brezillon wrote:
> On Sat,  9 Sep 2023 17:31:11 +0200
> Danilo Krummrich <dakr@redhat.com> wrote:
> 
> > @@ -240,9 +240,22 @@ struct drm_gpuvm {
> >  	 * @ops: &drm_gpuvm_ops providing the split/merge steps to drivers
> >  	 */
> >  	const struct drm_gpuvm_ops *ops;
> > +
> > +	/**
> > +	 * @d_obj: Dummy GEM object; used internally to pass the GPU VMs
> > +	 * dma-resv to &drm_exec.
> > +	 */
> > +	struct drm_gem_object d_obj;
> > +
> > +	/**
> > +	 * @resv: the &dma_resv for &drm_gem_objects mapped in this GPU VA
> > +	 * space
> > +	 */
> > +	struct dma_resv *resv;
> 
> Hm, I'd be tempted to drop this field and add a drm_gpuvm_resv() helper
> returning vm->d_obj.resv;

Makes sense, will do that for V4.

> 
> >  };
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-11 10:35   ` Boris Brezillon
@ 2023-09-11 16:23     ` Danilo Krummrich
  0 siblings, 0 replies; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-11 16:23 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

On Mon, Sep 11, 2023 at 12:35:26PM +0200, Boris Brezillon wrote:
> Hello Danilo,
> 
> On Sat,  9 Sep 2023 17:31:13 +0200
> Danilo Krummrich <dakr@redhat.com> wrote:
> 
> 
> > @@ -632,6 +661,131 @@
> >   *	}
> >   */
> >  
> > +/**
> > + * get_next_vm_bo_from_list() - get the next vm_bo element
> > + * @__gpuvm: The GPU VM
> > + * @__list_name: The name of the list we're iterating on
> > + * @__local_list: A pointer to the local list used to store already iterated items
> > + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> > + *
> > + * This helper is here to provide lockless list iteration. Lockless as in, the
> > + * iterator releases the lock immediately after picking the first element from
> > + * the list, so list insertion deletion can happen concurrently.
> > + *
> > + * Elements popped from the original list are kept in a local list, so removal
> > + * and is_empty checks can still happen while we're iterating the list.
> > + */
> > +#define get_next_vm_bo_from_list(__gpuvm, __list_name, __local_list, __prev_vm_bo)	\
> > +	({										\
> > +		struct drm_gpuvm_bo *__vm_bo;						\
> > +											\
> > +		drm_gpuvm_bo_put(__prev_vm_bo);						\
> > +											\
> > +		spin_lock(&(__gpuvm)->__list_name.lock);				\
> 
> I'm tempted to add a drm_gpuvm::<list_name>::local_list field, so we
> can catch concurrent iterations with something like:
> 
> 		if (!(__gpuvm)->__list_name.local_list)
> 			(__gpuvm)->__list_name.local_list = __local_list;
> 		else
> 			WARN_ON((__gpuvm)->__list_name.local_list != __local_list);
> 
> with (__gpuvm)->__list_name.local_list being restored to NULL
> in restore_vm_bo_list().
> 
> > +		while (!list_empty(&(__gpuvm)->__list_name.list)) {			\
> > +			__vm_bo = list_first_entry(&(__gpuvm)->__list_name.list,	\
> > +						   struct drm_gpuvm_bo,			\
> > +						   list.entry.__list_name);		\
> > +			if (drm_gpuvm_bo_get_unless_zero(__vm_bo)) {			\
> > +				list_move_tail(&(__vm_bo)->list.entry.__list_name,	\
> > +					       __local_list);				\
> > +				break;							\
> > +			} else {							\
> > +				list_del_init(&(__vm_bo)->list.entry.__list_name);	\
> > +				__vm_bo = NULL;						\
> > +			}								\
> > +		}									\
> > +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
> > +											\
> > +		__vm_bo;								\
> > +	})
> > +
> > +/**
> > + * for_each_vm_bo_in_list() - internal vm_bo list iterator
> > + *
> > + * This helper is here to provide lockless list iteration. Lockless as in, the
> > + * iterator releases the lock immediately after picking the first element from the
> > + * list, so list insertion and deletion can happen concurrently.
> > + *
> > + * Typical use:
> > + *
> > + *	struct drm_gpuvm_bo *vm_bo;
> > + *	LIST_HEAD(my_local_list);
> > + *
> > + *	ret = 0;
> > + *	drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>, &my_local_list, vm_bo) {
> > + *		ret = do_something_with_vm_bo(..., vm_bo);
> > + *		if (ret)
> > + *			break;
> > + *	}
> > + *	drm_gpuvm_bo_put(vm_bo);
> > + *	drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>, &my_local_list);
> 
> The names in this example and the helper names don't match.
> 
> > + *
> > + *
> > + * Only used for internal list iterations, not meant to be exposed to the outside
> > + * world.
> > + */
> > +#define for_each_vm_bo_in_list(__gpuvm, __list_name, __local_list, __vm_bo)	\
> > +	for (__vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
> > +						__local_list, NULL);		\
> > +	     __vm_bo;								\
> > +	     __vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
> > +						__local_list, __vm_bo))		\
> > +
> > +/**
> > + * restore_vm_bo_list() - move vm_bo elements back to their original list
> > + * @__gpuvm: The GPU VM
> > + * @__list_name: The name of the list we're iterating on
> > + * @__local_list: A pointer to the local list used to store already iterated items
> > + *
> > + * When we're done iterating a vm_bo list, we should call restore_vm_bo_list()
> > + * to restore the original state and let new iterations take place.
> > + */
> > +#define restore_vm_bo_list(__gpuvm, __list_name, __local_list)				\
> > +	do {										\
> > +		/* Merge back the two lists, moving local list elements to the		\
> > +		 * head to preserve previous ordering, in case it matters.		\
> > +		 */									\
> > +		spin_lock(&(__gpuvm)->__list_name.lock);				\
> > +		list_splice(__local_list, &(__gpuvm)->__list_name.list);		\
> > +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
> > +	} while (0)
> > +/**
> > + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given list
> > + * @__vm_bo: the &drm_gpuvm_bo
> > + * @__list_name: the name of the list to insert into
> > + *
> > + * Inserts the given @__vm_bo into the list specified by @__list_name and
> > + * increases the vm_bo's reference count.
> > + */
> > +#define drm_gpuvm_bo_list_add(__vm_bo, __list_name)				\
> > +	do {									\
> > +		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
> > +		if (list_empty(&(__vm_bo)->list.entry.__list_name))		\
> > +			list_add_tail(&(__vm_bo)->list.entry.__list_name,	\
> > +				      &(__vm_bo)->vm->__list_name.list);	\
> > +		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
> > +	} while (0)
> > +
> > +/**
> > + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given list
> > + * @__vm_bo: the &drm_gpuvm_bo
> > + * @__list_name: the name of the list to insert into
> > + *
> > + * Removes the given @__vm_bo from the list specified by @__list_name and
> > + * decreases the vm_bo's reference count.
> > + */
> > +#define drm_gpuvm_bo_list_del(__vm_bo, __list_name)				\
> > +	do {									\
> > +		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
> > +		if (!list_empty(&(__vm_bo)->list.entry.__list_name))		\
> > +			list_del_init(&(__vm_bo)->list.entry.__list_name);	\
> > +		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
> > +	} while (0)
> > +
> > +static int __must_check
> > +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
> 
> I see no obvious reason to have a forward declaration for this helper,
> if we decide to keep it, let's at least move the declaration here.
> 
> 
> > @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
> >  
> >  	drm_gem_gpuva_assert_lock_held(vm_bo->obj);
> >  
> > +	spin_lock(&gpuvm->extobj.lock);
> > +	list_del(&vm_bo->list.entry.extobj);
> > +	spin_unlock(&gpuvm->extobj.lock);
> > +
> > +	spin_lock(&gpuvm->evict.lock);
> > +	list_del(&vm_bo->list.entry.evict);
> > +	spin_unlock(&gpuvm->evict.lock);
> > +
> >  	list_del(&vm_bo->list.entry.gem);
> >  
> >  	drm_gem_object_put(obj);
> > @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
> >   * @vm_bo: the &drm_gpuvm_bo to release the reference of
> >   *
> >   * This releases a reference to @vm_bo.
> > + *
> > + * If the reference count drops to zero, the &gpuvm_bo is destroyed, which
> > + * includes removing it from the GEMs gpuva list. Hence, if a call to this
> > + * function can potentially let the reference count to zero the caller must
> > + * hold the dma-resv or driver specific GEM gpuva lock.
> 
> Looks like this should have been part of the previous patch. I hate
> the fact we have to worry about GEM gpuva lock being held when we call
> _put() only if the ref drops to zero though. I think I'd feel more
> comfortable if the function was named differently. Maybe _return() or
> _release() to match the _obtain() function, where the object is inserted
> in the GEM vm_bo list. I would also do the lock_is_held() check
> unconditionally, move the list removal in this function with a del_init(),
> and have a WARN_ON(!list_empty) in vm_bo_destroy().
> 

We can't move the list removal to drm_gpuvm_bo_put(), we need to make sure we
can't create duplicate drm_gpuvm_bo structures. Everything else pretty much goes
away with a dedicated GEM gpuva list lock, as I had in my first patch series
when I introduced the GPUVA manager. At that time it wasn't always needed, hence
the optional driver specific lock, however with the VM_BO abstraction it really
makes sense to have a dedicated one.


I agree with the other feedback from this reply and will address it in a V4.

> >   */
> >  void
> >  drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
> > @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
> >  }
> >  EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
> >  
> > +static int __must_check
> > +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
> > +{
> > +	return kref_get_unless_zero(&vm_bo->kref);
> 
> Not convinced this helper is needed. It's only used once, and I
> don't think we'll need it elsewhere.
> 
> > +}
> > +
> >  static struct drm_gpuvm_bo *
> >  __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
> >  		    struct drm_gem_object *obj)
> 
> 
> Regards,
> 
> Boris
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-11 14:45   ` Boris Brezillon
@ 2023-09-11 16:30     ` Danilo Krummrich
  0 siblings, 0 replies; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-11 16:30 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: airlied, daniel, matthew.brost, thomas.hellstrom, sarah.walker,
	donald.robson, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

On Mon, Sep 11, 2023 at 04:45:26PM +0200, Boris Brezillon wrote:
> On Sat,  9 Sep 2023 17:31:13 +0200
> Danilo Krummrich <dakr@redhat.com> wrote:
> 
> > @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
> >  
> >  	drm_gem_gpuva_assert_lock_held(vm_bo->obj);
> >  
> > +	spin_lock(&gpuvm->extobj.lock);
> > +	list_del(&vm_bo->list.entry.extobj);
> > +	spin_unlock(&gpuvm->extobj.lock);
> > +
> > +	spin_lock(&gpuvm->evict.lock);
> > +	list_del(&vm_bo->list.entry.evict);
> > +	spin_unlock(&gpuvm->evict.lock);
> > +
> >  	list_del(&vm_bo->list.entry.gem);
> >  
> >  	drm_gem_object_put(obj);
> 
> I ran into a UAF situation when the drm_gpuvm_bo object is the last
> owner of obj, because the lock that's supposed to be held when calling
> this function (drm_gem_gpuva_assert_lock_held() call above), belongs to
> obj (either obj->resv, or a driver specific lock that's attached to the
> driver-specific GEM object). I worked around it by taking a ref to obj
> before calling lock()+drm_gpuvm_bo_put()+unlock(), and releasing it
> after I'm node with the lock, but that just feels wrong.
> 
As mentioned in a previous reply, I think we want to bring the dedicated GEM
gpuva list lock back instead of abusing the dma-resv lock. This way we can
handle locking internally and don't run into such issues.

There is also no reason for a driver to already hold the GEM gpuva list lock
when when calling drm_gpuvm_bo_put(). Drivers would only acquire the lock to
iterate the GEMs list of drm_gpuvm_bos or the drm_gpuvm_bos list of drm_gpuvas.
And dropping the drm_gpuvm_bo from within such a loop is forbidden anyways.


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination Danilo Krummrich
@ 2023-09-11 17:19   ` Thomas Hellström
  2023-09-11 17:49     ` Danilo Krummrich
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-11 17:19 UTC (permalink / raw)
  To: Danilo Krummrich, airlied, daniel, matthew.brost, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel

Hi, Danilo

On 9/9/23 17:31, Danilo Krummrich wrote:
> This patch adds an abstraction layer between the drm_gpuva mappings of
> a particular drm_gem_object and this GEM object itself. The abstraction
> represents a combination of a drm_gem_object and drm_gpuvm. The
> drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
> representing this abstraction), while each drm_gpuvm_bo contains list of
> mappings of this GEM object.
>
> This has multiple advantages:
>
> 1) We can use the drm_gpuvm_bo structure to attach it to various lists
>     of the drm_gpuvm. This is useful for tracking external and evicted
>     objects per VM, which is introduced in subsequent patches.
>
> 2) Finding mappings of a certain drm_gem_object mapped in a certain
>     drm_gpuvm becomes much cheaper.
>
> 3) Drivers can derive and extend the structure to easily represent
>     driver specific states of a BO for a certain GPUVM.
>
> The idea of this abstraction was taken from amdgpu, hence the credit for
> this idea goes to the developers of amdgpu.
>
> Cc: Christian König <christian.koenig@amd.com>
> Signed-off-by: Danilo Krummrich <dakr@redhat.com>

Did you consider having the drivers embed the struct drm_gpuvm_bo in 
their own bo definition? I figure that would mean using the gem bo's 
refcounting and providing a helper to call from the driver's bo release. 
Looks like that could potentially save a lot of code? Or is there 
something that won't work with that approach?

Thanks,

Thomas



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination
  2023-09-11 17:19   ` Thomas Hellström
@ 2023-09-11 17:49     ` Danilo Krummrich
  2023-09-11 18:37       ` Thomas Hellström
  2023-09-12  7:42       ` Thomas Hellström
  0 siblings, 2 replies; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-11 17:49 UTC (permalink / raw)
  To: Thomas Hellström, airlied, daniel, matthew.brost,
	sarah.walker, donald.robson, boris.brezillon, christian.koenig,
	faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel

Hi Thomas,

On 9/11/23 19:19, Thomas Hellström wrote:
> Hi, Danilo
> 
> On 9/9/23 17:31, Danilo Krummrich wrote:
>> This patch adds an abstraction layer between the drm_gpuva mappings of
>> a particular drm_gem_object and this GEM object itself. The abstraction
>> represents a combination of a drm_gem_object and drm_gpuvm. The
>> drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
>> representing this abstraction), while each drm_gpuvm_bo contains list of
>> mappings of this GEM object.
>>
>> This has multiple advantages:
>>
>> 1) We can use the drm_gpuvm_bo structure to attach it to various lists
>>     of the drm_gpuvm. This is useful for tracking external and evicted
>>     objects per VM, which is introduced in subsequent patches.
>>
>> 2) Finding mappings of a certain drm_gem_object mapped in a certain
>>     drm_gpuvm becomes much cheaper.
>>
>> 3) Drivers can derive and extend the structure to easily represent
>>     driver specific states of a BO for a certain GPUVM.
>>
>> The idea of this abstraction was taken from amdgpu, hence the credit for
>> this idea goes to the developers of amdgpu.
>>
>> Cc: Christian König <christian.koenig@amd.com>
>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> 
> Did you consider having the drivers embed the struct drm_gpuvm_bo in their own bo definition? I figure that would mean using the gem bo's refcounting and providing a helper to call from the driver's bo release. Looks like that could potentially save a lot of code? Or is there something that won't work with that approach?

There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free callback for drivers to register for that purpose.

- Danilo

> 
> Thanks,
> 
> Thomas
> 
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination
  2023-09-11 17:49     ` Danilo Krummrich
@ 2023-09-11 18:37       ` Thomas Hellström
  2023-09-12  7:42       ` Thomas Hellström
  1 sibling, 0 replies; 77+ messages in thread
From: Thomas Hellström @ 2023-09-11 18:37 UTC (permalink / raw)
  To: Danilo Krummrich, airlied, daniel, matthew.brost, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel


On 9/11/23 19:49, Danilo Krummrich wrote:
> Hi Thomas,
>
> On 9/11/23 19:19, Thomas Hellström wrote:
>> Hi, Danilo
>>
>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>> This patch adds an abstraction layer between the drm_gpuva mappings of
>>> a particular drm_gem_object and this GEM object itself. The abstraction
>>> represents a combination of a drm_gem_object and drm_gpuvm. The
>>> drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
>>> representing this abstraction), while each drm_gpuvm_bo contains 
>>> list of
>>> mappings of this GEM object.
>>>
>>> This has multiple advantages:
>>>
>>> 1) We can use the drm_gpuvm_bo structure to attach it to various lists
>>>     of the drm_gpuvm. This is useful for tracking external and evicted
>>>     objects per VM, which is introduced in subsequent patches.
>>>
>>> 2) Finding mappings of a certain drm_gem_object mapped in a certain
>>>     drm_gpuvm becomes much cheaper.
>>>
>>> 3) Drivers can derive and extend the structure to easily represent
>>>     driver specific states of a BO for a certain GPUVM.
>>>
>>> The idea of this abstraction was taken from amdgpu, hence the credit 
>>> for
>>> this idea goes to the developers of amdgpu.
>>>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>
>> Did you consider having the drivers embed the struct drm_gpuvm_bo in 
>> their own bo definition? I figure that would mean using the gem bo's 
>> refcounting and providing a helper to call from the driver's bo 
>> release. Looks like that could potentially save a lot of code? Or is 
>> there something that won't work with that approach?
>
> There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free 
> callback for drivers to register for that purpose.

Ah OK. Thanks, I'll take a deeper look.

/Thomas


>
> - Danilo
>
>>
>> Thanks,
>>
>> Thomas
>>
>>
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination
  2023-09-11 17:49     ` Danilo Krummrich
  2023-09-11 18:37       ` Thomas Hellström
@ 2023-09-12  7:42       ` Thomas Hellström
  2023-09-12 10:06         ` Danilo Krummrich
  1 sibling, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-12  7:42 UTC (permalink / raw)
  To: Danilo Krummrich, airlied, daniel, matthew.brost, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel

Hi, Danilo

On 9/11/23 19:49, Danilo Krummrich wrote:
> Hi Thomas,
>
> On 9/11/23 19:19, Thomas Hellström wrote:
>> Hi, Danilo
>>
>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>> This patch adds an abstraction layer between the drm_gpuva mappings of
>>> a particular drm_gem_object and this GEM object itself. The abstraction
>>> represents a combination of a drm_gem_object and drm_gpuvm. The
>>> drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
>>> representing this abstraction), while each drm_gpuvm_bo contains 
>>> list of
>>> mappings of this GEM object.
>>>
>>> This has multiple advantages:
>>>
>>> 1) We can use the drm_gpuvm_bo structure to attach it to various lists
>>>     of the drm_gpuvm. This is useful for tracking external and evicted
>>>     objects per VM, which is introduced in subsequent patches.
>>>
>>> 2) Finding mappings of a certain drm_gem_object mapped in a certain
>>>     drm_gpuvm becomes much cheaper.
>>>
>>> 3) Drivers can derive and extend the structure to easily represent
>>>     driver specific states of a BO for a certain GPUVM.
>>>
>>> The idea of this abstraction was taken from amdgpu, hence the credit 
>>> for
>>> this idea goes to the developers of amdgpu.
>>>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>
>> Did you consider having the drivers embed the struct drm_gpuvm_bo in 
>> their own bo definition? I figure that would mean using the gem bo's 
>> refcounting and providing a helper to call from the driver's bo 
>> release. Looks like that could potentially save a lot of code? Or is 
>> there something that won't work with that approach?
>
> There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free 
> callback for drivers to register for that purpose.
>
> - Danilo

Now after looking a bit deeper, I think actually the question could be 
rephrased as, why don't we just use the
struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of 
keeping things simple? Drivers would then just embed it in their bo 
subclass and we'd avoid unnecessary fields in the struct drm_gem_object 
for drivers that don't do VM_BIND yet.

Sure, this won't be per bo and per vm, but it'd really only make a 
slight difference where we have multiple VMAs per bo, where per-vm 
per-bo state either needs to be duplicated or attached to a single vma 
(as in the case of the external bo list).

To me that looks like a substantial amount of less code / complexity?

/Thomas


>
>>
>> Thanks,
>>
>> Thomas
>>
>>
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination
  2023-09-12  7:42       ` Thomas Hellström
@ 2023-09-12 10:06         ` Danilo Krummrich
  2023-09-12 10:33           ` Thomas Hellström
  0 siblings, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-12 10:06 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

On Tue, Sep 12, 2023 at 09:42:44AM +0200, Thomas Hellström wrote:
> Hi, Danilo
> 
> On 9/11/23 19:49, Danilo Krummrich wrote:
> > Hi Thomas,
> > 
> > On 9/11/23 19:19, Thomas Hellström wrote:
> > > Hi, Danilo
> > > 
> > > On 9/9/23 17:31, Danilo Krummrich wrote:
> > > > This patch adds an abstraction layer between the drm_gpuva mappings of
> > > > a particular drm_gem_object and this GEM object itself. The abstraction
> > > > represents a combination of a drm_gem_object and drm_gpuvm. The
> > > > drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
> > > > representing this abstraction), while each drm_gpuvm_bo contains
> > > > list of
> > > > mappings of this GEM object.
> > > > 
> > > > This has multiple advantages:
> > > > 
> > > > 1) We can use the drm_gpuvm_bo structure to attach it to various lists
> > > >     of the drm_gpuvm. This is useful for tracking external and evicted
> > > >     objects per VM, which is introduced in subsequent patches.
> > > > 
> > > > 2) Finding mappings of a certain drm_gem_object mapped in a certain
> > > >     drm_gpuvm becomes much cheaper.
> > > > 
> > > > 3) Drivers can derive and extend the structure to easily represent
> > > >     driver specific states of a BO for a certain GPUVM.
> > > > 
> > > > The idea of this abstraction was taken from amdgpu, hence the
> > > > credit for
> > > > this idea goes to the developers of amdgpu.
> > > > 
> > > > Cc: Christian König <christian.koenig@amd.com>
> > > > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > > 
> > > Did you consider having the drivers embed the struct drm_gpuvm_bo in
> > > their own bo definition? I figure that would mean using the gem bo's
> > > refcounting and providing a helper to call from the driver's bo
> > > release. Looks like that could potentially save a lot of code? Or is
> > > there something that won't work with that approach?
> > 
> > There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free
> > callback for drivers to register for that purpose.
> > 
> > - Danilo
> 
> Now after looking a bit deeper, I think actually the question could be
> rephrased as, why don't we just use the
> struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of
> keeping things simple? Drivers would then just embed it in their bo subclass
> and we'd avoid unnecessary fields in the struct drm_gem_object for drivers
> that don't do VM_BIND yet.

struct drm_gem_object::gpuva is just a container containing a list in order to
(currently) attach drm_gpuva structs to it and with this patch attach
drm_gpuvm_bo structs (combination of BO + VM) to it. Doing the above basically
means "leave everything as it is, but move the list_head of drm_gpuvs per GEM to
the driver specific BO structure". Having a common connection between GEM
objects and drm_gpuva structs was one of the goals of the initial GPUVA manager
patch series however.

> 
> Sure, this won't be per bo and per vm, but it'd really only make a slight
> difference where we have multiple VMAs per bo, where per-vm per-bo state
> either needs to be duplicated or attached to a single vma (as in the case of
> the external bo list).


Correct, one implication is that we don't get a per VM and BO abstraction, and
hence are left with a list of all drm_gpuva structs having the same backing BO,
regardless of the VM.

For amdgpu this was always a concern. Now that we want to keep track of external
and evicted objects it's going to be a concern for most drivers I guess. Because
the only structure we could use for tracking external and evicted objects we are
left with (without having a VM_BO abstraction) is struct drm_gpuva. But this
structure isn't unique and we need to consider cases where userspace just
allocates rather huge BOs and creates tons of mappings from it. Running the full
list of drm_gpuva structs (with even the ones from other VMs included) for
adding an external or evicted object isn't very efficient. Not to mention that
the maintenance when the mapping we've (randomly) picked as an entry for the
external/evicted object list is unmapped, but there are still mappings left in
the VM with the same backing BO.

Now, a way to get rid of the VM_BO abstraction would be to use maple trees
instead, since then we can store drm_gem_object structs directly for each VM.
However, Xe had concerns about using maple trees and preferred lists, plus
having maple trees wouldn't get rid of the concerns of amdgpu not having a VM_BO
abstraction for cases with tons of VMs and tons of mappings per BO. Hence,
having a VM_BO abstraction enabling us to track external/evicted objects with
lists seems to satisfy everyone's needs.

- Danilo

> 
> To me that looks like a substantial amount of less code / complexity?
> 
> /Thomas
> 
> 
> > 
> > > 
> > > Thanks,
> > > 
> > > Thomas
> > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination
  2023-09-12 10:06         ` Danilo Krummrich
@ 2023-09-12 10:33           ` Thomas Hellström
  2023-09-12 11:05             ` Danilo Krummrich
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-12 10:33 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel


On 9/12/23 12:06, Danilo Krummrich wrote:
> On Tue, Sep 12, 2023 at 09:42:44AM +0200, Thomas Hellström wrote:
>> Hi, Danilo
>>
>> On 9/11/23 19:49, Danilo Krummrich wrote:
>>> Hi Thomas,
>>>
>>> On 9/11/23 19:19, Thomas Hellström wrote:
>>>> Hi, Danilo
>>>>
>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>> This patch adds an abstraction layer between the drm_gpuva mappings of
>>>>> a particular drm_gem_object and this GEM object itself. The abstraction
>>>>> represents a combination of a drm_gem_object and drm_gpuvm. The
>>>>> drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
>>>>> representing this abstraction), while each drm_gpuvm_bo contains
>>>>> list of
>>>>> mappings of this GEM object.
>>>>>
>>>>> This has multiple advantages:
>>>>>
>>>>> 1) We can use the drm_gpuvm_bo structure to attach it to various lists
>>>>>      of the drm_gpuvm. This is useful for tracking external and evicted
>>>>>      objects per VM, which is introduced in subsequent patches.
>>>>>
>>>>> 2) Finding mappings of a certain drm_gem_object mapped in a certain
>>>>>      drm_gpuvm becomes much cheaper.
>>>>>
>>>>> 3) Drivers can derive and extend the structure to easily represent
>>>>>      driver specific states of a BO for a certain GPUVM.
>>>>>
>>>>> The idea of this abstraction was taken from amdgpu, hence the
>>>>> credit for
>>>>> this idea goes to the developers of amdgpu.
>>>>>
>>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>> Did you consider having the drivers embed the struct drm_gpuvm_bo in
>>>> their own bo definition? I figure that would mean using the gem bo's
>>>> refcounting and providing a helper to call from the driver's bo
>>>> release. Looks like that could potentially save a lot of code? Or is
>>>> there something that won't work with that approach?
>>> There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free
>>> callback for drivers to register for that purpose.
>>>
>>> - Danilo
>> Now after looking a bit deeper, I think actually the question could be
>> rephrased as, why don't we just use the
>> struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of
>> keeping things simple? Drivers would then just embed it in their bo subclass
>> and we'd avoid unnecessary fields in the struct drm_gem_object for drivers
>> that don't do VM_BIND yet.
> struct drm_gem_object::gpuva is just a container containing a list in order to
> (currently) attach drm_gpuva structs to it and with this patch attach
> drm_gpuvm_bo structs (combination of BO + VM) to it. Doing the above basically
> means "leave everything as it is, but move the list_head of drm_gpuvs per GEM to
> the driver specific BO structure". Having a common connection between GEM
> objects and drm_gpuva structs was one of the goals of the initial GPUVA manager
> patch series however.
>
>> Sure, this won't be per bo and per vm, but it'd really only make a slight
>> difference where we have multiple VMAs per bo, where per-vm per-bo state
>> either needs to be duplicated or attached to a single vma (as in the case of
>> the external bo list).
>
> Correct, one implication is that we don't get a per VM and BO abstraction, and
> hence are left with a list of all drm_gpuva structs having the same backing BO,
> regardless of the VM.
>
> For amdgpu this was always a concern. Now that we want to keep track of external
> and evicted objects it's going to be a concern for most drivers I guess. Because
> the only structure we could use for tracking external and evicted objects we are
> left with (without having a VM_BO abstraction) is struct drm_gpuva. But this
> structure isn't unique and we need to consider cases where userspace just
> allocates rather huge BOs and creates tons of mappings from it. Running the full
> list of drm_gpuva structs (with even the ones from other VMs included) for
> adding an external or evicted object isn't very efficient. Not to mention that
> the maintenance when the mapping we've (randomly) picked as an entry for the
> external/evicted object list is unmapped, but there are still mappings left in
> the VM with the same backing BO.
For the evicted object it's not much of an issue; we maintain a list of 
vmas needing rebinding for each VM rather than objects evicted, so there 
is no or very little additional overhead there. The extobj list is 
indeed a problem if many VMAs are bound to the same bo. Not that the 
code snippets are complicated, but the list traversals would be excessive.
>
> Now, a way to get rid of the VM_BO abstraction would be to use maple trees
> instead, since then we can store drm_gem_object structs directly for each VM.
> However, Xe had concerns about using maple trees and preferred lists, plus
> having maple trees wouldn't get rid of the concerns of amdgpu not having a VM_BO
> abstraction for cases with tons of VMs and tons of mappings per BO. Hence,
> having a VM_BO abstraction enabling us to track external/evicted objects with
> lists seems to satisfy everyone's needs.

Indeed this is a tradeoff between a simple implementation that is OK for 
situations with not many VMs nor VMAs per bo vs a more complex 
implementation that optimizes for the opposite case.

So if this latter is a case we need to optimize for at this point then I 
guess it's the way to go.
(I'm in the process of adapting the xe driver to this, so I just wanted 
to bring up areas where the implementations differ quite a lot and make 
sure options are discussed).

Thanks,

Thomas


>
> - Danilo
>
>> To me that looks like a substantial amount of less code / complexity?
>>
>> /Thomas
>>
>>
>>>> Thanks,
>>>>
>>>> Thomas
>>>>
>>>>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination
  2023-09-12 10:33           ` Thomas Hellström
@ 2023-09-12 11:05             ` Danilo Krummrich
  0 siblings, 0 replies; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-12 11:05 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

On Tue, Sep 12, 2023 at 12:33:14PM +0200, Thomas Hellström wrote:
> 
> On 9/12/23 12:06, Danilo Krummrich wrote:
> > On Tue, Sep 12, 2023 at 09:42:44AM +0200, Thomas Hellström wrote:
> > > Hi, Danilo
> > > 
> > > On 9/11/23 19:49, Danilo Krummrich wrote:
> > > > Hi Thomas,
> > > > 
> > > > On 9/11/23 19:19, Thomas Hellström wrote:
> > > > > Hi, Danilo
> > > > > 
> > > > > On 9/9/23 17:31, Danilo Krummrich wrote:
> > > > > > This patch adds an abstraction layer between the drm_gpuva mappings of
> > > > > > a particular drm_gem_object and this GEM object itself. The abstraction
> > > > > > represents a combination of a drm_gem_object and drm_gpuvm. The
> > > > > > drm_gem_object holds a list of drm_gpuvm_bo structures (the structure
> > > > > > representing this abstraction), while each drm_gpuvm_bo contains
> > > > > > list of
> > > > > > mappings of this GEM object.
> > > > > > 
> > > > > > This has multiple advantages:
> > > > > > 
> > > > > > 1) We can use the drm_gpuvm_bo structure to attach it to various lists
> > > > > >      of the drm_gpuvm. This is useful for tracking external and evicted
> > > > > >      objects per VM, which is introduced in subsequent patches.
> > > > > > 
> > > > > > 2) Finding mappings of a certain drm_gem_object mapped in a certain
> > > > > >      drm_gpuvm becomes much cheaper.
> > > > > > 
> > > > > > 3) Drivers can derive and extend the structure to easily represent
> > > > > >      driver specific states of a BO for a certain GPUVM.
> > > > > > 
> > > > > > The idea of this abstraction was taken from amdgpu, hence the
> > > > > > credit for
> > > > > > this idea goes to the developers of amdgpu.
> > > > > > 
> > > > > > Cc: Christian König <christian.koenig@amd.com>
> > > > > > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > > > > Did you consider having the drivers embed the struct drm_gpuvm_bo in
> > > > > their own bo definition? I figure that would mean using the gem bo's
> > > > > refcounting and providing a helper to call from the driver's bo
> > > > > release. Looks like that could potentially save a lot of code? Or is
> > > > > there something that won't work with that approach?
> > > > There are drm_gpuvm_ops::vm_bo_alloc and drm_gpuvm_ops::vm_bo_free
> > > > callback for drivers to register for that purpose.
> > > > 
> > > > - Danilo
> > > Now after looking a bit deeper, I think actually the question could be
> > > rephrased as, why don't we just use the
> > > struct drm_gem_object::gpuva struct as the drm_gpuvm_bo in the spirit of
> > > keeping things simple? Drivers would then just embed it in their bo subclass
> > > and we'd avoid unnecessary fields in the struct drm_gem_object for drivers
> > > that don't do VM_BIND yet.
> > struct drm_gem_object::gpuva is just a container containing a list in order to
> > (currently) attach drm_gpuva structs to it and with this patch attach
> > drm_gpuvm_bo structs (combination of BO + VM) to it. Doing the above basically
> > means "leave everything as it is, but move the list_head of drm_gpuvs per GEM to
> > the driver specific BO structure". Having a common connection between GEM
> > objects and drm_gpuva structs was one of the goals of the initial GPUVA manager
> > patch series however.
> > 
> > > Sure, this won't be per bo and per vm, but it'd really only make a slight
> > > difference where we have multiple VMAs per bo, where per-vm per-bo state
> > > either needs to be duplicated or attached to a single vma (as in the case of
> > > the external bo list).
> > 
> > Correct, one implication is that we don't get a per VM and BO abstraction, and
> > hence are left with a list of all drm_gpuva structs having the same backing BO,
> > regardless of the VM.
> > 
> > For amdgpu this was always a concern. Now that we want to keep track of external
> > and evicted objects it's going to be a concern for most drivers I guess. Because
> > the only structure we could use for tracking external and evicted objects we are
> > left with (without having a VM_BO abstraction) is struct drm_gpuva. But this
> > structure isn't unique and we need to consider cases where userspace just
> > allocates rather huge BOs and creates tons of mappings from it. Running the full
> > list of drm_gpuva structs (with even the ones from other VMs included) for
> > adding an external or evicted object isn't very efficient. Not to mention that
> > the maintenance when the mapping we've (randomly) picked as an entry for the
> > external/evicted object list is unmapped, but there are still mappings left in
> > the VM with the same backing BO.
> For the evicted object it's not much of an issue; we maintain a list of vmas
> needing rebinding for each VM rather than objects evicted, so there is no or
> very little additional overhead there. The extobj list is indeed a problem
> if many VMAs are bound to the same bo. Not that the code snippets are
> complicated, but the list traversals would be excessive.
> > 
> > Now, a way to get rid of the VM_BO abstraction would be to use maple trees
> > instead, since then we can store drm_gem_object structs directly for each VM.
> > However, Xe had concerns about using maple trees and preferred lists, plus
> > having maple trees wouldn't get rid of the concerns of amdgpu not having a VM_BO
> > abstraction for cases with tons of VMs and tons of mappings per BO. Hence,
> > having a VM_BO abstraction enabling us to track external/evicted objects with
> > lists seems to satisfy everyone's needs.
> 
> Indeed this is a tradeoff between a simple implementation that is OK for
> situations with not many VMs nor VMAs per bo vs a more complex
> implementation that optimizes for the opposite case.
> 
> So if this latter is a case we need to optimize for at this point then I
> guess it's the way to go.
> (I'm in the process of adapting the xe driver to this, so I just wanted to
> bring up areas where the implementations differ quite a lot and make sure
> options are discussed).

Thanks, I appreciate that. Just be aware of the locking issue in V3 that Boris
has pointed out. I don't know if I will get to sending out a V4 today to fix
that, but I'll try to do it by tomorrow.

- Danilo

> 
> Thanks,
> 
> Thomas
> 
> 
> > 
> > - Danilo
> > 
> > > To me that looks like a substantial amount of less code / complexity?
> > > 
> > > /Thomas
> > > 
> > > 
> > > > > Thanks,
> > > > > 
> > > > > Thomas
> > > > > 
> > > > > 
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation Danilo Krummrich
                     ` (3 preceding siblings ...)
  2023-09-11 14:45   ` Boris Brezillon
@ 2023-09-12 16:20   ` Thomas Hellström
  2023-09-12 16:50     ` Danilo Krummrich
  2023-09-13  7:03     ` Boris Brezillon
  2023-09-14 13:48   ` Thomas Hellström
  5 siblings, 2 replies; 77+ messages in thread
From: Thomas Hellström @ 2023-09-12 16:20 UTC (permalink / raw)
  To: Danilo Krummrich, airlied, daniel, matthew.brost, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel

Hi, Danilo,

On 9/9/23 17:31, Danilo Krummrich wrote:
> So far the DRM GPUVA manager offers common infrastructure to track GPU VA
> allocations and mappings, generically connect GPU VA mappings to their
> backing buffers and perform more complex mapping operations on the GPU VA
> space.
>
> However, there are more design patterns commonly used by drivers, which
> can potentially be generalized in order to make the DRM GPUVA manager
> represent a basic GPU-VM implementation. In this context, this patch aims
> at generalizing the following elements.
>
> 1) Provide a common dma-resv for GEM objects not being used outside of
>     this GPU-VM.
>
> 2) Provide tracking of external GEM objects (GEM objects which are
>     shared with other GPU-VMs).
>
> 3) Provide functions to efficiently lock all GEM objects dma-resv the
>     GPU-VM contains mappings of.
>
> 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
>     of, such that validation of evicted GEM objects is accelerated.
>
> 5) Provide some convinience functions for common patterns.
>
> Rather than being designed as a "framework", the target is to make all
> features appear as a collection of optional helper functions, such that
> drivers are free to make use of the DRM GPUVA managers basic
> functionality and opt-in for other features without setting any feature
> flags, just by making use of the corresponding functions.
>
> Big kudos to Boris Brezillon for his help to figure out locking for drivers
> updating the GPU VA space within the fence signalling path.
>
> Suggested-by: Matthew Brost <matthew.brost@intel.com>
> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> ---
>   drivers/gpu/drm/drm_gpuvm.c | 516 ++++++++++++++++++++++++++++++++++++
>   include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>   2 files changed, 713 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
> index f4411047dbb3..8e62a043f719 100644
> --- a/drivers/gpu/drm/drm_gpuvm.c
> +++ b/drivers/gpu/drm/drm_gpuvm.c
> @@ -73,6 +73,21 @@
>    * &drm_gem_object list of &drm_gpuvm_bos for an existing instance of this
>    * particular combination. If not existent a new instance is created and linked
>    * to the &drm_gem_object.
> + *
> + * &drm_gpuvm_bo structures, since unique for a given &drm_gpuvm, are also used
> + * as entry for the &drm_gpuvm's lists of external and evicted objects. Those
> + * list are maintained in order to accelerate locking of dma-resv locks and
> + * validation of evicted objects bound in a &drm_gpuvm. For instance the all
> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be locked by calling
> + * drm_gpuvm_exec_lock(). Once locked drivers can call drm_gpuvm_validate() in
> + * order to validate all evicted &drm_gem_objects. It is also possible to lock
> + * additional &drm_gem_objects by providing the corresponding parameters to
> + * drm_gpuvm_exec_lock() as well as open code the &drm_exec loop while making
> + * use of helper functions such as drm_gpuvm_prepare_range() or
> + * drm_gpuvm_prepare_objects().
> + *
> + * Every bound &drm_gem_object is treated as external object when its &dma_resv
> + * structure is different than the &drm_gpuvm's common &dma_resv structure.
>    */
>   
>   /**
> @@ -420,6 +435,20 @@
>    * Subsequent calls to drm_gpuvm_bo_obtain() for the same &drm_gpuvm and
>    * &drm_gem_object must be able to observe previous creations and destructions
>    * of &drm_gpuvm_bos in order to keep instances unique.
> + *
> + * The &drm_gpuvm's lists for keeping track of external and evicted objects are
> + * protected against concurrent insertion / removal and iteration internally.
> + *
> + * However, drivers still need ensure to protect concurrent calls to functions
> + * iterating those lists, such as drm_gpuvm_validate() and
> + * drm_gpuvm_prepare_objects(). Every such function contains a particular
> + * comment and lockdep checks if possible.
> + *
> + * Functions adding or removing entries from those lists, such as
> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be called with external
> + * locks being held, e.g. in order to avoid the corresponding list to be
> + * (safely) modified while potentially being iternated by other API functions.
> + * However, this is entirely optional.
>    */
>   
>   /**
> @@ -632,6 +661,131 @@
>    *	}
>    */
>   
> +/**
> + * get_next_vm_bo_from_list() - get the next vm_bo element
> + * @__gpuvm: The GPU VM
> + * @__list_name: The name of the list we're iterating on
> + * @__local_list: A pointer to the local list used to store already iterated items
> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> + *
> + * This helper is here to provide lockless list iteration. Lockless as in, the
> + * iterator releases the lock immediately after picking the first element from
> + * the list, so list insertion deletion can happen concurrently.

Are the list spinlocks needed for that async state update from within 
the dma-fence critical section we've discussed previously?

Otherwise it should be sufficient to protect the lists with the gpuvm's 
resv (or for the extobj list with an outer lock).

If those spinlocks are still needed in some situations, perhaps could we 
have an option to set them to NULL (Like IIRC the maple tree allows for)?

For such drivers, that would require anybody calling unlink to hold the 
vm's resv, though.

It seems that with that also the refcount could be make non-atomic.

All in the spirit of the drm locking guidelines "use big locks when 
possible".
Lower level locks only when necessary for performance or locking inversion?

/Thomas


> + *
> + * Elements popped from the original list are kept in a local list, so removal
> + * and is_empty checks can still happen while we're iterating the list.
> + */
> +#define get_next_vm_bo_from_list(__gpuvm, __list_name, __local_list, __prev_vm_bo)	\
> +	({										\
> +		struct drm_gpuvm_bo *__vm_bo;						\
> +											\
> +		drm_gpuvm_bo_put(__prev_vm_bo);						\
> +											\
> +		spin_lock(&(__gpuvm)->__list_name.lock);				\
> +		while (!list_empty(&(__gpuvm)->__list_name.list)) {			\
> +			__vm_bo = list_first_entry(&(__gpuvm)->__list_name.list,	\
> +						   struct drm_gpuvm_bo,			\
> +						   list.entry.__list_name);		\
> +			if (drm_gpuvm_bo_get_unless_zero(__vm_bo)) {			\
> +				list_move_tail(&(__vm_bo)->list.entry.__list_name,	\
> +					       __local_list);				\
> +				break;							\
> +			} else {							\
> +				list_del_init(&(__vm_bo)->list.entry.__list_name);	\
> +				__vm_bo = NULL;						\
> +			}								\
> +		}									\
> +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
> +											\
> +		__vm_bo;								\
> +	})
> +
> +/**
> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
> + *
> + * This helper is here to provide lockless list iteration. Lockless as in, the
> + * iterator releases the lock immediately after picking the first element from the
> + * list, so list insertion and deletion can happen concurrently.
> + *
> + * Typical use:
> + *
> + *	struct drm_gpuvm_bo *vm_bo;
> + *	LIST_HEAD(my_local_list);
> + *
> + *	ret = 0;
> + *	drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>, &my_local_list, vm_bo) {
> + *		ret = do_something_with_vm_bo(..., vm_bo);
> + *		if (ret)
> + *			break;
> + *	}
> + *	drm_gpuvm_bo_put(vm_bo);
> + *	drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>, &my_local_list);
> + *
> + *
> + * Only used for internal list iterations, not meant to be exposed to the outside
> + * world.
> + */
> +#define for_each_vm_bo_in_list(__gpuvm, __list_name, __local_list, __vm_bo)	\
> +	for (__vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
> +						__local_list, NULL);		\
> +	     __vm_bo;								\
> +	     __vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
> +						__local_list, __vm_bo))		\
> +
> +/**
> + * restore_vm_bo_list() - move vm_bo elements back to their original list
> + * @__gpuvm: The GPU VM
> + * @__list_name: The name of the list we're iterating on
> + * @__local_list: A pointer to the local list used to store already iterated items
> + *
> + * When we're done iterating a vm_bo list, we should call restore_vm_bo_list()
> + * to restore the original state and let new iterations take place.
> + */
> +#define restore_vm_bo_list(__gpuvm, __list_name, __local_list)				\
> +	do {										\
> +		/* Merge back the two lists, moving local list elements to the		\
> +		 * head to preserve previous ordering, in case it matters.		\
> +		 */									\
> +		spin_lock(&(__gpuvm)->__list_name.lock);				\
> +		list_splice(__local_list, &(__gpuvm)->__list_name.list);		\
> +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
> +	} while (0)
> +/**
> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given list
> + * @__vm_bo: the &drm_gpuvm_bo
> + * @__list_name: the name of the list to insert into
> + *
> + * Inserts the given @__vm_bo into the list specified by @__list_name and
> + * increases the vm_bo's reference count.
> + */
> +#define drm_gpuvm_bo_list_add(__vm_bo, __list_name)				\
> +	do {									\
> +		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
> +		if (list_empty(&(__vm_bo)->list.entry.__list_name))		\
> +			list_add_tail(&(__vm_bo)->list.entry.__list_name,	\
> +				      &(__vm_bo)->vm->__list_name.list);	\
> +		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
> +	} while (0)
> +
> +/**
> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given list
> + * @__vm_bo: the &drm_gpuvm_bo
> + * @__list_name: the name of the list to insert into
> + *
> + * Removes the given @__vm_bo from the list specified by @__list_name and
> + * decreases the vm_bo's reference count.
> + */
> +#define drm_gpuvm_bo_list_del(__vm_bo, __list_name)				\
> +	do {									\
> +		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
> +		if (!list_empty(&(__vm_bo)->list.entry.__list_name))		\
> +			list_del_init(&(__vm_bo)->list.entry.__list_name);	\
> +		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
> +	} while (0)
> +
> +static int __must_check
> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
> +
>   #define to_drm_gpuva(__node)	container_of((__node), struct drm_gpuva, rb.node)
>   
>   #define GPUVA_START(node) ((node)->va.addr)
> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
>   	gpuvm->rb.tree = RB_ROOT_CACHED;
>   	INIT_LIST_HEAD(&gpuvm->rb.list);
>   
> +	INIT_LIST_HEAD(&gpuvm->extobj.list);
> +	spin_lock_init(&gpuvm->extobj.lock);
> +
> +	INIT_LIST_HEAD(&gpuvm->evict.list);
> +	spin_lock_init(&gpuvm->evict.lock);
> +
>   	drm_gpuva_check_overflow(start_offset, range);
>   	gpuvm->mm_start = start_offset;
>   	gpuvm->mm_range = range;
> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
>   	WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>   	     "GPUVA tree is not empty, potentially leaking memory.\n");
>   
> +	WARN(!list_empty(&gpuvm->extobj.list), "Extobj list should be empty.\n");
> +	WARN(!list_empty(&gpuvm->evict.list), "Evict list should be empty.\n");
> +
>   	drm_gem_private_object_fini(&gpuvm->d_obj);
>   }
>   EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>   
> +/**
> + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
> + * @gpuvm: the &drm_gpuvm
> + * @exec: the &drm_exec locking context
> + * @num_fences: the amount of &dma_fences to reserve
> + *
> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the given
> + * &drm_gpuvm contains mappings of.
> + *
> + * Using this function directly, it is the drivers responsibility to call
> + * drm_exec_init() and drm_exec_fini() accordingly.
> + *
> + * Note: This function is safe against concurrent insertion and removal of
> + * external objects, however it is not safe against concurrent usage itself.
> + *
> + * Drivers need to make sure to protect this case with either an outer VM lock
> + * or by calling drm_gpuvm_prepare_vm() before this function within the
> + * drm_exec_until_all_locked() loop, such that the GPUVM's dma-resv lock ensures
> + * mutual exclusion.
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +int
> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> +			  struct drm_exec *exec,
> +			  unsigned int num_fences)
> +{
> +	struct drm_gpuvm_bo *vm_bo;
> +	LIST_HEAD(extobjs);
> +	int ret = 0;
> +
> +	for_each_vm_bo_in_list(gpuvm, extobj, &extobjs, vm_bo) {
> +		ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
> +		if (ret)
> +			break;
> +	}
> +	/* Drop ref in case we break out of the loop. */
> +	drm_gpuvm_bo_put(vm_bo);
> +	restore_vm_bo_list(gpuvm, extobj, &extobjs);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
> +
> +/**
> + * drm_gpuvm_prepare_range() - prepare all BOs mapped within a given range
> + * @gpuvm: the &drm_gpuvm
> + * @exec: the &drm_exec locking context
> + * @addr: the start address within the VA space
> + * @range: the range to iterate within the VA space
> + * @num_fences: the amount of &dma_fences to reserve
> + *
> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects mapped between @addr
> + * and @addr + @range.
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +int
> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct drm_exec *exec,
> +			u64 addr, u64 range, unsigned int num_fences)
> +{
> +	struct drm_gpuva *va;
> +	u64 end = addr + range;
> +	int ret;
> +
> +	drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
> +		struct drm_gem_object *obj = va->gem.obj;
> +
> +		ret = drm_exec_prepare_obj(exec, obj, num_fences);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
> +
> +/**
> + * drm_gpuvm_exec_lock() - lock all dma-resv of all assoiciated BOs
> + * @vm_exec: the &drm_gpuvm_exec abstraction
> + * @num_fences: the amount of &dma_fences to reserve
> + * @interruptible: sleep interruptible if waiting
> + *
> + * Acquires all dma-resv locks of all &drm_gem_objects the given
> + * &drm_gpuvm contains mappings of.
> + *
> + * Addionally, when calling this function with struct drm_gpuvm_exec::extra
> + * being set the driver receives the given @fn callback to lock additional
> + * dma-resv in the context of the &drm_gpuvm_exec instance. Typically, drivers
> + * would call drm_exec_prepare_obj() from within this callback.
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +int
> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> +		    unsigned int num_fences,
> +		    bool interruptible)
> +{
> +	struct drm_gpuvm *gpuvm = vm_exec->vm;
> +	struct drm_exec *exec = &vm_exec->exec;
> +	uint32_t flags;
> +	int ret;
> +
> +	flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT : 0 |
> +		DRM_EXEC_IGNORE_DUPLICATES;
> +
> +	drm_exec_init(exec, flags);
> +
> +	drm_exec_until_all_locked(exec) {
> +		ret = drm_gpuvm_prepare_vm(gpuvm, exec, num_fences);
> +		drm_exec_retry_on_contention(exec);
> +		if (ret)
> +			goto err;
> +
> +		ret = drm_gpuvm_prepare_objects(gpuvm, exec, num_fences);
> +		drm_exec_retry_on_contention(exec);
> +		if (ret)
> +			goto err;
> +
> +		if (vm_exec->extra.fn) {
> +			ret = vm_exec->extra.fn(vm_exec, num_fences);
> +			drm_exec_retry_on_contention(exec);
> +			if (ret)
> +				goto err;
> +		}
> +	}
> +
> +	return 0;
> +
> +err:
> +	drm_exec_fini(exec);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
> +
> +static int
> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int num_fences)
> +{
> +	struct {
> +		struct drm_gem_object **objs;
> +		unsigned int num_objs;
> +	} *args = vm_exec->extra.priv;
> +
> +	return drm_exec_prepare_array(&vm_exec->exec, args->objs,
> +				      args->num_objs, num_fences);
> +}
> +
> +/**
> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all assoiciated BOs
> + * @vm_exec: the &drm_gpuvm_exec abstraction
> + * @objs: additional &drm_gem_objects to lock
> + * @num_objs: the number of additional &drm_gem_objects to lock
> + * @num_fences: the amount of &dma_fences to reserve
> + * @interruptible: sleep interruptible if waiting
> + *
> + * Acquires all dma-resv locks of all &drm_gem_objects the given &drm_gpuvm
> + * contains mappings of, plus the ones given through @objs.
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +int
> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
> +			  struct drm_gem_object **objs,
> +			  unsigned int num_objs,
> +			  unsigned int num_fences,
> +			  bool interruptible)
> +{
> +	struct {
> +		struct drm_gem_object **objs;
> +		unsigned int num_objs;
> +	} args;
> +
> +	args.objs = objs;
> +	args.num_objs = num_objs;
> +
> +	vm_exec->extra.fn = fn_lock_array;
> +	vm_exec->extra.priv = &args;
> +
> +	return drm_gpuvm_exec_lock(vm_exec, num_fences, interruptible);
> +}
> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
> +
> +/**
> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped within a given range
> + * @vm_exec: the &drm_gpuvm_exec abstraction
> + * @addr: the start address within the VA space
> + * @range: the range to iterate within the VA space
> + * @num_fences: the amount of &dma_fences to reserve
> + * @interruptible: sleep interruptible if waiting
> + *
> + * Acquires all dma-resv locks of all &drm_gem_objects mapped between @addr and
> + * @addr + @range.
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +int
> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
> +			  u64 addr, u64 range,
> +			  unsigned int num_fences,
> +			  bool interruptible)
> +{
> +	struct drm_gpuvm *gpuvm = vm_exec->vm;
> +	struct drm_exec *exec = &vm_exec->exec;
> +	uint32_t flags;
> +	int ret;
> +
> +	flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT : 0 |
> +		DRM_EXEC_IGNORE_DUPLICATES;
> +
> +	drm_exec_init(exec, flags);
> +
> +	drm_exec_until_all_locked(exec) {
> +		ret = drm_gpuvm_prepare_range(gpuvm, exec, addr, range,
> +					      num_fences);
> +		drm_exec_retry_on_contention(exec);
> +		if (ret)
> +			goto err;
> +	}
> +
> +	return ret;
> +
> +err:
> +	drm_exec_fini(exec);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
> +
> +/**
> + * drm_gpuvm_validate() - validate all BOs marked as evicted
> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
> + *
> + * Calls the &drm_gpuvm_ops.bo_validate callback for all evicted buffer
> + * objects being mapped in the given &drm_gpuvm.
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +int
> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
> +{
> +	const struct drm_gpuvm_ops *ops = gpuvm->ops;
> +	struct drm_gpuvm_bo *vm_bo;
> +	LIST_HEAD(evict);
> +	int ret = 0;
> +
> +	if (unlikely(!ops || !ops->bo_validate))
> +		return -ENOTSUPP;
> +
> +	for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
> +		dma_resv_assert_held(vm_bo->obj->resv);
> +		ret = ops->bo_validate(vm_bo->obj);
> +		if (ret)
> +			break;
> +	}
> +	/* Drop ref in case we break out of the loop. */
> +	drm_gpuvm_bo_put(vm_bo);
> +	restore_vm_bo_list(gpuvm, evict, &evict);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
> +
> +/**
> + * drm_gpuvm_resv_add_fence - add fence to private and all extobj
> + * dma-resv
> + * @gpuvm: the &drm_gpuvm to add a fence to
> + * @exec: the &drm_exec locking context
> + * @fence: fence to add
> + * @private_usage: private dma-resv usage
> + * @extobj_usage: extobj dma-resv usage
> + */
> +void
> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> +			 struct drm_exec *exec,
> +			 struct dma_fence *fence,
> +			 enum dma_resv_usage private_usage,
> +			 enum dma_resv_usage extobj_usage)
> +{
> +	struct drm_gem_object *obj;
> +	unsigned long index;
> +
> +	drm_exec_for_each_locked_object(exec, index, obj) {
> +		dma_resv_assert_held(obj->resv);
> +		dma_resv_add_fence(obj->resv, fence,
> +				   drm_gpuvm_is_extobj(gpuvm, obj) ?
> +				   private_usage : extobj_usage);
> +	}
> +}
> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
> +
>   /**
>    * drm_gpuvm_bo_create() - create a new instance of struct drm_gpuvm_bo
>    * @gpuvm: The &drm_gpuvm the @obj is mapped in.
> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm,
>   	INIT_LIST_HEAD(&vm_bo->list.gpuva);
>   	INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>   
> +	INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
> +	INIT_LIST_HEAD(&vm_bo->list.entry.evict);
> +
>   	drm_gem_object_get(obj);
>   
>   	return vm_bo;
> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>   
>   	drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>   
> +	spin_lock(&gpuvm->extobj.lock);
> +	list_del(&vm_bo->list.entry.extobj);
> +	spin_unlock(&gpuvm->extobj.lock);
> +
> +	spin_lock(&gpuvm->evict.lock);
> +	list_del(&vm_bo->list.entry.evict);
> +	spin_unlock(&gpuvm->evict.lock);
> +
>   	list_del(&vm_bo->list.entry.gem);
>   
>   	drm_gem_object_put(obj);
> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>    * @vm_bo: the &drm_gpuvm_bo to release the reference of
>    *
>    * This releases a reference to @vm_bo.
> + *
> + * If the reference count drops to zero, the &gpuvm_bo is destroyed, which
> + * includes removing it from the GEMs gpuva list. Hence, if a call to this
> + * function can potentially let the reference count to zero the caller must
> + * hold the dma-resv or driver specific GEM gpuva lock.
>    */
>   void
>   drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>   }
>   EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>   
> +static int __must_check
> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
> +{
> +	return kref_get_unless_zero(&vm_bo->kref);
> +}
> +
>   static struct drm_gpuvm_bo *
>   __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>   		    struct drm_gem_object *obj)
> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *__vm_bo)
>   }
>   EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>   
> +/**
> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its &drm_gpuvm's
> + * extobj list
> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the extobj list.
> + *
> + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if not on the list
> + * already and if the corresponding &drm_gem_object is an external object,
> + * actually.
> + */
> +void
> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
> +{
> +	struct drm_gpuvm *gpuvm = vm_bo->vm;
> +
> +	if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
> +		drm_gpuvm_bo_list_add(vm_bo, extobj);
> +}
> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
> +
> +/**
> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to / from a
> + * &drm_gpuvms evicted list
> + * @obj: the &drm_gem_object to add or remove
> + * @evict: indicates whether the object is evicted
> + *
> + * Adds a &drm_gem_object to or removes it from all &drm_gpuvms evicted
> + * list containing a mapping of this &drm_gem_object.
> + */
> +void
> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
> +{
> +	struct drm_gpuvm_bo *vm_bo;
> +
> +	drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
> +		if (evict)
> +			drm_gpuvm_bo_list_add(vm_bo, evict);
> +		else
> +			drm_gpuvm_bo_list_del(vm_bo, evict);
> +	}
> +}
> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
> +
>   static int
>   __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>   		   struct drm_gpuva *va)
> diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
> index afa50b9059a2..834bb6d6617e 100644
> --- a/include/drm/drm_gpuvm.h
> +++ b/include/drm/drm_gpuvm.h
> @@ -26,10 +26,12 @@
>    */
>   
>   #include <linux/list.h>
> +#include <linux/dma-resv.h>
>   #include <linux/rbtree.h>
>   #include <linux/types.h>
>   
>   #include <drm/drm_gem.h>
> +#include <drm/drm_exec.h>
>   
>   struct drm_gpuvm;
>   struct drm_gpuvm_bo;
> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>   	 * space
>   	 */
>   	struct dma_resv *resv;
> +
> +	/**
> +	 * @extobj: structure holding the extobj list
> +	 */
> +	struct {
> +		/**
> +		 * @list: &list_head storing &drm_gpuvm_bos serving as
> +		 * external object
> +		 */
> +		struct list_head list;
> +
> +		/**
> +		 * @lock: spinlock to protect the extobj list
> +		 */
> +		spinlock_t lock;
> +	} extobj;
> +
> +	/**
> +	 * @evict: structure holding the evict list and evict list lock
> +	 */
> +	struct {
> +		/**
> +		 * @list: &list_head storing &drm_gpuvm_bos currently being
> +		 * evicted
> +		 */
> +		struct list_head list;
> +
> +		/**
> +		 * @lock: spinlock to protect the evict list
> +		 */
> +		spinlock_t lock;
> +	} evict;
>   };
>   
>   void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
>   		    const struct drm_gpuvm_ops *ops);
>   void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>   
> +/**
> + * drm_gpuvm_is_extobj() - indicates whether the given &drm_gem_object is an
> + * external object
> + * @gpuvm: the &drm_gpuvm to check
> + * @obj: the &drm_gem_object to check
> + *
> + * Returns: true if the &drm_gem_object &dma_resv differs from the
> + * &drm_gpuvms &dma_resv, false otherwise
> + */
> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm *gpuvm,
> +				       struct drm_gem_object *obj)
> +{
> +	return obj && obj->resv != gpuvm->resv;
> +}
> +
>   static inline struct drm_gpuva *
>   __drm_gpuva_next(struct drm_gpuva *va)
>   {
> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>   #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) \
>   	list_for_each_entry_safe(va__, next__, &(gpuvm__)->rb.list, rb.entry)
>   
> +/**
> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of &drm_exec
> + *
> + * This structure should be created on the stack as &drm_exec should be.
> + *
> + * Optionally, @extra can be set in order to lock additional &drm_gem_objects.
> + */
> +struct drm_gpuvm_exec {
> +	/**
> +	 * @exec: the &drm_exec structure
> +	 */
> +	struct drm_exec exec;
> +
> +	/**
> +	 * @vm: the &drm_gpuvm to lock its DMA reservations
> +	 */
> +	struct drm_gpuvm *vm;
> +
> +	/**
> +	 * @extra: Callback and corresponding private data for the driver to
> +	 * lock arbitrary additional &drm_gem_objects.
> +	 */
> +	struct {
> +		/**
> +		 * @fn: The driver callback to lock additional &drm_gem_objects.
> +		 */
> +		int (*fn)(struct drm_gpuvm_exec *vm_exec,
> +			  unsigned int num_fences);
> +
> +		/**
> +		 * @priv: driver private data for the @fn callback
> +		 */
> +		void *priv;
> +	} extra;
> +};
> +
> +/**
> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
> + * @gpuvm: the &drm_gpuvm
> + * @exec: the &drm_exec context
> + * @num_fences: the amount of &dma_fences to reserve
> + *
> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy &drm_gem_object.
> + *
> + * Using this function directly, it is the drivers responsibility to call
> + * drm_exec_init() and drm_exec_fini() accordingly.
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +static inline int
> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
> +		     struct drm_exec *exec,
> +		     unsigned int num_fences)
> +{
> +	return drm_exec_prepare_obj(exec, &gpuvm->d_obj, num_fences);
> +}
> +
> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> +			      struct drm_exec *exec,
> +			      unsigned int num_fences);
> +
> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
> +			    struct drm_exec *exec,
> +			    u64 addr, u64 range,
> +			    unsigned int num_fences);
> +
> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> +			unsigned int num_fences,
> +			bool interruptible);
> +
> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
> +			      struct drm_gem_object **objs,
> +			      unsigned int num_objs,
> +			      unsigned int num_fences,
> +			      bool interruptible);
> +
> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
> +			      u64 addr, u64 range,
> +			      unsigned int num_fences,
> +			      bool interruptible);
> +
> +/**
> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated BOs
> + * @gpuvm: the &drm_gpuvm
> + *
> + * Releases all dma-resv locks of all &drm_gem_objects previously acquired
> + * through drm_gpuvm_lock() or its variants.
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +static inline void
> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
> +{
> +	drm_exec_fini(&vm_exec->exec);
> +}
> +
> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> +			      struct drm_exec *exec,
> +			      struct dma_fence *fence,
> +			      enum dma_resv_usage private_usage,
> +			      enum dma_resv_usage extobj_usage);
> +
> +/**
> + * drm_gpuvm_exec_resv_add_fence()
> + * @vm_exec: the &drm_gpuvm_exec abstraction
> + * @fence: fence to add
> + * @private_usage: private dma-resv usage
> + * @extobj_usage: extobj dma-resv usage
> + *
> + * See drm_gpuvm_resv_add_fence().
> + */
> +static inline void
> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec *vm_exec,
> +			      struct dma_fence *fence,
> +			      enum dma_resv_usage private_usage,
> +			      enum dma_resv_usage extobj_usage)
> +{
> +	drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec, fence,
> +				 private_usage, extobj_usage);
> +}
> +
>   /**
>    * struct drm_gpuvm_bo - structure representing a &drm_gpuvm and
>    * &drm_gem_object combination
> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>   			 * gpuva list.
>   			 */
>   			struct list_head gem;
> +
> +			/**
> +			 * @evict: List entry to attach to the &drm_gpuvms
> +			 * extobj list.
> +			 */
> +			struct list_head extobj;
> +
> +			/**
> +			 * @evict: List entry to attach to the &drm_gpuvms evict
> +			 * list.
> +			 */
> +			struct list_head evict;
>   		} entry;
>   	} list;
>   };
> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>   drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>   		  struct drm_gem_object *obj);
>   
> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict);
> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
> +
>   /**
>    * drm_gpuvm_bo_for_each_va() - iterator to walk over a list of &drm_gpuva
>    * @va__: &drm_gpuva structure to assign to in each iteration step
> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>   	 * used.
>   	 */
>   	int (*sm_step_unmap)(struct drm_gpuva_op *op, void *priv);
> +
> +	/**
> +	 * @bo_validate: called from drm_gpuvm_validate()
> +	 *
> +	 * Drivers receive this callback for every evicted &drm_gem_object being
> +	 * mapped in the corresponding &drm_gpuvm.
> +	 *
> +	 * Typically, drivers would call their driver specific variant of
> +	 * ttm_bo_validate() from within this callback.
> +	 */
> +	int (*bo_validate)(struct drm_gem_object *obj);
>   };
>   
>   int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-12 16:20   ` Thomas Hellström
@ 2023-09-12 16:50     ` Danilo Krummrich
  2023-09-12 19:23       ` Thomas Hellström
  2023-09-13  7:03     ` Boris Brezillon
  1 sibling, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-12 16:50 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
> Hi, Danilo,
> 
> On 9/9/23 17:31, Danilo Krummrich wrote:
> > So far the DRM GPUVA manager offers common infrastructure to track GPU VA
> > allocations and mappings, generically connect GPU VA mappings to their
> > backing buffers and perform more complex mapping operations on the GPU VA
> > space.
> > 
> > However, there are more design patterns commonly used by drivers, which
> > can potentially be generalized in order to make the DRM GPUVA manager
> > represent a basic GPU-VM implementation. In this context, this patch aims
> > at generalizing the following elements.
> > 
> > 1) Provide a common dma-resv for GEM objects not being used outside of
> >     this GPU-VM.
> > 
> > 2) Provide tracking of external GEM objects (GEM objects which are
> >     shared with other GPU-VMs).
> > 
> > 3) Provide functions to efficiently lock all GEM objects dma-resv the
> >     GPU-VM contains mappings of.
> > 
> > 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
> >     of, such that validation of evicted GEM objects is accelerated.
> > 
> > 5) Provide some convinience functions for common patterns.
> > 
> > Rather than being designed as a "framework", the target is to make all
> > features appear as a collection of optional helper functions, such that
> > drivers are free to make use of the DRM GPUVA managers basic
> > functionality and opt-in for other features without setting any feature
> > flags, just by making use of the corresponding functions.
> > 
> > Big kudos to Boris Brezillon for his help to figure out locking for drivers
> > updating the GPU VA space within the fence signalling path.
> > 
> > Suggested-by: Matthew Brost <matthew.brost@intel.com>
> > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > ---
> >   drivers/gpu/drm/drm_gpuvm.c | 516 ++++++++++++++++++++++++++++++++++++
> >   include/drm/drm_gpuvm.h     | 197 ++++++++++++++
> >   2 files changed, 713 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
> > index f4411047dbb3..8e62a043f719 100644
> > --- a/drivers/gpu/drm/drm_gpuvm.c
> > +++ b/drivers/gpu/drm/drm_gpuvm.c
> > @@ -73,6 +73,21 @@
> >    * &drm_gem_object list of &drm_gpuvm_bos for an existing instance of this
> >    * particular combination. If not existent a new instance is created and linked
> >    * to the &drm_gem_object.
> > + *
> > + * &drm_gpuvm_bo structures, since unique for a given &drm_gpuvm, are also used
> > + * as entry for the &drm_gpuvm's lists of external and evicted objects. Those
> > + * list are maintained in order to accelerate locking of dma-resv locks and
> > + * validation of evicted objects bound in a &drm_gpuvm. For instance the all
> > + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be locked by calling
> > + * drm_gpuvm_exec_lock(). Once locked drivers can call drm_gpuvm_validate() in
> > + * order to validate all evicted &drm_gem_objects. It is also possible to lock
> > + * additional &drm_gem_objects by providing the corresponding parameters to
> > + * drm_gpuvm_exec_lock() as well as open code the &drm_exec loop while making
> > + * use of helper functions such as drm_gpuvm_prepare_range() or
> > + * drm_gpuvm_prepare_objects().
> > + *
> > + * Every bound &drm_gem_object is treated as external object when its &dma_resv
> > + * structure is different than the &drm_gpuvm's common &dma_resv structure.
> >    */
> >   /**
> > @@ -420,6 +435,20 @@
> >    * Subsequent calls to drm_gpuvm_bo_obtain() for the same &drm_gpuvm and
> >    * &drm_gem_object must be able to observe previous creations and destructions
> >    * of &drm_gpuvm_bos in order to keep instances unique.
> > + *
> > + * The &drm_gpuvm's lists for keeping track of external and evicted objects are
> > + * protected against concurrent insertion / removal and iteration internally.
> > + *
> > + * However, drivers still need ensure to protect concurrent calls to functions
> > + * iterating those lists, such as drm_gpuvm_validate() and
> > + * drm_gpuvm_prepare_objects(). Every such function contains a particular
> > + * comment and lockdep checks if possible.
> > + *
> > + * Functions adding or removing entries from those lists, such as
> > + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be called with external
> > + * locks being held, e.g. in order to avoid the corresponding list to be
> > + * (safely) modified while potentially being iternated by other API functions.
> > + * However, this is entirely optional.
> >    */
> >   /**
> > @@ -632,6 +661,131 @@
> >    *	}
> >    */
> > +/**
> > + * get_next_vm_bo_from_list() - get the next vm_bo element
> > + * @__gpuvm: The GPU VM
> > + * @__list_name: The name of the list we're iterating on
> > + * @__local_list: A pointer to the local list used to store already iterated items
> > + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> > + *
> > + * This helper is here to provide lockless list iteration. Lockless as in, the
> > + * iterator releases the lock immediately after picking the first element from
> > + * the list, so list insertion deletion can happen concurrently.
> 
> Are the list spinlocks needed for that async state update from within the
> dma-fence critical section we've discussed previously?

Yes, but also for other reasons, see below.

> 
> Otherwise it should be sufficient to protect the lists with the gpuvm's resv
> (or for the extobj list with an outer lock).
> 
> If those spinlocks are still needed in some situations, perhaps could we
> have an option to set them to NULL (Like IIRC the maple tree allows for)?

The evict spinlock is needed in any case, since in drm_gpuvm_bo_evict() we're
holding only the dma-resv lock from the BO this function gets called for. Hence,
the spinlock protects concurrent drm_gpuvm_bo_evict() calls with different BOs.

For extobjs an outer lock would be enough in case of Xe, but I really would not
like to add even more complexity just to get the spinlock out of the way in case
the driver already has an outer lock protecting this path.

> 
> For such drivers, that would require anybody calling unlink to hold the vm's
> resv, though.

In V4 I want to go back to having a dedicated lock for the GEMs gpuva list (or
VM_BO list to be more precise). We can't just use the dma-resv lock for that
with VM_BO abstractions, because on destruction of a VM_BO we otherwise wouldn't
be allowed to already hold the dma-resv lock. That's the fix I was referring to
earlier.

> 
> It seems that with that also the refcount could be make non-atomic.
> 
> All in the spirit of the drm locking guidelines "use big locks when
> possible".
> Lower level locks only when necessary for performance or locking inversion?
> 
> /Thomas
> 
> 
> > + *
> > + * Elements popped from the original list are kept in a local list, so removal
> > + * and is_empty checks can still happen while we're iterating the list.
> > + */
> > +#define get_next_vm_bo_from_list(__gpuvm, __list_name, __local_list, __prev_vm_bo)	\
> > +	({										\
> > +		struct drm_gpuvm_bo *__vm_bo;						\
> > +											\
> > +		drm_gpuvm_bo_put(__prev_vm_bo);						\
> > +											\
> > +		spin_lock(&(__gpuvm)->__list_name.lock);				\
> > +		while (!list_empty(&(__gpuvm)->__list_name.list)) {			\
> > +			__vm_bo = list_first_entry(&(__gpuvm)->__list_name.list,	\
> > +						   struct drm_gpuvm_bo,			\
> > +						   list.entry.__list_name);		\
> > +			if (drm_gpuvm_bo_get_unless_zero(__vm_bo)) {			\
> > +				list_move_tail(&(__vm_bo)->list.entry.__list_name,	\
> > +					       __local_list);				\
> > +				break;							\
> > +			} else {							\
> > +				list_del_init(&(__vm_bo)->list.entry.__list_name);	\
> > +				__vm_bo = NULL;						\
> > +			}								\
> > +		}									\
> > +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
> > +											\
> > +		__vm_bo;								\
> > +	})
> > +
> > +/**
> > + * for_each_vm_bo_in_list() - internal vm_bo list iterator
> > + *
> > + * This helper is here to provide lockless list iteration. Lockless as in, the
> > + * iterator releases the lock immediately after picking the first element from the
> > + * list, so list insertion and deletion can happen concurrently.
> > + *
> > + * Typical use:
> > + *
> > + *	struct drm_gpuvm_bo *vm_bo;
> > + *	LIST_HEAD(my_local_list);
> > + *
> > + *	ret = 0;
> > + *	drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>, &my_local_list, vm_bo) {
> > + *		ret = do_something_with_vm_bo(..., vm_bo);
> > + *		if (ret)
> > + *			break;
> > + *	}
> > + *	drm_gpuvm_bo_put(vm_bo);
> > + *	drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>, &my_local_list);
> > + *
> > + *
> > + * Only used for internal list iterations, not meant to be exposed to the outside
> > + * world.
> > + */
> > +#define for_each_vm_bo_in_list(__gpuvm, __list_name, __local_list, __vm_bo)	\
> > +	for (__vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
> > +						__local_list, NULL);		\
> > +	     __vm_bo;								\
> > +	     __vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
> > +						__local_list, __vm_bo))		\
> > +
> > +/**
> > + * restore_vm_bo_list() - move vm_bo elements back to their original list
> > + * @__gpuvm: The GPU VM
> > + * @__list_name: The name of the list we're iterating on
> > + * @__local_list: A pointer to the local list used to store already iterated items
> > + *
> > + * When we're done iterating a vm_bo list, we should call restore_vm_bo_list()
> > + * to restore the original state and let new iterations take place.
> > + */
> > +#define restore_vm_bo_list(__gpuvm, __list_name, __local_list)				\
> > +	do {										\
> > +		/* Merge back the two lists, moving local list elements to the		\
> > +		 * head to preserve previous ordering, in case it matters.		\
> > +		 */									\
> > +		spin_lock(&(__gpuvm)->__list_name.lock);				\
> > +		list_splice(__local_list, &(__gpuvm)->__list_name.list);		\
> > +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
> > +	} while (0)
> > +/**
> > + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given list
> > + * @__vm_bo: the &drm_gpuvm_bo
> > + * @__list_name: the name of the list to insert into
> > + *
> > + * Inserts the given @__vm_bo into the list specified by @__list_name and
> > + * increases the vm_bo's reference count.
> > + */
> > +#define drm_gpuvm_bo_list_add(__vm_bo, __list_name)				\
> > +	do {									\
> > +		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
> > +		if (list_empty(&(__vm_bo)->list.entry.__list_name))		\
> > +			list_add_tail(&(__vm_bo)->list.entry.__list_name,	\
> > +				      &(__vm_bo)->vm->__list_name.list);	\
> > +		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
> > +	} while (0)
> > +
> > +/**
> > + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given list
> > + * @__vm_bo: the &drm_gpuvm_bo
> > + * @__list_name: the name of the list to insert into
> > + *
> > + * Removes the given @__vm_bo from the list specified by @__list_name and
> > + * decreases the vm_bo's reference count.
> > + */
> > +#define drm_gpuvm_bo_list_del(__vm_bo, __list_name)				\
> > +	do {									\
> > +		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
> > +		if (!list_empty(&(__vm_bo)->list.entry.__list_name))		\
> > +			list_del_init(&(__vm_bo)->list.entry.__list_name);	\
> > +		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
> > +	} while (0)
> > +
> > +static int __must_check
> > +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
> > +
> >   #define to_drm_gpuva(__node)	container_of((__node), struct drm_gpuva, rb.node)
> >   #define GPUVA_START(node) ((node)->va.addr)
> > @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
> >   	gpuvm->rb.tree = RB_ROOT_CACHED;
> >   	INIT_LIST_HEAD(&gpuvm->rb.list);
> > +	INIT_LIST_HEAD(&gpuvm->extobj.list);
> > +	spin_lock_init(&gpuvm->extobj.lock);
> > +
> > +	INIT_LIST_HEAD(&gpuvm->evict.list);
> > +	spin_lock_init(&gpuvm->evict.lock);
> > +
> >   	drm_gpuva_check_overflow(start_offset, range);
> >   	gpuvm->mm_start = start_offset;
> >   	gpuvm->mm_range = range;
> > @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
> >   	WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
> >   	     "GPUVA tree is not empty, potentially leaking memory.\n");
> > +	WARN(!list_empty(&gpuvm->extobj.list), "Extobj list should be empty.\n");
> > +	WARN(!list_empty(&gpuvm->evict.list), "Evict list should be empty.\n");
> > +
> >   	drm_gem_private_object_fini(&gpuvm->d_obj);
> >   }
> >   EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
> > +/**
> > + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
> > + * @gpuvm: the &drm_gpuvm
> > + * @exec: the &drm_exec locking context
> > + * @num_fences: the amount of &dma_fences to reserve
> > + *
> > + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the given
> > + * &drm_gpuvm contains mappings of.
> > + *
> > + * Using this function directly, it is the drivers responsibility to call
> > + * drm_exec_init() and drm_exec_fini() accordingly.
> > + *
> > + * Note: This function is safe against concurrent insertion and removal of
> > + * external objects, however it is not safe against concurrent usage itself.
> > + *
> > + * Drivers need to make sure to protect this case with either an outer VM lock
> > + * or by calling drm_gpuvm_prepare_vm() before this function within the
> > + * drm_exec_until_all_locked() loop, such that the GPUVM's dma-resv lock ensures
> > + * mutual exclusion.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + */
> > +int
> > +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> > +			  struct drm_exec *exec,
> > +			  unsigned int num_fences)
> > +{
> > +	struct drm_gpuvm_bo *vm_bo;
> > +	LIST_HEAD(extobjs);
> > +	int ret = 0;
> > +
> > +	for_each_vm_bo_in_list(gpuvm, extobj, &extobjs, vm_bo) {
> > +		ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
> > +		if (ret)
> > +			break;
> > +	}
> > +	/* Drop ref in case we break out of the loop. */
> > +	drm_gpuvm_bo_put(vm_bo);
> > +	restore_vm_bo_list(gpuvm, extobj, &extobjs);
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
> > +
> > +/**
> > + * drm_gpuvm_prepare_range() - prepare all BOs mapped within a given range
> > + * @gpuvm: the &drm_gpuvm
> > + * @exec: the &drm_exec locking context
> > + * @addr: the start address within the VA space
> > + * @range: the range to iterate within the VA space
> > + * @num_fences: the amount of &dma_fences to reserve
> > + *
> > + * Calls drm_exec_prepare_obj() for all &drm_gem_objects mapped between @addr
> > + * and @addr + @range.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + */
> > +int
> > +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct drm_exec *exec,
> > +			u64 addr, u64 range, unsigned int num_fences)
> > +{
> > +	struct drm_gpuva *va;
> > +	u64 end = addr + range;
> > +	int ret;
> > +
> > +	drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
> > +		struct drm_gem_object *obj = va->gem.obj;
> > +
> > +		ret = drm_exec_prepare_obj(exec, obj, num_fences);
> > +		if (ret)
> > +			return ret;
> > +	}
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
> > +
> > +/**
> > + * drm_gpuvm_exec_lock() - lock all dma-resv of all assoiciated BOs
> > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > + * @num_fences: the amount of &dma_fences to reserve
> > + * @interruptible: sleep interruptible if waiting
> > + *
> > + * Acquires all dma-resv locks of all &drm_gem_objects the given
> > + * &drm_gpuvm contains mappings of.
> > + *
> > + * Addionally, when calling this function with struct drm_gpuvm_exec::extra
> > + * being set the driver receives the given @fn callback to lock additional
> > + * dma-resv in the context of the &drm_gpuvm_exec instance. Typically, drivers
> > + * would call drm_exec_prepare_obj() from within this callback.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + */
> > +int
> > +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> > +		    unsigned int num_fences,
> > +		    bool interruptible)
> > +{
> > +	struct drm_gpuvm *gpuvm = vm_exec->vm;
> > +	struct drm_exec *exec = &vm_exec->exec;
> > +	uint32_t flags;
> > +	int ret;
> > +
> > +	flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT : 0 |
> > +		DRM_EXEC_IGNORE_DUPLICATES;
> > +
> > +	drm_exec_init(exec, flags);
> > +
> > +	drm_exec_until_all_locked(exec) {
> > +		ret = drm_gpuvm_prepare_vm(gpuvm, exec, num_fences);
> > +		drm_exec_retry_on_contention(exec);
> > +		if (ret)
> > +			goto err;
> > +
> > +		ret = drm_gpuvm_prepare_objects(gpuvm, exec, num_fences);
> > +		drm_exec_retry_on_contention(exec);
> > +		if (ret)
> > +			goto err;
> > +
> > +		if (vm_exec->extra.fn) {
> > +			ret = vm_exec->extra.fn(vm_exec, num_fences);
> > +			drm_exec_retry_on_contention(exec);
> > +			if (ret)
> > +				goto err;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +
> > +err:
> > +	drm_exec_fini(exec);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
> > +
> > +static int
> > +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int num_fences)
> > +{
> > +	struct {
> > +		struct drm_gem_object **objs;
> > +		unsigned int num_objs;
> > +	} *args = vm_exec->extra.priv;
> > +
> > +	return drm_exec_prepare_array(&vm_exec->exec, args->objs,
> > +				      args->num_objs, num_fences);
> > +}
> > +
> > +/**
> > + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all assoiciated BOs
> > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > + * @objs: additional &drm_gem_objects to lock
> > + * @num_objs: the number of additional &drm_gem_objects to lock
> > + * @num_fences: the amount of &dma_fences to reserve
> > + * @interruptible: sleep interruptible if waiting
> > + *
> > + * Acquires all dma-resv locks of all &drm_gem_objects the given &drm_gpuvm
> > + * contains mappings of, plus the ones given through @objs.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + */
> > +int
> > +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
> > +			  struct drm_gem_object **objs,
> > +			  unsigned int num_objs,
> > +			  unsigned int num_fences,
> > +			  bool interruptible)
> > +{
> > +	struct {
> > +		struct drm_gem_object **objs;
> > +		unsigned int num_objs;
> > +	} args;
> > +
> > +	args.objs = objs;
> > +	args.num_objs = num_objs;
> > +
> > +	vm_exec->extra.fn = fn_lock_array;
> > +	vm_exec->extra.priv = &args;
> > +
> > +	return drm_gpuvm_exec_lock(vm_exec, num_fences, interruptible);
> > +}
> > +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
> > +
> > +/**
> > + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped within a given range
> > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > + * @addr: the start address within the VA space
> > + * @range: the range to iterate within the VA space
> > + * @num_fences: the amount of &dma_fences to reserve
> > + * @interruptible: sleep interruptible if waiting
> > + *
> > + * Acquires all dma-resv locks of all &drm_gem_objects mapped between @addr and
> > + * @addr + @range.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + */
> > +int
> > +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
> > +			  u64 addr, u64 range,
> > +			  unsigned int num_fences,
> > +			  bool interruptible)
> > +{
> > +	struct drm_gpuvm *gpuvm = vm_exec->vm;
> > +	struct drm_exec *exec = &vm_exec->exec;
> > +	uint32_t flags;
> > +	int ret;
> > +
> > +	flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT : 0 |
> > +		DRM_EXEC_IGNORE_DUPLICATES;
> > +
> > +	drm_exec_init(exec, flags);
> > +
> > +	drm_exec_until_all_locked(exec) {
> > +		ret = drm_gpuvm_prepare_range(gpuvm, exec, addr, range,
> > +					      num_fences);
> > +		drm_exec_retry_on_contention(exec);
> > +		if (ret)
> > +			goto err;
> > +	}
> > +
> > +	return ret;
> > +
> > +err:
> > +	drm_exec_fini(exec);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
> > +
> > +/**
> > + * drm_gpuvm_validate() - validate all BOs marked as evicted
> > + * @gpuvm: the &drm_gpuvm to validate evicted BOs
> > + *
> > + * Calls the &drm_gpuvm_ops.bo_validate callback for all evicted buffer
> > + * objects being mapped in the given &drm_gpuvm.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + */
> > +int
> > +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
> > +{
> > +	const struct drm_gpuvm_ops *ops = gpuvm->ops;
> > +	struct drm_gpuvm_bo *vm_bo;
> > +	LIST_HEAD(evict);
> > +	int ret = 0;
> > +
> > +	if (unlikely(!ops || !ops->bo_validate))
> > +		return -ENOTSUPP;
> > +
> > +	for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
> > +		dma_resv_assert_held(vm_bo->obj->resv);
> > +		ret = ops->bo_validate(vm_bo->obj);
> > +		if (ret)
> > +			break;
> > +	}
> > +	/* Drop ref in case we break out of the loop. */
> > +	drm_gpuvm_bo_put(vm_bo);
> > +	restore_vm_bo_list(gpuvm, evict, &evict);
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
> > +
> > +/**
> > + * drm_gpuvm_resv_add_fence - add fence to private and all extobj
> > + * dma-resv
> > + * @gpuvm: the &drm_gpuvm to add a fence to
> > + * @exec: the &drm_exec locking context
> > + * @fence: fence to add
> > + * @private_usage: private dma-resv usage
> > + * @extobj_usage: extobj dma-resv usage
> > + */
> > +void
> > +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> > +			 struct drm_exec *exec,
> > +			 struct dma_fence *fence,
> > +			 enum dma_resv_usage private_usage,
> > +			 enum dma_resv_usage extobj_usage)
> > +{
> > +	struct drm_gem_object *obj;
> > +	unsigned long index;
> > +
> > +	drm_exec_for_each_locked_object(exec, index, obj) {
> > +		dma_resv_assert_held(obj->resv);
> > +		dma_resv_add_fence(obj->resv, fence,
> > +				   drm_gpuvm_is_extobj(gpuvm, obj) ?
> > +				   private_usage : extobj_usage);
> > +	}
> > +}
> > +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
> > +
> >   /**
> >    * drm_gpuvm_bo_create() - create a new instance of struct drm_gpuvm_bo
> >    * @gpuvm: The &drm_gpuvm the @obj is mapped in.
> > @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm,
> >   	INIT_LIST_HEAD(&vm_bo->list.gpuva);
> >   	INIT_LIST_HEAD(&vm_bo->list.entry.gem);
> > +	INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
> > +	INIT_LIST_HEAD(&vm_bo->list.entry.evict);
> > +
> >   	drm_gem_object_get(obj);
> >   	return vm_bo;
> > @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
> >   	drm_gem_gpuva_assert_lock_held(vm_bo->obj);
> > +	spin_lock(&gpuvm->extobj.lock);
> > +	list_del(&vm_bo->list.entry.extobj);
> > +	spin_unlock(&gpuvm->extobj.lock);
> > +
> > +	spin_lock(&gpuvm->evict.lock);
> > +	list_del(&vm_bo->list.entry.evict);
> > +	spin_unlock(&gpuvm->evict.lock);
> > +
> >   	list_del(&vm_bo->list.entry.gem);
> >   	drm_gem_object_put(obj);
> > @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
> >    * @vm_bo: the &drm_gpuvm_bo to release the reference of
> >    *
> >    * This releases a reference to @vm_bo.
> > + *
> > + * If the reference count drops to zero, the &gpuvm_bo is destroyed, which
> > + * includes removing it from the GEMs gpuva list. Hence, if a call to this
> > + * function can potentially let the reference count to zero the caller must
> > + * hold the dma-resv or driver specific GEM gpuva lock.
> >    */
> >   void
> >   drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
> > @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
> >   }
> >   EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
> > +static int __must_check
> > +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
> > +{
> > +	return kref_get_unless_zero(&vm_bo->kref);
> > +}
> > +
> >   static struct drm_gpuvm_bo *
> >   __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
> >   		    struct drm_gem_object *obj)
> > @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *__vm_bo)
> >   }
> >   EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
> > +/**
> > + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its &drm_gpuvm's
> > + * extobj list
> > + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the extobj list.
> > + *
> > + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if not on the list
> > + * already and if the corresponding &drm_gem_object is an external object,
> > + * actually.
> > + */
> > +void
> > +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
> > +{
> > +	struct drm_gpuvm *gpuvm = vm_bo->vm;
> > +
> > +	if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
> > +		drm_gpuvm_bo_list_add(vm_bo, extobj);
> > +}
> > +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
> > +
> > +/**
> > + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to / from a
> > + * &drm_gpuvms evicted list
> > + * @obj: the &drm_gem_object to add or remove
> > + * @evict: indicates whether the object is evicted
> > + *
> > + * Adds a &drm_gem_object to or removes it from all &drm_gpuvms evicted
> > + * list containing a mapping of this &drm_gem_object.
> > + */
> > +void
> > +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
> > +{
> > +	struct drm_gpuvm_bo *vm_bo;
> > +
> > +	drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
> > +		if (evict)
> > +			drm_gpuvm_bo_list_add(vm_bo, evict);
> > +		else
> > +			drm_gpuvm_bo_list_del(vm_bo, evict);
> > +	}
> > +}
> > +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
> > +
> >   static int
> >   __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
> >   		   struct drm_gpuva *va)
> > diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
> > index afa50b9059a2..834bb6d6617e 100644
> > --- a/include/drm/drm_gpuvm.h
> > +++ b/include/drm/drm_gpuvm.h
> > @@ -26,10 +26,12 @@
> >    */
> >   #include <linux/list.h>
> > +#include <linux/dma-resv.h>
> >   #include <linux/rbtree.h>
> >   #include <linux/types.h>
> >   #include <drm/drm_gem.h>
> > +#include <drm/drm_exec.h>
> >   struct drm_gpuvm;
> >   struct drm_gpuvm_bo;
> > @@ -259,6 +261,38 @@ struct drm_gpuvm {
> >   	 * space
> >   	 */
> >   	struct dma_resv *resv;
> > +
> > +	/**
> > +	 * @extobj: structure holding the extobj list
> > +	 */
> > +	struct {
> > +		/**
> > +		 * @list: &list_head storing &drm_gpuvm_bos serving as
> > +		 * external object
> > +		 */
> > +		struct list_head list;
> > +
> > +		/**
> > +		 * @lock: spinlock to protect the extobj list
> > +		 */
> > +		spinlock_t lock;
> > +	} extobj;
> > +
> > +	/**
> > +	 * @evict: structure holding the evict list and evict list lock
> > +	 */
> > +	struct {
> > +		/**
> > +		 * @list: &list_head storing &drm_gpuvm_bos currently being
> > +		 * evicted
> > +		 */
> > +		struct list_head list;
> > +
> > +		/**
> > +		 * @lock: spinlock to protect the evict list
> > +		 */
> > +		spinlock_t lock;
> > +	} evict;
> >   };
> >   void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
> > @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
> >   		    const struct drm_gpuvm_ops *ops);
> >   void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
> > +/**
> > + * drm_gpuvm_is_extobj() - indicates whether the given &drm_gem_object is an
> > + * external object
> > + * @gpuvm: the &drm_gpuvm to check
> > + * @obj: the &drm_gem_object to check
> > + *
> > + * Returns: true if the &drm_gem_object &dma_resv differs from the
> > + * &drm_gpuvms &dma_resv, false otherwise
> > + */
> > +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm *gpuvm,
> > +				       struct drm_gem_object *obj)
> > +{
> > +	return obj && obj->resv != gpuvm->resv;
> > +}
> > +
> >   static inline struct drm_gpuva *
> >   __drm_gpuva_next(struct drm_gpuva *va)
> >   {
> > @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
> >   #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) \
> >   	list_for_each_entry_safe(va__, next__, &(gpuvm__)->rb.list, rb.entry)
> > +/**
> > + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of &drm_exec
> > + *
> > + * This structure should be created on the stack as &drm_exec should be.
> > + *
> > + * Optionally, @extra can be set in order to lock additional &drm_gem_objects.
> > + */
> > +struct drm_gpuvm_exec {
> > +	/**
> > +	 * @exec: the &drm_exec structure
> > +	 */
> > +	struct drm_exec exec;
> > +
> > +	/**
> > +	 * @vm: the &drm_gpuvm to lock its DMA reservations
> > +	 */
> > +	struct drm_gpuvm *vm;
> > +
> > +	/**
> > +	 * @extra: Callback and corresponding private data for the driver to
> > +	 * lock arbitrary additional &drm_gem_objects.
> > +	 */
> > +	struct {
> > +		/**
> > +		 * @fn: The driver callback to lock additional &drm_gem_objects.
> > +		 */
> > +		int (*fn)(struct drm_gpuvm_exec *vm_exec,
> > +			  unsigned int num_fences);
> > +
> > +		/**
> > +		 * @priv: driver private data for the @fn callback
> > +		 */
> > +		void *priv;
> > +	} extra;
> > +};
> > +
> > +/**
> > + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
> > + * @gpuvm: the &drm_gpuvm
> > + * @exec: the &drm_exec context
> > + * @num_fences: the amount of &dma_fences to reserve
> > + *
> > + * Calls drm_exec_prepare_obj() for the GPUVMs dummy &drm_gem_object.
> > + *
> > + * Using this function directly, it is the drivers responsibility to call
> > + * drm_exec_init() and drm_exec_fini() accordingly.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + */
> > +static inline int
> > +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
> > +		     struct drm_exec *exec,
> > +		     unsigned int num_fences)
> > +{
> > +	return drm_exec_prepare_obj(exec, &gpuvm->d_obj, num_fences);
> > +}
> > +
> > +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> > +			      struct drm_exec *exec,
> > +			      unsigned int num_fences);
> > +
> > +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
> > +			    struct drm_exec *exec,
> > +			    u64 addr, u64 range,
> > +			    unsigned int num_fences);
> > +
> > +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> > +			unsigned int num_fences,
> > +			bool interruptible);
> > +
> > +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
> > +			      struct drm_gem_object **objs,
> > +			      unsigned int num_objs,
> > +			      unsigned int num_fences,
> > +			      bool interruptible);
> > +
> > +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
> > +			      u64 addr, u64 range,
> > +			      unsigned int num_fences,
> > +			      bool interruptible);
> > +
> > +/**
> > + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated BOs
> > + * @gpuvm: the &drm_gpuvm
> > + *
> > + * Releases all dma-resv locks of all &drm_gem_objects previously acquired
> > + * through drm_gpuvm_lock() or its variants.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + */
> > +static inline void
> > +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
> > +{
> > +	drm_exec_fini(&vm_exec->exec);
> > +}
> > +
> > +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
> > +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> > +			      struct drm_exec *exec,
> > +			      struct dma_fence *fence,
> > +			      enum dma_resv_usage private_usage,
> > +			      enum dma_resv_usage extobj_usage);
> > +
> > +/**
> > + * drm_gpuvm_exec_resv_add_fence()
> > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > + * @fence: fence to add
> > + * @private_usage: private dma-resv usage
> > + * @extobj_usage: extobj dma-resv usage
> > + *
> > + * See drm_gpuvm_resv_add_fence().
> > + */
> > +static inline void
> > +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec *vm_exec,
> > +			      struct dma_fence *fence,
> > +			      enum dma_resv_usage private_usage,
> > +			      enum dma_resv_usage extobj_usage)
> > +{
> > +	drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec, fence,
> > +				 private_usage, extobj_usage);
> > +}
> > +
> >   /**
> >    * struct drm_gpuvm_bo - structure representing a &drm_gpuvm and
> >    * &drm_gem_object combination
> > @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
> >   			 * gpuva list.
> >   			 */
> >   			struct list_head gem;
> > +
> > +			/**
> > +			 * @evict: List entry to attach to the &drm_gpuvms
> > +			 * extobj list.
> > +			 */
> > +			struct list_head extobj;
> > +
> > +			/**
> > +			 * @evict: List entry to attach to the &drm_gpuvms evict
> > +			 * list.
> > +			 */
> > +			struct list_head evict;
> >   		} entry;
> >   	} list;
> >   };
> > @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
> >   drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
> >   		  struct drm_gem_object *obj);
> > +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict);
> > +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
> > +
> >   /**
> >    * drm_gpuvm_bo_for_each_va() - iterator to walk over a list of &drm_gpuva
> >    * @va__: &drm_gpuva structure to assign to in each iteration step
> > @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
> >   	 * used.
> >   	 */
> >   	int (*sm_step_unmap)(struct drm_gpuva_op *op, void *priv);
> > +
> > +	/**
> > +	 * @bo_validate: called from drm_gpuvm_validate()
> > +	 *
> > +	 * Drivers receive this callback for every evicted &drm_gem_object being
> > +	 * mapped in the corresponding &drm_gpuvm.
> > +	 *
> > +	 * Typically, drivers would call their driver specific variant of
> > +	 * ttm_bo_validate() from within this callback.
> > +	 */
> > +	int (*bo_validate)(struct drm_gem_object *obj);
> >   };
> >   int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-12 16:50     ` Danilo Krummrich
@ 2023-09-12 19:23       ` Thomas Hellström
  2023-09-12 23:36         ` Danilo Krummrich
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-12 19:23 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel


On 9/12/23 18:50, Danilo Krummrich wrote:
> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
>> Hi, Danilo,
>>
>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>> So far the DRM GPUVA manager offers common infrastructure to track GPU VA
>>> allocations and mappings, generically connect GPU VA mappings to their
>>> backing buffers and perform more complex mapping operations on the GPU VA
>>> space.
>>>
>>> However, there are more design patterns commonly used by drivers, which
>>> can potentially be generalized in order to make the DRM GPUVA manager
>>> represent a basic GPU-VM implementation. In this context, this patch aims
>>> at generalizing the following elements.
>>>
>>> 1) Provide a common dma-resv for GEM objects not being used outside of
>>>      this GPU-VM.
>>>
>>> 2) Provide tracking of external GEM objects (GEM objects which are
>>>      shared with other GPU-VMs).
>>>
>>> 3) Provide functions to efficiently lock all GEM objects dma-resv the
>>>      GPU-VM contains mappings of.
>>>
>>> 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
>>>      of, such that validation of evicted GEM objects is accelerated.
>>>
>>> 5) Provide some convinience functions for common patterns.
>>>
>>> Rather than being designed as a "framework", the target is to make all
>>> features appear as a collection of optional helper functions, such that
>>> drivers are free to make use of the DRM GPUVA managers basic
>>> functionality and opt-in for other features without setting any feature
>>> flags, just by making use of the corresponding functions.
>>>
>>> Big kudos to Boris Brezillon for his help to figure out locking for drivers
>>> updating the GPU VA space within the fence signalling path.
>>>
>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>> ---
>>>    drivers/gpu/drm/drm_gpuvm.c | 516 ++++++++++++++++++++++++++++++++++++
>>>    include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>    2 files changed, 713 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
>>> index f4411047dbb3..8e62a043f719 100644
>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>> @@ -73,6 +73,21 @@
>>>     * &drm_gem_object list of &drm_gpuvm_bos for an existing instance of this
>>>     * particular combination. If not existent a new instance is created and linked
>>>     * to the &drm_gem_object.
>>> + *
>>> + * &drm_gpuvm_bo structures, since unique for a given &drm_gpuvm, are also used
>>> + * as entry for the &drm_gpuvm's lists of external and evicted objects. Those
>>> + * list are maintained in order to accelerate locking of dma-resv locks and
>>> + * validation of evicted objects bound in a &drm_gpuvm. For instance the all
>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be locked by calling
>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call drm_gpuvm_validate() in
>>> + * order to validate all evicted &drm_gem_objects. It is also possible to lock
>>> + * additional &drm_gem_objects by providing the corresponding parameters to
>>> + * drm_gpuvm_exec_lock() as well as open code the &drm_exec loop while making
>>> + * use of helper functions such as drm_gpuvm_prepare_range() or
>>> + * drm_gpuvm_prepare_objects().
>>> + *
>>> + * Every bound &drm_gem_object is treated as external object when its &dma_resv
>>> + * structure is different than the &drm_gpuvm's common &dma_resv structure.
>>>     */
>>>    /**
>>> @@ -420,6 +435,20 @@
>>>     * Subsequent calls to drm_gpuvm_bo_obtain() for the same &drm_gpuvm and
>>>     * &drm_gem_object must be able to observe previous creations and destructions
>>>     * of &drm_gpuvm_bos in order to keep instances unique.
>>> + *
>>> + * The &drm_gpuvm's lists for keeping track of external and evicted objects are
>>> + * protected against concurrent insertion / removal and iteration internally.
>>> + *
>>> + * However, drivers still need ensure to protect concurrent calls to functions
>>> + * iterating those lists, such as drm_gpuvm_validate() and
>>> + * drm_gpuvm_prepare_objects(). Every such function contains a particular
>>> + * comment and lockdep checks if possible.
>>> + *
>>> + * Functions adding or removing entries from those lists, such as
>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be called with external
>>> + * locks being held, e.g. in order to avoid the corresponding list to be
>>> + * (safely) modified while potentially being iternated by other API functions.
>>> + * However, this is entirely optional.
>>>     */
>>>    /**
>>> @@ -632,6 +661,131 @@
>>>     *	}
>>>     */
>>> +/**
>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>> + * @__gpuvm: The GPU VM
>>> + * @__list_name: The name of the list we're iterating on
>>> + * @__local_list: A pointer to the local list used to store already iterated items
>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
>>> + *
>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
>>> + * iterator releases the lock immediately after picking the first element from
>>> + * the list, so list insertion deletion can happen concurrently.
>> Are the list spinlocks needed for that async state update from within the
>> dma-fence critical section we've discussed previously?
> Yes, but also for other reasons, see below.
>
>> Otherwise it should be sufficient to protect the lists with the gpuvm's resv
>> (or for the extobj list with an outer lock).
>>
>> If those spinlocks are still needed in some situations, perhaps could we
>> have an option to set them to NULL (Like IIRC the maple tree allows for)?
> The evict spinlock is needed in any case, since in drm_gpuvm_bo_evict() we're
> holding only the dma-resv lock from the BO this function gets called for. Hence,
> the spinlock protects concurrent drm_gpuvm_bo_evict() calls with different BOs.
No. Only if you try to add external objects to the vm's evict list from 
within the evict code. That's not necessary since you loop through all 
external objects anyway when locking them so an "evicted" bool in the 
vm_bo, protected by the bo resv would be sufficient. The extobj locking 
loop can then add the bo to the evicted list.
>
> For extobjs an outer lock would be enough in case of Xe, but I really would not
> like to add even more complexity just to get the spinlock out of the way in case
> the driver already has an outer lock protecting this path.

I must disagree here. These spinlocks and atomic operations are pretty 
costly and as discussed earlier this type of locking was the reason (at 
least according to the commit message) that made Christian drop the 
XArray use in drm_exec for the same set of objects: "The locking 
overhead is unecessary and measurable". IMHO the spinlock is the added 
complexity and a single wide lock following the drm locking guidelines 
set out by Daniel and David should really be the default choice with an 
opt-in for a spinlock if needed for async and pushing out to a wq is not 
an option.

A pretty simple way that would not add much code would be

static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm, 
spinlock_t *lock)

{

     if (!gpuvm->resv_protected_lists)
         spin_lock(lock);

}

>> For such drivers, that would require anybody calling unlink to hold the vm's
>> resv, though.
> In V4 I want to go back to having a dedicated lock for the GEMs gpuva list (or
> VM_BO list to be more precise). We can't just use the dma-resv lock for that
> with VM_BO abstractions, because on destruction of a VM_BO we otherwise wouldn't
> be allowed to already hold the dma-resv lock. That's the fix I was referring to
> earlier.

Yeah, I can see the need for a dedicated lock for the GEM's gpuva list, 
but holding the vm's dma-resv lock across the unlink shouldn't be a 
problem. We may free the object and a pointer to the vm's resv during 
unlink but we don't free the vm's resv.  It'd be a matter of ensuring 
that any calls to unlink from *within* drm_gpuvm allows it to be held.

/Thomas


>> It seems that with that also the refcount could be make non-atomic.
>>
>> All in the spirit of the drm locking guidelines "use big locks when
>> possible".
>> Lower level locks only when necessary for performance or locking inversion?
>>
>> /Thomas
>>
>>
>>> + *
>>> + * Elements popped from the original list are kept in a local list, so removal
>>> + * and is_empty checks can still happen while we're iterating the list.
>>> + */
>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name, __local_list, __prev_vm_bo)	\
>>> +	({										\
>>> +		struct drm_gpuvm_bo *__vm_bo;						\
>>> +											\
>>> +		drm_gpuvm_bo_put(__prev_vm_bo);						\
>>> +											\
>>> +		spin_lock(&(__gpuvm)->__list_name.lock);				\
>>> +		while (!list_empty(&(__gpuvm)->__list_name.list)) {			\
>>> +			__vm_bo = list_first_entry(&(__gpuvm)->__list_name.list,	\
>>> +						   struct drm_gpuvm_bo,			\
>>> +						   list.entry.__list_name);		\
>>> +			if (drm_gpuvm_bo_get_unless_zero(__vm_bo)) {			\
>>> +				list_move_tail(&(__vm_bo)->list.entry.__list_name,	\
>>> +					       __local_list);				\
>>> +				break;							\
>>> +			} else {							\
>>> +				list_del_init(&(__vm_bo)->list.entry.__list_name);	\
>>> +				__vm_bo = NULL;						\
>>> +			}								\
>>> +		}									\
>>> +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
>>> +											\
>>> +		__vm_bo;								\
>>> +	})
>>> +
>>> +/**
>>> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
>>> + *
>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
>>> + * iterator releases the lock immediately after picking the first element from the
>>> + * list, so list insertion and deletion can happen concurrently.
>>> + *
>>> + * Typical use:
>>> + *
>>> + *	struct drm_gpuvm_bo *vm_bo;
>>> + *	LIST_HEAD(my_local_list);
>>> + *
>>> + *	ret = 0;
>>> + *	drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>, &my_local_list, vm_bo) {
>>> + *		ret = do_something_with_vm_bo(..., vm_bo);
>>> + *		if (ret)
>>> + *			break;
>>> + *	}
>>> + *	drm_gpuvm_bo_put(vm_bo);
>>> + *	drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>, &my_local_list);
>>> + *
>>> + *
>>> + * Only used for internal list iterations, not meant to be exposed to the outside
>>> + * world.
>>> + */
>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name, __local_list, __vm_bo)	\
>>> +	for (__vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
>>> +						__local_list, NULL);		\
>>> +	     __vm_bo;								\
>>> +	     __vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
>>> +						__local_list, __vm_bo))		\
>>> +
>>> +/**
>>> + * restore_vm_bo_list() - move vm_bo elements back to their original list
>>> + * @__gpuvm: The GPU VM
>>> + * @__list_name: The name of the list we're iterating on
>>> + * @__local_list: A pointer to the local list used to store already iterated items
>>> + *
>>> + * When we're done iterating a vm_bo list, we should call restore_vm_bo_list()
>>> + * to restore the original state and let new iterations take place.
>>> + */
>>> +#define restore_vm_bo_list(__gpuvm, __list_name, __local_list)				\
>>> +	do {										\
>>> +		/* Merge back the two lists, moving local list elements to the		\
>>> +		 * head to preserve previous ordering, in case it matters.		\
>>> +		 */									\
>>> +		spin_lock(&(__gpuvm)->__list_name.lock);				\
>>> +		list_splice(__local_list, &(__gpuvm)->__list_name.list);		\
>>> +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
>>> +	} while (0)
>>> +/**
>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given list
>>> + * @__vm_bo: the &drm_gpuvm_bo
>>> + * @__list_name: the name of the list to insert into
>>> + *
>>> + * Inserts the given @__vm_bo into the list specified by @__list_name and
>>> + * increases the vm_bo's reference count.
>>> + */
>>> +#define drm_gpuvm_bo_list_add(__vm_bo, __list_name)				\
>>> +	do {									\
>>> +		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
>>> +		if (list_empty(&(__vm_bo)->list.entry.__list_name))		\
>>> +			list_add_tail(&(__vm_bo)->list.entry.__list_name,	\
>>> +				      &(__vm_bo)->vm->__list_name.list);	\
>>> +		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
>>> +	} while (0)
>>> +
>>> +/**
>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given list
>>> + * @__vm_bo: the &drm_gpuvm_bo
>>> + * @__list_name: the name of the list to insert into
>>> + *
>>> + * Removes the given @__vm_bo from the list specified by @__list_name and
>>> + * decreases the vm_bo's reference count.
>>> + */
>>> +#define drm_gpuvm_bo_list_del(__vm_bo, __list_name)				\
>>> +	do {									\
>>> +		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
>>> +		if (!list_empty(&(__vm_bo)->list.entry.__list_name))		\
>>> +			list_del_init(&(__vm_bo)->list.entry.__list_name);	\
>>> +		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
>>> +	} while (0)
>>> +
>>> +static int __must_check
>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>> +
>>>    #define to_drm_gpuva(__node)	container_of((__node), struct drm_gpuva, rb.node)
>>>    #define GPUVA_START(node) ((node)->va.addr)
>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
>>>    	gpuvm->rb.tree = RB_ROOT_CACHED;
>>>    	INIT_LIST_HEAD(&gpuvm->rb.list);
>>> +	INIT_LIST_HEAD(&gpuvm->extobj.list);
>>> +	spin_lock_init(&gpuvm->extobj.lock);
>>> +
>>> +	INIT_LIST_HEAD(&gpuvm->evict.list);
>>> +	spin_lock_init(&gpuvm->evict.lock);
>>> +
>>>    	drm_gpuva_check_overflow(start_offset, range);
>>>    	gpuvm->mm_start = start_offset;
>>>    	gpuvm->mm_range = range;
>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
>>>    	WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>    	     "GPUVA tree is not empty, potentially leaking memory.\n");
>>> +	WARN(!list_empty(&gpuvm->extobj.list), "Extobj list should be empty.\n");
>>> +	WARN(!list_empty(&gpuvm->evict.list), "Evict list should be empty.\n");
>>> +
>>>    	drm_gem_private_object_fini(&gpuvm->d_obj);
>>>    }
>>>    EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>> +/**
>>> + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
>>> + * @gpuvm: the &drm_gpuvm
>>> + * @exec: the &drm_exec locking context
>>> + * @num_fences: the amount of &dma_fences to reserve
>>> + *
>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the given
>>> + * &drm_gpuvm contains mappings of.
>>> + *
>>> + * Using this function directly, it is the drivers responsibility to call
>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>> + *
>>> + * Note: This function is safe against concurrent insertion and removal of
>>> + * external objects, however it is not safe against concurrent usage itself.
>>> + *
>>> + * Drivers need to make sure to protect this case with either an outer VM lock
>>> + * or by calling drm_gpuvm_prepare_vm() before this function within the
>>> + * drm_exec_until_all_locked() loop, such that the GPUVM's dma-resv lock ensures
>>> + * mutual exclusion.
>>> + *
>>> + * Returns: 0 on success, negative error code on failure.
>>> + */
>>> +int
>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>> +			  struct drm_exec *exec,
>>> +			  unsigned int num_fences)
>>> +{
>>> +	struct drm_gpuvm_bo *vm_bo;
>>> +	LIST_HEAD(extobjs);
>>> +	int ret = 0;
>>> +
>>> +	for_each_vm_bo_in_list(gpuvm, extobj, &extobjs, vm_bo) {
>>> +		ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
>>> +		if (ret)
>>> +			break;
>>> +	}
>>> +	/* Drop ref in case we break out of the loop. */
>>> +	drm_gpuvm_bo_put(vm_bo);
>>> +	restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>> +
>>> +	return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>> +
>>> +/**
>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped within a given range
>>> + * @gpuvm: the &drm_gpuvm
>>> + * @exec: the &drm_exec locking context
>>> + * @addr: the start address within the VA space
>>> + * @range: the range to iterate within the VA space
>>> + * @num_fences: the amount of &dma_fences to reserve
>>> + *
>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects mapped between @addr
>>> + * and @addr + @range.
>>> + *
>>> + * Returns: 0 on success, negative error code on failure.
>>> + */
>>> +int
>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct drm_exec *exec,
>>> +			u64 addr, u64 range, unsigned int num_fences)
>>> +{
>>> +	struct drm_gpuva *va;
>>> +	u64 end = addr + range;
>>> +	int ret;
>>> +
>>> +	drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
>>> +		struct drm_gem_object *obj = va->gem.obj;
>>> +
>>> +		ret = drm_exec_prepare_obj(exec, obj, num_fences);
>>> +		if (ret)
>>> +			return ret;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>> +
>>> +/**
>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all assoiciated BOs
>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>> + * @num_fences: the amount of &dma_fences to reserve
>>> + * @interruptible: sleep interruptible if waiting
>>> + *
>>> + * Acquires all dma-resv locks of all &drm_gem_objects the given
>>> + * &drm_gpuvm contains mappings of.
>>> + *
>>> + * Addionally, when calling this function with struct drm_gpuvm_exec::extra
>>> + * being set the driver receives the given @fn callback to lock additional
>>> + * dma-resv in the context of the &drm_gpuvm_exec instance. Typically, drivers
>>> + * would call drm_exec_prepare_obj() from within this callback.
>>> + *
>>> + * Returns: 0 on success, negative error code on failure.
>>> + */
>>> +int
>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>> +		    unsigned int num_fences,
>>> +		    bool interruptible)
>>> +{
>>> +	struct drm_gpuvm *gpuvm = vm_exec->vm;
>>> +	struct drm_exec *exec = &vm_exec->exec;
>>> +	uint32_t flags;
>>> +	int ret;
>>> +
>>> +	flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT : 0 |
>>> +		DRM_EXEC_IGNORE_DUPLICATES;
>>> +
>>> +	drm_exec_init(exec, flags);
>>> +
>>> +	drm_exec_until_all_locked(exec) {
>>> +		ret = drm_gpuvm_prepare_vm(gpuvm, exec, num_fences);
>>> +		drm_exec_retry_on_contention(exec);
>>> +		if (ret)
>>> +			goto err;
>>> +
>>> +		ret = drm_gpuvm_prepare_objects(gpuvm, exec, num_fences);
>>> +		drm_exec_retry_on_contention(exec);
>>> +		if (ret)
>>> +			goto err;
>>> +
>>> +		if (vm_exec->extra.fn) {
>>> +			ret = vm_exec->extra.fn(vm_exec, num_fences);
>>> +			drm_exec_retry_on_contention(exec);
>>> +			if (ret)
>>> +				goto err;
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +
>>> +err:
>>> +	drm_exec_fini(exec);
>>> +	return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>> +
>>> +static int
>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int num_fences)
>>> +{
>>> +	struct {
>>> +		struct drm_gem_object **objs;
>>> +		unsigned int num_objs;
>>> +	} *args = vm_exec->extra.priv;
>>> +
>>> +	return drm_exec_prepare_array(&vm_exec->exec, args->objs,
>>> +				      args->num_objs, num_fences);
>>> +}
>>> +
>>> +/**
>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all assoiciated BOs
>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>> + * @objs: additional &drm_gem_objects to lock
>>> + * @num_objs: the number of additional &drm_gem_objects to lock
>>> + * @num_fences: the amount of &dma_fences to reserve
>>> + * @interruptible: sleep interruptible if waiting
>>> + *
>>> + * Acquires all dma-resv locks of all &drm_gem_objects the given &drm_gpuvm
>>> + * contains mappings of, plus the ones given through @objs.
>>> + *
>>> + * Returns: 0 on success, negative error code on failure.
>>> + */
>>> +int
>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>> +			  struct drm_gem_object **objs,
>>> +			  unsigned int num_objs,
>>> +			  unsigned int num_fences,
>>> +			  bool interruptible)
>>> +{
>>> +	struct {
>>> +		struct drm_gem_object **objs;
>>> +		unsigned int num_objs;
>>> +	} args;
>>> +
>>> +	args.objs = objs;
>>> +	args.num_objs = num_objs;
>>> +
>>> +	vm_exec->extra.fn = fn_lock_array;
>>> +	vm_exec->extra.priv = &args;
>>> +
>>> +	return drm_gpuvm_exec_lock(vm_exec, num_fences, interruptible);
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>> +
>>> +/**
>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped within a given range
>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>> + * @addr: the start address within the VA space
>>> + * @range: the range to iterate within the VA space
>>> + * @num_fences: the amount of &dma_fences to reserve
>>> + * @interruptible: sleep interruptible if waiting
>>> + *
>>> + * Acquires all dma-resv locks of all &drm_gem_objects mapped between @addr and
>>> + * @addr + @range.
>>> + *
>>> + * Returns: 0 on success, negative error code on failure.
>>> + */
>>> +int
>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>> +			  u64 addr, u64 range,
>>> +			  unsigned int num_fences,
>>> +			  bool interruptible)
>>> +{
>>> +	struct drm_gpuvm *gpuvm = vm_exec->vm;
>>> +	struct drm_exec *exec = &vm_exec->exec;
>>> +	uint32_t flags;
>>> +	int ret;
>>> +
>>> +	flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT : 0 |
>>> +		DRM_EXEC_IGNORE_DUPLICATES;
>>> +
>>> +	drm_exec_init(exec, flags);
>>> +
>>> +	drm_exec_until_all_locked(exec) {
>>> +		ret = drm_gpuvm_prepare_range(gpuvm, exec, addr, range,
>>> +					      num_fences);
>>> +		drm_exec_retry_on_contention(exec);
>>> +		if (ret)
>>> +			goto err;
>>> +	}
>>> +
>>> +	return ret;
>>> +
>>> +err:
>>> +	drm_exec_fini(exec);
>>> +	return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>> +
>>> +/**
>>> + * drm_gpuvm_validate() - validate all BOs marked as evicted
>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>> + *
>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all evicted buffer
>>> + * objects being mapped in the given &drm_gpuvm.
>>> + *
>>> + * Returns: 0 on success, negative error code on failure.
>>> + */
>>> +int
>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>> +{
>>> +	const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>> +	struct drm_gpuvm_bo *vm_bo;
>>> +	LIST_HEAD(evict);
>>> +	int ret = 0;
>>> +
>>> +	if (unlikely(!ops || !ops->bo_validate))
>>> +		return -ENOTSUPP;
>>> +
>>> +	for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
>>> +		dma_resv_assert_held(vm_bo->obj->resv);
>>> +		ret = ops->bo_validate(vm_bo->obj);
>>> +		if (ret)
>>> +			break;
>>> +	}
>>> +	/* Drop ref in case we break out of the loop. */
>>> +	drm_gpuvm_bo_put(vm_bo);
>>> +	restore_vm_bo_list(gpuvm, evict, &evict);
>>> +
>>> +	return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>> +
>>> +/**
>>> + * drm_gpuvm_resv_add_fence - add fence to private and all extobj
>>> + * dma-resv
>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>> + * @exec: the &drm_exec locking context
>>> + * @fence: fence to add
>>> + * @private_usage: private dma-resv usage
>>> + * @extobj_usage: extobj dma-resv usage
>>> + */
>>> +void
>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>> +			 struct drm_exec *exec,
>>> +			 struct dma_fence *fence,
>>> +			 enum dma_resv_usage private_usage,
>>> +			 enum dma_resv_usage extobj_usage)
>>> +{
>>> +	struct drm_gem_object *obj;
>>> +	unsigned long index;
>>> +
>>> +	drm_exec_for_each_locked_object(exec, index, obj) {
>>> +		dma_resv_assert_held(obj->resv);
>>> +		dma_resv_add_fence(obj->resv, fence,
>>> +				   drm_gpuvm_is_extobj(gpuvm, obj) ?
>>> +				   private_usage : extobj_usage);
>>> +	}
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>> +
>>>    /**
>>>     * drm_gpuvm_bo_create() - create a new instance of struct drm_gpuvm_bo
>>>     * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm,
>>>    	INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>    	INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>> +	INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>> +	INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>> +
>>>    	drm_gem_object_get(obj);
>>>    	return vm_bo;
>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>    	drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>> +	spin_lock(&gpuvm->extobj.lock);
>>> +	list_del(&vm_bo->list.entry.extobj);
>>> +	spin_unlock(&gpuvm->extobj.lock);
>>> +
>>> +	spin_lock(&gpuvm->evict.lock);
>>> +	list_del(&vm_bo->list.entry.evict);
>>> +	spin_unlock(&gpuvm->evict.lock);
>>> +
>>>    	list_del(&vm_bo->list.entry.gem);
>>>    	drm_gem_object_put(obj);
>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>     * @vm_bo: the &drm_gpuvm_bo to release the reference of
>>>     *
>>>     * This releases a reference to @vm_bo.
>>> + *
>>> + * If the reference count drops to zero, the &gpuvm_bo is destroyed, which
>>> + * includes removing it from the GEMs gpuva list. Hence, if a call to this
>>> + * function can potentially let the reference count to zero the caller must
>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>     */
>>>    void
>>>    drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>    }
>>>    EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>> +static int __must_check
>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>> +{
>>> +	return kref_get_unless_zero(&vm_bo->kref);
>>> +}
>>> +
>>>    static struct drm_gpuvm_bo *
>>>    __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>    		    struct drm_gem_object *obj)
>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *__vm_bo)
>>>    }
>>>    EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>> +/**
>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its &drm_gpuvm's
>>> + * extobj list
>>> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the extobj list.
>>> + *
>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if not on the list
>>> + * already and if the corresponding &drm_gem_object is an external object,
>>> + * actually.
>>> + */
>>> +void
>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>> +{
>>> +	struct drm_gpuvm *gpuvm = vm_bo->vm;
>>> +
>>> +	if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>> +		drm_gpuvm_bo_list_add(vm_bo, extobj);
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>> +
>>> +/**
>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to / from a
>>> + * &drm_gpuvms evicted list
>>> + * @obj: the &drm_gem_object to add or remove
>>> + * @evict: indicates whether the object is evicted
>>> + *
>>> + * Adds a &drm_gem_object to or removes it from all &drm_gpuvms evicted
>>> + * list containing a mapping of this &drm_gem_object.
>>> + */
>>> +void
>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>> +{
>>> +	struct drm_gpuvm_bo *vm_bo;
>>> +
>>> +	drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>> +		if (evict)
>>> +			drm_gpuvm_bo_list_add(vm_bo, evict);
>>> +		else
>>> +			drm_gpuvm_bo_list_del(vm_bo, evict);
>>> +	}
>>> +}
>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>> +
>>>    static int
>>>    __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>    		   struct drm_gpuva *va)
>>> diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
>>> index afa50b9059a2..834bb6d6617e 100644
>>> --- a/include/drm/drm_gpuvm.h
>>> +++ b/include/drm/drm_gpuvm.h
>>> @@ -26,10 +26,12 @@
>>>     */
>>>    #include <linux/list.h>
>>> +#include <linux/dma-resv.h>
>>>    #include <linux/rbtree.h>
>>>    #include <linux/types.h>
>>>    #include <drm/drm_gem.h>
>>> +#include <drm/drm_exec.h>
>>>    struct drm_gpuvm;
>>>    struct drm_gpuvm_bo;
>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>    	 * space
>>>    	 */
>>>    	struct dma_resv *resv;
>>> +
>>> +	/**
>>> +	 * @extobj: structure holding the extobj list
>>> +	 */
>>> +	struct {
>>> +		/**
>>> +		 * @list: &list_head storing &drm_gpuvm_bos serving as
>>> +		 * external object
>>> +		 */
>>> +		struct list_head list;
>>> +
>>> +		/**
>>> +		 * @lock: spinlock to protect the extobj list
>>> +		 */
>>> +		spinlock_t lock;
>>> +	} extobj;
>>> +
>>> +	/**
>>> +	 * @evict: structure holding the evict list and evict list lock
>>> +	 */
>>> +	struct {
>>> +		/**
>>> +		 * @list: &list_head storing &drm_gpuvm_bos currently being
>>> +		 * evicted
>>> +		 */
>>> +		struct list_head list;
>>> +
>>> +		/**
>>> +		 * @lock: spinlock to protect the evict list
>>> +		 */
>>> +		spinlock_t lock;
>>> +	} evict;
>>>    };
>>>    void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
>>>    		    const struct drm_gpuvm_ops *ops);
>>>    void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>> +/**
>>> + * drm_gpuvm_is_extobj() - indicates whether the given &drm_gem_object is an
>>> + * external object
>>> + * @gpuvm: the &drm_gpuvm to check
>>> + * @obj: the &drm_gem_object to check
>>> + *
>>> + * Returns: true if the &drm_gem_object &dma_resv differs from the
>>> + * &drm_gpuvms &dma_resv, false otherwise
>>> + */
>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm *gpuvm,
>>> +				       struct drm_gem_object *obj)
>>> +{
>>> +	return obj && obj->resv != gpuvm->resv;
>>> +}
>>> +
>>>    static inline struct drm_gpuva *
>>>    __drm_gpuva_next(struct drm_gpuva *va)
>>>    {
>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>>    #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) \
>>>    	list_for_each_entry_safe(va__, next__, &(gpuvm__)->rb.list, rb.entry)
>>> +/**
>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of &drm_exec
>>> + *
>>> + * This structure should be created on the stack as &drm_exec should be.
>>> + *
>>> + * Optionally, @extra can be set in order to lock additional &drm_gem_objects.
>>> + */
>>> +struct drm_gpuvm_exec {
>>> +	/**
>>> +	 * @exec: the &drm_exec structure
>>> +	 */
>>> +	struct drm_exec exec;
>>> +
>>> +	/**
>>> +	 * @vm: the &drm_gpuvm to lock its DMA reservations
>>> +	 */
>>> +	struct drm_gpuvm *vm;
>>> +
>>> +	/**
>>> +	 * @extra: Callback and corresponding private data for the driver to
>>> +	 * lock arbitrary additional &drm_gem_objects.
>>> +	 */
>>> +	struct {
>>> +		/**
>>> +		 * @fn: The driver callback to lock additional &drm_gem_objects.
>>> +		 */
>>> +		int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>> +			  unsigned int num_fences);
>>> +
>>> +		/**
>>> +		 * @priv: driver private data for the @fn callback
>>> +		 */
>>> +		void *priv;
>>> +	} extra;
>>> +};
>>> +
>>> +/**
>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
>>> + * @gpuvm: the &drm_gpuvm
>>> + * @exec: the &drm_exec context
>>> + * @num_fences: the amount of &dma_fences to reserve
>>> + *
>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy &drm_gem_object.
>>> + *
>>> + * Using this function directly, it is the drivers responsibility to call
>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>> + *
>>> + * Returns: 0 on success, negative error code on failure.
>>> + */
>>> +static inline int
>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>> +		     struct drm_exec *exec,
>>> +		     unsigned int num_fences)
>>> +{
>>> +	return drm_exec_prepare_obj(exec, &gpuvm->d_obj, num_fences);
>>> +}
>>> +
>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>> +			      struct drm_exec *exec,
>>> +			      unsigned int num_fences);
>>> +
>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>> +			    struct drm_exec *exec,
>>> +			    u64 addr, u64 range,
>>> +			    unsigned int num_fences);
>>> +
>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>> +			unsigned int num_fences,
>>> +			bool interruptible);
>>> +
>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>> +			      struct drm_gem_object **objs,
>>> +			      unsigned int num_objs,
>>> +			      unsigned int num_fences,
>>> +			      bool interruptible);
>>> +
>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>> +			      u64 addr, u64 range,
>>> +			      unsigned int num_fences,
>>> +			      bool interruptible);
>>> +
>>> +/**
>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated BOs
>>> + * @gpuvm: the &drm_gpuvm
>>> + *
>>> + * Releases all dma-resv locks of all &drm_gem_objects previously acquired
>>> + * through drm_gpuvm_lock() or its variants.
>>> + *
>>> + * Returns: 0 on success, negative error code on failure.
>>> + */
>>> +static inline void
>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>> +{
>>> +	drm_exec_fini(&vm_exec->exec);
>>> +}
>>> +
>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>> +			      struct drm_exec *exec,
>>> +			      struct dma_fence *fence,
>>> +			      enum dma_resv_usage private_usage,
>>> +			      enum dma_resv_usage extobj_usage);
>>> +
>>> +/**
>>> + * drm_gpuvm_exec_resv_add_fence()
>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>> + * @fence: fence to add
>>> + * @private_usage: private dma-resv usage
>>> + * @extobj_usage: extobj dma-resv usage
>>> + *
>>> + * See drm_gpuvm_resv_add_fence().
>>> + */
>>> +static inline void
>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec *vm_exec,
>>> +			      struct dma_fence *fence,
>>> +			      enum dma_resv_usage private_usage,
>>> +			      enum dma_resv_usage extobj_usage)
>>> +{
>>> +	drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec, fence,
>>> +				 private_usage, extobj_usage);
>>> +}
>>> +
>>>    /**
>>>     * struct drm_gpuvm_bo - structure representing a &drm_gpuvm and
>>>     * &drm_gem_object combination
>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>    			 * gpuva list.
>>>    			 */
>>>    			struct list_head gem;
>>> +
>>> +			/**
>>> +			 * @evict: List entry to attach to the &drm_gpuvms
>>> +			 * extobj list.
>>> +			 */
>>> +			struct list_head extobj;
>>> +
>>> +			/**
>>> +			 * @evict: List entry to attach to the &drm_gpuvms evict
>>> +			 * list.
>>> +			 */
>>> +			struct list_head evict;
>>>    		} entry;
>>>    	} list;
>>>    };
>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>    drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>    		  struct drm_gem_object *obj);
>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict);
>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>> +
>>>    /**
>>>     * drm_gpuvm_bo_for_each_va() - iterator to walk over a list of &drm_gpuva
>>>     * @va__: &drm_gpuva structure to assign to in each iteration step
>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>    	 * used.
>>>    	 */
>>>    	int (*sm_step_unmap)(struct drm_gpuva_op *op, void *priv);
>>> +
>>> +	/**
>>> +	 * @bo_validate: called from drm_gpuvm_validate()
>>> +	 *
>>> +	 * Drivers receive this callback for every evicted &drm_gem_object being
>>> +	 * mapped in the corresponding &drm_gpuvm.
>>> +	 *
>>> +	 * Typically, drivers would call their driver specific variant of
>>> +	 * ttm_bo_validate() from within this callback.
>>> +	 */
>>> +	int (*bo_validate)(struct drm_gem_object *obj);
>>>    };
>>>    int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-12 19:23       ` Thomas Hellström
@ 2023-09-12 23:36         ` Danilo Krummrich
  2023-09-13  9:14           ` Thomas Hellström
  0 siblings, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-12 23:36 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström wrote:
> 
> On 9/12/23 18:50, Danilo Krummrich wrote:
> > On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
> > > Hi, Danilo,
> > > 
> > > On 9/9/23 17:31, Danilo Krummrich wrote:
> > > > So far the DRM GPUVA manager offers common infrastructure to track GPU VA
> > > > allocations and mappings, generically connect GPU VA mappings to their
> > > > backing buffers and perform more complex mapping operations on the GPU VA
> > > > space.
> > > > 
> > > > However, there are more design patterns commonly used by drivers, which
> > > > can potentially be generalized in order to make the DRM GPUVA manager
> > > > represent a basic GPU-VM implementation. In this context, this patch aims
> > > > at generalizing the following elements.
> > > > 
> > > > 1) Provide a common dma-resv for GEM objects not being used outside of
> > > >      this GPU-VM.
> > > > 
> > > > 2) Provide tracking of external GEM objects (GEM objects which are
> > > >      shared with other GPU-VMs).
> > > > 
> > > > 3) Provide functions to efficiently lock all GEM objects dma-resv the
> > > >      GPU-VM contains mappings of.
> > > > 
> > > > 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
> > > >      of, such that validation of evicted GEM objects is accelerated.
> > > > 
> > > > 5) Provide some convinience functions for common patterns.
> > > > 
> > > > Rather than being designed as a "framework", the target is to make all
> > > > features appear as a collection of optional helper functions, such that
> > > > drivers are free to make use of the DRM GPUVA managers basic
> > > > functionality and opt-in for other features without setting any feature
> > > > flags, just by making use of the corresponding functions.
> > > > 
> > > > Big kudos to Boris Brezillon for his help to figure out locking for drivers
> > > > updating the GPU VA space within the fence signalling path.
> > > > 
> > > > Suggested-by: Matthew Brost <matthew.brost@intel.com>
> > > > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > > > ---
> > > >    drivers/gpu/drm/drm_gpuvm.c | 516 ++++++++++++++++++++++++++++++++++++
> > > >    include/drm/drm_gpuvm.h     | 197 ++++++++++++++
> > > >    2 files changed, 713 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
> > > > index f4411047dbb3..8e62a043f719 100644
> > > > --- a/drivers/gpu/drm/drm_gpuvm.c
> > > > +++ b/drivers/gpu/drm/drm_gpuvm.c
> > > > @@ -73,6 +73,21 @@
> > > >     * &drm_gem_object list of &drm_gpuvm_bos for an existing instance of this
> > > >     * particular combination. If not existent a new instance is created and linked
> > > >     * to the &drm_gem_object.
> > > > + *
> > > > + * &drm_gpuvm_bo structures, since unique for a given &drm_gpuvm, are also used
> > > > + * as entry for the &drm_gpuvm's lists of external and evicted objects. Those
> > > > + * list are maintained in order to accelerate locking of dma-resv locks and
> > > > + * validation of evicted objects bound in a &drm_gpuvm. For instance the all
> > > > + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be locked by calling
> > > > + * drm_gpuvm_exec_lock(). Once locked drivers can call drm_gpuvm_validate() in
> > > > + * order to validate all evicted &drm_gem_objects. It is also possible to lock
> > > > + * additional &drm_gem_objects by providing the corresponding parameters to
> > > > + * drm_gpuvm_exec_lock() as well as open code the &drm_exec loop while making
> > > > + * use of helper functions such as drm_gpuvm_prepare_range() or
> > > > + * drm_gpuvm_prepare_objects().
> > > > + *
> > > > + * Every bound &drm_gem_object is treated as external object when its &dma_resv
> > > > + * structure is different than the &drm_gpuvm's common &dma_resv structure.
> > > >     */
> > > >    /**
> > > > @@ -420,6 +435,20 @@
> > > >     * Subsequent calls to drm_gpuvm_bo_obtain() for the same &drm_gpuvm and
> > > >     * &drm_gem_object must be able to observe previous creations and destructions
> > > >     * of &drm_gpuvm_bos in order to keep instances unique.
> > > > + *
> > > > + * The &drm_gpuvm's lists for keeping track of external and evicted objects are
> > > > + * protected against concurrent insertion / removal and iteration internally.
> > > > + *
> > > > + * However, drivers still need ensure to protect concurrent calls to functions
> > > > + * iterating those lists, such as drm_gpuvm_validate() and
> > > > + * drm_gpuvm_prepare_objects(). Every such function contains a particular
> > > > + * comment and lockdep checks if possible.
> > > > + *
> > > > + * Functions adding or removing entries from those lists, such as
> > > > + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be called with external
> > > > + * locks being held, e.g. in order to avoid the corresponding list to be
> > > > + * (safely) modified while potentially being iternated by other API functions.
> > > > + * However, this is entirely optional.
> > > >     */
> > > >    /**
> > > > @@ -632,6 +661,131 @@
> > > >     *	}
> > > >     */
> > > > +/**
> > > > + * get_next_vm_bo_from_list() - get the next vm_bo element
> > > > + * @__gpuvm: The GPU VM
> > > > + * @__list_name: The name of the list we're iterating on
> > > > + * @__local_list: A pointer to the local list used to store already iterated items
> > > > + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> > > > + *
> > > > + * This helper is here to provide lockless list iteration. Lockless as in, the
> > > > + * iterator releases the lock immediately after picking the first element from
> > > > + * the list, so list insertion deletion can happen concurrently.
> > > Are the list spinlocks needed for that async state update from within the
> > > dma-fence critical section we've discussed previously?
> > Yes, but also for other reasons, see below.
> > 
> > > Otherwise it should be sufficient to protect the lists with the gpuvm's resv
> > > (or for the extobj list with an outer lock).
> > > 
> > > If those spinlocks are still needed in some situations, perhaps could we
> > > have an option to set them to NULL (Like IIRC the maple tree allows for)?
> > The evict spinlock is needed in any case, since in drm_gpuvm_bo_evict() we're
> > holding only the dma-resv lock from the BO this function gets called for. Hence,
> > the spinlock protects concurrent drm_gpuvm_bo_evict() calls with different BOs.
> No. Only if you try to add external objects to the vm's evict list from
> within the evict code. That's not necessary since you loop through all
> external objects anyway when locking them so an "evicted" bool in the vm_bo,
> protected by the bo resv would be sufficient. The extobj locking loop can
> then add the bo to the evicted list.

And validate() can remove it while still holding all dma-resv locks, neat!
However, what if two tasks are trying to lock the VA space concurrently? What
do we do when the drm_gpuvm_bo's refcount drops to zero in drm_gpuva_unlink()?
Are we guaranteed that at this point of time the drm_gpuvm_bo is not on the
evicted list? Because otherwise we would call drm_gpuvm_bo_destroy() with the
dma-resv lock held, which wouldn't be allowed, since drm_gpuvm_bo_destroy()
might drop the last reference to the drm_gem_object and hence we'd potentially
free the dma-resv lock while holding it, at least if it's an external object.

> > 
> > For extobjs an outer lock would be enough in case of Xe, but I really would not
> > like to add even more complexity just to get the spinlock out of the way in case
> > the driver already has an outer lock protecting this path.
> 
> I must disagree here. These spinlocks and atomic operations are pretty
> costly and as discussed earlier this type of locking was the reason (at
> least according to the commit message) that made Christian drop the XArray
> use in drm_exec for the same set of objects: "The locking overhead is
> unecessary and measurable". IMHO the spinlock is the added complexity and a
> single wide lock following the drm locking guidelines set out by Daniel and
> David should really be the default choice with an opt-in for a spinlock if
> needed for async and pushing out to a wq is not an option.

For the external object list an outer lock would work as long as it's not the
dma-resv lock of the corresponding GEM object, since here we actually need to
remove the list entry from the external object list on drm_gpuvm_bo_destroy().
It's just a bit weird design wise that drivers would need to take this outer
lock on:

- drm_gpuvm_bo_extobj_add()
- drm_gpuvm_bo_destroy()	(and hence also drm_gpuvm_bo_put())
- drm_gpuva_unlink() 		(because it needs to call drm_gpuvm_bo_put())
- drm_gpuvm_exec_lock()
- drm_gpuvm_exec_lock_array()
- drm_gpuvm_prepare_range()

Given that it seems reasonable to do all the required locking internally.

In order to at least place lockdep checks, the driver would need to supply the
corresponding lock's lockdep_map, because the GPUVM otherwise doesn't know about
the lock.

Out of curiosity, what is the overhead of a spin_lock() that doesn't need to
spin? 

> 
> A pretty simple way that would not add much code would be
> 
> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm, spinlock_t
> *lock)
> 
> {
> 
>     if (!gpuvm->resv_protected_lists)
>         spin_lock(lock);
> 
> }
> 
> > > For such drivers, that would require anybody calling unlink to hold the vm's
> > > resv, though.
> > In V4 I want to go back to having a dedicated lock for the GEMs gpuva list (or
> > VM_BO list to be more precise). We can't just use the dma-resv lock for that
> > with VM_BO abstractions, because on destruction of a VM_BO we otherwise wouldn't
> > be allowed to already hold the dma-resv lock. That's the fix I was referring to
> > earlier.
> 
> Yeah, I can see the need for a dedicated lock for the GEM's gpuva list, but
> holding the vm's dma-resv lock across the unlink shouldn't be a problem. We
> may free the object and a pointer to the vm's resv during unlink but we
> don't free the vm's resv.  It'd be a matter of ensuring that any calls to
> unlink from *within* drm_gpuvm allows it to be held.

Drivers calling unlink() from the fence signaling path can't use the VM's
dma-resv lock.

Also, what if the object is an external object? We can't use the VM's dma-resv
lock here. And we can't have the GEM objs dma-resv lock held when calling
unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the refcount drops
to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might drop the
last reference of the GEM object. All those problems go away with a dedicated
GEM gpuva list lock.

> 
> /Thomas
> 
> 
> > > It seems that with that also the refcount could be make non-atomic.
> > > 
> > > All in the spirit of the drm locking guidelines "use big locks when
> > > possible".
> > > Lower level locks only when necessary for performance or locking inversion?
> > > 
> > > /Thomas
> > > 
> > > 
> > > > + *
> > > > + * Elements popped from the original list are kept in a local list, so removal
> > > > + * and is_empty checks can still happen while we're iterating the list.
> > > > + */
> > > > +#define get_next_vm_bo_from_list(__gpuvm, __list_name, __local_list, __prev_vm_bo)	\
> > > > +	({										\
> > > > +		struct drm_gpuvm_bo *__vm_bo;						\
> > > > +											\
> > > > +		drm_gpuvm_bo_put(__prev_vm_bo);						\
> > > > +											\
> > > > +		spin_lock(&(__gpuvm)->__list_name.lock);				\
> > > > +		while (!list_empty(&(__gpuvm)->__list_name.list)) {			\
> > > > +			__vm_bo = list_first_entry(&(__gpuvm)->__list_name.list,	\
> > > > +						   struct drm_gpuvm_bo,			\
> > > > +						   list.entry.__list_name);		\
> > > > +			if (drm_gpuvm_bo_get_unless_zero(__vm_bo)) {			\
> > > > +				list_move_tail(&(__vm_bo)->list.entry.__list_name,	\
> > > > +					       __local_list);				\
> > > > +				break;							\
> > > > +			} else {							\
> > > > +				list_del_init(&(__vm_bo)->list.entry.__list_name);	\
> > > > +				__vm_bo = NULL;						\
> > > > +			}								\
> > > > +		}									\
> > > > +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
> > > > +											\
> > > > +		__vm_bo;								\
> > > > +	})
> > > > +
> > > > +/**
> > > > + * for_each_vm_bo_in_list() - internal vm_bo list iterator
> > > > + *
> > > > + * This helper is here to provide lockless list iteration. Lockless as in, the
> > > > + * iterator releases the lock immediately after picking the first element from the
> > > > + * list, so list insertion and deletion can happen concurrently.
> > > > + *
> > > > + * Typical use:
> > > > + *
> > > > + *	struct drm_gpuvm_bo *vm_bo;
> > > > + *	LIST_HEAD(my_local_list);
> > > > + *
> > > > + *	ret = 0;
> > > > + *	drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>, &my_local_list, vm_bo) {
> > > > + *		ret = do_something_with_vm_bo(..., vm_bo);
> > > > + *		if (ret)
> > > > + *			break;
> > > > + *	}
> > > > + *	drm_gpuvm_bo_put(vm_bo);
> > > > + *	drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>, &my_local_list);
> > > > + *
> > > > + *
> > > > + * Only used for internal list iterations, not meant to be exposed to the outside
> > > > + * world.
> > > > + */
> > > > +#define for_each_vm_bo_in_list(__gpuvm, __list_name, __local_list, __vm_bo)	\
> > > > +	for (__vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
> > > > +						__local_list, NULL);		\
> > > > +	     __vm_bo;								\
> > > > +	     __vm_bo = get_next_vm_bo_from_list(__gpuvm, __list_name,		\
> > > > +						__local_list, __vm_bo))		\
> > > > +
> > > > +/**
> > > > + * restore_vm_bo_list() - move vm_bo elements back to their original list
> > > > + * @__gpuvm: The GPU VM
> > > > + * @__list_name: The name of the list we're iterating on
> > > > + * @__local_list: A pointer to the local list used to store already iterated items
> > > > + *
> > > > + * When we're done iterating a vm_bo list, we should call restore_vm_bo_list()
> > > > + * to restore the original state and let new iterations take place.
> > > > + */
> > > > +#define restore_vm_bo_list(__gpuvm, __list_name, __local_list)				\
> > > > +	do {										\
> > > > +		/* Merge back the two lists, moving local list elements to the		\
> > > > +		 * head to preserve previous ordering, in case it matters.		\
> > > > +		 */									\
> > > > +		spin_lock(&(__gpuvm)->__list_name.lock);				\
> > > > +		list_splice(__local_list, &(__gpuvm)->__list_name.list);		\
> > > > +		spin_unlock(&(__gpuvm)->__list_name.lock);				\
> > > > +	} while (0)
> > > > +/**
> > > > + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given list
> > > > + * @__vm_bo: the &drm_gpuvm_bo
> > > > + * @__list_name: the name of the list to insert into
> > > > + *
> > > > + * Inserts the given @__vm_bo into the list specified by @__list_name and
> > > > + * increases the vm_bo's reference count.
> > > > + */
> > > > +#define drm_gpuvm_bo_list_add(__vm_bo, __list_name)				\
> > > > +	do {									\
> > > > +		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
> > > > +		if (list_empty(&(__vm_bo)->list.entry.__list_name))		\
> > > > +			list_add_tail(&(__vm_bo)->list.entry.__list_name,	\
> > > > +				      &(__vm_bo)->vm->__list_name.list);	\
> > > > +		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
> > > > +	} while (0)
> > > > +
> > > > +/**
> > > > + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given list
> > > > + * @__vm_bo: the &drm_gpuvm_bo
> > > > + * @__list_name: the name of the list to insert into
> > > > + *
> > > > + * Removes the given @__vm_bo from the list specified by @__list_name and
> > > > + * decreases the vm_bo's reference count.
> > > > + */
> > > > +#define drm_gpuvm_bo_list_del(__vm_bo, __list_name)				\
> > > > +	do {									\
> > > > +		spin_lock(&(__vm_bo)->vm->__list_name.lock);			\
> > > > +		if (!list_empty(&(__vm_bo)->list.entry.__list_name))		\
> > > > +			list_del_init(&(__vm_bo)->list.entry.__list_name);	\
> > > > +		spin_unlock(&(__vm_bo)->vm->__list_name.lock);			\
> > > > +	} while (0)
> > > > +
> > > > +static int __must_check
> > > > +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
> > > > +
> > > >    #define to_drm_gpuva(__node)	container_of((__node), struct drm_gpuva, rb.node)
> > > >    #define GPUVA_START(node) ((node)->va.addr)
> > > > @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
> > > >    	gpuvm->rb.tree = RB_ROOT_CACHED;
> > > >    	INIT_LIST_HEAD(&gpuvm->rb.list);
> > > > +	INIT_LIST_HEAD(&gpuvm->extobj.list);
> > > > +	spin_lock_init(&gpuvm->extobj.lock);
> > > > +
> > > > +	INIT_LIST_HEAD(&gpuvm->evict.list);
> > > > +	spin_lock_init(&gpuvm->evict.lock);
> > > > +
> > > >    	drm_gpuva_check_overflow(start_offset, range);
> > > >    	gpuvm->mm_start = start_offset;
> > > >    	gpuvm->mm_range = range;
> > > > @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm *gpuvm)
> > > >    	WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
> > > >    	     "GPUVA tree is not empty, potentially leaking memory.\n");
> > > > +	WARN(!list_empty(&gpuvm->extobj.list), "Extobj list should be empty.\n");
> > > > +	WARN(!list_empty(&gpuvm->evict.list), "Evict list should be empty.\n");
> > > > +
> > > >    	drm_gem_private_object_fini(&gpuvm->d_obj);
> > > >    }
> > > >    EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
> > > > +/**
> > > > + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
> > > > + * @gpuvm: the &drm_gpuvm
> > > > + * @exec: the &drm_exec locking context
> > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > + *
> > > > + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the given
> > > > + * &drm_gpuvm contains mappings of.
> > > > + *
> > > > + * Using this function directly, it is the drivers responsibility to call
> > > > + * drm_exec_init() and drm_exec_fini() accordingly.
> > > > + *
> > > > + * Note: This function is safe against concurrent insertion and removal of
> > > > + * external objects, however it is not safe against concurrent usage itself.
> > > > + *
> > > > + * Drivers need to make sure to protect this case with either an outer VM lock
> > > > + * or by calling drm_gpuvm_prepare_vm() before this function within the
> > > > + * drm_exec_until_all_locked() loop, such that the GPUVM's dma-resv lock ensures
> > > > + * mutual exclusion.
> > > > + *
> > > > + * Returns: 0 on success, negative error code on failure.
> > > > + */
> > > > +int
> > > > +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> > > > +			  struct drm_exec *exec,
> > > > +			  unsigned int num_fences)
> > > > +{
> > > > +	struct drm_gpuvm_bo *vm_bo;
> > > > +	LIST_HEAD(extobjs);
> > > > +	int ret = 0;
> > > > +
> > > > +	for_each_vm_bo_in_list(gpuvm, extobj, &extobjs, vm_bo) {
> > > > +		ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
> > > > +		if (ret)
> > > > +			break;
> > > > +	}
> > > > +	/* Drop ref in case we break out of the loop. */
> > > > +	drm_gpuvm_bo_put(vm_bo);
> > > > +	restore_vm_bo_list(gpuvm, extobj, &extobjs);
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
> > > > +
> > > > +/**
> > > > + * drm_gpuvm_prepare_range() - prepare all BOs mapped within a given range
> > > > + * @gpuvm: the &drm_gpuvm
> > > > + * @exec: the &drm_exec locking context
> > > > + * @addr: the start address within the VA space
> > > > + * @range: the range to iterate within the VA space
> > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > + *
> > > > + * Calls drm_exec_prepare_obj() for all &drm_gem_objects mapped between @addr
> > > > + * and @addr + @range.
> > > > + *
> > > > + * Returns: 0 on success, negative error code on failure.
> > > > + */
> > > > +int
> > > > +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct drm_exec *exec,
> > > > +			u64 addr, u64 range, unsigned int num_fences)
> > > > +{
> > > > +	struct drm_gpuva *va;
> > > > +	u64 end = addr + range;
> > > > +	int ret;
> > > > +
> > > > +	drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
> > > > +		struct drm_gem_object *obj = va->gem.obj;
> > > > +
> > > > +		ret = drm_exec_prepare_obj(exec, obj, num_fences);
> > > > +		if (ret)
> > > > +			return ret;
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
> > > > +
> > > > +/**
> > > > + * drm_gpuvm_exec_lock() - lock all dma-resv of all assoiciated BOs
> > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > + * @interruptible: sleep interruptible if waiting
> > > > + *
> > > > + * Acquires all dma-resv locks of all &drm_gem_objects the given
> > > > + * &drm_gpuvm contains mappings of.
> > > > + *
> > > > + * Addionally, when calling this function with struct drm_gpuvm_exec::extra
> > > > + * being set the driver receives the given @fn callback to lock additional
> > > > + * dma-resv in the context of the &drm_gpuvm_exec instance. Typically, drivers
> > > > + * would call drm_exec_prepare_obj() from within this callback.
> > > > + *
> > > > + * Returns: 0 on success, negative error code on failure.
> > > > + */
> > > > +int
> > > > +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> > > > +		    unsigned int num_fences,
> > > > +		    bool interruptible)
> > > > +{
> > > > +	struct drm_gpuvm *gpuvm = vm_exec->vm;
> > > > +	struct drm_exec *exec = &vm_exec->exec;
> > > > +	uint32_t flags;
> > > > +	int ret;
> > > > +
> > > > +	flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT : 0 |
> > > > +		DRM_EXEC_IGNORE_DUPLICATES;
> > > > +
> > > > +	drm_exec_init(exec, flags);
> > > > +
> > > > +	drm_exec_until_all_locked(exec) {
> > > > +		ret = drm_gpuvm_prepare_vm(gpuvm, exec, num_fences);
> > > > +		drm_exec_retry_on_contention(exec);
> > > > +		if (ret)
> > > > +			goto err;
> > > > +
> > > > +		ret = drm_gpuvm_prepare_objects(gpuvm, exec, num_fences);
> > > > +		drm_exec_retry_on_contention(exec);
> > > > +		if (ret)
> > > > +			goto err;
> > > > +
> > > > +		if (vm_exec->extra.fn) {
> > > > +			ret = vm_exec->extra.fn(vm_exec, num_fences);
> > > > +			drm_exec_retry_on_contention(exec);
> > > > +			if (ret)
> > > > +				goto err;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +
> > > > +err:
> > > > +	drm_exec_fini(exec);
> > > > +	return ret;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
> > > > +
> > > > +static int
> > > > +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int num_fences)
> > > > +{
> > > > +	struct {
> > > > +		struct drm_gem_object **objs;
> > > > +		unsigned int num_objs;
> > > > +	} *args = vm_exec->extra.priv;
> > > > +
> > > > +	return drm_exec_prepare_array(&vm_exec->exec, args->objs,
> > > > +				      args->num_objs, num_fences);
> > > > +}
> > > > +
> > > > +/**
> > > > + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all assoiciated BOs
> > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > + * @objs: additional &drm_gem_objects to lock
> > > > + * @num_objs: the number of additional &drm_gem_objects to lock
> > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > + * @interruptible: sleep interruptible if waiting
> > > > + *
> > > > + * Acquires all dma-resv locks of all &drm_gem_objects the given &drm_gpuvm
> > > > + * contains mappings of, plus the ones given through @objs.
> > > > + *
> > > > + * Returns: 0 on success, negative error code on failure.
> > > > + */
> > > > +int
> > > > +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
> > > > +			  struct drm_gem_object **objs,
> > > > +			  unsigned int num_objs,
> > > > +			  unsigned int num_fences,
> > > > +			  bool interruptible)
> > > > +{
> > > > +	struct {
> > > > +		struct drm_gem_object **objs;
> > > > +		unsigned int num_objs;
> > > > +	} args;
> > > > +
> > > > +	args.objs = objs;
> > > > +	args.num_objs = num_objs;
> > > > +
> > > > +	vm_exec->extra.fn = fn_lock_array;
> > > > +	vm_exec->extra.priv = &args;
> > > > +
> > > > +	return drm_gpuvm_exec_lock(vm_exec, num_fences, interruptible);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
> > > > +
> > > > +/**
> > > > + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped within a given range
> > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > + * @addr: the start address within the VA space
> > > > + * @range: the range to iterate within the VA space
> > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > + * @interruptible: sleep interruptible if waiting
> > > > + *
> > > > + * Acquires all dma-resv locks of all &drm_gem_objects mapped between @addr and
> > > > + * @addr + @range.
> > > > + *
> > > > + * Returns: 0 on success, negative error code on failure.
> > > > + */
> > > > +int
> > > > +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
> > > > +			  u64 addr, u64 range,
> > > > +			  unsigned int num_fences,
> > > > +			  bool interruptible)
> > > > +{
> > > > +	struct drm_gpuvm *gpuvm = vm_exec->vm;
> > > > +	struct drm_exec *exec = &vm_exec->exec;
> > > > +	uint32_t flags;
> > > > +	int ret;
> > > > +
> > > > +	flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT : 0 |
> > > > +		DRM_EXEC_IGNORE_DUPLICATES;
> > > > +
> > > > +	drm_exec_init(exec, flags);
> > > > +
> > > > +	drm_exec_until_all_locked(exec) {
> > > > +		ret = drm_gpuvm_prepare_range(gpuvm, exec, addr, range,
> > > > +					      num_fences);
> > > > +		drm_exec_retry_on_contention(exec);
> > > > +		if (ret)
> > > > +			goto err;
> > > > +	}
> > > > +
> > > > +	return ret;
> > > > +
> > > > +err:
> > > > +	drm_exec_fini(exec);
> > > > +	return ret;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
> > > > +
> > > > +/**
> > > > + * drm_gpuvm_validate() - validate all BOs marked as evicted
> > > > + * @gpuvm: the &drm_gpuvm to validate evicted BOs
> > > > + *
> > > > + * Calls the &drm_gpuvm_ops.bo_validate callback for all evicted buffer
> > > > + * objects being mapped in the given &drm_gpuvm.
> > > > + *
> > > > + * Returns: 0 on success, negative error code on failure.
> > > > + */
> > > > +int
> > > > +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
> > > > +{
> > > > +	const struct drm_gpuvm_ops *ops = gpuvm->ops;
> > > > +	struct drm_gpuvm_bo *vm_bo;
> > > > +	LIST_HEAD(evict);
> > > > +	int ret = 0;
> > > > +
> > > > +	if (unlikely(!ops || !ops->bo_validate))
> > > > +		return -ENOTSUPP;
> > > > +
> > > > +	for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
> > > > +		dma_resv_assert_held(vm_bo->obj->resv);
> > > > +		ret = ops->bo_validate(vm_bo->obj);
> > > > +		if (ret)
> > > > +			break;
> > > > +	}
> > > > +	/* Drop ref in case we break out of the loop. */
> > > > +	drm_gpuvm_bo_put(vm_bo);
> > > > +	restore_vm_bo_list(gpuvm, evict, &evict);
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
> > > > +
> > > > +/**
> > > > + * drm_gpuvm_resv_add_fence - add fence to private and all extobj
> > > > + * dma-resv
> > > > + * @gpuvm: the &drm_gpuvm to add a fence to
> > > > + * @exec: the &drm_exec locking context
> > > > + * @fence: fence to add
> > > > + * @private_usage: private dma-resv usage
> > > > + * @extobj_usage: extobj dma-resv usage
> > > > + */
> > > > +void
> > > > +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> > > > +			 struct drm_exec *exec,
> > > > +			 struct dma_fence *fence,
> > > > +			 enum dma_resv_usage private_usage,
> > > > +			 enum dma_resv_usage extobj_usage)
> > > > +{
> > > > +	struct drm_gem_object *obj;
> > > > +	unsigned long index;
> > > > +
> > > > +	drm_exec_for_each_locked_object(exec, index, obj) {
> > > > +		dma_resv_assert_held(obj->resv);
> > > > +		dma_resv_add_fence(obj->resv, fence,
> > > > +				   drm_gpuvm_is_extobj(gpuvm, obj) ?
> > > > +				   private_usage : extobj_usage);
> > > > +	}
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
> > > > +
> > > >    /**
> > > >     * drm_gpuvm_bo_create() - create a new instance of struct drm_gpuvm_bo
> > > >     * @gpuvm: The &drm_gpuvm the @obj is mapped in.
> > > > @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm *gpuvm,
> > > >    	INIT_LIST_HEAD(&vm_bo->list.gpuva);
> > > >    	INIT_LIST_HEAD(&vm_bo->list.entry.gem);
> > > > +	INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
> > > > +	INIT_LIST_HEAD(&vm_bo->list.entry.evict);
> > > > +
> > > >    	drm_gem_object_get(obj);
> > > >    	return vm_bo;
> > > > @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
> > > >    	drm_gem_gpuva_assert_lock_held(vm_bo->obj);
> > > > +	spin_lock(&gpuvm->extobj.lock);
> > > > +	list_del(&vm_bo->list.entry.extobj);
> > > > +	spin_unlock(&gpuvm->extobj.lock);
> > > > +
> > > > +	spin_lock(&gpuvm->evict.lock);
> > > > +	list_del(&vm_bo->list.entry.evict);
> > > > +	spin_unlock(&gpuvm->evict.lock);
> > > > +
> > > >    	list_del(&vm_bo->list.entry.gem);
> > > >    	drm_gem_object_put(obj);
> > > > @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
> > > >     * @vm_bo: the &drm_gpuvm_bo to release the reference of
> > > >     *
> > > >     * This releases a reference to @vm_bo.
> > > > + *
> > > > + * If the reference count drops to zero, the &gpuvm_bo is destroyed, which
> > > > + * includes removing it from the GEMs gpuva list. Hence, if a call to this
> > > > + * function can potentially let the reference count to zero the caller must
> > > > + * hold the dma-resv or driver specific GEM gpuva lock.
> > > >     */
> > > >    void
> > > >    drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
> > > > @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
> > > >    }
> > > >    EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
> > > > +static int __must_check
> > > > +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
> > > > +{
> > > > +	return kref_get_unless_zero(&vm_bo->kref);
> > > > +}
> > > > +
> > > >    static struct drm_gpuvm_bo *
> > > >    __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
> > > >    		    struct drm_gem_object *obj)
> > > > @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct drm_gpuvm_bo *__vm_bo)
> > > >    }
> > > >    EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
> > > > +/**
> > > > + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its &drm_gpuvm's
> > > > + * extobj list
> > > > + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the extobj list.
> > > > + *
> > > > + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if not on the list
> > > > + * already and if the corresponding &drm_gem_object is an external object,
> > > > + * actually.
> > > > + */
> > > > +void
> > > > +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
> > > > +{
> > > > +	struct drm_gpuvm *gpuvm = vm_bo->vm;
> > > > +
> > > > +	if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
> > > > +		drm_gpuvm_bo_list_add(vm_bo, extobj);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
> > > > +
> > > > +/**
> > > > + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to / from a
> > > > + * &drm_gpuvms evicted list
> > > > + * @obj: the &drm_gem_object to add or remove
> > > > + * @evict: indicates whether the object is evicted
> > > > + *
> > > > + * Adds a &drm_gem_object to or removes it from all &drm_gpuvms evicted
> > > > + * list containing a mapping of this &drm_gem_object.
> > > > + */
> > > > +void
> > > > +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
> > > > +{
> > > > +	struct drm_gpuvm_bo *vm_bo;
> > > > +
> > > > +	drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
> > > > +		if (evict)
> > > > +			drm_gpuvm_bo_list_add(vm_bo, evict);
> > > > +		else
> > > > +			drm_gpuvm_bo_list_del(vm_bo, evict);
> > > > +	}
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
> > > > +
> > > >    static int
> > > >    __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
> > > >    		   struct drm_gpuva *va)
> > > > diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
> > > > index afa50b9059a2..834bb6d6617e 100644
> > > > --- a/include/drm/drm_gpuvm.h
> > > > +++ b/include/drm/drm_gpuvm.h
> > > > @@ -26,10 +26,12 @@
> > > >     */
> > > >    #include <linux/list.h>
> > > > +#include <linux/dma-resv.h>
> > > >    #include <linux/rbtree.h>
> > > >    #include <linux/types.h>
> > > >    #include <drm/drm_gem.h>
> > > > +#include <drm/drm_exec.h>
> > > >    struct drm_gpuvm;
> > > >    struct drm_gpuvm_bo;
> > > > @@ -259,6 +261,38 @@ struct drm_gpuvm {
> > > >    	 * space
> > > >    	 */
> > > >    	struct dma_resv *resv;
> > > > +
> > > > +	/**
> > > > +	 * @extobj: structure holding the extobj list
> > > > +	 */
> > > > +	struct {
> > > > +		/**
> > > > +		 * @list: &list_head storing &drm_gpuvm_bos serving as
> > > > +		 * external object
> > > > +		 */
> > > > +		struct list_head list;
> > > > +
> > > > +		/**
> > > > +		 * @lock: spinlock to protect the extobj list
> > > > +		 */
> > > > +		spinlock_t lock;
> > > > +	} extobj;
> > > > +
> > > > +	/**
> > > > +	 * @evict: structure holding the evict list and evict list lock
> > > > +	 */
> > > > +	struct {
> > > > +		/**
> > > > +		 * @list: &list_head storing &drm_gpuvm_bos currently being
> > > > +		 * evicted
> > > > +		 */
> > > > +		struct list_head list;
> > > > +
> > > > +		/**
> > > > +		 * @lock: spinlock to protect the evict list
> > > > +		 */
> > > > +		spinlock_t lock;
> > > > +	} evict;
> > > >    };
> > > >    void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
> > > > @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
> > > >    		    const struct drm_gpuvm_ops *ops);
> > > >    void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
> > > > +/**
> > > > + * drm_gpuvm_is_extobj() - indicates whether the given &drm_gem_object is an
> > > > + * external object
> > > > + * @gpuvm: the &drm_gpuvm to check
> > > > + * @obj: the &drm_gem_object to check
> > > > + *
> > > > + * Returns: true if the &drm_gem_object &dma_resv differs from the
> > > > + * &drm_gpuvms &dma_resv, false otherwise
> > > > + */
> > > > +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm *gpuvm,
> > > > +				       struct drm_gem_object *obj)
> > > > +{
> > > > +	return obj && obj->resv != gpuvm->resv;
> > > > +}
> > > > +
> > > >    static inline struct drm_gpuva *
> > > >    __drm_gpuva_next(struct drm_gpuva *va)
> > > >    {
> > > > @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
> > > >    #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) \
> > > >    	list_for_each_entry_safe(va__, next__, &(gpuvm__)->rb.list, rb.entry)
> > > > +/**
> > > > + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of &drm_exec
> > > > + *
> > > > + * This structure should be created on the stack as &drm_exec should be.
> > > > + *
> > > > + * Optionally, @extra can be set in order to lock additional &drm_gem_objects.
> > > > + */
> > > > +struct drm_gpuvm_exec {
> > > > +	/**
> > > > +	 * @exec: the &drm_exec structure
> > > > +	 */
> > > > +	struct drm_exec exec;
> > > > +
> > > > +	/**
> > > > +	 * @vm: the &drm_gpuvm to lock its DMA reservations
> > > > +	 */
> > > > +	struct drm_gpuvm *vm;
> > > > +
> > > > +	/**
> > > > +	 * @extra: Callback and corresponding private data for the driver to
> > > > +	 * lock arbitrary additional &drm_gem_objects.
> > > > +	 */
> > > > +	struct {
> > > > +		/**
> > > > +		 * @fn: The driver callback to lock additional &drm_gem_objects.
> > > > +		 */
> > > > +		int (*fn)(struct drm_gpuvm_exec *vm_exec,
> > > > +			  unsigned int num_fences);
> > > > +
> > > > +		/**
> > > > +		 * @priv: driver private data for the @fn callback
> > > > +		 */
> > > > +		void *priv;
> > > > +	} extra;
> > > > +};
> > > > +
> > > > +/**
> > > > + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
> > > > + * @gpuvm: the &drm_gpuvm
> > > > + * @exec: the &drm_exec context
> > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > + *
> > > > + * Calls drm_exec_prepare_obj() for the GPUVMs dummy &drm_gem_object.
> > > > + *
> > > > + * Using this function directly, it is the drivers responsibility to call
> > > > + * drm_exec_init() and drm_exec_fini() accordingly.
> > > > + *
> > > > + * Returns: 0 on success, negative error code on failure.
> > > > + */
> > > > +static inline int
> > > > +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
> > > > +		     struct drm_exec *exec,
> > > > +		     unsigned int num_fences)
> > > > +{
> > > > +	return drm_exec_prepare_obj(exec, &gpuvm->d_obj, num_fences);
> > > > +}
> > > > +
> > > > +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> > > > +			      struct drm_exec *exec,
> > > > +			      unsigned int num_fences);
> > > > +
> > > > +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
> > > > +			    struct drm_exec *exec,
> > > > +			    u64 addr, u64 range,
> > > > +			    unsigned int num_fences);
> > > > +
> > > > +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> > > > +			unsigned int num_fences,
> > > > +			bool interruptible);
> > > > +
> > > > +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
> > > > +			      struct drm_gem_object **objs,
> > > > +			      unsigned int num_objs,
> > > > +			      unsigned int num_fences,
> > > > +			      bool interruptible);
> > > > +
> > > > +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
> > > > +			      u64 addr, u64 range,
> > > > +			      unsigned int num_fences,
> > > > +			      bool interruptible);
> > > > +
> > > > +/**
> > > > + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated BOs
> > > > + * @gpuvm: the &drm_gpuvm
> > > > + *
> > > > + * Releases all dma-resv locks of all &drm_gem_objects previously acquired
> > > > + * through drm_gpuvm_lock() or its variants.
> > > > + *
> > > > + * Returns: 0 on success, negative error code on failure.
> > > > + */
> > > > +static inline void
> > > > +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
> > > > +{
> > > > +	drm_exec_fini(&vm_exec->exec);
> > > > +}
> > > > +
> > > > +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
> > > > +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> > > > +			      struct drm_exec *exec,
> > > > +			      struct dma_fence *fence,
> > > > +			      enum dma_resv_usage private_usage,
> > > > +			      enum dma_resv_usage extobj_usage);
> > > > +
> > > > +/**
> > > > + * drm_gpuvm_exec_resv_add_fence()
> > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > + * @fence: fence to add
> > > > + * @private_usage: private dma-resv usage
> > > > + * @extobj_usage: extobj dma-resv usage
> > > > + *
> > > > + * See drm_gpuvm_resv_add_fence().
> > > > + */
> > > > +static inline void
> > > > +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec *vm_exec,
> > > > +			      struct dma_fence *fence,
> > > > +			      enum dma_resv_usage private_usage,
> > > > +			      enum dma_resv_usage extobj_usage)
> > > > +{
> > > > +	drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec, fence,
> > > > +				 private_usage, extobj_usage);
> > > > +}
> > > > +
> > > >    /**
> > > >     * struct drm_gpuvm_bo - structure representing a &drm_gpuvm and
> > > >     * &drm_gem_object combination
> > > > @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
> > > >    			 * gpuva list.
> > > >    			 */
> > > >    			struct list_head gem;
> > > > +
> > > > +			/**
> > > > +			 * @evict: List entry to attach to the &drm_gpuvms
> > > > +			 * extobj list.
> > > > +			 */
> > > > +			struct list_head extobj;
> > > > +
> > > > +			/**
> > > > +			 * @evict: List entry to attach to the &drm_gpuvms evict
> > > > +			 * list.
> > > > +			 */
> > > > +			struct list_head evict;
> > > >    		} entry;
> > > >    	} list;
> > > >    };
> > > > @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
> > > >    drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
> > > >    		  struct drm_gem_object *obj);
> > > > +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict);
> > > > +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
> > > > +
> > > >    /**
> > > >     * drm_gpuvm_bo_for_each_va() - iterator to walk over a list of &drm_gpuva
> > > >     * @va__: &drm_gpuva structure to assign to in each iteration step
> > > > @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
> > > >    	 * used.
> > > >    	 */
> > > >    	int (*sm_step_unmap)(struct drm_gpuva_op *op, void *priv);
> > > > +
> > > > +	/**
> > > > +	 * @bo_validate: called from drm_gpuvm_validate()
> > > > +	 *
> > > > +	 * Drivers receive this callback for every evicted &drm_gem_object being
> > > > +	 * mapped in the corresponding &drm_gpuvm.
> > > > +	 *
> > > > +	 * Typically, drivers would call their driver specific variant of
> > > > +	 * ttm_bo_validate() from within this callback.
> > > > +	 */
> > > > +	int (*bo_validate)(struct drm_gem_object *obj);
> > > >    };
> > > >    int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-12 16:20   ` Thomas Hellström
  2023-09-12 16:50     ` Danilo Krummrich
@ 2023-09-13  7:03     ` Boris Brezillon
  2023-09-13  7:05       ` Dave Airlie
  1 sibling, 1 reply; 77+ messages in thread
From: Boris Brezillon @ 2023-09-13  7:03 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: Danilo Krummrich, airlied, daniel, matthew.brost, sarah.walker,
	donald.robson, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

On Tue, 12 Sep 2023 18:20:32 +0200
Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:

> > +/**
> > + * get_next_vm_bo_from_list() - get the next vm_bo element
> > + * @__gpuvm: The GPU VM
> > + * @__list_name: The name of the list we're iterating on
> > + * @__local_list: A pointer to the local list used to store already iterated items
> > + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> > + *
> > + * This helper is here to provide lockless list iteration. Lockless as in, the
> > + * iterator releases the lock immediately after picking the first element from
> > + * the list, so list insertion deletion can happen concurrently.  
> 
> Are the list spinlocks needed for that async state update from within 
> the dma-fence critical section we've discussed previously?

Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
hook will be in this situation (Panthor at the moment, PowerVR soon). I
get that Xe and Nouveau don't need that because they update the VM
state early (in the ioctl path), but I keep thinking this will hurt us
if we don't think it through from the beginning, because once you've
set this logic to depend only on resv locks, it will be pretty hard to
get back to a solution which lets synchronous VM_BINDs take precedence
on asynchronous request, and, with vkQueueBindSparse() passing external
deps (plus the fact the VM_BIND queue might be pretty deep), it can
take a long time to get your synchronous VM_BIND executed...

Now, maybe the solution is something different, with early VM state
update for everyone (creation of to-be-[un]mapped drm_gpuva entries,
some of them being shadowed by already existing drm_gpuva that's
encoding the currently mapped region), and VM state patching when a
synchronous VM_BIND kicks in (we need to patch the previously queued
requests too, so they always have enough resources for the map/unmap
operations to succeed).

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13  7:03     ` Boris Brezillon
@ 2023-09-13  7:05       ` Dave Airlie
  2023-09-13  7:19         ` Boris Brezillon
  0 siblings, 1 reply; 77+ messages in thread
From: Dave Airlie @ 2023-09-13  7:05 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Thomas Hellström, Danilo Krummrich, daniel, matthew.brost,
	sarah.walker, donald.robson, christian.koenig, faith.ekstrand,
	dri-devel, nouveau, linux-kernel

On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
<boris.brezillon@collabora.com> wrote:
>
> On Tue, 12 Sep 2023 18:20:32 +0200
> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>
> > > +/**
> > > + * get_next_vm_bo_from_list() - get the next vm_bo element
> > > + * @__gpuvm: The GPU VM
> > > + * @__list_name: The name of the list we're iterating on
> > > + * @__local_list: A pointer to the local list used to store already iterated items
> > > + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> > > + *
> > > + * This helper is here to provide lockless list iteration. Lockless as in, the
> > > + * iterator releases the lock immediately after picking the first element from
> > > + * the list, so list insertion deletion can happen concurrently.
> >
> > Are the list spinlocks needed for that async state update from within
> > the dma-fence critical section we've discussed previously?
>
> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
> hook will be in this situation (Panthor at the moment, PowerVR soon). I
> get that Xe and Nouveau don't need that because they update the VM
> state early (in the ioctl path), but I keep thinking this will hurt us
> if we don't think it through from the beginning, because once you've
> set this logic to depend only on resv locks, it will be pretty hard to
> get back to a solution which lets synchronous VM_BINDs take precedence
> on asynchronous request, and, with vkQueueBindSparse() passing external
> deps (plus the fact the VM_BIND queue might be pretty deep), it can
> take a long time to get your synchronous VM_BIND executed...

btw what is the use case for this? do we have actual vulkan
applications we know will have problems here?

it feels like a bit of premature optimisation, but maybe we have use cases.

Dave.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13  7:05       ` Dave Airlie
@ 2023-09-13  7:19         ` Boris Brezillon
  2023-09-13 10:39           ` Thomas Hellström
  0 siblings, 1 reply; 77+ messages in thread
From: Boris Brezillon @ 2023-09-13  7:19 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Thomas Hellström, Danilo Krummrich, daniel, matthew.brost,
	sarah.walker, donald.robson, christian.koenig, faith.ekstrand,
	dri-devel, nouveau, linux-kernel

On Wed, 13 Sep 2023 17:05:42 +1000
Dave Airlie <airlied@gmail.com> wrote:

> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
> <boris.brezillon@collabora.com> wrote:
> >
> > On Tue, 12 Sep 2023 18:20:32 +0200
> > Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >  
> > > > +/**
> > > > + * get_next_vm_bo_from_list() - get the next vm_bo element
> > > > + * @__gpuvm: The GPU VM
> > > > + * @__list_name: The name of the list we're iterating on
> > > > + * @__local_list: A pointer to the local list used to store already iterated items
> > > > + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> > > > + *
> > > > + * This helper is here to provide lockless list iteration. Lockless as in, the
> > > > + * iterator releases the lock immediately after picking the first element from
> > > > + * the list, so list insertion deletion can happen concurrently.  
> > >
> > > Are the list spinlocks needed for that async state update from within
> > > the dma-fence critical section we've discussed previously?  
> >
> > Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
> > hook will be in this situation (Panthor at the moment, PowerVR soon). I
> > get that Xe and Nouveau don't need that because they update the VM
> > state early (in the ioctl path), but I keep thinking this will hurt us
> > if we don't think it through from the beginning, because once you've
> > set this logic to depend only on resv locks, it will be pretty hard to
> > get back to a solution which lets synchronous VM_BINDs take precedence
> > on asynchronous request, and, with vkQueueBindSparse() passing external
> > deps (plus the fact the VM_BIND queue might be pretty deep), it can
> > take a long time to get your synchronous VM_BIND executed...  
> 
> btw what is the use case for this? do we have actual vulkan
> applications we know will have problems here?

I don't, but I think that's a concern Faith raised at some point (dates
back from when I was reading threads describing how VM_BIND on i915
should work, and I was clearly discovering this whole VM_BIND thing at
that time, so maybe I misunderstood).

> 
> it feels like a bit of premature optimisation, but maybe we have use cases.

Might be, but that's the sort of thing that would put us in a corner if
we don't have a plan for when the needs arise. Besides, if we don't
want to support that case because it's too complicated, I'd recommend
dropping all the drm_gpuvm APIs that let people think this mode is
valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
confusion.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-12 23:36         ` Danilo Krummrich
@ 2023-09-13  9:14           ` Thomas Hellström
  2023-09-13 12:16             ` Danilo Krummrich
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-13  9:14 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

Hi!

On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström wrote:
> > 
> > On 9/12/23 18:50, Danilo Krummrich wrote:
> > > On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
> > > > Hi, Danilo,
> > > > 
> > > > On 9/9/23 17:31, Danilo Krummrich wrote:
> > > > > So far the DRM GPUVA manager offers common infrastructure to
> > > > > track GPU VA
> > > > > allocations and mappings, generically connect GPU VA mappings
> > > > > to their
> > > > > backing buffers and perform more complex mapping operations
> > > > > on the GPU VA
> > > > > space.
> > > > > 
> > > > > However, there are more design patterns commonly used by
> > > > > drivers, which
> > > > > can potentially be generalized in order to make the DRM GPUVA
> > > > > manager
> > > > > represent a basic GPU-VM implementation. In this context,
> > > > > this patch aims
> > > > > at generalizing the following elements.
> > > > > 
> > > > > 1) Provide a common dma-resv for GEM objects not being used
> > > > > outside of
> > > > >      this GPU-VM.
> > > > > 
> > > > > 2) Provide tracking of external GEM objects (GEM objects
> > > > > which are
> > > > >      shared with other GPU-VMs).
> > > > > 
> > > > > 3) Provide functions to efficiently lock all GEM objects dma-
> > > > > resv the
> > > > >      GPU-VM contains mappings of.
> > > > > 
> > > > > 4) Provide tracking of evicted GEM objects the GPU-VM
> > > > > contains mappings
> > > > >      of, such that validation of evicted GEM objects is
> > > > > accelerated.
> > > > > 
> > > > > 5) Provide some convinience functions for common patterns.
> > > > > 
> > > > > Rather than being designed as a "framework", the target is to
> > > > > make all
> > > > > features appear as a collection of optional helper functions,
> > > > > such that
> > > > > drivers are free to make use of the DRM GPUVA managers basic
> > > > > functionality and opt-in for other features without setting
> > > > > any feature
> > > > > flags, just by making use of the corresponding functions.
> > > > > 
> > > > > Big kudos to Boris Brezillon for his help to figure out
> > > > > locking for drivers
> > > > > updating the GPU VA space within the fence signalling path.
> > > > > 
> > > > > Suggested-by: Matthew Brost <matthew.brost@intel.com>
> > > > > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > > > > ---
> > > > >    drivers/gpu/drm/drm_gpuvm.c | 516
> > > > > ++++++++++++++++++++++++++++++++++++
> > > > >    include/drm/drm_gpuvm.h     | 197 ++++++++++++++
> > > > >    2 files changed, 713 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/drm_gpuvm.c
> > > > > b/drivers/gpu/drm/drm_gpuvm.c
> > > > > index f4411047dbb3..8e62a043f719 100644
> > > > > --- a/drivers/gpu/drm/drm_gpuvm.c
> > > > > +++ b/drivers/gpu/drm/drm_gpuvm.c
> > > > > @@ -73,6 +73,21 @@
> > > > >     * &drm_gem_object list of &drm_gpuvm_bos for an existing
> > > > > instance of this
> > > > >     * particular combination. If not existent a new instance
> > > > > is created and linked
> > > > >     * to the &drm_gem_object.
> > > > > + *
> > > > > + * &drm_gpuvm_bo structures, since unique for a given
> > > > > &drm_gpuvm, are also used
> > > > > + * as entry for the &drm_gpuvm's lists of external and
> > > > > evicted objects. Those
> > > > > + * list are maintained in order to accelerate locking of
> > > > > dma-resv locks and
> > > > > + * validation of evicted objects bound in a &drm_gpuvm. For
> > > > > instance the all
> > > > > + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be
> > > > > locked by calling
> > > > > + * drm_gpuvm_exec_lock(). Once locked drivers can call
> > > > > drm_gpuvm_validate() in
> > > > > + * order to validate all evicted &drm_gem_objects. It is
> > > > > also possible to lock
> > > > > + * additional &drm_gem_objects by providing the
> > > > > corresponding parameters to
> > > > > + * drm_gpuvm_exec_lock() as well as open code the &drm_exec
> > > > > loop while making
> > > > > + * use of helper functions such as drm_gpuvm_prepare_range()
> > > > > or
> > > > > + * drm_gpuvm_prepare_objects().
> > > > > + *
> > > > > + * Every bound &drm_gem_object is treated as external object
> > > > > when its &dma_resv
> > > > > + * structure is different than the &drm_gpuvm's common
> > > > > &dma_resv structure.
> > > > >     */
> > > > >    /**
> > > > > @@ -420,6 +435,20 @@
> > > > >     * Subsequent calls to drm_gpuvm_bo_obtain() for the same
> > > > > &drm_gpuvm and
> > > > >     * &drm_gem_object must be able to observe previous
> > > > > creations and destructions
> > > > >     * of &drm_gpuvm_bos in order to keep instances unique.
> > > > > + *
> > > > > + * The &drm_gpuvm's lists for keeping track of external and
> > > > > evicted objects are
> > > > > + * protected against concurrent insertion / removal and
> > > > > iteration internally.
> > > > > + *
> > > > > + * However, drivers still need ensure to protect concurrent
> > > > > calls to functions
> > > > > + * iterating those lists, such as drm_gpuvm_validate() and
> > > > > + * drm_gpuvm_prepare_objects(). Every such function contains
> > > > > a particular
> > > > > + * comment and lockdep checks if possible.
> > > > > + *
> > > > > + * Functions adding or removing entries from those lists,
> > > > > such as
> > > > > + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be
> > > > > called with external
> > > > > + * locks being held, e.g. in order to avoid the
> > > > > corresponding list to be
> > > > > + * (safely) modified while potentially being iternated by
> > > > > other API functions.
> > > > > + * However, this is entirely optional.
> > > > >     */
> > > > >    /**
> > > > > @@ -632,6 +661,131 @@
> > > > >     *   }
> > > > >     */
> > > > > +/**
> > > > > + * get_next_vm_bo_from_list() - get the next vm_bo element
> > > > > + * @__gpuvm: The GPU VM
> > > > > + * @__list_name: The name of the list we're iterating on
> > > > > + * @__local_list: A pointer to the local list used to store
> > > > > already iterated items
> > > > > + * @__prev_vm_bo: The previous element we got from
> > > > > drm_gpuvm_get_next_cached_vm_bo()
> > > > > + *
> > > > > + * This helper is here to provide lockless list iteration.
> > > > > Lockless as in, the
> > > > > + * iterator releases the lock immediately after picking the
> > > > > first element from
> > > > > + * the list, so list insertion deletion can happen
> > > > > concurrently.
> > > > Are the list spinlocks needed for that async state update from
> > > > within the
> > > > dma-fence critical section we've discussed previously?
> > > Yes, but also for other reasons, see below.
> > > 
> > > > Otherwise it should be sufficient to protect the lists with the
> > > > gpuvm's resv
> > > > (or for the extobj list with an outer lock).
> > > > 
> > > > If those spinlocks are still needed in some situations, perhaps
> > > > could we
> > > > have an option to set them to NULL (Like IIRC the maple tree
> > > > allows for)?
> > > The evict spinlock is needed in any case, since in
> > > drm_gpuvm_bo_evict() we're
> > > holding only the dma-resv lock from the BO this function gets
> > > called for. Hence,
> > > the spinlock protects concurrent drm_gpuvm_bo_evict() calls with
> > > different BOs.
> > No. Only if you try to add external objects to the vm's evict list
> > from
> > within the evict code. That's not necessary since you loop through
> > all
> > external objects anyway when locking them so an "evicted" bool in
> > the vm_bo,
> > protected by the bo resv would be sufficient. The extobj locking
> > loop can
> > then add the bo to the evicted list.
> 
> And validate() can remove it while still holding all dma-resv locks,
> neat!
> However, what if two tasks are trying to lock the VA space
> concurrently? What
> do we do when the drm_gpuvm_bo's refcount drops to zero in
> drm_gpuva_unlink()?
> Are we guaranteed that at this point of time the drm_gpuvm_bo is not
> on the
> evicted list? Because otherwise we would call drm_gpuvm_bo_destroy()
> with the
> dma-resv lock held, which wouldn't be allowed, since
> drm_gpuvm_bo_destroy()
> might drop the last reference to the drm_gem_object and hence we'd
> potentially
> free the dma-resv lock while holding it, at least if it's an external
> object.

Easiest way in this scheme is to think of the lists as being protected
by the vm's resv lock. That means anybody calling unlink() must also
hold the vm's resv lock. (Which is OK from an UAF point of view, but
perhaps not from a locking inversion POW from an async list update).

> 
> > > 
> > > For extobjs an outer lock would be enough in case of Xe, but I
> > > really would not
> > > like to add even more complexity just to get the spinlock out of
> > > the way in case
> > > the driver already has an outer lock protecting this path.
> > 
> > I must disagree here. These spinlocks and atomic operations are
> > pretty
> > costly and as discussed earlier this type of locking was the reason
> > (at
> > least according to the commit message) that made Christian drop the
> > XArray
> > use in drm_exec for the same set of objects: "The locking overhead
> > is
> > unecessary and measurable". IMHO the spinlock is the added
> > complexity and a
> > single wide lock following the drm locking guidelines set out by
> > Daniel and
> > David should really be the default choice with an opt-in for a
> > spinlock if
> > needed for async and pushing out to a wq is not an option.
> 
> For the external object list an outer lock would work as long as it's
> not the
> dma-resv lock of the corresponding GEM object, since here we actually
> need to
> remove the list entry from the external object list on
> drm_gpuvm_bo_destroy().
> It's just a bit weird design wise that drivers would need to take
> this outer
> lock on:
> 
> - drm_gpuvm_bo_extobj_add()
> - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
> - drm_gpuva_unlink()            (because it needs to call
> drm_gpuvm_bo_put())
> - drm_gpuvm_exec_lock()
> - drm_gpuvm_exec_lock_array()
> - drm_gpuvm_prepare_range()
> 
> Given that it seems reasonable to do all the required locking
> internally.

From a design POW, there has been a clear direction in XE to make
things similar to mmap() / munmap(), so this outer lock, which in Xe is
an rwsem, is used in a similar way as the mmap_lock. It's protecting
the page-table structures and vma rb tree, the userptr structures and
the extobj list. Basically it's taken early in the exec IOCTL, the
VM_BIND ioctl, the compute rebind worker and the pagefault handler, so
all of the above are just asserting that it is taken in the correct
mode.

But strictly with this scheme one could also use the vm's dma_resv for
the extobj list since with drm_exec, it's locked before traversing the
list.

The whole point of this scheme is to rely on locks that you already are
supposed to be holding for various reasons and is simple to comprehend.

> 
> In order to at least place lockdep checks, the driver would need to
> supply the
> corresponding lock's lockdep_map, because the GPUVM otherwise doesn't
> know about
> the lock.

Yes, that sounds reasonable. One lockdep map per list.

> 
> Out of curiosity, what is the overhead of a spin_lock() that doesn't
> need to
> spin? 

I guess it's hard to tell exactly, but it is much lower on modern x86
than what it used to be. Not sure about ARM, which is the other
architecture important to us. I figure if there is little cache-line
bouncing the main overhead comes from the implied barriers.

> 
> > 
> > A pretty simple way that would not add much code would be
> > 
> > static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
> > spinlock_t
> > *lock)
> > 
> > {
> > 
> >     if (!gpuvm->resv_protected_lists)
> >         spin_lock(lock);
> > 
> > }
> > 
> > > > For such drivers, that would require anybody calling unlink to
> > > > hold the vm's
> > > > resv, though.
> > > In V4 I want to go back to having a dedicated lock for the GEMs
> > > gpuva list (or
> > > VM_BO list to be more precise). We can't just use the dma-resv
> > > lock for that
> > > with VM_BO abstractions, because on destruction of a VM_BO we
> > > otherwise wouldn't
> > > be allowed to already hold the dma-resv lock. That's the fix I
> > > was referring to
> > > earlier.
> > 
> > Yeah, I can see the need for a dedicated lock for the GEM's gpuva
> > list, but
> > holding the vm's dma-resv lock across the unlink shouldn't be a
> > problem. We
> > may free the object and a pointer to the vm's resv during unlink
> > but we
> > don't free the vm's resv.  It'd be a matter of ensuring that any
> > calls to
> > unlink from *within* drm_gpuvm allows it to be held.
> 
> Drivers calling unlink() from the fence signaling path can't use the
> VM's
> dma-resv lock.

Yes, that made me a bit curious because in the current version the code
required the object's dma_resv for unlink() which can't be grabbed
either from the fence signaling path. So are there any drivers actually
wanting to do that? If so, they will either need to resort to the
current spinlock solution or they will need to call unlink from a
workqueue item. 
> 
> Also, what if the object is an external object? We can't use the VM's
> dma-resv
> lock here.

Why? Typically (sync) unlink is only ever called from an unbind-like
operation where it should be trivial to grab the vm's resv. Or, for
that matter any outer lock protecting the extobj list. Rule would be
the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict would
be protected by either the vm's dma_resv (or possibly an outer lock in
the case of the extobj list).

>  And we can't have the GEM objs dma-resv lock held when calling
> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
> refcount drops
> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might
> drop the
> last reference of the GEM object.

Yes, but this is a different problem as to what exactly protects
drm_gpuvm_bo::entry::gem. Either as you suggest an internal per bo list
lock, or if we want to keep the bo's dma_resv we need to ensure that
the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
pointer, and doesn't implicitly rely on the gpuvm_bo's refcount (I know
Boris didn't like that, but requiring an explicit refcount for a
pointer you dereference unless you're under a lock that ensures keeping
the object alive is pretty much required?) But anyway for the
drm_gpuvm_bo::entry::gem list protection (bo resv or internal spinlock)
I don't have a strong preference.

>  All those problems go away with a dedicated
> GEM gpuva list lock.

I don't think these are real problems.
With the excepton of the eviction list "trick" where we currently have
slightly different approach to collect external bos needing rebinding,
we have this working fine.

TBH I think pretty much the only situation where the spinlock is needed
is for async updates of these lists, unless a wq item can be used for
that, but it doesn't really seem like the current code allows for such
updates anyway? It complicates the code a lot, adds overhead and also
adds the requirement for refcounting during list traversal.

/Thomas

> 
> > 
> > /Thomas
> > 
> > 
> > > > It seems that with that also the refcount could be make non-
> > > > atomic.
> > > > 
> > > > All in the spirit of the drm locking guidelines "use big locks
> > > > when
> > > > possible".
> > > > Lower level locks only when necessary for performance or
> > > > locking inversion?
> > > > 
> > > > /Thomas
> > > > 
> > > > 
> > > > > + *
> > > > > + * Elements popped from the original list are kept in a
> > > > > local list, so removal
> > > > > + * and is_empty checks can still happen while we're
> > > > > iterating the list.
> > > > > + */
> > > > > +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
> > > > > __local_list, __prev_vm_bo)     \
> > > > > +       ({                                                   
> > > > >                            \
> > > > > +               struct drm_gpuvm_bo
> > > > > *__vm_bo;                                           \
> > > > > +                                                            
> > > > >                            \
> > > > > +               drm_gpuvm_bo_put(__prev_vm_bo);              
> > > > >                            \
> > > > > +                                                            
> > > > >                            \
> > > > > +               spin_lock(&(__gpuvm)-
> > > > > >__list_name.lock);                                \
> > > > > +               while (!list_empty(&(__gpuvm)-
> > > > > >__list_name.list)) {                     \
> > > > > +                       __vm_bo =
> > > > > list_first_entry(&(__gpuvm)->__list_name.list,        \
> > > > > +                                                  struct
> > > > > drm_gpuvm_bo,                 \
> > > > > +                                                 
> > > > > list.entry.__list_name);             \
> > > > > +                       if
> > > > > (drm_gpuvm_bo_get_unless_zero(__vm_bo))
> > > > > {                    \
> > > > > +                               list_move_tail(&(__vm_bo)-
> > > > > >list.entry.__list_name,      \
> > > > > +                                             
> > > > > __local_list);                           \
> > > > > +                               break;                       
> > > > >                            \
> > > > > +                       } else
> > > > > {                                                        \
> > > > > +                               list_del_init(&(__vm_bo)-
> > > > > >list.entry.__list_name);      \
> > > > > +                               __vm_bo =
> > > > > NULL;                                         \
> > > > > +                       }                                    
> > > > >                            \
> > > > > +               }                                            
> > > > >                            \
> > > > > +               spin_unlock(&(__gpuvm)-
> > > > > >__list_name.lock);                              \
> > > > > +                                                            
> > > > >                            \
> > > > > +               __vm_bo;                                     
> > > > >                            \
> > > > > +       })
> > > > > +
> > > > > +/**
> > > > > + * for_each_vm_bo_in_list() - internal vm_bo list iterator
> > > > > + *
> > > > > + * This helper is here to provide lockless list iteration.
> > > > > Lockless as in, the
> > > > > + * iterator releases the lock immediately after picking the
> > > > > first element from the
> > > > > + * list, so list insertion and deletion can happen
> > > > > concurrently.
> > > > > + *
> > > > > + * Typical use:
> > > > > + *
> > > > > + *     struct drm_gpuvm_bo *vm_bo;
> > > > > + *     LIST_HEAD(my_local_list);
> > > > > + *
> > > > > + *     ret = 0;
> > > > > + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
> > > > > &my_local_list, vm_bo) {
> > > > > + *             ret = do_something_with_vm_bo(..., vm_bo);
> > > > > + *             if (ret)
> > > > > + *                     break;
> > > > > + *     }
> > > > > + *     drm_gpuvm_bo_put(vm_bo);
> > > > > + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
> > > > > &my_local_list);
> > > > > + *
> > > > > + *
> > > > > + * Only used for internal list iterations, not meant to be
> > > > > exposed to the outside
> > > > > + * world.
> > > > > + */
> > > > > +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
> > > > > __local_list, __vm_bo)    \
> > > > > +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
> > > > > __list_name,           \
> > > > > +                                               __local_list,
> > > > > NULL);            \
> > > > > +           
> > > > > __vm_bo;                                                     
> > > > >       \
> > > > > +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
> > > > > __list_name,           \
> > > > > +                                               __local_list,
> > > > > __vm_bo))         \
> > > > > +
> > > > > +/**
> > > > > + * restore_vm_bo_list() - move vm_bo elements back to their
> > > > > original list
> > > > > + * @__gpuvm: The GPU VM
> > > > > + * @__list_name: The name of the list we're iterating on
> > > > > + * @__local_list: A pointer to the local list used to store
> > > > > already iterated items
> > > > > + *
> > > > > + * When we're done iterating a vm_bo list, we should call
> > > > > restore_vm_bo_list()
> > > > > + * to restore the original state and let new iterations take
> > > > > place.
> > > > > + */
> > > > > +#define restore_vm_bo_list(__gpuvm, __list_name,
> > > > > __local_list)                         \
> > > > > +       do
> > > > > {                                                            
> > > > >                 \
> > > > > +               /* Merge back the two lists, moving local
> > > > > list elements to the          \
> > > > > +                * head to preserve previous ordering, in
> > > > > case it matters.              \
> > > > > +               
> > > > > */                                                           
> > > > >           \
> > > > > +               spin_lock(&(__gpuvm)-
> > > > > >__list_name.lock);                                \
> > > > > +               list_splice(__local_list, &(__gpuvm)-
> > > > > >__list_name.list);                \
> > > > > +               spin_unlock(&(__gpuvm)-
> > > > > >__list_name.lock);                              \
> > > > > +       } while (0)
> > > > > +/**
> > > > > + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
> > > > > list
> > > > > + * @__vm_bo: the &drm_gpuvm_bo
> > > > > + * @__list_name: the name of the list to insert into
> > > > > + *
> > > > > + * Inserts the given @__vm_bo into the list specified by
> > > > > @__list_name and
> > > > > + * increases the vm_bo's reference count.
> > > > > + */
> > > > > +#define drm_gpuvm_bo_list_add(__vm_bo,
> > > > > __list_name)                            \
> > > > > +       do
> > > > > {                                                            
> > > > >         \
> > > > > +               spin_lock(&(__vm_bo)->vm-
> > > > > >__list_name.lock);                    \
> > > > > +               if (list_empty(&(__vm_bo)-
> > > > > >list.entry.__list_name))             \
> > > > > +                       list_add_tail(&(__vm_bo)-
> > > > > >list.entry.__list_name,       \
> > > > > +                                     &(__vm_bo)->vm-
> > > > > >__list_name.list);        \
> > > > > +               spin_unlock(&(__vm_bo)->vm-
> > > > > >__list_name.lock);                  \
> > > > > +       } while (0)
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
> > > > > list
> > > > > + * @__vm_bo: the &drm_gpuvm_bo
> > > > > + * @__list_name: the name of the list to insert into
> > > > > + *
> > > > > + * Removes the given @__vm_bo from the list specified by
> > > > > @__list_name and
> > > > > + * decreases the vm_bo's reference count.
> > > > > + */
> > > > > +#define drm_gpuvm_bo_list_del(__vm_bo,
> > > > > __list_name)                            \
> > > > > +       do
> > > > > {                                                            
> > > > >         \
> > > > > +               spin_lock(&(__vm_bo)->vm-
> > > > > >__list_name.lock);                    \
> > > > > +               if (!list_empty(&(__vm_bo)-
> > > > > >list.entry.__list_name))            \
> > > > > +                       list_del_init(&(__vm_bo)-
> > > > > >list.entry.__list_name);      \
> > > > > +               spin_unlock(&(__vm_bo)->vm-
> > > > > >__list_name.lock);                  \
> > > > > +       } while (0)
> > > > > +
> > > > > +static int __must_check
> > > > > +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
> > > > > +
> > > > >    #define to_drm_gpuva(__node) container_of((__node), struct
> > > > > drm_gpuva, rb.node)
> > > > >    #define GPUVA_START(node) ((node)->va.addr)
> > > > > @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
> > > > > struct drm_device *drm,
> > > > >         gpuvm->rb.tree = RB_ROOT_CACHED;
> > > > >         INIT_LIST_HEAD(&gpuvm->rb.list);
> > > > > +       INIT_LIST_HEAD(&gpuvm->extobj.list);
> > > > > +       spin_lock_init(&gpuvm->extobj.lock);
> > > > > +
> > > > > +       INIT_LIST_HEAD(&gpuvm->evict.list);
> > > > > +       spin_lock_init(&gpuvm->evict.lock);
> > > > > +
> > > > >         drm_gpuva_check_overflow(start_offset, range);
> > > > >         gpuvm->mm_start = start_offset;
> > > > >         gpuvm->mm_range = range;
> > > > > @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
> > > > > *gpuvm)
> > > > >         WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
> > > > >              "GPUVA tree is not empty, potentially leaking
> > > > > memory.\n");
> > > > > +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
> > > > > should be empty.\n");
> > > > > +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
> > > > > should be empty.\n");
> > > > > +
> > > > >         drm_gem_private_object_fini(&gpuvm->d_obj);
> > > > >    }
> > > > >    EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
> > > > > +/**
> > > > > + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
> > > > > + * @gpuvm: the &drm_gpuvm
> > > > > + * @exec: the &drm_exec locking context
> > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > + *
> > > > > + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the
> > > > > given
> > > > > + * &drm_gpuvm contains mappings of.
> > > > > + *
> > > > > + * Using this function directly, it is the drivers
> > > > > responsibility to call
> > > > > + * drm_exec_init() and drm_exec_fini() accordingly.
> > > > > + *
> > > > > + * Note: This function is safe against concurrent insertion
> > > > > and removal of
> > > > > + * external objects, however it is not safe against
> > > > > concurrent usage itself.
> > > > > + *
> > > > > + * Drivers need to make sure to protect this case with
> > > > > either an outer VM lock
> > > > > + * or by calling drm_gpuvm_prepare_vm() before this function
> > > > > within the
> > > > > + * drm_exec_until_all_locked() loop, such that the GPUVM's
> > > > > dma-resv lock ensures
> > > > > + * mutual exclusion.
> > > > > + *
> > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > + */
> > > > > +int
> > > > > +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> > > > > +                         struct drm_exec *exec,
> > > > > +                         unsigned int num_fences)
> > > > > +{
> > > > > +       struct drm_gpuvm_bo *vm_bo;
> > > > > +       LIST_HEAD(extobjs);
> > > > > +       int ret = 0;
> > > > > +
> > > > > +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
> > > > > vm_bo) {
> > > > > +               ret = drm_exec_prepare_obj(exec, vm_bo->obj,
> > > > > num_fences);
> > > > > +               if (ret)
> > > > > +                       break;
> > > > > +       }
> > > > > +       /* Drop ref in case we break out of the loop. */
> > > > > +       drm_gpuvm_bo_put(vm_bo);
> > > > > +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
> > > > > +
> > > > > +       return ret;
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_prepare_range() - prepare all BOs mapped within
> > > > > a given range
> > > > > + * @gpuvm: the &drm_gpuvm
> > > > > + * @exec: the &drm_exec locking context
> > > > > + * @addr: the start address within the VA space
> > > > > + * @range: the range to iterate within the VA space
> > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > + *
> > > > > + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
> > > > > mapped between @addr
> > > > > + * and @addr + @range.
> > > > > + *
> > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > + */
> > > > > +int
> > > > > +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
> > > > > drm_exec *exec,
> > > > > +                       u64 addr, u64 range, unsigned int
> > > > > num_fences)
> > > > > +{
> > > > > +       struct drm_gpuva *va;
> > > > > +       u64 end = addr + range;
> > > > > +       int ret;
> > > > > +
> > > > > +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
> > > > > +               struct drm_gem_object *obj = va->gem.obj;
> > > > > +
> > > > > +               ret = drm_exec_prepare_obj(exec, obj,
> > > > > num_fences);
> > > > > +               if (ret)
> > > > > +                       return ret;
> > > > > +       }
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_exec_lock() - lock all dma-resv of all
> > > > > assoiciated BOs
> > > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > + * @interruptible: sleep interruptible if waiting
> > > > > + *
> > > > > + * Acquires all dma-resv locks of all &drm_gem_objects the
> > > > > given
> > > > > + * &drm_gpuvm contains mappings of.
> > > > > + *
> > > > > + * Addionally, when calling this function with struct
> > > > > drm_gpuvm_exec::extra
> > > > > + * being set the driver receives the given @fn callback to
> > > > > lock additional
> > > > > + * dma-resv in the context of the &drm_gpuvm_exec instance.
> > > > > Typically, drivers
> > > > > + * would call drm_exec_prepare_obj() from within this
> > > > > callback.
> > > > > + *
> > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > + */
> > > > > +int
> > > > > +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> > > > > +                   unsigned int num_fences,
> > > > > +                   bool interruptible)
> > > > > +{
> > > > > +       struct drm_gpuvm *gpuvm = vm_exec->vm;
> > > > > +       struct drm_exec *exec = &vm_exec->exec;
> > > > > +       uint32_t flags;
> > > > > +       int ret;
> > > > > +
> > > > > +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
> > > > > 0 |
> > > > > +               DRM_EXEC_IGNORE_DUPLICATES;
> > > > > +
> > > > > +       drm_exec_init(exec, flags);
> > > > > +
> > > > > +       drm_exec_until_all_locked(exec) {
> > > > > +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
> > > > > num_fences);
> > > > > +               drm_exec_retry_on_contention(exec);
> > > > > +               if (ret)
> > > > > +                       goto err;
> > > > > +
> > > > > +               ret = drm_gpuvm_prepare_objects(gpuvm, exec,
> > > > > num_fences);
> > > > > +               drm_exec_retry_on_contention(exec);
> > > > > +               if (ret)
> > > > > +                       goto err;
> > > > > +
> > > > > +               if (vm_exec->extra.fn) {
> > > > > +                       ret = vm_exec->extra.fn(vm_exec,
> > > > > num_fences);
> > > > > +                       drm_exec_retry_on_contention(exec);
> > > > > +                       if (ret)
> > > > > +                               goto err;
> > > > > +               }
> > > > > +       }
> > > > > +
> > > > > +       return 0;
> > > > > +
> > > > > +err:
> > > > > +       drm_exec_fini(exec);
> > > > > +       return ret;
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
> > > > > +
> > > > > +static int
> > > > > +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
> > > > > num_fences)
> > > > > +{
> > > > > +       struct {
> > > > > +               struct drm_gem_object **objs;
> > > > > +               unsigned int num_objs;
> > > > > +       } *args = vm_exec->extra.priv;
> > > > > +
> > > > > +       return drm_exec_prepare_array(&vm_exec->exec, args-
> > > > > >objs,
> > > > > +                                     args->num_objs,
> > > > > num_fences);
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
> > > > > assoiciated BOs
> > > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > > + * @objs: additional &drm_gem_objects to lock
> > > > > + * @num_objs: the number of additional &drm_gem_objects to
> > > > > lock
> > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > + * @interruptible: sleep interruptible if waiting
> > > > > + *
> > > > > + * Acquires all dma-resv locks of all &drm_gem_objects the
> > > > > given &drm_gpuvm
> > > > > + * contains mappings of, plus the ones given through @objs.
> > > > > + *
> > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > + */
> > > > > +int
> > > > > +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
> > > > > +                         struct drm_gem_object **objs,
> > > > > +                         unsigned int num_objs,
> > > > > +                         unsigned int num_fences,
> > > > > +                         bool interruptible)
> > > > > +{
> > > > > +       struct {
> > > > > +               struct drm_gem_object **objs;
> > > > > +               unsigned int num_objs;
> > > > > +       } args;
> > > > > +
> > > > > +       args.objs = objs;
> > > > > +       args.num_objs = num_objs;
> > > > > +
> > > > > +       vm_exec->extra.fn = fn_lock_array;
> > > > > +       vm_exec->extra.priv = &args;
> > > > > +
> > > > > +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
> > > > > interruptible);
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
> > > > > within a given range
> > > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > > + * @addr: the start address within the VA space
> > > > > + * @range: the range to iterate within the VA space
> > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > + * @interruptible: sleep interruptible if waiting
> > > > > + *
> > > > > + * Acquires all dma-resv locks of all &drm_gem_objects
> > > > > mapped between @addr and
> > > > > + * @addr + @range.
> > > > > + *
> > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > + */
> > > > > +int
> > > > > +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
> > > > > +                         u64 addr, u64 range,
> > > > > +                         unsigned int num_fences,
> > > > > +                         bool interruptible)
> > > > > +{
> > > > > +       struct drm_gpuvm *gpuvm = vm_exec->vm;
> > > > > +       struct drm_exec *exec = &vm_exec->exec;
> > > > > +       uint32_t flags;
> > > > > +       int ret;
> > > > > +
> > > > > +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
> > > > > 0 |
> > > > > +               DRM_EXEC_IGNORE_DUPLICATES;
> > > > > +
> > > > > +       drm_exec_init(exec, flags);
> > > > > +
> > > > > +       drm_exec_until_all_locked(exec) {
> > > > > +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
> > > > > addr, range,
> > > > > +                                             num_fences);
> > > > > +               drm_exec_retry_on_contention(exec);
> > > > > +               if (ret)
> > > > > +                       goto err;
> > > > > +       }
> > > > > +
> > > > > +       return ret;
> > > > > +
> > > > > +err:
> > > > > +       drm_exec_fini(exec);
> > > > > +       return ret;
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_validate() - validate all BOs marked as evicted
> > > > > + * @gpuvm: the &drm_gpuvm to validate evicted BOs
> > > > > + *
> > > > > + * Calls the &drm_gpuvm_ops.bo_validate callback for all
> > > > > evicted buffer
> > > > > + * objects being mapped in the given &drm_gpuvm.
> > > > > + *
> > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > + */
> > > > > +int
> > > > > +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
> > > > > +{
> > > > > +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
> > > > > +       struct drm_gpuvm_bo *vm_bo;
> > > > > +       LIST_HEAD(evict);
> > > > > +       int ret = 0;
> > > > > +
> > > > > +       if (unlikely(!ops || !ops->bo_validate))
> > > > > +               return -ENOTSUPP;
> > > > > +
> > > > > +       for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
> > > > > +               dma_resv_assert_held(vm_bo->obj->resv);
> > > > > +               ret = ops->bo_validate(vm_bo->obj);
> > > > > +               if (ret)
> > > > > +                       break;
> > > > > +       }
> > > > > +       /* Drop ref in case we break out of the loop. */
> > > > > +       drm_gpuvm_bo_put(vm_bo);
> > > > > +       restore_vm_bo_list(gpuvm, evict, &evict);
> > > > > +
> > > > > +       return ret;
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_resv_add_fence - add fence to private and all
> > > > > extobj
> > > > > + * dma-resv
> > > > > + * @gpuvm: the &drm_gpuvm to add a fence to
> > > > > + * @exec: the &drm_exec locking context
> > > > > + * @fence: fence to add
> > > > > + * @private_usage: private dma-resv usage
> > > > > + * @extobj_usage: extobj dma-resv usage
> > > > > + */
> > > > > +void
> > > > > +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> > > > > +                        struct drm_exec *exec,
> > > > > +                        struct dma_fence *fence,
> > > > > +                        enum dma_resv_usage private_usage,
> > > > > +                        enum dma_resv_usage extobj_usage)
> > > > > +{
> > > > > +       struct drm_gem_object *obj;
> > > > > +       unsigned long index;
> > > > > +
> > > > > +       drm_exec_for_each_locked_object(exec, index, obj) {
> > > > > +               dma_resv_assert_held(obj->resv);
> > > > > +               dma_resv_add_fence(obj->resv, fence,
> > > > > +                                  drm_gpuvm_is_extobj(gpuvm,
> > > > > obj) ?
> > > > > +                                  private_usage :
> > > > > extobj_usage);
> > > > > +       }
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
> > > > > +
> > > > >    /**
> > > > >     * drm_gpuvm_bo_create() - create a new instance of struct
> > > > > drm_gpuvm_bo
> > > > >     * @gpuvm: The &drm_gpuvm the @obj is mapped in.
> > > > > @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
> > > > > *gpuvm,
> > > > >         INIT_LIST_HEAD(&vm_bo->list.gpuva);
> > > > >         INIT_LIST_HEAD(&vm_bo->list.entry.gem);
> > > > > +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
> > > > > +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
> > > > > +
> > > > >         drm_gem_object_get(obj);
> > > > >         return vm_bo;
> > > > > @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
> > > > >         drm_gem_gpuva_assert_lock_held(vm_bo->obj);
> > > > > +       spin_lock(&gpuvm->extobj.lock);
> > > > > +       list_del(&vm_bo->list.entry.extobj);
> > > > > +       spin_unlock(&gpuvm->extobj.lock);
> > > > > +
> > > > > +       spin_lock(&gpuvm->evict.lock);
> > > > > +       list_del(&vm_bo->list.entry.evict);
> > > > > +       spin_unlock(&gpuvm->evict.lock);
> > > > > +
> > > > >         list_del(&vm_bo->list.entry.gem);
> > > > >         drm_gem_object_put(obj);
> > > > > @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
> > > > >     * @vm_bo: the &drm_gpuvm_bo to release the reference of
> > > > >     *
> > > > >     * This releases a reference to @vm_bo.
> > > > > + *
> > > > > + * If the reference count drops to zero, the &gpuvm_bo is
> > > > > destroyed, which
> > > > > + * includes removing it from the GEMs gpuva list. Hence, if
> > > > > a call to this
> > > > > + * function can potentially let the reference count to zero
> > > > > the caller must
> > > > > + * hold the dma-resv or driver specific GEM gpuva lock.
> > > > >     */
> > > > >    void
> > > > >    drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
> > > > > @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
> > > > > *vm_bo)
> > > > >    }
> > > > >    EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
> > > > > +static int __must_check
> > > > > +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
> > > > > +{
> > > > > +       return kref_get_unless_zero(&vm_bo->kref);
> > > > > +}
> > > > > +
> > > > >    static struct drm_gpuvm_bo *
> > > > >    __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
> > > > >                     struct drm_gem_object *obj)
> > > > > @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
> > > > > drm_gpuvm_bo *__vm_bo)
> > > > >    }
> > > > >    EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
> > > > > +/**
> > > > > + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its
> > > > > &drm_gpuvm's
> > > > > + * extobj list
> > > > > + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the
> > > > > extobj list.
> > > > > + *
> > > > > + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if
> > > > > not on the list
> > > > > + * already and if the corresponding &drm_gem_object is an
> > > > > external object,
> > > > > + * actually.
> > > > > + */
> > > > > +void
> > > > > +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
> > > > > +{
> > > > > +       struct drm_gpuvm *gpuvm = vm_bo->vm;
> > > > > +
> > > > > +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
> > > > > +               drm_gpuvm_bo_list_add(vm_bo, extobj);
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
> > > > > / from a
> > > > > + * &drm_gpuvms evicted list
> > > > > + * @obj: the &drm_gem_object to add or remove
> > > > > + * @evict: indicates whether the object is evicted
> > > > > + *
> > > > > + * Adds a &drm_gem_object to or removes it from all
> > > > > &drm_gpuvms evicted
> > > > > + * list containing a mapping of this &drm_gem_object.
> > > > > + */
> > > > > +void
> > > > > +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
> > > > > +{
> > > > > +       struct drm_gpuvm_bo *vm_bo;
> > > > > +
> > > > > +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
> > > > > +               if (evict)
> > > > > +                       drm_gpuvm_bo_list_add(vm_bo, evict);
> > > > > +               else
> > > > > +                       drm_gpuvm_bo_list_del(vm_bo, evict);
> > > > > +       }
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
> > > > > +
> > > > >    static int
> > > > >    __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
> > > > >                    struct drm_gpuva *va)
> > > > > diff --git a/include/drm/drm_gpuvm.h
> > > > > b/include/drm/drm_gpuvm.h
> > > > > index afa50b9059a2..834bb6d6617e 100644
> > > > > --- a/include/drm/drm_gpuvm.h
> > > > > +++ b/include/drm/drm_gpuvm.h
> > > > > @@ -26,10 +26,12 @@
> > > > >     */
> > > > >    #include <linux/list.h>
> > > > > +#include <linux/dma-resv.h>
> > > > >    #include <linux/rbtree.h>
> > > > >    #include <linux/types.h>
> > > > >    #include <drm/drm_gem.h>
> > > > > +#include <drm/drm_exec.h>
> > > > >    struct drm_gpuvm;
> > > > >    struct drm_gpuvm_bo;
> > > > > @@ -259,6 +261,38 @@ struct drm_gpuvm {
> > > > >          * space
> > > > >          */
> > > > >         struct dma_resv *resv;
> > > > > +
> > > > > +       /**
> > > > > +        * @extobj: structure holding the extobj list
> > > > > +        */
> > > > > +       struct {
> > > > > +               /**
> > > > > +                * @list: &list_head storing &drm_gpuvm_bos
> > > > > serving as
> > > > > +                * external object
> > > > > +                */
> > > > > +               struct list_head list;
> > > > > +
> > > > > +               /**
> > > > > +                * @lock: spinlock to protect the extobj list
> > > > > +                */
> > > > > +               spinlock_t lock;
> > > > > +       } extobj;
> > > > > +
> > > > > +       /**
> > > > > +        * @evict: structure holding the evict list and evict
> > > > > list lock
> > > > > +        */
> > > > > +       struct {
> > > > > +               /**
> > > > > +                * @list: &list_head storing &drm_gpuvm_bos
> > > > > currently being
> > > > > +                * evicted
> > > > > +                */
> > > > > +               struct list_head list;
> > > > > +
> > > > > +               /**
> > > > > +                * @lock: spinlock to protect the evict list
> > > > > +                */
> > > > > +               spinlock_t lock;
> > > > > +       } evict;
> > > > >    };
> > > > >    void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
> > > > > drm_device *drm,
> > > > > @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
> > > > > *gpuvm, struct drm_device *drm,
> > > > >                     const struct drm_gpuvm_ops *ops);
> > > > >    void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
> > > > > +/**
> > > > > + * drm_gpuvm_is_extobj() - indicates whether the given
> > > > > &drm_gem_object is an
> > > > > + * external object
> > > > > + * @gpuvm: the &drm_gpuvm to check
> > > > > + * @obj: the &drm_gem_object to check
> > > > > + *
> > > > > + * Returns: true if the &drm_gem_object &dma_resv differs
> > > > > from the
> > > > > + * &drm_gpuvms &dma_resv, false otherwise
> > > > > + */
> > > > > +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
> > > > > *gpuvm,
> > > > > +                                      struct drm_gem_object
> > > > > *obj)
> > > > > +{
> > > > > +       return obj && obj->resv != gpuvm->resv;
> > > > > +}
> > > > > +
> > > > >    static inline struct drm_gpuva *
> > > > >    __drm_gpuva_next(struct drm_gpuva *va)
> > > > >    {
> > > > > @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
> > > > >    #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__)
> > > > > \
> > > > >         list_for_each_entry_safe(va__, next__, &(gpuvm__)-
> > > > > >rb.list, rb.entry)
> > > > > +/**
> > > > > + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
> > > > > &drm_exec
> > > > > + *
> > > > > + * This structure should be created on the stack as
> > > > > &drm_exec should be.
> > > > > + *
> > > > > + * Optionally, @extra can be set in order to lock additional
> > > > > &drm_gem_objects.
> > > > > + */
> > > > > +struct drm_gpuvm_exec {
> > > > > +       /**
> > > > > +        * @exec: the &drm_exec structure
> > > > > +        */
> > > > > +       struct drm_exec exec;
> > > > > +
> > > > > +       /**
> > > > > +        * @vm: the &drm_gpuvm to lock its DMA reservations
> > > > > +        */
> > > > > +       struct drm_gpuvm *vm;
> > > > > +
> > > > > +       /**
> > > > > +        * @extra: Callback and corresponding private data
> > > > > for the driver to
> > > > > +        * lock arbitrary additional &drm_gem_objects.
> > > > > +        */
> > > > > +       struct {
> > > > > +               /**
> > > > > +                * @fn: The driver callback to lock
> > > > > additional &drm_gem_objects.
> > > > > +                */
> > > > > +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
> > > > > +                         unsigned int num_fences);
> > > > > +
> > > > > +               /**
> > > > > +                * @priv: driver private data for the @fn
> > > > > callback
> > > > > +                */
> > > > > +               void *priv;
> > > > > +       } extra;
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
> > > > > resv
> > > > > + * @gpuvm: the &drm_gpuvm
> > > > > + * @exec: the &drm_exec context
> > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > + *
> > > > > + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
> > > > > &drm_gem_object.
> > > > > + *
> > > > > + * Using this function directly, it is the drivers
> > > > > responsibility to call
> > > > > + * drm_exec_init() and drm_exec_fini() accordingly.
> > > > > + *
> > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > + */
> > > > > +static inline int
> > > > > +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
> > > > > +                    struct drm_exec *exec,
> > > > > +                    unsigned int num_fences)
> > > > > +{
> > > > > +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
> > > > > num_fences);
> > > > > +}
> > > > > +
> > > > > +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> > > > > +                             struct drm_exec *exec,
> > > > > +                             unsigned int num_fences);
> > > > > +
> > > > > +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
> > > > > +                           struct drm_exec *exec,
> > > > > +                           u64 addr, u64 range,
> > > > > +                           unsigned int num_fences);
> > > > > +
> > > > > +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> > > > > +                       unsigned int num_fences,
> > > > > +                       bool interruptible);
> > > > > +
> > > > > +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
> > > > > *vm_exec,
> > > > > +                             struct drm_gem_object **objs,
> > > > > +                             unsigned int num_objs,
> > > > > +                             unsigned int num_fences,
> > > > > +                             bool interruptible);
> > > > > +
> > > > > +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
> > > > > *vm_exec,
> > > > > +                             u64 addr, u64 range,
> > > > > +                             unsigned int num_fences,
> > > > > +                             bool interruptible);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
> > > > > BOs
> > > > > + * @gpuvm: the &drm_gpuvm
> > > > > + *
> > > > > + * Releases all dma-resv locks of all &drm_gem_objects
> > > > > previously acquired
> > > > > + * through drm_gpuvm_lock() or its variants.
> > > > > + *
> > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > + */
> > > > > +static inline void
> > > > > +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
> > > > > +{
> > > > > +       drm_exec_fini(&vm_exec->exec);
> > > > > +}
> > > > > +
> > > > > +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
> > > > > +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> > > > > +                             struct drm_exec *exec,
> > > > > +                             struct dma_fence *fence,
> > > > > +                             enum dma_resv_usage
> > > > > private_usage,
> > > > > +                             enum dma_resv_usage
> > > > > extobj_usage);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_exec_resv_add_fence()
> > > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > > + * @fence: fence to add
> > > > > + * @private_usage: private dma-resv usage
> > > > > + * @extobj_usage: extobj dma-resv usage
> > > > > + *
> > > > > + * See drm_gpuvm_resv_add_fence().
> > > > > + */
> > > > > +static inline void
> > > > > +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
> > > > > *vm_exec,
> > > > > +                             struct dma_fence *fence,
> > > > > +                             enum dma_resv_usage
> > > > > private_usage,
> > > > > +                             enum dma_resv_usage
> > > > > extobj_usage)
> > > > > +{
> > > > > +       drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec,
> > > > > fence,
> > > > > +                                private_usage,
> > > > > extobj_usage);
> > > > > +}
> > > > > +
> > > > >    /**
> > > > >     * struct drm_gpuvm_bo - structure representing a
> > > > > &drm_gpuvm and
> > > > >     * &drm_gem_object combination
> > > > > @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
> > > > >                          * gpuva list.
> > > > >                          */
> > > > >                         struct list_head gem;
> > > > > +
> > > > > +                       /**
> > > > > +                        * @evict: List entry to attach to
> > > > > the &drm_gpuvms
> > > > > +                        * extobj list.
> > > > > +                        */
> > > > > +                       struct list_head extobj;
> > > > > +
> > > > > +                       /**
> > > > > +                        * @evict: List entry to attach to
> > > > > the &drm_gpuvms evict
> > > > > +                        * list.
> > > > > +                        */
> > > > > +                       struct list_head evict;
> > > > >                 } entry;
> > > > >         } list;
> > > > >    };
> > > > > @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
> > > > >    drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
> > > > >                   struct drm_gem_object *obj);
> > > > > +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
> > > > > evict);
> > > > > +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
> > > > > +
> > > > >    /**
> > > > >     * drm_gpuvm_bo_for_each_va() - iterator to walk over a
> > > > > list of &drm_gpuva
> > > > >     * @va__: &drm_gpuva structure to assign to in each
> > > > > iteration step
> > > > > @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
> > > > >          * used.
> > > > >          */
> > > > >         int (*sm_step_unmap)(struct drm_gpuva_op *op, void
> > > > > *priv);
> > > > > +
> > > > > +       /**
> > > > > +        * @bo_validate: called from drm_gpuvm_validate()
> > > > > +        *
> > > > > +        * Drivers receive this callback for every evicted
> > > > > &drm_gem_object being
> > > > > +        * mapped in the corresponding &drm_gpuvm.
> > > > > +        *
> > > > > +        * Typically, drivers would call their driver
> > > > > specific variant of
> > > > > +        * ttm_bo_validate() from within this callback.
> > > > > +        */
> > > > > +       int (*bo_validate)(struct drm_gem_object *obj);
> > > > >    };
> > > > >    int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
> > 
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13  7:19         ` Boris Brezillon
@ 2023-09-13 10:39           ` Thomas Hellström
  2023-09-13 11:33             ` Boris Brezillon
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-13 10:39 UTC (permalink / raw)
  To: Boris Brezillon, Dave Airlie
  Cc: Danilo Krummrich, daniel, matthew.brost, sarah.walker,
	donald.robson, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

Hi,

On 9/13/23 09:19, Boris Brezillon wrote:
> On Wed, 13 Sep 2023 17:05:42 +1000
> Dave Airlie <airlied@gmail.com> wrote:
>
>> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
>> <boris.brezillon@collabora.com> wrote:
>>> On Tue, 12 Sep 2023 18:20:32 +0200
>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>>>   
>>>>> +/**
>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>> + * @__gpuvm: The GPU VM
>>>>> + * @__list_name: The name of the list we're iterating on
>>>>> + * @__local_list: A pointer to the local list used to store already iterated items
>>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
>>>>> + *
>>>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
>>>>> + * iterator releases the lock immediately after picking the first element from
>>>>> + * the list, so list insertion deletion can happen concurrently.
>>>> Are the list spinlocks needed for that async state update from within
>>>> the dma-fence critical section we've discussed previously?
>>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
>>> hook will be in this situation (Panthor at the moment, PowerVR soon). I
>>> get that Xe and Nouveau don't need that because they update the VM
>>> state early (in the ioctl path), but I keep thinking this will hurt us
>>> if we don't think it through from the beginning, because once you've
>>> set this logic to depend only on resv locks, it will be pretty hard to
>>> get back to a solution which lets synchronous VM_BINDs take precedence
>>> on asynchronous request, and, with vkQueueBindSparse() passing external
>>> deps (plus the fact the VM_BIND queue might be pretty deep), it can
>>> take a long time to get your synchronous VM_BIND executed...

So this would boil down to either (possibly opt-in) keeping the spinlock 
approach or pushing the unlink out to a wq then?
BTW, as also asked in a reply to Danilo, how do you call unlink from 
run_job() when it was requiring the obj->dma_resv lock, or was that a WIP?

>>>   
>> btw what is the use case for this? do we have actual vulkan
>> applications we know will have problems here?
> I don't, but I think that's a concern Faith raised at some point (dates
> back from when I was reading threads describing how VM_BIND on i915
> should work, and I was clearly discovering this whole VM_BIND thing at
> that time, so maybe I misunderstood).
>
>> it feels like a bit of premature optimisation, but maybe we have use cases.
> Might be, but that's the sort of thing that would put us in a corner if
> we don't have a plan for when the needs arise. Besides, if we don't
> want to support that case because it's too complicated, I'd recommend
> dropping all the drm_gpuvm APIs that let people think this mode is
> valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
> drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
> confusion.

Xe allows bypassing the bind-queue with another bind-queue, but to 
completely avoid dependencies between queues the Operations may not 
overlap.  (And the definition of overlap is currently page-table 
structure updates may not overlap) but no guarantees are made about 
priority.

/Thomas




^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 10:39           ` Thomas Hellström
@ 2023-09-13 11:33             ` Boris Brezillon
  2023-09-13 12:01               ` Danilo Krummrich
  2023-09-13 13:22               ` Thomas Hellström
  0 siblings, 2 replies; 77+ messages in thread
From: Boris Brezillon @ 2023-09-13 11:33 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: Dave Airlie, Danilo Krummrich, daniel, matthew.brost,
	sarah.walker, donald.robson, christian.koenig, faith.ekstrand,
	dri-devel, nouveau, linux-kernel

On Wed, 13 Sep 2023 12:39:01 +0200
Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:

> Hi,
> 
> On 9/13/23 09:19, Boris Brezillon wrote:
> > On Wed, 13 Sep 2023 17:05:42 +1000
> > Dave Airlie <airlied@gmail.com> wrote:
> >  
> >> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
> >> <boris.brezillon@collabora.com> wrote:  
> >>> On Tue, 12 Sep 2023 18:20:32 +0200
> >>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >>>     
> >>>>> +/**
> >>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
> >>>>> + * @__gpuvm: The GPU VM
> >>>>> + * @__list_name: The name of the list we're iterating on
> >>>>> + * @__local_list: A pointer to the local list used to store already iterated items
> >>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> >>>>> + *
> >>>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
> >>>>> + * iterator releases the lock immediately after picking the first element from
> >>>>> + * the list, so list insertion deletion can happen concurrently.  
> >>>> Are the list spinlocks needed for that async state update from within
> >>>> the dma-fence critical section we've discussed previously?  
> >>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
> >>> hook will be in this situation (Panthor at the moment, PowerVR soon). I
> >>> get that Xe and Nouveau don't need that because they update the VM
> >>> state early (in the ioctl path), but I keep thinking this will hurt us
> >>> if we don't think it through from the beginning, because once you've
> >>> set this logic to depend only on resv locks, it will be pretty hard to
> >>> get back to a solution which lets synchronous VM_BINDs take precedence
> >>> on asynchronous request, and, with vkQueueBindSparse() passing external
> >>> deps (plus the fact the VM_BIND queue might be pretty deep), it can
> >>> take a long time to get your synchronous VM_BIND executed...  
> 
> So this would boil down to either (possibly opt-in) keeping the spinlock 
> approach or pushing the unlink out to a wq then?

Deferred _unlink() would not be an issue, since I already defer the
drm_gpuva destruction to a wq, it would just a be a matter of moving the
_unlink() call there as well. But _link() also takes the GEM gpuva list
lock, and that one is bit tricky, in that sm_map() can trigger 2 more
_link() calls for the prev/next mappings, which we can't guess until we
get to execute the VM update. If we mandate the use of the GEM resv
lock, that simply means async VM updates (AKA calling
drm_gpuvm_sm_[un]map()) are not an option. And if this is what everyone
agrees on, then I'd like the APIs that make this sort of async VM
update possible (drm_gpuvm_sm_[un]map(), the drm_gpuvm_ops::sm_step*
methods, and probably other things) to be dropped, so we don't make it
look like it's something we support.

> BTW, as also asked in a reply to Danilo, how do you call unlink from 
> run_job() when it was requiring the obj->dma_resv lock, or was that a WIP?

_unlink() makes sure the GEM gpuva list lock is taken, but this can be
a custom lock (see drm_gem_gpuva_set_lock()). In panthor we have
panthor_gem_object::gpuva_list_lock that's dedicated the gpuva list
protection. We make sure we never take this lock while allocating
memory to guarantee the dma-signalling path can't deadlock.

> 
> >>>     
> >> btw what is the use case for this? do we have actual vulkan
> >> applications we know will have problems here?  
> > I don't, but I think that's a concern Faith raised at some point (dates
> > back from when I was reading threads describing how VM_BIND on i915
> > should work, and I was clearly discovering this whole VM_BIND thing at
> > that time, so maybe I misunderstood).
> >  
> >> it feels like a bit of premature optimisation, but maybe we have use cases.  
> > Might be, but that's the sort of thing that would put us in a corner if
> > we don't have a plan for when the needs arise. Besides, if we don't
> > want to support that case because it's too complicated, I'd recommend
> > dropping all the drm_gpuvm APIs that let people think this mode is
> > valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
> > drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
> > confusion.  
> 
> Xe allows bypassing the bind-queue with another bind-queue, but to 
> completely avoid dependencies between queues the Operations may not 
> overlap.

So, you check the VM state with some VM lock held (would be the VM resv
in my case), and if the mapping is new (no overlaps with pre-existing
mappings), you queue it to the fast-track/sync-VM_BIND queue. What would
be missing I guess is a way to know if the mapping is active (MMU has
been updated) or pending (MMU update queued to the bind-queue), so I can
fast-track mapping/unmapping of active mappings. This would leave
overlapping sync/async VM updates, which can't happen in practice
unless userspace is doing something wrong (sparse bindings always go
through vkQueueBindSparse).

I'll give it a try.

> (And the definition of overlap is currently page-table 
> structure updates may not overlap) but no guarantees are made about 
> priority.
> 
> /Thomas
> 
> 
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 11:33             ` Boris Brezillon
@ 2023-09-13 12:01               ` Danilo Krummrich
  2023-09-13 13:22               ` Thomas Hellström
  1 sibling, 0 replies; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-13 12:01 UTC (permalink / raw)
  To: Boris Brezillon, Thomas Hellström
  Cc: Dave Airlie, daniel, matthew.brost, sarah.walker, donald.robson,
	christian.koenig, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

After some more discussion with Boris on IRC, he seems to be willing to drop GPUVM
updates from the async path. If everyone agrees I'm fine to go ahead and drop this
use case for GPUVM.

@Thomas: I will reply to your last mail only considering GPUVM updates from within
the IOCTL.

- Danilo

On 9/13/23 13:33, Boris Brezillon wrote:
> On Wed, 13 Sep 2023 12:39:01 +0200
> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> 
>> Hi,
>>
>> On 9/13/23 09:19, Boris Brezillon wrote:
>>> On Wed, 13 Sep 2023 17:05:42 +1000
>>> Dave Airlie <airlied@gmail.com> wrote:
>>>   
>>>> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
>>>> <boris.brezillon@collabora.com> wrote:
>>>>> On Tue, 12 Sep 2023 18:20:32 +0200
>>>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>>>>>      
>>>>>>> +/**
>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>> + * @__local_list: A pointer to the local list used to store already iterated items
>>>>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
>>>>>>> + *
>>>>>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
>>>>>>> + * iterator releases the lock immediately after picking the first element from
>>>>>>> + * the list, so list insertion deletion can happen concurrently.
>>>>>> Are the list spinlocks needed for that async state update from within
>>>>>> the dma-fence critical section we've discussed previously?
>>>>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
>>>>> hook will be in this situation (Panthor at the moment, PowerVR soon). I
>>>>> get that Xe and Nouveau don't need that because they update the VM
>>>>> state early (in the ioctl path), but I keep thinking this will hurt us
>>>>> if we don't think it through from the beginning, because once you've
>>>>> set this logic to depend only on resv locks, it will be pretty hard to
>>>>> get back to a solution which lets synchronous VM_BINDs take precedence
>>>>> on asynchronous request, and, with vkQueueBindSparse() passing external
>>>>> deps (plus the fact the VM_BIND queue might be pretty deep), it can
>>>>> take a long time to get your synchronous VM_BIND executed...
>>
>> So this would boil down to either (possibly opt-in) keeping the spinlock
>> approach or pushing the unlink out to a wq then?
> 
> Deferred _unlink() would not be an issue, since I already defer the
> drm_gpuva destruction to a wq, it would just a be a matter of moving the
> _unlink() call there as well. But _link() also takes the GEM gpuva list
> lock, and that one is bit tricky, in that sm_map() can trigger 2 more
> _link() calls for the prev/next mappings, which we can't guess until we
> get to execute the VM update. If we mandate the use of the GEM resv
> lock, that simply means async VM updates (AKA calling
> drm_gpuvm_sm_[un]map()) are not an option. And if this is what everyone
> agrees on, then I'd like the APIs that make this sort of async VM
> update possible (drm_gpuvm_sm_[un]map(), the drm_gpuvm_ops::sm_step*
> methods, and probably other things) to be dropped, so we don't make it
> look like it's something we support.
> 
>> BTW, as also asked in a reply to Danilo, how do you call unlink from
>> run_job() when it was requiring the obj->dma_resv lock, or was that a WIP?
> 
> _unlink() makes sure the GEM gpuva list lock is taken, but this can be
> a custom lock (see drm_gem_gpuva_set_lock()). In panthor we have
> panthor_gem_object::gpuva_list_lock that's dedicated the gpuva list
> protection. We make sure we never take this lock while allocating
> memory to guarantee the dma-signalling path can't deadlock.
> 
>>
>>>>>      
>>>> btw what is the use case for this? do we have actual vulkan
>>>> applications we know will have problems here?
>>> I don't, but I think that's a concern Faith raised at some point (dates
>>> back from when I was reading threads describing how VM_BIND on i915
>>> should work, and I was clearly discovering this whole VM_BIND thing at
>>> that time, so maybe I misunderstood).
>>>   
>>>> it feels like a bit of premature optimisation, but maybe we have use cases.
>>> Might be, but that's the sort of thing that would put us in a corner if
>>> we don't have a plan for when the needs arise. Besides, if we don't
>>> want to support that case because it's too complicated, I'd recommend
>>> dropping all the drm_gpuvm APIs that let people think this mode is
>>> valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
>>> drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
>>> confusion.
>>
>> Xe allows bypassing the bind-queue with another bind-queue, but to
>> completely avoid dependencies between queues the Operations may not
>> overlap.
> 
> So, you check the VM state with some VM lock held (would be the VM resv
> in my case), and if the mapping is new (no overlaps with pre-existing
> mappings), you queue it to the fast-track/sync-VM_BIND queue. What would
> be missing I guess is a way to know if the mapping is active (MMU has
> been updated) or pending (MMU update queued to the bind-queue), so I can
> fast-track mapping/unmapping of active mappings. This would leave
> overlapping sync/async VM updates, which can't happen in practice
> unless userspace is doing something wrong (sparse bindings always go
> through vkQueueBindSparse).
> 
> I'll give it a try.
> 
>> (And the definition of overlap is currently page-table
>> structure updates may not overlap) but no guarantees are made about
>> priority.
>>
>> /Thomas
>>
>>
>>
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13  9:14           ` Thomas Hellström
@ 2023-09-13 12:16             ` Danilo Krummrich
  2023-09-13 14:26               ` Christian König
  2023-09-14 10:57               ` [Nouveau] " Danilo Krummrich
  0 siblings, 2 replies; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-13 12:16 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, christian.koenig, faith.ekstrand, dri-devel,
	nouveau, linux-kernel

As mentioned in a different mail thread, the reply is based on the assumption
that we don't support anything else than GPUVM updates from the IOCTL.

On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström wrote:
> Hi!
> 
> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
> > On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström wrote:
> > > 
> > > On 9/12/23 18:50, Danilo Krummrich wrote:
> > > > On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
> > > > > Hi, Danilo,
> > > > > 
> > > > > On 9/9/23 17:31, Danilo Krummrich wrote:
> > > > > > So far the DRM GPUVA manager offers common infrastructure to
> > > > > > track GPU VA
> > > > > > allocations and mappings, generically connect GPU VA mappings
> > > > > > to their
> > > > > > backing buffers and perform more complex mapping operations
> > > > > > on the GPU VA
> > > > > > space.
> > > > > > 
> > > > > > However, there are more design patterns commonly used by
> > > > > > drivers, which
> > > > > > can potentially be generalized in order to make the DRM GPUVA
> > > > > > manager
> > > > > > represent a basic GPU-VM implementation. In this context,
> > > > > > this patch aims
> > > > > > at generalizing the following elements.
> > > > > > 
> > > > > > 1) Provide a common dma-resv for GEM objects not being used
> > > > > > outside of
> > > > > >      this GPU-VM.
> > > > > > 
> > > > > > 2) Provide tracking of external GEM objects (GEM objects
> > > > > > which are
> > > > > >      shared with other GPU-VMs).
> > > > > > 
> > > > > > 3) Provide functions to efficiently lock all GEM objects dma-
> > > > > > resv the
> > > > > >      GPU-VM contains mappings of.
> > > > > > 
> > > > > > 4) Provide tracking of evicted GEM objects the GPU-VM
> > > > > > contains mappings
> > > > > >      of, such that validation of evicted GEM objects is
> > > > > > accelerated.
> > > > > > 
> > > > > > 5) Provide some convinience functions for common patterns.
> > > > > > 
> > > > > > Rather than being designed as a "framework", the target is to
> > > > > > make all
> > > > > > features appear as a collection of optional helper functions,
> > > > > > such that
> > > > > > drivers are free to make use of the DRM GPUVA managers basic
> > > > > > functionality and opt-in for other features without setting
> > > > > > any feature
> > > > > > flags, just by making use of the corresponding functions.
> > > > > > 
> > > > > > Big kudos to Boris Brezillon for his help to figure out
> > > > > > locking for drivers
> > > > > > updating the GPU VA space within the fence signalling path.
> > > > > > 
> > > > > > Suggested-by: Matthew Brost <matthew.brost@intel.com>
> > > > > > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > > > > > ---
> > > > > >    drivers/gpu/drm/drm_gpuvm.c | 516
> > > > > > ++++++++++++++++++++++++++++++++++++
> > > > > >    include/drm/drm_gpuvm.h     | 197 ++++++++++++++
> > > > > >    2 files changed, 713 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/drm_gpuvm.c
> > > > > > b/drivers/gpu/drm/drm_gpuvm.c
> > > > > > index f4411047dbb3..8e62a043f719 100644
> > > > > > --- a/drivers/gpu/drm/drm_gpuvm.c
> > > > > > +++ b/drivers/gpu/drm/drm_gpuvm.c
> > > > > > @@ -73,6 +73,21 @@
> > > > > >     * &drm_gem_object list of &drm_gpuvm_bos for an existing
> > > > > > instance of this
> > > > > >     * particular combination. If not existent a new instance
> > > > > > is created and linked
> > > > > >     * to the &drm_gem_object.
> > > > > > + *
> > > > > > + * &drm_gpuvm_bo structures, since unique for a given
> > > > > > &drm_gpuvm, are also used
> > > > > > + * as entry for the &drm_gpuvm's lists of external and
> > > > > > evicted objects. Those
> > > > > > + * list are maintained in order to accelerate locking of
> > > > > > dma-resv locks and
> > > > > > + * validation of evicted objects bound in a &drm_gpuvm. For
> > > > > > instance the all
> > > > > > + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be
> > > > > > locked by calling
> > > > > > + * drm_gpuvm_exec_lock(). Once locked drivers can call
> > > > > > drm_gpuvm_validate() in
> > > > > > + * order to validate all evicted &drm_gem_objects. It is
> > > > > > also possible to lock
> > > > > > + * additional &drm_gem_objects by providing the
> > > > > > corresponding parameters to
> > > > > > + * drm_gpuvm_exec_lock() as well as open code the &drm_exec
> > > > > > loop while making
> > > > > > + * use of helper functions such as drm_gpuvm_prepare_range()
> > > > > > or
> > > > > > + * drm_gpuvm_prepare_objects().
> > > > > > + *
> > > > > > + * Every bound &drm_gem_object is treated as external object
> > > > > > when its &dma_resv
> > > > > > + * structure is different than the &drm_gpuvm's common
> > > > > > &dma_resv structure.
> > > > > >     */
> > > > > >    /**
> > > > > > @@ -420,6 +435,20 @@
> > > > > >     * Subsequent calls to drm_gpuvm_bo_obtain() for the same
> > > > > > &drm_gpuvm and
> > > > > >     * &drm_gem_object must be able to observe previous
> > > > > > creations and destructions
> > > > > >     * of &drm_gpuvm_bos in order to keep instances unique.
> > > > > > + *
> > > > > > + * The &drm_gpuvm's lists for keeping track of external and
> > > > > > evicted objects are
> > > > > > + * protected against concurrent insertion / removal and
> > > > > > iteration internally.
> > > > > > + *
> > > > > > + * However, drivers still need ensure to protect concurrent
> > > > > > calls to functions
> > > > > > + * iterating those lists, such as drm_gpuvm_validate() and
> > > > > > + * drm_gpuvm_prepare_objects(). Every such function contains
> > > > > > a particular
> > > > > > + * comment and lockdep checks if possible.
> > > > > > + *
> > > > > > + * Functions adding or removing entries from those lists,
> > > > > > such as
> > > > > > + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be
> > > > > > called with external
> > > > > > + * locks being held, e.g. in order to avoid the
> > > > > > corresponding list to be
> > > > > > + * (safely) modified while potentially being iternated by
> > > > > > other API functions.
> > > > > > + * However, this is entirely optional.
> > > > > >     */
> > > > > >    /**
> > > > > > @@ -632,6 +661,131 @@
> > > > > >     *   }
> > > > > >     */
> > > > > > +/**
> > > > > > + * get_next_vm_bo_from_list() - get the next vm_bo element
> > > > > > + * @__gpuvm: The GPU VM
> > > > > > + * @__list_name: The name of the list we're iterating on
> > > > > > + * @__local_list: A pointer to the local list used to store
> > > > > > already iterated items
> > > > > > + * @__prev_vm_bo: The previous element we got from
> > > > > > drm_gpuvm_get_next_cached_vm_bo()
> > > > > > + *
> > > > > > + * This helper is here to provide lockless list iteration.
> > > > > > Lockless as in, the
> > > > > > + * iterator releases the lock immediately after picking the
> > > > > > first element from
> > > > > > + * the list, so list insertion deletion can happen
> > > > > > concurrently.
> > > > > Are the list spinlocks needed for that async state update from
> > > > > within the
> > > > > dma-fence critical section we've discussed previously?
> > > > Yes, but also for other reasons, see below.
> > > > 
> > > > > Otherwise it should be sufficient to protect the lists with the
> > > > > gpuvm's resv
> > > > > (or for the extobj list with an outer lock).
> > > > > 
> > > > > If those spinlocks are still needed in some situations, perhaps
> > > > > could we
> > > > > have an option to set them to NULL (Like IIRC the maple tree
> > > > > allows for)?
> > > > The evict spinlock is needed in any case, since in
> > > > drm_gpuvm_bo_evict() we're
> > > > holding only the dma-resv lock from the BO this function gets
> > > > called for. Hence,
> > > > the spinlock protects concurrent drm_gpuvm_bo_evict() calls with
> > > > different BOs.
> > > No. Only if you try to add external objects to the vm's evict list
> > > from
> > > within the evict code. That's not necessary since you loop through
> > > all
> > > external objects anyway when locking them so an "evicted" bool in
> > > the vm_bo,
> > > protected by the bo resv would be sufficient. The extobj locking
> > > loop can
> > > then add the bo to the evicted list.
> > 
> > And validate() can remove it while still holding all dma-resv locks,
> > neat!
> > However, what if two tasks are trying to lock the VA space
> > concurrently? What
> > do we do when the drm_gpuvm_bo's refcount drops to zero in
> > drm_gpuva_unlink()?
> > Are we guaranteed that at this point of time the drm_gpuvm_bo is not
> > on the
> > evicted list? Because otherwise we would call drm_gpuvm_bo_destroy()
> > with the
> > dma-resv lock held, which wouldn't be allowed, since
> > drm_gpuvm_bo_destroy()
> > might drop the last reference to the drm_gem_object and hence we'd
> > potentially
> > free the dma-resv lock while holding it, at least if it's an external
> > object.
> 
> Easiest way in this scheme is to think of the lists as being protected
> by the vm's resv lock. That means anybody calling unlink() must also
> hold the vm's resv lock. (Which is OK from an UAF point of view, but
> perhaps not from a locking inversion POW from an async list update).

This would mean that on unlink() we'd need to hold the VM's resv lock and the
corresponding GEM's resv lock (in case they're not the same anyways) because the
VM's resv lock would protect the external / evicted object lists and the GEM
objects resv lock protects the GEM's list of drm_gpuvm_bos and the
drm_gpuvm_bo's list of drm_gpuvas.

> 
> > 
> > > > 
> > > > For extobjs an outer lock would be enough in case of Xe, but I
> > > > really would not
> > > > like to add even more complexity just to get the spinlock out of
> > > > the way in case
> > > > the driver already has an outer lock protecting this path.
> > > 
> > > I must disagree here. These spinlocks and atomic operations are
> > > pretty
> > > costly and as discussed earlier this type of locking was the reason
> > > (at
> > > least according to the commit message) that made Christian drop the
> > > XArray
> > > use in drm_exec for the same set of objects: "The locking overhead
> > > is
> > > unecessary and measurable". IMHO the spinlock is the added
> > > complexity and a
> > > single wide lock following the drm locking guidelines set out by
> > > Daniel and
> > > David should really be the default choice with an opt-in for a
> > > spinlock if
> > > needed for async and pushing out to a wq is not an option.
> > 
> > For the external object list an outer lock would work as long as it's
> > not the
> > dma-resv lock of the corresponding GEM object, since here we actually
> > need to
> > remove the list entry from the external object list on
> > drm_gpuvm_bo_destroy().
> > It's just a bit weird design wise that drivers would need to take
> > this outer
> > lock on:
> > 
> > - drm_gpuvm_bo_extobj_add()
> > - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
> > - drm_gpuva_unlink()            (because it needs to call
> > drm_gpuvm_bo_put())
> > - drm_gpuvm_exec_lock()
> > - drm_gpuvm_exec_lock_array()
> > - drm_gpuvm_prepare_range()
> > 
> > Given that it seems reasonable to do all the required locking
> > internally.
> 
> From a design POW, there has been a clear direction in XE to make
> things similar to mmap() / munmap(), so this outer lock, which in Xe is
> an rwsem, is used in a similar way as the mmap_lock. It's protecting
> the page-table structures and vma rb tree, the userptr structures and
> the extobj list. Basically it's taken early in the exec IOCTL, the
> VM_BIND ioctl, the compute rebind worker and the pagefault handler, so
> all of the above are just asserting that it is taken in the correct
> mode.
> 
> But strictly with this scheme one could also use the vm's dma_resv for
> the extobj list since with drm_exec, it's locked before traversing the
> list.
> 
> The whole point of this scheme is to rely on locks that you already are
> supposed to be holding for various reasons and is simple to comprehend.

I don't agree that we're supposed to hold the VM's resv lock anyways for
functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm fine using it
for that purpose nevertheless.

> 
> > 
> > In order to at least place lockdep checks, the driver would need to
> > supply the
> > corresponding lock's lockdep_map, because the GPUVM otherwise doesn't
> > know about
> > the lock.
> 
> Yes, that sounds reasonable. One lockdep map per list.

I'd really like to avoid that, especially now that everything got simpler. We
should define the actual locks to take instead.

> 
> > 
> > Out of curiosity, what is the overhead of a spin_lock() that doesn't
> > need to
> > spin? 
> 
> I guess it's hard to tell exactly, but it is much lower on modern x86
> than what it used to be. Not sure about ARM, which is the other
> architecture important to us. I figure if there is little cache-line
> bouncing the main overhead comes from the implied barriers.
> 
> > 
> > > 
> > > A pretty simple way that would not add much code would be
> > > 
> > > static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
> > > spinlock_t
> > > *lock)
> > > 
> > > {
> > > 
> > >     if (!gpuvm->resv_protected_lists)
> > >         spin_lock(lock);
> > > 
> > > }
> > > 
> > > > > For such drivers, that would require anybody calling unlink to
> > > > > hold the vm's
> > > > > resv, though.
> > > > In V4 I want to go back to having a dedicated lock for the GEMs
> > > > gpuva list (or
> > > > VM_BO list to be more precise). We can't just use the dma-resv
> > > > lock for that
> > > > with VM_BO abstractions, because on destruction of a VM_BO we
> > > > otherwise wouldn't
> > > > be allowed to already hold the dma-resv lock. That's the fix I
> > > > was referring to
> > > > earlier.
> > > 
> > > Yeah, I can see the need for a dedicated lock for the GEM's gpuva
> > > list, but
> > > holding the vm's dma-resv lock across the unlink shouldn't be a
> > > problem. We
> > > may free the object and a pointer to the vm's resv during unlink
> > > but we
> > > don't free the vm's resv.  It'd be a matter of ensuring that any
> > > calls to
> > > unlink from *within* drm_gpuvm allows it to be held.
> > 
> > Drivers calling unlink() from the fence signaling path can't use the
> > VM's
> > dma-resv lock.
> 
> Yes, that made me a bit curious because in the current version the code
> required the object's dma_resv for unlink() which can't be grabbed
> either from the fence signaling path. So are there any drivers actually
> wanting to do that? If so, they will either need to resort to the
> current spinlock solution or they will need to call unlink from a
> workqueue item.

As Boris already mentioned we have the dma-resv lock by default or a driver
specific GEM gpuva lock as opt-in. Now, we can get rid of the latter.

> > 
> > Also, what if the object is an external object? We can't use the VM's
> > dma-resv
> > lock here.
> 
> Why? Typically (sync) unlink is only ever called from an unbind-like
> operation where it should be trivial to grab the vm's resv. Or, for
> that matter any outer lock protecting the extobj list. Rule would be
> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict would
> be protected by either the vm's dma_resv (or possibly an outer lock in
> the case of the extobj list).

Outer lock wouldn't have been working for updates in the async path, but
shouldn't be relevant anymore. We could use the VM's resv for that.

> 
> >  And we can't have the GEM objs dma-resv lock held when calling
> > unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
> > refcount drops
> > to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might
> > drop the
> > last reference of the GEM object.
> 
> Yes, but this is a different problem as to what exactly protects
> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per bo list
> lock, or if we want to keep the bo's dma_resv we need to ensure that
> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount (I know
> Boris didn't like that, but requiring an explicit refcount for a
> pointer you dereference unless you're under a lock that ensures keeping
> the object alive is pretty much required?) But anyway for the
> drm_gpuvm_bo::entry::gem list protection (bo resv or internal spinlock)
> I don't have a strong preference.

We can keep the GEM objects dma-resv lock, however as mentioned above
drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the VM's resv lock
and the GEM's resv lock in case they differ.

> 
> >  All those problems go away with a dedicated
> > GEM gpuva list lock.
> 
> I don't think these are real problems.
> With the excepton of the eviction list "trick" where we currently have
> slightly different approach to collect external bos needing rebinding,
> we have this working fine.
> 
> TBH I think pretty much the only situation where the spinlock is needed
> is for async updates of these lists, unless a wq item can be used for
> that, but it doesn't really seem like the current code allows for such
> updates anyway? It complicates the code a lot, adds overhead and also
> adds the requirement for refcounting during list traversal.
> 
> /Thomas
> 
> > 
> > > 
> > > /Thomas
> > > 
> > > 
> > > > > It seems that with that also the refcount could be make non-
> > > > > atomic.
> > > > > 
> > > > > All in the spirit of the drm locking guidelines "use big locks
> > > > > when
> > > > > possible".
> > > > > Lower level locks only when necessary for performance or
> > > > > locking inversion?
> > > > > 
> > > > > /Thomas
> > > > > 
> > > > > 
> > > > > > + *
> > > > > > + * Elements popped from the original list are kept in a
> > > > > > local list, so removal
> > > > > > + * and is_empty checks can still happen while we're
> > > > > > iterating the list.
> > > > > > + */
> > > > > > +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
> > > > > > __local_list, __prev_vm_bo)     \
> > > > > > +       ({                                                   
> > > > > >                            \
> > > > > > +               struct drm_gpuvm_bo
> > > > > > *__vm_bo;                                           \
> > > > > > +                                                            
> > > > > >                            \
> > > > > > +               drm_gpuvm_bo_put(__prev_vm_bo);              
> > > > > >                            \
> > > > > > +                                                            
> > > > > >                            \
> > > > > > +               spin_lock(&(__gpuvm)-
> > > > > > >__list_name.lock);                                \
> > > > > > +               while (!list_empty(&(__gpuvm)-
> > > > > > >__list_name.list)) {                     \
> > > > > > +                       __vm_bo =
> > > > > > list_first_entry(&(__gpuvm)->__list_name.list,        \
> > > > > > +                                                  struct
> > > > > > drm_gpuvm_bo,                 \
> > > > > > +                                                 
> > > > > > list.entry.__list_name);             \
> > > > > > +                       if
> > > > > > (drm_gpuvm_bo_get_unless_zero(__vm_bo))
> > > > > > {                    \
> > > > > > +                               list_move_tail(&(__vm_bo)-
> > > > > > >list.entry.__list_name,      \
> > > > > > +                                             
> > > > > > __local_list);                           \
> > > > > > +                               break;                       
> > > > > >                            \
> > > > > > +                       } else
> > > > > > {                                                        \
> > > > > > +                               list_del_init(&(__vm_bo)-
> > > > > > >list.entry.__list_name);      \
> > > > > > +                               __vm_bo =
> > > > > > NULL;                                         \
> > > > > > +                       }                                    
> > > > > >                            \
> > > > > > +               }                                            
> > > > > >                            \
> > > > > > +               spin_unlock(&(__gpuvm)-
> > > > > > >__list_name.lock);                              \
> > > > > > +                                                            
> > > > > >                            \
> > > > > > +               __vm_bo;                                     
> > > > > >                            \
> > > > > > +       })
> > > > > > +
> > > > > > +/**
> > > > > > + * for_each_vm_bo_in_list() - internal vm_bo list iterator
> > > > > > + *
> > > > > > + * This helper is here to provide lockless list iteration.
> > > > > > Lockless as in, the
> > > > > > + * iterator releases the lock immediately after picking the
> > > > > > first element from the
> > > > > > + * list, so list insertion and deletion can happen
> > > > > > concurrently.
> > > > > > + *
> > > > > > + * Typical use:
> > > > > > + *
> > > > > > + *     struct drm_gpuvm_bo *vm_bo;
> > > > > > + *     LIST_HEAD(my_local_list);
> > > > > > + *
> > > > > > + *     ret = 0;
> > > > > > + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
> > > > > > &my_local_list, vm_bo) {
> > > > > > + *             ret = do_something_with_vm_bo(..., vm_bo);
> > > > > > + *             if (ret)
> > > > > > + *                     break;
> > > > > > + *     }
> > > > > > + *     drm_gpuvm_bo_put(vm_bo);
> > > > > > + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
> > > > > > &my_local_list);
> > > > > > + *
> > > > > > + *
> > > > > > + * Only used for internal list iterations, not meant to be
> > > > > > exposed to the outside
> > > > > > + * world.
> > > > > > + */
> > > > > > +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
> > > > > > __local_list, __vm_bo)    \
> > > > > > +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
> > > > > > __list_name,           \
> > > > > > +                                               __local_list,
> > > > > > NULL);            \
> > > > > > +           
> > > > > > __vm_bo;                                                     
> > > > > >       \
> > > > > > +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
> > > > > > __list_name,           \
> > > > > > +                                               __local_list,
> > > > > > __vm_bo))         \
> > > > > > +
> > > > > > +/**
> > > > > > + * restore_vm_bo_list() - move vm_bo elements back to their
> > > > > > original list
> > > > > > + * @__gpuvm: The GPU VM
> > > > > > + * @__list_name: The name of the list we're iterating on
> > > > > > + * @__local_list: A pointer to the local list used to store
> > > > > > already iterated items
> > > > > > + *
> > > > > > + * When we're done iterating a vm_bo list, we should call
> > > > > > restore_vm_bo_list()
> > > > > > + * to restore the original state and let new iterations take
> > > > > > place.
> > > > > > + */
> > > > > > +#define restore_vm_bo_list(__gpuvm, __list_name,
> > > > > > __local_list)                         \
> > > > > > +       do
> > > > > > {                                                            
> > > > > >                 \
> > > > > > +               /* Merge back the two lists, moving local
> > > > > > list elements to the          \
> > > > > > +                * head to preserve previous ordering, in
> > > > > > case it matters.              \
> > > > > > +               
> > > > > > */                                                           
> > > > > >           \
> > > > > > +               spin_lock(&(__gpuvm)-
> > > > > > >__list_name.lock);                                \
> > > > > > +               list_splice(__local_list, &(__gpuvm)-
> > > > > > >__list_name.list);                \
> > > > > > +               spin_unlock(&(__gpuvm)-
> > > > > > >__list_name.lock);                              \
> > > > > > +       } while (0)
> > > > > > +/**
> > > > > > + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
> > > > > > list
> > > > > > + * @__vm_bo: the &drm_gpuvm_bo
> > > > > > + * @__list_name: the name of the list to insert into
> > > > > > + *
> > > > > > + * Inserts the given @__vm_bo into the list specified by
> > > > > > @__list_name and
> > > > > > + * increases the vm_bo's reference count.
> > > > > > + */
> > > > > > +#define drm_gpuvm_bo_list_add(__vm_bo,
> > > > > > __list_name)                            \
> > > > > > +       do
> > > > > > {                                                            
> > > > > >         \
> > > > > > +               spin_lock(&(__vm_bo)->vm-
> > > > > > >__list_name.lock);                    \
> > > > > > +               if (list_empty(&(__vm_bo)-
> > > > > > >list.entry.__list_name))             \
> > > > > > +                       list_add_tail(&(__vm_bo)-
> > > > > > >list.entry.__list_name,       \
> > > > > > +                                     &(__vm_bo)->vm-
> > > > > > >__list_name.list);        \
> > > > > > +               spin_unlock(&(__vm_bo)->vm-
> > > > > > >__list_name.lock);                  \
> > > > > > +       } while (0)
> > > > > > +
> > > > > > +/**
> > > > > > + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
> > > > > > list
> > > > > > + * @__vm_bo: the &drm_gpuvm_bo
> > > > > > + * @__list_name: the name of the list to insert into
> > > > > > + *
> > > > > > + * Removes the given @__vm_bo from the list specified by
> > > > > > @__list_name and
> > > > > > + * decreases the vm_bo's reference count.
> > > > > > + */
> > > > > > +#define drm_gpuvm_bo_list_del(__vm_bo,
> > > > > > __list_name)                            \
> > > > > > +       do
> > > > > > {                                                            
> > > > > >         \
> > > > > > +               spin_lock(&(__vm_bo)->vm-
> > > > > > >__list_name.lock);                    \
> > > > > > +               if (!list_empty(&(__vm_bo)-
> > > > > > >list.entry.__list_name))            \
> > > > > > +                       list_del_init(&(__vm_bo)-
> > > > > > >list.entry.__list_name);      \
> > > > > > +               spin_unlock(&(__vm_bo)->vm-
> > > > > > >__list_name.lock);                  \
> > > > > > +       } while (0)
> > > > > > +
> > > > > > +static int __must_check
> > > > > > +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
> > > > > > +
> > > > > >    #define to_drm_gpuva(__node) container_of((__node), struct
> > > > > > drm_gpuva, rb.node)
> > > > > >    #define GPUVA_START(node) ((node)->va.addr)
> > > > > > @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
> > > > > > struct drm_device *drm,
> > > > > >         gpuvm->rb.tree = RB_ROOT_CACHED;
> > > > > >         INIT_LIST_HEAD(&gpuvm->rb.list);
> > > > > > +       INIT_LIST_HEAD(&gpuvm->extobj.list);
> > > > > > +       spin_lock_init(&gpuvm->extobj.lock);
> > > > > > +
> > > > > > +       INIT_LIST_HEAD(&gpuvm->evict.list);
> > > > > > +       spin_lock_init(&gpuvm->evict.lock);
> > > > > > +
> > > > > >         drm_gpuva_check_overflow(start_offset, range);
> > > > > >         gpuvm->mm_start = start_offset;
> > > > > >         gpuvm->mm_range = range;
> > > > > > @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
> > > > > > *gpuvm)
> > > > > >         WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
> > > > > >              "GPUVA tree is not empty, potentially leaking
> > > > > > memory.\n");
> > > > > > +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
> > > > > > should be empty.\n");
> > > > > > +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
> > > > > > should be empty.\n");
> > > > > > +
> > > > > >         drm_gem_private_object_fini(&gpuvm->d_obj);
> > > > > >    }
> > > > > >    EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
> > > > > > +/**
> > > > > > + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
> > > > > > + * @gpuvm: the &drm_gpuvm
> > > > > > + * @exec: the &drm_exec locking context
> > > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > > + *
> > > > > > + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the
> > > > > > given
> > > > > > + * &drm_gpuvm contains mappings of.
> > > > > > + *
> > > > > > + * Using this function directly, it is the drivers
> > > > > > responsibility to call
> > > > > > + * drm_exec_init() and drm_exec_fini() accordingly.
> > > > > > + *
> > > > > > + * Note: This function is safe against concurrent insertion
> > > > > > and removal of
> > > > > > + * external objects, however it is not safe against
> > > > > > concurrent usage itself.
> > > > > > + *
> > > > > > + * Drivers need to make sure to protect this case with
> > > > > > either an outer VM lock
> > > > > > + * or by calling drm_gpuvm_prepare_vm() before this function
> > > > > > within the
> > > > > > + * drm_exec_until_all_locked() loop, such that the GPUVM's
> > > > > > dma-resv lock ensures
> > > > > > + * mutual exclusion.
> > > > > > + *
> > > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > > + */
> > > > > > +int
> > > > > > +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> > > > > > +                         struct drm_exec *exec,
> > > > > > +                         unsigned int num_fences)
> > > > > > +{
> > > > > > +       struct drm_gpuvm_bo *vm_bo;
> > > > > > +       LIST_HEAD(extobjs);
> > > > > > +       int ret = 0;
> > > > > > +
> > > > > > +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
> > > > > > vm_bo) {
> > > > > > +               ret = drm_exec_prepare_obj(exec, vm_bo->obj,
> > > > > > num_fences);
> > > > > > +               if (ret)
> > > > > > +                       break;
> > > > > > +       }
> > > > > > +       /* Drop ref in case we break out of the loop. */
> > > > > > +       drm_gpuvm_bo_put(vm_bo);
> > > > > > +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
> > > > > > +
> > > > > > +       return ret;
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
> > > > > > +
> > > > > > +/**
> > > > > > + * drm_gpuvm_prepare_range() - prepare all BOs mapped within
> > > > > > a given range
> > > > > > + * @gpuvm: the &drm_gpuvm
> > > > > > + * @exec: the &drm_exec locking context
> > > > > > + * @addr: the start address within the VA space
> > > > > > + * @range: the range to iterate within the VA space
> > > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > > + *
> > > > > > + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
> > > > > > mapped between @addr
> > > > > > + * and @addr + @range.
> > > > > > + *
> > > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > > + */
> > > > > > +int
> > > > > > +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
> > > > > > drm_exec *exec,
> > > > > > +                       u64 addr, u64 range, unsigned int
> > > > > > num_fences)
> > > > > > +{
> > > > > > +       struct drm_gpuva *va;
> > > > > > +       u64 end = addr + range;
> > > > > > +       int ret;
> > > > > > +
> > > > > > +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
> > > > > > +               struct drm_gem_object *obj = va->gem.obj;
> > > > > > +
> > > > > > +               ret = drm_exec_prepare_obj(exec, obj,
> > > > > > num_fences);
> > > > > > +               if (ret)
> > > > > > +                       return ret;
> > > > > > +       }
> > > > > > +
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
> > > > > > +
> > > > > > +/**
> > > > > > + * drm_gpuvm_exec_lock() - lock all dma-resv of all
> > > > > > assoiciated BOs
> > > > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > > + * @interruptible: sleep interruptible if waiting
> > > > > > + *
> > > > > > + * Acquires all dma-resv locks of all &drm_gem_objects the
> > > > > > given
> > > > > > + * &drm_gpuvm contains mappings of.
> > > > > > + *
> > > > > > + * Addionally, when calling this function with struct
> > > > > > drm_gpuvm_exec::extra
> > > > > > + * being set the driver receives the given @fn callback to
> > > > > > lock additional
> > > > > > + * dma-resv in the context of the &drm_gpuvm_exec instance.
> > > > > > Typically, drivers
> > > > > > + * would call drm_exec_prepare_obj() from within this
> > > > > > callback.
> > > > > > + *
> > > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > > + */
> > > > > > +int
> > > > > > +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> > > > > > +                   unsigned int num_fences,
> > > > > > +                   bool interruptible)
> > > > > > +{
> > > > > > +       struct drm_gpuvm *gpuvm = vm_exec->vm;
> > > > > > +       struct drm_exec *exec = &vm_exec->exec;
> > > > > > +       uint32_t flags;
> > > > > > +       int ret;
> > > > > > +
> > > > > > +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
> > > > > > 0 |
> > > > > > +               DRM_EXEC_IGNORE_DUPLICATES;
> > > > > > +
> > > > > > +       drm_exec_init(exec, flags);
> > > > > > +
> > > > > > +       drm_exec_until_all_locked(exec) {
> > > > > > +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
> > > > > > num_fences);
> > > > > > +               drm_exec_retry_on_contention(exec);
> > > > > > +               if (ret)
> > > > > > +                       goto err;
> > > > > > +
> > > > > > +               ret = drm_gpuvm_prepare_objects(gpuvm, exec,
> > > > > > num_fences);
> > > > > > +               drm_exec_retry_on_contention(exec);
> > > > > > +               if (ret)
> > > > > > +                       goto err;
> > > > > > +
> > > > > > +               if (vm_exec->extra.fn) {
> > > > > > +                       ret = vm_exec->extra.fn(vm_exec,
> > > > > > num_fences);
> > > > > > +                       drm_exec_retry_on_contention(exec);
> > > > > > +                       if (ret)
> > > > > > +                               goto err;
> > > > > > +               }
> > > > > > +       }
> > > > > > +
> > > > > > +       return 0;
> > > > > > +
> > > > > > +err:
> > > > > > +       drm_exec_fini(exec);
> > > > > > +       return ret;
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
> > > > > > +
> > > > > > +static int
> > > > > > +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
> > > > > > num_fences)
> > > > > > +{
> > > > > > +       struct {
> > > > > > +               struct drm_gem_object **objs;
> > > > > > +               unsigned int num_objs;
> > > > > > +       } *args = vm_exec->extra.priv;
> > > > > > +
> > > > > > +       return drm_exec_prepare_array(&vm_exec->exec, args-
> > > > > > >objs,
> > > > > > +                                     args->num_objs,
> > > > > > num_fences);
> > > > > > +}
> > > > > > +
> > > > > > +/**
> > > > > > + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
> > > > > > assoiciated BOs
> > > > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > > > + * @objs: additional &drm_gem_objects to lock
> > > > > > + * @num_objs: the number of additional &drm_gem_objects to
> > > > > > lock
> > > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > > + * @interruptible: sleep interruptible if waiting
> > > > > > + *
> > > > > > + * Acquires all dma-resv locks of all &drm_gem_objects the
> > > > > > given &drm_gpuvm
> > > > > > + * contains mappings of, plus the ones given through @objs.
> > > > > > + *
> > > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > > + */
> > > > > > +int
> > > > > > +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
> > > > > > +                         struct drm_gem_object **objs,
> > > > > > +                         unsigned int num_objs,
> > > > > > +                         unsigned int num_fences,
> > > > > > +                         bool interruptible)
> > > > > > +{
> > > > > > +       struct {
> > > > > > +               struct drm_gem_object **objs;
> > > > > > +               unsigned int num_objs;
> > > > > > +       } args;
> > > > > > +
> > > > > > +       args.objs = objs;
> > > > > > +       args.num_objs = num_objs;
> > > > > > +
> > > > > > +       vm_exec->extra.fn = fn_lock_array;
> > > > > > +       vm_exec->extra.priv = &args;
> > > > > > +
> > > > > > +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
> > > > > > interruptible);
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
> > > > > > +
> > > > > > +/**
> > > > > > + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
> > > > > > within a given range
> > > > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > > > + * @addr: the start address within the VA space
> > > > > > + * @range: the range to iterate within the VA space
> > > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > > + * @interruptible: sleep interruptible if waiting
> > > > > > + *
> > > > > > + * Acquires all dma-resv locks of all &drm_gem_objects
> > > > > > mapped between @addr and
> > > > > > + * @addr + @range.
> > > > > > + *
> > > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > > + */
> > > > > > +int
> > > > > > +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
> > > > > > +                         u64 addr, u64 range,
> > > > > > +                         unsigned int num_fences,
> > > > > > +                         bool interruptible)
> > > > > > +{
> > > > > > +       struct drm_gpuvm *gpuvm = vm_exec->vm;
> > > > > > +       struct drm_exec *exec = &vm_exec->exec;
> > > > > > +       uint32_t flags;
> > > > > > +       int ret;
> > > > > > +
> > > > > > +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
> > > > > > 0 |
> > > > > > +               DRM_EXEC_IGNORE_DUPLICATES;
> > > > > > +
> > > > > > +       drm_exec_init(exec, flags);
> > > > > > +
> > > > > > +       drm_exec_until_all_locked(exec) {
> > > > > > +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
> > > > > > addr, range,
> > > > > > +                                             num_fences);
> > > > > > +               drm_exec_retry_on_contention(exec);
> > > > > > +               if (ret)
> > > > > > +                       goto err;
> > > > > > +       }
> > > > > > +
> > > > > > +       return ret;
> > > > > > +
> > > > > > +err:
> > > > > > +       drm_exec_fini(exec);
> > > > > > +       return ret;
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
> > > > > > +
> > > > > > +/**
> > > > > > + * drm_gpuvm_validate() - validate all BOs marked as evicted
> > > > > > + * @gpuvm: the &drm_gpuvm to validate evicted BOs
> > > > > > + *
> > > > > > + * Calls the &drm_gpuvm_ops.bo_validate callback for all
> > > > > > evicted buffer
> > > > > > + * objects being mapped in the given &drm_gpuvm.
> > > > > > + *
> > > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > > + */
> > > > > > +int
> > > > > > +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
> > > > > > +{
> > > > > > +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
> > > > > > +       struct drm_gpuvm_bo *vm_bo;
> > > > > > +       LIST_HEAD(evict);
> > > > > > +       int ret = 0;
> > > > > > +
> > > > > > +       if (unlikely(!ops || !ops->bo_validate))
> > > > > > +               return -ENOTSUPP;
> > > > > > +
> > > > > > +       for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
> > > > > > +               dma_resv_assert_held(vm_bo->obj->resv);
> > > > > > +               ret = ops->bo_validate(vm_bo->obj);
> > > > > > +               if (ret)
> > > > > > +                       break;
> > > > > > +       }
> > > > > > +       /* Drop ref in case we break out of the loop. */
> > > > > > +       drm_gpuvm_bo_put(vm_bo);
> > > > > > +       restore_vm_bo_list(gpuvm, evict, &evict);
> > > > > > +
> > > > > > +       return ret;
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
> > > > > > +
> > > > > > +/**
> > > > > > + * drm_gpuvm_resv_add_fence - add fence to private and all
> > > > > > extobj
> > > > > > + * dma-resv
> > > > > > + * @gpuvm: the &drm_gpuvm to add a fence to
> > > > > > + * @exec: the &drm_exec locking context
> > > > > > + * @fence: fence to add
> > > > > > + * @private_usage: private dma-resv usage
> > > > > > + * @extobj_usage: extobj dma-resv usage
> > > > > > + */
> > > > > > +void
> > > > > > +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> > > > > > +                        struct drm_exec *exec,
> > > > > > +                        struct dma_fence *fence,
> > > > > > +                        enum dma_resv_usage private_usage,
> > > > > > +                        enum dma_resv_usage extobj_usage)
> > > > > > +{
> > > > > > +       struct drm_gem_object *obj;
> > > > > > +       unsigned long index;
> > > > > > +
> > > > > > +       drm_exec_for_each_locked_object(exec, index, obj) {
> > > > > > +               dma_resv_assert_held(obj->resv);
> > > > > > +               dma_resv_add_fence(obj->resv, fence,
> > > > > > +                                  drm_gpuvm_is_extobj(gpuvm,
> > > > > > obj) ?
> > > > > > +                                  private_usage :
> > > > > > extobj_usage);
> > > > > > +       }
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
> > > > > > +
> > > > > >    /**
> > > > > >     * drm_gpuvm_bo_create() - create a new instance of struct
> > > > > > drm_gpuvm_bo
> > > > > >     * @gpuvm: The &drm_gpuvm the @obj is mapped in.
> > > > > > @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
> > > > > > *gpuvm,
> > > > > >         INIT_LIST_HEAD(&vm_bo->list.gpuva);
> > > > > >         INIT_LIST_HEAD(&vm_bo->list.entry.gem);
> > > > > > +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
> > > > > > +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
> > > > > > +
> > > > > >         drm_gem_object_get(obj);
> > > > > >         return vm_bo;
> > > > > > @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
> > > > > >         drm_gem_gpuva_assert_lock_held(vm_bo->obj);
> > > > > > +       spin_lock(&gpuvm->extobj.lock);
> > > > > > +       list_del(&vm_bo->list.entry.extobj);
> > > > > > +       spin_unlock(&gpuvm->extobj.lock);
> > > > > > +
> > > > > > +       spin_lock(&gpuvm->evict.lock);
> > > > > > +       list_del(&vm_bo->list.entry.evict);
> > > > > > +       spin_unlock(&gpuvm->evict.lock);
> > > > > > +
> > > > > >         list_del(&vm_bo->list.entry.gem);
> > > > > >         drm_gem_object_put(obj);
> > > > > > @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
> > > > > >     * @vm_bo: the &drm_gpuvm_bo to release the reference of
> > > > > >     *
> > > > > >     * This releases a reference to @vm_bo.
> > > > > > + *
> > > > > > + * If the reference count drops to zero, the &gpuvm_bo is
> > > > > > destroyed, which
> > > > > > + * includes removing it from the GEMs gpuva list. Hence, if
> > > > > > a call to this
> > > > > > + * function can potentially let the reference count to zero
> > > > > > the caller must
> > > > > > + * hold the dma-resv or driver specific GEM gpuva lock.
> > > > > >     */
> > > > > >    void
> > > > > >    drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
> > > > > > @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
> > > > > > *vm_bo)
> > > > > >    }
> > > > > >    EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
> > > > > > +static int __must_check
> > > > > > +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
> > > > > > +{
> > > > > > +       return kref_get_unless_zero(&vm_bo->kref);
> > > > > > +}
> > > > > > +
> > > > > >    static struct drm_gpuvm_bo *
> > > > > >    __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
> > > > > >                     struct drm_gem_object *obj)
> > > > > > @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
> > > > > > drm_gpuvm_bo *__vm_bo)
> > > > > >    }
> > > > > >    EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
> > > > > > +/**
> > > > > > + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its
> > > > > > &drm_gpuvm's
> > > > > > + * extobj list
> > > > > > + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the
> > > > > > extobj list.
> > > > > > + *
> > > > > > + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if
> > > > > > not on the list
> > > > > > + * already and if the corresponding &drm_gem_object is an
> > > > > > external object,
> > > > > > + * actually.
> > > > > > + */
> > > > > > +void
> > > > > > +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
> > > > > > +{
> > > > > > +       struct drm_gpuvm *gpuvm = vm_bo->vm;
> > > > > > +
> > > > > > +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
> > > > > > +               drm_gpuvm_bo_list_add(vm_bo, extobj);
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
> > > > > > +
> > > > > > +/**
> > > > > > + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
> > > > > > / from a
> > > > > > + * &drm_gpuvms evicted list
> > > > > > + * @obj: the &drm_gem_object to add or remove
> > > > > > + * @evict: indicates whether the object is evicted
> > > > > > + *
> > > > > > + * Adds a &drm_gem_object to or removes it from all
> > > > > > &drm_gpuvms evicted
> > > > > > + * list containing a mapping of this &drm_gem_object.
> > > > > > + */
> > > > > > +void
> > > > > > +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
> > > > > > +{
> > > > > > +       struct drm_gpuvm_bo *vm_bo;
> > > > > > +
> > > > > > +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
> > > > > > +               if (evict)
> > > > > > +                       drm_gpuvm_bo_list_add(vm_bo, evict);
> > > > > > +               else
> > > > > > +                       drm_gpuvm_bo_list_del(vm_bo, evict);
> > > > > > +       }
> > > > > > +}
> > > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
> > > > > > +
> > > > > >    static int
> > > > > >    __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
> > > > > >                    struct drm_gpuva *va)
> > > > > > diff --git a/include/drm/drm_gpuvm.h
> > > > > > b/include/drm/drm_gpuvm.h
> > > > > > index afa50b9059a2..834bb6d6617e 100644
> > > > > > --- a/include/drm/drm_gpuvm.h
> > > > > > +++ b/include/drm/drm_gpuvm.h
> > > > > > @@ -26,10 +26,12 @@
> > > > > >     */
> > > > > >    #include <linux/list.h>
> > > > > > +#include <linux/dma-resv.h>
> > > > > >    #include <linux/rbtree.h>
> > > > > >    #include <linux/types.h>
> > > > > >    #include <drm/drm_gem.h>
> > > > > > +#include <drm/drm_exec.h>
> > > > > >    struct drm_gpuvm;
> > > > > >    struct drm_gpuvm_bo;
> > > > > > @@ -259,6 +261,38 @@ struct drm_gpuvm {
> > > > > >          * space
> > > > > >          */
> > > > > >         struct dma_resv *resv;
> > > > > > +
> > > > > > +       /**
> > > > > > +        * @extobj: structure holding the extobj list
> > > > > > +        */
> > > > > > +       struct {
> > > > > > +               /**
> > > > > > +                * @list: &list_head storing &drm_gpuvm_bos
> > > > > > serving as
> > > > > > +                * external object
> > > > > > +                */
> > > > > > +               struct list_head list;
> > > > > > +
> > > > > > +               /**
> > > > > > +                * @lock: spinlock to protect the extobj list
> > > > > > +                */
> > > > > > +               spinlock_t lock;
> > > > > > +       } extobj;
> > > > > > +
> > > > > > +       /**
> > > > > > +        * @evict: structure holding the evict list and evict
> > > > > > list lock
> > > > > > +        */
> > > > > > +       struct {
> > > > > > +               /**
> > > > > > +                * @list: &list_head storing &drm_gpuvm_bos
> > > > > > currently being
> > > > > > +                * evicted
> > > > > > +                */
> > > > > > +               struct list_head list;
> > > > > > +
> > > > > > +               /**
> > > > > > +                * @lock: spinlock to protect the evict list
> > > > > > +                */
> > > > > > +               spinlock_t lock;
> > > > > > +       } evict;
> > > > > >    };
> > > > > >    void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
> > > > > > drm_device *drm,
> > > > > > @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
> > > > > > *gpuvm, struct drm_device *drm,
> > > > > >                     const struct drm_gpuvm_ops *ops);
> > > > > >    void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
> > > > > > +/**
> > > > > > + * drm_gpuvm_is_extobj() - indicates whether the given
> > > > > > &drm_gem_object is an
> > > > > > + * external object
> > > > > > + * @gpuvm: the &drm_gpuvm to check
> > > > > > + * @obj: the &drm_gem_object to check
> > > > > > + *
> > > > > > + * Returns: true if the &drm_gem_object &dma_resv differs
> > > > > > from the
> > > > > > + * &drm_gpuvms &dma_resv, false otherwise
> > > > > > + */
> > > > > > +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
> > > > > > *gpuvm,
> > > > > > +                                      struct drm_gem_object
> > > > > > *obj)
> > > > > > +{
> > > > > > +       return obj && obj->resv != gpuvm->resv;
> > > > > > +}
> > > > > > +
> > > > > >    static inline struct drm_gpuva *
> > > > > >    __drm_gpuva_next(struct drm_gpuva *va)
> > > > > >    {
> > > > > > @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
> > > > > >    #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__)
> > > > > > \
> > > > > >         list_for_each_entry_safe(va__, next__, &(gpuvm__)-
> > > > > > >rb.list, rb.entry)
> > > > > > +/**
> > > > > > + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
> > > > > > &drm_exec
> > > > > > + *
> > > > > > + * This structure should be created on the stack as
> > > > > > &drm_exec should be.
> > > > > > + *
> > > > > > + * Optionally, @extra can be set in order to lock additional
> > > > > > &drm_gem_objects.
> > > > > > + */
> > > > > > +struct drm_gpuvm_exec {
> > > > > > +       /**
> > > > > > +        * @exec: the &drm_exec structure
> > > > > > +        */
> > > > > > +       struct drm_exec exec;
> > > > > > +
> > > > > > +       /**
> > > > > > +        * @vm: the &drm_gpuvm to lock its DMA reservations
> > > > > > +        */
> > > > > > +       struct drm_gpuvm *vm;
> > > > > > +
> > > > > > +       /**
> > > > > > +        * @extra: Callback and corresponding private data
> > > > > > for the driver to
> > > > > > +        * lock arbitrary additional &drm_gem_objects.
> > > > > > +        */
> > > > > > +       struct {
> > > > > > +               /**
> > > > > > +                * @fn: The driver callback to lock
> > > > > > additional &drm_gem_objects.
> > > > > > +                */
> > > > > > +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
> > > > > > +                         unsigned int num_fences);
> > > > > > +
> > > > > > +               /**
> > > > > > +                * @priv: driver private data for the @fn
> > > > > > callback
> > > > > > +                */
> > > > > > +               void *priv;
> > > > > > +       } extra;
> > > > > > +};
> > > > > > +
> > > > > > +/**
> > > > > > + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
> > > > > > resv
> > > > > > + * @gpuvm: the &drm_gpuvm
> > > > > > + * @exec: the &drm_exec context
> > > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > > + *
> > > > > > + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
> > > > > > &drm_gem_object.
> > > > > > + *
> > > > > > + * Using this function directly, it is the drivers
> > > > > > responsibility to call
> > > > > > + * drm_exec_init() and drm_exec_fini() accordingly.
> > > > > > + *
> > > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > > + */
> > > > > > +static inline int
> > > > > > +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
> > > > > > +                    struct drm_exec *exec,
> > > > > > +                    unsigned int num_fences)
> > > > > > +{
> > > > > > +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
> > > > > > num_fences);
> > > > > > +}
> > > > > > +
> > > > > > +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> > > > > > +                             struct drm_exec *exec,
> > > > > > +                             unsigned int num_fences);
> > > > > > +
> > > > > > +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
> > > > > > +                           struct drm_exec *exec,
> > > > > > +                           u64 addr, u64 range,
> > > > > > +                           unsigned int num_fences);
> > > > > > +
> > > > > > +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> > > > > > +                       unsigned int num_fences,
> > > > > > +                       bool interruptible);
> > > > > > +
> > > > > > +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
> > > > > > *vm_exec,
> > > > > > +                             struct drm_gem_object **objs,
> > > > > > +                             unsigned int num_objs,
> > > > > > +                             unsigned int num_fences,
> > > > > > +                             bool interruptible);
> > > > > > +
> > > > > > +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
> > > > > > *vm_exec,
> > > > > > +                             u64 addr, u64 range,
> > > > > > +                             unsigned int num_fences,
> > > > > > +                             bool interruptible);
> > > > > > +
> > > > > > +/**
> > > > > > + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
> > > > > > BOs
> > > > > > + * @gpuvm: the &drm_gpuvm
> > > > > > + *
> > > > > > + * Releases all dma-resv locks of all &drm_gem_objects
> > > > > > previously acquired
> > > > > > + * through drm_gpuvm_lock() or its variants.
> > > > > > + *
> > > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > > + */
> > > > > > +static inline void
> > > > > > +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
> > > > > > +{
> > > > > > +       drm_exec_fini(&vm_exec->exec);
> > > > > > +}
> > > > > > +
> > > > > > +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
> > > > > > +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> > > > > > +                             struct drm_exec *exec,
> > > > > > +                             struct dma_fence *fence,
> > > > > > +                             enum dma_resv_usage
> > > > > > private_usage,
> > > > > > +                             enum dma_resv_usage
> > > > > > extobj_usage);
> > > > > > +
> > > > > > +/**
> > > > > > + * drm_gpuvm_exec_resv_add_fence()
> > > > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > > > + * @fence: fence to add
> > > > > > + * @private_usage: private dma-resv usage
> > > > > > + * @extobj_usage: extobj dma-resv usage
> > > > > > + *
> > > > > > + * See drm_gpuvm_resv_add_fence().
> > > > > > + */
> > > > > > +static inline void
> > > > > > +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
> > > > > > *vm_exec,
> > > > > > +                             struct dma_fence *fence,
> > > > > > +                             enum dma_resv_usage
> > > > > > private_usage,
> > > > > > +                             enum dma_resv_usage
> > > > > > extobj_usage)
> > > > > > +{
> > > > > > +       drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec,
> > > > > > fence,
> > > > > > +                                private_usage,
> > > > > > extobj_usage);
> > > > > > +}
> > > > > > +
> > > > > >    /**
> > > > > >     * struct drm_gpuvm_bo - structure representing a
> > > > > > &drm_gpuvm and
> > > > > >     * &drm_gem_object combination
> > > > > > @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
> > > > > >                          * gpuva list.
> > > > > >                          */
> > > > > >                         struct list_head gem;
> > > > > > +
> > > > > > +                       /**
> > > > > > +                        * @evict: List entry to attach to
> > > > > > the &drm_gpuvms
> > > > > > +                        * extobj list.
> > > > > > +                        */
> > > > > > +                       struct list_head extobj;
> > > > > > +
> > > > > > +                       /**
> > > > > > +                        * @evict: List entry to attach to
> > > > > > the &drm_gpuvms evict
> > > > > > +                        * list.
> > > > > > +                        */
> > > > > > +                       struct list_head evict;
> > > > > >                 } entry;
> > > > > >         } list;
> > > > > >    };
> > > > > > @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
> > > > > >    drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
> > > > > >                   struct drm_gem_object *obj);
> > > > > > +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
> > > > > > evict);
> > > > > > +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
> > > > > > +
> > > > > >    /**
> > > > > >     * drm_gpuvm_bo_for_each_va() - iterator to walk over a
> > > > > > list of &drm_gpuva
> > > > > >     * @va__: &drm_gpuva structure to assign to in each
> > > > > > iteration step
> > > > > > @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
> > > > > >          * used.
> > > > > >          */
> > > > > >         int (*sm_step_unmap)(struct drm_gpuva_op *op, void
> > > > > > *priv);
> > > > > > +
> > > > > > +       /**
> > > > > > +        * @bo_validate: called from drm_gpuvm_validate()
> > > > > > +        *
> > > > > > +        * Drivers receive this callback for every evicted
> > > > > > &drm_gem_object being
> > > > > > +        * mapped in the corresponding &drm_gpuvm.
> > > > > > +        *
> > > > > > +        * Typically, drivers would call their driver
> > > > > > specific variant of
> > > > > > +        * ttm_bo_validate() from within this callback.
> > > > > > +        */
> > > > > > +       int (*bo_validate)(struct drm_gem_object *obj);
> > > > > >    };
> > > > > >    int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 11:33             ` Boris Brezillon
  2023-09-13 12:01               ` Danilo Krummrich
@ 2023-09-13 13:22               ` Thomas Hellström
  2023-09-13 14:01                 ` Boris Brezillon
  2023-09-14  8:20                 ` Boris Brezillon
  1 sibling, 2 replies; 77+ messages in thread
From: Thomas Hellström @ 2023-09-13 13:22 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Dave Airlie, Danilo Krummrich, daniel, matthew.brost,
	sarah.walker, donald.robson, christian.koenig, faith.ekstrand,
	dri-devel, nouveau, linux-kernel


On 9/13/23 13:33, Boris Brezillon wrote:
> On Wed, 13 Sep 2023 12:39:01 +0200
> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>
>> Hi,
>>
>> On 9/13/23 09:19, Boris Brezillon wrote:
>>> On Wed, 13 Sep 2023 17:05:42 +1000
>>> Dave Airlie <airlied@gmail.com> wrote:
>>>   
>>>> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
>>>> <boris.brezillon@collabora.com> wrote:
>>>>> On Tue, 12 Sep 2023 18:20:32 +0200
>>>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>>>>>      
>>>>>>> +/**
>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>> + * @__local_list: A pointer to the local list used to store already iterated items
>>>>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
>>>>>>> + *
>>>>>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
>>>>>>> + * iterator releases the lock immediately after picking the first element from
>>>>>>> + * the list, so list insertion deletion can happen concurrently.
>>>>>> Are the list spinlocks needed for that async state update from within
>>>>>> the dma-fence critical section we've discussed previously?
>>>>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
>>>>> hook will be in this situation (Panthor at the moment, PowerVR soon). I
>>>>> get that Xe and Nouveau don't need that because they update the VM
>>>>> state early (in the ioctl path), but I keep thinking this will hurt us
>>>>> if we don't think it through from the beginning, because once you've
>>>>> set this logic to depend only on resv locks, it will be pretty hard to
>>>>> get back to a solution which lets synchronous VM_BINDs take precedence
>>>>> on asynchronous request, and, with vkQueueBindSparse() passing external
>>>>> deps (plus the fact the VM_BIND queue might be pretty deep), it can
>>>>> take a long time to get your synchronous VM_BIND executed...
>> So this would boil down to either (possibly opt-in) keeping the spinlock
>> approach or pushing the unlink out to a wq then?
> Deferred _unlink() would not be an issue, since I already defer the
> drm_gpuva destruction to a wq, it would just a be a matter of moving the
> _unlink() call there as well. But _link() also takes the GEM gpuva list
> lock, and that one is bit tricky, in that sm_map() can trigger 2 more
> _link() calls for the prev/next mappings, which we can't guess until we
> get to execute the VM update. If we mandate the use of the GEM resv
> lock, that simply means async VM updates (AKA calling
> drm_gpuvm_sm_[un]map()) are not an option. And if this is what everyone
> agrees on, then I'd like the APIs that make this sort of async VM
> update possible (drm_gpuvm_sm_[un]map(), the drm_gpuvm_ops::sm_step*
> methods, and probably other things) to be dropped, so we don't make it
> look like it's something we support.
>
>> BTW, as also asked in a reply to Danilo, how do you call unlink from
>> run_job() when it was requiring the obj->dma_resv lock, or was that a WIP?
> _unlink() makes sure the GEM gpuva list lock is taken, but this can be
> a custom lock (see drm_gem_gpuva_set_lock()). In panthor we have
> panthor_gem_object::gpuva_list_lock that's dedicated the gpuva list
> protection. We make sure we never take this lock while allocating
> memory to guarantee the dma-signalling path can't deadlock.
>
>>>>>      
>>>> btw what is the use case for this? do we have actual vulkan
>>>> applications we know will have problems here?
>>> I don't, but I think that's a concern Faith raised at some point (dates
>>> back from when I was reading threads describing how VM_BIND on i915
>>> should work, and I was clearly discovering this whole VM_BIND thing at
>>> that time, so maybe I misunderstood).
>>>   
>>>> it feels like a bit of premature optimisation, but maybe we have use cases.
>>> Might be, but that's the sort of thing that would put us in a corner if
>>> we don't have a plan for when the needs arise. Besides, if we don't
>>> want to support that case because it's too complicated, I'd recommend
>>> dropping all the drm_gpuvm APIs that let people think this mode is
>>> valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
>>> drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
>>> confusion.
>> Xe allows bypassing the bind-queue with another bind-queue, but to
>> completely avoid dependencies between queues the Operations may not
>> overlap.
> So, you check the VM state with some VM lock held (would be the VM resv
> in my case), and if the mapping is new (no overlaps with pre-existing
> mappings), you queue it to the fast-track/sync-VM_BIND queue. What would
> be missing I guess is a way to know if the mapping is active (MMU has
> been updated) or pending (MMU update queued to the bind-queue), so I can
> fast-track mapping/unmapping of active mappings. This would leave
> overlapping sync/async VM updates, which can't happen in practice
> unless userspace is doing something wrong (sparse bindings always go
> through vkQueueBindSparse).

User-space is allowed to create new bind queues at will, and they 
execute independently save for range overlaps.

And the overlapping granularity depends very much on the detail of the 
range tracking.
We drafted this fenced range utility

https://gitlab.freedesktop.org/drm/xe/kernel/-/merge_requests/353

That tracks active ranges that remove themselves when the attached fence 
signals. Not sure if we ended up using it, though. A new binding would 
scan this utility for dma-fences it needs to depend upon. Ranges in Xe 
are actually page-table modification ranges, so can exceed the actual VA 
range in some situations, but if you can build page-table structures 
async the granularity indeed becomes better.

/Thomas



>
> I'll give it a try.
>
>> (And the definition of overlap is currently page-table
>> structure updates may not overlap) but no guarantees are made about
>> priority.
>>
>> /Thomas
>>
>>
>>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 13:22               ` Thomas Hellström
@ 2023-09-13 14:01                 ` Boris Brezillon
  2023-09-13 14:29                   ` Thomas Hellström
  2023-09-14  8:20                 ` Boris Brezillon
  1 sibling, 1 reply; 77+ messages in thread
From: Boris Brezillon @ 2023-09-13 14:01 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: Dave Airlie, Danilo Krummrich, daniel, matthew.brost,
	sarah.walker, donald.robson, christian.koenig, faith.ekstrand,
	dri-devel, nouveau, linux-kernel

On Wed, 13 Sep 2023 15:22:56 +0200
Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:

> On 9/13/23 13:33, Boris Brezillon wrote:
> > On Wed, 13 Sep 2023 12:39:01 +0200
> > Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >  
> >> Hi,
> >>
> >> On 9/13/23 09:19, Boris Brezillon wrote:  
> >>> On Wed, 13 Sep 2023 17:05:42 +1000
> >>> Dave Airlie <airlied@gmail.com> wrote:
> >>>     
> >>>> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
> >>>> <boris.brezillon@collabora.com> wrote:  
> >>>>> On Tue, 12 Sep 2023 18:20:32 +0200
> >>>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >>>>>        
> >>>>>>> +/**
> >>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
> >>>>>>> + * @__gpuvm: The GPU VM
> >>>>>>> + * @__list_name: The name of the list we're iterating on
> >>>>>>> + * @__local_list: A pointer to the local list used to store already iterated items
> >>>>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> >>>>>>> + *
> >>>>>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
> >>>>>>> + * iterator releases the lock immediately after picking the first element from
> >>>>>>> + * the list, so list insertion deletion can happen concurrently.  
> >>>>>> Are the list spinlocks needed for that async state update from within
> >>>>>> the dma-fence critical section we've discussed previously?  
> >>>>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
> >>>>> hook will be in this situation (Panthor at the moment, PowerVR soon). I
> >>>>> get that Xe and Nouveau don't need that because they update the VM
> >>>>> state early (in the ioctl path), but I keep thinking this will hurt us
> >>>>> if we don't think it through from the beginning, because once you've
> >>>>> set this logic to depend only on resv locks, it will be pretty hard to
> >>>>> get back to a solution which lets synchronous VM_BINDs take precedence
> >>>>> on asynchronous request, and, with vkQueueBindSparse() passing external
> >>>>> deps (plus the fact the VM_BIND queue might be pretty deep), it can
> >>>>> take a long time to get your synchronous VM_BIND executed...  
> >> So this would boil down to either (possibly opt-in) keeping the spinlock
> >> approach or pushing the unlink out to a wq then?  
> > Deferred _unlink() would not be an issue, since I already defer the
> > drm_gpuva destruction to a wq, it would just a be a matter of moving the
> > _unlink() call there as well. But _link() also takes the GEM gpuva list
> > lock, and that one is bit tricky, in that sm_map() can trigger 2 more
> > _link() calls for the prev/next mappings, which we can't guess until we
> > get to execute the VM update. If we mandate the use of the GEM resv
> > lock, that simply means async VM updates (AKA calling
> > drm_gpuvm_sm_[un]map()) are not an option. And if this is what everyone
> > agrees on, then I'd like the APIs that make this sort of async VM
> > update possible (drm_gpuvm_sm_[un]map(), the drm_gpuvm_ops::sm_step*
> > methods, and probably other things) to be dropped, so we don't make it
> > look like it's something we support.
> >  
> >> BTW, as also asked in a reply to Danilo, how do you call unlink from
> >> run_job() when it was requiring the obj->dma_resv lock, or was that a WIP?  
> > _unlink() makes sure the GEM gpuva list lock is taken, but this can be
> > a custom lock (see drm_gem_gpuva_set_lock()). In panthor we have
> > panthor_gem_object::gpuva_list_lock that's dedicated the gpuva list
> > protection. We make sure we never take this lock while allocating
> > memory to guarantee the dma-signalling path can't deadlock.
> >  
> >>>>>        
> >>>> btw what is the use case for this? do we have actual vulkan
> >>>> applications we know will have problems here?  
> >>> I don't, but I think that's a concern Faith raised at some point (dates
> >>> back from when I was reading threads describing how VM_BIND on i915
> >>> should work, and I was clearly discovering this whole VM_BIND thing at
> >>> that time, so maybe I misunderstood).
> >>>     
> >>>> it feels like a bit of premature optimisation, but maybe we have use cases.  
> >>> Might be, but that's the sort of thing that would put us in a corner if
> >>> we don't have a plan for when the needs arise. Besides, if we don't
> >>> want to support that case because it's too complicated, I'd recommend
> >>> dropping all the drm_gpuvm APIs that let people think this mode is
> >>> valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
> >>> drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
> >>> confusion.  
> >> Xe allows bypassing the bind-queue with another bind-queue, but to
> >> completely avoid dependencies between queues the Operations may not
> >> overlap.  
> > So, you check the VM state with some VM lock held (would be the VM resv
> > in my case), and if the mapping is new (no overlaps with pre-existing
> > mappings), you queue it to the fast-track/sync-VM_BIND queue. What would
> > be missing I guess is a way to know if the mapping is active (MMU has
> > been updated) or pending (MMU update queued to the bind-queue), so I can
> > fast-track mapping/unmapping of active mappings. This would leave
> > overlapping sync/async VM updates, which can't happen in practice
> > unless userspace is doing something wrong (sparse bindings always go
> > through vkQueueBindSparse).  
> 
> User-space is allowed to create new bind queues at will, and they 
> execute independently save for range overlaps.

I've limited panthor to just one bind-queue that's automatically
created when the VM is created. I guess letting userspace create more
than one queue is doable, but we'd still be serializing VM
operations anyway and that complicates the whole thing when concurrent
operations to the same VM region happen from different bind queues, so I
figured it'd be simpler to expose just one queue.

> 
> And the overlapping granularity depends very much on the detail of the 
> range tracking.
> We drafted this fenced range utility
> 
> https://gitlab.freedesktop.org/drm/xe/kernel/-/merge_requests/353
> 
> That tracks active ranges that remove themselves when the attached fence 
> signals. Not sure if we ended up using it, though. A new binding would 
> scan this utility for dma-fences it needs to depend upon.

Sounds like implicit deps on VM ranges :D. I'll have a look, thanks
for the pointer! 

> Ranges in Xe 
> are actually page-table modification ranges, so can exceed the actual VA 
> range in some situations, but if you can build page-table structures 
> async the granularity indeed becomes better.

The granularity in Mali is 4k, and we don't build the page table struct
asynchronously, we just update the page table tree from the CPU,
holding a VM lock to serialize such operations (that's done
synchronously in the ::run_job() path, or from the ioctl in case of a
sync-VM_BIND).

> 
> /Thomas
> 
> 
> 
> >
> > I'll give it a try.
> >  
> >> (And the definition of overlap is currently page-table
> >> structure updates may not overlap) but no guarantees are made about
> >> priority.
> >>
> >> /Thomas
> >>
> >>
> >>  


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 12:16             ` Danilo Krummrich
@ 2023-09-13 14:26               ` Christian König
  2023-09-13 15:13                 ` Thomas Hellström
  2023-09-13 15:15                 ` Danilo Krummrich
  2023-09-14 10:57               ` [Nouveau] " Danilo Krummrich
  1 sibling, 2 replies; 77+ messages in thread
From: Christian König @ 2023-09-13 14:26 UTC (permalink / raw)
  To: Danilo Krummrich, Thomas Hellström
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
> As mentioned in a different mail thread, the reply is based on the assumption
> that we don't support anything else than GPUVM updates from the IOCTL.

I think that this assumption is incorrect.

Vulkan is just once specific use case, but this here should probably be 
able to handle other use cases as well.

Especially with HMM you get the requirement that you need to be able to 
invalidate GPUVM mappings without grabbing a reservation lock.

See what the eviction lock in amdgpu is doing for example.

Regards,
Christian.

>
> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström wrote:
>> Hi!
>>
>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström wrote:
>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
>>>>>> Hi, Danilo,
>>>>>>
>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>> So far the DRM GPUVA manager offers common infrastructure to
>>>>>>> track GPU VA
>>>>>>> allocations and mappings, generically connect GPU VA mappings
>>>>>>> to their
>>>>>>> backing buffers and perform more complex mapping operations
>>>>>>> on the GPU VA
>>>>>>> space.
>>>>>>>
>>>>>>> However, there are more design patterns commonly used by
>>>>>>> drivers, which
>>>>>>> can potentially be generalized in order to make the DRM GPUVA
>>>>>>> manager
>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>> this patch aims
>>>>>>> at generalizing the following elements.
>>>>>>>
>>>>>>> 1) Provide a common dma-resv for GEM objects not being used
>>>>>>> outside of
>>>>>>>       this GPU-VM.
>>>>>>>
>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>> which are
>>>>>>>       shared with other GPU-VMs).
>>>>>>>
>>>>>>> 3) Provide functions to efficiently lock all GEM objects dma-
>>>>>>> resv the
>>>>>>>       GPU-VM contains mappings of.
>>>>>>>
>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>> contains mappings
>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>> accelerated.
>>>>>>>
>>>>>>> 5) Provide some convinience functions for common patterns.
>>>>>>>
>>>>>>> Rather than being designed as a "framework", the target is to
>>>>>>> make all
>>>>>>> features appear as a collection of optional helper functions,
>>>>>>> such that
>>>>>>> drivers are free to make use of the DRM GPUVA managers basic
>>>>>>> functionality and opt-in for other features without setting
>>>>>>> any feature
>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>
>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>> locking for drivers
>>>>>>> updating the GPU VA space within the fence signalling path.
>>>>>>>
>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an existing
>>>>>>> instance of this
>>>>>>>      * particular combination. If not existent a new instance
>>>>>>> is created and linked
>>>>>>>      * to the &drm_gem_object.
>>>>>>> + *
>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>> &drm_gpuvm, are also used
>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>> evicted objects. Those
>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>> dma-resv locks and
>>>>>>> + * validation of evicted objects bound in a &drm_gpuvm. For
>>>>>>> instance the all
>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be
>>>>>>> locked by calling
>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>> drm_gpuvm_validate() in
>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>> also possible to lock
>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>> corresponding parameters to
>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the &drm_exec
>>>>>>> loop while making
>>>>>>> + * use of helper functions such as drm_gpuvm_prepare_range()
>>>>>>> or
>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>> + *
>>>>>>> + * Every bound &drm_gem_object is treated as external object
>>>>>>> when its &dma_resv
>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>> &dma_resv structure.
>>>>>>>      */
>>>>>>>     /**
>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for the same
>>>>>>> &drm_gpuvm and
>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>> creations and destructions
>>>>>>>      * of &drm_gpuvm_bos in order to keep instances unique.
>>>>>>> + *
>>>>>>> + * The &drm_gpuvm's lists for keeping track of external and
>>>>>>> evicted objects are
>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>> iteration internally.
>>>>>>> + *
>>>>>>> + * However, drivers still need ensure to protect concurrent
>>>>>>> calls to functions
>>>>>>> + * iterating those lists, such as drm_gpuvm_validate() and
>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function contains
>>>>>>> a particular
>>>>>>> + * comment and lockdep checks if possible.
>>>>>>> + *
>>>>>>> + * Functions adding or removing entries from those lists,
>>>>>>> such as
>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be
>>>>>>> called with external
>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>> corresponding list to be
>>>>>>> + * (safely) modified while potentially being iternated by
>>>>>>> other API functions.
>>>>>>> + * However, this is entirely optional.
>>>>>>>      */
>>>>>>>     /**
>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>      *   }
>>>>>>>      */
>>>>>>> +/**
>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>> already iterated items
>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>> + *
>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>> Lockless as in, the
>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>> first element from
>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>> concurrently.
>>>>>> Are the list spinlocks needed for that async state update from
>>>>>> within the
>>>>>> dma-fence critical section we've discussed previously?
>>>>> Yes, but also for other reasons, see below.
>>>>>
>>>>>> Otherwise it should be sufficient to protect the lists with the
>>>>>> gpuvm's resv
>>>>>> (or for the extobj list with an outer lock).
>>>>>>
>>>>>> If those spinlocks are still needed in some situations, perhaps
>>>>>> could we
>>>>>> have an option to set them to NULL (Like IIRC the maple tree
>>>>>> allows for)?
>>>>> The evict spinlock is needed in any case, since in
>>>>> drm_gpuvm_bo_evict() we're
>>>>> holding only the dma-resv lock from the BO this function gets
>>>>> called for. Hence,
>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() calls with
>>>>> different BOs.
>>>> No. Only if you try to add external objects to the vm's evict list
>>>> from
>>>> within the evict code. That's not necessary since you loop through
>>>> all
>>>> external objects anyway when locking them so an "evicted" bool in
>>>> the vm_bo,
>>>> protected by the bo resv would be sufficient. The extobj locking
>>>> loop can
>>>> then add the bo to the evicted list.
>>> And validate() can remove it while still holding all dma-resv locks,
>>> neat!
>>> However, what if two tasks are trying to lock the VA space
>>> concurrently? What
>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>> drm_gpuva_unlink()?
>>> Are we guaranteed that at this point of time the drm_gpuvm_bo is not
>>> on the
>>> evicted list? Because otherwise we would call drm_gpuvm_bo_destroy()
>>> with the
>>> dma-resv lock held, which wouldn't be allowed, since
>>> drm_gpuvm_bo_destroy()
>>> might drop the last reference to the drm_gem_object and hence we'd
>>> potentially
>>> free the dma-resv lock while holding it, at least if it's an external
>>> object.
>> Easiest way in this scheme is to think of the lists as being protected
>> by the vm's resv lock. That means anybody calling unlink() must also
>> hold the vm's resv lock. (Which is OK from an UAF point of view, but
>> perhaps not from a locking inversion POW from an async list update).
> This would mean that on unlink() we'd need to hold the VM's resv lock and the
> corresponding GEM's resv lock (in case they're not the same anyways) because the
> VM's resv lock would protect the external / evicted object lists and the GEM
> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
> drm_gpuvm_bo's list of drm_gpuvas.
>
>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>> really would not
>>>>> like to add even more complexity just to get the spinlock out of
>>>>> the way in case
>>>>> the driver already has an outer lock protecting this path.
>>>> I must disagree here. These spinlocks and atomic operations are
>>>> pretty
>>>> costly and as discussed earlier this type of locking was the reason
>>>> (at
>>>> least according to the commit message) that made Christian drop the
>>>> XArray
>>>> use in drm_exec for the same set of objects: "The locking overhead
>>>> is
>>>> unecessary and measurable". IMHO the spinlock is the added
>>>> complexity and a
>>>> single wide lock following the drm locking guidelines set out by
>>>> Daniel and
>>>> David should really be the default choice with an opt-in for a
>>>> spinlock if
>>>> needed for async and pushing out to a wq is not an option.
>>> For the external object list an outer lock would work as long as it's
>>> not the
>>> dma-resv lock of the corresponding GEM object, since here we actually
>>> need to
>>> remove the list entry from the external object list on
>>> drm_gpuvm_bo_destroy().
>>> It's just a bit weird design wise that drivers would need to take
>>> this outer
>>> lock on:
>>>
>>> - drm_gpuvm_bo_extobj_add()
>>> - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
>>> - drm_gpuva_unlink()            (because it needs to call
>>> drm_gpuvm_bo_put())
>>> - drm_gpuvm_exec_lock()
>>> - drm_gpuvm_exec_lock_array()
>>> - drm_gpuvm_prepare_range()
>>>
>>> Given that it seems reasonable to do all the required locking
>>> internally.
>>  From a design POW, there has been a clear direction in XE to make
>> things similar to mmap() / munmap(), so this outer lock, which in Xe is
>> an rwsem, is used in a similar way as the mmap_lock. It's protecting
>> the page-table structures and vma rb tree, the userptr structures and
>> the extobj list. Basically it's taken early in the exec IOCTL, the
>> VM_BIND ioctl, the compute rebind worker and the pagefault handler, so
>> all of the above are just asserting that it is taken in the correct
>> mode.
>>
>> But strictly with this scheme one could also use the vm's dma_resv for
>> the extobj list since with drm_exec, it's locked before traversing the
>> list.
>>
>> The whole point of this scheme is to rely on locks that you already are
>> supposed to be holding for various reasons and is simple to comprehend.
> I don't agree that we're supposed to hold the VM's resv lock anyways for
> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm fine using it
> for that purpose nevertheless.
>
>>> In order to at least place lockdep checks, the driver would need to
>>> supply the
>>> corresponding lock's lockdep_map, because the GPUVM otherwise doesn't
>>> know about
>>> the lock.
>> Yes, that sounds reasonable. One lockdep map per list.
> I'd really like to avoid that, especially now that everything got simpler. We
> should define the actual locks to take instead.
>
>>> Out of curiosity, what is the overhead of a spin_lock() that doesn't
>>> need to
>>> spin?
>> I guess it's hard to tell exactly, but it is much lower on modern x86
>> than what it used to be. Not sure about ARM, which is the other
>> architecture important to us. I figure if there is little cache-line
>> bouncing the main overhead comes from the implied barriers.
>>
>>>> A pretty simple way that would not add much code would be
>>>>
>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>> spinlock_t
>>>> *lock)
>>>>
>>>> {
>>>>
>>>>      if (!gpuvm->resv_protected_lists)
>>>>          spin_lock(lock);
>>>>
>>>> }
>>>>
>>>>>> For such drivers, that would require anybody calling unlink to
>>>>>> hold the vm's
>>>>>> resv, though.
>>>>> In V4 I want to go back to having a dedicated lock for the GEMs
>>>>> gpuva list (or
>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>> lock for that
>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>> otherwise wouldn't
>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>> was referring to
>>>>> earlier.
>>>> Yeah, I can see the need for a dedicated lock for the GEM's gpuva
>>>> list, but
>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>> problem. We
>>>> may free the object and a pointer to the vm's resv during unlink
>>>> but we
>>>> don't free the vm's resv.  It'd be a matter of ensuring that any
>>>> calls to
>>>> unlink from *within* drm_gpuvm allows it to be held.
>>> Drivers calling unlink() from the fence signaling path can't use the
>>> VM's
>>> dma-resv lock.
>> Yes, that made me a bit curious because in the current version the code
>> required the object's dma_resv for unlink() which can't be grabbed
>> either from the fence signaling path. So are there any drivers actually
>> wanting to do that? If so, they will either need to resort to the
>> current spinlock solution or they will need to call unlink from a
>> workqueue item.
> As Boris already mentioned we have the dma-resv lock by default or a driver
> specific GEM gpuva lock as opt-in. Now, we can get rid of the latter.
>
>>> Also, what if the object is an external object? We can't use the VM's
>>> dma-resv
>>> lock here.
>> Why? Typically (sync) unlink is only ever called from an unbind-like
>> operation where it should be trivial to grab the vm's resv. Or, for
>> that matter any outer lock protecting the extobj list. Rule would be
>> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict would
>> be protected by either the vm's dma_resv (or possibly an outer lock in
>> the case of the extobj list).
> Outer lock wouldn't have been working for updates in the async path, but
> shouldn't be relevant anymore. We could use the VM's resv for that.
>
>>>   And we can't have the GEM objs dma-resv lock held when calling
>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>> refcount drops
>>> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might
>>> drop the
>>> last reference of the GEM object.
>> Yes, but this is a different problem as to what exactly protects
>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per bo list
>> lock, or if we want to keep the bo's dma_resv we need to ensure that
>> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
>> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount (I know
>> Boris didn't like that, but requiring an explicit refcount for a
>> pointer you dereference unless you're under a lock that ensures keeping
>> the object alive is pretty much required?) But anyway for the
>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal spinlock)
>> I don't have a strong preference.
> We can keep the GEM objects dma-resv lock, however as mentioned above
> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the VM's resv lock
> and the GEM's resv lock in case they differ.
>
>>>   All those problems go away with a dedicated
>>> GEM gpuva list lock.
>> I don't think these are real problems.
>> With the excepton of the eviction list "trick" where we currently have
>> slightly different approach to collect external bos needing rebinding,
>> we have this working fine.
>>
>> TBH I think pretty much the only situation where the spinlock is needed
>> is for async updates of these lists, unless a wq item can be used for
>> that, but it doesn't really seem like the current code allows for such
>> updates anyway? It complicates the code a lot, adds overhead and also
>> adds the requirement for refcounting during list traversal.
>>
>> /Thomas
>>
>>>> /Thomas
>>>>
>>>>
>>>>>> It seems that with that also the refcount could be make non-
>>>>>> atomic.
>>>>>>
>>>>>> All in the spirit of the drm locking guidelines "use big locks
>>>>>> when
>>>>>> possible".
>>>>>> Lower level locks only when necessary for performance or
>>>>>> locking inversion?
>>>>>>
>>>>>> /Thomas
>>>>>>
>>>>>>
>>>>>>> + *
>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>> local list, so removal
>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>> iterating the list.
>>>>>>> + */
>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>> +       ({
>>>>>>>                             \
>>>>>>> +               struct drm_gpuvm_bo
>>>>>>> *__vm_bo;                                           \
>>>>>>> +
>>>>>>>                             \
>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>                             \
>>>>>>> +
>>>>>>>                             \
>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>> __list_name.lock);                                \
>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>> __list_name.list)) {                     \
>>>>>>> +                       __vm_bo =
>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>> +                                                  struct
>>>>>>> drm_gpuvm_bo,                 \
>>>>>>> +
>>>>>>> list.entry.__list_name);             \
>>>>>>> +                       if
>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>> {                    \
>>>>>>> +                               list_move_tail(&(__vm_bo)-
>>>>>>>> list.entry.__list_name,      \
>>>>>>> +
>>>>>>> __local_list);                           \
>>>>>>> +                               break;
>>>>>>>                             \
>>>>>>> +                       } else
>>>>>>> {                                                        \
>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>> list.entry.__list_name);      \
>>>>>>> +                               __vm_bo =
>>>>>>> NULL;                                         \
>>>>>>> +                       }
>>>>>>>                             \
>>>>>>> +               }
>>>>>>>                             \
>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>> __list_name.lock);                              \
>>>>>>> +
>>>>>>>                             \
>>>>>>> +               __vm_bo;
>>>>>>>                             \
>>>>>>> +       })
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
>>>>>>> + *
>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>> Lockless as in, the
>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>> first element from the
>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>> concurrently.
>>>>>>> + *
>>>>>>> + * Typical use:
>>>>>>> + *
>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>> + *
>>>>>>> + *     ret = 0;
>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>> &my_local_list, vm_bo) {
>>>>>>> + *             ret = do_something_with_vm_bo(..., vm_bo);
>>>>>>> + *             if (ret)
>>>>>>> + *                     break;
>>>>>>> + *     }
>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>> &my_local_list);
>>>>>>> + *
>>>>>>> + *
>>>>>>> + * Only used for internal list iterations, not meant to be
>>>>>>> exposed to the outside
>>>>>>> + * world.
>>>>>>> + */
>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>> __local_list, __vm_bo)    \
>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>> __list_name,           \
>>>>>>> +                                               __local_list,
>>>>>>> NULL);            \
>>>>>>> +
>>>>>>> __vm_bo;
>>>>>>>        \
>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>> __list_name,           \
>>>>>>> +                                               __local_list,
>>>>>>> __vm_bo))         \
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to their
>>>>>>> original list
>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>> already iterated items
>>>>>>> + *
>>>>>>> + * When we're done iterating a vm_bo list, we should call
>>>>>>> restore_vm_bo_list()
>>>>>>> + * to restore the original state and let new iterations take
>>>>>>> place.
>>>>>>> + */
>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>> __local_list)                         \
>>>>>>> +       do
>>>>>>> {
>>>>>>>                  \
>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>> list elements to the          \
>>>>>>> +                * head to preserve previous ordering, in
>>>>>>> case it matters.              \
>>>>>>> +
>>>>>>> */
>>>>>>>            \
>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>> __list_name.lock);                                \
>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>> __list_name.list);                \
>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>> __list_name.lock);                              \
>>>>>>> +       } while (0)
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
>>>>>>> list
>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>> + *
>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>> @__list_name and
>>>>>>> + * increases the vm_bo's reference count.
>>>>>>> + */
>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>> __list_name)                            \
>>>>>>> +       do
>>>>>>> {
>>>>>>>          \
>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>> __list_name.lock);                    \
>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>> list.entry.__list_name))             \
>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>> list.entry.__list_name,       \
>>>>>>> +                                     &(__vm_bo)->vm-
>>>>>>>> __list_name.list);        \
>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>> __list_name.lock);                  \
>>>>>>> +       } while (0)
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
>>>>>>> list
>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>> + *
>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>> @__list_name and
>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>> + */
>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>> __list_name)                            \
>>>>>>> +       do
>>>>>>> {
>>>>>>>          \
>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>> __list_name.lock);                    \
>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>> list.entry.__list_name))            \
>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>> list.entry.__list_name);      \
>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>> __list_name.lock);                  \
>>>>>>> +       } while (0)
>>>>>>> +
>>>>>>> +static int __must_check
>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>>>>>> +
>>>>>>>     #define to_drm_gpuva(__node) container_of((__node), struct
>>>>>>> drm_gpuva, rb.node)
>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
>>>>>>> struct drm_device *drm,
>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>> +
>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>> +
>>>>>>>          drm_gpuva_check_overflow(start_offset, range);
>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>          gpuvm->mm_range = range;
>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>> *gpuvm)
>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>               "GPUVA tree is not empty, potentially leaking
>>>>>>> memory.\n");
>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
>>>>>>> should be empty.\n");
>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>> should be empty.\n");
>>>>>>> +
>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>     }
>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>> + *
>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the
>>>>>>> given
>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>> + *
>>>>>>> + * Using this function directly, it is the drivers
>>>>>>> responsibility to call
>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>> + *
>>>>>>> + * Note: This function is safe against concurrent insertion
>>>>>>> and removal of
>>>>>>> + * external objects, however it is not safe against
>>>>>>> concurrent usage itself.
>>>>>>> + *
>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>> either an outer VM lock
>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this function
>>>>>>> within the
>>>>>>> + * drm_exec_until_all_locked() loop, such that the GPUVM's
>>>>>>> dma-resv lock ensures
>>>>>>> + * mutual exclusion.
>>>>>>> + *
>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>> + */
>>>>>>> +int
>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>> +                         struct drm_exec *exec,
>>>>>>> +                         unsigned int num_fences)
>>>>>>> +{
>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>> +       int ret = 0;
>>>>>>> +
>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>> vm_bo) {
>>>>>>> +               ret = drm_exec_prepare_obj(exec, vm_bo->obj,
>>>>>>> num_fences);
>>>>>>> +               if (ret)
>>>>>>> +                       break;
>>>>>>> +       }
>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>> +
>>>>>>> +       return ret;
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped within
>>>>>>> a given range
>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>> + * @addr: the start address within the VA space
>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>> + *
>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>> mapped between @addr
>>>>>>> + * and @addr + @range.
>>>>>>> + *
>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>> + */
>>>>>>> +int
>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>> drm_exec *exec,
>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>> num_fences)
>>>>>>> +{
>>>>>>> +       struct drm_gpuva *va;
>>>>>>> +       u64 end = addr + range;
>>>>>>> +       int ret;
>>>>>>> +
>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>> +
>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>> num_fences);
>>>>>>> +               if (ret)
>>>>>>> +                       return ret;
>>>>>>> +       }
>>>>>>> +
>>>>>>> +       return 0;
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>> assoiciated BOs
>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>> + *
>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>> given
>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>> + *
>>>>>>> + * Addionally, when calling this function with struct
>>>>>>> drm_gpuvm_exec::extra
>>>>>>> + * being set the driver receives the given @fn callback to
>>>>>>> lock additional
>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec instance.
>>>>>>> Typically, drivers
>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>> callback.
>>>>>>> + *
>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>> + */
>>>>>>> +int
>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>> +                   unsigned int num_fences,
>>>>>>> +                   bool interruptible)
>>>>>>> +{
>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>> +       uint32_t flags;
>>>>>>> +       int ret;
>>>>>>> +
>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>> 0 |
>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>> +
>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>> +
>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>> num_fences);
>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>> +               if (ret)
>>>>>>> +                       goto err;
>>>>>>> +
>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>> num_fences);
>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>> +               if (ret)
>>>>>>> +                       goto err;
>>>>>>> +
>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>> num_fences);
>>>>>>> +                       drm_exec_retry_on_contention(exec);
>>>>>>> +                       if (ret)
>>>>>>> +                               goto err;
>>>>>>> +               }
>>>>>>> +       }
>>>>>>> +
>>>>>>> +       return 0;
>>>>>>> +
>>>>>>> +err:
>>>>>>> +       drm_exec_fini(exec);
>>>>>>> +       return ret;
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>> +
>>>>>>> +static int
>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
>>>>>>> num_fences)
>>>>>>> +{
>>>>>>> +       struct {
>>>>>>> +               struct drm_gem_object **objs;
>>>>>>> +               unsigned int num_objs;
>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>> +
>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>> objs,
>>>>>>> +                                     args->num_objs,
>>>>>>> num_fences);
>>>>>>> +}
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
>>>>>>> assoiciated BOs
>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>> + * @num_objs: the number of additional &drm_gem_objects to
>>>>>>> lock
>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>> + *
>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>> given &drm_gpuvm
>>>>>>> + * contains mappings of, plus the ones given through @objs.
>>>>>>> + *
>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>> + */
>>>>>>> +int
>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>> +                         unsigned int num_objs,
>>>>>>> +                         unsigned int num_fences,
>>>>>>> +                         bool interruptible)
>>>>>>> +{
>>>>>>> +       struct {
>>>>>>> +               struct drm_gem_object **objs;
>>>>>>> +               unsigned int num_objs;
>>>>>>> +       } args;
>>>>>>> +
>>>>>>> +       args.objs = objs;
>>>>>>> +       args.num_objs = num_objs;
>>>>>>> +
>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>> +
>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>> interruptible);
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>> within a given range
>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>> + * @addr: the start address within the VA space
>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>> + *
>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>> mapped between @addr and
>>>>>>> + * @addr + @range.
>>>>>>> + *
>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>> + */
>>>>>>> +int
>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>>>>> +                         u64 addr, u64 range,
>>>>>>> +                         unsigned int num_fences,
>>>>>>> +                         bool interruptible)
>>>>>>> +{
>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>> +       uint32_t flags;
>>>>>>> +       int ret;
>>>>>>> +
>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>> 0 |
>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>> +
>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>> +
>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>> addr, range,
>>>>>>> +                                             num_fences);
>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>> +               if (ret)
>>>>>>> +                       goto err;
>>>>>>> +       }
>>>>>>> +
>>>>>>> +       return ret;
>>>>>>> +
>>>>>>> +err:
>>>>>>> +       drm_exec_fini(exec);
>>>>>>> +       return ret;
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as evicted
>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>> + *
>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>> evicted buffer
>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>> + *
>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>> + */
>>>>>>> +int
>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>> +{
>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>> +       LIST_HEAD(evict);
>>>>>>> +       int ret = 0;
>>>>>>> +
>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>> +               return -ENOTSUPP;
>>>>>>> +
>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>> +               if (ret)
>>>>>>> +                       break;
>>>>>>> +       }
>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>> +
>>>>>>> +       return ret;
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private and all
>>>>>>> extobj
>>>>>>> + * dma-resv
>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>> + * @fence: fence to add
>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>> + */
>>>>>>> +void
>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>> +                        struct drm_exec *exec,
>>>>>>> +                        struct dma_fence *fence,
>>>>>>> +                        enum dma_resv_usage private_usage,
>>>>>>> +                        enum dma_resv_usage extobj_usage)
>>>>>>> +{
>>>>>>> +       struct drm_gem_object *obj;
>>>>>>> +       unsigned long index;
>>>>>>> +
>>>>>>> +       drm_exec_for_each_locked_object(exec, index, obj) {
>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>> +                                  drm_gpuvm_is_extobj(gpuvm,
>>>>>>> obj) ?
>>>>>>> +                                  private_usage :
>>>>>>> extobj_usage);
>>>>>>> +       }
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>> +
>>>>>>>     /**
>>>>>>>      * drm_gpuvm_bo_create() - create a new instance of struct
>>>>>>> drm_gpuvm_bo
>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>> *gpuvm,
>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>> +
>>>>>>>          drm_gem_object_get(obj);
>>>>>>>          return vm_bo;
>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>> +
>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>> +
>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>          drm_gem_object_put(obj);
>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the reference of
>>>>>>>      *
>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>> + *
>>>>>>> + * If the reference count drops to zero, the &gpuvm_bo is
>>>>>>> destroyed, which
>>>>>>> + * includes removing it from the GEMs gpuva list. Hence, if
>>>>>>> a call to this
>>>>>>> + * function can potentially let the reference count to zero
>>>>>>> the caller must
>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>      */
>>>>>>>     void
>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
>>>>>>> *vm_bo)
>>>>>>>     }
>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>> +static int __must_check
>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>> +{
>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>> +}
>>>>>>> +
>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>                      struct drm_gem_object *obj)
>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>     }
>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its
>>>>>>> &drm_gpuvm's
>>>>>>> + * extobj list
>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the
>>>>>>> extobj list.
>>>>>>> + *
>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if
>>>>>>> not on the list
>>>>>>> + * already and if the corresponding &drm_gem_object is an
>>>>>>> external object,
>>>>>>> + * actually.
>>>>>>> + */
>>>>>>> +void
>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>> +{
>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>> +
>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
>>>>>>> / from a
>>>>>>> + * &drm_gpuvms evicted list
>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>> + *
>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>> &drm_gpuvms evicted
>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>> + */
>>>>>>> +void
>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>>>>>> +{
>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>> +
>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>> +               if (evict)
>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, evict);
>>>>>>> +               else
>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, evict);
>>>>>>> +       }
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>> +
>>>>>>>     static int
>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>                     struct drm_gpuva *va)
>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>      */
>>>>>>>     #include <linux/list.h>
>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>     #include <linux/rbtree.h>
>>>>>>>     #include <linux/types.h>
>>>>>>>     #include <drm/drm_gem.h>
>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>     struct drm_gpuvm;
>>>>>>>     struct drm_gpuvm_bo;
>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>           * space
>>>>>>>           */
>>>>>>>          struct dma_resv *resv;
>>>>>>> +
>>>>>>> +       /**
>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>> +        */
>>>>>>> +       struct {
>>>>>>> +               /**
>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>> serving as
>>>>>>> +                * external object
>>>>>>> +                */
>>>>>>> +               struct list_head list;
>>>>>>> +
>>>>>>> +               /**
>>>>>>> +                * @lock: spinlock to protect the extobj list
>>>>>>> +                */
>>>>>>> +               spinlock_t lock;
>>>>>>> +       } extobj;
>>>>>>> +
>>>>>>> +       /**
>>>>>>> +        * @evict: structure holding the evict list and evict
>>>>>>> list lock
>>>>>>> +        */
>>>>>>> +       struct {
>>>>>>> +               /**
>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>> currently being
>>>>>>> +                * evicted
>>>>>>> +                */
>>>>>>> +               struct list_head list;
>>>>>>> +
>>>>>>> +               /**
>>>>>>> +                * @lock: spinlock to protect the evict list
>>>>>>> +                */
>>>>>>> +               spinlock_t lock;
>>>>>>> +       } evict;
>>>>>>>     };
>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>> drm_device *drm,
>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>> &drm_gem_object is an
>>>>>>> + * external object
>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>> + *
>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv differs
>>>>>>> from the
>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>> + */
>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>> *gpuvm,
>>>>>>> +                                      struct drm_gem_object
>>>>>>> *obj)
>>>>>>> +{
>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>> +}
>>>>>>> +
>>>>>>>     static inline struct drm_gpuva *
>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>     {
>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__)
>>>>>>> \
>>>>>>>          list_for_each_entry_safe(va__, next__, &(gpuvm__)-
>>>>>>>> rb.list, rb.entry)
>>>>>>> +/**
>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>> &drm_exec
>>>>>>> + *
>>>>>>> + * This structure should be created on the stack as
>>>>>>> &drm_exec should be.
>>>>>>> + *
>>>>>>> + * Optionally, @extra can be set in order to lock additional
>>>>>>> &drm_gem_objects.
>>>>>>> + */
>>>>>>> +struct drm_gpuvm_exec {
>>>>>>> +       /**
>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>> +        */
>>>>>>> +       struct drm_exec exec;
>>>>>>> +
>>>>>>> +       /**
>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA reservations
>>>>>>> +        */
>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>> +
>>>>>>> +       /**
>>>>>>> +        * @extra: Callback and corresponding private data
>>>>>>> for the driver to
>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>> +        */
>>>>>>> +       struct {
>>>>>>> +               /**
>>>>>>> +                * @fn: The driver callback to lock
>>>>>>> additional &drm_gem_objects.
>>>>>>> +                */
>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>> +                         unsigned int num_fences);
>>>>>>> +
>>>>>>> +               /**
>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>> callback
>>>>>>> +                */
>>>>>>> +               void *priv;
>>>>>>> +       } extra;
>>>>>>> +};
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
>>>>>>> resv
>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>> + * @exec: the &drm_exec context
>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>> + *
>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>> &drm_gem_object.
>>>>>>> + *
>>>>>>> + * Using this function directly, it is the drivers
>>>>>>> responsibility to call
>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>> + *
>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>> + */
>>>>>>> +static inline int
>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>> +                    struct drm_exec *exec,
>>>>>>> +                    unsigned int num_fences)
>>>>>>> +{
>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>> num_fences);
>>>>>>> +}
>>>>>>> +
>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>> +                             struct drm_exec *exec,
>>>>>>> +                             unsigned int num_fences);
>>>>>>> +
>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>> +                           struct drm_exec *exec,
>>>>>>> +                           u64 addr, u64 range,
>>>>>>> +                           unsigned int num_fences);
>>>>>>> +
>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>> +                       unsigned int num_fences,
>>>>>>> +                       bool interruptible);
>>>>>>> +
>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>> *vm_exec,
>>>>>>> +                             struct drm_gem_object **objs,
>>>>>>> +                             unsigned int num_objs,
>>>>>>> +                             unsigned int num_fences,
>>>>>>> +                             bool interruptible);
>>>>>>> +
>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>> *vm_exec,
>>>>>>> +                             u64 addr, u64 range,
>>>>>>> +                             unsigned int num_fences,
>>>>>>> +                             bool interruptible);
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
>>>>>>> BOs
>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>> + *
>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>> previously acquired
>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>> + *
>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>> + */
>>>>>>> +static inline void
>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>> +{
>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>> +}
>>>>>>> +
>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>> +                             struct drm_exec *exec,
>>>>>>> +                             struct dma_fence *fence,
>>>>>>> +                             enum dma_resv_usage
>>>>>>> private_usage,
>>>>>>> +                             enum dma_resv_usage
>>>>>>> extobj_usage);
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>> + * @fence: fence to add
>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>> + *
>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>> + */
>>>>>>> +static inline void
>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>> *vm_exec,
>>>>>>> +                             struct dma_fence *fence,
>>>>>>> +                             enum dma_resv_usage
>>>>>>> private_usage,
>>>>>>> +                             enum dma_resv_usage
>>>>>>> extobj_usage)
>>>>>>> +{
>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec,
>>>>>>> fence,
>>>>>>> +                                private_usage,
>>>>>>> extobj_usage);
>>>>>>> +}
>>>>>>> +
>>>>>>>     /**
>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>> &drm_gpuvm and
>>>>>>>      * &drm_gem_object combination
>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>                           * gpuva list.
>>>>>>>                           */
>>>>>>>                          struct list_head gem;
>>>>>>> +
>>>>>>> +                       /**
>>>>>>> +                        * @evict: List entry to attach to
>>>>>>> the &drm_gpuvms
>>>>>>> +                        * extobj list.
>>>>>>> +                        */
>>>>>>> +                       struct list_head extobj;
>>>>>>> +
>>>>>>> +                       /**
>>>>>>> +                        * @evict: List entry to attach to
>>>>>>> the &drm_gpuvms evict
>>>>>>> +                        * list.
>>>>>>> +                        */
>>>>>>> +                       struct list_head evict;
>>>>>>>                  } entry;
>>>>>>>          } list;
>>>>>>>     };
>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>                    struct drm_gem_object *obj);
>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>> evict);
>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>>>>> +
>>>>>>>     /**
>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk over a
>>>>>>> list of &drm_gpuva
>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>> iteration step
>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>           * used.
>>>>>>>           */
>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, void
>>>>>>> *priv);
>>>>>>> +
>>>>>>> +       /**
>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>> +        *
>>>>>>> +        * Drivers receive this callback for every evicted
>>>>>>> &drm_gem_object being
>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>> +        *
>>>>>>> +        * Typically, drivers would call their driver
>>>>>>> specific variant of
>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>> +        */
>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>     };
>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 14:01                 ` Boris Brezillon
@ 2023-09-13 14:29                   ` Thomas Hellström
  2023-09-13 15:17                     ` Boris Brezillon
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-13 14:29 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Dave Airlie, Danilo Krummrich, daniel, matthew.brost,
	sarah.walker, donald.robson, christian.koenig, faith.ekstrand,
	dri-devel, nouveau, linux-kernel


On 9/13/23 16:01, Boris Brezillon wrote:
> On Wed, 13 Sep 2023 15:22:56 +0200
> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>
>> On 9/13/23 13:33, Boris Brezillon wrote:
>>> On Wed, 13 Sep 2023 12:39:01 +0200
>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>>>   
>>>> Hi,
>>>>
>>>> On 9/13/23 09:19, Boris Brezillon wrote:
>>>>> On Wed, 13 Sep 2023 17:05:42 +1000
>>>>> Dave Airlie <airlied@gmail.com> wrote:
>>>>>      
>>>>>> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
>>>>>> <boris.brezillon@collabora.com> wrote:
>>>>>>> On Tue, 12 Sep 2023 18:20:32 +0200
>>>>>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>>>>>>>         
>>>>>>>>> +/**
>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>> + * @__local_list: A pointer to the local list used to store already iterated items
>>>>>>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>> + *
>>>>>>>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
>>>>>>>>> + * iterator releases the lock immediately after picking the first element from
>>>>>>>>> + * the list, so list insertion deletion can happen concurrently.
>>>>>>>> Are the list spinlocks needed for that async state update from within
>>>>>>>> the dma-fence critical section we've discussed previously?
>>>>>>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
>>>>>>> hook will be in this situation (Panthor at the moment, PowerVR soon). I
>>>>>>> get that Xe and Nouveau don't need that because they update the VM
>>>>>>> state early (in the ioctl path), but I keep thinking this will hurt us
>>>>>>> if we don't think it through from the beginning, because once you've
>>>>>>> set this logic to depend only on resv locks, it will be pretty hard to
>>>>>>> get back to a solution which lets synchronous VM_BINDs take precedence
>>>>>>> on asynchronous request, and, with vkQueueBindSparse() passing external
>>>>>>> deps (plus the fact the VM_BIND queue might be pretty deep), it can
>>>>>>> take a long time to get your synchronous VM_BIND executed...
>>>> So this would boil down to either (possibly opt-in) keeping the spinlock
>>>> approach or pushing the unlink out to a wq then?
>>> Deferred _unlink() would not be an issue, since I already defer the
>>> drm_gpuva destruction to a wq, it would just a be a matter of moving the
>>> _unlink() call there as well. But _link() also takes the GEM gpuva list
>>> lock, and that one is bit tricky, in that sm_map() can trigger 2 more
>>> _link() calls for the prev/next mappings, which we can't guess until we
>>> get to execute the VM update. If we mandate the use of the GEM resv
>>> lock, that simply means async VM updates (AKA calling
>>> drm_gpuvm_sm_[un]map()) are not an option. And if this is what everyone
>>> agrees on, then I'd like the APIs that make this sort of async VM
>>> update possible (drm_gpuvm_sm_[un]map(), the drm_gpuvm_ops::sm_step*
>>> methods, and probably other things) to be dropped, so we don't make it
>>> look like it's something we support.
>>>   
>>>> BTW, as also asked in a reply to Danilo, how do you call unlink from
>>>> run_job() when it was requiring the obj->dma_resv lock, or was that a WIP?
>>> _unlink() makes sure the GEM gpuva list lock is taken, but this can be
>>> a custom lock (see drm_gem_gpuva_set_lock()). In panthor we have
>>> panthor_gem_object::gpuva_list_lock that's dedicated the gpuva list
>>> protection. We make sure we never take this lock while allocating
>>> memory to guarantee the dma-signalling path can't deadlock.
>>>   
>>>>>>>         
>>>>>> btw what is the use case for this? do we have actual vulkan
>>>>>> applications we know will have problems here?
>>>>> I don't, but I think that's a concern Faith raised at some point (dates
>>>>> back from when I was reading threads describing how VM_BIND on i915
>>>>> should work, and I was clearly discovering this whole VM_BIND thing at
>>>>> that time, so maybe I misunderstood).
>>>>>      
>>>>>> it feels like a bit of premature optimisation, but maybe we have use cases.
>>>>> Might be, but that's the sort of thing that would put us in a corner if
>>>>> we don't have a plan for when the needs arise. Besides, if we don't
>>>>> want to support that case because it's too complicated, I'd recommend
>>>>> dropping all the drm_gpuvm APIs that let people think this mode is
>>>>> valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
>>>>> drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
>>>>> confusion.
>>>> Xe allows bypassing the bind-queue with another bind-queue, but to
>>>> completely avoid dependencies between queues the Operations may not
>>>> overlap.
>>> So, you check the VM state with some VM lock held (would be the VM resv
>>> in my case), and if the mapping is new (no overlaps with pre-existing
>>> mappings), you queue it to the fast-track/sync-VM_BIND queue. What would
>>> be missing I guess is a way to know if the mapping is active (MMU has
>>> been updated) or pending (MMU update queued to the bind-queue), so I can
>>> fast-track mapping/unmapping of active mappings. This would leave
>>> overlapping sync/async VM updates, which can't happen in practice
>>> unless userspace is doing something wrong (sparse bindings always go
>>> through vkQueueBindSparse).
>> User-space is allowed to create new bind queues at will, and they
>> execute independently save for range overlaps.
> I've limited panthor to just one bind-queue that's automatically
> created when the VM is created. I guess letting userspace create more
> than one queue is doable, but we'd still be serializing VM
> operations anyway and that complicates the whole thing when concurrent
> operations to the same VM region happen from different bind queues, so I
> figured it'd be simpler to expose just one queue.
>
>> And the overlapping granularity depends very much on the detail of the
>> range tracking.
>> We drafted this fenced range utility
>>
>> https://gitlab.freedesktop.org/drm/xe/kernel/-/merge_requests/353
>>
>> That tracks active ranges that remove themselves when the attached fence
>> signals. Not sure if we ended up using it, though. A new binding would
>> scan this utility for dma-fences it needs to depend upon.
> Sounds like implicit deps on VM ranges :D. I'll have a look, thanks
> for the pointer!
>
>> Ranges in Xe
>> are actually page-table modification ranges, so can exceed the actual VA
>> range in some situations, but if you can build page-table structures
>> async the granularity indeed becomes better.
> The granularity in Mali is 4k, and we don't build the page table struct
> asynchronously, we just update the page table tree from the CPU,
> holding a VM lock to serialize such operations (that's done
> synchronously in the ::run_job() path, or from the ioctl in case of a
> sync-VM_BIND).

OK, yeah we have something similar although we build the page-table tree 
in the IOCTL and update entries using the GPU unless there are no 
dependencies, in which case we do it sync in the ioctl as well.

The drawback here is that if one op adds a pagetable tree node near the 
root (spanning say 1G) and the next op adds an entry to that node, the 
granularity can become pretty large...

/Thomas


>
>> /Thomas
>>
>>
>>
>>> I'll give it a try.
>>>   
>>>> (And the definition of overlap is currently page-table
>>>> structure updates may not overlap) but no guarantees are made about
>>>> priority.
>>>>
>>>> /Thomas
>>>>
>>>>
>>>>   

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 14:26               ` Christian König
@ 2023-09-13 15:13                 ` Thomas Hellström
  2023-09-13 15:26                   ` Christian König
  2023-09-13 15:15                 ` Danilo Krummrich
  1 sibling, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-13 15:13 UTC (permalink / raw)
  To: Christian König, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

Hi Christian

On 9/13/23 16:26, Christian König wrote:
> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>> As mentioned in a different mail thread, the reply is based on the 
>> assumption
>> that we don't support anything else than GPUVM updates from the IOCTL.
>
> I think that this assumption is incorrect.
>
> Vulkan is just once specific use case, but this here should probably 
> be able to handle other use cases as well.
>
> Especially with HMM you get the requirement that you need to be able 
> to invalidate GPUVM mappings without grabbing a reservation lock.

Are you referring to the MMU range invalidation notifiers here?

>
> See what the eviction lock in amdgpu is doing for example.

IMO the statement regarding GPUVM updates from the IOCTL mostly refers 
to the need to protect the evicted- and extobj lists with additional 
spinlocks. Supporting userptr and faulting will ofc require additional 
locks / locking mechanisms. But this code doesn't do that yet. Is your 
concern that these particular spinlocks for these lists are indeed needed?

/Thomas


>
> Regards,
> Christian.
>
>>
>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström wrote:
>>> Hi!
>>>
>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström wrote:
>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
>>>>>>> Hi, Danilo,
>>>>>>>
>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>> So far the DRM GPUVA manager offers common infrastructure to
>>>>>>>> track GPU VA
>>>>>>>> allocations and mappings, generically connect GPU VA mappings
>>>>>>>> to their
>>>>>>>> backing buffers and perform more complex mapping operations
>>>>>>>> on the GPU VA
>>>>>>>> space.
>>>>>>>>
>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>> drivers, which
>>>>>>>> can potentially be generalized in order to make the DRM GPUVA
>>>>>>>> manager
>>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>>> this patch aims
>>>>>>>> at generalizing the following elements.
>>>>>>>>
>>>>>>>> 1) Provide a common dma-resv for GEM objects not being used
>>>>>>>> outside of
>>>>>>>>       this GPU-VM.
>>>>>>>>
>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>> which are
>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>
>>>>>>>> 3) Provide functions to efficiently lock all GEM objects dma-
>>>>>>>> resv the
>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>
>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>> contains mappings
>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>> accelerated.
>>>>>>>>
>>>>>>>> 5) Provide some convinience functions for common patterns.
>>>>>>>>
>>>>>>>> Rather than being designed as a "framework", the target is to
>>>>>>>> make all
>>>>>>>> features appear as a collection of optional helper functions,
>>>>>>>> such that
>>>>>>>> drivers are free to make use of the DRM GPUVA managers basic
>>>>>>>> functionality and opt-in for other features without setting
>>>>>>>> any feature
>>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>>
>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>> locking for drivers
>>>>>>>> updating the GPU VA space within the fence signalling path.
>>>>>>>>
>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>> ---
>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an existing
>>>>>>>> instance of this
>>>>>>>>      * particular combination. If not existent a new instance
>>>>>>>> is created and linked
>>>>>>>>      * to the &drm_gem_object.
>>>>>>>> + *
>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>> &drm_gpuvm, are also used
>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>> evicted objects. Those
>>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>>> dma-resv locks and
>>>>>>>> + * validation of evicted objects bound in a &drm_gpuvm. For
>>>>>>>> instance the all
>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be
>>>>>>>> locked by calling
>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>> drm_gpuvm_validate() in
>>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>>> also possible to lock
>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>> corresponding parameters to
>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the &drm_exec
>>>>>>>> loop while making
>>>>>>>> + * use of helper functions such as drm_gpuvm_prepare_range()
>>>>>>>> or
>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>> + *
>>>>>>>> + * Every bound &drm_gem_object is treated as external object
>>>>>>>> when its &dma_resv
>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>> &dma_resv structure.
>>>>>>>>      */
>>>>>>>>     /**
>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for the same
>>>>>>>> &drm_gpuvm and
>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>> creations and destructions
>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances unique.
>>>>>>>> + *
>>>>>>>> + * The &drm_gpuvm's lists for keeping track of external and
>>>>>>>> evicted objects are
>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>> iteration internally.
>>>>>>>> + *
>>>>>>>> + * However, drivers still need ensure to protect concurrent
>>>>>>>> calls to functions
>>>>>>>> + * iterating those lists, such as drm_gpuvm_validate() and
>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function contains
>>>>>>>> a particular
>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>> + *
>>>>>>>> + * Functions adding or removing entries from those lists,
>>>>>>>> such as
>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be
>>>>>>>> called with external
>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>> corresponding list to be
>>>>>>>> + * (safely) modified while potentially being iternated by
>>>>>>>> other API functions.
>>>>>>>> + * However, this is entirely optional.
>>>>>>>>      */
>>>>>>>>     /**
>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>      *   }
>>>>>>>>      */
>>>>>>>> +/**
>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>> already iterated items
>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>> + *
>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>> Lockless as in, the
>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>> first element from
>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>> concurrently.
>>>>>>> Are the list spinlocks needed for that async state update from
>>>>>>> within the
>>>>>>> dma-fence critical section we've discussed previously?
>>>>>> Yes, but also for other reasons, see below.
>>>>>>
>>>>>>> Otherwise it should be sufficient to protect the lists with the
>>>>>>> gpuvm's resv
>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>
>>>>>>> If those spinlocks are still needed in some situations, perhaps
>>>>>>> could we
>>>>>>> have an option to set them to NULL (Like IIRC the maple tree
>>>>>>> allows for)?
>>>>>> The evict spinlock is needed in any case, since in
>>>>>> drm_gpuvm_bo_evict() we're
>>>>>> holding only the dma-resv lock from the BO this function gets
>>>>>> called for. Hence,
>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() calls with
>>>>>> different BOs.
>>>>> No. Only if you try to add external objects to the vm's evict list
>>>>> from
>>>>> within the evict code. That's not necessary since you loop through
>>>>> all
>>>>> external objects anyway when locking them so an "evicted" bool in
>>>>> the vm_bo,
>>>>> protected by the bo resv would be sufficient. The extobj locking
>>>>> loop can
>>>>> then add the bo to the evicted list.
>>>> And validate() can remove it while still holding all dma-resv locks,
>>>> neat!
>>>> However, what if two tasks are trying to lock the VA space
>>>> concurrently? What
>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>> drm_gpuva_unlink()?
>>>> Are we guaranteed that at this point of time the drm_gpuvm_bo is not
>>>> on the
>>>> evicted list? Because otherwise we would call drm_gpuvm_bo_destroy()
>>>> with the
>>>> dma-resv lock held, which wouldn't be allowed, since
>>>> drm_gpuvm_bo_destroy()
>>>> might drop the last reference to the drm_gem_object and hence we'd
>>>> potentially
>>>> free the dma-resv lock while holding it, at least if it's an external
>>>> object.
>>> Easiest way in this scheme is to think of the lists as being protected
>>> by the vm's resv lock. That means anybody calling unlink() must also
>>> hold the vm's resv lock. (Which is OK from an UAF point of view, but
>>> perhaps not from a locking inversion POW from an async list update).
>> This would mean that on unlink() we'd need to hold the VM's resv lock 
>> and the
>> corresponding GEM's resv lock (in case they're not the same anyways) 
>> because the
>> VM's resv lock would protect the external / evicted object lists and 
>> the GEM
>> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
>> drm_gpuvm_bo's list of drm_gpuvas.
>>
>>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>>> really would not
>>>>>> like to add even more complexity just to get the spinlock out of
>>>>>> the way in case
>>>>>> the driver already has an outer lock protecting this path.
>>>>> I must disagree here. These spinlocks and atomic operations are
>>>>> pretty
>>>>> costly and as discussed earlier this type of locking was the reason
>>>>> (at
>>>>> least according to the commit message) that made Christian drop the
>>>>> XArray
>>>>> use in drm_exec for the same set of objects: "The locking overhead
>>>>> is
>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>> complexity and a
>>>>> single wide lock following the drm locking guidelines set out by
>>>>> Daniel and
>>>>> David should really be the default choice with an opt-in for a
>>>>> spinlock if
>>>>> needed for async and pushing out to a wq is not an option.
>>>> For the external object list an outer lock would work as long as it's
>>>> not the
>>>> dma-resv lock of the corresponding GEM object, since here we actually
>>>> need to
>>>> remove the list entry from the external object list on
>>>> drm_gpuvm_bo_destroy().
>>>> It's just a bit weird design wise that drivers would need to take
>>>> this outer
>>>> lock on:
>>>>
>>>> - drm_gpuvm_bo_extobj_add()
>>>> - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
>>>> - drm_gpuva_unlink()            (because it needs to call
>>>> drm_gpuvm_bo_put())
>>>> - drm_gpuvm_exec_lock()
>>>> - drm_gpuvm_exec_lock_array()
>>>> - drm_gpuvm_prepare_range()
>>>>
>>>> Given that it seems reasonable to do all the required locking
>>>> internally.
>>>  From a design POW, there has been a clear direction in XE to make
>>> things similar to mmap() / munmap(), so this outer lock, which in Xe is
>>> an rwsem, is used in a similar way as the mmap_lock. It's protecting
>>> the page-table structures and vma rb tree, the userptr structures and
>>> the extobj list. Basically it's taken early in the exec IOCTL, the
>>> VM_BIND ioctl, the compute rebind worker and the pagefault handler, so
>>> all of the above are just asserting that it is taken in the correct
>>> mode.
>>>
>>> But strictly with this scheme one could also use the vm's dma_resv for
>>> the extobj list since with drm_exec, it's locked before traversing the
>>> list.
>>>
>>> The whole point of this scheme is to rely on locks that you already are
>>> supposed to be holding for various reasons and is simple to comprehend.
>> I don't agree that we're supposed to hold the VM's resv lock anyways for
>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm fine 
>> using it
>> for that purpose nevertheless.
>>
>>>> In order to at least place lockdep checks, the driver would need to
>>>> supply the
>>>> corresponding lock's lockdep_map, because the GPUVM otherwise doesn't
>>>> know about
>>>> the lock.
>>> Yes, that sounds reasonable. One lockdep map per list.
>> I'd really like to avoid that, especially now that everything got 
>> simpler. We
>> should define the actual locks to take instead.
>>
>>>> Out of curiosity, what is the overhead of a spin_lock() that doesn't
>>>> need to
>>>> spin?
>>> I guess it's hard to tell exactly, but it is much lower on modern x86
>>> than what it used to be. Not sure about ARM, which is the other
>>> architecture important to us. I figure if there is little cache-line
>>> bouncing the main overhead comes from the implied barriers.
>>>
>>>>> A pretty simple way that would not add much code would be
>>>>>
>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>>> spinlock_t
>>>>> *lock)
>>>>>
>>>>> {
>>>>>
>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>          spin_lock(lock);
>>>>>
>>>>> }
>>>>>
>>>>>>> For such drivers, that would require anybody calling unlink to
>>>>>>> hold the vm's
>>>>>>> resv, though.
>>>>>> In V4 I want to go back to having a dedicated lock for the GEMs
>>>>>> gpuva list (or
>>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>>> lock for that
>>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>>> otherwise wouldn't
>>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>>> was referring to
>>>>>> earlier.
>>>>> Yeah, I can see the need for a dedicated lock for the GEM's gpuva
>>>>> list, but
>>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>>> problem. We
>>>>> may free the object and a pointer to the vm's resv during unlink
>>>>> but we
>>>>> don't free the vm's resv.  It'd be a matter of ensuring that any
>>>>> calls to
>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>> Drivers calling unlink() from the fence signaling path can't use the
>>>> VM's
>>>> dma-resv lock.
>>> Yes, that made me a bit curious because in the current version the code
>>> required the object's dma_resv for unlink() which can't be grabbed
>>> either from the fence signaling path. So are there any drivers actually
>>> wanting to do that? If so, they will either need to resort to the
>>> current spinlock solution or they will need to call unlink from a
>>> workqueue item.
>> As Boris already mentioned we have the dma-resv lock by default or a 
>> driver
>> specific GEM gpuva lock as opt-in. Now, we can get rid of the latter.
>>
>>>> Also, what if the object is an external object? We can't use the VM's
>>>> dma-resv
>>>> lock here.
>>> Why? Typically (sync) unlink is only ever called from an unbind-like
>>> operation where it should be trivial to grab the vm's resv. Or, for
>>> that matter any outer lock protecting the extobj list. Rule would be
>>> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict would
>>> be protected by either the vm's dma_resv (or possibly an outer lock in
>>> the case of the extobj list).
>> Outer lock wouldn't have been working for updates in the async path, but
>> shouldn't be relevant anymore. We could use the VM's resv for that.
>>
>>>>   And we can't have the GEM objs dma-resv lock held when calling
>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>>> refcount drops
>>>> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might
>>>> drop the
>>>> last reference of the GEM object.
>>> Yes, but this is a different problem as to what exactly protects
>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per bo list
>>> lock, or if we want to keep the bo's dma_resv we need to ensure that
>>> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
>>> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount (I know
>>> Boris didn't like that, but requiring an explicit refcount for a
>>> pointer you dereference unless you're under a lock that ensures keeping
>>> the object alive is pretty much required?) But anyway for the
>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal spinlock)
>>> I don't have a strong preference.
>> We can keep the GEM objects dma-resv lock, however as mentioned above
>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the VM's 
>> resv lock
>> and the GEM's resv lock in case they differ.
>>
>>>>   All those problems go away with a dedicated
>>>> GEM gpuva list lock.
>>> I don't think these are real problems.
>>> With the excepton of the eviction list "trick" where we currently have
>>> slightly different approach to collect external bos needing rebinding,
>>> we have this working fine.
>>>
>>> TBH I think pretty much the only situation where the spinlock is needed
>>> is for async updates of these lists, unless a wq item can be used for
>>> that, but it doesn't really seem like the current code allows for such
>>> updates anyway? It complicates the code a lot, adds overhead and also
>>> adds the requirement for refcounting during list traversal.
>>>
>>> /Thomas
>>>
>>>>> /Thomas
>>>>>
>>>>>
>>>>>>> It seems that with that also the refcount could be make non-
>>>>>>> atomic.
>>>>>>>
>>>>>>> All in the spirit of the drm locking guidelines "use big locks
>>>>>>> when
>>>>>>> possible".
>>>>>>> Lower level locks only when necessary for performance or
>>>>>>> locking inversion?
>>>>>>>
>>>>>>> /Thomas
>>>>>>>
>>>>>>>
>>>>>>>> + *
>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>> local list, so removal
>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>> iterating the list.
>>>>>>>> + */
>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>> +       ({
>>>>>>>>                             \
>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>> *__vm_bo;                                           \
>>>>>>>> +
>>>>>>>>                             \
>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>                             \
>>>>>>>> +
>>>>>>>>                             \
>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>> __list_name.lock);                                \
>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>> __list_name.list)) {                     \
>>>>>>>> +                       __vm_bo =
>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>> + struct
>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>> +
>>>>>>>> list.entry.__list_name);             \
>>>>>>>> +                       if
>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>> {                    \
>>>>>>>> +                               list_move_tail(&(__vm_bo)-
>>>>>>>>> list.entry.__list_name,      \
>>>>>>>> +
>>>>>>>> __local_list);                           \
>>>>>>>> +                               break;
>>>>>>>>                             \
>>>>>>>> +                       } else
>>>>>>>> {                                                        \
>>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>>> list.entry.__list_name);      \
>>>>>>>> +                               __vm_bo =
>>>>>>>> NULL;                                         \
>>>>>>>> +                       }
>>>>>>>>                             \
>>>>>>>> +               }
>>>>>>>>                             \
>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>> __list_name.lock);                              \
>>>>>>>> +
>>>>>>>>                             \
>>>>>>>> +               __vm_bo;
>>>>>>>>                             \
>>>>>>>> +       })
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
>>>>>>>> + *
>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>> Lockless as in, the
>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>> first element from the
>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>> concurrently.
>>>>>>>> + *
>>>>>>>> + * Typical use:
>>>>>>>> + *
>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>> + *
>>>>>>>> + *     ret = 0;
>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>> + *             ret = do_something_with_vm_bo(..., vm_bo);
>>>>>>>> + *             if (ret)
>>>>>>>> + *                     break;
>>>>>>>> + *     }
>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>> &my_local_list);
>>>>>>>> + *
>>>>>>>> + *
>>>>>>>> + * Only used for internal list iterations, not meant to be
>>>>>>>> exposed to the outside
>>>>>>>> + * world.
>>>>>>>> + */
>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>> __list_name,           \
>>>>>>>> +                                               __local_list,
>>>>>>>> NULL);            \
>>>>>>>> +
>>>>>>>> __vm_bo;
>>>>>>>>        \
>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>> __list_name,           \
>>>>>>>> +                                               __local_list,
>>>>>>>> __vm_bo))         \
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to their
>>>>>>>> original list
>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>> already iterated items
>>>>>>>> + *
>>>>>>>> + * When we're done iterating a vm_bo list, we should call
>>>>>>>> restore_vm_bo_list()
>>>>>>>> + * to restore the original state and let new iterations take
>>>>>>>> place.
>>>>>>>> + */
>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>> __local_list)                         \
>>>>>>>> +       do
>>>>>>>> {
>>>>>>>>                  \
>>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>>> list elements to the          \
>>>>>>>> +                * head to preserve previous ordering, in
>>>>>>>> case it matters.              \
>>>>>>>> +
>>>>>>>> */
>>>>>>>>            \
>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>> __list_name.lock);                                \
>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>> __list_name.list);                \
>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>> __list_name.lock);                              \
>>>>>>>> +       } while (0)
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
>>>>>>>> list
>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>> + *
>>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>>> @__list_name and
>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>> + */
>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>> __list_name)                            \
>>>>>>>> +       do
>>>>>>>> {
>>>>>>>>          \
>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>> __list_name.lock);                    \
>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>> list.entry.__list_name))             \
>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>> list.entry.__list_name,       \
>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>> __list_name.list);        \
>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>> __list_name.lock);                  \
>>>>>>>> +       } while (0)
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
>>>>>>>> list
>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>> + *
>>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>>> @__list_name and
>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>> + */
>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>> __list_name)                            \
>>>>>>>> +       do
>>>>>>>> {
>>>>>>>>          \
>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>> __list_name.lock);                    \
>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>> list.entry.__list_name))            \
>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>> list.entry.__list_name);      \
>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>> __list_name.lock);                  \
>>>>>>>> +       } while (0)
>>>>>>>> +
>>>>>>>> +static int __must_check
>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>>>>>>> +
>>>>>>>>     #define to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
>>>>>>>> struct drm_device *drm,
>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>> +
>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>> +
>>>>>>>>          drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>>> *gpuvm)
>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>               "GPUVA tree is not empty, potentially leaking
>>>>>>>> memory.\n");
>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
>>>>>>>> should be empty.\n");
>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>>> should be empty.\n");
>>>>>>>> +
>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>     }
>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>> + *
>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the
>>>>>>>> given
>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>> + *
>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>> responsibility to call
>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>> + *
>>>>>>>> + * Note: This function is safe against concurrent insertion
>>>>>>>> and removal of
>>>>>>>> + * external objects, however it is not safe against
>>>>>>>> concurrent usage itself.
>>>>>>>> + *
>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>> either an outer VM lock
>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this function
>>>>>>>> within the
>>>>>>>> + * drm_exec_until_all_locked() loop, such that the GPUVM's
>>>>>>>> dma-resv lock ensures
>>>>>>>> + * mutual exclusion.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +int
>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>> +                         unsigned int num_fences)
>>>>>>>> +{
>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>> +       int ret = 0;
>>>>>>>> +
>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>> vm_bo) {
>>>>>>>> +               ret = drm_exec_prepare_obj(exec, vm_bo->obj,
>>>>>>>> num_fences);
>>>>>>>> +               if (ret)
>>>>>>>> +                       break;
>>>>>>>> +       }
>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>> +
>>>>>>>> +       return ret;
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped within
>>>>>>>> a given range
>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>> + *
>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>>> mapped between @addr
>>>>>>>> + * and @addr + @range.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +int
>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>> drm_exec *exec,
>>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>>> num_fences)
>>>>>>>> +{
>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>> +       u64 end = addr + range;
>>>>>>>> +       int ret;
>>>>>>>> +
>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
>>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>>> +
>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>> num_fences);
>>>>>>>> +               if (ret)
>>>>>>>> +                       return ret;
>>>>>>>> +       }
>>>>>>>> +
>>>>>>>> +       return 0;
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>> assoiciated BOs
>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>> + *
>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>> given
>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>> + *
>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>> + * being set the driver receives the given @fn callback to
>>>>>>>> lock additional
>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec instance.
>>>>>>>> Typically, drivers
>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>> callback.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +int
>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>> +                   unsigned int num_fences,
>>>>>>>> +                   bool interruptible)
>>>>>>>> +{
>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>> +       uint32_t flags;
>>>>>>>> +       int ret;
>>>>>>>> +
>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>> 0 |
>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>> +
>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>> +
>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>> num_fences);
>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>> +               if (ret)
>>>>>>>> +                       goto err;
>>>>>>>> +
>>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>> num_fences);
>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>> +               if (ret)
>>>>>>>> +                       goto err;
>>>>>>>> +
>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>> num_fences);
>>>>>>>> +                       drm_exec_retry_on_contention(exec);
>>>>>>>> +                       if (ret)
>>>>>>>> +                               goto err;
>>>>>>>> +               }
>>>>>>>> +       }
>>>>>>>> +
>>>>>>>> +       return 0;
>>>>>>>> +
>>>>>>>> +err:
>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>> +       return ret;
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>> +
>>>>>>>> +static int
>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
>>>>>>>> num_fences)
>>>>>>>> +{
>>>>>>>> +       struct {
>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>> +               unsigned int num_objs;
>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>> +
>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>> objs,
>>>>>>>> + args->num_objs,
>>>>>>>> num_fences);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
>>>>>>>> assoiciated BOs
>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>> + * @num_objs: the number of additional &drm_gem_objects to
>>>>>>>> lock
>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>> + *
>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>> given &drm_gpuvm
>>>>>>>> + * contains mappings of, plus the ones given through @objs.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +int
>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>> +                         unsigned int num_objs,
>>>>>>>> +                         unsigned int num_fences,
>>>>>>>> +                         bool interruptible)
>>>>>>>> +{
>>>>>>>> +       struct {
>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>> +               unsigned int num_objs;
>>>>>>>> +       } args;
>>>>>>>> +
>>>>>>>> +       args.objs = objs;
>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>> +
>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>> +
>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>> interruptible);
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>> within a given range
>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>> + *
>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>> mapped between @addr and
>>>>>>>> + * @addr + @range.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +int
>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>> +                         unsigned int num_fences,
>>>>>>>> +                         bool interruptible)
>>>>>>>> +{
>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>> +       uint32_t flags;
>>>>>>>> +       int ret;
>>>>>>>> +
>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>> 0 |
>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>> +
>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>> +
>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>> addr, range,
>>>>>>>> + num_fences);
>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>> +               if (ret)
>>>>>>>> +                       goto err;
>>>>>>>> +       }
>>>>>>>> +
>>>>>>>> +       return ret;
>>>>>>>> +
>>>>>>>> +err:
>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>> +       return ret;
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as evicted
>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>> + *
>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>>> evicted buffer
>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +int
>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>> +{
>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>> +       int ret = 0;
>>>>>>>> +
>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>> +               return -ENOTSUPP;
>>>>>>>> +
>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>> +               if (ret)
>>>>>>>> +                       break;
>>>>>>>> +       }
>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>> +
>>>>>>>> +       return ret;
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private and all
>>>>>>>> extobj
>>>>>>>> + * dma-resv
>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>> + * @fence: fence to add
>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>> + */
>>>>>>>> +void
>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>> +                        enum dma_resv_usage private_usage,
>>>>>>>> +                        enum dma_resv_usage extobj_usage)
>>>>>>>> +{
>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>> +       unsigned long index;
>>>>>>>> +
>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, obj) {
>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>> obj) ?
>>>>>>>> +                                  private_usage :
>>>>>>>> extobj_usage);
>>>>>>>> +       }
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>> +
>>>>>>>>     /**
>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance of struct
>>>>>>>> drm_gpuvm_bo
>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>>> *gpuvm,
>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>> +
>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>          return vm_bo;
>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>> +
>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>> +
>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the reference of
>>>>>>>>      *
>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>> + *
>>>>>>>> + * If the reference count drops to zero, the &gpuvm_bo is
>>>>>>>> destroyed, which
>>>>>>>> + * includes removing it from the GEMs gpuva list. Hence, if
>>>>>>>> a call to this
>>>>>>>> + * function can potentially let the reference count to zero
>>>>>>>> the caller must
>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>      */
>>>>>>>>     void
>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
>>>>>>>> *vm_bo)
>>>>>>>>     }
>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>> +static int __must_check
>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>>> +{
>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>     }
>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its
>>>>>>>> &drm_gpuvm's
>>>>>>>> + * extobj list
>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the
>>>>>>>> extobj list.
>>>>>>>> + *
>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if
>>>>>>>> not on the list
>>>>>>>> + * already and if the corresponding &drm_gem_object is an
>>>>>>>> external object,
>>>>>>>> + * actually.
>>>>>>>> + */
>>>>>>>> +void
>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>> +{
>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>> +
>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
>>>>>>>> / from a
>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>> + *
>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>> &drm_gpuvms evicted
>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>> + */
>>>>>>>> +void
>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>>>>>>> +{
>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>> +
>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>> +               if (evict)
>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, evict);
>>>>>>>> +               else
>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, evict);
>>>>>>>> +       }
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>> +
>>>>>>>>     static int
>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>      */
>>>>>>>>     #include <linux/list.h>
>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>     #include <linux/types.h>
>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>     struct drm_gpuvm;
>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>           * space
>>>>>>>>           */
>>>>>>>>          struct dma_resv *resv;
>>>>>>>> +
>>>>>>>> +       /**
>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>> +        */
>>>>>>>> +       struct {
>>>>>>>> +               /**
>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>> serving as
>>>>>>>> +                * external object
>>>>>>>> +                */
>>>>>>>> +               struct list_head list;
>>>>>>>> +
>>>>>>>> +               /**
>>>>>>>> +                * @lock: spinlock to protect the extobj list
>>>>>>>> +                */
>>>>>>>> +               spinlock_t lock;
>>>>>>>> +       } extobj;
>>>>>>>> +
>>>>>>>> +       /**
>>>>>>>> +        * @evict: structure holding the evict list and evict
>>>>>>>> list lock
>>>>>>>> +        */
>>>>>>>> +       struct {
>>>>>>>> +               /**
>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>> currently being
>>>>>>>> +                * evicted
>>>>>>>> +                */
>>>>>>>> +               struct list_head list;
>>>>>>>> +
>>>>>>>> +               /**
>>>>>>>> +                * @lock: spinlock to protect the evict list
>>>>>>>> +                */
>>>>>>>> +               spinlock_t lock;
>>>>>>>> +       } evict;
>>>>>>>>     };
>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>> drm_device *drm,
>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>> &drm_gem_object is an
>>>>>>>> + * external object
>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>> + *
>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv differs
>>>>>>>> from the
>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>> + */
>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>> *gpuvm,
>>>>>>>> +                                      struct drm_gem_object
>>>>>>>> *obj)
>>>>>>>> +{
>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>     {
>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__)
>>>>>>>> \
>>>>>>>>          list_for_each_entry_safe(va__, next__, &(gpuvm__)-
>>>>>>>>> rb.list, rb.entry)
>>>>>>>> +/**
>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>> &drm_exec
>>>>>>>> + *
>>>>>>>> + * This structure should be created on the stack as
>>>>>>>> &drm_exec should be.
>>>>>>>> + *
>>>>>>>> + * Optionally, @extra can be set in order to lock additional
>>>>>>>> &drm_gem_objects.
>>>>>>>> + */
>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>> +       /**
>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>> +        */
>>>>>>>> +       struct drm_exec exec;
>>>>>>>> +
>>>>>>>> +       /**
>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA reservations
>>>>>>>> +        */
>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>> +
>>>>>>>> +       /**
>>>>>>>> +        * @extra: Callback and corresponding private data
>>>>>>>> for the driver to
>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>> +        */
>>>>>>>> +       struct {
>>>>>>>> +               /**
>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>> additional &drm_gem_objects.
>>>>>>>> +                */
>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>>> +                         unsigned int num_fences);
>>>>>>>> +
>>>>>>>> +               /**
>>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>>> callback
>>>>>>>> +                */
>>>>>>>> +               void *priv;
>>>>>>>> +       } extra;
>>>>>>>> +};
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
>>>>>>>> resv
>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>> + *
>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>> &drm_gem_object.
>>>>>>>> + *
>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>> responsibility to call
>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +static inline int
>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>> +                    unsigned int num_fences)
>>>>>>>> +{
>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>> num_fences);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>> +                             unsigned int num_fences);
>>>>>>>> +
>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>> +                           struct drm_exec *exec,
>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>> +                           unsigned int num_fences);
>>>>>>>> +
>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>> +                       unsigned int num_fences,
>>>>>>>> +                       bool interruptible);
>>>>>>>> +
>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>> *vm_exec,
>>>>>>>> +                             struct drm_gem_object **objs,
>>>>>>>> +                             unsigned int num_objs,
>>>>>>>> +                             unsigned int num_fences,
>>>>>>>> +                             bool interruptible);
>>>>>>>> +
>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>> *vm_exec,
>>>>>>>> +                             u64 addr, u64 range,
>>>>>>>> +                             unsigned int num_fences,
>>>>>>>> +                             bool interruptible);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
>>>>>>>> BOs
>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>> + *
>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>> previously acquired
>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +static inline void
>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>> +{
>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>> +                             enum dma_resv_usage
>>>>>>>> private_usage,
>>>>>>>> +                             enum dma_resv_usage
>>>>>>>> extobj_usage);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>> + * @fence: fence to add
>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>> + *
>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>> + */
>>>>>>>> +static inline void
>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>> *vm_exec,
>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>> +                             enum dma_resv_usage
>>>>>>>> private_usage,
>>>>>>>> +                             enum dma_resv_usage
>>>>>>>> extobj_usage)
>>>>>>>> +{
>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec,
>>>>>>>> fence,
>>>>>>>> +                                private_usage,
>>>>>>>> extobj_usage);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>     /**
>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>> &drm_gpuvm and
>>>>>>>>      * &drm_gem_object combination
>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>                           * gpuva list.
>>>>>>>>                           */
>>>>>>>>                          struct list_head gem;
>>>>>>>> +
>>>>>>>> +                       /**
>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>> the &drm_gpuvms
>>>>>>>> +                        * extobj list.
>>>>>>>> +                        */
>>>>>>>> +                       struct list_head extobj;
>>>>>>>> +
>>>>>>>> +                       /**
>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>> the &drm_gpuvms evict
>>>>>>>> +                        * list.
>>>>>>>> +                        */
>>>>>>>> +                       struct list_head evict;
>>>>>>>>                  } entry;
>>>>>>>>          } list;
>>>>>>>>     };
>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>>> evict);
>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>>>>>> +
>>>>>>>>     /**
>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk over a
>>>>>>>> list of &drm_gpuva
>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>> iteration step
>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>           * used.
>>>>>>>>           */
>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, void
>>>>>>>> *priv);
>>>>>>>> +
>>>>>>>> +       /**
>>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>>> +        *
>>>>>>>> +        * Drivers receive this callback for every evicted
>>>>>>>> &drm_gem_object being
>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>> +        *
>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>> specific variant of
>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>> +        */
>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>     };
>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 14:26               ` Christian König
  2023-09-13 15:13                 ` Thomas Hellström
@ 2023-09-13 15:15                 ` Danilo Krummrich
  2023-09-13 15:33                   ` Christian König
  1 sibling, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-13 15:15 UTC (permalink / raw)
  To: Christian König, Thomas Hellström
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

On 9/13/23 16:26, Christian König wrote:
> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>> As mentioned in a different mail thread, the reply is based on the assumption
>> that we don't support anything else than GPUVM updates from the IOCTL.
> 
> I think that this assumption is incorrect.

Well, more precisely I should have said "don't support GPUVM updated from within
fence signaling critical sections". And looking at the code, that doesn't seem what
you're doing there.

> 
> Vulkan is just once specific use case, but this here should probably be able to handle other use cases as well.
> 
> Especially with HMM you get the requirement that you need to be able to invalidate GPUVM mappings without grabbing a reservation lock.

What do you mean with "invalidate GPUVM mappings" in this context? drm_gpuvm_bo_evict()
should only be called from a ttm_device_funcs::move\x0f callback, we should hold the dma-resv
lock there.

> 
> See what the eviction lock in amdgpu is doing for example.

The eviction_lock seems to protect a VM state "evicting" of whether any BO that
is associated with the VM is currently evicting. At the same time amdgpu protects
the eviceted list of the VM with a different lock. So this seems to be entirely
unrelated. Tracking a "currently evicting" state is not part of the GPUVM
implementation currently and hence nothing would change for amdgpu there.

> 
> Regards,
> Christian.
> 
>>
>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström wrote:
>>> Hi!
>>>
>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström wrote:
>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
>>>>>>> Hi, Danilo,
>>>>>>>
>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>> So far the DRM GPUVA manager offers common infrastructure to
>>>>>>>> track GPU VA
>>>>>>>> allocations and mappings, generically connect GPU VA mappings
>>>>>>>> to their
>>>>>>>> backing buffers and perform more complex mapping operations
>>>>>>>> on the GPU VA
>>>>>>>> space.
>>>>>>>>
>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>> drivers, which
>>>>>>>> can potentially be generalized in order to make the DRM GPUVA
>>>>>>>> manager
>>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>>> this patch aims
>>>>>>>> at generalizing the following elements.
>>>>>>>>
>>>>>>>> 1) Provide a common dma-resv for GEM objects not being used
>>>>>>>> outside of
>>>>>>>>       this GPU-VM.
>>>>>>>>
>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>> which are
>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>
>>>>>>>> 3) Provide functions to efficiently lock all GEM objects dma-
>>>>>>>> resv the
>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>
>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>> contains mappings
>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>> accelerated.
>>>>>>>>
>>>>>>>> 5) Provide some convinience functions for common patterns.
>>>>>>>>
>>>>>>>> Rather than being designed as a "framework", the target is to
>>>>>>>> make all
>>>>>>>> features appear as a collection of optional helper functions,
>>>>>>>> such that
>>>>>>>> drivers are free to make use of the DRM GPUVA managers basic
>>>>>>>> functionality and opt-in for other features without setting
>>>>>>>> any feature
>>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>>
>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>> locking for drivers
>>>>>>>> updating the GPU VA space within the fence signalling path.
>>>>>>>>
>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>> ---
>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an existing
>>>>>>>> instance of this
>>>>>>>>      * particular combination. If not existent a new instance
>>>>>>>> is created and linked
>>>>>>>>      * to the &drm_gem_object.
>>>>>>>> + *
>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>> &drm_gpuvm, are also used
>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>> evicted objects. Those
>>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>>> dma-resv locks and
>>>>>>>> + * validation of evicted objects bound in a &drm_gpuvm. For
>>>>>>>> instance the all
>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be
>>>>>>>> locked by calling
>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>> drm_gpuvm_validate() in
>>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>>> also possible to lock
>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>> corresponding parameters to
>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the &drm_exec
>>>>>>>> loop while making
>>>>>>>> + * use of helper functions such as drm_gpuvm_prepare_range()
>>>>>>>> or
>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>> + *
>>>>>>>> + * Every bound &drm_gem_object is treated as external object
>>>>>>>> when its &dma_resv
>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>> &dma_resv structure.
>>>>>>>>      */
>>>>>>>>     /**
>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for the same
>>>>>>>> &drm_gpuvm and
>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>> creations and destructions
>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances unique.
>>>>>>>> + *
>>>>>>>> + * The &drm_gpuvm's lists for keeping track of external and
>>>>>>>> evicted objects are
>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>> iteration internally.
>>>>>>>> + *
>>>>>>>> + * However, drivers still need ensure to protect concurrent
>>>>>>>> calls to functions
>>>>>>>> + * iterating those lists, such as drm_gpuvm_validate() and
>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function contains
>>>>>>>> a particular
>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>> + *
>>>>>>>> + * Functions adding or removing entries from those lists,
>>>>>>>> such as
>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be
>>>>>>>> called with external
>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>> corresponding list to be
>>>>>>>> + * (safely) modified while potentially being iternated by
>>>>>>>> other API functions.
>>>>>>>> + * However, this is entirely optional.
>>>>>>>>      */
>>>>>>>>     /**
>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>      *   }
>>>>>>>>      */
>>>>>>>> +/**
>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>> already iterated items
>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>> + *
>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>> Lockless as in, the
>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>> first element from
>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>> concurrently.
>>>>>>> Are the list spinlocks needed for that async state update from
>>>>>>> within the
>>>>>>> dma-fence critical section we've discussed previously?
>>>>>> Yes, but also for other reasons, see below.
>>>>>>
>>>>>>> Otherwise it should be sufficient to protect the lists with the
>>>>>>> gpuvm's resv
>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>
>>>>>>> If those spinlocks are still needed in some situations, perhaps
>>>>>>> could we
>>>>>>> have an option to set them to NULL (Like IIRC the maple tree
>>>>>>> allows for)?
>>>>>> The evict spinlock is needed in any case, since in
>>>>>> drm_gpuvm_bo_evict() we're
>>>>>> holding only the dma-resv lock from the BO this function gets
>>>>>> called for. Hence,
>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() calls with
>>>>>> different BOs.
>>>>> No. Only if you try to add external objects to the vm's evict list
>>>>> from
>>>>> within the evict code. That's not necessary since you loop through
>>>>> all
>>>>> external objects anyway when locking them so an "evicted" bool in
>>>>> the vm_bo,
>>>>> protected by the bo resv would be sufficient. The extobj locking
>>>>> loop can
>>>>> then add the bo to the evicted list.
>>>> And validate() can remove it while still holding all dma-resv locks,
>>>> neat!
>>>> However, what if two tasks are trying to lock the VA space
>>>> concurrently? What
>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>> drm_gpuva_unlink()?
>>>> Are we guaranteed that at this point of time the drm_gpuvm_bo is not
>>>> on the
>>>> evicted list? Because otherwise we would call drm_gpuvm_bo_destroy()
>>>> with the
>>>> dma-resv lock held, which wouldn't be allowed, since
>>>> drm_gpuvm_bo_destroy()
>>>> might drop the last reference to the drm_gem_object and hence we'd
>>>> potentially
>>>> free the dma-resv lock while holding it, at least if it's an external
>>>> object.
>>> Easiest way in this scheme is to think of the lists as being protected
>>> by the vm's resv lock. That means anybody calling unlink() must also
>>> hold the vm's resv lock. (Which is OK from an UAF point of view, but
>>> perhaps not from a locking inversion POW from an async list update).
>> This would mean that on unlink() we'd need to hold the VM's resv lock and the
>> corresponding GEM's resv lock (in case they're not the same anyways) because the
>> VM's resv lock would protect the external / evicted object lists and the GEM
>> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
>> drm_gpuvm_bo's list of drm_gpuvas.
>>
>>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>>> really would not
>>>>>> like to add even more complexity just to get the spinlock out of
>>>>>> the way in case
>>>>>> the driver already has an outer lock protecting this path.
>>>>> I must disagree here. These spinlocks and atomic operations are
>>>>> pretty
>>>>> costly and as discussed earlier this type of locking was the reason
>>>>> (at
>>>>> least according to the commit message) that made Christian drop the
>>>>> XArray
>>>>> use in drm_exec for the same set of objects: "The locking overhead
>>>>> is
>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>> complexity and a
>>>>> single wide lock following the drm locking guidelines set out by
>>>>> Daniel and
>>>>> David should really be the default choice with an opt-in for a
>>>>> spinlock if
>>>>> needed for async and pushing out to a wq is not an option.
>>>> For the external object list an outer lock would work as long as it's
>>>> not the
>>>> dma-resv lock of the corresponding GEM object, since here we actually
>>>> need to
>>>> remove the list entry from the external object list on
>>>> drm_gpuvm_bo_destroy().
>>>> It's just a bit weird design wise that drivers would need to take
>>>> this outer
>>>> lock on:
>>>>
>>>> - drm_gpuvm_bo_extobj_add()
>>>> - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
>>>> - drm_gpuva_unlink()            (because it needs to call
>>>> drm_gpuvm_bo_put())
>>>> - drm_gpuvm_exec_lock()
>>>> - drm_gpuvm_exec_lock_array()
>>>> - drm_gpuvm_prepare_range()
>>>>
>>>> Given that it seems reasonable to do all the required locking
>>>> internally.
>>>  From a design POW, there has been a clear direction in XE to make
>>> things similar to mmap() / munmap(), so this outer lock, which in Xe is
>>> an rwsem, is used in a similar way as the mmap_lock. It's protecting
>>> the page-table structures and vma rb tree, the userptr structures and
>>> the extobj list. Basically it's taken early in the exec IOCTL, the
>>> VM_BIND ioctl, the compute rebind worker and the pagefault handler, so
>>> all of the above are just asserting that it is taken in the correct
>>> mode.
>>>
>>> But strictly with this scheme one could also use the vm's dma_resv for
>>> the extobj list since with drm_exec, it's locked before traversing the
>>> list.
>>>
>>> The whole point of this scheme is to rely on locks that you already are
>>> supposed to be holding for various reasons and is simple to comprehend.
>> I don't agree that we're supposed to hold the VM's resv lock anyways for
>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm fine using it
>> for that purpose nevertheless.
>>
>>>> In order to at least place lockdep checks, the driver would need to
>>>> supply the
>>>> corresponding lock's lockdep_map, because the GPUVM otherwise doesn't
>>>> know about
>>>> the lock.
>>> Yes, that sounds reasonable. One lockdep map per list.
>> I'd really like to avoid that, especially now that everything got simpler. We
>> should define the actual locks to take instead.
>>
>>>> Out of curiosity, what is the overhead of a spin_lock() that doesn't
>>>> need to
>>>> spin?
>>> I guess it's hard to tell exactly, but it is much lower on modern x86
>>> than what it used to be. Not sure about ARM, which is the other
>>> architecture important to us. I figure if there is little cache-line
>>> bouncing the main overhead comes from the implied barriers.
>>>
>>>>> A pretty simple way that would not add much code would be
>>>>>
>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>>> spinlock_t
>>>>> *lock)
>>>>>
>>>>> {
>>>>>
>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>          spin_lock(lock);
>>>>>
>>>>> }
>>>>>
>>>>>>> For such drivers, that would require anybody calling unlink to
>>>>>>> hold the vm's
>>>>>>> resv, though.
>>>>>> In V4 I want to go back to having a dedicated lock for the GEMs
>>>>>> gpuva list (or
>>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>>> lock for that
>>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>>> otherwise wouldn't
>>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>>> was referring to
>>>>>> earlier.
>>>>> Yeah, I can see the need for a dedicated lock for the GEM's gpuva
>>>>> list, but
>>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>>> problem. We
>>>>> may free the object and a pointer to the vm's resv during unlink
>>>>> but we
>>>>> don't free the vm's resv.  It'd be a matter of ensuring that any
>>>>> calls to
>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>> Drivers calling unlink() from the fence signaling path can't use the
>>>> VM's
>>>> dma-resv lock.
>>> Yes, that made me a bit curious because in the current version the code
>>> required the object's dma_resv for unlink() which can't be grabbed
>>> either from the fence signaling path. So are there any drivers actually
>>> wanting to do that? If so, they will either need to resort to the
>>> current spinlock solution or they will need to call unlink from a
>>> workqueue item.
>> As Boris already mentioned we have the dma-resv lock by default or a driver
>> specific GEM gpuva lock as opt-in. Now, we can get rid of the latter.
>>
>>>> Also, what if the object is an external object? We can't use the VM's
>>>> dma-resv
>>>> lock here.
>>> Why? Typically (sync) unlink is only ever called from an unbind-like
>>> operation where it should be trivial to grab the vm's resv. Or, for
>>> that matter any outer lock protecting the extobj list. Rule would be
>>> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict would
>>> be protected by either the vm's dma_resv (or possibly an outer lock in
>>> the case of the extobj list).
>> Outer lock wouldn't have been working for updates in the async path, but
>> shouldn't be relevant anymore. We could use the VM's resv for that.
>>
>>>>   And we can't have the GEM objs dma-resv lock held when calling
>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>>> refcount drops
>>>> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might
>>>> drop the
>>>> last reference of the GEM object.
>>> Yes, but this is a different problem as to what exactly protects
>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per bo list
>>> lock, or if we want to keep the bo's dma_resv we need to ensure that
>>> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
>>> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount (I know
>>> Boris didn't like that, but requiring an explicit refcount for a
>>> pointer you dereference unless you're under a lock that ensures keeping
>>> the object alive is pretty much required?) But anyway for the
>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal spinlock)
>>> I don't have a strong preference.
>> We can keep the GEM objects dma-resv lock, however as mentioned above
>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the VM's resv lock
>> and the GEM's resv lock in case they differ.
>>
>>>>   All those problems go away with a dedicated
>>>> GEM gpuva list lock.
>>> I don't think these are real problems.
>>> With the excepton of the eviction list "trick" where we currently have
>>> slightly different approach to collect external bos needing rebinding,
>>> we have this working fine.
>>>
>>> TBH I think pretty much the only situation where the spinlock is needed
>>> is for async updates of these lists, unless a wq item can be used for
>>> that, but it doesn't really seem like the current code allows for such
>>> updates anyway? It complicates the code a lot, adds overhead and also
>>> adds the requirement for refcounting during list traversal.
>>>
>>> /Thomas
>>>
>>>>> /Thomas
>>>>>
>>>>>
>>>>>>> It seems that with that also the refcount could be make non-
>>>>>>> atomic.
>>>>>>>
>>>>>>> All in the spirit of the drm locking guidelines "use big locks
>>>>>>> when
>>>>>>> possible".
>>>>>>> Lower level locks only when necessary for performance or
>>>>>>> locking inversion?
>>>>>>>
>>>>>>> /Thomas
>>>>>>>
>>>>>>>
>>>>>>>> + *
>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>> local list, so removal
>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>> iterating the list.
>>>>>>>> + */
>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>> +       ({
>>>>>>>>                             \
>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>> *__vm_bo;                                           \
>>>>>>>> +
>>>>>>>>                             \
>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>                             \
>>>>>>>> +
>>>>>>>>                             \
>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>> __list_name.lock);                                \
>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>> __list_name.list)) {                     \
>>>>>>>> +                       __vm_bo =
>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>> +                                                  struct
>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>> +
>>>>>>>> list.entry.__list_name);             \
>>>>>>>> +                       if
>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>> {                    \
>>>>>>>> +                               list_move_tail(&(__vm_bo)-
>>>>>>>>> list.entry.__list_name,      \
>>>>>>>> +
>>>>>>>> __local_list);                           \
>>>>>>>> +                               break;
>>>>>>>>                             \
>>>>>>>> +                       } else
>>>>>>>> {                                                        \
>>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>>> list.entry.__list_name);      \
>>>>>>>> +                               __vm_bo =
>>>>>>>> NULL;                                         \
>>>>>>>> +                       }
>>>>>>>>                             \
>>>>>>>> +               }
>>>>>>>>                             \
>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>> __list_name.lock);                              \
>>>>>>>> +
>>>>>>>>                             \
>>>>>>>> +               __vm_bo;
>>>>>>>>                             \
>>>>>>>> +       })
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
>>>>>>>> + *
>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>> Lockless as in, the
>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>> first element from the
>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>> concurrently.
>>>>>>>> + *
>>>>>>>> + * Typical use:
>>>>>>>> + *
>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>> + *
>>>>>>>> + *     ret = 0;
>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>> + *             ret = do_something_with_vm_bo(..., vm_bo);
>>>>>>>> + *             if (ret)
>>>>>>>> + *                     break;
>>>>>>>> + *     }
>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>> &my_local_list);
>>>>>>>> + *
>>>>>>>> + *
>>>>>>>> + * Only used for internal list iterations, not meant to be
>>>>>>>> exposed to the outside
>>>>>>>> + * world.
>>>>>>>> + */
>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>> __list_name,           \
>>>>>>>> +                                               __local_list,
>>>>>>>> NULL);            \
>>>>>>>> +
>>>>>>>> __vm_bo;
>>>>>>>>        \
>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>> __list_name,           \
>>>>>>>> +                                               __local_list,
>>>>>>>> __vm_bo))         \
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to their
>>>>>>>> original list
>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>> already iterated items
>>>>>>>> + *
>>>>>>>> + * When we're done iterating a vm_bo list, we should call
>>>>>>>> restore_vm_bo_list()
>>>>>>>> + * to restore the original state and let new iterations take
>>>>>>>> place.
>>>>>>>> + */
>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>> __local_list)                         \
>>>>>>>> +       do
>>>>>>>> {
>>>>>>>>                  \
>>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>>> list elements to the          \
>>>>>>>> +                * head to preserve previous ordering, in
>>>>>>>> case it matters.              \
>>>>>>>> +
>>>>>>>> */
>>>>>>>>            \
>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>> __list_name.lock);                                \
>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>> __list_name.list);                \
>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>> __list_name.lock);                              \
>>>>>>>> +       } while (0)
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
>>>>>>>> list
>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>> + *
>>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>>> @__list_name and
>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>> + */
>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>> __list_name)                            \
>>>>>>>> +       do
>>>>>>>> {
>>>>>>>>          \
>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>> __list_name.lock);                    \
>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>> list.entry.__list_name))             \
>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>> list.entry.__list_name,       \
>>>>>>>> +                                     &(__vm_bo)->vm-
>>>>>>>>> __list_name.list);        \
>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>> __list_name.lock);                  \
>>>>>>>> +       } while (0)
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
>>>>>>>> list
>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>> + *
>>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>>> @__list_name and
>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>> + */
>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>> __list_name)                            \
>>>>>>>> +       do
>>>>>>>> {
>>>>>>>>          \
>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>> __list_name.lock);                    \
>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>> list.entry.__list_name))            \
>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>> list.entry.__list_name);      \
>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>> __list_name.lock);                  \
>>>>>>>> +       } while (0)
>>>>>>>> +
>>>>>>>> +static int __must_check
>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>>>>>>> +
>>>>>>>>     #define to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
>>>>>>>> struct drm_device *drm,
>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>> +
>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>> +
>>>>>>>>          drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>>> *gpuvm)
>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>               "GPUVA tree is not empty, potentially leaking
>>>>>>>> memory.\n");
>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
>>>>>>>> should be empty.\n");
>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>>> should be empty.\n");
>>>>>>>> +
>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>     }
>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>> + *
>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the
>>>>>>>> given
>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>> + *
>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>> responsibility to call
>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>> + *
>>>>>>>> + * Note: This function is safe against concurrent insertion
>>>>>>>> and removal of
>>>>>>>> + * external objects, however it is not safe against
>>>>>>>> concurrent usage itself.
>>>>>>>> + *
>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>> either an outer VM lock
>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this function
>>>>>>>> within the
>>>>>>>> + * drm_exec_until_all_locked() loop, such that the GPUVM's
>>>>>>>> dma-resv lock ensures
>>>>>>>> + * mutual exclusion.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +int
>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>> +                         unsigned int num_fences)
>>>>>>>> +{
>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>> +       int ret = 0;
>>>>>>>> +
>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>> vm_bo) {
>>>>>>>> +               ret = drm_exec_prepare_obj(exec, vm_bo->obj,
>>>>>>>> num_fences);
>>>>>>>> +               if (ret)
>>>>>>>> +                       break;
>>>>>>>> +       }
>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>> +
>>>>>>>> +       return ret;
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped within
>>>>>>>> a given range
>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>> + *
>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>>> mapped between @addr
>>>>>>>> + * and @addr + @range.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +int
>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>> drm_exec *exec,
>>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>>> num_fences)
>>>>>>>> +{
>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>> +       u64 end = addr + range;
>>>>>>>> +       int ret;
>>>>>>>> +
>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
>>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>>> +
>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>> num_fences);
>>>>>>>> +               if (ret)
>>>>>>>> +                       return ret;
>>>>>>>> +       }
>>>>>>>> +
>>>>>>>> +       return 0;
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>> assoiciated BOs
>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>> + *
>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>> given
>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>> + *
>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>> + * being set the driver receives the given @fn callback to
>>>>>>>> lock additional
>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec instance.
>>>>>>>> Typically, drivers
>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>> callback.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +int
>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>> +                   unsigned int num_fences,
>>>>>>>> +                   bool interruptible)
>>>>>>>> +{
>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>> +       uint32_t flags;
>>>>>>>> +       int ret;
>>>>>>>> +
>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>> 0 |
>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>> +
>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>> +
>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>> num_fences);
>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>> +               if (ret)
>>>>>>>> +                       goto err;
>>>>>>>> +
>>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>> num_fences);
>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>> +               if (ret)
>>>>>>>> +                       goto err;
>>>>>>>> +
>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>> num_fences);
>>>>>>>> +                       drm_exec_retry_on_contention(exec);
>>>>>>>> +                       if (ret)
>>>>>>>> +                               goto err;
>>>>>>>> +               }
>>>>>>>> +       }
>>>>>>>> +
>>>>>>>> +       return 0;
>>>>>>>> +
>>>>>>>> +err:
>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>> +       return ret;
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>> +
>>>>>>>> +static int
>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
>>>>>>>> num_fences)
>>>>>>>> +{
>>>>>>>> +       struct {
>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>> +               unsigned int num_objs;
>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>> +
>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>> objs,
>>>>>>>> +                                     args->num_objs,
>>>>>>>> num_fences);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
>>>>>>>> assoiciated BOs
>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>> + * @num_objs: the number of additional &drm_gem_objects to
>>>>>>>> lock
>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>> + *
>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>> given &drm_gpuvm
>>>>>>>> + * contains mappings of, plus the ones given through @objs.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +int
>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>> +                         unsigned int num_objs,
>>>>>>>> +                         unsigned int num_fences,
>>>>>>>> +                         bool interruptible)
>>>>>>>> +{
>>>>>>>> +       struct {
>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>> +               unsigned int num_objs;
>>>>>>>> +       } args;
>>>>>>>> +
>>>>>>>> +       args.objs = objs;
>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>> +
>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>> +
>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>> interruptible);
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>> within a given range
>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>> + *
>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>> mapped between @addr and
>>>>>>>> + * @addr + @range.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +int
>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>> +                         unsigned int num_fences,
>>>>>>>> +                         bool interruptible)
>>>>>>>> +{
>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>> +       uint32_t flags;
>>>>>>>> +       int ret;
>>>>>>>> +
>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>> 0 |
>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>> +
>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>> +
>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>> addr, range,
>>>>>>>> +                                             num_fences);
>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>> +               if (ret)
>>>>>>>> +                       goto err;
>>>>>>>> +       }
>>>>>>>> +
>>>>>>>> +       return ret;
>>>>>>>> +
>>>>>>>> +err:
>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>> +       return ret;
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as evicted
>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>> + *
>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>>> evicted buffer
>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +int
>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>> +{
>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>> +       int ret = 0;
>>>>>>>> +
>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>> +               return -ENOTSUPP;
>>>>>>>> +
>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>> +               if (ret)
>>>>>>>> +                       break;
>>>>>>>> +       }
>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>> +
>>>>>>>> +       return ret;
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private and all
>>>>>>>> extobj
>>>>>>>> + * dma-resv
>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>> + * @fence: fence to add
>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>> + */
>>>>>>>> +void
>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>> +                        enum dma_resv_usage private_usage,
>>>>>>>> +                        enum dma_resv_usage extobj_usage)
>>>>>>>> +{
>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>> +       unsigned long index;
>>>>>>>> +
>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, obj) {
>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>> +                                  drm_gpuvm_is_extobj(gpuvm,
>>>>>>>> obj) ?
>>>>>>>> +                                  private_usage :
>>>>>>>> extobj_usage);
>>>>>>>> +       }
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>> +
>>>>>>>>     /**
>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance of struct
>>>>>>>> drm_gpuvm_bo
>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>>> *gpuvm,
>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>> +
>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>          return vm_bo;
>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>> +
>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>> +
>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the reference of
>>>>>>>>      *
>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>> + *
>>>>>>>> + * If the reference count drops to zero, the &gpuvm_bo is
>>>>>>>> destroyed, which
>>>>>>>> + * includes removing it from the GEMs gpuva list. Hence, if
>>>>>>>> a call to this
>>>>>>>> + * function can potentially let the reference count to zero
>>>>>>>> the caller must
>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>      */
>>>>>>>>     void
>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
>>>>>>>> *vm_bo)
>>>>>>>>     }
>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>> +static int __must_check
>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>>> +{
>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>     }
>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its
>>>>>>>> &drm_gpuvm's
>>>>>>>> + * extobj list
>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the
>>>>>>>> extobj list.
>>>>>>>> + *
>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if
>>>>>>>> not on the list
>>>>>>>> + * already and if the corresponding &drm_gem_object is an
>>>>>>>> external object,
>>>>>>>> + * actually.
>>>>>>>> + */
>>>>>>>> +void
>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>> +{
>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>> +
>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
>>>>>>>> / from a
>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>> + *
>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>> &drm_gpuvms evicted
>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>> + */
>>>>>>>> +void
>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>>>>>>> +{
>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>> +
>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>> +               if (evict)
>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, evict);
>>>>>>>> +               else
>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, evict);
>>>>>>>> +       }
>>>>>>>> +}
>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>> +
>>>>>>>>     static int
>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>      */
>>>>>>>>     #include <linux/list.h>
>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>     #include <linux/types.h>
>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>     struct drm_gpuvm;
>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>           * space
>>>>>>>>           */
>>>>>>>>          struct dma_resv *resv;
>>>>>>>> +
>>>>>>>> +       /**
>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>> +        */
>>>>>>>> +       struct {
>>>>>>>> +               /**
>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>> serving as
>>>>>>>> +                * external object
>>>>>>>> +                */
>>>>>>>> +               struct list_head list;
>>>>>>>> +
>>>>>>>> +               /**
>>>>>>>> +                * @lock: spinlock to protect the extobj list
>>>>>>>> +                */
>>>>>>>> +               spinlock_t lock;
>>>>>>>> +       } extobj;
>>>>>>>> +
>>>>>>>> +       /**
>>>>>>>> +        * @evict: structure holding the evict list and evict
>>>>>>>> list lock
>>>>>>>> +        */
>>>>>>>> +       struct {
>>>>>>>> +               /**
>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>> currently being
>>>>>>>> +                * evicted
>>>>>>>> +                */
>>>>>>>> +               struct list_head list;
>>>>>>>> +
>>>>>>>> +               /**
>>>>>>>> +                * @lock: spinlock to protect the evict list
>>>>>>>> +                */
>>>>>>>> +               spinlock_t lock;
>>>>>>>> +       } evict;
>>>>>>>>     };
>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>> drm_device *drm,
>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>> &drm_gem_object is an
>>>>>>>> + * external object
>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>> + *
>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv differs
>>>>>>>> from the
>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>> + */
>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>> *gpuvm,
>>>>>>>> +                                      struct drm_gem_object
>>>>>>>> *obj)
>>>>>>>> +{
>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>     {
>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__)
>>>>>>>> \
>>>>>>>>          list_for_each_entry_safe(va__, next__, &(gpuvm__)-
>>>>>>>>> rb.list, rb.entry)
>>>>>>>> +/**
>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>> &drm_exec
>>>>>>>> + *
>>>>>>>> + * This structure should be created on the stack as
>>>>>>>> &drm_exec should be.
>>>>>>>> + *
>>>>>>>> + * Optionally, @extra can be set in order to lock additional
>>>>>>>> &drm_gem_objects.
>>>>>>>> + */
>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>> +       /**
>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>> +        */
>>>>>>>> +       struct drm_exec exec;
>>>>>>>> +
>>>>>>>> +       /**
>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA reservations
>>>>>>>> +        */
>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>> +
>>>>>>>> +       /**
>>>>>>>> +        * @extra: Callback and corresponding private data
>>>>>>>> for the driver to
>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>> +        */
>>>>>>>> +       struct {
>>>>>>>> +               /**
>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>> additional &drm_gem_objects.
>>>>>>>> +                */
>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>>> +                         unsigned int num_fences);
>>>>>>>> +
>>>>>>>> +               /**
>>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>>> callback
>>>>>>>> +                */
>>>>>>>> +               void *priv;
>>>>>>>> +       } extra;
>>>>>>>> +};
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
>>>>>>>> resv
>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>> + *
>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>> &drm_gem_object.
>>>>>>>> + *
>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>> responsibility to call
>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +static inline int
>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>> +                    unsigned int num_fences)
>>>>>>>> +{
>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>> num_fences);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>> +                             unsigned int num_fences);
>>>>>>>> +
>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>> +                           struct drm_exec *exec,
>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>> +                           unsigned int num_fences);
>>>>>>>> +
>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>> +                       unsigned int num_fences,
>>>>>>>> +                       bool interruptible);
>>>>>>>> +
>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>> *vm_exec,
>>>>>>>> +                             struct drm_gem_object **objs,
>>>>>>>> +                             unsigned int num_objs,
>>>>>>>> +                             unsigned int num_fences,
>>>>>>>> +                             bool interruptible);
>>>>>>>> +
>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>> *vm_exec,
>>>>>>>> +                             u64 addr, u64 range,
>>>>>>>> +                             unsigned int num_fences,
>>>>>>>> +                             bool interruptible);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
>>>>>>>> BOs
>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>> + *
>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>> previously acquired
>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>> + *
>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>> + */
>>>>>>>> +static inline void
>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>> +{
>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>> +                             enum dma_resv_usage
>>>>>>>> private_usage,
>>>>>>>> +                             enum dma_resv_usage
>>>>>>>> extobj_usage);
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>> + * @fence: fence to add
>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>> + *
>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>> + */
>>>>>>>> +static inline void
>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>> *vm_exec,
>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>> +                             enum dma_resv_usage
>>>>>>>> private_usage,
>>>>>>>> +                             enum dma_resv_usage
>>>>>>>> extobj_usage)
>>>>>>>> +{
>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec,
>>>>>>>> fence,
>>>>>>>> +                                private_usage,
>>>>>>>> extobj_usage);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>     /**
>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>> &drm_gpuvm and
>>>>>>>>      * &drm_gem_object combination
>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>                           * gpuva list.
>>>>>>>>                           */
>>>>>>>>                          struct list_head gem;
>>>>>>>> +
>>>>>>>> +                       /**
>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>> the &drm_gpuvms
>>>>>>>> +                        * extobj list.
>>>>>>>> +                        */
>>>>>>>> +                       struct list_head extobj;
>>>>>>>> +
>>>>>>>> +                       /**
>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>> the &drm_gpuvms evict
>>>>>>>> +                        * list.
>>>>>>>> +                        */
>>>>>>>> +                       struct list_head evict;
>>>>>>>>                  } entry;
>>>>>>>>          } list;
>>>>>>>>     };
>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>>> evict);
>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>>>>>> +
>>>>>>>>     /**
>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk over a
>>>>>>>> list of &drm_gpuva
>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>> iteration step
>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>           * used.
>>>>>>>>           */
>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, void
>>>>>>>> *priv);
>>>>>>>> +
>>>>>>>> +       /**
>>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>>> +        *
>>>>>>>> +        * Drivers receive this callback for every evicted
>>>>>>>> &drm_gem_object being
>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>> +        *
>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>> specific variant of
>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>> +        */
>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>     };
>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 14:29                   ` Thomas Hellström
@ 2023-09-13 15:17                     ` Boris Brezillon
  0 siblings, 0 replies; 77+ messages in thread
From: Boris Brezillon @ 2023-09-13 15:17 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: Dave Airlie, Danilo Krummrich, daniel, matthew.brost,
	sarah.walker, donald.robson, christian.koenig, faith.ekstrand,
	dri-devel, nouveau, linux-kernel

On Wed, 13 Sep 2023 16:29:30 +0200
Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:

> On 9/13/23 16:01, Boris Brezillon wrote:
> > On Wed, 13 Sep 2023 15:22:56 +0200
> > Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >  
> >> On 9/13/23 13:33, Boris Brezillon wrote:  
> >>> On Wed, 13 Sep 2023 12:39:01 +0200
> >>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >>>     
> >>>> Hi,
> >>>>
> >>>> On 9/13/23 09:19, Boris Brezillon wrote:  
> >>>>> On Wed, 13 Sep 2023 17:05:42 +1000
> >>>>> Dave Airlie <airlied@gmail.com> wrote:
> >>>>>        
> >>>>>> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
> >>>>>> <boris.brezillon@collabora.com> wrote:  
> >>>>>>> On Tue, 12 Sep 2023 18:20:32 +0200
> >>>>>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >>>>>>>           
> >>>>>>>>> +/**
> >>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
> >>>>>>>>> + * @__gpuvm: The GPU VM
> >>>>>>>>> + * @__list_name: The name of the list we're iterating on
> >>>>>>>>> + * @__local_list: A pointer to the local list used to store already iterated items
> >>>>>>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> >>>>>>>>> + *
> >>>>>>>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
> >>>>>>>>> + * iterator releases the lock immediately after picking the first element from
> >>>>>>>>> + * the list, so list insertion deletion can happen concurrently.  
> >>>>>>>> Are the list spinlocks needed for that async state update from within
> >>>>>>>> the dma-fence critical section we've discussed previously?  
> >>>>>>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
> >>>>>>> hook will be in this situation (Panthor at the moment, PowerVR soon). I
> >>>>>>> get that Xe and Nouveau don't need that because they update the VM
> >>>>>>> state early (in the ioctl path), but I keep thinking this will hurt us
> >>>>>>> if we don't think it through from the beginning, because once you've
> >>>>>>> set this logic to depend only on resv locks, it will be pretty hard to
> >>>>>>> get back to a solution which lets synchronous VM_BINDs take precedence
> >>>>>>> on asynchronous request, and, with vkQueueBindSparse() passing external
> >>>>>>> deps (plus the fact the VM_BIND queue might be pretty deep), it can
> >>>>>>> take a long time to get your synchronous VM_BIND executed...  
> >>>> So this would boil down to either (possibly opt-in) keeping the spinlock
> >>>> approach or pushing the unlink out to a wq then?  
> >>> Deferred _unlink() would not be an issue, since I already defer the
> >>> drm_gpuva destruction to a wq, it would just a be a matter of moving the
> >>> _unlink() call there as well. But _link() also takes the GEM gpuva list
> >>> lock, and that one is bit tricky, in that sm_map() can trigger 2 more
> >>> _link() calls for the prev/next mappings, which we can't guess until we
> >>> get to execute the VM update. If we mandate the use of the GEM resv
> >>> lock, that simply means async VM updates (AKA calling
> >>> drm_gpuvm_sm_[un]map()) are not an option. And if this is what everyone
> >>> agrees on, then I'd like the APIs that make this sort of async VM
> >>> update possible (drm_gpuvm_sm_[un]map(), the drm_gpuvm_ops::sm_step*
> >>> methods, and probably other things) to be dropped, so we don't make it
> >>> look like it's something we support.
> >>>     
> >>>> BTW, as also asked in a reply to Danilo, how do you call unlink from
> >>>> run_job() when it was requiring the obj->dma_resv lock, or was that a WIP?  
> >>> _unlink() makes sure the GEM gpuva list lock is taken, but this can be
> >>> a custom lock (see drm_gem_gpuva_set_lock()). In panthor we have
> >>> panthor_gem_object::gpuva_list_lock that's dedicated the gpuva list
> >>> protection. We make sure we never take this lock while allocating
> >>> memory to guarantee the dma-signalling path can't deadlock.
> >>>     
> >>>>>>>           
> >>>>>> btw what is the use case for this? do we have actual vulkan
> >>>>>> applications we know will have problems here?  
> >>>>> I don't, but I think that's a concern Faith raised at some point (dates
> >>>>> back from when I was reading threads describing how VM_BIND on i915
> >>>>> should work, and I was clearly discovering this whole VM_BIND thing at
> >>>>> that time, so maybe I misunderstood).
> >>>>>        
> >>>>>> it feels like a bit of premature optimisation, but maybe we have use cases.  
> >>>>> Might be, but that's the sort of thing that would put us in a corner if
> >>>>> we don't have a plan for when the needs arise. Besides, if we don't
> >>>>> want to support that case because it's too complicated, I'd recommend
> >>>>> dropping all the drm_gpuvm APIs that let people think this mode is
> >>>>> valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
> >>>>> drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
> >>>>> confusion.  
> >>>> Xe allows bypassing the bind-queue with another bind-queue, but to
> >>>> completely avoid dependencies between queues the Operations may not
> >>>> overlap.  
> >>> So, you check the VM state with some VM lock held (would be the VM resv
> >>> in my case), and if the mapping is new (no overlaps with pre-existing
> >>> mappings), you queue it to the fast-track/sync-VM_BIND queue. What would
> >>> be missing I guess is a way to know if the mapping is active (MMU has
> >>> been updated) or pending (MMU update queued to the bind-queue), so I can
> >>> fast-track mapping/unmapping of active mappings. This would leave
> >>> overlapping sync/async VM updates, which can't happen in practice
> >>> unless userspace is doing something wrong (sparse bindings always go
> >>> through vkQueueBindSparse).  
> >> User-space is allowed to create new bind queues at will, and they
> >> execute independently save for range overlaps.  
> > I've limited panthor to just one bind-queue that's automatically
> > created when the VM is created. I guess letting userspace create more
> > than one queue is doable, but we'd still be serializing VM
> > operations anyway and that complicates the whole thing when concurrent
> > operations to the same VM region happen from different bind queues, so I
> > figured it'd be simpler to expose just one queue.
> >  
> >> And the overlapping granularity depends very much on the detail of the
> >> range tracking.
> >> We drafted this fenced range utility
> >>
> >> https://gitlab.freedesktop.org/drm/xe/kernel/-/merge_requests/353
> >>
> >> That tracks active ranges that remove themselves when the attached fence
> >> signals. Not sure if we ended up using it, though. A new binding would
> >> scan this utility for dma-fences it needs to depend upon.  
> > Sounds like implicit deps on VM ranges :D. I'll have a look, thanks
> > for the pointer!
> >  
> >> Ranges in Xe
> >> are actually page-table modification ranges, so can exceed the actual VA
> >> range in some situations, but if you can build page-table structures
> >> async the granularity indeed becomes better.  
> > The granularity in Mali is 4k, and we don't build the page table struct
> > asynchronously, we just update the page table tree from the CPU,
> > holding a VM lock to serialize such operations (that's done
> > synchronously in the ::run_job() path, or from the ioctl in case of a
> > sync-VM_BIND).  
> 
> OK, yeah we have something similar although we build the page-table tree 
> in the IOCTL and update entries using the GPU unless there are no 
> dependencies,

We can't do that since we defer pgtable updates to the io-pgtable
framework, which handles the pgtable tree update (we can't pass a
pre-built version of the tree). What we did though, is extend the
framework so we're in control of the page table allocations. In order
to avoid allocations in the dma-signalling path, we pre-allocate page
tables for the range we want to map (or unmap), and then pick from these
pre-allocated pages when the iopgtable frameworks asks us to allocate a
page table. Until now, we were provisioning for the worst case scenario
(all levels of the page table tree are missing, except for the root
level, which is allocated when the io-pgtable is instantiated). With
per-range operation tracking, we could potentially avoid this
over-provisioning by checking the queue of operations touching a
specific range, and making sure unmaps don't teardown page tables if we
now they're going to be needed later on (we just want to update PTEs
in that case).

> in which case we do it sync in the ioctl as well.
> 
> The drawback here is that if one op adds a pagetable tree node near the 
> root (spanning say 1G) and the next op adds an entry to that node, the 
> granularity can become pretty large...

I know nothing about the intel GPU MMU page table format, but I guess
you're talking about adding one or more levels to the pgtable tree
because some big physically contiguous mapping is split, which indeed
might require allocating page tables and filling a lot of PTEs. This
change of granularity indeed has a cost, and avoiding repeated changes
would indeed be preferable, but it's not the end of the world for
Panthor, where we only use 4k and 2M granules (only the last level is
optional in our implementation).

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 15:13                 ` Thomas Hellström
@ 2023-09-13 15:26                   ` Christian König
  0 siblings, 0 replies; 77+ messages in thread
From: Christian König @ 2023-09-13 15:26 UTC (permalink / raw)
  To: Thomas Hellström, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

Am 13.09.23 um 17:13 schrieb Thomas Hellström:
> Hi Christian
>
> On 9/13/23 16:26, Christian König wrote:
>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>> As mentioned in a different mail thread, the reply is based on the 
>>> assumption
>>> that we don't support anything else than GPUVM updates from the IOCTL.
>>
>> I think that this assumption is incorrect.
>>
>> Vulkan is just once specific use case, but this here should probably 
>> be able to handle other use cases as well.
>>
>> Especially with HMM you get the requirement that you need to be able 
>> to invalidate GPUVM mappings without grabbing a reservation lock.
>
> Are you referring to the MMU range invalidation notifiers here?

Yes, but you need to ping Felix and Philip for the details.

>
>>
>> See what the eviction lock in amdgpu is doing for example.
>
> IMO the statement regarding GPUVM updates from the IOCTL mostly refers 
> to the need to protect the evicted- and extobj lists with additional 
> spinlocks. Supporting userptr and faulting will ofc require additional 
> locks / locking mechanisms. But this code doesn't do that yet. Is your 
> concern that these particular spinlocks for these lists are indeed 
> needed?

More or less yes. My main concern is that both Dave and Danilo mentioned 
that they work with the assumption that they only need to handle 
Vulkan/IOCTL based use cases.

Regards,
Christian.

>
> /Thomas
>
>
>>
>> Regards,
>> Christian.
>>
>>>
>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström wrote:
>>>> Hi!
>>>>
>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström wrote:
>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
>>>>>>>> Hi, Danilo,
>>>>>>>>
>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>> So far the DRM GPUVA manager offers common infrastructure to
>>>>>>>>> track GPU VA
>>>>>>>>> allocations and mappings, generically connect GPU VA mappings
>>>>>>>>> to their
>>>>>>>>> backing buffers and perform more complex mapping operations
>>>>>>>>> on the GPU VA
>>>>>>>>> space.
>>>>>>>>>
>>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>>> drivers, which
>>>>>>>>> can potentially be generalized in order to make the DRM GPUVA
>>>>>>>>> manager
>>>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>>>> this patch aims
>>>>>>>>> at generalizing the following elements.
>>>>>>>>>
>>>>>>>>> 1) Provide a common dma-resv for GEM objects not being used
>>>>>>>>> outside of
>>>>>>>>>       this GPU-VM.
>>>>>>>>>
>>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>>> which are
>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>
>>>>>>>>> 3) Provide functions to efficiently lock all GEM objects dma-
>>>>>>>>> resv the
>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>
>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>> contains mappings
>>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>>> accelerated.
>>>>>>>>>
>>>>>>>>> 5) Provide some convinience functions for common patterns.
>>>>>>>>>
>>>>>>>>> Rather than being designed as a "framework", the target is to
>>>>>>>>> make all
>>>>>>>>> features appear as a collection of optional helper functions,
>>>>>>>>> such that
>>>>>>>>> drivers are free to make use of the DRM GPUVA managers basic
>>>>>>>>> functionality and opt-in for other features without setting
>>>>>>>>> any feature
>>>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>>>
>>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>>> locking for drivers
>>>>>>>>> updating the GPU VA space within the fence signalling path.
>>>>>>>>>
>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>> ---
>>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an existing
>>>>>>>>> instance of this
>>>>>>>>>      * particular combination. If not existent a new instance
>>>>>>>>> is created and linked
>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>> + *
>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>>> evicted objects. Those
>>>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>>>> dma-resv locks and
>>>>>>>>> + * validation of evicted objects bound in a &drm_gpuvm. For
>>>>>>>>> instance the all
>>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be
>>>>>>>>> locked by calling
>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>>>> also possible to lock
>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>> corresponding parameters to
>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the &drm_exec
>>>>>>>>> loop while making
>>>>>>>>> + * use of helper functions such as drm_gpuvm_prepare_range()
>>>>>>>>> or
>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>> + *
>>>>>>>>> + * Every bound &drm_gem_object is treated as external object
>>>>>>>>> when its &dma_resv
>>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>>> &dma_resv structure.
>>>>>>>>>      */
>>>>>>>>>     /**
>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for the same
>>>>>>>>> &drm_gpuvm and
>>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>>> creations and destructions
>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances unique.
>>>>>>>>> + *
>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of external and
>>>>>>>>> evicted objects are
>>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>>> iteration internally.
>>>>>>>>> + *
>>>>>>>>> + * However, drivers still need ensure to protect concurrent
>>>>>>>>> calls to functions
>>>>>>>>> + * iterating those lists, such as drm_gpuvm_validate() and
>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function contains
>>>>>>>>> a particular
>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>> + *
>>>>>>>>> + * Functions adding or removing entries from those lists,
>>>>>>>>> such as
>>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be
>>>>>>>>> called with external
>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>> corresponding list to be
>>>>>>>>> + * (safely) modified while potentially being iternated by
>>>>>>>>> other API functions.
>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>      */
>>>>>>>>>     /**
>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>      *   }
>>>>>>>>>      */
>>>>>>>>> +/**
>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>> already iterated items
>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>> + *
>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>> Lockless as in, the
>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>> first element from
>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>> concurrently.
>>>>>>>> Are the list spinlocks needed for that async state update from
>>>>>>>> within the
>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>
>>>>>>>> Otherwise it should be sufficient to protect the lists with the
>>>>>>>> gpuvm's resv
>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>
>>>>>>>> If those spinlocks are still needed in some situations, perhaps
>>>>>>>> could we
>>>>>>>> have an option to set them to NULL (Like IIRC the maple tree
>>>>>>>> allows for)?
>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>> holding only the dma-resv lock from the BO this function gets
>>>>>>> called for. Hence,
>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() calls with
>>>>>>> different BOs.
>>>>>> No. Only if you try to add external objects to the vm's evict list
>>>>>> from
>>>>>> within the evict code. That's not necessary since you loop through
>>>>>> all
>>>>>> external objects anyway when locking them so an "evicted" bool in
>>>>>> the vm_bo,
>>>>>> protected by the bo resv would be sufficient. The extobj locking
>>>>>> loop can
>>>>>> then add the bo to the evicted list.
>>>>> And validate() can remove it while still holding all dma-resv locks,
>>>>> neat!
>>>>> However, what if two tasks are trying to lock the VA space
>>>>> concurrently? What
>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>> drm_gpuva_unlink()?
>>>>> Are we guaranteed that at this point of time the drm_gpuvm_bo is not
>>>>> on the
>>>>> evicted list? Because otherwise we would call drm_gpuvm_bo_destroy()
>>>>> with the
>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>> drm_gpuvm_bo_destroy()
>>>>> might drop the last reference to the drm_gem_object and hence we'd
>>>>> potentially
>>>>> free the dma-resv lock while holding it, at least if it's an external
>>>>> object.
>>>> Easiest way in this scheme is to think of the lists as being protected
>>>> by the vm's resv lock. That means anybody calling unlink() must also
>>>> hold the vm's resv lock. (Which is OK from an UAF point of view, but
>>>> perhaps not from a locking inversion POW from an async list update).
>>> This would mean that on unlink() we'd need to hold the VM's resv 
>>> lock and the
>>> corresponding GEM's resv lock (in case they're not the same anyways) 
>>> because the
>>> VM's resv lock would protect the external / evicted object lists and 
>>> the GEM
>>> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>
>>>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>>>> really would not
>>>>>>> like to add even more complexity just to get the spinlock out of
>>>>>>> the way in case
>>>>>>> the driver already has an outer lock protecting this path.
>>>>>> I must disagree here. These spinlocks and atomic operations are
>>>>>> pretty
>>>>>> costly and as discussed earlier this type of locking was the reason
>>>>>> (at
>>>>>> least according to the commit message) that made Christian drop the
>>>>>> XArray
>>>>>> use in drm_exec for the same set of objects: "The locking overhead
>>>>>> is
>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>> complexity and a
>>>>>> single wide lock following the drm locking guidelines set out by
>>>>>> Daniel and
>>>>>> David should really be the default choice with an opt-in for a
>>>>>> spinlock if
>>>>>> needed for async and pushing out to a wq is not an option.
>>>>> For the external object list an outer lock would work as long as it's
>>>>> not the
>>>>> dma-resv lock of the corresponding GEM object, since here we actually
>>>>> need to
>>>>> remove the list entry from the external object list on
>>>>> drm_gpuvm_bo_destroy().
>>>>> It's just a bit weird design wise that drivers would need to take
>>>>> this outer
>>>>> lock on:
>>>>>
>>>>> - drm_gpuvm_bo_extobj_add()
>>>>> - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>> drm_gpuvm_bo_put())
>>>>> - drm_gpuvm_exec_lock()
>>>>> - drm_gpuvm_exec_lock_array()
>>>>> - drm_gpuvm_prepare_range()
>>>>>
>>>>> Given that it seems reasonable to do all the required locking
>>>>> internally.
>>>>  From a design POW, there has been a clear direction in XE to make
>>>> things similar to mmap() / munmap(), so this outer lock, which in 
>>>> Xe is
>>>> an rwsem, is used in a similar way as the mmap_lock. It's protecting
>>>> the page-table structures and vma rb tree, the userptr structures and
>>>> the extobj list. Basically it's taken early in the exec IOCTL, the
>>>> VM_BIND ioctl, the compute rebind worker and the pagefault handler, so
>>>> all of the above are just asserting that it is taken in the correct
>>>> mode.
>>>>
>>>> But strictly with this scheme one could also use the vm's dma_resv for
>>>> the extobj list since with drm_exec, it's locked before traversing the
>>>> list.
>>>>
>>>> The whole point of this scheme is to rely on locks that you already 
>>>> are
>>>> supposed to be holding for various reasons and is simple to 
>>>> comprehend.
>>> I don't agree that we're supposed to hold the VM's resv lock anyways 
>>> for
>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm 
>>> fine using it
>>> for that purpose nevertheless.
>>>
>>>>> In order to at least place lockdep checks, the driver would need to
>>>>> supply the
>>>>> corresponding lock's lockdep_map, because the GPUVM otherwise doesn't
>>>>> know about
>>>>> the lock.
>>>> Yes, that sounds reasonable. One lockdep map per list.
>>> I'd really like to avoid that, especially now that everything got 
>>> simpler. We
>>> should define the actual locks to take instead.
>>>
>>>>> Out of curiosity, what is the overhead of a spin_lock() that doesn't
>>>>> need to
>>>>> spin?
>>>> I guess it's hard to tell exactly, but it is much lower on modern x86
>>>> than what it used to be. Not sure about ARM, which is the other
>>>> architecture important to us. I figure if there is little cache-line
>>>> bouncing the main overhead comes from the implied barriers.
>>>>
>>>>>> A pretty simple way that would not add much code would be
>>>>>>
>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>>>> spinlock_t
>>>>>> *lock)
>>>>>>
>>>>>> {
>>>>>>
>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>          spin_lock(lock);
>>>>>>
>>>>>> }
>>>>>>
>>>>>>>> For such drivers, that would require anybody calling unlink to
>>>>>>>> hold the vm's
>>>>>>>> resv, though.
>>>>>>> In V4 I want to go back to having a dedicated lock for the GEMs
>>>>>>> gpuva list (or
>>>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>>>> lock for that
>>>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>>>> otherwise wouldn't
>>>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>>>> was referring to
>>>>>>> earlier.
>>>>>> Yeah, I can see the need for a dedicated lock for the GEM's gpuva
>>>>>> list, but
>>>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>>>> problem. We
>>>>>> may free the object and a pointer to the vm's resv during unlink
>>>>>> but we
>>>>>> don't free the vm's resv.  It'd be a matter of ensuring that any
>>>>>> calls to
>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>> Drivers calling unlink() from the fence signaling path can't use the
>>>>> VM's
>>>>> dma-resv lock.
>>>> Yes, that made me a bit curious because in the current version the 
>>>> code
>>>> required the object's dma_resv for unlink() which can't be grabbed
>>>> either from the fence signaling path. So are there any drivers 
>>>> actually
>>>> wanting to do that? If so, they will either need to resort to the
>>>> current spinlock solution or they will need to call unlink from a
>>>> workqueue item.
>>> As Boris already mentioned we have the dma-resv lock by default or a 
>>> driver
>>> specific GEM gpuva lock as opt-in. Now, we can get rid of the latter.
>>>
>>>>> Also, what if the object is an external object? We can't use the VM's
>>>>> dma-resv
>>>>> lock here.
>>>> Why? Typically (sync) unlink is only ever called from an unbind-like
>>>> operation where it should be trivial to grab the vm's resv. Or, for
>>>> that matter any outer lock protecting the extobj list. Rule would be
>>>> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict would
>>>> be protected by either the vm's dma_resv (or possibly an outer lock in
>>>> the case of the extobj list).
>>> Outer lock wouldn't have been working for updates in the async path, 
>>> but
>>> shouldn't be relevant anymore. We could use the VM's resv for that.
>>>
>>>>>   And we can't have the GEM objs dma-resv lock held when calling
>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>>>> refcount drops
>>>>> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might
>>>>> drop the
>>>>> last reference of the GEM object.
>>>> Yes, but this is a different problem as to what exactly protects
>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per bo 
>>>> list
>>>> lock, or if we want to keep the bo's dma_resv we need to ensure that
>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount (I 
>>>> know
>>>> Boris didn't like that, but requiring an explicit refcount for a
>>>> pointer you dereference unless you're under a lock that ensures 
>>>> keeping
>>>> the object alive is pretty much required?) But anyway for the
>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal 
>>>> spinlock)
>>>> I don't have a strong preference.
>>> We can keep the GEM objects dma-resv lock, however as mentioned above
>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the 
>>> VM's resv lock
>>> and the GEM's resv lock in case they differ.
>>>
>>>>>   All those problems go away with a dedicated
>>>>> GEM gpuva list lock.
>>>> I don't think these are real problems.
>>>> With the excepton of the eviction list "trick" where we currently have
>>>> slightly different approach to collect external bos needing rebinding,
>>>> we have this working fine.
>>>>
>>>> TBH I think pretty much the only situation where the spinlock is 
>>>> needed
>>>> is for async updates of these lists, unless a wq item can be used for
>>>> that, but it doesn't really seem like the current code allows for such
>>>> updates anyway? It complicates the code a lot, adds overhead and also
>>>> adds the requirement for refcounting during list traversal.
>>>>
>>>> /Thomas
>>>>
>>>>>> /Thomas
>>>>>>
>>>>>>
>>>>>>>> It seems that with that also the refcount could be make non-
>>>>>>>> atomic.
>>>>>>>>
>>>>>>>> All in the spirit of the drm locking guidelines "use big locks
>>>>>>>> when
>>>>>>>> possible".
>>>>>>>> Lower level locks only when necessary for performance or
>>>>>>>> locking inversion?
>>>>>>>>
>>>>>>>> /Thomas
>>>>>>>>
>>>>>>>>
>>>>>>>>> + *
>>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>>> local list, so removal
>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>> iterating the list.
>>>>>>>>> + */
>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>> +       ({
>>>>>>>>>                             \
>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>> +
>>>>>>>>>                             \
>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>                             \
>>>>>>>>> +
>>>>>>>>>                             \
>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>> +                       __vm_bo =
>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>>> + struct
>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>> +
>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>> +                       if
>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>> {                    \
>>>>>>>>> +                               list_move_tail(&(__vm_bo)-
>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>> +
>>>>>>>>> __local_list);                           \
>>>>>>>>> +                               break;
>>>>>>>>>                             \
>>>>>>>>> +                       } else
>>>>>>>>> {                                                        \
>>>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>> +                               __vm_bo =
>>>>>>>>> NULL;                                         \
>>>>>>>>> +                       }
>>>>>>>>>                             \
>>>>>>>>> +               }
>>>>>>>>>                             \
>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>> +
>>>>>>>>>                             \
>>>>>>>>> +               __vm_bo;
>>>>>>>>>                             \
>>>>>>>>> +       })
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
>>>>>>>>> + *
>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>> Lockless as in, the
>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>> first element from the
>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>> concurrently.
>>>>>>>>> + *
>>>>>>>>> + * Typical use:
>>>>>>>>> + *
>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>> + *
>>>>>>>>> + *     ret = 0;
>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>> + *             ret = do_something_with_vm_bo(..., vm_bo);
>>>>>>>>> + *             if (ret)
>>>>>>>>> + *                     break;
>>>>>>>>> + *     }
>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>>> &my_local_list);
>>>>>>>>> + *
>>>>>>>>> + *
>>>>>>>>> + * Only used for internal list iterations, not meant to be
>>>>>>>>> exposed to the outside
>>>>>>>>> + * world.
>>>>>>>>> + */
>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>> __list_name,           \
>>>>>>>>> +                                               __local_list,
>>>>>>>>> NULL);            \
>>>>>>>>> +
>>>>>>>>> __vm_bo;
>>>>>>>>>        \
>>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>> __list_name,           \
>>>>>>>>> +                                               __local_list,
>>>>>>>>> __vm_bo))         \
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to their
>>>>>>>>> original list
>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>> already iterated items
>>>>>>>>> + *
>>>>>>>>> + * When we're done iterating a vm_bo list, we should call
>>>>>>>>> restore_vm_bo_list()
>>>>>>>>> + * to restore the original state and let new iterations take
>>>>>>>>> place.
>>>>>>>>> + */
>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>> __local_list)                         \
>>>>>>>>> +       do
>>>>>>>>> {
>>>>>>>>>                  \
>>>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>>>> list elements to the          \
>>>>>>>>> +                * head to preserve previous ordering, in
>>>>>>>>> case it matters.              \
>>>>>>>>> +
>>>>>>>>> */
>>>>>>>>>            \
>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>> __list_name.list);                \
>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>> +       } while (0)
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
>>>>>>>>> list
>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>> + *
>>>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>>>> @__list_name and
>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>> + */
>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>> __list_name)                            \
>>>>>>>>> +       do
>>>>>>>>> {
>>>>>>>>>          \
>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>> __list_name.list);        \
>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>> +       } while (0)
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
>>>>>>>>> list
>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>> + *
>>>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>>>> @__list_name and
>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>> + */
>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>> __list_name)                            \
>>>>>>>>> +       do
>>>>>>>>> {
>>>>>>>>>          \
>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>> +       } while (0)
>>>>>>>>> +
>>>>>>>>> +static int __must_check
>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>> +
>>>>>>>>>     #define to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
>>>>>>>>> struct drm_device *drm,
>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>> +
>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>> +
>>>>>>>>>          drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>>>> *gpuvm)
>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>               "GPUVA tree is not empty, potentially leaking
>>>>>>>>> memory.\n");
>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
>>>>>>>>> should be empty.\n");
>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>>>> should be empty.\n");
>>>>>>>>> +
>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>     }
>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>> + *
>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the
>>>>>>>>> given
>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>> + *
>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>> responsibility to call
>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>> + *
>>>>>>>>> + * Note: This function is safe against concurrent insertion
>>>>>>>>> and removal of
>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>> concurrent usage itself.
>>>>>>>>> + *
>>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>>> either an outer VM lock
>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this function
>>>>>>>>> within the
>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the GPUVM's
>>>>>>>>> dma-resv lock ensures
>>>>>>>>> + * mutual exclusion.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +int
>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>>> +                         unsigned int num_fences)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>> +       int ret = 0;
>>>>>>>>> +
>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>>> vm_bo) {
>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, vm_bo->obj,
>>>>>>>>> num_fences);
>>>>>>>>> +               if (ret)
>>>>>>>>> +                       break;
>>>>>>>>> +       }
>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>> +
>>>>>>>>> +       return ret;
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped within
>>>>>>>>> a given range
>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>> + *
>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>>>> mapped between @addr
>>>>>>>>> + * and @addr + @range.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +int
>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>>> drm_exec *exec,
>>>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>>>> num_fences)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>> +       int ret;
>>>>>>>>> +
>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
>>>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>>>> +
>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>> num_fences);
>>>>>>>>> +               if (ret)
>>>>>>>>> +                       return ret;
>>>>>>>>> +       }
>>>>>>>>> +
>>>>>>>>> +       return 0;
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>> assoiciated BOs
>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>> + *
>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>> given
>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>> + *
>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>> + * being set the driver receives the given @fn callback to
>>>>>>>>> lock additional
>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec instance.
>>>>>>>>> Typically, drivers
>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>> callback.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +int
>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>> +                   bool interruptible)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>> +       uint32_t flags;
>>>>>>>>> +       int ret;
>>>>>>>>> +
>>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>> 0 |
>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>> +
>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>> +
>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>>> num_fences);
>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>> +               if (ret)
>>>>>>>>> +                       goto err;
>>>>>>>>> +
>>>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>>> num_fences);
>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>> +               if (ret)
>>>>>>>>> +                       goto err;
>>>>>>>>> +
>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>>> num_fences);
>>>>>>>>> +                       drm_exec_retry_on_contention(exec);
>>>>>>>>> +                       if (ret)
>>>>>>>>> +                               goto err;
>>>>>>>>> +               }
>>>>>>>>> +       }
>>>>>>>>> +
>>>>>>>>> +       return 0;
>>>>>>>>> +
>>>>>>>>> +err:
>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>> +       return ret;
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>> +
>>>>>>>>> +static int
>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
>>>>>>>>> num_fences)
>>>>>>>>> +{
>>>>>>>>> +       struct {
>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>> +
>>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>>> objs,
>>>>>>>>> + args->num_objs,
>>>>>>>>> num_fences);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
>>>>>>>>> assoiciated BOs
>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>> + * @num_objs: the number of additional &drm_gem_objects to
>>>>>>>>> lock
>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>> + *
>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>> given &drm_gpuvm
>>>>>>>>> + * contains mappings of, plus the ones given through @objs.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +int
>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>>> +                         unsigned int num_objs,
>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>> +                         bool interruptible)
>>>>>>>>> +{
>>>>>>>>> +       struct {
>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>> +       } args;
>>>>>>>>> +
>>>>>>>>> +       args.objs = objs;
>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>> +
>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>> +
>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>>> interruptible);
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>>> within a given range
>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>> + *
>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>>> mapped between @addr and
>>>>>>>>> + * @addr + @range.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +int
>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>> +                         bool interruptible)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>> +       uint32_t flags;
>>>>>>>>> +       int ret;
>>>>>>>>> +
>>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>> 0 |
>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>> +
>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>> +
>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>>> addr, range,
>>>>>>>>> + num_fences);
>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>> +               if (ret)
>>>>>>>>> +                       goto err;
>>>>>>>>> +       }
>>>>>>>>> +
>>>>>>>>> +       return ret;
>>>>>>>>> +
>>>>>>>>> +err:
>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>> +       return ret;
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as evicted
>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>> + *
>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>>>> evicted buffer
>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +int
>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>> +{
>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>> +       int ret = 0;
>>>>>>>>> +
>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>> +
>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>> +               if (ret)
>>>>>>>>> +                       break;
>>>>>>>>> +       }
>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>> +
>>>>>>>>> +       return ret;
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private and all
>>>>>>>>> extobj
>>>>>>>>> + * dma-resv
>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>> + * @fence: fence to add
>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>> + */
>>>>>>>>> +void
>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>>> +                        enum dma_resv_usage private_usage,
>>>>>>>>> +                        enum dma_resv_usage extobj_usage)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>> +       unsigned long index;
>>>>>>>>> +
>>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, obj) {
>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>> obj) ?
>>>>>>>>> +                                  private_usage :
>>>>>>>>> extobj_usage);
>>>>>>>>> +       }
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>> +
>>>>>>>>>     /**
>>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance of struct
>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>>>> *gpuvm,
>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>> +
>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>          return vm_bo;
>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>> +
>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>> +
>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the reference of
>>>>>>>>>      *
>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>> + *
>>>>>>>>> + * If the reference count drops to zero, the &gpuvm_bo is
>>>>>>>>> destroyed, which
>>>>>>>>> + * includes removing it from the GEMs gpuva list. Hence, if
>>>>>>>>> a call to this
>>>>>>>>> + * function can potentially let the reference count to zero
>>>>>>>>> the caller must
>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>>      */
>>>>>>>>>     void
>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
>>>>>>>>> *vm_bo)
>>>>>>>>>     }
>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>> +static int __must_check
>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>> +{
>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>     }
>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its
>>>>>>>>> &drm_gpuvm's
>>>>>>>>> + * extobj list
>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the
>>>>>>>>> extobj list.
>>>>>>>>> + *
>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if
>>>>>>>>> not on the list
>>>>>>>>> + * already and if the corresponding &drm_gem_object is an
>>>>>>>>> external object,
>>>>>>>>> + * actually.
>>>>>>>>> + */
>>>>>>>>> +void
>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>> +
>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
>>>>>>>>> / from a
>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>> + *
>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>> + */
>>>>>>>>> +void
>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>> +
>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>> +               if (evict)
>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, evict);
>>>>>>>>> +               else
>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, evict);
>>>>>>>>> +       }
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>> +
>>>>>>>>>     static int
>>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>      */
>>>>>>>>>     #include <linux/list.h>
>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>           * space
>>>>>>>>>           */
>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>> +
>>>>>>>>> +       /**
>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>> +        */
>>>>>>>>> +       struct {
>>>>>>>>> +               /**
>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>> serving as
>>>>>>>>> +                * external object
>>>>>>>>> +                */
>>>>>>>>> +               struct list_head list;
>>>>>>>>> +
>>>>>>>>> +               /**
>>>>>>>>> +                * @lock: spinlock to protect the extobj list
>>>>>>>>> +                */
>>>>>>>>> +               spinlock_t lock;
>>>>>>>>> +       } extobj;
>>>>>>>>> +
>>>>>>>>> +       /**
>>>>>>>>> +        * @evict: structure holding the evict list and evict
>>>>>>>>> list lock
>>>>>>>>> +        */
>>>>>>>>> +       struct {
>>>>>>>>> +               /**
>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>> currently being
>>>>>>>>> +                * evicted
>>>>>>>>> +                */
>>>>>>>>> +               struct list_head list;
>>>>>>>>> +
>>>>>>>>> +               /**
>>>>>>>>> +                * @lock: spinlock to protect the evict list
>>>>>>>>> +                */
>>>>>>>>> +               spinlock_t lock;
>>>>>>>>> +       } evict;
>>>>>>>>>     };
>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>>> drm_device *drm,
>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>>> &drm_gem_object is an
>>>>>>>>> + * external object
>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>> + *
>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv differs
>>>>>>>>> from the
>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>> + */
>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>>> *gpuvm,
>>>>>>>>> +                                      struct drm_gem_object
>>>>>>>>> *obj)
>>>>>>>>> +{
>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>     {
>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__)
>>>>>>>>> \
>>>>>>>>>          list_for_each_entry_safe(va__, next__, &(gpuvm__)-
>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>> +/**
>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>> &drm_exec
>>>>>>>>> + *
>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>> &drm_exec should be.
>>>>>>>>> + *
>>>>>>>>> + * Optionally, @extra can be set in order to lock additional
>>>>>>>>> &drm_gem_objects.
>>>>>>>>> + */
>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>> +       /**
>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>> +        */
>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>> +
>>>>>>>>> +       /**
>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA reservations
>>>>>>>>> +        */
>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>> +
>>>>>>>>> +       /**
>>>>>>>>> +        * @extra: Callback and corresponding private data
>>>>>>>>> for the driver to
>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>> +        */
>>>>>>>>> +       struct {
>>>>>>>>> +               /**
>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>> +                */
>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>> +                         unsigned int num_fences);
>>>>>>>>> +
>>>>>>>>> +               /**
>>>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>>>> callback
>>>>>>>>> +                */
>>>>>>>>> +               void *priv;
>>>>>>>>> +       } extra;
>>>>>>>>> +};
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
>>>>>>>>> resv
>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>> + *
>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>> &drm_gem_object.
>>>>>>>>> + *
>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>> responsibility to call
>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +static inline int
>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>> +                    unsigned int num_fences)
>>>>>>>>> +{
>>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>>> num_fences);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>> +                             unsigned int num_fences);
>>>>>>>>> +
>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>> +                           struct drm_exec *exec,
>>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>>> +                           unsigned int num_fences);
>>>>>>>>> +
>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>> +                       bool interruptible);
>>>>>>>>> +
>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>> *vm_exec,
>>>>>>>>> +                             struct drm_gem_object **objs,
>>>>>>>>> +                             unsigned int num_objs,
>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>> +                             bool interruptible);
>>>>>>>>> +
>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>> *vm_exec,
>>>>>>>>> +                             u64 addr, u64 range,
>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>> +                             bool interruptible);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
>>>>>>>>> BOs
>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>> + *
>>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>>> previously acquired
>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +static inline void
>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>> +{
>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>> private_usage,
>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>> extobj_usage);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>> + * @fence: fence to add
>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>> + *
>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>> + */
>>>>>>>>> +static inline void
>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>> *vm_exec,
>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>> private_usage,
>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>> extobj_usage)
>>>>>>>>> +{
>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec,
>>>>>>>>> fence,
>>>>>>>>> +                                private_usage,
>>>>>>>>> extobj_usage);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>>     /**
>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>> &drm_gpuvm and
>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>                           * gpuva list.
>>>>>>>>>                           */
>>>>>>>>>                          struct list_head gem;
>>>>>>>>> +
>>>>>>>>> +                       /**
>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>> the &drm_gpuvms
>>>>>>>>> +                        * extobj list.
>>>>>>>>> +                        */
>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>> +
>>>>>>>>> +                       /**
>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>> +                        * list.
>>>>>>>>> +                        */
>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>                  } entry;
>>>>>>>>>          } list;
>>>>>>>>>     };
>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>>>> evict);
>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>> +
>>>>>>>>>     /**
>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk over a
>>>>>>>>> list of &drm_gpuva
>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>>> iteration step
>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>           * used.
>>>>>>>>>           */
>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, void
>>>>>>>>> *priv);
>>>>>>>>> +
>>>>>>>>> +       /**
>>>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>>>> +        *
>>>>>>>>> +        * Drivers receive this callback for every evicted
>>>>>>>>> &drm_gem_object being
>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>> +        *
>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>> specific variant of
>>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>>> +        */
>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>>     };
>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
>>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 15:15                 ` Danilo Krummrich
@ 2023-09-13 15:33                   ` Christian König
  2023-09-13 15:46                     ` Danilo Krummrich
  0 siblings, 1 reply; 77+ messages in thread
From: Christian König @ 2023-09-13 15:33 UTC (permalink / raw)
  To: Danilo Krummrich, Thomas Hellström
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
> On 9/13/23 16:26, Christian König wrote:
>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>> As mentioned in a different mail thread, the reply is based on the 
>>> assumption
>>> that we don't support anything else than GPUVM updates from the IOCTL.
>>
>> I think that this assumption is incorrect.
>
> Well, more precisely I should have said "don't support GPUVM updated 
> from within
> fence signaling critical sections". And looking at the code, that 
> doesn't seem what
> you're doing there.
>
>>
>> Vulkan is just once specific use case, but this here should probably 
>> be able to handle other use cases as well.
>>
>> Especially with HMM you get the requirement that you need to be able 
>> to invalidate GPUVM mappings without grabbing a reservation lock.
>
> What do you mean with "invalidate GPUVM mappings" in this context? 
> drm_gpuvm_bo_evict()
> should only be called from a ttm_device_funcs::move\x0f callback, we 
> should hold the dma-resv
> lock there.

Well the question is which dma-resv lock do we hold?

In the move callback we only hold the dma-resv lock of the BO which is 
moved, but when that is a shared BO then that's not the same as the one 
for the VM.

>
>>
>> See what the eviction lock in amdgpu is doing for example.
>
> The eviction_lock seems to protect a VM state "evicting" of whether 
> any BO that
> is associated with the VM is currently evicting. At the same time 
> amdgpu protects
> the eviceted list of the VM with a different lock. So this seems to be 
> entirely
> unrelated. Tracking a "currently evicting" state is not part of the GPUVM
> implementation currently and hence nothing would change for amdgpu there.

Sorry for the confusion we use different terminology in amdgpu.

The eviction lock and evicted state is for the VM page tables, e.g. if 
the whole VM is currently not used and swapped out or even de-allocated.

This is necessary because we have cases where we need to access the VM 
data without holding the dma-resv lock of this VM. Especially figuring 
out which parts of an address space contain mappings and which doesn't.

This is a requirement which comes with HMM handling, you won't see this 
with Vulkan (or OpenGL, VAAPI etc..).


The invalidation lock on the other hand is what in this discussion is 
called eviction lock. This one is needed because what I wrote above, 
during the move callback only the dma-resv of the BO which is moved is 
locked, but not necessarily the dma-resv of the VM.

Regards,
Christian.

>
>>
>> Regards,
>> Christian.
>>
>>>
>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström wrote:
>>>> Hi!
>>>>
>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström wrote:
>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
>>>>>>>> Hi, Danilo,
>>>>>>>>
>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>> So far the DRM GPUVA manager offers common infrastructure to
>>>>>>>>> track GPU VA
>>>>>>>>> allocations and mappings, generically connect GPU VA mappings
>>>>>>>>> to their
>>>>>>>>> backing buffers and perform more complex mapping operations
>>>>>>>>> on the GPU VA
>>>>>>>>> space.
>>>>>>>>>
>>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>>> drivers, which
>>>>>>>>> can potentially be generalized in order to make the DRM GPUVA
>>>>>>>>> manager
>>>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>>>> this patch aims
>>>>>>>>> at generalizing the following elements.
>>>>>>>>>
>>>>>>>>> 1) Provide a common dma-resv for GEM objects not being used
>>>>>>>>> outside of
>>>>>>>>>       this GPU-VM.
>>>>>>>>>
>>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>>> which are
>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>
>>>>>>>>> 3) Provide functions to efficiently lock all GEM objects dma-
>>>>>>>>> resv the
>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>
>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>> contains mappings
>>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>>> accelerated.
>>>>>>>>>
>>>>>>>>> 5) Provide some convinience functions for common patterns.
>>>>>>>>>
>>>>>>>>> Rather than being designed as a "framework", the target is to
>>>>>>>>> make all
>>>>>>>>> features appear as a collection of optional helper functions,
>>>>>>>>> such that
>>>>>>>>> drivers are free to make use of the DRM GPUVA managers basic
>>>>>>>>> functionality and opt-in for other features without setting
>>>>>>>>> any feature
>>>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>>>
>>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>>> locking for drivers
>>>>>>>>> updating the GPU VA space within the fence signalling path.
>>>>>>>>>
>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>> ---
>>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an existing
>>>>>>>>> instance of this
>>>>>>>>>      * particular combination. If not existent a new instance
>>>>>>>>> is created and linked
>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>> + *
>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>>> evicted objects. Those
>>>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>>>> dma-resv locks and
>>>>>>>>> + * validation of evicted objects bound in a &drm_gpuvm. For
>>>>>>>>> instance the all
>>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be
>>>>>>>>> locked by calling
>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>>>> also possible to lock
>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>> corresponding parameters to
>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the &drm_exec
>>>>>>>>> loop while making
>>>>>>>>> + * use of helper functions such as drm_gpuvm_prepare_range()
>>>>>>>>> or
>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>> + *
>>>>>>>>> + * Every bound &drm_gem_object is treated as external object
>>>>>>>>> when its &dma_resv
>>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>>> &dma_resv structure.
>>>>>>>>>      */
>>>>>>>>>     /**
>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for the same
>>>>>>>>> &drm_gpuvm and
>>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>>> creations and destructions
>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances unique.
>>>>>>>>> + *
>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of external and
>>>>>>>>> evicted objects are
>>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>>> iteration internally.
>>>>>>>>> + *
>>>>>>>>> + * However, drivers still need ensure to protect concurrent
>>>>>>>>> calls to functions
>>>>>>>>> + * iterating those lists, such as drm_gpuvm_validate() and
>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function contains
>>>>>>>>> a particular
>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>> + *
>>>>>>>>> + * Functions adding or removing entries from those lists,
>>>>>>>>> such as
>>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be
>>>>>>>>> called with external
>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>> corresponding list to be
>>>>>>>>> + * (safely) modified while potentially being iternated by
>>>>>>>>> other API functions.
>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>      */
>>>>>>>>>     /**
>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>      *   }
>>>>>>>>>      */
>>>>>>>>> +/**
>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>> already iterated items
>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>> + *
>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>> Lockless as in, the
>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>> first element from
>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>> concurrently.
>>>>>>>> Are the list spinlocks needed for that async state update from
>>>>>>>> within the
>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>
>>>>>>>> Otherwise it should be sufficient to protect the lists with the
>>>>>>>> gpuvm's resv
>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>
>>>>>>>> If those spinlocks are still needed in some situations, perhaps
>>>>>>>> could we
>>>>>>>> have an option to set them to NULL (Like IIRC the maple tree
>>>>>>>> allows for)?
>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>> holding only the dma-resv lock from the BO this function gets
>>>>>>> called for. Hence,
>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() calls with
>>>>>>> different BOs.
>>>>>> No. Only if you try to add external objects to the vm's evict list
>>>>>> from
>>>>>> within the evict code. That's not necessary since you loop through
>>>>>> all
>>>>>> external objects anyway when locking them so an "evicted" bool in
>>>>>> the vm_bo,
>>>>>> protected by the bo resv would be sufficient. The extobj locking
>>>>>> loop can
>>>>>> then add the bo to the evicted list.
>>>>> And validate() can remove it while still holding all dma-resv locks,
>>>>> neat!
>>>>> However, what if two tasks are trying to lock the VA space
>>>>> concurrently? What
>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>> drm_gpuva_unlink()?
>>>>> Are we guaranteed that at this point of time the drm_gpuvm_bo is not
>>>>> on the
>>>>> evicted list? Because otherwise we would call drm_gpuvm_bo_destroy()
>>>>> with the
>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>> drm_gpuvm_bo_destroy()
>>>>> might drop the last reference to the drm_gem_object and hence we'd
>>>>> potentially
>>>>> free the dma-resv lock while holding it, at least if it's an external
>>>>> object.
>>>> Easiest way in this scheme is to think of the lists as being protected
>>>> by the vm's resv lock. That means anybody calling unlink() must also
>>>> hold the vm's resv lock. (Which is OK from an UAF point of view, but
>>>> perhaps not from a locking inversion POW from an async list update).
>>> This would mean that on unlink() we'd need to hold the VM's resv 
>>> lock and the
>>> corresponding GEM's resv lock (in case they're not the same anyways) 
>>> because the
>>> VM's resv lock would protect the external / evicted object lists and 
>>> the GEM
>>> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>
>>>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>>>> really would not
>>>>>>> like to add even more complexity just to get the spinlock out of
>>>>>>> the way in case
>>>>>>> the driver already has an outer lock protecting this path.
>>>>>> I must disagree here. These spinlocks and atomic operations are
>>>>>> pretty
>>>>>> costly and as discussed earlier this type of locking was the reason
>>>>>> (at
>>>>>> least according to the commit message) that made Christian drop the
>>>>>> XArray
>>>>>> use in drm_exec for the same set of objects: "The locking overhead
>>>>>> is
>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>> complexity and a
>>>>>> single wide lock following the drm locking guidelines set out by
>>>>>> Daniel and
>>>>>> David should really be the default choice with an opt-in for a
>>>>>> spinlock if
>>>>>> needed for async and pushing out to a wq is not an option.
>>>>> For the external object list an outer lock would work as long as it's
>>>>> not the
>>>>> dma-resv lock of the corresponding GEM object, since here we actually
>>>>> need to
>>>>> remove the list entry from the external object list on
>>>>> drm_gpuvm_bo_destroy().
>>>>> It's just a bit weird design wise that drivers would need to take
>>>>> this outer
>>>>> lock on:
>>>>>
>>>>> - drm_gpuvm_bo_extobj_add()
>>>>> - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>> drm_gpuvm_bo_put())
>>>>> - drm_gpuvm_exec_lock()
>>>>> - drm_gpuvm_exec_lock_array()
>>>>> - drm_gpuvm_prepare_range()
>>>>>
>>>>> Given that it seems reasonable to do all the required locking
>>>>> internally.
>>>>  From a design POW, there has been a clear direction in XE to make
>>>> things similar to mmap() / munmap(), so this outer lock, which in 
>>>> Xe is
>>>> an rwsem, is used in a similar way as the mmap_lock. It's protecting
>>>> the page-table structures and vma rb tree, the userptr structures and
>>>> the extobj list. Basically it's taken early in the exec IOCTL, the
>>>> VM_BIND ioctl, the compute rebind worker and the pagefault handler, so
>>>> all of the above are just asserting that it is taken in the correct
>>>> mode.
>>>>
>>>> But strictly with this scheme one could also use the vm's dma_resv for
>>>> the extobj list since with drm_exec, it's locked before traversing the
>>>> list.
>>>>
>>>> The whole point of this scheme is to rely on locks that you already 
>>>> are
>>>> supposed to be holding for various reasons and is simple to 
>>>> comprehend.
>>> I don't agree that we're supposed to hold the VM's resv lock anyways 
>>> for
>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm 
>>> fine using it
>>> for that purpose nevertheless.
>>>
>>>>> In order to at least place lockdep checks, the driver would need to
>>>>> supply the
>>>>> corresponding lock's lockdep_map, because the GPUVM otherwise doesn't
>>>>> know about
>>>>> the lock.
>>>> Yes, that sounds reasonable. One lockdep map per list.
>>> I'd really like to avoid that, especially now that everything got 
>>> simpler. We
>>> should define the actual locks to take instead.
>>>
>>>>> Out of curiosity, what is the overhead of a spin_lock() that doesn't
>>>>> need to
>>>>> spin?
>>>> I guess it's hard to tell exactly, but it is much lower on modern x86
>>>> than what it used to be. Not sure about ARM, which is the other
>>>> architecture important to us. I figure if there is little cache-line
>>>> bouncing the main overhead comes from the implied barriers.
>>>>
>>>>>> A pretty simple way that would not add much code would be
>>>>>>
>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>>>> spinlock_t
>>>>>> *lock)
>>>>>>
>>>>>> {
>>>>>>
>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>          spin_lock(lock);
>>>>>>
>>>>>> }
>>>>>>
>>>>>>>> For such drivers, that would require anybody calling unlink to
>>>>>>>> hold the vm's
>>>>>>>> resv, though.
>>>>>>> In V4 I want to go back to having a dedicated lock for the GEMs
>>>>>>> gpuva list (or
>>>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>>>> lock for that
>>>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>>>> otherwise wouldn't
>>>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>>>> was referring to
>>>>>>> earlier.
>>>>>> Yeah, I can see the need for a dedicated lock for the GEM's gpuva
>>>>>> list, but
>>>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>>>> problem. We
>>>>>> may free the object and a pointer to the vm's resv during unlink
>>>>>> but we
>>>>>> don't free the vm's resv.  It'd be a matter of ensuring that any
>>>>>> calls to
>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>> Drivers calling unlink() from the fence signaling path can't use the
>>>>> VM's
>>>>> dma-resv lock.
>>>> Yes, that made me a bit curious because in the current version the 
>>>> code
>>>> required the object's dma_resv for unlink() which can't be grabbed
>>>> either from the fence signaling path. So are there any drivers 
>>>> actually
>>>> wanting to do that? If so, they will either need to resort to the
>>>> current spinlock solution or they will need to call unlink from a
>>>> workqueue item.
>>> As Boris already mentioned we have the dma-resv lock by default or a 
>>> driver
>>> specific GEM gpuva lock as opt-in. Now, we can get rid of the latter.
>>>
>>>>> Also, what if the object is an external object? We can't use the VM's
>>>>> dma-resv
>>>>> lock here.
>>>> Why? Typically (sync) unlink is only ever called from an unbind-like
>>>> operation where it should be trivial to grab the vm's resv. Or, for
>>>> that matter any outer lock protecting the extobj list. Rule would be
>>>> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict would
>>>> be protected by either the vm's dma_resv (or possibly an outer lock in
>>>> the case of the extobj list).
>>> Outer lock wouldn't have been working for updates in the async path, 
>>> but
>>> shouldn't be relevant anymore. We could use the VM's resv for that.
>>>
>>>>>   And we can't have the GEM objs dma-resv lock held when calling
>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>>>> refcount drops
>>>>> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might
>>>>> drop the
>>>>> last reference of the GEM object.
>>>> Yes, but this is a different problem as to what exactly protects
>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per bo 
>>>> list
>>>> lock, or if we want to keep the bo's dma_resv we need to ensure that
>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount (I 
>>>> know
>>>> Boris didn't like that, but requiring an explicit refcount for a
>>>> pointer you dereference unless you're under a lock that ensures 
>>>> keeping
>>>> the object alive is pretty much required?) But anyway for the
>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal 
>>>> spinlock)
>>>> I don't have a strong preference.
>>> We can keep the GEM objects dma-resv lock, however as mentioned above
>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the 
>>> VM's resv lock
>>> and the GEM's resv lock in case they differ.
>>>
>>>>>   All those problems go away with a dedicated
>>>>> GEM gpuva list lock.
>>>> I don't think these are real problems.
>>>> With the excepton of the eviction list "trick" where we currently have
>>>> slightly different approach to collect external bos needing rebinding,
>>>> we have this working fine.
>>>>
>>>> TBH I think pretty much the only situation where the spinlock is 
>>>> needed
>>>> is for async updates of these lists, unless a wq item can be used for
>>>> that, but it doesn't really seem like the current code allows for such
>>>> updates anyway? It complicates the code a lot, adds overhead and also
>>>> adds the requirement for refcounting during list traversal.
>>>>
>>>> /Thomas
>>>>
>>>>>> /Thomas
>>>>>>
>>>>>>
>>>>>>>> It seems that with that also the refcount could be make non-
>>>>>>>> atomic.
>>>>>>>>
>>>>>>>> All in the spirit of the drm locking guidelines "use big locks
>>>>>>>> when
>>>>>>>> possible".
>>>>>>>> Lower level locks only when necessary for performance or
>>>>>>>> locking inversion?
>>>>>>>>
>>>>>>>> /Thomas
>>>>>>>>
>>>>>>>>
>>>>>>>>> + *
>>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>>> local list, so removal
>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>> iterating the list.
>>>>>>>>> + */
>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>> +       ({
>>>>>>>>>                             \
>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>> +
>>>>>>>>>                             \
>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>                             \
>>>>>>>>> +
>>>>>>>>>                             \
>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>> +                       __vm_bo =
>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>>> + struct
>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>> +
>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>> +                       if
>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>> {                    \
>>>>>>>>> +                               list_move_tail(&(__vm_bo)-
>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>> +
>>>>>>>>> __local_list);                           \
>>>>>>>>> +                               break;
>>>>>>>>>                             \
>>>>>>>>> +                       } else
>>>>>>>>> {                                                        \
>>>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>> +                               __vm_bo =
>>>>>>>>> NULL;                                         \
>>>>>>>>> +                       }
>>>>>>>>>                             \
>>>>>>>>> +               }
>>>>>>>>>                             \
>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>> +
>>>>>>>>>                             \
>>>>>>>>> +               __vm_bo;
>>>>>>>>>                             \
>>>>>>>>> +       })
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
>>>>>>>>> + *
>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>> Lockless as in, the
>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>> first element from the
>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>> concurrently.
>>>>>>>>> + *
>>>>>>>>> + * Typical use:
>>>>>>>>> + *
>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>> + *
>>>>>>>>> + *     ret = 0;
>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>> + *             ret = do_something_with_vm_bo(..., vm_bo);
>>>>>>>>> + *             if (ret)
>>>>>>>>> + *                     break;
>>>>>>>>> + *     }
>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>>> &my_local_list);
>>>>>>>>> + *
>>>>>>>>> + *
>>>>>>>>> + * Only used for internal list iterations, not meant to be
>>>>>>>>> exposed to the outside
>>>>>>>>> + * world.
>>>>>>>>> + */
>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>> __list_name,           \
>>>>>>>>> +                                               __local_list,
>>>>>>>>> NULL);            \
>>>>>>>>> +
>>>>>>>>> __vm_bo;
>>>>>>>>>        \
>>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>> __list_name,           \
>>>>>>>>> +                                               __local_list,
>>>>>>>>> __vm_bo))         \
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to their
>>>>>>>>> original list
>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>> already iterated items
>>>>>>>>> + *
>>>>>>>>> + * When we're done iterating a vm_bo list, we should call
>>>>>>>>> restore_vm_bo_list()
>>>>>>>>> + * to restore the original state and let new iterations take
>>>>>>>>> place.
>>>>>>>>> + */
>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>> __local_list)                         \
>>>>>>>>> +       do
>>>>>>>>> {
>>>>>>>>>                  \
>>>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>>>> list elements to the          \
>>>>>>>>> +                * head to preserve previous ordering, in
>>>>>>>>> case it matters.              \
>>>>>>>>> +
>>>>>>>>> */
>>>>>>>>>            \
>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>> __list_name.list);                \
>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>> +       } while (0)
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
>>>>>>>>> list
>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>> + *
>>>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>>>> @__list_name and
>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>> + */
>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>> __list_name)                            \
>>>>>>>>> +       do
>>>>>>>>> {
>>>>>>>>>          \
>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>> __list_name.list);        \
>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>> +       } while (0)
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
>>>>>>>>> list
>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>> + *
>>>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>>>> @__list_name and
>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>> + */
>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>> __list_name)                            \
>>>>>>>>> +       do
>>>>>>>>> {
>>>>>>>>>          \
>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>> +       } while (0)
>>>>>>>>> +
>>>>>>>>> +static int __must_check
>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>> +
>>>>>>>>>     #define to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
>>>>>>>>> struct drm_device *drm,
>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>> +
>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>> +
>>>>>>>>>          drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>>>> *gpuvm)
>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>               "GPUVA tree is not empty, potentially leaking
>>>>>>>>> memory.\n");
>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
>>>>>>>>> should be empty.\n");
>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>>>> should be empty.\n");
>>>>>>>>> +
>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>     }
>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>> + *
>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the
>>>>>>>>> given
>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>> + *
>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>> responsibility to call
>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>> + *
>>>>>>>>> + * Note: This function is safe against concurrent insertion
>>>>>>>>> and removal of
>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>> concurrent usage itself.
>>>>>>>>> + *
>>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>>> either an outer VM lock
>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this function
>>>>>>>>> within the
>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the GPUVM's
>>>>>>>>> dma-resv lock ensures
>>>>>>>>> + * mutual exclusion.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +int
>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>>> +                         unsigned int num_fences)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>> +       int ret = 0;
>>>>>>>>> +
>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>>> vm_bo) {
>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, vm_bo->obj,
>>>>>>>>> num_fences);
>>>>>>>>> +               if (ret)
>>>>>>>>> +                       break;
>>>>>>>>> +       }
>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>> +
>>>>>>>>> +       return ret;
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped within
>>>>>>>>> a given range
>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>> + *
>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>>>> mapped between @addr
>>>>>>>>> + * and @addr + @range.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +int
>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>>> drm_exec *exec,
>>>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>>>> num_fences)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>> +       int ret;
>>>>>>>>> +
>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
>>>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>>>> +
>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>> num_fences);
>>>>>>>>> +               if (ret)
>>>>>>>>> +                       return ret;
>>>>>>>>> +       }
>>>>>>>>> +
>>>>>>>>> +       return 0;
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>> assoiciated BOs
>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>> + *
>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>> given
>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>> + *
>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>> + * being set the driver receives the given @fn callback to
>>>>>>>>> lock additional
>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec instance.
>>>>>>>>> Typically, drivers
>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>> callback.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +int
>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>> +                   bool interruptible)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>> +       uint32_t flags;
>>>>>>>>> +       int ret;
>>>>>>>>> +
>>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>> 0 |
>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>> +
>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>> +
>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>>> num_fences);
>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>> +               if (ret)
>>>>>>>>> +                       goto err;
>>>>>>>>> +
>>>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>>> num_fences);
>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>> +               if (ret)
>>>>>>>>> +                       goto err;
>>>>>>>>> +
>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>>> num_fences);
>>>>>>>>> +                       drm_exec_retry_on_contention(exec);
>>>>>>>>> +                       if (ret)
>>>>>>>>> +                               goto err;
>>>>>>>>> +               }
>>>>>>>>> +       }
>>>>>>>>> +
>>>>>>>>> +       return 0;
>>>>>>>>> +
>>>>>>>>> +err:
>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>> +       return ret;
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>> +
>>>>>>>>> +static int
>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
>>>>>>>>> num_fences)
>>>>>>>>> +{
>>>>>>>>> +       struct {
>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>> +
>>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>>> objs,
>>>>>>>>> + args->num_objs,
>>>>>>>>> num_fences);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
>>>>>>>>> assoiciated BOs
>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>> + * @num_objs: the number of additional &drm_gem_objects to
>>>>>>>>> lock
>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>> + *
>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>> given &drm_gpuvm
>>>>>>>>> + * contains mappings of, plus the ones given through @objs.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +int
>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>>> +                         unsigned int num_objs,
>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>> +                         bool interruptible)
>>>>>>>>> +{
>>>>>>>>> +       struct {
>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>> +       } args;
>>>>>>>>> +
>>>>>>>>> +       args.objs = objs;
>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>> +
>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>> +
>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>>> interruptible);
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>>> within a given range
>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>> + *
>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>>> mapped between @addr and
>>>>>>>>> + * @addr + @range.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +int
>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>> +                         bool interruptible)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>> +       uint32_t flags;
>>>>>>>>> +       int ret;
>>>>>>>>> +
>>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>> 0 |
>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>> +
>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>> +
>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>>> addr, range,
>>>>>>>>> + num_fences);
>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>> +               if (ret)
>>>>>>>>> +                       goto err;
>>>>>>>>> +       }
>>>>>>>>> +
>>>>>>>>> +       return ret;
>>>>>>>>> +
>>>>>>>>> +err:
>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>> +       return ret;
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as evicted
>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>> + *
>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>>>> evicted buffer
>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +int
>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>> +{
>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>> +       int ret = 0;
>>>>>>>>> +
>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>> +
>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>> +               if (ret)
>>>>>>>>> +                       break;
>>>>>>>>> +       }
>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>> +
>>>>>>>>> +       return ret;
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private and all
>>>>>>>>> extobj
>>>>>>>>> + * dma-resv
>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>> + * @fence: fence to add
>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>> + */
>>>>>>>>> +void
>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>>> +                        enum dma_resv_usage private_usage,
>>>>>>>>> +                        enum dma_resv_usage extobj_usage)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>> +       unsigned long index;
>>>>>>>>> +
>>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, obj) {
>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>> obj) ?
>>>>>>>>> +                                  private_usage :
>>>>>>>>> extobj_usage);
>>>>>>>>> +       }
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>> +
>>>>>>>>>     /**
>>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance of struct
>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>>>> *gpuvm,
>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>> +
>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>          return vm_bo;
>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>> +
>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>> +
>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the reference of
>>>>>>>>>      *
>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>> + *
>>>>>>>>> + * If the reference count drops to zero, the &gpuvm_bo is
>>>>>>>>> destroyed, which
>>>>>>>>> + * includes removing it from the GEMs gpuva list. Hence, if
>>>>>>>>> a call to this
>>>>>>>>> + * function can potentially let the reference count to zero
>>>>>>>>> the caller must
>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>>      */
>>>>>>>>>     void
>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
>>>>>>>>> *vm_bo)
>>>>>>>>>     }
>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>> +static int __must_check
>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>> +{
>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>     }
>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its
>>>>>>>>> &drm_gpuvm's
>>>>>>>>> + * extobj list
>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the
>>>>>>>>> extobj list.
>>>>>>>>> + *
>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if
>>>>>>>>> not on the list
>>>>>>>>> + * already and if the corresponding &drm_gem_object is an
>>>>>>>>> external object,
>>>>>>>>> + * actually.
>>>>>>>>> + */
>>>>>>>>> +void
>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>> +
>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
>>>>>>>>> / from a
>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>> + *
>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>> + */
>>>>>>>>> +void
>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>>>>>>>> +{
>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>> +
>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>> +               if (evict)
>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, evict);
>>>>>>>>> +               else
>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, evict);
>>>>>>>>> +       }
>>>>>>>>> +}
>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>> +
>>>>>>>>>     static int
>>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>      */
>>>>>>>>>     #include <linux/list.h>
>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>           * space
>>>>>>>>>           */
>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>> +
>>>>>>>>> +       /**
>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>> +        */
>>>>>>>>> +       struct {
>>>>>>>>> +               /**
>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>> serving as
>>>>>>>>> +                * external object
>>>>>>>>> +                */
>>>>>>>>> +               struct list_head list;
>>>>>>>>> +
>>>>>>>>> +               /**
>>>>>>>>> +                * @lock: spinlock to protect the extobj list
>>>>>>>>> +                */
>>>>>>>>> +               spinlock_t lock;
>>>>>>>>> +       } extobj;
>>>>>>>>> +
>>>>>>>>> +       /**
>>>>>>>>> +        * @evict: structure holding the evict list and evict
>>>>>>>>> list lock
>>>>>>>>> +        */
>>>>>>>>> +       struct {
>>>>>>>>> +               /**
>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>> currently being
>>>>>>>>> +                * evicted
>>>>>>>>> +                */
>>>>>>>>> +               struct list_head list;
>>>>>>>>> +
>>>>>>>>> +               /**
>>>>>>>>> +                * @lock: spinlock to protect the evict list
>>>>>>>>> +                */
>>>>>>>>> +               spinlock_t lock;
>>>>>>>>> +       } evict;
>>>>>>>>>     };
>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>>> drm_device *drm,
>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>>> &drm_gem_object is an
>>>>>>>>> + * external object
>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>> + *
>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv differs
>>>>>>>>> from the
>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>> + */
>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>>> *gpuvm,
>>>>>>>>> +                                      struct drm_gem_object
>>>>>>>>> *obj)
>>>>>>>>> +{
>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>     {
>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__)
>>>>>>>>> \
>>>>>>>>>          list_for_each_entry_safe(va__, next__, &(gpuvm__)-
>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>> +/**
>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>> &drm_exec
>>>>>>>>> + *
>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>> &drm_exec should be.
>>>>>>>>> + *
>>>>>>>>> + * Optionally, @extra can be set in order to lock additional
>>>>>>>>> &drm_gem_objects.
>>>>>>>>> + */
>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>> +       /**
>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>> +        */
>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>> +
>>>>>>>>> +       /**
>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA reservations
>>>>>>>>> +        */
>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>> +
>>>>>>>>> +       /**
>>>>>>>>> +        * @extra: Callback and corresponding private data
>>>>>>>>> for the driver to
>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>> +        */
>>>>>>>>> +       struct {
>>>>>>>>> +               /**
>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>> +                */
>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>> +                         unsigned int num_fences);
>>>>>>>>> +
>>>>>>>>> +               /**
>>>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>>>> callback
>>>>>>>>> +                */
>>>>>>>>> +               void *priv;
>>>>>>>>> +       } extra;
>>>>>>>>> +};
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
>>>>>>>>> resv
>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>> + *
>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>> &drm_gem_object.
>>>>>>>>> + *
>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>> responsibility to call
>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +static inline int
>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>> +                    unsigned int num_fences)
>>>>>>>>> +{
>>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>>> num_fences);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>> +                             unsigned int num_fences);
>>>>>>>>> +
>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>> +                           struct drm_exec *exec,
>>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>>> +                           unsigned int num_fences);
>>>>>>>>> +
>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>> +                       bool interruptible);
>>>>>>>>> +
>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>> *vm_exec,
>>>>>>>>> +                             struct drm_gem_object **objs,
>>>>>>>>> +                             unsigned int num_objs,
>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>> +                             bool interruptible);
>>>>>>>>> +
>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>> *vm_exec,
>>>>>>>>> +                             u64 addr, u64 range,
>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>> +                             bool interruptible);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
>>>>>>>>> BOs
>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>> + *
>>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>>> previously acquired
>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>> + *
>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>> + */
>>>>>>>>> +static inline void
>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>> +{
>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>> private_usage,
>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>> extobj_usage);
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>> + * @fence: fence to add
>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>> + *
>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>> + */
>>>>>>>>> +static inline void
>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>> *vm_exec,
>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>> private_usage,
>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>> extobj_usage)
>>>>>>>>> +{
>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec,
>>>>>>>>> fence,
>>>>>>>>> +                                private_usage,
>>>>>>>>> extobj_usage);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>>     /**
>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>> &drm_gpuvm and
>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>                           * gpuva list.
>>>>>>>>>                           */
>>>>>>>>>                          struct list_head gem;
>>>>>>>>> +
>>>>>>>>> +                       /**
>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>> the &drm_gpuvms
>>>>>>>>> +                        * extobj list.
>>>>>>>>> +                        */
>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>> +
>>>>>>>>> +                       /**
>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>> +                        * list.
>>>>>>>>> +                        */
>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>                  } entry;
>>>>>>>>>          } list;
>>>>>>>>>     };
>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>>>> evict);
>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>> +
>>>>>>>>>     /**
>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk over a
>>>>>>>>> list of &drm_gpuva
>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>>> iteration step
>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>           * used.
>>>>>>>>>           */
>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, void
>>>>>>>>> *priv);
>>>>>>>>> +
>>>>>>>>> +       /**
>>>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>>>> +        *
>>>>>>>>> +        * Drivers receive this callback for every evicted
>>>>>>>>> &drm_gem_object being
>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>> +        *
>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>> specific variant of
>>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>>> +        */
>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>>     };
>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
>>
>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 15:33                   ` Christian König
@ 2023-09-13 15:46                     ` Danilo Krummrich
  2023-09-19 12:07                       ` Christian König
  0 siblings, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-13 15:46 UTC (permalink / raw)
  To: Christian König, Thomas Hellström
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

On 9/13/23 17:33, Christian König wrote:
> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>> On 9/13/23 16:26, Christian König wrote:
>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>> As mentioned in a different mail thread, the reply is based on the assumption
>>>> that we don't support anything else than GPUVM updates from the IOCTL.
>>>
>>> I think that this assumption is incorrect.
>>
>> Well, more precisely I should have said "don't support GPUVM updated from within
>> fence signaling critical sections". And looking at the code, that doesn't seem what
>> you're doing there.
>>
>>>
>>> Vulkan is just once specific use case, but this here should probably be able to handle other use cases as well.
>>>
>>> Especially with HMM you get the requirement that you need to be able to invalidate GPUVM mappings without grabbing a reservation lock.
>>
>> What do you mean with "invalidate GPUVM mappings" in this context? drm_gpuvm_bo_evict()
>> should only be called from a ttm_device_funcs::move\x0f callback, we should hold the dma-resv
>> lock there.
> 
> Well the question is which dma-resv lock do we hold?
> 
> In the move callback we only hold the dma-resv lock of the BO which is moved, but when that is a shared BO then that's not the same as the one for the VM.

Correct, Thomas' idea was to use the GEM's dma_resv lock to protect drm_gpuvm_bo::evicted
and then actually move the drm_gpuvm_bo to the VM's evicted list once we grabbed all
dma-resv locks when locking the VM's BOs using drm_exec. We can remove them from the evicted
list on validate(). This way we never touch the evicted list without holding at least the VM's
dma-resv lock.

Do you have any concerns about that?

> 
>>
>>>
>>> See what the eviction lock in amdgpu is doing for example.
>>
>> The eviction_lock seems to protect a VM state "evicting" of whether any BO that
>> is associated with the VM is currently evicting. At the same time amdgpu protects
>> the eviceted list of the VM with a different lock. So this seems to be entirely
>> unrelated. Tracking a "currently evicting" state is not part of the GPUVM
>> implementation currently and hence nothing would change for amdgpu there.
> 
> Sorry for the confusion we use different terminology in amdgpu.
> 
> The eviction lock and evicted state is for the VM page tables, e.g. if the whole VM is currently not used and swapped out or even de-allocated.
> 
> This is necessary because we have cases where we need to access the VM data without holding the dma-resv lock of this VM. Especially figuring out which parts of an address space contain mappings and which doesn't.

I think this is fine, this has nothing to do with lists of evicted GEM objects or external GEM
objects, right? Marking mappings (drm_gpuva) as invalidated (DRM_GPUVA_INVALIDATED) or accessing
the VA space does not require any dma-resv locks.

> 
> This is a requirement which comes with HMM handling, you won't see this with Vulkan (or OpenGL, VAAPI etc..).
> 
> 
> The invalidation lock on the other hand is what in this discussion is called eviction lock. This one is needed because what I wrote above, during the move callback only the dma-resv of the BO which is moved is locked, but not necessarily the dma-resv of the VM.

That's yet another thing, right? This is used to track whether *any* BO that belongs to the VM is
currently being evicted, correct? As mentioned, as by now this is not supported in GPUVM and hence
would be the same driver specific code with the same driver specifc lock.

> 
> Regards,
> Christian.
> 
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström wrote:
>>>>> Hi!
>>>>>
>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström wrote:
>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
>>>>>>>>> Hi, Danilo,
>>>>>>>>>
>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>> So far the DRM GPUVA manager offers common infrastructure to
>>>>>>>>>> track GPU VA
>>>>>>>>>> allocations and mappings, generically connect GPU VA mappings
>>>>>>>>>> to their
>>>>>>>>>> backing buffers and perform more complex mapping operations
>>>>>>>>>> on the GPU VA
>>>>>>>>>> space.
>>>>>>>>>>
>>>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>>>> drivers, which
>>>>>>>>>> can potentially be generalized in order to make the DRM GPUVA
>>>>>>>>>> manager
>>>>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>>>>> this patch aims
>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>
>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not being used
>>>>>>>>>> outside of
>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>
>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>>>> which are
>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>
>>>>>>>>>> 3) Provide functions to efficiently lock all GEM objects dma-
>>>>>>>>>> resv the
>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>
>>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>>> contains mappings
>>>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>>>> accelerated.
>>>>>>>>>>
>>>>>>>>>> 5) Provide some convinience functions for common patterns.
>>>>>>>>>>
>>>>>>>>>> Rather than being designed as a "framework", the target is to
>>>>>>>>>> make all
>>>>>>>>>> features appear as a collection of optional helper functions,
>>>>>>>>>> such that
>>>>>>>>>> drivers are free to make use of the DRM GPUVA managers basic
>>>>>>>>>> functionality and opt-in for other features without setting
>>>>>>>>>> any feature
>>>>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>>>>
>>>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>>>> locking for drivers
>>>>>>>>>> updating the GPU VA space within the fence signalling path.
>>>>>>>>>>
>>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>> ---
>>>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an existing
>>>>>>>>>> instance of this
>>>>>>>>>>      * particular combination. If not existent a new instance
>>>>>>>>>> is created and linked
>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>> + *
>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>>>> evicted objects. Those
>>>>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>>>>> dma-resv locks and
>>>>>>>>>> + * validation of evicted objects bound in a &drm_gpuvm. For
>>>>>>>>>> instance the all
>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be
>>>>>>>>>> locked by calling
>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>>>>> also possible to lock
>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>> corresponding parameters to
>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the &drm_exec
>>>>>>>>>> loop while making
>>>>>>>>>> + * use of helper functions such as drm_gpuvm_prepare_range()
>>>>>>>>>> or
>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>> + *
>>>>>>>>>> + * Every bound &drm_gem_object is treated as external object
>>>>>>>>>> when its &dma_resv
>>>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>      */
>>>>>>>>>>     /**
>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for the same
>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>>>> creations and destructions
>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances unique.
>>>>>>>>>> + *
>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of external and
>>>>>>>>>> evicted objects are
>>>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>>>> iteration internally.
>>>>>>>>>> + *
>>>>>>>>>> + * However, drivers still need ensure to protect concurrent
>>>>>>>>>> calls to functions
>>>>>>>>>> + * iterating those lists, such as drm_gpuvm_validate() and
>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function contains
>>>>>>>>>> a particular
>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>> + *
>>>>>>>>>> + * Functions adding or removing entries from those lists,
>>>>>>>>>> such as
>>>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be
>>>>>>>>>> called with external
>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>> corresponding list to be
>>>>>>>>>> + * (safely) modified while potentially being iternated by
>>>>>>>>>> other API functions.
>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>      */
>>>>>>>>>>     /**
>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>      *   }
>>>>>>>>>>      */
>>>>>>>>>> +/**
>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>>> already iterated items
>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>> + *
>>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>>> Lockless as in, the
>>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>>> first element from
>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>> concurrently.
>>>>>>>>> Are the list spinlocks needed for that async state update from
>>>>>>>>> within the
>>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>
>>>>>>>>> Otherwise it should be sufficient to protect the lists with the
>>>>>>>>> gpuvm's resv
>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>
>>>>>>>>> If those spinlocks are still needed in some situations, perhaps
>>>>>>>>> could we
>>>>>>>>> have an option to set them to NULL (Like IIRC the maple tree
>>>>>>>>> allows for)?
>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>> holding only the dma-resv lock from the BO this function gets
>>>>>>>> called for. Hence,
>>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() calls with
>>>>>>>> different BOs.
>>>>>>> No. Only if you try to add external objects to the vm's evict list
>>>>>>> from
>>>>>>> within the evict code. That's not necessary since you loop through
>>>>>>> all
>>>>>>> external objects anyway when locking them so an "evicted" bool in
>>>>>>> the vm_bo,
>>>>>>> protected by the bo resv would be sufficient. The extobj locking
>>>>>>> loop can
>>>>>>> then add the bo to the evicted list.
>>>>>> And validate() can remove it while still holding all dma-resv locks,
>>>>>> neat!
>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>> concurrently? What
>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>> drm_gpuva_unlink()?
>>>>>> Are we guaranteed that at this point of time the drm_gpuvm_bo is not
>>>>>> on the
>>>>>> evicted list? Because otherwise we would call drm_gpuvm_bo_destroy()
>>>>>> with the
>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>> drm_gpuvm_bo_destroy()
>>>>>> might drop the last reference to the drm_gem_object and hence we'd
>>>>>> potentially
>>>>>> free the dma-resv lock while holding it, at least if it's an external
>>>>>> object.
>>>>> Easiest way in this scheme is to think of the lists as being protected
>>>>> by the vm's resv lock. That means anybody calling unlink() must also
>>>>> hold the vm's resv lock. (Which is OK from an UAF point of view, but
>>>>> perhaps not from a locking inversion POW from an async list update).
>>>> This would mean that on unlink() we'd need to hold the VM's resv lock and the
>>>> corresponding GEM's resv lock (in case they're not the same anyways) because the
>>>> VM's resv lock would protect the external / evicted object lists and the GEM
>>>> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>
>>>>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>>>>> really would not
>>>>>>>> like to add even more complexity just to get the spinlock out of
>>>>>>>> the way in case
>>>>>>>> the driver already has an outer lock protecting this path.
>>>>>>> I must disagree here. These spinlocks and atomic operations are
>>>>>>> pretty
>>>>>>> costly and as discussed earlier this type of locking was the reason
>>>>>>> (at
>>>>>>> least according to the commit message) that made Christian drop the
>>>>>>> XArray
>>>>>>> use in drm_exec for the same set of objects: "The locking overhead
>>>>>>> is
>>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>>> complexity and a
>>>>>>> single wide lock following the drm locking guidelines set out by
>>>>>>> Daniel and
>>>>>>> David should really be the default choice with an opt-in for a
>>>>>>> spinlock if
>>>>>>> needed for async and pushing out to a wq is not an option.
>>>>>> For the external object list an outer lock would work as long as it's
>>>>>> not the
>>>>>> dma-resv lock of the corresponding GEM object, since here we actually
>>>>>> need to
>>>>>> remove the list entry from the external object list on
>>>>>> drm_gpuvm_bo_destroy().
>>>>>> It's just a bit weird design wise that drivers would need to take
>>>>>> this outer
>>>>>> lock on:
>>>>>>
>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>> - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>> drm_gpuvm_bo_put())
>>>>>> - drm_gpuvm_exec_lock()
>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>> - drm_gpuvm_prepare_range()
>>>>>>
>>>>>> Given that it seems reasonable to do all the required locking
>>>>>> internally.
>>>>>  From a design POW, there has been a clear direction in XE to make
>>>>> things similar to mmap() / munmap(), so this outer lock, which in Xe is
>>>>> an rwsem, is used in a similar way as the mmap_lock. It's protecting
>>>>> the page-table structures and vma rb tree, the userptr structures and
>>>>> the extobj list. Basically it's taken early in the exec IOCTL, the
>>>>> VM_BIND ioctl, the compute rebind worker and the pagefault handler, so
>>>>> all of the above are just asserting that it is taken in the correct
>>>>> mode.
>>>>>
>>>>> But strictly with this scheme one could also use the vm's dma_resv for
>>>>> the extobj list since with drm_exec, it's locked before traversing the
>>>>> list.
>>>>>
>>>>> The whole point of this scheme is to rely on locks that you already are
>>>>> supposed to be holding for various reasons and is simple to comprehend.
>>>> I don't agree that we're supposed to hold the VM's resv lock anyways for
>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm fine using it
>>>> for that purpose nevertheless.
>>>>
>>>>>> In order to at least place lockdep checks, the driver would need to
>>>>>> supply the
>>>>>> corresponding lock's lockdep_map, because the GPUVM otherwise doesn't
>>>>>> know about
>>>>>> the lock.
>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>> I'd really like to avoid that, especially now that everything got simpler. We
>>>> should define the actual locks to take instead.
>>>>
>>>>>> Out of curiosity, what is the overhead of a spin_lock() that doesn't
>>>>>> need to
>>>>>> spin?
>>>>> I guess it's hard to tell exactly, but it is much lower on modern x86
>>>>> than what it used to be. Not sure about ARM, which is the other
>>>>> architecture important to us. I figure if there is little cache-line
>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>
>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>
>>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>>>>> spinlock_t
>>>>>>> *lock)
>>>>>>>
>>>>>>> {
>>>>>>>
>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>          spin_lock(lock);
>>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>>>> For such drivers, that would require anybody calling unlink to
>>>>>>>>> hold the vm's
>>>>>>>>> resv, though.
>>>>>>>> In V4 I want to go back to having a dedicated lock for the GEMs
>>>>>>>> gpuva list (or
>>>>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>>>>> lock for that
>>>>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>>>>> otherwise wouldn't
>>>>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>>>>> was referring to
>>>>>>>> earlier.
>>>>>>> Yeah, I can see the need for a dedicated lock for the GEM's gpuva
>>>>>>> list, but
>>>>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>>>>> problem. We
>>>>>>> may free the object and a pointer to the vm's resv during unlink
>>>>>>> but we
>>>>>>> don't free the vm's resv.  It'd be a matter of ensuring that any
>>>>>>> calls to
>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>> Drivers calling unlink() from the fence signaling path can't use the
>>>>>> VM's
>>>>>> dma-resv lock.
>>>>> Yes, that made me a bit curious because in the current version the code
>>>>> required the object's dma_resv for unlink() which can't be grabbed
>>>>> either from the fence signaling path. So are there any drivers actually
>>>>> wanting to do that? If so, they will either need to resort to the
>>>>> current spinlock solution or they will need to call unlink from a
>>>>> workqueue item.
>>>> As Boris already mentioned we have the dma-resv lock by default or a driver
>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of the latter.
>>>>
>>>>>> Also, what if the object is an external object? We can't use the VM's
>>>>>> dma-resv
>>>>>> lock here.
>>>>> Why? Typically (sync) unlink is only ever called from an unbind-like
>>>>> operation where it should be trivial to grab the vm's resv. Or, for
>>>>> that matter any outer lock protecting the extobj list. Rule would be
>>>>> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict would
>>>>> be protected by either the vm's dma_resv (or possibly an outer lock in
>>>>> the case of the extobj list).
>>>> Outer lock wouldn't have been working for updates in the async path, but
>>>> shouldn't be relevant anymore. We could use the VM's resv for that.
>>>>
>>>>>>   And we can't have the GEM objs dma-resv lock held when calling
>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>>>>> refcount drops
>>>>>> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might
>>>>>> drop the
>>>>>> last reference of the GEM object.
>>>>> Yes, but this is a different problem as to what exactly protects
>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per bo list
>>>>> lock, or if we want to keep the bo's dma_resv we need to ensure that
>>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount (I know
>>>>> Boris didn't like that, but requiring an explicit refcount for a
>>>>> pointer you dereference unless you're under a lock that ensures keeping
>>>>> the object alive is pretty much required?) But anyway for the
>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal spinlock)
>>>>> I don't have a strong preference.
>>>> We can keep the GEM objects dma-resv lock, however as mentioned above
>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the VM's resv lock
>>>> and the GEM's resv lock in case they differ.
>>>>
>>>>>>   All those problems go away with a dedicated
>>>>>> GEM gpuva list lock.
>>>>> I don't think these are real problems.
>>>>> With the excepton of the eviction list "trick" where we currently have
>>>>> slightly different approach to collect external bos needing rebinding,
>>>>> we have this working fine.
>>>>>
>>>>> TBH I think pretty much the only situation where the spinlock is needed
>>>>> is for async updates of these lists, unless a wq item can be used for
>>>>> that, but it doesn't really seem like the current code allows for such
>>>>> updates anyway? It complicates the code a lot, adds overhead and also
>>>>> adds the requirement for refcounting during list traversal.
>>>>>
>>>>> /Thomas
>>>>>
>>>>>>> /Thomas
>>>>>>>
>>>>>>>
>>>>>>>>> It seems that with that also the refcount could be make non-
>>>>>>>>> atomic.
>>>>>>>>>
>>>>>>>>> All in the spirit of the drm locking guidelines "use big locks
>>>>>>>>> when
>>>>>>>>> possible".
>>>>>>>>> Lower level locks only when necessary for performance or
>>>>>>>>> locking inversion?
>>>>>>>>>
>>>>>>>>> /Thomas
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> + *
>>>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>>>> local list, so removal
>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>> iterating the list.
>>>>>>>>>> + */
>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>> +       ({
>>>>>>>>>>                             \
>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>>> +
>>>>>>>>>>                             \
>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>                             \
>>>>>>>>>> +
>>>>>>>>>>                             \
>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>>>> + struct
>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>> +
>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>> +                       if
>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>> {                    \
>>>>>>>>>> +                               list_move_tail(&(__vm_bo)-
>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>> +
>>>>>>>>>> __local_list);                           \
>>>>>>>>>> +                               break;
>>>>>>>>>>                             \
>>>>>>>>>> +                       } else
>>>>>>>>>> {                                                        \
>>>>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>> NULL;                                         \
>>>>>>>>>> +                       }
>>>>>>>>>>                             \
>>>>>>>>>> +               }
>>>>>>>>>>                             \
>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>> +
>>>>>>>>>>                             \
>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>                             \
>>>>>>>>>> +       })
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
>>>>>>>>>> + *
>>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>>> Lockless as in, the
>>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>>> first element from the
>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>> concurrently.
>>>>>>>>>> + *
>>>>>>>>>> + * Typical use:
>>>>>>>>>> + *
>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>> + *
>>>>>>>>>> + *     ret = 0;
>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>> + *             ret = do_something_with_vm_bo(..., vm_bo);
>>>>>>>>>> + *             if (ret)
>>>>>>>>>> + *                     break;
>>>>>>>>>> + *     }
>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>>>> &my_local_list);
>>>>>>>>>> + *
>>>>>>>>>> + *
>>>>>>>>>> + * Only used for internal list iterations, not meant to be
>>>>>>>>>> exposed to the outside
>>>>>>>>>> + * world.
>>>>>>>>>> + */
>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>> __list_name,           \
>>>>>>>>>> +                                               __local_list,
>>>>>>>>>> NULL);            \
>>>>>>>>>> +
>>>>>>>>>> __vm_bo;
>>>>>>>>>>        \
>>>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>> __list_name,           \
>>>>>>>>>> +                                               __local_list,
>>>>>>>>>> __vm_bo))         \
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to their
>>>>>>>>>> original list
>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>>> already iterated items
>>>>>>>>>> + *
>>>>>>>>>> + * When we're done iterating a vm_bo list, we should call
>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>> + * to restore the original state and let new iterations take
>>>>>>>>>> place.
>>>>>>>>>> + */
>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>> __local_list)                         \
>>>>>>>>>> +       do
>>>>>>>>>> {
>>>>>>>>>>                  \
>>>>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>>>>> list elements to the          \
>>>>>>>>>> +                * head to preserve previous ordering, in
>>>>>>>>>> case it matters.              \
>>>>>>>>>> +
>>>>>>>>>> */
>>>>>>>>>>            \
>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>> +       } while (0)
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
>>>>>>>>>> list
>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>> + *
>>>>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>>>>> @__list_name and
>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>> + */
>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>> __list_name)                            \
>>>>>>>>>> +       do
>>>>>>>>>> {
>>>>>>>>>>          \
>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>> +       } while (0)
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
>>>>>>>>>> list
>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>> + *
>>>>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>>>>> @__list_name and
>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>> + */
>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>> __list_name)                            \
>>>>>>>>>> +       do
>>>>>>>>>> {
>>>>>>>>>>          \
>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>> +       } while (0)
>>>>>>>>>> +
>>>>>>>>>> +static int __must_check
>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>> +
>>>>>>>>>>     #define to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>> +
>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>> +
>>>>>>>>>>          drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>>>>> *gpuvm)
>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>>               "GPUVA tree is not empty, potentially leaking
>>>>>>>>>> memory.\n");
>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
>>>>>>>>>> should be empty.\n");
>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>>>>> should be empty.\n");
>>>>>>>>>> +
>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>     }
>>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>> + *
>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the
>>>>>>>>>> given
>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>> + *
>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>> responsibility to call
>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>> + *
>>>>>>>>>> + * Note: This function is safe against concurrent insertion
>>>>>>>>>> and removal of
>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>> concurrent usage itself.
>>>>>>>>>> + *
>>>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>>>> either an outer VM lock
>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this function
>>>>>>>>>> within the
>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the GPUVM's
>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>> + *
>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>> + */
>>>>>>>>>> +int
>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>>>> +                         unsigned int num_fences)
>>>>>>>>>> +{
>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>> +       int ret = 0;
>>>>>>>>>> +
>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>>>> vm_bo) {
>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, vm_bo->obj,
>>>>>>>>>> num_fences);
>>>>>>>>>> +               if (ret)
>>>>>>>>>> +                       break;
>>>>>>>>>> +       }
>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>> +
>>>>>>>>>> +       return ret;
>>>>>>>>>> +}
>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped within
>>>>>>>>>> a given range
>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>> + *
>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>>>>> mapped between @addr
>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>> + *
>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>> + */
>>>>>>>>>> +int
>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>> drm_exec *exec,
>>>>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>>>>> num_fences)
>>>>>>>>>> +{
>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>> +       int ret;
>>>>>>>>>> +
>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
>>>>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>>>>> +
>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>>> num_fences);
>>>>>>>>>> +               if (ret)
>>>>>>>>>> +                       return ret;
>>>>>>>>>> +       }
>>>>>>>>>> +
>>>>>>>>>> +       return 0;
>>>>>>>>>> +}
>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>> assoiciated BOs
>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>> + *
>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>>> given
>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>> + *
>>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>> + * being set the driver receives the given @fn callback to
>>>>>>>>>> lock additional
>>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec instance.
>>>>>>>>>> Typically, drivers
>>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>>> callback.
>>>>>>>>>> + *
>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>> + */
>>>>>>>>>> +int
>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>> +{
>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>> +       int ret;
>>>>>>>>>> +
>>>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>> 0 |
>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>> +
>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>> +
>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>>>> num_fences);
>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>> +               if (ret)
>>>>>>>>>> +                       goto err;
>>>>>>>>>> +
>>>>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>>>> num_fences);
>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>> +               if (ret)
>>>>>>>>>> +                       goto err;
>>>>>>>>>> +
>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>>>> num_fences);
>>>>>>>>>> +                       drm_exec_retry_on_contention(exec);
>>>>>>>>>> +                       if (ret)
>>>>>>>>>> +                               goto err;
>>>>>>>>>> +               }
>>>>>>>>>> +       }
>>>>>>>>>> +
>>>>>>>>>> +       return 0;
>>>>>>>>>> +
>>>>>>>>>> +err:
>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>> +       return ret;
>>>>>>>>>> +}
>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>> +
>>>>>>>>>> +static int
>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
>>>>>>>>>> num_fences)
>>>>>>>>>> +{
>>>>>>>>>> +       struct {
>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>> +
>>>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>>>> objs,
>>>>>>>>>> + args->num_objs,
>>>>>>>>>> num_fences);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
>>>>>>>>>> assoiciated BOs
>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>> + * @num_objs: the number of additional &drm_gem_objects to
>>>>>>>>>> lock
>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>> + *
>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>> + * contains mappings of, plus the ones given through @objs.
>>>>>>>>>> + *
>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>> + */
>>>>>>>>>> +int
>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>>>> +                         unsigned int num_objs,
>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>> +{
>>>>>>>>>> +       struct {
>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>> +       } args;
>>>>>>>>>> +
>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>> +
>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>> +
>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>>>> interruptible);
>>>>>>>>>> +}
>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>>>> within a given range
>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>> + *
>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>>>> mapped between @addr and
>>>>>>>>>> + * @addr + @range.
>>>>>>>>>> + *
>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>> + */
>>>>>>>>>> +int
>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>> +{
>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>> +       int ret;
>>>>>>>>>> +
>>>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>> 0 |
>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>> +
>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>> +
>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>>>> addr, range,
>>>>>>>>>> + num_fences);
>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>> +               if (ret)
>>>>>>>>>> +                       goto err;
>>>>>>>>>> +       }
>>>>>>>>>> +
>>>>>>>>>> +       return ret;
>>>>>>>>>> +
>>>>>>>>>> +err:
>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>> +       return ret;
>>>>>>>>>> +}
>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as evicted
>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>> + *
>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>>>>> evicted buffer
>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>> + *
>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>> + */
>>>>>>>>>> +int
>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>> +{
>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>> +       int ret = 0;
>>>>>>>>>> +
>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>> +
>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>> +               if (ret)
>>>>>>>>>> +                       break;
>>>>>>>>>> +       }
>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>> +
>>>>>>>>>> +       return ret;
>>>>>>>>>> +}
>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private and all
>>>>>>>>>> extobj
>>>>>>>>>> + * dma-resv
>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>> + */
>>>>>>>>>> +void
>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>>>> +                        enum dma_resv_usage private_usage,
>>>>>>>>>> +                        enum dma_resv_usage extobj_usage)
>>>>>>>>>> +{
>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>> +       unsigned long index;
>>>>>>>>>> +
>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, obj) {
>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>> obj) ?
>>>>>>>>>> +                                  private_usage :
>>>>>>>>>> extobj_usage);
>>>>>>>>>> +       }
>>>>>>>>>> +}
>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>> +
>>>>>>>>>>     /**
>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance of struct
>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>>>>> *gpuvm,
>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>> +
>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>          return vm_bo;
>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>> +
>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>> +
>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the reference of
>>>>>>>>>>      *
>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>> + *
>>>>>>>>>> + * If the reference count drops to zero, the &gpuvm_bo is
>>>>>>>>>> destroyed, which
>>>>>>>>>> + * includes removing it from the GEMs gpuva list. Hence, if
>>>>>>>>>> a call to this
>>>>>>>>>> + * function can potentially let the reference count to zero
>>>>>>>>>> the caller must
>>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>>>      */
>>>>>>>>>>     void
>>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
>>>>>>>>>> *vm_bo)
>>>>>>>>>>     }
>>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>> +static int __must_check
>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>> +{
>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>     }
>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its
>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>> + * extobj list
>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the
>>>>>>>>>> extobj list.
>>>>>>>>>> + *
>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if
>>>>>>>>>> not on the list
>>>>>>>>>> + * already and if the corresponding &drm_gem_object is an
>>>>>>>>>> external object,
>>>>>>>>>> + * actually.
>>>>>>>>>> + */
>>>>>>>>>> +void
>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>> +{
>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>> +
>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>>> +}
>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
>>>>>>>>>> / from a
>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>> + *
>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>>> + */
>>>>>>>>>> +void
>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>>>>>>>>> +{
>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>> +
>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>> +               if (evict)
>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, evict);
>>>>>>>>>> +               else
>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, evict);
>>>>>>>>>> +       }
>>>>>>>>>> +}
>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>> +
>>>>>>>>>>     static int
>>>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>      */
>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>           * space
>>>>>>>>>>           */
>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>> +
>>>>>>>>>> +       /**
>>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>>> +        */
>>>>>>>>>> +       struct {
>>>>>>>>>> +               /**
>>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>>> serving as
>>>>>>>>>> +                * external object
>>>>>>>>>> +                */
>>>>>>>>>> +               struct list_head list;
>>>>>>>>>> +
>>>>>>>>>> +               /**
>>>>>>>>>> +                * @lock: spinlock to protect the extobj list
>>>>>>>>>> +                */
>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>> +       } extobj;
>>>>>>>>>> +
>>>>>>>>>> +       /**
>>>>>>>>>> +        * @evict: structure holding the evict list and evict
>>>>>>>>>> list lock
>>>>>>>>>> +        */
>>>>>>>>>> +       struct {
>>>>>>>>>> +               /**
>>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>>> currently being
>>>>>>>>>> +                * evicted
>>>>>>>>>> +                */
>>>>>>>>>> +               struct list_head list;
>>>>>>>>>> +
>>>>>>>>>> +               /**
>>>>>>>>>> +                * @lock: spinlock to protect the evict list
>>>>>>>>>> +                */
>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>> +       } evict;
>>>>>>>>>>     };
>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>> drm_device *drm,
>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>> + * external object
>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>> + *
>>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv differs
>>>>>>>>>> from the
>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>> + */
>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>>>> *gpuvm,
>>>>>>>>>> +                                      struct drm_gem_object
>>>>>>>>>> *obj)
>>>>>>>>>> +{
>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>     {
>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__)
>>>>>>>>>> \
>>>>>>>>>>          list_for_each_entry_safe(va__, next__, &(gpuvm__)-
>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>> +/**
>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>>> &drm_exec
>>>>>>>>>> + *
>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>> &drm_exec should be.
>>>>>>>>>> + *
>>>>>>>>>> + * Optionally, @extra can be set in order to lock additional
>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>> + */
>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>> +       /**
>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>> +        */
>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>> +
>>>>>>>>>> +       /**
>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA reservations
>>>>>>>>>> +        */
>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>> +
>>>>>>>>>> +       /**
>>>>>>>>>> +        * @extra: Callback and corresponding private data
>>>>>>>>>> for the driver to
>>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>>> +        */
>>>>>>>>>> +       struct {
>>>>>>>>>> +               /**
>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>> +                */
>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>> +                         unsigned int num_fences);
>>>>>>>>>> +
>>>>>>>>>> +               /**
>>>>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>>>>> callback
>>>>>>>>>> +                */
>>>>>>>>>> +               void *priv;
>>>>>>>>>> +       } extra;
>>>>>>>>>> +};
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
>>>>>>>>>> resv
>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>> + *
>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>>> &drm_gem_object.
>>>>>>>>>> + *
>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>> responsibility to call
>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>> + *
>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>> + */
>>>>>>>>>> +static inline int
>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>>> +                    unsigned int num_fences)
>>>>>>>>>> +{
>>>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>>>> num_fences);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>> +                             unsigned int num_fences);
>>>>>>>>>> +
>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>>> +                           struct drm_exec *exec,
>>>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>>>> +                           unsigned int num_fences);
>>>>>>>>>> +
>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>> +
>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>> *vm_exec,
>>>>>>>>>> +                             struct drm_gem_object **objs,
>>>>>>>>>> +                             unsigned int num_objs,
>>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>> +
>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>> *vm_exec,
>>>>>>>>>> +                             u64 addr, u64 range,
>>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
>>>>>>>>>> BOs
>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>> + *
>>>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>>>> previously acquired
>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>> + *
>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>> + */
>>>>>>>>>> +static inline void
>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>>> +{
>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>> private_usage,
>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>> extobj_usage);
>>>>>>>>>> +
>>>>>>>>>> +/**
>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>> + *
>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>> + */
>>>>>>>>>> +static inline void
>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>> *vm_exec,
>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>> private_usage,
>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>> extobj_usage)
>>>>>>>>>> +{
>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec,
>>>>>>>>>> fence,
>>>>>>>>>> +                                private_usage,
>>>>>>>>>> extobj_usage);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>>     /**
>>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>                           * gpuva list.
>>>>>>>>>>                           */
>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>> +
>>>>>>>>>> +                       /**
>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>> +                        * extobj list.
>>>>>>>>>> +                        */
>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>> +
>>>>>>>>>> +                       /**
>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>> +                        * list.
>>>>>>>>>> +                        */
>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>                  } entry;
>>>>>>>>>>          } list;
>>>>>>>>>>     };
>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>>>>> evict);
>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>> +
>>>>>>>>>>     /**
>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk over a
>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>>>> iteration step
>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>           * used.
>>>>>>>>>>           */
>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, void
>>>>>>>>>> *priv);
>>>>>>>>>> +
>>>>>>>>>> +       /**
>>>>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>>>>> +        *
>>>>>>>>>> +        * Drivers receive this callback for every evicted
>>>>>>>>>> &drm_gem_object being
>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>> +        *
>>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>>> specific variant of
>>>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>>>> +        */
>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>>>     };
>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
>>>
>>
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 13:22               ` Thomas Hellström
  2023-09-13 14:01                 ` Boris Brezillon
@ 2023-09-14  8:20                 ` Boris Brezillon
  2023-09-14 10:45                   ` Thomas Hellström
  1 sibling, 1 reply; 77+ messages in thread
From: Boris Brezillon @ 2023-09-14  8:20 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: Dave Airlie, Danilo Krummrich, daniel, matthew.brost,
	sarah.walker, donald.robson, christian.koenig, faith.ekstrand,
	dri-devel, nouveau, linux-kernel

On Wed, 13 Sep 2023 15:22:56 +0200
Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:

> On 9/13/23 13:33, Boris Brezillon wrote:
> > On Wed, 13 Sep 2023 12:39:01 +0200
> > Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >  
> >> Hi,
> >>
> >> On 9/13/23 09:19, Boris Brezillon wrote:  
> >>> On Wed, 13 Sep 2023 17:05:42 +1000
> >>> Dave Airlie <airlied@gmail.com> wrote:
> >>>     
> >>>> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
> >>>> <boris.brezillon@collabora.com> wrote:  
> >>>>> On Tue, 12 Sep 2023 18:20:32 +0200
> >>>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >>>>>        
> >>>>>>> +/**
> >>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
> >>>>>>> + * @__gpuvm: The GPU VM
> >>>>>>> + * @__list_name: The name of the list we're iterating on
> >>>>>>> + * @__local_list: A pointer to the local list used to store already iterated items
> >>>>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> >>>>>>> + *
> >>>>>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
> >>>>>>> + * iterator releases the lock immediately after picking the first element from
> >>>>>>> + * the list, so list insertion deletion can happen concurrently.  
> >>>>>> Are the list spinlocks needed for that async state update from within
> >>>>>> the dma-fence critical section we've discussed previously?  
> >>>>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
> >>>>> hook will be in this situation (Panthor at the moment, PowerVR soon). I
> >>>>> get that Xe and Nouveau don't need that because they update the VM
> >>>>> state early (in the ioctl path), but I keep thinking this will hurt us
> >>>>> if we don't think it through from the beginning, because once you've
> >>>>> set this logic to depend only on resv locks, it will be pretty hard to
> >>>>> get back to a solution which lets synchronous VM_BINDs take precedence
> >>>>> on asynchronous request, and, with vkQueueBindSparse() passing external
> >>>>> deps (plus the fact the VM_BIND queue might be pretty deep), it can
> >>>>> take a long time to get your synchronous VM_BIND executed...  
> >> So this would boil down to either (possibly opt-in) keeping the spinlock
> >> approach or pushing the unlink out to a wq then?  
> > Deferred _unlink() would not be an issue, since I already defer the
> > drm_gpuva destruction to a wq, it would just a be a matter of moving the
> > _unlink() call there as well. But _link() also takes the GEM gpuva list
> > lock, and that one is bit tricky, in that sm_map() can trigger 2 more
> > _link() calls for the prev/next mappings, which we can't guess until we
> > get to execute the VM update. If we mandate the use of the GEM resv
> > lock, that simply means async VM updates (AKA calling
> > drm_gpuvm_sm_[un]map()) are not an option. And if this is what everyone
> > agrees on, then I'd like the APIs that make this sort of async VM
> > update possible (drm_gpuvm_sm_[un]map(), the drm_gpuvm_ops::sm_step*
> > methods, and probably other things) to be dropped, so we don't make it
> > look like it's something we support.
> >  
> >> BTW, as also asked in a reply to Danilo, how do you call unlink from
> >> run_job() when it was requiring the obj->dma_resv lock, or was that a WIP?  
> > _unlink() makes sure the GEM gpuva list lock is taken, but this can be
> > a custom lock (see drm_gem_gpuva_set_lock()). In panthor we have
> > panthor_gem_object::gpuva_list_lock that's dedicated the gpuva list
> > protection. We make sure we never take this lock while allocating
> > memory to guarantee the dma-signalling path can't deadlock.
> >  
> >>>>>        
> >>>> btw what is the use case for this? do we have actual vulkan
> >>>> applications we know will have problems here?  
> >>> I don't, but I think that's a concern Faith raised at some point (dates
> >>> back from when I was reading threads describing how VM_BIND on i915
> >>> should work, and I was clearly discovering this whole VM_BIND thing at
> >>> that time, so maybe I misunderstood).
> >>>     
> >>>> it feels like a bit of premature optimisation, but maybe we have use cases.  
> >>> Might be, but that's the sort of thing that would put us in a corner if
> >>> we don't have a plan for when the needs arise. Besides, if we don't
> >>> want to support that case because it's too complicated, I'd recommend
> >>> dropping all the drm_gpuvm APIs that let people think this mode is
> >>> valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
> >>> drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
> >>> confusion.  
> >> Xe allows bypassing the bind-queue with another bind-queue, but to
> >> completely avoid dependencies between queues the Operations may not
> >> overlap.  
> > So, you check the VM state with some VM lock held (would be the VM resv
> > in my case), and if the mapping is new (no overlaps with pre-existing
> > mappings), you queue it to the fast-track/sync-VM_BIND queue. What would
> > be missing I guess is a way to know if the mapping is active (MMU has
> > been updated) or pending (MMU update queued to the bind-queue), so I can
> > fast-track mapping/unmapping of active mappings.

Ok, so I started modifying the implementation, and quickly realized the
overlap test can't be done without your xe_range_fence tree because of
unmaps. Since we call drm_gpuva_unmap() early/in the IOCTL path (IOW,
before the mapping teardown is effective), we lose track of this
yet-to-be-executed-unmap operation, and if we do our
va_range_overlaps_with_existing_mappings() test after such an unmap has
been queued using just the drm_gpuvm tree, we might get false even if
the mapping still exists and is expected to be torn down when the
VM_BIND(unmap) job is executed on the bind-queue. As a result, this
might execute the VM_BIND(map,sync) immediately (because the dependency
went undetected), and then the vm_bind_run_job() function kicks in and
undoes what the synchronous VM_BIND(map) did. Am I missing something?

If I'm correct, that means I'm back to having synchronous VM_BIND ops
queued after all asynchronous ones unless I use something like your
xe_range_fence solution (which I was hoping I could postpone until we
decide to expose multiple bind queues).

I'm still a bit skeptical about this 'update VM mappings tree early,
defer MMU page table updates' approach, where the VM state and the
actual page table tree are temporarily out of sync until all operations
have been flushed on all queues targeting a VM. This means any test we
do on the gpuvm, like, 'give me the BO mapped at VA xxx', is subject to
'is this the current state or the future state?' questioning. Note that
we can't even get the current VM state anymore, because all the
drm_gpuvm::tree stores with this solution is the future state, and
to-be-unmapped mappings are lost during the transitioning period (when
vm_bind jobs are queued but not executed yet).

> > This would leave
> > overlapping sync/async VM updates, which can't happen in practice
> > unless userspace is doing something wrong (sparse bindings always go
> > through vkQueueBindSparse).  
> 
> User-space is allowed to create new bind queues at will, and they 
> execute independently save for range overlaps.
> 
> And the overlapping granularity depends very much on the detail of the 
> range tracking.
> We drafted this fenced range utility
>
> https://gitlab.freedesktop.org/drm/xe/kernel/-/merge_requests/353

I'll try to see if there's a way we can have something generic shared
at the gpuvm level.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14  8:20                 ` Boris Brezillon
@ 2023-09-14 10:45                   ` Thomas Hellström
  2023-09-14 11:54                     ` Boris Brezillon
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-14 10:45 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Dave Airlie, Danilo Krummrich, daniel, matthew.brost,
	sarah.walker, donald.robson, christian.koenig, faith.ekstrand,
	dri-devel, nouveau, linux-kernel


On 9/14/23 10:20, Boris Brezillon wrote:
> On Wed, 13 Sep 2023 15:22:56 +0200
> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>
>> On 9/13/23 13:33, Boris Brezillon wrote:
>>> On Wed, 13 Sep 2023 12:39:01 +0200
>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>>>   
>>>> Hi,
>>>>
>>>> On 9/13/23 09:19, Boris Brezillon wrote:
>>>>> On Wed, 13 Sep 2023 17:05:42 +1000
>>>>> Dave Airlie <airlied@gmail.com> wrote:
>>>>>      
>>>>>> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
>>>>>> <boris.brezillon@collabora.com> wrote:
>>>>>>> On Tue, 12 Sep 2023 18:20:32 +0200
>>>>>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>>>>>>>         
>>>>>>>>> +/**
>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>> + * @__local_list: A pointer to the local list used to store already iterated items
>>>>>>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>> + *
>>>>>>>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
>>>>>>>>> + * iterator releases the lock immediately after picking the first element from
>>>>>>>>> + * the list, so list insertion deletion can happen concurrently.
>>>>>>>> Are the list spinlocks needed for that async state update from within
>>>>>>>> the dma-fence critical section we've discussed previously?
>>>>>>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
>>>>>>> hook will be in this situation (Panthor at the moment, PowerVR soon). I
>>>>>>> get that Xe and Nouveau don't need that because they update the VM
>>>>>>> state early (in the ioctl path), but I keep thinking this will hurt us
>>>>>>> if we don't think it through from the beginning, because once you've
>>>>>>> set this logic to depend only on resv locks, it will be pretty hard to
>>>>>>> get back to a solution which lets synchronous VM_BINDs take precedence
>>>>>>> on asynchronous request, and, with vkQueueBindSparse() passing external
>>>>>>> deps (plus the fact the VM_BIND queue might be pretty deep), it can
>>>>>>> take a long time to get your synchronous VM_BIND executed...
>>>> So this would boil down to either (possibly opt-in) keeping the spinlock
>>>> approach or pushing the unlink out to a wq then?
>>> Deferred _unlink() would not be an issue, since I already defer the
>>> drm_gpuva destruction to a wq, it would just a be a matter of moving the
>>> _unlink() call there as well. But _link() also takes the GEM gpuva list
>>> lock, and that one is bit tricky, in that sm_map() can trigger 2 more
>>> _link() calls for the prev/next mappings, which we can't guess until we
>>> get to execute the VM update. If we mandate the use of the GEM resv
>>> lock, that simply means async VM updates (AKA calling
>>> drm_gpuvm_sm_[un]map()) are not an option. And if this is what everyone
>>> agrees on, then I'd like the APIs that make this sort of async VM
>>> update possible (drm_gpuvm_sm_[un]map(), the drm_gpuvm_ops::sm_step*
>>> methods, and probably other things) to be dropped, so we don't make it
>>> look like it's something we support.
>>>   
>>>> BTW, as also asked in a reply to Danilo, how do you call unlink from
>>>> run_job() when it was requiring the obj->dma_resv lock, or was that a WIP?
>>> _unlink() makes sure the GEM gpuva list lock is taken, but this can be
>>> a custom lock (see drm_gem_gpuva_set_lock()). In panthor we have
>>> panthor_gem_object::gpuva_list_lock that's dedicated the gpuva list
>>> protection. We make sure we never take this lock while allocating
>>> memory to guarantee the dma-signalling path can't deadlock.
>>>   
>>>>>>>         
>>>>>> btw what is the use case for this? do we have actual vulkan
>>>>>> applications we know will have problems here?
>>>>> I don't, but I think that's a concern Faith raised at some point (dates
>>>>> back from when I was reading threads describing how VM_BIND on i915
>>>>> should work, and I was clearly discovering this whole VM_BIND thing at
>>>>> that time, so maybe I misunderstood).
>>>>>      
>>>>>> it feels like a bit of premature optimisation, but maybe we have use cases.
>>>>> Might be, but that's the sort of thing that would put us in a corner if
>>>>> we don't have a plan for when the needs arise. Besides, if we don't
>>>>> want to support that case because it's too complicated, I'd recommend
>>>>> dropping all the drm_gpuvm APIs that let people think this mode is
>>>>> valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
>>>>> drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
>>>>> confusion.
>>>> Xe allows bypassing the bind-queue with another bind-queue, but to
>>>> completely avoid dependencies between queues the Operations may not
>>>> overlap.
>>> So, you check the VM state with some VM lock held (would be the VM resv
>>> in my case), and if the mapping is new (no overlaps with pre-existing
>>> mappings), you queue it to the fast-track/sync-VM_BIND queue. What would
>>> be missing I guess is a way to know if the mapping is active (MMU has
>>> been updated) or pending (MMU update queued to the bind-queue), so I can
>>> fast-track mapping/unmapping of active mappings.
> Ok, so I started modifying the implementation, and quickly realized the
> overlap test can't be done without your xe_range_fence tree because of
> unmaps. Since we call drm_gpuva_unmap() early/in the IOCTL path (IOW,
> before the mapping teardown is effective), we lose track of this
> yet-to-be-executed-unmap operation, and if we do our
> va_range_overlaps_with_existing_mappings() test after such an unmap has
> been queued using just the drm_gpuvm tree, we might get false even if
> the mapping still exists and is expected to be torn down when the
> VM_BIND(unmap) job is executed on the bind-queue. As a result, this
> might execute the VM_BIND(map,sync) immediately (because the dependency
> went undetected), and then the vm_bind_run_job() function kicks in and
> undoes what the synchronous VM_BIND(map) did. Am I missing something?
>
> If I'm correct, that means I'm back to having synchronous VM_BIND ops
> queued after all asynchronous ones unless I use something like your
> xe_range_fence solution (which I was hoping I could postpone until we
> decide to expose multiple bind queues).

Yes, unfortunately fine-granular async range-tracking comes with a cost. 
Still, if you are doing page-table updates solely with the CPU, you 
could probably short-circuit the fence part of the fenced ranges?


>
> I'm still a bit skeptical about this 'update VM mappings tree early,
> defer MMU page table updates' approach, where the VM state and the
> actual page table tree are temporarily out of sync until all operations
> have been flushed on all queues targeting a VM. This means any test we
> do on the gpuvm, like, 'give me the BO mapped at VA xxx', is subject to
> 'is this the current state or the future state?' questioning. Note that
> we can't even get the current VM state anymore, because all the
> drm_gpuvm::tree stores with this solution is the future state, and
> to-be-unmapped mappings are lost during the transitioning period (when
> vm_bind jobs are queued but not executed yet).

Understandable. But this is the way we historically have been doing 
things, (I think the whole async atomic page-flipping is using the same 
concept), but rather than refering to it as current state and future 
state, I'd like to think it as Synchronous CPU state (What an API user 
sees) vs GPU state (What the GPU sees where it's currently executing). 
To bring them in sync you need to wait for fences. And ideally the async 
work should never fail.

If one wants to push async work out to be handled solely by the GPU, 
this is the way things must be done since the GPU can't take locks or 
allocate memory, but as part or all of async work is sometimes done 
using the CPU, it might make sense to challenge that to some extent. 
There are indeed pros and cons with both approaches.

/Thomas

>
>>> This would leave
>>> overlapping sync/async VM updates, which can't happen in practice
>>> unless userspace is doing something wrong (sparse bindings always go
>>> through vkQueueBindSparse).
>> User-space is allowed to create new bind queues at will, and they
>> execute independently save for range overlaps.
>>
>> And the overlapping granularity depends very much on the detail of the
>> range tracking.
>> We drafted this fenced range utility
>>
>> https://gitlab.freedesktop.org/drm/xe/kernel/-/merge_requests/353
> I'll try to see if there's a way we can have something generic shared
> at the gpuvm level.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [Nouveau] [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 12:16             ` Danilo Krummrich
  2023-09-13 14:26               ` Christian König
@ 2023-09-14 10:57               ` Danilo Krummrich
  2023-09-14 11:32                 ` Thomas Hellström
  1 sibling, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-14 10:57 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: matthew.brost, sarah.walker, nouveau, linux-kernel, dri-devel,
	boris.brezillon, donald.robson, daniel, christian.koenig,
	faith.ekstrand

On 9/13/23 14:16, Danilo Krummrich wrote:

<snip>

>>> And validate() can remove it while still holding all dma-resv locks,
>>> neat!
>>> However, what if two tasks are trying to lock the VA space
>>> concurrently? What
>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>> drm_gpuva_unlink()?
>>> Are we guaranteed that at this point of time the drm_gpuvm_bo is not
>>> on the
>>> evicted list? Because otherwise we would call drm_gpuvm_bo_destroy()
>>> with the
>>> dma-resv lock held, which wouldn't be allowed, since
>>> drm_gpuvm_bo_destroy()
>>> might drop the last reference to the drm_gem_object and hence we'd
>>> potentially
>>> free the dma-resv lock while holding it, at least if it's an external
>>> object.
>>
>> Easiest way in this scheme is to think of the lists as being protected
>> by the vm's resv lock. That means anybody calling unlink() must also
>> hold the vm's resv lock. (Which is OK from an UAF point of view, but
>> perhaps not from a locking inversion POW from an async list update).
> 
> This would mean that on unlink() we'd need to hold the VM's resv lock and the
> corresponding GEM's resv lock (in case they're not the same anyways) because the
> VM's resv lock would protect the external / evicted object lists and the GEM
> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
> drm_gpuvm_bo's list of drm_gpuvas.

As mentioned below the same applies for drm_gpuvm_bo_put() since it might
destroy the vm_bo, which includes removing the vm_bo from external / evicted
object lists and the GEMs list of vm_bos.

As mentioned, if the GEM's dma-resv is different from the VM's dma-resv we need
to take both locks. Ultimately, this would mean we need a drm_exec loop, because
we can't know the order in which to take these locks. Doing a full drm_exec loop
just to put() a vm_bo doesn't sound reasonable to me.

Can we instead just have an internal mutex for locking the lists such that we
avoid taking and dropping the spinlocks, which we use currently, in a loop?

- Danilo

> 
>>
>>>
>>>>>
>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>> really would not
>>>>> like to add even more complexity just to get the spinlock out of
>>>>> the way in case
>>>>> the driver already has an outer lock protecting this path.
>>>>
>>>> I must disagree here. These spinlocks and atomic operations are
>>>> pretty
>>>> costly and as discussed earlier this type of locking was the reason
>>>> (at
>>>> least according to the commit message) that made Christian drop the
>>>> XArray
>>>> use in drm_exec for the same set of objects: "The locking overhead
>>>> is
>>>> unecessary and measurable". IMHO the spinlock is the added
>>>> complexity and a
>>>> single wide lock following the drm locking guidelines set out by
>>>> Daniel and
>>>> David should really be the default choice with an opt-in for a
>>>> spinlock if
>>>> needed for async and pushing out to a wq is not an option.
>>>
>>> For the external object list an outer lock would work as long as it's
>>> not the
>>> dma-resv lock of the corresponding GEM object, since here we actually
>>> need to
>>> remove the list entry from the external object list on
>>> drm_gpuvm_bo_destroy().
>>> It's just a bit weird design wise that drivers would need to take
>>> this outer
>>> lock on:
>>>
>>> - drm_gpuvm_bo_extobj_add()
>>> - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
>>> - drm_gpuva_unlink()            (because it needs to call
>>> drm_gpuvm_bo_put())
>>> - drm_gpuvm_exec_lock()
>>> - drm_gpuvm_exec_lock_array()
>>> - drm_gpuvm_prepare_range()
>>>
>>> Given that it seems reasonable to do all the required locking
>>> internally.
>>
>>  From a design POW, there has been a clear direction in XE to make
>> things similar to mmap() / munmap(), so this outer lock, which in Xe is
>> an rwsem, is used in a similar way as the mmap_lock. It's protecting
>> the page-table structures and vma rb tree, the userptr structures and
>> the extobj list. Basically it's taken early in the exec IOCTL, the
>> VM_BIND ioctl, the compute rebind worker and the pagefault handler, so
>> all of the above are just asserting that it is taken in the correct
>> mode.
>>
>> But strictly with this scheme one could also use the vm's dma_resv for
>> the extobj list since with drm_exec, it's locked before traversing the
>> list.
>>
>> The whole point of this scheme is to rely on locks that you already are
>> supposed to be holding for various reasons and is simple to comprehend.
> 
> I don't agree that we're supposed to hold the VM's resv lock anyways for
> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm fine using it
> for that purpose nevertheless.
> 
>>
>>>
>>> In order to at least place lockdep checks, the driver would need to
>>> supply the
>>> corresponding lock's lockdep_map, because the GPUVM otherwise doesn't
>>> know about
>>> the lock.
>>
>> Yes, that sounds reasonable. One lockdep map per list.
> 
> I'd really like to avoid that, especially now that everything got simpler. We
> should define the actual locks to take instead.
> 
>>
>>>
>>> Out of curiosity, what is the overhead of a spin_lock() that doesn't
>>> need to
>>> spin?
>>
>> I guess it's hard to tell exactly, but it is much lower on modern x86
>> than what it used to be. Not sure about ARM, which is the other
>> architecture important to us. I figure if there is little cache-line
>> bouncing the main overhead comes from the implied barriers.
>>
>>>
>>>>
>>>> A pretty simple way that would not add much code would be
>>>>
>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>> spinlock_t
>>>> *lock)
>>>>
>>>> {
>>>>
>>>>      if (!gpuvm->resv_protected_lists)
>>>>          spin_lock(lock);
>>>>
>>>> }
>>>>
>>>>>> For such drivers, that would require anybody calling unlink to
>>>>>> hold the vm's
>>>>>> resv, though.
>>>>> In V4 I want to go back to having a dedicated lock for the GEMs
>>>>> gpuva list (or
>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>> lock for that
>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>> otherwise wouldn't
>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>> was referring to
>>>>> earlier.
>>>>
>>>> Yeah, I can see the need for a dedicated lock for the GEM's gpuva
>>>> list, but
>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>> problem. We
>>>> may free the object and a pointer to the vm's resv during unlink
>>>> but we
>>>> don't free the vm's resv.  It'd be a matter of ensuring that any
>>>> calls to
>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>
>>> Drivers calling unlink() from the fence signaling path can't use the
>>> VM's
>>> dma-resv lock.
>>
>> Yes, that made me a bit curious because in the current version the code
>> required the object's dma_resv for unlink() which can't be grabbed
>> either from the fence signaling path. So are there any drivers actually
>> wanting to do that? If so, they will either need to resort to the
>> current spinlock solution or they will need to call unlink from a
>> workqueue item.
> 
> As Boris already mentioned we have the dma-resv lock by default or a driver
> specific GEM gpuva lock as opt-in. Now, we can get rid of the latter.
> 
>>>
>>> Also, what if the object is an external object? We can't use the VM's
>>> dma-resv
>>> lock here.
>>
>> Why? Typically (sync) unlink is only ever called from an unbind-like
>> operation where it should be trivial to grab the vm's resv. Or, for
>> that matter any outer lock protecting the extobj list. Rule would be
>> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict would
>> be protected by either the vm's dma_resv (or possibly an outer lock in
>> the case of the extobj list).
> 
> Outer lock wouldn't have been working for updates in the async path, but
> shouldn't be relevant anymore. We could use the VM's resv for that.
> 
>>
>>>   And we can't have the GEM objs dma-resv lock held when calling
>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>> refcount drops
>>> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might
>>> drop the
>>> last reference of the GEM object.
>>
>> Yes, but this is a different problem as to what exactly protects
>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per bo list
>> lock, or if we want to keep the bo's dma_resv we need to ensure that
>> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
>> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount (I know
>> Boris didn't like that, but requiring an explicit refcount for a
>> pointer you dereference unless you're under a lock that ensures keeping
>> the object alive is pretty much required?) But anyway for the
>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal spinlock)
>> I don't have a strong preference.
> 
> We can keep the GEM objects dma-resv lock, however as mentioned above
> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the VM's resv lock
> and the GEM's resv lock in case they differ.
> 

>>>>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [Nouveau] [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14 10:57               ` [Nouveau] " Danilo Krummrich
@ 2023-09-14 11:32                 ` Thomas Hellström
  2023-09-14 15:27                   ` Danilo Krummrich
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-14 11:32 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: matthew.brost, sarah.walker, nouveau, linux-kernel, dri-devel,
	boris.brezillon, donald.robson, daniel, christian.koenig,
	faith.ekstrand


On 9/14/23 12:57, Danilo Krummrich wrote:
> On 9/13/23 14:16, Danilo Krummrich wrote:
>
> <snip>
>
>>>> And validate() can remove it while still holding all dma-resv locks,
>>>> neat!
>>>> However, what if two tasks are trying to lock the VA space
>>>> concurrently? What
>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>> drm_gpuva_unlink()?
>>>> Are we guaranteed that at this point of time the drm_gpuvm_bo is not
>>>> on the
>>>> evicted list? Because otherwise we would call drm_gpuvm_bo_destroy()
>>>> with the
>>>> dma-resv lock held, which wouldn't be allowed, since
>>>> drm_gpuvm_bo_destroy()
>>>> might drop the last reference to the drm_gem_object and hence we'd
>>>> potentially
>>>> free the dma-resv lock while holding it, at least if it's an external
>>>> object.
>>>
>>> Easiest way in this scheme is to think of the lists as being protected
>>> by the vm's resv lock. That means anybody calling unlink() must also
>>> hold the vm's resv lock. (Which is OK from an UAF point of view, but
>>> perhaps not from a locking inversion POW from an async list update).
>>
>> This would mean that on unlink() we'd need to hold the VM's resv lock 
>> and the
>> corresponding GEM's resv lock (in case they're not the same anyways) 
>> because the
>> VM's resv lock would protect the external / evicted object lists and 
>> the GEM
>> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
>> drm_gpuvm_bo's list of drm_gpuvas.
>
> As mentioned below the same applies for drm_gpuvm_bo_put() since it might
> destroy the vm_bo, which includes removing the vm_bo from external / 
> evicted
> object lists and the GEMs list of vm_bos.
>
> As mentioned, if the GEM's dma-resv is different from the VM's 
> dma-resv we need
> to take both locks. Ultimately, this would mean we need a drm_exec 
> loop, because
> we can't know the order in which to take these locks. Doing a full 
> drm_exec loop
> just to put() a vm_bo doesn't sound reasonable to me.
>
> Can we instead just have an internal mutex for locking the lists such 
> that we
> avoid taking and dropping the spinlocks, which we use currently, in a 
> loop?

You'd have the same locking inversion problem with a mutex, right? Since 
in the eviction path you have resv->mutex, from exec you have 
resv->mutex->resv because validate would attempt to grab resv.

That said, xe currently indeed does the vm+bo exec dance on vma put.

One reason why that seemingly horrible construct is good, is that when 
evicting an extobj and you need to access individual vmas to Zap page 
table entries or TLB flush, those VMAs are not allowed to go away (we're 
not refcounting them). Holding the bo resv on gpuva put prevents that 
from happening. Possibly one could use another mutex to protect the 
gem->vm_bo list to achieve the same, but we'd need to hold it on gpuva put.

/Thomas


>
> - Danilo
>
>>
>>>
>>>>
>>>>>>
>>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>>> really would not
>>>>>> like to add even more complexity just to get the spinlock out of
>>>>>> the way in case
>>>>>> the driver already has an outer lock protecting this path.
>>>>>
>>>>> I must disagree here. These spinlocks and atomic operations are
>>>>> pretty
>>>>> costly and as discussed earlier this type of locking was the reason
>>>>> (at
>>>>> least according to the commit message) that made Christian drop the
>>>>> XArray
>>>>> use in drm_exec for the same set of objects: "The locking overhead
>>>>> is
>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>> complexity and a
>>>>> single wide lock following the drm locking guidelines set out by
>>>>> Daniel and
>>>>> David should really be the default choice with an opt-in for a
>>>>> spinlock if
>>>>> needed for async and pushing out to a wq is not an option.
>>>>
>>>> For the external object list an outer lock would work as long as it's
>>>> not the
>>>> dma-resv lock of the corresponding GEM object, since here we actually
>>>> need to
>>>> remove the list entry from the external object list on
>>>> drm_gpuvm_bo_destroy().
>>>> It's just a bit weird design wise that drivers would need to take
>>>> this outer
>>>> lock on:
>>>>
>>>> - drm_gpuvm_bo_extobj_add()
>>>> - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
>>>> - drm_gpuva_unlink()            (because it needs to call
>>>> drm_gpuvm_bo_put())
>>>> - drm_gpuvm_exec_lock()
>>>> - drm_gpuvm_exec_lock_array()
>>>> - drm_gpuvm_prepare_range()
>>>>
>>>> Given that it seems reasonable to do all the required locking
>>>> internally.
>>>
>>>  From a design POW, there has been a clear direction in XE to make
>>> things similar to mmap() / munmap(), so this outer lock, which in Xe is
>>> an rwsem, is used in a similar way as the mmap_lock. It's protecting
>>> the page-table structures and vma rb tree, the userptr structures and
>>> the extobj list. Basically it's taken early in the exec IOCTL, the
>>> VM_BIND ioctl, the compute rebind worker and the pagefault handler, so
>>> all of the above are just asserting that it is taken in the correct
>>> mode.
>>>
>>> But strictly with this scheme one could also use the vm's dma_resv for
>>> the extobj list since with drm_exec, it's locked before traversing the
>>> list.
>>>
>>> The whole point of this scheme is to rely on locks that you already are
>>> supposed to be holding for various reasons and is simple to comprehend.
>>
>> I don't agree that we're supposed to hold the VM's resv lock anyways for
>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm fine 
>> using it
>> for that purpose nevertheless.
>>
>>>
>>>>
>>>> In order to at least place lockdep checks, the driver would need to
>>>> supply the
>>>> corresponding lock's lockdep_map, because the GPUVM otherwise doesn't
>>>> know about
>>>> the lock.
>>>
>>> Yes, that sounds reasonable. One lockdep map per list.
>>
>> I'd really like to avoid that, especially now that everything got 
>> simpler. We
>> should define the actual locks to take instead.
>>
>>>
>>>>
>>>> Out of curiosity, what is the overhead of a spin_lock() that doesn't
>>>> need to
>>>> spin?
>>>
>>> I guess it's hard to tell exactly, but it is much lower on modern x86
>>> than what it used to be. Not sure about ARM, which is the other
>>> architecture important to us. I figure if there is little cache-line
>>> bouncing the main overhead comes from the implied barriers.
>>>
>>>>
>>>>>
>>>>> A pretty simple way that would not add much code would be
>>>>>
>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>>> spinlock_t
>>>>> *lock)
>>>>>
>>>>> {
>>>>>
>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>          spin_lock(lock);
>>>>>
>>>>> }
>>>>>
>>>>>>> For such drivers, that would require anybody calling unlink to
>>>>>>> hold the vm's
>>>>>>> resv, though.
>>>>>> In V4 I want to go back to having a dedicated lock for the GEMs
>>>>>> gpuva list (or
>>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>>> lock for that
>>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>>> otherwise wouldn't
>>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>>> was referring to
>>>>>> earlier.
>>>>>
>>>>> Yeah, I can see the need for a dedicated lock for the GEM's gpuva
>>>>> list, but
>>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>>> problem. We
>>>>> may free the object and a pointer to the vm's resv during unlink
>>>>> but we
>>>>> don't free the vm's resv.  It'd be a matter of ensuring that any
>>>>> calls to
>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>
>>>> Drivers calling unlink() from the fence signaling path can't use the
>>>> VM's
>>>> dma-resv lock.
>>>
>>> Yes, that made me a bit curious because in the current version the code
>>> required the object's dma_resv for unlink() which can't be grabbed
>>> either from the fence signaling path. So are there any drivers actually
>>> wanting to do that? If so, they will either need to resort to the
>>> current spinlock solution or they will need to call unlink from a
>>> workqueue item.
>>
>> As Boris already mentioned we have the dma-resv lock by default or a 
>> driver
>> specific GEM gpuva lock as opt-in. Now, we can get rid of the latter.
>>
>>>>
>>>> Also, what if the object is an external object? We can't use the VM's
>>>> dma-resv
>>>> lock here.
>>>
>>> Why? Typically (sync) unlink is only ever called from an unbind-like
>>> operation where it should be trivial to grab the vm's resv. Or, for
>>> that matter any outer lock protecting the extobj list. Rule would be
>>> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict would
>>> be protected by either the vm's dma_resv (or possibly an outer lock in
>>> the case of the extobj list).
>>
>> Outer lock wouldn't have been working for updates in the async path, but
>> shouldn't be relevant anymore. We could use the VM's resv for that.
>>
>>>
>>>>   And we can't have the GEM objs dma-resv lock held when calling
>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>>> refcount drops
>>>> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might
>>>> drop the
>>>> last reference of the GEM object.
>>>
>>> Yes, but this is a different problem as to what exactly protects
>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per bo list
>>> lock, or if we want to keep the bo's dma_resv we need to ensure that
>>> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
>>> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount (I know
>>> Boris didn't like that, but requiring an explicit refcount for a
>>> pointer you dereference unless you're under a lock that ensures keeping
>>> the object alive is pretty much required?) But anyway for the
>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal spinlock)
>>> I don't have a strong preference.
>>
>> We can keep the GEM objects dma-resv lock, however as mentioned above
>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the VM's 
>> resv lock
>> and the GEM's resv lock in case they differ.
>>
>
>>>>>
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14 10:45                   ` Thomas Hellström
@ 2023-09-14 11:54                     ` Boris Brezillon
  2023-09-14 13:33                       ` Thomas Hellström
  0 siblings, 1 reply; 77+ messages in thread
From: Boris Brezillon @ 2023-09-14 11:54 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: Dave Airlie, Danilo Krummrich, daniel, matthew.brost,
	sarah.walker, donald.robson, christian.koenig, faith.ekstrand,
	dri-devel, nouveau, linux-kernel

On Thu, 14 Sep 2023 12:45:44 +0200
Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:

> On 9/14/23 10:20, Boris Brezillon wrote:
> > On Wed, 13 Sep 2023 15:22:56 +0200
> > Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >  
> >> On 9/13/23 13:33, Boris Brezillon wrote:  
> >>> On Wed, 13 Sep 2023 12:39:01 +0200
> >>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >>>     
> >>>> Hi,
> >>>>
> >>>> On 9/13/23 09:19, Boris Brezillon wrote:  
> >>>>> On Wed, 13 Sep 2023 17:05:42 +1000
> >>>>> Dave Airlie <airlied@gmail.com> wrote:
> >>>>>        
> >>>>>> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
> >>>>>> <boris.brezillon@collabora.com> wrote:  
> >>>>>>> On Tue, 12 Sep 2023 18:20:32 +0200
> >>>>>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >>>>>>>           
> >>>>>>>>> +/**
> >>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
> >>>>>>>>> + * @__gpuvm: The GPU VM
> >>>>>>>>> + * @__list_name: The name of the list we're iterating on
> >>>>>>>>> + * @__local_list: A pointer to the local list used to store already iterated items
> >>>>>>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> >>>>>>>>> + *
> >>>>>>>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
> >>>>>>>>> + * iterator releases the lock immediately after picking the first element from
> >>>>>>>>> + * the list, so list insertion deletion can happen concurrently.  
> >>>>>>>> Are the list spinlocks needed for that async state update from within
> >>>>>>>> the dma-fence critical section we've discussed previously?  
> >>>>>>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
> >>>>>>> hook will be in this situation (Panthor at the moment, PowerVR soon). I
> >>>>>>> get that Xe and Nouveau don't need that because they update the VM
> >>>>>>> state early (in the ioctl path), but I keep thinking this will hurt us
> >>>>>>> if we don't think it through from the beginning, because once you've
> >>>>>>> set this logic to depend only on resv locks, it will be pretty hard to
> >>>>>>> get back to a solution which lets synchronous VM_BINDs take precedence
> >>>>>>> on asynchronous request, and, with vkQueueBindSparse() passing external
> >>>>>>> deps (plus the fact the VM_BIND queue might be pretty deep), it can
> >>>>>>> take a long time to get your synchronous VM_BIND executed...  
> >>>> So this would boil down to either (possibly opt-in) keeping the spinlock
> >>>> approach or pushing the unlink out to a wq then?  
> >>> Deferred _unlink() would not be an issue, since I already defer the
> >>> drm_gpuva destruction to a wq, it would just a be a matter of moving the
> >>> _unlink() call there as well. But _link() also takes the GEM gpuva list
> >>> lock, and that one is bit tricky, in that sm_map() can trigger 2 more
> >>> _link() calls for the prev/next mappings, which we can't guess until we
> >>> get to execute the VM update. If we mandate the use of the GEM resv
> >>> lock, that simply means async VM updates (AKA calling
> >>> drm_gpuvm_sm_[un]map()) are not an option. And if this is what everyone
> >>> agrees on, then I'd like the APIs that make this sort of async VM
> >>> update possible (drm_gpuvm_sm_[un]map(), the drm_gpuvm_ops::sm_step*
> >>> methods, and probably other things) to be dropped, so we don't make it
> >>> look like it's something we support.
> >>>     
> >>>> BTW, as also asked in a reply to Danilo, how do you call unlink from
> >>>> run_job() when it was requiring the obj->dma_resv lock, or was that a WIP?  
> >>> _unlink() makes sure the GEM gpuva list lock is taken, but this can be
> >>> a custom lock (see drm_gem_gpuva_set_lock()). In panthor we have
> >>> panthor_gem_object::gpuva_list_lock that's dedicated the gpuva list
> >>> protection. We make sure we never take this lock while allocating
> >>> memory to guarantee the dma-signalling path can't deadlock.
> >>>     
> >>>>>>>           
> >>>>>> btw what is the use case for this? do we have actual vulkan
> >>>>>> applications we know will have problems here?  
> >>>>> I don't, but I think that's a concern Faith raised at some point (dates
> >>>>> back from when I was reading threads describing how VM_BIND on i915
> >>>>> should work, and I was clearly discovering this whole VM_BIND thing at
> >>>>> that time, so maybe I misunderstood).
> >>>>>        
> >>>>>> it feels like a bit of premature optimisation, but maybe we have use cases.  
> >>>>> Might be, but that's the sort of thing that would put us in a corner if
> >>>>> we don't have a plan for when the needs arise. Besides, if we don't
> >>>>> want to support that case because it's too complicated, I'd recommend
> >>>>> dropping all the drm_gpuvm APIs that let people think this mode is
> >>>>> valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
> >>>>> drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
> >>>>> confusion.  
> >>>> Xe allows bypassing the bind-queue with another bind-queue, but to
> >>>> completely avoid dependencies between queues the Operations may not
> >>>> overlap.  
> >>> So, you check the VM state with some VM lock held (would be the VM resv
> >>> in my case), and if the mapping is new (no overlaps with pre-existing
> >>> mappings), you queue it to the fast-track/sync-VM_BIND queue. What would
> >>> be missing I guess is a way to know if the mapping is active (MMU has
> >>> been updated) or pending (MMU update queued to the bind-queue), so I can
> >>> fast-track mapping/unmapping of active mappings.  
> > Ok, so I started modifying the implementation, and quickly realized the
> > overlap test can't be done without your xe_range_fence tree because of
> > unmaps. Since we call drm_gpuva_unmap() early/in the IOCTL path (IOW,
> > before the mapping teardown is effective), we lose track of this
> > yet-to-be-executed-unmap operation, and if we do our
> > va_range_overlaps_with_existing_mappings() test after such an unmap has
> > been queued using just the drm_gpuvm tree, we might get false even if
> > the mapping still exists and is expected to be torn down when the
> > VM_BIND(unmap) job is executed on the bind-queue. As a result, this
> > might execute the VM_BIND(map,sync) immediately (because the dependency
> > went undetected), and then the vm_bind_run_job() function kicks in and
> > undoes what the synchronous VM_BIND(map) did. Am I missing something?
> >
> > If I'm correct, that means I'm back to having synchronous VM_BIND ops
> > queued after all asynchronous ones unless I use something like your
> > xe_range_fence solution (which I was hoping I could postpone until we
> > decide to expose multiple bind queues).  
> 
> Yes, unfortunately fine-granular async range-tracking comes with a cost. 
> Still, if you are doing page-table updates solely with the CPU, you 
> could probably short-circuit the fence part of the fenced ranges?

I'm doing it with the CPU, but asynchronously (bind-queue), so I'm
facing pretty much the same problems, I think.

> 
> 
> >
> > I'm still a bit skeptical about this 'update VM mappings tree early,
> > defer MMU page table updates' approach, where the VM state and the
> > actual page table tree are temporarily out of sync until all operations
> > have been flushed on all queues targeting a VM. This means any test we
> > do on the gpuvm, like, 'give me the BO mapped at VA xxx', is subject to
> > 'is this the current state or the future state?' questioning. Note that
> > we can't even get the current VM state anymore, because all the
> > drm_gpuvm::tree stores with this solution is the future state, and
> > to-be-unmapped mappings are lost during the transitioning period (when
> > vm_bind jobs are queued but not executed yet).  
> 
> Understandable. But this is the way we historically have been doing 
> things, (I think the whole async atomic page-flipping is using the same 
> concept), but rather than refering to it as current state and future 
> state, I'd like to think it as Synchronous CPU state (What an API user 
> sees) vs GPU state (What the GPU sees where it's currently executing).

Actually, the latency incurred by the fact the page table updates are
done by the GPU is one thing, and I guess I could agree with you if that
was the only difference between the GPU and CPU view. But the fact
VM_BIND jobs can have external dependencies makes things a lot more
confusing. I might be wrong, but I think atomic page-flip is simpler.
Yes you can have implicit deps on your scanout buffer, and yes the HW
will wait for these fences to signal before updating the plane pointer,
but that's still just a simple pipeline with one resource to deal with.
A VM is a whole range with virtual memory regions being attached
physical mem chunks, possibly with each range having its own lifecycle,
etc. It'd make more sense to me to have a way to know the current
state, and the future state.

Just one example, say you have a GPU job that triggers some fault
that's supposed to be handled by the kernel driver to unblock the
situation. In order to have some context, the kernel driver needs to
read a GPU buffer that's passed back as a virtual address by the GPU/FW,
so it calls drm_gpuvm_bo_find(), and now it might potentially get a BO
that's not the current BO being mapped at this address, but the future
BO after some asynchronous VM_BIND(map) has been executed, and of
course, the VM_BIND job leading to this future state, could have a
dependency on the GPU job, because this GPU job was using the old
mapping. It might sound completely hypothetical, but that's actually
the sort of things the Mali FW does in a few occasions.

So yeah, I'm still not convinced we can always get away with just the
future representation of the VM. Sometimes you have to know what's
mapped at the moment.

> To bring them in sync you need to wait for fences.

Wouldn't solve the case I mentioned above, AFAICT.

> And ideally the async 
> work should never fail.

Sure, that I considered for granted. If async VM_BIND fails, we just
flag the VM as unusable, and cancel any GPU job submission happening on
the VM. The user then has to recreate the VM to take a fresh start
(DEVICE_LOST situation).

It a bit tricky when we want to clean things up after a failure,
because we might have lost track of some of mappings (early
gpuva_unmap(), but the MMU page tables are still lying around). In our
case (Panthor) that's not really an issue though, because
free_io_pgtable_ops() will take care of that for us.

> 
> If one wants to push async work out to be handled solely by the GPU, 
> this is the way things must be done since the GPU can't take locks or 
> allocate memory, but as part or all of async work is sometimes done 
> using the CPU, it might make sense to challenge that to some extent. 

I think updating the VM state in the run_job() with drm_gpuva_[un]map()
would still account for the GPU-is-executing-pgtable-updates latency,
and that's not really the sort of desynchronization I'm worried about,
because when you get to submit your VM_BIND job, you know all the job
deps are met, and the VM update is about to happen. What I'm worried
about is the desynchronization incurred by complex VM_BIND job deps
that make it hard to know what's the diff between the drm_gpuvm state
(predicting the future) and the VM state a GPU job expects (the
present).

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14 11:54                     ` Boris Brezillon
@ 2023-09-14 13:33                       ` Thomas Hellström
  2023-09-14 15:37                         ` Boris Brezillon
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-14 13:33 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Dave Airlie, Danilo Krummrich, daniel, matthew.brost,
	sarah.walker, donald.robson, christian.koenig, faith.ekstrand,
	dri-devel, nouveau, linux-kernel

Hi,

On 9/14/23 13:54, Boris Brezillon wrote:
> On Thu, 14 Sep 2023 12:45:44 +0200
> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>
>> On 9/14/23 10:20, Boris Brezillon wrote:
>>> On Wed, 13 Sep 2023 15:22:56 +0200
>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>>>   
>>>> On 9/13/23 13:33, Boris Brezillon wrote:
>>>>> On Wed, 13 Sep 2023 12:39:01 +0200
>>>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>>>>>      
>>>>>> Hi,
>>>>>>
>>>>>> On 9/13/23 09:19, Boris Brezillon wrote:
>>>>>>> On Wed, 13 Sep 2023 17:05:42 +1000
>>>>>>> Dave Airlie <airlied@gmail.com> wrote:
>>>>>>>         
>>>>>>>> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
>>>>>>>> <boris.brezillon@collabora.com> wrote:
>>>>>>>>> On Tue, 12 Sep 2023 18:20:32 +0200
>>>>>>>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
>>>>>>>>>            
>>>>>>>>>>> +/**
>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>> + * @__local_list: A pointer to the local list used to store already iterated items
>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>> + *
>>>>>>>>>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
>>>>>>>>>>> + * iterator releases the lock immediately after picking the first element from
>>>>>>>>>>> + * the list, so list insertion deletion can happen concurrently.
>>>>>>>>>> Are the list spinlocks needed for that async state update from within
>>>>>>>>>> the dma-fence critical section we've discussed previously?
>>>>>>>>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
>>>>>>>>> hook will be in this situation (Panthor at the moment, PowerVR soon). I
>>>>>>>>> get that Xe and Nouveau don't need that because they update the VM
>>>>>>>>> state early (in the ioctl path), but I keep thinking this will hurt us
>>>>>>>>> if we don't think it through from the beginning, because once you've
>>>>>>>>> set this logic to depend only on resv locks, it will be pretty hard to
>>>>>>>>> get back to a solution which lets synchronous VM_BINDs take precedence
>>>>>>>>> on asynchronous request, and, with vkQueueBindSparse() passing external
>>>>>>>>> deps (plus the fact the VM_BIND queue might be pretty deep), it can
>>>>>>>>> take a long time to get your synchronous VM_BIND executed...
>>>>>> So this would boil down to either (possibly opt-in) keeping the spinlock
>>>>>> approach or pushing the unlink out to a wq then?
>>>>> Deferred _unlink() would not be an issue, since I already defer the
>>>>> drm_gpuva destruction to a wq, it would just a be a matter of moving the
>>>>> _unlink() call there as well. But _link() also takes the GEM gpuva list
>>>>> lock, and that one is bit tricky, in that sm_map() can trigger 2 more
>>>>> _link() calls for the prev/next mappings, which we can't guess until we
>>>>> get to execute the VM update. If we mandate the use of the GEM resv
>>>>> lock, that simply means async VM updates (AKA calling
>>>>> drm_gpuvm_sm_[un]map()) are not an option. And if this is what everyone
>>>>> agrees on, then I'd like the APIs that make this sort of async VM
>>>>> update possible (drm_gpuvm_sm_[un]map(), the drm_gpuvm_ops::sm_step*
>>>>> methods, and probably other things) to be dropped, so we don't make it
>>>>> look like it's something we support.
>>>>>      
>>>>>> BTW, as also asked in a reply to Danilo, how do you call unlink from
>>>>>> run_job() when it was requiring the obj->dma_resv lock, or was that a WIP?
>>>>> _unlink() makes sure the GEM gpuva list lock is taken, but this can be
>>>>> a custom lock (see drm_gem_gpuva_set_lock()). In panthor we have
>>>>> panthor_gem_object::gpuva_list_lock that's dedicated the gpuva list
>>>>> protection. We make sure we never take this lock while allocating
>>>>> memory to guarantee the dma-signalling path can't deadlock.
>>>>>      
>>>>>>>>>            
>>>>>>>> btw what is the use case for this? do we have actual vulkan
>>>>>>>> applications we know will have problems here?
>>>>>>> I don't, but I think that's a concern Faith raised at some point (dates
>>>>>>> back from when I was reading threads describing how VM_BIND on i915
>>>>>>> should work, and I was clearly discovering this whole VM_BIND thing at
>>>>>>> that time, so maybe I misunderstood).
>>>>>>>         
>>>>>>>> it feels like a bit of premature optimisation, but maybe we have use cases.
>>>>>>> Might be, but that's the sort of thing that would put us in a corner if
>>>>>>> we don't have a plan for when the needs arise. Besides, if we don't
>>>>>>> want to support that case because it's too complicated, I'd recommend
>>>>>>> dropping all the drm_gpuvm APIs that let people think this mode is
>>>>>>> valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
>>>>>>> drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
>>>>>>> confusion.
>>>>>> Xe allows bypassing the bind-queue with another bind-queue, but to
>>>>>> completely avoid dependencies between queues the Operations may not
>>>>>> overlap.
>>>>> So, you check the VM state with some VM lock held (would be the VM resv
>>>>> in my case), and if the mapping is new (no overlaps with pre-existing
>>>>> mappings), you queue it to the fast-track/sync-VM_BIND queue. What would
>>>>> be missing I guess is a way to know if the mapping is active (MMU has
>>>>> been updated) or pending (MMU update queued to the bind-queue), so I can
>>>>> fast-track mapping/unmapping of active mappings.
>>> Ok, so I started modifying the implementation, and quickly realized the
>>> overlap test can't be done without your xe_range_fence tree because of
>>> unmaps. Since we call drm_gpuva_unmap() early/in the IOCTL path (IOW,
>>> before the mapping teardown is effective), we lose track of this
>>> yet-to-be-executed-unmap operation, and if we do our
>>> va_range_overlaps_with_existing_mappings() test after such an unmap has
>>> been queued using just the drm_gpuvm tree, we might get false even if
>>> the mapping still exists and is expected to be torn down when the
>>> VM_BIND(unmap) job is executed on the bind-queue. As a result, this
>>> might execute the VM_BIND(map,sync) immediately (because the dependency
>>> went undetected), and then the vm_bind_run_job() function kicks in and
>>> undoes what the synchronous VM_BIND(map) did. Am I missing something?
>>>
>>> If I'm correct, that means I'm back to having synchronous VM_BIND ops
>>> queued after all asynchronous ones unless I use something like your
>>> xe_range_fence solution (which I was hoping I could postpone until we
>>> decide to expose multiple bind queues).
>> Yes, unfortunately fine-granular async range-tracking comes with a cost.
>> Still, if you are doing page-table updates solely with the CPU, you
>> could probably short-circuit the fence part of the fenced ranges?
> I'm doing it with the CPU, but asynchronously (bind-queue), so I'm
> facing pretty much the same problems, I think.
>
>>
>>> I'm still a bit skeptical about this 'update VM mappings tree early,
>>> defer MMU page table updates' approach, where the VM state and the
>>> actual page table tree are temporarily out of sync until all operations
>>> have been flushed on all queues targeting a VM. This means any test we
>>> do on the gpuvm, like, 'give me the BO mapped at VA xxx', is subject to
>>> 'is this the current state or the future state?' questioning. Note that
>>> we can't even get the current VM state anymore, because all the
>>> drm_gpuvm::tree stores with this solution is the future state, and
>>> to-be-unmapped mappings are lost during the transitioning period (when
>>> vm_bind jobs are queued but not executed yet).
>> Understandable. But this is the way we historically have been doing
>> things, (I think the whole async atomic page-flipping is using the same
>> concept), but rather than refering to it as current state and future
>> state, I'd like to think it as Synchronous CPU state (What an API user
>> sees) vs GPU state (What the GPU sees where it's currently executing).
> Actually, the latency incurred by the fact the page table updates are
> done by the GPU is one thing, and I guess I could agree with you if that
> was the only difference between the GPU and CPU view. But the fact
> VM_BIND jobs can have external dependencies makes things a lot more
> confusing. I might be wrong, but I think atomic page-flip is simpler.
> Yes you can have implicit deps on your scanout buffer, and yes the HW
> will wait for these fences to signal before updating the plane pointer,
> but that's still just a simple pipeline with one resource to deal with.
> A VM is a whole range with virtual memory regions being attached
> physical mem chunks, possibly with each range having its own lifecycle,
> etc. It'd make more sense to me to have a way to know the current
> state, and the future state.

Yeah so in Xe we support async bind jobs solely to be able to do deep 
pipelining and it's not only the pagetable jobs, You could have multiple 
bind-evict-restore-exec-unbind-bind-evict-restore-exec all piplelined 
and only the available memory resources sets the limit. In fact you can 
even have physical VRAM assigned to a bo which won't be used until exec 
#5 in the pipeline and released in exec #4 since TTM is aware of async 
memory management.

So something needs to absorb the state discrepancy between what you 
refer to as the current state and the future state. The question is what 
should absorb it? Should it be the gpuvm or some associated driver state 
tracking?

Now let's say that you have a deferred bind state-update pending and 
track the *current* state in the gpuvm so that a number of vma unmaps 
and maps aren't yet visible to gpuvm and then you submit an exec ioctl. 
How does the exec ioctl know the gpuvm state? Like external bos to 
validate or bos that become evicted, userptr vmas that have been 
invalidated? Does the exec need to block waiting for the bind fence to 
complete so that it can assess the VM state that UMD intended to be there?

>
> Just one example, say you have a GPU job that triggers some fault
> that's supposed to be handled by the kernel driver to unblock the
> situation. In order to have some context, the kernel driver needs to
> read a GPU buffer that's passed back as a virtual address by the GPU/FW,
> so it calls drm_gpuvm_bo_find(), and now it might potentially get a BO
> that's not the current BO being mapped at this address, but the future
> BO after some asynchronous VM_BIND(map) has been executed, and of
> course, the VM_BIND job leading to this future state, could have a
> dependency on the GPU job, because this GPU job was using the old
> mapping. It might sound completely hypothetical, but that's actually
> the sort of things the Mali FW does in a few occasions.

Recoverable faults are typically requiring some sort of memory operation 
that requires the dma_resv or outer lock, like validation or 
get_user_pages(), and can thus not be performed in the fence signalling 
critical path and on Xe they are reserved for Long-Running VMs. On 
those, pipelining is not really needed and is disallowed in Xe to avoid 
having to deal with the state discrepancy.

But to the actual problem you mention, let's say its a fault that 
triggers a need to dump bo contents, then yes in order to be able to do 
deep pipelining in this way the driver needs to track some state 
discrepancy, and that's an additional overhead.

>
> So yeah, I'm still not convinced we can always get away with just the
> future representation of the VM. Sometimes you have to know what's
> mapped at the moment.
>
>> To bring them in sync you need to wait for fences.
> Wouldn't solve the case I mentioned above, AFAICT.
>
>> And ideally the async
>> work should never fail.
> Sure, that I considered for granted. If async VM_BIND fails, we just
> flag the VM as unusable, and cancel any GPU job submission happening on
> the VM. The user then has to recreate the VM to take a fresh start
> (DEVICE_LOST situation).
>
> It a bit tricky when we want to clean things up after a failure,
> because we might have lost track of some of mappings (early
> gpuva_unmap(), but the MMU page tables are still lying around). In our
> case (Panthor) that's not really an issue though, because
> free_io_pgtable_ops() will take care of that for us.
>
>> If one wants to push async work out to be handled solely by the GPU,
>> this is the way things must be done since the GPU can't take locks or
>> allocate memory, but as part or all of async work is sometimes done
>> using the CPU, it might make sense to challenge that to some extent.
> I think updating the VM state in the run_job() with drm_gpuva_[un]map()
> would still account for the GPU-is-executing-pgtable-updates latency,
> and that's not really the sort of desynchronization I'm worried about,
> because when you get to submit your VM_BIND job, you know all the job
> deps are met, and the VM update is about to happen. What I'm worried
> about is the desynchronization incurred by complex VM_BIND job deps
> that make it hard to know what's the diff between the drm_gpuvm state
> (predicting the future) and the VM state a GPU job expects (the
> present).

Yes that sort of deep pipeling requires additional "current" state 
tracking for some situations, but waiting in exec for the current state 
to catch up with future state, which it seems is a consequence of async 
state updates, isn't really an option for us.

Now if you think the decision to remove those spinlocks from drm_gpuvm 
was premature, I'm fully OK to have them in there again, but opt-in so 
that we have helpers that fit all purposes.

/Thomas




^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-09 15:31 ` [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation Danilo Krummrich
                     ` (4 preceding siblings ...)
  2023-09-12 16:20   ` Thomas Hellström
@ 2023-09-14 13:48   ` Thomas Hellström
  2023-09-14 16:36     ` Danilo Krummrich
  5 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-14 13:48 UTC (permalink / raw)
  To: Danilo Krummrich, airlied, daniel, matthew.brost, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel

Hi, Danilo

Some additional minor comments as xe conversion progresses.

On 9/9/23 17:31, Danilo Krummrich wrote:
> So far the DRM GPUVA manager offers common infrastructure to track GPU VA
> allocations and mappings, generically connect GPU VA mappings to their
> backing buffers and perform more complex mapping operations on the GPU VA
> space.
>
> However, there are more design patterns commonly used by drivers, which
> can potentially be generalized in order to make the DRM GPUVA manager
> represent a basic GPU-VM implementation. In this context, this patch aims
> at generalizing the following elements.
>
> 1) Provide a common dma-resv for GEM objects not being used outside of
>     this GPU-VM.
>
> 2) Provide tracking of external GEM objects (GEM objects which are
>     shared with other GPU-VMs).
>
> 3) Provide functions to efficiently lock all GEM objects dma-resv the
>     GPU-VM contains mappings of.
>
> 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
>     of, such that validation of evicted GEM objects is accelerated.
>
> 5) Provide some convinience functions for common patterns.
>
> Rather than being designed as a "framework", the target is to make all
> features appear as a collection of optional helper functions, such that
> drivers are free to make use of the DRM GPUVA managers basic
> functionality and opt-in for other features without setting any feature
> flags, just by making use of the corresponding functions.
>
> Big kudos to Boris Brezillon for his help to figure out locking for drivers
> updating the GPU VA space within the fence signalling path.
>
> Suggested-by: Matthew Brost <matthew.brost@intel.com>
> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> ---
>
> +/**
> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to / from a
> + * &drm_gpuvms evicted list
> + * @obj: the &drm_gem_object to add or remove
> + * @evict: indicates whether the object is evicted
> + *
> + * Adds a &drm_gem_object to or removes it from all &drm_gpuvms evicted
> + * list containing a mapping of this &drm_gem_object.
> + */
> +void
> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
> +{
> +	struct drm_gpuvm_bo *vm_bo;
> +
> +	drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
> +		if (evict)
> +			drm_gpuvm_bo_list_add(vm_bo, evict);
> +		else
> +			drm_gpuvm_bo_list_del(vm_bo, evict);
> +	}
> +}
> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
> +

We need a drm_gpuvm_bo_evict(struct drm_gpuvm_bo *vm_bo, ...) that puts 
a single gpuvm_bo on the list, the above function could perhaps be 
renamed as drm_gpuvm_gem_obj_evict(obj, ....).

Reason is some vm's are faulting vms which don't have an evict list, but 
validate from the pagefault handler. Also evict == false is dangerous 
because if called from within an exec, it might remove the obj from 
other vm's evict list before they've had a chance to rebind their VMAs.

>   static int
>   __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>   		   struct drm_gpuva *va)
> diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
> index afa50b9059a2..834bb6d6617e 100644
> --- a/include/drm/drm_gpuvm.h
> +++ b/include/drm/drm_gpuvm.h
> @@ -26,10 +26,12 @@
>    */
>   
>   #include <linux/list.h>
> +#include <linux/dma-resv.h>
>   #include <linux/rbtree.h>
>   #include <linux/types.h>
>   
>   #include <drm/drm_gem.h>
> +#include <drm/drm_exec.h>
>   
>   struct drm_gpuvm;
>   struct drm_gpuvm_bo;
> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>   	 * space
>   	 */
>   	struct dma_resv *resv;
> +
> +	/**
> +	 * @extobj: structure holding the extobj list
> +	 */
> +	struct {
> +		/**
> +		 * @list: &list_head storing &drm_gpuvm_bos serving as
> +		 * external object
> +		 */
> +		struct list_head list;
> +
> +		/**
> +		 * @lock: spinlock to protect the extobj list
> +		 */
> +		spinlock_t lock;
> +	} extobj;
> +
> +	/**
> +	 * @evict: structure holding the evict list and evict list lock
> +	 */
> +	struct {
> +		/**
> +		 * @list: &list_head storing &drm_gpuvm_bos currently being
> +		 * evicted
> +		 */
> +		struct list_head list;
> +
> +		/**
> +		 * @lock: spinlock to protect the evict list
> +		 */
> +		spinlock_t lock;
> +	} evict;
>   };
>   
>   void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
>   		    const struct drm_gpuvm_ops *ops);
>   void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>   
> +/**
> + * drm_gpuvm_is_extobj() - indicates whether the given &drm_gem_object is an
> + * external object
> + * @gpuvm: the &drm_gpuvm to check
> + * @obj: the &drm_gem_object to check
> + *
> + * Returns: true if the &drm_gem_object &dma_resv differs from the
> + * &drm_gpuvms &dma_resv, false otherwise
> + */
> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm *gpuvm,
> +				       struct drm_gem_object *obj)
> +{
> +	return obj && obj->resv != gpuvm->resv;
> +}
> +
>   static inline struct drm_gpuva *
>   __drm_gpuva_next(struct drm_gpuva *va)
>   {
> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>   #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) \
>   	list_for_each_entry_safe(va__, next__, &(gpuvm__)->rb.list, rb.entry)
>   
> +/**
> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of &drm_exec
> + *
> + * This structure should be created on the stack as &drm_exec should be.
> + *
> + * Optionally, @extra can be set in order to lock additional &drm_gem_objects.
> + */
> +struct drm_gpuvm_exec {
> +	/**
> +	 * @exec: the &drm_exec structure
> +	 */
> +	struct drm_exec exec;
> +
> +	/**
> +	 * @vm: the &drm_gpuvm to lock its DMA reservations
> +	 */
> +	struct drm_gpuvm *vm;
> +
> +	/**
> +	 * @extra: Callback and corresponding private data for the driver to
> +	 * lock arbitrary additional &drm_gem_objects.
> +	 */
> +	struct {
> +		/**
> +		 * @fn: The driver callback to lock additional &drm_gem_objects.
> +		 */
> +		int (*fn)(struct drm_gpuvm_exec *vm_exec,
> +			  unsigned int num_fences);
> +
> +		/**
> +		 * @priv: driver private data for the @fn callback
> +		 */
> +		void *priv;
> +	} extra;
> +};
> +
> +/**
> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
> + * @gpuvm: the &drm_gpuvm
> + * @exec: the &drm_exec context
> + * @num_fences: the amount of &dma_fences to reserve
> + *
> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy &drm_gem_object.
> + *
> + * Using this function directly, it is the drivers responsibility to call
> + * drm_exec_init() and drm_exec_fini() accordingly.
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +static inline int
> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
> +		     struct drm_exec *exec,
> +		     unsigned int num_fences)
> +{
> +	return drm_exec_prepare_obj(exec, &gpuvm->d_obj, num_fences);
> +}
> +
> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> +			      struct drm_exec *exec,
> +			      unsigned int num_fences);
> +
> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
> +			    struct drm_exec *exec,
> +			    u64 addr, u64 range,
> +			    unsigned int num_fences);
> +
> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> +			unsigned int num_fences,
> +			bool interruptible);
> +
> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
> +			      struct drm_gem_object **objs,
> +			      unsigned int num_objs,
> +			      unsigned int num_fences,
> +			      bool interruptible);
> +
> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
> +			      u64 addr, u64 range,
> +			      unsigned int num_fences,
> +			      bool interruptible);
> +
> +/**
> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated BOs
> + * @gpuvm: the &drm_gpuvm
> + *
> + * Releases all dma-resv locks of all &drm_gem_objects previously acquired
> + * through drm_gpuvm_lock() or its variants.
> + *
> + * Returns: 0 on success, negative error code on failure.
> + */
> +static inline void
> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
> +{
> +	drm_exec_fini(&vm_exec->exec);
> +}
> +
> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> +			      struct drm_exec *exec,
> +			      struct dma_fence *fence,
> +			      enum dma_resv_usage private_usage,
> +			      enum dma_resv_usage extobj_usage);
> +
> +/**
> + * drm_gpuvm_exec_resv_add_fence()
> + * @vm_exec: the &drm_gpuvm_exec abstraction
> + * @fence: fence to add
> + * @private_usage: private dma-resv usage
> + * @extobj_usage: extobj dma-resv usage
> + *
> + * See drm_gpuvm_resv_add_fence().
> + */
> +static inline void
> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec *vm_exec,
> +			      struct dma_fence *fence,
> +			      enum dma_resv_usage private_usage,
> +			      enum dma_resv_usage extobj_usage)
> +{
> +	drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec, fence,
> +				 private_usage, extobj_usage);
> +}
> +
>   /**
>    * struct drm_gpuvm_bo - structure representing a &drm_gpuvm and
>    * &drm_gem_object combination
> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>   			 * gpuva list.
>   			 */
>   			struct list_head gem;
> +
> +			/**
> +			 * @evict: List entry to attach to the &drm_gpuvms
> +			 * extobj list.
> +			 */
> +			struct list_head extobj;
> +
> +			/**
> +			 * @evict: List entry to attach to the &drm_gpuvms evict
> +			 * list.
> +			 */
> +			struct list_head evict;
>   		} entry;
>   	} list;
>   };
> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>   drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>   		  struct drm_gem_object *obj);
>   
> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict);
> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
> +
>   /**
>    * drm_gpuvm_bo_for_each_va() - iterator to walk over a list of &drm_gpuva
>    * @va__: &drm_gpuva structure to assign to in each iteration step
> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>   	 * used.
>   	 */
>   	int (*sm_step_unmap)(struct drm_gpuva_op *op, void *priv);
> +
> +	/**
> +	 * @bo_validate: called from drm_gpuvm_validate()
> +	 *
> +	 * Drivers receive this callback for every evicted &drm_gem_object being
> +	 * mapped in the corresponding &drm_gpuvm.
> +	 *
> +	 * Typically, drivers would call their driver specific variant of
> +	 * ttm_bo_validate() from within this callback.
> +	 */
> +	int (*bo_validate)(struct drm_gem_object *obj);

Same here. Could we have a vm_bo as an argument instead, so that the 
callback knows what gpuvm we're targeting and can mark all its gpu_vas 
for revalidation? Or is that intended to be done elsewhere?

>   };
>   
>   int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,

Thanks,

Thomas



^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [Nouveau] [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14 11:32                 ` Thomas Hellström
@ 2023-09-14 15:27                   ` Danilo Krummrich
  2023-09-14 17:13                     ` Thomas Hellström
  0 siblings, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-14 15:27 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: matthew.brost, sarah.walker, nouveau, linux-kernel, dri-devel,
	boris.brezillon, donald.robson, daniel, christian.koenig,
	faith.ekstrand

On 9/14/23 13:32, Thomas Hellström wrote:
> 
> On 9/14/23 12:57, Danilo Krummrich wrote:
>> On 9/13/23 14:16, Danilo Krummrich wrote:
>>
>> <snip>
>>
>>>>> And validate() can remove it while still holding all dma-resv locks,
>>>>> neat!
>>>>> However, what if two tasks are trying to lock the VA space
>>>>> concurrently? What
>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>> drm_gpuva_unlink()?
>>>>> Are we guaranteed that at this point of time the drm_gpuvm_bo is not
>>>>> on the
>>>>> evicted list? Because otherwise we would call drm_gpuvm_bo_destroy()
>>>>> with the
>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>> drm_gpuvm_bo_destroy()
>>>>> might drop the last reference to the drm_gem_object and hence we'd
>>>>> potentially
>>>>> free the dma-resv lock while holding it, at least if it's an external
>>>>> object.
>>>>
>>>> Easiest way in this scheme is to think of the lists as being protected
>>>> by the vm's resv lock. That means anybody calling unlink() must also
>>>> hold the vm's resv lock. (Which is OK from an UAF point of view, but
>>>> perhaps not from a locking inversion POW from an async list update).
>>>
>>> This would mean that on unlink() we'd need to hold the VM's resv lock and the
>>> corresponding GEM's resv lock (in case they're not the same anyways) because the
>>> VM's resv lock would protect the external / evicted object lists and the GEM
>>> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
>>> drm_gpuvm_bo's list of drm_gpuvas.
>>
>> As mentioned below the same applies for drm_gpuvm_bo_put() since it might
>> destroy the vm_bo, which includes removing the vm_bo from external / evicted
>> object lists and the GEMs list of vm_bos.
>>
>> As mentioned, if the GEM's dma-resv is different from the VM's dma-resv we need
>> to take both locks. Ultimately, this would mean we need a drm_exec loop, because
>> we can't know the order in which to take these locks. Doing a full drm_exec loop
>> just to put() a vm_bo doesn't sound reasonable to me.
>>
>> Can we instead just have an internal mutex for locking the lists such that we
>> avoid taking and dropping the spinlocks, which we use currently, in a loop?
> 
> You'd have the same locking inversion problem with a mutex, right? Since in the eviction path you have resv->mutex, from exec you have resv->mutex->resv because validate would attempt to grab resv.

Both lists, evict and extobj, would need to have a separate mutex, not a common one.
We'd also need a dedicated GEM gpuva lock. Then the only rule would be that you can't
hold the dma-resv lock when calling put(). Which I admit is not that nice.

With the current spinlock solution drivers wouldn't need to worry about anything locking
related though. So maybe I come back to your proposal of having a switch for external
locking with dma-resv locks entirely. Such that with external dma-resv locking I skip
all the spinlocks and add lockdep checks instead.

I think that makes the most sense in terms of taking advantage of external dma-resv locking
where possible and on the other hand having a self-contained solution if not. This should
get all concerns out of the way, yours, Christian's and Boris'.

> 
> That said, xe currently indeed does the vm+bo exec dance on vma put.
> 
> One reason why that seemingly horrible construct is good, is that when evicting an extobj and you need to access individual vmas to Zap page table entries or TLB flush, those VMAs are not allowed to go away (we're not refcounting them). Holding the bo resv on gpuva put prevents that from happening. Possibly one could use another mutex to protect the gem->vm_bo list to achieve the same, but we'd need to hold it on gpuva put.
> 
> /Thomas
> 
> 
>>
>> - Danilo
>>
>>>
>>>>
>>>>>
>>>>>>>
>>>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>>>> really would not
>>>>>>> like to add even more complexity just to get the spinlock out of
>>>>>>> the way in case
>>>>>>> the driver already has an outer lock protecting this path.
>>>>>>
>>>>>> I must disagree here. These spinlocks and atomic operations are
>>>>>> pretty
>>>>>> costly and as discussed earlier this type of locking was the reason
>>>>>> (at
>>>>>> least according to the commit message) that made Christian drop the
>>>>>> XArray
>>>>>> use in drm_exec for the same set of objects: "The locking overhead
>>>>>> is
>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>> complexity and a
>>>>>> single wide lock following the drm locking guidelines set out by
>>>>>> Daniel and
>>>>>> David should really be the default choice with an opt-in for a
>>>>>> spinlock if
>>>>>> needed for async and pushing out to a wq is not an option.
>>>>>
>>>>> For the external object list an outer lock would work as long as it's
>>>>> not the
>>>>> dma-resv lock of the corresponding GEM object, since here we actually
>>>>> need to
>>>>> remove the list entry from the external object list on
>>>>> drm_gpuvm_bo_destroy().
>>>>> It's just a bit weird design wise that drivers would need to take
>>>>> this outer
>>>>> lock on:
>>>>>
>>>>> - drm_gpuvm_bo_extobj_add()
>>>>> - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>> drm_gpuvm_bo_put())
>>>>> - drm_gpuvm_exec_lock()
>>>>> - drm_gpuvm_exec_lock_array()
>>>>> - drm_gpuvm_prepare_range()
>>>>>
>>>>> Given that it seems reasonable to do all the required locking
>>>>> internally.
>>>>
>>>>  From a design POW, there has been a clear direction in XE to make
>>>> things similar to mmap() / munmap(), so this outer lock, which in Xe is
>>>> an rwsem, is used in a similar way as the mmap_lock. It's protecting
>>>> the page-table structures and vma rb tree, the userptr structures and
>>>> the extobj list. Basically it's taken early in the exec IOCTL, the
>>>> VM_BIND ioctl, the compute rebind worker and the pagefault handler, so
>>>> all of the above are just asserting that it is taken in the correct
>>>> mode.
>>>>
>>>> But strictly with this scheme one could also use the vm's dma_resv for
>>>> the extobj list since with drm_exec, it's locked before traversing the
>>>> list.
>>>>
>>>> The whole point of this scheme is to rely on locks that you already are
>>>> supposed to be holding for various reasons and is simple to comprehend.
>>>
>>> I don't agree that we're supposed to hold the VM's resv lock anyways for
>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm fine using it
>>> for that purpose nevertheless.
>>>
>>>>
>>>>>
>>>>> In order to at least place lockdep checks, the driver would need to
>>>>> supply the
>>>>> corresponding lock's lockdep_map, because the GPUVM otherwise doesn't
>>>>> know about
>>>>> the lock.
>>>>
>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>
>>> I'd really like to avoid that, especially now that everything got simpler. We
>>> should define the actual locks to take instead.
>>>
>>>>
>>>>>
>>>>> Out of curiosity, what is the overhead of a spin_lock() that doesn't
>>>>> need to
>>>>> spin?
>>>>
>>>> I guess it's hard to tell exactly, but it is much lower on modern x86
>>>> than what it used to be. Not sure about ARM, which is the other
>>>> architecture important to us. I figure if there is little cache-line
>>>> bouncing the main overhead comes from the implied barriers.
>>>>
>>>>>
>>>>>>
>>>>>> A pretty simple way that would not add much code would be
>>>>>>
>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>>>> spinlock_t
>>>>>> *lock)
>>>>>>
>>>>>> {
>>>>>>
>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>          spin_lock(lock);
>>>>>>
>>>>>> }
>>>>>>
>>>>>>>> For such drivers, that would require anybody calling unlink to
>>>>>>>> hold the vm's
>>>>>>>> resv, though.
>>>>>>> In V4 I want to go back to having a dedicated lock for the GEMs
>>>>>>> gpuva list (or
>>>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>>>> lock for that
>>>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>>>> otherwise wouldn't
>>>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>>>> was referring to
>>>>>>> earlier.
>>>>>>
>>>>>> Yeah, I can see the need for a dedicated lock for the GEM's gpuva
>>>>>> list, but
>>>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>>>> problem. We
>>>>>> may free the object and a pointer to the vm's resv during unlink
>>>>>> but we
>>>>>> don't free the vm's resv.  It'd be a matter of ensuring that any
>>>>>> calls to
>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>
>>>>> Drivers calling unlink() from the fence signaling path can't use the
>>>>> VM's
>>>>> dma-resv lock.
>>>>
>>>> Yes, that made me a bit curious because in the current version the code
>>>> required the object's dma_resv for unlink() which can't be grabbed
>>>> either from the fence signaling path. So are there any drivers actually
>>>> wanting to do that? If so, they will either need to resort to the
>>>> current spinlock solution or they will need to call unlink from a
>>>> workqueue item.
>>>
>>> As Boris already mentioned we have the dma-resv lock by default or a driver
>>> specific GEM gpuva lock as opt-in. Now, we can get rid of the latter.
>>>
>>>>>
>>>>> Also, what if the object is an external object? We can't use the VM's
>>>>> dma-resv
>>>>> lock here.
>>>>
>>>> Why? Typically (sync) unlink is only ever called from an unbind-like
>>>> operation where it should be trivial to grab the vm's resv. Or, for
>>>> that matter any outer lock protecting the extobj list. Rule would be
>>>> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict would
>>>> be protected by either the vm's dma_resv (or possibly an outer lock in
>>>> the case of the extobj list).
>>>
>>> Outer lock wouldn't have been working for updates in the async path, but
>>> shouldn't be relevant anymore. We could use the VM's resv for that.
>>>
>>>>
>>>>>   And we can't have the GEM objs dma-resv lock held when calling
>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>>>> refcount drops
>>>>> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might
>>>>> drop the
>>>>> last reference of the GEM object.
>>>>
>>>> Yes, but this is a different problem as to what exactly protects
>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per bo list
>>>> lock, or if we want to keep the bo's dma_resv we need to ensure that
>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount (I know
>>>> Boris didn't like that, but requiring an explicit refcount for a
>>>> pointer you dereference unless you're under a lock that ensures keeping
>>>> the object alive is pretty much required?) But anyway for the
>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal spinlock)
>>>> I don't have a strong preference.
>>>
>>> We can keep the GEM objects dma-resv lock, however as mentioned above
>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the VM's resv lock
>>> and the GEM's resv lock in case they differ.
>>>
>>
>>>>>>
>>
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14 13:33                       ` Thomas Hellström
@ 2023-09-14 15:37                         ` Boris Brezillon
  0 siblings, 0 replies; 77+ messages in thread
From: Boris Brezillon @ 2023-09-14 15:37 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: Dave Airlie, Danilo Krummrich, daniel, matthew.brost,
	sarah.walker, donald.robson, christian.koenig, faith.ekstrand,
	dri-devel, nouveau, linux-kernel

On Thu, 14 Sep 2023 15:33:50 +0200
Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:

> Hi,
> 
> On 9/14/23 13:54, Boris Brezillon wrote:
> > On Thu, 14 Sep 2023 12:45:44 +0200
> > Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >  
> >> On 9/14/23 10:20, Boris Brezillon wrote:  
> >>> On Wed, 13 Sep 2023 15:22:56 +0200
> >>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >>>     
> >>>> On 9/13/23 13:33, Boris Brezillon wrote:  
> >>>>> On Wed, 13 Sep 2023 12:39:01 +0200
> >>>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >>>>>        
> >>>>>> Hi,
> >>>>>>
> >>>>>> On 9/13/23 09:19, Boris Brezillon wrote:  
> >>>>>>> On Wed, 13 Sep 2023 17:05:42 +1000
> >>>>>>> Dave Airlie <airlied@gmail.com> wrote:
> >>>>>>>           
> >>>>>>>> On Wed, 13 Sept 2023 at 17:03, Boris Brezillon
> >>>>>>>> <boris.brezillon@collabora.com> wrote:  
> >>>>>>>>> On Tue, 12 Sep 2023 18:20:32 +0200
> >>>>>>>>> Thomas Hellström <thomas.hellstrom@linux.intel.com> wrote:
> >>>>>>>>>              
> >>>>>>>>>>> +/**
> >>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
> >>>>>>>>>>> + * @__gpuvm: The GPU VM
> >>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
> >>>>>>>>>>> + * @__local_list: A pointer to the local list used to store already iterated items
> >>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from drm_gpuvm_get_next_cached_vm_bo()
> >>>>>>>>>>> + *
> >>>>>>>>>>> + * This helper is here to provide lockless list iteration. Lockless as in, the
> >>>>>>>>>>> + * iterator releases the lock immediately after picking the first element from
> >>>>>>>>>>> + * the list, so list insertion deletion can happen concurrently.  
> >>>>>>>>>> Are the list spinlocks needed for that async state update from within
> >>>>>>>>>> the dma-fence critical section we've discussed previously?  
> >>>>>>>>> Any driver calling _[un]link() from its drm_gpu_scheduler::run_job()
> >>>>>>>>> hook will be in this situation (Panthor at the moment, PowerVR soon). I
> >>>>>>>>> get that Xe and Nouveau don't need that because they update the VM
> >>>>>>>>> state early (in the ioctl path), but I keep thinking this will hurt us
> >>>>>>>>> if we don't think it through from the beginning, because once you've
> >>>>>>>>> set this logic to depend only on resv locks, it will be pretty hard to
> >>>>>>>>> get back to a solution which lets synchronous VM_BINDs take precedence
> >>>>>>>>> on asynchronous request, and, with vkQueueBindSparse() passing external
> >>>>>>>>> deps (plus the fact the VM_BIND queue might be pretty deep), it can
> >>>>>>>>> take a long time to get your synchronous VM_BIND executed...  
> >>>>>> So this would boil down to either (possibly opt-in) keeping the spinlock
> >>>>>> approach or pushing the unlink out to a wq then?  
> >>>>> Deferred _unlink() would not be an issue, since I already defer the
> >>>>> drm_gpuva destruction to a wq, it would just a be a matter of moving the
> >>>>> _unlink() call there as well. But _link() also takes the GEM gpuva list
> >>>>> lock, and that one is bit tricky, in that sm_map() can trigger 2 more
> >>>>> _link() calls for the prev/next mappings, which we can't guess until we
> >>>>> get to execute the VM update. If we mandate the use of the GEM resv
> >>>>> lock, that simply means async VM updates (AKA calling
> >>>>> drm_gpuvm_sm_[un]map()) are not an option. And if this is what everyone
> >>>>> agrees on, then I'd like the APIs that make this sort of async VM
> >>>>> update possible (drm_gpuvm_sm_[un]map(), the drm_gpuvm_ops::sm_step*
> >>>>> methods, and probably other things) to be dropped, so we don't make it
> >>>>> look like it's something we support.
> >>>>>        
> >>>>>> BTW, as also asked in a reply to Danilo, how do you call unlink from
> >>>>>> run_job() when it was requiring the obj->dma_resv lock, or was that a WIP?  
> >>>>> _unlink() makes sure the GEM gpuva list lock is taken, but this can be
> >>>>> a custom lock (see drm_gem_gpuva_set_lock()). In panthor we have
> >>>>> panthor_gem_object::gpuva_list_lock that's dedicated the gpuva list
> >>>>> protection. We make sure we never take this lock while allocating
> >>>>> memory to guarantee the dma-signalling path can't deadlock.
> >>>>>        
> >>>>>>>>>              
> >>>>>>>> btw what is the use case for this? do we have actual vulkan
> >>>>>>>> applications we know will have problems here?  
> >>>>>>> I don't, but I think that's a concern Faith raised at some point (dates
> >>>>>>> back from when I was reading threads describing how VM_BIND on i915
> >>>>>>> should work, and I was clearly discovering this whole VM_BIND thing at
> >>>>>>> that time, so maybe I misunderstood).
> >>>>>>>           
> >>>>>>>> it feels like a bit of premature optimisation, but maybe we have use cases.  
> >>>>>>> Might be, but that's the sort of thing that would put us in a corner if
> >>>>>>> we don't have a plan for when the needs arise. Besides, if we don't
> >>>>>>> want to support that case because it's too complicated, I'd recommend
> >>>>>>> dropping all the drm_gpuvm APIs that let people think this mode is
> >>>>>>> valid/supported (map/remap/unmap hooks in drm_gpuvm_ops,
> >>>>>>> drm_gpuvm_sm_[un]map helpers, etc). Keeping them around just adds to the
> >>>>>>> confusion.  
> >>>>>> Xe allows bypassing the bind-queue with another bind-queue, but to
> >>>>>> completely avoid dependencies between queues the Operations may not
> >>>>>> overlap.  
> >>>>> So, you check the VM state with some VM lock held (would be the VM resv
> >>>>> in my case), and if the mapping is new (no overlaps with pre-existing
> >>>>> mappings), you queue it to the fast-track/sync-VM_BIND queue. What would
> >>>>> be missing I guess is a way to know if the mapping is active (MMU has
> >>>>> been updated) or pending (MMU update queued to the bind-queue), so I can
> >>>>> fast-track mapping/unmapping of active mappings.  
> >>> Ok, so I started modifying the implementation, and quickly realized the
> >>> overlap test can't be done without your xe_range_fence tree because of
> >>> unmaps. Since we call drm_gpuva_unmap() early/in the IOCTL path (IOW,
> >>> before the mapping teardown is effective), we lose track of this
> >>> yet-to-be-executed-unmap operation, and if we do our
> >>> va_range_overlaps_with_existing_mappings() test after such an unmap has
> >>> been queued using just the drm_gpuvm tree, we might get false even if
> >>> the mapping still exists and is expected to be torn down when the
> >>> VM_BIND(unmap) job is executed on the bind-queue. As a result, this
> >>> might execute the VM_BIND(map,sync) immediately (because the dependency
> >>> went undetected), and then the vm_bind_run_job() function kicks in and
> >>> undoes what the synchronous VM_BIND(map) did. Am I missing something?
> >>>
> >>> If I'm correct, that means I'm back to having synchronous VM_BIND ops
> >>> queued after all asynchronous ones unless I use something like your
> >>> xe_range_fence solution (which I was hoping I could postpone until we
> >>> decide to expose multiple bind queues).  
> >> Yes, unfortunately fine-granular async range-tracking comes with a cost.
> >> Still, if you are doing page-table updates solely with the CPU, you
> >> could probably short-circuit the fence part of the fenced ranges?  
> > I'm doing it with the CPU, but asynchronously (bind-queue), so I'm
> > facing pretty much the same problems, I think.
> >  
> >>  
> >>> I'm still a bit skeptical about this 'update VM mappings tree early,
> >>> defer MMU page table updates' approach, where the VM state and the
> >>> actual page table tree are temporarily out of sync until all operations
> >>> have been flushed on all queues targeting a VM. This means any test we
> >>> do on the gpuvm, like, 'give me the BO mapped at VA xxx', is subject to
> >>> 'is this the current state or the future state?' questioning. Note that
> >>> we can't even get the current VM state anymore, because all the
> >>> drm_gpuvm::tree stores with this solution is the future state, and
> >>> to-be-unmapped mappings are lost during the transitioning period (when
> >>> vm_bind jobs are queued but not executed yet).  
> >> Understandable. But this is the way we historically have been doing
> >> things, (I think the whole async atomic page-flipping is using the same
> >> concept), but rather than refering to it as current state and future
> >> state, I'd like to think it as Synchronous CPU state (What an API user
> >> sees) vs GPU state (What the GPU sees where it's currently executing).  
> > Actually, the latency incurred by the fact the page table updates are
> > done by the GPU is one thing, and I guess I could agree with you if that
> > was the only difference between the GPU and CPU view. But the fact
> > VM_BIND jobs can have external dependencies makes things a lot more
> > confusing. I might be wrong, but I think atomic page-flip is simpler.
> > Yes you can have implicit deps on your scanout buffer, and yes the HW
> > will wait for these fences to signal before updating the plane pointer,
> > but that's still just a simple pipeline with one resource to deal with.
> > A VM is a whole range with virtual memory regions being attached
> > physical mem chunks, possibly with each range having its own lifecycle,
> > etc. It'd make more sense to me to have a way to know the current
> > state, and the future state.  
> 
> Yeah so in Xe we support async bind jobs solely to be able to do deep 
> pipelining and it's not only the pagetable jobs, You could have multiple 
> bind-evict-restore-exec-unbind-bind-evict-restore-exec all piplelined 
> and only the available memory resources sets the limit. In fact you can 
> even have physical VRAM assigned to a bo which won't be used until exec 
> #5 in the pipeline and released in exec #4 since TTM is aware of async 
> memory management.
> 
> So something needs to absorb the state discrepancy between what you 
> refer to as the current state and the future state. The question is what 
> should absorb it? Should it be the gpuvm or some associated driver state 
> tracking?

That's exactly what I'd like to sort out.

> 
> Now let's say that you have a deferred bind state-update pending and 
> track the *current* state in the gpuvm so that a number of vma unmaps 
> and maps aren't yet visible to gpuvm and then you submit an exec ioctl. 
> How does the exec ioctl know the gpuvm state?

A tree of pending VM ops, ordered per VA-range, with overlapping
allowed (to support pipelining), who are assigned fence objects? With
the fence + explicit deps passed to a job, we should know which extra
extobjs to add to the currently mapped extobjs. But I wasn't even
considering something as complex. Extobjs can be added early, even
before the mapping is active (that's what I was doing in my previous
PoC, using the async VM update model). Same goes for evicted BOs, they
can be re-pinned early. The only downside is that we might force BO
residency to last longer than strictly needed (because we'd be adding
our GPU job fence to extobjs we don't necessarily need if these extobjs
end up being mapped after the GPU job is executed, which can happen if
the deps passed to VM_BIND prevent its execution).

> Like external bos to 
> validate or bos that become evicted, userptr vmas that have been 
> invalidated? Does the exec need to block waiting for the bind fence to 
> complete so that it can assess the VM state that UMD intended to be there?

I'd say no, given the GPU job added its fence to the VM resv and all
extobjs resvs. If a mapping update is queued, it should be waiting for
the job using the previous mapping to complete, thus making BO
retrieval from an exception path okay (and when I say exception path, I
intentionally exclude any allocation requests, because those would need
extra precautions, like non-blocking allocation, and I don't even want
to think about it at the moment).

> 
> >
> > Just one example, say you have a GPU job that triggers some fault
> > that's supposed to be handled by the kernel driver to unblock the
> > situation. In order to have some context, the kernel driver needs to
> > read a GPU buffer that's passed back as a virtual address by the GPU/FW,
> > so it calls drm_gpuvm_bo_find(), and now it might potentially get a BO
> > that's not the current BO being mapped at this address, but the future
> > BO after some asynchronous VM_BIND(map) has been executed, and of
> > course, the VM_BIND job leading to this future state, could have a
> > dependency on the GPU job, because this GPU job was using the old
> > mapping. It might sound completely hypothetical, but that's actually
> > the sort of things the Mali FW does in a few occasions.  
> 
> Recoverable faults are typically requiring some sort of memory operation 
> that requires the dma_resv or outer lock, like validation or 
> get_user_pages(), and can thus not be performed in the fence signalling 
> critical path and on Xe they are reserved for Long-Running VMs. On 
> those, pipelining is not really needed and is disallowed in Xe to avoid 
> having to deal with the state discrepancy.

I intentionally didn't take the map/alloc-on-fault example, because
that one is bit complicated, and we don't need it (at least not yet).

> 
> But to the actual problem you mention, let's say its a fault that 
> triggers a need to dump bo contents, then yes in order to be able to do 
> deep pipelining in this way the driver needs to track some state 
> discrepancy, and that's an additional overhead.

Yes, more something like that.

> 
> >
> > So yeah, I'm still not convinced we can always get away with just the
> > future representation of the VM. Sometimes you have to know what's
> > mapped at the moment.
> >  
> >> To bring them in sync you need to wait for fences.  
> > Wouldn't solve the case I mentioned above, AFAICT.
> >  
> >> And ideally the async
> >> work should never fail.  
> > Sure, that I considered for granted. If async VM_BIND fails, we just
> > flag the VM as unusable, and cancel any GPU job submission happening on
> > the VM. The user then has to recreate the VM to take a fresh start
> > (DEVICE_LOST situation).
> >
> > It a bit tricky when we want to clean things up after a failure,
> > because we might have lost track of some of mappings (early
> > gpuva_unmap(), but the MMU page tables are still lying around). In our
> > case (Panthor) that's not really an issue though, because
> > free_io_pgtable_ops() will take care of that for us.
> >  
> >> If one wants to push async work out to be handled solely by the GPU,
> >> this is the way things must be done since the GPU can't take locks or
> >> allocate memory, but as part or all of async work is sometimes done
> >> using the CPU, it might make sense to challenge that to some extent.  
> > I think updating the VM state in the run_job() with drm_gpuva_[un]map()
> > would still account for the GPU-is-executing-pgtable-updates latency,
> > and that's not really the sort of desynchronization I'm worried about,
> > because when you get to submit your VM_BIND job, you know all the job
> > deps are met, and the VM update is about to happen. What I'm worried
> > about is the desynchronization incurred by complex VM_BIND job deps
> > that make it hard to know what's the diff between the drm_gpuvm state
> > (predicting the future) and the VM state a GPU job expects (the
> > present).  
> 
> Yes that sort of deep pipeling requires additional "current" state 
> tracking for some situations, but waiting in exec for the current state 
> to catch up with future state, which it seems is a consequence of async 
> state updates, isn't really an option for us.

I wasn't really considering waits as an option, and I do intend to
pipeline VM_BIND and GPU jobs, with deps taking care of further
ordering constraints. What I had in mind was more a way to retrieve the
future state from the current state + a list of diffs. Actually I don't
even mind doing it the other way around (retrieving the current state
from the future state plus a list of pending operations reverted),
because such exceptions are rare enough that we can accept the extra
cost. My point was, having just the future state doesn't always work,
and there's currently no way we can have a list of diffs to revert,
because we lose track of some operations, like unmaps.

> 
> Now if you think the decision to remove those spinlocks from drm_gpuvm 
> was premature, I'm fully OK to have them in there again, but opt-in so 
> that we have helpers that fit all purposes.

Well, with the dual-mode APIs, at least the driver could decide what
the drm_gpuvm state was encoding (current if you call
drm_gpuva_[un]map() from the run_job() path, or future if you do it
from the the IOCTL/submit path). With the new model, that's no longer
an option. But even the new model could work out, if I have way to get
the current state from the future state, I was just hoping we could
make this logic generic...


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14 13:48   ` Thomas Hellström
@ 2023-09-14 16:36     ` Danilo Krummrich
  2023-09-14 17:21       ` Thomas Hellström
  0 siblings, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-14 16:36 UTC (permalink / raw)
  To: Thomas Hellström, airlied, daniel, matthew.brost,
	sarah.walker, donald.robson, boris.brezillon, christian.koenig,
	faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel

On 9/14/23 15:48, Thomas Hellström wrote:
> Hi, Danilo
> 
> Some additional minor comments as xe conversion progresses.
> 
> On 9/9/23 17:31, Danilo Krummrich wrote:
>> So far the DRM GPUVA manager offers common infrastructure to track GPU VA
>> allocations and mappings, generically connect GPU VA mappings to their
>> backing buffers and perform more complex mapping operations on the GPU VA
>> space.
>>
>> However, there are more design patterns commonly used by drivers, which
>> can potentially be generalized in order to make the DRM GPUVA manager
>> represent a basic GPU-VM implementation. In this context, this patch aims
>> at generalizing the following elements.
>>
>> 1) Provide a common dma-resv for GEM objects not being used outside of
>>     this GPU-VM.
>>
>> 2) Provide tracking of external GEM objects (GEM objects which are
>>     shared with other GPU-VMs).
>>
>> 3) Provide functions to efficiently lock all GEM objects dma-resv the
>>     GPU-VM contains mappings of.
>>
>> 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
>>     of, such that validation of evicted GEM objects is accelerated.
>>
>> 5) Provide some convinience functions for common patterns.
>>
>> Rather than being designed as a "framework", the target is to make all
>> features appear as a collection of optional helper functions, such that
>> drivers are free to make use of the DRM GPUVA managers basic
>> functionality and opt-in for other features without setting any feature
>> flags, just by making use of the corresponding functions.
>>
>> Big kudos to Boris Brezillon for his help to figure out locking for drivers
>> updating the GPU VA space within the fence signalling path.
>>
>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>> ---
>>
>> +/**
>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to / from a
>> + * &drm_gpuvms evicted list
>> + * @obj: the &drm_gem_object to add or remove
>> + * @evict: indicates whether the object is evicted
>> + *
>> + * Adds a &drm_gem_object to or removes it from all &drm_gpuvms evicted
>> + * list containing a mapping of this &drm_gem_object.
>> + */
>> +void
>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>> +{
>> +    struct drm_gpuvm_bo *vm_bo;
>> +
>> +    drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>> +        if (evict)
>> +            drm_gpuvm_bo_list_add(vm_bo, evict);
>> +        else
>> +            drm_gpuvm_bo_list_del(vm_bo, evict);
>> +    }
>> +}
>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>> +
> 
> We need a drm_gpuvm_bo_evict(struct drm_gpuvm_bo *vm_bo, ...) that puts a single gpuvm_bo on the list, the above function could perhaps be renamed as drm_gpuvm_gem_obj_evict(obj, ....).

Makes sense - gonna change that.

> 
> Reason is some vm's are faulting vms which don't have an evict list, but validate from the pagefault handler. Also evict == false is dangerous because if called from within an exec, it might remove the obj from other vm's evict list before they've had a chance to rebind their VMAs.
> 
>>   static int
>>   __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>              struct drm_gpuva *va)
>> diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
>> index afa50b9059a2..834bb6d6617e 100644
>> --- a/include/drm/drm_gpuvm.h
>> +++ b/include/drm/drm_gpuvm.h
>> @@ -26,10 +26,12 @@
>>    */
>>   #include <linux/list.h>
>> +#include <linux/dma-resv.h>
>>   #include <linux/rbtree.h>
>>   #include <linux/types.h>
>>   #include <drm/drm_gem.h>
>> +#include <drm/drm_exec.h>
>>   struct drm_gpuvm;
>>   struct drm_gpuvm_bo;
>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>        * space
>>        */
>>       struct dma_resv *resv;
>> +
>> +    /**
>> +     * @extobj: structure holding the extobj list
>> +     */
>> +    struct {
>> +        /**
>> +         * @list: &list_head storing &drm_gpuvm_bos serving as
>> +         * external object
>> +         */
>> +        struct list_head list;
>> +
>> +        /**
>> +         * @lock: spinlock to protect the extobj list
>> +         */
>> +        spinlock_t lock;
>> +    } extobj;
>> +
>> +    /**
>> +     * @evict: structure holding the evict list and evict list lock
>> +     */
>> +    struct {
>> +        /**
>> +         * @list: &list_head storing &drm_gpuvm_bos currently being
>> +         * evicted
>> +         */
>> +        struct list_head list;
>> +
>> +        /**
>> +         * @lock: spinlock to protect the evict list
>> +         */
>> +        spinlock_t lock;
>> +    } evict;
>>   };
>>   void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device *drm,
>>               const struct drm_gpuvm_ops *ops);
>>   void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>> +/**
>> + * drm_gpuvm_is_extobj() - indicates whether the given &drm_gem_object is an
>> + * external object
>> + * @gpuvm: the &drm_gpuvm to check
>> + * @obj: the &drm_gem_object to check
>> + *
>> + * Returns: true if the &drm_gem_object &dma_resv differs from the
>> + * &drm_gpuvms &dma_resv, false otherwise
>> + */
>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm *gpuvm,
>> +                       struct drm_gem_object *obj)
>> +{
>> +    return obj && obj->resv != gpuvm->resv;
>> +}
>> +
>>   static inline struct drm_gpuva *
>>   __drm_gpuva_next(struct drm_gpuva *va)
>>   {
>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>   #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) \
>>       list_for_each_entry_safe(va__, next__, &(gpuvm__)->rb.list, rb.entry)
>> +/**
>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of &drm_exec
>> + *
>> + * This structure should be created on the stack as &drm_exec should be.
>> + *
>> + * Optionally, @extra can be set in order to lock additional &drm_gem_objects.
>> + */
>> +struct drm_gpuvm_exec {
>> +    /**
>> +     * @exec: the &drm_exec structure
>> +     */
>> +    struct drm_exec exec;
>> +
>> +    /**
>> +     * @vm: the &drm_gpuvm to lock its DMA reservations
>> +     */
>> +    struct drm_gpuvm *vm;
>> +
>> +    /**
>> +     * @extra: Callback and corresponding private data for the driver to
>> +     * lock arbitrary additional &drm_gem_objects.
>> +     */
>> +    struct {
>> +        /**
>> +         * @fn: The driver callback to lock additional &drm_gem_objects.
>> +         */
>> +        int (*fn)(struct drm_gpuvm_exec *vm_exec,
>> +              unsigned int num_fences);
>> +
>> +        /**
>> +         * @priv: driver private data for the @fn callback
>> +         */
>> +        void *priv;
>> +    } extra;
>> +};
>> +
>> +/**
>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
>> + * @gpuvm: the &drm_gpuvm
>> + * @exec: the &drm_exec context
>> + * @num_fences: the amount of &dma_fences to reserve
>> + *
>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy &drm_gem_object.
>> + *
>> + * Using this function directly, it is the drivers responsibility to call
>> + * drm_exec_init() and drm_exec_fini() accordingly.
>> + *
>> + * Returns: 0 on success, negative error code on failure.
>> + */
>> +static inline int
>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>> +             struct drm_exec *exec,
>> +             unsigned int num_fences)
>> +{
>> +    return drm_exec_prepare_obj(exec, &gpuvm->d_obj, num_fences);
>> +}
>> +
>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>> +                  struct drm_exec *exec,
>> +                  unsigned int num_fences);
>> +
>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>> +                struct drm_exec *exec,
>> +                u64 addr, u64 range,
>> +                unsigned int num_fences);
>> +
>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>> +            unsigned int num_fences,
>> +            bool interruptible);
>> +
>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>> +                  struct drm_gem_object **objs,
>> +                  unsigned int num_objs,
>> +                  unsigned int num_fences,
>> +                  bool interruptible);
>> +
>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>> +                  u64 addr, u64 range,
>> +                  unsigned int num_fences,
>> +                  bool interruptible);
>> +
>> +/**
>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated BOs
>> + * @gpuvm: the &drm_gpuvm
>> + *
>> + * Releases all dma-resv locks of all &drm_gem_objects previously acquired
>> + * through drm_gpuvm_lock() or its variants.
>> + *
>> + * Returns: 0 on success, negative error code on failure.
>> + */
>> +static inline void
>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>> +{
>> +    drm_exec_fini(&vm_exec->exec);
>> +}
>> +
>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>> +                  struct drm_exec *exec,
>> +                  struct dma_fence *fence,
>> +                  enum dma_resv_usage private_usage,
>> +                  enum dma_resv_usage extobj_usage);
>> +
>> +/**
>> + * drm_gpuvm_exec_resv_add_fence()
>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>> + * @fence: fence to add
>> + * @private_usage: private dma-resv usage
>> + * @extobj_usage: extobj dma-resv usage
>> + *
>> + * See drm_gpuvm_resv_add_fence().
>> + */
>> +static inline void
>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec *vm_exec,
>> +                  struct dma_fence *fence,
>> +                  enum dma_resv_usage private_usage,
>> +                  enum dma_resv_usage extobj_usage)
>> +{
>> +    drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec, fence,
>> +                 private_usage, extobj_usage);
>> +}
>> +
>>   /**
>>    * struct drm_gpuvm_bo - structure representing a &drm_gpuvm and
>>    * &drm_gem_object combination
>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>                * gpuva list.
>>                */
>>               struct list_head gem;
>> +
>> +            /**
>> +             * @evict: List entry to attach to the &drm_gpuvms
>> +             * extobj list.
>> +             */
>> +            struct list_head extobj;
>> +
>> +            /**
>> +             * @evict: List entry to attach to the &drm_gpuvms evict
>> +             * list.
>> +             */
>> +            struct list_head evict;
>>           } entry;
>>       } list;
>>   };
>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>   drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>             struct drm_gem_object *obj);
>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict);
>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>> +
>>   /**
>>    * drm_gpuvm_bo_for_each_va() - iterator to walk over a list of &drm_gpuva
>>    * @va__: &drm_gpuva structure to assign to in each iteration step
>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>        * used.
>>        */
>>       int (*sm_step_unmap)(struct drm_gpuva_op *op, void *priv);
>> +
>> +    /**
>> +     * @bo_validate: called from drm_gpuvm_validate()
>> +     *
>> +     * Drivers receive this callback for every evicted &drm_gem_object being
>> +     * mapped in the corresponding &drm_gpuvm.
>> +     *
>> +     * Typically, drivers would call their driver specific variant of
>> +     * ttm_bo_validate() from within this callback.
>> +     */
>> +    int (*bo_validate)(struct drm_gem_object *obj);
> 
> Same here. Could we have a vm_bo as an argument instead, so that the callback knows what gpuvm we're targeting and can mark all its gpu_vas for revalidation? Or is that intended to be done elsewhere?

Makes sense as well. I'll change that too.

> 
>>   };
>>   int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
> 
> Thanks,
> 
> Thomas
> 
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [Nouveau] [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14 15:27                   ` Danilo Krummrich
@ 2023-09-14 17:13                     ` Thomas Hellström
  2023-09-14 17:15                       ` Danilo Krummrich
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-14 17:13 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: matthew.brost, sarah.walker, nouveau, linux-kernel, dri-devel,
	boris.brezillon, donald.robson, daniel, christian.koenig,
	faith.ekstrand

On Thu, 2023-09-14 at 17:27 +0200, Danilo Krummrich wrote:
> On 9/14/23 13:32, Thomas Hellström wrote:
> > 
> > On 9/14/23 12:57, Danilo Krummrich wrote:
> > > On 9/13/23 14:16, Danilo Krummrich wrote:
> > > 
> > > <snip>
> > > 
> > > > > > And validate() can remove it while still holding all dma-
> > > > > > resv locks,
> > > > > > neat!
> > > > > > However, what if two tasks are trying to lock the VA space
> > > > > > concurrently? What
> > > > > > do we do when the drm_gpuvm_bo's refcount drops to zero in
> > > > > > drm_gpuva_unlink()?
> > > > > > Are we guaranteed that at this point of time the
> > > > > > drm_gpuvm_bo is not
> > > > > > on the
> > > > > > evicted list? Because otherwise we would call
> > > > > > drm_gpuvm_bo_destroy()
> > > > > > with the
> > > > > > dma-resv lock held, which wouldn't be allowed, since
> > > > > > drm_gpuvm_bo_destroy()
> > > > > > might drop the last reference to the drm_gem_object and
> > > > > > hence we'd
> > > > > > potentially
> > > > > > free the dma-resv lock while holding it, at least if it's
> > > > > > an external
> > > > > > object.
> > > > > 
> > > > > Easiest way in this scheme is to think of the lists as being
> > > > > protected
> > > > > by the vm's resv lock. That means anybody calling unlink()
> > > > > must also
> > > > > hold the vm's resv lock. (Which is OK from an UAF point of
> > > > > view, but
> > > > > perhaps not from a locking inversion POW from an async list
> > > > > update).
> > > > 
> > > > This would mean that on unlink() we'd need to hold the VM's
> > > > resv lock and the
> > > > corresponding GEM's resv lock (in case they're not the same
> > > > anyways) because the
> > > > VM's resv lock would protect the external / evicted object
> > > > lists and the GEM
> > > > objects resv lock protects the GEM's list of drm_gpuvm_bos and
> > > > the
> > > > drm_gpuvm_bo's list of drm_gpuvas.
> > > 
> > > As mentioned below the same applies for drm_gpuvm_bo_put() since
> > > it might
> > > destroy the vm_bo, which includes removing the vm_bo from
> > > external / evicted
> > > object lists and the GEMs list of vm_bos.
> > > 
> > > As mentioned, if the GEM's dma-resv is different from the VM's
> > > dma-resv we need
> > > to take both locks. Ultimately, this would mean we need a
> > > drm_exec loop, because
> > > we can't know the order in which to take these locks. Doing a
> > > full drm_exec loop
> > > just to put() a vm_bo doesn't sound reasonable to me.
> > > 
> > > Can we instead just have an internal mutex for locking the lists
> > > such that we
> > > avoid taking and dropping the spinlocks, which we use currently,
> > > in a loop?
> > 
> > You'd have the same locking inversion problem with a mutex, right?
> > Since in the eviction path you have resv->mutex, from exec you have
> > resv->mutex->resv because validate would attempt to grab resv.
> 
> Both lists, evict and extobj, would need to have a separate mutex,
> not a common one.
> We'd also need a dedicated GEM gpuva lock. Then the only rule would
> be that you can't
> hold the dma-resv lock when calling put(). Which I admit is not that
> nice.
> 
> With the current spinlock solution drivers wouldn't need to worry
> about anything locking
> related though. So maybe I come back to your proposal of having a
> switch for external
> locking with dma-resv locks entirely. Such that with external dma-
> resv locking I skip
> all the spinlocks and add lockdep checks instead.
> 
> I think that makes the most sense in terms of taking advantage of
> external dma-resv locking
> where possible and on the other hand having a self-contained solution
> if not. This should
> get all concerns out of the way, yours, Christian's and Boris'.

If we need additional locks yes, I'd prefer the opt-in/opt-out spinlock
solution, and check back after a while to see if we can remove either
option once most pitfalls are hit.

Thanks,
/Thomas


> 
> > 
> > That said, xe currently indeed does the vm+bo exec dance on vma
> > put.
> > 
> > One reason why that seemingly horrible construct is good, is that
> > when evicting an extobj and you need to access individual vmas to
> > Zap page table entries or TLB flush, those VMAs are not allowed to
> > go away (we're not refcounting them). Holding the bo resv on gpuva
> > put prevents that from happening. Possibly one could use another
> > mutex to protect the gem->vm_bo list to achieve the same, but we'd
> > need to hold it on gpuva put.
> > 
> > /Thomas
> > 
> > 
> > > 
> > > - Danilo
> > > 
> > > > 
> > > > > 
> > > > > > 
> > > > > > > > 
> > > > > > > > For extobjs an outer lock would be enough in case of
> > > > > > > > Xe, but I
> > > > > > > > really would not
> > > > > > > > like to add even more complexity just to get the
> > > > > > > > spinlock out of
> > > > > > > > the way in case
> > > > > > > > the driver already has an outer lock protecting this
> > > > > > > > path.
> > > > > > > 
> > > > > > > I must disagree here. These spinlocks and atomic
> > > > > > > operations are
> > > > > > > pretty
> > > > > > > costly and as discussed earlier this type of locking was
> > > > > > > the reason
> > > > > > > (at
> > > > > > > least according to the commit message) that made
> > > > > > > Christian drop the
> > > > > > > XArray
> > > > > > > use in drm_exec for the same set of objects: "The locking
> > > > > > > overhead
> > > > > > > is
> > > > > > > unecessary and measurable". IMHO the spinlock is the
> > > > > > > added
> > > > > > > complexity and a
> > > > > > > single wide lock following the drm locking guidelines set
> > > > > > > out by
> > > > > > > Daniel and
> > > > > > > David should really be the default choice with an opt-in
> > > > > > > for a
> > > > > > > spinlock if
> > > > > > > needed for async and pushing out to a wq is not an
> > > > > > > option.
> > > > > > 
> > > > > > For the external object list an outer lock would work as
> > > > > > long as it's
> > > > > > not the
> > > > > > dma-resv lock of the corresponding GEM object, since here
> > > > > > we actually
> > > > > > need to
> > > > > > remove the list entry from the external object list on
> > > > > > drm_gpuvm_bo_destroy().
> > > > > > It's just a bit weird design wise that drivers would need
> > > > > > to take
> > > > > > this outer
> > > > > > lock on:
> > > > > > 
> > > > > > - drm_gpuvm_bo_extobj_add()
> > > > > > - drm_gpuvm_bo_destroy()        (and hence also
> > > > > > drm_gpuvm_bo_put())
> > > > > > - drm_gpuva_unlink()            (because it needs to call
> > > > > > drm_gpuvm_bo_put())
> > > > > > - drm_gpuvm_exec_lock()
> > > > > > - drm_gpuvm_exec_lock_array()
> > > > > > - drm_gpuvm_prepare_range()
> > > > > > 
> > > > > > Given that it seems reasonable to do all the required
> > > > > > locking
> > > > > > internally.
> > > > > 
> > > > >  From a design POW, there has been a clear direction in XE to
> > > > > make
> > > > > things similar to mmap() / munmap(), so this outer lock,
> > > > > which in Xe is
> > > > > an rwsem, is used in a similar way as the mmap_lock. It's
> > > > > protecting
> > > > > the page-table structures and vma rb tree, the userptr
> > > > > structures and
> > > > > the extobj list. Basically it's taken early in the exec
> > > > > IOCTL, the
> > > > > VM_BIND ioctl, the compute rebind worker and the pagefault
> > > > > handler, so
> > > > > all of the above are just asserting that it is taken in the
> > > > > correct
> > > > > mode.
> > > > > 
> > > > > But strictly with this scheme one could also use the vm's
> > > > > dma_resv for
> > > > > the extobj list since with drm_exec, it's locked before
> > > > > traversing the
> > > > > list.
> > > > > 
> > > > > The whole point of this scheme is to rely on locks that you
> > > > > already are
> > > > > supposed to be holding for various reasons and is simple to
> > > > > comprehend.
> > > > 
> > > > I don't agree that we're supposed to hold the VM's resv lock
> > > > anyways for
> > > > functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but
> > > > I'm fine using it
> > > > for that purpose nevertheless.
> > > > 
> > > > > 
> > > > > > 
> > > > > > In order to at least place lockdep checks, the driver would
> > > > > > need to
> > > > > > supply the
> > > > > > corresponding lock's lockdep_map, because the GPUVM
> > > > > > otherwise doesn't
> > > > > > know about
> > > > > > the lock.
> > > > > 
> > > > > Yes, that sounds reasonable. One lockdep map per list.
> > > > 
> > > > I'd really like to avoid that, especially now that everything
> > > > got simpler. We
> > > > should define the actual locks to take instead.
> > > > 
> > > > > 
> > > > > > 
> > > > > > Out of curiosity, what is the overhead of a spin_lock()
> > > > > > that doesn't
> > > > > > need to
> > > > > > spin?
> > > > > 
> > > > > I guess it's hard to tell exactly, but it is much lower on
> > > > > modern x86
> > > > > than what it used to be. Not sure about ARM, which is the
> > > > > other
> > > > > architecture important to us. I figure if there is little
> > > > > cache-line
> > > > > bouncing the main overhead comes from the implied barriers.
> > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > A pretty simple way that would not add much code would be
> > > > > > > 
> > > > > > > static void gpuvm_cond_spin_lock(const struct drm_gpuvm
> > > > > > > *gpuvm,
> > > > > > > spinlock_t
> > > > > > > *lock)
> > > > > > > 
> > > > > > > {
> > > > > > > 
> > > > > > >      if (!gpuvm->resv_protected_lists)
> > > > > > >          spin_lock(lock);
> > > > > > > 
> > > > > > > }
> > > > > > > 
> > > > > > > > > For such drivers, that would require anybody calling
> > > > > > > > > unlink to
> > > > > > > > > hold the vm's
> > > > > > > > > resv, though.
> > > > > > > > In V4 I want to go back to having a dedicated lock for
> > > > > > > > the GEMs
> > > > > > > > gpuva list (or
> > > > > > > > VM_BO list to be more precise). We can't just use the
> > > > > > > > dma-resv
> > > > > > > > lock for that
> > > > > > > > with VM_BO abstractions, because on destruction of a
> > > > > > > > VM_BO we
> > > > > > > > otherwise wouldn't
> > > > > > > > be allowed to already hold the dma-resv lock. That's
> > > > > > > > the fix I
> > > > > > > > was referring to
> > > > > > > > earlier.
> > > > > > > 
> > > > > > > Yeah, I can see the need for a dedicated lock for the
> > > > > > > GEM's gpuva
> > > > > > > list, but
> > > > > > > holding the vm's dma-resv lock across the unlink
> > > > > > > shouldn't be a
> > > > > > > problem. We
> > > > > > > may free the object and a pointer to the vm's resv during
> > > > > > > unlink
> > > > > > > but we
> > > > > > > don't free the vm's resv.  It'd be a matter of ensuring
> > > > > > > that any
> > > > > > > calls to
> > > > > > > unlink from *within* drm_gpuvm allows it to be held.
> > > > > > 
> > > > > > Drivers calling unlink() from the fence signaling path
> > > > > > can't use the
> > > > > > VM's
> > > > > > dma-resv lock.
> > > > > 
> > > > > Yes, that made me a bit curious because in the current
> > > > > version the code
> > > > > required the object's dma_resv for unlink() which can't be
> > > > > grabbed
> > > > > either from the fence signaling path. So are there any
> > > > > drivers actually
> > > > > wanting to do that? If so, they will either need to resort to
> > > > > the
> > > > > current spinlock solution or they will need to call unlink
> > > > > from a
> > > > > workqueue item.
> > > > 
> > > > As Boris already mentioned we have the dma-resv lock by default
> > > > or a driver
> > > > specific GEM gpuva lock as opt-in. Now, we can get rid of the
> > > > latter.
> > > > 
> > > > > > 
> > > > > > Also, what if the object is an external object? We can't
> > > > > > use the VM's
> > > > > > dma-resv
> > > > > > lock here.
> > > > > 
> > > > > Why? Typically (sync) unlink is only ever called from an
> > > > > unbind-like
> > > > > operation where it should be trivial to grab the vm's resv.
> > > > > Or, for
> > > > > that matter any outer lock protecting the extobj list. Rule
> > > > > would be
> > > > > the drm_gpuvm_bo::entry::extobj  and
> > > > > drm_gpuvm_bo::entry::evict would
> > > > > be protected by either the vm's dma_resv (or possibly an
> > > > > outer lock in
> > > > > the case of the extobj list).
> > > > 
> > > > Outer lock wouldn't have been working for updates in the async
> > > > path, but
> > > > shouldn't be relevant anymore. We could use the VM's resv for
> > > > that.
> > > > 
> > > > > 
> > > > > >   And we can't have the GEM objs dma-resv lock held when
> > > > > > calling
> > > > > > unlink(), since unlink() calls drm_gpuvm_bo_put(), which if
> > > > > > the
> > > > > > refcount drops
> > > > > > to zero calls drm_gpuvm_bo_destroy() and
> > > > > > drm_gpuvm_bo_destroy() might
> > > > > > drop the
> > > > > > last reference of the GEM object.
> > > > > 
> > > > > Yes, but this is a different problem as to what exactly
> > > > > protects
> > > > > drm_gpuvm_bo::entry::gem. Either as you suggest an internal
> > > > > per bo list
> > > > > lock, or if we want to keep the bo's dma_resv we need to
> > > > > ensure that
> > > > > the caller of dma_resv_unlock(obj->resv) actually refcounts
> > > > > its obj
> > > > > pointer, and doesn't implicitly rely on the gpuvm_bo's
> > > > > refcount (I know
> > > > > Boris didn't like that, but requiring an explicit refcount
> > > > > for a
> > > > > pointer you dereference unless you're under a lock that
> > > > > ensures keeping
> > > > > the object alive is pretty much required?) But anyway for the
> > > > > drm_gpuvm_bo::entry::gem list protection (bo resv or internal
> > > > > spinlock)
> > > > > I don't have a strong preference.
> > > > 
> > > > We can keep the GEM objects dma-resv lock, however as mentioned
> > > > above
> > > > drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both
> > > > the VM's resv lock
> > > > and the GEM's resv lock in case they differ.
> > > > 
> > > 
> > > > > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [Nouveau] [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14 17:13                     ` Thomas Hellström
@ 2023-09-14 17:15                       ` Danilo Krummrich
  2023-09-18 11:21                         ` Danilo Krummrich
  0 siblings, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-14 17:15 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: matthew.brost, sarah.walker, nouveau, linux-kernel, dri-devel,
	boris.brezillon, donald.robson, daniel, christian.koenig,
	faith.ekstrand

On 9/14/23 19:13, Thomas Hellström wrote:
> On Thu, 2023-09-14 at 17:27 +0200, Danilo Krummrich wrote:
>> On 9/14/23 13:32, Thomas Hellström wrote:
>>>
>>> On 9/14/23 12:57, Danilo Krummrich wrote:
>>>> On 9/13/23 14:16, Danilo Krummrich wrote:
>>>>
>>>> <snip>
>>>>
>>>>>>> And validate() can remove it while still holding all dma-
>>>>>>> resv locks,
>>>>>>> neat!
>>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>>> concurrently? What
>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>>> drm_gpuva_unlink()?
>>>>>>> Are we guaranteed that at this point of time the
>>>>>>> drm_gpuvm_bo is not
>>>>>>> on the
>>>>>>> evicted list? Because otherwise we would call
>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>> with the
>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>> might drop the last reference to the drm_gem_object and
>>>>>>> hence we'd
>>>>>>> potentially
>>>>>>> free the dma-resv lock while holding it, at least if it's
>>>>>>> an external
>>>>>>> object.
>>>>>>
>>>>>> Easiest way in this scheme is to think of the lists as being
>>>>>> protected
>>>>>> by the vm's resv lock. That means anybody calling unlink()
>>>>>> must also
>>>>>> hold the vm's resv lock. (Which is OK from an UAF point of
>>>>>> view, but
>>>>>> perhaps not from a locking inversion POW from an async list
>>>>>> update).
>>>>>
>>>>> This would mean that on unlink() we'd need to hold the VM's
>>>>> resv lock and the
>>>>> corresponding GEM's resv lock (in case they're not the same
>>>>> anyways) because the
>>>>> VM's resv lock would protect the external / evicted object
>>>>> lists and the GEM
>>>>> objects resv lock protects the GEM's list of drm_gpuvm_bos and
>>>>> the
>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>
>>>> As mentioned below the same applies for drm_gpuvm_bo_put() since
>>>> it might
>>>> destroy the vm_bo, which includes removing the vm_bo from
>>>> external / evicted
>>>> object lists and the GEMs list of vm_bos.
>>>>
>>>> As mentioned, if the GEM's dma-resv is different from the VM's
>>>> dma-resv we need
>>>> to take both locks. Ultimately, this would mean we need a
>>>> drm_exec loop, because
>>>> we can't know the order in which to take these locks. Doing a
>>>> full drm_exec loop
>>>> just to put() a vm_bo doesn't sound reasonable to me.
>>>>
>>>> Can we instead just have an internal mutex for locking the lists
>>>> such that we
>>>> avoid taking and dropping the spinlocks, which we use currently,
>>>> in a loop?
>>>
>>> You'd have the same locking inversion problem with a mutex, right?
>>> Since in the eviction path you have resv->mutex, from exec you have
>>> resv->mutex->resv because validate would attempt to grab resv.
>>
>> Both lists, evict and extobj, would need to have a separate mutex,
>> not a common one.
>> We'd also need a dedicated GEM gpuva lock. Then the only rule would
>> be that you can't
>> hold the dma-resv lock when calling put(). Which I admit is not that
>> nice.
>>
>> With the current spinlock solution drivers wouldn't need to worry
>> about anything locking
>> related though. So maybe I come back to your proposal of having a
>> switch for external
>> locking with dma-resv locks entirely. Such that with external dma-
>> resv locking I skip
>> all the spinlocks and add lockdep checks instead.
>>
>> I think that makes the most sense in terms of taking advantage of
>> external dma-resv locking
>> where possible and on the other hand having a self-contained solution
>> if not. This should
>> get all concerns out of the way, yours, Christian's and Boris'.
> 
> If we need additional locks yes, I'd prefer the opt-in/opt-out spinlock
> solution, and check back after a while to see if we can remove either
> option once most pitfalls are hit.

Sounds good, I'll prepare this for a V4.

- Danilo

> 
> Thanks,
> /Thomas
> 
> 
>>
>>>
>>> That said, xe currently indeed does the vm+bo exec dance on vma
>>> put.
>>>
>>> One reason why that seemingly horrible construct is good, is that
>>> when evicting an extobj and you need to access individual vmas to
>>> Zap page table entries or TLB flush, those VMAs are not allowed to
>>> go away (we're not refcounting them). Holding the bo resv on gpuva
>>> put prevents that from happening. Possibly one could use another
>>> mutex to protect the gem->vm_bo list to achieve the same, but we'd
>>> need to hold it on gpuva put.
>>>
>>> /Thomas
>>>
>>>
>>>>
>>>> - Danilo
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> For extobjs an outer lock would be enough in case of
>>>>>>>>> Xe, but I
>>>>>>>>> really would not
>>>>>>>>> like to add even more complexity just to get the
>>>>>>>>> spinlock out of
>>>>>>>>> the way in case
>>>>>>>>> the driver already has an outer lock protecting this
>>>>>>>>> path.
>>>>>>>>
>>>>>>>> I must disagree here. These spinlocks and atomic
>>>>>>>> operations are
>>>>>>>> pretty
>>>>>>>> costly and as discussed earlier this type of locking was
>>>>>>>> the reason
>>>>>>>> (at
>>>>>>>> least according to the commit message) that made
>>>>>>>> Christian drop the
>>>>>>>> XArray
>>>>>>>> use in drm_exec for the same set of objects: "The locking
>>>>>>>> overhead
>>>>>>>> is
>>>>>>>> unecessary and measurable". IMHO the spinlock is the
>>>>>>>> added
>>>>>>>> complexity and a
>>>>>>>> single wide lock following the drm locking guidelines set
>>>>>>>> out by
>>>>>>>> Daniel and
>>>>>>>> David should really be the default choice with an opt-in
>>>>>>>> for a
>>>>>>>> spinlock if
>>>>>>>> needed for async and pushing out to a wq is not an
>>>>>>>> option.
>>>>>>>
>>>>>>> For the external object list an outer lock would work as
>>>>>>> long as it's
>>>>>>> not the
>>>>>>> dma-resv lock of the corresponding GEM object, since here
>>>>>>> we actually
>>>>>>> need to
>>>>>>> remove the list entry from the external object list on
>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>> It's just a bit weird design wise that drivers would need
>>>>>>> to take
>>>>>>> this outer
>>>>>>> lock on:
>>>>>>>
>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also
>>>>>>> drm_gpuvm_bo_put())
>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>> drm_gpuvm_bo_put())
>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>
>>>>>>> Given that it seems reasonable to do all the required
>>>>>>> locking
>>>>>>> internally.
>>>>>>
>>>>>>   From a design POW, there has been a clear direction in XE to
>>>>>> make
>>>>>> things similar to mmap() / munmap(), so this outer lock,
>>>>>> which in Xe is
>>>>>> an rwsem, is used in a similar way as the mmap_lock. It's
>>>>>> protecting
>>>>>> the page-table structures and vma rb tree, the userptr
>>>>>> structures and
>>>>>> the extobj list. Basically it's taken early in the exec
>>>>>> IOCTL, the
>>>>>> VM_BIND ioctl, the compute rebind worker and the pagefault
>>>>>> handler, so
>>>>>> all of the above are just asserting that it is taken in the
>>>>>> correct
>>>>>> mode.
>>>>>>
>>>>>> But strictly with this scheme one could also use the vm's
>>>>>> dma_resv for
>>>>>> the extobj list since with drm_exec, it's locked before
>>>>>> traversing the
>>>>>> list.
>>>>>>
>>>>>> The whole point of this scheme is to rely on locks that you
>>>>>> already are
>>>>>> supposed to be holding for various reasons and is simple to
>>>>>> comprehend.
>>>>>
>>>>> I don't agree that we're supposed to hold the VM's resv lock
>>>>> anyways for
>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but
>>>>> I'm fine using it
>>>>> for that purpose nevertheless.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> In order to at least place lockdep checks, the driver would
>>>>>>> need to
>>>>>>> supply the
>>>>>>> corresponding lock's lockdep_map, because the GPUVM
>>>>>>> otherwise doesn't
>>>>>>> know about
>>>>>>> the lock.
>>>>>>
>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>
>>>>> I'd really like to avoid that, especially now that everything
>>>>> got simpler. We
>>>>> should define the actual locks to take instead.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Out of curiosity, what is the overhead of a spin_lock()
>>>>>>> that doesn't
>>>>>>> need to
>>>>>>> spin?
>>>>>>
>>>>>> I guess it's hard to tell exactly, but it is much lower on
>>>>>> modern x86
>>>>>> than what it used to be. Not sure about ARM, which is the
>>>>>> other
>>>>>> architecture important to us. I figure if there is little
>>>>>> cache-line
>>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>>
>>>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm
>>>>>>>> *gpuvm,
>>>>>>>> spinlock_t
>>>>>>>> *lock)
>>>>>>>>
>>>>>>>> {
>>>>>>>>
>>>>>>>>       if (!gpuvm->resv_protected_lists)
>>>>>>>>           spin_lock(lock);
>>>>>>>>
>>>>>>>> }
>>>>>>>>
>>>>>>>>>> For such drivers, that would require anybody calling
>>>>>>>>>> unlink to
>>>>>>>>>> hold the vm's
>>>>>>>>>> resv, though.
>>>>>>>>> In V4 I want to go back to having a dedicated lock for
>>>>>>>>> the GEMs
>>>>>>>>> gpuva list (or
>>>>>>>>> VM_BO list to be more precise). We can't just use the
>>>>>>>>> dma-resv
>>>>>>>>> lock for that
>>>>>>>>> with VM_BO abstractions, because on destruction of a
>>>>>>>>> VM_BO we
>>>>>>>>> otherwise wouldn't
>>>>>>>>> be allowed to already hold the dma-resv lock. That's
>>>>>>>>> the fix I
>>>>>>>>> was referring to
>>>>>>>>> earlier.
>>>>>>>>
>>>>>>>> Yeah, I can see the need for a dedicated lock for the
>>>>>>>> GEM's gpuva
>>>>>>>> list, but
>>>>>>>> holding the vm's dma-resv lock across the unlink
>>>>>>>> shouldn't be a
>>>>>>>> problem. We
>>>>>>>> may free the object and a pointer to the vm's resv during
>>>>>>>> unlink
>>>>>>>> but we
>>>>>>>> don't free the vm's resv.  It'd be a matter of ensuring
>>>>>>>> that any
>>>>>>>> calls to
>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>
>>>>>>> Drivers calling unlink() from the fence signaling path
>>>>>>> can't use the
>>>>>>> VM's
>>>>>>> dma-resv lock.
>>>>>>
>>>>>> Yes, that made me a bit curious because in the current
>>>>>> version the code
>>>>>> required the object's dma_resv for unlink() which can't be
>>>>>> grabbed
>>>>>> either from the fence signaling path. So are there any
>>>>>> drivers actually
>>>>>> wanting to do that? If so, they will either need to resort to
>>>>>> the
>>>>>> current spinlock solution or they will need to call unlink
>>>>>> from a
>>>>>> workqueue item.
>>>>>
>>>>> As Boris already mentioned we have the dma-resv lock by default
>>>>> or a driver
>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of the
>>>>> latter.
>>>>>
>>>>>>>
>>>>>>> Also, what if the object is an external object? We can't
>>>>>>> use the VM's
>>>>>>> dma-resv
>>>>>>> lock here.
>>>>>>
>>>>>> Why? Typically (sync) unlink is only ever called from an
>>>>>> unbind-like
>>>>>> operation where it should be trivial to grab the vm's resv.
>>>>>> Or, for
>>>>>> that matter any outer lock protecting the extobj list. Rule
>>>>>> would be
>>>>>> the drm_gpuvm_bo::entry::extobj  and
>>>>>> drm_gpuvm_bo::entry::evict would
>>>>>> be protected by either the vm's dma_resv (or possibly an
>>>>>> outer lock in
>>>>>> the case of the extobj list).
>>>>>
>>>>> Outer lock wouldn't have been working for updates in the async
>>>>> path, but
>>>>> shouldn't be relevant anymore. We could use the VM's resv for
>>>>> that.
>>>>>
>>>>>>
>>>>>>>    And we can't have the GEM objs dma-resv lock held when
>>>>>>> calling
>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if
>>>>>>> the
>>>>>>> refcount drops
>>>>>>> to zero calls drm_gpuvm_bo_destroy() and
>>>>>>> drm_gpuvm_bo_destroy() might
>>>>>>> drop the
>>>>>>> last reference of the GEM object.
>>>>>>
>>>>>> Yes, but this is a different problem as to what exactly
>>>>>> protects
>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal
>>>>>> per bo list
>>>>>> lock, or if we want to keep the bo's dma_resv we need to
>>>>>> ensure that
>>>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts
>>>>>> its obj
>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's
>>>>>> refcount (I know
>>>>>> Boris didn't like that, but requiring an explicit refcount
>>>>>> for a
>>>>>> pointer you dereference unless you're under a lock that
>>>>>> ensures keeping
>>>>>> the object alive is pretty much required?) But anyway for the
>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal
>>>>>> spinlock)
>>>>>> I don't have a strong preference.
>>>>>
>>>>> We can keep the GEM objects dma-resv lock, however as mentioned
>>>>> above
>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both
>>>>> the VM's resv lock
>>>>> and the GEM's resv lock in case they differ.
>>>>>
>>>>
>>>>>>>>
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14 16:36     ` Danilo Krummrich
@ 2023-09-14 17:21       ` Thomas Hellström
  2023-09-14 17:25         ` Danilo Krummrich
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-14 17:21 UTC (permalink / raw)
  To: Danilo Krummrich, airlied, daniel, matthew.brost, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel

On Thu, 2023-09-14 at 18:36 +0200, Danilo Krummrich wrote:
> On 9/14/23 15:48, Thomas Hellström wrote:
> > Hi, Danilo
> > 
> > Some additional minor comments as xe conversion progresses.
> > 
> > On 9/9/23 17:31, Danilo Krummrich wrote:
> > > So far the DRM GPUVA manager offers common infrastructure to
> > > track GPU VA
> > > allocations and mappings, generically connect GPU VA mappings to
> > > their
> > > backing buffers and perform more complex mapping operations on
> > > the GPU VA
> > > space.
> > > 
> > > However, there are more design patterns commonly used by drivers,
> > > which
> > > can potentially be generalized in order to make the DRM GPUVA
> > > manager
> > > represent a basic GPU-VM implementation. In this context, this
> > > patch aims
> > > at generalizing the following elements.
> > > 
> > > 1) Provide a common dma-resv for GEM objects not being used
> > > outside of
> > >     this GPU-VM.
> > > 
> > > 2) Provide tracking of external GEM objects (GEM objects which
> > > are
> > >     shared with other GPU-VMs).
> > > 
> > > 3) Provide functions to efficiently lock all GEM objects dma-resv
> > > the
> > >     GPU-VM contains mappings of.
> > > 
> > > 4) Provide tracking of evicted GEM objects the GPU-VM contains
> > > mappings
> > >     of, such that validation of evicted GEM objects is
> > > accelerated.
> > > 
> > > 5) Provide some convinience functions for common patterns.
> > > 
> > > Rather than being designed as a "framework", the target is to
> > > make all
> > > features appear as a collection of optional helper functions,
> > > such that
> > > drivers are free to make use of the DRM GPUVA managers basic
> > > functionality and opt-in for other features without setting any
> > > feature
> > > flags, just by making use of the corresponding functions.
> > > 
> > > Big kudos to Boris Brezillon for his help to figure out locking
> > > for drivers
> > > updating the GPU VA space within the fence signalling path.
> > > 
> > > Suggested-by: Matthew Brost <matthew.brost@intel.com>
> > > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > > ---
> > > 
> > > +/**
> > > + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to /
> > > from a
> > > + * &drm_gpuvms evicted list
> > > + * @obj: the &drm_gem_object to add or remove
> > > + * @evict: indicates whether the object is evicted
> > > + *
> > > + * Adds a &drm_gem_object to or removes it from all &drm_gpuvms
> > > evicted
> > > + * list containing a mapping of this &drm_gem_object.
> > > + */
> > > +void
> > > +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
> > > +{
> > > +    struct drm_gpuvm_bo *vm_bo;
> > > +
> > > +    drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
> > > +        if (evict)
> > > +            drm_gpuvm_bo_list_add(vm_bo, evict);
> > > +        else
> > > +            drm_gpuvm_bo_list_del(vm_bo, evict);
> > > +    }
> > > +}
> > > +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
> > > +
> > 
> > We need a drm_gpuvm_bo_evict(struct drm_gpuvm_bo *vm_bo, ...) that
> > puts a single gpuvm_bo on the list, the above function could
> > perhaps be renamed as drm_gpuvm_gem_obj_evict(obj, ....).
> 
> Makes sense - gonna change that.
> 
> > 
> > Reason is some vm's are faulting vms which don't have an evict
> > list, but validate from the pagefault handler. Also evict == false
> > is dangerous because if called from within an exec, it might remove
> > the obj from other vm's evict list before they've had a chance to
> > rebind their VMAs.
> > 
> > >   static int
> > >   __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
> > >              struct drm_gpuva *va)
> > > diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
> > > index afa50b9059a2..834bb6d6617e 100644
> > > --- a/include/drm/drm_gpuvm.h
> > > +++ b/include/drm/drm_gpuvm.h
> > > @@ -26,10 +26,12 @@
> > >    */
> > >   #include <linux/list.h>
> > > +#include <linux/dma-resv.h>
> > >   #include <linux/rbtree.h>
> > >   #include <linux/types.h>
> > >   #include <drm/drm_gem.h>
> > > +#include <drm/drm_exec.h>
> > >   struct drm_gpuvm;
> > >   struct drm_gpuvm_bo;
> > > @@ -259,6 +261,38 @@ struct drm_gpuvm {
> > >        * space
> > >        */
> > >       struct dma_resv *resv;
> > > +
> > > +    /**
> > > +     * @extobj: structure holding the extobj list
> > > +     */
> > > +    struct {
> > > +        /**
> > > +         * @list: &list_head storing &drm_gpuvm_bos serving as
> > > +         * external object
> > > +         */
> > > +        struct list_head list;
> > > +
> > > +        /**
> > > +         * @lock: spinlock to protect the extobj list
> > > +         */
> > > +        spinlock_t lock;
> > > +    } extobj;
> > > +
> > > +    /**
> > > +     * @evict: structure holding the evict list and evict list
> > > lock
> > > +     */
> > > +    struct {
> > > +        /**
> > > +         * @list: &list_head storing &drm_gpuvm_bos currently
> > > being
> > > +         * evicted
> > > +         */
> > > +        struct list_head list;
> > > +
> > > +        /**
> > > +         * @lock: spinlock to protect the evict list
> > > +         */
> > > +        spinlock_t lock;
> > > +    } evict;
> > >   };
> > >   void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device
> > > *drm,
> > > @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm *gpuvm,
> > > struct drm_device *drm,
> > >               const struct drm_gpuvm_ops *ops);
> > >   void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
> > > +/**
> > > + * drm_gpuvm_is_extobj() - indicates whether the given
> > > &drm_gem_object is an
> > > + * external object
> > > + * @gpuvm: the &drm_gpuvm to check
> > > + * @obj: the &drm_gem_object to check
> > > + *
> > > + * Returns: true if the &drm_gem_object &dma_resv differs from
> > > the
> > > + * &drm_gpuvms &dma_resv, false otherwise
> > > + */
> > > +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm *gpuvm,
> > > +                       struct drm_gem_object *obj)
> > > +{
> > > +    return obj && obj->resv != gpuvm->resv;
> > > +}
> > > +
> > >   static inline struct drm_gpuva *
> > >   __drm_gpuva_next(struct drm_gpuva *va)
> > >   {
> > > @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
> > >   #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) \
> > >       list_for_each_entry_safe(va__, next__, &(gpuvm__)->rb.list,
> > > rb.entry)
> > > +/**
> > > + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of &drm_exec
> > > + *
> > > + * This structure should be created on the stack as &drm_exec
> > > should be.
> > > + *
> > > + * Optionally, @extra can be set in order to lock additional
> > > &drm_gem_objects.
> > > + */
> > > +struct drm_gpuvm_exec {
> > > +    /**
> > > +     * @exec: the &drm_exec structure
> > > +     */
> > > +    struct drm_exec exec;
> > > +
> > > +    /**
> > > +     * @vm: the &drm_gpuvm to lock its DMA reservations
> > > +     */
> > > +    struct drm_gpuvm *vm;
> > > +
> > > +    /**
> > > +     * @extra: Callback and corresponding private data for the
> > > driver to
> > > +     * lock arbitrary additional &drm_gem_objects.
> > > +     */
> > > +    struct {
> > > +        /**
> > > +         * @fn: The driver callback to lock additional
> > > &drm_gem_objects.
> > > +         */
> > > +        int (*fn)(struct drm_gpuvm_exec *vm_exec,
> > > +              unsigned int num_fences);
> > > +
> > > +        /**
> > > +         * @priv: driver private data for the @fn callback
> > > +         */
> > > +        void *priv;
> > > +    } extra;
> > > +};
> > > +
> > > +/**
> > > + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
> > > + * @gpuvm: the &drm_gpuvm
> > > + * @exec: the &drm_exec context
> > > + * @num_fences: the amount of &dma_fences to reserve
> > > + *
> > > + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
> > > &drm_gem_object.
> > > + *
> > > + * Using this function directly, it is the drivers
> > > responsibility to call
> > > + * drm_exec_init() and drm_exec_fini() accordingly.
> > > + *
> > > + * Returns: 0 on success, negative error code on failure.
> > > + */
> > > +static inline int
> > > +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
> > > +             struct drm_exec *exec,
> > > +             unsigned int num_fences)
> > > +{
> > > +    return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
> > > num_fences);
> > > +}
> > > +
> > > +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> > > +                  struct drm_exec *exec,
> > > +                  unsigned int num_fences);
> > > +
> > > +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
> > > +                struct drm_exec *exec,
> > > +                u64 addr, u64 range,
> > > +                unsigned int num_fences);
> > > +
> > > +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> > > +            unsigned int num_fences,
> > > +            bool interruptible);
> > > +
> > > +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
> > > +                  struct drm_gem_object **objs,
> > > +                  unsigned int num_objs,
> > > +                  unsigned int num_fences,
> > > +                  bool interruptible);
> > > +
> > > +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
> > > +                  u64 addr, u64 range,
> > > +                  unsigned int num_fences,
> > > +                  bool interruptible);
> > > +
> > > +/**
> > > + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated BOs
> > > + * @gpuvm: the &drm_gpuvm
> > > + *
> > > + * Releases all dma-resv locks of all &drm_gem_objects
> > > previously acquired
> > > + * through drm_gpuvm_lock() or its variants.
> > > + *
> > > + * Returns: 0 on success, negative error code on failure.
> > > + */
> > > +static inline void
> > > +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
> > > +{
> > > +    drm_exec_fini(&vm_exec->exec);
> > > +}
> > > +
> > > +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
> > > +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> > > +                  struct drm_exec *exec,
> > > +                  struct dma_fence *fence,
> > > +                  enum dma_resv_usage private_usage,
> > > +                  enum dma_resv_usage extobj_usage);
> > > +
> > > +/**
> > > + * drm_gpuvm_exec_resv_add_fence()
> > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > + * @fence: fence to add
> > > + * @private_usage: private dma-resv usage
> > > + * @extobj_usage: extobj dma-resv usage
> > > + *
> > > + * See drm_gpuvm_resv_add_fence().
> > > + */
> > > +static inline void
> > > +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec *vm_exec,
> > > +                  struct dma_fence *fence,
> > > +                  enum dma_resv_usage private_usage,
> > > +                  enum dma_resv_usage extobj_usage)
> > > +{
> > > +    drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec, fence,
> > > +                 private_usage, extobj_usage);
> > > +}
> > > +
> > >   /**
> > >    * struct drm_gpuvm_bo - structure representing a &drm_gpuvm
> > > and
> > >    * &drm_gem_object combination
> > > @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
> > >                * gpuva list.
> > >                */
> > >               struct list_head gem;
> > > +
> > > +            /**
> > > +             * @evict: List entry to attach to the &drm_gpuvms
> > > +             * extobj list.
> > > +             */
> > > +            struct list_head extobj;
> > > +
> > > +            /**
> > > +             * @evict: List entry to attach to the &drm_gpuvms
> > > evict
> > > +             * list.
> > > +             */
> > > +            struct list_head evict;
> > >           } entry;
> > >       } list;
> > >   };
> > > @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
> > >   drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
> > >             struct drm_gem_object *obj);
> > > +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict);
> > > +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
> > > +
> > >   /**
> > >    * drm_gpuvm_bo_for_each_va() - iterator to walk over a list of
> > > &drm_gpuva
> > >    * @va__: &drm_gpuva structure to assign to in each iteration
> > > step
> > > @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
> > >        * used.
> > >        */
> > >       int (*sm_step_unmap)(struct drm_gpuva_op *op, void *priv);
> > > +
> > > +    /**
> > > +     * @bo_validate: called from drm_gpuvm_validate()
> > > +     *
> > > +     * Drivers receive this callback for every evicted
> > > &drm_gem_object being
> > > +     * mapped in the corresponding &drm_gpuvm.
> > > +     *
> > > +     * Typically, drivers would call their driver specific
> > > variant of
> > > +     * ttm_bo_validate() from within this callback.
> > > +     */
> > > +    int (*bo_validate)(struct drm_gem_object *obj);
> > 
> > Same here. Could we have a vm_bo as an argument instead, so that
> > the callback knows what gpuvm we're targeting and can mark all its
> > gpu_vas for revalidation? Or is that intended to be done elsewhere?
> 
> Makes sense as well. I'll change that too.

I forgot, drm_gpuvm_validate() would preferably take an drm_gpuvm_exec
argument because we need it in the validate callback. It's also easy
for the driver to subclass further if needed, to pass even more
arguments to its validate callback.

/Thomas


> 
> > 
> > >   };
> > >   int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
> > 
> > Thanks,
> > 
> > Thomas
> > 
> > 
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14 17:21       ` Thomas Hellström
@ 2023-09-14 17:25         ` Danilo Krummrich
  2023-09-14 19:14           ` Thomas Hellström
  0 siblings, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-14 17:25 UTC (permalink / raw)
  To: Thomas Hellström, airlied, daniel, matthew.brost,
	sarah.walker, donald.robson, boris.brezillon, christian.koenig,
	faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel

On 9/14/23 19:21, Thomas Hellström wrote:
> On Thu, 2023-09-14 at 18:36 +0200, Danilo Krummrich wrote:
>> On 9/14/23 15:48, Thomas Hellström wrote:
>>> Hi, Danilo
>>>
>>> Some additional minor comments as xe conversion progresses.
>>>
>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>> So far the DRM GPUVA manager offers common infrastructure to
>>>> track GPU VA
>>>> allocations and mappings, generically connect GPU VA mappings to
>>>> their
>>>> backing buffers and perform more complex mapping operations on
>>>> the GPU VA
>>>> space.
>>>>
>>>> However, there are more design patterns commonly used by drivers,
>>>> which
>>>> can potentially be generalized in order to make the DRM GPUVA
>>>> manager
>>>> represent a basic GPU-VM implementation. In this context, this
>>>> patch aims
>>>> at generalizing the following elements.
>>>>
>>>> 1) Provide a common dma-resv for GEM objects not being used
>>>> outside of
>>>>      this GPU-VM.
>>>>
>>>> 2) Provide tracking of external GEM objects (GEM objects which
>>>> are
>>>>      shared with other GPU-VMs).
>>>>
>>>> 3) Provide functions to efficiently lock all GEM objects dma-resv
>>>> the
>>>>      GPU-VM contains mappings of.
>>>>
>>>> 4) Provide tracking of evicted GEM objects the GPU-VM contains
>>>> mappings
>>>>      of, such that validation of evicted GEM objects is
>>>> accelerated.
>>>>
>>>> 5) Provide some convinience functions for common patterns.
>>>>
>>>> Rather than being designed as a "framework", the target is to
>>>> make all
>>>> features appear as a collection of optional helper functions,
>>>> such that
>>>> drivers are free to make use of the DRM GPUVA managers basic
>>>> functionality and opt-in for other features without setting any
>>>> feature
>>>> flags, just by making use of the corresponding functions.
>>>>
>>>> Big kudos to Boris Brezillon for his help to figure out locking
>>>> for drivers
>>>> updating the GPU VA space within the fence signalling path.
>>>>
>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>> ---
>>>>
>>>> +/**
>>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to /
>>>> from a
>>>> + * &drm_gpuvms evicted list
>>>> + * @obj: the &drm_gem_object to add or remove
>>>> + * @evict: indicates whether the object is evicted
>>>> + *
>>>> + * Adds a &drm_gem_object to or removes it from all &drm_gpuvms
>>>> evicted
>>>> + * list containing a mapping of this &drm_gem_object.
>>>> + */
>>>> +void
>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>>> +{
>>>> +    struct drm_gpuvm_bo *vm_bo;
>>>> +
>>>> +    drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>> +        if (evict)
>>>> +            drm_gpuvm_bo_list_add(vm_bo, evict);
>>>> +        else
>>>> +            drm_gpuvm_bo_list_del(vm_bo, evict);
>>>> +    }
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>> +
>>>
>>> We need a drm_gpuvm_bo_evict(struct drm_gpuvm_bo *vm_bo, ...) that
>>> puts a single gpuvm_bo on the list, the above function could
>>> perhaps be renamed as drm_gpuvm_gem_obj_evict(obj, ....).
>>
>> Makes sense - gonna change that.
>>
>>>
>>> Reason is some vm's are faulting vms which don't have an evict
>>> list, but validate from the pagefault handler. Also evict == false
>>> is dangerous because if called from within an exec, it might remove
>>> the obj from other vm's evict list before they've had a chance to
>>> rebind their VMAs.
>>>
>>>>    static int
>>>>    __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>               struct drm_gpuva *va)
>>>> diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
>>>> index afa50b9059a2..834bb6d6617e 100644
>>>> --- a/include/drm/drm_gpuvm.h
>>>> +++ b/include/drm/drm_gpuvm.h
>>>> @@ -26,10 +26,12 @@
>>>>     */
>>>>    #include <linux/list.h>
>>>> +#include <linux/dma-resv.h>
>>>>    #include <linux/rbtree.h>
>>>>    #include <linux/types.h>
>>>>    #include <drm/drm_gem.h>
>>>> +#include <drm/drm_exec.h>
>>>>    struct drm_gpuvm;
>>>>    struct drm_gpuvm_bo;
>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>         * space
>>>>         */
>>>>        struct dma_resv *resv;
>>>> +
>>>> +    /**
>>>> +     * @extobj: structure holding the extobj list
>>>> +     */
>>>> +    struct {
>>>> +        /**
>>>> +         * @list: &list_head storing &drm_gpuvm_bos serving as
>>>> +         * external object
>>>> +         */
>>>> +        struct list_head list;
>>>> +
>>>> +        /**
>>>> +         * @lock: spinlock to protect the extobj list
>>>> +         */
>>>> +        spinlock_t lock;
>>>> +    } extobj;
>>>> +
>>>> +    /**
>>>> +     * @evict: structure holding the evict list and evict list
>>>> lock
>>>> +     */
>>>> +    struct {
>>>> +        /**
>>>> +         * @list: &list_head storing &drm_gpuvm_bos currently
>>>> being
>>>> +         * evicted
>>>> +         */
>>>> +        struct list_head list;
>>>> +
>>>> +        /**
>>>> +         * @lock: spinlock to protect the evict list
>>>> +         */
>>>> +        spinlock_t lock;
>>>> +    } evict;
>>>>    };
>>>>    void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct drm_device
>>>> *drm,
>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm *gpuvm,
>>>> struct drm_device *drm,
>>>>                const struct drm_gpuvm_ops *ops);
>>>>    void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>> +/**
>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>> &drm_gem_object is an
>>>> + * external object
>>>> + * @gpuvm: the &drm_gpuvm to check
>>>> + * @obj: the &drm_gem_object to check
>>>> + *
>>>> + * Returns: true if the &drm_gem_object &dma_resv differs from
>>>> the
>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>> + */
>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm *gpuvm,
>>>> +                       struct drm_gem_object *obj)
>>>> +{
>>>> +    return obj && obj->resv != gpuvm->resv;
>>>> +}
>>>> +
>>>>    static inline struct drm_gpuva *
>>>>    __drm_gpuva_next(struct drm_gpuva *va)
>>>>    {
>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>>>    #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) \
>>>>        list_for_each_entry_safe(va__, next__, &(gpuvm__)->rb.list,
>>>> rb.entry)
>>>> +/**
>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of &drm_exec
>>>> + *
>>>> + * This structure should be created on the stack as &drm_exec
>>>> should be.
>>>> + *
>>>> + * Optionally, @extra can be set in order to lock additional
>>>> &drm_gem_objects.
>>>> + */
>>>> +struct drm_gpuvm_exec {
>>>> +    /**
>>>> +     * @exec: the &drm_exec structure
>>>> +     */
>>>> +    struct drm_exec exec;
>>>> +
>>>> +    /**
>>>> +     * @vm: the &drm_gpuvm to lock its DMA reservations
>>>> +     */
>>>> +    struct drm_gpuvm *vm;
>>>> +
>>>> +    /**
>>>> +     * @extra: Callback and corresponding private data for the
>>>> driver to
>>>> +     * lock arbitrary additional &drm_gem_objects.
>>>> +     */
>>>> +    struct {
>>>> +        /**
>>>> +         * @fn: The driver callback to lock additional
>>>> &drm_gem_objects.
>>>> +         */
>>>> +        int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>> +              unsigned int num_fences);
>>>> +
>>>> +        /**
>>>> +         * @priv: driver private data for the @fn callback
>>>> +         */
>>>> +        void *priv;
>>>> +    } extra;
>>>> +};
>>>> +
>>>> +/**
>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-resv
>>>> + * @gpuvm: the &drm_gpuvm
>>>> + * @exec: the &drm_exec context
>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>> + *
>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>> &drm_gem_object.
>>>> + *
>>>> + * Using this function directly, it is the drivers
>>>> responsibility to call
>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>> + *
>>>> + * Returns: 0 on success, negative error code on failure.
>>>> + */
>>>> +static inline int
>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>> +             struct drm_exec *exec,
>>>> +             unsigned int num_fences)
>>>> +{
>>>> +    return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>> num_fences);
>>>> +}
>>>> +
>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>> +                  struct drm_exec *exec,
>>>> +                  unsigned int num_fences);
>>>> +
>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>> +                struct drm_exec *exec,
>>>> +                u64 addr, u64 range,
>>>> +                unsigned int num_fences);
>>>> +
>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>> +            unsigned int num_fences,
>>>> +            bool interruptible);
>>>> +
>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>> +                  struct drm_gem_object **objs,
>>>> +                  unsigned int num_objs,
>>>> +                  unsigned int num_fences,
>>>> +                  bool interruptible);
>>>> +
>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>> +                  u64 addr, u64 range,
>>>> +                  unsigned int num_fences,
>>>> +                  bool interruptible);
>>>> +
>>>> +/**
>>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated BOs
>>>> + * @gpuvm: the &drm_gpuvm
>>>> + *
>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>> previously acquired
>>>> + * through drm_gpuvm_lock() or its variants.
>>>> + *
>>>> + * Returns: 0 on success, negative error code on failure.
>>>> + */
>>>> +static inline void
>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>> +{
>>>> +    drm_exec_fini(&vm_exec->exec);
>>>> +}
>>>> +
>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>> +                  struct drm_exec *exec,
>>>> +                  struct dma_fence *fence,
>>>> +                  enum dma_resv_usage private_usage,
>>>> +                  enum dma_resv_usage extobj_usage);
>>>> +
>>>> +/**
>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>> + * @fence: fence to add
>>>> + * @private_usage: private dma-resv usage
>>>> + * @extobj_usage: extobj dma-resv usage
>>>> + *
>>>> + * See drm_gpuvm_resv_add_fence().
>>>> + */
>>>> +static inline void
>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec *vm_exec,
>>>> +                  struct dma_fence *fence,
>>>> +                  enum dma_resv_usage private_usage,
>>>> +                  enum dma_resv_usage extobj_usage)
>>>> +{
>>>> +    drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec, fence,
>>>> +                 private_usage, extobj_usage);
>>>> +}
>>>> +
>>>>    /**
>>>>     * struct drm_gpuvm_bo - structure representing a &drm_gpuvm
>>>> and
>>>>     * &drm_gem_object combination
>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>                 * gpuva list.
>>>>                 */
>>>>                struct list_head gem;
>>>> +
>>>> +            /**
>>>> +             * @evict: List entry to attach to the &drm_gpuvms
>>>> +             * extobj list.
>>>> +             */
>>>> +            struct list_head extobj;
>>>> +
>>>> +            /**
>>>> +             * @evict: List entry to attach to the &drm_gpuvms
>>>> evict
>>>> +             * list.
>>>> +             */
>>>> +            struct list_head evict;
>>>>            } entry;
>>>>        } list;
>>>>    };
>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>    drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>              struct drm_gem_object *obj);
>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict);
>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>> +
>>>>    /**
>>>>     * drm_gpuvm_bo_for_each_va() - iterator to walk over a list of
>>>> &drm_gpuva
>>>>     * @va__: &drm_gpuva structure to assign to in each iteration
>>>> step
>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>         * used.
>>>>         */
>>>>        int (*sm_step_unmap)(struct drm_gpuva_op *op, void *priv);
>>>> +
>>>> +    /**
>>>> +     * @bo_validate: called from drm_gpuvm_validate()
>>>> +     *
>>>> +     * Drivers receive this callback for every evicted
>>>> &drm_gem_object being
>>>> +     * mapped in the corresponding &drm_gpuvm.
>>>> +     *
>>>> +     * Typically, drivers would call their driver specific
>>>> variant of
>>>> +     * ttm_bo_validate() from within this callback.
>>>> +     */
>>>> +    int (*bo_validate)(struct drm_gem_object *obj);
>>>
>>> Same here. Could we have a vm_bo as an argument instead, so that
>>> the callback knows what gpuvm we're targeting and can mark all its
>>> gpu_vas for revalidation? Or is that intended to be done elsewhere?
>>
>> Makes sense as well. I'll change that too.
> 
> I forgot, drm_gpuvm_validate() would preferably take an drm_gpuvm_exec
> argument because we need it in the validate callback. It's also easy
> for the driver to subclass further if needed, to pass even more
> arguments to its validate callback.

Hm.. that implies that a driver open coding the drm_exec loop, still needs
to use a struct drm_gpuvm_exec rather than just a struct drm_exec. What is
this needed for in Xe? Do we expect other drivers needing it? Might a priv
void pointer maybe make more sense?

> 
> /Thomas
> 
> 
>>
>>>
>>>>    };
>>>>    int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
>>>
>>> Thanks,
>>>
>>> Thomas
>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14 17:25         ` Danilo Krummrich
@ 2023-09-14 19:14           ` Thomas Hellström
  0 siblings, 0 replies; 77+ messages in thread
From: Thomas Hellström @ 2023-09-14 19:14 UTC (permalink / raw)
  To: Danilo Krummrich, airlied, daniel, matthew.brost, sarah.walker,
	donald.robson, boris.brezillon, christian.koenig, faith.ekstrand
  Cc: dri-devel, nouveau, linux-kernel

On Thu, 2023-09-14 at 19:25 +0200, Danilo Krummrich wrote:
> On 9/14/23 19:21, Thomas Hellström wrote:
> > On Thu, 2023-09-14 at 18:36 +0200, Danilo Krummrich wrote:
> > > On 9/14/23 15:48, Thomas Hellström wrote:
> > > > Hi, Danilo
> > > > 
> > > > Some additional minor comments as xe conversion progresses.
> > > > 
> > > > On 9/9/23 17:31, Danilo Krummrich wrote:
> > > > > So far the DRM GPUVA manager offers common infrastructure to
> > > > > track GPU VA
> > > > > allocations and mappings, generically connect GPU VA mappings
> > > > > to
> > > > > their
> > > > > backing buffers and perform more complex mapping operations
> > > > > on
> > > > > the GPU VA
> > > > > space.
> > > > > 
> > > > > However, there are more design patterns commonly used by
> > > > > drivers,
> > > > > which
> > > > > can potentially be generalized in order to make the DRM GPUVA
> > > > > manager
> > > > > represent a basic GPU-VM implementation. In this context,
> > > > > this
> > > > > patch aims
> > > > > at generalizing the following elements.
> > > > > 
> > > > > 1) Provide a common dma-resv for GEM objects not being used
> > > > > outside of
> > > > >      this GPU-VM.
> > > > > 
> > > > > 2) Provide tracking of external GEM objects (GEM objects
> > > > > which
> > > > > are
> > > > >      shared with other GPU-VMs).
> > > > > 
> > > > > 3) Provide functions to efficiently lock all GEM objects dma-
> > > > > resv
> > > > > the
> > > > >      GPU-VM contains mappings of.
> > > > > 
> > > > > 4) Provide tracking of evicted GEM objects the GPU-VM
> > > > > contains
> > > > > mappings
> > > > >      of, such that validation of evicted GEM objects is
> > > > > accelerated.
> > > > > 
> > > > > 5) Provide some convinience functions for common patterns.
> > > > > 
> > > > > Rather than being designed as a "framework", the target is to
> > > > > make all
> > > > > features appear as a collection of optional helper functions,
> > > > > such that
> > > > > drivers are free to make use of the DRM GPUVA managers basic
> > > > > functionality and opt-in for other features without setting
> > > > > any
> > > > > feature
> > > > > flags, just by making use of the corresponding functions.
> > > > > 
> > > > > Big kudos to Boris Brezillon for his help to figure out
> > > > > locking
> > > > > for drivers
> > > > > updating the GPU VA space within the fence signalling path.
> > > > > 
> > > > > Suggested-by: Matthew Brost <matthew.brost@intel.com>
> > > > > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > > > > ---
> > > > > 
> > > > > +/**
> > > > > + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
> > > > > /
> > > > > from a
> > > > > + * &drm_gpuvms evicted list
> > > > > + * @obj: the &drm_gem_object to add or remove
> > > > > + * @evict: indicates whether the object is evicted
> > > > > + *
> > > > > + * Adds a &drm_gem_object to or removes it from all
> > > > > &drm_gpuvms
> > > > > evicted
> > > > > + * list containing a mapping of this &drm_gem_object.
> > > > > + */
> > > > > +void
> > > > > +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
> > > > > +{
> > > > > +    struct drm_gpuvm_bo *vm_bo;
> > > > > +
> > > > > +    drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
> > > > > +        if (evict)
> > > > > +            drm_gpuvm_bo_list_add(vm_bo, evict);
> > > > > +        else
> > > > > +            drm_gpuvm_bo_list_del(vm_bo, evict);
> > > > > +    }
> > > > > +}
> > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
> > > > > +
> > > > 
> > > > We need a drm_gpuvm_bo_evict(struct drm_gpuvm_bo *vm_bo, ...)
> > > > that
> > > > puts a single gpuvm_bo on the list, the above function could
> > > > perhaps be renamed as drm_gpuvm_gem_obj_evict(obj, ....).
> > > 
> > > Makes sense - gonna change that.
> > > 
> > > > 
> > > > Reason is some vm's are faulting vms which don't have an evict
> > > > list, but validate from the pagefault handler. Also evict ==
> > > > false
> > > > is dangerous because if called from within an exec, it might
> > > > remove
> > > > the obj from other vm's evict list before they've had a chance
> > > > to
> > > > rebind their VMAs.
> > > > 
> > > > >    static int
> > > > >    __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
> > > > >               struct drm_gpuva *va)
> > > > > diff --git a/include/drm/drm_gpuvm.h
> > > > > b/include/drm/drm_gpuvm.h
> > > > > index afa50b9059a2..834bb6d6617e 100644
> > > > > --- a/include/drm/drm_gpuvm.h
> > > > > +++ b/include/drm/drm_gpuvm.h
> > > > > @@ -26,10 +26,12 @@
> > > > >     */
> > > > >    #include <linux/list.h>
> > > > > +#include <linux/dma-resv.h>
> > > > >    #include <linux/rbtree.h>
> > > > >    #include <linux/types.h>
> > > > >    #include <drm/drm_gem.h>
> > > > > +#include <drm/drm_exec.h>
> > > > >    struct drm_gpuvm;
> > > > >    struct drm_gpuvm_bo;
> > > > > @@ -259,6 +261,38 @@ struct drm_gpuvm {
> > > > >         * space
> > > > >         */
> > > > >        struct dma_resv *resv;
> > > > > +
> > > > > +    /**
> > > > > +     * @extobj: structure holding the extobj list
> > > > > +     */
> > > > > +    struct {
> > > > > +        /**
> > > > > +         * @list: &list_head storing &drm_gpuvm_bos serving
> > > > > as
> > > > > +         * external object
> > > > > +         */
> > > > > +        struct list_head list;
> > > > > +
> > > > > +        /**
> > > > > +         * @lock: spinlock to protect the extobj list
> > > > > +         */
> > > > > +        spinlock_t lock;
> > > > > +    } extobj;
> > > > > +
> > > > > +    /**
> > > > > +     * @evict: structure holding the evict list and evict
> > > > > list
> > > > > lock
> > > > > +     */
> > > > > +    struct {
> > > > > +        /**
> > > > > +         * @list: &list_head storing &drm_gpuvm_bos
> > > > > currently
> > > > > being
> > > > > +         * evicted
> > > > > +         */
> > > > > +        struct list_head list;
> > > > > +
> > > > > +        /**
> > > > > +         * @lock: spinlock to protect the evict list
> > > > > +         */
> > > > > +        spinlock_t lock;
> > > > > +    } evict;
> > > > >    };
> > > > >    void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
> > > > > drm_device
> > > > > *drm,
> > > > > @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
> > > > > *gpuvm,
> > > > > struct drm_device *drm,
> > > > >                const struct drm_gpuvm_ops *ops);
> > > > >    void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
> > > > > +/**
> > > > > + * drm_gpuvm_is_extobj() - indicates whether the given
> > > > > &drm_gem_object is an
> > > > > + * external object
> > > > > + * @gpuvm: the &drm_gpuvm to check
> > > > > + * @obj: the &drm_gem_object to check
> > > > > + *
> > > > > + * Returns: true if the &drm_gem_object &dma_resv differs
> > > > > from
> > > > > the
> > > > > + * &drm_gpuvms &dma_resv, false otherwise
> > > > > + */
> > > > > +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
> > > > > *gpuvm,
> > > > > +                       struct drm_gem_object *obj)
> > > > > +{
> > > > > +    return obj && obj->resv != gpuvm->resv;
> > > > > +}
> > > > > +
> > > > >    static inline struct drm_gpuva *
> > > > >    __drm_gpuva_next(struct drm_gpuva *va)
> > > > >    {
> > > > > @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
> > > > >    #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__)
> > > > > \
> > > > >        list_for_each_entry_safe(va__, next__, &(gpuvm__)-
> > > > > >rb.list,
> > > > > rb.entry)
> > > > > +/**
> > > > > + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
> > > > > &drm_exec
> > > > > + *
> > > > > + * This structure should be created on the stack as
> > > > > &drm_exec
> > > > > should be.
> > > > > + *
> > > > > + * Optionally, @extra can be set in order to lock additional
> > > > > &drm_gem_objects.
> > > > > + */
> > > > > +struct drm_gpuvm_exec {
> > > > > +    /**
> > > > > +     * @exec: the &drm_exec structure
> > > > > +     */
> > > > > +    struct drm_exec exec;
> > > > > +
> > > > > +    /**
> > > > > +     * @vm: the &drm_gpuvm to lock its DMA reservations
> > > > > +     */
> > > > > +    struct drm_gpuvm *vm;
> > > > > +
> > > > > +    /**
> > > > > +     * @extra: Callback and corresponding private data for
> > > > > the
> > > > > driver to
> > > > > +     * lock arbitrary additional &drm_gem_objects.
> > > > > +     */
> > > > > +    struct {
> > > > > +        /**
> > > > > +         * @fn: The driver callback to lock additional
> > > > > &drm_gem_objects.
> > > > > +         */
> > > > > +        int (*fn)(struct drm_gpuvm_exec *vm_exec,
> > > > > +              unsigned int num_fences);
> > > > > +
> > > > > +        /**
> > > > > +         * @priv: driver private data for the @fn callback
> > > > > +         */
> > > > > +        void *priv;
> > > > > +    } extra;
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
> > > > > resv
> > > > > + * @gpuvm: the &drm_gpuvm
> > > > > + * @exec: the &drm_exec context
> > > > > + * @num_fences: the amount of &dma_fences to reserve
> > > > > + *
> > > > > + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
> > > > > &drm_gem_object.
> > > > > + *
> > > > > + * Using this function directly, it is the drivers
> > > > > responsibility to call
> > > > > + * drm_exec_init() and drm_exec_fini() accordingly.
> > > > > + *
> > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > + */
> > > > > +static inline int
> > > > > +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
> > > > > +             struct drm_exec *exec,
> > > > > +             unsigned int num_fences)
> > > > > +{
> > > > > +    return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
> > > > > num_fences);
> > > > > +}
> > > > > +
> > > > > +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
> > > > > +                  struct drm_exec *exec,
> > > > > +                  unsigned int num_fences);
> > > > > +
> > > > > +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
> > > > > +                struct drm_exec *exec,
> > > > > +                u64 addr, u64 range,
> > > > > +                unsigned int num_fences);
> > > > > +
> > > > > +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
> > > > > +            unsigned int num_fences,
> > > > > +            bool interruptible);
> > > > > +
> > > > > +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
> > > > > *vm_exec,
> > > > > +                  struct drm_gem_object **objs,
> > > > > +                  unsigned int num_objs,
> > > > > +                  unsigned int num_fences,
> > > > > +                  bool interruptible);
> > > > > +
> > > > > +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
> > > > > *vm_exec,
> > > > > +                  u64 addr, u64 range,
> > > > > +                  unsigned int num_fences,
> > > > > +                  bool interruptible);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
> > > > > BOs
> > > > > + * @gpuvm: the &drm_gpuvm
> > > > > + *
> > > > > + * Releases all dma-resv locks of all &drm_gem_objects
> > > > > previously acquired
> > > > > + * through drm_gpuvm_lock() or its variants.
> > > > > + *
> > > > > + * Returns: 0 on success, negative error code on failure.
> > > > > + */
> > > > > +static inline void
> > > > > +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
> > > > > +{
> > > > > +    drm_exec_fini(&vm_exec->exec);
> > > > > +}
> > > > > +
> > > > > +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
> > > > > +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
> > > > > +                  struct drm_exec *exec,
> > > > > +                  struct dma_fence *fence,
> > > > > +                  enum dma_resv_usage private_usage,
> > > > > +                  enum dma_resv_usage extobj_usage);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuvm_exec_resv_add_fence()
> > > > > + * @vm_exec: the &drm_gpuvm_exec abstraction
> > > > > + * @fence: fence to add
> > > > > + * @private_usage: private dma-resv usage
> > > > > + * @extobj_usage: extobj dma-resv usage
> > > > > + *
> > > > > + * See drm_gpuvm_resv_add_fence().
> > > > > + */
> > > > > +static inline void
> > > > > +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
> > > > > *vm_exec,
> > > > > +                  struct dma_fence *fence,
> > > > > +                  enum dma_resv_usage private_usage,
> > > > > +                  enum dma_resv_usage extobj_usage)
> > > > > +{
> > > > > +    drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec,
> > > > > fence,
> > > > > +                 private_usage, extobj_usage);
> > > > > +}
> > > > > +
> > > > >    /**
> > > > >     * struct drm_gpuvm_bo - structure representing a
> > > > > &drm_gpuvm
> > > > > and
> > > > >     * &drm_gem_object combination
> > > > > @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
> > > > >                 * gpuva list.
> > > > >                 */
> > > > >                struct list_head gem;
> > > > > +
> > > > > +            /**
> > > > > +             * @evict: List entry to attach to the
> > > > > &drm_gpuvms
> > > > > +             * extobj list.
> > > > > +             */
> > > > > +            struct list_head extobj;
> > > > > +
> > > > > +            /**
> > > > > +             * @evict: List entry to attach to the
> > > > > &drm_gpuvms
> > > > > evict
> > > > > +             * list.
> > > > > +             */
> > > > > +            struct list_head evict;
> > > > >            } entry;
> > > > >        } list;
> > > > >    };
> > > > > @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
> > > > >    drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
> > > > >              struct drm_gem_object *obj);
> > > > > +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
> > > > > evict);
> > > > > +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
> > > > > +
> > > > >    /**
> > > > >     * drm_gpuvm_bo_for_each_va() - iterator to walk over a
> > > > > list of
> > > > > &drm_gpuva
> > > > >     * @va__: &drm_gpuva structure to assign to in each
> > > > > iteration
> > > > > step
> > > > > @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
> > > > >         * used.
> > > > >         */
> > > > >        int (*sm_step_unmap)(struct drm_gpuva_op *op, void
> > > > > *priv);
> > > > > +
> > > > > +    /**
> > > > > +     * @bo_validate: called from drm_gpuvm_validate()
> > > > > +     *
> > > > > +     * Drivers receive this callback for every evicted
> > > > > &drm_gem_object being
> > > > > +     * mapped in the corresponding &drm_gpuvm.
> > > > > +     *
> > > > > +     * Typically, drivers would call their driver specific
> > > > > variant of
> > > > > +     * ttm_bo_validate() from within this callback.
> > > > > +     */
> > > > > +    int (*bo_validate)(struct drm_gem_object *obj);
> > > > 
> > > > Same here. Could we have a vm_bo as an argument instead, so
> > > > that
> > > > the callback knows what gpuvm we're targeting and can mark all
> > > > its
> > > > gpu_vas for revalidation? Or is that intended to be done
> > > > elsewhere?
> > > 
> > > Makes sense as well. I'll change that too.
> > 
> > I forgot, drm_gpuvm_validate() would preferably take an
> > drm_gpuvm_exec
> > argument because we need it in the validate callback. It's also
> > easy
> > for the driver to subclass further if needed, to pass even more
> > arguments to its validate callback.
> 
> Hm.. that implies that a driver open coding the drm_exec loop, still
> needs
> to use a struct drm_gpuvm_exec rather than just a struct drm_exec.
> What is
> this needed for in Xe? Do we expect other drivers needing it? Might a
> priv
> void pointer maybe make more sense?

It's for sleeping locks during eviction rather than trylocks. TTM
currently fishes out the struct ww_acquire_context used for locking
from the lock itself, but I'd expect that to be more explicit in the
near future with a variant of ttm_bo_validate() that actually
explicitly takes a drm_exec as argument.

So we would probably also like to try to find a way to encourage
drivers to include the validate() in the until_all_locked() loop,
because if TTM resorts to a sleeping lock *after* that loop, the
following warning will be hit:

https://elixir.bootlin.com/linux/latest/source/kernel/locking/ww_mutex.h#L195

So not sure what's best, but perhaps then a struct drm_exec * or a 
(void *)

/Thomas


> 
> > 
> > /Thomas
> > 
> > 
> > > 
> > > > 
> > > > >    };
> > > > >    int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
> > > > 
> > > > Thanks,
> > > > 
> > > > Thomas
> > > > 
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [Nouveau] [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-14 17:15                       ` Danilo Krummrich
@ 2023-09-18 11:21                         ` Danilo Krummrich
  0 siblings, 0 replies; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-18 11:21 UTC (permalink / raw)
  To: Thomas Hellström
  Cc: matthew.brost, sarah.walker, nouveau, linux-kernel, dri-devel,
	boris.brezillon, donald.robson, daniel, christian.koenig,
	faith.ekstrand

On 9/14/23 19:15, Danilo Krummrich wrote:
> On 9/14/23 19:13, Thomas Hellström wrote:
>> On Thu, 2023-09-14 at 17:27 +0200, Danilo Krummrich wrote:
>>> On 9/14/23 13:32, Thomas Hellström wrote:
>>>>
>>>> On 9/14/23 12:57, Danilo Krummrich wrote:
>>>>> On 9/13/23 14:16, Danilo Krummrich wrote:
>>>>>
>>>>> <snip>
>>>>>
>>>>>>>> And validate() can remove it while still holding all dma-
>>>>>>>> resv locks,
>>>>>>>> neat!
>>>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>>>> concurrently? What
>>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>>>> drm_gpuva_unlink()?
>>>>>>>> Are we guaranteed that at this point of time the
>>>>>>>> drm_gpuvm_bo is not
>>>>>>>> on the
>>>>>>>> evicted list? Because otherwise we would call
>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>> with the
>>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>> might drop the last reference to the drm_gem_object and
>>>>>>>> hence we'd
>>>>>>>> potentially
>>>>>>>> free the dma-resv lock while holding it, at least if it's
>>>>>>>> an external
>>>>>>>> object.
>>>>>>>
>>>>>>> Easiest way in this scheme is to think of the lists as being
>>>>>>> protected
>>>>>>> by the vm's resv lock. That means anybody calling unlink()
>>>>>>> must also
>>>>>>> hold the vm's resv lock. (Which is OK from an UAF point of
>>>>>>> view, but
>>>>>>> perhaps not from a locking inversion POW from an async list
>>>>>>> update).
>>>>>>
>>>>>> This would mean that on unlink() we'd need to hold the VM's
>>>>>> resv lock and the
>>>>>> corresponding GEM's resv lock (in case they're not the same
>>>>>> anyways) because the
>>>>>> VM's resv lock would protect the external / evicted object
>>>>>> lists and the GEM
>>>>>> objects resv lock protects the GEM's list of drm_gpuvm_bos and
>>>>>> the
>>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>
>>>>> As mentioned below the same applies for drm_gpuvm_bo_put() since
>>>>> it might
>>>>> destroy the vm_bo, which includes removing the vm_bo from
>>>>> external / evicted
>>>>> object lists and the GEMs list of vm_bos.
>>>>>
>>>>> As mentioned, if the GEM's dma-resv is different from the VM's
>>>>> dma-resv we need
>>>>> to take both locks. Ultimately, this would mean we need a
>>>>> drm_exec loop, because
>>>>> we can't know the order in which to take these locks. Doing a
>>>>> full drm_exec loop
>>>>> just to put() a vm_bo doesn't sound reasonable to me.
>>>>>
>>>>> Can we instead just have an internal mutex for locking the lists
>>>>> such that we
>>>>> avoid taking and dropping the spinlocks, which we use currently,
>>>>> in a loop?
>>>>
>>>> You'd have the same locking inversion problem with a mutex, right?
>>>> Since in the eviction path you have resv->mutex, from exec you have
>>>> resv->mutex->resv because validate would attempt to grab resv.
>>>
>>> Both lists, evict and extobj, would need to have a separate mutex,
>>> not a common one.
>>> We'd also need a dedicated GEM gpuva lock. Then the only rule would
>>> be that you can't
>>> hold the dma-resv lock when calling put(). Which I admit is not that
>>> nice.
>>>
>>> With the current spinlock solution drivers wouldn't need to worry
>>> about anything locking
>>> related though. So maybe I come back to your proposal of having a
>>> switch for external
>>> locking with dma-resv locks entirely. Such that with external dma-
>>> resv locking I skip
>>> all the spinlocks and add lockdep checks instead.
>>>
>>> I think that makes the most sense in terms of taking advantage of
>>> external dma-resv locking
>>> where possible and on the other hand having a self-contained solution
>>> if not. This should
>>> get all concerns out of the way, yours, Christian's and Boris'.
>>
>> If we need additional locks yes, I'd prefer the opt-in/opt-out spinlock
>> solution, and check back after a while to see if we can remove either
>> option once most pitfalls are hit.
> 
> Sounds good, I'll prepare this for a V4.

I was considering getting rid of the spinlocks using srcu for both external and
evicted objects instead. This would get us rid of taking/dropping the spinlock in
every iteration step of the lists, limiting it to a single srcu_read_{lock,unlock}
call per list walk. Plus, obviously the list_add_rcu() and list_del_rcu() variants
as accessors. The accessors, would probably still need a spinlock to protect against
concurrent list_add_rcu()/list_del_rcu() calls, but I think those are not a concern.

Any concerns from your side with variant?

> 
> - Danilo
> 
>>
>> Thanks,
>> /Thomas
>>
>>
>>>
>>>>
>>>> That said, xe currently indeed does the vm+bo exec dance on vma
>>>> put.
>>>>
>>>> One reason why that seemingly horrible construct is good, is that
>>>> when evicting an extobj and you need to access individual vmas to
>>>> Zap page table entries or TLB flush, those VMAs are not allowed to
>>>> go away (we're not refcounting them). Holding the bo resv on gpuva
>>>> put prevents that from happening. Possibly one could use another
>>>> mutex to protect the gem->vm_bo list to achieve the same, but we'd
>>>> need to hold it on gpuva put.
>>>>
>>>> /Thomas
>>>>
>>>>
>>>>>
>>>>> - Danilo
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> For extobjs an outer lock would be enough in case of
>>>>>>>>>> Xe, but I
>>>>>>>>>> really would not
>>>>>>>>>> like to add even more complexity just to get the
>>>>>>>>>> spinlock out of
>>>>>>>>>> the way in case
>>>>>>>>>> the driver already has an outer lock protecting this
>>>>>>>>>> path.
>>>>>>>>>
>>>>>>>>> I must disagree here. These spinlocks and atomic
>>>>>>>>> operations are
>>>>>>>>> pretty
>>>>>>>>> costly and as discussed earlier this type of locking was
>>>>>>>>> the reason
>>>>>>>>> (at
>>>>>>>>> least according to the commit message) that made
>>>>>>>>> Christian drop the
>>>>>>>>> XArray
>>>>>>>>> use in drm_exec for the same set of objects: "The locking
>>>>>>>>> overhead
>>>>>>>>> is
>>>>>>>>> unecessary and measurable". IMHO the spinlock is the
>>>>>>>>> added
>>>>>>>>> complexity and a
>>>>>>>>> single wide lock following the drm locking guidelines set
>>>>>>>>> out by
>>>>>>>>> Daniel and
>>>>>>>>> David should really be the default choice with an opt-in
>>>>>>>>> for a
>>>>>>>>> spinlock if
>>>>>>>>> needed for async and pushing out to a wq is not an
>>>>>>>>> option.
>>>>>>>>
>>>>>>>> For the external object list an outer lock would work as
>>>>>>>> long as it's
>>>>>>>> not the
>>>>>>>> dma-resv lock of the corresponding GEM object, since here
>>>>>>>> we actually
>>>>>>>> need to
>>>>>>>> remove the list entry from the external object list on
>>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>>> It's just a bit weird design wise that drivers would need
>>>>>>>> to take
>>>>>>>> this outer
>>>>>>>> lock on:
>>>>>>>>
>>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also
>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>>
>>>>>>>> Given that it seems reasonable to do all the required
>>>>>>>> locking
>>>>>>>> internally.
>>>>>>>
>>>>>>>   From a design POW, there has been a clear direction in XE to
>>>>>>> make
>>>>>>> things similar to mmap() / munmap(), so this outer lock,
>>>>>>> which in Xe is
>>>>>>> an rwsem, is used in a similar way as the mmap_lock. It's
>>>>>>> protecting
>>>>>>> the page-table structures and vma rb tree, the userptr
>>>>>>> structures and
>>>>>>> the extobj list. Basically it's taken early in the exec
>>>>>>> IOCTL, the
>>>>>>> VM_BIND ioctl, the compute rebind worker and the pagefault
>>>>>>> handler, so
>>>>>>> all of the above are just asserting that it is taken in the
>>>>>>> correct
>>>>>>> mode.
>>>>>>>
>>>>>>> But strictly with this scheme one could also use the vm's
>>>>>>> dma_resv for
>>>>>>> the extobj list since with drm_exec, it's locked before
>>>>>>> traversing the
>>>>>>> list.
>>>>>>>
>>>>>>> The whole point of this scheme is to rely on locks that you
>>>>>>> already are
>>>>>>> supposed to be holding for various reasons and is simple to
>>>>>>> comprehend.
>>>>>>
>>>>>> I don't agree that we're supposed to hold the VM's resv lock
>>>>>> anyways for
>>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but
>>>>>> I'm fine using it
>>>>>> for that purpose nevertheless.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> In order to at least place lockdep checks, the driver would
>>>>>>>> need to
>>>>>>>> supply the
>>>>>>>> corresponding lock's lockdep_map, because the GPUVM
>>>>>>>> otherwise doesn't
>>>>>>>> know about
>>>>>>>> the lock.
>>>>>>>
>>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>>
>>>>>> I'd really like to avoid that, especially now that everything
>>>>>> got simpler. We
>>>>>> should define the actual locks to take instead.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Out of curiosity, what is the overhead of a spin_lock()
>>>>>>>> that doesn't
>>>>>>>> need to
>>>>>>>> spin?
>>>>>>>
>>>>>>> I guess it's hard to tell exactly, but it is much lower on
>>>>>>> modern x86
>>>>>>> than what it used to be. Not sure about ARM, which is the
>>>>>>> other
>>>>>>> architecture important to us. I figure if there is little
>>>>>>> cache-line
>>>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>>>
>>>>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm
>>>>>>>>> *gpuvm,
>>>>>>>>> spinlock_t
>>>>>>>>> *lock)
>>>>>>>>>
>>>>>>>>> {
>>>>>>>>>
>>>>>>>>>       if (!gpuvm->resv_protected_lists)
>>>>>>>>>           spin_lock(lock);
>>>>>>>>>
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>>>> For such drivers, that would require anybody calling
>>>>>>>>>>> unlink to
>>>>>>>>>>> hold the vm's
>>>>>>>>>>> resv, though.
>>>>>>>>>> In V4 I want to go back to having a dedicated lock for
>>>>>>>>>> the GEMs
>>>>>>>>>> gpuva list (or
>>>>>>>>>> VM_BO list to be more precise). We can't just use the
>>>>>>>>>> dma-resv
>>>>>>>>>> lock for that
>>>>>>>>>> with VM_BO abstractions, because on destruction of a
>>>>>>>>>> VM_BO we
>>>>>>>>>> otherwise wouldn't
>>>>>>>>>> be allowed to already hold the dma-resv lock. That's
>>>>>>>>>> the fix I
>>>>>>>>>> was referring to
>>>>>>>>>> earlier.
>>>>>>>>>
>>>>>>>>> Yeah, I can see the need for a dedicated lock for the
>>>>>>>>> GEM's gpuva
>>>>>>>>> list, but
>>>>>>>>> holding the vm's dma-resv lock across the unlink
>>>>>>>>> shouldn't be a
>>>>>>>>> problem. We
>>>>>>>>> may free the object and a pointer to the vm's resv during
>>>>>>>>> unlink
>>>>>>>>> but we
>>>>>>>>> don't free the vm's resv.  It'd be a matter of ensuring
>>>>>>>>> that any
>>>>>>>>> calls to
>>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>>
>>>>>>>> Drivers calling unlink() from the fence signaling path
>>>>>>>> can't use the
>>>>>>>> VM's
>>>>>>>> dma-resv lock.
>>>>>>>
>>>>>>> Yes, that made me a bit curious because in the current
>>>>>>> version the code
>>>>>>> required the object's dma_resv for unlink() which can't be
>>>>>>> grabbed
>>>>>>> either from the fence signaling path. So are there any
>>>>>>> drivers actually
>>>>>>> wanting to do that? If so, they will either need to resort to
>>>>>>> the
>>>>>>> current spinlock solution or they will need to call unlink
>>>>>>> from a
>>>>>>> workqueue item.
>>>>>>
>>>>>> As Boris already mentioned we have the dma-resv lock by default
>>>>>> or a driver
>>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of the
>>>>>> latter.
>>>>>>
>>>>>>>>
>>>>>>>> Also, what if the object is an external object? We can't
>>>>>>>> use the VM's
>>>>>>>> dma-resv
>>>>>>>> lock here.
>>>>>>>
>>>>>>> Why? Typically (sync) unlink is only ever called from an
>>>>>>> unbind-like
>>>>>>> operation where it should be trivial to grab the vm's resv.
>>>>>>> Or, for
>>>>>>> that matter any outer lock protecting the extobj list. Rule
>>>>>>> would be
>>>>>>> the drm_gpuvm_bo::entry::extobj  and
>>>>>>> drm_gpuvm_bo::entry::evict would
>>>>>>> be protected by either the vm's dma_resv (or possibly an
>>>>>>> outer lock in
>>>>>>> the case of the extobj list).
>>>>>>
>>>>>> Outer lock wouldn't have been working for updates in the async
>>>>>> path, but
>>>>>> shouldn't be relevant anymore. We could use the VM's resv for
>>>>>> that.
>>>>>>
>>>>>>>
>>>>>>>>    And we can't have the GEM objs dma-resv lock held when
>>>>>>>> calling
>>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if
>>>>>>>> the
>>>>>>>> refcount drops
>>>>>>>> to zero calls drm_gpuvm_bo_destroy() and
>>>>>>>> drm_gpuvm_bo_destroy() might
>>>>>>>> drop the
>>>>>>>> last reference of the GEM object.
>>>>>>>
>>>>>>> Yes, but this is a different problem as to what exactly
>>>>>>> protects
>>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal
>>>>>>> per bo list
>>>>>>> lock, or if we want to keep the bo's dma_resv we need to
>>>>>>> ensure that
>>>>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts
>>>>>>> its obj
>>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's
>>>>>>> refcount (I know
>>>>>>> Boris didn't like that, but requiring an explicit refcount
>>>>>>> for a
>>>>>>> pointer you dereference unless you're under a lock that
>>>>>>> ensures keeping
>>>>>>> the object alive is pretty much required?) But anyway for the
>>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal
>>>>>>> spinlock)
>>>>>>> I don't have a strong preference.
>>>>>>
>>>>>> We can keep the GEM objects dma-resv lock, however as mentioned
>>>>>> above
>>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both
>>>>>> the VM's resv lock
>>>>>> and the GEM's resv lock in case they differ.
>>>>>>
>>>>>
>>>>>>>>>
>>>>>
>>>>
>>>
>>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-13 15:46                     ` Danilo Krummrich
@ 2023-09-19 12:07                       ` Christian König
  2023-09-19 12:21                         ` Thomas Hellström
  0 siblings, 1 reply; 77+ messages in thread
From: Christian König @ 2023-09-19 12:07 UTC (permalink / raw)
  To: Danilo Krummrich, Thomas Hellström
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
> On 9/13/23 17:33, Christian König wrote:
>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>> On 9/13/23 16:26, Christian König wrote:
>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>> As mentioned in a different mail thread, the reply is based on the 
>>>>> assumption
>>>>> that we don't support anything else than GPUVM updates from the 
>>>>> IOCTL.
>>>>
>>>> I think that this assumption is incorrect.
>>>
>>> Well, more precisely I should have said "don't support GPUVM updated 
>>> from within
>>> fence signaling critical sections". And looking at the code, that 
>>> doesn't seem what
>>> you're doing there.
>>>
>>>>
>>>> Vulkan is just once specific use case, but this here should 
>>>> probably be able to handle other use cases as well.
>>>>
>>>> Especially with HMM you get the requirement that you need to be 
>>>> able to invalidate GPUVM mappings without grabbing a reservation lock.
>>>
>>> What do you mean with "invalidate GPUVM mappings" in this context? 
>>> drm_gpuvm_bo_evict()
>>> should only be called from a ttm_device_funcs::move\x0f callback, we 
>>> should hold the dma-resv
>>> lock there.
>>
>> Well the question is which dma-resv lock do we hold?
>>
>> In the move callback we only hold the dma-resv lock of the BO which 
>> is moved, but when that is a shared BO then that's not the same as 
>> the one for the VM.
>
> Correct, Thomas' idea was to use the GEM's dma_resv lock to protect 
> drm_gpuvm_bo::evicted
> and then actually move the drm_gpuvm_bo to the VM's evicted list once 
> we grabbed all
> dma-resv locks when locking the VM's BOs using drm_exec. We can remove 
> them from the evicted
> list on validate(). This way we never touch the evicted list without 
> holding at least the VM's
> dma-resv lock.
>
> Do you have any concerns about that?

Scratching my head a bit how that is supposed to work.

This implies that you go over all the evicted BOs during validation and 
not just the one mentioned in the CS.

That might work for Vulkan, but is pretty much a no-go for OpenGL.

>
>>
>>>
>>>>
>>>> See what the eviction lock in amdgpu is doing for example.
>>>
>>> The eviction_lock seems to protect a VM state "evicting" of whether 
>>> any BO that
>>> is associated with the VM is currently evicting. At the same time 
>>> amdgpu protects
>>> the eviceted list of the VM with a different lock. So this seems to 
>>> be entirely
>>> unrelated. Tracking a "currently evicting" state is not part of the 
>>> GPUVM
>>> implementation currently and hence nothing would change for amdgpu 
>>> there.
>>
>> Sorry for the confusion we use different terminology in amdgpu.
>>
>> The eviction lock and evicted state is for the VM page tables, e.g. 
>> if the whole VM is currently not used and swapped out or even 
>> de-allocated.
>>
>> This is necessary because we have cases where we need to access the 
>> VM data without holding the dma-resv lock of this VM. Especially 
>> figuring out which parts of an address space contain mappings and 
>> which doesn't.
>
> I think this is fine, this has nothing to do with lists of evicted GEM 
> objects or external GEM
> objects, right? Marking mappings (drm_gpuva) as invalidated 
> (DRM_GPUVA_INVALIDATED) or accessing
> the VA space does not require any dma-resv locks.

I hope so, but I'm not 100% sure.

>
>>
>> This is a requirement which comes with HMM handling, you won't see 
>> this with Vulkan (or OpenGL, VAAPI etc..).
>>
>>
>> The invalidation lock on the other hand is what in this discussion is 
>> called eviction lock. This one is needed because what I wrote above, 
>> during the move callback only the dma-resv of the BO which is moved 
>> is locked, but not necessarily the dma-resv of the VM.
>
> That's yet another thing, right? This is used to track whether *any* 
> BO that belongs to the VM is
> currently being evicted, correct? As mentioned, as by now this is not 
> supported in GPUVM and hence
> would be the same driver specific code with the same driver specifc lock.

That is most likely a show stopper using this for OpenGL based workloads 
as far as I can see. For those you need to able to figure out which 
non-VM BOs have been evicted and which parts of the VM needs updates.

BTW: Do I got it right that you put the dma_resv object into the VM and 
not into the first GEM object associated with the VM? If yes then that 
would be a circle dependency.

Regards,
Christian.



>
>>
>> Regards,
>> Christian.
>>
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström wrote:
>>>>>> Hi!
>>>>>>
>>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström wrote:
>>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
>>>>>>>>>> Hi, Danilo,
>>>>>>>>>>
>>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>>> So far the DRM GPUVA manager offers common infrastructure to
>>>>>>>>>>> track GPU VA
>>>>>>>>>>> allocations and mappings, generically connect GPU VA mappings
>>>>>>>>>>> to their
>>>>>>>>>>> backing buffers and perform more complex mapping operations
>>>>>>>>>>> on the GPU VA
>>>>>>>>>>> space.
>>>>>>>>>>>
>>>>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>>>>> drivers, which
>>>>>>>>>>> can potentially be generalized in order to make the DRM GPUVA
>>>>>>>>>>> manager
>>>>>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>>>>>> this patch aims
>>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>>
>>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not being used
>>>>>>>>>>> outside of
>>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>>
>>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>>>>> which are
>>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>>
>>>>>>>>>>> 3) Provide functions to efficiently lock all GEM objects dma-
>>>>>>>>>>> resv the
>>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>>
>>>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>>>> contains mappings
>>>>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>>>>> accelerated.
>>>>>>>>>>>
>>>>>>>>>>> 5) Provide some convinience functions for common patterns.
>>>>>>>>>>>
>>>>>>>>>>> Rather than being designed as a "framework", the target is to
>>>>>>>>>>> make all
>>>>>>>>>>> features appear as a collection of optional helper functions,
>>>>>>>>>>> such that
>>>>>>>>>>> drivers are free to make use of the DRM GPUVA managers basic
>>>>>>>>>>> functionality and opt-in for other features without setting
>>>>>>>>>>> any feature
>>>>>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>>>>>
>>>>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>>>>> locking for drivers
>>>>>>>>>>> updating the GPU VA space within the fence signalling path.
>>>>>>>>>>>
>>>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>>> ---
>>>>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an existing
>>>>>>>>>>> instance of this
>>>>>>>>>>>      * particular combination. If not existent a new instance
>>>>>>>>>>> is created and linked
>>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>>> + *
>>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>>>>> evicted objects. Those
>>>>>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>>>>>> dma-resv locks and
>>>>>>>>>>> + * validation of evicted objects bound in a &drm_gpuvm. For
>>>>>>>>>>> instance the all
>>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be
>>>>>>>>>>> locked by calling
>>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>>>>>> also possible to lock
>>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>>> corresponding parameters to
>>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the &drm_exec
>>>>>>>>>>> loop while making
>>>>>>>>>>> + * use of helper functions such as drm_gpuvm_prepare_range()
>>>>>>>>>>> or
>>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>>> + *
>>>>>>>>>>> + * Every bound &drm_gem_object is treated as external object
>>>>>>>>>>> when its &dma_resv
>>>>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>>      */
>>>>>>>>>>>     /**
>>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for the same
>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>>>>> creations and destructions
>>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances unique.
>>>>>>>>>>> + *
>>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of external and
>>>>>>>>>>> evicted objects are
>>>>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>>>>> iteration internally.
>>>>>>>>>>> + *
>>>>>>>>>>> + * However, drivers still need ensure to protect concurrent
>>>>>>>>>>> calls to functions
>>>>>>>>>>> + * iterating those lists, such as drm_gpuvm_validate() and
>>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function contains
>>>>>>>>>>> a particular
>>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Functions adding or removing entries from those lists,
>>>>>>>>>>> such as
>>>>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be
>>>>>>>>>>> called with external
>>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>>> corresponding list to be
>>>>>>>>>>> + * (safely) modified while potentially being iternated by
>>>>>>>>>>> other API functions.
>>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>>      */
>>>>>>>>>>>     /**
>>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>>      *   }
>>>>>>>>>>>      */
>>>>>>>>>>> +/**
>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>>>> already iterated items
>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>> + *
>>>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>>>> first element from
>>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>>> concurrently.
>>>>>>>>>> Are the list spinlocks needed for that async state update from
>>>>>>>>>> within the
>>>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>>
>>>>>>>>>> Otherwise it should be sufficient to protect the lists with the
>>>>>>>>>> gpuvm's resv
>>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>>
>>>>>>>>>> If those spinlocks are still needed in some situations, perhaps
>>>>>>>>>> could we
>>>>>>>>>> have an option to set them to NULL (Like IIRC the maple tree
>>>>>>>>>> allows for)?
>>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>>> holding only the dma-resv lock from the BO this function gets
>>>>>>>>> called for. Hence,
>>>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() calls with
>>>>>>>>> different BOs.
>>>>>>>> No. Only if you try to add external objects to the vm's evict list
>>>>>>>> from
>>>>>>>> within the evict code. That's not necessary since you loop through
>>>>>>>> all
>>>>>>>> external objects anyway when locking them so an "evicted" bool in
>>>>>>>> the vm_bo,
>>>>>>>> protected by the bo resv would be sufficient. The extobj locking
>>>>>>>> loop can
>>>>>>>> then add the bo to the evicted list.
>>>>>>> And validate() can remove it while still holding all dma-resv 
>>>>>>> locks,
>>>>>>> neat!
>>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>>> concurrently? What
>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>>> drm_gpuva_unlink()?
>>>>>>> Are we guaranteed that at this point of time the drm_gpuvm_bo is 
>>>>>>> not
>>>>>>> on the
>>>>>>> evicted list? Because otherwise we would call 
>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>> with the
>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>> might drop the last reference to the drm_gem_object and hence we'd
>>>>>>> potentially
>>>>>>> free the dma-resv lock while holding it, at least if it's an 
>>>>>>> external
>>>>>>> object.
>>>>>> Easiest way in this scheme is to think of the lists as being 
>>>>>> protected
>>>>>> by the vm's resv lock. That means anybody calling unlink() must also
>>>>>> hold the vm's resv lock. (Which is OK from an UAF point of view, but
>>>>>> perhaps not from a locking inversion POW from an async list update).
>>>>> This would mean that on unlink() we'd need to hold the VM's resv 
>>>>> lock and the
>>>>> corresponding GEM's resv lock (in case they're not the same 
>>>>> anyways) because the
>>>>> VM's resv lock would protect the external / evicted object lists 
>>>>> and the GEM
>>>>> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>
>>>>>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>>>>>> really would not
>>>>>>>>> like to add even more complexity just to get the spinlock out of
>>>>>>>>> the way in case
>>>>>>>>> the driver already has an outer lock protecting this path.
>>>>>>>> I must disagree here. These spinlocks and atomic operations are
>>>>>>>> pretty
>>>>>>>> costly and as discussed earlier this type of locking was the 
>>>>>>>> reason
>>>>>>>> (at
>>>>>>>> least according to the commit message) that made Christian drop 
>>>>>>>> the
>>>>>>>> XArray
>>>>>>>> use in drm_exec for the same set of objects: "The locking overhead
>>>>>>>> is
>>>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>>>> complexity and a
>>>>>>>> single wide lock following the drm locking guidelines set out by
>>>>>>>> Daniel and
>>>>>>>> David should really be the default choice with an opt-in for a
>>>>>>>> spinlock if
>>>>>>>> needed for async and pushing out to a wq is not an option.
>>>>>>> For the external object list an outer lock would work as long as 
>>>>>>> it's
>>>>>>> not the
>>>>>>> dma-resv lock of the corresponding GEM object, since here we 
>>>>>>> actually
>>>>>>> need to
>>>>>>> remove the list entry from the external object list on
>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>> It's just a bit weird design wise that drivers would need to take
>>>>>>> this outer
>>>>>>> lock on:
>>>>>>>
>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>> drm_gpuvm_bo_put())
>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>
>>>>>>> Given that it seems reasonable to do all the required locking
>>>>>>> internally.
>>>>>>  From a design POW, there has been a clear direction in XE to make
>>>>>> things similar to mmap() / munmap(), so this outer lock, which in 
>>>>>> Xe is
>>>>>> an rwsem, is used in a similar way as the mmap_lock. It's protecting
>>>>>> the page-table structures and vma rb tree, the userptr structures 
>>>>>> and
>>>>>> the extobj list. Basically it's taken early in the exec IOCTL, the
>>>>>> VM_BIND ioctl, the compute rebind worker and the pagefault 
>>>>>> handler, so
>>>>>> all of the above are just asserting that it is taken in the correct
>>>>>> mode.
>>>>>>
>>>>>> But strictly with this scheme one could also use the vm's 
>>>>>> dma_resv for
>>>>>> the extobj list since with drm_exec, it's locked before 
>>>>>> traversing the
>>>>>> list.
>>>>>>
>>>>>> The whole point of this scheme is to rely on locks that you 
>>>>>> already are
>>>>>> supposed to be holding for various reasons and is simple to 
>>>>>> comprehend.
>>>>> I don't agree that we're supposed to hold the VM's resv lock 
>>>>> anyways for
>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm 
>>>>> fine using it
>>>>> for that purpose nevertheless.
>>>>>
>>>>>>> In order to at least place lockdep checks, the driver would need to
>>>>>>> supply the
>>>>>>> corresponding lock's lockdep_map, because the GPUVM otherwise 
>>>>>>> doesn't
>>>>>>> know about
>>>>>>> the lock.
>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>> I'd really like to avoid that, especially now that everything got 
>>>>> simpler. We
>>>>> should define the actual locks to take instead.
>>>>>
>>>>>>> Out of curiosity, what is the overhead of a spin_lock() that 
>>>>>>> doesn't
>>>>>>> need to
>>>>>>> spin?
>>>>>> I guess it's hard to tell exactly, but it is much lower on modern 
>>>>>> x86
>>>>>> than what it used to be. Not sure about ARM, which is the other
>>>>>> architecture important to us. I figure if there is little cache-line
>>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>>
>>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>>
>>>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>>>>>> spinlock_t
>>>>>>>> *lock)
>>>>>>>>
>>>>>>>> {
>>>>>>>>
>>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>>          spin_lock(lock);
>>>>>>>>
>>>>>>>> }
>>>>>>>>
>>>>>>>>>> For such drivers, that would require anybody calling unlink to
>>>>>>>>>> hold the vm's
>>>>>>>>>> resv, though.
>>>>>>>>> In V4 I want to go back to having a dedicated lock for the GEMs
>>>>>>>>> gpuva list (or
>>>>>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>>>>>> lock for that
>>>>>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>>>>>> otherwise wouldn't
>>>>>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>>>>>> was referring to
>>>>>>>>> earlier.
>>>>>>>> Yeah, I can see the need for a dedicated lock for the GEM's gpuva
>>>>>>>> list, but
>>>>>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>>>>>> problem. We
>>>>>>>> may free the object and a pointer to the vm's resv during unlink
>>>>>>>> but we
>>>>>>>> don't free the vm's resv.  It'd be a matter of ensuring that any
>>>>>>>> calls to
>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>> Drivers calling unlink() from the fence signaling path can't use 
>>>>>>> the
>>>>>>> VM's
>>>>>>> dma-resv lock.
>>>>>> Yes, that made me a bit curious because in the current version 
>>>>>> the code
>>>>>> required the object's dma_resv for unlink() which can't be grabbed
>>>>>> either from the fence signaling path. So are there any drivers 
>>>>>> actually
>>>>>> wanting to do that? If so, they will either need to resort to the
>>>>>> current spinlock solution or they will need to call unlink from a
>>>>>> workqueue item.
>>>>> As Boris already mentioned we have the dma-resv lock by default or 
>>>>> a driver
>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of the latter.
>>>>>
>>>>>>> Also, what if the object is an external object? We can't use the 
>>>>>>> VM's
>>>>>>> dma-resv
>>>>>>> lock here.
>>>>>> Why? Typically (sync) unlink is only ever called from an unbind-like
>>>>>> operation where it should be trivial to grab the vm's resv. Or, for
>>>>>> that matter any outer lock protecting the extobj list. Rule would be
>>>>>> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict 
>>>>>> would
>>>>>> be protected by either the vm's dma_resv (or possibly an outer 
>>>>>> lock in
>>>>>> the case of the extobj list).
>>>>> Outer lock wouldn't have been working for updates in the async 
>>>>> path, but
>>>>> shouldn't be relevant anymore. We could use the VM's resv for that.
>>>>>
>>>>>>>   And we can't have the GEM objs dma-resv lock held when calling
>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>>>>>> refcount drops
>>>>>>> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() 
>>>>>>> might
>>>>>>> drop the
>>>>>>> last reference of the GEM object.
>>>>>> Yes, but this is a different problem as to what exactly protects
>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per 
>>>>>> bo list
>>>>>> lock, or if we want to keep the bo's dma_resv we need to ensure that
>>>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount 
>>>>>> (I know
>>>>>> Boris didn't like that, but requiring an explicit refcount for a
>>>>>> pointer you dereference unless you're under a lock that ensures 
>>>>>> keeping
>>>>>> the object alive is pretty much required?) But anyway for the
>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal 
>>>>>> spinlock)
>>>>>> I don't have a strong preference.
>>>>> We can keep the GEM objects dma-resv lock, however as mentioned above
>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the 
>>>>> VM's resv lock
>>>>> and the GEM's resv lock in case they differ.
>>>>>
>>>>>>>   All those problems go away with a dedicated
>>>>>>> GEM gpuva list lock.
>>>>>> I don't think these are real problems.
>>>>>> With the excepton of the eviction list "trick" where we currently 
>>>>>> have
>>>>>> slightly different approach to collect external bos needing 
>>>>>> rebinding,
>>>>>> we have this working fine.
>>>>>>
>>>>>> TBH I think pretty much the only situation where the spinlock is 
>>>>>> needed
>>>>>> is for async updates of these lists, unless a wq item can be used 
>>>>>> for
>>>>>> that, but it doesn't really seem like the current code allows for 
>>>>>> such
>>>>>> updates anyway? It complicates the code a lot, adds overhead and 
>>>>>> also
>>>>>> adds the requirement for refcounting during list traversal.
>>>>>>
>>>>>> /Thomas
>>>>>>
>>>>>>>> /Thomas
>>>>>>>>
>>>>>>>>
>>>>>>>>>> It seems that with that also the refcount could be make non-
>>>>>>>>>> atomic.
>>>>>>>>>>
>>>>>>>>>> All in the spirit of the drm locking guidelines "use big locks
>>>>>>>>>> when
>>>>>>>>>> possible".
>>>>>>>>>> Lower level locks only when necessary for performance or
>>>>>>>>>> locking inversion?
>>>>>>>>>>
>>>>>>>>>> /Thomas
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> + *
>>>>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>>>>> local list, so removal
>>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>>> iterating the list.
>>>>>>>>>>> + */
>>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>>> +       ({
>>>>>>>>>>>                             \
>>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>>>> +
>>>>>>>>>>>                             \
>>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>>                             \
>>>>>>>>>>> +
>>>>>>>>>>>                             \
>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>>>>> + struct
>>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>>> +
>>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>>> +                       if
>>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>>> {                    \
>>>>>>>>>>> +                               list_move_tail(&(__vm_bo)-
>>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>>> +
>>>>>>>>>>> __local_list);                           \
>>>>>>>>>>> +                               break;
>>>>>>>>>>>                             \
>>>>>>>>>>> +                       } else
>>>>>>>>>>> {                                                        \
>>>>>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>>> NULL;                                         \
>>>>>>>>>>> +                       }
>>>>>>>>>>>                             \
>>>>>>>>>>> +               }
>>>>>>>>>>>                             \
>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>> +
>>>>>>>>>>>                             \
>>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>>                             \
>>>>>>>>>>> +       })
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
>>>>>>>>>>> + *
>>>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>>>> first element from the
>>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>>> concurrently.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Typical use:
>>>>>>>>>>> + *
>>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>>> + *
>>>>>>>>>>> + *     ret = 0;
>>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>>> + *             ret = do_something_with_vm_bo(..., vm_bo);
>>>>>>>>>>> + *             if (ret)
>>>>>>>>>>> + *                     break;
>>>>>>>>>>> + *     }
>>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>>>>> &my_local_list);
>>>>>>>>>>> + *
>>>>>>>>>>> + *
>>>>>>>>>>> + * Only used for internal list iterations, not meant to be
>>>>>>>>>>> exposed to the outside
>>>>>>>>>>> + * world.
>>>>>>>>>>> + */
>>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>> __list_name,           \
>>>>>>>>>>> +                                               __local_list,
>>>>>>>>>>> NULL);            \
>>>>>>>>>>> +
>>>>>>>>>>> __vm_bo;
>>>>>>>>>>>        \
>>>>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>> __list_name,           \
>>>>>>>>>>> +                                               __local_list,
>>>>>>>>>>> __vm_bo))         \
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to their
>>>>>>>>>>> original list
>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>>>> already iterated items
>>>>>>>>>>> + *
>>>>>>>>>>> + * When we're done iterating a vm_bo list, we should call
>>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>>> + * to restore the original state and let new iterations take
>>>>>>>>>>> place.
>>>>>>>>>>> + */
>>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>>> __local_list)                         \
>>>>>>>>>>> +       do
>>>>>>>>>>> {
>>>>>>>>>>>                  \
>>>>>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>>>>>> list elements to the          \
>>>>>>>>>>> +                * head to preserve previous ordering, in
>>>>>>>>>>> case it matters.              \
>>>>>>>>>>> +
>>>>>>>>>>> */
>>>>>>>>>>>            \
>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>> +       } while (0)
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
>>>>>>>>>>> list
>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>> + *
>>>>>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>>>>>> @__list_name and
>>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>>> + */
>>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>> +       do
>>>>>>>>>>> {
>>>>>>>>>>>          \
>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>> +       } while (0)
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
>>>>>>>>>>> list
>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>> + *
>>>>>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>>>>>> @__list_name and
>>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>>> + */
>>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>> +       do
>>>>>>>>>>> {
>>>>>>>>>>>          \
>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>> +       } while (0)
>>>>>>>>>>> +
>>>>>>>>>>> +static int __must_check
>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>>> +
>>>>>>>>>>>     #define to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
>>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>>> +
>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>>> +
>>>>>>>>>>>          drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>>>>>> *gpuvm)
>>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>>>               "GPUVA tree is not empty, potentially leaking
>>>>>>>>>>> memory.\n");
>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>> +
>>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>>     }
>>>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>> + *
>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the
>>>>>>>>>>> given
>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>> responsibility to call
>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Note: This function is safe against concurrent insertion
>>>>>>>>>>> and removal of
>>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>>> concurrent usage itself.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>>>>> either an outer VM lock
>>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this function
>>>>>>>>>>> within the
>>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the GPUVM's
>>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>> + */
>>>>>>>>>>> +int
>>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>>>>> +                         unsigned int num_fences)
>>>>>>>>>>> +{
>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>> +
>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>>>>> vm_bo) {
>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, vm_bo->obj,
>>>>>>>>>>> num_fences);
>>>>>>>>>>> +               if (ret)
>>>>>>>>>>> +                       break;
>>>>>>>>>>> +       }
>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>>> +
>>>>>>>>>>> +       return ret;
>>>>>>>>>>> +}
>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped within
>>>>>>>>>>> a given range
>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>> + *
>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>>>>>> mapped between @addr
>>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>> + */
>>>>>>>>>>> +int
>>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>> drm_exec *exec,
>>>>>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>>>>>> num_fences)
>>>>>>>>>>> +{
>>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>>> +       int ret;
>>>>>>>>>>> +
>>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
>>>>>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>>>>>> +
>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>>>> num_fences);
>>>>>>>>>>> +               if (ret)
>>>>>>>>>>> +                       return ret;
>>>>>>>>>>> +       }
>>>>>>>>>>> +
>>>>>>>>>>> +       return 0;
>>>>>>>>>>> +}
>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>> + *
>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>>>> given
>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>>> + * being set the driver receives the given @fn callback to
>>>>>>>>>>> lock additional
>>>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec instance.
>>>>>>>>>>> Typically, drivers
>>>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>>>> callback.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>> + */
>>>>>>>>>>> +int
>>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>>> +{
>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>> +       int ret;
>>>>>>>>>>> +
>>>>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>> 0 |
>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>> +
>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>> +
>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>>>>> num_fences);
>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>> +               if (ret)
>>>>>>>>>>> +                       goto err;
>>>>>>>>>>> +
>>>>>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>>>>> num_fences);
>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>> +               if (ret)
>>>>>>>>>>> +                       goto err;
>>>>>>>>>>> +
>>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>>>>> num_fences);
>>>>>>>>>>> +                       drm_exec_retry_on_contention(exec);
>>>>>>>>>>> +                       if (ret)
>>>>>>>>>>> +                               goto err;
>>>>>>>>>>> +               }
>>>>>>>>>>> +       }
>>>>>>>>>>> +
>>>>>>>>>>> +       return 0;
>>>>>>>>>>> +
>>>>>>>>>>> +err:
>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>> +       return ret;
>>>>>>>>>>> +}
>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>>> +
>>>>>>>>>>> +static int
>>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
>>>>>>>>>>> num_fences)
>>>>>>>>>>> +{
>>>>>>>>>>> +       struct {
>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>>> +
>>>>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>>>>> objs,
>>>>>>>>>>> + args->num_objs,
>>>>>>>>>>> num_fences);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>>> + * @num_objs: the number of additional &drm_gem_objects to
>>>>>>>>>>> lock
>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>> + *
>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>>> + * contains mappings of, plus the ones given through @objs.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>> + */
>>>>>>>>>>> +int
>>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>>>>> +                         unsigned int num_objs,
>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>> +{
>>>>>>>>>>> +       struct {
>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>> +       } args;
>>>>>>>>>>> +
>>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>>> +
>>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>>> +
>>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>>>>> interruptible);
>>>>>>>>>>> +}
>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>>>>> within a given range
>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>> + *
>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>> mapped between @addr and
>>>>>>>>>>> + * @addr + @range.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>> + */
>>>>>>>>>>> +int
>>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>> +{
>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>> +       int ret;
>>>>>>>>>>> +
>>>>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>> 0 |
>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>> +
>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>> +
>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>>>>> addr, range,
>>>>>>>>>>> + num_fences);
>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>> +               if (ret)
>>>>>>>>>>> +                       goto err;
>>>>>>>>>>> +       }
>>>>>>>>>>> +
>>>>>>>>>>> +       return ret;
>>>>>>>>>>> +
>>>>>>>>>>> +err:
>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>> +       return ret;
>>>>>>>>>>> +}
>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as evicted
>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>>> + *
>>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>>>>>> evicted buffer
>>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>> + */
>>>>>>>>>>> +int
>>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>>> +{
>>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>> +
>>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>>> +
>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
>>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>>> +               if (ret)
>>>>>>>>>>> +                       break;
>>>>>>>>>>> +       }
>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>>> +
>>>>>>>>>>> +       return ret;
>>>>>>>>>>> +}
>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private and all
>>>>>>>>>>> extobj
>>>>>>>>>>> + * dma-resv
>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>> + */
>>>>>>>>>>> +void
>>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>>>>> +                        enum dma_resv_usage private_usage,
>>>>>>>>>>> +                        enum dma_resv_usage extobj_usage)
>>>>>>>>>>> +{
>>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>>> +       unsigned long index;
>>>>>>>>>>> +
>>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, obj) {
>>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>>> obj) ?
>>>>>>>>>>> + private_usage :
>>>>>>>>>>> extobj_usage);
>>>>>>>>>>> +       }
>>>>>>>>>>> +}
>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>>> +
>>>>>>>>>>>     /**
>>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance of struct
>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>>> +
>>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>>          return vm_bo;
>>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>>> +
>>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>>> +
>>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the reference of
>>>>>>>>>>>      *
>>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>>> + *
>>>>>>>>>>> + * If the reference count drops to zero, the &gpuvm_bo is
>>>>>>>>>>> destroyed, which
>>>>>>>>>>> + * includes removing it from the GEMs gpuva list. Hence, if
>>>>>>>>>>> a call to this
>>>>>>>>>>> + * function can potentially let the reference count to zero
>>>>>>>>>>> the caller must
>>>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>>>>      */
>>>>>>>>>>>     void
>>>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>     }
>>>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>>> +static int __must_check
>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>> +{
>>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>>     }
>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its
>>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>>> + * extobj list
>>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the
>>>>>>>>>>> extobj list.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if
>>>>>>>>>>> not on the list
>>>>>>>>>>> + * already and if the corresponding &drm_gem_object is an
>>>>>>>>>>> external object,
>>>>>>>>>>> + * actually.
>>>>>>>>>>> + */
>>>>>>>>>>> +void
>>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>> +{
>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>>> +
>>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>>>> +}
>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
>>>>>>>>>>> / from a
>>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>>> + *
>>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>>>> + */
>>>>>>>>>>> +void
>>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>>>>>>>>>> +{
>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>> +
>>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>>> +               if (evict)
>>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, evict);
>>>>>>>>>>> +               else
>>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, evict);
>>>>>>>>>>> +       }
>>>>>>>>>>> +}
>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>>> +
>>>>>>>>>>>     static int
>>>>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>>      */
>>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>>           * space
>>>>>>>>>>>           */
>>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>>> +
>>>>>>>>>>> +       /**
>>>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>>>> +        */
>>>>>>>>>>> +       struct {
>>>>>>>>>>> +               /**
>>>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>>>> serving as
>>>>>>>>>>> +                * external object
>>>>>>>>>>> +                */
>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>> +
>>>>>>>>>>> +               /**
>>>>>>>>>>> +                * @lock: spinlock to protect the extobj list
>>>>>>>>>>> +                */
>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>> +       } extobj;
>>>>>>>>>>> +
>>>>>>>>>>> +       /**
>>>>>>>>>>> +        * @evict: structure holding the evict list and evict
>>>>>>>>>>> list lock
>>>>>>>>>>> +        */
>>>>>>>>>>> +       struct {
>>>>>>>>>>> +               /**
>>>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>>>> currently being
>>>>>>>>>>> +                * evicted
>>>>>>>>>>> +                */
>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>> +
>>>>>>>>>>> +               /**
>>>>>>>>>>> +                * @lock: spinlock to protect the evict list
>>>>>>>>>>> +                */
>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>> +       } evict;
>>>>>>>>>>>     };
>>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>> drm_device *drm,
>>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>> + * external object
>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>>> + *
>>>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv differs
>>>>>>>>>>> from the
>>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>>> + */
>>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>>>>> *gpuvm,
>>>>>>>>>>> +                                      struct drm_gem_object
>>>>>>>>>>> *obj)
>>>>>>>>>>> +{
>>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>     {
>>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__)
>>>>>>>>>>> \
>>>>>>>>>>>          list_for_each_entry_safe(va__, next__, &(gpuvm__)-
>>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>>> +/**
>>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>>>> &drm_exec
>>>>>>>>>>> + *
>>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>>> &drm_exec should be.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Optionally, @extra can be set in order to lock additional
>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>> + */
>>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>>> +       /**
>>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>>> +        */
>>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>>> +
>>>>>>>>>>> +       /**
>>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA reservations
>>>>>>>>>>> +        */
>>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>>> +
>>>>>>>>>>> +       /**
>>>>>>>>>>> +        * @extra: Callback and corresponding private data
>>>>>>>>>>> for the driver to
>>>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>>>> +        */
>>>>>>>>>>> +       struct {
>>>>>>>>>>> +               /**
>>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>>> +                */
>>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>> +                         unsigned int num_fences);
>>>>>>>>>>> +
>>>>>>>>>>> +               /**
>>>>>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>>>>>> callback
>>>>>>>>>>> +                */
>>>>>>>>>>> +               void *priv;
>>>>>>>>>>> +       } extra;
>>>>>>>>>>> +};
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
>>>>>>>>>>> resv
>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>> + *
>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>> responsibility to call
>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>> + */
>>>>>>>>>>> +static inline int
>>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>>>> +                    unsigned int num_fences)
>>>>>>>>>>> +{
>>>>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>>>>> num_fences);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>>> +                             unsigned int num_fences);
>>>>>>>>>>> +
>>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>>>> +                           struct drm_exec *exec,
>>>>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>>>>> +                           unsigned int num_fences);
>>>>>>>>>>> +
>>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>>> +
>>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>>> *vm_exec,
>>>>>>>>>>> +                             struct drm_gem_object **objs,
>>>>>>>>>>> +                             unsigned int num_objs,
>>>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>> +
>>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>>> *vm_exec,
>>>>>>>>>>> +                             u64 addr, u64 range,
>>>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
>>>>>>>>>>> BOs
>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>> + *
>>>>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>> previously acquired
>>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>>> + *
>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>> + */
>>>>>>>>>>> +static inline void
>>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>>>> +{
>>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>> private_usage,
>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>> extobj_usage);
>>>>>>>>>>> +
>>>>>>>>>>> +/**
>>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>> + *
>>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>>> + */
>>>>>>>>>>> +static inline void
>>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>>> *vm_exec,
>>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>> private_usage,
>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>> extobj_usage)
>>>>>>>>>>> +{
>>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec,
>>>>>>>>>>> fence,
>>>>>>>>>>> + private_usage,
>>>>>>>>>>> extobj_usage);
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>>     /**
>>>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>>                           * gpuva list.
>>>>>>>>>>>                           */
>>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>>> +
>>>>>>>>>>> +                       /**
>>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>>> +                        * extobj list.
>>>>>>>>>>> +                        */
>>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>>> +
>>>>>>>>>>> +                       /**
>>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>>> +                        * list.
>>>>>>>>>>> +                        */
>>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>>                  } entry;
>>>>>>>>>>>          } list;
>>>>>>>>>>>     };
>>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>>>>>> evict);
>>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>>> +
>>>>>>>>>>>     /**
>>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk over a
>>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>>>>> iteration step
>>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>>           * used.
>>>>>>>>>>>           */
>>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, void
>>>>>>>>>>> *priv);
>>>>>>>>>>> +
>>>>>>>>>>> +       /**
>>>>>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>>>>>> +        *
>>>>>>>>>>> +        * Drivers receive this callback for every evicted
>>>>>>>>>>> &drm_gem_object being
>>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>>> +        *
>>>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>>>> specific variant of
>>>>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>>>>> +        */
>>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>>>>     };
>>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
>>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-19 12:07                       ` Christian König
@ 2023-09-19 12:21                         ` Thomas Hellström
  2023-09-19 15:16                           ` Danilo Krummrich
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-19 12:21 UTC (permalink / raw)
  To: Christian König, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

Hi Christian

On 9/19/23 14:07, Christian König wrote:
> Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
>> On 9/13/23 17:33, Christian König wrote:
>>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>>> On 9/13/23 16:26, Christian König wrote:
>>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>>> As mentioned in a different mail thread, the reply is based on 
>>>>>> the assumption
>>>>>> that we don't support anything else than GPUVM updates from the 
>>>>>> IOCTL.
>>>>>
>>>>> I think that this assumption is incorrect.
>>>>
>>>> Well, more precisely I should have said "don't support GPUVM 
>>>> updated from within
>>>> fence signaling critical sections". And looking at the code, that 
>>>> doesn't seem what
>>>> you're doing there.
>>>>
>>>>>
>>>>> Vulkan is just once specific use case, but this here should 
>>>>> probably be able to handle other use cases as well.
>>>>>
>>>>> Especially with HMM you get the requirement that you need to be 
>>>>> able to invalidate GPUVM mappings without grabbing a reservation 
>>>>> lock.
>>>>
>>>> What do you mean with "invalidate GPUVM mappings" in this context? 
>>>> drm_gpuvm_bo_evict()
>>>> should only be called from a ttm_device_funcs::move\x0f callback, we 
>>>> should hold the dma-resv
>>>> lock there.
>>>
>>> Well the question is which dma-resv lock do we hold?
>>>
>>> In the move callback we only hold the dma-resv lock of the BO which 
>>> is moved, but when that is a shared BO then that's not the same as 
>>> the one for the VM.
>>
>> Correct, Thomas' idea was to use the GEM's dma_resv lock to protect 
>> drm_gpuvm_bo::evicted
>> and then actually move the drm_gpuvm_bo to the VM's evicted list once 
>> we grabbed all
>> dma-resv locks when locking the VM's BOs using drm_exec. We can 
>> remove them from the evicted
>> list on validate(). This way we never touch the evicted list without 
>> holding at least the VM's
>> dma-resv lock.
>>
>> Do you have any concerns about that?
>
> Scratching my head a bit how that is supposed to work.
>
> This implies that you go over all the evicted BOs during validation 
> and not just the one mentioned in the CS.
>
> That might work for Vulkan, but is pretty much a no-go for OpenGL.
>
>>
>>>
>>>>
>>>>>
>>>>> See what the eviction lock in amdgpu is doing for example.
>>>>
>>>> The eviction_lock seems to protect a VM state "evicting" of whether 
>>>> any BO that
>>>> is associated with the VM is currently evicting. At the same time 
>>>> amdgpu protects
>>>> the eviceted list of the VM with a different lock. So this seems to 
>>>> be entirely
>>>> unrelated. Tracking a "currently evicting" state is not part of the 
>>>> GPUVM
>>>> implementation currently and hence nothing would change for amdgpu 
>>>> there.
>>>
>>> Sorry for the confusion we use different terminology in amdgpu.
>>>
>>> The eviction lock and evicted state is for the VM page tables, e.g. 
>>> if the whole VM is currently not used and swapped out or even 
>>> de-allocated.
>>>
>>> This is necessary because we have cases where we need to access the 
>>> VM data without holding the dma-resv lock of this VM. Especially 
>>> figuring out which parts of an address space contain mappings and 
>>> which doesn't.
>>
>> I think this is fine, this has nothing to do with lists of evicted 
>> GEM objects or external GEM
>> objects, right? Marking mappings (drm_gpuva) as invalidated 
>> (DRM_GPUVA_INVALIDATED) or accessing
>> the VA space does not require any dma-resv locks.
>
> I hope so, but I'm not 100% sure.
>
>>
>>>
>>> This is a requirement which comes with HMM handling, you won't see 
>>> this with Vulkan (or OpenGL, VAAPI etc..).
>>>
>>>
>>> The invalidation lock on the other hand is what in this discussion 
>>> is called eviction lock. This one is needed because what I wrote 
>>> above, during the move callback only the dma-resv of the BO which is 
>>> moved is locked, but not necessarily the dma-resv of the VM.
>>
>> That's yet another thing, right? This is used to track whether *any* 
>> BO that belongs to the VM is
>> currently being evicted, correct? As mentioned, as by now this is not 
>> supported in GPUVM and hence
>> would be the same driver specific code with the same driver specifc 
>> lock.
>
> That is most likely a show stopper using this for OpenGL based 
> workloads as far as I can see. For those you need to able to figure 
> out which non-VM BOs have been evicted and which parts of the VM needs 
> updates.

We identify those with a bool in the gpuvm_bo, and that bool is 
protected by the bo_resv. In essence, the "evicted" list must be made 
up-to-date with all relevant locks held before traversing in the next exec.

If you mean that we need to unbind all vmas of all vms of evicted bos 
before evicting, We don't do that, at least not in Xe, since evicting we 
wait for VM idle, and it cant access anything through the stale vmas 
until they have been revalidated and rebound.

/Thomas



>>
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>>
>>>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström wrote:
>>>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström 
>>>>>>>>>> wrote:
>>>>>>>>>>> Hi, Danilo,
>>>>>>>>>>>
>>>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>>>> So far the DRM GPUVA manager offers common infrastructure to
>>>>>>>>>>>> track GPU VA
>>>>>>>>>>>> allocations and mappings, generically connect GPU VA mappings
>>>>>>>>>>>> to their
>>>>>>>>>>>> backing buffers and perform more complex mapping operations
>>>>>>>>>>>> on the GPU VA
>>>>>>>>>>>> space.
>>>>>>>>>>>>
>>>>>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>>>>>> drivers, which
>>>>>>>>>>>> can potentially be generalized in order to make the DRM GPUVA
>>>>>>>>>>>> manager
>>>>>>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>>>>>>> this patch aims
>>>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>>>
>>>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not being used
>>>>>>>>>>>> outside of
>>>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>>>
>>>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>>>>>> which are
>>>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>>>
>>>>>>>>>>>> 3) Provide functions to efficiently lock all GEM objects dma-
>>>>>>>>>>>> resv the
>>>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>>>
>>>>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>>>>> contains mappings
>>>>>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>>>>>> accelerated.
>>>>>>>>>>>>
>>>>>>>>>>>> 5) Provide some convinience functions for common patterns.
>>>>>>>>>>>>
>>>>>>>>>>>> Rather than being designed as a "framework", the target is to
>>>>>>>>>>>> make all
>>>>>>>>>>>> features appear as a collection of optional helper functions,
>>>>>>>>>>>> such that
>>>>>>>>>>>> drivers are free to make use of the DRM GPUVA managers basic
>>>>>>>>>>>> functionality and opt-in for other features without setting
>>>>>>>>>>>> any feature
>>>>>>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>>>>>>
>>>>>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>>>>>> locking for drivers
>>>>>>>>>>>> updating the GPU VA space within the fence signalling path.
>>>>>>>>>>>>
>>>>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an existing
>>>>>>>>>>>> instance of this
>>>>>>>>>>>>      * particular combination. If not existent a new instance
>>>>>>>>>>>> is created and linked
>>>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>>>>>> evicted objects. Those
>>>>>>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>>>>>>> dma-resv locks and
>>>>>>>>>>>> + * validation of evicted objects bound in a &drm_gpuvm. For
>>>>>>>>>>>> instance the all
>>>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be
>>>>>>>>>>>> locked by calling
>>>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>>>>>>> also possible to lock
>>>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>>>> corresponding parameters to
>>>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the &drm_exec
>>>>>>>>>>>> loop while making
>>>>>>>>>>>> + * use of helper functions such as drm_gpuvm_prepare_range()
>>>>>>>>>>>> or
>>>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Every bound &drm_gem_object is treated as external object
>>>>>>>>>>>> when its &dma_resv
>>>>>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>>>      */
>>>>>>>>>>>>     /**
>>>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for the same
>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>>>>>> creations and destructions
>>>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances unique.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of external and
>>>>>>>>>>>> evicted objects are
>>>>>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>>>>>> iteration internally.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * However, drivers still need ensure to protect concurrent
>>>>>>>>>>>> calls to functions
>>>>>>>>>>>> + * iterating those lists, such as drm_gpuvm_validate() and
>>>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function contains
>>>>>>>>>>>> a particular
>>>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Functions adding or removing entries from those lists,
>>>>>>>>>>>> such as
>>>>>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be
>>>>>>>>>>>> called with external
>>>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>>>> corresponding list to be
>>>>>>>>>>>> + * (safely) modified while potentially being iternated by
>>>>>>>>>>>> other API functions.
>>>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>>>      */
>>>>>>>>>>>>     /**
>>>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>>>      *   }
>>>>>>>>>>>>      */
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>>>>> already iterated items
>>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>>>>> first element from
>>>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>>>> concurrently.
>>>>>>>>>>> Are the list spinlocks needed for that async state update from
>>>>>>>>>>> within the
>>>>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>>>
>>>>>>>>>>> Otherwise it should be sufficient to protect the lists with the
>>>>>>>>>>> gpuvm's resv
>>>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>>>
>>>>>>>>>>> If those spinlocks are still needed in some situations, perhaps
>>>>>>>>>>> could we
>>>>>>>>>>> have an option to set them to NULL (Like IIRC the maple tree
>>>>>>>>>>> allows for)?
>>>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>>>> holding only the dma-resv lock from the BO this function gets
>>>>>>>>>> called for. Hence,
>>>>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() calls with
>>>>>>>>>> different BOs.
>>>>>>>>> No. Only if you try to add external objects to the vm's evict 
>>>>>>>>> list
>>>>>>>>> from
>>>>>>>>> within the evict code. That's not necessary since you loop 
>>>>>>>>> through
>>>>>>>>> all
>>>>>>>>> external objects anyway when locking them so an "evicted" bool in
>>>>>>>>> the vm_bo,
>>>>>>>>> protected by the bo resv would be sufficient. The extobj locking
>>>>>>>>> loop can
>>>>>>>>> then add the bo to the evicted list.
>>>>>>>> And validate() can remove it while still holding all dma-resv 
>>>>>>>> locks,
>>>>>>>> neat!
>>>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>>>> concurrently? What
>>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>>>> drm_gpuva_unlink()?
>>>>>>>> Are we guaranteed that at this point of time the drm_gpuvm_bo 
>>>>>>>> is not
>>>>>>>> on the
>>>>>>>> evicted list? Because otherwise we would call 
>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>> with the
>>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>> might drop the last reference to the drm_gem_object and hence we'd
>>>>>>>> potentially
>>>>>>>> free the dma-resv lock while holding it, at least if it's an 
>>>>>>>> external
>>>>>>>> object.
>>>>>>> Easiest way in this scheme is to think of the lists as being 
>>>>>>> protected
>>>>>>> by the vm's resv lock. That means anybody calling unlink() must 
>>>>>>> also
>>>>>>> hold the vm's resv lock. (Which is OK from an UAF point of view, 
>>>>>>> but
>>>>>>> perhaps not from a locking inversion POW from an async list 
>>>>>>> update).
>>>>>> This would mean that on unlink() we'd need to hold the VM's resv 
>>>>>> lock and the
>>>>>> corresponding GEM's resv lock (in case they're not the same 
>>>>>> anyways) because the
>>>>>> VM's resv lock would protect the external / evicted object lists 
>>>>>> and the GEM
>>>>>> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
>>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>>
>>>>>>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>>>>>>> really would not
>>>>>>>>>> like to add even more complexity just to get the spinlock out of
>>>>>>>>>> the way in case
>>>>>>>>>> the driver already has an outer lock protecting this path.
>>>>>>>>> I must disagree here. These spinlocks and atomic operations are
>>>>>>>>> pretty
>>>>>>>>> costly and as discussed earlier this type of locking was the 
>>>>>>>>> reason
>>>>>>>>> (at
>>>>>>>>> least according to the commit message) that made Christian 
>>>>>>>>> drop the
>>>>>>>>> XArray
>>>>>>>>> use in drm_exec for the same set of objects: "The locking 
>>>>>>>>> overhead
>>>>>>>>> is
>>>>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>>>>> complexity and a
>>>>>>>>> single wide lock following the drm locking guidelines set out by
>>>>>>>>> Daniel and
>>>>>>>>> David should really be the default choice with an opt-in for a
>>>>>>>>> spinlock if
>>>>>>>>> needed for async and pushing out to a wq is not an option.
>>>>>>>> For the external object list an outer lock would work as long 
>>>>>>>> as it's
>>>>>>>> not the
>>>>>>>> dma-resv lock of the corresponding GEM object, since here we 
>>>>>>>> actually
>>>>>>>> need to
>>>>>>>> remove the list entry from the external object list on
>>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>>> It's just a bit weird design wise that drivers would need to take
>>>>>>>> this outer
>>>>>>>> lock on:
>>>>>>>>
>>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also 
>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>>
>>>>>>>> Given that it seems reasonable to do all the required locking
>>>>>>>> internally.
>>>>>>>  From a design POW, there has been a clear direction in XE to make
>>>>>>> things similar to mmap() / munmap(), so this outer lock, which 
>>>>>>> in Xe is
>>>>>>> an rwsem, is used in a similar way as the mmap_lock. It's 
>>>>>>> protecting
>>>>>>> the page-table structures and vma rb tree, the userptr 
>>>>>>> structures and
>>>>>>> the extobj list. Basically it's taken early in the exec IOCTL, the
>>>>>>> VM_BIND ioctl, the compute rebind worker and the pagefault 
>>>>>>> handler, so
>>>>>>> all of the above are just asserting that it is taken in the correct
>>>>>>> mode.
>>>>>>>
>>>>>>> But strictly with this scheme one could also use the vm's 
>>>>>>> dma_resv for
>>>>>>> the extobj list since with drm_exec, it's locked before 
>>>>>>> traversing the
>>>>>>> list.
>>>>>>>
>>>>>>> The whole point of this scheme is to rely on locks that you 
>>>>>>> already are
>>>>>>> supposed to be holding for various reasons and is simple to 
>>>>>>> comprehend.
>>>>>> I don't agree that we're supposed to hold the VM's resv lock 
>>>>>> anyways for
>>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm 
>>>>>> fine using it
>>>>>> for that purpose nevertheless.
>>>>>>
>>>>>>>> In order to at least place lockdep checks, the driver would 
>>>>>>>> need to
>>>>>>>> supply the
>>>>>>>> corresponding lock's lockdep_map, because the GPUVM otherwise 
>>>>>>>> doesn't
>>>>>>>> know about
>>>>>>>> the lock.
>>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>> I'd really like to avoid that, especially now that everything got 
>>>>>> simpler. We
>>>>>> should define the actual locks to take instead.
>>>>>>
>>>>>>>> Out of curiosity, what is the overhead of a spin_lock() that 
>>>>>>>> doesn't
>>>>>>>> need to
>>>>>>>> spin?
>>>>>>> I guess it's hard to tell exactly, but it is much lower on 
>>>>>>> modern x86
>>>>>>> than what it used to be. Not sure about ARM, which is the other
>>>>>>> architecture important to us. I figure if there is little 
>>>>>>> cache-line
>>>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>>>
>>>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>>>
>>>>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>>>>>>> spinlock_t
>>>>>>>>> *lock)
>>>>>>>>>
>>>>>>>>> {
>>>>>>>>>
>>>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>>>          spin_lock(lock);
>>>>>>>>>
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>>>> For such drivers, that would require anybody calling unlink to
>>>>>>>>>>> hold the vm's
>>>>>>>>>>> resv, though.
>>>>>>>>>> In V4 I want to go back to having a dedicated lock for the GEMs
>>>>>>>>>> gpuva list (or
>>>>>>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>>>>>>> lock for that
>>>>>>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>>>>>>> otherwise wouldn't
>>>>>>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>>>>>>> was referring to
>>>>>>>>>> earlier.
>>>>>>>>> Yeah, I can see the need for a dedicated lock for the GEM's gpuva
>>>>>>>>> list, but
>>>>>>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>>>>>>> problem. We
>>>>>>>>> may free the object and a pointer to the vm's resv during unlink
>>>>>>>>> but we
>>>>>>>>> don't free the vm's resv.  It'd be a matter of ensuring that any
>>>>>>>>> calls to
>>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>> Drivers calling unlink() from the fence signaling path can't 
>>>>>>>> use the
>>>>>>>> VM's
>>>>>>>> dma-resv lock.
>>>>>>> Yes, that made me a bit curious because in the current version 
>>>>>>> the code
>>>>>>> required the object's dma_resv for unlink() which can't be grabbed
>>>>>>> either from the fence signaling path. So are there any drivers 
>>>>>>> actually
>>>>>>> wanting to do that? If so, they will either need to resort to the
>>>>>>> current spinlock solution or they will need to call unlink from a
>>>>>>> workqueue item.
>>>>>> As Boris already mentioned we have the dma-resv lock by default 
>>>>>> or a driver
>>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of the 
>>>>>> latter.
>>>>>>
>>>>>>>> Also, what if the object is an external object? We can't use 
>>>>>>>> the VM's
>>>>>>>> dma-resv
>>>>>>>> lock here.
>>>>>>> Why? Typically (sync) unlink is only ever called from an 
>>>>>>> unbind-like
>>>>>>> operation where it should be trivial to grab the vm's resv. Or, for
>>>>>>> that matter any outer lock protecting the extobj list. Rule 
>>>>>>> would be
>>>>>>> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict 
>>>>>>> would
>>>>>>> be protected by either the vm's dma_resv (or possibly an outer 
>>>>>>> lock in
>>>>>>> the case of the extobj list).
>>>>>> Outer lock wouldn't have been working for updates in the async 
>>>>>> path, but
>>>>>> shouldn't be relevant anymore. We could use the VM's resv for that.
>>>>>>
>>>>>>>>   And we can't have the GEM objs dma-resv lock held when calling
>>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>>>>>>> refcount drops
>>>>>>>> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() 
>>>>>>>> might
>>>>>>>> drop the
>>>>>>>> last reference of the GEM object.
>>>>>>> Yes, but this is a different problem as to what exactly protects
>>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per 
>>>>>>> bo list
>>>>>>> lock, or if we want to keep the bo's dma_resv we need to ensure 
>>>>>>> that
>>>>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
>>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount 
>>>>>>> (I know
>>>>>>> Boris didn't like that, but requiring an explicit refcount for a
>>>>>>> pointer you dereference unless you're under a lock that ensures 
>>>>>>> keeping
>>>>>>> the object alive is pretty much required?) But anyway for the
>>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal 
>>>>>>> spinlock)
>>>>>>> I don't have a strong preference.
>>>>>> We can keep the GEM objects dma-resv lock, however as mentioned 
>>>>>> above
>>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the 
>>>>>> VM's resv lock
>>>>>> and the GEM's resv lock in case they differ.
>>>>>>
>>>>>>>>   All those problems go away with a dedicated
>>>>>>>> GEM gpuva list lock.
>>>>>>> I don't think these are real problems.
>>>>>>> With the excepton of the eviction list "trick" where we 
>>>>>>> currently have
>>>>>>> slightly different approach to collect external bos needing 
>>>>>>> rebinding,
>>>>>>> we have this working fine.
>>>>>>>
>>>>>>> TBH I think pretty much the only situation where the spinlock is 
>>>>>>> needed
>>>>>>> is for async updates of these lists, unless a wq item can be 
>>>>>>> used for
>>>>>>> that, but it doesn't really seem like the current code allows 
>>>>>>> for such
>>>>>>> updates anyway? It complicates the code a lot, adds overhead and 
>>>>>>> also
>>>>>>> adds the requirement for refcounting during list traversal.
>>>>>>>
>>>>>>> /Thomas
>>>>>>>
>>>>>>>>> /Thomas
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> It seems that with that also the refcount could be make non-
>>>>>>>>>>> atomic.
>>>>>>>>>>>
>>>>>>>>>>> All in the spirit of the drm locking guidelines "use big locks
>>>>>>>>>>> when
>>>>>>>>>>> possible".
>>>>>>>>>>> Lower level locks only when necessary for performance or
>>>>>>>>>>> locking inversion?
>>>>>>>>>>>
>>>>>>>>>>> /Thomas
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>>>>>> local list, so removal
>>>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>>>> iterating the list.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>>>> +       ({
>>>>>>>>>>>>                             \
>>>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>>>>> +
>>>>>>>>>>>>                             \
>>>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>>>                             \
>>>>>>>>>>>> +
>>>>>>>>>>>>                             \
>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>>>>>> + struct
>>>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>>>> +
>>>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>>>> +                       if
>>>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>>>> {                    \
>>>>>>>>>>>> +                               list_move_tail(&(__vm_bo)-
>>>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>>>> +
>>>>>>>>>>>> __local_list);                           \
>>>>>>>>>>>> +                               break;
>>>>>>>>>>>>                             \
>>>>>>>>>>>> +                       } else
>>>>>>>>>>>> {                                                        \
>>>>>>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>>>> NULL;                                         \
>>>>>>>>>>>> +                       }
>>>>>>>>>>>>                             \
>>>>>>>>>>>> +               }
>>>>>>>>>>>>                             \
>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>> +
>>>>>>>>>>>>                             \
>>>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>>>                             \
>>>>>>>>>>>> +       })
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>>>>> first element from the
>>>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>>>> concurrently.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Typical use:
>>>>>>>>>>>> + *
>>>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>>>> + *
>>>>>>>>>>>> + *     ret = 0;
>>>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>>>> + *             ret = do_something_with_vm_bo(..., vm_bo);
>>>>>>>>>>>> + *             if (ret)
>>>>>>>>>>>> + *                     break;
>>>>>>>>>>>> + *     }
>>>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>>>>>> &my_local_list);
>>>>>>>>>>>> + *
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Only used for internal list iterations, not meant to be
>>>>>>>>>>>> exposed to the outside
>>>>>>>>>>>> + * world.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>> +                                               __local_list,
>>>>>>>>>>>> NULL);            \
>>>>>>>>>>>> +
>>>>>>>>>>>> __vm_bo;
>>>>>>>>>>>>        \
>>>>>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>> +                                               __local_list,
>>>>>>>>>>>> __vm_bo))         \
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to their
>>>>>>>>>>>> original list
>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>>>>> already iterated items
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * When we're done iterating a vm_bo list, we should call
>>>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>>>> + * to restore the original state and let new iterations take
>>>>>>>>>>>> place.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>>>> __local_list)                         \
>>>>>>>>>>>> +       do
>>>>>>>>>>>> {
>>>>>>>>>>>>                  \
>>>>>>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>>>>>>> list elements to the          \
>>>>>>>>>>>> +                * head to preserve previous ordering, in
>>>>>>>>>>>> case it matters.              \
>>>>>>>>>>>> +
>>>>>>>>>>>> */
>>>>>>>>>>>>            \
>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
>>>>>>>>>>>> list
>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>> +       do
>>>>>>>>>>>> {
>>>>>>>>>>>>          \
>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
>>>>>>>>>>>> list
>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>> +       do
>>>>>>>>>>>> {
>>>>>>>>>>>>          \
>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>> +
>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>>>> +
>>>>>>>>>>>>     #define to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>>>> +
>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>>>> +
>>>>>>>>>>>>          drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>>>>>>> *gpuvm)
>>>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>>>>               "GPUVA tree is not empty, potentially leaking
>>>>>>>>>>>> memory.\n");
>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>> +
>>>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>>>     }
>>>>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the
>>>>>>>>>>>> given
>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Note: This function is safe against concurrent insertion
>>>>>>>>>>>> and removal of
>>>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>>>> concurrent usage itself.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>>>>>> either an outer VM lock
>>>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this function
>>>>>>>>>>>> within the
>>>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the GPUVM's
>>>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +int
>>>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>>>>>> +                         unsigned int num_fences)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, vm_bo->obj,
>>>>>>>>>>>> num_fences);
>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>> +                       break;
>>>>>>>>>>>> +       }
>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>>>> +
>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>> +}
>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped within
>>>>>>>>>>>> a given range
>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>>>>>>> mapped between @addr
>>>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +int
>>>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>> drm_exec *exec,
>>>>>>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>>>>>>> num_fences)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
>>>>>>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>>>>>>> +
>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>>>>> num_fences);
>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>> +                       return ret;
>>>>>>>>>>>> +       }
>>>>>>>>>>>> +
>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>> +}
>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>>>>> given
>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>>>> + * being set the driver receives the given @fn callback to
>>>>>>>>>>>> lock additional
>>>>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec instance.
>>>>>>>>>>>> Typically, drivers
>>>>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>>>>> callback.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +int
>>>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>> 0 |
>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>> +
>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>>>>>> num_fences);
>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>> +
>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>>>>>> num_fences);
>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>> +
>>>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>>>>>> num_fences);
>>>>>>>>>>>> +                       drm_exec_retry_on_contention(exec);
>>>>>>>>>>>> +                       if (ret)
>>>>>>>>>>>> +                               goto err;
>>>>>>>>>>>> +               }
>>>>>>>>>>>> +       }
>>>>>>>>>>>> +
>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>> +
>>>>>>>>>>>> +err:
>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>> +}
>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>>>> +
>>>>>>>>>>>> +static int
>>>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
>>>>>>>>>>>> num_fences)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       struct {
>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>>>>>> objs,
>>>>>>>>>>>> + args->num_objs,
>>>>>>>>>>>> num_fences);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>>>> + * @num_objs: the number of additional &drm_gem_objects to
>>>>>>>>>>>> lock
>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>>>> + * contains mappings of, plus the ones given through @objs.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +int
>>>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>>>>>> +                         unsigned int num_objs,
>>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       struct {
>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>> +       } args;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>>>>>> interruptible);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>>>>>> within a given range
>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>> mapped between @addr and
>>>>>>>>>>>> + * @addr + @range.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +int
>>>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>> 0 |
>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>> +
>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>>>>>> addr, range,
>>>>>>>>>>>> + num_fences);
>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>> +       }
>>>>>>>>>>>> +
>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>> +
>>>>>>>>>>>> +err:
>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>> +}
>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as evicted
>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>>>>>>> evicted buffer
>>>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +int
>>>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
>>>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>> +                       break;
>>>>>>>>>>>> +       }
>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>>>> +
>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>> +}
>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private and all
>>>>>>>>>>>> extobj
>>>>>>>>>>>> + * dma-resv
>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>> + */
>>>>>>>>>>>> +void
>>>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>>>>>> +                        enum dma_resv_usage private_usage,
>>>>>>>>>>>> +                        enum dma_resv_usage extobj_usage)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>>>> +       unsigned long index;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, obj) {
>>>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>>>> obj) ?
>>>>>>>>>>>> + private_usage :
>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>> +       }
>>>>>>>>>>>> +}
>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>>>> +
>>>>>>>>>>>>     /**
>>>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance of struct
>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>>>> +
>>>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>>>          return vm_bo;
>>>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>>>> +
>>>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>>>> +
>>>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the reference of
>>>>>>>>>>>>      *
>>>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * If the reference count drops to zero, the &gpuvm_bo is
>>>>>>>>>>>> destroyed, which
>>>>>>>>>>>> + * includes removing it from the GEMs gpuva list. Hence, if
>>>>>>>>>>>> a call to this
>>>>>>>>>>>> + * function can potentially let the reference count to zero
>>>>>>>>>>>> the caller must
>>>>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>>>>>      */
>>>>>>>>>>>>     void
>>>>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>     }
>>>>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>>>     }
>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its
>>>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>>>> + * extobj list
>>>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the
>>>>>>>>>>>> extobj list.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if
>>>>>>>>>>>> not on the list
>>>>>>>>>>>> + * already and if the corresponding &drm_gem_object is an
>>>>>>>>>>>> external object,
>>>>>>>>>>>> + * actually.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +void
>>>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
>>>>>>>>>>>> / from a
>>>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +void
>>>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>>>> +               if (evict)
>>>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, evict);
>>>>>>>>>>>> +               else
>>>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, evict);
>>>>>>>>>>>> +       }
>>>>>>>>>>>> +}
>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>>>> +
>>>>>>>>>>>>     static int
>>>>>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>>>      */
>>>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>>>           * space
>>>>>>>>>>>>           */
>>>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       /**
>>>>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>>>>> +        */
>>>>>>>>>>>> +       struct {
>>>>>>>>>>>> +               /**
>>>>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>>>>> serving as
>>>>>>>>>>>> +                * external object
>>>>>>>>>>>> +                */
>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>> +
>>>>>>>>>>>> +               /**
>>>>>>>>>>>> +                * @lock: spinlock to protect the extobj list
>>>>>>>>>>>> +                */
>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>> +       } extobj;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       /**
>>>>>>>>>>>> +        * @evict: structure holding the evict list and evict
>>>>>>>>>>>> list lock
>>>>>>>>>>>> +        */
>>>>>>>>>>>> +       struct {
>>>>>>>>>>>> +               /**
>>>>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>>>>> currently being
>>>>>>>>>>>> +                * evicted
>>>>>>>>>>>> +                */
>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>> +
>>>>>>>>>>>> +               /**
>>>>>>>>>>>> +                * @lock: spinlock to protect the evict list
>>>>>>>>>>>> +                */
>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>> +       } evict;
>>>>>>>>>>>>     };
>>>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>> drm_device *drm,
>>>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>> + * external object
>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv differs
>>>>>>>>>>>> from the
>>>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>>>> + */
>>>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>> + struct drm_gem_object
>>>>>>>>>>>> *obj)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>     {
>>>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__)
>>>>>>>>>>>> \
>>>>>>>>>>>>          list_for_each_entry_safe(va__, next__, &(gpuvm__)-
>>>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>>>> &drm_exec should be.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Optionally, @extra can be set in order to lock additional
>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>>>> +       /**
>>>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>>>> +        */
>>>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       /**
>>>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA reservations
>>>>>>>>>>>> +        */
>>>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>>>> +
>>>>>>>>>>>> +       /**
>>>>>>>>>>>> +        * @extra: Callback and corresponding private data
>>>>>>>>>>>> for the driver to
>>>>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>>>>> +        */
>>>>>>>>>>>> +       struct {
>>>>>>>>>>>> +               /**
>>>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>>>> +                */
>>>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>> +                         unsigned int num_fences);
>>>>>>>>>>>> +
>>>>>>>>>>>> +               /**
>>>>>>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>>>>>>> callback
>>>>>>>>>>>> +                */
>>>>>>>>>>>> +               void *priv;
>>>>>>>>>>>> +       } extra;
>>>>>>>>>>>> +};
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
>>>>>>>>>>>> resv
>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +static inline int
>>>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>>>>> +                    unsigned int num_fences)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>>>>>> num_fences);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>>>> +                             unsigned int num_fences);
>>>>>>>>>>>> +
>>>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>> +                           struct drm_exec *exec,
>>>>>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>>>>>> +                           unsigned int num_fences);
>>>>>>>>>>>> +
>>>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>>>> +
>>>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>> +                             struct drm_gem_object **objs,
>>>>>>>>>>>> +                             unsigned int num_objs,
>>>>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>>> +
>>>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>> +                             u64 addr, u64 range,
>>>>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
>>>>>>>>>>>> BOs
>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>> previously acquired
>>>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>> + */
>>>>>>>>>>>> +static inline void
>>>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>> private_usage,
>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>> +
>>>>>>>>>>>> +/**
>>>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>> + *
>>>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>>>> + */
>>>>>>>>>>>> +static inline void
>>>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>> private_usage,
>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec,
>>>>>>>>>>>> fence,
>>>>>>>>>>>> + private_usage,
>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>>     /**
>>>>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>>>                           * gpuva list.
>>>>>>>>>>>>                           */
>>>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>>>> +
>>>>>>>>>>>> +                       /**
>>>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>>>> +                        * extobj list.
>>>>>>>>>>>> +                        */
>>>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>>>> +
>>>>>>>>>>>> +                       /**
>>>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>>>> +                        * list.
>>>>>>>>>>>> +                        */
>>>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>>>                  } entry;
>>>>>>>>>>>>          } list;
>>>>>>>>>>>>     };
>>>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>>>>>>> evict);
>>>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>>>> +
>>>>>>>>>>>>     /**
>>>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk over a
>>>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>>>>>> iteration step
>>>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>>>           * used.
>>>>>>>>>>>>           */
>>>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, void
>>>>>>>>>>>> *priv);
>>>>>>>>>>>> +
>>>>>>>>>>>> +       /**
>>>>>>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>>>>>>> +        *
>>>>>>>>>>>> +        * Drivers receive this callback for every evicted
>>>>>>>>>>>> &drm_gem_object being
>>>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>>>> +        *
>>>>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>>>>> specific variant of
>>>>>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>>>>>> +        */
>>>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>>>>>     };
>>>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
>>>>>
>>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-19 12:21                         ` Thomas Hellström
@ 2023-09-19 15:16                           ` Danilo Krummrich
  2023-09-19 15:23                             ` Thomas Hellström
  0 siblings, 1 reply; 77+ messages in thread
From: Danilo Krummrich @ 2023-09-19 15:16 UTC (permalink / raw)
  To: Thomas Hellström, Christian König
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

On 9/19/23 14:21, Thomas Hellström wrote:
> Hi Christian
> 
> On 9/19/23 14:07, Christian König wrote:
>> Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
>>> On 9/13/23 17:33, Christian König wrote:
>>>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>>>> On 9/13/23 16:26, Christian König wrote:
>>>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>>>> As mentioned in a different mail thread, the reply is based on the assumption
>>>>>>> that we don't support anything else than GPUVM updates from the IOCTL.
>>>>>>
>>>>>> I think that this assumption is incorrect.
>>>>>
>>>>> Well, more precisely I should have said "don't support GPUVM updated from within
>>>>> fence signaling critical sections". And looking at the code, that doesn't seem what
>>>>> you're doing there.
>>>>>
>>>>>>
>>>>>> Vulkan is just once specific use case, but this here should probably be able to handle other use cases as well.
>>>>>>
>>>>>> Especially with HMM you get the requirement that you need to be able to invalidate GPUVM mappings without grabbing a reservation lock.
>>>>>
>>>>> What do you mean with "invalidate GPUVM mappings" in this context? drm_gpuvm_bo_evict()
>>>>> should only be called from a ttm_device_funcs::move\x0f callback, we should hold the dma-resv
>>>>> lock there.
>>>>
>>>> Well the question is which dma-resv lock do we hold?
>>>>
>>>> In the move callback we only hold the dma-resv lock of the BO which is moved, but when that is a shared BO then that's not the same as the one for the VM.
>>>
>>> Correct, Thomas' idea was to use the GEM's dma_resv lock to protect drm_gpuvm_bo::evicted
>>> and then actually move the drm_gpuvm_bo to the VM's evicted list once we grabbed all
>>> dma-resv locks when locking the VM's BOs using drm_exec. We can remove them from the evicted
>>> list on validate(). This way we never touch the evicted list without holding at least the VM's
>>> dma-resv lock.
>>>
>>> Do you have any concerns about that?
>>
>> Scratching my head a bit how that is supposed to work.
>>
>> This implies that you go over all the evicted BOs during validation and not just the one mentioned in the CS.
>>
>> That might work for Vulkan, but is pretty much a no-go for OpenGL.
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> See what the eviction lock in amdgpu is doing for example.
>>>>>
>>>>> The eviction_lock seems to protect a VM state "evicting" of whether any BO that
>>>>> is associated with the VM is currently evicting. At the same time amdgpu protects
>>>>> the eviceted list of the VM with a different lock. So this seems to be entirely
>>>>> unrelated. Tracking a "currently evicting" state is not part of the GPUVM
>>>>> implementation currently and hence nothing would change for amdgpu there.
>>>>
>>>> Sorry for the confusion we use different terminology in amdgpu.
>>>>
>>>> The eviction lock and evicted state is for the VM page tables, e.g. if the whole VM is currently not used and swapped out or even de-allocated.
>>>>
>>>> This is necessary because we have cases where we need to access the VM data without holding the dma-resv lock of this VM. Especially figuring out which parts of an address space contain mappings and which doesn't.
>>>
>>> I think this is fine, this has nothing to do with lists of evicted GEM objects or external GEM
>>> objects, right? Marking mappings (drm_gpuva) as invalidated (DRM_GPUVA_INVALIDATED) or accessing
>>> the VA space does not require any dma-resv locks.
>>
>> I hope so, but I'm not 100% sure.
>>
>>>
>>>>
>>>> This is a requirement which comes with HMM handling, you won't see this with Vulkan (or OpenGL, VAAPI etc..).
>>>>
>>>>
>>>> The invalidation lock on the other hand is what in this discussion is called eviction lock. This one is needed because what I wrote above, during the move callback only the dma-resv of the BO which is moved is locked, but not necessarily the dma-resv of the VM.
>>>
>>> That's yet another thing, right? This is used to track whether *any* BO that belongs to the VM is
>>> currently being evicted, correct? As mentioned, as by now this is not supported in GPUVM and hence
>>> would be the same driver specific code with the same driver specifc lock.
>>
>> That is most likely a show stopper using this for OpenGL based workloads as far as I can see. For those you need to able to figure out which non-VM BOs have been evicted and which parts of the VM needs updates.
> 
> We identify those with a bool in the gpuvm_bo, and that bool is protected by the bo_resv. In essence, the "evicted" list must be made up-to-date with all relevant locks held before traversing in the next exec.

What I still miss with this idea is how do we find all the drm_gpuvm_bo structures with the evicted bool set to true? When doing the drm_exec dance we come across all external ones and can add them to the list if needed, but what about the BOs having the VM's dma-resv?

> 
> If you mean that we need to unbind all vmas of all vms of evicted bos before evicting, We don't do that, at least not in Xe, since evicting we wait for VM idle, and it cant access anything through the stale vmas until they have been revalidated and rebound.
> 
> /Thomas
> 
> 
> 
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>>
>>>>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström wrote:
>>>>>>>> Hi!
>>>>>>>>
>>>>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström wrote:
>>>>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström wrote:
>>>>>>>>>>>> Hi, Danilo,
>>>>>>>>>>>>
>>>>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>>>>> So far the DRM GPUVA manager offers common infrastructure to
>>>>>>>>>>>>> track GPU VA
>>>>>>>>>>>>> allocations and mappings, generically connect GPU VA mappings
>>>>>>>>>>>>> to their
>>>>>>>>>>>>> backing buffers and perform more complex mapping operations
>>>>>>>>>>>>> on the GPU VA
>>>>>>>>>>>>> space.
>>>>>>>>>>>>>
>>>>>>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>>>>>>> drivers, which
>>>>>>>>>>>>> can potentially be generalized in order to make the DRM GPUVA
>>>>>>>>>>>>> manager
>>>>>>>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>>>>>>>> this patch aims
>>>>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not being used
>>>>>>>>>>>>> outside of
>>>>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>>>>>>> which are
>>>>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3) Provide functions to efficiently lock all GEM objects dma-
>>>>>>>>>>>>> resv the
>>>>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>>>>>> contains mappings
>>>>>>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>>>>>>> accelerated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 5) Provide some convinience functions for common patterns.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Rather than being designed as a "framework", the target is to
>>>>>>>>>>>>> make all
>>>>>>>>>>>>> features appear as a collection of optional helper functions,
>>>>>>>>>>>>> such that
>>>>>>>>>>>>> drivers are free to make use of the DRM GPUVA managers basic
>>>>>>>>>>>>> functionality and opt-in for other features without setting
>>>>>>>>>>>>> any feature
>>>>>>>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>>>>>>> locking for drivers
>>>>>>>>>>>>> updating the GPU VA space within the fence signalling path.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an existing
>>>>>>>>>>>>> instance of this
>>>>>>>>>>>>>      * particular combination. If not existent a new instance
>>>>>>>>>>>>> is created and linked
>>>>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>>>>>>> evicted objects. Those
>>>>>>>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>>>>>>>> dma-resv locks and
>>>>>>>>>>>>> + * validation of evicted objects bound in a &drm_gpuvm. For
>>>>>>>>>>>>> instance the all
>>>>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be
>>>>>>>>>>>>> locked by calling
>>>>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>>>>>>>> also possible to lock
>>>>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>>>>> corresponding parameters to
>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the &drm_exec
>>>>>>>>>>>>> loop while making
>>>>>>>>>>>>> + * use of helper functions such as drm_gpuvm_prepare_range()
>>>>>>>>>>>>> or
>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Every bound &drm_gem_object is treated as external object
>>>>>>>>>>>>> when its &dma_resv
>>>>>>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>>>>      */
>>>>>>>>>>>>>     /**
>>>>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for the same
>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>>>>>>> creations and destructions
>>>>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances unique.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of external and
>>>>>>>>>>>>> evicted objects are
>>>>>>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>>>>>>> iteration internally.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * However, drivers still need ensure to protect concurrent
>>>>>>>>>>>>> calls to functions
>>>>>>>>>>>>> + * iterating those lists, such as drm_gpuvm_validate() and
>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function contains
>>>>>>>>>>>>> a particular
>>>>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Functions adding or removing entries from those lists,
>>>>>>>>>>>>> such as
>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be
>>>>>>>>>>>>> called with external
>>>>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>>>>> corresponding list to be
>>>>>>>>>>>>> + * (safely) modified while potentially being iternated by
>>>>>>>>>>>>> other API functions.
>>>>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>>>>      */
>>>>>>>>>>>>>     /**
>>>>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>>>>      *   }
>>>>>>>>>>>>>      */
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>>>>>> first element from
>>>>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>> Are the list spinlocks needed for that async state update from
>>>>>>>>>>>> within the
>>>>>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>>>>
>>>>>>>>>>>> Otherwise it should be sufficient to protect the lists with the
>>>>>>>>>>>> gpuvm's resv
>>>>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>>>>
>>>>>>>>>>>> If those spinlocks are still needed in some situations, perhaps
>>>>>>>>>>>> could we
>>>>>>>>>>>> have an option to set them to NULL (Like IIRC the maple tree
>>>>>>>>>>>> allows for)?
>>>>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>>>>> holding only the dma-resv lock from the BO this function gets
>>>>>>>>>>> called for. Hence,
>>>>>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() calls with
>>>>>>>>>>> different BOs.
>>>>>>>>>> No. Only if you try to add external objects to the vm's evict list
>>>>>>>>>> from
>>>>>>>>>> within the evict code. That's not necessary since you loop through
>>>>>>>>>> all
>>>>>>>>>> external objects anyway when locking them so an "evicted" bool in
>>>>>>>>>> the vm_bo,
>>>>>>>>>> protected by the bo resv would be sufficient. The extobj locking
>>>>>>>>>> loop can
>>>>>>>>>> then add the bo to the evicted list.
>>>>>>>>> And validate() can remove it while still holding all dma-resv locks,
>>>>>>>>> neat!
>>>>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>>>>> concurrently? What
>>>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>>>>> drm_gpuva_unlink()?
>>>>>>>>> Are we guaranteed that at this point of time the drm_gpuvm_bo is not
>>>>>>>>> on the
>>>>>>>>> evicted list? Because otherwise we would call drm_gpuvm_bo_destroy()
>>>>>>>>> with the
>>>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>> might drop the last reference to the drm_gem_object and hence we'd
>>>>>>>>> potentially
>>>>>>>>> free the dma-resv lock while holding it, at least if it's an external
>>>>>>>>> object.
>>>>>>>> Easiest way in this scheme is to think of the lists as being protected
>>>>>>>> by the vm's resv lock. That means anybody calling unlink() must also
>>>>>>>> hold the vm's resv lock. (Which is OK from an UAF point of view, but
>>>>>>>> perhaps not from a locking inversion POW from an async list update).
>>>>>>> This would mean that on unlink() we'd need to hold the VM's resv lock and the
>>>>>>> corresponding GEM's resv lock (in case they're not the same anyways) because the
>>>>>>> VM's resv lock would protect the external / evicted object lists and the GEM
>>>>>>> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
>>>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>>>
>>>>>>>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>>>>>>>> really would not
>>>>>>>>>>> like to add even more complexity just to get the spinlock out of
>>>>>>>>>>> the way in case
>>>>>>>>>>> the driver already has an outer lock protecting this path.
>>>>>>>>>> I must disagree here. These spinlocks and atomic operations are
>>>>>>>>>> pretty
>>>>>>>>>> costly and as discussed earlier this type of locking was the reason
>>>>>>>>>> (at
>>>>>>>>>> least according to the commit message) that made Christian drop the
>>>>>>>>>> XArray
>>>>>>>>>> use in drm_exec for the same set of objects: "The locking overhead
>>>>>>>>>> is
>>>>>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>>>>>> complexity and a
>>>>>>>>>> single wide lock following the drm locking guidelines set out by
>>>>>>>>>> Daniel and
>>>>>>>>>> David should really be the default choice with an opt-in for a
>>>>>>>>>> spinlock if
>>>>>>>>>> needed for async and pushing out to a wq is not an option.
>>>>>>>>> For the external object list an outer lock would work as long as it's
>>>>>>>>> not the
>>>>>>>>> dma-resv lock of the corresponding GEM object, since here we actually
>>>>>>>>> need to
>>>>>>>>> remove the list entry from the external object list on
>>>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>>>> It's just a bit weird design wise that drivers would need to take
>>>>>>>>> this outer
>>>>>>>>> lock on:
>>>>>>>>>
>>>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also drm_gpuvm_bo_put())
>>>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>>>
>>>>>>>>> Given that it seems reasonable to do all the required locking
>>>>>>>>> internally.
>>>>>>>>  From a design POW, there has been a clear direction in XE to make
>>>>>>>> things similar to mmap() / munmap(), so this outer lock, which in Xe is
>>>>>>>> an rwsem, is used in a similar way as the mmap_lock. It's protecting
>>>>>>>> the page-table structures and vma rb tree, the userptr structures and
>>>>>>>> the extobj list. Basically it's taken early in the exec IOCTL, the
>>>>>>>> VM_BIND ioctl, the compute rebind worker and the pagefault handler, so
>>>>>>>> all of the above are just asserting that it is taken in the correct
>>>>>>>> mode.
>>>>>>>>
>>>>>>>> But strictly with this scheme one could also use the vm's dma_resv for
>>>>>>>> the extobj list since with drm_exec, it's locked before traversing the
>>>>>>>> list.
>>>>>>>>
>>>>>>>> The whole point of this scheme is to rely on locks that you already are
>>>>>>>> supposed to be holding for various reasons and is simple to comprehend.
>>>>>>> I don't agree that we're supposed to hold the VM's resv lock anyways for
>>>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but I'm fine using it
>>>>>>> for that purpose nevertheless.
>>>>>>>
>>>>>>>>> In order to at least place lockdep checks, the driver would need to
>>>>>>>>> supply the
>>>>>>>>> corresponding lock's lockdep_map, because the GPUVM otherwise doesn't
>>>>>>>>> know about
>>>>>>>>> the lock.
>>>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>>> I'd really like to avoid that, especially now that everything got simpler. We
>>>>>>> should define the actual locks to take instead.
>>>>>>>
>>>>>>>>> Out of curiosity, what is the overhead of a spin_lock() that doesn't
>>>>>>>>> need to
>>>>>>>>> spin?
>>>>>>>> I guess it's hard to tell exactly, but it is much lower on modern x86
>>>>>>>> than what it used to be. Not sure about ARM, which is the other
>>>>>>>> architecture important to us. I figure if there is little cache-line
>>>>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>>>>
>>>>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>>>>
>>>>>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>>>>>>>> spinlock_t
>>>>>>>>>> *lock)
>>>>>>>>>>
>>>>>>>>>> {
>>>>>>>>>>
>>>>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>>>>          spin_lock(lock);
>>>>>>>>>>
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>>>> For such drivers, that would require anybody calling unlink to
>>>>>>>>>>>> hold the vm's
>>>>>>>>>>>> resv, though.
>>>>>>>>>>> In V4 I want to go back to having a dedicated lock for the GEMs
>>>>>>>>>>> gpuva list (or
>>>>>>>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>>>>>>>> lock for that
>>>>>>>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>>>>>>>> otherwise wouldn't
>>>>>>>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>>>>>>>> was referring to
>>>>>>>>>>> earlier.
>>>>>>>>>> Yeah, I can see the need for a dedicated lock for the GEM's gpuva
>>>>>>>>>> list, but
>>>>>>>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>>>>>>>> problem. We
>>>>>>>>>> may free the object and a pointer to the vm's resv during unlink
>>>>>>>>>> but we
>>>>>>>>>> don't free the vm's resv.  It'd be a matter of ensuring that any
>>>>>>>>>> calls to
>>>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>>> Drivers calling unlink() from the fence signaling path can't use the
>>>>>>>>> VM's
>>>>>>>>> dma-resv lock.
>>>>>>>> Yes, that made me a bit curious because in the current version the code
>>>>>>>> required the object's dma_resv for unlink() which can't be grabbed
>>>>>>>> either from the fence signaling path. So are there any drivers actually
>>>>>>>> wanting to do that? If so, they will either need to resort to the
>>>>>>>> current spinlock solution or they will need to call unlink from a
>>>>>>>> workqueue item.
>>>>>>> As Boris already mentioned we have the dma-resv lock by default or a driver
>>>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of the latter.
>>>>>>>
>>>>>>>>> Also, what if the object is an external object? We can't use the VM's
>>>>>>>>> dma-resv
>>>>>>>>> lock here.
>>>>>>>> Why? Typically (sync) unlink is only ever called from an unbind-like
>>>>>>>> operation where it should be trivial to grab the vm's resv. Or, for
>>>>>>>> that matter any outer lock protecting the extobj list. Rule would be
>>>>>>>> the drm_gpuvm_bo::entry::extobj  and drm_gpuvm_bo::entry::evict would
>>>>>>>> be protected by either the vm's dma_resv (or possibly an outer lock in
>>>>>>>> the case of the extobj list).
>>>>>>> Outer lock wouldn't have been working for updates in the async path, but
>>>>>>> shouldn't be relevant anymore. We could use the VM's resv for that.
>>>>>>>
>>>>>>>>>   And we can't have the GEM objs dma-resv lock held when calling
>>>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>>>>>>>> refcount drops
>>>>>>>>> to zero calls drm_gpuvm_bo_destroy() and drm_gpuvm_bo_destroy() might
>>>>>>>>> drop the
>>>>>>>>> last reference of the GEM object.
>>>>>>>> Yes, but this is a different problem as to what exactly protects
>>>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal per bo list
>>>>>>>> lock, or if we want to keep the bo's dma_resv we need to ensure that
>>>>>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts its obj
>>>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's refcount (I know
>>>>>>>> Boris didn't like that, but requiring an explicit refcount for a
>>>>>>>> pointer you dereference unless you're under a lock that ensures keeping
>>>>>>>> the object alive is pretty much required?) But anyway for the
>>>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal spinlock)
>>>>>>>> I don't have a strong preference.
>>>>>>> We can keep the GEM objects dma-resv lock, however as mentioned above
>>>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both the VM's resv lock
>>>>>>> and the GEM's resv lock in case they differ.
>>>>>>>
>>>>>>>>>   All those problems go away with a dedicated
>>>>>>>>> GEM gpuva list lock.
>>>>>>>> I don't think these are real problems.
>>>>>>>> With the excepton of the eviction list "trick" where we currently have
>>>>>>>> slightly different approach to collect external bos needing rebinding,
>>>>>>>> we have this working fine.
>>>>>>>>
>>>>>>>> TBH I think pretty much the only situation where the spinlock is needed
>>>>>>>> is for async updates of these lists, unless a wq item can be used for
>>>>>>>> that, but it doesn't really seem like the current code allows for such
>>>>>>>> updates anyway? It complicates the code a lot, adds overhead and also
>>>>>>>> adds the requirement for refcounting during list traversal.
>>>>>>>>
>>>>>>>> /Thomas
>>>>>>>>
>>>>>>>>>> /Thomas
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> It seems that with that also the refcount could be make non-
>>>>>>>>>>>> atomic.
>>>>>>>>>>>>
>>>>>>>>>>>> All in the spirit of the drm locking guidelines "use big locks
>>>>>>>>>>>> when
>>>>>>>>>>>> possible".
>>>>>>>>>>>> Lower level locks only when necessary for performance or
>>>>>>>>>>>> locking inversion?
>>>>>>>>>>>>
>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>>>>>>> local list, so removal
>>>>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>>>>> iterating the list.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>>>>> +       ({
>>>>>>>>>>>>>                             \
>>>>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>>>>>> +
>>>>>>>>>>>>>                             \
>>>>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>>>>                             \
>>>>>>>>>>>>> +
>>>>>>>>>>>>>                             \
>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>>>>>>> + struct
>>>>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>>>>> +
>>>>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>>>>> +                       if
>>>>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>>>>> {                    \
>>>>>>>>>>>>> +                               list_move_tail(&(__vm_bo)-
>>>>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>>>>> +
>>>>>>>>>>>>> __local_list);                           \
>>>>>>>>>>>>> +                               break;
>>>>>>>>>>>>>                             \
>>>>>>>>>>>>> +                       } else
>>>>>>>>>>>>> {                                                        \
>>>>>>>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>>>>> NULL;                                         \
>>>>>>>>>>>>> +                       }
>>>>>>>>>>>>>                             \
>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>                             \
>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>> +
>>>>>>>>>>>>>                             \
>>>>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>>>>                             \
>>>>>>>>>>>>> +       })
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>>>>>> first element from the
>>>>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Typical use:
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + *     ret = 0;
>>>>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>>>>> + *             ret = do_something_with_vm_bo(..., vm_bo);
>>>>>>>>>>>>> + *             if (ret)
>>>>>>>>>>>>> + *                     break;
>>>>>>>>>>>>> + *     }
>>>>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>>>>>>> &my_local_list);
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Only used for internal list iterations, not meant to be
>>>>>>>>>>>>> exposed to the outside
>>>>>>>>>>>>> + * world.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>> +                                               __local_list,
>>>>>>>>>>>>> NULL);            \
>>>>>>>>>>>>> +
>>>>>>>>>>>>> __vm_bo;
>>>>>>>>>>>>>        \
>>>>>>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>> +                                               __local_list,
>>>>>>>>>>>>> __vm_bo))         \
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to their
>>>>>>>>>>>>> original list
>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * When we're done iterating a vm_bo list, we should call
>>>>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>>>>> + * to restore the original state and let new iterations take
>>>>>>>>>>>>> place.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>>>>> __local_list)                         \
>>>>>>>>>>>>> +       do
>>>>>>>>>>>>> {
>>>>>>>>>>>>>                  \
>>>>>>>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>>>>>>>> list elements to the          \
>>>>>>>>>>>>> +                * head to preserve previous ordering, in
>>>>>>>>>>>>> case it matters.              \
>>>>>>>>>>>>> +
>>>>>>>>>>>>> */
>>>>>>>>>>>>>            \
>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
>>>>>>>>>>>>> list
>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>> +       do
>>>>>>>>>>>>> {
>>>>>>>>>>>>>          \
>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
>>>>>>>>>>>>> list
>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>> +       do
>>>>>>>>>>>>> {
>>>>>>>>>>>>>          \
>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>>>>> +
>>>>>>>>>>>>>     #define to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>>>>> +
>>>>>>>>>>>>>          drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>>>>>>>> *gpuvm)
>>>>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>>>>>               "GPUVA tree is not empty, potentially leaking
>>>>>>>>>>>>> memory.\n");
>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>> +
>>>>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>>>>     }
>>>>>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all assoiciated BOs
>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects the
>>>>>>>>>>>>> given
>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Note: This function is safe against concurrent insertion
>>>>>>>>>>>>> and removal of
>>>>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>>>>> concurrent usage itself.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>>>>>>> either an outer VM lock
>>>>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this function
>>>>>>>>>>>>> within the
>>>>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the GPUVM's
>>>>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +int
>>>>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>>>>>>> +                         unsigned int num_fences)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, vm_bo->obj,
>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>> +       }
>>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped within
>>>>>>>>>>>>> a given range
>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>>>>>>>> mapped between @addr
>>>>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +int
>>>>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>> drm_exec *exec,
>>>>>>>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
>>>>>>>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>> +                       return ret;
>>>>>>>>>>>>> +       }
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>>>>>> given
>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>>>>> + * being set the driver receives the given @fn callback to
>>>>>>>>>>>>> lock additional
>>>>>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec instance.
>>>>>>>>>>>>> Typically, drivers
>>>>>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>>>>>> callback.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +int
>>>>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>> +                       drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>> +                       if (ret)
>>>>>>>>>>>>> +                               goto err;
>>>>>>>>>>>>> +               }
>>>>>>>>>>>>> +       }
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +err:
>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +static int
>>>>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>>>>>>> objs,
>>>>>>>>>>>>> + args->num_objs,
>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>>>>> + * @num_objs: the number of additional &drm_gem_objects to
>>>>>>>>>>>>> lock
>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>>>>> + * contains mappings of, plus the ones given through @objs.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +int
>>>>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>>>>>>> +                         unsigned int num_objs,
>>>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>> +       } args;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>>>>>>> interruptible);
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>>>>>>> within a given range
>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>> mapped between @addr and
>>>>>>>>>>>>> + * @addr + @range.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +int
>>>>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       flags = interruptible ? DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>>>>>>> addr, range,
>>>>>>>>>>>>> + num_fences);
>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>> +       }
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +err:
>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as evicted
>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>>>>>>>> evicted buffer
>>>>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +int
>>>>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, vm_bo) {
>>>>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>> +       }
>>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private and all
>>>>>>>>>>>>> extobj
>>>>>>>>>>>>> + * dma-resv
>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +void
>>>>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>>>>>>> +                        enum dma_resv_usage private_usage,
>>>>>>>>>>>>> +                        enum dma_resv_usage extobj_usage)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>>>>> +       unsigned long index;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, obj) {
>>>>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>>>>> obj) ?
>>>>>>>>>>>>> + private_usage :
>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>> +       }
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>>>>> +
>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance of struct
>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>>>>> +
>>>>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>>>>          return vm_bo;
>>>>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>>>>> +
>>>>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref *kref)
>>>>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the reference of
>>>>>>>>>>>>>      *
>>>>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * If the reference count drops to zero, the &gpuvm_bo is
>>>>>>>>>>>>> destroyed, which
>>>>>>>>>>>>> + * includes removing it from the GEMs gpuva list. Hence, if
>>>>>>>>>>>>> a call to this
>>>>>>>>>>>>> + * function can potentially let the reference count to zero
>>>>>>>>>>>>> the caller must
>>>>>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>>>>>>      */
>>>>>>>>>>>>>     void
>>>>>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>     }
>>>>>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +
>>>>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>>>>     }
>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to its
>>>>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>>>>> + * extobj list
>>>>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the
>>>>>>>>>>>>> extobj list.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if
>>>>>>>>>>>>> not on the list
>>>>>>>>>>>>> + * already and if the corresponding &drm_gem_object is an
>>>>>>>>>>>>> external object,
>>>>>>>>>>>>> + * actually.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +void
>>>>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
>>>>>>>>>>>>> / from a
>>>>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +void
>>>>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>>>>> +               if (evict)
>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, evict);
>>>>>>>>>>>>> +               else
>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, evict);
>>>>>>>>>>>>> +       }
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>>>>> +
>>>>>>>>>>>>>     static int
>>>>>>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>>>>      */
>>>>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>>>>           * space
>>>>>>>>>>>>>           */
>>>>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>>>>>> +        */
>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>>>>>> serving as
>>>>>>>>>>>>> +                * external object
>>>>>>>>>>>>> +                */
>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>> +                * @lock: spinlock to protect the extobj list
>>>>>>>>>>>>> +                */
>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>> +       } extobj;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>> +        * @evict: structure holding the evict list and evict
>>>>>>>>>>>>> list lock
>>>>>>>>>>>>> +        */
>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>>>>>> currently being
>>>>>>>>>>>>> +                * evicted
>>>>>>>>>>>>> +                */
>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>> +                * @lock: spinlock to protect the evict list
>>>>>>>>>>>>> +                */
>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>> +       } evict;
>>>>>>>>>>>>>     };
>>>>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>> drm_device *drm,
>>>>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>> + * external object
>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv differs
>>>>>>>>>>>>> from the
>>>>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>> + struct drm_gem_object
>>>>>>>>>>>>> *obj)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +
>>>>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>>     {
>>>>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__)
>>>>>>>>>>>>> \
>>>>>>>>>>>>>          list_for_each_entry_safe(va__, next__, &(gpuvm__)-
>>>>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>>>>> &drm_exec should be.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Optionally, @extra can be set in order to lock additional
>>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>>>>> +        */
>>>>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA reservations
>>>>>>>>>>>>> +        */
>>>>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>> +        * @extra: Callback and corresponding private data
>>>>>>>>>>>>> for the driver to
>>>>>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>>>>>> +        */
>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>>>>> +                */
>>>>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>> +                         unsigned int num_fences);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>>>>>>>> callback
>>>>>>>>>>>>> +                */
>>>>>>>>>>>>> +               void *priv;
>>>>>>>>>>>>> +       } extra;
>>>>>>>>>>>>> +};
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
>>>>>>>>>>>>> resv
>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +static inline int
>>>>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>>>>>> +                    unsigned int num_fences)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>>>>> +                             unsigned int num_fences);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>> +                           struct drm_exec *exec,
>>>>>>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>>>>>>> +                           unsigned int num_fences);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>> +                             struct drm_gem_object **objs,
>>>>>>>>>>>>> +                             unsigned int num_objs,
>>>>>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>> +                             u64 addr, u64 range,
>>>>>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
>>>>>>>>>>>>> BOs
>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>> previously acquired
>>>>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +/**
>>>>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>> + *
>>>>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>>>>> + */
>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec,
>>>>>>>>>>>>> fence,
>>>>>>>>>>>>> + private_usage,
>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +
>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>>>>                           * gpuva list.
>>>>>>>>>>>>>                           */
>>>>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>>>>> +                        * extobj list.
>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>>>>> +                        * list.
>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>>>>                  } entry;
>>>>>>>>>>>>>          } list;
>>>>>>>>>>>>>     };
>>>>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>>>>>>>> evict);
>>>>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>>>>> +
>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk over a
>>>>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>>>>>>> iteration step
>>>>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>>>>           * used.
>>>>>>>>>>>>>           */
>>>>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, void
>>>>>>>>>>>>> *priv);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>>>>>>>> +        *
>>>>>>>>>>>>> +        * Drivers receive this callback for every evicted
>>>>>>>>>>>>> &drm_gem_object being
>>>>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>>>>> +        *
>>>>>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>>>>>> specific variant of
>>>>>>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>>>>>>> +        */
>>>>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>>>>>>     };
>>>>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
>>>>>>
>>>>>
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-19 15:16                           ` Danilo Krummrich
@ 2023-09-19 15:23                             ` Thomas Hellström
  2023-09-20  5:37                               ` Christian König
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-19 15:23 UTC (permalink / raw)
  To: Danilo Krummrich, Christian König
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel


On 9/19/23 17:16, Danilo Krummrich wrote:
> On 9/19/23 14:21, Thomas Hellström wrote:
>> Hi Christian
>>
>> On 9/19/23 14:07, Christian König wrote:
>>> Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
>>>> On 9/13/23 17:33, Christian König wrote:
>>>>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>>>>> On 9/13/23 16:26, Christian König wrote:
>>>>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>>>>> As mentioned in a different mail thread, the reply is based on 
>>>>>>>> the assumption
>>>>>>>> that we don't support anything else than GPUVM updates from the 
>>>>>>>> IOCTL.
>>>>>>>
>>>>>>> I think that this assumption is incorrect.
>>>>>>
>>>>>> Well, more precisely I should have said "don't support GPUVM 
>>>>>> updated from within
>>>>>> fence signaling critical sections". And looking at the code, that 
>>>>>> doesn't seem what
>>>>>> you're doing there.
>>>>>>
>>>>>>>
>>>>>>> Vulkan is just once specific use case, but this here should 
>>>>>>> probably be able to handle other use cases as well.
>>>>>>>
>>>>>>> Especially with HMM you get the requirement that you need to be 
>>>>>>> able to invalidate GPUVM mappings without grabbing a reservation 
>>>>>>> lock.
>>>>>>
>>>>>> What do you mean with "invalidate GPUVM mappings" in this 
>>>>>> context? drm_gpuvm_bo_evict()
>>>>>> should only be called from a ttm_device_funcs::move\x0f callback, we 
>>>>>> should hold the dma-resv
>>>>>> lock there.
>>>>>
>>>>> Well the question is which dma-resv lock do we hold?
>>>>>
>>>>> In the move callback we only hold the dma-resv lock of the BO 
>>>>> which is moved, but when that is a shared BO then that's not the 
>>>>> same as the one for the VM.
>>>>
>>>> Correct, Thomas' idea was to use the GEM's dma_resv lock to protect 
>>>> drm_gpuvm_bo::evicted
>>>> and then actually move the drm_gpuvm_bo to the VM's evicted list 
>>>> once we grabbed all
>>>> dma-resv locks when locking the VM's BOs using drm_exec. We can 
>>>> remove them from the evicted
>>>> list on validate(). This way we never touch the evicted list 
>>>> without holding at least the VM's
>>>> dma-resv lock.
>>>>
>>>> Do you have any concerns about that?
>>>
>>> Scratching my head a bit how that is supposed to work.
>>>
>>> This implies that you go over all the evicted BOs during validation 
>>> and not just the one mentioned in the CS.
>>>
>>> That might work for Vulkan, but is pretty much a no-go for OpenGL.
>>>
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> See what the eviction lock in amdgpu is doing for example.
>>>>>>
>>>>>> The eviction_lock seems to protect a VM state "evicting" of 
>>>>>> whether any BO that
>>>>>> is associated with the VM is currently evicting. At the same time 
>>>>>> amdgpu protects
>>>>>> the eviceted list of the VM with a different lock. So this seems 
>>>>>> to be entirely
>>>>>> unrelated. Tracking a "currently evicting" state is not part of 
>>>>>> the GPUVM
>>>>>> implementation currently and hence nothing would change for 
>>>>>> amdgpu there.
>>>>>
>>>>> Sorry for the confusion we use different terminology in amdgpu.
>>>>>
>>>>> The eviction lock and evicted state is for the VM page tables, 
>>>>> e.g. if the whole VM is currently not used and swapped out or even 
>>>>> de-allocated.
>>>>>
>>>>> This is necessary because we have cases where we need to access 
>>>>> the VM data without holding the dma-resv lock of this VM. 
>>>>> Especially figuring out which parts of an address space contain 
>>>>> mappings and which doesn't.
>>>>
>>>> I think this is fine, this has nothing to do with lists of evicted 
>>>> GEM objects or external GEM
>>>> objects, right? Marking mappings (drm_gpuva) as invalidated 
>>>> (DRM_GPUVA_INVALIDATED) or accessing
>>>> the VA space does not require any dma-resv locks.
>>>
>>> I hope so, but I'm not 100% sure.
>>>
>>>>
>>>>>
>>>>> This is a requirement which comes with HMM handling, you won't see 
>>>>> this with Vulkan (or OpenGL, VAAPI etc..).
>>>>>
>>>>>
>>>>> The invalidation lock on the other hand is what in this discussion 
>>>>> is called eviction lock. This one is needed because what I wrote 
>>>>> above, during the move callback only the dma-resv of the BO which 
>>>>> is moved is locked, but not necessarily the dma-resv of the VM.
>>>>
>>>> That's yet another thing, right? This is used to track whether 
>>>> *any* BO that belongs to the VM is
>>>> currently being evicted, correct? As mentioned, as by now this is 
>>>> not supported in GPUVM and hence
>>>> would be the same driver specific code with the same driver specifc 
>>>> lock.
>>>
>>> That is most likely a show stopper using this for OpenGL based 
>>> workloads as far as I can see. For those you need to able to figure 
>>> out which non-VM BOs have been evicted and which parts of the VM 
>>> needs updates.
>>
>> We identify those with a bool in the gpuvm_bo, and that bool is 
>> protected by the bo_resv. In essence, the "evicted" list must be made 
>> up-to-date with all relevant locks held before traversing in the next 
>> exec.
>
> What I still miss with this idea is how do we find all the 
> drm_gpuvm_bo structures with the evicted bool set to true? When doing 
> the drm_exec dance we come across all external ones and can add them 
> to the list if needed, but what about the BOs having the VM's dma-resv?

Oh, they can be added to the evict list directly (no bool needed) in the 
eviction code, like in v3. Since for those we indeed hold the VM's 
dma_resv since it's aliased with the object's dma-resv.

/Thomas



>
>>
>> If you mean that we need to unbind all vmas of all vms of evicted bos 
>> before evicting, We don't do that, at least not in Xe, since evicting 
>> we wait for VM idle, and it cant access anything through the stale 
>> vmas until they have been revalidated and rebound.
>>
>> /Thomas
>>
>>
>>
>>>>
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström wrote:
>>>>>>>>> Hi!
>>>>>>>>>
>>>>>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström 
>>>>>>>>>> wrote:
>>>>>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> Hi, Danilo,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>>>>>> So far the DRM GPUVA manager offers common infrastructure to
>>>>>>>>>>>>>> track GPU VA
>>>>>>>>>>>>>> allocations and mappings, generically connect GPU VA 
>>>>>>>>>>>>>> mappings
>>>>>>>>>>>>>> to their
>>>>>>>>>>>>>> backing buffers and perform more complex mapping operations
>>>>>>>>>>>>>> on the GPU VA
>>>>>>>>>>>>>> space.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>>>>>>>> drivers, which
>>>>>>>>>>>>>> can potentially be generalized in order to make the DRM 
>>>>>>>>>>>>>> GPUVA
>>>>>>>>>>>>>> manager
>>>>>>>>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>>>>>>>>> this patch aims
>>>>>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not being used
>>>>>>>>>>>>>> outside of
>>>>>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>>>>>>>> which are
>>>>>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 3) Provide functions to efficiently lock all GEM objects 
>>>>>>>>>>>>>> dma-
>>>>>>>>>>>>>> resv the
>>>>>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>>>>>>> contains mappings
>>>>>>>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>>>>>>>> accelerated.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 5) Provide some convinience functions for common patterns.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Rather than being designed as a "framework", the target 
>>>>>>>>>>>>>> is to
>>>>>>>>>>>>>> make all
>>>>>>>>>>>>>> features appear as a collection of optional helper 
>>>>>>>>>>>>>> functions,
>>>>>>>>>>>>>> such that
>>>>>>>>>>>>>> drivers are free to make use of the DRM GPUVA managers basic
>>>>>>>>>>>>>> functionality and opt-in for other features without setting
>>>>>>>>>>>>>> any feature
>>>>>>>>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>>>>>>>> locking for drivers
>>>>>>>>>>>>>> updating the GPU VA space within the fence signalling path.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an 
>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>> instance of this
>>>>>>>>>>>>>>      * particular combination. If not existent a new 
>>>>>>>>>>>>>> instance
>>>>>>>>>>>>>> is created and linked
>>>>>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>>>>>>>> evicted objects. Those
>>>>>>>>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>>>>>>>>> dma-resv locks and
>>>>>>>>>>>>>> + * validation of evicted objects bound in a &drm_gpuvm. For
>>>>>>>>>>>>>> instance the all
>>>>>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm can be
>>>>>>>>>>>>>> locked by calling
>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>>>>>>>>> also possible to lock
>>>>>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>>>>>> corresponding parameters to
>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the &drm_exec
>>>>>>>>>>>>>> loop while making
>>>>>>>>>>>>>> + * use of helper functions such as 
>>>>>>>>>>>>>> drm_gpuvm_prepare_range()
>>>>>>>>>>>>>> or
>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Every bound &drm_gem_object is treated as external 
>>>>>>>>>>>>>> object
>>>>>>>>>>>>>> when its &dma_resv
>>>>>>>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for the 
>>>>>>>>>>>>>> same
>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>>>>>>>> creations and destructions
>>>>>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances unique.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of external and
>>>>>>>>>>>>>> evicted objects are
>>>>>>>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>>>>>>>> iteration internally.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * However, drivers still need ensure to protect concurrent
>>>>>>>>>>>>>> calls to functions
>>>>>>>>>>>>>> + * iterating those lists, such as drm_gpuvm_validate() and
>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function 
>>>>>>>>>>>>>> contains
>>>>>>>>>>>>>> a particular
>>>>>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Functions adding or removing entries from those lists,
>>>>>>>>>>>>>> such as
>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() may be
>>>>>>>>>>>>>> called with external
>>>>>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>>>>>> corresponding list to be
>>>>>>>>>>>>>> + * (safely) modified while potentially being iternated by
>>>>>>>>>>>>>> other API functions.
>>>>>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>>>>>      *   }
>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>>>>>>> first element from
>>>>>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>> Are the list spinlocks needed for that async state update 
>>>>>>>>>>>>> from
>>>>>>>>>>>>> within the
>>>>>>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>>>>>
>>>>>>>>>>>>> Otherwise it should be sufficient to protect the lists 
>>>>>>>>>>>>> with the
>>>>>>>>>>>>> gpuvm's resv
>>>>>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>>>>>
>>>>>>>>>>>>> If those spinlocks are still needed in some situations, 
>>>>>>>>>>>>> perhaps
>>>>>>>>>>>>> could we
>>>>>>>>>>>>> have an option to set them to NULL (Like IIRC the maple tree
>>>>>>>>>>>>> allows for)?
>>>>>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>>>>>> holding only the dma-resv lock from the BO this function gets
>>>>>>>>>>>> called for. Hence,
>>>>>>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() calls 
>>>>>>>>>>>> with
>>>>>>>>>>>> different BOs.
>>>>>>>>>>> No. Only if you try to add external objects to the vm's 
>>>>>>>>>>> evict list
>>>>>>>>>>> from
>>>>>>>>>>> within the evict code. That's not necessary since you loop 
>>>>>>>>>>> through
>>>>>>>>>>> all
>>>>>>>>>>> external objects anyway when locking them so an "evicted" 
>>>>>>>>>>> bool in
>>>>>>>>>>> the vm_bo,
>>>>>>>>>>> protected by the bo resv would be sufficient. The extobj 
>>>>>>>>>>> locking
>>>>>>>>>>> loop can
>>>>>>>>>>> then add the bo to the evicted list.
>>>>>>>>>> And validate() can remove it while still holding all dma-resv 
>>>>>>>>>> locks,
>>>>>>>>>> neat!
>>>>>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>>>>>> concurrently? What
>>>>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>>>>>> drm_gpuva_unlink()?
>>>>>>>>>> Are we guaranteed that at this point of time the drm_gpuvm_bo 
>>>>>>>>>> is not
>>>>>>>>>> on the
>>>>>>>>>> evicted list? Because otherwise we would call 
>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>> with the
>>>>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>> might drop the last reference to the drm_gem_object and hence 
>>>>>>>>>> we'd
>>>>>>>>>> potentially
>>>>>>>>>> free the dma-resv lock while holding it, at least if it's an 
>>>>>>>>>> external
>>>>>>>>>> object.
>>>>>>>>> Easiest way in this scheme is to think of the lists as being 
>>>>>>>>> protected
>>>>>>>>> by the vm's resv lock. That means anybody calling unlink() 
>>>>>>>>> must also
>>>>>>>>> hold the vm's resv lock. (Which is OK from an UAF point of 
>>>>>>>>> view, but
>>>>>>>>> perhaps not from a locking inversion POW from an async list 
>>>>>>>>> update).
>>>>>>>> This would mean that on unlink() we'd need to hold the VM's 
>>>>>>>> resv lock and the
>>>>>>>> corresponding GEM's resv lock (in case they're not the same 
>>>>>>>> anyways) because the
>>>>>>>> VM's resv lock would protect the external / evicted object 
>>>>>>>> lists and the GEM
>>>>>>>> objects resv lock protects the GEM's list of drm_gpuvm_bos and the
>>>>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>>>>
>>>>>>>>>>>> For extobjs an outer lock would be enough in case of Xe, but I
>>>>>>>>>>>> really would not
>>>>>>>>>>>> like to add even more complexity just to get the spinlock 
>>>>>>>>>>>> out of
>>>>>>>>>>>> the way in case
>>>>>>>>>>>> the driver already has an outer lock protecting this path.
>>>>>>>>>>> I must disagree here. These spinlocks and atomic operations are
>>>>>>>>>>> pretty
>>>>>>>>>>> costly and as discussed earlier this type of locking was the 
>>>>>>>>>>> reason
>>>>>>>>>>> (at
>>>>>>>>>>> least according to the commit message) that made Christian 
>>>>>>>>>>> drop the
>>>>>>>>>>> XArray
>>>>>>>>>>> use in drm_exec for the same set of objects: "The locking 
>>>>>>>>>>> overhead
>>>>>>>>>>> is
>>>>>>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>>>>>>> complexity and a
>>>>>>>>>>> single wide lock following the drm locking guidelines set 
>>>>>>>>>>> out by
>>>>>>>>>>> Daniel and
>>>>>>>>>>> David should really be the default choice with an opt-in for a
>>>>>>>>>>> spinlock if
>>>>>>>>>>> needed for async and pushing out to a wq is not an option.
>>>>>>>>>> For the external object list an outer lock would work as long 
>>>>>>>>>> as it's
>>>>>>>>>> not the
>>>>>>>>>> dma-resv lock of the corresponding GEM object, since here we 
>>>>>>>>>> actually
>>>>>>>>>> need to
>>>>>>>>>> remove the list entry from the external object list on
>>>>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>>>>> It's just a bit weird design wise that drivers would need to 
>>>>>>>>>> take
>>>>>>>>>> this outer
>>>>>>>>>> lock on:
>>>>>>>>>>
>>>>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also 
>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>>>>
>>>>>>>>>> Given that it seems reasonable to do all the required locking
>>>>>>>>>> internally.
>>>>>>>>>  From a design POW, there has been a clear direction in XE to 
>>>>>>>>> make
>>>>>>>>> things similar to mmap() / munmap(), so this outer lock, which 
>>>>>>>>> in Xe is
>>>>>>>>> an rwsem, is used in a similar way as the mmap_lock. It's 
>>>>>>>>> protecting
>>>>>>>>> the page-table structures and vma rb tree, the userptr 
>>>>>>>>> structures and
>>>>>>>>> the extobj list. Basically it's taken early in the exec IOCTL, 
>>>>>>>>> the
>>>>>>>>> VM_BIND ioctl, the compute rebind worker and the pagefault 
>>>>>>>>> handler, so
>>>>>>>>> all of the above are just asserting that it is taken in the 
>>>>>>>>> correct
>>>>>>>>> mode.
>>>>>>>>>
>>>>>>>>> But strictly with this scheme one could also use the vm's 
>>>>>>>>> dma_resv for
>>>>>>>>> the extobj list since with drm_exec, it's locked before 
>>>>>>>>> traversing the
>>>>>>>>> list.
>>>>>>>>>
>>>>>>>>> The whole point of this scheme is to rely on locks that you 
>>>>>>>>> already are
>>>>>>>>> supposed to be holding for various reasons and is simple to 
>>>>>>>>> comprehend.
>>>>>>>> I don't agree that we're supposed to hold the VM's resv lock 
>>>>>>>> anyways for
>>>>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but 
>>>>>>>> I'm fine using it
>>>>>>>> for that purpose nevertheless.
>>>>>>>>
>>>>>>>>>> In order to at least place lockdep checks, the driver would 
>>>>>>>>>> need to
>>>>>>>>>> supply the
>>>>>>>>>> corresponding lock's lockdep_map, because the GPUVM otherwise 
>>>>>>>>>> doesn't
>>>>>>>>>> know about
>>>>>>>>>> the lock.
>>>>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>>>> I'd really like to avoid that, especially now that everything 
>>>>>>>> got simpler. We
>>>>>>>> should define the actual locks to take instead.
>>>>>>>>
>>>>>>>>>> Out of curiosity, what is the overhead of a spin_lock() that 
>>>>>>>>>> doesn't
>>>>>>>>>> need to
>>>>>>>>>> spin?
>>>>>>>>> I guess it's hard to tell exactly, but it is much lower on 
>>>>>>>>> modern x86
>>>>>>>>> than what it used to be. Not sure about ARM, which is the other
>>>>>>>>> architecture important to us. I figure if there is little 
>>>>>>>>> cache-line
>>>>>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>>>>>
>>>>>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>>>>>
>>>>>>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm *gpuvm,
>>>>>>>>>>> spinlock_t
>>>>>>>>>>> *lock)
>>>>>>>>>>>
>>>>>>>>>>> {
>>>>>>>>>>>
>>>>>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>>>>>          spin_lock(lock);
>>>>>>>>>>>
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>>>> For such drivers, that would require anybody calling 
>>>>>>>>>>>>> unlink to
>>>>>>>>>>>>> hold the vm's
>>>>>>>>>>>>> resv, though.
>>>>>>>>>>>> In V4 I want to go back to having a dedicated lock for the 
>>>>>>>>>>>> GEMs
>>>>>>>>>>>> gpuva list (or
>>>>>>>>>>>> VM_BO list to be more precise). We can't just use the dma-resv
>>>>>>>>>>>> lock for that
>>>>>>>>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>>>>>>>>> otherwise wouldn't
>>>>>>>>>>>> be allowed to already hold the dma-resv lock. That's the fix I
>>>>>>>>>>>> was referring to
>>>>>>>>>>>> earlier.
>>>>>>>>>>> Yeah, I can see the need for a dedicated lock for the GEM's 
>>>>>>>>>>> gpuva
>>>>>>>>>>> list, but
>>>>>>>>>>> holding the vm's dma-resv lock across the unlink shouldn't be a
>>>>>>>>>>> problem. We
>>>>>>>>>>> may free the object and a pointer to the vm's resv during 
>>>>>>>>>>> unlink
>>>>>>>>>>> but we
>>>>>>>>>>> don't free the vm's resv.  It'd be a matter of ensuring that 
>>>>>>>>>>> any
>>>>>>>>>>> calls to
>>>>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>>>> Drivers calling unlink() from the fence signaling path can't 
>>>>>>>>>> use the
>>>>>>>>>> VM's
>>>>>>>>>> dma-resv lock.
>>>>>>>>> Yes, that made me a bit curious because in the current version 
>>>>>>>>> the code
>>>>>>>>> required the object's dma_resv for unlink() which can't be 
>>>>>>>>> grabbed
>>>>>>>>> either from the fence signaling path. So are there any drivers 
>>>>>>>>> actually
>>>>>>>>> wanting to do that? If so, they will either need to resort to the
>>>>>>>>> current spinlock solution or they will need to call unlink from a
>>>>>>>>> workqueue item.
>>>>>>>> As Boris already mentioned we have the dma-resv lock by default 
>>>>>>>> or a driver
>>>>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of the 
>>>>>>>> latter.
>>>>>>>>
>>>>>>>>>> Also, what if the object is an external object? We can't use 
>>>>>>>>>> the VM's
>>>>>>>>>> dma-resv
>>>>>>>>>> lock here.
>>>>>>>>> Why? Typically (sync) unlink is only ever called from an 
>>>>>>>>> unbind-like
>>>>>>>>> operation where it should be trivial to grab the vm's resv. 
>>>>>>>>> Or, for
>>>>>>>>> that matter any outer lock protecting the extobj list. Rule 
>>>>>>>>> would be
>>>>>>>>> the drm_gpuvm_bo::entry::extobj  and 
>>>>>>>>> drm_gpuvm_bo::entry::evict would
>>>>>>>>> be protected by either the vm's dma_resv (or possibly an outer 
>>>>>>>>> lock in
>>>>>>>>> the case of the extobj list).
>>>>>>>> Outer lock wouldn't have been working for updates in the async 
>>>>>>>> path, but
>>>>>>>> shouldn't be relevant anymore. We could use the VM's resv for 
>>>>>>>> that.
>>>>>>>>
>>>>>>>>>>   And we can't have the GEM objs dma-resv lock held when calling
>>>>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>>>>>>>>> refcount drops
>>>>>>>>>> to zero calls drm_gpuvm_bo_destroy() and 
>>>>>>>>>> drm_gpuvm_bo_destroy() might
>>>>>>>>>> drop the
>>>>>>>>>> last reference of the GEM object.
>>>>>>>>> Yes, but this is a different problem as to what exactly protects
>>>>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal 
>>>>>>>>> per bo list
>>>>>>>>> lock, or if we want to keep the bo's dma_resv we need to 
>>>>>>>>> ensure that
>>>>>>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts 
>>>>>>>>> its obj
>>>>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's 
>>>>>>>>> refcount (I know
>>>>>>>>> Boris didn't like that, but requiring an explicit refcount for a
>>>>>>>>> pointer you dereference unless you're under a lock that 
>>>>>>>>> ensures keeping
>>>>>>>>> the object alive is pretty much required?) But anyway for the
>>>>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal 
>>>>>>>>> spinlock)
>>>>>>>>> I don't have a strong preference.
>>>>>>>> We can keep the GEM objects dma-resv lock, however as mentioned 
>>>>>>>> above
>>>>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both 
>>>>>>>> the VM's resv lock
>>>>>>>> and the GEM's resv lock in case they differ.
>>>>>>>>
>>>>>>>>>>   All those problems go away with a dedicated
>>>>>>>>>> GEM gpuva list lock.
>>>>>>>>> I don't think these are real problems.
>>>>>>>>> With the excepton of the eviction list "trick" where we 
>>>>>>>>> currently have
>>>>>>>>> slightly different approach to collect external bos needing 
>>>>>>>>> rebinding,
>>>>>>>>> we have this working fine.
>>>>>>>>>
>>>>>>>>> TBH I think pretty much the only situation where the spinlock 
>>>>>>>>> is needed
>>>>>>>>> is for async updates of these lists, unless a wq item can be 
>>>>>>>>> used for
>>>>>>>>> that, but it doesn't really seem like the current code allows 
>>>>>>>>> for such
>>>>>>>>> updates anyway? It complicates the code a lot, adds overhead 
>>>>>>>>> and also
>>>>>>>>> adds the requirement for refcounting during list traversal.
>>>>>>>>>
>>>>>>>>> /Thomas
>>>>>>>>>
>>>>>>>>>>> /Thomas
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> It seems that with that also the refcount could be make non-
>>>>>>>>>>>>> atomic.
>>>>>>>>>>>>>
>>>>>>>>>>>>> All in the spirit of the drm locking guidelines "use big 
>>>>>>>>>>>>> locks
>>>>>>>>>>>>> when
>>>>>>>>>>>>> possible".
>>>>>>>>>>>>> Lower level locks only when necessary for performance or
>>>>>>>>>>>>> locking inversion?
>>>>>>>>>>>>>
>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>>>>>>>> local list, so removal
>>>>>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>>>>>> iterating the list.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>>>>>> +       ({
>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>>>>>>>> + struct
>>>>>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>>>>>> +                       if
>>>>>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>>>>>> {                    \
>>>>>>>>>>>>>> +                               list_move_tail(&(__vm_bo)-
>>>>>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> __local_list);                           \
>>>>>>>>>>>>>> +                               break;
>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>> +                       } else
>>>>>>>>>>>>>> {                                                        \
>>>>>>>>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>>>>>> NULL;                                         \
>>>>>>>>>>>>>> +                       }
>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>> +       })
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>> + * iterator releases the lock immediately after picking the
>>>>>>>>>>>>>> first element from the
>>>>>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Typical use:
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + *     ret = 0;
>>>>>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>>>>>> + *             ret = do_something_with_vm_bo(..., vm_bo);
>>>>>>>>>>>>>> + *             if (ret)
>>>>>>>>>>>>>> + *                     break;
>>>>>>>>>>>>>> + *     }
>>>>>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>>>>>>>> &my_local_list);
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Only used for internal list iterations, not meant to be
>>>>>>>>>>>>>> exposed to the outside
>>>>>>>>>>>>>> + * world.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> NULL);            \
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> __vm_bo;
>>>>>>>>>>>>>>        \
>>>>>>>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> __vm_bo))         \
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to their
>>>>>>>>>>>>>> original list
>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used to store
>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * When we're done iterating a vm_bo list, we should call
>>>>>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>>>>>> + * to restore the original state and let new iterations 
>>>>>>>>>>>>>> take
>>>>>>>>>>>>>> place.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>>>>>> __local_list)                         \
>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>                  \
>>>>>>>>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>>>>>>>>> list elements to the          \
>>>>>>>>>>>>>> +                * head to preserve previous ordering, in
>>>>>>>>>>>>>> case it matters.              \
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>            \
>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
>>>>>>>>>>>>>> list
>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
>>>>>>>>>>>>>> list
>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>     #define to_drm_gpuva(__node) container_of((__node), 
>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>         ��drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>>>>>>>>> *gpuvm)
>>>>>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>>>>>>               "GPUVA tree is not empty, potentially leaking
>>>>>>>>>>>>>> memory.\n");
>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all assoiciated 
>>>>>>>>>>>>>> BOs
>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> given
>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Note: This function is safe against concurrent insertion
>>>>>>>>>>>>>> and removal of
>>>>>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>>>>>> concurrent usage itself.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>>>>>>>> either an outer VM lock
>>>>>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this 
>>>>>>>>>>>>>> function
>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the GPUVM's
>>>>>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>>>>>>>> +                         unsigned int num_fences)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, vm_bo->obj,
>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped 
>>>>>>>>>>>>>> within
>>>>>>>>>>>>>> a given range
>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>>>>>>>>> mapped between @addr
>>>>>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>>> drm_exec *exec,
>>>>>>>>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
>>>>>>>>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>> +                       return ret;
>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>>>>>>> given
>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>>>>>> + * being set the driver receives the given @fn callback to
>>>>>>>>>>>>>> lock additional
>>>>>>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec instance.
>>>>>>>>>>>>>> Typically, drivers
>>>>>>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>>>>>>> callback.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>> +                       drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>> +                       if (ret)
>>>>>>>>>>>>>> +                               goto err;
>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +static int
>>>>>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>>>>>>>> objs,
>>>>>>>>>>>>>> + args->num_objs,
>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>>>>>> + * @num_objs: the number of additional &drm_gem_objects to
>>>>>>>>>>>>>> lock
>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>>>>>> + * contains mappings of, plus the ones given through @objs.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>>>>>>>> +                         unsigned int num_objs,
>>>>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>> +       } args;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>>>>>>>> interruptible);
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>>>>>>>> within a given range
>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>>> mapped between @addr and
>>>>>>>>>>>>>> + * @addr + @range.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>>>>>>>> addr, range,
>>>>>>>>>>>>>> + num_fences);
>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as 
>>>>>>>>>>>>>> evicted
>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>>>>>>>>> evicted buffer
>>>>>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, 
>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private and all
>>>>>>>>>>>>>> extobj
>>>>>>>>>>>>>> + * dma-resv
>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>>>>>>>> +                        enum dma_resv_usage private_usage,
>>>>>>>>>>>>>> +                        enum dma_resv_usage extobj_usage)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>>>>>> +       unsigned long index;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, obj) {
>>>>>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>>>>>> obj) ?
>>>>>>>>>>>>>> + private_usage :
>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance of 
>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>>>>>          return vm_bo;
>>>>>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref 
>>>>>>>>>>>>>> *kref)
>>>>>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref 
>>>>>>>>>>>>>> *kref)
>>>>>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the reference of
>>>>>>>>>>>>>>      *
>>>>>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * If the reference count drops to zero, the &gpuvm_bo is
>>>>>>>>>>>>>> destroyed, which
>>>>>>>>>>>>>> + * includes removing it from the GEMs gpuva list. Hence, if
>>>>>>>>>>>>>> a call to this
>>>>>>>>>>>>>> + * function can potentially let the reference count to zero
>>>>>>>>>>>>>> the caller must
>>>>>>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>     void
>>>>>>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>     EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo to 
>>>>>>>>>>>>>> its
>>>>>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>>>>>> + * extobj list
>>>>>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's the
>>>>>>>>>>>>>> extobj list.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj list if
>>>>>>>>>>>>>> not on the list
>>>>>>>>>>>>>> + * already and if the corresponding &drm_gem_object is an
>>>>>>>>>>>>>> external object,
>>>>>>>>>>>>>> + * actually.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to
>>>>>>>>>>>>>> / from a
>>>>>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>>>>>> +               if (evict)
>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, evict);
>>>>>>>>>>>>>> +               else
>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, evict);
>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>     static int
>>>>>>>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>>>>>           * space
>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>>>>>>> serving as
>>>>>>>>>>>>>> +                * external object
>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>> +                * @lock: spinlock to protect the extobj 
>>>>>>>>>>>>>> list
>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>> +       } extobj;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>> +        * @evict: structure holding the evict list and 
>>>>>>>>>>>>>> evict
>>>>>>>>>>>>>> list lock
>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>>>>>>> currently being
>>>>>>>>>>>>>> +                * evicted
>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>> +                * @lock: spinlock to protect the evict list
>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>> +       } evict;
>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>>> drm_device *drm,
>>>>>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>>> + * external object
>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv differs
>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>> + struct drm_gem_object
>>>>>>>>>>>>>> *obj)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, 
>>>>>>>>>>>>>> gpuvm__)
>>>>>>>>>>>>>> \
>>>>>>>>>>>>>>          list_for_each_entry_safe(va__, next__, &(gpuvm__)-
>>>>>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>>>>>> &drm_exec should be.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Optionally, @extra can be set in order to lock 
>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA reservations
>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>> +        * @extra: Callback and corresponding private data
>>>>>>>>>>>>>> for the driver to
>>>>>>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>> +                         unsigned int num_fences);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>>>>>>>>> callback
>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>> +               void *priv;
>>>>>>>>>>>>>> +       } extra;
>>>>>>>>>>>>>> +};
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
>>>>>>>>>>>>>> resv
>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +static inline int
>>>>>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>>>>>>> +                    unsigned int num_fences)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>>>>>> +                             unsigned int num_fences);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>> +                           struct drm_exec *exec,
>>>>>>>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>>>>>>>> +                           unsigned int num_fences);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>> +                             struct drm_gem_object **objs,
>>>>>>>>>>>>>> +                             unsigned int num_objs,
>>>>>>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>> +                             u64 addr, u64 range,
>>>>>>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
>>>>>>>>>>>>>> BOs
>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>>> previously acquired
>>>>>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, 
>>>>>>>>>>>>>> &vm_exec->exec,
>>>>>>>>>>>>>> fence,
>>>>>>>>>>>>>> + private_usage,
>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>>>>>                           * gpuva list.
>>>>>>>>>>>>>>                           */
>>>>>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>>>>>> +                        * extobj list.
>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>>>>>> +                        * list.
>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>>>>>                  } entry;
>>>>>>>>>>>>>>          } list;
>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk over a
>>>>>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>>>>>>>> iteration step
>>>>>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>>>>>           * used.
>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, void
>>>>>>>>>>>>>> *priv);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>> +        * Drivers receive this callback for every evicted
>>>>>>>>>>>>>> &drm_gem_object being
>>>>>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>>>>>>> specific variant of
>>>>>>>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void 
>>>>>>>>>>>>>> *priv,
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-19 15:23                             ` Thomas Hellström
@ 2023-09-20  5:37                               ` Christian König
  2023-09-20  7:44                                 ` Thomas Hellström
  0 siblings, 1 reply; 77+ messages in thread
From: Christian König @ 2023-09-20  5:37 UTC (permalink / raw)
  To: Thomas Hellström, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

Am 19.09.23 um 17:23 schrieb Thomas Hellström:
>
> On 9/19/23 17:16, Danilo Krummrich wrote:
>> On 9/19/23 14:21, Thomas Hellström wrote:
>>> Hi Christian
>>>
>>> On 9/19/23 14:07, Christian König wrote:
>>>> Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
>>>>> On 9/13/23 17:33, Christian König wrote:
>>>>>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>>>>>> On 9/13/23 16:26, Christian König wrote:
>>>>>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>>>>>> As mentioned in a different mail thread, the reply is based on 
>>>>>>>>> the assumption
>>>>>>>>> that we don't support anything else than GPUVM updates from 
>>>>>>>>> the IOCTL.
>>>>>>>>
>>>>>>>> I think that this assumption is incorrect.
>>>>>>>
>>>>>>> Well, more precisely I should have said "don't support GPUVM 
>>>>>>> updated from within
>>>>>>> fence signaling critical sections". And looking at the code, 
>>>>>>> that doesn't seem what
>>>>>>> you're doing there.
>>>>>>>
>>>>>>>>
>>>>>>>> Vulkan is just once specific use case, but this here should 
>>>>>>>> probably be able to handle other use cases as well.
>>>>>>>>
>>>>>>>> Especially with HMM you get the requirement that you need to be 
>>>>>>>> able to invalidate GPUVM mappings without grabbing a 
>>>>>>>> reservation lock.
>>>>>>>
>>>>>>> What do you mean with "invalidate GPUVM mappings" in this 
>>>>>>> context? drm_gpuvm_bo_evict()
>>>>>>> should only be called from a ttm_device_funcs::move\x0f callback, 
>>>>>>> we should hold the dma-resv
>>>>>>> lock there.
>>>>>>
>>>>>> Well the question is which dma-resv lock do we hold?
>>>>>>
>>>>>> In the move callback we only hold the dma-resv lock of the BO 
>>>>>> which is moved, but when that is a shared BO then that's not the 
>>>>>> same as the one for the VM.
>>>>>
>>>>> Correct, Thomas' idea was to use the GEM's dma_resv lock to 
>>>>> protect drm_gpuvm_bo::evicted
>>>>> and then actually move the drm_gpuvm_bo to the VM's evicted list 
>>>>> once we grabbed all
>>>>> dma-resv locks when locking the VM's BOs using drm_exec. We can 
>>>>> remove them from the evicted
>>>>> list on validate(). This way we never touch the evicted list 
>>>>> without holding at least the VM's
>>>>> dma-resv lock.
>>>>>
>>>>> Do you have any concerns about that?
>>>>
>>>> Scratching my head a bit how that is supposed to work.
>>>>
>>>> This implies that you go over all the evicted BOs during validation 
>>>> and not just the one mentioned in the CS.
>>>>
>>>> That might work for Vulkan, but is pretty much a no-go for OpenGL.
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> See what the eviction lock in amdgpu is doing for example.
>>>>>>>
>>>>>>> The eviction_lock seems to protect a VM state "evicting" of 
>>>>>>> whether any BO that
>>>>>>> is associated with the VM is currently evicting. At the same 
>>>>>>> time amdgpu protects
>>>>>>> the eviceted list of the VM with a different lock. So this seems 
>>>>>>> to be entirely
>>>>>>> unrelated. Tracking a "currently evicting" state is not part of 
>>>>>>> the GPUVM
>>>>>>> implementation currently and hence nothing would change for 
>>>>>>> amdgpu there.
>>>>>>
>>>>>> Sorry for the confusion we use different terminology in amdgpu.
>>>>>>
>>>>>> The eviction lock and evicted state is for the VM page tables, 
>>>>>> e.g. if the whole VM is currently not used and swapped out or 
>>>>>> even de-allocated.
>>>>>>
>>>>>> This is necessary because we have cases where we need to access 
>>>>>> the VM data without holding the dma-resv lock of this VM. 
>>>>>> Especially figuring out which parts of an address space contain 
>>>>>> mappings and which doesn't.
>>>>>
>>>>> I think this is fine, this has nothing to do with lists of evicted 
>>>>> GEM objects or external GEM
>>>>> objects, right? Marking mappings (drm_gpuva) as invalidated 
>>>>> (DRM_GPUVA_INVALIDATED) or accessing
>>>>> the VA space does not require any dma-resv locks.
>>>>
>>>> I hope so, but I'm not 100% sure.
>>>>
>>>>>
>>>>>>
>>>>>> This is a requirement which comes with HMM handling, you won't 
>>>>>> see this with Vulkan (or OpenGL, VAAPI etc..).
>>>>>>
>>>>>>
>>>>>> The invalidation lock on the other hand is what in this 
>>>>>> discussion is called eviction lock. This one is needed because 
>>>>>> what I wrote above, during the move callback only the dma-resv of 
>>>>>> the BO which is moved is locked, but not necessarily the dma-resv 
>>>>>> of the VM.
>>>>>
>>>>> That's yet another thing, right? This is used to track whether 
>>>>> *any* BO that belongs to the VM is
>>>>> currently being evicted, correct? As mentioned, as by now this is 
>>>>> not supported in GPUVM and hence
>>>>> would be the same driver specific code with the same driver 
>>>>> specifc lock.
>>>>
>>>> That is most likely a show stopper using this for OpenGL based 
>>>> workloads as far as I can see. For those you need to able to figure 
>>>> out which non-VM BOs have been evicted and which parts of the VM 
>>>> needs updates.
>>>
>>> We identify those with a bool in the gpuvm_bo, and that bool is 
>>> protected by the bo_resv. In essence, the "evicted" list must be 
>>> made up-to-date with all relevant locks held before traversing in 
>>> the next exec.
>>
>> What I still miss with this idea is how do we find all the 
>> drm_gpuvm_bo structures with the evicted bool set to true? When doing 
>> the drm_exec dance we come across all external ones and can add them 
>> to the list if needed, but what about the BOs having the VM's dma-resv?
>
> Oh, they can be added to the evict list directly (no bool needed) in 
> the eviction code, like in v3. Since for those we indeed hold the VM's 
> dma_resv since it's aliased with the object's dma-resv.

Yeah, I wanted to note what Danilo seems to think about as well. How do 
we figure out the non-VM BOs evicted?

We can't walk over the list of all non-VM BOs on every submission, 
that's to much overhead for cases with lots of non-VM BOs.

And we can't rely on userspace sending all non-VM BOs as used list down 
to the kernel with each submission.

Regards,
Christian.

>
> /Thomas
>
>
>
>>
>>>
>>> If you mean that we need to unbind all vmas of all vms of evicted 
>>> bos before evicting, We don't do that, at least not in Xe, since 
>>> evicting we wait for VM idle, and it cant access anything through 
>>> the stale vmas until they have been revalidated and rebound.
>>>
>>> /Thomas
>>>
>>>
>>>
>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström wrote:
>>>>>>>>>> Hi!
>>>>>>>>>>
>>>>>>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>>>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström 
>>>>>>>>>>> wrote:
>>>>>>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas Hellström 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Hi, Danilo,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>>>>>>> So far the DRM GPUVA manager offers common 
>>>>>>>>>>>>>>> infrastructure to
>>>>>>>>>>>>>>> track GPU VA
>>>>>>>>>>>>>>> allocations and mappings, generically connect GPU VA 
>>>>>>>>>>>>>>> mappings
>>>>>>>>>>>>>>> to their
>>>>>>>>>>>>>>> backing buffers and perform more complex mapping operations
>>>>>>>>>>>>>>> on the GPU VA
>>>>>>>>>>>>>>> space.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>>>>>>>>> drivers, which
>>>>>>>>>>>>>>> can potentially be generalized in order to make the DRM 
>>>>>>>>>>>>>>> GPUVA
>>>>>>>>>>>>>>> manager
>>>>>>>>>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>>>>>>>>>> this patch aims
>>>>>>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not being used
>>>>>>>>>>>>>>> outside of
>>>>>>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>>>>>>>>> which are
>>>>>>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 3) Provide functions to efficiently lock all GEM objects 
>>>>>>>>>>>>>>> dma-
>>>>>>>>>>>>>>> resv the
>>>>>>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>>>>>>>> contains mappings
>>>>>>>>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>>>>>>>>> accelerated.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 5) Provide some convinience functions for common patterns.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Rather than being designed as a "framework", the target 
>>>>>>>>>>>>>>> is to
>>>>>>>>>>>>>>> make all
>>>>>>>>>>>>>>> features appear as a collection of optional helper 
>>>>>>>>>>>>>>> functions,
>>>>>>>>>>>>>>> such that
>>>>>>>>>>>>>>> drivers are free to make use of the DRM GPUVA managers 
>>>>>>>>>>>>>>> basic
>>>>>>>>>>>>>>> functionality and opt-in for other features without setting
>>>>>>>>>>>>>>> any feature
>>>>>>>>>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>>>>>>>>> locking for drivers
>>>>>>>>>>>>>>> updating the GPU VA space within the fence signalling path.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an 
>>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>>> instance of this
>>>>>>>>>>>>>>>      * particular combination. If not existent a new 
>>>>>>>>>>>>>>> instance
>>>>>>>>>>>>>>> is created and linked
>>>>>>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>>>>>>>>> evicted objects. Those
>>>>>>>>>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>>>>>>>>>> dma-resv locks and
>>>>>>>>>>>>>>> + * validation of evicted objects bound in a &drm_gpuvm. 
>>>>>>>>>>>>>>> For
>>>>>>>>>>>>>>> instance the all
>>>>>>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm 
>>>>>>>>>>>>>>> can be
>>>>>>>>>>>>>>> locked by calling
>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>>>>>>>>>> also possible to lock
>>>>>>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>>>>>>> corresponding parameters to
>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the 
>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>> loop while making
>>>>>>>>>>>>>>> + * use of helper functions such as 
>>>>>>>>>>>>>>> drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Every bound &drm_gem_object is treated as external 
>>>>>>>>>>>>>>> object
>>>>>>>>>>>>>>> when its &dma_resv
>>>>>>>>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for the 
>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>>>>>>>>> creations and destructions
>>>>>>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances unique.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of external 
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> evicted objects are
>>>>>>>>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>>>>>>>>> iteration internally.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * However, drivers still need ensure to protect 
>>>>>>>>>>>>>>> concurrent
>>>>>>>>>>>>>>> calls to functions
>>>>>>>>>>>>>>> + * iterating those lists, such as drm_gpuvm_validate() and
>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function 
>>>>>>>>>>>>>>> contains
>>>>>>>>>>>>>>> a particular
>>>>>>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Functions adding or removing entries from those lists,
>>>>>>>>>>>>>>> such as
>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() 
>>>>>>>>>>>>>>> may be
>>>>>>>>>>>>>>> called with external
>>>>>>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>>>>>>> corresponding list to be
>>>>>>>>>>>>>>> + * (safely) modified while potentially being iternated by
>>>>>>>>>>>>>>> other API functions.
>>>>>>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>>>>>>      *   }
>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo element
>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used to 
>>>>>>>>>>>>>>> store
>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>> + * iterator releases the lock immediately after picking 
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> first element from
>>>>>>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>> Are the list spinlocks needed for that async state update 
>>>>>>>>>>>>>> from
>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Otherwise it should be sufficient to protect the lists 
>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>> gpuvm's resv
>>>>>>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If those spinlocks are still needed in some situations, 
>>>>>>>>>>>>>> perhaps
>>>>>>>>>>>>>> could we
>>>>>>>>>>>>>> have an option to set them to NULL (Like IIRC the maple tree
>>>>>>>>>>>>>> allows for)?
>>>>>>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>>>>>>> holding only the dma-resv lock from the BO this function gets
>>>>>>>>>>>>> called for. Hence,
>>>>>>>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() 
>>>>>>>>>>>>> calls with
>>>>>>>>>>>>> different BOs.
>>>>>>>>>>>> No. Only if you try to add external objects to the vm's 
>>>>>>>>>>>> evict list
>>>>>>>>>>>> from
>>>>>>>>>>>> within the evict code. That's not necessary since you loop 
>>>>>>>>>>>> through
>>>>>>>>>>>> all
>>>>>>>>>>>> external objects anyway when locking them so an "evicted" 
>>>>>>>>>>>> bool in
>>>>>>>>>>>> the vm_bo,
>>>>>>>>>>>> protected by the bo resv would be sufficient. The extobj 
>>>>>>>>>>>> locking
>>>>>>>>>>>> loop can
>>>>>>>>>>>> then add the bo to the evicted list.
>>>>>>>>>>> And validate() can remove it while still holding all 
>>>>>>>>>>> dma-resv locks,
>>>>>>>>>>> neat!
>>>>>>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>>>>>>> concurrently? What
>>>>>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>>>>>>> drm_gpuva_unlink()?
>>>>>>>>>>> Are we guaranteed that at this point of time the 
>>>>>>>>>>> drm_gpuvm_bo is not
>>>>>>>>>>> on the
>>>>>>>>>>> evicted list? Because otherwise we would call 
>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>> with the
>>>>>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>> might drop the last reference to the drm_gem_object and 
>>>>>>>>>>> hence we'd
>>>>>>>>>>> potentially
>>>>>>>>>>> free the dma-resv lock while holding it, at least if it's an 
>>>>>>>>>>> external
>>>>>>>>>>> object.
>>>>>>>>>> Easiest way in this scheme is to think of the lists as being 
>>>>>>>>>> protected
>>>>>>>>>> by the vm's resv lock. That means anybody calling unlink() 
>>>>>>>>>> must also
>>>>>>>>>> hold the vm's resv lock. (Which is OK from an UAF point of 
>>>>>>>>>> view, but
>>>>>>>>>> perhaps not from a locking inversion POW from an async list 
>>>>>>>>>> update).
>>>>>>>>> This would mean that on unlink() we'd need to hold the VM's 
>>>>>>>>> resv lock and the
>>>>>>>>> corresponding GEM's resv lock (in case they're not the same 
>>>>>>>>> anyways) because the
>>>>>>>>> VM's resv lock would protect the external / evicted object 
>>>>>>>>> lists and the GEM
>>>>>>>>> objects resv lock protects the GEM's list of drm_gpuvm_bos and 
>>>>>>>>> the
>>>>>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>>>>>
>>>>>>>>>>>>> For extobjs an outer lock would be enough in case of Xe, 
>>>>>>>>>>>>> but I
>>>>>>>>>>>>> really would not
>>>>>>>>>>>>> like to add even more complexity just to get the spinlock 
>>>>>>>>>>>>> out of
>>>>>>>>>>>>> the way in case
>>>>>>>>>>>>> the driver already has an outer lock protecting this path.
>>>>>>>>>>>> I must disagree here. These spinlocks and atomic operations 
>>>>>>>>>>>> are
>>>>>>>>>>>> pretty
>>>>>>>>>>>> costly and as discussed earlier this type of locking was 
>>>>>>>>>>>> the reason
>>>>>>>>>>>> (at
>>>>>>>>>>>> least according to the commit message) that made Christian 
>>>>>>>>>>>> drop the
>>>>>>>>>>>> XArray
>>>>>>>>>>>> use in drm_exec for the same set of objects: "The locking 
>>>>>>>>>>>> overhead
>>>>>>>>>>>> is
>>>>>>>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>>>>>>>> complexity and a
>>>>>>>>>>>> single wide lock following the drm locking guidelines set 
>>>>>>>>>>>> out by
>>>>>>>>>>>> Daniel and
>>>>>>>>>>>> David should really be the default choice with an opt-in for a
>>>>>>>>>>>> spinlock if
>>>>>>>>>>>> needed for async and pushing out to a wq is not an option.
>>>>>>>>>>> For the external object list an outer lock would work as 
>>>>>>>>>>> long as it's
>>>>>>>>>>> not the
>>>>>>>>>>> dma-resv lock of the corresponding GEM object, since here we 
>>>>>>>>>>> actually
>>>>>>>>>>> need to
>>>>>>>>>>> remove the list entry from the external object list on
>>>>>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>>>>>> It's just a bit weird design wise that drivers would need to 
>>>>>>>>>>> take
>>>>>>>>>>> this outer
>>>>>>>>>>> lock on:
>>>>>>>>>>>
>>>>>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also 
>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>>>>>
>>>>>>>>>>> Given that it seems reasonable to do all the required locking
>>>>>>>>>>> internally.
>>>>>>>>>>  From a design POW, there has been a clear direction in XE to 
>>>>>>>>>> make
>>>>>>>>>> things similar to mmap() / munmap(), so this outer lock, 
>>>>>>>>>> which in Xe is
>>>>>>>>>> an rwsem, is used in a similar way as the mmap_lock. It's 
>>>>>>>>>> protecting
>>>>>>>>>> the page-table structures and vma rb tree, the userptr 
>>>>>>>>>> structures and
>>>>>>>>>> the extobj list. Basically it's taken early in the exec 
>>>>>>>>>> IOCTL, the
>>>>>>>>>> VM_BIND ioctl, the compute rebind worker and the pagefault 
>>>>>>>>>> handler, so
>>>>>>>>>> all of the above are just asserting that it is taken in the 
>>>>>>>>>> correct
>>>>>>>>>> mode.
>>>>>>>>>>
>>>>>>>>>> But strictly with this scheme one could also use the vm's 
>>>>>>>>>> dma_resv for
>>>>>>>>>> the extobj list since with drm_exec, it's locked before 
>>>>>>>>>> traversing the
>>>>>>>>>> list.
>>>>>>>>>>
>>>>>>>>>> The whole point of this scheme is to rely on locks that you 
>>>>>>>>>> already are
>>>>>>>>>> supposed to be holding for various reasons and is simple to 
>>>>>>>>>> comprehend.
>>>>>>>>> I don't agree that we're supposed to hold the VM's resv lock 
>>>>>>>>> anyways for
>>>>>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but 
>>>>>>>>> I'm fine using it
>>>>>>>>> for that purpose nevertheless.
>>>>>>>>>
>>>>>>>>>>> In order to at least place lockdep checks, the driver would 
>>>>>>>>>>> need to
>>>>>>>>>>> supply the
>>>>>>>>>>> corresponding lock's lockdep_map, because the GPUVM 
>>>>>>>>>>> otherwise doesn't
>>>>>>>>>>> know about
>>>>>>>>>>> the lock.
>>>>>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>>>>> I'd really like to avoid that, especially now that everything 
>>>>>>>>> got simpler. We
>>>>>>>>> should define the actual locks to take instead.
>>>>>>>>>
>>>>>>>>>>> Out of curiosity, what is the overhead of a spin_lock() that 
>>>>>>>>>>> doesn't
>>>>>>>>>>> need to
>>>>>>>>>>> spin?
>>>>>>>>>> I guess it's hard to tell exactly, but it is much lower on 
>>>>>>>>>> modern x86
>>>>>>>>>> than what it used to be. Not sure about ARM, which is the other
>>>>>>>>>> architecture important to us. I figure if there is little 
>>>>>>>>>> cache-line
>>>>>>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>>>>>>
>>>>>>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>>>>>>
>>>>>>>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm 
>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>> spinlock_t
>>>>>>>>>>>> *lock)
>>>>>>>>>>>>
>>>>>>>>>>>> {
>>>>>>>>>>>>
>>>>>>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>>>>>>          spin_lock(lock);
>>>>>>>>>>>>
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>>>> For such drivers, that would require anybody calling 
>>>>>>>>>>>>>> unlink to
>>>>>>>>>>>>>> hold the vm's
>>>>>>>>>>>>>> resv, though.
>>>>>>>>>>>>> In V4 I want to go back to having a dedicated lock for the 
>>>>>>>>>>>>> GEMs
>>>>>>>>>>>>> gpuva list (or
>>>>>>>>>>>>> VM_BO list to be more precise). We can't just use the 
>>>>>>>>>>>>> dma-resv
>>>>>>>>>>>>> lock for that
>>>>>>>>>>>>> with VM_BO abstractions, because on destruction of a VM_BO we
>>>>>>>>>>>>> otherwise wouldn't
>>>>>>>>>>>>> be allowed to already hold the dma-resv lock. That's the 
>>>>>>>>>>>>> fix I
>>>>>>>>>>>>> was referring to
>>>>>>>>>>>>> earlier.
>>>>>>>>>>>> Yeah, I can see the need for a dedicated lock for the GEM's 
>>>>>>>>>>>> gpuva
>>>>>>>>>>>> list, but
>>>>>>>>>>>> holding the vm's dma-resv lock across the unlink shouldn't 
>>>>>>>>>>>> be a
>>>>>>>>>>>> problem. We
>>>>>>>>>>>> may free the object and a pointer to the vm's resv during 
>>>>>>>>>>>> unlink
>>>>>>>>>>>> but we
>>>>>>>>>>>> don't free the vm's resv.  It'd be a matter of ensuring 
>>>>>>>>>>>> that any
>>>>>>>>>>>> calls to
>>>>>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>>>>> Drivers calling unlink() from the fence signaling path can't 
>>>>>>>>>>> use the
>>>>>>>>>>> VM's
>>>>>>>>>>> dma-resv lock.
>>>>>>>>>> Yes, that made me a bit curious because in the current 
>>>>>>>>>> version the code
>>>>>>>>>> required the object's dma_resv for unlink() which can't be 
>>>>>>>>>> grabbed
>>>>>>>>>> either from the fence signaling path. So are there any 
>>>>>>>>>> drivers actually
>>>>>>>>>> wanting to do that? If so, they will either need to resort to 
>>>>>>>>>> the
>>>>>>>>>> current spinlock solution or they will need to call unlink 
>>>>>>>>>> from a
>>>>>>>>>> workqueue item.
>>>>>>>>> As Boris already mentioned we have the dma-resv lock by 
>>>>>>>>> default or a driver
>>>>>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of the 
>>>>>>>>> latter.
>>>>>>>>>
>>>>>>>>>>> Also, what if the object is an external object? We can't use 
>>>>>>>>>>> the VM's
>>>>>>>>>>> dma-resv
>>>>>>>>>>> lock here.
>>>>>>>>>> Why? Typically (sync) unlink is only ever called from an 
>>>>>>>>>> unbind-like
>>>>>>>>>> operation where it should be trivial to grab the vm's resv. 
>>>>>>>>>> Or, for
>>>>>>>>>> that matter any outer lock protecting the extobj list. Rule 
>>>>>>>>>> would be
>>>>>>>>>> the drm_gpuvm_bo::entry::extobj  and 
>>>>>>>>>> drm_gpuvm_bo::entry::evict would
>>>>>>>>>> be protected by either the vm's dma_resv (or possibly an 
>>>>>>>>>> outer lock in
>>>>>>>>>> the case of the extobj list).
>>>>>>>>> Outer lock wouldn't have been working for updates in the async 
>>>>>>>>> path, but
>>>>>>>>> shouldn't be relevant anymore. We could use the VM's resv for 
>>>>>>>>> that.
>>>>>>>>>
>>>>>>>>>>>   And we can't have the GEM objs dma-resv lock held when 
>>>>>>>>>>> calling
>>>>>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if the
>>>>>>>>>>> refcount drops
>>>>>>>>>>> to zero calls drm_gpuvm_bo_destroy() and 
>>>>>>>>>>> drm_gpuvm_bo_destroy() might
>>>>>>>>>>> drop the
>>>>>>>>>>> last reference of the GEM object.
>>>>>>>>>> Yes, but this is a different problem as to what exactly protects
>>>>>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal 
>>>>>>>>>> per bo list
>>>>>>>>>> lock, or if we want to keep the bo's dma_resv we need to 
>>>>>>>>>> ensure that
>>>>>>>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts 
>>>>>>>>>> its obj
>>>>>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's 
>>>>>>>>>> refcount (I know
>>>>>>>>>> Boris didn't like that, but requiring an explicit refcount for a
>>>>>>>>>> pointer you dereference unless you're under a lock that 
>>>>>>>>>> ensures keeping
>>>>>>>>>> the object alive is pretty much required?) But anyway for the
>>>>>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or internal 
>>>>>>>>>> spinlock)
>>>>>>>>>> I don't have a strong preference.
>>>>>>>>> We can keep the GEM objects dma-resv lock, however as 
>>>>>>>>> mentioned above
>>>>>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both 
>>>>>>>>> the VM's resv lock
>>>>>>>>> and the GEM's resv lock in case they differ.
>>>>>>>>>
>>>>>>>>>>>   All those problems go away with a dedicated
>>>>>>>>>>> GEM gpuva list lock.
>>>>>>>>>> I don't think these are real problems.
>>>>>>>>>> With the excepton of the eviction list "trick" where we 
>>>>>>>>>> currently have
>>>>>>>>>> slightly different approach to collect external bos needing 
>>>>>>>>>> rebinding,
>>>>>>>>>> we have this working fine.
>>>>>>>>>>
>>>>>>>>>> TBH I think pretty much the only situation where the spinlock 
>>>>>>>>>> is needed
>>>>>>>>>> is for async updates of these lists, unless a wq item can be 
>>>>>>>>>> used for
>>>>>>>>>> that, but it doesn't really seem like the current code allows 
>>>>>>>>>> for such
>>>>>>>>>> updates anyway? It complicates the code a lot, adds overhead 
>>>>>>>>>> and also
>>>>>>>>>> adds the requirement for refcounting during list traversal.
>>>>>>>>>>
>>>>>>>>>> /Thomas
>>>>>>>>>>
>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>> It seems that with that also the refcount could be make non-
>>>>>>>>>>>>>> atomic.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> All in the spirit of the drm locking guidelines "use big 
>>>>>>>>>>>>>> locks
>>>>>>>>>>>>>> when
>>>>>>>>>>>>>> possible".
>>>>>>>>>>>>>> Lower level locks only when necessary for performance or
>>>>>>>>>>>>>> locking inversion?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>>>>>>>>> local list, so removal
>>>>>>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>>>>>>> iterating the list.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>>>>>>> +       ({
>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>>>>>>>>> + struct
>>>>>>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>>>>>>> +                       if
>>>>>>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>>>>>>> {                    \
>>>>>>>>>>>>>>> +                               list_move_tail(&(__vm_bo)-
>>>>>>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> __local_list);                           \
>>>>>>>>>>>>>>> +                               break;
>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>> +                       } else
>>>>>>>>>>>>>>> {                                                        \
>>>>>>>>>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>>>>>>> NULL;                                         \
>>>>>>>>>>>>>>> +                       }
>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>> +       })
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list iterator
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * This helper is here to provide lockless list iteration.
>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>> + * iterator releases the lock immediately after picking 
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> first element from the
>>>>>>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Typical use:
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + *     ret = 0;
>>>>>>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>>>>>>> + *             ret = do_something_with_vm_bo(..., vm_bo);
>>>>>>>>>>>>>>> + *             if (ret)
>>>>>>>>>>>>>>> + *                     break;
>>>>>>>>>>>>>>> + *     }
>>>>>>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>>>>>>>>> &my_local_list);
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Only used for internal list iterations, not meant to be
>>>>>>>>>>>>>>> exposed to the outside
>>>>>>>>>>>>>>> + * world.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> NULL);            \
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> __vm_bo;
>>>>>>>>>>>>>>>        \
>>>>>>>>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> __vm_bo))         \
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to 
>>>>>>>>>>>>>>> their
>>>>>>>>>>>>>>> original list
>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used to 
>>>>>>>>>>>>>>> store
>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * When we're done iterating a vm_bo list, we should call
>>>>>>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>>>>>>> + * to restore the original state and let new iterations 
>>>>>>>>>>>>>>> take
>>>>>>>>>>>>>>> place.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>> __local_list)                         \
>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>                  \
>>>>>>>>>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>>>>>>>>>> list elements to the          \
>>>>>>>>>>>>>>> +                * head to preserve previous ordering, in
>>>>>>>>>>>>>>> case it matters.              \
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>>            \
>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the given
>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the given
>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>     #define to_drm_gpuva(__node) container_of((__node), 
>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm 
>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>         ��drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>>>>>>>>>> *gpuvm)
>>>>>>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>>>>>>>               "GPUVA tree is not empty, potentially leaking
>>>>>>>>>>>>>>> memory.\n");
>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj list
>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all 
>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Note: This function is safe against concurrent 
>>>>>>>>>>>>>>> insertion
>>>>>>>>>>>>>>> and removal of
>>>>>>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>>>>>>> concurrent usage itself.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>>>>>>>>> either an outer VM lock
>>>>>>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this 
>>>>>>>>>>>>>>> function
>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the GPUVM's
>>>>>>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>>>>>>>>> +                         unsigned int num_fences)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>> vm_bo->obj,
>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped 
>>>>>>>>>>>>>>> within
>>>>>>>>>>>>>>> a given range
>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>>>>>>>>>> mapped between @addr
>>>>>>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>>>> drm_exec *exec,
>>>>>>>>>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, end) {
>>>>>>>>>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>> +                       return ret;
>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>>>>>>> + * being set the driver receives the given @fn callback to
>>>>>>>>>>>>>>> lock additional
>>>>>>>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec 
>>>>>>>>>>>>>>> instance.
>>>>>>>>>>>>>>> Typically, drivers
>>>>>>>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>>>>>>>> callback.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, 
>>>>>>>>>>>>>>> exec,
>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>> +                       drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>> +                       if (ret)
>>>>>>>>>>>>>>> +                               goto err;
>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +static int
>>>>>>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned int
>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>>>>>>>>> objs,
>>>>>>>>>>>>>>> + args->num_objs,
>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>>>>>>> + * @num_objs: the number of additional &drm_gem_objects to
>>>>>>>>>>>>>>> lock
>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects the
>>>>>>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>>>>>>> + * contains mappings of, plus the ones given through 
>>>>>>>>>>>>>>> @objs.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>>>>>>>>> +                         unsigned int num_objs,
>>>>>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>> +       } args;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>>>>>>>>> interruptible);
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>>>>>>>>> within a given range
>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>>>> mapped between @addr and
>>>>>>>>>>>>>>> + * @addr + @range.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>>>>>>>>> addr, range,
>>>>>>>>>>>>>>> + num_fences);
>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as 
>>>>>>>>>>>>>>> evicted
>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>>>>>>>>>> evicted buffer
>>>>>>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, 
>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private and all
>>>>>>>>>>>>>>> extobj
>>>>>>>>>>>>>>> + * dma-resv
>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>>>>>>>>> +                        enum dma_resv_usage private_usage,
>>>>>>>>>>>>>>> +                        enum dma_resv_usage extobj_usage)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>>>>>>> +       unsigned long index;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, obj) {
>>>>>>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>>>>>>> obj) ?
>>>>>>>>>>>>>>> + private_usage :
>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance of 
>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>>>>>>          return vm_bo;
>>>>>>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref 
>>>>>>>>>>>>>>> *kref)
>>>>>>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref 
>>>>>>>>>>>>>>> *kref)
>>>>>>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the 
>>>>>>>>>>>>>>> reference of
>>>>>>>>>>>>>>>      *
>>>>>>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * If the reference count drops to zero, the &gpuvm_bo is
>>>>>>>>>>>>>>> destroyed, which
>>>>>>>>>>>>>>> + * includes removing it from the GEMs gpuva list. 
>>>>>>>>>>>>>>> Hence, if
>>>>>>>>>>>>>>> a call to this
>>>>>>>>>>>>>>> + * function can potentially let the reference count to 
>>>>>>>>>>>>>>> zero
>>>>>>>>>>>>>>> the caller must
>>>>>>>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>     void
>>>>>>>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo 
>>>>>>>>>>>>>>> to its
>>>>>>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>>>>>>> + * extobj list
>>>>>>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its &drm_gpuvm's 
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> extobj list.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj 
>>>>>>>>>>>>>>> list if
>>>>>>>>>>>>>>> not on the list
>>>>>>>>>>>>>>> + * already and if the corresponding &drm_gem_object is an
>>>>>>>>>>>>>>> external object,
>>>>>>>>>>>>>>> + * actually.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a 
>>>>>>>>>>>>>>> &drm_gem_object to
>>>>>>>>>>>>>>> / from a
>>>>>>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>>>>>>> +               if (evict)
>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, 
>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>> +               else
>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, 
>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>     static int
>>>>>>>>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>>>>>>           * space
>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>>>>>>>> serving as
>>>>>>>>>>>>>>> +                * external object
>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the extobj 
>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>> +       } extobj;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>> +        * @evict: structure holding the evict list and 
>>>>>>>>>>>>>>> evict
>>>>>>>>>>>>>>> list lock
>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>> +                * @list: &list_head storing &drm_gpuvm_bos
>>>>>>>>>>>>>>> currently being
>>>>>>>>>>>>>>> +                * evicted
>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the evict 
>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>> +       } evict;
>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>>>> drm_device *drm,
>>>>>>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>>>> + * external object
>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv differs
>>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>> + struct drm_gem_object
>>>>>>>>>>>>>>> *obj)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva 
>>>>>>>>>>>>>>> *va)
>>>>>>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, 
>>>>>>>>>>>>>>> gpuvm__)
>>>>>>>>>>>>>>> \
>>>>>>>>>>>>>>>          list_for_each_entry_safe(va__, next__, &(gpuvm__)-
>>>>>>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>>>>>>> &drm_exec should be.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Optionally, @extra can be set in order to lock 
>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA reservations
>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>> +        * @extra: Callback and corresponding private data
>>>>>>>>>>>>>>> for the driver to
>>>>>>>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>> +                         unsigned int num_fences);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>>>>>>>>>> callback
>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>> +               void *priv;
>>>>>>>>>>>>>>> +       } extra;
>>>>>>>>>>>>>>> +};
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma-
>>>>>>>>>>>>>>> resv
>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +static inline int
>>>>>>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>>>>>>>> +                    unsigned int num_fences)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>>>>>>> +                             unsigned int num_fences);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>> +                           struct drm_exec *exec,
>>>>>>>>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>>>>>>>>> +                           unsigned int num_fences);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>> +                             struct drm_gem_object **objs,
>>>>>>>>>>>>>>> +                             unsigned int num_objs,
>>>>>>>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>> +                             u64 addr, u64 range,
>>>>>>>>>>>>>>> +                             unsigned int num_fences,
>>>>>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated
>>>>>>>>>>>>>>> BOs
>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>>>> previously acquired
>>>>>>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, 
>>>>>>>>>>>>>>> &vm_exec->exec,
>>>>>>>>>>>>>>> fence,
>>>>>>>>>>>>>>> + private_usage,
>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>>>>>>                           * gpuva list.
>>>>>>>>>>>>>>>                           */
>>>>>>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>>>>>>> +                        * extobj list.
>>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>>>>>>> +                        * list.
>>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>>>>>>                  } entry;
>>>>>>>>>>>>>>>          } list;
>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk over a
>>>>>>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>>>>>>>>> iteration step
>>>>>>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>>>>>>           * used.
>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, void
>>>>>>>>>>>>>>> *priv);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>> +        * Drivers receive this callback for every evicted
>>>>>>>>>>>>>>> &drm_gem_object being
>>>>>>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>>>>>>>> specific variant of
>>>>>>>>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void 
>>>>>>>>>>>>>>> *priv,
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-20  5:37                               ` Christian König
@ 2023-09-20  7:44                                 ` Thomas Hellström
  2023-09-20  8:29                                   ` Thomas Hellström
  2023-09-20 10:51                                   ` Christian König
  0 siblings, 2 replies; 77+ messages in thread
From: Thomas Hellström @ 2023-09-20  7:44 UTC (permalink / raw)
  To: Christian König, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

Hi,

On 9/20/23 07:37, Christian König wrote:
> Am 19.09.23 um 17:23 schrieb Thomas Hellström:
>>
>> On 9/19/23 17:16, Danilo Krummrich wrote:
>>> On 9/19/23 14:21, Thomas Hellström wrote:
>>>> Hi Christian
>>>>
>>>> On 9/19/23 14:07, Christian König wrote:
>>>>> Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
>>>>>> On 9/13/23 17:33, Christian König wrote:
>>>>>>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>>>>>>> On 9/13/23 16:26, Christian König wrote:
>>>>>>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>>>>>>> As mentioned in a different mail thread, the reply is based 
>>>>>>>>>> on the assumption
>>>>>>>>>> that we don't support anything else than GPUVM updates from 
>>>>>>>>>> the IOCTL.
>>>>>>>>>
>>>>>>>>> I think that this assumption is incorrect.
>>>>>>>>
>>>>>>>> Well, more precisely I should have said "don't support GPUVM 
>>>>>>>> updated from within
>>>>>>>> fence signaling critical sections". And looking at the code, 
>>>>>>>> that doesn't seem what
>>>>>>>> you're doing there.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Vulkan is just once specific use case, but this here should 
>>>>>>>>> probably be able to handle other use cases as well.
>>>>>>>>>
>>>>>>>>> Especially with HMM you get the requirement that you need to 
>>>>>>>>> be able to invalidate GPUVM mappings without grabbing a 
>>>>>>>>> reservation lock.
>>>>>>>>
>>>>>>>> What do you mean with "invalidate GPUVM mappings" in this 
>>>>>>>> context? drm_gpuvm_bo_evict()
>>>>>>>> should only be called from a ttm_device_funcs::move\x0f callback, 
>>>>>>>> we should hold the dma-resv
>>>>>>>> lock there.
>>>>>>>
>>>>>>> Well the question is which dma-resv lock do we hold?
>>>>>>>
>>>>>>> In the move callback we only hold the dma-resv lock of the BO 
>>>>>>> which is moved, but when that is a shared BO then that's not the 
>>>>>>> same as the one for the VM.
>>>>>>
>>>>>> Correct, Thomas' idea was to use the GEM's dma_resv lock to 
>>>>>> protect drm_gpuvm_bo::evicted
>>>>>> and then actually move the drm_gpuvm_bo to the VM's evicted list 
>>>>>> once we grabbed all
>>>>>> dma-resv locks when locking the VM's BOs using drm_exec. We can 
>>>>>> remove them from the evicted
>>>>>> list on validate(). This way we never touch the evicted list 
>>>>>> without holding at least the VM's
>>>>>> dma-resv lock.
>>>>>>
>>>>>> Do you have any concerns about that?
>>>>>
>>>>> Scratching my head a bit how that is supposed to work.
>>>>>
>>>>> This implies that you go over all the evicted BOs during 
>>>>> validation and not just the one mentioned in the CS.
>>>>>
>>>>> That might work for Vulkan, but is pretty much a no-go for OpenGL.
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> See what the eviction lock in amdgpu is doing for example.
>>>>>>>>
>>>>>>>> The eviction_lock seems to protect a VM state "evicting" of 
>>>>>>>> whether any BO that
>>>>>>>> is associated with the VM is currently evicting. At the same 
>>>>>>>> time amdgpu protects
>>>>>>>> the eviceted list of the VM with a different lock. So this 
>>>>>>>> seems to be entirely
>>>>>>>> unrelated. Tracking a "currently evicting" state is not part of 
>>>>>>>> the GPUVM
>>>>>>>> implementation currently and hence nothing would change for 
>>>>>>>> amdgpu there.
>>>>>>>
>>>>>>> Sorry for the confusion we use different terminology in amdgpu.
>>>>>>>
>>>>>>> The eviction lock and evicted state is for the VM page tables, 
>>>>>>> e.g. if the whole VM is currently not used and swapped out or 
>>>>>>> even de-allocated.
>>>>>>>
>>>>>>> This is necessary because we have cases where we need to access 
>>>>>>> the VM data without holding the dma-resv lock of this VM. 
>>>>>>> Especially figuring out which parts of an address space contain 
>>>>>>> mappings and which doesn't.
>>>>>>
>>>>>> I think this is fine, this has nothing to do with lists of 
>>>>>> evicted GEM objects or external GEM
>>>>>> objects, right? Marking mappings (drm_gpuva) as invalidated 
>>>>>> (DRM_GPUVA_INVALIDATED) or accessing
>>>>>> the VA space does not require any dma-resv locks.
>>>>>
>>>>> I hope so, but I'm not 100% sure.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> This is a requirement which comes with HMM handling, you won't 
>>>>>>> see this with Vulkan (or OpenGL, VAAPI etc..).
>>>>>>>
>>>>>>>
>>>>>>> The invalidation lock on the other hand is what in this 
>>>>>>> discussion is called eviction lock. This one is needed because 
>>>>>>> what I wrote above, during the move callback only the dma-resv 
>>>>>>> of the BO which is moved is locked, but not necessarily the 
>>>>>>> dma-resv of the VM.
>>>>>>
>>>>>> That's yet another thing, right? This is used to track whether 
>>>>>> *any* BO that belongs to the VM is
>>>>>> currently being evicted, correct? As mentioned, as by now this is 
>>>>>> not supported in GPUVM and hence
>>>>>> would be the same driver specific code with the same driver 
>>>>>> specifc lock.
>>>>>
>>>>> That is most likely a show stopper using this for OpenGL based 
>>>>> workloads as far as I can see. For those you need to able to 
>>>>> figure out which non-VM BOs have been evicted and which parts of 
>>>>> the VM needs updates.
>>>>
>>>> We identify those with a bool in the gpuvm_bo, and that bool is 
>>>> protected by the bo_resv. In essence, the "evicted" list must be 
>>>> made up-to-date with all relevant locks held before traversing in 
>>>> the next exec.
>>>
>>> What I still miss with this idea is how do we find all the 
>>> drm_gpuvm_bo structures with the evicted bool set to true? When 
>>> doing the drm_exec dance we come across all external ones and can 
>>> add them to the list if needed, but what about the BOs having the 
>>> VM's dma-resv?
>>
>> Oh, they can be added to the evict list directly (no bool needed) in 
>> the eviction code, like in v3. Since for those we indeed hold the 
>> VM's dma_resv since it's aliased with the object's dma-resv.
>
> Yeah, I wanted to note what Danilo seems to think about as well. How 
> do we figure out the non-VM BOs evicted?
>
> We can't walk over the list of all non-VM BOs on every submission, 
> that's to much overhead for cases with lots of non-VM BOs.
>
> And we can't rely on userspace sending all non-VM BOs as used list 
> down to the kernel with each submission.
>
> Regards,
> Christian.

No, that's not needed: Mechanism below.

1) We maintain an evicted list. Typically protected by the vm resv.
2) Each gpuvm_bo has a bool "evicted". Protected by the bo resv.

a) Evicting a vm bo: The vm resv is held by the eviction code. Just put 
it on the evicted list.
b) Evicting a shared/external bo: The bo resv is held by the eviction 
code. Set the "evicted" bool
c) Validating the evicted list on exec: Loop through all 
*external/shared* bos. Lock them. After locking, check the "evicted" 
bool, if it's true. put the bo on the evicted list (we hold the VM resv 
at this point) and clear the "evicted" bool. Note that other vms will 
have their own gpuvm_bo which is marked evicted.

I have this coded up in a patch for Xe and it seems to be working properly.

/Thomas


>
>>
>> /Thomas
>>
>>
>>
>>>
>>>>
>>>> If you mean that we need to unbind all vmas of all vms of evicted 
>>>> bos before evicting, We don't do that, at least not in Xe, since 
>>>> evicting we wait for VM idle, and it cant access anything through 
>>>> the stale vmas until they have been revalidated and rebound.
>>>>
>>>> /Thomas
>>>>
>>>>
>>>>
>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström 
>>>>>>>>>> wrote:
>>>>>>>>>>> Hi!
>>>>>>>>>>>
>>>>>>>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>>>>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas 
>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>> Hi, Danilo,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>> So far the DRM GPUVA manager offers common 
>>>>>>>>>>>>>>>> infrastructure to
>>>>>>>>>>>>>>>> track GPU VA
>>>>>>>>>>>>>>>> allocations and mappings, generically connect GPU VA 
>>>>>>>>>>>>>>>> mappings
>>>>>>>>>>>>>>>> to their
>>>>>>>>>>>>>>>> backing buffers and perform more complex mapping 
>>>>>>>>>>>>>>>> operations
>>>>>>>>>>>>>>>> on the GPU VA
>>>>>>>>>>>>>>>> space.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>>>>>>>>>> drivers, which
>>>>>>>>>>>>>>>> can potentially be generalized in order to make the DRM 
>>>>>>>>>>>>>>>> GPUVA
>>>>>>>>>>>>>>>> manager
>>>>>>>>>>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>>>>>>>>>>> this patch aims
>>>>>>>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not being 
>>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>> outside of
>>>>>>>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>>>>>>>>>> which are
>>>>>>>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 3) Provide functions to efficiently lock all GEM 
>>>>>>>>>>>>>>>> objects dma-
>>>>>>>>>>>>>>>> resv the
>>>>>>>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>>>>>>>>> contains mappings
>>>>>>>>>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>>>>>>>>>> accelerated.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 5) Provide some convinience functions for common patterns.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Rather than being designed as a "framework", the target 
>>>>>>>>>>>>>>>> is to
>>>>>>>>>>>>>>>> make all
>>>>>>>>>>>>>>>> features appear as a collection of optional helper 
>>>>>>>>>>>>>>>> functions,
>>>>>>>>>>>>>>>> such that
>>>>>>>>>>>>>>>> drivers are free to make use of the DRM GPUVA managers 
>>>>>>>>>>>>>>>> basic
>>>>>>>>>>>>>>>> functionality and opt-in for other features without 
>>>>>>>>>>>>>>>> setting
>>>>>>>>>>>>>>>> any feature
>>>>>>>>>>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>>>>>>>>>> locking for drivers
>>>>>>>>>>>>>>>> updating the GPU VA space within the fence signalling 
>>>>>>>>>>>>>>>> path.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an 
>>>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>>>> instance of this
>>>>>>>>>>>>>>>>      * particular combination. If not existent a new 
>>>>>>>>>>>>>>>> instance
>>>>>>>>>>>>>>>> is created and linked
>>>>>>>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>>>>>>>>>> evicted objects. Those
>>>>>>>>>>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>>>>>>>>>>> dma-resv locks and
>>>>>>>>>>>>>>>> + * validation of evicted objects bound in a 
>>>>>>>>>>>>>>>> &drm_gpuvm. For
>>>>>>>>>>>>>>>> instance the all
>>>>>>>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm 
>>>>>>>>>>>>>>>> can be
>>>>>>>>>>>>>>>> locked by calling
>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>>>>>>>>>>> also possible to lock
>>>>>>>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>>>>>>>> corresponding parameters to
>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the 
>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>> loop while making
>>>>>>>>>>>>>>>> + * use of helper functions such as 
>>>>>>>>>>>>>>>> drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Every bound &drm_gem_object is treated as external 
>>>>>>>>>>>>>>>> object
>>>>>>>>>>>>>>>> when its &dma_resv
>>>>>>>>>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for 
>>>>>>>>>>>>>>>> the same
>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>>>>>>>>>> creations and destructions
>>>>>>>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances 
>>>>>>>>>>>>>>>> unique.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of 
>>>>>>>>>>>>>>>> external and
>>>>>>>>>>>>>>>> evicted objects are
>>>>>>>>>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>>>>>>>>>> iteration internally.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * However, drivers still need ensure to protect 
>>>>>>>>>>>>>>>> concurrent
>>>>>>>>>>>>>>>> calls to functions
>>>>>>>>>>>>>>>> + * iterating those lists, such as drm_gpuvm_validate() 
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function 
>>>>>>>>>>>>>>>> contains
>>>>>>>>>>>>>>>> a particular
>>>>>>>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Functions adding or removing entries from those lists,
>>>>>>>>>>>>>>>> such as
>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() 
>>>>>>>>>>>>>>>> may be
>>>>>>>>>>>>>>>> called with external
>>>>>>>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>>>>>>>> corresponding list to be
>>>>>>>>>>>>>>>> + * (safely) modified while potentially being iternated by
>>>>>>>>>>>>>>>> other API functions.
>>>>>>>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>>>>>>>      *   }
>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo 
>>>>>>>>>>>>>>>> element
>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used to 
>>>>>>>>>>>>>>>> store
>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>> first element from
>>>>>>>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>> Are the list spinlocks needed for that async state 
>>>>>>>>>>>>>>> update from
>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Otherwise it should be sufficient to protect the lists 
>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>> gpuvm's resv
>>>>>>>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If those spinlocks are still needed in some situations, 
>>>>>>>>>>>>>>> perhaps
>>>>>>>>>>>>>>> could we
>>>>>>>>>>>>>>> have an option to set them to NULL (Like IIRC the maple 
>>>>>>>>>>>>>>> tree
>>>>>>>>>>>>>>> allows for)?
>>>>>>>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>>>>>>>> holding only the dma-resv lock from the BO this function 
>>>>>>>>>>>>>> gets
>>>>>>>>>>>>>> called for. Hence,
>>>>>>>>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() 
>>>>>>>>>>>>>> calls with
>>>>>>>>>>>>>> different BOs.
>>>>>>>>>>>>> No. Only if you try to add external objects to the vm's 
>>>>>>>>>>>>> evict list
>>>>>>>>>>>>> from
>>>>>>>>>>>>> within the evict code. That's not necessary since you loop 
>>>>>>>>>>>>> through
>>>>>>>>>>>>> all
>>>>>>>>>>>>> external objects anyway when locking them so an "evicted" 
>>>>>>>>>>>>> bool in
>>>>>>>>>>>>> the vm_bo,
>>>>>>>>>>>>> protected by the bo resv would be sufficient. The extobj 
>>>>>>>>>>>>> locking
>>>>>>>>>>>>> loop can
>>>>>>>>>>>>> then add the bo to the evicted list.
>>>>>>>>>>>> And validate() can remove it while still holding all 
>>>>>>>>>>>> dma-resv locks,
>>>>>>>>>>>> neat!
>>>>>>>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>>>>>>>> concurrently? What
>>>>>>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>>>>>>>> drm_gpuva_unlink()?
>>>>>>>>>>>> Are we guaranteed that at this point of time the 
>>>>>>>>>>>> drm_gpuvm_bo is not
>>>>>>>>>>>> on the
>>>>>>>>>>>> evicted list? Because otherwise we would call 
>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>> with the
>>>>>>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>> might drop the last reference to the drm_gem_object and 
>>>>>>>>>>>> hence we'd
>>>>>>>>>>>> potentially
>>>>>>>>>>>> free the dma-resv lock while holding it, at least if it's 
>>>>>>>>>>>> an external
>>>>>>>>>>>> object.
>>>>>>>>>>> Easiest way in this scheme is to think of the lists as being 
>>>>>>>>>>> protected
>>>>>>>>>>> by the vm's resv lock. That means anybody calling unlink() 
>>>>>>>>>>> must also
>>>>>>>>>>> hold the vm's resv lock. (Which is OK from an UAF point of 
>>>>>>>>>>> view, but
>>>>>>>>>>> perhaps not from a locking inversion POW from an async list 
>>>>>>>>>>> update).
>>>>>>>>>> This would mean that on unlink() we'd need to hold the VM's 
>>>>>>>>>> resv lock and the
>>>>>>>>>> corresponding GEM's resv lock (in case they're not the same 
>>>>>>>>>> anyways) because the
>>>>>>>>>> VM's resv lock would protect the external / evicted object 
>>>>>>>>>> lists and the GEM
>>>>>>>>>> objects resv lock protects the GEM's list of drm_gpuvm_bos 
>>>>>>>>>> and the
>>>>>>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>>>>>>
>>>>>>>>>>>>>> For extobjs an outer lock would be enough in case of Xe, 
>>>>>>>>>>>>>> but I
>>>>>>>>>>>>>> really would not
>>>>>>>>>>>>>> like to add even more complexity just to get the spinlock 
>>>>>>>>>>>>>> out of
>>>>>>>>>>>>>> the way in case
>>>>>>>>>>>>>> the driver already has an outer lock protecting this path.
>>>>>>>>>>>>> I must disagree here. These spinlocks and atomic 
>>>>>>>>>>>>> operations are
>>>>>>>>>>>>> pretty
>>>>>>>>>>>>> costly and as discussed earlier this type of locking was 
>>>>>>>>>>>>> the reason
>>>>>>>>>>>>> (at
>>>>>>>>>>>>> least according to the commit message) that made Christian 
>>>>>>>>>>>>> drop the
>>>>>>>>>>>>> XArray
>>>>>>>>>>>>> use in drm_exec for the same set of objects: "The locking 
>>>>>>>>>>>>> overhead
>>>>>>>>>>>>> is
>>>>>>>>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>>>>>>>>> complexity and a
>>>>>>>>>>>>> single wide lock following the drm locking guidelines set 
>>>>>>>>>>>>> out by
>>>>>>>>>>>>> Daniel and
>>>>>>>>>>>>> David should really be the default choice with an opt-in 
>>>>>>>>>>>>> for a
>>>>>>>>>>>>> spinlock if
>>>>>>>>>>>>> needed for async and pushing out to a wq is not an option.
>>>>>>>>>>>> For the external object list an outer lock would work as 
>>>>>>>>>>>> long as it's
>>>>>>>>>>>> not the
>>>>>>>>>>>> dma-resv lock of the corresponding GEM object, since here 
>>>>>>>>>>>> we actually
>>>>>>>>>>>> need to
>>>>>>>>>>>> remove the list entry from the external object list on
>>>>>>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>>>>>>> It's just a bit weird design wise that drivers would need 
>>>>>>>>>>>> to take
>>>>>>>>>>>> this outer
>>>>>>>>>>>> lock on:
>>>>>>>>>>>>
>>>>>>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also 
>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>>>>>>
>>>>>>>>>>>> Given that it seems reasonable to do all the required locking
>>>>>>>>>>>> internally.
>>>>>>>>>>>  From a design POW, there has been a clear direction in XE 
>>>>>>>>>>> to make
>>>>>>>>>>> things similar to mmap() / munmap(), so this outer lock, 
>>>>>>>>>>> which in Xe is
>>>>>>>>>>> an rwsem, is used in a similar way as the mmap_lock. It's 
>>>>>>>>>>> protecting
>>>>>>>>>>> the page-table structures and vma rb tree, the userptr 
>>>>>>>>>>> structures and
>>>>>>>>>>> the extobj list. Basically it's taken early in the exec 
>>>>>>>>>>> IOCTL, the
>>>>>>>>>>> VM_BIND ioctl, the compute rebind worker and the pagefault 
>>>>>>>>>>> handler, so
>>>>>>>>>>> all of the above are just asserting that it is taken in the 
>>>>>>>>>>> correct
>>>>>>>>>>> mode.
>>>>>>>>>>>
>>>>>>>>>>> But strictly with this scheme one could also use the vm's 
>>>>>>>>>>> dma_resv for
>>>>>>>>>>> the extobj list since with drm_exec, it's locked before 
>>>>>>>>>>> traversing the
>>>>>>>>>>> list.
>>>>>>>>>>>
>>>>>>>>>>> The whole point of this scheme is to rely on locks that you 
>>>>>>>>>>> already are
>>>>>>>>>>> supposed to be holding for various reasons and is simple to 
>>>>>>>>>>> comprehend.
>>>>>>>>>> I don't agree that we're supposed to hold the VM's resv lock 
>>>>>>>>>> anyways for
>>>>>>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but 
>>>>>>>>>> I'm fine using it
>>>>>>>>>> for that purpose nevertheless.
>>>>>>>>>>
>>>>>>>>>>>> In order to at least place lockdep checks, the driver would 
>>>>>>>>>>>> need to
>>>>>>>>>>>> supply the
>>>>>>>>>>>> corresponding lock's lockdep_map, because the GPUVM 
>>>>>>>>>>>> otherwise doesn't
>>>>>>>>>>>> know about
>>>>>>>>>>>> the lock.
>>>>>>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>>>>>> I'd really like to avoid that, especially now that everything 
>>>>>>>>>> got simpler. We
>>>>>>>>>> should define the actual locks to take instead.
>>>>>>>>>>
>>>>>>>>>>>> Out of curiosity, what is the overhead of a spin_lock() 
>>>>>>>>>>>> that doesn't
>>>>>>>>>>>> need to
>>>>>>>>>>>> spin?
>>>>>>>>>>> I guess it's hard to tell exactly, but it is much lower on 
>>>>>>>>>>> modern x86
>>>>>>>>>>> than what it used to be. Not sure about ARM, which is the other
>>>>>>>>>>> architecture important to us. I figure if there is little 
>>>>>>>>>>> cache-line
>>>>>>>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>>>>>>>
>>>>>>>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>>>>>>>
>>>>>>>>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm 
>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>> spinlock_t
>>>>>>>>>>>>> *lock)
>>>>>>>>>>>>>
>>>>>>>>>>>>> {
>>>>>>>>>>>>>
>>>>>>>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>>>>>>>          spin_lock(lock);
>>>>>>>>>>>>>
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For such drivers, that would require anybody calling 
>>>>>>>>>>>>>>> unlink to
>>>>>>>>>>>>>>> hold the vm's
>>>>>>>>>>>>>>> resv, though.
>>>>>>>>>>>>>> In V4 I want to go back to having a dedicated lock for 
>>>>>>>>>>>>>> the GEMs
>>>>>>>>>>>>>> gpuva list (or
>>>>>>>>>>>>>> VM_BO list to be more precise). We can't just use the 
>>>>>>>>>>>>>> dma-resv
>>>>>>>>>>>>>> lock for that
>>>>>>>>>>>>>> with VM_BO abstractions, because on destruction of a 
>>>>>>>>>>>>>> VM_BO we
>>>>>>>>>>>>>> otherwise wouldn't
>>>>>>>>>>>>>> be allowed to already hold the dma-resv lock. That's the 
>>>>>>>>>>>>>> fix I
>>>>>>>>>>>>>> was referring to
>>>>>>>>>>>>>> earlier.
>>>>>>>>>>>>> Yeah, I can see the need for a dedicated lock for the 
>>>>>>>>>>>>> GEM's gpuva
>>>>>>>>>>>>> list, but
>>>>>>>>>>>>> holding the vm's dma-resv lock across the unlink shouldn't 
>>>>>>>>>>>>> be a
>>>>>>>>>>>>> problem. We
>>>>>>>>>>>>> may free the object and a pointer to the vm's resv during 
>>>>>>>>>>>>> unlink
>>>>>>>>>>>>> but we
>>>>>>>>>>>>> don't free the vm's resv.  It'd be a matter of ensuring 
>>>>>>>>>>>>> that any
>>>>>>>>>>>>> calls to
>>>>>>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>>>>>> Drivers calling unlink() from the fence signaling path 
>>>>>>>>>>>> can't use the
>>>>>>>>>>>> VM's
>>>>>>>>>>>> dma-resv lock.
>>>>>>>>>>> Yes, that made me a bit curious because in the current 
>>>>>>>>>>> version the code
>>>>>>>>>>> required the object's dma_resv for unlink() which can't be 
>>>>>>>>>>> grabbed
>>>>>>>>>>> either from the fence signaling path. So are there any 
>>>>>>>>>>> drivers actually
>>>>>>>>>>> wanting to do that? If so, they will either need to resort 
>>>>>>>>>>> to the
>>>>>>>>>>> current spinlock solution or they will need to call unlink 
>>>>>>>>>>> from a
>>>>>>>>>>> workqueue item.
>>>>>>>>>> As Boris already mentioned we have the dma-resv lock by 
>>>>>>>>>> default or a driver
>>>>>>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of the 
>>>>>>>>>> latter.
>>>>>>>>>>
>>>>>>>>>>>> Also, what if the object is an external object? We can't 
>>>>>>>>>>>> use the VM's
>>>>>>>>>>>> dma-resv
>>>>>>>>>>>> lock here.
>>>>>>>>>>> Why? Typically (sync) unlink is only ever called from an 
>>>>>>>>>>> unbind-like
>>>>>>>>>>> operation where it should be trivial to grab the vm's resv. 
>>>>>>>>>>> Or, for
>>>>>>>>>>> that matter any outer lock protecting the extobj list. Rule 
>>>>>>>>>>> would be
>>>>>>>>>>> the drm_gpuvm_bo::entry::extobj  and 
>>>>>>>>>>> drm_gpuvm_bo::entry::evict would
>>>>>>>>>>> be protected by either the vm's dma_resv (or possibly an 
>>>>>>>>>>> outer lock in
>>>>>>>>>>> the case of the extobj list).
>>>>>>>>>> Outer lock wouldn't have been working for updates in the 
>>>>>>>>>> async path, but
>>>>>>>>>> shouldn't be relevant anymore. We could use the VM's resv for 
>>>>>>>>>> that.
>>>>>>>>>>
>>>>>>>>>>>>   And we can't have the GEM objs dma-resv lock held when 
>>>>>>>>>>>> calling
>>>>>>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which if 
>>>>>>>>>>>> the
>>>>>>>>>>>> refcount drops
>>>>>>>>>>>> to zero calls drm_gpuvm_bo_destroy() and 
>>>>>>>>>>>> drm_gpuvm_bo_destroy() might
>>>>>>>>>>>> drop the
>>>>>>>>>>>> last reference of the GEM object.
>>>>>>>>>>> Yes, but this is a different problem as to what exactly 
>>>>>>>>>>> protects
>>>>>>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal 
>>>>>>>>>>> per bo list
>>>>>>>>>>> lock, or if we want to keep the bo's dma_resv we need to 
>>>>>>>>>>> ensure that
>>>>>>>>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts 
>>>>>>>>>>> its obj
>>>>>>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's 
>>>>>>>>>>> refcount (I know
>>>>>>>>>>> Boris didn't like that, but requiring an explicit refcount 
>>>>>>>>>>> for a
>>>>>>>>>>> pointer you dereference unless you're under a lock that 
>>>>>>>>>>> ensures keeping
>>>>>>>>>>> the object alive is pretty much required?) But anyway for the
>>>>>>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or 
>>>>>>>>>>> internal spinlock)
>>>>>>>>>>> I don't have a strong preference.
>>>>>>>>>> We can keep the GEM objects dma-resv lock, however as 
>>>>>>>>>> mentioned above
>>>>>>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both 
>>>>>>>>>> the VM's resv lock
>>>>>>>>>> and the GEM's resv lock in case they differ.
>>>>>>>>>>
>>>>>>>>>>>>   All those problems go away with a dedicated
>>>>>>>>>>>> GEM gpuva list lock.
>>>>>>>>>>> I don't think these are real problems.
>>>>>>>>>>> With the excepton of the eviction list "trick" where we 
>>>>>>>>>>> currently have
>>>>>>>>>>> slightly different approach to collect external bos needing 
>>>>>>>>>>> rebinding,
>>>>>>>>>>> we have this working fine.
>>>>>>>>>>>
>>>>>>>>>>> TBH I think pretty much the only situation where the 
>>>>>>>>>>> spinlock is needed
>>>>>>>>>>> is for async updates of these lists, unless a wq item can be 
>>>>>>>>>>> used for
>>>>>>>>>>> that, but it doesn't really seem like the current code 
>>>>>>>>>>> allows for such
>>>>>>>>>>> updates anyway? It complicates the code a lot, adds overhead 
>>>>>>>>>>> and also
>>>>>>>>>>> adds the requirement for refcounting during list traversal.
>>>>>>>>>>>
>>>>>>>>>>> /Thomas
>>>>>>>>>>>
>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It seems that with that also the refcount could be make 
>>>>>>>>>>>>>>> non-
>>>>>>>>>>>>>>> atomic.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> All in the spirit of the drm locking guidelines "use big 
>>>>>>>>>>>>>>> locks
>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>> possible".
>>>>>>>>>>>>>>> Lower level locks only when necessary for performance or
>>>>>>>>>>>>>>> locking inversion?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>>>>>>>>>> local list, so removal
>>>>>>>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>>>>>>>> iterating the list.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>>>>>>>> +       ({
>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>>>>>>>>>> + struct
>>>>>>>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>>>>>>>> +                       if
>>>>>>>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>>>>>>>> {                    \
>>>>>>>>>>>>>>>> +                               list_move_tail(&(__vm_bo)-
>>>>>>>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> __local_list);                           \
>>>>>>>>>>>>>>>> +                               break;
>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>> +                       } else
>>>>>>>>>>>>>>>> {                                                        \
>>>>>>>>>>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>>>>>>>> NULL;                                         \
>>>>>>>>>>>>>>>> +                       }
>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>> +       })
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list 
>>>>>>>>>>>>>>>> iterator
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>> first element from the
>>>>>>>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Typical use:
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + *     ret = 0;
>>>>>>>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>>>>>>>> + *             ret = do_something_with_vm_bo(..., vm_bo);
>>>>>>>>>>>>>>>> + *             if (ret)
>>>>>>>>>>>>>>>> + *                     break;
>>>>>>>>>>>>>>>> + *     }
>>>>>>>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>>>>>>>>>> &my_local_list);
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Only used for internal list iterations, not meant 
>>>>>>>>>>>>>>>> to be
>>>>>>>>>>>>>>>> exposed to the outside
>>>>>>>>>>>>>>>> + * world.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> NULL);            \
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> __vm_bo;
>>>>>>>>>>>>>>>>        \
>>>>>>>>>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> __vm_bo))         \
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to 
>>>>>>>>>>>>>>>> their
>>>>>>>>>>>>>>>> original list
>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used to 
>>>>>>>>>>>>>>>> store
>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * When we're done iterating a vm_bo list, we should call
>>>>>>>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>>>>>>>> + * to restore the original state and let new 
>>>>>>>>>>>>>>>> iterations take
>>>>>>>>>>>>>>>> place.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>> __local_list)                         \
>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>                  \
>>>>>>>>>>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>>>>>>>>>>> list elements to the          \
>>>>>>>>>>>>>>>> +                * head to preserve previous ordering, in
>>>>>>>>>>>>>>>> case it matters.              \
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>>>            \
>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the 
>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the 
>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>     #define to_drm_gpuva(__node) container_of((__node), 
>>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm 
>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>         ��drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>>>>>>>>>>> *gpuvm)
>>>>>>>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>>>>>>>>               "GPUVA tree is not empty, potentially 
>>>>>>>>>>>>>>>> leaking
>>>>>>>>>>>>>>>> memory.\n");
>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj 
>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all 
>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Note: This function is safe against concurrent 
>>>>>>>>>>>>>>>> insertion
>>>>>>>>>>>>>>>> and removal of
>>>>>>>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>>>>>>>> concurrent usage itself.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>>>>>>>>>> either an outer VM lock
>>>>>>>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this 
>>>>>>>>>>>>>>>> function
>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the 
>>>>>>>>>>>>>>>> GPUVM's
>>>>>>>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>>>>>>>>>> +                         unsigned int num_fences)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>>> vm_bo->obj,
>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped 
>>>>>>>>>>>>>>>> within
>>>>>>>>>>>>>>>> a given range
>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>>>>>>>>>>> mapped between @addr
>>>>>>>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>>>>> drm_exec *exec,
>>>>>>>>>>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, 
>>>>>>>>>>>>>>>> end) {
>>>>>>>>>>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>> +                       return ret;
>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects 
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>>>>>>>> + * being set the driver receives the given @fn 
>>>>>>>>>>>>>>>> callback to
>>>>>>>>>>>>>>>> lock additional
>>>>>>>>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec 
>>>>>>>>>>>>>>>> instance.
>>>>>>>>>>>>>>>> Typically, drivers
>>>>>>>>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>>>>>>>>> callback.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, 
>>>>>>>>>>>>>>>> exec,
>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>> +                       drm_exec_retry_on_contention(exec); 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> +                       if (ret)
>>>>>>>>>>>>>>>> +                               goto err;
>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +static int
>>>>>>>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, unsigned 
>>>>>>>>>>>>>>>> int
>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, 
>>>>>>>>>>>>>>>> args-
>>>>>>>>>>>>>>>>> objs,
>>>>>>>>>>>>>>>> + args->num_objs,
>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of all
>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>>>>>>>> + * @num_objs: the number of additional 
>>>>>>>>>>>>>>>> &drm_gem_objects to
>>>>>>>>>>>>>>>> lock
>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects 
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>>>>>>>> + * contains mappings of, plus the ones given through 
>>>>>>>>>>>>>>>> @objs.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>>>>>>>>>> +                         unsigned int num_objs,
>>>>>>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>> +       } args;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>>>>>>>>>> interruptible);
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>>>>>>>>>> within a given range
>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>>>>> mapped between @addr and
>>>>>>>>>>>>>>>> + * @addr + @range.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>>>>>>>>>> addr, range,
>>>>>>>>>>>>>>>> + num_fences);
>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as 
>>>>>>>>>>>>>>>> evicted
>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>>>>>>>>>>> evicted buffer
>>>>>>>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, 
>>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private and 
>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>> extobj
>>>>>>>>>>>>>>>> + * dma-resv
>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>>>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>>>>>>>>>> +                        enum dma_resv_usage 
>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>> +                        enum dma_resv_usage extobj_usage)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>>>>>>>> +       unsigned long index;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, 
>>>>>>>>>>>>>>>> obj) {
>>>>>>>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>>>>>>>> obj) ?
>>>>>>>>>>>>>>>> + private_usage :
>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance of 
>>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>>>>>>>          return vm_bo;
>>>>>>>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref 
>>>>>>>>>>>>>>>> *kref)
>>>>>>>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref 
>>>>>>>>>>>>>>>> *kref)
>>>>>>>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the 
>>>>>>>>>>>>>>>> reference of
>>>>>>>>>>>>>>>>      *
>>>>>>>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * If the reference count drops to zero, the &gpuvm_bo is
>>>>>>>>>>>>>>>> destroyed, which
>>>>>>>>>>>>>>>> + * includes removing it from the GEMs gpuva list. 
>>>>>>>>>>>>>>>> Hence, if
>>>>>>>>>>>>>>>> a call to this
>>>>>>>>>>>>>>>> + * function can potentially let the reference count to 
>>>>>>>>>>>>>>>> zero
>>>>>>>>>>>>>>>> the caller must
>>>>>>>>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>     void
>>>>>>>>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct drm_gpuvm_bo
>>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>>>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo 
>>>>>>>>>>>>>>>> to its
>>>>>>>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>>>>>>>> + * extobj list
>>>>>>>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its 
>>>>>>>>>>>>>>>> &drm_gpuvm's the
>>>>>>>>>>>>>>>> extobj list.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj 
>>>>>>>>>>>>>>>> list if
>>>>>>>>>>>>>>>> not on the list
>>>>>>>>>>>>>>>> + * already and if the corresponding &drm_gem_object is an
>>>>>>>>>>>>>>>> external object,
>>>>>>>>>>>>>>>> + * actually.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a 
>>>>>>>>>>>>>>>> &drm_gem_object to
>>>>>>>>>>>>>>>> / from a
>>>>>>>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool 
>>>>>>>>>>>>>>>> evict)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>>>>>>>> +               if (evict)
>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, 
>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>> +               else
>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, 
>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>     static int
>>>>>>>>>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>>>>>>>           * space
>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>> serving as
>>>>>>>>>>>>>>>> +                * external object
>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>> extobj list
>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>> +       } extobj;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>> +        * @evict: structure holding the evict list and 
>>>>>>>>>>>>>>>> evict
>>>>>>>>>>>>>>>> list lock
>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>> currently being
>>>>>>>>>>>>>>>> +                * evicted
>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the evict 
>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>> +       } evict;
>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>>>>> drm_device *drm,
>>>>>>>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>>>>> + * external object
>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv differs
>>>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>> + struct drm_gem_object
>>>>>>>>>>>>>>>> *obj)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva 
>>>>>>>>>>>>>>>> *va)
>>>>>>>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, 
>>>>>>>>>>>>>>>> gpuvm__)
>>>>>>>>>>>>>>>> \
>>>>>>>>>>>>>>>>          list_for_each_entry_safe(va__, next__, 
>>>>>>>>>>>>>>>> &(gpuvm__)-
>>>>>>>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>>>>>>>> &drm_exec should be.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Optionally, @extra can be set in order to lock 
>>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA 
>>>>>>>>>>>>>>>> reservations
>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>> +        * @extra: Callback and corresponding private data
>>>>>>>>>>>>>>>> for the driver to
>>>>>>>>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>> +                         unsigned int num_fences);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>>>>>>>>>>> callback
>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>> +               void *priv;
>>>>>>>>>>>>>>>> +       } extra;
>>>>>>>>>>>>>>>> +};
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common 
>>>>>>>>>>>>>>>> dma-
>>>>>>>>>>>>>>>> resv
>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +static inline int
>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>>>>>>>>> +                    unsigned int num_fences)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>> +                           struct drm_exec *exec,
>>>>>>>>>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>>>>>>>>>> +                           unsigned int num_fences);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>> +                             struct drm_gem_object 
>>>>>>>>>>>>>>>> **objs,
>>>>>>>>>>>>>>>> + unsigned int num_objs,
>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>> +                             u64 addr, u64 range,
>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all 
>>>>>>>>>>>>>>>> assoiciated
>>>>>>>>>>>>>>>> BOs
>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>>>>> previously acquired
>>>>>>>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on failure.
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>> +                             struct drm_exec *exec,
>>>>>>>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>> +                             struct dma_fence *fence,
>>>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, 
>>>>>>>>>>>>>>>> &vm_exec->exec,
>>>>>>>>>>>>>>>> fence,
>>>>>>>>>>>>>>>> + private_usage,
>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>>>>>>>                           * gpuva list.
>>>>>>>>>>>>>>>>                           */
>>>>>>>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>>>>>>>> +                        * extobj list.
>>>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>> +                        * @evict: List entry to attach to
>>>>>>>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>>>>>>>> +                        * list.
>>>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>>>>>>>                  } entry;
>>>>>>>>>>>>>>>>          } list;
>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk 
>>>>>>>>>>>>>>>> over a
>>>>>>>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>>>>>>>>>> iteration step
>>>>>>>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>>>>>>>           * used.
>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, 
>>>>>>>>>>>>>>>> void
>>>>>>>>>>>>>>>> *priv);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>> +        * Drivers receive this callback for every evicted
>>>>>>>>>>>>>>>> &drm_gem_object being
>>>>>>>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>>>>>>>>> specific variant of
>>>>>>>>>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void 
>>>>>>>>>>>>>>>> *priv,
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-20  7:44                                 ` Thomas Hellström
@ 2023-09-20  8:29                                   ` Thomas Hellström
  2023-09-20 10:51                                   ` Christian König
  1 sibling, 0 replies; 77+ messages in thread
From: Thomas Hellström @ 2023-09-20  8:29 UTC (permalink / raw)
  To: Christian König, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 7327 bytes --]


On 9/20/23 09:44, Thomas Hellström wrote:
> Hi,
>
> On 9/20/23 07:37, Christian König wrote:
>> Am 19.09.23 um 17:23 schrieb Thomas Hellström:
>>>
>>> On 9/19/23 17:16, Danilo Krummrich wrote:
>>>> On 9/19/23 14:21, Thomas Hellström wrote:
>>>>> Hi Christian
>>>>>
>>>>> On 9/19/23 14:07, Christian König wrote:
>>>>>> Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
>>>>>>> On 9/13/23 17:33, Christian König wrote:
>>>>>>>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>>>>>>>> On 9/13/23 16:26, Christian König wrote:
>>>>>>>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>>>>>>>> As mentioned in a different mail thread, the reply is based 
>>>>>>>>>>> on the assumption
>>>>>>>>>>> that we don't support anything else than GPUVM updates from 
>>>>>>>>>>> the IOCTL.
>>>>>>>>>>
>>>>>>>>>> I think that this assumption is incorrect.
>>>>>>>>>
>>>>>>>>> Well, more precisely I should have said "don't support GPUVM 
>>>>>>>>> updated from within
>>>>>>>>> fence signaling critical sections". And looking at the code, 
>>>>>>>>> that doesn't seem what
>>>>>>>>> you're doing there.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Vulkan is just once specific use case, but this here should 
>>>>>>>>>> probably be able to handle other use cases as well.
>>>>>>>>>>
>>>>>>>>>> Especially with HMM you get the requirement that you need to 
>>>>>>>>>> be able to invalidate GPUVM mappings without grabbing a 
>>>>>>>>>> reservation lock.
>>>>>>>>>
>>>>>>>>> What do you mean with "invalidate GPUVM mappings" in this 
>>>>>>>>> context? drm_gpuvm_bo_evict()
>>>>>>>>> should only be called from a ttm_device_funcs::move\x0f callback, 
>>>>>>>>> we should hold the dma-resv
>>>>>>>>> lock there.
>>>>>>>>
>>>>>>>> Well the question is which dma-resv lock do we hold?
>>>>>>>>
>>>>>>>> In the move callback we only hold the dma-resv lock of the BO 
>>>>>>>> which is moved, but when that is a shared BO then that's not 
>>>>>>>> the same as the one for the VM.
>>>>>>>
>>>>>>> Correct, Thomas' idea was to use the GEM's dma_resv lock to 
>>>>>>> protect drm_gpuvm_bo::evicted
>>>>>>> and then actually move the drm_gpuvm_bo to the VM's evicted list 
>>>>>>> once we grabbed all
>>>>>>> dma-resv locks when locking the VM's BOs using drm_exec. We can 
>>>>>>> remove them from the evicted
>>>>>>> list on validate(). This way we never touch the evicted list 
>>>>>>> without holding at least the VM's
>>>>>>> dma-resv lock.
>>>>>>>
>>>>>>> Do you have any concerns about that?
>>>>>>
>>>>>> Scratching my head a bit how that is supposed to work.
>>>>>>
>>>>>> This implies that you go over all the evicted BOs during 
>>>>>> validation and not just the one mentioned in the CS.
>>>>>>
>>>>>> That might work for Vulkan, but is pretty much a no-go for OpenGL.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> See what the eviction lock in amdgpu is doing for example.
>>>>>>>>>
>>>>>>>>> The eviction_lock seems to protect a VM state "evicting" of 
>>>>>>>>> whether any BO that
>>>>>>>>> is associated with the VM is currently evicting. At the same 
>>>>>>>>> time amdgpu protects
>>>>>>>>> the eviceted list of the VM with a different lock. So this 
>>>>>>>>> seems to be entirely
>>>>>>>>> unrelated. Tracking a "currently evicting" state is not part 
>>>>>>>>> of the GPUVM
>>>>>>>>> implementation currently and hence nothing would change for 
>>>>>>>>> amdgpu there.
>>>>>>>>
>>>>>>>> Sorry for the confusion we use different terminology in amdgpu.
>>>>>>>>
>>>>>>>> The eviction lock and evicted state is for the VM page tables, 
>>>>>>>> e.g. if the whole VM is currently not used and swapped out or 
>>>>>>>> even de-allocated.
>>>>>>>>
>>>>>>>> This is necessary because we have cases where we need to access 
>>>>>>>> the VM data without holding the dma-resv lock of this VM. 
>>>>>>>> Especially figuring out which parts of an address space contain 
>>>>>>>> mappings and which doesn't.
>>>>>>>
>>>>>>> I think this is fine, this has nothing to do with lists of 
>>>>>>> evicted GEM objects or external GEM
>>>>>>> objects, right? Marking mappings (drm_gpuva) as invalidated 
>>>>>>> (DRM_GPUVA_INVALIDATED) or accessing
>>>>>>> the VA space does not require any dma-resv locks.
>>>>>>
>>>>>> I hope so, but I'm not 100% sure.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> This is a requirement which comes with HMM handling, you won't 
>>>>>>>> see this with Vulkan (or OpenGL, VAAPI etc..).
>>>>>>>>
>>>>>>>>
>>>>>>>> The invalidation lock on the other hand is what in this 
>>>>>>>> discussion is called eviction lock. This one is needed because 
>>>>>>>> what I wrote above, during the move callback only the dma-resv 
>>>>>>>> of the BO which is moved is locked, but not necessarily the 
>>>>>>>> dma-resv of the VM.
>>>>>>>
>>>>>>> That's yet another thing, right? This is used to track whether 
>>>>>>> *any* BO that belongs to the VM is
>>>>>>> currently being evicted, correct? As mentioned, as by now this 
>>>>>>> is not supported in GPUVM and hence
>>>>>>> would be the same driver specific code with the same driver 
>>>>>>> specifc lock.
>>>>>>
>>>>>> That is most likely a show stopper using this for OpenGL based 
>>>>>> workloads as far as I can see. For those you need to able to 
>>>>>> figure out which non-VM BOs have been evicted and which parts of 
>>>>>> the VM needs updates.
>>>>>
>>>>> We identify those with a bool in the gpuvm_bo, and that bool is 
>>>>> protected by the bo_resv. In essence, the "evicted" list must be 
>>>>> made up-to-date with all relevant locks held before traversing in 
>>>>> the next exec.
>>>>
>>>> What I still miss with this idea is how do we find all the 
>>>> drm_gpuvm_bo structures with the evicted bool set to true? When 
>>>> doing the drm_exec dance we come across all external ones and can 
>>>> add them to the list if needed, but what about the BOs having the 
>>>> VM's dma-resv?
>>>
>>> Oh, they can be added to the evict list directly (no bool needed) in 
>>> the eviction code, like in v3. Since for those we indeed hold the 
>>> VM's dma_resv since it's aliased with the object's dma-resv.
>>
>> Yeah, I wanted to note what Danilo seems to think about as well. How 
>> do we figure out the non-VM BOs evicted?
>>
>> We can't walk over the list of all non-VM BOs on every submission, 
>> that's to much overhead for cases with lots of non-VM BOs.
>>
>> And we can't rely on userspace sending all non-VM BOs as used list 
>> down to the kernel with each submission.
>>
>> Regards,
>> Christian.
>
> No, that's not needed: Mechanism below.
>
> 1) We maintain an evicted list. Typically protected by the vm resv.
> 2) Each gpuvm_bo has a bool "evicted". Protected by the bo resv.
>
> a) Evicting a vm bo: The vm resv is held by the eviction code. Just 
> put it on the evicted list.
> b) Evicting a shared/external bo: The bo resv is held by the eviction 
> code. Set the "evicted" bool
> c) Validating the evicted list on exec: Loop through all 
> *external/shared* bos. Lock them. After locking, check the "evicted" 
> bool, if it's true. put the bo on the evicted list (we hold the VM 
> resv at this point) and clear the "evicted" bool. Note that other vms 
> will have their own gpuvm_bo which is marked evicted.
>
> I have this coded up in a patch for Xe and it seems to be working 
> properly.
>
> /Thomas
>
Something along the lines of the attach patch.


[-- Attachment #2: 0001-drm-gpuvm-Adjustment-for-extobj-eviction.patch --]
[-- Type: text/x-patch, Size: 3065 bytes --]

From 12778a3f1b2ca055ff658864c538f944550c9adf Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= <thomas.hellstrom@linux.intel.com>
Date: Thu, 14 Sep 2023 10:23:52 +0200
Subject: [PATCH] drm/gpuvm: Adjustment for extobj eviction.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/drm_gpuvm.c | 32 ++++++++++++++++++++++++--------
 include/drm/drm_gpuvm.h     |  7 ++++++-
 2 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 11a0aee1c038..029c38d7fa4d 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -956,6 +956,11 @@ drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
 		ret = drm_exec_prepare_obj(exec, vm_bo->obj, num_fences);
 		if (ret)
 			break;
+
+		if (vm_bo->evicted) {
+			drm_gpuvm_bo_list_add(vm_bo, evict);
+			vm_bo->evicted = false;
+		}
 	}
 	/* Drop ref in case we break out of the loop. */
 	drm_gpuvm_bo_put(vm_bo);
@@ -1431,6 +1436,21 @@ drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
 
+void
+drm_gpuvm_bo_evict(struct drm_gpuvm_bo *vm_bo, bool evict)
+{
+	if (drm_gpuvm_is_extobj(vm_bo->vm, vm_bo->obj)) {
+		vm_bo->evicted = evict;
+		return;
+	}
+
+	if (evict)
+		drm_gpuvm_bo_list_add(vm_bo, evict);
+	else
+		drm_gpuvm_bo_list_del(vm_bo, evict);
+}
+EXPORT_SYMBOL(drm_gpuvm_bo_evict);
+
 /**
  * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to / from a
  * &drm_gpuvms evicted list
@@ -1441,18 +1461,14 @@ EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
  * list containing a mapping of this &drm_gem_object.
  */
 void
-drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict)
+drm_gpuvm_gem_evict(struct drm_gem_object *obj, bool evict)
 {
 	struct drm_gpuvm_bo *vm_bo;
 
-	drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
-		if (evict)
-			drm_gpuvm_bo_list_add(vm_bo, evict);
-		else
-			drm_gpuvm_bo_list_del(vm_bo, evict);
-	}
+	drm_gem_for_each_gpuvm_bo(vm_bo, obj)
+		drm_gpuvm_bo_evict(vm_bo, evict);
 }
-EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
+EXPORT_SYMBOL_GPL(drm_gpuvm_gem_evict);
 
 static int
 __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index dce26a923d5d..c2216f18243f 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -550,6 +550,9 @@ struct drm_gpuvm_bo {
 	 */
 	struct kref kref;
 
+	/** @evicted: Whether the bo needs revalidation and rebinding. */
+	bool evicted;
+
 	/**
 	 * @list: Structure containing all &list_heads.
 	 */
@@ -615,7 +618,9 @@ struct drm_gpuvm_bo *
 drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
 		  struct drm_gem_object *obj);
 
-void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict);
+void drm_gpuvm_bo_evict(struct drm_gpuvm_bo *vm_bo, bool evict);
+
+void drm_gpuvm_gem_evict(struct drm_gem_object *obj, bool evict);
 void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo);
 
 /**
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-20  7:44                                 ` Thomas Hellström
  2023-09-20  8:29                                   ` Thomas Hellström
@ 2023-09-20 10:51                                   ` Christian König
  2023-09-20 12:06                                     ` Thomas Hellström
  1 sibling, 1 reply; 77+ messages in thread
From: Christian König @ 2023-09-20 10:51 UTC (permalink / raw)
  To: Thomas Hellström, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

Am 20.09.23 um 09:44 schrieb Thomas Hellström:
> Hi,
>
> On 9/20/23 07:37, Christian König wrote:
>> Am 19.09.23 um 17:23 schrieb Thomas Hellström:
>>>
>>> On 9/19/23 17:16, Danilo Krummrich wrote:
>>>> On 9/19/23 14:21, Thomas Hellström wrote:
>>>>> Hi Christian
>>>>>
>>>>> On 9/19/23 14:07, Christian König wrote:
>>>>>> Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
>>>>>>> On 9/13/23 17:33, Christian König wrote:
>>>>>>>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>>>>>>>> On 9/13/23 16:26, Christian König wrote:
>>>>>>>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>>>>>>>> As mentioned in a different mail thread, the reply is based 
>>>>>>>>>>> on the assumption
>>>>>>>>>>> that we don't support anything else than GPUVM updates from 
>>>>>>>>>>> the IOCTL.
>>>>>>>>>>
>>>>>>>>>> I think that this assumption is incorrect.
>>>>>>>>>
>>>>>>>>> Well, more precisely I should have said "don't support GPUVM 
>>>>>>>>> updated from within
>>>>>>>>> fence signaling critical sections". And looking at the code, 
>>>>>>>>> that doesn't seem what
>>>>>>>>> you're doing there.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Vulkan is just once specific use case, but this here should 
>>>>>>>>>> probably be able to handle other use cases as well.
>>>>>>>>>>
>>>>>>>>>> Especially with HMM you get the requirement that you need to 
>>>>>>>>>> be able to invalidate GPUVM mappings without grabbing a 
>>>>>>>>>> reservation lock.
>>>>>>>>>
>>>>>>>>> What do you mean with "invalidate GPUVM mappings" in this 
>>>>>>>>> context? drm_gpuvm_bo_evict()
>>>>>>>>> should only be called from a ttm_device_funcs::move\x0f callback, 
>>>>>>>>> we should hold the dma-resv
>>>>>>>>> lock there.
>>>>>>>>
>>>>>>>> Well the question is which dma-resv lock do we hold?
>>>>>>>>
>>>>>>>> In the move callback we only hold the dma-resv lock of the BO 
>>>>>>>> which is moved, but when that is a shared BO then that's not 
>>>>>>>> the same as the one for the VM.
>>>>>>>
>>>>>>> Correct, Thomas' idea was to use the GEM's dma_resv lock to 
>>>>>>> protect drm_gpuvm_bo::evicted
>>>>>>> and then actually move the drm_gpuvm_bo to the VM's evicted list 
>>>>>>> once we grabbed all
>>>>>>> dma-resv locks when locking the VM's BOs using drm_exec. We can 
>>>>>>> remove them from the evicted
>>>>>>> list on validate(). This way we never touch the evicted list 
>>>>>>> without holding at least the VM's
>>>>>>> dma-resv lock.
>>>>>>>
>>>>>>> Do you have any concerns about that?
>>>>>>
>>>>>> Scratching my head a bit how that is supposed to work.
>>>>>>
>>>>>> This implies that you go over all the evicted BOs during 
>>>>>> validation and not just the one mentioned in the CS.
>>>>>>
>>>>>> That might work for Vulkan, but is pretty much a no-go for OpenGL.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> See what the eviction lock in amdgpu is doing for example.
>>>>>>>>>
>>>>>>>>> The eviction_lock seems to protect a VM state "evicting" of 
>>>>>>>>> whether any BO that
>>>>>>>>> is associated with the VM is currently evicting. At the same 
>>>>>>>>> time amdgpu protects
>>>>>>>>> the eviceted list of the VM with a different lock. So this 
>>>>>>>>> seems to be entirely
>>>>>>>>> unrelated. Tracking a "currently evicting" state is not part 
>>>>>>>>> of the GPUVM
>>>>>>>>> implementation currently and hence nothing would change for 
>>>>>>>>> amdgpu there.
>>>>>>>>
>>>>>>>> Sorry for the confusion we use different terminology in amdgpu.
>>>>>>>>
>>>>>>>> The eviction lock and evicted state is for the VM page tables, 
>>>>>>>> e.g. if the whole VM is currently not used and swapped out or 
>>>>>>>> even de-allocated.
>>>>>>>>
>>>>>>>> This is necessary because we have cases where we need to access 
>>>>>>>> the VM data without holding the dma-resv lock of this VM. 
>>>>>>>> Especially figuring out which parts of an address space contain 
>>>>>>>> mappings and which doesn't.
>>>>>>>
>>>>>>> I think this is fine, this has nothing to do with lists of 
>>>>>>> evicted GEM objects or external GEM
>>>>>>> objects, right? Marking mappings (drm_gpuva) as invalidated 
>>>>>>> (DRM_GPUVA_INVALIDATED) or accessing
>>>>>>> the VA space does not require any dma-resv locks.
>>>>>>
>>>>>> I hope so, but I'm not 100% sure.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> This is a requirement which comes with HMM handling, you won't 
>>>>>>>> see this with Vulkan (or OpenGL, VAAPI etc..).
>>>>>>>>
>>>>>>>>
>>>>>>>> The invalidation lock on the other hand is what in this 
>>>>>>>> discussion is called eviction lock. This one is needed because 
>>>>>>>> what I wrote above, during the move callback only the dma-resv 
>>>>>>>> of the BO which is moved is locked, but not necessarily the 
>>>>>>>> dma-resv of the VM.
>>>>>>>
>>>>>>> That's yet another thing, right? This is used to track whether 
>>>>>>> *any* BO that belongs to the VM is
>>>>>>> currently being evicted, correct? As mentioned, as by now this 
>>>>>>> is not supported in GPUVM and hence
>>>>>>> would be the same driver specific code with the same driver 
>>>>>>> specifc lock.
>>>>>>
>>>>>> That is most likely a show stopper using this for OpenGL based 
>>>>>> workloads as far as I can see. For those you need to able to 
>>>>>> figure out which non-VM BOs have been evicted and which parts of 
>>>>>> the VM needs updates.
>>>>>
>>>>> We identify those with a bool in the gpuvm_bo, and that bool is 
>>>>> protected by the bo_resv. In essence, the "evicted" list must be 
>>>>> made up-to-date with all relevant locks held before traversing in 
>>>>> the next exec.
>>>>
>>>> What I still miss with this idea is how do we find all the 
>>>> drm_gpuvm_bo structures with the evicted bool set to true? When 
>>>> doing the drm_exec dance we come across all external ones and can 
>>>> add them to the list if needed, but what about the BOs having the 
>>>> VM's dma-resv?
>>>
>>> Oh, they can be added to the evict list directly (no bool needed) in 
>>> the eviction code, like in v3. Since for those we indeed hold the 
>>> VM's dma_resv since it's aliased with the object's dma-resv.
>>
>> Yeah, I wanted to note what Danilo seems to think about as well. How 
>> do we figure out the non-VM BOs evicted?
>>
>> We can't walk over the list of all non-VM BOs on every submission, 
>> that's to much overhead for cases with lots of non-VM BOs.
>>
>> And we can't rely on userspace sending all non-VM BOs as used list 
>> down to the kernel with each submission.
>>
>> Regards,
>> Christian.
>
> No, that's not needed: Mechanism below.
>
> 1) We maintain an evicted list. Typically protected by the vm resv.
> 2) Each gpuvm_bo has a bool "evicted". Protected by the bo resv.
>
> a) Evicting a vm bo: The vm resv is held by the eviction code. Just 
> put it on the evicted list.
> b) Evicting a shared/external bo: The bo resv is held by the eviction 
> code. Set the "evicted" bool
> c) Validating the evicted list on exec:


> Loop through all *external/shared* bos.

And this is what you can't do. For Vulkan it probably doesn't matter, 
but for OpenGL and especially multimedia we have much more BOs on the 
shared list than what's allocated for the VM.

Regards,
Christian.

> Lock them. After locking, check the "evicted" bool, if it's true. put 
> the bo on the evicted list (we hold the VM resv at this point) and 
> clear the "evicted" bool. Note that other vms will have their own 
> gpuvm_bo which is marked evicted.
>
> I have this coded up in a patch for Xe and it seems to be working 
> properly.
>
> /Thomas
>
>
>>
>>>
>>> /Thomas
>>>
>>>
>>>
>>>>
>>>>>
>>>>> If you mean that we need to unbind all vmas of all vms of evicted 
>>>>> bos before evicting, We don't do that, at least not in Xe, since 
>>>>> evicting we wait for VM idle, and it cant access anything through 
>>>>> the stale vmas until they have been revalidated and rebound.
>>>>>
>>>>> /Thomas
>>>>>
>>>>>
>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström 
>>>>>>>>>>> wrote:
>>>>>>>>>>>> Hi!
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas Hellström 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas 
>>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>>> Hi, Danilo,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>>> So far the DRM GPUVA manager offers common 
>>>>>>>>>>>>>>>>> infrastructure to
>>>>>>>>>>>>>>>>> track GPU VA
>>>>>>>>>>>>>>>>> allocations and mappings, generically connect GPU VA 
>>>>>>>>>>>>>>>>> mappings
>>>>>>>>>>>>>>>>> to their
>>>>>>>>>>>>>>>>> backing buffers and perform more complex mapping 
>>>>>>>>>>>>>>>>> operations
>>>>>>>>>>>>>>>>> on the GPU VA
>>>>>>>>>>>>>>>>> space.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>>>>>>>>>>> drivers, which
>>>>>>>>>>>>>>>>> can potentially be generalized in order to make the 
>>>>>>>>>>>>>>>>> DRM GPUVA
>>>>>>>>>>>>>>>>> manager
>>>>>>>>>>>>>>>>> represent a basic GPU-VM implementation. In this context,
>>>>>>>>>>>>>>>>> this patch aims
>>>>>>>>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not being 
>>>>>>>>>>>>>>>>> used
>>>>>>>>>>>>>>>>> outside of
>>>>>>>>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>>>>>>>>>>> which are
>>>>>>>>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 3) Provide functions to efficiently lock all GEM 
>>>>>>>>>>>>>>>>> objects dma-
>>>>>>>>>>>>>>>>> resv the
>>>>>>>>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>>>>>>>>>> contains mappings
>>>>>>>>>>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>>>>>>>>>>> accelerated.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 5) Provide some convinience functions for common 
>>>>>>>>>>>>>>>>> patterns.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Rather than being designed as a "framework", the 
>>>>>>>>>>>>>>>>> target is to
>>>>>>>>>>>>>>>>> make all
>>>>>>>>>>>>>>>>> features appear as a collection of optional helper 
>>>>>>>>>>>>>>>>> functions,
>>>>>>>>>>>>>>>>> such that
>>>>>>>>>>>>>>>>> drivers are free to make use of the DRM GPUVA managers 
>>>>>>>>>>>>>>>>> basic
>>>>>>>>>>>>>>>>> functionality and opt-in for other features without 
>>>>>>>>>>>>>>>>> setting
>>>>>>>>>>>>>>>>> any feature
>>>>>>>>>>>>>>>>> flags, just by making use of the corresponding functions.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>>>>>>>>>>> locking for drivers
>>>>>>>>>>>>>>>>> updating the GPU VA space within the fence signalling 
>>>>>>>>>>>>>>>>> path.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>>>>>>>     include/drm/drm_gpuvm.h     | 197 ++++++++++++++
>>>>>>>>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an 
>>>>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>>>>> instance of this
>>>>>>>>>>>>>>>>>      * particular combination. If not existent a new 
>>>>>>>>>>>>>>>>> instance
>>>>>>>>>>>>>>>>> is created and linked
>>>>>>>>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>>>>>>>>>>> evicted objects. Those
>>>>>>>>>>>>>>>>> + * list are maintained in order to accelerate locking of
>>>>>>>>>>>>>>>>> dma-resv locks and
>>>>>>>>>>>>>>>>> + * validation of evicted objects bound in a 
>>>>>>>>>>>>>>>>> &drm_gpuvm. For
>>>>>>>>>>>>>>>>> instance the all
>>>>>>>>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm 
>>>>>>>>>>>>>>>>> can be
>>>>>>>>>>>>>>>>> locked by calling
>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>>>>>>>>> + * order to validate all evicted &drm_gem_objects. It is
>>>>>>>>>>>>>>>>> also possible to lock
>>>>>>>>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>>>>>>>>> corresponding parameters to
>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the 
>>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>>> loop while making
>>>>>>>>>>>>>>>>> + * use of helper functions such as 
>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Every bound &drm_gem_object is treated as external 
>>>>>>>>>>>>>>>>> object
>>>>>>>>>>>>>>>>> when its &dma_resv
>>>>>>>>>>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for 
>>>>>>>>>>>>>>>>> the same
>>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>>>>>>>>>>> creations and destructions
>>>>>>>>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances 
>>>>>>>>>>>>>>>>> unique.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of 
>>>>>>>>>>>>>>>>> external and
>>>>>>>>>>>>>>>>> evicted objects are
>>>>>>>>>>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>>>>>>>>>>> iteration internally.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * However, drivers still need ensure to protect 
>>>>>>>>>>>>>>>>> concurrent
>>>>>>>>>>>>>>>>> calls to functions
>>>>>>>>>>>>>>>>> + * iterating those lists, such as 
>>>>>>>>>>>>>>>>> drm_gpuvm_validate() and
>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function 
>>>>>>>>>>>>>>>>> contains
>>>>>>>>>>>>>>>>> a particular
>>>>>>>>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Functions adding or removing entries from those 
>>>>>>>>>>>>>>>>> lists,
>>>>>>>>>>>>>>>>> such as
>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() 
>>>>>>>>>>>>>>>>> may be
>>>>>>>>>>>>>>>>> called with external
>>>>>>>>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>>>>>>>>> corresponding list to be
>>>>>>>>>>>>>>>>> + * (safely) modified while potentially being 
>>>>>>>>>>>>>>>>> iternated by
>>>>>>>>>>>>>>>>> other API functions.
>>>>>>>>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>>>>>>>>      *   }
>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo 
>>>>>>>>>>>>>>>>> element
>>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used to 
>>>>>>>>>>>>>>>>> store
>>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>>> first element from
>>>>>>>>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>> Are the list spinlocks needed for that async state 
>>>>>>>>>>>>>>>> update from
>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>>>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Otherwise it should be sufficient to protect the lists 
>>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>>> gpuvm's resv
>>>>>>>>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If those spinlocks are still needed in some situations, 
>>>>>>>>>>>>>>>> perhaps
>>>>>>>>>>>>>>>> could we
>>>>>>>>>>>>>>>> have an option to set them to NULL (Like IIRC the maple 
>>>>>>>>>>>>>>>> tree
>>>>>>>>>>>>>>>> allows for)?
>>>>>>>>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>>>>>>>>> holding only the dma-resv lock from the BO this function 
>>>>>>>>>>>>>>> gets
>>>>>>>>>>>>>>> called for. Hence,
>>>>>>>>>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() 
>>>>>>>>>>>>>>> calls with
>>>>>>>>>>>>>>> different BOs.
>>>>>>>>>>>>>> No. Only if you try to add external objects to the vm's 
>>>>>>>>>>>>>> evict list
>>>>>>>>>>>>>> from
>>>>>>>>>>>>>> within the evict code. That's not necessary since you 
>>>>>>>>>>>>>> loop through
>>>>>>>>>>>>>> all
>>>>>>>>>>>>>> external objects anyway when locking them so an "evicted" 
>>>>>>>>>>>>>> bool in
>>>>>>>>>>>>>> the vm_bo,
>>>>>>>>>>>>>> protected by the bo resv would be sufficient. The extobj 
>>>>>>>>>>>>>> locking
>>>>>>>>>>>>>> loop can
>>>>>>>>>>>>>> then add the bo to the evicted list.
>>>>>>>>>>>>> And validate() can remove it while still holding all 
>>>>>>>>>>>>> dma-resv locks,
>>>>>>>>>>>>> neat!
>>>>>>>>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>>>>>>>>> concurrently? What
>>>>>>>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>>>>>>>>> drm_gpuva_unlink()?
>>>>>>>>>>>>> Are we guaranteed that at this point of time the 
>>>>>>>>>>>>> drm_gpuvm_bo is not
>>>>>>>>>>>>> on the
>>>>>>>>>>>>> evicted list? Because otherwise we would call 
>>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>>> with the
>>>>>>>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>>> might drop the last reference to the drm_gem_object and 
>>>>>>>>>>>>> hence we'd
>>>>>>>>>>>>> potentially
>>>>>>>>>>>>> free the dma-resv lock while holding it, at least if it's 
>>>>>>>>>>>>> an external
>>>>>>>>>>>>> object.
>>>>>>>>>>>> Easiest way in this scheme is to think of the lists as 
>>>>>>>>>>>> being protected
>>>>>>>>>>>> by the vm's resv lock. That means anybody calling unlink() 
>>>>>>>>>>>> must also
>>>>>>>>>>>> hold the vm's resv lock. (Which is OK from an UAF point of 
>>>>>>>>>>>> view, but
>>>>>>>>>>>> perhaps not from a locking inversion POW from an async list 
>>>>>>>>>>>> update).
>>>>>>>>>>> This would mean that on unlink() we'd need to hold the VM's 
>>>>>>>>>>> resv lock and the
>>>>>>>>>>> corresponding GEM's resv lock (in case they're not the same 
>>>>>>>>>>> anyways) because the
>>>>>>>>>>> VM's resv lock would protect the external / evicted object 
>>>>>>>>>>> lists and the GEM
>>>>>>>>>>> objects resv lock protects the GEM's list of drm_gpuvm_bos 
>>>>>>>>>>> and the
>>>>>>>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>>>>>>>
>>>>>>>>>>>>>>> For extobjs an outer lock would be enough in case of Xe, 
>>>>>>>>>>>>>>> but I
>>>>>>>>>>>>>>> really would not
>>>>>>>>>>>>>>> like to add even more complexity just to get the 
>>>>>>>>>>>>>>> spinlock out of
>>>>>>>>>>>>>>> the way in case
>>>>>>>>>>>>>>> the driver already has an outer lock protecting this path.
>>>>>>>>>>>>>> I must disagree here. These spinlocks and atomic 
>>>>>>>>>>>>>> operations are
>>>>>>>>>>>>>> pretty
>>>>>>>>>>>>>> costly and as discussed earlier this type of locking was 
>>>>>>>>>>>>>> the reason
>>>>>>>>>>>>>> (at
>>>>>>>>>>>>>> least according to the commit message) that made 
>>>>>>>>>>>>>> Christian drop the
>>>>>>>>>>>>>> XArray
>>>>>>>>>>>>>> use in drm_exec for the same set of objects: "The locking 
>>>>>>>>>>>>>> overhead
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>>>>>>>>>> complexity and a
>>>>>>>>>>>>>> single wide lock following the drm locking guidelines set 
>>>>>>>>>>>>>> out by
>>>>>>>>>>>>>> Daniel and
>>>>>>>>>>>>>> David should really be the default choice with an opt-in 
>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>> spinlock if
>>>>>>>>>>>>>> needed for async and pushing out to a wq is not an option.
>>>>>>>>>>>>> For the external object list an outer lock would work as 
>>>>>>>>>>>>> long as it's
>>>>>>>>>>>>> not the
>>>>>>>>>>>>> dma-resv lock of the corresponding GEM object, since here 
>>>>>>>>>>>>> we actually
>>>>>>>>>>>>> need to
>>>>>>>>>>>>> remove the list entry from the external object list on
>>>>>>>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>>>>>>>> It's just a bit weird design wise that drivers would need 
>>>>>>>>>>>>> to take
>>>>>>>>>>>>> this outer
>>>>>>>>>>>>> lock on:
>>>>>>>>>>>>>
>>>>>>>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also 
>>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>>>>>>>
>>>>>>>>>>>>> Given that it seems reasonable to do all the required locking
>>>>>>>>>>>>> internally.
>>>>>>>>>>>>  From a design POW, there has been a clear direction in XE 
>>>>>>>>>>>> to make
>>>>>>>>>>>> things similar to mmap() / munmap(), so this outer lock, 
>>>>>>>>>>>> which in Xe is
>>>>>>>>>>>> an rwsem, is used in a similar way as the mmap_lock. It's 
>>>>>>>>>>>> protecting
>>>>>>>>>>>> the page-table structures and vma rb tree, the userptr 
>>>>>>>>>>>> structures and
>>>>>>>>>>>> the extobj list. Basically it's taken early in the exec 
>>>>>>>>>>>> IOCTL, the
>>>>>>>>>>>> VM_BIND ioctl, the compute rebind worker and the pagefault 
>>>>>>>>>>>> handler, so
>>>>>>>>>>>> all of the above are just asserting that it is taken in the 
>>>>>>>>>>>> correct
>>>>>>>>>>>> mode.
>>>>>>>>>>>>
>>>>>>>>>>>> But strictly with this scheme one could also use the vm's 
>>>>>>>>>>>> dma_resv for
>>>>>>>>>>>> the extobj list since with drm_exec, it's locked before 
>>>>>>>>>>>> traversing the
>>>>>>>>>>>> list.
>>>>>>>>>>>>
>>>>>>>>>>>> The whole point of this scheme is to rely on locks that you 
>>>>>>>>>>>> already are
>>>>>>>>>>>> supposed to be holding for various reasons and is simple to 
>>>>>>>>>>>> comprehend.
>>>>>>>>>>> I don't agree that we're supposed to hold the VM's resv lock 
>>>>>>>>>>> anyways for
>>>>>>>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), but 
>>>>>>>>>>> I'm fine using it
>>>>>>>>>>> for that purpose nevertheless.
>>>>>>>>>>>
>>>>>>>>>>>>> In order to at least place lockdep checks, the driver 
>>>>>>>>>>>>> would need to
>>>>>>>>>>>>> supply the
>>>>>>>>>>>>> corresponding lock's lockdep_map, because the GPUVM 
>>>>>>>>>>>>> otherwise doesn't
>>>>>>>>>>>>> know about
>>>>>>>>>>>>> the lock.
>>>>>>>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>>>>>>> I'd really like to avoid that, especially now that 
>>>>>>>>>>> everything got simpler. We
>>>>>>>>>>> should define the actual locks to take instead.
>>>>>>>>>>>
>>>>>>>>>>>>> Out of curiosity, what is the overhead of a spin_lock() 
>>>>>>>>>>>>> that doesn't
>>>>>>>>>>>>> need to
>>>>>>>>>>>>> spin?
>>>>>>>>>>>> I guess it's hard to tell exactly, but it is much lower on 
>>>>>>>>>>>> modern x86
>>>>>>>>>>>> than what it used to be. Not sure about ARM, which is the 
>>>>>>>>>>>> other
>>>>>>>>>>>> architecture important to us. I figure if there is little 
>>>>>>>>>>>> cache-line
>>>>>>>>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>>>>>>>>
>>>>>>>>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm 
>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>> spinlock_t
>>>>>>>>>>>>>> *lock)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>>>>>>>>          spin_lock(lock);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For such drivers, that would require anybody calling 
>>>>>>>>>>>>>>>> unlink to
>>>>>>>>>>>>>>>> hold the vm's
>>>>>>>>>>>>>>>> resv, though.
>>>>>>>>>>>>>>> In V4 I want to go back to having a dedicated lock for 
>>>>>>>>>>>>>>> the GEMs
>>>>>>>>>>>>>>> gpuva list (or
>>>>>>>>>>>>>>> VM_BO list to be more precise). We can't just use the 
>>>>>>>>>>>>>>> dma-resv
>>>>>>>>>>>>>>> lock for that
>>>>>>>>>>>>>>> with VM_BO abstractions, because on destruction of a 
>>>>>>>>>>>>>>> VM_BO we
>>>>>>>>>>>>>>> otherwise wouldn't
>>>>>>>>>>>>>>> be allowed to already hold the dma-resv lock. That's the 
>>>>>>>>>>>>>>> fix I
>>>>>>>>>>>>>>> was referring to
>>>>>>>>>>>>>>> earlier.
>>>>>>>>>>>>>> Yeah, I can see the need for a dedicated lock for the 
>>>>>>>>>>>>>> GEM's gpuva
>>>>>>>>>>>>>> list, but
>>>>>>>>>>>>>> holding the vm's dma-resv lock across the unlink 
>>>>>>>>>>>>>> shouldn't be a
>>>>>>>>>>>>>> problem. We
>>>>>>>>>>>>>> may free the object and a pointer to the vm's resv during 
>>>>>>>>>>>>>> unlink
>>>>>>>>>>>>>> but we
>>>>>>>>>>>>>> don't free the vm's resv.  It'd be a matter of ensuring 
>>>>>>>>>>>>>> that any
>>>>>>>>>>>>>> calls to
>>>>>>>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>>>>>>> Drivers calling unlink() from the fence signaling path 
>>>>>>>>>>>>> can't use the
>>>>>>>>>>>>> VM's
>>>>>>>>>>>>> dma-resv lock.
>>>>>>>>>>>> Yes, that made me a bit curious because in the current 
>>>>>>>>>>>> version the code
>>>>>>>>>>>> required the object's dma_resv for unlink() which can't be 
>>>>>>>>>>>> grabbed
>>>>>>>>>>>> either from the fence signaling path. So are there any 
>>>>>>>>>>>> drivers actually
>>>>>>>>>>>> wanting to do that? If so, they will either need to resort 
>>>>>>>>>>>> to the
>>>>>>>>>>>> current spinlock solution or they will need to call unlink 
>>>>>>>>>>>> from a
>>>>>>>>>>>> workqueue item.
>>>>>>>>>>> As Boris already mentioned we have the dma-resv lock by 
>>>>>>>>>>> default or a driver
>>>>>>>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of 
>>>>>>>>>>> the latter.
>>>>>>>>>>>
>>>>>>>>>>>>> Also, what if the object is an external object? We can't 
>>>>>>>>>>>>> use the VM's
>>>>>>>>>>>>> dma-resv
>>>>>>>>>>>>> lock here.
>>>>>>>>>>>> Why? Typically (sync) unlink is only ever called from an 
>>>>>>>>>>>> unbind-like
>>>>>>>>>>>> operation where it should be trivial to grab the vm's resv. 
>>>>>>>>>>>> Or, for
>>>>>>>>>>>> that matter any outer lock protecting the extobj list. Rule 
>>>>>>>>>>>> would be
>>>>>>>>>>>> the drm_gpuvm_bo::entry::extobj  and 
>>>>>>>>>>>> drm_gpuvm_bo::entry::evict would
>>>>>>>>>>>> be protected by either the vm's dma_resv (or possibly an 
>>>>>>>>>>>> outer lock in
>>>>>>>>>>>> the case of the extobj list).
>>>>>>>>>>> Outer lock wouldn't have been working for updates in the 
>>>>>>>>>>> async path, but
>>>>>>>>>>> shouldn't be relevant anymore. We could use the VM's resv 
>>>>>>>>>>> for that.
>>>>>>>>>>>
>>>>>>>>>>>>>   And we can't have the GEM objs dma-resv lock held when 
>>>>>>>>>>>>> calling
>>>>>>>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which 
>>>>>>>>>>>>> if the
>>>>>>>>>>>>> refcount drops
>>>>>>>>>>>>> to zero calls drm_gpuvm_bo_destroy() and 
>>>>>>>>>>>>> drm_gpuvm_bo_destroy() might
>>>>>>>>>>>>> drop the
>>>>>>>>>>>>> last reference of the GEM object.
>>>>>>>>>>>> Yes, but this is a different problem as to what exactly 
>>>>>>>>>>>> protects
>>>>>>>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an internal 
>>>>>>>>>>>> per bo list
>>>>>>>>>>>> lock, or if we want to keep the bo's dma_resv we need to 
>>>>>>>>>>>> ensure that
>>>>>>>>>>>> the caller of dma_resv_unlock(obj->resv) actually refcounts 
>>>>>>>>>>>> its obj
>>>>>>>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's 
>>>>>>>>>>>> refcount (I know
>>>>>>>>>>>> Boris didn't like that, but requiring an explicit refcount 
>>>>>>>>>>>> for a
>>>>>>>>>>>> pointer you dereference unless you're under a lock that 
>>>>>>>>>>>> ensures keeping
>>>>>>>>>>>> the object alive is pretty much required?) But anyway for the
>>>>>>>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or 
>>>>>>>>>>>> internal spinlock)
>>>>>>>>>>>> I don't have a strong preference.
>>>>>>>>>>> We can keep the GEM objects dma-resv lock, however as 
>>>>>>>>>>> mentioned above
>>>>>>>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires both 
>>>>>>>>>>> the VM's resv lock
>>>>>>>>>>> and the GEM's resv lock in case they differ.
>>>>>>>>>>>
>>>>>>>>>>>>>   All those problems go away with a dedicated
>>>>>>>>>>>>> GEM gpuva list lock.
>>>>>>>>>>>> I don't think these are real problems.
>>>>>>>>>>>> With the excepton of the eviction list "trick" where we 
>>>>>>>>>>>> currently have
>>>>>>>>>>>> slightly different approach to collect external bos needing 
>>>>>>>>>>>> rebinding,
>>>>>>>>>>>> we have this working fine.
>>>>>>>>>>>>
>>>>>>>>>>>> TBH I think pretty much the only situation where the 
>>>>>>>>>>>> spinlock is needed
>>>>>>>>>>>> is for async updates of these lists, unless a wq item can 
>>>>>>>>>>>> be used for
>>>>>>>>>>>> that, but it doesn't really seem like the current code 
>>>>>>>>>>>> allows for such
>>>>>>>>>>>> updates anyway? It complicates the code a lot, adds 
>>>>>>>>>>>> overhead and also
>>>>>>>>>>>> adds the requirement for refcounting during list traversal.
>>>>>>>>>>>>
>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>
>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It seems that with that also the refcount could be make 
>>>>>>>>>>>>>>>> non-
>>>>>>>>>>>>>>>> atomic.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> All in the spirit of the drm locking guidelines "use 
>>>>>>>>>>>>>>>> big locks
>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> possible".
>>>>>>>>>>>>>>>> Lower level locks only when necessary for performance or
>>>>>>>>>>>>>>>> locking inversion?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>>>>>>>>>>> local list, so removal
>>>>>>>>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>>>>>>>>> iterating the list.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>>>>>>>>> +       ({
>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>>>>>>>>>>> + struct
>>>>>>>>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>>>>>>>>> +                       if
>>>>>>>>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>>>>>>>>> {                    \
>>>>>>>>>>>>>>>>> +                               list_move_tail(&(__vm_bo)- 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> __local_list);                           \
>>>>>>>>>>>>>>>>> +                               break;
>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>> +                       } else
>>>>>>>>>>>>>>>>> {                                                        \ 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +                               list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>>>>>>>>> NULL;                                         \
>>>>>>>>>>>>>>>>> +                       }
>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>> +       })
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list 
>>>>>>>>>>>>>>>>> iterator
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>>> first element from the
>>>>>>>>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Typical use:
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + *     ret = 0;
>>>>>>>>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>>>>>>>>> + *             ret = do_something_with_vm_bo(..., 
>>>>>>>>>>>>>>>>> vm_bo);
>>>>>>>>>>>>>>>>> + *             if (ret)
>>>>>>>>>>>>>>>>> + *                     break;
>>>>>>>>>>>>>>>>> + *     }
>>>>>>>>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>>>>>>>>>>> &my_local_list);
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Only used for internal list iterations, not meant 
>>>>>>>>>>>>>>>>> to be
>>>>>>>>>>>>>>>>> exposed to the outside
>>>>>>>>>>>>>>>>> + * world.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> NULL);            \
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> __vm_bo;
>>>>>>>>>>>>>>>>>        \
>>>>>>>>>>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> __vm_bo))         \
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back to 
>>>>>>>>>>>>>>>>> their
>>>>>>>>>>>>>>>>> original list
>>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're iterating on
>>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used to 
>>>>>>>>>>>>>>>>> store
>>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * When we're done iterating a vm_bo list, we should 
>>>>>>>>>>>>>>>>> call
>>>>>>>>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>>>>>>>>> + * to restore the original state and let new 
>>>>>>>>>>>>>>>>> iterations take
>>>>>>>>>>>>>>>>> place.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>> __local_list)                         \
>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>                  \
>>>>>>>>>>>>>>>>> +               /* Merge back the two lists, moving local
>>>>>>>>>>>>>>>>> list elements to the          \
>>>>>>>>>>>>>>>>> +                * head to preserve previous ordering, in
>>>>>>>>>>>>>>>>> case it matters.              \
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>>>>            \
>>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the 
>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Inserts the given @__vm_bo into the list specified by
>>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the 
>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Removes the given @__vm_bo from the list specified by
>>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>> *vm_bo);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>     #define 
>>>>>>>>>>>>>>>>> to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm 
>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>         ��drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct drm_gpuvm
>>>>>>>>>>>>>>>>> *gpuvm)
>>>>>>>>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>>>>>>>>>               "GPUVA tree is not empty, potentially 
>>>>>>>>>>>>>>>>> leaking
>>>>>>>>>>>>>>>>> memory.\n");
>>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), "Extobj 
>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict list
>>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all 
>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Note: This function is safe against concurrent 
>>>>>>>>>>>>>>>>> insertion
>>>>>>>>>>>>>>>>> and removal of
>>>>>>>>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>>>>>>>>> concurrent usage itself.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>>>>>>>>>>> either an outer VM lock
>>>>>>>>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this 
>>>>>>>>>>>>>>>>> function
>>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the 
>>>>>>>>>>>>>>>>> GPUVM's
>>>>>>>>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>>>>>>>>>>> +                         unsigned int num_fences)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>>>> vm_bo->obj,
>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs mapped 
>>>>>>>>>>>>>>>>> within
>>>>>>>>>>>>>>>>> a given range
>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all &drm_gem_objects
>>>>>>>>>>>>>>>>> mapped between @addr
>>>>>>>>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>>>>>> drm_exec *exec,
>>>>>>>>>>>>>>>>> +                       u64 addr, u64 range, unsigned int
>>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, 
>>>>>>>>>>>>>>>>> end) {
>>>>>>>>>>>>>>>>> +               struct drm_gem_object *obj = va->gem.obj;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>> +                       return ret;
>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>>>>>>>>> + * being set the driver receives the given @fn 
>>>>>>>>>>>>>>>>> callback to
>>>>>>>>>>>>>>>>> lock additional
>>>>>>>>>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec 
>>>>>>>>>>>>>>>>> instance.
>>>>>>>>>>>>>>>>> Typically, drivers
>>>>>>>>>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>>>>>>>>>> callback.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_objects(gpuvm, 
>>>>>>>>>>>>>>>>> exec,
>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>> +                       drm_exec_retry_on_contention(exec); 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +                       if (ret)
>>>>>>>>>>>>>>>>> +                               goto err;
>>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +static int
>>>>>>>>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, 
>>>>>>>>>>>>>>>>> unsigned int
>>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, 
>>>>>>>>>>>>>>>>> args-
>>>>>>>>>>>>>>>>>> objs,
>>>>>>>>>>>>>>>>> + args->num_objs,
>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv of 
>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>>>>>>>>> + * @num_objs: the number of additional 
>>>>>>>>>>>>>>>>> &drm_gem_objects to
>>>>>>>>>>>>>>>>> lock
>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>>>>>>>>> + * contains mappings of, plus the ones given through 
>>>>>>>>>>>>>>>>> @objs.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>>>>>>>>>>> +                         unsigned int num_objs,
>>>>>>>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>>> +       } args;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>>>>>>>>>>> interruptible);
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>>>>>>>>>>> within a given range
>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>>>>>> mapped between @addr and
>>>>>>>>>>>>>>>>> + * @addr + @range.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>>>>>>>>>>> +                         unsigned int num_fences,
>>>>>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, 
>>>>>>>>>>>>>>>>> exec,
>>>>>>>>>>>>>>>>> addr, range,
>>>>>>>>>>>>>>>>> + num_fences);
>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as 
>>>>>>>>>>>>>>>>> evicted
>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for all
>>>>>>>>>>>>>>>>> evicted buffer
>>>>>>>>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, 
>>>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private 
>>>>>>>>>>>>>>>>> and all
>>>>>>>>>>>>>>>>> extobj
>>>>>>>>>>>>>>>>> + * dma-resv
>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>>>>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>>>>>>>>>>> +                        enum dma_resv_usage 
>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>> +                        enum dma_resv_usage 
>>>>>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>>>>>>>>> +       unsigned long index;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, 
>>>>>>>>>>>>>>>>> obj) {
>>>>>>>>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>>>>>>>>> obj) ?
>>>>>>>>>>>>>>>>> + private_usage :
>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance 
>>>>>>>>>>>>>>>>> of struct
>>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct drm_gpuvm
>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>>>>>>>>          return vm_bo;
>>>>>>>>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct kref 
>>>>>>>>>>>>>>>>> *kref)
>>>>>>>>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct kref 
>>>>>>>>>>>>>>>>> *kref)
>>>>>>>>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the 
>>>>>>>>>>>>>>>>> reference of
>>>>>>>>>>>>>>>>>      *
>>>>>>>>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * If the reference count drops to zero, the 
>>>>>>>>>>>>>>>>> &gpuvm_bo is
>>>>>>>>>>>>>>>>> destroyed, which
>>>>>>>>>>>>>>>>> + * includes removing it from the GEMs gpuva list. 
>>>>>>>>>>>>>>>>> Hence, if
>>>>>>>>>>>>>>>>> a call to this
>>>>>>>>>>>>>>>>> + * function can potentially let the reference count 
>>>>>>>>>>>>>>>>> to zero
>>>>>>>>>>>>>>>>> the caller must
>>>>>>>>>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>     void
>>>>>>>>>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct 
>>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>>>>>>>>>> @@ -938,6 +1412,48 @@ drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the &drm_gpuvm_bo 
>>>>>>>>>>>>>>>>> to its
>>>>>>>>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>>>>>>>>> + * extobj list
>>>>>>>>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its 
>>>>>>>>>>>>>>>>> &drm_gpuvm's the
>>>>>>>>>>>>>>>>> extobj list.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj 
>>>>>>>>>>>>>>>>> list if
>>>>>>>>>>>>>>>>> not on the list
>>>>>>>>>>>>>>>>> + * already and if the corresponding &drm_gem_object 
>>>>>>>>>>>>>>>>> is an
>>>>>>>>>>>>>>>>> external object,
>>>>>>>>>>>>>>>>> + * actually.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a 
>>>>>>>>>>>>>>>>> &drm_gem_object to
>>>>>>>>>>>>>>>>> / from a
>>>>>>>>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool 
>>>>>>>>>>>>>>>>> evict)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>>>>>>>>> +               if (evict)
>>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, 
>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>> +               else
>>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, 
>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>     static int
>>>>>>>>>>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>>>>>>>>           * space
>>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>>> serving as
>>>>>>>>>>>>>>>>> +                * external object
>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>>> extobj list
>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>>> +       } extobj;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>> +        * @evict: structure holding the evict list 
>>>>>>>>>>>>>>>>> and evict
>>>>>>>>>>>>>>>>> list lock
>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>>> currently being
>>>>>>>>>>>>>>>>> +                * evicted
>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>>> evict list
>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>>> +       } evict;
>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>>>>>> drm_device *drm,
>>>>>>>>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm
>>>>>>>>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>>>>>> + * external object
>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv 
>>>>>>>>>>>>>>>>> differs
>>>>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>> + struct drm_gem_object
>>>>>>>>>>>>>>>>> *obj)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct 
>>>>>>>>>>>>>>>>> drm_gpuva *va)
>>>>>>>>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, 
>>>>>>>>>>>>>>>>> gpuvm__)
>>>>>>>>>>>>>>>>> \
>>>>>>>>>>>>>>>>>          list_for_each_entry_safe(va__, next__, 
>>>>>>>>>>>>>>>>> &(gpuvm__)-
>>>>>>>>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>>>>>>>>> &drm_exec should be.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Optionally, @extra can be set in order to lock 
>>>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA 
>>>>>>>>>>>>>>>>> reservations
>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>> +        * @extra: Callback and corresponding private 
>>>>>>>>>>>>>>>>> data
>>>>>>>>>>>>>>>>> for the driver to
>>>>>>>>>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>>> +                         unsigned int num_fences);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>> +                * @priv: driver private data for the @fn
>>>>>>>>>>>>>>>>> callback
>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>> +               void *priv;
>>>>>>>>>>>>>>>>> +       } extra;
>>>>>>>>>>>>>>>>> +};
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common 
>>>>>>>>>>>>>>>>> dma-
>>>>>>>>>>>>>>>>> resv
>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +static inline int
>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>>>>>>>>>> +                    unsigned int num_fences)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>> +                           struct drm_exec *exec,
>>>>>>>>>>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>> + struct drm_gem_object **objs,
>>>>>>>>>>>>>>>>> + unsigned int num_objs,
>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>> +                             u64 addr, u64 range,
>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>> +                             bool interruptible);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all 
>>>>>>>>>>>>>>>>> assoiciated
>>>>>>>>>>>>>>>>> BOs
>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>>>>>> previously acquired
>>>>>>>>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>> +                             enum dma_resv_usage
>>>>>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, 
>>>>>>>>>>>>>>>>> &vm_exec->exec,
>>>>>>>>>>>>>>>>> fence,
>>>>>>>>>>>>>>>>> + private_usage,
>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>>>>>>>>                           * gpuva list.
>>>>>>>>>>>>>>>>>                           */
>>>>>>>>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>>> +                        * @evict: List entry to 
>>>>>>>>>>>>>>>>> attach to
>>>>>>>>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>>>>>>>>> +                        * extobj list.
>>>>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>>> +                        * @evict: List entry to 
>>>>>>>>>>>>>>>>> attach to
>>>>>>>>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>>>>>>>>> +                        * list.
>>>>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>>>>>>>>                  } entry;
>>>>>>>>>>>>>>>>>          } list;
>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool
>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>> *vm_bo);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk 
>>>>>>>>>>>>>>>>> over a
>>>>>>>>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>>>>>>>>>>> iteration step
>>>>>>>>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>>>>>>>>           * used.
>>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op *op, 
>>>>>>>>>>>>>>>>> void
>>>>>>>>>>>>>>>>> *priv);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>> +        * @bo_validate: called from drm_gpuvm_validate()
>>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>>> +        * Drivers receive this callback for every 
>>>>>>>>>>>>>>>>> evicted
>>>>>>>>>>>>>>>>> &drm_gem_object being
>>>>>>>>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>>>>>>>>>> specific variant of
>>>>>>>>>>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void 
>>>>>>>>>>>>>>>>> *priv,
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-20 10:51                                   ` Christian König
@ 2023-09-20 12:06                                     ` Thomas Hellström
  2023-09-20 13:06                                       ` Christian König
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-20 12:06 UTC (permalink / raw)
  To: Christian König, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel


On 9/20/23 12:51, Christian König wrote:
> Am 20.09.23 um 09:44 schrieb Thomas Hellström:
>> Hi,
>>
>> On 9/20/23 07:37, Christian König wrote:
>>> Am 19.09.23 um 17:23 schrieb Thomas Hellström:
>>>>
>>>> On 9/19/23 17:16, Danilo Krummrich wrote:
>>>>> On 9/19/23 14:21, Thomas Hellström wrote:
>>>>>> Hi Christian
>>>>>>
>>>>>> On 9/19/23 14:07, Christian König wrote:
>>>>>>> Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
>>>>>>>> On 9/13/23 17:33, Christian König wrote:
>>>>>>>>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>>>>>>>>> On 9/13/23 16:26, Christian König wrote:
>>>>>>>>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>>>>>>>>> As mentioned in a different mail thread, the reply is based 
>>>>>>>>>>>> on the assumption
>>>>>>>>>>>> that we don't support anything else than GPUVM updates from 
>>>>>>>>>>>> the IOCTL.
>>>>>>>>>>>
>>>>>>>>>>> I think that this assumption is incorrect.
>>>>>>>>>>
>>>>>>>>>> Well, more precisely I should have said "don't support GPUVM 
>>>>>>>>>> updated from within
>>>>>>>>>> fence signaling critical sections". And looking at the code, 
>>>>>>>>>> that doesn't seem what
>>>>>>>>>> you're doing there.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Vulkan is just once specific use case, but this here should 
>>>>>>>>>>> probably be able to handle other use cases as well.
>>>>>>>>>>>
>>>>>>>>>>> Especially with HMM you get the requirement that you need to 
>>>>>>>>>>> be able to invalidate GPUVM mappings without grabbing a 
>>>>>>>>>>> reservation lock.
>>>>>>>>>>
>>>>>>>>>> What do you mean with "invalidate GPUVM mappings" in this 
>>>>>>>>>> context? drm_gpuvm_bo_evict()
>>>>>>>>>> should only be called from a ttm_device_funcs::move\x0f 
>>>>>>>>>> callback, we should hold the dma-resv
>>>>>>>>>> lock there.
>>>>>>>>>
>>>>>>>>> Well the question is which dma-resv lock do we hold?
>>>>>>>>>
>>>>>>>>> In the move callback we only hold the dma-resv lock of the BO 
>>>>>>>>> which is moved, but when that is a shared BO then that's not 
>>>>>>>>> the same as the one for the VM.
>>>>>>>>
>>>>>>>> Correct, Thomas' idea was to use the GEM's dma_resv lock to 
>>>>>>>> protect drm_gpuvm_bo::evicted
>>>>>>>> and then actually move the drm_gpuvm_bo to the VM's evicted 
>>>>>>>> list once we grabbed all
>>>>>>>> dma-resv locks when locking the VM's BOs using drm_exec. We can 
>>>>>>>> remove them from the evicted
>>>>>>>> list on validate(). This way we never touch the evicted list 
>>>>>>>> without holding at least the VM's
>>>>>>>> dma-resv lock.
>>>>>>>>
>>>>>>>> Do you have any concerns about that?
>>>>>>>
>>>>>>> Scratching my head a bit how that is supposed to work.
>>>>>>>
>>>>>>> This implies that you go over all the evicted BOs during 
>>>>>>> validation and not just the one mentioned in the CS.
>>>>>>>
>>>>>>> That might work for Vulkan, but is pretty much a no-go for OpenGL.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> See what the eviction lock in amdgpu is doing for example.
>>>>>>>>>>
>>>>>>>>>> The eviction_lock seems to protect a VM state "evicting" of 
>>>>>>>>>> whether any BO that
>>>>>>>>>> is associated with the VM is currently evicting. At the same 
>>>>>>>>>> time amdgpu protects
>>>>>>>>>> the eviceted list of the VM with a different lock. So this 
>>>>>>>>>> seems to be entirely
>>>>>>>>>> unrelated. Tracking a "currently evicting" state is not part 
>>>>>>>>>> of the GPUVM
>>>>>>>>>> implementation currently and hence nothing would change for 
>>>>>>>>>> amdgpu there.
>>>>>>>>>
>>>>>>>>> Sorry for the confusion we use different terminology in amdgpu.
>>>>>>>>>
>>>>>>>>> The eviction lock and evicted state is for the VM page tables, 
>>>>>>>>> e.g. if the whole VM is currently not used and swapped out or 
>>>>>>>>> even de-allocated.
>>>>>>>>>
>>>>>>>>> This is necessary because we have cases where we need to 
>>>>>>>>> access the VM data without holding the dma-resv lock of this 
>>>>>>>>> VM. Especially figuring out which parts of an address space 
>>>>>>>>> contain mappings and which doesn't.
>>>>>>>>
>>>>>>>> I think this is fine, this has nothing to do with lists of 
>>>>>>>> evicted GEM objects or external GEM
>>>>>>>> objects, right? Marking mappings (drm_gpuva) as invalidated 
>>>>>>>> (DRM_GPUVA_INVALIDATED) or accessing
>>>>>>>> the VA space does not require any dma-resv locks.
>>>>>>>
>>>>>>> I hope so, but I'm not 100% sure.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> This is a requirement which comes with HMM handling, you won't 
>>>>>>>>> see this with Vulkan (or OpenGL, VAAPI etc..).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The invalidation lock on the other hand is what in this 
>>>>>>>>> discussion is called eviction lock. This one is needed because 
>>>>>>>>> what I wrote above, during the move callback only the dma-resv 
>>>>>>>>> of the BO which is moved is locked, but not necessarily the 
>>>>>>>>> dma-resv of the VM.
>>>>>>>>
>>>>>>>> That's yet another thing, right? This is used to track whether 
>>>>>>>> *any* BO that belongs to the VM is
>>>>>>>> currently being evicted, correct? As mentioned, as by now this 
>>>>>>>> is not supported in GPUVM and hence
>>>>>>>> would be the same driver specific code with the same driver 
>>>>>>>> specifc lock.
>>>>>>>
>>>>>>> That is most likely a show stopper using this for OpenGL based 
>>>>>>> workloads as far as I can see. For those you need to able to 
>>>>>>> figure out which non-VM BOs have been evicted and which parts of 
>>>>>>> the VM needs updates.
>>>>>>
>>>>>> We identify those with a bool in the gpuvm_bo, and that bool is 
>>>>>> protected by the bo_resv. In essence, the "evicted" list must be 
>>>>>> made up-to-date with all relevant locks held before traversing in 
>>>>>> the next exec.
>>>>>
>>>>> What I still miss with this idea is how do we find all the 
>>>>> drm_gpuvm_bo structures with the evicted bool set to true? When 
>>>>> doing the drm_exec dance we come across all external ones and can 
>>>>> add them to the list if needed, but what about the BOs having the 
>>>>> VM's dma-resv?
>>>>
>>>> Oh, they can be added to the evict list directly (no bool needed) 
>>>> in the eviction code, like in v3. Since for those we indeed hold 
>>>> the VM's dma_resv since it's aliased with the object's dma-resv.
>>>
>>> Yeah, I wanted to note what Danilo seems to think about as well. How 
>>> do we figure out the non-VM BOs evicted?
>>>
>>> We can't walk over the list of all non-VM BOs on every submission, 
>>> that's to much overhead for cases with lots of non-VM BOs.
>>>
>>> And we can't rely on userspace sending all non-VM BOs as used list 
>>> down to the kernel with each submission.
>>>
>>> Regards,
>>> Christian.
>>
>> No, that's not needed: Mechanism below.
>>
>> 1) We maintain an evicted list. Typically protected by the vm resv.
>> 2) Each gpuvm_bo has a bool "evicted". Protected by the bo resv.
>>
>> a) Evicting a vm bo: The vm resv is held by the eviction code. Just 
>> put it on the evicted list.
>> b) Evicting a shared/external bo: The bo resv is held by the eviction 
>> code. Set the "evicted" bool
>> c) Validating the evicted list on exec:
>
>
>> Loop through all *external/shared* bos.
>
> And this is what you can't do. For Vulkan it probably doesn't matter, 
> but for OpenGL and especially multimedia we have much more BOs on the 
> shared list than what's allocated for the VM.

But you need to lock- and fence all those so you need to loop through 
them anyway, so we're still O(n_shared)? Or is there some clever 
optimization in amdgpu?

I think with some UMDs, xe might end up with similar large lists...

/Thomas


>
> Regards,
> Christian.
>
>> Lock them. After locking, check the "evicted" bool, if it's true. put 
>> the bo on the evicted list (we hold the VM resv at this point) and 
>> clear the "evicted" bool. Note that other vms will have their own 
>> gpuvm_bo which is marked evicted.
>>
>> I have this coded up in a patch for Xe and it seems to be working 
>> properly.
>>
>> /Thomas
>>
>>
>>>
>>>>
>>>> /Thomas
>>>>
>>>>
>>>>
>>>>>
>>>>>>
>>>>>> If you mean that we need to unbind all vmas of all vms of evicted 
>>>>>> bos before evicting, We don't do that, at least not in Xe, since 
>>>>>> evicting we wait for VM idle, and it cant access anything through 
>>>>>> the stale vmas until they have been revalidated and rebound.
>>>>>>
>>>>>> /Thomas
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Christian.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas 
>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas 
>>>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>>>> Hi, Danilo,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>>>> So far the DRM GPUVA manager offers common 
>>>>>>>>>>>>>>>>>> infrastructure to
>>>>>>>>>>>>>>>>>> track GPU VA
>>>>>>>>>>>>>>>>>> allocations and mappings, generically connect GPU VA 
>>>>>>>>>>>>>>>>>> mappings
>>>>>>>>>>>>>>>>>> to their
>>>>>>>>>>>>>>>>>> backing buffers and perform more complex mapping 
>>>>>>>>>>>>>>>>>> operations
>>>>>>>>>>>>>>>>>> on the GPU VA
>>>>>>>>>>>>>>>>>> space.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> However, there are more design patterns commonly used by
>>>>>>>>>>>>>>>>>> drivers, which
>>>>>>>>>>>>>>>>>> can potentially be generalized in order to make the 
>>>>>>>>>>>>>>>>>> DRM GPUVA
>>>>>>>>>>>>>>>>>> manager
>>>>>>>>>>>>>>>>>> represent a basic GPU-VM implementation. In this 
>>>>>>>>>>>>>>>>>> context,
>>>>>>>>>>>>>>>>>> this patch aims
>>>>>>>>>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not 
>>>>>>>>>>>>>>>>>> being used
>>>>>>>>>>>>>>>>>> outside of
>>>>>>>>>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM objects
>>>>>>>>>>>>>>>>>> which are
>>>>>>>>>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 3) Provide functions to efficiently lock all GEM 
>>>>>>>>>>>>>>>>>> objects dma-
>>>>>>>>>>>>>>>>>> resv the
>>>>>>>>>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>>>>>>>>>>> contains mappings
>>>>>>>>>>>>>>>>>>       of, such that validation of evicted GEM objects is
>>>>>>>>>>>>>>>>>> accelerated.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 5) Provide some convinience functions for common 
>>>>>>>>>>>>>>>>>> patterns.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Rather than being designed as a "framework", the 
>>>>>>>>>>>>>>>>>> target is to
>>>>>>>>>>>>>>>>>> make all
>>>>>>>>>>>>>>>>>> features appear as a collection of optional helper 
>>>>>>>>>>>>>>>>>> functions,
>>>>>>>>>>>>>>>>>> such that
>>>>>>>>>>>>>>>>>> drivers are free to make use of the DRM GPUVA 
>>>>>>>>>>>>>>>>>> managers basic
>>>>>>>>>>>>>>>>>> functionality and opt-in for other features without 
>>>>>>>>>>>>>>>>>> setting
>>>>>>>>>>>>>>>>>> any feature
>>>>>>>>>>>>>>>>>> flags, just by making use of the corresponding 
>>>>>>>>>>>>>>>>>> functions.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>>>>>>>>>>>> locking for drivers
>>>>>>>>>>>>>>>>>> updating the GPU VA space within the fence signalling 
>>>>>>>>>>>>>>>>>> path.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>     drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>>>>>>>>     include/drm/drm_gpuvm.h | 197 ++++++++++++++
>>>>>>>>>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an 
>>>>>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>>>>>> instance of this
>>>>>>>>>>>>>>>>>>      * particular combination. If not existent a new 
>>>>>>>>>>>>>>>>>> instance
>>>>>>>>>>>>>>>>>> is created and linked
>>>>>>>>>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>>>>>>>>>>>> evicted objects. Those
>>>>>>>>>>>>>>>>>> + * list are maintained in order to accelerate 
>>>>>>>>>>>>>>>>>> locking of
>>>>>>>>>>>>>>>>>> dma-resv locks and
>>>>>>>>>>>>>>>>>> + * validation of evicted objects bound in a 
>>>>>>>>>>>>>>>>>> &drm_gpuvm. For
>>>>>>>>>>>>>>>>>> instance the all
>>>>>>>>>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given &drm_gpuvm 
>>>>>>>>>>>>>>>>>> can be
>>>>>>>>>>>>>>>>>> locked by calling
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>>>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>>>>>>>>>> + * order to validate all evicted &drm_gem_objects. 
>>>>>>>>>>>>>>>>>> It is
>>>>>>>>>>>>>>>>>> also possible to lock
>>>>>>>>>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>>>>>>>>>> corresponding parameters to
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the 
>>>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>>>> loop while making
>>>>>>>>>>>>>>>>>> + * use of helper functions such as 
>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Every bound &drm_gem_object is treated as 
>>>>>>>>>>>>>>>>>> external object
>>>>>>>>>>>>>>>>>> when its &dma_resv
>>>>>>>>>>>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>>>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for 
>>>>>>>>>>>>>>>>>> the same
>>>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>>>>>>>>>>>> creations and destructions
>>>>>>>>>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances 
>>>>>>>>>>>>>>>>>> unique.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of 
>>>>>>>>>>>>>>>>>> external and
>>>>>>>>>>>>>>>>>> evicted objects are
>>>>>>>>>>>>>>>>>> + * protected against concurrent insertion / removal and
>>>>>>>>>>>>>>>>>> iteration internally.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * However, drivers still need ensure to protect 
>>>>>>>>>>>>>>>>>> concurrent
>>>>>>>>>>>>>>>>>> calls to functions
>>>>>>>>>>>>>>>>>> + * iterating those lists, such as 
>>>>>>>>>>>>>>>>>> drm_gpuvm_validate() and
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function 
>>>>>>>>>>>>>>>>>> contains
>>>>>>>>>>>>>>>>>> a particular
>>>>>>>>>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Functions adding or removing entries from those 
>>>>>>>>>>>>>>>>>> lists,
>>>>>>>>>>>>>>>>>> such as
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() or drm_gpuvm_bo_extobj_add() 
>>>>>>>>>>>>>>>>>> may be
>>>>>>>>>>>>>>>>>> called with external
>>>>>>>>>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>>>>>>>>>> corresponding list to be
>>>>>>>>>>>>>>>>>> + * (safely) modified while potentially being 
>>>>>>>>>>>>>>>>>> iternated by
>>>>>>>>>>>>>>>>>> other API functions.
>>>>>>>>>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>>>>>>>>>      *   }
>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo 
>>>>>>>>>>>>>>>>>> element
>>>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're 
>>>>>>>>>>>>>>>>>> iterating on
>>>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used 
>>>>>>>>>>>>>>>>>> to store
>>>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>>>> first element from
>>>>>>>>>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>>> Are the list spinlocks needed for that async state 
>>>>>>>>>>>>>>>>> update from
>>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>>>>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Otherwise it should be sufficient to protect the lists 
>>>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>>>> gpuvm's resv
>>>>>>>>>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If those spinlocks are still needed in some 
>>>>>>>>>>>>>>>>> situations, perhaps
>>>>>>>>>>>>>>>>> could we
>>>>>>>>>>>>>>>>> have an option to set them to NULL (Like IIRC the 
>>>>>>>>>>>>>>>>> maple tree
>>>>>>>>>>>>>>>>> allows for)?
>>>>>>>>>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>>>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>>>>>>>>>> holding only the dma-resv lock from the BO this 
>>>>>>>>>>>>>>>> function gets
>>>>>>>>>>>>>>>> called for. Hence,
>>>>>>>>>>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() 
>>>>>>>>>>>>>>>> calls with
>>>>>>>>>>>>>>>> different BOs.
>>>>>>>>>>>>>>> No. Only if you try to add external objects to the vm's 
>>>>>>>>>>>>>>> evict list
>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>> within the evict code. That's not necessary since you 
>>>>>>>>>>>>>>> loop through
>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>> external objects anyway when locking them so an 
>>>>>>>>>>>>>>> "evicted" bool in
>>>>>>>>>>>>>>> the vm_bo,
>>>>>>>>>>>>>>> protected by the bo resv would be sufficient. The extobj 
>>>>>>>>>>>>>>> locking
>>>>>>>>>>>>>>> loop can
>>>>>>>>>>>>>>> then add the bo to the evicted list.
>>>>>>>>>>>>>> And validate() can remove it while still holding all 
>>>>>>>>>>>>>> dma-resv locks,
>>>>>>>>>>>>>> neat!
>>>>>>>>>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>>>>>>>>>> concurrently? What
>>>>>>>>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>>>>>>>>>> drm_gpuva_unlink()?
>>>>>>>>>>>>>> Are we guaranteed that at this point of time the 
>>>>>>>>>>>>>> drm_gpuvm_bo is not
>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>> evicted list? Because otherwise we would call 
>>>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>>>> might drop the last reference to the drm_gem_object and 
>>>>>>>>>>>>>> hence we'd
>>>>>>>>>>>>>> potentially
>>>>>>>>>>>>>> free the dma-resv lock while holding it, at least if it's 
>>>>>>>>>>>>>> an external
>>>>>>>>>>>>>> object.
>>>>>>>>>>>>> Easiest way in this scheme is to think of the lists as 
>>>>>>>>>>>>> being protected
>>>>>>>>>>>>> by the vm's resv lock. That means anybody calling unlink() 
>>>>>>>>>>>>> must also
>>>>>>>>>>>>> hold the vm's resv lock. (Which is OK from an UAF point of 
>>>>>>>>>>>>> view, but
>>>>>>>>>>>>> perhaps not from a locking inversion POW from an async 
>>>>>>>>>>>>> list update).
>>>>>>>>>>>> This would mean that on unlink() we'd need to hold the VM's 
>>>>>>>>>>>> resv lock and the
>>>>>>>>>>>> corresponding GEM's resv lock (in case they're not the same 
>>>>>>>>>>>> anyways) because the
>>>>>>>>>>>> VM's resv lock would protect the external / evicted object 
>>>>>>>>>>>> lists and the GEM
>>>>>>>>>>>> objects resv lock protects the GEM's list of drm_gpuvm_bos 
>>>>>>>>>>>> and the
>>>>>>>>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For extobjs an outer lock would be enough in case of 
>>>>>>>>>>>>>>>> Xe, but I
>>>>>>>>>>>>>>>> really would not
>>>>>>>>>>>>>>>> like to add even more complexity just to get the 
>>>>>>>>>>>>>>>> spinlock out of
>>>>>>>>>>>>>>>> the way in case
>>>>>>>>>>>>>>>> the driver already has an outer lock protecting this path.
>>>>>>>>>>>>>>> I must disagree here. These spinlocks and atomic 
>>>>>>>>>>>>>>> operations are
>>>>>>>>>>>>>>> pretty
>>>>>>>>>>>>>>> costly and as discussed earlier this type of locking was 
>>>>>>>>>>>>>>> the reason
>>>>>>>>>>>>>>> (at
>>>>>>>>>>>>>>> least according to the commit message) that made 
>>>>>>>>>>>>>>> Christian drop the
>>>>>>>>>>>>>>> XArray
>>>>>>>>>>>>>>> use in drm_exec for the same set of objects: "The 
>>>>>>>>>>>>>>> locking overhead
>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>>>>>>>>>>> complexity and a
>>>>>>>>>>>>>>> single wide lock following the drm locking guidelines 
>>>>>>>>>>>>>>> set out by
>>>>>>>>>>>>>>> Daniel and
>>>>>>>>>>>>>>> David should really be the default choice with an opt-in 
>>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>>> spinlock if
>>>>>>>>>>>>>>> needed for async and pushing out to a wq is not an option.
>>>>>>>>>>>>>> For the external object list an outer lock would work as 
>>>>>>>>>>>>>> long as it's
>>>>>>>>>>>>>> not the
>>>>>>>>>>>>>> dma-resv lock of the corresponding GEM object, since here 
>>>>>>>>>>>>>> we actually
>>>>>>>>>>>>>> need to
>>>>>>>>>>>>>> remove the list entry from the external object list on
>>>>>>>>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>>>>>>>>> It's just a bit weird design wise that drivers would need 
>>>>>>>>>>>>>> to take
>>>>>>>>>>>>>> this outer
>>>>>>>>>>>>>> lock on:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>>>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also 
>>>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>>>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>>>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Given that it seems reasonable to do all the required 
>>>>>>>>>>>>>> locking
>>>>>>>>>>>>>> internally.
>>>>>>>>>>>>>  From a design POW, there has been a clear direction in XE 
>>>>>>>>>>>>> to make
>>>>>>>>>>>>> things similar to mmap() / munmap(), so this outer lock, 
>>>>>>>>>>>>> which in Xe is
>>>>>>>>>>>>> an rwsem, is used in a similar way as the mmap_lock. It's 
>>>>>>>>>>>>> protecting
>>>>>>>>>>>>> the page-table structures and vma rb tree, the userptr 
>>>>>>>>>>>>> structures and
>>>>>>>>>>>>> the extobj list. Basically it's taken early in the exec 
>>>>>>>>>>>>> IOCTL, the
>>>>>>>>>>>>> VM_BIND ioctl, the compute rebind worker and the pagefault 
>>>>>>>>>>>>> handler, so
>>>>>>>>>>>>> all of the above are just asserting that it is taken in 
>>>>>>>>>>>>> the correct
>>>>>>>>>>>>> mode.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But strictly with this scheme one could also use the vm's 
>>>>>>>>>>>>> dma_resv for
>>>>>>>>>>>>> the extobj list since with drm_exec, it's locked before 
>>>>>>>>>>>>> traversing the
>>>>>>>>>>>>> list.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The whole point of this scheme is to rely on locks that 
>>>>>>>>>>>>> you already are
>>>>>>>>>>>>> supposed to be holding for various reasons and is simple 
>>>>>>>>>>>>> to comprehend.
>>>>>>>>>>>> I don't agree that we're supposed to hold the VM's resv 
>>>>>>>>>>>> lock anyways for
>>>>>>>>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), 
>>>>>>>>>>>> but I'm fine using it
>>>>>>>>>>>> for that purpose nevertheless.
>>>>>>>>>>>>
>>>>>>>>>>>>>> In order to at least place lockdep checks, the driver 
>>>>>>>>>>>>>> would need to
>>>>>>>>>>>>>> supply the
>>>>>>>>>>>>>> corresponding lock's lockdep_map, because the GPUVM 
>>>>>>>>>>>>>> otherwise doesn't
>>>>>>>>>>>>>> know about
>>>>>>>>>>>>>> the lock.
>>>>>>>>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>>>>>>>> I'd really like to avoid that, especially now that 
>>>>>>>>>>>> everything got simpler. We
>>>>>>>>>>>> should define the actual locks to take instead.
>>>>>>>>>>>>
>>>>>>>>>>>>>> Out of curiosity, what is the overhead of a spin_lock() 
>>>>>>>>>>>>>> that doesn't
>>>>>>>>>>>>>> need to
>>>>>>>>>>>>>> spin?
>>>>>>>>>>>>> I guess it's hard to tell exactly, but it is much lower on 
>>>>>>>>>>>>> modern x86
>>>>>>>>>>>>> than what it used to be. Not sure about ARM, which is the 
>>>>>>>>>>>>> other
>>>>>>>>>>>>> architecture important to us. I figure if there is little 
>>>>>>>>>>>>> cache-line
>>>>>>>>>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm 
>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>> spinlock_t
>>>>>>>>>>>>>>> *lock)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>>>>>>>>>          spin_lock(lock);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For such drivers, that would require anybody calling 
>>>>>>>>>>>>>>>>> unlink to
>>>>>>>>>>>>>>>>> hold the vm's
>>>>>>>>>>>>>>>>> resv, though.
>>>>>>>>>>>>>>>> In V4 I want to go back to having a dedicated lock for 
>>>>>>>>>>>>>>>> the GEMs
>>>>>>>>>>>>>>>> gpuva list (or
>>>>>>>>>>>>>>>> VM_BO list to be more precise). We can't just use the 
>>>>>>>>>>>>>>>> dma-resv
>>>>>>>>>>>>>>>> lock for that
>>>>>>>>>>>>>>>> with VM_BO abstractions, because on destruction of a 
>>>>>>>>>>>>>>>> VM_BO we
>>>>>>>>>>>>>>>> otherwise wouldn't
>>>>>>>>>>>>>>>> be allowed to already hold the dma-resv lock. That's 
>>>>>>>>>>>>>>>> the fix I
>>>>>>>>>>>>>>>> was referring to
>>>>>>>>>>>>>>>> earlier.
>>>>>>>>>>>>>>> Yeah, I can see the need for a dedicated lock for the 
>>>>>>>>>>>>>>> GEM's gpuva
>>>>>>>>>>>>>>> list, but
>>>>>>>>>>>>>>> holding the vm's dma-resv lock across the unlink 
>>>>>>>>>>>>>>> shouldn't be a
>>>>>>>>>>>>>>> problem. We
>>>>>>>>>>>>>>> may free the object and a pointer to the vm's resv 
>>>>>>>>>>>>>>> during unlink
>>>>>>>>>>>>>>> but we
>>>>>>>>>>>>>>> don't free the vm's resv.  It'd be a matter of ensuring 
>>>>>>>>>>>>>>> that any
>>>>>>>>>>>>>>> calls to
>>>>>>>>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>>>>>>>> Drivers calling unlink() from the fence signaling path 
>>>>>>>>>>>>>> can't use the
>>>>>>>>>>>>>> VM's
>>>>>>>>>>>>>> dma-resv lock.
>>>>>>>>>>>>> Yes, that made me a bit curious because in the current 
>>>>>>>>>>>>> version the code
>>>>>>>>>>>>> required the object's dma_resv for unlink() which can't be 
>>>>>>>>>>>>> grabbed
>>>>>>>>>>>>> either from the fence signaling path. So are there any 
>>>>>>>>>>>>> drivers actually
>>>>>>>>>>>>> wanting to do that? If so, they will either need to resort 
>>>>>>>>>>>>> to the
>>>>>>>>>>>>> current spinlock solution or they will need to call unlink 
>>>>>>>>>>>>> from a
>>>>>>>>>>>>> workqueue item.
>>>>>>>>>>>> As Boris already mentioned we have the dma-resv lock by 
>>>>>>>>>>>> default or a driver
>>>>>>>>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of 
>>>>>>>>>>>> the latter.
>>>>>>>>>>>>
>>>>>>>>>>>>>> Also, what if the object is an external object? We can't 
>>>>>>>>>>>>>> use the VM's
>>>>>>>>>>>>>> dma-resv
>>>>>>>>>>>>>> lock here.
>>>>>>>>>>>>> Why? Typically (sync) unlink is only ever called from an 
>>>>>>>>>>>>> unbind-like
>>>>>>>>>>>>> operation where it should be trivial to grab the vm's 
>>>>>>>>>>>>> resv. Or, for
>>>>>>>>>>>>> that matter any outer lock protecting the extobj list. 
>>>>>>>>>>>>> Rule would be
>>>>>>>>>>>>> the drm_gpuvm_bo::entry::extobj  and 
>>>>>>>>>>>>> drm_gpuvm_bo::entry::evict would
>>>>>>>>>>>>> be protected by either the vm's dma_resv (or possibly an 
>>>>>>>>>>>>> outer lock in
>>>>>>>>>>>>> the case of the extobj list).
>>>>>>>>>>>> Outer lock wouldn't have been working for updates in the 
>>>>>>>>>>>> async path, but
>>>>>>>>>>>> shouldn't be relevant anymore. We could use the VM's resv 
>>>>>>>>>>>> for that.
>>>>>>>>>>>>
>>>>>>>>>>>>>>   And we can't have the GEM objs dma-resv lock held when 
>>>>>>>>>>>>>> calling
>>>>>>>>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which 
>>>>>>>>>>>>>> if the
>>>>>>>>>>>>>> refcount drops
>>>>>>>>>>>>>> to zero calls drm_gpuvm_bo_destroy() and 
>>>>>>>>>>>>>> drm_gpuvm_bo_destroy() might
>>>>>>>>>>>>>> drop the
>>>>>>>>>>>>>> last reference of the GEM object.
>>>>>>>>>>>>> Yes, but this is a different problem as to what exactly 
>>>>>>>>>>>>> protects
>>>>>>>>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an 
>>>>>>>>>>>>> internal per bo list
>>>>>>>>>>>>> lock, or if we want to keep the bo's dma_resv we need to 
>>>>>>>>>>>>> ensure that
>>>>>>>>>>>>> the caller of dma_resv_unlock(obj->resv) actually 
>>>>>>>>>>>>> refcounts its obj
>>>>>>>>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's 
>>>>>>>>>>>>> refcount (I know
>>>>>>>>>>>>> Boris didn't like that, but requiring an explicit refcount 
>>>>>>>>>>>>> for a
>>>>>>>>>>>>> pointer you dereference unless you're under a lock that 
>>>>>>>>>>>>> ensures keeping
>>>>>>>>>>>>> the object alive is pretty much required?) But anyway for the
>>>>>>>>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or 
>>>>>>>>>>>>> internal spinlock)
>>>>>>>>>>>>> I don't have a strong preference.
>>>>>>>>>>>> We can keep the GEM objects dma-resv lock, however as 
>>>>>>>>>>>> mentioned above
>>>>>>>>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires 
>>>>>>>>>>>> both the VM's resv lock
>>>>>>>>>>>> and the GEM's resv lock in case they differ.
>>>>>>>>>>>>
>>>>>>>>>>>>>>   All those problems go away with a dedicated
>>>>>>>>>>>>>> GEM gpuva list lock.
>>>>>>>>>>>>> I don't think these are real problems.
>>>>>>>>>>>>> With the excepton of the eviction list "trick" where we 
>>>>>>>>>>>>> currently have
>>>>>>>>>>>>> slightly different approach to collect external bos 
>>>>>>>>>>>>> needing rebinding,
>>>>>>>>>>>>> we have this working fine.
>>>>>>>>>>>>>
>>>>>>>>>>>>> TBH I think pretty much the only situation where the 
>>>>>>>>>>>>> spinlock is needed
>>>>>>>>>>>>> is for async updates of these lists, unless a wq item can 
>>>>>>>>>>>>> be used for
>>>>>>>>>>>>> that, but it doesn't really seem like the current code 
>>>>>>>>>>>>> allows for such
>>>>>>>>>>>>> updates anyway? It complicates the code a lot, adds 
>>>>>>>>>>>>> overhead and also
>>>>>>>>>>>>> adds the requirement for refcounting during list traversal.
>>>>>>>>>>>>>
>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It seems that with that also the refcount could be 
>>>>>>>>>>>>>>>>> make non-
>>>>>>>>>>>>>>>>> atomic.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> All in the spirit of the drm locking guidelines "use 
>>>>>>>>>>>>>>>>> big locks
>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>> possible".
>>>>>>>>>>>>>>>>> Lower level locks only when necessary for performance or
>>>>>>>>>>>>>>>>> locking inversion?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Elements popped from the original list are kept in a
>>>>>>>>>>>>>>>>>> local list, so removal
>>>>>>>>>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>>>>>>>>>> iterating the list.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>>>>>>>>>> +       ({
>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>>>>>>>>>>>> + struct
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>>>>>>>>>> +                       if
>>>>>>>>>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>>>>>>>>>> {                    \
>>>>>>>>>>>>>>>>>> +                               list_move_tail(&(__vm_bo)- 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> __local_list);                           \
>>>>>>>>>>>>>>>>>> +                               break;
>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>> +                       } else
>>>>>>>>>>>>>>>>>> {                                                        \ 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> +                               list_del_init(&(__vm_bo)- 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>>>>>>>>>> NULL;                                         \
>>>>>>>>>>>>>>>>>> +                       }
>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>> +       })
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list 
>>>>>>>>>>>>>>>>>> iterator
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>>>> first element from the
>>>>>>>>>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Typical use:
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + *     ret = 0;
>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>>>>>>>>>> + *             ret = do_something_with_vm_bo(..., 
>>>>>>>>>>>>>>>>>> vm_bo);
>>>>>>>>>>>>>>>>>> + *             if (ret)
>>>>>>>>>>>>>>>>>> + *                     break;
>>>>>>>>>>>>>>>>>> + *     }
>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, <list_name>,
>>>>>>>>>>>>>>>>>> &my_local_list);
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Only used for internal list iterations, not meant 
>>>>>>>>>>>>>>>>>> to be
>>>>>>>>>>>>>>>>>> exposed to the outside
>>>>>>>>>>>>>>>>>> + * world.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>>>>>>>>>> +       for (__vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> NULL);            \
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> __vm_bo;
>>>>>>>>>>>>>>>>>>        \
>>>>>>>>>>>>>>>>>> +            __vm_bo = get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> __vm_bo))         \
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back 
>>>>>>>>>>>>>>>>>> to their
>>>>>>>>>>>>>>>>>> original list
>>>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're 
>>>>>>>>>>>>>>>>>> iterating on
>>>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used 
>>>>>>>>>>>>>>>>>> to store
>>>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * When we're done iterating a vm_bo list, we should 
>>>>>>>>>>>>>>>>>> call
>>>>>>>>>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>>>>>>>>>> + * to restore the original state and let new 
>>>>>>>>>>>>>>>>>> iterations take
>>>>>>>>>>>>>>>>>> place.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>>> __local_list)                         \
>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>                  \
>>>>>>>>>>>>>>>>>> +               /* Merge back the two lists, moving 
>>>>>>>>>>>>>>>>>> local
>>>>>>>>>>>>>>>>>> list elements to the          \
>>>>>>>>>>>>>>>>>> +                * head to preserve previous 
>>>>>>>>>>>>>>>>>> ordering, in
>>>>>>>>>>>>>>>>>> case it matters.              \
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>>>>>            \
>>>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into the 
>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Inserts the given @__vm_bo into the list 
>>>>>>>>>>>>>>>>>> specified by
>>>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from the 
>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Removes the given @__vm_bo from the list 
>>>>>>>>>>>>>>>>>> specified by
>>>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>> *vm_bo);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>     #define 
>>>>>>>>>>>>>>>>>> to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>>>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm 
>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>         ��drm_gpuva_check_overflow(start_offset, range);
>>>>>>>>>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct 
>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>> *gpuvm)
>>>>>>>>>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>>>>>>>>>>               "GPUVA tree is not empty, potentially 
>>>>>>>>>>>>>>>>>> leaking
>>>>>>>>>>>>>>>>>> memory.\n");
>>>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), 
>>>>>>>>>>>>>>>>>> "Extobj list
>>>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict 
>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all 
>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Note: This function is safe against concurrent 
>>>>>>>>>>>>>>>>>> insertion
>>>>>>>>>>>>>>>>>> and removal of
>>>>>>>>>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>>>>>>>>>> concurrent usage itself.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>>>>>>>>>>>> either an outer VM lock
>>>>>>>>>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this 
>>>>>>>>>>>>>>>>>> function
>>>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the 
>>>>>>>>>>>>>>>>>> GPUVM's
>>>>>>>>>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>> +                         struct drm_exec *exec,
>>>>>>>>>>>>>>>>>> + unsigned int num_fences)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>>>>> vm_bo->obj,
>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs 
>>>>>>>>>>>>>>>>>> mapped within
>>>>>>>>>>>>>>>>>> a given range
>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>>>>> &drm_gem_objects
>>>>>>>>>>>>>>>>>> mapped between @addr
>>>>>>>>>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>>>>>>> drm_exec *exec,
>>>>>>>>>>>>>>>>>> +                       u64 addr, u64 range, unsigned 
>>>>>>>>>>>>>>>>>> int
>>>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, 
>>>>>>>>>>>>>>>>>> end) {
>>>>>>>>>>>>>>>>>> +               struct drm_gem_object *obj = 
>>>>>>>>>>>>>>>>>> va->gem.obj;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>> +                       return ret;
>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>>>>>>>>>> + * being set the driver receives the given @fn 
>>>>>>>>>>>>>>>>>> callback to
>>>>>>>>>>>>>>>>>> lock additional
>>>>>>>>>>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>> instance.
>>>>>>>>>>>>>>>>>> Typically, drivers
>>>>>>>>>>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>>>>>>>>>>> callback.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +               ret = 
>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>>>>>>>>>> +                       ret = vm_exec->extra.fn(vm_exec,
>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>> +                       drm_exec_retry_on_contention(exec); 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> +                       if (ret)
>>>>>>>>>>>>>>>>>> +                               goto err;
>>>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +static int
>>>>>>>>>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, 
>>>>>>>>>>>>>>>>>> unsigned int
>>>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       return drm_exec_prepare_array(&vm_exec->exec, 
>>>>>>>>>>>>>>>>>> args-
>>>>>>>>>>>>>>>>>>> objs,
>>>>>>>>>>>>>>>>>> + args->num_objs,
>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv 
>>>>>>>>>>>>>>>>>> of all
>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>>>>>>>>>> + * @num_objs: the number of additional 
>>>>>>>>>>>>>>>>>> &drm_gem_objects to
>>>>>>>>>>>>>>>>>> lock
>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>>>>>>>>>> + * contains mappings of, plus the ones given through 
>>>>>>>>>>>>>>>>>> @objs.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>> +                         struct drm_gem_object **objs,
>>>>>>>>>>>>>>>>>> + unsigned int num_objs,
>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>>>> +       } args;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>>>>>>>>>>>> interruptible);
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs mapped
>>>>>>>>>>>>>>>>>> within a given range
>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>>>>>>> mapped between @addr and
>>>>>>>>>>>>>>>>>> + * @addr + @range.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, 
>>>>>>>>>>>>>>>>>> exec,
>>>>>>>>>>>>>>>>>> addr, range,
>>>>>>>>>>>>>>>>>> + num_fences);
>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked as 
>>>>>>>>>>>>>>>>>> evicted
>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback for 
>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>> evicted buffer
>>>>>>>>>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, 
>>>>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>>>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the loop. */
>>>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private 
>>>>>>>>>>>>>>>>>> and all
>>>>>>>>>>>>>>>>>> extobj
>>>>>>>>>>>>>>>>>> + * dma-resv
>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>> +                        struct drm_exec *exec,
>>>>>>>>>>>>>>>>>> +                        struct dma_fence *fence,
>>>>>>>>>>>>>>>>>> +                        enum dma_resv_usage 
>>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>>> +                        enum dma_resv_usage 
>>>>>>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>>>>>>>>>> +       unsigned long index;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, 
>>>>>>>>>>>>>>>>>> obj) {
>>>>>>>>>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>>>>>>>>>> obj) ?
>>>>>>>>>>>>>>>>>> + private_usage :
>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance 
>>>>>>>>>>>>>>>>>> of struct
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct 
>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>>>>>>>>>          return vm_bo;
>>>>>>>>>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct 
>>>>>>>>>>>>>>>>>> kref *kref)
>>>>>>>>>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct 
>>>>>>>>>>>>>>>>>> kref *kref)
>>>>>>>>>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the 
>>>>>>>>>>>>>>>>>> reference of
>>>>>>>>>>>>>>>>>>      *
>>>>>>>>>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * If the reference count drops to zero, the 
>>>>>>>>>>>>>>>>>> &gpuvm_bo is
>>>>>>>>>>>>>>>>>> destroyed, which
>>>>>>>>>>>>>>>>>> + * includes removing it from the GEMs gpuva list. 
>>>>>>>>>>>>>>>>>> Hence, if
>>>>>>>>>>>>>>>>>> a call to this
>>>>>>>>>>>>>>>>>> + * function can potentially let the reference count 
>>>>>>>>>>>>>>>>>> to zero
>>>>>>>>>>>>>>>>>> the caller must
>>>>>>>>>>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva lock.
>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>     void
>>>>>>>>>>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct 
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>>>>>>>>>>> @@ -938,6 +1412,48 @@ 
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the 
>>>>>>>>>>>>>>>>>> &drm_gpuvm_bo to its
>>>>>>>>>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>>>>>>>>>> + * extobj list
>>>>>>>>>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its 
>>>>>>>>>>>>>>>>>> &drm_gpuvm's the
>>>>>>>>>>>>>>>>>> extobj list.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj 
>>>>>>>>>>>>>>>>>> list if
>>>>>>>>>>>>>>>>>> not on the list
>>>>>>>>>>>>>>>>>> + * already and if the corresponding &drm_gem_object 
>>>>>>>>>>>>>>>>>> is an
>>>>>>>>>>>>>>>>>> external object,
>>>>>>>>>>>>>>>>>> + * actually.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a 
>>>>>>>>>>>>>>>>>> &drm_gem_object to
>>>>>>>>>>>>>>>>>> / from a
>>>>>>>>>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool 
>>>>>>>>>>>>>>>>>> evict)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>>>>>>>>>> +               if (evict)
>>>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, 
>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>> +               else
>>>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, 
>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>     static int
>>>>>>>>>>>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>>>>>>>>>           * space
>>>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>>>> serving as
>>>>>>>>>>>>>>>>>> +                * external object
>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>>>> extobj list
>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>>>> +       } extobj;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>> +        * @evict: structure holding the evict list 
>>>>>>>>>>>>>>>>>> and evict
>>>>>>>>>>>>>>>>>> list lock
>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>>>> currently being
>>>>>>>>>>>>>>>>>> +                * evicted
>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>>>> evict list
>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>>>> +       } evict;
>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>>>>>>> drm_device *drm,
>>>>>>>>>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct 
>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>>>>>>> + * external object
>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv 
>>>>>>>>>>>>>>>>>> differs
>>>>>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm
>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>> + struct drm_gem_object
>>>>>>>>>>>>>>>>>> *obj)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct 
>>>>>>>>>>>>>>>>>> drm_gpuva *va)
>>>>>>>>>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, 
>>>>>>>>>>>>>>>>>> gpuvm__)
>>>>>>>>>>>>>>>>>> \
>>>>>>>>>>>>>>>>>>          list_for_each_entry_safe(va__, next__, 
>>>>>>>>>>>>>>>>>> &(gpuvm__)-
>>>>>>>>>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>>>>>>>>>> &drm_exec should be.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Optionally, @extra can be set in order to lock 
>>>>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA 
>>>>>>>>>>>>>>>>>> reservations
>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>> +        * @extra: Callback and corresponding private 
>>>>>>>>>>>>>>>>>> data
>>>>>>>>>>>>>>>>>> for the driver to
>>>>>>>>>>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>> +                * @priv: driver private data for the 
>>>>>>>>>>>>>>>>>> @fn
>>>>>>>>>>>>>>>>>> callback
>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>> +               void *priv;
>>>>>>>>>>>>>>>>>> +       } extra;
>>>>>>>>>>>>>>>>>> +};
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs 
>>>>>>>>>>>>>>>>>> common dma-
>>>>>>>>>>>>>>>>>> resv
>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +static inline int
>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>>>>>>>>>>> +                    unsigned int num_fences)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       return drm_exec_prepare_obj(exec, &gpuvm->d_obj,
>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>> +                           u64 addr, u64 range,
>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>> + struct drm_gem_object **objs,
>>>>>>>>>>>>>>>>>> + unsigned int num_objs,
>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>> + bool interruptible);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>> + u64 addr, u64 range,
>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>> + bool interruptible);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all 
>>>>>>>>>>>>>>>>>> assoiciated
>>>>>>>>>>>>>>>>>> BOs
>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>>>>>>> previously acquired
>>>>>>>>>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, 
>>>>>>>>>>>>>>>>>> &vm_exec->exec,
>>>>>>>>>>>>>>>>>> fence,
>>>>>>>>>>>>>>>>>> + private_usage,
>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>>>>>>>>>                           * gpuva list.
>>>>>>>>>>>>>>>>>>                           */
>>>>>>>>>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>>>> +                        * @evict: List entry to 
>>>>>>>>>>>>>>>>>> attach to
>>>>>>>>>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>>>>>>>>>> +                        * extobj list.
>>>>>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>>>> +                        * @evict: List entry to 
>>>>>>>>>>>>>>>>>> attach to
>>>>>>>>>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>>>>>>>>>> +                        * list.
>>>>>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>>>>>>>>>                  } entry;
>>>>>>>>>>>>>>>>>>          } list;
>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, 
>>>>>>>>>>>>>>>>>> bool
>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>> *vm_bo);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk 
>>>>>>>>>>>>>>>>>> over a
>>>>>>>>>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>>>>>>>>>>>> iteration step
>>>>>>>>>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>>>>>>>>>           * used.
>>>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op 
>>>>>>>>>>>>>>>>>> *op, void
>>>>>>>>>>>>>>>>>> *priv);
>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>> +        * @bo_validate: called from 
>>>>>>>>>>>>>>>>>> drm_gpuvm_validate()
>>>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>>>> +        * Drivers receive this callback for every 
>>>>>>>>>>>>>>>>>> evicted
>>>>>>>>>>>>>>>>>> &drm_gem_object being
>>>>>>>>>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>>>>>>>>>>> specific variant of
>>>>>>>>>>>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, 
>>>>>>>>>>>>>>>>>> void *priv,
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-20 12:06                                     ` Thomas Hellström
@ 2023-09-20 13:06                                       ` Christian König
  2023-09-20 13:38                                         ` Thomas Hellström
  0 siblings, 1 reply; 77+ messages in thread
From: Christian König @ 2023-09-20 13:06 UTC (permalink / raw)
  To: Thomas Hellström, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel



Am 20.09.23 um 14:06 schrieb Thomas Hellström:
>
> On 9/20/23 12:51, Christian König wrote:
>> Am 20.09.23 um 09:44 schrieb Thomas Hellström:
>>> Hi,
>>>
>>> On 9/20/23 07:37, Christian König wrote:
>>>> Am 19.09.23 um 17:23 schrieb Thomas Hellström:
>>>>>
>>>>> On 9/19/23 17:16, Danilo Krummrich wrote:
>>>>>> On 9/19/23 14:21, Thomas Hellström wrote:
>>>>>>> Hi Christian
>>>>>>>
>>>>>>> On 9/19/23 14:07, Christian König wrote:
>>>>>>>> Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
>>>>>>>>> On 9/13/23 17:33, Christian König wrote:
>>>>>>>>>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>>>>>>>>>> On 9/13/23 16:26, Christian König wrote:
>>>>>>>>>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>>>>>>>>>> As mentioned in a different mail thread, the reply is 
>>>>>>>>>>>>> based on the assumption
>>>>>>>>>>>>> that we don't support anything else than GPUVM updates 
>>>>>>>>>>>>> from the IOCTL.
>>>>>>>>>>>>
>>>>>>>>>>>> I think that this assumption is incorrect.
>>>>>>>>>>>
>>>>>>>>>>> Well, more precisely I should have said "don't support GPUVM 
>>>>>>>>>>> updated from within
>>>>>>>>>>> fence signaling critical sections". And looking at the code, 
>>>>>>>>>>> that doesn't seem what
>>>>>>>>>>> you're doing there.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Vulkan is just once specific use case, but this here should 
>>>>>>>>>>>> probably be able to handle other use cases as well.
>>>>>>>>>>>>
>>>>>>>>>>>> Especially with HMM you get the requirement that you need 
>>>>>>>>>>>> to be able to invalidate GPUVM mappings without grabbing a 
>>>>>>>>>>>> reservation lock.
>>>>>>>>>>>
>>>>>>>>>>> What do you mean with "invalidate GPUVM mappings" in this 
>>>>>>>>>>> context? drm_gpuvm_bo_evict()
>>>>>>>>>>> should only be called from a ttm_device_funcs::move\x0f 
>>>>>>>>>>> callback, we should hold the dma-resv
>>>>>>>>>>> lock there.
>>>>>>>>>>
>>>>>>>>>> Well the question is which dma-resv lock do we hold?
>>>>>>>>>>
>>>>>>>>>> In the move callback we only hold the dma-resv lock of the BO 
>>>>>>>>>> which is moved, but when that is a shared BO then that's not 
>>>>>>>>>> the same as the one for the VM.
>>>>>>>>>
>>>>>>>>> Correct, Thomas' idea was to use the GEM's dma_resv lock to 
>>>>>>>>> protect drm_gpuvm_bo::evicted
>>>>>>>>> and then actually move the drm_gpuvm_bo to the VM's evicted 
>>>>>>>>> list once we grabbed all
>>>>>>>>> dma-resv locks when locking the VM's BOs using drm_exec. We 
>>>>>>>>> can remove them from the evicted
>>>>>>>>> list on validate(). This way we never touch the evicted list 
>>>>>>>>> without holding at least the VM's
>>>>>>>>> dma-resv lock.
>>>>>>>>>
>>>>>>>>> Do you have any concerns about that?
>>>>>>>>
>>>>>>>> Scratching my head a bit how that is supposed to work.
>>>>>>>>
>>>>>>>> This implies that you go over all the evicted BOs during 
>>>>>>>> validation and not just the one mentioned in the CS.
>>>>>>>>
>>>>>>>> That might work for Vulkan, but is pretty much a no-go for OpenGL.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> See what the eviction lock in amdgpu is doing for example.
>>>>>>>>>>>
>>>>>>>>>>> The eviction_lock seems to protect a VM state "evicting" of 
>>>>>>>>>>> whether any BO that
>>>>>>>>>>> is associated with the VM is currently evicting. At the same 
>>>>>>>>>>> time amdgpu protects
>>>>>>>>>>> the eviceted list of the VM with a different lock. So this 
>>>>>>>>>>> seems to be entirely
>>>>>>>>>>> unrelated. Tracking a "currently evicting" state is not part 
>>>>>>>>>>> of the GPUVM
>>>>>>>>>>> implementation currently and hence nothing would change for 
>>>>>>>>>>> amdgpu there.
>>>>>>>>>>
>>>>>>>>>> Sorry for the confusion we use different terminology in amdgpu.
>>>>>>>>>>
>>>>>>>>>> The eviction lock and evicted state is for the VM page 
>>>>>>>>>> tables, e.g. if the whole VM is currently not used and 
>>>>>>>>>> swapped out or even de-allocated.
>>>>>>>>>>
>>>>>>>>>> This is necessary because we have cases where we need to 
>>>>>>>>>> access the VM data without holding the dma-resv lock of this 
>>>>>>>>>> VM. Especially figuring out which parts of an address space 
>>>>>>>>>> contain mappings and which doesn't.
>>>>>>>>>
>>>>>>>>> I think this is fine, this has nothing to do with lists of 
>>>>>>>>> evicted GEM objects or external GEM
>>>>>>>>> objects, right? Marking mappings (drm_gpuva) as invalidated 
>>>>>>>>> (DRM_GPUVA_INVALIDATED) or accessing
>>>>>>>>> the VA space does not require any dma-resv locks.
>>>>>>>>
>>>>>>>> I hope so, but I'm not 100% sure.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This is a requirement which comes with HMM handling, you 
>>>>>>>>>> won't see this with Vulkan (or OpenGL, VAAPI etc..).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The invalidation lock on the other hand is what in this 
>>>>>>>>>> discussion is called eviction lock. This one is needed 
>>>>>>>>>> because what I wrote above, during the move callback only the 
>>>>>>>>>> dma-resv of the BO which is moved is locked, but not 
>>>>>>>>>> necessarily the dma-resv of the VM.
>>>>>>>>>
>>>>>>>>> That's yet another thing, right? This is used to track whether 
>>>>>>>>> *any* BO that belongs to the VM is
>>>>>>>>> currently being evicted, correct? As mentioned, as by now this 
>>>>>>>>> is not supported in GPUVM and hence
>>>>>>>>> would be the same driver specific code with the same driver 
>>>>>>>>> specifc lock.
>>>>>>>>
>>>>>>>> That is most likely a show stopper using this for OpenGL based 
>>>>>>>> workloads as far as I can see. For those you need to able to 
>>>>>>>> figure out which non-VM BOs have been evicted and which parts 
>>>>>>>> of the VM needs updates.
>>>>>>>
>>>>>>> We identify those with a bool in the gpuvm_bo, and that bool is 
>>>>>>> protected by the bo_resv. In essence, the "evicted" list must be 
>>>>>>> made up-to-date with all relevant locks held before traversing 
>>>>>>> in the next exec.
>>>>>>
>>>>>> What I still miss with this idea is how do we find all the 
>>>>>> drm_gpuvm_bo structures with the evicted bool set to true? When 
>>>>>> doing the drm_exec dance we come across all external ones and can 
>>>>>> add them to the list if needed, but what about the BOs having the 
>>>>>> VM's dma-resv?
>>>>>
>>>>> Oh, they can be added to the evict list directly (no bool needed) 
>>>>> in the eviction code, like in v3. Since for those we indeed hold 
>>>>> the VM's dma_resv since it's aliased with the object's dma-resv.
>>>>
>>>> Yeah, I wanted to note what Danilo seems to think about as well. 
>>>> How do we figure out the non-VM BOs evicted?
>>>>
>>>> We can't walk over the list of all non-VM BOs on every submission, 
>>>> that's to much overhead for cases with lots of non-VM BOs.
>>>>
>>>> And we can't rely on userspace sending all non-VM BOs as used list 
>>>> down to the kernel with each submission.
>>>>
>>>> Regards,
>>>> Christian.
>>>
>>> No, that's not needed: Mechanism below.
>>>
>>> 1) We maintain an evicted list. Typically protected by the vm resv.
>>> 2) Each gpuvm_bo has a bool "evicted". Protected by the bo resv.
>>>
>>> a) Evicting a vm bo: The vm resv is held by the eviction code. Just 
>>> put it on the evicted list.
>>> b) Evicting a shared/external bo: The bo resv is held by the 
>>> eviction code. Set the "evicted" bool
>>> c) Validating the evicted list on exec:
>>
>>
>>> Loop through all *external/shared* bos.
>>
>> And this is what you can't do. For Vulkan it probably doesn't matter, 
>> but for OpenGL and especially multimedia we have much more BOs on the 
>> shared list than what's allocated for the VM.
>
> But you need to lock- and fence all those so you need to loop through 
> them anyway, so we're still O(n_shared)? Or is there some clever 
> optimization in amdgpu?

Why should I lock and fence them? Only the BOs in the relocation list 
are locked and fenced.

Regards,
Christian.

>
> I think with some UMDs, xe might end up with similar large lists...
>
> /Thomas
>
>
>>
>> Regards,
>> Christian.
>>
>>> Lock them. After locking, check the "evicted" bool, if it's true. 
>>> put the bo on the evicted list (we hold the VM resv at this point) 
>>> and clear the "evicted" bool. Note that other vms will have their 
>>> own gpuvm_bo which is marked evicted.
>>>
>>> I have this coded up in a patch for Xe and it seems to be working 
>>> properly.
>>>
>>> /Thomas
>>>
>>>
>>>>
>>>>>
>>>>> /Thomas
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> If you mean that we need to unbind all vmas of all vms of 
>>>>>>> evicted bos before evicting, We don't do that, at least not in 
>>>>>>> Xe, since evicting we wait for VM idle, and it cant access 
>>>>>>> anything through the stale vmas until they have been revalidated 
>>>>>>> and rebound.
>>>>>>>
>>>>>>> /Thomas
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Christian.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas Hellström 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas 
>>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas 
>>>>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>>>>> Hi, Danilo,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>>>>> So far the DRM GPUVA manager offers common 
>>>>>>>>>>>>>>>>>>> infrastructure to
>>>>>>>>>>>>>>>>>>> track GPU VA
>>>>>>>>>>>>>>>>>>> allocations and mappings, generically connect GPU VA 
>>>>>>>>>>>>>>>>>>> mappings
>>>>>>>>>>>>>>>>>>> to their
>>>>>>>>>>>>>>>>>>> backing buffers and perform more complex mapping 
>>>>>>>>>>>>>>>>>>> operations
>>>>>>>>>>>>>>>>>>> on the GPU VA
>>>>>>>>>>>>>>>>>>> space.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> However, there are more design patterns commonly 
>>>>>>>>>>>>>>>>>>> used by
>>>>>>>>>>>>>>>>>>> drivers, which
>>>>>>>>>>>>>>>>>>> can potentially be generalized in order to make the 
>>>>>>>>>>>>>>>>>>> DRM GPUVA
>>>>>>>>>>>>>>>>>>> manager
>>>>>>>>>>>>>>>>>>> represent a basic GPU-VM implementation. In this 
>>>>>>>>>>>>>>>>>>> context,
>>>>>>>>>>>>>>>>>>> this patch aims
>>>>>>>>>>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not 
>>>>>>>>>>>>>>>>>>> being used
>>>>>>>>>>>>>>>>>>> outside of
>>>>>>>>>>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM 
>>>>>>>>>>>>>>>>>>> objects
>>>>>>>>>>>>>>>>>>> which are
>>>>>>>>>>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 3) Provide functions to efficiently lock all GEM 
>>>>>>>>>>>>>>>>>>> objects dma-
>>>>>>>>>>>>>>>>>>> resv the
>>>>>>>>>>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>>>>>>>>>>>> contains mappings
>>>>>>>>>>>>>>>>>>>       of, such that validation of evicted GEM 
>>>>>>>>>>>>>>>>>>> objects is
>>>>>>>>>>>>>>>>>>> accelerated.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 5) Provide some convinience functions for common 
>>>>>>>>>>>>>>>>>>> patterns.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Rather than being designed as a "framework", the 
>>>>>>>>>>>>>>>>>>> target is to
>>>>>>>>>>>>>>>>>>> make all
>>>>>>>>>>>>>>>>>>> features appear as a collection of optional helper 
>>>>>>>>>>>>>>>>>>> functions,
>>>>>>>>>>>>>>>>>>> such that
>>>>>>>>>>>>>>>>>>> drivers are free to make use of the DRM GPUVA 
>>>>>>>>>>>>>>>>>>> managers basic
>>>>>>>>>>>>>>>>>>> functionality and opt-in for other features without 
>>>>>>>>>>>>>>>>>>> setting
>>>>>>>>>>>>>>>>>>> any feature
>>>>>>>>>>>>>>>>>>> flags, just by making use of the corresponding 
>>>>>>>>>>>>>>>>>>> functions.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Big kudos to Boris Brezillon for his help to figure out
>>>>>>>>>>>>>>>>>>> locking for drivers
>>>>>>>>>>>>>>>>>>> updating the GPU VA space within the fence 
>>>>>>>>>>>>>>>>>>> signalling path.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>>>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>> drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>>>>>>>>>     include/drm/drm_gpuvm.h | 197 ++++++++++++++
>>>>>>>>>>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for an 
>>>>>>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>>>>>>> instance of this
>>>>>>>>>>>>>>>>>>>      * particular combination. If not existent a new 
>>>>>>>>>>>>>>>>>>> instance
>>>>>>>>>>>>>>>>>>> is created and linked
>>>>>>>>>>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>>>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external and
>>>>>>>>>>>>>>>>>>> evicted objects. Those
>>>>>>>>>>>>>>>>>>> + * list are maintained in order to accelerate 
>>>>>>>>>>>>>>>>>>> locking of
>>>>>>>>>>>>>>>>>>> dma-resv locks and
>>>>>>>>>>>>>>>>>>> + * validation of evicted objects bound in a 
>>>>>>>>>>>>>>>>>>> &drm_gpuvm. For
>>>>>>>>>>>>>>>>>>> instance the all
>>>>>>>>>>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given 
>>>>>>>>>>>>>>>>>>> &drm_gpuvm can be
>>>>>>>>>>>>>>>>>>> locked by calling
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can call
>>>>>>>>>>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>>>>>>>>>>> + * order to validate all evicted &drm_gem_objects. 
>>>>>>>>>>>>>>>>>>> It is
>>>>>>>>>>>>>>>>>>> also possible to lock
>>>>>>>>>>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>>>>>>>>>>> corresponding parameters to
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the 
>>>>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>>>>> loop while making
>>>>>>>>>>>>>>>>>>> + * use of helper functions such as 
>>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Every bound &drm_gem_object is treated as 
>>>>>>>>>>>>>>>>>>> external object
>>>>>>>>>>>>>>>>>>> when its &dma_resv
>>>>>>>>>>>>>>>>>>> + * structure is different than the &drm_gpuvm's common
>>>>>>>>>>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() for 
>>>>>>>>>>>>>>>>>>> the same
>>>>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>>>>      * &drm_gem_object must be able to observe previous
>>>>>>>>>>>>>>>>>>> creations and destructions
>>>>>>>>>>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances 
>>>>>>>>>>>>>>>>>>> unique.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of 
>>>>>>>>>>>>>>>>>>> external and
>>>>>>>>>>>>>>>>>>> evicted objects are
>>>>>>>>>>>>>>>>>>> + * protected against concurrent insertion / removal 
>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>> iteration internally.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * However, drivers still need ensure to protect 
>>>>>>>>>>>>>>>>>>> concurrent
>>>>>>>>>>>>>>>>>>> calls to functions
>>>>>>>>>>>>>>>>>>> + * iterating those lists, such as 
>>>>>>>>>>>>>>>>>>> drm_gpuvm_validate() and
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such function 
>>>>>>>>>>>>>>>>>>> contains
>>>>>>>>>>>>>>>>>>> a particular
>>>>>>>>>>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Functions adding or removing entries from those 
>>>>>>>>>>>>>>>>>>> lists,
>>>>>>>>>>>>>>>>>>> such as
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() or 
>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_extobj_add() may be
>>>>>>>>>>>>>>>>>>> called with external
>>>>>>>>>>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>>>>>>>>>>> corresponding list to be
>>>>>>>>>>>>>>>>>>> + * (safely) modified while potentially being 
>>>>>>>>>>>>>>>>>>> iternated by
>>>>>>>>>>>>>>>>>>> other API functions.
>>>>>>>>>>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>>>>>>>>>>      *   }
>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo 
>>>>>>>>>>>>>>>>>>> element
>>>>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're 
>>>>>>>>>>>>>>>>>>> iterating on
>>>>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used 
>>>>>>>>>>>>>>>>>>> to store
>>>>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>>>>> first element from
>>>>>>>>>>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>>>> Are the list spinlocks needed for that async state 
>>>>>>>>>>>>>>>>>> update from
>>>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>>>>>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Otherwise it should be sufficient to protect the 
>>>>>>>>>>>>>>>>>> lists with the
>>>>>>>>>>>>>>>>>> gpuvm's resv
>>>>>>>>>>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If those spinlocks are still needed in some 
>>>>>>>>>>>>>>>>>> situations, perhaps
>>>>>>>>>>>>>>>>>> could we
>>>>>>>>>>>>>>>>>> have an option to set them to NULL (Like IIRC the 
>>>>>>>>>>>>>>>>>> maple tree
>>>>>>>>>>>>>>>>>> allows for)?
>>>>>>>>>>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>>>>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>>>>>>>>>>> holding only the dma-resv lock from the BO this 
>>>>>>>>>>>>>>>>> function gets
>>>>>>>>>>>>>>>>> called for. Hence,
>>>>>>>>>>>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() 
>>>>>>>>>>>>>>>>> calls with
>>>>>>>>>>>>>>>>> different BOs.
>>>>>>>>>>>>>>>> No. Only if you try to add external objects to the vm's 
>>>>>>>>>>>>>>>> evict list
>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>> within the evict code. That's not necessary since you 
>>>>>>>>>>>>>>>> loop through
>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>> external objects anyway when locking them so an 
>>>>>>>>>>>>>>>> "evicted" bool in
>>>>>>>>>>>>>>>> the vm_bo,
>>>>>>>>>>>>>>>> protected by the bo resv would be sufficient. The 
>>>>>>>>>>>>>>>> extobj locking
>>>>>>>>>>>>>>>> loop can
>>>>>>>>>>>>>>>> then add the bo to the evicted list.
>>>>>>>>>>>>>>> And validate() can remove it while still holding all 
>>>>>>>>>>>>>>> dma-resv locks,
>>>>>>>>>>>>>>> neat!
>>>>>>>>>>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>>>>>>>>>>> concurrently? What
>>>>>>>>>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>>>>>>>>>>> drm_gpuva_unlink()?
>>>>>>>>>>>>>>> Are we guaranteed that at this point of time the 
>>>>>>>>>>>>>>> drm_gpuvm_bo is not
>>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>>> evicted list? Because otherwise we would call 
>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>>>>> might drop the last reference to the drm_gem_object and 
>>>>>>>>>>>>>>> hence we'd
>>>>>>>>>>>>>>> potentially
>>>>>>>>>>>>>>> free the dma-resv lock while holding it, at least if 
>>>>>>>>>>>>>>> it's an external
>>>>>>>>>>>>>>> object.
>>>>>>>>>>>>>> Easiest way in this scheme is to think of the lists as 
>>>>>>>>>>>>>> being protected
>>>>>>>>>>>>>> by the vm's resv lock. That means anybody calling 
>>>>>>>>>>>>>> unlink() must also
>>>>>>>>>>>>>> hold the vm's resv lock. (Which is OK from an UAF point 
>>>>>>>>>>>>>> of view, but
>>>>>>>>>>>>>> perhaps not from a locking inversion POW from an async 
>>>>>>>>>>>>>> list update).
>>>>>>>>>>>>> This would mean that on unlink() we'd need to hold the 
>>>>>>>>>>>>> VM's resv lock and the
>>>>>>>>>>>>> corresponding GEM's resv lock (in case they're not the 
>>>>>>>>>>>>> same anyways) because the
>>>>>>>>>>>>> VM's resv lock would protect the external / evicted object 
>>>>>>>>>>>>> lists and the GEM
>>>>>>>>>>>>> objects resv lock protects the GEM's list of drm_gpuvm_bos 
>>>>>>>>>>>>> and the
>>>>>>>>>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For extobjs an outer lock would be enough in case of 
>>>>>>>>>>>>>>>>> Xe, but I
>>>>>>>>>>>>>>>>> really would not
>>>>>>>>>>>>>>>>> like to add even more complexity just to get the 
>>>>>>>>>>>>>>>>> spinlock out of
>>>>>>>>>>>>>>>>> the way in case
>>>>>>>>>>>>>>>>> the driver already has an outer lock protecting this 
>>>>>>>>>>>>>>>>> path.
>>>>>>>>>>>>>>>> I must disagree here. These spinlocks and atomic 
>>>>>>>>>>>>>>>> operations are
>>>>>>>>>>>>>>>> pretty
>>>>>>>>>>>>>>>> costly and as discussed earlier this type of locking 
>>>>>>>>>>>>>>>> was the reason
>>>>>>>>>>>>>>>> (at
>>>>>>>>>>>>>>>> least according to the commit message) that made 
>>>>>>>>>>>>>>>> Christian drop the
>>>>>>>>>>>>>>>> XArray
>>>>>>>>>>>>>>>> use in drm_exec for the same set of objects: "The 
>>>>>>>>>>>>>>>> locking overhead
>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>> unecessary and measurable". IMHO the spinlock is the added
>>>>>>>>>>>>>>>> complexity and a
>>>>>>>>>>>>>>>> single wide lock following the drm locking guidelines 
>>>>>>>>>>>>>>>> set out by
>>>>>>>>>>>>>>>> Daniel and
>>>>>>>>>>>>>>>> David should really be the default choice with an 
>>>>>>>>>>>>>>>> opt-in for a
>>>>>>>>>>>>>>>> spinlock if
>>>>>>>>>>>>>>>> needed for async and pushing out to a wq is not an option.
>>>>>>>>>>>>>>> For the external object list an outer lock would work as 
>>>>>>>>>>>>>>> long as it's
>>>>>>>>>>>>>>> not the
>>>>>>>>>>>>>>> dma-resv lock of the corresponding GEM object, since 
>>>>>>>>>>>>>>> here we actually
>>>>>>>>>>>>>>> need to
>>>>>>>>>>>>>>> remove the list entry from the external object list on
>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>>>>>>>>>> It's just a bit weird design wise that drivers would 
>>>>>>>>>>>>>>> need to take
>>>>>>>>>>>>>>> this outer
>>>>>>>>>>>>>>> lock on:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>>>>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also 
>>>>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>>>>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>>>>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Given that it seems reasonable to do all the required 
>>>>>>>>>>>>>>> locking
>>>>>>>>>>>>>>> internally.
>>>>>>>>>>>>>>  From a design POW, there has been a clear direction in 
>>>>>>>>>>>>>> XE to make
>>>>>>>>>>>>>> things similar to mmap() / munmap(), so this outer lock, 
>>>>>>>>>>>>>> which in Xe is
>>>>>>>>>>>>>> an rwsem, is used in a similar way as the mmap_lock. It's 
>>>>>>>>>>>>>> protecting
>>>>>>>>>>>>>> the page-table structures and vma rb tree, the userptr 
>>>>>>>>>>>>>> structures and
>>>>>>>>>>>>>> the extobj list. Basically it's taken early in the exec 
>>>>>>>>>>>>>> IOCTL, the
>>>>>>>>>>>>>> VM_BIND ioctl, the compute rebind worker and the 
>>>>>>>>>>>>>> pagefault handler, so
>>>>>>>>>>>>>> all of the above are just asserting that it is taken in 
>>>>>>>>>>>>>> the correct
>>>>>>>>>>>>>> mode.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But strictly with this scheme one could also use the vm's 
>>>>>>>>>>>>>> dma_resv for
>>>>>>>>>>>>>> the extobj list since with drm_exec, it's locked before 
>>>>>>>>>>>>>> traversing the
>>>>>>>>>>>>>> list.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The whole point of this scheme is to rely on locks that 
>>>>>>>>>>>>>> you already are
>>>>>>>>>>>>>> supposed to be holding for various reasons and is simple 
>>>>>>>>>>>>>> to comprehend.
>>>>>>>>>>>>> I don't agree that we're supposed to hold the VM's resv 
>>>>>>>>>>>>> lock anyways for
>>>>>>>>>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), 
>>>>>>>>>>>>> but I'm fine using it
>>>>>>>>>>>>> for that purpose nevertheless.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In order to at least place lockdep checks, the driver 
>>>>>>>>>>>>>>> would need to
>>>>>>>>>>>>>>> supply the
>>>>>>>>>>>>>>> corresponding lock's lockdep_map, because the GPUVM 
>>>>>>>>>>>>>>> otherwise doesn't
>>>>>>>>>>>>>>> know about
>>>>>>>>>>>>>>> the lock.
>>>>>>>>>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>>>>>>>>> I'd really like to avoid that, especially now that 
>>>>>>>>>>>>> everything got simpler. We
>>>>>>>>>>>>> should define the actual locks to take instead.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Out of curiosity, what is the overhead of a spin_lock() 
>>>>>>>>>>>>>>> that doesn't
>>>>>>>>>>>>>>> need to
>>>>>>>>>>>>>>> spin?
>>>>>>>>>>>>>> I guess it's hard to tell exactly, but it is much lower 
>>>>>>>>>>>>>> on modern x86
>>>>>>>>>>>>>> than what it used to be. Not sure about ARM, which is the 
>>>>>>>>>>>>>> other
>>>>>>>>>>>>>> architecture important to us. I figure if there is little 
>>>>>>>>>>>>>> cache-line
>>>>>>>>>>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> static void gpuvm_cond_spin_lock(const struct drm_gpuvm 
>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>> spinlock_t
>>>>>>>>>>>>>>>> *lock)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>>>>>>>>>>          spin_lock(lock);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For such drivers, that would require anybody calling 
>>>>>>>>>>>>>>>>>> unlink to
>>>>>>>>>>>>>>>>>> hold the vm's
>>>>>>>>>>>>>>>>>> resv, though.
>>>>>>>>>>>>>>>>> In V4 I want to go back to having a dedicated lock for 
>>>>>>>>>>>>>>>>> the GEMs
>>>>>>>>>>>>>>>>> gpuva list (or
>>>>>>>>>>>>>>>>> VM_BO list to be more precise). We can't just use the 
>>>>>>>>>>>>>>>>> dma-resv
>>>>>>>>>>>>>>>>> lock for that
>>>>>>>>>>>>>>>>> with VM_BO abstractions, because on destruction of a 
>>>>>>>>>>>>>>>>> VM_BO we
>>>>>>>>>>>>>>>>> otherwise wouldn't
>>>>>>>>>>>>>>>>> be allowed to already hold the dma-resv lock. That's 
>>>>>>>>>>>>>>>>> the fix I
>>>>>>>>>>>>>>>>> was referring to
>>>>>>>>>>>>>>>>> earlier.
>>>>>>>>>>>>>>>> Yeah, I can see the need for a dedicated lock for the 
>>>>>>>>>>>>>>>> GEM's gpuva
>>>>>>>>>>>>>>>> list, but
>>>>>>>>>>>>>>>> holding the vm's dma-resv lock across the unlink 
>>>>>>>>>>>>>>>> shouldn't be a
>>>>>>>>>>>>>>>> problem. We
>>>>>>>>>>>>>>>> may free the object and a pointer to the vm's resv 
>>>>>>>>>>>>>>>> during unlink
>>>>>>>>>>>>>>>> but we
>>>>>>>>>>>>>>>> don't free the vm's resv.  It'd be a matter of ensuring 
>>>>>>>>>>>>>>>> that any
>>>>>>>>>>>>>>>> calls to
>>>>>>>>>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>>>>>>>>> Drivers calling unlink() from the fence signaling path 
>>>>>>>>>>>>>>> can't use the
>>>>>>>>>>>>>>> VM's
>>>>>>>>>>>>>>> dma-resv lock.
>>>>>>>>>>>>>> Yes, that made me a bit curious because in the current 
>>>>>>>>>>>>>> version the code
>>>>>>>>>>>>>> required the object's dma_resv for unlink() which can't 
>>>>>>>>>>>>>> be grabbed
>>>>>>>>>>>>>> either from the fence signaling path. So are there any 
>>>>>>>>>>>>>> drivers actually
>>>>>>>>>>>>>> wanting to do that? If so, they will either need to 
>>>>>>>>>>>>>> resort to the
>>>>>>>>>>>>>> current spinlock solution or they will need to call 
>>>>>>>>>>>>>> unlink from a
>>>>>>>>>>>>>> workqueue item.
>>>>>>>>>>>>> As Boris already mentioned we have the dma-resv lock by 
>>>>>>>>>>>>> default or a driver
>>>>>>>>>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of 
>>>>>>>>>>>>> the latter.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also, what if the object is an external object? We can't 
>>>>>>>>>>>>>>> use the VM's
>>>>>>>>>>>>>>> dma-resv
>>>>>>>>>>>>>>> lock here.
>>>>>>>>>>>>>> Why? Typically (sync) unlink is only ever called from an 
>>>>>>>>>>>>>> unbind-like
>>>>>>>>>>>>>> operation where it should be trivial to grab the vm's 
>>>>>>>>>>>>>> resv. Or, for
>>>>>>>>>>>>>> that matter any outer lock protecting the extobj list. 
>>>>>>>>>>>>>> Rule would be
>>>>>>>>>>>>>> the drm_gpuvm_bo::entry::extobj  and 
>>>>>>>>>>>>>> drm_gpuvm_bo::entry::evict would
>>>>>>>>>>>>>> be protected by either the vm's dma_resv (or possibly an 
>>>>>>>>>>>>>> outer lock in
>>>>>>>>>>>>>> the case of the extobj list).
>>>>>>>>>>>>> Outer lock wouldn't have been working for updates in the 
>>>>>>>>>>>>> async path, but
>>>>>>>>>>>>> shouldn't be relevant anymore. We could use the VM's resv 
>>>>>>>>>>>>> for that.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   And we can't have the GEM objs dma-resv lock held when 
>>>>>>>>>>>>>>> calling
>>>>>>>>>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), which 
>>>>>>>>>>>>>>> if the
>>>>>>>>>>>>>>> refcount drops
>>>>>>>>>>>>>>> to zero calls drm_gpuvm_bo_destroy() and 
>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy() might
>>>>>>>>>>>>>>> drop the
>>>>>>>>>>>>>>> last reference of the GEM object.
>>>>>>>>>>>>>> Yes, but this is a different problem as to what exactly 
>>>>>>>>>>>>>> protects
>>>>>>>>>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an 
>>>>>>>>>>>>>> internal per bo list
>>>>>>>>>>>>>> lock, or if we want to keep the bo's dma_resv we need to 
>>>>>>>>>>>>>> ensure that
>>>>>>>>>>>>>> the caller of dma_resv_unlock(obj->resv) actually 
>>>>>>>>>>>>>> refcounts its obj
>>>>>>>>>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's 
>>>>>>>>>>>>>> refcount (I know
>>>>>>>>>>>>>> Boris didn't like that, but requiring an explicit 
>>>>>>>>>>>>>> refcount for a
>>>>>>>>>>>>>> pointer you dereference unless you're under a lock that 
>>>>>>>>>>>>>> ensures keeping
>>>>>>>>>>>>>> the object alive is pretty much required?) But anyway for 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or 
>>>>>>>>>>>>>> internal spinlock)
>>>>>>>>>>>>>> I don't have a strong preference.
>>>>>>>>>>>>> We can keep the GEM objects dma-resv lock, however as 
>>>>>>>>>>>>> mentioned above
>>>>>>>>>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires 
>>>>>>>>>>>>> both the VM's resv lock
>>>>>>>>>>>>> and the GEM's resv lock in case they differ.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   All those problems go away with a dedicated
>>>>>>>>>>>>>>> GEM gpuva list lock.
>>>>>>>>>>>>>> I don't think these are real problems.
>>>>>>>>>>>>>> With the excepton of the eviction list "trick" where we 
>>>>>>>>>>>>>> currently have
>>>>>>>>>>>>>> slightly different approach to collect external bos 
>>>>>>>>>>>>>> needing rebinding,
>>>>>>>>>>>>>> we have this working fine.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> TBH I think pretty much the only situation where the 
>>>>>>>>>>>>>> spinlock is needed
>>>>>>>>>>>>>> is for async updates of these lists, unless a wq item can 
>>>>>>>>>>>>>> be used for
>>>>>>>>>>>>>> that, but it doesn't really seem like the current code 
>>>>>>>>>>>>>> allows for such
>>>>>>>>>>>>>> updates anyway? It complicates the code a lot, adds 
>>>>>>>>>>>>>> overhead and also
>>>>>>>>>>>>>> adds the requirement for refcounting during list traversal.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It seems that with that also the refcount could be 
>>>>>>>>>>>>>>>>>> make non-
>>>>>>>>>>>>>>>>>> atomic.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> All in the spirit of the drm locking guidelines "use 
>>>>>>>>>>>>>>>>>> big locks
>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>> possible".
>>>>>>>>>>>>>>>>>> Lower level locks only when necessary for performance or
>>>>>>>>>>>>>>>>>> locking inversion?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Elements popped from the original list are kept 
>>>>>>>>>>>>>>>>>>> in a
>>>>>>>>>>>>>>>>>>> local list, so removal
>>>>>>>>>>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>>>>>>>>>>> iterating the list.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>>>>>>>>>>> +       ({
>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \
>>>>>>>>>>>>>>>>>>> + struct
>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>>>>>>>>>>> +                       if
>>>>>>>>>>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>>>>>>>>>>> {                    \
>>>>>>>>>>>>>>>>>>> +                               list_move_tail(&(__vm_bo)- 
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> __local_list);                           \
>>>>>>>>>>>>>>>>>>> +                               break;
>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>> +                       } else
>>>>>>>>>>>>>>>>>>> {                                                        \ 
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> +                               list_del_init(&(__vm_bo)- 
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>>>>>>>>>>> NULL;                                         \
>>>>>>>>>>>>>>>>>>> +                       }
>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>> +       })
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list 
>>>>>>>>>>>>>>>>>>> iterator
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>>>>> first element from the
>>>>>>>>>>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Typical use:
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + *     ret = 0;
>>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>>>>>>>>>>> + *             ret = do_something_with_vm_bo(..., 
>>>>>>>>>>>>>>>>>>> vm_bo);
>>>>>>>>>>>>>>>>>>> + *             if (ret)
>>>>>>>>>>>>>>>>>>> + *                     break;
>>>>>>>>>>>>>>>>>>> + *     }
>>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, 
>>>>>>>>>>>>>>>>>>> <list_name>,
>>>>>>>>>>>>>>>>>>> &my_local_list);
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Only used for internal list iterations, not 
>>>>>>>>>>>>>>>>>>> meant to be
>>>>>>>>>>>>>>>>>>> exposed to the outside
>>>>>>>>>>>>>>>>>>> + * world.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>>>>>>>>>>> +       for (__vm_bo = 
>>>>>>>>>>>>>>>>>>> get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> NULL);            \
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> __vm_bo;
>>>>>>>>>>>>>>>>>>>        \
>>>>>>>>>>>>>>>>>>> +            __vm_bo = 
>>>>>>>>>>>>>>>>>>> get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> __vm_bo))         \
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back 
>>>>>>>>>>>>>>>>>>> to their
>>>>>>>>>>>>>>>>>>> original list
>>>>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're 
>>>>>>>>>>>>>>>>>>> iterating on
>>>>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used 
>>>>>>>>>>>>>>>>>>> to store
>>>>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * When we're done iterating a vm_bo list, we 
>>>>>>>>>>>>>>>>>>> should call
>>>>>>>>>>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>>>>>>>>>>> + * to restore the original state and let new 
>>>>>>>>>>>>>>>>>>> iterations take
>>>>>>>>>>>>>>>>>>> place.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>>>> __local_list)                         \
>>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>                  \
>>>>>>>>>>>>>>>>>>> +               /* Merge back the two lists, moving 
>>>>>>>>>>>>>>>>>>> local
>>>>>>>>>>>>>>>>>>> list elements to the          \
>>>>>>>>>>>>>>>>>>> +                * head to preserve previous 
>>>>>>>>>>>>>>>>>>> ordering, in
>>>>>>>>>>>>>>>>>>> case it matters.              \
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>>>>>>            \
>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into 
>>>>>>>>>>>>>>>>>>> the given
>>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Inserts the given @__vm_bo into the list 
>>>>>>>>>>>>>>>>>>> specified by
>>>>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>>>>>>>>>>> __list_name)      ��                     \
>>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from 
>>>>>>>>>>>>>>>>>>> the given
>>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Removes the given @__vm_bo from the list 
>>>>>>>>>>>>>>>>>>> specified by
>>>>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>>> *vm_bo);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>     #define 
>>>>>>>>>>>>>>>>>>> to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>>>>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct drm_gpuvm 
>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>         ��drm_gpuva_check_overflow(start_offset, 
>>>>>>>>>>>>>>>>>>> range);
>>>>>>>>>>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct 
>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>> *gpuvm)
>>>>>>>>>>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>>>>>>>>>>>               "GPUVA tree is not empty, potentially 
>>>>>>>>>>>>>>>>>>> leaking
>>>>>>>>>>>>>>>>>>> memory.\n");
>>>>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), 
>>>>>>>>>>>>>>>>>>> "Extobj list
>>>>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), "Evict 
>>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all 
>>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Note: This function is safe against concurrent 
>>>>>>>>>>>>>>>>>>> insertion
>>>>>>>>>>>>>>>>>>> and removal of
>>>>>>>>>>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>>>>>>>>>>> concurrent usage itself.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Drivers need to make sure to protect this case with
>>>>>>>>>>>>>>>>>>> either an outer VM lock
>>>>>>>>>>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before this 
>>>>>>>>>>>>>>>>>>> function
>>>>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the 
>>>>>>>>>>>>>>>>>>> GPUVM's
>>>>>>>>>>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>> + unsigned int num_fences)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, &extobjs,
>>>>>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>>>>>> vm_bo->obj,
>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the 
>>>>>>>>>>>>>>>>>>> loop. */
>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs 
>>>>>>>>>>>>>>>>>>> mapped within
>>>>>>>>>>>>>>>>>>> a given range
>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>>>>>> &drm_gem_objects
>>>>>>>>>>>>>>>>>>> mapped between @addr
>>>>>>>>>>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, 
>>>>>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>>>>>> drm_exec *exec,
>>>>>>>>>>>>>>>>>>> +                       u64 addr, u64 range, 
>>>>>>>>>>>>>>>>>>> unsigned int
>>>>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, addr, 
>>>>>>>>>>>>>>>>>>> end) {
>>>>>>>>>>>>>>>>>>> +               struct drm_gem_object *obj = 
>>>>>>>>>>>>>>>>>>> va->gem.obj;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>> +                       return ret;
>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>>>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>>>>>>>>>>> + * being set the driver receives the given @fn 
>>>>>>>>>>>>>>>>>>> callback to
>>>>>>>>>>>>>>>>>>> lock additional
>>>>>>>>>>>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>> instance.
>>>>>>>>>>>>>>>>>>> Typically, drivers
>>>>>>>>>>>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>>>>>>>>>>>> callback.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>>>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, exec,
>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +               ret = 
>>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>>>>>>>>>>> +                       ret = 
>>>>>>>>>>>>>>>>>>> vm_exec->extra.fn(vm_exec,
>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>> +                       drm_exec_retry_on_contention(exec); 
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> +                       if (ret)
>>>>>>>>>>>>>>>>>>> +                               goto err;
>>>>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +static int
>>>>>>>>>>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, 
>>>>>>>>>>>>>>>>>>> unsigned int
>>>>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       return 
>>>>>>>>>>>>>>>>>>> drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>>>>>>>>>>>>> objs,
>>>>>>>>>>>>>>>>>>> + args->num_objs,
>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv 
>>>>>>>>>>>>>>>>>>> of all
>>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>>>>>>>>>>> + * @num_objs: the number of additional 
>>>>>>>>>>>>>>>>>>> &drm_gem_objects to
>>>>>>>>>>>>>>>>>>> lock
>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>>>>>>>>>>> + * contains mappings of, plus the ones given 
>>>>>>>>>>>>>>>>>>> through @objs.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>> + struct drm_gem_object **objs,
>>>>>>>>>>>>>>>>>>> + unsigned int num_objs,
>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>>>>> +       } args;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, num_fences,
>>>>>>>>>>>>>>>>>>> interruptible);
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs 
>>>>>>>>>>>>>>>>>>> mapped
>>>>>>>>>>>>>>>>>>> within a given range
>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>>>>>>>> mapped between @addr and
>>>>>>>>>>>>>>>>>>> + * @addr + @range.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>> +                         u64 addr, u64 range,
>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>> +                         bool interruptible)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_range(gpuvm, 
>>>>>>>>>>>>>>>>>>> exec,
>>>>>>>>>>>>>>>>>>> addr, range,
>>>>>>>>>>>>>>>>>>> + num_fences);
>>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked 
>>>>>>>>>>>>>>>>>>> as evicted
>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback 
>>>>>>>>>>>>>>>>>>> for all
>>>>>>>>>>>>>>>>>>> evicted buffer
>>>>>>>>>>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, &evict, 
>>>>>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv);
>>>>>>>>>>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the 
>>>>>>>>>>>>>>>>>>> loop. */
>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private 
>>>>>>>>>>>>>>>>>>> and all
>>>>>>>>>>>>>>>>>>> extobj
>>>>>>>>>>>>>>>>>>> + * dma-resv
>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>>> +                        enum dma_resv_usage 
>>>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>>>> +                        enum dma_resv_usage 
>>>>>>>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>>>>>>>>>>> +       unsigned long index;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, index, 
>>>>>>>>>>>>>>>>>>> obj) {
>>>>>>>>>>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>>>>>>>>>>> obj) ?
>>>>>>>>>>>>>>>>>>> + private_usage :
>>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new instance 
>>>>>>>>>>>>>>>>>>> of struct
>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct 
>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>>>>>>>>>>          return vm_bo;
>>>>>>>>>>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct 
>>>>>>>>>>>>>>>>>>> kref *kref)
>>>>>>>>>>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct 
>>>>>>>>>>>>>>>>>>> kref *kref)
>>>>>>>>>>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the 
>>>>>>>>>>>>>>>>>>> reference of
>>>>>>>>>>>>>>>>>>>      *
>>>>>>>>>>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * If the reference count drops to zero, the 
>>>>>>>>>>>>>>>>>>> &gpuvm_bo is
>>>>>>>>>>>>>>>>>>> destroyed, which
>>>>>>>>>>>>>>>>>>> + * includes removing it from the GEMs gpuva list. 
>>>>>>>>>>>>>>>>>>> Hence, if
>>>>>>>>>>>>>>>>>>> a call to this
>>>>>>>>>>>>>>>>>>> + * function can potentially let the reference count 
>>>>>>>>>>>>>>>>>>> to zero
>>>>>>>>>>>>>>>>>>> the caller must
>>>>>>>>>>>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva 
>>>>>>>>>>>>>>>>>>> lock.
>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>     void
>>>>>>>>>>>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct 
>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>>>>     __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>>>>>>>>>>>> @@ -938,6 +1412,48 @@ 
>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the 
>>>>>>>>>>>>>>>>>>> &drm_gpuvm_bo to its
>>>>>>>>>>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>>>>>>>>>>> + * extobj list
>>>>>>>>>>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its 
>>>>>>>>>>>>>>>>>>> &drm_gpuvm's the
>>>>>>>>>>>>>>>>>>> extobj list.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's extobj 
>>>>>>>>>>>>>>>>>>> list if
>>>>>>>>>>>>>>>>>>> not on the list
>>>>>>>>>>>>>>>>>>> + * already and if the corresponding &drm_gem_object 
>>>>>>>>>>>>>>>>>>> is an
>>>>>>>>>>>>>>>>>>> external object,
>>>>>>>>>>>>>>>>>>> + * actually.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a 
>>>>>>>>>>>>>>>>>>> &drm_gem_object to
>>>>>>>>>>>>>>>>>>> / from a
>>>>>>>>>>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>>>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool 
>>>>>>>>>>>>>>>>>>> evict)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>>>>>>>>>>> +               if (evict)
>>>>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, 
>>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>>> +               else
>>>>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, 
>>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>     static int
>>>>>>>>>>>>>>>>>>>     __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>>>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>>>>>>>>>>           * space
>>>>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>>>>> serving as
>>>>>>>>>>>>>>>>>>> +                * external object
>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>>>>> extobj list
>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>>>>> +       } extobj;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>> +        * @evict: structure holding the evict list 
>>>>>>>>>>>>>>>>>>> and evict
>>>>>>>>>>>>>>>>>>> list lock
>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>>>>> currently being
>>>>>>>>>>>>>>>>>>> +                * evicted
>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>>>>> evict list
>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>>>>> +       } evict;
>>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct
>>>>>>>>>>>>>>>>>>> drm_device *drm,
>>>>>>>>>>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct 
>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the given
>>>>>>>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>>>>>>>> + * external object
>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv 
>>>>>>>>>>>>>>>>>>> differs
>>>>>>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct 
>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>> + struct drm_gem_object
>>>>>>>>>>>>>>>>>>> *obj)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct 
>>>>>>>>>>>>>>>>>>> drm_gpuva *va)
>>>>>>>>>>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, next__, 
>>>>>>>>>>>>>>>>>>> gpuvm__)
>>>>>>>>>>>>>>>>>>> \
>>>>>>>>>>>>>>>>>>>          list_for_each_entry_safe(va__, next__, 
>>>>>>>>>>>>>>>>>>> &(gpuvm__)-
>>>>>>>>>>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>>>>>>>>>>> &drm_exec should be.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Optionally, @extra can be set in order to lock 
>>>>>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA 
>>>>>>>>>>>>>>>>>>> reservations
>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>> +        * @extra: Callback and corresponding 
>>>>>>>>>>>>>>>>>>> private data
>>>>>>>>>>>>>>>>>>> for the driver to
>>>>>>>>>>>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>> +                * @priv: driver private data for 
>>>>>>>>>>>>>>>>>>> the @fn
>>>>>>>>>>>>>>>>>>> callback
>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>> +               void *priv;
>>>>>>>>>>>>>>>>>>> +       } extra;
>>>>>>>>>>>>>>>>>>> +};
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs 
>>>>>>>>>>>>>>>>>>> common dma-
>>>>>>>>>>>>>>>>>>> resv
>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +static inline int
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>> +                    unsigned int num_fences)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       return drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>>>>>> &gpuvm->d_obj,
>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>> + u64 addr, u64 range,
>>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>> + struct drm_gem_object **objs,
>>>>>>>>>>>>>>>>>>> + unsigned int num_objs,
>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>> + bool interruptible);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>> + u64 addr, u64 range,
>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>> + bool interruptible);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all 
>>>>>>>>>>>>>>>>>>> assoiciated
>>>>>>>>>>>>>>>>>>> BOs
>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Releases all dma-resv locks of all &drm_gem_objects
>>>>>>>>>>>>>>>>>>> previously acquired
>>>>>>>>>>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, 
>>>>>>>>>>>>>>>>>>> &vm_exec->exec,
>>>>>>>>>>>>>>>>>>> fence,
>>>>>>>>>>>>>>>>>>> + private_usage,
>>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>>>>>>>>>>                           * gpuva list.
>>>>>>>>>>>>>>>>>>>                           */
>>>>>>>>>>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>>>>> +                        * @evict: List entry to 
>>>>>>>>>>>>>>>>>>> attach to
>>>>>>>>>>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>>>>>>>>>>> +                        * extobj list.
>>>>>>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>>>>> +                        * @evict: List entry to 
>>>>>>>>>>>>>>>>>>> attach to
>>>>>>>>>>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>>>>>>>>>>> +                        * list.
>>>>>>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>>>>>>>>>>                  } entry;
>>>>>>>>>>>>>>>>>>>          } list;
>>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, 
>>>>>>>>>>>>>>>>>>> bool
>>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>>> *vm_bo);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to walk 
>>>>>>>>>>>>>>>>>>> over a
>>>>>>>>>>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in each
>>>>>>>>>>>>>>>>>>> iteration step
>>>>>>>>>>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>>>>>>>>>>           * used.
>>>>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op 
>>>>>>>>>>>>>>>>>>> *op, void
>>>>>>>>>>>>>>>>>>> *priv);
>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>> +        * @bo_validate: called from 
>>>>>>>>>>>>>>>>>>> drm_gpuvm_validate()
>>>>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>>>>> +        * Drivers receive this callback for every 
>>>>>>>>>>>>>>>>>>> evicted
>>>>>>>>>>>>>>>>>>> &drm_gem_object being
>>>>>>>>>>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>>>>>>>>>>>> specific variant of
>>>>>>>>>>>>>>>>>>> +        * ttm_bo_validate() from within this callback.
>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object *obj);
>>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, 
>>>>>>>>>>>>>>>>>>> void *priv,
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-20 13:06                                       ` Christian König
@ 2023-09-20 13:38                                         ` Thomas Hellström
  2023-09-20 13:48                                           ` Christian König
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-20 13:38 UTC (permalink / raw)
  To: Christian König, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel


On 9/20/23 15:06, Christian König wrote:
>
>
> Am 20.09.23 um 14:06 schrieb Thomas Hellström:
>>
>> On 9/20/23 12:51, Christian König wrote:
>>> Am 20.09.23 um 09:44 schrieb Thomas Hellström:
>>>> Hi,
>>>>
>>>> On 9/20/23 07:37, Christian König wrote:
>>>>> Am 19.09.23 um 17:23 schrieb Thomas Hellström:
>>>>>>
>>>>>> On 9/19/23 17:16, Danilo Krummrich wrote:
>>>>>>> On 9/19/23 14:21, Thomas Hellström wrote:
>>>>>>>> Hi Christian
>>>>>>>>
>>>>>>>> On 9/19/23 14:07, Christian König wrote:
>>>>>>>>> Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
>>>>>>>>>> On 9/13/23 17:33, Christian König wrote:
>>>>>>>>>>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>>>>>>>>>>> On 9/13/23 16:26, Christian König wrote:
>>>>>>>>>>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>>>>>>>>>>> As mentioned in a different mail thread, the reply is 
>>>>>>>>>>>>>> based on the assumption
>>>>>>>>>>>>>> that we don't support anything else than GPUVM updates 
>>>>>>>>>>>>>> from the IOCTL.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think that this assumption is incorrect.
>>>>>>>>>>>>
>>>>>>>>>>>> Well, more precisely I should have said "don't support 
>>>>>>>>>>>> GPUVM updated from within
>>>>>>>>>>>> fence signaling critical sections". And looking at the 
>>>>>>>>>>>> code, that doesn't seem what
>>>>>>>>>>>> you're doing there.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Vulkan is just once specific use case, but this here 
>>>>>>>>>>>>> should probably be able to handle other use cases as well.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Especially with HMM you get the requirement that you need 
>>>>>>>>>>>>> to be able to invalidate GPUVM mappings without grabbing a 
>>>>>>>>>>>>> reservation lock.
>>>>>>>>>>>>
>>>>>>>>>>>> What do you mean with "invalidate GPUVM mappings" in this 
>>>>>>>>>>>> context? drm_gpuvm_bo_evict()
>>>>>>>>>>>> should only be called from a ttm_device_funcs::move\x0f 
>>>>>>>>>>>> callback, we should hold the dma-resv
>>>>>>>>>>>> lock there.
>>>>>>>>>>>
>>>>>>>>>>> Well the question is which dma-resv lock do we hold?
>>>>>>>>>>>
>>>>>>>>>>> In the move callback we only hold the dma-resv lock of the 
>>>>>>>>>>> BO which is moved, but when that is a shared BO then that's 
>>>>>>>>>>> not the same as the one for the VM.
>>>>>>>>>>
>>>>>>>>>> Correct, Thomas' idea was to use the GEM's dma_resv lock to 
>>>>>>>>>> protect drm_gpuvm_bo::evicted
>>>>>>>>>> and then actually move the drm_gpuvm_bo to the VM's evicted 
>>>>>>>>>> list once we grabbed all
>>>>>>>>>> dma-resv locks when locking the VM's BOs using drm_exec. We 
>>>>>>>>>> can remove them from the evicted
>>>>>>>>>> list on validate(). This way we never touch the evicted list 
>>>>>>>>>> without holding at least the VM's
>>>>>>>>>> dma-resv lock.
>>>>>>>>>>
>>>>>>>>>> Do you have any concerns about that?
>>>>>>>>>
>>>>>>>>> Scratching my head a bit how that is supposed to work.
>>>>>>>>>
>>>>>>>>> This implies that you go over all the evicted BOs during 
>>>>>>>>> validation and not just the one mentioned in the CS.
>>>>>>>>>
>>>>>>>>> That might work for Vulkan, but is pretty much a no-go for 
>>>>>>>>> OpenGL.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> See what the eviction lock in amdgpu is doing for example.
>>>>>>>>>>>>
>>>>>>>>>>>> The eviction_lock seems to protect a VM state "evicting" of 
>>>>>>>>>>>> whether any BO that
>>>>>>>>>>>> is associated with the VM is currently evicting. At the 
>>>>>>>>>>>> same time amdgpu protects
>>>>>>>>>>>> the eviceted list of the VM with a different lock. So this 
>>>>>>>>>>>> seems to be entirely
>>>>>>>>>>>> unrelated. Tracking a "currently evicting" state is not 
>>>>>>>>>>>> part of the GPUVM
>>>>>>>>>>>> implementation currently and hence nothing would change for 
>>>>>>>>>>>> amdgpu there.
>>>>>>>>>>>
>>>>>>>>>>> Sorry for the confusion we use different terminology in amdgpu.
>>>>>>>>>>>
>>>>>>>>>>> The eviction lock and evicted state is for the VM page 
>>>>>>>>>>> tables, e.g. if the whole VM is currently not used and 
>>>>>>>>>>> swapped out or even de-allocated.
>>>>>>>>>>>
>>>>>>>>>>> This is necessary because we have cases where we need to 
>>>>>>>>>>> access the VM data without holding the dma-resv lock of this 
>>>>>>>>>>> VM. Especially figuring out which parts of an address space 
>>>>>>>>>>> contain mappings and which doesn't.
>>>>>>>>>>
>>>>>>>>>> I think this is fine, this has nothing to do with lists of 
>>>>>>>>>> evicted GEM objects or external GEM
>>>>>>>>>> objects, right? Marking mappings (drm_gpuva) as invalidated 
>>>>>>>>>> (DRM_GPUVA_INVALIDATED) or accessing
>>>>>>>>>> the VA space does not require any dma-resv locks.
>>>>>>>>>
>>>>>>>>> I hope so, but I'm not 100% sure.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This is a requirement which comes with HMM handling, you 
>>>>>>>>>>> won't see this with Vulkan (or OpenGL, VAAPI etc..).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The invalidation lock on the other hand is what in this 
>>>>>>>>>>> discussion is called eviction lock. This one is needed 
>>>>>>>>>>> because what I wrote above, during the move callback only 
>>>>>>>>>>> the dma-resv of the BO which is moved is locked, but not 
>>>>>>>>>>> necessarily the dma-resv of the VM.
>>>>>>>>>>
>>>>>>>>>> That's yet another thing, right? This is used to track 
>>>>>>>>>> whether *any* BO that belongs to the VM is
>>>>>>>>>> currently being evicted, correct? As mentioned, as by now 
>>>>>>>>>> this is not supported in GPUVM and hence
>>>>>>>>>> would be the same driver specific code with the same driver 
>>>>>>>>>> specifc lock.
>>>>>>>>>
>>>>>>>>> That is most likely a show stopper using this for OpenGL based 
>>>>>>>>> workloads as far as I can see. For those you need to able to 
>>>>>>>>> figure out which non-VM BOs have been evicted and which parts 
>>>>>>>>> of the VM needs updates.
>>>>>>>>
>>>>>>>> We identify those with a bool in the gpuvm_bo, and that bool is 
>>>>>>>> protected by the bo_resv. In essence, the "evicted" list must 
>>>>>>>> be made up-to-date with all relevant locks held before 
>>>>>>>> traversing in the next exec.
>>>>>>>
>>>>>>> What I still miss with this idea is how do we find all the 
>>>>>>> drm_gpuvm_bo structures with the evicted bool set to true? When 
>>>>>>> doing the drm_exec dance we come across all external ones and 
>>>>>>> can add them to the list if needed, but what about the BOs 
>>>>>>> having the VM's dma-resv?
>>>>>>
>>>>>> Oh, they can be added to the evict list directly (no bool needed) 
>>>>>> in the eviction code, like in v3. Since for those we indeed hold 
>>>>>> the VM's dma_resv since it's aliased with the object's dma-resv.
>>>>>
>>>>> Yeah, I wanted to note what Danilo seems to think about as well. 
>>>>> How do we figure out the non-VM BOs evicted?
>>>>>
>>>>> We can't walk over the list of all non-VM BOs on every submission, 
>>>>> that's to much overhead for cases with lots of non-VM BOs.
>>>>>
>>>>> And we can't rely on userspace sending all non-VM BOs as used list 
>>>>> down to the kernel with each submission.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>
>>>> No, that's not needed: Mechanism below.
>>>>
>>>> 1) We maintain an evicted list. Typically protected by the vm resv.
>>>> 2) Each gpuvm_bo has a bool "evicted". Protected by the bo resv.
>>>>
>>>> a) Evicting a vm bo: The vm resv is held by the eviction code. Just 
>>>> put it on the evicted list.
>>>> b) Evicting a shared/external bo: The bo resv is held by the 
>>>> eviction code. Set the "evicted" bool
>>>> c) Validating the evicted list on exec:
>>>
>>>
>>>> Loop through all *external/shared* bos.
>>>
>>> And this is what you can't do. For Vulkan it probably doesn't 
>>> matter, but for OpenGL and especially multimedia we have much more 
>>> BOs on the shared list than what's allocated for the VM.
>>
>> But you need to lock- and fence all those so you need to loop through 
>> them anyway, so we're still O(n_shared)? Or is there some clever 
>> optimization in amdgpu?
>
> Why should I lock and fence them? Only the BOs in the relocation list 
> are locked and fenced.

Do you by "relocation" list refer to what gpuvm calls "evict" list or 
something else? Like the relocaton/validation list that used to be sent 
from user-space for non-VM_BIND vms?

The vm bos plus the external/shared bos bound to the VM (the external 
list) are the bos being referenced by the current batch. So the bos on 
the VM's external list are the ones being locked and fenced and checked 
for eviction. If they weren't they could be evicted before the current 
batch completes?

Thanks,

Thomas


>
> Regards,
> Christian.
>
>>
>> I think with some UMDs, xe might end up with similar large lists...
>>
>> /Thomas
>>
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>>> Lock them. After locking, check the "evicted" bool, if it's true. 
>>>> put the bo on the evicted list (we hold the VM resv at this point) 
>>>> and clear the "evicted" bool. Note that other vms will have their 
>>>> own gpuvm_bo which is marked evicted.
>>>>
>>>> I have this coded up in a patch for Xe and it seems to be working 
>>>> properly.
>>>>
>>>> /Thomas
>>>>
>>>>
>>>>>
>>>>>>
>>>>>> /Thomas
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> If you mean that we need to unbind all vmas of all vms of 
>>>>>>>> evicted bos before evicting, We don't do that, at least not in 
>>>>>>>> Xe, since evicting we wait for VM idle, and it cant access 
>>>>>>>> anything through the stale vmas until they have been 
>>>>>>>> revalidated and rebound.
>>>>>>>>
>>>>>>>> /Thomas
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Christian.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Christian.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas 
>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas 
>>>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas 
>>>>>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>>>>>> Hi, Danilo,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>>>>>> So far the DRM GPUVA manager offers common 
>>>>>>>>>>>>>>>>>>>> infrastructure to
>>>>>>>>>>>>>>>>>>>> track GPU VA
>>>>>>>>>>>>>>>>>>>> allocations and mappings, generically connect GPU 
>>>>>>>>>>>>>>>>>>>> VA mappings
>>>>>>>>>>>>>>>>>>>> to their
>>>>>>>>>>>>>>>>>>>> backing buffers and perform more complex mapping 
>>>>>>>>>>>>>>>>>>>> operations
>>>>>>>>>>>>>>>>>>>> on the GPU VA
>>>>>>>>>>>>>>>>>>>> space.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> However, there are more design patterns commonly 
>>>>>>>>>>>>>>>>>>>> used by
>>>>>>>>>>>>>>>>>>>> drivers, which
>>>>>>>>>>>>>>>>>>>> can potentially be generalized in order to make the 
>>>>>>>>>>>>>>>>>>>> DRM GPUVA
>>>>>>>>>>>>>>>>>>>> manager
>>>>>>>>>>>>>>>>>>>> represent a basic GPU-VM implementation. In this 
>>>>>>>>>>>>>>>>>>>> context,
>>>>>>>>>>>>>>>>>>>> this patch aims
>>>>>>>>>>>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not 
>>>>>>>>>>>>>>>>>>>> being used
>>>>>>>>>>>>>>>>>>>> outside of
>>>>>>>>>>>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM 
>>>>>>>>>>>>>>>>>>>> objects
>>>>>>>>>>>>>>>>>>>> which are
>>>>>>>>>>>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 3) Provide functions to efficiently lock all GEM 
>>>>>>>>>>>>>>>>>>>> objects dma-
>>>>>>>>>>>>>>>>>>>> resv the
>>>>>>>>>>>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>>>>>>>>>>>>> contains mappings
>>>>>>>>>>>>>>>>>>>>       of, such that validation of evicted GEM 
>>>>>>>>>>>>>>>>>>>> objects is
>>>>>>>>>>>>>>>>>>>> accelerated.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 5) Provide some convinience functions for common 
>>>>>>>>>>>>>>>>>>>> patterns.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Rather than being designed as a "framework", the 
>>>>>>>>>>>>>>>>>>>> target is to
>>>>>>>>>>>>>>>>>>>> make all
>>>>>>>>>>>>>>>>>>>> features appear as a collection of optional helper 
>>>>>>>>>>>>>>>>>>>> functions,
>>>>>>>>>>>>>>>>>>>> such that
>>>>>>>>>>>>>>>>>>>> drivers are free to make use of the DRM GPUVA 
>>>>>>>>>>>>>>>>>>>> managers basic
>>>>>>>>>>>>>>>>>>>> functionality and opt-in for other features without 
>>>>>>>>>>>>>>>>>>>> setting
>>>>>>>>>>>>>>>>>>>> any feature
>>>>>>>>>>>>>>>>>>>> flags, just by making use of the corresponding 
>>>>>>>>>>>>>>>>>>>> functions.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Big kudos to Boris Brezillon for his help to figure 
>>>>>>>>>>>>>>>>>>>> out
>>>>>>>>>>>>>>>>>>>> locking for drivers
>>>>>>>>>>>>>>>>>>>> updating the GPU VA space within the fence 
>>>>>>>>>>>>>>>>>>>> signalling path.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>>>>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>>> drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>>>>>>>>>>     include/drm/drm_gpuvm.h | 197 ++++++++++++++
>>>>>>>>>>>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for 
>>>>>>>>>>>>>>>>>>>> an existing
>>>>>>>>>>>>>>>>>>>> instance of this
>>>>>>>>>>>>>>>>>>>>      * particular combination. If not existent a 
>>>>>>>>>>>>>>>>>>>> new instance
>>>>>>>>>>>>>>>>>>>> is created and linked
>>>>>>>>>>>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a given
>>>>>>>>>>>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>>>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of external 
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> evicted objects. Those
>>>>>>>>>>>>>>>>>>>> + * list are maintained in order to accelerate 
>>>>>>>>>>>>>>>>>>>> locking of
>>>>>>>>>>>>>>>>>>>> dma-resv locks and
>>>>>>>>>>>>>>>>>>>> + * validation of evicted objects bound in a 
>>>>>>>>>>>>>>>>>>>> &drm_gpuvm. For
>>>>>>>>>>>>>>>>>>>> instance the all
>>>>>>>>>>>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given 
>>>>>>>>>>>>>>>>>>>> &drm_gpuvm can be
>>>>>>>>>>>>>>>>>>>> locked by calling
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can 
>>>>>>>>>>>>>>>>>>>> call
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>>>>>>>>>>>> + * order to validate all evicted &drm_gem_objects. 
>>>>>>>>>>>>>>>>>>>> It is
>>>>>>>>>>>>>>>>>>>> also possible to lock
>>>>>>>>>>>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>>>>>>>>>>>> corresponding parameters to
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the 
>>>>>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>>>>>> loop while making
>>>>>>>>>>>>>>>>>>>> + * use of helper functions such as 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Every bound &drm_gem_object is treated as 
>>>>>>>>>>>>>>>>>>>> external object
>>>>>>>>>>>>>>>>>>>> when its &dma_resv
>>>>>>>>>>>>>>>>>>>> + * structure is different than the &drm_gpuvm's 
>>>>>>>>>>>>>>>>>>>> common
>>>>>>>>>>>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() 
>>>>>>>>>>>>>>>>>>>> for the same
>>>>>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>>>>>      * &drm_gem_object must be able to observe 
>>>>>>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>> creations and destructions
>>>>>>>>>>>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep instances 
>>>>>>>>>>>>>>>>>>>> unique.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of 
>>>>>>>>>>>>>>>>>>>> external and
>>>>>>>>>>>>>>>>>>>> evicted objects are
>>>>>>>>>>>>>>>>>>>> + * protected against concurrent insertion / 
>>>>>>>>>>>>>>>>>>>> removal and
>>>>>>>>>>>>>>>>>>>> iteration internally.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * However, drivers still need ensure to protect 
>>>>>>>>>>>>>>>>>>>> concurrent
>>>>>>>>>>>>>>>>>>>> calls to functions
>>>>>>>>>>>>>>>>>>>> + * iterating those lists, such as 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_validate() and
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such 
>>>>>>>>>>>>>>>>>>>> function contains
>>>>>>>>>>>>>>>>>>>> a particular
>>>>>>>>>>>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Functions adding or removing entries from those 
>>>>>>>>>>>>>>>>>>>> lists,
>>>>>>>>>>>>>>>>>>>> such as
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() or 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_extobj_add() may be
>>>>>>>>>>>>>>>>>>>> called with external
>>>>>>>>>>>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>>>>>>>>>>>> corresponding list to be
>>>>>>>>>>>>>>>>>>>> + * (safely) modified while potentially being 
>>>>>>>>>>>>>>>>>>>> iternated by
>>>>>>>>>>>>>>>>>>>> other API functions.
>>>>>>>>>>>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>>>>>>>>>>>      *   }
>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next vm_bo 
>>>>>>>>>>>>>>>>>>>> element
>>>>>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're 
>>>>>>>>>>>>>>>>>>>> iterating on
>>>>>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used 
>>>>>>>>>>>>>>>>>>>> to store
>>>>>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>>>>>> first element from
>>>>>>>>>>>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>>>>> Are the list spinlocks needed for that async state 
>>>>>>>>>>>>>>>>>>> update from
>>>>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>>>>>>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Otherwise it should be sufficient to protect the 
>>>>>>>>>>>>>>>>>>> lists with the
>>>>>>>>>>>>>>>>>>> gpuvm's resv
>>>>>>>>>>>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> If those spinlocks are still needed in some 
>>>>>>>>>>>>>>>>>>> situations, perhaps
>>>>>>>>>>>>>>>>>>> could we
>>>>>>>>>>>>>>>>>>> have an option to set them to NULL (Like IIRC the 
>>>>>>>>>>>>>>>>>>> maple tree
>>>>>>>>>>>>>>>>>>> allows for)?
>>>>>>>>>>>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>>>>>>>>>>>> holding only the dma-resv lock from the BO this 
>>>>>>>>>>>>>>>>>> function gets
>>>>>>>>>>>>>>>>>> called for. Hence,
>>>>>>>>>>>>>>>>>> the spinlock protects concurrent drm_gpuvm_bo_evict() 
>>>>>>>>>>>>>>>>>> calls with
>>>>>>>>>>>>>>>>>> different BOs.
>>>>>>>>>>>>>>>>> No. Only if you try to add external objects to the 
>>>>>>>>>>>>>>>>> vm's evict list
>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>> within the evict code. That's not necessary since you 
>>>>>>>>>>>>>>>>> loop through
>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>> external objects anyway when locking them so an 
>>>>>>>>>>>>>>>>> "evicted" bool in
>>>>>>>>>>>>>>>>> the vm_bo,
>>>>>>>>>>>>>>>>> protected by the bo resv would be sufficient. The 
>>>>>>>>>>>>>>>>> extobj locking
>>>>>>>>>>>>>>>>> loop can
>>>>>>>>>>>>>>>>> then add the bo to the evicted list.
>>>>>>>>>>>>>>>> And validate() can remove it while still holding all 
>>>>>>>>>>>>>>>> dma-resv locks,
>>>>>>>>>>>>>>>> neat!
>>>>>>>>>>>>>>>> However, what if two tasks are trying to lock the VA space
>>>>>>>>>>>>>>>> concurrently? What
>>>>>>>>>>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to zero in
>>>>>>>>>>>>>>>> drm_gpuva_unlink()?
>>>>>>>>>>>>>>>> Are we guaranteed that at this point of time the 
>>>>>>>>>>>>>>>> drm_gpuvm_bo is not
>>>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>>>> evicted list? Because otherwise we would call 
>>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>>>>>> might drop the last reference to the drm_gem_object and 
>>>>>>>>>>>>>>>> hence we'd
>>>>>>>>>>>>>>>> potentially
>>>>>>>>>>>>>>>> free the dma-resv lock while holding it, at least if 
>>>>>>>>>>>>>>>> it's an external
>>>>>>>>>>>>>>>> object.
>>>>>>>>>>>>>>> Easiest way in this scheme is to think of the lists as 
>>>>>>>>>>>>>>> being protected
>>>>>>>>>>>>>>> by the vm's resv lock. That means anybody calling 
>>>>>>>>>>>>>>> unlink() must also
>>>>>>>>>>>>>>> hold the vm's resv lock. (Which is OK from an UAF point 
>>>>>>>>>>>>>>> of view, but
>>>>>>>>>>>>>>> perhaps not from a locking inversion POW from an async 
>>>>>>>>>>>>>>> list update).
>>>>>>>>>>>>>> This would mean that on unlink() we'd need to hold the 
>>>>>>>>>>>>>> VM's resv lock and the
>>>>>>>>>>>>>> corresponding GEM's resv lock (in case they're not the 
>>>>>>>>>>>>>> same anyways) because the
>>>>>>>>>>>>>> VM's resv lock would protect the external / evicted 
>>>>>>>>>>>>>> object lists and the GEM
>>>>>>>>>>>>>> objects resv lock protects the GEM's list of 
>>>>>>>>>>>>>> drm_gpuvm_bos and the
>>>>>>>>>>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For extobjs an outer lock would be enough in case of 
>>>>>>>>>>>>>>>>>> Xe, but I
>>>>>>>>>>>>>>>>>> really would not
>>>>>>>>>>>>>>>>>> like to add even more complexity just to get the 
>>>>>>>>>>>>>>>>>> spinlock out of
>>>>>>>>>>>>>>>>>> the way in case
>>>>>>>>>>>>>>>>>> the driver already has an outer lock protecting this 
>>>>>>>>>>>>>>>>>> path.
>>>>>>>>>>>>>>>>> I must disagree here. These spinlocks and atomic 
>>>>>>>>>>>>>>>>> operations are
>>>>>>>>>>>>>>>>> pretty
>>>>>>>>>>>>>>>>> costly and as discussed earlier this type of locking 
>>>>>>>>>>>>>>>>> was the reason
>>>>>>>>>>>>>>>>> (at
>>>>>>>>>>>>>>>>> least according to the commit message) that made 
>>>>>>>>>>>>>>>>> Christian drop the
>>>>>>>>>>>>>>>>> XArray
>>>>>>>>>>>>>>>>> use in drm_exec for the same set of objects: "The 
>>>>>>>>>>>>>>>>> locking overhead
>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>> unecessary and measurable". IMHO the spinlock is the 
>>>>>>>>>>>>>>>>> added
>>>>>>>>>>>>>>>>> complexity and a
>>>>>>>>>>>>>>>>> single wide lock following the drm locking guidelines 
>>>>>>>>>>>>>>>>> set out by
>>>>>>>>>>>>>>>>> Daniel and
>>>>>>>>>>>>>>>>> David should really be the default choice with an 
>>>>>>>>>>>>>>>>> opt-in for a
>>>>>>>>>>>>>>>>> spinlock if
>>>>>>>>>>>>>>>>> needed for async and pushing out to a wq is not an 
>>>>>>>>>>>>>>>>> option.
>>>>>>>>>>>>>>>> For the external object list an outer lock would work 
>>>>>>>>>>>>>>>> as long as it's
>>>>>>>>>>>>>>>> not the
>>>>>>>>>>>>>>>> dma-resv lock of the corresponding GEM object, since 
>>>>>>>>>>>>>>>> here we actually
>>>>>>>>>>>>>>>> need to
>>>>>>>>>>>>>>>> remove the list entry from the external object list on
>>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>>>>>>>>>>> It's just a bit weird design wise that drivers would 
>>>>>>>>>>>>>>>> need to take
>>>>>>>>>>>>>>>> this outer
>>>>>>>>>>>>>>>> lock on:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>>>>>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also 
>>>>>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>>>>>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>>>>>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Given that it seems reasonable to do all the required 
>>>>>>>>>>>>>>>> locking
>>>>>>>>>>>>>>>> internally.
>>>>>>>>>>>>>>>  From a design POW, there has been a clear direction in 
>>>>>>>>>>>>>>> XE to make
>>>>>>>>>>>>>>> things similar to mmap() / munmap(), so this outer lock, 
>>>>>>>>>>>>>>> which in Xe is
>>>>>>>>>>>>>>> an rwsem, is used in a similar way as the mmap_lock. 
>>>>>>>>>>>>>>> It's protecting
>>>>>>>>>>>>>>> the page-table structures and vma rb tree, the userptr 
>>>>>>>>>>>>>>> structures and
>>>>>>>>>>>>>>> the extobj list. Basically it's taken early in the exec 
>>>>>>>>>>>>>>> IOCTL, the
>>>>>>>>>>>>>>> VM_BIND ioctl, the compute rebind worker and the 
>>>>>>>>>>>>>>> pagefault handler, so
>>>>>>>>>>>>>>> all of the above are just asserting that it is taken in 
>>>>>>>>>>>>>>> the correct
>>>>>>>>>>>>>>> mode.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> But strictly with this scheme one could also use the 
>>>>>>>>>>>>>>> vm's dma_resv for
>>>>>>>>>>>>>>> the extobj list since with drm_exec, it's locked before 
>>>>>>>>>>>>>>> traversing the
>>>>>>>>>>>>>>> list.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The whole point of this scheme is to rely on locks that 
>>>>>>>>>>>>>>> you already are
>>>>>>>>>>>>>>> supposed to be holding for various reasons and is simple 
>>>>>>>>>>>>>>> to comprehend.
>>>>>>>>>>>>>> I don't agree that we're supposed to hold the VM's resv 
>>>>>>>>>>>>>> lock anyways for
>>>>>>>>>>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), 
>>>>>>>>>>>>>> but I'm fine using it
>>>>>>>>>>>>>> for that purpose nevertheless.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In order to at least place lockdep checks, the driver 
>>>>>>>>>>>>>>>> would need to
>>>>>>>>>>>>>>>> supply the
>>>>>>>>>>>>>>>> corresponding lock's lockdep_map, because the GPUVM 
>>>>>>>>>>>>>>>> otherwise doesn't
>>>>>>>>>>>>>>>> know about
>>>>>>>>>>>>>>>> the lock.
>>>>>>>>>>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>>>>>>>>>> I'd really like to avoid that, especially now that 
>>>>>>>>>>>>>> everything got simpler. We
>>>>>>>>>>>>>> should define the actual locks to take instead.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Out of curiosity, what is the overhead of a spin_lock() 
>>>>>>>>>>>>>>>> that doesn't
>>>>>>>>>>>>>>>> need to
>>>>>>>>>>>>>>>> spin?
>>>>>>>>>>>>>>> I guess it's hard to tell exactly, but it is much lower 
>>>>>>>>>>>>>>> on modern x86
>>>>>>>>>>>>>>> than what it used to be. Not sure about ARM, which is 
>>>>>>>>>>>>>>> the other
>>>>>>>>>>>>>>> architecture important to us. I figure if there is 
>>>>>>>>>>>>>>> little cache-line
>>>>>>>>>>>>>>> bouncing the main overhead comes from the implied barriers.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> A pretty simple way that would not add much code would be
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> static void gpuvm_cond_spin_lock(const struct 
>>>>>>>>>>>>>>>>> drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>> spinlock_t
>>>>>>>>>>>>>>>>> *lock)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>>>>>>>>>>>          spin_lock(lock);
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> For such drivers, that would require anybody calling 
>>>>>>>>>>>>>>>>>>> unlink to
>>>>>>>>>>>>>>>>>>> hold the vm's
>>>>>>>>>>>>>>>>>>> resv, though.
>>>>>>>>>>>>>>>>>> In V4 I want to go back to having a dedicated lock 
>>>>>>>>>>>>>>>>>> for the GEMs
>>>>>>>>>>>>>>>>>> gpuva list (or
>>>>>>>>>>>>>>>>>> VM_BO list to be more precise). We can't just use the 
>>>>>>>>>>>>>>>>>> dma-resv
>>>>>>>>>>>>>>>>>> lock for that
>>>>>>>>>>>>>>>>>> with VM_BO abstractions, because on destruction of a 
>>>>>>>>>>>>>>>>>> VM_BO we
>>>>>>>>>>>>>>>>>> otherwise wouldn't
>>>>>>>>>>>>>>>>>> be allowed to already hold the dma-resv lock. That's 
>>>>>>>>>>>>>>>>>> the fix I
>>>>>>>>>>>>>>>>>> was referring to
>>>>>>>>>>>>>>>>>> earlier.
>>>>>>>>>>>>>>>>> Yeah, I can see the need for a dedicated lock for the 
>>>>>>>>>>>>>>>>> GEM's gpuva
>>>>>>>>>>>>>>>>> list, but
>>>>>>>>>>>>>>>>> holding the vm's dma-resv lock across the unlink 
>>>>>>>>>>>>>>>>> shouldn't be a
>>>>>>>>>>>>>>>>> problem. We
>>>>>>>>>>>>>>>>> may free the object and a pointer to the vm's resv 
>>>>>>>>>>>>>>>>> during unlink
>>>>>>>>>>>>>>>>> but we
>>>>>>>>>>>>>>>>> don't free the vm's resv.  It'd be a matter of 
>>>>>>>>>>>>>>>>> ensuring that any
>>>>>>>>>>>>>>>>> calls to
>>>>>>>>>>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>>>>>>>>>> Drivers calling unlink() from the fence signaling path 
>>>>>>>>>>>>>>>> can't use the
>>>>>>>>>>>>>>>> VM's
>>>>>>>>>>>>>>>> dma-resv lock.
>>>>>>>>>>>>>>> Yes, that made me a bit curious because in the current 
>>>>>>>>>>>>>>> version the code
>>>>>>>>>>>>>>> required the object's dma_resv for unlink() which can't 
>>>>>>>>>>>>>>> be grabbed
>>>>>>>>>>>>>>> either from the fence signaling path. So are there any 
>>>>>>>>>>>>>>> drivers actually
>>>>>>>>>>>>>>> wanting to do that? If so, they will either need to 
>>>>>>>>>>>>>>> resort to the
>>>>>>>>>>>>>>> current spinlock solution or they will need to call 
>>>>>>>>>>>>>>> unlink from a
>>>>>>>>>>>>>>> workqueue item.
>>>>>>>>>>>>>> As Boris already mentioned we have the dma-resv lock by 
>>>>>>>>>>>>>> default or a driver
>>>>>>>>>>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid of 
>>>>>>>>>>>>>> the latter.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Also, what if the object is an external object? We 
>>>>>>>>>>>>>>>> can't use the VM's
>>>>>>>>>>>>>>>> dma-resv
>>>>>>>>>>>>>>>> lock here.
>>>>>>>>>>>>>>> Why? Typically (sync) unlink is only ever called from an 
>>>>>>>>>>>>>>> unbind-like
>>>>>>>>>>>>>>> operation where it should be trivial to grab the vm's 
>>>>>>>>>>>>>>> resv. Or, for
>>>>>>>>>>>>>>> that matter any outer lock protecting the extobj list. 
>>>>>>>>>>>>>>> Rule would be
>>>>>>>>>>>>>>> the drm_gpuvm_bo::entry::extobj  and 
>>>>>>>>>>>>>>> drm_gpuvm_bo::entry::evict would
>>>>>>>>>>>>>>> be protected by either the vm's dma_resv (or possibly an 
>>>>>>>>>>>>>>> outer lock in
>>>>>>>>>>>>>>> the case of the extobj list).
>>>>>>>>>>>>>> Outer lock wouldn't have been working for updates in the 
>>>>>>>>>>>>>> async path, but
>>>>>>>>>>>>>> shouldn't be relevant anymore. We could use the VM's resv 
>>>>>>>>>>>>>> for that.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>   And we can't have the GEM objs dma-resv lock held 
>>>>>>>>>>>>>>>> when calling
>>>>>>>>>>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), 
>>>>>>>>>>>>>>>> which if the
>>>>>>>>>>>>>>>> refcount drops
>>>>>>>>>>>>>>>> to zero calls drm_gpuvm_bo_destroy() and 
>>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy() might
>>>>>>>>>>>>>>>> drop the
>>>>>>>>>>>>>>>> last reference of the GEM object.
>>>>>>>>>>>>>>> Yes, but this is a different problem as to what exactly 
>>>>>>>>>>>>>>> protects
>>>>>>>>>>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an 
>>>>>>>>>>>>>>> internal per bo list
>>>>>>>>>>>>>>> lock, or if we want to keep the bo's dma_resv we need to 
>>>>>>>>>>>>>>> ensure that
>>>>>>>>>>>>>>> the caller of dma_resv_unlock(obj->resv) actually 
>>>>>>>>>>>>>>> refcounts its obj
>>>>>>>>>>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's 
>>>>>>>>>>>>>>> refcount (I know
>>>>>>>>>>>>>>> Boris didn't like that, but requiring an explicit 
>>>>>>>>>>>>>>> refcount for a
>>>>>>>>>>>>>>> pointer you dereference unless you're under a lock that 
>>>>>>>>>>>>>>> ensures keeping
>>>>>>>>>>>>>>> the object alive is pretty much required?) But anyway 
>>>>>>>>>>>>>>> for the
>>>>>>>>>>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or 
>>>>>>>>>>>>>>> internal spinlock)
>>>>>>>>>>>>>>> I don't have a strong preference.
>>>>>>>>>>>>>> We can keep the GEM objects dma-resv lock, however as 
>>>>>>>>>>>>>> mentioned above
>>>>>>>>>>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires 
>>>>>>>>>>>>>> both the VM's resv lock
>>>>>>>>>>>>>> and the GEM's resv lock in case they differ.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>   All those problems go away with a dedicated
>>>>>>>>>>>>>>>> GEM gpuva list lock.
>>>>>>>>>>>>>>> I don't think these are real problems.
>>>>>>>>>>>>>>> With the excepton of the eviction list "trick" where we 
>>>>>>>>>>>>>>> currently have
>>>>>>>>>>>>>>> slightly different approach to collect external bos 
>>>>>>>>>>>>>>> needing rebinding,
>>>>>>>>>>>>>>> we have this working fine.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> TBH I think pretty much the only situation where the 
>>>>>>>>>>>>>>> spinlock is needed
>>>>>>>>>>>>>>> is for async updates of these lists, unless a wq item 
>>>>>>>>>>>>>>> can be used for
>>>>>>>>>>>>>>> that, but it doesn't really seem like the current code 
>>>>>>>>>>>>>>> allows for such
>>>>>>>>>>>>>>> updates anyway? It complicates the code a lot, adds 
>>>>>>>>>>>>>>> overhead and also
>>>>>>>>>>>>>>> adds the requirement for refcounting during list traversal.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It seems that with that also the refcount could be 
>>>>>>>>>>>>>>>>>>> make non-
>>>>>>>>>>>>>>>>>>> atomic.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> All in the spirit of the drm locking guidelines "use 
>>>>>>>>>>>>>>>>>>> big locks
>>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>> possible".
>>>>>>>>>>>>>>>>>>> Lower level locks only when necessary for 
>>>>>>>>>>>>>>>>>>> performance or
>>>>>>>>>>>>>>>>>>> locking inversion?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Elements popped from the original list are kept 
>>>>>>>>>>>>>>>>>>>> in a
>>>>>>>>>>>>>>>>>>>> local list, so removal
>>>>>>>>>>>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>>>>>>>>>>>> iterating the list.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, 
>>>>>>>>>>>>>>>>>>>> __list_name,
>>>>>>>>>>>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>>>>>>>>>>>> +       ({
>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>>>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>>>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \ 
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> + struct
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>>>>>>>>>>>> +                       if
>>>>>>>>>>>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>>>>>>>>>>>> {                    \
>>>>>>>>>>>>>>>>>>>> +                               list_move_tail(&(__vm_bo)- 
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> __local_list);                           \
>>>>>>>>>>>>>>>>>>>> +                               break;
>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>> +                       } else
>>>>>>>>>>>>>>>>>>>> {                                                        \ 
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> +                               list_del_init(&(__vm_bo)- 
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>>>>>>>>>>>> NULL;                                         \
>>>>>>>>>>>>>>>>>>>> +                       }
>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>> +       })
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list 
>>>>>>>>>>>>>>>>>>>> iterator
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>>>>>> first element from the
>>>>>>>>>>>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Typical use:
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + *     ret = 0;
>>>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>>>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>>>>>>>>>>>> + *             ret = do_something_with_vm_bo(..., 
>>>>>>>>>>>>>>>>>>>> vm_bo);
>>>>>>>>>>>>>>>>>>>> + *             if (ret)
>>>>>>>>>>>>>>>>>>>> + *                     break;
>>>>>>>>>>>>>>>>>>>> + *     }
>>>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, 
>>>>>>>>>>>>>>>>>>>> <list_name>,
>>>>>>>>>>>>>>>>>>>> &my_local_list);
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Only used for internal list iterations, not 
>>>>>>>>>>>>>>>>>>>> meant to be
>>>>>>>>>>>>>>>>>>>> exposed to the outside
>>>>>>>>>>>>>>>>>>>> + * world.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>>>>>>>>>>>> +       for (__vm_bo = 
>>>>>>>>>>>>>>>>>>>> get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> NULL);            \
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> __vm_bo;
>>>>>>>>>>>>>>>>>>>>        \
>>>>>>>>>>>>>>>>>>>> +            __vm_bo = 
>>>>>>>>>>>>>>>>>>>> get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> __vm_bo))         \
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements back 
>>>>>>>>>>>>>>>>>>>> to their
>>>>>>>>>>>>>>>>>>>> original list
>>>>>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're 
>>>>>>>>>>>>>>>>>>>> iterating on
>>>>>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list used 
>>>>>>>>>>>>>>>>>>>> to store
>>>>>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * When we're done iterating a vm_bo list, we 
>>>>>>>>>>>>>>>>>>>> should call
>>>>>>>>>>>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>>>>>>>>>>>> + * to restore the original state and let new 
>>>>>>>>>>>>>>>>>>>> iterations take
>>>>>>>>>>>>>>>>>>>> place.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>>>>> __local_list)                         \
>>>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>                  \
>>>>>>>>>>>>>>>>>>>> +               /* Merge back the two lists, moving 
>>>>>>>>>>>>>>>>>>>> local
>>>>>>>>>>>>>>>>>>>> list elements to the          \
>>>>>>>>>>>>>>>>>>>> +                * head to preserve previous 
>>>>>>>>>>>>>>>>>>>> ordering, in
>>>>>>>>>>>>>>>>>>>> case it matters.              \
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>>>>>>>            \
>>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into 
>>>>>>>>>>>>>>>>>>>> the given
>>>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Inserts the given @__vm_bo into the list 
>>>>>>>>>>>>>>>>>>>> specified by
>>>>>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>>>>>>>>>>>> __list_name)      ��                     \
>>>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>>>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>>>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from 
>>>>>>>>>>>>>>>>>>>> the given
>>>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Removes the given @__vm_bo from the list 
>>>>>>>>>>>>>>>>>>>> specified by
>>>>>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>>>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>>>> *vm_bo);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>     #define 
>>>>>>>>>>>>>>>>>>>> to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>>>>>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>>>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>         ��drm_gpuva_check_overflow(start_offset, 
>>>>>>>>>>>>>>>>>>>> range);
>>>>>>>>>>>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>>>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>>> *gpuvm)
>>>>>>>>>>>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root),
>>>>>>>>>>>>>>>>>>>>               "GPUVA tree is not empty, potentially 
>>>>>>>>>>>>>>>>>>>> leaking
>>>>>>>>>>>>>>>>>>>> memory.\n");
>>>>>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), 
>>>>>>>>>>>>>>>>>>>> "Extobj list
>>>>>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), 
>>>>>>>>>>>>>>>>>>>> "Evict list
>>>>>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all 
>>>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Note: This function is safe against concurrent 
>>>>>>>>>>>>>>>>>>>> insertion
>>>>>>>>>>>>>>>>>>>> and removal of
>>>>>>>>>>>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>>>>>>>>>>>> concurrent usage itself.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Drivers need to make sure to protect this case 
>>>>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>> either an outer VM lock
>>>>>>>>>>>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before 
>>>>>>>>>>>>>>>>>>>> this function
>>>>>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that the 
>>>>>>>>>>>>>>>>>>>> GPUVM's
>>>>>>>>>>>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>>>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, 
>>>>>>>>>>>>>>>>>>>> &extobjs,
>>>>>>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>>>>>>> vm_bo->obj,
>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the 
>>>>>>>>>>>>>>>>>>>> loop. */
>>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs 
>>>>>>>>>>>>>>>>>>>> mapped within
>>>>>>>>>>>>>>>>>>>> a given range
>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>>>>>>> &drm_gem_objects
>>>>>>>>>>>>>>>>>>>> mapped between @addr
>>>>>>>>>>>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, 
>>>>>>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>>>>>>> drm_exec *exec,
>>>>>>>>>>>>>>>>>>>> +                       u64 addr, u64 range, 
>>>>>>>>>>>>>>>>>>>> unsigned int
>>>>>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>>>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, 
>>>>>>>>>>>>>>>>>>>> addr, end) {
>>>>>>>>>>>>>>>>>>>> +               struct drm_gem_object *obj = 
>>>>>>>>>>>>>>>>>>>> va->gem.obj;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>> +                       return ret;
>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Addionally, when calling this function with struct
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>>>>>>>>>>>> + * being set the driver receives the given @fn 
>>>>>>>>>>>>>>>>>>>> callback to
>>>>>>>>>>>>>>>>>>>> lock additional
>>>>>>>>>>>>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>> instance.
>>>>>>>>>>>>>>>>>>>> Typically, drivers
>>>>>>>>>>>>>>>>>>>> + * would call drm_exec_prepare_obj() from within this
>>>>>>>>>>>>>>>>>>>> callback.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>>>>>> +                   unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, 
>>>>>>>>>>>>>>>>>>>> exec,
>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +               ret = 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>>>>>>>>>>>> +                       ret = 
>>>>>>>>>>>>>>>>>>>> vm_exec->extra.fn(vm_exec,
>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>> +                       drm_exec_retry_on_contention(exec); 
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> +                       if (ret)
>>>>>>>>>>>>>>>>>>>> +                               goto err;
>>>>>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +static int
>>>>>>>>>>>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, 
>>>>>>>>>>>>>>>>>>>> unsigned int
>>>>>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       return 
>>>>>>>>>>>>>>>>>>>> drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>>>>>>>>>>>>>> objs,
>>>>>>>>>>>>>>>>>>>> + args->num_objs,
>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all dma-resv 
>>>>>>>>>>>>>>>>>>>> of all
>>>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>>>>>>>>>>>> + * @num_objs: the number of additional 
>>>>>>>>>>>>>>>>>>>> &drm_gem_objects to
>>>>>>>>>>>>>>>>>>>> lock
>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>>>>>>>>>>>> + * contains mappings of, plus the ones given 
>>>>>>>>>>>>>>>>>>>> through @objs.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>> + struct drm_gem_object **objs,
>>>>>>>>>>>>>>>>>>>> + unsigned int num_objs,
>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>> + bool interruptible)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>>>>>> +       } args;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>>>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>>>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, 
>>>>>>>>>>>>>>>>>>>> num_fences,
>>>>>>>>>>>>>>>>>>>> interruptible);
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs 
>>>>>>>>>>>>>>>>>>>> mapped
>>>>>>>>>>>>>>>>>>>> within a given range
>>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>>> &drm_gem_objects
>>>>>>>>>>>>>>>>>>>> mapped between @addr and
>>>>>>>>>>>>>>>>>>>> + * @addr + @range.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>> + u64 addr, u64 range,
>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>> + bool interruptible)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>>>>>> +               ret = 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>>>>>>>>>>>>>> addr, range,
>>>>>>>>>>>>>>>>>>>> + num_fences);
>>>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked 
>>>>>>>>>>>>>>>>>>>> as evicted
>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback 
>>>>>>>>>>>>>>>>>>>> for all
>>>>>>>>>>>>>>>>>>>> evicted buffer
>>>>>>>>>>>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>>>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, 
>>>>>>>>>>>>>>>>>>>> &evict, vm_bo) {
>>>>>>>>>>>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv); 
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the 
>>>>>>>>>>>>>>>>>>>> loop. */
>>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to private 
>>>>>>>>>>>>>>>>>>>> and all
>>>>>>>>>>>>>>>>>>>> extobj
>>>>>>>>>>>>>>>>>>>> + * dma-resv
>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage private_usage,
>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage extobj_usage)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>>>>>>>>>>>> +       unsigned long index;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, 
>>>>>>>>>>>>>>>>>>>> index, obj) {
>>>>>>>>>>>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>>>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>>>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>>>>>>>>>>>> obj) ?
>>>>>>>>>>>>>>>>>>>> + private_usage :
>>>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new 
>>>>>>>>>>>>>>>>>>>> instance of struct
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>>>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>>>>>>>>>>>          return vm_bo;
>>>>>>>>>>>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct 
>>>>>>>>>>>>>>>>>>>> kref *kref)
>>>>>>>>>>>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>>>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct 
>>>>>>>>>>>>>>>>>>>> kref *kref)
>>>>>>>>>>>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the 
>>>>>>>>>>>>>>>>>>>> reference of
>>>>>>>>>>>>>>>>>>>>      *
>>>>>>>>>>>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * If the reference count drops to zero, the 
>>>>>>>>>>>>>>>>>>>> &gpuvm_bo is
>>>>>>>>>>>>>>>>>>>> destroyed, which
>>>>>>>>>>>>>>>>>>>> + * includes removing it from the GEMs gpuva list. 
>>>>>>>>>>>>>>>>>>>> Hence, if
>>>>>>>>>>>>>>>>>>>> a call to this
>>>>>>>>>>>>>>>>>>>> + * function can potentially let the reference 
>>>>>>>>>>>>>>>>>>>> count to zero
>>>>>>>>>>>>>>>>>>>> the caller must
>>>>>>>>>>>>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva 
>>>>>>>>>>>>>>>>>>>> lock.
>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>     void
>>>>>>>>>>>>>>>>>>>>     drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>>>>> __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>                      struct drm_gem_object *obj)
>>>>>>>>>>>>>>>>>>>> @@ -938,6 +1412,48 @@ 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the 
>>>>>>>>>>>>>>>>>>>> &drm_gpuvm_bo to its
>>>>>>>>>>>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>>>>>>>>>>>> + * extobj list
>>>>>>>>>>>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its 
>>>>>>>>>>>>>>>>>>>> &drm_gpuvm's the
>>>>>>>>>>>>>>>>>>>> extobj list.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's 
>>>>>>>>>>>>>>>>>>>> extobj list if
>>>>>>>>>>>>>>>>>>>> not on the list
>>>>>>>>>>>>>>>>>>>> + * already and if the corresponding 
>>>>>>>>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>>>>>>>>> external object,
>>>>>>>>>>>>>>>>>>>> + * actually.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a 
>>>>>>>>>>>>>>>>>>>> &drm_gem_object to
>>>>>>>>>>>>>>>>>>>> / from a
>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>>>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>>>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>>>>>>>>>>>> + * list containing a mapping of this &drm_gem_object.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, 
>>>>>>>>>>>>>>>>>>>> bool evict)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>>>>>>>>>>>> +               if (evict)
>>>>>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, 
>>>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>>>> +               else
>>>>>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, 
>>>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>     static int
>>>>>>>>>>>>>>>>>>>> __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>>>>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>>>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>>>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>>>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>>>>>>>>>>>           * space
>>>>>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>>>>>> serving as
>>>>>>>>>>>>>>>>>>>> +                * external object
>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>>>>>> extobj list
>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>>>>>> +       } extobj;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>> +        * @evict: structure holding the evict list 
>>>>>>>>>>>>>>>>>>>> and evict
>>>>>>>>>>>>>>>>>>>> list lock
>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>>>>>> currently being
>>>>>>>>>>>>>>>>>>>> +                * evicted
>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>>>>>> evict list
>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>>>>>> +       } evict;
>>>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, 
>>>>>>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>>>>>>> drm_device *drm,
>>>>>>>>>>>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>>>>>>>>>>>                      const struct drm_gpuvm_ops *ops);
>>>>>>>>>>>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the 
>>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>>>>>>>>> + * external object
>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv 
>>>>>>>>>>>>>>>>>>>> differs
>>>>>>>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>> + struct drm_gem_object
>>>>>>>>>>>>>>>>>>>> *obj)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>>>>>>>>>>>     __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct 
>>>>>>>>>>>>>>>>>>>> drm_gpuva *va)
>>>>>>>>>>>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, 
>>>>>>>>>>>>>>>>>>>> next__, gpuvm__)
>>>>>>>>>>>>>>>>>>>> \
>>>>>>>>>>>>>>>>>>>>          list_for_each_entry_safe(va__, next__, 
>>>>>>>>>>>>>>>>>>>> &(gpuvm__)-
>>>>>>>>>>>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>>>>>>>>>>>> &drm_exec should be.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Optionally, @extra can be set in order to lock 
>>>>>>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA 
>>>>>>>>>>>>>>>>>>>> reservations
>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>> +        * @extra: Callback and corresponding 
>>>>>>>>>>>>>>>>>>>> private data
>>>>>>>>>>>>>>>>>>>> for the driver to
>>>>>>>>>>>>>>>>>>>> +        * lock arbitrary additional &drm_gem_objects.
>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>>>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>> +                * @priv: driver private data for 
>>>>>>>>>>>>>>>>>>>> the @fn
>>>>>>>>>>>>>>>>>>>> callback
>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>> +               void *priv;
>>>>>>>>>>>>>>>>>>>> +       } extra;
>>>>>>>>>>>>>>>>>>>> +};
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs 
>>>>>>>>>>>>>>>>>>>> common dma-
>>>>>>>>>>>>>>>>>>>> resv
>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +static inline int
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>> +                    struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       return drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>>>>>>> &gpuvm->d_obj,
>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm 
>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>> + u64 addr, u64 range,
>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>> + struct drm_gem_object **objs,
>>>>>>>>>>>>>>>>>>>> + unsigned int num_objs,
>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>> + bool interruptible);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>> + u64 addr, u64 range,
>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>> + bool interruptible);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all 
>>>>>>>>>>>>>>>>>>>> assoiciated
>>>>>>>>>>>>>>>>>>>> BOs
>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Releases all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>>> &drm_gem_objects
>>>>>>>>>>>>>>>>>>>> previously acquired
>>>>>>>>>>>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm 
>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, 
>>>>>>>>>>>>>>>>>>>> &vm_exec->exec,
>>>>>>>>>>>>>>>>>>>> fence,
>>>>>>>>>>>>>>>>>>>> + private_usage,
>>>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>>>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>>>>>>>>>>>                           * gpuva list.
>>>>>>>>>>>>>>>>>>>>                           */
>>>>>>>>>>>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>>>>>> +                        * @evict: List entry to 
>>>>>>>>>>>>>>>>>>>> attach to
>>>>>>>>>>>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>>>>>>>>>>>> +                        * extobj list.
>>>>>>>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>>>>>> +                        * @evict: List entry to 
>>>>>>>>>>>>>>>>>>>> attach to
>>>>>>>>>>>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>>>>>>>>>>>> +                        * list.
>>>>>>>>>>>>>>>>>>>> +                        */
>>>>>>>>>>>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>>>>>>>>>>>                  } entry;
>>>>>>>>>>>>>>>>>>>>          } list;
>>>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>>>>>     drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object 
>>>>>>>>>>>>>>>>>>>> *obj, bool
>>>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>>>> *vm_bo);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to 
>>>>>>>>>>>>>>>>>>>> walk over a
>>>>>>>>>>>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in 
>>>>>>>>>>>>>>>>>>>> each
>>>>>>>>>>>>>>>>>>>> iteration step
>>>>>>>>>>>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>>>>>>>>>>>           * used.
>>>>>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op 
>>>>>>>>>>>>>>>>>>>> *op, void
>>>>>>>>>>>>>>>>>>>> *priv);
>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>> +        * @bo_validate: called from 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_validate()
>>>>>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>>>>>> +        * Drivers receive this callback for every 
>>>>>>>>>>>>>>>>>>>> evicted
>>>>>>>>>>>>>>>>>>>> &drm_gem_object being
>>>>>>>>>>>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>>>>>>>>>>>>> specific variant of
>>>>>>>>>>>>>>>>>>>> +        * ttm_bo_validate() from within this 
>>>>>>>>>>>>>>>>>>>> callback.
>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object 
>>>>>>>>>>>>>>>>>>>> *obj);
>>>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, 
>>>>>>>>>>>>>>>>>>>> void *priv,
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-20 13:38                                         ` Thomas Hellström
@ 2023-09-20 13:48                                           ` Christian König
  2023-09-20 14:02                                             ` Thomas Hellström
  0 siblings, 1 reply; 77+ messages in thread
From: Christian König @ 2023-09-20 13:48 UTC (permalink / raw)
  To: Thomas Hellström, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

Am 20.09.23 um 15:38 schrieb Thomas Hellström:
>
> On 9/20/23 15:06, Christian König wrote:
>>
>>
>> Am 20.09.23 um 14:06 schrieb Thomas Hellström:
>>>
>>> On 9/20/23 12:51, Christian König wrote:
>>>> Am 20.09.23 um 09:44 schrieb Thomas Hellström:
>>>>> Hi,
>>>>>
>>>>> On 9/20/23 07:37, Christian König wrote:
>>>>>> Am 19.09.23 um 17:23 schrieb Thomas Hellström:
>>>>>>>
>>>>>>> On 9/19/23 17:16, Danilo Krummrich wrote:
>>>>>>>> On 9/19/23 14:21, Thomas Hellström wrote:
>>>>>>>>> Hi Christian
>>>>>>>>>
>>>>>>>>> On 9/19/23 14:07, Christian König wrote:
>>>>>>>>>> Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
>>>>>>>>>>> On 9/13/23 17:33, Christian König wrote:
>>>>>>>>>>>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>>>>>>>>>>>> On 9/13/23 16:26, Christian König wrote:
>>>>>>>>>>>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>>>>>>>>>>>> As mentioned in a different mail thread, the reply is 
>>>>>>>>>>>>>>> based on the assumption
>>>>>>>>>>>>>>> that we don't support anything else than GPUVM updates 
>>>>>>>>>>>>>>> from the IOCTL.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think that this assumption is incorrect.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Well, more precisely I should have said "don't support 
>>>>>>>>>>>>> GPUVM updated from within
>>>>>>>>>>>>> fence signaling critical sections". And looking at the 
>>>>>>>>>>>>> code, that doesn't seem what
>>>>>>>>>>>>> you're doing there.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Vulkan is just once specific use case, but this here 
>>>>>>>>>>>>>> should probably be able to handle other use cases as well.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Especially with HMM you get the requirement that you need 
>>>>>>>>>>>>>> to be able to invalidate GPUVM mappings without grabbing 
>>>>>>>>>>>>>> a reservation lock.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What do you mean with "invalidate GPUVM mappings" in this 
>>>>>>>>>>>>> context? drm_gpuvm_bo_evict()
>>>>>>>>>>>>> should only be called from a ttm_device_funcs::move\x0f 
>>>>>>>>>>>>> callback, we should hold the dma-resv
>>>>>>>>>>>>> lock there.
>>>>>>>>>>>>
>>>>>>>>>>>> Well the question is which dma-resv lock do we hold?
>>>>>>>>>>>>
>>>>>>>>>>>> In the move callback we only hold the dma-resv lock of the 
>>>>>>>>>>>> BO which is moved, but when that is a shared BO then that's 
>>>>>>>>>>>> not the same as the one for the VM.
>>>>>>>>>>>
>>>>>>>>>>> Correct, Thomas' idea was to use the GEM's dma_resv lock to 
>>>>>>>>>>> protect drm_gpuvm_bo::evicted
>>>>>>>>>>> and then actually move the drm_gpuvm_bo to the VM's evicted 
>>>>>>>>>>> list once we grabbed all
>>>>>>>>>>> dma-resv locks when locking the VM's BOs using drm_exec. We 
>>>>>>>>>>> can remove them from the evicted
>>>>>>>>>>> list on validate(). This way we never touch the evicted list 
>>>>>>>>>>> without holding at least the VM's
>>>>>>>>>>> dma-resv lock.
>>>>>>>>>>>
>>>>>>>>>>> Do you have any concerns about that?
>>>>>>>>>>
>>>>>>>>>> Scratching my head a bit how that is supposed to work.
>>>>>>>>>>
>>>>>>>>>> This implies that you go over all the evicted BOs during 
>>>>>>>>>> validation and not just the one mentioned in the CS.
>>>>>>>>>>
>>>>>>>>>> That might work for Vulkan, but is pretty much a no-go for 
>>>>>>>>>> OpenGL.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> See what the eviction lock in amdgpu is doing for example.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The eviction_lock seems to protect a VM state "evicting" 
>>>>>>>>>>>>> of whether any BO that
>>>>>>>>>>>>> is associated with the VM is currently evicting. At the 
>>>>>>>>>>>>> same time amdgpu protects
>>>>>>>>>>>>> the eviceted list of the VM with a different lock. So this 
>>>>>>>>>>>>> seems to be entirely
>>>>>>>>>>>>> unrelated. Tracking a "currently evicting" state is not 
>>>>>>>>>>>>> part of the GPUVM
>>>>>>>>>>>>> implementation currently and hence nothing would change 
>>>>>>>>>>>>> for amdgpu there.
>>>>>>>>>>>>
>>>>>>>>>>>> Sorry for the confusion we use different terminology in 
>>>>>>>>>>>> amdgpu.
>>>>>>>>>>>>
>>>>>>>>>>>> The eviction lock and evicted state is for the VM page 
>>>>>>>>>>>> tables, e.g. if the whole VM is currently not used and 
>>>>>>>>>>>> swapped out or even de-allocated.
>>>>>>>>>>>>
>>>>>>>>>>>> This is necessary because we have cases where we need to 
>>>>>>>>>>>> access the VM data without holding the dma-resv lock of 
>>>>>>>>>>>> this VM. Especially figuring out which parts of an address 
>>>>>>>>>>>> space contain mappings and which doesn't.
>>>>>>>>>>>
>>>>>>>>>>> I think this is fine, this has nothing to do with lists of 
>>>>>>>>>>> evicted GEM objects or external GEM
>>>>>>>>>>> objects, right? Marking mappings (drm_gpuva) as invalidated 
>>>>>>>>>>> (DRM_GPUVA_INVALIDATED) or accessing
>>>>>>>>>>> the VA space does not require any dma-resv locks.
>>>>>>>>>>
>>>>>>>>>> I hope so, but I'm not 100% sure.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This is a requirement which comes with HMM handling, you 
>>>>>>>>>>>> won't see this with Vulkan (or OpenGL, VAAPI etc..).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The invalidation lock on the other hand is what in this 
>>>>>>>>>>>> discussion is called eviction lock. This one is needed 
>>>>>>>>>>>> because what I wrote above, during the move callback only 
>>>>>>>>>>>> the dma-resv of the BO which is moved is locked, but not 
>>>>>>>>>>>> necessarily the dma-resv of the VM.
>>>>>>>>>>>
>>>>>>>>>>> That's yet another thing, right? This is used to track 
>>>>>>>>>>> whether *any* BO that belongs to the VM is
>>>>>>>>>>> currently being evicted, correct? As mentioned, as by now 
>>>>>>>>>>> this is not supported in GPUVM and hence
>>>>>>>>>>> would be the same driver specific code with the same driver 
>>>>>>>>>>> specifc lock.
>>>>>>>>>>
>>>>>>>>>> That is most likely a show stopper using this for OpenGL 
>>>>>>>>>> based workloads as far as I can see. For those you need to 
>>>>>>>>>> able to figure out which non-VM BOs have been evicted and 
>>>>>>>>>> which parts of the VM needs updates.
>>>>>>>>>
>>>>>>>>> We identify those with a bool in the gpuvm_bo, and that bool 
>>>>>>>>> is protected by the bo_resv. In essence, the "evicted" list 
>>>>>>>>> must be made up-to-date with all relevant locks held before 
>>>>>>>>> traversing in the next exec.
>>>>>>>>
>>>>>>>> What I still miss with this idea is how do we find all the 
>>>>>>>> drm_gpuvm_bo structures with the evicted bool set to true? When 
>>>>>>>> doing the drm_exec dance we come across all external ones and 
>>>>>>>> can add them to the list if needed, but what about the BOs 
>>>>>>>> having the VM's dma-resv?
>>>>>>>
>>>>>>> Oh, they can be added to the evict list directly (no bool 
>>>>>>> needed) in the eviction code, like in v3. Since for those we 
>>>>>>> indeed hold the VM's dma_resv since it's aliased with the 
>>>>>>> object's dma-resv.
>>>>>>
>>>>>> Yeah, I wanted to note what Danilo seems to think about as well. 
>>>>>> How do we figure out the non-VM BOs evicted?
>>>>>>
>>>>>> We can't walk over the list of all non-VM BOs on every 
>>>>>> submission, that's to much overhead for cases with lots of non-VM 
>>>>>> BOs.
>>>>>>
>>>>>> And we can't rely on userspace sending all non-VM BOs as used 
>>>>>> list down to the kernel with each submission.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>
>>>>> No, that's not needed: Mechanism below.
>>>>>
>>>>> 1) We maintain an evicted list. Typically protected by the vm resv.
>>>>> 2) Each gpuvm_bo has a bool "evicted". Protected by the bo resv.
>>>>>
>>>>> a) Evicting a vm bo: The vm resv is held by the eviction code. 
>>>>> Just put it on the evicted list.
>>>>> b) Evicting a shared/external bo: The bo resv is held by the 
>>>>> eviction code. Set the "evicted" bool
>>>>> c) Validating the evicted list on exec:
>>>>
>>>>
>>>>> Loop through all *external/shared* bos.
>>>>
>>>> And this is what you can't do. For Vulkan it probably doesn't 
>>>> matter, but for OpenGL and especially multimedia we have much more 
>>>> BOs on the shared list than what's allocated for the VM.
>>>
>>> But you need to lock- and fence all those so you need to loop 
>>> through them anyway, so we're still O(n_shared)? Or is there some 
>>> clever optimization in amdgpu?
>>
>> Why should I lock and fence them? Only the BOs in the relocation list 
>> are locked and fenced.
>
> Do you by "relocation" list refer to what gpuvm calls "evict" list or 
> something else? Like the relocaton/validation list that used to be 
> sent from user-space for non-VM_BIND vms?

The BOs send into the kernel with each command submission on the classic 
IOCTLs.

>
> The vm bos plus the external/shared bos bound to the VM (the external 
> list) are the bos being referenced by the current batch. So the bos on 
> the VM's external list are the ones being locked and fenced and 
> checked for eviction. If they weren't they could be evicted before the 
> current batch completes?

That only applies to a certain use case, e.g. Vulkan or user mode queues.

Multimedia APIs and especially OpenGL work differently, here only the 
BOs mentioned in the relocation list are guaranteed to not be evicted.

This is intentional because those APIs tend to over allocate memory all 
the time, so for good performance you need to be able to evict BOs from 
the VM while other parts of the VM are currently in use.

Without that especially OpenGL performance would be completely crippled 
at least on amdgpu.

Regards,
Christian.

>
> Thanks,
>
> Thomas
>
>
>>
>> Regards,
>> Christian.
>>
>>>
>>> I think with some UMDs, xe might end up with similar large lists...
>>>
>>> /Thomas
>>>
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> Lock them. After locking, check the "evicted" bool, if it's true. 
>>>>> put the bo on the evicted list (we hold the VM resv at this point) 
>>>>> and clear the "evicted" bool. Note that other vms will have their 
>>>>> own gpuvm_bo which is marked evicted.
>>>>>
>>>>> I have this coded up in a patch for Xe and it seems to be working 
>>>>> properly.
>>>>>
>>>>> /Thomas
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> /Thomas
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> If you mean that we need to unbind all vmas of all vms of 
>>>>>>>>> evicted bos before evicting, We don't do that, at least not in 
>>>>>>>>> Xe, since evicting we wait for VM idle, and it cant access 
>>>>>>>>> anything through the stale vmas until they have been 
>>>>>>>>> revalidated and rebound.
>>>>>>>>>
>>>>>>>>> /Thomas
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Christian.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Christian.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas 
>>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas 
>>>>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas 
>>>>>>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>>>>>>> Hi, Danilo,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>>>>>>> So far the DRM GPUVA manager offers common 
>>>>>>>>>>>>>>>>>>>>> infrastructure to
>>>>>>>>>>>>>>>>>>>>> track GPU VA
>>>>>>>>>>>>>>>>>>>>> allocations and mappings, generically connect GPU 
>>>>>>>>>>>>>>>>>>>>> VA mappings
>>>>>>>>>>>>>>>>>>>>> to their
>>>>>>>>>>>>>>>>>>>>> backing buffers and perform more complex mapping 
>>>>>>>>>>>>>>>>>>>>> operations
>>>>>>>>>>>>>>>>>>>>> on the GPU VA
>>>>>>>>>>>>>>>>>>>>> space.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> However, there are more design patterns commonly 
>>>>>>>>>>>>>>>>>>>>> used by
>>>>>>>>>>>>>>>>>>>>> drivers, which
>>>>>>>>>>>>>>>>>>>>> can potentially be generalized in order to make 
>>>>>>>>>>>>>>>>>>>>> the DRM GPUVA
>>>>>>>>>>>>>>>>>>>>> manager
>>>>>>>>>>>>>>>>>>>>> represent a basic GPU-VM implementation. In this 
>>>>>>>>>>>>>>>>>>>>> context,
>>>>>>>>>>>>>>>>>>>>> this patch aims
>>>>>>>>>>>>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not 
>>>>>>>>>>>>>>>>>>>>> being used
>>>>>>>>>>>>>>>>>>>>> outside of
>>>>>>>>>>>>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM 
>>>>>>>>>>>>>>>>>>>>> objects
>>>>>>>>>>>>>>>>>>>>> which are
>>>>>>>>>>>>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 3) Provide functions to efficiently lock all GEM 
>>>>>>>>>>>>>>>>>>>>> objects dma-
>>>>>>>>>>>>>>>>>>>>> resv the
>>>>>>>>>>>>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 4) Provide tracking of evicted GEM objects the GPU-VM
>>>>>>>>>>>>>>>>>>>>> contains mappings
>>>>>>>>>>>>>>>>>>>>>       of, such that validation of evicted GEM 
>>>>>>>>>>>>>>>>>>>>> objects is
>>>>>>>>>>>>>>>>>>>>> accelerated.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 5) Provide some convinience functions for common 
>>>>>>>>>>>>>>>>>>>>> patterns.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Rather than being designed as a "framework", the 
>>>>>>>>>>>>>>>>>>>>> target is to
>>>>>>>>>>>>>>>>>>>>> make all
>>>>>>>>>>>>>>>>>>>>> features appear as a collection of optional helper 
>>>>>>>>>>>>>>>>>>>>> functions,
>>>>>>>>>>>>>>>>>>>>> such that
>>>>>>>>>>>>>>>>>>>>> drivers are free to make use of the DRM GPUVA 
>>>>>>>>>>>>>>>>>>>>> managers basic
>>>>>>>>>>>>>>>>>>>>> functionality and opt-in for other features 
>>>>>>>>>>>>>>>>>>>>> without setting
>>>>>>>>>>>>>>>>>>>>> any feature
>>>>>>>>>>>>>>>>>>>>> flags, just by making use of the corresponding 
>>>>>>>>>>>>>>>>>>>>> functions.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Big kudos to Boris Brezillon for his help to 
>>>>>>>>>>>>>>>>>>>>> figure out
>>>>>>>>>>>>>>>>>>>>> locking for drivers
>>>>>>>>>>>>>>>>>>>>> updating the GPU VA space within the fence 
>>>>>>>>>>>>>>>>>>>>> signalling path.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Suggested-by: Matthew Brost <matthew.brost@intel.com>
>>>>>>>>>>>>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>>>> drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>>>>>>>>>>> include/drm/drm_gpuvm.h | 197 ++++++++++++++
>>>>>>>>>>>>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>>>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for 
>>>>>>>>>>>>>>>>>>>>> an existing
>>>>>>>>>>>>>>>>>>>>> instance of this
>>>>>>>>>>>>>>>>>>>>>      * particular combination. If not existent a 
>>>>>>>>>>>>>>>>>>>>> new instance
>>>>>>>>>>>>>>>>>>>>> is created and linked
>>>>>>>>>>>>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a 
>>>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>>>>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of 
>>>>>>>>>>>>>>>>>>>>> external and
>>>>>>>>>>>>>>>>>>>>> evicted objects. Those
>>>>>>>>>>>>>>>>>>>>> + * list are maintained in order to accelerate 
>>>>>>>>>>>>>>>>>>>>> locking of
>>>>>>>>>>>>>>>>>>>>> dma-resv locks and
>>>>>>>>>>>>>>>>>>>>> + * validation of evicted objects bound in a 
>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm. For
>>>>>>>>>>>>>>>>>>>>> instance the all
>>>>>>>>>>>>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given 
>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm can be
>>>>>>>>>>>>>>>>>>>>> locked by calling
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers can 
>>>>>>>>>>>>>>>>>>>>> call
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>>>>>>>>>>>>> + * order to validate all evicted 
>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects. It is
>>>>>>>>>>>>>>>>>>>>> also possible to lock
>>>>>>>>>>>>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>>>>>>>>>>>>> corresponding parameters to
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code the 
>>>>>>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>>>>>>> loop while making
>>>>>>>>>>>>>>>>>>>>> + * use of helper functions such as 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Every bound &drm_gem_object is treated as 
>>>>>>>>>>>>>>>>>>>>> external object
>>>>>>>>>>>>>>>>>>>>> when its &dma_resv
>>>>>>>>>>>>>>>>>>>>> + * structure is different than the &drm_gpuvm's 
>>>>>>>>>>>>>>>>>>>>> common
>>>>>>>>>>>>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>>>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() 
>>>>>>>>>>>>>>>>>>>>> for the same
>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>>>>>>      * &drm_gem_object must be able to observe 
>>>>>>>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>> creations and destructions
>>>>>>>>>>>>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep 
>>>>>>>>>>>>>>>>>>>>> instances unique.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of 
>>>>>>>>>>>>>>>>>>>>> external and
>>>>>>>>>>>>>>>>>>>>> evicted objects are
>>>>>>>>>>>>>>>>>>>>> + * protected against concurrent insertion / 
>>>>>>>>>>>>>>>>>>>>> removal and
>>>>>>>>>>>>>>>>>>>>> iteration internally.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * However, drivers still need ensure to protect 
>>>>>>>>>>>>>>>>>>>>> concurrent
>>>>>>>>>>>>>>>>>>>>> calls to functions
>>>>>>>>>>>>>>>>>>>>> + * iterating those lists, such as 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_validate() and
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such 
>>>>>>>>>>>>>>>>>>>>> function contains
>>>>>>>>>>>>>>>>>>>>> a particular
>>>>>>>>>>>>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Functions adding or removing entries from 
>>>>>>>>>>>>>>>>>>>>> those lists,
>>>>>>>>>>>>>>>>>>>>> such as
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() or 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_extobj_add() may be
>>>>>>>>>>>>>>>>>>>>> called with external
>>>>>>>>>>>>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>>>>>>>>>>>>> corresponding list to be
>>>>>>>>>>>>>>>>>>>>> + * (safely) modified while potentially being 
>>>>>>>>>>>>>>>>>>>>> iternated by
>>>>>>>>>>>>>>>>>>>>> other API functions.
>>>>>>>>>>>>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>>>>>>>>>>>>      *   }
>>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next 
>>>>>>>>>>>>>>>>>>>>> vm_bo element
>>>>>>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're 
>>>>>>>>>>>>>>>>>>>>> iterating on
>>>>>>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list 
>>>>>>>>>>>>>>>>>>>>> used to store
>>>>>>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>>>>>>> first element from
>>>>>>>>>>>>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>>>>>> Are the list spinlocks needed for that async state 
>>>>>>>>>>>>>>>>>>>> update from
>>>>>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>>>>>> dma-fence critical section we've discussed previously?
>>>>>>>>>>>>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Otherwise it should be sufficient to protect the 
>>>>>>>>>>>>>>>>>>>> lists with the
>>>>>>>>>>>>>>>>>>>> gpuvm's resv
>>>>>>>>>>>>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> If those spinlocks are still needed in some 
>>>>>>>>>>>>>>>>>>>> situations, perhaps
>>>>>>>>>>>>>>>>>>>> could we
>>>>>>>>>>>>>>>>>>>> have an option to set them to NULL (Like IIRC the 
>>>>>>>>>>>>>>>>>>>> maple tree
>>>>>>>>>>>>>>>>>>>> allows for)?
>>>>>>>>>>>>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>>>>>>>>>>>>> holding only the dma-resv lock from the BO this 
>>>>>>>>>>>>>>>>>>> function gets
>>>>>>>>>>>>>>>>>>> called for. Hence,
>>>>>>>>>>>>>>>>>>> the spinlock protects concurrent 
>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_evict() calls with
>>>>>>>>>>>>>>>>>>> different BOs.
>>>>>>>>>>>>>>>>>> No. Only if you try to add external objects to the 
>>>>>>>>>>>>>>>>>> vm's evict list
>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>> within the evict code. That's not necessary since you 
>>>>>>>>>>>>>>>>>> loop through
>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>> external objects anyway when locking them so an 
>>>>>>>>>>>>>>>>>> "evicted" bool in
>>>>>>>>>>>>>>>>>> the vm_bo,
>>>>>>>>>>>>>>>>>> protected by the bo resv would be sufficient. The 
>>>>>>>>>>>>>>>>>> extobj locking
>>>>>>>>>>>>>>>>>> loop can
>>>>>>>>>>>>>>>>>> then add the bo to the evicted list.
>>>>>>>>>>>>>>>>> And validate() can remove it while still holding all 
>>>>>>>>>>>>>>>>> dma-resv locks,
>>>>>>>>>>>>>>>>> neat!
>>>>>>>>>>>>>>>>> However, what if two tasks are trying to lock the VA 
>>>>>>>>>>>>>>>>> space
>>>>>>>>>>>>>>>>> concurrently? What
>>>>>>>>>>>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to 
>>>>>>>>>>>>>>>>> zero in
>>>>>>>>>>>>>>>>> drm_gpuva_unlink()?
>>>>>>>>>>>>>>>>> Are we guaranteed that at this point of time the 
>>>>>>>>>>>>>>>>> drm_gpuvm_bo is not
>>>>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>>>>> evicted list? Because otherwise we would call 
>>>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>>>>>>> might drop the last reference to the drm_gem_object 
>>>>>>>>>>>>>>>>> and hence we'd
>>>>>>>>>>>>>>>>> potentially
>>>>>>>>>>>>>>>>> free the dma-resv lock while holding it, at least if 
>>>>>>>>>>>>>>>>> it's an external
>>>>>>>>>>>>>>>>> object.
>>>>>>>>>>>>>>>> Easiest way in this scheme is to think of the lists as 
>>>>>>>>>>>>>>>> being protected
>>>>>>>>>>>>>>>> by the vm's resv lock. That means anybody calling 
>>>>>>>>>>>>>>>> unlink() must also
>>>>>>>>>>>>>>>> hold the vm's resv lock. (Which is OK from an UAF point 
>>>>>>>>>>>>>>>> of view, but
>>>>>>>>>>>>>>>> perhaps not from a locking inversion POW from an async 
>>>>>>>>>>>>>>>> list update).
>>>>>>>>>>>>>>> This would mean that on unlink() we'd need to hold the 
>>>>>>>>>>>>>>> VM's resv lock and the
>>>>>>>>>>>>>>> corresponding GEM's resv lock (in case they're not the 
>>>>>>>>>>>>>>> same anyways) because the
>>>>>>>>>>>>>>> VM's resv lock would protect the external / evicted 
>>>>>>>>>>>>>>> object lists and the GEM
>>>>>>>>>>>>>>> objects resv lock protects the GEM's list of 
>>>>>>>>>>>>>>> drm_gpuvm_bos and the
>>>>>>>>>>>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> For extobjs an outer lock would be enough in case of 
>>>>>>>>>>>>>>>>>>> Xe, but I
>>>>>>>>>>>>>>>>>>> really would not
>>>>>>>>>>>>>>>>>>> like to add even more complexity just to get the 
>>>>>>>>>>>>>>>>>>> spinlock out of
>>>>>>>>>>>>>>>>>>> the way in case
>>>>>>>>>>>>>>>>>>> the driver already has an outer lock protecting this 
>>>>>>>>>>>>>>>>>>> path.
>>>>>>>>>>>>>>>>>> I must disagree here. These spinlocks and atomic 
>>>>>>>>>>>>>>>>>> operations are
>>>>>>>>>>>>>>>>>> pretty
>>>>>>>>>>>>>>>>>> costly and as discussed earlier this type of locking 
>>>>>>>>>>>>>>>>>> was the reason
>>>>>>>>>>>>>>>>>> (at
>>>>>>>>>>>>>>>>>> least according to the commit message) that made 
>>>>>>>>>>>>>>>>>> Christian drop the
>>>>>>>>>>>>>>>>>> XArray
>>>>>>>>>>>>>>>>>> use in drm_exec for the same set of objects: "The 
>>>>>>>>>>>>>>>>>> locking overhead
>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> unecessary and measurable". IMHO the spinlock is the 
>>>>>>>>>>>>>>>>>> added
>>>>>>>>>>>>>>>>>> complexity and a
>>>>>>>>>>>>>>>>>> single wide lock following the drm locking guidelines 
>>>>>>>>>>>>>>>>>> set out by
>>>>>>>>>>>>>>>>>> Daniel and
>>>>>>>>>>>>>>>>>> David should really be the default choice with an 
>>>>>>>>>>>>>>>>>> opt-in for a
>>>>>>>>>>>>>>>>>> spinlock if
>>>>>>>>>>>>>>>>>> needed for async and pushing out to a wq is not an 
>>>>>>>>>>>>>>>>>> option.
>>>>>>>>>>>>>>>>> For the external object list an outer lock would work 
>>>>>>>>>>>>>>>>> as long as it's
>>>>>>>>>>>>>>>>> not the
>>>>>>>>>>>>>>>>> dma-resv lock of the corresponding GEM object, since 
>>>>>>>>>>>>>>>>> here we actually
>>>>>>>>>>>>>>>>> need to
>>>>>>>>>>>>>>>>> remove the list entry from the external object list on
>>>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>>>>>>>>>>>> It's just a bit weird design wise that drivers would 
>>>>>>>>>>>>>>>>> need to take
>>>>>>>>>>>>>>>>> this outer
>>>>>>>>>>>>>>>>> lock on:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>>>>>>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also 
>>>>>>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>>>>>>> - drm_gpuva_unlink()            (because it needs to call
>>>>>>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>>>>>>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>>>>>>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Given that it seems reasonable to do all the required 
>>>>>>>>>>>>>>>>> locking
>>>>>>>>>>>>>>>>> internally.
>>>>>>>>>>>>>>>>  From a design POW, there has been a clear direction in 
>>>>>>>>>>>>>>>> XE to make
>>>>>>>>>>>>>>>> things similar to mmap() / munmap(), so this outer 
>>>>>>>>>>>>>>>> lock, which in Xe is
>>>>>>>>>>>>>>>> an rwsem, is used in a similar way as the mmap_lock. 
>>>>>>>>>>>>>>>> It's protecting
>>>>>>>>>>>>>>>> the page-table structures and vma rb tree, the userptr 
>>>>>>>>>>>>>>>> structures and
>>>>>>>>>>>>>>>> the extobj list. Basically it's taken early in the exec 
>>>>>>>>>>>>>>>> IOCTL, the
>>>>>>>>>>>>>>>> VM_BIND ioctl, the compute rebind worker and the 
>>>>>>>>>>>>>>>> pagefault handler, so
>>>>>>>>>>>>>>>> all of the above are just asserting that it is taken in 
>>>>>>>>>>>>>>>> the correct
>>>>>>>>>>>>>>>> mode.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> But strictly with this scheme one could also use the 
>>>>>>>>>>>>>>>> vm's dma_resv for
>>>>>>>>>>>>>>>> the extobj list since with drm_exec, it's locked before 
>>>>>>>>>>>>>>>> traversing the
>>>>>>>>>>>>>>>> list.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The whole point of this scheme is to rely on locks that 
>>>>>>>>>>>>>>>> you already are
>>>>>>>>>>>>>>>> supposed to be holding for various reasons and is 
>>>>>>>>>>>>>>>> simple to comprehend.
>>>>>>>>>>>>>>> I don't agree that we're supposed to hold the VM's resv 
>>>>>>>>>>>>>>> lock anyways for
>>>>>>>>>>>>>>> functions like drm_gpuvm_bo_put() or drm_gpuva_unlink(), 
>>>>>>>>>>>>>>> but I'm fine using it
>>>>>>>>>>>>>>> for that purpose nevertheless.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In order to at least place lockdep checks, the driver 
>>>>>>>>>>>>>>>>> would need to
>>>>>>>>>>>>>>>>> supply the
>>>>>>>>>>>>>>>>> corresponding lock's lockdep_map, because the GPUVM 
>>>>>>>>>>>>>>>>> otherwise doesn't
>>>>>>>>>>>>>>>>> know about
>>>>>>>>>>>>>>>>> the lock.
>>>>>>>>>>>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>>>>>>>>>>> I'd really like to avoid that, especially now that 
>>>>>>>>>>>>>>> everything got simpler. We
>>>>>>>>>>>>>>> should define the actual locks to take instead.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Out of curiosity, what is the overhead of a 
>>>>>>>>>>>>>>>>> spin_lock() that doesn't
>>>>>>>>>>>>>>>>> need to
>>>>>>>>>>>>>>>>> spin?
>>>>>>>>>>>>>>>> I guess it's hard to tell exactly, but it is much lower 
>>>>>>>>>>>>>>>> on modern x86
>>>>>>>>>>>>>>>> than what it used to be. Not sure about ARM, which is 
>>>>>>>>>>>>>>>> the other
>>>>>>>>>>>>>>>> architecture important to us. I figure if there is 
>>>>>>>>>>>>>>>> little cache-line
>>>>>>>>>>>>>>>> bouncing the main overhead comes from the implied 
>>>>>>>>>>>>>>>> barriers.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> A pretty simple way that would not add much code 
>>>>>>>>>>>>>>>>>> would be
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> static void gpuvm_cond_spin_lock(const struct 
>>>>>>>>>>>>>>>>>> drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>> spinlock_t
>>>>>>>>>>>>>>>>>> *lock)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>>>>>>>>>>>>          spin_lock(lock);
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> For such drivers, that would require anybody 
>>>>>>>>>>>>>>>>>>>> calling unlink to
>>>>>>>>>>>>>>>>>>>> hold the vm's
>>>>>>>>>>>>>>>>>>>> resv, though.
>>>>>>>>>>>>>>>>>>> In V4 I want to go back to having a dedicated lock 
>>>>>>>>>>>>>>>>>>> for the GEMs
>>>>>>>>>>>>>>>>>>> gpuva list (or
>>>>>>>>>>>>>>>>>>> VM_BO list to be more precise). We can't just use 
>>>>>>>>>>>>>>>>>>> the dma-resv
>>>>>>>>>>>>>>>>>>> lock for that
>>>>>>>>>>>>>>>>>>> with VM_BO abstractions, because on destruction of a 
>>>>>>>>>>>>>>>>>>> VM_BO we
>>>>>>>>>>>>>>>>>>> otherwise wouldn't
>>>>>>>>>>>>>>>>>>> be allowed to already hold the dma-resv lock. That's 
>>>>>>>>>>>>>>>>>>> the fix I
>>>>>>>>>>>>>>>>>>> was referring to
>>>>>>>>>>>>>>>>>>> earlier.
>>>>>>>>>>>>>>>>>> Yeah, I can see the need for a dedicated lock for the 
>>>>>>>>>>>>>>>>>> GEM's gpuva
>>>>>>>>>>>>>>>>>> list, but
>>>>>>>>>>>>>>>>>> holding the vm's dma-resv lock across the unlink 
>>>>>>>>>>>>>>>>>> shouldn't be a
>>>>>>>>>>>>>>>>>> problem. We
>>>>>>>>>>>>>>>>>> may free the object and a pointer to the vm's resv 
>>>>>>>>>>>>>>>>>> during unlink
>>>>>>>>>>>>>>>>>> but we
>>>>>>>>>>>>>>>>>> don't free the vm's resv.  It'd be a matter of 
>>>>>>>>>>>>>>>>>> ensuring that any
>>>>>>>>>>>>>>>>>> calls to
>>>>>>>>>>>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>>>>>>>>>>> Drivers calling unlink() from the fence signaling path 
>>>>>>>>>>>>>>>>> can't use the
>>>>>>>>>>>>>>>>> VM's
>>>>>>>>>>>>>>>>> dma-resv lock.
>>>>>>>>>>>>>>>> Yes, that made me a bit curious because in the current 
>>>>>>>>>>>>>>>> version the code
>>>>>>>>>>>>>>>> required the object's dma_resv for unlink() which can't 
>>>>>>>>>>>>>>>> be grabbed
>>>>>>>>>>>>>>>> either from the fence signaling path. So are there any 
>>>>>>>>>>>>>>>> drivers actually
>>>>>>>>>>>>>>>> wanting to do that? If so, they will either need to 
>>>>>>>>>>>>>>>> resort to the
>>>>>>>>>>>>>>>> current spinlock solution or they will need to call 
>>>>>>>>>>>>>>>> unlink from a
>>>>>>>>>>>>>>>> workqueue item.
>>>>>>>>>>>>>>> As Boris already mentioned we have the dma-resv lock by 
>>>>>>>>>>>>>>> default or a driver
>>>>>>>>>>>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid 
>>>>>>>>>>>>>>> of the latter.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Also, what if the object is an external object? We 
>>>>>>>>>>>>>>>>> can't use the VM's
>>>>>>>>>>>>>>>>> dma-resv
>>>>>>>>>>>>>>>>> lock here.
>>>>>>>>>>>>>>>> Why? Typically (sync) unlink is only ever called from 
>>>>>>>>>>>>>>>> an unbind-like
>>>>>>>>>>>>>>>> operation where it should be trivial to grab the vm's 
>>>>>>>>>>>>>>>> resv. Or, for
>>>>>>>>>>>>>>>> that matter any outer lock protecting the extobj list. 
>>>>>>>>>>>>>>>> Rule would be
>>>>>>>>>>>>>>>> the drm_gpuvm_bo::entry::extobj  and 
>>>>>>>>>>>>>>>> drm_gpuvm_bo::entry::evict would
>>>>>>>>>>>>>>>> be protected by either the vm's dma_resv (or possibly 
>>>>>>>>>>>>>>>> an outer lock in
>>>>>>>>>>>>>>>> the case of the extobj list).
>>>>>>>>>>>>>>> Outer lock wouldn't have been working for updates in the 
>>>>>>>>>>>>>>> async path, but
>>>>>>>>>>>>>>> shouldn't be relevant anymore. We could use the VM's 
>>>>>>>>>>>>>>> resv for that.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   And we can't have the GEM objs dma-resv lock held 
>>>>>>>>>>>>>>>>> when calling
>>>>>>>>>>>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), 
>>>>>>>>>>>>>>>>> which if the
>>>>>>>>>>>>>>>>> refcount drops
>>>>>>>>>>>>>>>>> to zero calls drm_gpuvm_bo_destroy() and 
>>>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy() might
>>>>>>>>>>>>>>>>> drop the
>>>>>>>>>>>>>>>>> last reference of the GEM object.
>>>>>>>>>>>>>>>> Yes, but this is a different problem as to what exactly 
>>>>>>>>>>>>>>>> protects
>>>>>>>>>>>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an 
>>>>>>>>>>>>>>>> internal per bo list
>>>>>>>>>>>>>>>> lock, or if we want to keep the bo's dma_resv we need 
>>>>>>>>>>>>>>>> to ensure that
>>>>>>>>>>>>>>>> the caller of dma_resv_unlock(obj->resv) actually 
>>>>>>>>>>>>>>>> refcounts its obj
>>>>>>>>>>>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's 
>>>>>>>>>>>>>>>> refcount (I know
>>>>>>>>>>>>>>>> Boris didn't like that, but requiring an explicit 
>>>>>>>>>>>>>>>> refcount for a
>>>>>>>>>>>>>>>> pointer you dereference unless you're under a lock that 
>>>>>>>>>>>>>>>> ensures keeping
>>>>>>>>>>>>>>>> the object alive is pretty much required?) But anyway 
>>>>>>>>>>>>>>>> for the
>>>>>>>>>>>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or 
>>>>>>>>>>>>>>>> internal spinlock)
>>>>>>>>>>>>>>>> I don't have a strong preference.
>>>>>>>>>>>>>>> We can keep the GEM objects dma-resv lock, however as 
>>>>>>>>>>>>>>> mentioned above
>>>>>>>>>>>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires 
>>>>>>>>>>>>>>> both the VM's resv lock
>>>>>>>>>>>>>>> and the GEM's resv lock in case they differ.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>   All those problems go away with a dedicated
>>>>>>>>>>>>>>>>> GEM gpuva list lock.
>>>>>>>>>>>>>>>> I don't think these are real problems.
>>>>>>>>>>>>>>>> With the excepton of the eviction list "trick" where we 
>>>>>>>>>>>>>>>> currently have
>>>>>>>>>>>>>>>> slightly different approach to collect external bos 
>>>>>>>>>>>>>>>> needing rebinding,
>>>>>>>>>>>>>>>> we have this working fine.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> TBH I think pretty much the only situation where the 
>>>>>>>>>>>>>>>> spinlock is needed
>>>>>>>>>>>>>>>> is for async updates of these lists, unless a wq item 
>>>>>>>>>>>>>>>> can be used for
>>>>>>>>>>>>>>>> that, but it doesn't really seem like the current code 
>>>>>>>>>>>>>>>> allows for such
>>>>>>>>>>>>>>>> updates anyway? It complicates the code a lot, adds 
>>>>>>>>>>>>>>>> overhead and also
>>>>>>>>>>>>>>>> adds the requirement for refcounting during list 
>>>>>>>>>>>>>>>> traversal.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> It seems that with that also the refcount could be 
>>>>>>>>>>>>>>>>>>>> make non-
>>>>>>>>>>>>>>>>>>>> atomic.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> All in the spirit of the drm locking guidelines 
>>>>>>>>>>>>>>>>>>>> "use big locks
>>>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>>> possible".
>>>>>>>>>>>>>>>>>>>> Lower level locks only when necessary for 
>>>>>>>>>>>>>>>>>>>> performance or
>>>>>>>>>>>>>>>>>>>> locking inversion?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Elements popped from the original list are 
>>>>>>>>>>>>>>>>>>>>> kept in a
>>>>>>>>>>>>>>>>>>>>> local list, so removal
>>>>>>>>>>>>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>>>>>>>>>>>>> iterating the list.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, 
>>>>>>>>>>>>>>>>>>>>> __list_name,
>>>>>>>>>>>>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>>>>>>>>>>>>> +       ({
>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>>> *__vm_bo;                                           \
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>>>>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>>>>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \ 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> + struct
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>>>>>>>>>>>>> +                       if
>>>>>>>>>>>>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>>>>>>>>>>>>> {                    \
>>>>>>>>>>>>>>>>>>>>> +                               list_move_tail(&(__vm_bo)- 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> __local_list);                           \
>>>>>>>>>>>>>>>>>>>>> +                               break;
>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>> +                       } else
>>>>>>>>>>>>>>>>>>>>> {                                                        \ 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> +                               list_del_init(&(__vm_bo)- 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>>>>>>>>>>>>> NULL;                                         \
>>>>>>>>>>>>>>>>>>>>> +                       }
>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>> +       })
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo list 
>>>>>>>>>>>>>>>>>>>>> iterator
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>>>>>>> first element from the
>>>>>>>>>>>>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Typical use:
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + *     ret = 0;
>>>>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>>>>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>>>>>>>>>>>>> + *             ret = do_something_with_vm_bo(..., 
>>>>>>>>>>>>>>>>>>>>> vm_bo);
>>>>>>>>>>>>>>>>>>>>> + *             if (ret)
>>>>>>>>>>>>>>>>>>>>> + *                     break;
>>>>>>>>>>>>>>>>>>>>> + *     }
>>>>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, 
>>>>>>>>>>>>>>>>>>>>> <list_name>,
>>>>>>>>>>>>>>>>>>>>> &my_local_list);
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Only used for internal list iterations, not 
>>>>>>>>>>>>>>>>>>>>> meant to be
>>>>>>>>>>>>>>>>>>>>> exposed to the outside
>>>>>>>>>>>>>>>>>>>>> + * world.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>>>>>>>>>>>>> +       for (__vm_bo = 
>>>>>>>>>>>>>>>>>>>>> get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> NULL);            \
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> __vm_bo;
>>>>>>>>>>>>>>>>>>>>>        \
>>>>>>>>>>>>>>>>>>>>> +            __vm_bo = 
>>>>>>>>>>>>>>>>>>>>> get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> __vm_bo))         \
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements 
>>>>>>>>>>>>>>>>>>>>> back to their
>>>>>>>>>>>>>>>>>>>>> original list
>>>>>>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're 
>>>>>>>>>>>>>>>>>>>>> iterating on
>>>>>>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list 
>>>>>>>>>>>>>>>>>>>>> used to store
>>>>>>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * When we're done iterating a vm_bo list, we 
>>>>>>>>>>>>>>>>>>>>> should call
>>>>>>>>>>>>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>>>>>>>>>>>>> + * to restore the original state and let new 
>>>>>>>>>>>>>>>>>>>>> iterations take
>>>>>>>>>>>>>>>>>>>>> place.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>>>>>> __local_list)                         \
>>>>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>                  \
>>>>>>>>>>>>>>>>>>>>> +               /* Merge back the two lists, 
>>>>>>>>>>>>>>>>>>>>> moving local
>>>>>>>>>>>>>>>>>>>>> list elements to the          \
>>>>>>>>>>>>>>>>>>>>> +                * head to preserve previous 
>>>>>>>>>>>>>>>>>>>>> ordering, in
>>>>>>>>>>>>>>>>>>>>> case it matters.              \
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>>>>>>>>            \
>>>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>>>>>>> +               list_splice(__local_list, &(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into 
>>>>>>>>>>>>>>>>>>>>> the given
>>>>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Inserts the given @__vm_bo into the list 
>>>>>>>>>>>>>>>>>>>>> specified by
>>>>>>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>>>>>>>>>>>>> __list_name)      ��                     \
>>>>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>>>>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>>>>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from 
>>>>>>>>>>>>>>>>>>>>> the given
>>>>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert into
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Removes the given @__vm_bo from the list 
>>>>>>>>>>>>>>>>>>>>> specified by
>>>>>>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>>>>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>>>>> *vm_bo);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>     #define 
>>>>>>>>>>>>>>>>>>>>> to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>>>>>>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>>>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>>>>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>>>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>         ��drm_gpuva_check_overflow(start_offset, 
>>>>>>>>>>>>>>>>>>>>> range);
>>>>>>>>>>>>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>>>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>>>>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>>>> *gpuvm)
>>>>>>>>>>>>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root), 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>               "GPUVA tree is not empty, 
>>>>>>>>>>>>>>>>>>>>> potentially leaking
>>>>>>>>>>>>>>>>>>>>> memory.\n");
>>>>>>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), 
>>>>>>>>>>>>>>>>>>>>> "Extobj list
>>>>>>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), 
>>>>>>>>>>>>>>>>>>>>> "Evict list
>>>>>>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all 
>>>>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Note: This function is safe against concurrent 
>>>>>>>>>>>>>>>>>>>>> insertion
>>>>>>>>>>>>>>>>>>>>> and removal of
>>>>>>>>>>>>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>>>>>>>>>>>>> concurrent usage itself.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Drivers need to make sure to protect this case 
>>>>>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>> either an outer VM lock
>>>>>>>>>>>>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before 
>>>>>>>>>>>>>>>>>>>>> this function
>>>>>>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that 
>>>>>>>>>>>>>>>>>>>>> the GPUVM's
>>>>>>>>>>>>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>>>>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, 
>>>>>>>>>>>>>>>>>>>>> &extobjs,
>>>>>>>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>>>>>>>> vm_bo->obj,
>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the 
>>>>>>>>>>>>>>>>>>>>> loop. */
>>>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs 
>>>>>>>>>>>>>>>>>>>>> mapped within
>>>>>>>>>>>>>>>>>>>>> a given range
>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects
>>>>>>>>>>>>>>>>>>>>> mapped between @addr
>>>>>>>>>>>>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, 
>>>>>>>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>>>>>>>> drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>> +                       u64 addr, u64 range, 
>>>>>>>>>>>>>>>>>>>>> unsigned int
>>>>>>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>>>>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, 
>>>>>>>>>>>>>>>>>>>>> addr, end) {
>>>>>>>>>>>>>>>>>>>>> +               struct drm_gem_object *obj = 
>>>>>>>>>>>>>>>>>>>>> va->gem.obj;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, obj,
>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>>> +                       return ret;
>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Addionally, when calling this function with 
>>>>>>>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>>>>>>>>>>>>> + * being set the driver receives the given @fn 
>>>>>>>>>>>>>>>>>>>>> callback to
>>>>>>>>>>>>>>>>>>>>> lock additional
>>>>>>>>>>>>>>>>>>>>> + * dma-resv in the context of the &drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>>> instance.
>>>>>>>>>>>>>>>>>>>>> Typically, drivers
>>>>>>>>>>>>>>>>>>>>> + * would call drm_exec_prepare_obj() from within 
>>>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>>> callback.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, 
>>>>>>>>>>>>>>>>>>>>> exec,
>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +               ret = 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>>>>>>>>>>>>> +                       ret = 
>>>>>>>>>>>>>>>>>>>>> vm_exec->extra.fn(vm_exec,
>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>> +                       drm_exec_retry_on_contention(exec); 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> +                       if (ret)
>>>>>>>>>>>>>>>>>>>>> +                               goto err;
>>>>>>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +static int
>>>>>>>>>>>>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, 
>>>>>>>>>>>>>>>>>>>>> unsigned int
>>>>>>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       return 
>>>>>>>>>>>>>>>>>>>>> drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>>>>>>>>>>>>>>> objs,
>>>>>>>>>>>>>>>>>>>>> + args->num_objs,
>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all 
>>>>>>>>>>>>>>>>>>>>> dma-resv of all
>>>>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>>>>>>>>>>>>> + * @num_objs: the number of additional 
>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects to
>>>>>>>>>>>>>>>>>>>>> lock
>>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>>>>>>>>>>>>> + * contains mappings of, plus the ones given 
>>>>>>>>>>>>>>>>>>>>> through @objs.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>> + struct drm_gem_object **objs,
>>>>>>>>>>>>>>>>>>>>> + unsigned int num_objs,
>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>>> + bool interruptible)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>>>>>>> +       } args;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>>>>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>>>>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, 
>>>>>>>>>>>>>>>>>>>>> num_fences,
>>>>>>>>>>>>>>>>>>>>> interruptible);
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs 
>>>>>>>>>>>>>>>>>>>>> mapped
>>>>>>>>>>>>>>>>>>>>> within a given range
>>>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects
>>>>>>>>>>>>>>>>>>>>> mapped between @addr and
>>>>>>>>>>>>>>>>>>>>> + * @addr + @range.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>> + u64 addr, u64 range,
>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>>> + bool interruptible)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>>>>>>> +               ret = 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>>>>>>>>>>>>>>> addr, range,
>>>>>>>>>>>>>>>>>>>>> + num_fences);
>>>>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs marked 
>>>>>>>>>>>>>>>>>>>>> as evicted
>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback 
>>>>>>>>>>>>>>>>>>>>> for all
>>>>>>>>>>>>>>>>>>>>> evicted buffer
>>>>>>>>>>>>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = gpuvm->ops;
>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>>>>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, 
>>>>>>>>>>>>>>>>>>>>> &evict, vm_bo) {
>>>>>>>>>>>>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv); 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the 
>>>>>>>>>>>>>>>>>>>>> loop. */
>>>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to 
>>>>>>>>>>>>>>>>>>>>> private and all
>>>>>>>>>>>>>>>>>>>>> extobj
>>>>>>>>>>>>>>>>>>>>> + * dma-resv
>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage private_usage,
>>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage extobj_usage)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>>>>>>>>>>>>> +       unsigned long index;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, 
>>>>>>>>>>>>>>>>>>>>> index, obj) {
>>>>>>>>>>>>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>>>>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>>>>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>>>>>>>>>>>>> obj) ?
>>>>>>>>>>>>>>>>>>>>> + private_usage :
>>>>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new 
>>>>>>>>>>>>>>>>>>>>> instance of struct
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>>>>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>>>>>>>>>>>>          return vm_bo;
>>>>>>>>>>>>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct 
>>>>>>>>>>>>>>>>>>>>> kref *kref)
>>>>>>>>>>>>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>>>>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct 
>>>>>>>>>>>>>>>>>>>>> kref *kref)
>>>>>>>>>>>>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the 
>>>>>>>>>>>>>>>>>>>>> reference of
>>>>>>>>>>>>>>>>>>>>>      *
>>>>>>>>>>>>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * If the reference count drops to zero, the 
>>>>>>>>>>>>>>>>>>>>> &gpuvm_bo is
>>>>>>>>>>>>>>>>>>>>> destroyed, which
>>>>>>>>>>>>>>>>>>>>> + * includes removing it from the GEMs gpuva list. 
>>>>>>>>>>>>>>>>>>>>> Hence, if
>>>>>>>>>>>>>>>>>>>>> a call to this
>>>>>>>>>>>>>>>>>>>>> + * function can potentially let the reference 
>>>>>>>>>>>>>>>>>>>>> count to zero
>>>>>>>>>>>>>>>>>>>>> the caller must
>>>>>>>>>>>>>>>>>>>>> + * hold the dma-resv or driver specific GEM gpuva 
>>>>>>>>>>>>>>>>>>>>> lock.
>>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>>     void
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>>>>>> __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>> struct drm_gem_object *obj)
>>>>>>>>>>>>>>>>>>>>> @@ -938,6 +1412,48 @@ 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the 
>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm_bo to its
>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>>>>>>>>>>>>> + * extobj list
>>>>>>>>>>>>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its 
>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm's the
>>>>>>>>>>>>>>>>>>>>> extobj list.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's 
>>>>>>>>>>>>>>>>>>>>> extobj list if
>>>>>>>>>>>>>>>>>>>>> not on the list
>>>>>>>>>>>>>>>>>>>>> + * already and if the corresponding 
>>>>>>>>>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>>>>>>>>>> external object,
>>>>>>>>>>>>>>>>>>>>> + * actually.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, extobj);
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a 
>>>>>>>>>>>>>>>>>>>>> &drm_gem_object to
>>>>>>>>>>>>>>>>>>>>> / from a
>>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>>>>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>>>>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>>>>>>>>>>>>> + * list containing a mapping of this 
>>>>>>>>>>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, 
>>>>>>>>>>>>>>>>>>>>> bool evict)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>>>>>>>>>>>>> +               if (evict)
>>>>>>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, 
>>>>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>>>>> +               else
>>>>>>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, 
>>>>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>     static int
>>>>>>>>>>>>>>>>>>>>> __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>>                     struct drm_gpuva *va)
>>>>>>>>>>>>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>>>>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>>>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>>>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>>>>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>>>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>>>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>>>>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>>>>>>>>>>>>           * space
>>>>>>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>>> +        * @extobj: structure holding the extobj list
>>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>>>>>>> serving as
>>>>>>>>>>>>>>>>>>>>> +                * external object
>>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>>>>>>> extobj list
>>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>>>>>>> +       } extobj;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>>> +        * @evict: structure holding the evict 
>>>>>>>>>>>>>>>>>>>>> list and evict
>>>>>>>>>>>>>>>>>>>>> list lock
>>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>>>>>>> currently being
>>>>>>>>>>>>>>>>>>>>> +                * evicted
>>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>>>>>>> evict list
>>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>>>>>>> +       } evict;
>>>>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, 
>>>>>>>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>>>>>>>> drm_device *drm,
>>>>>>>>>>>>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>>>>>>>>>>>>                      const struct drm_gpuvm_ops 
>>>>>>>>>>>>>>>>>>>>> *ops);
>>>>>>>>>>>>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the 
>>>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>>>>>>>>>> + * external object
>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Returns: true if the &drm_gem_object &dma_resv 
>>>>>>>>>>>>>>>>>>>>> differs
>>>>>>>>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>>> + struct drm_gem_object
>>>>>>>>>>>>>>>>>>>>> *obj)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>>>>>>>>>>>> __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct 
>>>>>>>>>>>>>>>>>>>>> drm_gpuva *va)
>>>>>>>>>>>>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, 
>>>>>>>>>>>>>>>>>>>>> next__, gpuvm__)
>>>>>>>>>>>>>>>>>>>>> \
>>>>>>>>>>>>>>>>>>>>>          list_for_each_entry_safe(va__, next__, 
>>>>>>>>>>>>>>>>>>>>> &(gpuvm__)-
>>>>>>>>>>>>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of
>>>>>>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>>>>>>>>>>>>> &drm_exec should be.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Optionally, @extra can be set in order to lock 
>>>>>>>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA 
>>>>>>>>>>>>>>>>>>>>> reservations
>>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>>> +        * @extra: Callback and corresponding 
>>>>>>>>>>>>>>>>>>>>> private data
>>>>>>>>>>>>>>>>>>>>> for the driver to
>>>>>>>>>>>>>>>>>>>>> +        * lock arbitrary additional 
>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>>>>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>>> +                * @priv: driver private data for 
>>>>>>>>>>>>>>>>>>>>> the @fn
>>>>>>>>>>>>>>>>>>>>> callback
>>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>>> +               void *priv;
>>>>>>>>>>>>>>>>>>>>> +       } extra;
>>>>>>>>>>>>>>>>>>>>> +};
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs 
>>>>>>>>>>>>>>>>>>>>> common dma-
>>>>>>>>>>>>>>>>>>>>> resv
>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to reserve
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs dummy
>>>>>>>>>>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +static inline int
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       return drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>>>>>>>> &gpuvm->d_obj,
>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm 
>>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>> + u64 addr, u64 range,
>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>> + struct drm_gem_object **objs,
>>>>>>>>>>>>>>>>>>>>> + unsigned int num_objs,
>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>>> + bool interruptible);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>> + u64 addr, u64 range,
>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>>> + bool interruptible);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all 
>>>>>>>>>>>>>>>>>>>>> assoiciated
>>>>>>>>>>>>>>>>>>>>> BOs
>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Releases all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects
>>>>>>>>>>>>>>>>>>>>> previously acquired
>>>>>>>>>>>>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>>> *vm_exec)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm 
>>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, 
>>>>>>>>>>>>>>>>>>>>> &vm_exec->exec,
>>>>>>>>>>>>>>>>>>>>> fence,
>>>>>>>>>>>>>>>>>>>>> + private_usage,
>>>>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>>      * struct drm_gpuvm_bo - structure representing a
>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>>>>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>>>>>>>>>>>> * gpuva list.
>>>>>>>>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>>>>>>> +                        * @evict: List entry to 
>>>>>>>>>>>>>>>>>>>>> attach to
>>>>>>>>>>>>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>>>>>>>>>>>>> +                        * extobj list.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>>>>>>> +                        * @evict: List entry to 
>>>>>>>>>>>>>>>>>>>>> attach to
>>>>>>>>>>>>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>>>>>>>>>>>>> +                        * list.
>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>>>>>>>>>>>>                  } entry;
>>>>>>>>>>>>>>>>>>>>>          } list;
>>>>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>>                    struct drm_gem_object *obj);
>>>>>>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object 
>>>>>>>>>>>>>>>>>>>>> *obj, bool
>>>>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>>>>> *vm_bo);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to 
>>>>>>>>>>>>>>>>>>>>> walk over a
>>>>>>>>>>>>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>>>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to in 
>>>>>>>>>>>>>>>>>>>>> each
>>>>>>>>>>>>>>>>>>>>> iteration step
>>>>>>>>>>>>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>>>>>>>>>>>>           * used.
>>>>>>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op 
>>>>>>>>>>>>>>>>>>>>> *op, void
>>>>>>>>>>>>>>>>>>>>> *priv);
>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>>> +        * @bo_validate: called from 
>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_validate()
>>>>>>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>>>>>>> +        * Drivers receive this callback for every 
>>>>>>>>>>>>>>>>>>>>> evicted
>>>>>>>>>>>>>>>>>>>>> &drm_gem_object being
>>>>>>>>>>>>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>>>>>>> +        * Typically, drivers would call their driver
>>>>>>>>>>>>>>>>>>>>> specific variant of
>>>>>>>>>>>>>>>>>>>>> +        * ttm_bo_validate() from within this 
>>>>>>>>>>>>>>>>>>>>> callback.
>>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object 
>>>>>>>>>>>>>>>>>>>>> *obj);
>>>>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, 
>>>>>>>>>>>>>>>>>>>>> void *priv,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-20 13:48                                           ` Christian König
@ 2023-09-20 14:02                                             ` Thomas Hellström
  2023-09-20 14:11                                               ` Christian König
  0 siblings, 1 reply; 77+ messages in thread
From: Thomas Hellström @ 2023-09-20 14:02 UTC (permalink / raw)
  To: Christian König, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

Hi

On 9/20/23 15:48, Christian König wrote:
> Am 20.09.23 um 15:38 schrieb Thomas Hellström:
>>
>> On 9/20/23 15:06, Christian König wrote:
>>>
>>>
>>> Am 20.09.23 um 14:06 schrieb Thomas Hellström:
>>>>
>>>> On 9/20/23 12:51, Christian König wrote:
>>>>> Am 20.09.23 um 09:44 schrieb Thomas Hellström:
>>>>>> Hi,
>>>>>>
>>>>>> On 9/20/23 07:37, Christian König wrote:
>>>>>>> Am 19.09.23 um 17:23 schrieb Thomas Hellström:
>>>>>>>>
>>>>>>>> On 9/19/23 17:16, Danilo Krummrich wrote:
>>>>>>>>> On 9/19/23 14:21, Thomas Hellström wrote:
>>>>>>>>>> Hi Christian
>>>>>>>>>>
>>>>>>>>>> On 9/19/23 14:07, Christian König wrote:
>>>>>>>>>>> Am 13.09.23 um 17:46 schrieb Danilo Krummrich:
>>>>>>>>>>>> On 9/13/23 17:33, Christian König wrote:
>>>>>>>>>>>>> Am 13.09.23 um 17:15 schrieb Danilo Krummrich:
>>>>>>>>>>>>>> On 9/13/23 16:26, Christian König wrote:
>>>>>>>>>>>>>>> Am 13.09.23 um 14:16 schrieb Danilo Krummrich:
>>>>>>>>>>>>>>>> As mentioned in a different mail thread, the reply is 
>>>>>>>>>>>>>>>> based on the assumption
>>>>>>>>>>>>>>>> that we don't support anything else than GPUVM updates 
>>>>>>>>>>>>>>>> from the IOCTL.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think that this assumption is incorrect.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Well, more precisely I should have said "don't support 
>>>>>>>>>>>>>> GPUVM updated from within
>>>>>>>>>>>>>> fence signaling critical sections". And looking at the 
>>>>>>>>>>>>>> code, that doesn't seem what
>>>>>>>>>>>>>> you're doing there.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Vulkan is just once specific use case, but this here 
>>>>>>>>>>>>>>> should probably be able to handle other use cases as well.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Especially with HMM you get the requirement that you 
>>>>>>>>>>>>>>> need to be able to invalidate GPUVM mappings without 
>>>>>>>>>>>>>>> grabbing a reservation lock.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What do you mean with "invalidate GPUVM mappings" in this 
>>>>>>>>>>>>>> context? drm_gpuvm_bo_evict()
>>>>>>>>>>>>>> should only be called from a ttm_device_funcs::move\x0f 
>>>>>>>>>>>>>> callback, we should hold the dma-resv
>>>>>>>>>>>>>> lock there.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Well the question is which dma-resv lock do we hold?
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the move callback we only hold the dma-resv lock of the 
>>>>>>>>>>>>> BO which is moved, but when that is a shared BO then 
>>>>>>>>>>>>> that's not the same as the one for the VM.
>>>>>>>>>>>>
>>>>>>>>>>>> Correct, Thomas' idea was to use the GEM's dma_resv lock to 
>>>>>>>>>>>> protect drm_gpuvm_bo::evicted
>>>>>>>>>>>> and then actually move the drm_gpuvm_bo to the VM's evicted 
>>>>>>>>>>>> list once we grabbed all
>>>>>>>>>>>> dma-resv locks when locking the VM's BOs using drm_exec. We 
>>>>>>>>>>>> can remove them from the evicted
>>>>>>>>>>>> list on validate(). This way we never touch the evicted 
>>>>>>>>>>>> list without holding at least the VM's
>>>>>>>>>>>> dma-resv lock.
>>>>>>>>>>>>
>>>>>>>>>>>> Do you have any concerns about that?
>>>>>>>>>>>
>>>>>>>>>>> Scratching my head a bit how that is supposed to work.
>>>>>>>>>>>
>>>>>>>>>>> This implies that you go over all the evicted BOs during 
>>>>>>>>>>> validation and not just the one mentioned in the CS.
>>>>>>>>>>>
>>>>>>>>>>> That might work for Vulkan, but is pretty much a no-go for 
>>>>>>>>>>> OpenGL.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> See what the eviction lock in amdgpu is doing for example.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The eviction_lock seems to protect a VM state "evicting" 
>>>>>>>>>>>>>> of whether any BO that
>>>>>>>>>>>>>> is associated with the VM is currently evicting. At the 
>>>>>>>>>>>>>> same time amdgpu protects
>>>>>>>>>>>>>> the eviceted list of the VM with a different lock. So 
>>>>>>>>>>>>>> this seems to be entirely
>>>>>>>>>>>>>> unrelated. Tracking a "currently evicting" state is not 
>>>>>>>>>>>>>> part of the GPUVM
>>>>>>>>>>>>>> implementation currently and hence nothing would change 
>>>>>>>>>>>>>> for amdgpu there.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry for the confusion we use different terminology in 
>>>>>>>>>>>>> amdgpu.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The eviction lock and evicted state is for the VM page 
>>>>>>>>>>>>> tables, e.g. if the whole VM is currently not used and 
>>>>>>>>>>>>> swapped out or even de-allocated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is necessary because we have cases where we need to 
>>>>>>>>>>>>> access the VM data without holding the dma-resv lock of 
>>>>>>>>>>>>> this VM. Especially figuring out which parts of an address 
>>>>>>>>>>>>> space contain mappings and which doesn't.
>>>>>>>>>>>>
>>>>>>>>>>>> I think this is fine, this has nothing to do with lists of 
>>>>>>>>>>>> evicted GEM objects or external GEM
>>>>>>>>>>>> objects, right? Marking mappings (drm_gpuva) as invalidated 
>>>>>>>>>>>> (DRM_GPUVA_INVALIDATED) or accessing
>>>>>>>>>>>> the VA space does not require any dma-resv locks.
>>>>>>>>>>>
>>>>>>>>>>> I hope so, but I'm not 100% sure.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is a requirement which comes with HMM handling, you 
>>>>>>>>>>>>> won't see this with Vulkan (or OpenGL, VAAPI etc..).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> The invalidation lock on the other hand is what in this 
>>>>>>>>>>>>> discussion is called eviction lock. This one is needed 
>>>>>>>>>>>>> because what I wrote above, during the move callback only 
>>>>>>>>>>>>> the dma-resv of the BO which is moved is locked, but not 
>>>>>>>>>>>>> necessarily the dma-resv of the VM.
>>>>>>>>>>>>
>>>>>>>>>>>> That's yet another thing, right? This is used to track 
>>>>>>>>>>>> whether *any* BO that belongs to the VM is
>>>>>>>>>>>> currently being evicted, correct? As mentioned, as by now 
>>>>>>>>>>>> this is not supported in GPUVM and hence
>>>>>>>>>>>> would be the same driver specific code with the same driver 
>>>>>>>>>>>> specifc lock.
>>>>>>>>>>>
>>>>>>>>>>> That is most likely a show stopper using this for OpenGL 
>>>>>>>>>>> based workloads as far as I can see. For those you need to 
>>>>>>>>>>> able to figure out which non-VM BOs have been evicted and 
>>>>>>>>>>> which parts of the VM needs updates.
>>>>>>>>>>
>>>>>>>>>> We identify those with a bool in the gpuvm_bo, and that bool 
>>>>>>>>>> is protected by the bo_resv. In essence, the "evicted" list 
>>>>>>>>>> must be made up-to-date with all relevant locks held before 
>>>>>>>>>> traversing in the next exec.
>>>>>>>>>
>>>>>>>>> What I still miss with this idea is how do we find all the 
>>>>>>>>> drm_gpuvm_bo structures with the evicted bool set to true? 
>>>>>>>>> When doing the drm_exec dance we come across all external ones 
>>>>>>>>> and can add them to the list if needed, but what about the BOs 
>>>>>>>>> having the VM's dma-resv?
>>>>>>>>
>>>>>>>> Oh, they can be added to the evict list directly (no bool 
>>>>>>>> needed) in the eviction code, like in v3. Since for those we 
>>>>>>>> indeed hold the VM's dma_resv since it's aliased with the 
>>>>>>>> object's dma-resv.
>>>>>>>
>>>>>>> Yeah, I wanted to note what Danilo seems to think about as well. 
>>>>>>> How do we figure out the non-VM BOs evicted?
>>>>>>>
>>>>>>> We can't walk over the list of all non-VM BOs on every 
>>>>>>> submission, that's to much overhead for cases with lots of 
>>>>>>> non-VM BOs.
>>>>>>>
>>>>>>> And we can't rely on userspace sending all non-VM BOs as used 
>>>>>>> list down to the kernel with each submission.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>
>>>>>> No, that's not needed: Mechanism below.
>>>>>>
>>>>>> 1) We maintain an evicted list. Typically protected by the vm resv.
>>>>>> 2) Each gpuvm_bo has a bool "evicted". Protected by the bo resv.
>>>>>>
>>>>>> a) Evicting a vm bo: The vm resv is held by the eviction code. 
>>>>>> Just put it on the evicted list.
>>>>>> b) Evicting a shared/external bo: The bo resv is held by the 
>>>>>> eviction code. Set the "evicted" bool
>>>>>> c) Validating the evicted list on exec:
>>>>>
>>>>>
>>>>>> Loop through all *external/shared* bos.
>>>>>
>>>>> And this is what you can't do. For Vulkan it probably doesn't 
>>>>> matter, but for OpenGL and especially multimedia we have much more 
>>>>> BOs on the shared list than what's allocated for the VM.
>>>>
>>>> But you need to lock- and fence all those so you need to loop 
>>>> through them anyway, so we're still O(n_shared)? Or is there some 
>>>> clever optimization in amdgpu?
>>>
>>> Why should I lock and fence them? Only the BOs in the relocation 
>>> list are locked and fenced.
>>
>> Do you by "relocation" list refer to what gpuvm calls "evict" list or 
>> something else? Like the relocaton/validation list that used to be 
>> sent from user-space for non-VM_BIND vms?
>
> The BOs send into the kernel with each command submission on the 
> classic IOCTLs.
>
>>
>> The vm bos plus the external/shared bos bound to the VM (the external 
>> list) are the bos being referenced by the current batch. So the bos 
>> on the VM's external list are the ones being locked and fenced and 
>> checked for eviction. If they weren't they could be evicted before 
>> the current batch completes?
>
> That only applies to a certain use case, e.g. Vulkan or user mode queues.
>
> Multimedia APIs and especially OpenGL work differently, here only the 
> BOs mentioned in the relocation list are guaranteed to not be evicted.
>
> This is intentional because those APIs tend to over allocate memory 
> all the time, so for good performance you need to be able to evict BOs 
> from the VM while other parts of the VM are currently in use.
>
> Without that especially OpenGL performance would be completely 
> crippled at least on amdgpu.

OK, I've always wondered how overcommiting a local VM would be handled 
on VM_BIND, where we don't have the relocation list, at least not in xe, 
so we have what you refer to as the user mode queues.

I figure those APIs that suffer from overcommitting would maintain a 
"current working set" in user-space and send changes as deltas to the 
kernel as unbinds/binds. Or at least "can be unbound / can no longer be 
unbound" advises.

This may turn out interesting.

/Thomas




>
>
> Regards,
> Christian.
>
>>
>> Thanks,
>>
>> Thomas
>>
>>
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> I think with some UMDs, xe might end up with similar large lists...
>>>>
>>>> /Thomas
>>>>
>>>>
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> Lock them. After locking, check the "evicted" bool, if it's true. 
>>>>>> put the bo on the evicted list (we hold the VM resv at this 
>>>>>> point) and clear the "evicted" bool. Note that other vms will 
>>>>>> have their own gpuvm_bo which is marked evicted.
>>>>>>
>>>>>> I have this coded up in a patch for Xe and it seems to be working 
>>>>>> properly.
>>>>>>
>>>>>> /Thomas
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> /Thomas
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> If you mean that we need to unbind all vmas of all vms of 
>>>>>>>>>> evicted bos before evicting, We don't do that, at least not 
>>>>>>>>>> in Xe, since evicting we wait for VM idle, and it cant access 
>>>>>>>>>> anything through the stale vmas until they have been 
>>>>>>>>>> revalidated and rebound.
>>>>>>>>>>
>>>>>>>>>> /Thomas
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Christian.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>> Christian.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Sep 13, 2023 at 11:14:46AM +0200, Thomas 
>>>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, 2023-09-13 at 01:36 +0200, Danilo Krummrich 
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 09:23:08PM +0200, Thomas 
>>>>>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>>>>>> On 9/12/23 18:50, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>>>>>> On Tue, Sep 12, 2023 at 06:20:32PM +0200, Thomas 
>>>>>>>>>>>>>>>>>>>> Hellström wrote:
>>>>>>>>>>>>>>>>>>>>> Hi, Danilo,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On 9/9/23 17:31, Danilo Krummrich wrote:
>>>>>>>>>>>>>>>>>>>>>> So far the DRM GPUVA manager offers common 
>>>>>>>>>>>>>>>>>>>>>> infrastructure to
>>>>>>>>>>>>>>>>>>>>>> track GPU VA
>>>>>>>>>>>>>>>>>>>>>> allocations and mappings, generically connect GPU 
>>>>>>>>>>>>>>>>>>>>>> VA mappings
>>>>>>>>>>>>>>>>>>>>>> to their
>>>>>>>>>>>>>>>>>>>>>> backing buffers and perform more complex mapping 
>>>>>>>>>>>>>>>>>>>>>> operations
>>>>>>>>>>>>>>>>>>>>>> on the GPU VA
>>>>>>>>>>>>>>>>>>>>>> space.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> However, there are more design patterns commonly 
>>>>>>>>>>>>>>>>>>>>>> used by
>>>>>>>>>>>>>>>>>>>>>> drivers, which
>>>>>>>>>>>>>>>>>>>>>> can potentially be generalized in order to make 
>>>>>>>>>>>>>>>>>>>>>> the DRM GPUVA
>>>>>>>>>>>>>>>>>>>>>> manager
>>>>>>>>>>>>>>>>>>>>>> represent a basic GPU-VM implementation. In this 
>>>>>>>>>>>>>>>>>>>>>> context,
>>>>>>>>>>>>>>>>>>>>>> this patch aims
>>>>>>>>>>>>>>>>>>>>>> at generalizing the following elements.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> 1) Provide a common dma-resv for GEM objects not 
>>>>>>>>>>>>>>>>>>>>>> being used
>>>>>>>>>>>>>>>>>>>>>> outside of
>>>>>>>>>>>>>>>>>>>>>>       this GPU-VM.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> 2) Provide tracking of external GEM objects (GEM 
>>>>>>>>>>>>>>>>>>>>>> objects
>>>>>>>>>>>>>>>>>>>>>> which are
>>>>>>>>>>>>>>>>>>>>>>       shared with other GPU-VMs).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> 3) Provide functions to efficiently lock all GEM 
>>>>>>>>>>>>>>>>>>>>>> objects dma-
>>>>>>>>>>>>>>>>>>>>>> resv the
>>>>>>>>>>>>>>>>>>>>>>       GPU-VM contains mappings of.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> 4) Provide tracking of evicted GEM objects the 
>>>>>>>>>>>>>>>>>>>>>> GPU-VM
>>>>>>>>>>>>>>>>>>>>>> contains mappings
>>>>>>>>>>>>>>>>>>>>>>       of, such that validation of evicted GEM 
>>>>>>>>>>>>>>>>>>>>>> objects is
>>>>>>>>>>>>>>>>>>>>>> accelerated.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> 5) Provide some convinience functions for common 
>>>>>>>>>>>>>>>>>>>>>> patterns.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Rather than being designed as a "framework", the 
>>>>>>>>>>>>>>>>>>>>>> target is to
>>>>>>>>>>>>>>>>>>>>>> make all
>>>>>>>>>>>>>>>>>>>>>> features appear as a collection of optional 
>>>>>>>>>>>>>>>>>>>>>> helper functions,
>>>>>>>>>>>>>>>>>>>>>> such that
>>>>>>>>>>>>>>>>>>>>>> drivers are free to make use of the DRM GPUVA 
>>>>>>>>>>>>>>>>>>>>>> managers basic
>>>>>>>>>>>>>>>>>>>>>> functionality and opt-in for other features 
>>>>>>>>>>>>>>>>>>>>>> without setting
>>>>>>>>>>>>>>>>>>>>>> any feature
>>>>>>>>>>>>>>>>>>>>>> flags, just by making use of the corresponding 
>>>>>>>>>>>>>>>>>>>>>> functions.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Big kudos to Boris Brezillon for his help to 
>>>>>>>>>>>>>>>>>>>>>> figure out
>>>>>>>>>>>>>>>>>>>>>> locking for drivers
>>>>>>>>>>>>>>>>>>>>>> updating the GPU VA space within the fence 
>>>>>>>>>>>>>>>>>>>>>> signalling path.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Suggested-by: Matthew Brost 
>>>>>>>>>>>>>>>>>>>>>> <matthew.brost@intel.com>
>>>>>>>>>>>>>>>>>>>>>> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
>>>>>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>>>>> drivers/gpu/drm/drm_gpuvm.c | 516
>>>>>>>>>>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>>>>>>>>>>>> include/drm/drm_gpuvm.h | 197 ++++++++++++++
>>>>>>>>>>>>>>>>>>>>>>     2 files changed, 713 insertions(+)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>>>>> b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>>>>> index f4411047dbb3..8e62a043f719 100644
>>>>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_gpuvm.c
>>>>>>>>>>>>>>>>>>>>>> @@ -73,6 +73,21 @@
>>>>>>>>>>>>>>>>>>>>>>      * &drm_gem_object list of &drm_gpuvm_bos for 
>>>>>>>>>>>>>>>>>>>>>> an existing
>>>>>>>>>>>>>>>>>>>>>> instance of this
>>>>>>>>>>>>>>>>>>>>>>      * particular combination. If not existent a 
>>>>>>>>>>>>>>>>>>>>>> new instance
>>>>>>>>>>>>>>>>>>>>>> is created and linked
>>>>>>>>>>>>>>>>>>>>>>      * to the &drm_gem_object.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvm_bo structures, since unique for a 
>>>>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm, are also used
>>>>>>>>>>>>>>>>>>>>>> + * as entry for the &drm_gpuvm's lists of 
>>>>>>>>>>>>>>>>>>>>>> external and
>>>>>>>>>>>>>>>>>>>>>> evicted objects. Those
>>>>>>>>>>>>>>>>>>>>>> + * list are maintained in order to accelerate 
>>>>>>>>>>>>>>>>>>>>>> locking of
>>>>>>>>>>>>>>>>>>>>>> dma-resv locks and
>>>>>>>>>>>>>>>>>>>>>> + * validation of evicted objects bound in a 
>>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm. For
>>>>>>>>>>>>>>>>>>>>>> instance the all
>>>>>>>>>>>>>>>>>>>>>> + * &drm_gem_object's &dma_resv of a given 
>>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm can be
>>>>>>>>>>>>>>>>>>>>>> locked by calling
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock(). Once locked drivers 
>>>>>>>>>>>>>>>>>>>>>> can call
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_validate() in
>>>>>>>>>>>>>>>>>>>>>> + * order to validate all evicted 
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects. It is
>>>>>>>>>>>>>>>>>>>>>> also possible to lock
>>>>>>>>>>>>>>>>>>>>>> + * additional &drm_gem_objects by providing the
>>>>>>>>>>>>>>>>>>>>>> corresponding parameters to
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() as well as open code 
>>>>>>>>>>>>>>>>>>>>>> the &drm_exec
>>>>>>>>>>>>>>>>>>>>>> loop while making
>>>>>>>>>>>>>>>>>>>>>> + * use of helper functions such as 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects().
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Every bound &drm_gem_object is treated as 
>>>>>>>>>>>>>>>>>>>>>> external object
>>>>>>>>>>>>>>>>>>>>>> when its &dma_resv
>>>>>>>>>>>>>>>>>>>>>> + * structure is different than the &drm_gpuvm's 
>>>>>>>>>>>>>>>>>>>>>> common
>>>>>>>>>>>>>>>>>>>>>> &dma_resv structure.
>>>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>>> @@ -420,6 +435,20 @@
>>>>>>>>>>>>>>>>>>>>>>      * Subsequent calls to drm_gpuvm_bo_obtain() 
>>>>>>>>>>>>>>>>>>>>>> for the same
>>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>>>>>>>      * &drm_gem_object must be able to observe 
>>>>>>>>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>>> creations and destructions
>>>>>>>>>>>>>>>>>>>>>>      * of &drm_gpuvm_bos in order to keep 
>>>>>>>>>>>>>>>>>>>>>> instances unique.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * The &drm_gpuvm's lists for keeping track of 
>>>>>>>>>>>>>>>>>>>>>> external and
>>>>>>>>>>>>>>>>>>>>>> evicted objects are
>>>>>>>>>>>>>>>>>>>>>> + * protected against concurrent insertion / 
>>>>>>>>>>>>>>>>>>>>>> removal and
>>>>>>>>>>>>>>>>>>>>>> iteration internally.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * However, drivers still need ensure to protect 
>>>>>>>>>>>>>>>>>>>>>> concurrent
>>>>>>>>>>>>>>>>>>>>>> calls to functions
>>>>>>>>>>>>>>>>>>>>>> + * iterating those lists, such as 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_validate() and
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects(). Every such 
>>>>>>>>>>>>>>>>>>>>>> function contains
>>>>>>>>>>>>>>>>>>>>>> a particular
>>>>>>>>>>>>>>>>>>>>>> + * comment and lockdep checks if possible.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Functions adding or removing entries from 
>>>>>>>>>>>>>>>>>>>>>> those lists,
>>>>>>>>>>>>>>>>>>>>>> such as
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() or 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_extobj_add() may be
>>>>>>>>>>>>>>>>>>>>>> called with external
>>>>>>>>>>>>>>>>>>>>>> + * locks being held, e.g. in order to avoid the
>>>>>>>>>>>>>>>>>>>>>> corresponding list to be
>>>>>>>>>>>>>>>>>>>>>> + * (safely) modified while potentially being 
>>>>>>>>>>>>>>>>>>>>>> iternated by
>>>>>>>>>>>>>>>>>>>>>> other API functions.
>>>>>>>>>>>>>>>>>>>>>> + * However, this is entirely optional.
>>>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>>> @@ -632,6 +661,131 @@
>>>>>>>>>>>>>>>>>>>>>>      *   }
>>>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * get_next_vm_bo_from_list() - get the next 
>>>>>>>>>>>>>>>>>>>>>> vm_bo element
>>>>>>>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're 
>>>>>>>>>>>>>>>>>>>>>> iterating on
>>>>>>>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list 
>>>>>>>>>>>>>>>>>>>>>> used to store
>>>>>>>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>>>>>>>> + * @__prev_vm_bo: The previous element we got from
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_get_next_cached_vm_bo()
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>>>>>>>> first element from
>>>>>>>>>>>>>>>>>>>>>> + * the list, so list insertion deletion can happen
>>>>>>>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>>>>>>> Are the list spinlocks needed for that async state 
>>>>>>>>>>>>>>>>>>>>> update from
>>>>>>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>>>>>>> dma-fence critical section we've discussed 
>>>>>>>>>>>>>>>>>>>>> previously?
>>>>>>>>>>>>>>>>>>>> Yes, but also for other reasons, see below.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Otherwise it should be sufficient to protect the 
>>>>>>>>>>>>>>>>>>>>> lists with the
>>>>>>>>>>>>>>>>>>>>> gpuvm's resv
>>>>>>>>>>>>>>>>>>>>> (or for the extobj list with an outer lock).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> If those spinlocks are still needed in some 
>>>>>>>>>>>>>>>>>>>>> situations, perhaps
>>>>>>>>>>>>>>>>>>>>> could we
>>>>>>>>>>>>>>>>>>>>> have an option to set them to NULL (Like IIRC the 
>>>>>>>>>>>>>>>>>>>>> maple tree
>>>>>>>>>>>>>>>>>>>>> allows for)?
>>>>>>>>>>>>>>>>>>>> The evict spinlock is needed in any case, since in
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_evict() we're
>>>>>>>>>>>>>>>>>>>> holding only the dma-resv lock from the BO this 
>>>>>>>>>>>>>>>>>>>> function gets
>>>>>>>>>>>>>>>>>>>> called for. Hence,
>>>>>>>>>>>>>>>>>>>> the spinlock protects concurrent 
>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_evict() calls with
>>>>>>>>>>>>>>>>>>>> different BOs.
>>>>>>>>>>>>>>>>>>> No. Only if you try to add external objects to the 
>>>>>>>>>>>>>>>>>>> vm's evict list
>>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>> within the evict code. That's not necessary since 
>>>>>>>>>>>>>>>>>>> you loop through
>>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>>> external objects anyway when locking them so an 
>>>>>>>>>>>>>>>>>>> "evicted" bool in
>>>>>>>>>>>>>>>>>>> the vm_bo,
>>>>>>>>>>>>>>>>>>> protected by the bo resv would be sufficient. The 
>>>>>>>>>>>>>>>>>>> extobj locking
>>>>>>>>>>>>>>>>>>> loop can
>>>>>>>>>>>>>>>>>>> then add the bo to the evicted list.
>>>>>>>>>>>>>>>>>> And validate() can remove it while still holding all 
>>>>>>>>>>>>>>>>>> dma-resv locks,
>>>>>>>>>>>>>>>>>> neat!
>>>>>>>>>>>>>>>>>> However, what if two tasks are trying to lock the VA 
>>>>>>>>>>>>>>>>>> space
>>>>>>>>>>>>>>>>>> concurrently? What
>>>>>>>>>>>>>>>>>> do we do when the drm_gpuvm_bo's refcount drops to 
>>>>>>>>>>>>>>>>>> zero in
>>>>>>>>>>>>>>>>>> drm_gpuva_unlink()?
>>>>>>>>>>>>>>>>>> Are we guaranteed that at this point of time the 
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo is not
>>>>>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>>>>>> evicted list? Because otherwise we would call 
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>>>>>>>> with the
>>>>>>>>>>>>>>>>>> dma-resv lock held, which wouldn't be allowed, since
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy()
>>>>>>>>>>>>>>>>>> might drop the last reference to the drm_gem_object 
>>>>>>>>>>>>>>>>>> and hence we'd
>>>>>>>>>>>>>>>>>> potentially
>>>>>>>>>>>>>>>>>> free the dma-resv lock while holding it, at least if 
>>>>>>>>>>>>>>>>>> it's an external
>>>>>>>>>>>>>>>>>> object.
>>>>>>>>>>>>>>>>> Easiest way in this scheme is to think of the lists as 
>>>>>>>>>>>>>>>>> being protected
>>>>>>>>>>>>>>>>> by the vm's resv lock. That means anybody calling 
>>>>>>>>>>>>>>>>> unlink() must also
>>>>>>>>>>>>>>>>> hold the vm's resv lock. (Which is OK from an UAF 
>>>>>>>>>>>>>>>>> point of view, but
>>>>>>>>>>>>>>>>> perhaps not from a locking inversion POW from an async 
>>>>>>>>>>>>>>>>> list update).
>>>>>>>>>>>>>>>> This would mean that on unlink() we'd need to hold the 
>>>>>>>>>>>>>>>> VM's resv lock and the
>>>>>>>>>>>>>>>> corresponding GEM's resv lock (in case they're not the 
>>>>>>>>>>>>>>>> same anyways) because the
>>>>>>>>>>>>>>>> VM's resv lock would protect the external / evicted 
>>>>>>>>>>>>>>>> object lists and the GEM
>>>>>>>>>>>>>>>> objects resv lock protects the GEM's list of 
>>>>>>>>>>>>>>>> drm_gpuvm_bos and the
>>>>>>>>>>>>>>>> drm_gpuvm_bo's list of drm_gpuvas.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> For extobjs an outer lock would be enough in case 
>>>>>>>>>>>>>>>>>>>> of Xe, but I
>>>>>>>>>>>>>>>>>>>> really would not
>>>>>>>>>>>>>>>>>>>> like to add even more complexity just to get the 
>>>>>>>>>>>>>>>>>>>> spinlock out of
>>>>>>>>>>>>>>>>>>>> the way in case
>>>>>>>>>>>>>>>>>>>> the driver already has an outer lock protecting 
>>>>>>>>>>>>>>>>>>>> this path.
>>>>>>>>>>>>>>>>>>> I must disagree here. These spinlocks and atomic 
>>>>>>>>>>>>>>>>>>> operations are
>>>>>>>>>>>>>>>>>>> pretty
>>>>>>>>>>>>>>>>>>> costly and as discussed earlier this type of locking 
>>>>>>>>>>>>>>>>>>> was the reason
>>>>>>>>>>>>>>>>>>> (at
>>>>>>>>>>>>>>>>>>> least according to the commit message) that made 
>>>>>>>>>>>>>>>>>>> Christian drop the
>>>>>>>>>>>>>>>>>>> XArray
>>>>>>>>>>>>>>>>>>> use in drm_exec for the same set of objects: "The 
>>>>>>>>>>>>>>>>>>> locking overhead
>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>> unecessary and measurable". IMHO the spinlock is the 
>>>>>>>>>>>>>>>>>>> added
>>>>>>>>>>>>>>>>>>> complexity and a
>>>>>>>>>>>>>>>>>>> single wide lock following the drm locking 
>>>>>>>>>>>>>>>>>>> guidelines set out by
>>>>>>>>>>>>>>>>>>> Daniel and
>>>>>>>>>>>>>>>>>>> David should really be the default choice with an 
>>>>>>>>>>>>>>>>>>> opt-in for a
>>>>>>>>>>>>>>>>>>> spinlock if
>>>>>>>>>>>>>>>>>>> needed for async and pushing out to a wq is not an 
>>>>>>>>>>>>>>>>>>> option.
>>>>>>>>>>>>>>>>>> For the external object list an outer lock would work 
>>>>>>>>>>>>>>>>>> as long as it's
>>>>>>>>>>>>>>>>>> not the
>>>>>>>>>>>>>>>>>> dma-resv lock of the corresponding GEM object, since 
>>>>>>>>>>>>>>>>>> here we actually
>>>>>>>>>>>>>>>>>> need to
>>>>>>>>>>>>>>>>>> remove the list entry from the external object list on
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy().
>>>>>>>>>>>>>>>>>> It's just a bit weird design wise that drivers would 
>>>>>>>>>>>>>>>>>> need to take
>>>>>>>>>>>>>>>>>> this outer
>>>>>>>>>>>>>>>>>> lock on:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> - drm_gpuvm_bo_extobj_add()
>>>>>>>>>>>>>>>>>> - drm_gpuvm_bo_destroy()        (and hence also 
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>>>>>>>> - drm_gpuva_unlink()            (because it needs to 
>>>>>>>>>>>>>>>>>> call
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_put())
>>>>>>>>>>>>>>>>>> - drm_gpuvm_exec_lock()
>>>>>>>>>>>>>>>>>> - drm_gpuvm_exec_lock_array()
>>>>>>>>>>>>>>>>>> - drm_gpuvm_prepare_range()
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Given that it seems reasonable to do all the required 
>>>>>>>>>>>>>>>>>> locking
>>>>>>>>>>>>>>>>>> internally.
>>>>>>>>>>>>>>>>>  From a design POW, there has been a clear direction 
>>>>>>>>>>>>>>>>> in XE to make
>>>>>>>>>>>>>>>>> things similar to mmap() / munmap(), so this outer 
>>>>>>>>>>>>>>>>> lock, which in Xe is
>>>>>>>>>>>>>>>>> an rwsem, is used in a similar way as the mmap_lock. 
>>>>>>>>>>>>>>>>> It's protecting
>>>>>>>>>>>>>>>>> the page-table structures and vma rb tree, the userptr 
>>>>>>>>>>>>>>>>> structures and
>>>>>>>>>>>>>>>>> the extobj list. Basically it's taken early in the 
>>>>>>>>>>>>>>>>> exec IOCTL, the
>>>>>>>>>>>>>>>>> VM_BIND ioctl, the compute rebind worker and the 
>>>>>>>>>>>>>>>>> pagefault handler, so
>>>>>>>>>>>>>>>>> all of the above are just asserting that it is taken 
>>>>>>>>>>>>>>>>> in the correct
>>>>>>>>>>>>>>>>> mode.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> But strictly with this scheme one could also use the 
>>>>>>>>>>>>>>>>> vm's dma_resv for
>>>>>>>>>>>>>>>>> the extobj list since with drm_exec, it's locked 
>>>>>>>>>>>>>>>>> before traversing the
>>>>>>>>>>>>>>>>> list.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The whole point of this scheme is to rely on locks 
>>>>>>>>>>>>>>>>> that you already are
>>>>>>>>>>>>>>>>> supposed to be holding for various reasons and is 
>>>>>>>>>>>>>>>>> simple to comprehend.
>>>>>>>>>>>>>>>> I don't agree that we're supposed to hold the VM's resv 
>>>>>>>>>>>>>>>> lock anyways for
>>>>>>>>>>>>>>>> functions like drm_gpuvm_bo_put() or 
>>>>>>>>>>>>>>>> drm_gpuva_unlink(), but I'm fine using it
>>>>>>>>>>>>>>>> for that purpose nevertheless.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In order to at least place lockdep checks, the driver 
>>>>>>>>>>>>>>>>>> would need to
>>>>>>>>>>>>>>>>>> supply the
>>>>>>>>>>>>>>>>>> corresponding lock's lockdep_map, because the GPUVM 
>>>>>>>>>>>>>>>>>> otherwise doesn't
>>>>>>>>>>>>>>>>>> know about
>>>>>>>>>>>>>>>>>> the lock.
>>>>>>>>>>>>>>>>> Yes, that sounds reasonable. One lockdep map per list.
>>>>>>>>>>>>>>>> I'd really like to avoid that, especially now that 
>>>>>>>>>>>>>>>> everything got simpler. We
>>>>>>>>>>>>>>>> should define the actual locks to take instead.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Out of curiosity, what is the overhead of a 
>>>>>>>>>>>>>>>>>> spin_lock() that doesn't
>>>>>>>>>>>>>>>>>> need to
>>>>>>>>>>>>>>>>>> spin?
>>>>>>>>>>>>>>>>> I guess it's hard to tell exactly, but it is much 
>>>>>>>>>>>>>>>>> lower on modern x86
>>>>>>>>>>>>>>>>> than what it used to be. Not sure about ARM, which is 
>>>>>>>>>>>>>>>>> the other
>>>>>>>>>>>>>>>>> architecture important to us. I figure if there is 
>>>>>>>>>>>>>>>>> little cache-line
>>>>>>>>>>>>>>>>> bouncing the main overhead comes from the implied 
>>>>>>>>>>>>>>>>> barriers.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> A pretty simple way that would not add much code 
>>>>>>>>>>>>>>>>>>> would be
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> static void gpuvm_cond_spin_lock(const struct 
>>>>>>>>>>>>>>>>>>> drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>> spinlock_t
>>>>>>>>>>>>>>>>>>> *lock)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>      if (!gpuvm->resv_protected_lists)
>>>>>>>>>>>>>>>>>>>          spin_lock(lock);
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> For such drivers, that would require anybody 
>>>>>>>>>>>>>>>>>>>>> calling unlink to
>>>>>>>>>>>>>>>>>>>>> hold the vm's
>>>>>>>>>>>>>>>>>>>>> resv, though.
>>>>>>>>>>>>>>>>>>>> In V4 I want to go back to having a dedicated lock 
>>>>>>>>>>>>>>>>>>>> for the GEMs
>>>>>>>>>>>>>>>>>>>> gpuva list (or
>>>>>>>>>>>>>>>>>>>> VM_BO list to be more precise). We can't just use 
>>>>>>>>>>>>>>>>>>>> the dma-resv
>>>>>>>>>>>>>>>>>>>> lock for that
>>>>>>>>>>>>>>>>>>>> with VM_BO abstractions, because on destruction of 
>>>>>>>>>>>>>>>>>>>> a VM_BO we
>>>>>>>>>>>>>>>>>>>> otherwise wouldn't
>>>>>>>>>>>>>>>>>>>> be allowed to already hold the dma-resv lock. 
>>>>>>>>>>>>>>>>>>>> That's the fix I
>>>>>>>>>>>>>>>>>>>> was referring to
>>>>>>>>>>>>>>>>>>>> earlier.
>>>>>>>>>>>>>>>>>>> Yeah, I can see the need for a dedicated lock for 
>>>>>>>>>>>>>>>>>>> the GEM's gpuva
>>>>>>>>>>>>>>>>>>> list, but
>>>>>>>>>>>>>>>>>>> holding the vm's dma-resv lock across the unlink 
>>>>>>>>>>>>>>>>>>> shouldn't be a
>>>>>>>>>>>>>>>>>>> problem. We
>>>>>>>>>>>>>>>>>>> may free the object and a pointer to the vm's resv 
>>>>>>>>>>>>>>>>>>> during unlink
>>>>>>>>>>>>>>>>>>> but we
>>>>>>>>>>>>>>>>>>> don't free the vm's resv. It'd be a matter of 
>>>>>>>>>>>>>>>>>>> ensuring that any
>>>>>>>>>>>>>>>>>>> calls to
>>>>>>>>>>>>>>>>>>> unlink from *within* drm_gpuvm allows it to be held.
>>>>>>>>>>>>>>>>>> Drivers calling unlink() from the fence signaling 
>>>>>>>>>>>>>>>>>> path can't use the
>>>>>>>>>>>>>>>>>> VM's
>>>>>>>>>>>>>>>>>> dma-resv lock.
>>>>>>>>>>>>>>>>> Yes, that made me a bit curious because in the current 
>>>>>>>>>>>>>>>>> version the code
>>>>>>>>>>>>>>>>> required the object's dma_resv for unlink() which 
>>>>>>>>>>>>>>>>> can't be grabbed
>>>>>>>>>>>>>>>>> either from the fence signaling path. So are there any 
>>>>>>>>>>>>>>>>> drivers actually
>>>>>>>>>>>>>>>>> wanting to do that? If so, they will either need to 
>>>>>>>>>>>>>>>>> resort to the
>>>>>>>>>>>>>>>>> current spinlock solution or they will need to call 
>>>>>>>>>>>>>>>>> unlink from a
>>>>>>>>>>>>>>>>> workqueue item.
>>>>>>>>>>>>>>>> As Boris already mentioned we have the dma-resv lock by 
>>>>>>>>>>>>>>>> default or a driver
>>>>>>>>>>>>>>>> specific GEM gpuva lock as opt-in. Now, we can get rid 
>>>>>>>>>>>>>>>> of the latter.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Also, what if the object is an external object? We 
>>>>>>>>>>>>>>>>>> can't use the VM's
>>>>>>>>>>>>>>>>>> dma-resv
>>>>>>>>>>>>>>>>>> lock here.
>>>>>>>>>>>>>>>>> Why? Typically (sync) unlink is only ever called from 
>>>>>>>>>>>>>>>>> an unbind-like
>>>>>>>>>>>>>>>>> operation where it should be trivial to grab the vm's 
>>>>>>>>>>>>>>>>> resv. Or, for
>>>>>>>>>>>>>>>>> that matter any outer lock protecting the extobj list. 
>>>>>>>>>>>>>>>>> Rule would be
>>>>>>>>>>>>>>>>> the drm_gpuvm_bo::entry::extobj and 
>>>>>>>>>>>>>>>>> drm_gpuvm_bo::entry::evict would
>>>>>>>>>>>>>>>>> be protected by either the vm's dma_resv (or possibly 
>>>>>>>>>>>>>>>>> an outer lock in
>>>>>>>>>>>>>>>>> the case of the extobj list).
>>>>>>>>>>>>>>>> Outer lock wouldn't have been working for updates in 
>>>>>>>>>>>>>>>> the async path, but
>>>>>>>>>>>>>>>> shouldn't be relevant anymore. We could use the VM's 
>>>>>>>>>>>>>>>> resv for that.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>   And we can't have the GEM objs dma-resv lock held 
>>>>>>>>>>>>>>>>>> when calling
>>>>>>>>>>>>>>>>>> unlink(), since unlink() calls drm_gpuvm_bo_put(), 
>>>>>>>>>>>>>>>>>> which if the
>>>>>>>>>>>>>>>>>> refcount drops
>>>>>>>>>>>>>>>>>> to zero calls drm_gpuvm_bo_destroy() and 
>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_destroy() might
>>>>>>>>>>>>>>>>>> drop the
>>>>>>>>>>>>>>>>>> last reference of the GEM object.
>>>>>>>>>>>>>>>>> Yes, but this is a different problem as to what 
>>>>>>>>>>>>>>>>> exactly protects
>>>>>>>>>>>>>>>>> drm_gpuvm_bo::entry::gem. Either as you suggest an 
>>>>>>>>>>>>>>>>> internal per bo list
>>>>>>>>>>>>>>>>> lock, or if we want to keep the bo's dma_resv we need 
>>>>>>>>>>>>>>>>> to ensure that
>>>>>>>>>>>>>>>>> the caller of dma_resv_unlock(obj->resv) actually 
>>>>>>>>>>>>>>>>> refcounts its obj
>>>>>>>>>>>>>>>>> pointer, and doesn't implicitly rely on the gpuvm_bo's 
>>>>>>>>>>>>>>>>> refcount (I know
>>>>>>>>>>>>>>>>> Boris didn't like that, but requiring an explicit 
>>>>>>>>>>>>>>>>> refcount for a
>>>>>>>>>>>>>>>>> pointer you dereference unless you're under a lock 
>>>>>>>>>>>>>>>>> that ensures keeping
>>>>>>>>>>>>>>>>> the object alive is pretty much required?) But anyway 
>>>>>>>>>>>>>>>>> for the
>>>>>>>>>>>>>>>>> drm_gpuvm_bo::entry::gem list protection (bo resv or 
>>>>>>>>>>>>>>>>> internal spinlock)
>>>>>>>>>>>>>>>>> I don't have a strong preference.
>>>>>>>>>>>>>>>> We can keep the GEM objects dma-resv lock, however as 
>>>>>>>>>>>>>>>> mentioned above
>>>>>>>>>>>>>>>> drm_gpuva_unlink() and drm_gpuvm_bo_put() then requires 
>>>>>>>>>>>>>>>> both the VM's resv lock
>>>>>>>>>>>>>>>> and the GEM's resv lock in case they differ.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>   All those problems go away with a dedicated
>>>>>>>>>>>>>>>>>> GEM gpuva list lock.
>>>>>>>>>>>>>>>>> I don't think these are real problems.
>>>>>>>>>>>>>>>>> With the excepton of the eviction list "trick" where 
>>>>>>>>>>>>>>>>> we currently have
>>>>>>>>>>>>>>>>> slightly different approach to collect external bos 
>>>>>>>>>>>>>>>>> needing rebinding,
>>>>>>>>>>>>>>>>> we have this working fine.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> TBH I think pretty much the only situation where the 
>>>>>>>>>>>>>>>>> spinlock is needed
>>>>>>>>>>>>>>>>> is for async updates of these lists, unless a wq item 
>>>>>>>>>>>>>>>>> can be used for
>>>>>>>>>>>>>>>>> that, but it doesn't really seem like the current code 
>>>>>>>>>>>>>>>>> allows for such
>>>>>>>>>>>>>>>>> updates anyway? It complicates the code a lot, adds 
>>>>>>>>>>>>>>>>> overhead and also
>>>>>>>>>>>>>>>>> adds the requirement for refcounting during list 
>>>>>>>>>>>>>>>>> traversal.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> It seems that with that also the refcount could be 
>>>>>>>>>>>>>>>>>>>>> make non-
>>>>>>>>>>>>>>>>>>>>> atomic.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> All in the spirit of the drm locking guidelines 
>>>>>>>>>>>>>>>>>>>>> "use big locks
>>>>>>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>>>> possible".
>>>>>>>>>>>>>>>>>>>>> Lower level locks only when necessary for 
>>>>>>>>>>>>>>>>>>>>> performance or
>>>>>>>>>>>>>>>>>>>>> locking inversion?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> /Thomas
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Elements popped from the original list are 
>>>>>>>>>>>>>>>>>>>>>> kept in a
>>>>>>>>>>>>>>>>>>>>>> local list, so removal
>>>>>>>>>>>>>>>>>>>>>> + * and is_empty checks can still happen while we're
>>>>>>>>>>>>>>>>>>>>>> iterating the list.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +#define get_next_vm_bo_from_list(__gpuvm, 
>>>>>>>>>>>>>>>>>>>>>> __list_name,
>>>>>>>>>>>>>>>>>>>>>> __local_list, __prev_vm_bo)     \
>>>>>>>>>>>>>>>>>>>>>> +       ({
>>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>>> +               struct drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>>>> *__vm_bo;                                           \ 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_put(__prev_vm_bo);
>>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>>>>>>>> +               while (!list_empty(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>>>> __list_name.list)) {                     \
>>>>>>>>>>>>>>>>>>>>>> +                       __vm_bo =
>>>>>>>>>>>>>>>>>>>>>> list_first_entry(&(__gpuvm)->__list_name.list,        \ 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> + struct
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo,                 \
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name);             \
>>>>>>>>>>>>>>>>>>>>>> +                       if
>>>>>>>>>>>>>>>>>>>>>> (drm_gpuvm_bo_get_unless_zero(__vm_bo))
>>>>>>>>>>>>>>>>>>>>>> {                    \
>>>>>>>>>>>>>>>>>>>>>> +                               list_move_tail(&(__vm_bo)- 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name,      \
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> __local_list);                           \
>>>>>>>>>>>>>>>>>>>>>> +                               break;
>>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>>> +                       } else
>>>>>>>>>>>>>>>>>>>>>> {                                                        \ 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> +                               list_del_init(&(__vm_bo)- 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>>>>>>>> +                               __vm_bo =
>>>>>>>>>>>>>>>>>>>>>> NULL;                                         \
>>>>>>>>>>>>>>>>>>>>>> +                       }
>>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>>> +               __vm_bo;
>>>>>>>>>>>>>>>>>>>>>>                             \
>>>>>>>>>>>>>>>>>>>>>> +       })
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * for_each_vm_bo_in_list() - internal vm_bo 
>>>>>>>>>>>>>>>>>>>>>> list iterator
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * This helper is here to provide lockless list 
>>>>>>>>>>>>>>>>>>>>>> iteration.
>>>>>>>>>>>>>>>>>>>>>> Lockless as in, the
>>>>>>>>>>>>>>>>>>>>>> + * iterator releases the lock immediately after 
>>>>>>>>>>>>>>>>>>>>>> picking the
>>>>>>>>>>>>>>>>>>>>>> first element from the
>>>>>>>>>>>>>>>>>>>>>> + * list, so list insertion and deletion can happen
>>>>>>>>>>>>>>>>>>>>>> concurrently.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Typical use:
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + *     struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>>>>> + *     LIST_HEAD(my_local_list);
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + *     ret = 0;
>>>>>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_for_each_vm_bo(gpuvm, <list_name>,
>>>>>>>>>>>>>>>>>>>>>> &my_local_list, vm_bo) {
>>>>>>>>>>>>>>>>>>>>>> + *             ret = 
>>>>>>>>>>>>>>>>>>>>>> do_something_with_vm_bo(..., vm_bo);
>>>>>>>>>>>>>>>>>>>>>> + *             if (ret)
>>>>>>>>>>>>>>>>>>>>>> + *                     break;
>>>>>>>>>>>>>>>>>>>>>> + *     }
>>>>>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>>>>>> + *     drm_gpuvm_restore_vm_bo_list(gpuvm, 
>>>>>>>>>>>>>>>>>>>>>> <list_name>,
>>>>>>>>>>>>>>>>>>>>>> &my_local_list);
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Only used for internal list iterations, not 
>>>>>>>>>>>>>>>>>>>>>> meant to be
>>>>>>>>>>>>>>>>>>>>>> exposed to the outside
>>>>>>>>>>>>>>>>>>>>>> + * world.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +#define for_each_vm_bo_in_list(__gpuvm, 
>>>>>>>>>>>>>>>>>>>>>> __list_name,
>>>>>>>>>>>>>>>>>>>>>> __local_list, __vm_bo)    \
>>>>>>>>>>>>>>>>>>>>>> +       for (__vm_bo = 
>>>>>>>>>>>>>>>>>>>>>> get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> NULL);            \
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> __vm_bo;
>>>>>>>>>>>>>>>>>>>>>>        \
>>>>>>>>>>>>>>>>>>>>>> +            __vm_bo = 
>>>>>>>>>>>>>>>>>>>>>> get_next_vm_bo_from_list(__gpuvm,
>>>>>>>>>>>>>>>>>>>>>> __list_name,           \
>>>>>>>>>>>>>>>>>>>>>> +                                               __local_list, 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> __vm_bo))         \
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * restore_vm_bo_list() - move vm_bo elements 
>>>>>>>>>>>>>>>>>>>>>> back to their
>>>>>>>>>>>>>>>>>>>>>> original list
>>>>>>>>>>>>>>>>>>>>>> + * @__gpuvm: The GPU VM
>>>>>>>>>>>>>>>>>>>>>> + * @__list_name: The name of the list we're 
>>>>>>>>>>>>>>>>>>>>>> iterating on
>>>>>>>>>>>>>>>>>>>>>> + * @__local_list: A pointer to the local list 
>>>>>>>>>>>>>>>>>>>>>> used to store
>>>>>>>>>>>>>>>>>>>>>> already iterated items
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * When we're done iterating a vm_bo list, we 
>>>>>>>>>>>>>>>>>>>>>> should call
>>>>>>>>>>>>>>>>>>>>>> restore_vm_bo_list()
>>>>>>>>>>>>>>>>>>>>>> + * to restore the original state and let new 
>>>>>>>>>>>>>>>>>>>>>> iterations take
>>>>>>>>>>>>>>>>>>>>>> place.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +#define restore_vm_bo_list(__gpuvm, __list_name,
>>>>>>>>>>>>>>>>>>>>>> __local_list)                         \
>>>>>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>>                  \
>>>>>>>>>>>>>>>>>>>>>> +               /* Merge back the two lists, 
>>>>>>>>>>>>>>>>>>>>>> moving local
>>>>>>>>>>>>>>>>>>>>>> list elements to the          \
>>>>>>>>>>>>>>>>>>>>>> +                * head to preserve previous 
>>>>>>>>>>>>>>>>>>>>>> ordering, in
>>>>>>>>>>>>>>>>>>>>>> case it matters.              \
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>>>>>>>>>            \
>>>>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                                \
>>>>>>>>>>>>>>>>>>>>>> +               list_splice(__local_list, 
>>>>>>>>>>>>>>>>>>>>>> &(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>>>> __list_name.list);                \
>>>>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__gpuvm)-
>>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                              \
>>>>>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_add() - insert a vm_bo into 
>>>>>>>>>>>>>>>>>>>>>> the given
>>>>>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert 
>>>>>>>>>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Inserts the given @__vm_bo into the list 
>>>>>>>>>>>>>>>>>>>>>> specified by
>>>>>>>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>>>>>>>> + * increases the vm_bo's reference count.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_add(__vm_bo,
>>>>>>>>>>>>>>>>>>>>>> __list_name)      ��                     \
>>>>>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>>>>>>>> +               if (list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name))             \
>>>>>>>>>>>>>>>>>>>>>> +                       list_add_tail(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name,       \
>>>>>>>>>>>>>>>>>>>>>> + &(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>>>> __list_name.list);        \
>>>>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_list_del() - remove a vm_bo from 
>>>>>>>>>>>>>>>>>>>>>> the given
>>>>>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>>>>> + * @__vm_bo: the &drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>>>> + * @__list_name: the name of the list to insert 
>>>>>>>>>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Removes the given @__vm_bo from the list 
>>>>>>>>>>>>>>>>>>>>>> specified by
>>>>>>>>>>>>>>>>>>>>>> @__list_name and
>>>>>>>>>>>>>>>>>>>>>> + * decreases the vm_bo's reference count.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +#define drm_gpuvm_bo_list_del(__vm_bo,
>>>>>>>>>>>>>>>>>>>>>> __list_name)                            \
>>>>>>>>>>>>>>>>>>>>>> +       do
>>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>>          \
>>>>>>>>>>>>>>>>>>>>>> +               spin_lock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                    \
>>>>>>>>>>>>>>>>>>>>>> +               if (!list_empty(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name))            \
>>>>>>>>>>>>>>>>>>>>>> +                       list_del_init(&(__vm_bo)-
>>>>>>>>>>>>>>>>>>>>>>> list.entry.__list_name);      \
>>>>>>>>>>>>>>>>>>>>>> +               spin_unlock(&(__vm_bo)->vm-
>>>>>>>>>>>>>>>>>>>>>>> __list_name.lock);                  \
>>>>>>>>>>>>>>>>>>>>>> +       } while (0)
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>>>>>> *vm_bo);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>     #define 
>>>>>>>>>>>>>>>>>>>>>> to_drm_gpuva(__node) container_of((__node), struct
>>>>>>>>>>>>>>>>>>>>>> drm_gpuva, rb.node)
>>>>>>>>>>>>>>>>>>>>>>     #define GPUVA_START(node) ((node)->va.addr)
>>>>>>>>>>>>>>>>>>>>>> @@ -713,6 +867,12 @@ drm_gpuvm_init(struct 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>>> struct drm_device *drm,
>>>>>>>>>>>>>>>>>>>>>>          gpuvm->rb.tree = RB_ROOT_CACHED;
>>>>>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&gpuvm->rb.list);
>>>>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->extobj.list);
>>>>>>>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&gpuvm->evict.list);
>>>>>>>>>>>>>>>>>>>>>> +       spin_lock_init(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>         ��drm_gpuva_check_overflow(start_offset, 
>>>>>>>>>>>>>>>>>>>>>> range);
>>>>>>>>>>>>>>>>>>>>>>          gpuvm->mm_start = start_offset;
>>>>>>>>>>>>>>>>>>>>>>          gpuvm->mm_range = range;
>>>>>>>>>>>>>>>>>>>>>> @@ -754,10 +914,302 @@ drm_gpuvm_destroy(struct 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>>>>> *gpuvm)
>>>>>>>>>>>>>>>>>>>>>>          WARN(!RB_EMPTY_ROOT(&gpuvm->rb.tree.rb_root), 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>               "GPUVA tree is not empty, 
>>>>>>>>>>>>>>>>>>>>>> potentially leaking
>>>>>>>>>>>>>>>>>>>>>> memory.\n");
>>>>>>>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->extobj.list), 
>>>>>>>>>>>>>>>>>>>>>> "Extobj list
>>>>>>>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>>>>>>>> +       WARN(!list_empty(&gpuvm->evict.list), 
>>>>>>>>>>>>>>>>>>>>>> "Evict list
>>>>>>>>>>>>>>>>>>>>>> should be empty.\n");
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>          drm_gem_private_object_fini(&gpuvm->d_obj);
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_destroy);
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_objects() - prepare all 
>>>>>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to 
>>>>>>>>>>>>>>>>>>>>>> reserve
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Note: This function is safe against 
>>>>>>>>>>>>>>>>>>>>>> concurrent insertion
>>>>>>>>>>>>>>>>>>>>>> and removal of
>>>>>>>>>>>>>>>>>>>>>> + * external objects, however it is not safe against
>>>>>>>>>>>>>>>>>>>>>> concurrent usage itself.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Drivers need to make sure to protect this 
>>>>>>>>>>>>>>>>>>>>>> case with
>>>>>>>>>>>>>>>>>>>>>> either an outer VM lock
>>>>>>>>>>>>>>>>>>>>>> + * or by calling drm_gpuvm_prepare_vm() before 
>>>>>>>>>>>>>>>>>>>>>> this function
>>>>>>>>>>>>>>>>>>>>>> within the
>>>>>>>>>>>>>>>>>>>>>> + * drm_exec_until_all_locked() loop, such that 
>>>>>>>>>>>>>>>>>>>>>> the GPUVM's
>>>>>>>>>>>>>>>>>>>>>> dma-resv lock ensures
>>>>>>>>>>>>>>>>>>>>>> + * mutual exclusion.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>>>>> +       LIST_HEAD(extobjs);
>>>>>>>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, extobj, 
>>>>>>>>>>>>>>>>>>>>>> &extobjs,
>>>>>>>>>>>>>>>>>>>>>> vm_bo) {
>>>>>>>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>>>>>>>>> vm_bo->obj,
>>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the 
>>>>>>>>>>>>>>>>>>>>>> loop. */
>>>>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, extobj, &extobjs);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_objects);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_range() - prepare all BOs 
>>>>>>>>>>>>>>>>>>>>>> mapped within
>>>>>>>>>>>>>>>>>>>>>> a given range
>>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to 
>>>>>>>>>>>>>>>>>>>>>> reserve
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for all 
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects
>>>>>>>>>>>>>>>>>>>>>> mapped between @addr
>>>>>>>>>>>>>>>>>>>>>> + * and @addr + @range.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, 
>>>>>>>>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>>>>>>>>> drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>>> +                       u64 addr, u64 range, 
>>>>>>>>>>>>>>>>>>>>>> unsigned int
>>>>>>>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuva *va;
>>>>>>>>>>>>>>>>>>>>>> +       u64 end = addr + range;
>>>>>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_for_each_va_range(va, gpuvm, 
>>>>>>>>>>>>>>>>>>>>>> addr, end) {
>>>>>>>>>>>>>>>>>>>>>> +               struct drm_gem_object *obj = 
>>>>>>>>>>>>>>>>>>>>>> va->gem.obj;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +               ret = drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>>>>>>>>> obj,
>>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>>>> +                       return ret;
>>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_prepare_range);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock() - lock all dma-resv of all
>>>>>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to 
>>>>>>>>>>>>>>>>>>>>>> reserve
>>>>>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvm contains mappings of.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Addionally, when calling this function with 
>>>>>>>>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_exec::extra
>>>>>>>>>>>>>>>>>>>>>> + * being set the driver receives the given @fn 
>>>>>>>>>>>>>>>>>>>>>> callback to
>>>>>>>>>>>>>>>>>>>>>> lock additional
>>>>>>>>>>>>>>>>>>>>>> + * dma-resv in the context of the 
>>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm_exec instance.
>>>>>>>>>>>>>>>>>>>>>> Typically, drivers
>>>>>>>>>>>>>>>>>>>>>> + * would call drm_exec_prepare_obj() from within 
>>>>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>>>> callback.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec,
>>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>>>> +                   bool interruptible)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>>>>>>>> +               ret = drm_gpuvm_prepare_vm(gpuvm, 
>>>>>>>>>>>>>>>>>>>>>> exec,
>>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +               ret = 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_objects(gpuvm, exec,
>>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +               if (vm_exec->extra.fn) {
>>>>>>>>>>>>>>>>>>>>>> +                       ret = 
>>>>>>>>>>>>>>>>>>>>>> vm_exec->extra.fn(vm_exec,
>>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>>> +                       drm_exec_retry_on_contention(exec); 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> +                       if (ret)
>>>>>>>>>>>>>>>>>>>>>> +                               goto err;
>>>>>>>>>>>>>>>>>>>>>> +               }
>>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +static int
>>>>>>>>>>>>>>>>>>>>>> +fn_lock_array(struct drm_gpuvm_exec *vm_exec, 
>>>>>>>>>>>>>>>>>>>>>> unsigned int
>>>>>>>>>>>>>>>>>>>>>> num_fences)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>>>>>>>> +       } *args = vm_exec->extra.priv;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       return 
>>>>>>>>>>>>>>>>>>>>>> drm_exec_prepare_array(&vm_exec->exec, args-
>>>>>>>>>>>>>>>>>>>>>>> objs,
>>>>>>>>>>>>>>>>>>>>>> + args->num_objs,
>>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_array() - lock all 
>>>>>>>>>>>>>>>>>>>>>> dma-resv of all
>>>>>>>>>>>>>>>>>>>>>> assoiciated BOs
>>>>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>>>>> + * @objs: additional &drm_gem_objects to lock
>>>>>>>>>>>>>>>>>>>>>> + * @num_objs: the number of additional 
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects to
>>>>>>>>>>>>>>>>>>>>>> lock
>>>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to 
>>>>>>>>>>>>>>>>>>>>>> reserve
>>>>>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects the
>>>>>>>>>>>>>>>>>>>>>> given &drm_gpuvm
>>>>>>>>>>>>>>>>>>>>>> + * contains mappings of, plus the ones given 
>>>>>>>>>>>>>>>>>>>>>> through @objs.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>>> + struct drm_gem_object **objs,
>>>>>>>>>>>>>>>>>>>>>> + unsigned int num_objs,
>>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>>>> + bool interruptible)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>>>> +               struct drm_gem_object **objs;
>>>>>>>>>>>>>>>>>>>>>> +               unsigned int num_objs;
>>>>>>>>>>>>>>>>>>>>>> +       } args;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       args.objs = objs;
>>>>>>>>>>>>>>>>>>>>>> +       args.num_objs = num_objs;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       vm_exec->extra.fn = fn_lock_array;
>>>>>>>>>>>>>>>>>>>>>> +       vm_exec->extra.priv = &args;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       return drm_gpuvm_exec_lock(vm_exec, 
>>>>>>>>>>>>>>>>>>>>>> num_fences,
>>>>>>>>>>>>>>>>>>>>>> interruptible);
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_array);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_lock_range() - prepare all BOs 
>>>>>>>>>>>>>>>>>>>>>> mapped
>>>>>>>>>>>>>>>>>>>>>> within a given range
>>>>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>>>>> + * @addr: the start address within the VA space
>>>>>>>>>>>>>>>>>>>>>> + * @range: the range to iterate within the VA space
>>>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to 
>>>>>>>>>>>>>>>>>>>>>> reserve
>>>>>>>>>>>>>>>>>>>>>> + * @interruptible: sleep interruptible if waiting
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Acquires all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects
>>>>>>>>>>>>>>>>>>>>>> mapped between @addr and
>>>>>>>>>>>>>>>>>>>>>> + * @addr + @range.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>>> + u64 addr, u64 range,
>>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>>>> + bool interruptible)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_exec->vm;
>>>>>>>>>>>>>>>>>>>>>> +       struct drm_exec *exec = &vm_exec->exec;
>>>>>>>>>>>>>>>>>>>>>> +       uint32_t flags;
>>>>>>>>>>>>>>>>>>>>>> +       int ret;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       flags = interruptible ? 
>>>>>>>>>>>>>>>>>>>>>> DRM_EXEC_INTERRUPTIBLE_WAIT :
>>>>>>>>>>>>>>>>>>>>>> 0 |
>>>>>>>>>>>>>>>>>>>>>> +               DRM_EXEC_IGNORE_DUPLICATES;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       drm_exec_init(exec, flags);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       drm_exec_until_all_locked(exec) {
>>>>>>>>>>>>>>>>>>>>>> +               ret = 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_prepare_range(gpuvm, exec,
>>>>>>>>>>>>>>>>>>>>>> addr, range,
>>>>>>>>>>>>>>>>>>>>>> + num_fences);
>>>>>>>>>>>>>>>>>>>>>> +               drm_exec_retry_on_contention(exec);
>>>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>>>> +                       goto err;
>>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +err:
>>>>>>>>>>>>>>>>>>>>>> +       drm_exec_fini(exec);
>>>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_exec_lock_range);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_validate() - validate all BOs 
>>>>>>>>>>>>>>>>>>>>>> marked as evicted
>>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to validate evicted BOs
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Calls the &drm_gpuvm_ops.bo_validate callback 
>>>>>>>>>>>>>>>>>>>>>> for all
>>>>>>>>>>>>>>>>>>>>>> evicted buffer
>>>>>>>>>>>>>>>>>>>>>> + * objects being mapped in the given &drm_gpuvm.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +int
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_validate(struct drm_gpuvm *gpuvm)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       const struct drm_gpuvm_ops *ops = 
>>>>>>>>>>>>>>>>>>>>>> gpuvm->ops;
>>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>>>>> +       LIST_HEAD(evict);
>>>>>>>>>>>>>>>>>>>>>> +       int ret = 0;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       if (unlikely(!ops || !ops->bo_validate))
>>>>>>>>>>>>>>>>>>>>>> +               return -ENOTSUPP;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       for_each_vm_bo_in_list(gpuvm, evict, 
>>>>>>>>>>>>>>>>>>>>>> &evict, vm_bo) {
>>>>>>>>>>>>>>>>>>>>>> +               dma_resv_assert_held(vm_bo->obj->resv); 
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> +               ret = ops->bo_validate(vm_bo->obj);
>>>>>>>>>>>>>>>>>>>>>> +               if (ret)
>>>>>>>>>>>>>>>>>>>>>> +                       break;
>>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>>> +       /* Drop ref in case we break out of the 
>>>>>>>>>>>>>>>>>>>>>> loop. */
>>>>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_bo_put(vm_bo);
>>>>>>>>>>>>>>>>>>>>>> +       restore_vm_bo_list(gpuvm, evict, &evict);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       return ret;
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_validate);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_resv_add_fence - add fence to 
>>>>>>>>>>>>>>>>>>>>>> private and all
>>>>>>>>>>>>>>>>>>>>>> extobj
>>>>>>>>>>>>>>>>>>>>>> + * dma-resv
>>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to add a fence to
>>>>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec locking context
>>>>>>>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage private_usage,
>>>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage extobj_usage)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       struct drm_gem_object *obj;
>>>>>>>>>>>>>>>>>>>>>> +       unsigned long index;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       drm_exec_for_each_locked_object(exec, 
>>>>>>>>>>>>>>>>>>>>>> index, obj) {
>>>>>>>>>>>>>>>>>>>>>> +               dma_resv_assert_held(obj->resv);
>>>>>>>>>>>>>>>>>>>>>> +               dma_resv_add_fence(obj->resv, fence,
>>>>>>>>>>>>>>>>>>>>>> + drm_gpuvm_is_extobj(gpuvm,
>>>>>>>>>>>>>>>>>>>>>> obj) ?
>>>>>>>>>>>>>>>>>>>>>> + private_usage :
>>>>>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_resv_add_fence);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_create() - create a new 
>>>>>>>>>>>>>>>>>>>>>> instance of struct
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>>>>      * @gpuvm: The &drm_gpuvm the @obj is mapped in.
>>>>>>>>>>>>>>>>>>>>>> @@ -790,6 +1242,9 @@ drm_gpuvm_bo_create(struct 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.gpuva);
>>>>>>>>>>>>>>>>>>>>>>          INIT_LIST_HEAD(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>>>>>>>> +       INIT_LIST_HEAD(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>          drm_gem_object_get(obj);
>>>>>>>>>>>>>>>>>>>>>>          return vm_bo;
>>>>>>>>>>>>>>>>>>>>>> @@ -807,6 +1262,14 @@ drm_gpuvm_bo_destroy(struct 
>>>>>>>>>>>>>>>>>>>>>> kref *kref)
>>>>>>>>>>>>>>>>>>>>>>          drm_gem_gpuva_assert_lock_held(vm_bo->obj);
>>>>>>>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.extobj);
>>>>>>>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->extobj.lock);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       spin_lock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>>>>>> +       list_del(&vm_bo->list.entry.evict);
>>>>>>>>>>>>>>>>>>>>>> +       spin_unlock(&gpuvm->evict.lock);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>          list_del(&vm_bo->list.entry.gem);
>>>>>>>>>>>>>>>>>>>>>>          drm_gem_object_put(obj);
>>>>>>>>>>>>>>>>>>>>>> @@ -822,6 +1285,11 @@ drm_gpuvm_bo_destroy(struct 
>>>>>>>>>>>>>>>>>>>>>> kref *kref)
>>>>>>>>>>>>>>>>>>>>>>      * @vm_bo: the &drm_gpuvm_bo to release the 
>>>>>>>>>>>>>>>>>>>>>> reference of
>>>>>>>>>>>>>>>>>>>>>>      *
>>>>>>>>>>>>>>>>>>>>>>      * This releases a reference to @vm_bo.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * If the reference count drops to zero, the 
>>>>>>>>>>>>>>>>>>>>>> &gpuvm_bo is
>>>>>>>>>>>>>>>>>>>>>> destroyed, which
>>>>>>>>>>>>>>>>>>>>>> + * includes removing it from the GEMs gpuva 
>>>>>>>>>>>>>>>>>>>>>> list. Hence, if
>>>>>>>>>>>>>>>>>>>>>> a call to this
>>>>>>>>>>>>>>>>>>>>>> + * function can potentially let the reference 
>>>>>>>>>>>>>>>>>>>>>> count to zero
>>>>>>>>>>>>>>>>>>>>>> the caller must
>>>>>>>>>>>>>>>>>>>>>> + * hold the dma-resv or driver specific GEM 
>>>>>>>>>>>>>>>>>>>>>> gpuva lock.
>>>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>>>     void
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_put(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>>>>>>> @@ -831,6 +1299,12 @@ drm_gpuvm_bo_put(struct 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo
>>>>>>>>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_put);
>>>>>>>>>>>>>>>>>>>>>> +static int __must_check
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_get_unless_zero(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>>>>>> *vm_bo)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       return kref_get_unless_zero(&vm_bo->kref);
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>     static struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>>>>>>> __drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>>> struct drm_gem_object *obj)
>>>>>>>>>>>>>>>>>>>>>> @@ -938,6 +1412,48 @@ 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_obtain_prealloc(struct
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo *__vm_bo)
>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>> EXPORT_SYMBOL_GPL(drm_gpuvm_bo_obtain_prealloc);
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_extobj_add() - adds the 
>>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm_bo to its
>>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm's
>>>>>>>>>>>>>>>>>>>>>> + * extobj list
>>>>>>>>>>>>>>>>>>>>>> + * @vm_bo: The &drm_gpuvm_bo to add to its 
>>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm's the
>>>>>>>>>>>>>>>>>>>>>> extobj list.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Adds the given @vm_bo to its &drm_gpuvm's 
>>>>>>>>>>>>>>>>>>>>>> extobj list if
>>>>>>>>>>>>>>>>>>>>>> not on the list
>>>>>>>>>>>>>>>>>>>>>> + * already and if the corresponding 
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>>>>>>>>>>> external object,
>>>>>>>>>>>>>>>>>>>>>> + * actually.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *gpuvm = vm_bo->vm;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       if (drm_gpuvm_is_extobj(gpuvm, vm_bo->obj))
>>>>>>>>>>>>>>>>>>>>>> +               drm_gpuvm_bo_list_add(vm_bo, 
>>>>>>>>>>>>>>>>>>>>>> extobj);
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_extobj_add);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_bo_evict() - add / remove a 
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_object to
>>>>>>>>>>>>>>>>>>>>>> / from a
>>>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvms evicted list
>>>>>>>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to add or remove
>>>>>>>>>>>>>>>>>>>>>> + * @evict: indicates whether the object is evicted
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Adds a &drm_gem_object to or removes it from all
>>>>>>>>>>>>>>>>>>>>>> &drm_gpuvms evicted
>>>>>>>>>>>>>>>>>>>>>> + * list containing a mapping of this 
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +void
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_bo_evict(struct drm_gem_object *obj, 
>>>>>>>>>>>>>>>>>>>>>> bool evict)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm_bo *vm_bo;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       drm_gem_for_each_gpuvm_bo(vm_bo, obj) {
>>>>>>>>>>>>>>>>>>>>>> +               if (evict)
>>>>>>>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_add(vm_bo, 
>>>>>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>>>>>> +               else
>>>>>>>>>>>>>>>>>>>>>> +                       drm_gpuvm_bo_list_del(vm_bo, 
>>>>>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>>>>>> +       }
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>     static int
>>>>>>>>>>>>>>>>>>>>>> __drm_gpuva_insert(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>>> struct drm_gpuva *va)
>>>>>>>>>>>>>>>>>>>>>> diff --git a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>>>>> b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>>>>> index afa50b9059a2..834bb6d6617e 100644
>>>>>>>>>>>>>>>>>>>>>> --- a/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>>>>> +++ b/include/drm/drm_gpuvm.h
>>>>>>>>>>>>>>>>>>>>>> @@ -26,10 +26,12 @@
>>>>>>>>>>>>>>>>>>>>>>      */
>>>>>>>>>>>>>>>>>>>>>>     #include <linux/list.h>
>>>>>>>>>>>>>>>>>>>>>> +#include <linux/dma-resv.h>
>>>>>>>>>>>>>>>>>>>>>>     #include <linux/rbtree.h>
>>>>>>>>>>>>>>>>>>>>>>     #include <linux/types.h>
>>>>>>>>>>>>>>>>>>>>>>     #include <drm/drm_gem.h>
>>>>>>>>>>>>>>>>>>>>>> +#include <drm/drm_exec.h>
>>>>>>>>>>>>>>>>>>>>>>     struct drm_gpuvm;
>>>>>>>>>>>>>>>>>>>>>>     struct drm_gpuvm_bo;
>>>>>>>>>>>>>>>>>>>>>> @@ -259,6 +261,38 @@ struct drm_gpuvm {
>>>>>>>>>>>>>>>>>>>>>>           * space
>>>>>>>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>>>>>>>          struct dma_resv *resv;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>>>> +        * @extobj: structure holding the extobj 
>>>>>>>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>>>>>>>> serving as
>>>>>>>>>>>>>>>>>>>>>> +                * external object
>>>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>>>>>>>> extobj list
>>>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>>>>>>>> +       } extobj;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>>>> +        * @evict: structure holding the evict 
>>>>>>>>>>>>>>>>>>>>>> list and evict
>>>>>>>>>>>>>>>>>>>>>> list lock
>>>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>>>> +                * @list: &list_head storing 
>>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm_bos
>>>>>>>>>>>>>>>>>>>>>> currently being
>>>>>>>>>>>>>>>>>>>>>> +                * evicted
>>>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>>>> +               struct list_head list;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>>>> +                * @lock: spinlock to protect the 
>>>>>>>>>>>>>>>>>>>>>> evict list
>>>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>>>> +               spinlock_t lock;
>>>>>>>>>>>>>>>>>>>>>> +       } evict;
>>>>>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>>>>>     void drm_gpuvm_init(struct drm_gpuvm *gpuvm, 
>>>>>>>>>>>>>>>>>>>>>> struct
>>>>>>>>>>>>>>>>>>>>>> drm_device *drm,
>>>>>>>>>>>>>>>>>>>>>> @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>>>>> *gpuvm, struct drm_device *drm,
>>>>>>>>>>>>>>>>>>>>>> const struct drm_gpuvm_ops *ops);
>>>>>>>>>>>>>>>>>>>>>>     void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_is_extobj() - indicates whether the 
>>>>>>>>>>>>>>>>>>>>>> given
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_object is an
>>>>>>>>>>>>>>>>>>>>>> + * external object
>>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm to check
>>>>>>>>>>>>>>>>>>>>>> + * @obj: the &drm_gem_object to check
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Returns: true if the &drm_gem_object 
>>>>>>>>>>>>>>>>>>>>>> &dma_resv differs
>>>>>>>>>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>>>>>>>>>> + * &drm_gpuvms &dma_resv, false otherwise
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +static inline bool drm_gpuvm_is_extobj(struct 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm
>>>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>>>> + struct drm_gem_object
>>>>>>>>>>>>>>>>>>>>>> *obj)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       return obj && obj->resv != gpuvm->resv;
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>     static inline struct drm_gpuva *
>>>>>>>>>>>>>>>>>>>>>> __drm_gpuva_next(struct drm_gpuva *va)
>>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>> @@ -346,6 +395,128 @@ __drm_gpuva_next(struct 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuva *va)
>>>>>>>>>>>>>>>>>>>>>>     #define drm_gpuvm_for_each_va_safe(va__, 
>>>>>>>>>>>>>>>>>>>>>> next__, gpuvm__)
>>>>>>>>>>>>>>>>>>>>>> \
>>>>>>>>>>>>>>>>>>>>>>          list_for_each_entry_safe(va__, next__, 
>>>>>>>>>>>>>>>>>>>>>> &(gpuvm__)-
>>>>>>>>>>>>>>>>>>>>>>> rb.list, rb.entry)
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * struct drm_gpuvm_exec - &drm_gpuvm 
>>>>>>>>>>>>>>>>>>>>>> abstraction of
>>>>>>>>>>>>>>>>>>>>>> &drm_exec
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * This structure should be created on the stack as
>>>>>>>>>>>>>>>>>>>>>> &drm_exec should be.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Optionally, @extra can be set in order to 
>>>>>>>>>>>>>>>>>>>>>> lock additional
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +struct drm_gpuvm_exec {
>>>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>>>> +        * @exec: the &drm_exec structure
>>>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>>>> +       struct drm_exec exec;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>>>> +        * @vm: the &drm_gpuvm to lock its DMA 
>>>>>>>>>>>>>>>>>>>>>> reservations
>>>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>>>> +       struct drm_gpuvm *vm;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>>>> +        * @extra: Callback and corresponding 
>>>>>>>>>>>>>>>>>>>>>> private data
>>>>>>>>>>>>>>>>>>>>>> for the driver to
>>>>>>>>>>>>>>>>>>>>>> +        * lock arbitrary additional 
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects.
>>>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>>>> +       struct {
>>>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>>>> +                * @fn: The driver callback to lock
>>>>>>>>>>>>>>>>>>>>>> additional &drm_gem_objects.
>>>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>>>> +               int (*fn)(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +               /**
>>>>>>>>>>>>>>>>>>>>>> +                * @priv: driver private data for 
>>>>>>>>>>>>>>>>>>>>>> the @fn
>>>>>>>>>>>>>>>>>>>>>> callback
>>>>>>>>>>>>>>>>>>>>>> +                */
>>>>>>>>>>>>>>>>>>>>>> +               void *priv;
>>>>>>>>>>>>>>>>>>>>>> +       } extra;
>>>>>>>>>>>>>>>>>>>>>> +};
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_prepare_vm() - prepare the GPUVMs 
>>>>>>>>>>>>>>>>>>>>>> common dma-
>>>>>>>>>>>>>>>>>>>>>> resv
>>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>>>>> + * @exec: the &drm_exec context
>>>>>>>>>>>>>>>>>>>>>> + * @num_fences: the amount of &dma_fences to 
>>>>>>>>>>>>>>>>>>>>>> reserve
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Calls drm_exec_prepare_obj() for the GPUVMs 
>>>>>>>>>>>>>>>>>>>>>> dummy
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_object.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Using this function directly, it is the drivers
>>>>>>>>>>>>>>>>>>>>>> responsibility to call
>>>>>>>>>>>>>>>>>>>>>> + * drm_exec_init() and drm_exec_fini() accordingly.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +static inline int
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       return drm_exec_prepare_obj(exec, 
>>>>>>>>>>>>>>>>>>>>>> &gpuvm->d_obj,
>>>>>>>>>>>>>>>>>>>>>> num_fences);
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_objects(struct drm_gpuvm 
>>>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_prepare_range(struct drm_gpuvm 
>>>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>>> + u64 addr, u64 range,
>>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>>> +                       unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>>>> +                       bool interruptible);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>>> + struct drm_gem_object **objs,
>>>>>>>>>>>>>>>>>>>>>> + unsigned int num_objs,
>>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>>>> + bool interruptible);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>>> + u64 addr, u64 range,
>>>>>>>>>>>>>>>>>>>>>> + unsigned int num_fences,
>>>>>>>>>>>>>>>>>>>>>> + bool interruptible);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_lock() - lock all dma-resv of all 
>>>>>>>>>>>>>>>>>>>>>> assoiciated
>>>>>>>>>>>>>>>>>>>>>> BOs
>>>>>>>>>>>>>>>>>>>>>> + * @gpuvm: the &drm_gpuvm
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Releases all dma-resv locks of all 
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_objects
>>>>>>>>>>>>>>>>>>>>>> previously acquired
>>>>>>>>>>>>>>>>>>>>>> + * through drm_gpuvm_lock() or its variants.
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * Returns: 0 on success, negative error code on 
>>>>>>>>>>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec 
>>>>>>>>>>>>>>>>>>>>>> *vm_exec)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       drm_exec_fini(&vm_exec->exec);
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm);
>>>>>>>>>>>>>>>>>>>>>> +void drm_gpuvm_resv_add_fence(struct drm_gpuvm 
>>>>>>>>>>>>>>>>>>>>>> *gpuvm,
>>>>>>>>>>>>>>>>>>>>>> + struct drm_exec *exec,
>>>>>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>>>>>>>>> + * drm_gpuvm_exec_resv_add_fence()
>>>>>>>>>>>>>>>>>>>>>> + * @vm_exec: the &drm_gpuvm_exec abstraction
>>>>>>>>>>>>>>>>>>>>>> + * @fence: fence to add
>>>>>>>>>>>>>>>>>>>>>> + * @private_usage: private dma-resv usage
>>>>>>>>>>>>>>>>>>>>>> + * @extobj_usage: extobj dma-resv usage
>>>>>>>>>>>>>>>>>>>>>> + *
>>>>>>>>>>>>>>>>>>>>>> + * See drm_gpuvm_resv_add_fence().
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +static inline void
>>>>>>>>>>>>>>>>>>>>>> +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec
>>>>>>>>>>>>>>>>>>>>>> *vm_exec,
>>>>>>>>>>>>>>>>>>>>>> + struct dma_fence *fence,
>>>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>>>>> private_usage,
>>>>>>>>>>>>>>>>>>>>>> + enum dma_resv_usage
>>>>>>>>>>>>>>>>>>>>>> extobj_usage)
>>>>>>>>>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>>>>>>>>>> +       drm_gpuvm_resv_add_fence(vm_exec->vm, 
>>>>>>>>>>>>>>>>>>>>>> &vm_exec->exec,
>>>>>>>>>>>>>>>>>>>>>> fence,
>>>>>>>>>>>>>>>>>>>>>> + private_usage,
>>>>>>>>>>>>>>>>>>>>>> extobj_usage);
>>>>>>>>>>>>>>>>>>>>>> +}
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>>>      * struct drm_gpuvm_bo - structure 
>>>>>>>>>>>>>>>>>>>>>> representing a
>>>>>>>>>>>>>>>>>>>>>> &drm_gpuvm and
>>>>>>>>>>>>>>>>>>>>>>      * &drm_gem_object combination
>>>>>>>>>>>>>>>>>>>>>> @@ -398,6 +569,18 @@ struct drm_gpuvm_bo {
>>>>>>>>>>>>>>>>>>>>>> * gpuva list.
>>>>>>>>>>>>>>>>>>>>>> */
>>>>>>>>>>>>>>>>>>>>>>                          struct list_head gem;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>>>>>>>> + * @evict: List entry to attach to
>>>>>>>>>>>>>>>>>>>>>> the &drm_gpuvms
>>>>>>>>>>>>>>>>>>>>>> + * extobj list.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +                       struct list_head extobj;
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +                       /**
>>>>>>>>>>>>>>>>>>>>>> + * @evict: List entry to attach to
>>>>>>>>>>>>>>>>>>>>>> the &drm_gpuvms evict
>>>>>>>>>>>>>>>>>>>>>> + * list.
>>>>>>>>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>>>>>>>>> +                       struct list_head evict;
>>>>>>>>>>>>>>>>>>>>>>                  } entry;
>>>>>>>>>>>>>>>>>>>>>>          } list;
>>>>>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>>>>> @@ -432,6 +615,9 @@ struct drm_gpuvm_bo *
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm,
>>>>>>>>>>>>>>>>>>>>>> struct drm_gem_object *obj);
>>>>>>>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_evict(struct drm_gem_object 
>>>>>>>>>>>>>>>>>>>>>> *obj, bool
>>>>>>>>>>>>>>>>>>>>>> evict);
>>>>>>>>>>>>>>>>>>>>>> +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo 
>>>>>>>>>>>>>>>>>>>>>> *vm_bo);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>>     /**
>>>>>>>>>>>>>>>>>>>>>>      * drm_gpuvm_bo_for_each_va() - iterator to 
>>>>>>>>>>>>>>>>>>>>>> walk over a
>>>>>>>>>>>>>>>>>>>>>> list of &drm_gpuva
>>>>>>>>>>>>>>>>>>>>>>      * @va__: &drm_gpuva structure to assign to 
>>>>>>>>>>>>>>>>>>>>>> in each
>>>>>>>>>>>>>>>>>>>>>> iteration step
>>>>>>>>>>>>>>>>>>>>>> @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops {
>>>>>>>>>>>>>>>>>>>>>>           * used.
>>>>>>>>>>>>>>>>>>>>>>           */
>>>>>>>>>>>>>>>>>>>>>>          int (*sm_step_unmap)(struct drm_gpuva_op 
>>>>>>>>>>>>>>>>>>>>>> *op, void
>>>>>>>>>>>>>>>>>>>>>> *priv);
>>>>>>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>>>>>> +       /**
>>>>>>>>>>>>>>>>>>>>>> +        * @bo_validate: called from 
>>>>>>>>>>>>>>>>>>>>>> drm_gpuvm_validate()
>>>>>>>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>>>>>>>> +        * Drivers receive this callback for 
>>>>>>>>>>>>>>>>>>>>>> every evicted
>>>>>>>>>>>>>>>>>>>>>> &drm_gem_object being
>>>>>>>>>>>>>>>>>>>>>> +        * mapped in the corresponding &drm_gpuvm.
>>>>>>>>>>>>>>>>>>>>>> +        *
>>>>>>>>>>>>>>>>>>>>>> +        * Typically, drivers would call their 
>>>>>>>>>>>>>>>>>>>>>> driver
>>>>>>>>>>>>>>>>>>>>>> specific variant of
>>>>>>>>>>>>>>>>>>>>>> +        * ttm_bo_validate() from within this 
>>>>>>>>>>>>>>>>>>>>>> callback.
>>>>>>>>>>>>>>>>>>>>>> +        */
>>>>>>>>>>>>>>>>>>>>>> +       int (*bo_validate)(struct drm_gem_object 
>>>>>>>>>>>>>>>>>>>>>> *obj);
>>>>>>>>>>>>>>>>>>>>>>     };
>>>>>>>>>>>>>>>>>>>>>>     int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, 
>>>>>>>>>>>>>>>>>>>>>> void *priv,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation
  2023-09-20 14:02                                             ` Thomas Hellström
@ 2023-09-20 14:11                                               ` Christian König
  0 siblings, 0 replies; 77+ messages in thread
From: Christian König @ 2023-09-20 14:11 UTC (permalink / raw)
  To: Thomas Hellström, Danilo Krummrich
  Cc: airlied, daniel, matthew.brost, sarah.walker, donald.robson,
	boris.brezillon, faith.ekstrand, dri-devel, nouveau,
	linux-kernel

Am 20.09.23 um 16:02 schrieb Thomas Hellström:
> [SNIP]
>>> Do you by "relocation" list refer to what gpuvm calls "evict" list 
>>> or something else? Like the relocaton/validation list that used to 
>>> be sent from user-space for non-VM_BIND vms?
>>
>> The BOs send into the kernel with each command submission on the 
>> classic IOCTLs.
>>
>>>
>>> The vm bos plus the external/shared bos bound to the VM (the 
>>> external list) are the bos being referenced by the current batch. So 
>>> the bos on the VM's external list are the ones being locked and 
>>> fenced and checked for eviction. If they weren't they could be 
>>> evicted before the current batch completes?
>>
>> That only applies to a certain use case, e.g. Vulkan or user mode 
>> queues.
>>
>> Multimedia APIs and especially OpenGL work differently, here only the 
>> BOs mentioned in the relocation list are guaranteed to not be evicted.
>>
>> This is intentional because those APIs tend to over allocate memory 
>> all the time, so for good performance you need to be able to evict 
>> BOs from the VM while other parts of the VM are currently in use.
>>
>> Without that especially OpenGL performance would be completely 
>> crippled at least on amdgpu.
>
> OK, I've always wondered how overcommiting a local VM would be handled 
> on VM_BIND, where we don't have the relocation list, at least not in 
> xe, so we have what you refer to as the user mode queues.
>
> I figure those APIs that suffer from overcommitting would maintain a 
> "current working set" in user-space and send changes as deltas to the 
> kernel as unbinds/binds. Or at least "can be unbound / can no longer 
> be unbound" advises.
>
> This may turn out interesting.

Essentially this is how Windows used to work till (I think) Windows 8.

Basically the kernel is responsible to figure out which BO to move 
in/out of VRAM for each submission an application does. And it is 
perfectly acceptable for an application to allocate 8GiB of VRAM when 
only 4GiB is physical available.

To be honest I think it's one of the worst things every invented, but we 
somehow have to support it for some use cases.

Christian.

>
> /Thomas


^ permalink raw reply	[flat|nested] 77+ messages in thread

end of thread, other threads:[~2023-09-20 14:11 UTC | newest]

Thread overview: 77+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-09 15:31 [PATCH drm-misc-next v3 0/7] [RFC] DRM GPUVA Manager GPU-VM features Danilo Krummrich
2023-09-09 15:31 ` [PATCH drm-misc-next v3 1/7] drm/gpuvm: rename struct drm_gpuva_manager to struct drm_gpuvm Danilo Krummrich
2023-09-09 18:23   ` kernel test robot
2023-09-09 15:31 ` [PATCH drm-misc-next v3 2/7] drm/gpuvm: allow building as module Danilo Krummrich
2023-09-11 13:09   ` Christian König
2023-09-09 15:31 ` [PATCH drm-misc-next v3 3/7] drm/nouveau: uvmm: rename 'umgr' to 'base' Danilo Krummrich
2023-09-09 15:31 ` [PATCH drm-misc-next v3 4/7] drm/gpuvm: common dma-resv per struct drm_gpuvm Danilo Krummrich
2023-09-11 12:00   ` Boris Brezillon
2023-09-11 16:16     ` Danilo Krummrich
2023-09-09 15:31 ` [PATCH drm-misc-next v3 5/7] drm/gpuvm: add an abstraction for a VM / BO combination Danilo Krummrich
2023-09-11 17:19   ` Thomas Hellström
2023-09-11 17:49     ` Danilo Krummrich
2023-09-11 18:37       ` Thomas Hellström
2023-09-12  7:42       ` Thomas Hellström
2023-09-12 10:06         ` Danilo Krummrich
2023-09-12 10:33           ` Thomas Hellström
2023-09-12 11:05             ` Danilo Krummrich
2023-09-09 15:31 ` [PATCH drm-misc-next v3 6/7] drm/gpuvm: generalize dma_resv/extobj handling and GEM validation Danilo Krummrich
2023-09-09 20:16   ` kernel test robot
2023-09-11 10:35   ` Boris Brezillon
2023-09-11 16:23     ` Danilo Krummrich
2023-09-11 12:54   ` Boris Brezillon
2023-09-11 14:45   ` Boris Brezillon
2023-09-11 16:30     ` Danilo Krummrich
2023-09-12 16:20   ` Thomas Hellström
2023-09-12 16:50     ` Danilo Krummrich
2023-09-12 19:23       ` Thomas Hellström
2023-09-12 23:36         ` Danilo Krummrich
2023-09-13  9:14           ` Thomas Hellström
2023-09-13 12:16             ` Danilo Krummrich
2023-09-13 14:26               ` Christian König
2023-09-13 15:13                 ` Thomas Hellström
2023-09-13 15:26                   ` Christian König
2023-09-13 15:15                 ` Danilo Krummrich
2023-09-13 15:33                   ` Christian König
2023-09-13 15:46                     ` Danilo Krummrich
2023-09-19 12:07                       ` Christian König
2023-09-19 12:21                         ` Thomas Hellström
2023-09-19 15:16                           ` Danilo Krummrich
2023-09-19 15:23                             ` Thomas Hellström
2023-09-20  5:37                               ` Christian König
2023-09-20  7:44                                 ` Thomas Hellström
2023-09-20  8:29                                   ` Thomas Hellström
2023-09-20 10:51                                   ` Christian König
2023-09-20 12:06                                     ` Thomas Hellström
2023-09-20 13:06                                       ` Christian König
2023-09-20 13:38                                         ` Thomas Hellström
2023-09-20 13:48                                           ` Christian König
2023-09-20 14:02                                             ` Thomas Hellström
2023-09-20 14:11                                               ` Christian König
2023-09-14 10:57               ` [Nouveau] " Danilo Krummrich
2023-09-14 11:32                 ` Thomas Hellström
2023-09-14 15:27                   ` Danilo Krummrich
2023-09-14 17:13                     ` Thomas Hellström
2023-09-14 17:15                       ` Danilo Krummrich
2023-09-18 11:21                         ` Danilo Krummrich
2023-09-13  7:03     ` Boris Brezillon
2023-09-13  7:05       ` Dave Airlie
2023-09-13  7:19         ` Boris Brezillon
2023-09-13 10:39           ` Thomas Hellström
2023-09-13 11:33             ` Boris Brezillon
2023-09-13 12:01               ` Danilo Krummrich
2023-09-13 13:22               ` Thomas Hellström
2023-09-13 14:01                 ` Boris Brezillon
2023-09-13 14:29                   ` Thomas Hellström
2023-09-13 15:17                     ` Boris Brezillon
2023-09-14  8:20                 ` Boris Brezillon
2023-09-14 10:45                   ` Thomas Hellström
2023-09-14 11:54                     ` Boris Brezillon
2023-09-14 13:33                       ` Thomas Hellström
2023-09-14 15:37                         ` Boris Brezillon
2023-09-14 13:48   ` Thomas Hellström
2023-09-14 16:36     ` Danilo Krummrich
2023-09-14 17:21       ` Thomas Hellström
2023-09-14 17:25         ` Danilo Krummrich
2023-09-14 19:14           ` Thomas Hellström
2023-09-09 15:31 ` [PATCH drm-misc-next v3 7/7] drm/nouveau: GPUVM dma-resv/extobj handling, " Danilo Krummrich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).