All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/20] drm/i915/vm_bind: Add VM_BIND functionality
@ 2022-11-07  8:51 ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
buffer objects (BOs) or sections of a BOs at specified GPU virtual
addresses on a specified address space (VM). Multiple mappings can map
to the same physical pages of an object (aliasing). These mappings (also
referred to as persistent mappings) will be persistent across multiple
GPU submissions (execbuf calls) issued by the UMD, without user having
to provide a list of all required mappings during each submission (as
required by older execbuf mode).

This patch series support VM_BIND version 1, as described by the param
I915_PARAM_VM_BIND_VERSION.

Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
The new execbuf3 ioctl will not have any execlist support and all the
legacy support like relocations etc., are removed.

NOTEs:
* It is based on below VM_BIND design+uapi rfc.
  Documentation/gpu/rfc/i915_vm_bind.rst

* The IGT RFC series is posted as,
  [PATCH i-g-t v5 0/12] vm_bind: Add VM_BIND validation support

v2: Address various review comments
v3: Address review comments and other fixes
v4: Remove vm_unbind out fence uapi which is not supported yet,
    replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()
v5: Render kernel-doc, use PIN_NOEVICT, limit vm_bind support to
    non-recoverable faults
v6: Rebased, minor fixes, add reserved fields to drm_i915_gem_vm_bind,
    add new patch for async vm_unbind support

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

Niranjana Vishwanathapura (20):
  drm/i915/vm_bind: Expose vm lookup function
  drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
  drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
  drm/i915/vm_bind: Add support to create persistent vma
  drm/i915/vm_bind: Implement bind and unbind of object
  drm/i915/vm_bind: Support for VM private BOs
  drm/i915/vm_bind: Add support to handle object evictions
  drm/i915/vm_bind: Support persistent vma activeness tracking
  drm/i915/vm_bind: Add out fence support
  drm/i915/vm_bind: Abstract out common execbuf functions
  drm/i915/vm_bind: Use common execbuf functions in execbuf path
  drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
  drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
  drm/i915/vm_bind: Expose i915_request_await_bind()
  drm/i915/vm_bind: Handle persistent vmas in execbuf3
  drm/i915/vm_bind: userptr dma-resv changes
  drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
  drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
  drm/i915/vm_bind: Render VM_BIND documentation
  drm/i915/vm_bind: Async vm_unbind support

 Documentation/gpu/i915.rst                    |  78 +-
 drivers/gpu/drm/i915/Makefile                 |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  43 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |  17 +
 drivers/gpu/drm/i915/gem/i915_gem_create.c    |  72 +-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |   6 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 516 +----------
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 871 ++++++++++++++++++
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 +++++++++++++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   2 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  19 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  30 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c    | 449 +++++++++
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  17 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  21 +
 drivers/gpu/drm/i915/i915_driver.c            |   4 +
 drivers/gpu/drm/i915/i915_drv.h               |   2 +
 drivers/gpu/drm/i915/i915_gem_gtt.c           |  39 +
 drivers/gpu/drm/i915/i915_gem_gtt.h           |   3 +
 drivers/gpu/drm/i915/i915_getparam.c          |   3 +
 drivers/gpu/drm/i915/i915_sw_fence.c          |  28 +-
 drivers/gpu/drm/i915/i915_sw_fence.h          |  23 +-
 drivers/gpu/drm/i915/i915_vma.c               | 186 +++-
 drivers/gpu/drm/i915/i915_vma.h               |  68 +-
 drivers/gpu/drm/i915/i915_vma_types.h         |  39 +
 include/uapi/drm/i915_drm.h                   | 264 +++++-
 30 files changed, 3008 insertions(+), 546 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 00/20] drm/i915/vm_bind: Add VM_BIND functionality
@ 2022-11-07  8:51 ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
buffer objects (BOs) or sections of a BOs at specified GPU virtual
addresses on a specified address space (VM). Multiple mappings can map
to the same physical pages of an object (aliasing). These mappings (also
referred to as persistent mappings) will be persistent across multiple
GPU submissions (execbuf calls) issued by the UMD, without user having
to provide a list of all required mappings during each submission (as
required by older execbuf mode).

This patch series support VM_BIND version 1, as described by the param
I915_PARAM_VM_BIND_VERSION.

Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
The new execbuf3 ioctl will not have any execlist support and all the
legacy support like relocations etc., are removed.

NOTEs:
* It is based on below VM_BIND design+uapi rfc.
  Documentation/gpu/rfc/i915_vm_bind.rst

* The IGT RFC series is posted as,
  [PATCH i-g-t v5 0/12] vm_bind: Add VM_BIND validation support

v2: Address various review comments
v3: Address review comments and other fixes
v4: Remove vm_unbind out fence uapi which is not supported yet,
    replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()
v5: Render kernel-doc, use PIN_NOEVICT, limit vm_bind support to
    non-recoverable faults
v6: Rebased, minor fixes, add reserved fields to drm_i915_gem_vm_bind,
    add new patch for async vm_unbind support

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

Niranjana Vishwanathapura (20):
  drm/i915/vm_bind: Expose vm lookup function
  drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
  drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
  drm/i915/vm_bind: Add support to create persistent vma
  drm/i915/vm_bind: Implement bind and unbind of object
  drm/i915/vm_bind: Support for VM private BOs
  drm/i915/vm_bind: Add support to handle object evictions
  drm/i915/vm_bind: Support persistent vma activeness tracking
  drm/i915/vm_bind: Add out fence support
  drm/i915/vm_bind: Abstract out common execbuf functions
  drm/i915/vm_bind: Use common execbuf functions in execbuf path
  drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
  drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
  drm/i915/vm_bind: Expose i915_request_await_bind()
  drm/i915/vm_bind: Handle persistent vmas in execbuf3
  drm/i915/vm_bind: userptr dma-resv changes
  drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
  drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
  drm/i915/vm_bind: Render VM_BIND documentation
  drm/i915/vm_bind: Async vm_unbind support

 Documentation/gpu/i915.rst                    |  78 +-
 drivers/gpu/drm/i915/Makefile                 |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  43 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |  17 +
 drivers/gpu/drm/i915/gem/i915_gem_create.c    |  72 +-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |   6 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 516 +----------
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 871 ++++++++++++++++++
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 +++++++++++++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |   2 +
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   2 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  19 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  30 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c    | 449 +++++++++
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  17 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  21 +
 drivers/gpu/drm/i915/i915_driver.c            |   4 +
 drivers/gpu/drm/i915/i915_drv.h               |   2 +
 drivers/gpu/drm/i915/i915_gem_gtt.c           |  39 +
 drivers/gpu/drm/i915/i915_gem_gtt.h           |   3 +
 drivers/gpu/drm/i915/i915_getparam.c          |   3 +
 drivers/gpu/drm/i915/i915_sw_fence.c          |  28 +-
 drivers/gpu/drm/i915/i915_sw_fence.h          |  23 +-
 drivers/gpu/drm/i915/i915_vma.c               | 186 +++-
 drivers/gpu/drm/i915/i915_vma.h               |  68 +-
 drivers/gpu/drm/i915/i915_vma_types.h         |  39 +
 include/uapi/drm/i915_drm.h                   | 264 +++++-
 30 files changed, 3008 insertions(+), 546 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 01/20] drm/i915/vm_bind: Expose vm lookup function
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Make i915_gem_vm_lookup() function non-static as it will be
used by the vm_bind feature.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 11 ++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_context.h |  3 +++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 01402f3c58f6..6bed0633f744 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -346,7 +346,16 @@ static int proto_context_register(struct drm_i915_file_private *fpriv,
 	return ret;
 }
 
-static struct i915_address_space *
+/**
+ * i915_gem_vm_lookup() - looks up for the VM reference given the vm id
+ * @file_priv: the private data associated with the user's file
+ * @id: the VM id
+ *
+ * Finds the VM reference associated to a specific id.
+ *
+ * Returns the VM pointer on success, NULL in case of failure.
+ */
+struct i915_address_space *
 i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id)
 {
 	struct i915_address_space *vm;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index e5b0f66ea1fe..899fa8f1e0fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -139,6 +139,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
 				       struct drm_file *file);
 
+struct i915_address_space *
+i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
+
 struct i915_gem_context *
 i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
 
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 01/20] drm/i915/vm_bind: Expose vm lookup function
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Make i915_gem_vm_lookup() function non-static as it will be
used by the vm_bind feature.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 11 ++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_context.h |  3 +++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 01402f3c58f6..6bed0633f744 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -346,7 +346,16 @@ static int proto_context_register(struct drm_i915_file_private *fpriv,
 	return ret;
 }
 
-static struct i915_address_space *
+/**
+ * i915_gem_vm_lookup() - looks up for the VM reference given the vm id
+ * @file_priv: the private data associated with the user's file
+ * @id: the VM id
+ *
+ * Finds the VM reference associated to a specific id.
+ *
+ * Returns the VM pointer on success, NULL in case of failure.
+ */
+struct i915_address_space *
 i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id)
 {
 	struct i915_address_space *vm;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index e5b0f66ea1fe..899fa8f1e0fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -139,6 +139,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
 				       struct drm_file *file);
 
+struct i915_address_space *
+i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
+
 struct i915_gem_context *
 i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id);
 
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 02/20] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Add function __i915_sw_fence_await_reservation() for
asynchronous wait on a dma-resv object with specified
dma_resv_usage. This is required for async vma unbind
with vm_bind.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_sw_fence.c | 28 +++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_sw_fence.h | 23 +++++++++++++++++------
 2 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index cc2a8821d22a..ae06d35db056 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -7,7 +7,6 @@
 #include <linux/slab.h>
 #include <linux/dma-fence.h>
 #include <linux/irq_work.h>
-#include <linux/dma-resv.h>
 
 #include "i915_sw_fence.h"
 #include "i915_selftest.h"
@@ -569,11 +568,26 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 	return ret;
 }
 
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-				    struct dma_resv *resv,
-				    bool write,
-				    unsigned long timeout,
-				    gfp_t gfp)
+/**
+ * __i915_sw_fence_await_reservation() - Setup a fence to wait on a dma-resv
+ * object with specified usage.
+ * @fence: the fence that needs to wait
+ * @resv: dma-resv object
+ * @usage: dma_resv_usage (See enum dma_resv_usage)
+ * @timeout: how long to wait in jiffies
+ * @gfp: allocation mode
+ *
+ * Setup the @fence to asynchronously wait on dma-resv object @resv for
+ * @usage to complete before signaling.
+ *
+ * Returns 0 if there is nothing to wait on, -ve error code upon error
+ * and >0 upon successfully setting up the wait.
+ */
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+				      struct dma_resv *resv,
+				      enum dma_resv_usage usage,
+				      unsigned long timeout,
+				      gfp_t gfp)
 {
 	struct dma_resv_iter cursor;
 	struct dma_fence *f;
@@ -582,7 +596,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 	debug_fence_assert(fence);
 	might_sleep_if(gfpflags_allow_blocking(gfp));
 
-	dma_resv_iter_begin(&cursor, resv, dma_resv_usage_rw(write));
+	dma_resv_iter_begin(&cursor, resv, usage);
 	dma_resv_for_each_fence_unlocked(&cursor, f) {
 		pending = i915_sw_fence_await_dma_fence(fence, f, timeout,
 							gfp);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h
index f752bfc7c6e1..9c4859dc4c0d 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -10,13 +10,13 @@
 #define _I915_SW_FENCE_H_
 
 #include <linux/dma-fence.h>
+#include <linux/dma-resv.h>
 #include <linux/gfp.h>
 #include <linux/kref.h>
 #include <linux/notifier.h> /* for NOTIFY_DONE */
 #include <linux/wait.h>
 
 struct completion;
-struct dma_resv;
 struct i915_sw_fence;
 
 enum i915_sw_fence_notify {
@@ -89,11 +89,22 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 				  unsigned long timeout,
 				  gfp_t gfp);
 
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-				    struct dma_resv *resv,
-				    bool write,
-				    unsigned long timeout,
-				    gfp_t gfp);
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+				      struct dma_resv *resv,
+				      enum dma_resv_usage usage,
+				      unsigned long timeout,
+				      gfp_t gfp);
+
+static inline int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+						  struct dma_resv *resv,
+						  bool write,
+						  unsigned long timeout,
+						  gfp_t gfp)
+{
+	return __i915_sw_fence_await_reservation(fence, resv,
+						 dma_resv_usage_rw(write),
+						 timeout, gfp);
+}
 
 bool i915_sw_fence_await(struct i915_sw_fence *fence);
 void i915_sw_fence_complete(struct i915_sw_fence *fence);
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 02/20] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Add function __i915_sw_fence_await_reservation() for
asynchronous wait on a dma-resv object with specified
dma_resv_usage. This is required for async vma unbind
with vm_bind.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_sw_fence.c | 28 +++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_sw_fence.h | 23 +++++++++++++++++------
 2 files changed, 38 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index cc2a8821d22a..ae06d35db056 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -7,7 +7,6 @@
 #include <linux/slab.h>
 #include <linux/dma-fence.h>
 #include <linux/irq_work.h>
-#include <linux/dma-resv.h>
 
 #include "i915_sw_fence.h"
 #include "i915_selftest.h"
@@ -569,11 +568,26 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 	return ret;
 }
 
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-				    struct dma_resv *resv,
-				    bool write,
-				    unsigned long timeout,
-				    gfp_t gfp)
+/**
+ * __i915_sw_fence_await_reservation() - Setup a fence to wait on a dma-resv
+ * object with specified usage.
+ * @fence: the fence that needs to wait
+ * @resv: dma-resv object
+ * @usage: dma_resv_usage (See enum dma_resv_usage)
+ * @timeout: how long to wait in jiffies
+ * @gfp: allocation mode
+ *
+ * Setup the @fence to asynchronously wait on dma-resv object @resv for
+ * @usage to complete before signaling.
+ *
+ * Returns 0 if there is nothing to wait on, -ve error code upon error
+ * and >0 upon successfully setting up the wait.
+ */
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+				      struct dma_resv *resv,
+				      enum dma_resv_usage usage,
+				      unsigned long timeout,
+				      gfp_t gfp)
 {
 	struct dma_resv_iter cursor;
 	struct dma_fence *f;
@@ -582,7 +596,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 	debug_fence_assert(fence);
 	might_sleep_if(gfpflags_allow_blocking(gfp));
 
-	dma_resv_iter_begin(&cursor, resv, dma_resv_usage_rw(write));
+	dma_resv_iter_begin(&cursor, resv, usage);
 	dma_resv_for_each_fence_unlocked(&cursor, f) {
 		pending = i915_sw_fence_await_dma_fence(fence, f, timeout,
 							gfp);
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h
index f752bfc7c6e1..9c4859dc4c0d 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -10,13 +10,13 @@
 #define _I915_SW_FENCE_H_
 
 #include <linux/dma-fence.h>
+#include <linux/dma-resv.h>
 #include <linux/gfp.h>
 #include <linux/kref.h>
 #include <linux/notifier.h> /* for NOTIFY_DONE */
 #include <linux/wait.h>
 
 struct completion;
-struct dma_resv;
 struct i915_sw_fence;
 
 enum i915_sw_fence_notify {
@@ -89,11 +89,22 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 				  unsigned long timeout,
 				  gfp_t gfp);
 
-int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
-				    struct dma_resv *resv,
-				    bool write,
-				    unsigned long timeout,
-				    gfp_t gfp);
+int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+				      struct dma_resv *resv,
+				      enum dma_resv_usage usage,
+				      unsigned long timeout,
+				      gfp_t gfp);
+
+static inline int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
+						  struct dma_resv *resv,
+						  bool write,
+						  unsigned long timeout,
+						  gfp_t gfp)
+{
+	return __i915_sw_fence_await_reservation(fence, resv,
+						 dma_resv_usage_rw(write),
+						 timeout, gfp);
+}
 
 bool i915_sw_fence_await(struct i915_sw_fence *fence);
 void i915_sw_fence_complete(struct i915_sw_fence *fence);
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 03/20] drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Expose i915_gem_object_max_page_size() function non-static
which will be used by the vm_bind feature.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 18 +++++++++++++-----
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 ++
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 33673fe7ee0a..5c6e396ab74d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -15,10 +15,18 @@
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
-static u32 object_max_page_size(struct intel_memory_region **placements,
-				unsigned int n_placements)
+/**
+ * i915_gem_object_max_page_size() - max of min_page_size of the regions
+ * @placements:  list of regions
+ * @n_placements: number of the placements
+ *
+ * Returns the largest of min_page_size of the @placements,
+ * or I915_GTT_PAGE_SIZE_4K if @n_placements is 0.
+ */
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+				  unsigned int n_placements)
 {
-	u32 max_page_size = 0;
+	u32 max_page_size = I915_GTT_PAGE_SIZE_4K;
 	int i;
 
 	for (i = 0; i < n_placements; i++) {
@@ -28,7 +36,6 @@ static u32 object_max_page_size(struct intel_memory_region **placements,
 		max_page_size = max_t(u32, max_page_size, mr->min_page_size);
 	}
 
-	GEM_BUG_ON(!max_page_size);
 	return max_page_size;
 }
 
@@ -99,7 +106,8 @@ __i915_gem_object_create_user_ext(struct drm_i915_private *i915, u64 size,
 
 	i915_gem_flush_free_objects(i915);
 
-	size = round_up(size, object_max_page_size(placements, n_placements));
+	size = round_up(size, i915_gem_object_max_page_size(placements,
+							    n_placements));
 	if (size == 0)
 		return ERR_PTR(-EINVAL);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 6b9ecff42bb5..db3dd0e285c5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -47,6 +47,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
 }
 
 void i915_gem_init__objects(struct drm_i915_private *i915);
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+				  unsigned int n_placements);
 
 void i915_objects_module_exit(void);
 int i915_objects_module_init(void);
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 03/20] drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Expose i915_gem_object_max_page_size() function non-static
which will be used by the vm_bind feature.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 18 +++++++++++++-----
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 ++
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 33673fe7ee0a..5c6e396ab74d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -15,10 +15,18 @@
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
-static u32 object_max_page_size(struct intel_memory_region **placements,
-				unsigned int n_placements)
+/**
+ * i915_gem_object_max_page_size() - max of min_page_size of the regions
+ * @placements:  list of regions
+ * @n_placements: number of the placements
+ *
+ * Returns the largest of min_page_size of the @placements,
+ * or I915_GTT_PAGE_SIZE_4K if @n_placements is 0.
+ */
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+				  unsigned int n_placements)
 {
-	u32 max_page_size = 0;
+	u32 max_page_size = I915_GTT_PAGE_SIZE_4K;
 	int i;
 
 	for (i = 0; i < n_placements; i++) {
@@ -28,7 +36,6 @@ static u32 object_max_page_size(struct intel_memory_region **placements,
 		max_page_size = max_t(u32, max_page_size, mr->min_page_size);
 	}
 
-	GEM_BUG_ON(!max_page_size);
 	return max_page_size;
 }
 
@@ -99,7 +106,8 @@ __i915_gem_object_create_user_ext(struct drm_i915_private *i915, u64 size,
 
 	i915_gem_flush_free_objects(i915);
 
-	size = round_up(size, object_max_page_size(placements, n_placements));
+	size = round_up(size, i915_gem_object_max_page_size(placements,
+							    n_placements));
 	if (size == 0)
 		return ERR_PTR(-EINVAL);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 6b9ecff42bb5..db3dd0e285c5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -47,6 +47,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
 }
 
 void i915_gem_init__objects(struct drm_i915_private *i915);
+u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
+				  unsigned int n_placements);
 
 void i915_objects_module_exit(void);
 int i915_objects_module_init(void);
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 04/20] drm/i915/vm_bind: Add support to create persistent vma
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Add i915_vma_instance_persistent() to create persistent vmas.
Persistent vmas will use i915_gtt_view to support partial binding.

vma_lookup is tied to segment of the object instead of section
of VA space. Hence, it do not support aliasing. ie., multiple
mappings (at different VA) point to the same gtt_view of object.
Skip vma_lookup for persistent vmas to support aliasing.

v2: Remove unused I915_VMA_PERSISTENT definition,
    update validity check in i915_vma_compare(),
    remove unwanted is_persistent check in release_references().

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c       | 36 +++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_vma.h       | 17 ++++++++++++-
 drivers/gpu/drm/i915/i915_vma_types.h |  6 +++++
 3 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index c39488eb9eeb..529d97318f00 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -109,7 +109,8 @@ static void __i915_vma_retire(struct i915_active *ref)
 static struct i915_vma *
 vma_create(struct drm_i915_gem_object *obj,
 	   struct i915_address_space *vm,
-	   const struct i915_gtt_view *view)
+	   const struct i915_gtt_view *view,
+	   bool skip_lookup_cache)
 {
 	struct i915_vma *pos = ERR_PTR(-E2BIG);
 	struct i915_vma *vma;
@@ -196,6 +197,9 @@ vma_create(struct drm_i915_gem_object *obj,
 		__set_bit(I915_VMA_GGTT_BIT, __i915_vma_flags(vma));
 	}
 
+	if (skip_lookup_cache)
+		goto skip_rb_insert;
+
 	rb = NULL;
 	p = &obj->vma.tree.rb_node;
 	while (*p) {
@@ -220,6 +224,7 @@ vma_create(struct drm_i915_gem_object *obj,
 	rb_link_node(&vma->obj_node, rb, p);
 	rb_insert_color(&vma->obj_node, &obj->vma.tree);
 
+skip_rb_insert:
 	if (i915_vma_is_ggtt(vma))
 		/*
 		 * We put the GGTT vma at the start of the vma-list, followed
@@ -299,7 +304,34 @@ i915_vma_instance(struct drm_i915_gem_object *obj,
 
 	/* vma_create() will resolve the race if another creates the vma */
 	if (unlikely(!vma))
-		vma = vma_create(obj, vm, view);
+		vma = vma_create(obj, vm, view, false);
+
+	GEM_BUG_ON(!IS_ERR(vma) && i915_vma_compare(vma, vm, view));
+	return vma;
+}
+
+/**
+ * i915_vma_create_persistent - create a persistent VMA
+ * @obj: parent &struct drm_i915_gem_object to be mapped
+ * @vm: address space in which the mapping is located
+ * @view: additional mapping requirements
+ *
+ * Creates a persistent vma.
+ *
+ * Returns the vma, or an error pointer.
+ */
+struct i915_vma *
+i915_vma_create_persistent(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   const struct i915_gtt_view *view)
+{
+	struct i915_vma *vma;
+
+	GEM_BUG_ON(!kref_read(&vm->ref));
+
+	vma = vma_create(obj, vm, view, true);
+	if (!IS_ERR(vma))
+		i915_vma_set_persistent(vma);
 
 	GEM_BUG_ON(!IS_ERR(vma) && i915_vma_compare(vma, vm, view));
 	return vma;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index aecd9c64486b..c5378ec2f70a 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -44,6 +44,10 @@ struct i915_vma *
 i915_vma_instance(struct drm_i915_gem_object *obj,
 		  struct i915_address_space *vm,
 		  const struct i915_gtt_view *view);
+struct i915_vma *
+i915_vma_create_persistent(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   const struct i915_gtt_view *view);
 
 void i915_vma_unpin_and_release(struct i915_vma **p_vma, unsigned int flags);
 #define I915_VMA_RELEASE_MAP BIT(0)
@@ -138,6 +142,16 @@ static inline u32 i915_ggtt_pin_bias(struct i915_vma *vma)
 	return i915_vm_to_ggtt(vma->vm)->pin_bias;
 }
 
+static inline bool i915_vma_is_persistent(const struct i915_vma *vma)
+{
+	return test_bit(I915_VMA_PERSISTENT_BIT, __i915_vma_flags(vma));
+}
+
+static inline void i915_vma_set_persistent(struct i915_vma *vma)
+{
+	set_bit(I915_VMA_PERSISTENT_BIT, __i915_vma_flags(vma));
+}
+
 static inline struct i915_vma *i915_vma_get(struct i915_vma *vma)
 {
 	i915_gem_object_get(vma->obj);
@@ -164,7 +178,8 @@ i915_vma_compare(struct i915_vma *vma,
 {
 	ptrdiff_t cmp;
 
-	GEM_BUG_ON(view && !i915_is_ggtt_or_dpt(vm));
+	GEM_BUG_ON(view && !(i915_is_ggtt_or_dpt(vm) ||
+			     i915_vma_is_persistent(vma)));
 
 	cmp = ptrdiff(vma->vm, vm);
 	if (cmp)
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index ec0f6c9f57d0..3144d71a0c3e 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -264,6 +264,12 @@ struct i915_vma {
 #define I915_VMA_SCANOUT_BIT	17
 #define I915_VMA_SCANOUT	((int)BIT(I915_VMA_SCANOUT_BIT))
 
+/**
+ * I915_VMA_PERSISTENT_BIT:
+ * The vma is persistent (created with VM_BIND call).
+ */
+#define I915_VMA_PERSISTENT_BIT	19
+
 	struct i915_active active;
 
 #define I915_VMA_PAGES_BIAS 24
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 04/20] drm/i915/vm_bind: Add support to create persistent vma
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Add i915_vma_instance_persistent() to create persistent vmas.
Persistent vmas will use i915_gtt_view to support partial binding.

vma_lookup is tied to segment of the object instead of section
of VA space. Hence, it do not support aliasing. ie., multiple
mappings (at different VA) point to the same gtt_view of object.
Skip vma_lookup for persistent vmas to support aliasing.

v2: Remove unused I915_VMA_PERSISTENT definition,
    update validity check in i915_vma_compare(),
    remove unwanted is_persistent check in release_references().

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c       | 36 +++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_vma.h       | 17 ++++++++++++-
 drivers/gpu/drm/i915/i915_vma_types.h |  6 +++++
 3 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index c39488eb9eeb..529d97318f00 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -109,7 +109,8 @@ static void __i915_vma_retire(struct i915_active *ref)
 static struct i915_vma *
 vma_create(struct drm_i915_gem_object *obj,
 	   struct i915_address_space *vm,
-	   const struct i915_gtt_view *view)
+	   const struct i915_gtt_view *view,
+	   bool skip_lookup_cache)
 {
 	struct i915_vma *pos = ERR_PTR(-E2BIG);
 	struct i915_vma *vma;
@@ -196,6 +197,9 @@ vma_create(struct drm_i915_gem_object *obj,
 		__set_bit(I915_VMA_GGTT_BIT, __i915_vma_flags(vma));
 	}
 
+	if (skip_lookup_cache)
+		goto skip_rb_insert;
+
 	rb = NULL;
 	p = &obj->vma.tree.rb_node;
 	while (*p) {
@@ -220,6 +224,7 @@ vma_create(struct drm_i915_gem_object *obj,
 	rb_link_node(&vma->obj_node, rb, p);
 	rb_insert_color(&vma->obj_node, &obj->vma.tree);
 
+skip_rb_insert:
 	if (i915_vma_is_ggtt(vma))
 		/*
 		 * We put the GGTT vma at the start of the vma-list, followed
@@ -299,7 +304,34 @@ i915_vma_instance(struct drm_i915_gem_object *obj,
 
 	/* vma_create() will resolve the race if another creates the vma */
 	if (unlikely(!vma))
-		vma = vma_create(obj, vm, view);
+		vma = vma_create(obj, vm, view, false);
+
+	GEM_BUG_ON(!IS_ERR(vma) && i915_vma_compare(vma, vm, view));
+	return vma;
+}
+
+/**
+ * i915_vma_create_persistent - create a persistent VMA
+ * @obj: parent &struct drm_i915_gem_object to be mapped
+ * @vm: address space in which the mapping is located
+ * @view: additional mapping requirements
+ *
+ * Creates a persistent vma.
+ *
+ * Returns the vma, or an error pointer.
+ */
+struct i915_vma *
+i915_vma_create_persistent(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   const struct i915_gtt_view *view)
+{
+	struct i915_vma *vma;
+
+	GEM_BUG_ON(!kref_read(&vm->ref));
+
+	vma = vma_create(obj, vm, view, true);
+	if (!IS_ERR(vma))
+		i915_vma_set_persistent(vma);
 
 	GEM_BUG_ON(!IS_ERR(vma) && i915_vma_compare(vma, vm, view));
 	return vma;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index aecd9c64486b..c5378ec2f70a 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -44,6 +44,10 @@ struct i915_vma *
 i915_vma_instance(struct drm_i915_gem_object *obj,
 		  struct i915_address_space *vm,
 		  const struct i915_gtt_view *view);
+struct i915_vma *
+i915_vma_create_persistent(struct drm_i915_gem_object *obj,
+			   struct i915_address_space *vm,
+			   const struct i915_gtt_view *view);
 
 void i915_vma_unpin_and_release(struct i915_vma **p_vma, unsigned int flags);
 #define I915_VMA_RELEASE_MAP BIT(0)
@@ -138,6 +142,16 @@ static inline u32 i915_ggtt_pin_bias(struct i915_vma *vma)
 	return i915_vm_to_ggtt(vma->vm)->pin_bias;
 }
 
+static inline bool i915_vma_is_persistent(const struct i915_vma *vma)
+{
+	return test_bit(I915_VMA_PERSISTENT_BIT, __i915_vma_flags(vma));
+}
+
+static inline void i915_vma_set_persistent(struct i915_vma *vma)
+{
+	set_bit(I915_VMA_PERSISTENT_BIT, __i915_vma_flags(vma));
+}
+
 static inline struct i915_vma *i915_vma_get(struct i915_vma *vma)
 {
 	i915_gem_object_get(vma->obj);
@@ -164,7 +178,8 @@ i915_vma_compare(struct i915_vma *vma,
 {
 	ptrdiff_t cmp;
 
-	GEM_BUG_ON(view && !i915_is_ggtt_or_dpt(vm));
+	GEM_BUG_ON(view && !(i915_is_ggtt_or_dpt(vm) ||
+			     i915_vma_is_persistent(vma)));
 
 	cmp = ptrdiff(vma->vm, vm);
 	if (cmp)
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index ec0f6c9f57d0..3144d71a0c3e 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -264,6 +264,12 @@ struct i915_vma {
 #define I915_VMA_SCANOUT_BIT	17
 #define I915_VMA_SCANOUT	((int)BIT(I915_VMA_SCANOUT_BIT))
 
+/**
+ * I915_VMA_PERSISTENT_BIT:
+ * The vma is persistent (created with VM_BIND call).
+ */
+#define I915_VMA_PERSISTENT_BIT	19
+
 	struct i915_active active;
 
 #define I915_VMA_PAGES_BIAS 24
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 05/20] drm/i915/vm_bind: Implement bind and unbind of object
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Add uapi and implement support for bind and unbind of an
object at the specified GPU virtual addresses.

The vm_bind mode is not supported in legacy execbuf2 ioctl.
It will be supported only in the newer execbuf3 ioctl.

v2: On older platforms ctx->vm is not set, check for it.
    In vm_bind call, add vma to vm_bind_list.
    Add more input validity checks.
    Update some documentation.
v3: In vm_bind call, add vma to vm_bound_list as user can
    request a fence and pass to execbuf3 as input fence.
    Remove short term pinning with PIN_VALIDATE flag.
v4: Replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode().
v5: Ensure all reserved fields are 0, use PIN_NOEVICT.
v6: Add reserved fields to drm_i915_gem_vm_bind.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |  15 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   5 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c    | 324 ++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |   9 +
 drivers/gpu/drm/i915/i915_driver.c            |   3 +
 drivers/gpu/drm/i915/i915_vma.c               |   1 +
 drivers/gpu/drm/i915/i915_vma_types.h         |  14 +
 include/uapi/drm/i915_drm.h                   |  99 ++++++
 11 files changed, 507 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 51704b54317c..b731f3ac80da 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -166,6 +166,7 @@ gem-y += \
 	gem/i915_gem_ttm_move.o \
 	gem/i915_gem_ttm_pm.o \
 	gem/i915_gem_userptr.o \
+	gem/i915_gem_vm_bind_object.o \
 	gem/i915_gem_wait.o \
 	gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index 899fa8f1e0fe..e8b41aa8f8c4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -139,6 +139,21 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
 				       struct drm_file *file);
 
+/**
+ * i915_gem_vm_is_vm_bind_mode() - Check if address space is in vm_bind mode
+ * @vm: the address space
+ *
+ * Returns:
+ * true: @vm is in vm_bind mode; allows only vm_bind method of binding.
+ * false: @vm is not in vm_bind mode; allows only legacy execbuff method
+ *        of binding.
+ */
+static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
+{
+	/* No support to enable vm_bind mode yet */
+	return false;
+}
+
 struct i915_address_space *
 i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 1160723c9d2d..c5bc9f6e887f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
 	if (unlikely(IS_ERR(ctx)))
 		return PTR_ERR(ctx);
 
+	if (ctx->vm && i915_gem_vm_is_vm_bind_mode(ctx->vm)) {
+		i915_gem_context_put(ctx);
+		return -EOPNOTSUPP;
+	}
+
 	eb->gem_context = ctx;
 	if (i915_gem_context_has_full_ppgtt(ctx))
 		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
new file mode 100644
index 000000000000..36262a6357b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include <linux/types.h>
+
+struct drm_device;
+struct drm_file;
+struct i915_address_space;
+struct i915_vma;
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+			     struct drm_file *file);
+
+void i915_gem_vm_unbind_all(struct i915_address_space *vm);
+
+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
new file mode 100644
index 000000000000..6f299806bee1
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,324 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <uapi/drm/i915_drm.h>
+
+#include <linux/interval_tree_generic.h>
+
+#include "gem/i915_gem_context.h"
+#include "gem/i915_gem_vm_bind.h"
+
+#include "gt/intel_gpu_commands.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+/* Not all defined functions are used, hence use __maybe_unused */
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+		     START, LAST, __maybe_unused static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+ * specified address space (VM). Multiple mappings can map to the same physical
+ * pages of an object (aliasing). These mappings (also referred to as persistent
+ * mappings) will be persistent across multiple GPU submissions (execbuf calls)
+ * issued by the UMD, without user having to provide a list of all required
+ * mappings during each submission (as required by older execbuf mode).
+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
+ * User has to opt-in for VM_BIND mode of binding for an address space (VM)
+ * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
+ * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be
+ * done asynchronously, when valid out fence is specified.
+ *
+ * VM_BIND locking order is as below.
+ *
+ * 1) vm_bind_lock mutex will protect vm_bind lists. This lock is taken in
+ *    vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the
+ *    mapping.
+ *
+ *    In future, when GPU page faults are supported, we can potentially use a
+ *    rwsem instead, so that multiple page fault handlers can take the read
+ *    side lock to lookup the mapping and hence can run in parallel.
+ *    The older execbuf mode of binding do not need this lock.
+ *
+ * 2) The object's dma-resv lock will protect i915_vma state and needs
+ *    to be held while binding/unbinding a vma in the async worker and while
+ *    updating dma-resv fence list of an object. Note that private BOs of a VM
+ *    will all share a dma-resv object.
+ *
+ * 3) Spinlock/s to protect some of the VM's lists like the list of
+ *    invalidated vmas (due to eviction and userptr invalidation) etc.
+ */
+
+/**
+ * i915_gem_vm_bind_lookup_vma() - lookup for persistent vma mapped at a
+ * specified address
+ * @vm: virtual address space to look for persistent vma
+ * @va: starting address where vma is mapped
+ *
+ * Retrieves the persistent vma mapped address @va from the @vm's vma tree.
+ *
+ * Returns vma pointer on success, NULL on failure.
+ */
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va)
+{
+	lockdep_assert_held(&vm->vm_bind_lock);
+
+	return i915_vm_bind_it_iter_first(&vm->va, va, va);
+}
+
+static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
+{
+	lockdep_assert_held(&vma->vm->vm_bind_lock);
+
+	list_del_init(&vma->vm_bind_link);
+	i915_vm_bind_it_remove(vma, &vma->vm->va);
+
+	/* Release object */
+	if (release_obj)
+		i915_gem_object_put(vma->obj);
+}
+
+static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
+				  struct drm_i915_gem_vm_unbind *va)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int ret;
+
+	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
+	if (ret)
+		return ret;
+
+	va->start = gen8_noncanonical_addr(va->start);
+	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
+
+	if (!vma)
+		ret = -ENOENT;
+	else if (vma->size != va->length)
+		ret = -EINVAL;
+
+	if (ret) {
+		mutex_unlock(&vm->vm_bind_lock);
+		return ret;
+	}
+
+	i915_gem_vm_bind_remove(vma, false);
+
+	mutex_unlock(&vm->vm_bind_lock);
+
+	/*
+	 * Destroy the vma and then release the object.
+	 * As persistent vma holds object reference, it can only be destroyed
+	 * either by vm_unbind ioctl or when VM is being released. As we are
+	 * holding VM reference here, it is safe accessing the vma here.
+	 */
+	obj = vma->obj;
+	i915_gem_object_lock(obj, NULL);
+	i915_vma_destroy(vma);
+	i915_gem_object_unlock(obj);
+
+	i915_gem_object_put(obj);
+
+	return 0;
+}
+
+/**
+ * i915_gem_vm_unbind_all() - unbind all persistent mappings from an
+ * address space
+ * @vm: Address spece to remove persistent mappings from
+ *
+ * Unbind all userspace requested vm_bind mappings from @vm.
+ */
+void i915_gem_vm_unbind_all(struct i915_address_space *vm)
+{
+	struct i915_vma *vma, *t;
+
+	mutex_lock(&vm->vm_bind_lock);
+	list_for_each_entry_safe(vma, t, &vm->vm_bind_list, vm_bind_link)
+		i915_gem_vm_bind_remove(vma, true);
+	list_for_each_entry_safe(vma, t, &vm->vm_bound_list, vm_bind_link)
+		i915_gem_vm_bind_remove(vma, true);
+	mutex_unlock(&vm->vm_bind_lock);
+}
+
+static struct i915_vma *vm_bind_get_vma(struct i915_address_space *vm,
+					struct drm_i915_gem_object *obj,
+					struct drm_i915_gem_vm_bind *va)
+{
+	struct i915_gtt_view view;
+	struct i915_vma *vma;
+
+	va->start = gen8_noncanonical_addr(va->start);
+	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
+	if (vma)
+		return ERR_PTR(-EEXIST);
+
+	view.type = I915_GTT_VIEW_PARTIAL;
+	view.partial.offset = va->offset >> PAGE_SHIFT;
+	view.partial.size = va->length >> PAGE_SHIFT;
+	vma = i915_vma_create_persistent(obj, vm, &view);
+	if (IS_ERR(vma))
+		return vma;
+
+	vma->start = va->start;
+	vma->last = va->start + va->length - 1;
+
+	return vma;
+}
+
+static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
+				struct drm_i915_gem_vm_bind *va,
+				struct drm_file *file)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma = NULL;
+	struct i915_gem_ww_ctx ww;
+	u64 pin_flags;
+	int ret = 0;
+
+	if (!i915_gem_vm_is_vm_bind_mode(vm))
+		return -EOPNOTSUPP;
+
+	/* Ensure start and length fields are valid */
+	if (!va->length || !IS_ALIGNED(va->start, I915_GTT_PAGE_SIZE))
+		ret = -EINVAL;
+
+	obj = i915_gem_object_lookup(file, va->handle);
+	if (!obj)
+		return -ENOENT;
+
+	/* Ensure offset and length are aligned to object's max page size */
+	if (!IS_ALIGNED(va->offset | va->length,
+			i915_gem_object_max_page_size(obj->mm.placements,
+						      obj->mm.n_placements)))
+		ret = -EINVAL;
+
+	/* Check for mapping range overflow */
+	if (range_overflows_t(u64, va->offset, va->length, obj->base.size))
+		ret = -EINVAL;
+
+	if (ret)
+		goto put_obj;
+
+	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
+	if (ret)
+		goto put_obj;
+
+	vma = vm_bind_get_vma(vm, obj, va);
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto unlock_vm;
+	}
+
+	pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER |
+		    PIN_VALIDATE | PIN_NOEVICT;
+
+	for_i915_gem_ww(&ww, ret, true) {
+		ret = i915_gem_object_lock(vma->obj, &ww);
+		if (ret)
+			continue;
+
+		ret = i915_vma_pin_ww(vma, &ww, 0, 0, pin_flags);
+		if (ret)
+			continue;
+
+		list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+		i915_vm_bind_it_insert(vma, &vm->va);
+
+		/* Hold object reference until vm_unbind */
+		i915_gem_object_get(vma->obj);
+	}
+
+	if (ret)
+		i915_vma_destroy(vma);
+unlock_vm:
+	mutex_unlock(&vm->vm_bind_lock);
+put_obj:
+	i915_gem_object_put(obj);
+
+	return ret;
+}
+
+/**
+ * i915_gem_vm_bind_ioctl() - ioctl function for binding a section of object
+ * at a specified virtual address
+ * @dev: drm_device pointer
+ * @data: ioctl data structure
+ * @file: drm_file pointer
+ *
+ * Adds the specified persistent mapping (virtual address to a section of an
+ * object) and binds it in the device page table.
+ *
+ * Returns 0 on success, error code on failure.
+ */
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file)
+{
+	struct drm_i915_gem_vm_bind *args = data;
+	struct i915_address_space *vm;
+	int ret;
+
+	/* Reserved fields must be 0 */
+	if (args->rsvd[0] || args->rsvd[1] || args->rsvd[2] || args->extensions)
+		return -EINVAL;
+
+	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
+	if (unlikely(!vm))
+		return -ENOENT;
+
+	ret = i915_gem_vm_bind_obj(vm, args, file);
+
+	i915_vm_put(vm);
+	return ret;
+}
+
+/**
+ * i915_gem_vm_unbind_ioctl() - ioctl function for unbinding a mapping at a
+ * specified virtual address
+ * @dev: drm_device pointer
+ * @data: ioctl data structure
+ * @file: drm_file pointer
+ *
+ * Removes the persistent mapping at the specified address and unbinds it
+ * from the device page table.
+ *
+ * Returns 0 on success, error code on failure. -ENOENT is returned if the
+ * specified mapping is not found.
+ */
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+			     struct drm_file *file)
+{
+	struct drm_i915_gem_vm_unbind *args = data;
+	struct i915_address_space *vm;
+	int ret;
+
+	/* Reserved fields must be 0 */
+	if (args->rsvd || args->rsvd2[0] || args->rsvd2[1] ||
+	    args->rsvd2[2] || args->extensions)
+		return -EINVAL;
+
+	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
+	if (unlikely(!vm))
+		return -ENOENT;
+
+	ret = i915_gem_vm_unbind_vma(vm, args);
+
+	i915_vm_put(vm);
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index e82a9d763e57..412368c67c46 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -12,6 +12,7 @@
 
 #include "gem/i915_gem_internal.h"
 #include "gem/i915_gem_lmem.h"
+#include "gem/i915_gem_vm_bind.h"
 #include "i915_trace.h"
 #include "i915_utils.h"
 #include "intel_gt.h"
@@ -177,6 +178,8 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
 void i915_address_space_fini(struct i915_address_space *vm)
 {
 	drm_mm_takedown(&vm->mm);
+	GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
+	mutex_destroy(&vm->vm_bind_lock);
 }
 
 /**
@@ -203,6 +206,8 @@ static void __i915_vm_release(struct work_struct *work)
 	struct i915_address_space *vm =
 		container_of(work, struct i915_address_space, release_work);
 
+	i915_gem_vm_unbind_all(vm);
+
 	__i915_vm_close(vm);
 
 	/* Synchronize async unbinds. */
@@ -279,6 +284,11 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 
 	INIT_LIST_HEAD(&vm->bound_list);
 	INIT_LIST_HEAD(&vm->unbound_list);
+
+	vm->va = RB_ROOT_CACHED;
+	INIT_LIST_HEAD(&vm->vm_bind_list);
+	INIT_LIST_HEAD(&vm->vm_bound_list);
+	mutex_init(&vm->vm_bind_lock);
 }
 
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 4d75ba4bb41d..3a9bee1b9d03 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -260,6 +260,15 @@ struct i915_address_space {
 	 */
 	struct list_head unbound_list;
 
+	/** @vm_bind_lock: Mutex to protect @vm_bind_list and @vm_bound_list */
+	struct mutex vm_bind_lock;
+	/** @vm_bind_list: List of vm_binding in process */
+	struct list_head vm_bind_list;
+	/** @vm_bound_list: List of vm_binding completed */
+	struct list_head vm_bound_list;
+	/** @va: tree of persistent vmas */
+	struct rb_root_cached va;
+
 	/* Global GTT */
 	bool is_ggtt:1;
 
diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
index c3d43f9b1e45..cf41b96ac485 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -69,6 +69,7 @@
 #include "gem/i915_gem_ioctls.h"
 #include "gem/i915_gem_mman.h"
 #include "gem/i915_gem_pm.h"
+#include "gem/i915_gem_vm_bind.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_rc6.h"
@@ -1892,6 +1893,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_VM_BIND, i915_gem_vm_bind_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_VM_UNBIND, i915_gem_vm_unbind_ioctl, DRM_RENDER_ALLOW),
 };
 
 /*
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 529d97318f00..6a64a130dbcd 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -239,6 +239,7 @@ vma_create(struct drm_i915_gem_object *obj,
 	spin_unlock(&obj->vma.lock);
 	mutex_unlock(&vm->mutex);
 
+	INIT_LIST_HEAD(&vma->vm_bind_link);
 	return vma;
 
 err_unlock:
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index 3144d71a0c3e..db786d2d1530 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -295,6 +295,20 @@ struct i915_vma {
 	/** This object's place on the active/inactive lists */
 	struct list_head vm_link;
 
+	/** @vm_bind_link: node for the vm_bind related lists of vm */
+	struct list_head vm_bind_link;
+
+	/** Interval tree structures for persistent vma */
+
+	/** @rb: node for the interval tree of vm for persistent vmas */
+	struct rb_node rb;
+	/** @start: start endpoint of the rb node */
+	u64 start;
+	/** @last: Last endpoint of the rb node */
+	u64 last;
+	/** @__subtree_last: last in subtree */
+	u64 __subtree_last;
+
 	struct list_head obj_link; /* Link in the object's VMA list */
 	struct rb_node obj_node;
 	struct hlist_node obj_hash;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 8df261c5ab9b..f06a09f1db2d 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -470,6 +470,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_VM_CREATE		0x3a
 #define DRM_I915_GEM_VM_DESTROY		0x3b
 #define DRM_I915_GEM_CREATE_EXT		0x3c
+#define DRM_I915_GEM_VM_BIND		0x3d
+#define DRM_I915_GEM_VM_UNBIND		0x3e
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -534,6 +536,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
 #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
+#define DRM_IOCTL_I915_GEM_VM_BIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
+#define DRM_IOCTL_I915_GEM_VM_UNBIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_unbind)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -3727,6 +3731,101 @@ struct drm_i915_gem_create_ext_protected_content {
 /* ID of the protected content session managed by i915 when PXP is active */
 #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
 
+/**
+ * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
+ *
+ * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
+ * virtual address (VA) range to the section of an object that should be bound
+ * in the device page table of the specified address space (VM).
+ * The VA range specified must be unique (ie., not currently bound) and can
+ * be mapped to whole object or a section of the object (partial binding).
+ * Multiple VA mappings can be created to the same section of the object
+ * (aliasing).
+ *
+ * The @start, @offset and @length must be 4K page aligned. However the DG2
+ * and XEHPSDV has 64K page size for device local memory and has compact page
+ * table. On those platforms, for binding device local-memory objects, the
+ * @start, @offset and @length must be 64K aligned.
+ *
+ * Error code -EINVAL will be returned if @start, @offset and @length are not
+ * properly aligned. In version 1 (See I915_PARAM_VM_BIND_VERSION), error code
+ * -ENOSPC will be returned if the VA range specified can't be reserved.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
+ * are not ordered. Furthermore, parts of the VM_BIND operation can be done
+ * asynchronously, if valid @fence is specified.
+ */
+struct drm_i915_gem_vm_bind {
+	/** @vm_id: VM (address space) id to bind */
+	__u32 vm_id;
+
+	/** @handle: Object handle */
+	__u32 handle;
+
+	/** @start: Virtual Address start to bind */
+	__u64 start;
+
+	/** @offset: Offset in object to bind */
+	__u64 offset;
+
+	/** @length: Length of mapping to bind */
+	__u64 length;
+
+	/** @rsvd: Reserved, MBZ */
+	__u64 rsvd[3];
+
+	/** @rsvd2: Reserved for timeline fence */
+	__u64 rsvd2[2];
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * For future extensions. See struct i915_user_extension.
+	 */
+	__u64 extensions;
+};
+
+/**
+ * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
+ *
+ * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
+ * address (VA) range that should be unbound from the device page table of the
+ * specified address space (VM). VM_UNBIND will force unbind the specified
+ * range from device page table without waiting for any GPU job to complete.
+ * It is UMDs responsibility to ensure the mapping is no longer in use before
+ * calling VM_UNBIND.
+ *
+ * If the specified mapping is not found, the ioctl will simply return without
+ * any error.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
+ * are not ordered. Furthermore, parts of the VM_UNBIND operation can be done
+ * asynchronously, if valid @fence is specified.
+ */
+struct drm_i915_gem_vm_unbind {
+	/** @vm_id: VM (address space) id to bind */
+	__u32 vm_id;
+
+	/** @rsvd: Reserved, MBZ */
+	__u32 rsvd;
+
+	/** @start: Virtual Address start to unbind */
+	__u64 start;
+
+	/** @length: Length of mapping to unbind */
+	__u64 length;
+
+	/** @rsvd2: Reserved, MBZ */
+	__u64 rsvd2[3];
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * For future extensions. See struct i915_user_extension.
+	 */
+	__u64 extensions;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 05/20] drm/i915/vm_bind: Implement bind and unbind of object
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Add uapi and implement support for bind and unbind of an
object at the specified GPU virtual addresses.

The vm_bind mode is not supported in legacy execbuf2 ioctl.
It will be supported only in the newer execbuf3 ioctl.

v2: On older platforms ctx->vm is not set, check for it.
    In vm_bind call, add vma to vm_bind_list.
    Add more input validity checks.
    Update some documentation.
v3: In vm_bind call, add vma to vm_bound_list as user can
    request a fence and pass to execbuf3 as input fence.
    Remove short term pinning with PIN_VALIDATE flag.
v4: Replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode().
v5: Ensure all reserved fields are 0, use PIN_NOEVICT.
v6: Add reserved fields to drm_i915_gem_vm_bind.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |  15 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   5 +
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
 .../drm/i915/gem/i915_gem_vm_bind_object.c    | 324 ++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  10 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |   9 +
 drivers/gpu/drm/i915/i915_driver.c            |   3 +
 drivers/gpu/drm/i915/i915_vma.c               |   1 +
 drivers/gpu/drm/i915/i915_vma_types.h         |  14 +
 include/uapi/drm/i915_drm.h                   |  99 ++++++
 11 files changed, 507 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 51704b54317c..b731f3ac80da 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -166,6 +166,7 @@ gem-y += \
 	gem/i915_gem_ttm_move.o \
 	gem/i915_gem_ttm_pm.o \
 	gem/i915_gem_userptr.o \
+	gem/i915_gem_vm_bind_object.o \
 	gem/i915_gem_wait.o \
 	gem/i915_gemfs.o
 i915-y += \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index 899fa8f1e0fe..e8b41aa8f8c4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -139,6 +139,21 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
 				       struct drm_file *file);
 
+/**
+ * i915_gem_vm_is_vm_bind_mode() - Check if address space is in vm_bind mode
+ * @vm: the address space
+ *
+ * Returns:
+ * true: @vm is in vm_bind mode; allows only vm_bind method of binding.
+ * false: @vm is not in vm_bind mode; allows only legacy execbuff method
+ *        of binding.
+ */
+static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
+{
+	/* No support to enable vm_bind mode yet */
+	return false;
+}
+
 struct i915_address_space *
 i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 1160723c9d2d..c5bc9f6e887f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
 	if (unlikely(IS_ERR(ctx)))
 		return PTR_ERR(ctx);
 
+	if (ctx->vm && i915_gem_vm_is_vm_bind_mode(ctx->vm)) {
+		i915_gem_context_put(ctx);
+		return -EOPNOTSUPP;
+	}
+
 	eb->gem_context = ctx;
 	if (i915_gem_context_has_full_ppgtt(ctx))
 		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
new file mode 100644
index 000000000000..36262a6357b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_VM_BIND_H
+#define __I915_GEM_VM_BIND_H
+
+#include <linux/types.h>
+
+struct drm_device;
+struct drm_file;
+struct i915_address_space;
+struct i915_vma;
+
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
+
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file);
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+			     struct drm_file *file);
+
+void i915_gem_vm_unbind_all(struct i915_address_space *vm);
+
+#endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
new file mode 100644
index 000000000000..6f299806bee1
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -0,0 +1,324 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <uapi/drm/i915_drm.h>
+
+#include <linux/interval_tree_generic.h>
+
+#include "gem/i915_gem_context.h"
+#include "gem/i915_gem_vm_bind.h"
+
+#include "gt/intel_gpu_commands.h"
+
+#define START(node) ((node)->start)
+#define LAST(node) ((node)->last)
+
+/* Not all defined functions are used, hence use __maybe_unused */
+INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
+		     START, LAST, __maybe_unused static inline, i915_vm_bind_it)
+
+#undef START
+#undef LAST
+
+/**
+ * DOC: VM_BIND/UNBIND ioctls
+ *
+ * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+ * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+ * specified address space (VM). Multiple mappings can map to the same physical
+ * pages of an object (aliasing). These mappings (also referred to as persistent
+ * mappings) will be persistent across multiple GPU submissions (execbuf calls)
+ * issued by the UMD, without user having to provide a list of all required
+ * mappings during each submission (as required by older execbuf mode).
+ *
+ * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
+ * signaling the completion of bind/unbind operation.
+ *
+ * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
+ * User has to opt-in for VM_BIND mode of binding for an address space (VM)
+ * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
+ * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be
+ * done asynchronously, when valid out fence is specified.
+ *
+ * VM_BIND locking order is as below.
+ *
+ * 1) vm_bind_lock mutex will protect vm_bind lists. This lock is taken in
+ *    vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the
+ *    mapping.
+ *
+ *    In future, when GPU page faults are supported, we can potentially use a
+ *    rwsem instead, so that multiple page fault handlers can take the read
+ *    side lock to lookup the mapping and hence can run in parallel.
+ *    The older execbuf mode of binding do not need this lock.
+ *
+ * 2) The object's dma-resv lock will protect i915_vma state and needs
+ *    to be held while binding/unbinding a vma in the async worker and while
+ *    updating dma-resv fence list of an object. Note that private BOs of a VM
+ *    will all share a dma-resv object.
+ *
+ * 3) Spinlock/s to protect some of the VM's lists like the list of
+ *    invalidated vmas (due to eviction and userptr invalidation) etc.
+ */
+
+/**
+ * i915_gem_vm_bind_lookup_vma() - lookup for persistent vma mapped at a
+ * specified address
+ * @vm: virtual address space to look for persistent vma
+ * @va: starting address where vma is mapped
+ *
+ * Retrieves the persistent vma mapped address @va from the @vm's vma tree.
+ *
+ * Returns vma pointer on success, NULL on failure.
+ */
+struct i915_vma *
+i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va)
+{
+	lockdep_assert_held(&vm->vm_bind_lock);
+
+	return i915_vm_bind_it_iter_first(&vm->va, va, va);
+}
+
+static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
+{
+	lockdep_assert_held(&vma->vm->vm_bind_lock);
+
+	list_del_init(&vma->vm_bind_link);
+	i915_vm_bind_it_remove(vma, &vma->vm->va);
+
+	/* Release object */
+	if (release_obj)
+		i915_gem_object_put(vma->obj);
+}
+
+static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
+				  struct drm_i915_gem_vm_unbind *va)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int ret;
+
+	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
+	if (ret)
+		return ret;
+
+	va->start = gen8_noncanonical_addr(va->start);
+	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
+
+	if (!vma)
+		ret = -ENOENT;
+	else if (vma->size != va->length)
+		ret = -EINVAL;
+
+	if (ret) {
+		mutex_unlock(&vm->vm_bind_lock);
+		return ret;
+	}
+
+	i915_gem_vm_bind_remove(vma, false);
+
+	mutex_unlock(&vm->vm_bind_lock);
+
+	/*
+	 * Destroy the vma and then release the object.
+	 * As persistent vma holds object reference, it can only be destroyed
+	 * either by vm_unbind ioctl or when VM is being released. As we are
+	 * holding VM reference here, it is safe accessing the vma here.
+	 */
+	obj = vma->obj;
+	i915_gem_object_lock(obj, NULL);
+	i915_vma_destroy(vma);
+	i915_gem_object_unlock(obj);
+
+	i915_gem_object_put(obj);
+
+	return 0;
+}
+
+/**
+ * i915_gem_vm_unbind_all() - unbind all persistent mappings from an
+ * address space
+ * @vm: Address spece to remove persistent mappings from
+ *
+ * Unbind all userspace requested vm_bind mappings from @vm.
+ */
+void i915_gem_vm_unbind_all(struct i915_address_space *vm)
+{
+	struct i915_vma *vma, *t;
+
+	mutex_lock(&vm->vm_bind_lock);
+	list_for_each_entry_safe(vma, t, &vm->vm_bind_list, vm_bind_link)
+		i915_gem_vm_bind_remove(vma, true);
+	list_for_each_entry_safe(vma, t, &vm->vm_bound_list, vm_bind_link)
+		i915_gem_vm_bind_remove(vma, true);
+	mutex_unlock(&vm->vm_bind_lock);
+}
+
+static struct i915_vma *vm_bind_get_vma(struct i915_address_space *vm,
+					struct drm_i915_gem_object *obj,
+					struct drm_i915_gem_vm_bind *va)
+{
+	struct i915_gtt_view view;
+	struct i915_vma *vma;
+
+	va->start = gen8_noncanonical_addr(va->start);
+	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
+	if (vma)
+		return ERR_PTR(-EEXIST);
+
+	view.type = I915_GTT_VIEW_PARTIAL;
+	view.partial.offset = va->offset >> PAGE_SHIFT;
+	view.partial.size = va->length >> PAGE_SHIFT;
+	vma = i915_vma_create_persistent(obj, vm, &view);
+	if (IS_ERR(vma))
+		return vma;
+
+	vma->start = va->start;
+	vma->last = va->start + va->length - 1;
+
+	return vma;
+}
+
+static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
+				struct drm_i915_gem_vm_bind *va,
+				struct drm_file *file)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma = NULL;
+	struct i915_gem_ww_ctx ww;
+	u64 pin_flags;
+	int ret = 0;
+
+	if (!i915_gem_vm_is_vm_bind_mode(vm))
+		return -EOPNOTSUPP;
+
+	/* Ensure start and length fields are valid */
+	if (!va->length || !IS_ALIGNED(va->start, I915_GTT_PAGE_SIZE))
+		ret = -EINVAL;
+
+	obj = i915_gem_object_lookup(file, va->handle);
+	if (!obj)
+		return -ENOENT;
+
+	/* Ensure offset and length are aligned to object's max page size */
+	if (!IS_ALIGNED(va->offset | va->length,
+			i915_gem_object_max_page_size(obj->mm.placements,
+						      obj->mm.n_placements)))
+		ret = -EINVAL;
+
+	/* Check for mapping range overflow */
+	if (range_overflows_t(u64, va->offset, va->length, obj->base.size))
+		ret = -EINVAL;
+
+	if (ret)
+		goto put_obj;
+
+	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
+	if (ret)
+		goto put_obj;
+
+	vma = vm_bind_get_vma(vm, obj, va);
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto unlock_vm;
+	}
+
+	pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER |
+		    PIN_VALIDATE | PIN_NOEVICT;
+
+	for_i915_gem_ww(&ww, ret, true) {
+		ret = i915_gem_object_lock(vma->obj, &ww);
+		if (ret)
+			continue;
+
+		ret = i915_vma_pin_ww(vma, &ww, 0, 0, pin_flags);
+		if (ret)
+			continue;
+
+		list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+		i915_vm_bind_it_insert(vma, &vm->va);
+
+		/* Hold object reference until vm_unbind */
+		i915_gem_object_get(vma->obj);
+	}
+
+	if (ret)
+		i915_vma_destroy(vma);
+unlock_vm:
+	mutex_unlock(&vm->vm_bind_lock);
+put_obj:
+	i915_gem_object_put(obj);
+
+	return ret;
+}
+
+/**
+ * i915_gem_vm_bind_ioctl() - ioctl function for binding a section of object
+ * at a specified virtual address
+ * @dev: drm_device pointer
+ * @data: ioctl data structure
+ * @file: drm_file pointer
+ *
+ * Adds the specified persistent mapping (virtual address to a section of an
+ * object) and binds it in the device page table.
+ *
+ * Returns 0 on success, error code on failure.
+ */
+int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file)
+{
+	struct drm_i915_gem_vm_bind *args = data;
+	struct i915_address_space *vm;
+	int ret;
+
+	/* Reserved fields must be 0 */
+	if (args->rsvd[0] || args->rsvd[1] || args->rsvd[2] || args->extensions)
+		return -EINVAL;
+
+	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
+	if (unlikely(!vm))
+		return -ENOENT;
+
+	ret = i915_gem_vm_bind_obj(vm, args, file);
+
+	i915_vm_put(vm);
+	return ret;
+}
+
+/**
+ * i915_gem_vm_unbind_ioctl() - ioctl function for unbinding a mapping at a
+ * specified virtual address
+ * @dev: drm_device pointer
+ * @data: ioctl data structure
+ * @file: drm_file pointer
+ *
+ * Removes the persistent mapping at the specified address and unbinds it
+ * from the device page table.
+ *
+ * Returns 0 on success, error code on failure. -ENOENT is returned if the
+ * specified mapping is not found.
+ */
+int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
+			     struct drm_file *file)
+{
+	struct drm_i915_gem_vm_unbind *args = data;
+	struct i915_address_space *vm;
+	int ret;
+
+	/* Reserved fields must be 0 */
+	if (args->rsvd || args->rsvd2[0] || args->rsvd2[1] ||
+	    args->rsvd2[2] || args->extensions)
+		return -EINVAL;
+
+	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
+	if (unlikely(!vm))
+		return -ENOENT;
+
+	ret = i915_gem_vm_unbind_vma(vm, args);
+
+	i915_vm_put(vm);
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index e82a9d763e57..412368c67c46 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -12,6 +12,7 @@
 
 #include "gem/i915_gem_internal.h"
 #include "gem/i915_gem_lmem.h"
+#include "gem/i915_gem_vm_bind.h"
 #include "i915_trace.h"
 #include "i915_utils.h"
 #include "intel_gt.h"
@@ -177,6 +178,8 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
 void i915_address_space_fini(struct i915_address_space *vm)
 {
 	drm_mm_takedown(&vm->mm);
+	GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
+	mutex_destroy(&vm->vm_bind_lock);
 }
 
 /**
@@ -203,6 +206,8 @@ static void __i915_vm_release(struct work_struct *work)
 	struct i915_address_space *vm =
 		container_of(work, struct i915_address_space, release_work);
 
+	i915_gem_vm_unbind_all(vm);
+
 	__i915_vm_close(vm);
 
 	/* Synchronize async unbinds. */
@@ -279,6 +284,11 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 
 	INIT_LIST_HEAD(&vm->bound_list);
 	INIT_LIST_HEAD(&vm->unbound_list);
+
+	vm->va = RB_ROOT_CACHED;
+	INIT_LIST_HEAD(&vm->vm_bind_list);
+	INIT_LIST_HEAD(&vm->vm_bound_list);
+	mutex_init(&vm->vm_bind_lock);
 }
 
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 4d75ba4bb41d..3a9bee1b9d03 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -260,6 +260,15 @@ struct i915_address_space {
 	 */
 	struct list_head unbound_list;
 
+	/** @vm_bind_lock: Mutex to protect @vm_bind_list and @vm_bound_list */
+	struct mutex vm_bind_lock;
+	/** @vm_bind_list: List of vm_binding in process */
+	struct list_head vm_bind_list;
+	/** @vm_bound_list: List of vm_binding completed */
+	struct list_head vm_bound_list;
+	/** @va: tree of persistent vmas */
+	struct rb_root_cached va;
+
 	/* Global GTT */
 	bool is_ggtt:1;
 
diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
index c3d43f9b1e45..cf41b96ac485 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -69,6 +69,7 @@
 #include "gem/i915_gem_ioctls.h"
 #include "gem/i915_gem_mman.h"
 #include "gem/i915_gem_pm.h"
+#include "gem/i915_gem_vm_bind.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_rc6.h"
@@ -1892,6 +1893,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_VM_BIND, i915_gem_vm_bind_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_VM_UNBIND, i915_gem_vm_unbind_ioctl, DRM_RENDER_ALLOW),
 };
 
 /*
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 529d97318f00..6a64a130dbcd 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -239,6 +239,7 @@ vma_create(struct drm_i915_gem_object *obj,
 	spin_unlock(&obj->vma.lock);
 	mutex_unlock(&vm->mutex);
 
+	INIT_LIST_HEAD(&vma->vm_bind_link);
 	return vma;
 
 err_unlock:
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index 3144d71a0c3e..db786d2d1530 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -295,6 +295,20 @@ struct i915_vma {
 	/** This object's place on the active/inactive lists */
 	struct list_head vm_link;
 
+	/** @vm_bind_link: node for the vm_bind related lists of vm */
+	struct list_head vm_bind_link;
+
+	/** Interval tree structures for persistent vma */
+
+	/** @rb: node for the interval tree of vm for persistent vmas */
+	struct rb_node rb;
+	/** @start: start endpoint of the rb node */
+	u64 start;
+	/** @last: Last endpoint of the rb node */
+	u64 last;
+	/** @__subtree_last: last in subtree */
+	u64 __subtree_last;
+
 	struct list_head obj_link; /* Link in the object's VMA list */
 	struct rb_node obj_node;
 	struct hlist_node obj_hash;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 8df261c5ab9b..f06a09f1db2d 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -470,6 +470,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_VM_CREATE		0x3a
 #define DRM_I915_GEM_VM_DESTROY		0x3b
 #define DRM_I915_GEM_CREATE_EXT		0x3c
+#define DRM_I915_GEM_VM_BIND		0x3d
+#define DRM_I915_GEM_VM_UNBIND		0x3e
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -534,6 +536,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
 #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
+#define DRM_IOCTL_I915_GEM_VM_BIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
+#define DRM_IOCTL_I915_GEM_VM_UNBIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_unbind)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -3727,6 +3731,101 @@ struct drm_i915_gem_create_ext_protected_content {
 /* ID of the protected content session managed by i915 when PXP is active */
 #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
 
+/**
+ * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
+ *
+ * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
+ * virtual address (VA) range to the section of an object that should be bound
+ * in the device page table of the specified address space (VM).
+ * The VA range specified must be unique (ie., not currently bound) and can
+ * be mapped to whole object or a section of the object (partial binding).
+ * Multiple VA mappings can be created to the same section of the object
+ * (aliasing).
+ *
+ * The @start, @offset and @length must be 4K page aligned. However the DG2
+ * and XEHPSDV has 64K page size for device local memory and has compact page
+ * table. On those platforms, for binding device local-memory objects, the
+ * @start, @offset and @length must be 64K aligned.
+ *
+ * Error code -EINVAL will be returned if @start, @offset and @length are not
+ * properly aligned. In version 1 (See I915_PARAM_VM_BIND_VERSION), error code
+ * -ENOSPC will be returned if the VA range specified can't be reserved.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
+ * are not ordered. Furthermore, parts of the VM_BIND operation can be done
+ * asynchronously, if valid @fence is specified.
+ */
+struct drm_i915_gem_vm_bind {
+	/** @vm_id: VM (address space) id to bind */
+	__u32 vm_id;
+
+	/** @handle: Object handle */
+	__u32 handle;
+
+	/** @start: Virtual Address start to bind */
+	__u64 start;
+
+	/** @offset: Offset in object to bind */
+	__u64 offset;
+
+	/** @length: Length of mapping to bind */
+	__u64 length;
+
+	/** @rsvd: Reserved, MBZ */
+	__u64 rsvd[3];
+
+	/** @rsvd2: Reserved for timeline fence */
+	__u64 rsvd2[2];
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * For future extensions. See struct i915_user_extension.
+	 */
+	__u64 extensions;
+};
+
+/**
+ * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
+ *
+ * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
+ * address (VA) range that should be unbound from the device page table of the
+ * specified address space (VM). VM_UNBIND will force unbind the specified
+ * range from device page table without waiting for any GPU job to complete.
+ * It is UMDs responsibility to ensure the mapping is no longer in use before
+ * calling VM_UNBIND.
+ *
+ * If the specified mapping is not found, the ioctl will simply return without
+ * any error.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
+ * are not ordered. Furthermore, parts of the VM_UNBIND operation can be done
+ * asynchronously, if valid @fence is specified.
+ */
+struct drm_i915_gem_vm_unbind {
+	/** @vm_id: VM (address space) id to bind */
+	__u32 vm_id;
+
+	/** @rsvd: Reserved, MBZ */
+	__u32 rsvd;
+
+	/** @start: Virtual Address start to unbind */
+	__u64 start;
+
+	/** @length: Length of mapping to unbind */
+	__u64 length;
+
+	/** @rsvd2: Reserved, MBZ */
+	__u64 rsvd2[3];
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * For future extensions. See struct i915_user_extension.
+	 */
+	__u64 extensions;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 06/20] drm/i915/vm_bind: Support for VM private BOs
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Each VM creates a root_obj and shares it with all of its private objects
to use it as dma_resv object. This has a performance advantage as it
requires a single dma_resv object update for all private BOs vs list of
dma_resv objects update for shared BOs, in the execbuf path.

VM private BOs can be only mapped on specified VM and cannot be dmabuf
exported. Also, they are supported only in vm_bind mode.

v2: Pad struct drm_i915_gem_create_ext_vm_private for 64bit alignment,
    add input validity checks.
v3: Create root_obj only for ppgtt.
v4: Fix releasing of obj->priv_root. Do not create vm->root_obj yet.
    Allow vm private object creation only in vm_bind mode.
    Replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode().

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_create.c    | 54 ++++++++++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |  6 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  4 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  3 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  6 +++
 .../drm/i915/gem/i915_gem_vm_bind_object.c    |  9 ++++
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  1 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  4 ++
 drivers/gpu/drm/i915/i915_vma.c               |  1 +
 drivers/gpu/drm/i915/i915_vma_types.h         |  2 +
 include/uapi/drm/i915_drm.h                   | 33 ++++++++++++
 12 files changed, 122 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 6bed0633f744..1630a52f387d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -83,6 +83,7 @@
 
 #include "i915_file_private.h"
 #include "i915_gem_context.h"
+#include "i915_gem_internal.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 5c6e396ab74d..62648341780b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -11,6 +11,7 @@
 #include "pxp/intel_pxp.h"
 
 #include "i915_drv.h"
+#include "i915_gem_context.h"
 #include "i915_gem_create.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
@@ -251,6 +252,7 @@ struct create_ext {
 	unsigned int n_placements;
 	unsigned int placement_mask;
 	unsigned long flags;
+	u32 vm_id;
 };
 
 static void repr_placements(char *buf, size_t size,
@@ -400,9 +402,32 @@ static int ext_set_protected(struct i915_user_extension __user *base, void *data
 	return 0;
 }
 
+static int ext_set_vm_private(struct i915_user_extension __user *base,
+			      void *data)
+{
+	struct drm_i915_gem_create_ext_vm_private ext;
+	struct create_ext *ext_data = data;
+
+	if (copy_from_user(&ext, base, sizeof(ext)))
+		return -EFAULT;
+
+	/* Reserved fields must be 0 */
+	if (ext.rsvd)
+		return -EINVAL;
+
+	/* vm_id 0 is reserved */
+	if (!ext.vm_id)
+		return -ENOENT;
+
+	ext_data->vm_id = ext.vm_id;
+
+	return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
 	[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
 	[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
+	[I915_GEM_CREATE_EXT_VM_PRIVATE] = ext_set_vm_private,
 };
 
 /**
@@ -418,6 +443,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
 	struct drm_i915_private *i915 = to_i915(dev);
 	struct drm_i915_gem_create_ext *args = data;
 	struct create_ext ext_data = { .i915 = i915 };
+	struct i915_address_space *vm = NULL;
 	struct drm_i915_gem_object *obj;
 	int ret;
 
@@ -431,6 +457,17 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		return ret;
 
+	if (ext_data.vm_id) {
+		vm = i915_gem_vm_lookup(file->driver_priv, ext_data.vm_id);
+		if (unlikely(!vm))
+			return -ENOENT;
+
+		if (!i915_gem_vm_is_vm_bind_mode(vm)) {
+			ret = -EINVAL;
+			goto vm_put;
+		}
+	}
+
 	if (!ext_data.n_placements) {
 		ext_data.placements[0] =
 			intel_memory_region_by_type(i915, INTEL_MEMORY_SYSTEM);
@@ -457,8 +494,21 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
 						ext_data.placements,
 						ext_data.n_placements,
 						ext_data.flags);
-	if (IS_ERR(obj))
-		return PTR_ERR(obj);
+	if (IS_ERR(obj)) {
+		ret = PTR_ERR(obj);
+		goto vm_put;
+	}
+
+	if (vm) {
+		obj->base.resv = vm->root_obj->base.resv;
+		obj->priv_root = i915_gem_object_get(vm->root_obj);
+		i915_vm_put(vm);
+	}
 
 	return i915_gem_publish(obj, file, &args->size, &args->handle);
+vm_put:
+	if (vm)
+		i915_vm_put(vm);
+
+	return ret;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index ec6f7ae47783..a1c900a0c8ef 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -219,6 +219,12 @@ struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags)
 	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
 	DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
 
+	if (obj->priv_root) {
+		drm_dbg(obj->base.dev,
+			"Exporting VM private objects is not allowed\n");
+		return ERR_PTR(-EINVAL);
+	}
+
 	exp_info.ops = &i915_dmabuf_ops;
 	exp_info.size = gem_obj->size;
 	exp_info.flags = flags;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index c5bc9f6e887f..43f29acfbec9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -864,6 +864,10 @@ static struct i915_vma *eb_lookup_vma(struct i915_execbuffer *eb, u32 handle)
 		if (unlikely(!obj))
 			return ERR_PTR(-ENOENT);
 
+		/* VM private objects are not supported here */
+		if (obj->priv_root)
+			return ERR_PTR(-EINVAL);
+
 		/*
 		 * If the user has opted-in for protected-object tracking, make
 		 * sure the object encryption can be used.
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 733696057761..2abef7e5af81 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -111,6 +111,9 @@ void __i915_gem_object_fini(struct drm_i915_gem_object *obj)
 	mutex_destroy(&obj->mm.get_page.lock);
 	mutex_destroy(&obj->mm.get_dma_page.lock);
 	dma_resv_fini(&obj->base._resv);
+
+	if (obj->priv_root)
+		i915_gem_object_put(obj->priv_root);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index d0d6772e6f36..80a09d55b855 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -242,6 +242,12 @@ struct drm_i915_gem_object {
 
 	const struct drm_i915_gem_object_ops *ops;
 
+	/**
+	 * @priv_root: pointer to vm->root_obj if object is private,
+	 * NULL otherwise.
+	 */
+	struct drm_i915_gem_object *priv_root;
+
 	struct {
 		/**
 		 * @vma.lock: protect the list/tree of vmas
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 6f299806bee1..19f29fa76c19 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -87,6 +87,7 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
 	lockdep_assert_held(&vma->vm->vm_bind_lock);
 
 	list_del_init(&vma->vm_bind_link);
+	list_del_init(&vma->non_priv_vm_bind_link);
 	i915_vm_bind_it_remove(vma, &vma->vm->va);
 
 	/* Release object */
@@ -216,6 +217,11 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 	if (ret)
 		goto put_obj;
 
+	if (obj->priv_root && obj->priv_root != vm->root_obj) {
+		ret = -EINVAL;
+		goto put_obj;
+	}
+
 	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
 	if (ret)
 		goto put_obj;
@@ -240,6 +246,9 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 
 		list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
 		i915_vm_bind_it_insert(vma, &vm->va);
+		if (!obj->priv_root)
+			list_add_tail(&vma->non_priv_vm_bind_link,
+				      &vm->non_priv_vm_bind_list);
 
 		/* Hold object reference until vm_unbind */
 		i915_gem_object_get(vma->obj);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 412368c67c46..74c3557e5bc4 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -289,6 +289,7 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	INIT_LIST_HEAD(&vm->vm_bind_list);
 	INIT_LIST_HEAD(&vm->vm_bound_list);
 	mutex_init(&vm->vm_bind_lock);
+	INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
 }
 
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 3a9bee1b9d03..3d0a452567e4 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -268,6 +268,10 @@ struct i915_address_space {
 	struct list_head vm_bound_list;
 	/** @va: tree of persistent vmas */
 	struct rb_root_cached va;
+	/** @non_priv_vm_bind_list: list of non-private object mappings */
+	struct list_head non_priv_vm_bind_list;
+	/** @root_obj: root object for dma-resv sharing by private objects */
+	struct drm_i915_gem_object *root_obj;
 
 	/* Global GTT */
 	bool is_ggtt:1;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 6a64a130dbcd..0ffa24bc0954 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -240,6 +240,7 @@ vma_create(struct drm_i915_gem_object *obj,
 	mutex_unlock(&vm->mutex);
 
 	INIT_LIST_HEAD(&vma->vm_bind_link);
+	INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
 	return vma;
 
 err_unlock:
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index db786d2d1530..9cd055738997 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -297,6 +297,8 @@ struct i915_vma {
 
 	/** @vm_bind_link: node for the vm_bind related lists of vm */
 	struct list_head vm_bind_link;
+	/** @non_priv_vm_bind_link: Link in non-private persistent VMA list */
+	struct list_head non_priv_vm_bind_link;
 
 	/** Interval tree structures for persistent vma */
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index f06a09f1db2d..d3d709035b0d 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -3611,9 +3611,13 @@ struct drm_i915_gem_create_ext {
 	 *
 	 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
 	 * struct drm_i915_gem_create_ext_protected_content.
+	 *
+	 * For I915_GEM_CREATE_EXT_VM_PRIVATE usage see
+	 * struct drm_i915_gem_create_ext_vm_private.
 	 */
 #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
 #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
+#define I915_GEM_CREATE_EXT_VM_PRIVATE 2
 	__u64 extensions;
 };
 
@@ -3731,6 +3735,35 @@ struct drm_i915_gem_create_ext_protected_content {
 /* ID of the protected content session managed by i915 when PXP is active */
 #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
 
+/**
+ * struct drm_i915_gem_create_ext_vm_private - Extension to make the object
+ * private to the specified VM.
+ *
+ * See struct drm_i915_gem_create_ext.
+ *
+ * By default, BOs can be mapped on multiple VMs and can also be dma-buf
+ * exported. Hence these BOs are referred to as Shared BOs.
+ * During each execbuf3 submission, the request fence must be added to the
+ * dma-resv fence list of all shared BOs mapped on the VM.
+ *
+ * Unlike Shared BOs, these VM private BOs can only be mapped on the VM they
+ * are private to and can't be dma-buf exported. All private BOs of a VM share
+ * the dma-resv object. Hence during each execbuf3 submission, they need only
+ * one dma-resv fence list updated. Thus, the fast path (where required
+ * mappings are already bound) submission latency is O(1) w.r.t the number of
+ * VM private BOs.
+ */
+struct drm_i915_gem_create_ext_vm_private {
+	/** @base: Extension link. See struct i915_user_extension. */
+	struct i915_user_extension base;
+
+	/** @vm_id: Id of the VM to which Object is private */
+	__u32 vm_id;
+
+	/** @rsvd: Reserved, MBZ */
+	__u32 rsvd;
+};
+
 /**
  * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
  *
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 06/20] drm/i915/vm_bind: Support for VM private BOs
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Each VM creates a root_obj and shares it with all of its private objects
to use it as dma_resv object. This has a performance advantage as it
requires a single dma_resv object update for all private BOs vs list of
dma_resv objects update for shared BOs, in the execbuf path.

VM private BOs can be only mapped on specified VM and cannot be dmabuf
exported. Also, they are supported only in vm_bind mode.

v2: Pad struct drm_i915_gem_create_ext_vm_private for 64bit alignment,
    add input validity checks.
v3: Create root_obj only for ppgtt.
v4: Fix releasing of obj->priv_root. Do not create vm->root_obj yet.
    Allow vm private object creation only in vm_bind mode.
    Replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode().

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_create.c    | 54 ++++++++++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |  6 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  4 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  3 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  6 +++
 .../drm/i915/gem/i915_gem_vm_bind_object.c    |  9 ++++
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  1 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  4 ++
 drivers/gpu/drm/i915/i915_vma.c               |  1 +
 drivers/gpu/drm/i915/i915_vma_types.h         |  2 +
 include/uapi/drm/i915_drm.h                   | 33 ++++++++++++
 12 files changed, 122 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 6bed0633f744..1630a52f387d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -83,6 +83,7 @@
 
 #include "i915_file_private.h"
 #include "i915_gem_context.h"
+#include "i915_gem_internal.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 5c6e396ab74d..62648341780b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -11,6 +11,7 @@
 #include "pxp/intel_pxp.h"
 
 #include "i915_drv.h"
+#include "i915_gem_context.h"
 #include "i915_gem_create.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
@@ -251,6 +252,7 @@ struct create_ext {
 	unsigned int n_placements;
 	unsigned int placement_mask;
 	unsigned long flags;
+	u32 vm_id;
 };
 
 static void repr_placements(char *buf, size_t size,
@@ -400,9 +402,32 @@ static int ext_set_protected(struct i915_user_extension __user *base, void *data
 	return 0;
 }
 
+static int ext_set_vm_private(struct i915_user_extension __user *base,
+			      void *data)
+{
+	struct drm_i915_gem_create_ext_vm_private ext;
+	struct create_ext *ext_data = data;
+
+	if (copy_from_user(&ext, base, sizeof(ext)))
+		return -EFAULT;
+
+	/* Reserved fields must be 0 */
+	if (ext.rsvd)
+		return -EINVAL;
+
+	/* vm_id 0 is reserved */
+	if (!ext.vm_id)
+		return -ENOENT;
+
+	ext_data->vm_id = ext.vm_id;
+
+	return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
 	[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
 	[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
+	[I915_GEM_CREATE_EXT_VM_PRIVATE] = ext_set_vm_private,
 };
 
 /**
@@ -418,6 +443,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
 	struct drm_i915_private *i915 = to_i915(dev);
 	struct drm_i915_gem_create_ext *args = data;
 	struct create_ext ext_data = { .i915 = i915 };
+	struct i915_address_space *vm = NULL;
 	struct drm_i915_gem_object *obj;
 	int ret;
 
@@ -431,6 +457,17 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		return ret;
 
+	if (ext_data.vm_id) {
+		vm = i915_gem_vm_lookup(file->driver_priv, ext_data.vm_id);
+		if (unlikely(!vm))
+			return -ENOENT;
+
+		if (!i915_gem_vm_is_vm_bind_mode(vm)) {
+			ret = -EINVAL;
+			goto vm_put;
+		}
+	}
+
 	if (!ext_data.n_placements) {
 		ext_data.placements[0] =
 			intel_memory_region_by_type(i915, INTEL_MEMORY_SYSTEM);
@@ -457,8 +494,21 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
 						ext_data.placements,
 						ext_data.n_placements,
 						ext_data.flags);
-	if (IS_ERR(obj))
-		return PTR_ERR(obj);
+	if (IS_ERR(obj)) {
+		ret = PTR_ERR(obj);
+		goto vm_put;
+	}
+
+	if (vm) {
+		obj->base.resv = vm->root_obj->base.resv;
+		obj->priv_root = i915_gem_object_get(vm->root_obj);
+		i915_vm_put(vm);
+	}
 
 	return i915_gem_publish(obj, file, &args->size, &args->handle);
+vm_put:
+	if (vm)
+		i915_vm_put(vm);
+
+	return ret;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index ec6f7ae47783..a1c900a0c8ef 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -219,6 +219,12 @@ struct dma_buf *i915_gem_prime_export(struct drm_gem_object *gem_obj, int flags)
 	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
 	DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
 
+	if (obj->priv_root) {
+		drm_dbg(obj->base.dev,
+			"Exporting VM private objects is not allowed\n");
+		return ERR_PTR(-EINVAL);
+	}
+
 	exp_info.ops = &i915_dmabuf_ops;
 	exp_info.size = gem_obj->size;
 	exp_info.flags = flags;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index c5bc9f6e887f..43f29acfbec9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -864,6 +864,10 @@ static struct i915_vma *eb_lookup_vma(struct i915_execbuffer *eb, u32 handle)
 		if (unlikely(!obj))
 			return ERR_PTR(-ENOENT);
 
+		/* VM private objects are not supported here */
+		if (obj->priv_root)
+			return ERR_PTR(-EINVAL);
+
 		/*
 		 * If the user has opted-in for protected-object tracking, make
 		 * sure the object encryption can be used.
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 733696057761..2abef7e5af81 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -111,6 +111,9 @@ void __i915_gem_object_fini(struct drm_i915_gem_object *obj)
 	mutex_destroy(&obj->mm.get_page.lock);
 	mutex_destroy(&obj->mm.get_dma_page.lock);
 	dma_resv_fini(&obj->base._resv);
+
+	if (obj->priv_root)
+		i915_gem_object_put(obj->priv_root);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index d0d6772e6f36..80a09d55b855 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -242,6 +242,12 @@ struct drm_i915_gem_object {
 
 	const struct drm_i915_gem_object_ops *ops;
 
+	/**
+	 * @priv_root: pointer to vm->root_obj if object is private,
+	 * NULL otherwise.
+	 */
+	struct drm_i915_gem_object *priv_root;
+
 	struct {
 		/**
 		 * @vma.lock: protect the list/tree of vmas
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 6f299806bee1..19f29fa76c19 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -87,6 +87,7 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
 	lockdep_assert_held(&vma->vm->vm_bind_lock);
 
 	list_del_init(&vma->vm_bind_link);
+	list_del_init(&vma->non_priv_vm_bind_link);
 	i915_vm_bind_it_remove(vma, &vma->vm->va);
 
 	/* Release object */
@@ -216,6 +217,11 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 	if (ret)
 		goto put_obj;
 
+	if (obj->priv_root && obj->priv_root != vm->root_obj) {
+		ret = -EINVAL;
+		goto put_obj;
+	}
+
 	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
 	if (ret)
 		goto put_obj;
@@ -240,6 +246,9 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 
 		list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
 		i915_vm_bind_it_insert(vma, &vm->va);
+		if (!obj->priv_root)
+			list_add_tail(&vma->non_priv_vm_bind_link,
+				      &vm->non_priv_vm_bind_list);
 
 		/* Hold object reference until vm_unbind */
 		i915_gem_object_get(vma->obj);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 412368c67c46..74c3557e5bc4 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -289,6 +289,7 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	INIT_LIST_HEAD(&vm->vm_bind_list);
 	INIT_LIST_HEAD(&vm->vm_bound_list);
 	mutex_init(&vm->vm_bind_lock);
+	INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
 }
 
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 3a9bee1b9d03..3d0a452567e4 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -268,6 +268,10 @@ struct i915_address_space {
 	struct list_head vm_bound_list;
 	/** @va: tree of persistent vmas */
 	struct rb_root_cached va;
+	/** @non_priv_vm_bind_list: list of non-private object mappings */
+	struct list_head non_priv_vm_bind_list;
+	/** @root_obj: root object for dma-resv sharing by private objects */
+	struct drm_i915_gem_object *root_obj;
 
 	/* Global GTT */
 	bool is_ggtt:1;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 6a64a130dbcd..0ffa24bc0954 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -240,6 +240,7 @@ vma_create(struct drm_i915_gem_object *obj,
 	mutex_unlock(&vm->mutex);
 
 	INIT_LIST_HEAD(&vma->vm_bind_link);
+	INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
 	return vma;
 
 err_unlock:
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index db786d2d1530..9cd055738997 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -297,6 +297,8 @@ struct i915_vma {
 
 	/** @vm_bind_link: node for the vm_bind related lists of vm */
 	struct list_head vm_bind_link;
+	/** @non_priv_vm_bind_link: Link in non-private persistent VMA list */
+	struct list_head non_priv_vm_bind_link;
 
 	/** Interval tree structures for persistent vma */
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index f06a09f1db2d..d3d709035b0d 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -3611,9 +3611,13 @@ struct drm_i915_gem_create_ext {
 	 *
 	 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
 	 * struct drm_i915_gem_create_ext_protected_content.
+	 *
+	 * For I915_GEM_CREATE_EXT_VM_PRIVATE usage see
+	 * struct drm_i915_gem_create_ext_vm_private.
 	 */
 #define I915_GEM_CREATE_EXT_MEMORY_REGIONS 0
 #define I915_GEM_CREATE_EXT_PROTECTED_CONTENT 1
+#define I915_GEM_CREATE_EXT_VM_PRIVATE 2
 	__u64 extensions;
 };
 
@@ -3731,6 +3735,35 @@ struct drm_i915_gem_create_ext_protected_content {
 /* ID of the protected content session managed by i915 when PXP is active */
 #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
 
+/**
+ * struct drm_i915_gem_create_ext_vm_private - Extension to make the object
+ * private to the specified VM.
+ *
+ * See struct drm_i915_gem_create_ext.
+ *
+ * By default, BOs can be mapped on multiple VMs and can also be dma-buf
+ * exported. Hence these BOs are referred to as Shared BOs.
+ * During each execbuf3 submission, the request fence must be added to the
+ * dma-resv fence list of all shared BOs mapped on the VM.
+ *
+ * Unlike Shared BOs, these VM private BOs can only be mapped on the VM they
+ * are private to and can't be dma-buf exported. All private BOs of a VM share
+ * the dma-resv object. Hence during each execbuf3 submission, they need only
+ * one dma-resv fence list updated. Thus, the fast path (where required
+ * mappings are already bound) submission latency is O(1) w.r.t the number of
+ * VM private BOs.
+ */
+struct drm_i915_gem_create_ext_vm_private {
+	/** @base: Extension link. See struct i915_user_extension. */
+	struct i915_user_extension base;
+
+	/** @vm_id: Id of the VM to which Object is private */
+	__u32 vm_id;
+
+	/** @rsvd: Reserved, MBZ */
+	__u32 rsvd;
+};
+
 /**
  * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
  *
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 07/20] drm/i915/vm_bind: Add support to handle object evictions
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Support eviction by maintaining a list of evicted persistent vmas
for rebinding during next submission. Ensure the list do not
include persistent vmas that are being purged.

v2: Remove unused I915_VMA_PURGED definition.
v3: Properly handle __i915_vma_unbind_async() case.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 .../drm/i915/gem/i915_gem_vm_bind_object.c    |  6 ++++
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  2 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  4 +++
 drivers/gpu/drm/i915/i915_vma.c               | 31 +++++++++++++++++--
 drivers/gpu/drm/i915/i915_vma.h               | 10 ++++++
 drivers/gpu/drm/i915/i915_vma_types.h         |  8 +++++
 6 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 19f29fa76c19..8cc78f954b97 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -86,6 +86,12 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
 {
 	lockdep_assert_held(&vma->vm->vm_bind_lock);
 
+	spin_lock(&vma->vm->vm_rebind_lock);
+	if (!list_empty(&vma->vm_rebind_link))
+		list_del_init(&vma->vm_rebind_link);
+	i915_vma_set_purged(vma);
+	spin_unlock(&vma->vm->vm_rebind_lock);
+
 	list_del_init(&vma->vm_bind_link);
 	list_del_init(&vma->non_priv_vm_bind_link);
 	i915_vm_bind_it_remove(vma, &vma->vm->va);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 74c3557e5bc4..ebf8fc3a4603 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -290,6 +290,8 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	INIT_LIST_HEAD(&vm->vm_bound_list);
 	mutex_init(&vm->vm_bind_lock);
 	INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
+	INIT_LIST_HEAD(&vm->vm_rebind_list);
+	spin_lock_init(&vm->vm_rebind_lock);
 }
 
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 3d0a452567e4..b5a5b68adb32 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -266,6 +266,10 @@ struct i915_address_space {
 	struct list_head vm_bind_list;
 	/** @vm_bound_list: List of vm_binding completed */
 	struct list_head vm_bound_list;
+	/** @vm_rebind_list: list of vmas to be rebinded */
+	struct list_head vm_rebind_list;
+	/** @vm_rebind_lock: protects vm_rebound_list */
+	spinlock_t vm_rebind_lock;
 	/** @va: tree of persistent vmas */
 	struct rb_root_cached va;
 	/** @non_priv_vm_bind_list: list of non-private object mappings */
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 0ffa24bc0954..249697ae1186 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -241,6 +241,7 @@ vma_create(struct drm_i915_gem_object *obj,
 
 	INIT_LIST_HEAD(&vma->vm_bind_link);
 	INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
+	INIT_LIST_HEAD(&vma->vm_rebind_link);
 	return vma;
 
 err_unlock:
@@ -1681,6 +1682,14 @@ static void force_unbind(struct i915_vma *vma)
 	if (!drm_mm_node_allocated(&vma->node))
 		return;
 
+	/*
+	 * Persistent vma should have been purged by now.
+	 * If not, issue a warning and purge it.
+	 */
+	if (GEM_WARN_ON(i915_vma_is_persistent(vma) &&
+			!i915_vma_is_purged(vma)))
+		i915_vma_set_purged(vma);
+
 	atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
 	WARN_ON(__i915_vma_unbind(vma));
 	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
@@ -2042,6 +2051,16 @@ int __i915_vma_unbind(struct i915_vma *vma)
 	__i915_vma_evict(vma, false);
 
 	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
+
+	if (i915_vma_is_persistent(vma)) {
+		spin_lock(&vma->vm->vm_rebind_lock);
+		if (list_empty(&vma->vm_rebind_link) &&
+		    !i915_vma_is_purged(vma))
+			list_add_tail(&vma->vm_rebind_link,
+				      &vma->vm->vm_rebind_list);
+		spin_unlock(&vma->vm->vm_rebind_lock);
+	}
+
 	return 0;
 }
 
@@ -2054,8 +2073,7 @@ static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma)
 	if (!drm_mm_node_allocated(&vma->node))
 		return NULL;
 
-	if (i915_vma_is_pinned(vma) ||
-	    &vma->obj->mm.rsgt->table != vma->resource->bi.pages)
+	if (i915_vma_is_pinned(vma))
 		return ERR_PTR(-EAGAIN);
 
 	/*
@@ -2077,6 +2095,15 @@ static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma)
 
 	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
 
+	if (i915_vma_is_persistent(vma)) {
+		spin_lock(&vma->vm->vm_rebind_lock);
+		if (list_empty(&vma->vm_rebind_link) &&
+		    !i915_vma_is_purged(vma))
+			list_add_tail(&vma->vm_rebind_link,
+				      &vma->vm->vm_rebind_list);
+		spin_unlock(&vma->vm->vm_rebind_lock);
+	}
+
 	return fence;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index c5378ec2f70a..9a4a7a8dfe5b 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -152,6 +152,16 @@ static inline void i915_vma_set_persistent(struct i915_vma *vma)
 	set_bit(I915_VMA_PERSISTENT_BIT, __i915_vma_flags(vma));
 }
 
+static inline bool i915_vma_is_purged(const struct i915_vma *vma)
+{
+	return test_bit(I915_VMA_PURGED_BIT, __i915_vma_flags(vma));
+}
+
+static inline void i915_vma_set_purged(struct i915_vma *vma)
+{
+	set_bit(I915_VMA_PURGED_BIT, __i915_vma_flags(vma));
+}
+
 static inline struct i915_vma *i915_vma_get(struct i915_vma *vma)
 {
 	i915_gem_object_get(vma->obj);
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index 9cd055738997..61d0ec1a4e18 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -267,8 +267,14 @@ struct i915_vma {
 /**
  * I915_VMA_PERSISTENT_BIT:
  * The vma is persistent (created with VM_BIND call).
+ *
+ * I915_VMA_PURGED_BIT:
+ * The persistent vma is force unbound either due to VM_UNBIND call
+ * from UMD or VM is released. Do not check/wait for VM activeness
+ * in i915_vma_is_active() and i915_vma_sync() calls.
  */
 #define I915_VMA_PERSISTENT_BIT	19
+#define I915_VMA_PURGED_BIT	20
 
 	struct i915_active active;
 
@@ -299,6 +305,8 @@ struct i915_vma {
 	struct list_head vm_bind_link;
 	/** @non_priv_vm_bind_link: Link in non-private persistent VMA list */
 	struct list_head non_priv_vm_bind_link;
+	/** @vm_rebind_link: link to vm_rebind_list and protected by vm_rebind_lock */
+	struct list_head vm_rebind_link; /* Link in vm_rebind_list */
 
 	/** Interval tree structures for persistent vma */
 
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 07/20] drm/i915/vm_bind: Add support to handle object evictions
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Support eviction by maintaining a list of evicted persistent vmas
for rebinding during next submission. Ensure the list do not
include persistent vmas that are being purged.

v2: Remove unused I915_VMA_PURGED definition.
v3: Properly handle __i915_vma_unbind_async() case.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 .../drm/i915/gem/i915_gem_vm_bind_object.c    |  6 ++++
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  2 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  4 +++
 drivers/gpu/drm/i915/i915_vma.c               | 31 +++++++++++++++++--
 drivers/gpu/drm/i915/i915_vma.h               | 10 ++++++
 drivers/gpu/drm/i915/i915_vma_types.h         |  8 +++++
 6 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 19f29fa76c19..8cc78f954b97 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -86,6 +86,12 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
 {
 	lockdep_assert_held(&vma->vm->vm_bind_lock);
 
+	spin_lock(&vma->vm->vm_rebind_lock);
+	if (!list_empty(&vma->vm_rebind_link))
+		list_del_init(&vma->vm_rebind_link);
+	i915_vma_set_purged(vma);
+	spin_unlock(&vma->vm->vm_rebind_lock);
+
 	list_del_init(&vma->vm_bind_link);
 	list_del_init(&vma->non_priv_vm_bind_link);
 	i915_vm_bind_it_remove(vma, &vma->vm->va);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 74c3557e5bc4..ebf8fc3a4603 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -290,6 +290,8 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	INIT_LIST_HEAD(&vm->vm_bound_list);
 	mutex_init(&vm->vm_bind_lock);
 	INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
+	INIT_LIST_HEAD(&vm->vm_rebind_list);
+	spin_lock_init(&vm->vm_rebind_lock);
 }
 
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 3d0a452567e4..b5a5b68adb32 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -266,6 +266,10 @@ struct i915_address_space {
 	struct list_head vm_bind_list;
 	/** @vm_bound_list: List of vm_binding completed */
 	struct list_head vm_bound_list;
+	/** @vm_rebind_list: list of vmas to be rebinded */
+	struct list_head vm_rebind_list;
+	/** @vm_rebind_lock: protects vm_rebound_list */
+	spinlock_t vm_rebind_lock;
 	/** @va: tree of persistent vmas */
 	struct rb_root_cached va;
 	/** @non_priv_vm_bind_list: list of non-private object mappings */
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 0ffa24bc0954..249697ae1186 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -241,6 +241,7 @@ vma_create(struct drm_i915_gem_object *obj,
 
 	INIT_LIST_HEAD(&vma->vm_bind_link);
 	INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
+	INIT_LIST_HEAD(&vma->vm_rebind_link);
 	return vma;
 
 err_unlock:
@@ -1681,6 +1682,14 @@ static void force_unbind(struct i915_vma *vma)
 	if (!drm_mm_node_allocated(&vma->node))
 		return;
 
+	/*
+	 * Persistent vma should have been purged by now.
+	 * If not, issue a warning and purge it.
+	 */
+	if (GEM_WARN_ON(i915_vma_is_persistent(vma) &&
+			!i915_vma_is_purged(vma)))
+		i915_vma_set_purged(vma);
+
 	atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
 	WARN_ON(__i915_vma_unbind(vma));
 	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
@@ -2042,6 +2051,16 @@ int __i915_vma_unbind(struct i915_vma *vma)
 	__i915_vma_evict(vma, false);
 
 	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
+
+	if (i915_vma_is_persistent(vma)) {
+		spin_lock(&vma->vm->vm_rebind_lock);
+		if (list_empty(&vma->vm_rebind_link) &&
+		    !i915_vma_is_purged(vma))
+			list_add_tail(&vma->vm_rebind_link,
+				      &vma->vm->vm_rebind_list);
+		spin_unlock(&vma->vm->vm_rebind_lock);
+	}
+
 	return 0;
 }
 
@@ -2054,8 +2073,7 @@ static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma)
 	if (!drm_mm_node_allocated(&vma->node))
 		return NULL;
 
-	if (i915_vma_is_pinned(vma) ||
-	    &vma->obj->mm.rsgt->table != vma->resource->bi.pages)
+	if (i915_vma_is_pinned(vma))
 		return ERR_PTR(-EAGAIN);
 
 	/*
@@ -2077,6 +2095,15 @@ static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma)
 
 	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
 
+	if (i915_vma_is_persistent(vma)) {
+		spin_lock(&vma->vm->vm_rebind_lock);
+		if (list_empty(&vma->vm_rebind_link) &&
+		    !i915_vma_is_purged(vma))
+			list_add_tail(&vma->vm_rebind_link,
+				      &vma->vm->vm_rebind_list);
+		spin_unlock(&vma->vm->vm_rebind_lock);
+	}
+
 	return fence;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index c5378ec2f70a..9a4a7a8dfe5b 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -152,6 +152,16 @@ static inline void i915_vma_set_persistent(struct i915_vma *vma)
 	set_bit(I915_VMA_PERSISTENT_BIT, __i915_vma_flags(vma));
 }
 
+static inline bool i915_vma_is_purged(const struct i915_vma *vma)
+{
+	return test_bit(I915_VMA_PURGED_BIT, __i915_vma_flags(vma));
+}
+
+static inline void i915_vma_set_purged(struct i915_vma *vma)
+{
+	set_bit(I915_VMA_PURGED_BIT, __i915_vma_flags(vma));
+}
+
 static inline struct i915_vma *i915_vma_get(struct i915_vma *vma)
 {
 	i915_gem_object_get(vma->obj);
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index 9cd055738997..61d0ec1a4e18 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -267,8 +267,14 @@ struct i915_vma {
 /**
  * I915_VMA_PERSISTENT_BIT:
  * The vma is persistent (created with VM_BIND call).
+ *
+ * I915_VMA_PURGED_BIT:
+ * The persistent vma is force unbound either due to VM_UNBIND call
+ * from UMD or VM is released. Do not check/wait for VM activeness
+ * in i915_vma_is_active() and i915_vma_sync() calls.
  */
 #define I915_VMA_PERSISTENT_BIT	19
+#define I915_VMA_PURGED_BIT	20
 
 	struct i915_active active;
 
@@ -299,6 +305,8 @@ struct i915_vma {
 	struct list_head vm_bind_link;
 	/** @non_priv_vm_bind_link: Link in non-private persistent VMA list */
 	struct list_head non_priv_vm_bind_link;
+	/** @vm_rebind_link: link to vm_rebind_list and protected by vm_rebind_lock */
+	struct list_head vm_rebind_link; /* Link in vm_rebind_list */
 
 	/** Interval tree structures for persistent vma */
 
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 08/20] drm/i915/vm_bind: Support persistent vma activeness tracking
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Do not use i915_vma activeness tracking for persistent vmas.

As persistent vmas are part of working set for each execbuf
submission on that address space (VM), a persistent vma is
active if the VM active. As vm->root_obj->base.resv will be
updated for each submission on that VM, it correctly
represent whether the VM is active or not.

Add i915_vm_is_active() and i915_vm_sync() functions based
on vm->root_obj->base.resv with DMA_RESV_USAGE_BOOKKEEP
usage. dma-resv fence list will be updated with this usage
during each submission with this VM in the new execbuf3
ioctl path.

Update i915_vma_is_active(), i915_vma_sync() and the
__i915_vma_unbind_async() functions to properly handle
persistent vmas.

v2: Ensure lvalue of dma_resv_wait_timeout() call is long.

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 39 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  3 +++
 drivers/gpu/drm/i915/i915_vma.c     | 28 +++++++++++++++++++++
 drivers/gpu/drm/i915/i915_vma.h     | 25 +++++++++---------
 4 files changed, 83 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7bd1861ddbdf..1d8506548d4a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -25,6 +25,45 @@
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 
+/**
+ * i915_vm_sync() - Wait until address space is not in use
+ * @vm: address space
+ *
+ * Waits until all requests using the address space are complete.
+ *
+ * Returns: 0 if success, -ve err code upon failure
+ */
+int i915_vm_sync(struct i915_address_space *vm)
+{
+	long ret;
+
+	/* Wait for all requests under this vm to finish */
+	ret = dma_resv_wait_timeout(vm->root_obj->base.resv,
+				    DMA_RESV_USAGE_BOOKKEEP, false,
+				    MAX_SCHEDULE_TIMEOUT);
+	if (ret < 0)
+		return ret;
+	else if (ret > 0)
+		return 0;
+	else
+		return -ETIMEDOUT;
+}
+
+/**
+ * i915_vm_is_active() - Check if address space is being used
+ * @vm: address space
+ *
+ * Check if any request using the specified address space is
+ * active.
+ *
+ * Returns: true if address space is active, false otherwise.
+ */
+bool i915_vm_is_active(const struct i915_address_space *vm)
+{
+	return !dma_resv_test_signaled(vm->root_obj->base.resv,
+				       DMA_RESV_USAGE_BOOKKEEP);
+}
+
 int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
 			       struct sg_table *pages)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 8c2f57eb5dda..a5bbdc59d9df 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -51,4 +51,7 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 
 #define PIN_OFFSET_MASK		I915_GTT_PAGE_MASK
 
+int i915_vm_sync(struct i915_address_space *vm);
+bool i915_vm_is_active(const struct i915_address_space *vm);
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 249697ae1186..04abdb92c2b2 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -420,6 +420,24 @@ int i915_vma_wait_for_bind(struct i915_vma *vma)
 	return err;
 }
 
+/**
+ * i915_vma_sync() - Wait for the vma to be idle
+ * @vma: vma to be tested
+ *
+ * Returns 0 on success and error code on failure
+ */
+int i915_vma_sync(struct i915_vma *vma)
+{
+	int ret;
+
+	/* Wait for the asynchronous bindings and pending GPU reads */
+	ret = i915_active_wait(&vma->active);
+	if (ret || !i915_vma_is_persistent(vma) || i915_vma_is_purged(vma))
+		return ret;
+
+	return i915_vm_sync(vma->vm);
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
 static int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
@@ -1882,6 +1900,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
 	int err;
 
 	assert_object_held(obj);
+	if (i915_vma_is_persistent(vma))
+		return -EINVAL;
 
 	GEM_BUG_ON(!vma->pages);
 
@@ -2091,6 +2111,14 @@ static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma)
 		return ERR_PTR(-EBUSY);
 	}
 
+	if (i915_vma_is_persistent(vma) &&
+	    __i915_sw_fence_await_reservation(&vma->resource->chain,
+					      vma->vm->root_obj->base.resv,
+					      DMA_RESV_USAGE_BOOKKEEP,
+					      i915_fence_timeout(vma->vm->i915),
+					      GFP_NOWAIT | __GFP_NOWARN) < 0)
+		return ERR_PTR(-EBUSY);
+
 	fence = __i915_vma_evict(vma, true);
 
 	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 9a4a7a8dfe5b..1cadbf8fdedf 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -51,12 +51,6 @@ i915_vma_create_persistent(struct drm_i915_gem_object *obj,
 
 void i915_vma_unpin_and_release(struct i915_vma **p_vma, unsigned int flags);
 #define I915_VMA_RELEASE_MAP BIT(0)
-
-static inline bool i915_vma_is_active(const struct i915_vma *vma)
-{
-	return !i915_active_is_idle(&vma->active);
-}
-
 /* do not reserve memory to prevent deadlocks */
 #define __EXEC_OBJECT_NO_RESERVE BIT(31)
 
@@ -162,6 +156,18 @@ static inline void i915_vma_set_purged(struct i915_vma *vma)
 	set_bit(I915_VMA_PURGED_BIT, __i915_vma_flags(vma));
 }
 
+static inline bool i915_vma_is_active(const struct i915_vma *vma)
+{
+	if (i915_vma_is_persistent(vma)) {
+		if (i915_vma_is_purged(vma))
+			return false;
+
+		return i915_vm_is_active(vma->vm);
+	}
+
+	return !i915_active_is_idle(&vma->active);
+}
+
 static inline struct i915_vma *i915_vma_get(struct i915_vma *vma)
 {
 	i915_gem_object_get(vma->obj);
@@ -433,12 +439,7 @@ void i915_vma_make_shrinkable(struct i915_vma *vma);
 void i915_vma_make_purgeable(struct i915_vma *vma);
 
 int i915_vma_wait_for_bind(struct i915_vma *vma);
-
-static inline int i915_vma_sync(struct i915_vma *vma)
-{
-	/* Wait for the asynchronous bindings and pending GPU reads */
-	return i915_active_wait(&vma->active);
-}
+int i915_vma_sync(struct i915_vma *vma);
 
 /**
  * i915_vma_get_current_resource - Get the current resource of the vma
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 08/20] drm/i915/vm_bind: Support persistent vma activeness tracking
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Do not use i915_vma activeness tracking for persistent vmas.

As persistent vmas are part of working set for each execbuf
submission on that address space (VM), a persistent vma is
active if the VM active. As vm->root_obj->base.resv will be
updated for each submission on that VM, it correctly
represent whether the VM is active or not.

Add i915_vm_is_active() and i915_vm_sync() functions based
on vm->root_obj->base.resv with DMA_RESV_USAGE_BOOKKEEP
usage. dma-resv fence list will be updated with this usage
during each submission with this VM in the new execbuf3
ioctl path.

Update i915_vma_is_active(), i915_vma_sync() and the
__i915_vma_unbind_async() functions to properly handle
persistent vmas.

v2: Ensure lvalue of dma_resv_wait_timeout() call is long.

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 39 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.h |  3 +++
 drivers/gpu/drm/i915/i915_vma.c     | 28 +++++++++++++++++++++
 drivers/gpu/drm/i915/i915_vma.h     | 25 +++++++++---------
 4 files changed, 83 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7bd1861ddbdf..1d8506548d4a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -25,6 +25,45 @@
 #include "i915_trace.h"
 #include "i915_vgpu.h"
 
+/**
+ * i915_vm_sync() - Wait until address space is not in use
+ * @vm: address space
+ *
+ * Waits until all requests using the address space are complete.
+ *
+ * Returns: 0 if success, -ve err code upon failure
+ */
+int i915_vm_sync(struct i915_address_space *vm)
+{
+	long ret;
+
+	/* Wait for all requests under this vm to finish */
+	ret = dma_resv_wait_timeout(vm->root_obj->base.resv,
+				    DMA_RESV_USAGE_BOOKKEEP, false,
+				    MAX_SCHEDULE_TIMEOUT);
+	if (ret < 0)
+		return ret;
+	else if (ret > 0)
+		return 0;
+	else
+		return -ETIMEDOUT;
+}
+
+/**
+ * i915_vm_is_active() - Check if address space is being used
+ * @vm: address space
+ *
+ * Check if any request using the specified address space is
+ * active.
+ *
+ * Returns: true if address space is active, false otherwise.
+ */
+bool i915_vm_is_active(const struct i915_address_space *vm)
+{
+	return !dma_resv_test_signaled(vm->root_obj->base.resv,
+				       DMA_RESV_USAGE_BOOKKEEP);
+}
+
 int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
 			       struct sg_table *pages)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 8c2f57eb5dda..a5bbdc59d9df 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -51,4 +51,7 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
 
 #define PIN_OFFSET_MASK		I915_GTT_PAGE_MASK
 
+int i915_vm_sync(struct i915_address_space *vm);
+bool i915_vm_is_active(const struct i915_address_space *vm);
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 249697ae1186..04abdb92c2b2 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -420,6 +420,24 @@ int i915_vma_wait_for_bind(struct i915_vma *vma)
 	return err;
 }
 
+/**
+ * i915_vma_sync() - Wait for the vma to be idle
+ * @vma: vma to be tested
+ *
+ * Returns 0 on success and error code on failure
+ */
+int i915_vma_sync(struct i915_vma *vma)
+{
+	int ret;
+
+	/* Wait for the asynchronous bindings and pending GPU reads */
+	ret = i915_active_wait(&vma->active);
+	if (ret || !i915_vma_is_persistent(vma) || i915_vma_is_purged(vma))
+		return ret;
+
+	return i915_vm_sync(vma->vm);
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
 static int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
@@ -1882,6 +1900,8 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
 	int err;
 
 	assert_object_held(obj);
+	if (i915_vma_is_persistent(vma))
+		return -EINVAL;
 
 	GEM_BUG_ON(!vma->pages);
 
@@ -2091,6 +2111,14 @@ static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma)
 		return ERR_PTR(-EBUSY);
 	}
 
+	if (i915_vma_is_persistent(vma) &&
+	    __i915_sw_fence_await_reservation(&vma->resource->chain,
+					      vma->vm->root_obj->base.resv,
+					      DMA_RESV_USAGE_BOOKKEEP,
+					      i915_fence_timeout(vma->vm->i915),
+					      GFP_NOWAIT | __GFP_NOWARN) < 0)
+		return ERR_PTR(-EBUSY);
+
 	fence = __i915_vma_evict(vma, true);
 
 	drm_mm_remove_node(&vma->node); /* pairs with i915_vma_release() */
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 9a4a7a8dfe5b..1cadbf8fdedf 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -51,12 +51,6 @@ i915_vma_create_persistent(struct drm_i915_gem_object *obj,
 
 void i915_vma_unpin_and_release(struct i915_vma **p_vma, unsigned int flags);
 #define I915_VMA_RELEASE_MAP BIT(0)
-
-static inline bool i915_vma_is_active(const struct i915_vma *vma)
-{
-	return !i915_active_is_idle(&vma->active);
-}
-
 /* do not reserve memory to prevent deadlocks */
 #define __EXEC_OBJECT_NO_RESERVE BIT(31)
 
@@ -162,6 +156,18 @@ static inline void i915_vma_set_purged(struct i915_vma *vma)
 	set_bit(I915_VMA_PURGED_BIT, __i915_vma_flags(vma));
 }
 
+static inline bool i915_vma_is_active(const struct i915_vma *vma)
+{
+	if (i915_vma_is_persistent(vma)) {
+		if (i915_vma_is_purged(vma))
+			return false;
+
+		return i915_vm_is_active(vma->vm);
+	}
+
+	return !i915_active_is_idle(&vma->active);
+}
+
 static inline struct i915_vma *i915_vma_get(struct i915_vma *vma)
 {
 	i915_gem_object_get(vma->obj);
@@ -433,12 +439,7 @@ void i915_vma_make_shrinkable(struct i915_vma *vma);
 void i915_vma_make_purgeable(struct i915_vma *vma);
 
 int i915_vma_wait_for_bind(struct i915_vma *vma);
-
-static inline int i915_vma_sync(struct i915_vma *vma)
-{
-	/* Wait for the asynchronous bindings and pending GPU reads */
-	return i915_active_wait(&vma->active);
-}
+int i915_vma_sync(struct i915_vma *vma);
 
 /**
  * i915_vma_get_current_resource - Get the current resource of the vma
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 09/20] drm/i915/vm_bind: Add out fence support
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Add support for handling out fence for vm_bind call.

v2: Reset vma->vm_bind_fence.syncobj to NULL at the end
    of vm_bind call.
v3: Remove vm_unbind out fence uapi which is not supported yet.
v4: Return error if I915_TIMELINE_FENCE_WAIT fence flag is set.
    Wait for bind to complete iff I915_TIMELINE_FENCE_SIGNAL is
    not specified.
v5: Ensure __I915_TIMELINE_FENCE_UNKNOWN_FLAGS are not set.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  4 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c    | 95 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_vma.c               |  7 +-
 drivers/gpu/drm/i915/i915_vma_types.h         |  7 ++
 include/uapi/drm/i915_drm.h                   | 49 +++++++++-
 5 files changed, 159 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
index 36262a6357b5..b70e900e35ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -8,6 +8,7 @@
 
 #include <linux/types.h>
 
+struct dma_fence;
 struct drm_device;
 struct drm_file;
 struct i915_address_space;
@@ -23,4 +24,7 @@ int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
 
 void i915_gem_vm_unbind_all(struct i915_address_space *vm);
 
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+			       struct dma_fence * const fence);
+
 #endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 8cc78f954b97..6396fd1dc520 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -7,6 +7,8 @@
 
 #include <linux/interval_tree_generic.h>
 
+#include <drm/drm_syncobj.h>
+
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_vm_bind.h"
 
@@ -101,6 +103,77 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
 		i915_gem_object_put(vma->obj);
 }
 
+static int i915_vm_bind_add_fence(struct drm_file *file, struct i915_vma *vma,
+				  u32 handle, u64 point)
+{
+	struct drm_syncobj *syncobj;
+
+	syncobj = drm_syncobj_find(file, handle);
+	if (!syncobj) {
+		drm_dbg(&vma->vm->i915->drm,
+			"Invalid syncobj handle provided\n");
+		return -ENOENT;
+	}
+
+	/*
+	 * For timeline syncobjs we need to preallocate chains for
+	 * later signaling.
+	 */
+	if (point) {
+		vma->vm_bind_fence.chain_fence = dma_fence_chain_alloc();
+		if (!vma->vm_bind_fence.chain_fence) {
+			drm_syncobj_put(syncobj);
+			return -ENOMEM;
+		}
+	} else {
+		vma->vm_bind_fence.chain_fence = NULL;
+	}
+	vma->vm_bind_fence.syncobj = syncobj;
+	vma->vm_bind_fence.value = point;
+
+	return 0;
+}
+
+static void i915_vm_bind_put_fence(struct i915_vma *vma)
+{
+	if (!vma->vm_bind_fence.syncobj)
+		return;
+
+	drm_syncobj_put(vma->vm_bind_fence.syncobj);
+	dma_fence_chain_free(vma->vm_bind_fence.chain_fence);
+	vma->vm_bind_fence.syncobj = NULL;
+}
+
+/**
+ * i915_vm_bind_signal_fence() - Add fence to vm_bind syncobj
+ * @vma: vma mapping requiring signaling
+ * @fence: fence to be added
+ *
+ * Associate specified @fence with the @vma's syncobj to be
+ * signaled after the @fence work completes.
+ */
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+			       struct dma_fence * const fence)
+{
+	struct drm_syncobj *syncobj = vma->vm_bind_fence.syncobj;
+
+	if (!syncobj)
+		return;
+
+	if (vma->vm_bind_fence.chain_fence) {
+		drm_syncobj_add_point(syncobj,
+				      vma->vm_bind_fence.chain_fence,
+				      fence, vma->vm_bind_fence.value);
+		/*
+		 * The chain's ownership is transferred to the
+		 * timeline.
+		 */
+		vma->vm_bind_fence.chain_fence = NULL;
+	} else {
+		drm_syncobj_replace_fence(syncobj, fence);
+	}
+}
+
 static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
 				  struct drm_i915_gem_vm_unbind *va)
 {
@@ -206,6 +279,11 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 	if (!va->length || !IS_ALIGNED(va->start, I915_GTT_PAGE_SIZE))
 		ret = -EINVAL;
 
+	/* In fences are not supported */
+	if ((va->fence.flags & I915_TIMELINE_FENCE_WAIT) ||
+	    (va->fence.flags & __I915_TIMELINE_FENCE_UNKNOWN_FLAGS))
+		ret = -EINVAL;
+
 	obj = i915_gem_object_lookup(file, va->handle);
 	if (!obj)
 		return -ENOENT;
@@ -238,6 +316,13 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 		goto unlock_vm;
 	}
 
+	if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL) {
+		ret = i915_vm_bind_add_fence(file, vma, va->fence.handle,
+					     va->fence.value);
+		if (ret)
+			goto put_vma;
+	}
+
 	pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER |
 		    PIN_VALIDATE | PIN_NOEVICT;
 
@@ -250,6 +335,13 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 		if (ret)
 			continue;
 
+		/* If out fence is not requested, wait for bind to complete */
+		if (!(va->fence.flags & I915_TIMELINE_FENCE_SIGNAL)) {
+			ret = i915_vma_wait_for_bind(vma);
+			if (ret)
+				continue;
+		}
+
 		list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
 		i915_vm_bind_it_insert(vma, &vm->va);
 		if (!obj->priv_root)
@@ -260,6 +352,9 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 		i915_gem_object_get(vma->obj);
 	}
 
+	if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL)
+		i915_vm_bind_put_fence(vma);
+put_vma:
 	if (ret)
 		i915_vma_destroy(vma);
 unlock_vm:
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 04abdb92c2b2..eaa13e9ba966 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -29,6 +29,7 @@
 #include "display/intel_frontbuffer.h"
 #include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_tiling.h"
+#include "gem/i915_gem_vm_bind.h"
 #include "gt/intel_engine.h"
 #include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_gt.h"
@@ -1567,8 +1568,12 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 err_vma_res:
 	i915_vma_resource_free(vma_res);
 err_fence:
-	if (work)
+	if (work) {
+		if (i915_vma_is_persistent(vma))
+			i915_vm_bind_signal_fence(vma, &work->base.dma);
+
 		dma_fence_work_commit_imm(&work->base);
+	}
 err_rpm:
 	if (wakeref)
 		intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref);
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index 61d0ec1a4e18..7c8c293ddfcb 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -308,6 +308,13 @@ struct i915_vma {
 	/** @vm_rebind_link: link to vm_rebind_list and protected by vm_rebind_lock */
 	struct list_head vm_rebind_link; /* Link in vm_rebind_list */
 
+	/** Timeline fence for vm_bind completion notification */
+	struct {
+		struct dma_fence_chain *chain_fence;
+		struct drm_syncobj *syncobj;
+		u64 value;
+	} vm_bind_fence;
+
 	/** Interval tree structures for persistent vma */
 
 	/** @rb: node for the interval tree of vm for persistent vmas */
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index d3d709035b0d..9ac913606d40 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1533,6 +1533,41 @@ struct drm_i915_gem_execbuffer2 {
 #define i915_execbuffer2_get_context_id(eb2) \
 	((eb2).rsvd1 & I915_EXEC_CONTEXT_ID_MASK)
 
+/**
+ * struct drm_i915_gem_timeline_fence - An input or output timeline fence.
+ *
+ * The operation will wait for input fence to signal.
+ *
+ * The returned output fence will be signaled after the completion of the
+ * operation.
+ */
+struct drm_i915_gem_timeline_fence {
+	/** @handle: User's handle for a drm_syncobj to wait on or signal. */
+	__u32 handle;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_TIMELINE_FENCE_WAIT:
+	 * Wait for the input fence before the operation.
+	 *
+	 * I915_TIMELINE_FENCE_SIGNAL:
+	 * Return operation completion fence as output.
+	 */
+	__u32 flags;
+#define I915_TIMELINE_FENCE_WAIT            (1 << 0)
+#define I915_TIMELINE_FENCE_SIGNAL          (1 << 1)
+#define __I915_TIMELINE_FENCE_UNKNOWN_FLAGS (-(I915_TIMELINE_FENCE_SIGNAL << 1))
+
+	/**
+	 * @value: A point in the timeline.
+	 * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
+	 * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
+	 * binary one.
+	 */
+	__u64 value;
+};
+
 struct drm_i915_gem_pin {
 	/** Handle of the buffer to be pinned. */
 	__u32 handle;
@@ -3807,8 +3842,18 @@ struct drm_i915_gem_vm_bind {
 	/** @rsvd: Reserved, MBZ */
 	__u64 rsvd[3];
 
-	/** @rsvd2: Reserved for timeline fence */
-	__u64 rsvd2[2];
+	/**
+	 * @fence: Timeline fence for bind completion signaling.
+	 *
+	 * Timeline fence is of format struct drm_i915_gem_timeline_fence.
+	 *
+	 * It is an out fence, hence using I915_TIMELINE_FENCE_WAIT flag
+	 * is invalid, and an error will be returned.
+	 *
+	 * If I915_TIMELINE_FENCE_SIGNAL flag is not set, then out fence
+	 * is not requested and binding is completed synchronously.
+	 */
+	struct drm_i915_gem_timeline_fence fence;
 
 	/**
 	 * @extensions: Zero-terminated chain of extensions.
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 09/20] drm/i915/vm_bind: Add out fence support
@ 2022-11-07  8:51   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:51 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Add support for handling out fence for vm_bind call.

v2: Reset vma->vm_bind_fence.syncobj to NULL at the end
    of vm_bind call.
v3: Remove vm_unbind out fence uapi which is not supported yet.
v4: Return error if I915_TIMELINE_FENCE_WAIT fence flag is set.
    Wait for bind to complete iff I915_TIMELINE_FENCE_SIGNAL is
    not specified.
v5: Ensure __I915_TIMELINE_FENCE_UNKNOWN_FLAGS are not set.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  4 +
 .../drm/i915/gem/i915_gem_vm_bind_object.c    | 95 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_vma.c               |  7 +-
 drivers/gpu/drm/i915/i915_vma_types.h         |  7 ++
 include/uapi/drm/i915_drm.h                   | 49 +++++++++-
 5 files changed, 159 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
index 36262a6357b5..b70e900e35ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
@@ -8,6 +8,7 @@
 
 #include <linux/types.h>
 
+struct dma_fence;
 struct drm_device;
 struct drm_file;
 struct i915_address_space;
@@ -23,4 +24,7 @@ int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
 
 void i915_gem_vm_unbind_all(struct i915_address_space *vm);
 
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+			       struct dma_fence * const fence);
+
 #endif /* __I915_GEM_VM_BIND_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 8cc78f954b97..6396fd1dc520 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -7,6 +7,8 @@
 
 #include <linux/interval_tree_generic.h>
 
+#include <drm/drm_syncobj.h>
+
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_vm_bind.h"
 
@@ -101,6 +103,77 @@ static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
 		i915_gem_object_put(vma->obj);
 }
 
+static int i915_vm_bind_add_fence(struct drm_file *file, struct i915_vma *vma,
+				  u32 handle, u64 point)
+{
+	struct drm_syncobj *syncobj;
+
+	syncobj = drm_syncobj_find(file, handle);
+	if (!syncobj) {
+		drm_dbg(&vma->vm->i915->drm,
+			"Invalid syncobj handle provided\n");
+		return -ENOENT;
+	}
+
+	/*
+	 * For timeline syncobjs we need to preallocate chains for
+	 * later signaling.
+	 */
+	if (point) {
+		vma->vm_bind_fence.chain_fence = dma_fence_chain_alloc();
+		if (!vma->vm_bind_fence.chain_fence) {
+			drm_syncobj_put(syncobj);
+			return -ENOMEM;
+		}
+	} else {
+		vma->vm_bind_fence.chain_fence = NULL;
+	}
+	vma->vm_bind_fence.syncobj = syncobj;
+	vma->vm_bind_fence.value = point;
+
+	return 0;
+}
+
+static void i915_vm_bind_put_fence(struct i915_vma *vma)
+{
+	if (!vma->vm_bind_fence.syncobj)
+		return;
+
+	drm_syncobj_put(vma->vm_bind_fence.syncobj);
+	dma_fence_chain_free(vma->vm_bind_fence.chain_fence);
+	vma->vm_bind_fence.syncobj = NULL;
+}
+
+/**
+ * i915_vm_bind_signal_fence() - Add fence to vm_bind syncobj
+ * @vma: vma mapping requiring signaling
+ * @fence: fence to be added
+ *
+ * Associate specified @fence with the @vma's syncobj to be
+ * signaled after the @fence work completes.
+ */
+void i915_vm_bind_signal_fence(struct i915_vma *vma,
+			       struct dma_fence * const fence)
+{
+	struct drm_syncobj *syncobj = vma->vm_bind_fence.syncobj;
+
+	if (!syncobj)
+		return;
+
+	if (vma->vm_bind_fence.chain_fence) {
+		drm_syncobj_add_point(syncobj,
+				      vma->vm_bind_fence.chain_fence,
+				      fence, vma->vm_bind_fence.value);
+		/*
+		 * The chain's ownership is transferred to the
+		 * timeline.
+		 */
+		vma->vm_bind_fence.chain_fence = NULL;
+	} else {
+		drm_syncobj_replace_fence(syncobj, fence);
+	}
+}
+
 static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
 				  struct drm_i915_gem_vm_unbind *va)
 {
@@ -206,6 +279,11 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 	if (!va->length || !IS_ALIGNED(va->start, I915_GTT_PAGE_SIZE))
 		ret = -EINVAL;
 
+	/* In fences are not supported */
+	if ((va->fence.flags & I915_TIMELINE_FENCE_WAIT) ||
+	    (va->fence.flags & __I915_TIMELINE_FENCE_UNKNOWN_FLAGS))
+		ret = -EINVAL;
+
 	obj = i915_gem_object_lookup(file, va->handle);
 	if (!obj)
 		return -ENOENT;
@@ -238,6 +316,13 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 		goto unlock_vm;
 	}
 
+	if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL) {
+		ret = i915_vm_bind_add_fence(file, vma, va->fence.handle,
+					     va->fence.value);
+		if (ret)
+			goto put_vma;
+	}
+
 	pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER |
 		    PIN_VALIDATE | PIN_NOEVICT;
 
@@ -250,6 +335,13 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 		if (ret)
 			continue;
 
+		/* If out fence is not requested, wait for bind to complete */
+		if (!(va->fence.flags & I915_TIMELINE_FENCE_SIGNAL)) {
+			ret = i915_vma_wait_for_bind(vma);
+			if (ret)
+				continue;
+		}
+
 		list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
 		i915_vm_bind_it_insert(vma, &vm->va);
 		if (!obj->priv_root)
@@ -260,6 +352,9 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 		i915_gem_object_get(vma->obj);
 	}
 
+	if (va->fence.flags & I915_TIMELINE_FENCE_SIGNAL)
+		i915_vm_bind_put_fence(vma);
+put_vma:
 	if (ret)
 		i915_vma_destroy(vma);
 unlock_vm:
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 04abdb92c2b2..eaa13e9ba966 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -29,6 +29,7 @@
 #include "display/intel_frontbuffer.h"
 #include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_tiling.h"
+#include "gem/i915_gem_vm_bind.h"
 #include "gt/intel_engine.h"
 #include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_gt.h"
@@ -1567,8 +1568,12 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
 err_vma_res:
 	i915_vma_resource_free(vma_res);
 err_fence:
-	if (work)
+	if (work) {
+		if (i915_vma_is_persistent(vma))
+			i915_vm_bind_signal_fence(vma, &work->base.dma);
+
 		dma_fence_work_commit_imm(&work->base);
+	}
 err_rpm:
 	if (wakeref)
 		intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref);
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index 61d0ec1a4e18..7c8c293ddfcb 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -308,6 +308,13 @@ struct i915_vma {
 	/** @vm_rebind_link: link to vm_rebind_list and protected by vm_rebind_lock */
 	struct list_head vm_rebind_link; /* Link in vm_rebind_list */
 
+	/** Timeline fence for vm_bind completion notification */
+	struct {
+		struct dma_fence_chain *chain_fence;
+		struct drm_syncobj *syncobj;
+		u64 value;
+	} vm_bind_fence;
+
 	/** Interval tree structures for persistent vma */
 
 	/** @rb: node for the interval tree of vm for persistent vmas */
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index d3d709035b0d..9ac913606d40 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1533,6 +1533,41 @@ struct drm_i915_gem_execbuffer2 {
 #define i915_execbuffer2_get_context_id(eb2) \
 	((eb2).rsvd1 & I915_EXEC_CONTEXT_ID_MASK)
 
+/**
+ * struct drm_i915_gem_timeline_fence - An input or output timeline fence.
+ *
+ * The operation will wait for input fence to signal.
+ *
+ * The returned output fence will be signaled after the completion of the
+ * operation.
+ */
+struct drm_i915_gem_timeline_fence {
+	/** @handle: User's handle for a drm_syncobj to wait on or signal. */
+	__u32 handle;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_TIMELINE_FENCE_WAIT:
+	 * Wait for the input fence before the operation.
+	 *
+	 * I915_TIMELINE_FENCE_SIGNAL:
+	 * Return operation completion fence as output.
+	 */
+	__u32 flags;
+#define I915_TIMELINE_FENCE_WAIT            (1 << 0)
+#define I915_TIMELINE_FENCE_SIGNAL          (1 << 1)
+#define __I915_TIMELINE_FENCE_UNKNOWN_FLAGS (-(I915_TIMELINE_FENCE_SIGNAL << 1))
+
+	/**
+	 * @value: A point in the timeline.
+	 * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
+	 * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
+	 * binary one.
+	 */
+	__u64 value;
+};
+
 struct drm_i915_gem_pin {
 	/** Handle of the buffer to be pinned. */
 	__u32 handle;
@@ -3807,8 +3842,18 @@ struct drm_i915_gem_vm_bind {
 	/** @rsvd: Reserved, MBZ */
 	__u64 rsvd[3];
 
-	/** @rsvd2: Reserved for timeline fence */
-	__u64 rsvd2[2];
+	/**
+	 * @fence: Timeline fence for bind completion signaling.
+	 *
+	 * Timeline fence is of format struct drm_i915_gem_timeline_fence.
+	 *
+	 * It is an out fence, hence using I915_TIMELINE_FENCE_WAIT flag
+	 * is invalid, and an error will be returned.
+	 *
+	 * If I915_TIMELINE_FENCE_SIGNAL flag is not set, then out fence
+	 * is not requested and binding is completed synchronously.
+	 */
+	struct drm_i915_gem_timeline_fence fence;
 
 	/**
 	 * @extensions: Zero-terminated chain of extensions.
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 10/20] drm/i915/vm_bind: Abstract out common execbuf functions
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

The new execbuf3 ioctl path and the legacy execbuf ioctl
paths have many common functionalities.
Abstract out the common execbuf functionalities into a
separate file where possible, thus allowing code sharing.

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 ++++++++++++++++++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
 3 files changed, 741 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index b731f3ac80da..35636c6bf856 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -148,6 +148,7 @@ gem-y += \
 	gem/i915_gem_create.o \
 	gem/i915_gem_dmabuf.o \
 	gem/i915_gem_domain.o \
+	gem/i915_gem_execbuffer_common.o \
 	gem/i915_gem_execbuffer.o \
 	gem/i915_gem_internal.o \
 	gem/i915_gem_object.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
new file mode 100644
index 000000000000..4d1c9ce154b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
@@ -0,0 +1,666 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <linux/dma-fence-array.h>
+
+#include <drm/drm_syncobj.h>
+
+#include "gt/intel_context.h"
+#include "gt/intel_gt.h"
+#include "gt/intel_gt_pm.h"
+#include "gt/intel_ring.h"
+
+#include "i915_gem_execbuffer_common.h"
+
+#define __EXEC_COMMON_FENCE_WAIT	BIT(0)
+#define __EXEC_COMMON_FENCE_SIGNAL	BIT(1)
+
+static struct i915_request *eb_throttle(struct intel_context *ce)
+{
+	struct intel_ring *ring = ce->ring;
+	struct intel_timeline *tl = ce->timeline;
+	struct i915_request *rq;
+
+	/*
+	 * Completely unscientific finger-in-the-air estimates for suitable
+	 * maximum user request size (to avoid blocking) and then backoff.
+	 */
+	if (intel_ring_update_space(ring) >= PAGE_SIZE)
+		return NULL;
+
+	/*
+	 * Find a request that after waiting upon, there will be at least half
+	 * the ring available. The hysteresis allows us to compete for the
+	 * shared ring and should mean that we sleep less often prior to
+	 * claiming our resources, but not so long that the ring completely
+	 * drains before we can submit our next request.
+	 */
+	list_for_each_entry(rq, &tl->requests, link) {
+		if (rq->ring != ring)
+			continue;
+
+		if (__intel_ring_space(rq->postfix,
+				       ring->emit, ring->size) > ring->size / 2)
+			break;
+	}
+	if (&rq->link == &tl->requests)
+		return NULL; /* weird, we will check again later for real */
+
+	return i915_request_get(rq);
+}
+
+static int eb_pin_timeline(struct intel_context *ce, bool throttle,
+			   bool nonblock)
+{
+	struct intel_timeline *tl;
+	struct i915_request *rq = NULL;
+
+	/*
+	 * Take a local wakeref for preparing to dispatch the execbuf as
+	 * we expect to access the hardware fairly frequently in the
+	 * process, and require the engine to be kept awake between accesses.
+	 * Upon dispatch, we acquire another prolonged wakeref that we hold
+	 * until the timeline is idle, which in turn releases the wakeref
+	 * taken on the engine, and the parent device.
+	 */
+	tl = intel_context_timeline_lock(ce);
+	if (IS_ERR(tl))
+		return PTR_ERR(tl);
+
+	intel_context_enter(ce);
+	if (throttle)
+		rq = eb_throttle(ce);
+	intel_context_timeline_unlock(tl);
+
+	if (rq) {
+		long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
+
+		if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE,
+				      timeout) < 0) {
+			i915_request_put(rq);
+
+			/*
+			 * Error path, cannot use intel_context_timeline_lock as
+			 * that is user interruptable and this clean up step
+			 * must be done.
+			 */
+			mutex_lock(&ce->timeline->mutex);
+			intel_context_exit(ce);
+			mutex_unlock(&ce->timeline->mutex);
+
+			if (nonblock)
+				return -EWOULDBLOCK;
+			else
+				return -EINTR;
+		}
+		i915_request_put(rq);
+	}
+
+	return 0;
+}
+
+/**
+ * i915_eb_pin_engine() - Pin the engine
+ * @ce: the context
+ * @ww: optional locking context or NULL
+ * @throttle: throttle to ensure enough ring space
+ * @nonblock: do not block during throttle
+ *
+ * Pin the @ce timeline. If @throttle is set, enable throttling to ensure
+ * enough ring space is available either by waiting for requests to complete
+ * (if @nonblock is not set) or by returning error -EWOULDBLOCK (if @nonblock
+ * is set).
+ *
+ * Returns 0 upon success, -ve error code upon error.
+ */
+int i915_eb_pin_engine(struct intel_context *ce, struct i915_gem_ww_ctx *ww,
+		       bool throttle, bool nonblock)
+{
+	struct intel_context *child;
+	int err;
+	int i = 0, j = 0;
+
+	if (unlikely(intel_context_is_banned(ce)))
+		return -EIO;
+
+	/*
+	 * Pinning the contexts may generate requests in order to acquire
+	 * GGTT space, so do this first before we reserve a seqno for
+	 * ourselves.
+	 */
+	err = intel_context_pin_ww(ce, ww);
+	if (err)
+		return err;
+
+	for_each_child(ce, child) {
+		err = intel_context_pin_ww(child, ww);
+		GEM_BUG_ON(err);	/* perma-pinned should incr a counter */
+	}
+
+	for_each_child(ce, child) {
+		err = eb_pin_timeline(child, throttle, nonblock);
+		if (err)
+			goto unwind;
+		++i;
+	}
+	err = eb_pin_timeline(ce, throttle, nonblock);
+	if (err)
+		goto unwind;
+
+	return 0;
+
+unwind:
+	for_each_child(ce, child) {
+		if (j++ < i) {
+			mutex_lock(&child->timeline->mutex);
+			intel_context_exit(child);
+			mutex_unlock(&child->timeline->mutex);
+		}
+	}
+	for_each_child(ce, child)
+		intel_context_unpin(child);
+	intel_context_unpin(ce);
+	return err;
+}
+
+/**
+ * i915_eb_unpin_engine() - Unpin the engine
+ * @ce: the context
+ *
+ * Unpin the @ce timeline.
+ */
+void i915_eb_unpin_engine(struct intel_context *ce)
+{
+	struct intel_context *child;
+
+	for_each_child(ce, child) {
+		mutex_lock(&child->timeline->mutex);
+		intel_context_exit(child);
+		mutex_unlock(&child->timeline->mutex);
+
+		intel_context_unpin(child);
+	}
+
+	mutex_lock(&ce->timeline->mutex);
+	intel_context_exit(ce);
+	mutex_unlock(&ce->timeline->mutex);
+
+	intel_context_unpin(ce);
+}
+
+/**
+ * i915_eb_find_context() - Find the context
+ * @context: the context
+ * @context_number: required context index
+ *
+ * Returns the @context_number'th child of specified @context,
+ * or NULL if the child context is not found.
+ * If @context_number is 0, return the specified @context.
+ */
+struct intel_context *
+i915_eb_find_context(struct intel_context *context, unsigned int context_number)
+{
+	struct intel_context *child;
+
+	if (likely(context_number == 0))
+		return context;
+
+	for_each_child(context, child)
+		if (!--context_number)
+			return child;
+
+	GEM_BUG_ON("Context not found");
+
+	return NULL;
+}
+
+static void __free_fence_array(struct eb_fence *fences, u64 n)
+{
+	while (n--) {
+		drm_syncobj_put(ptr_mask_bits(fences[n].syncobj, 2));
+		dma_fence_put(fences[n].dma_fence);
+		dma_fence_chain_free(fences[n].chain_fence);
+	}
+	kvfree(fences);
+}
+
+/**
+ * i915_eb_put_fence_array() - Free Execbuffer fence array
+ * @fences: Pointer to array of Execbuffer fences (See struct eb_fences)
+ * @num_fences: Number of fences in @fences array
+ *
+ * Free the Execbuffer fences in @fences array.
+ */
+void i915_eb_put_fence_array(struct eb_fence *fences, u64 num_fences)
+{
+	if (fences)
+		__free_fence_array(fences, num_fences);
+}
+
+/**
+ * i915_eb_add_timeline_fence() - Add a fence to the specified Execbuffer fence
+ * array.
+ * @file: drm file pointer
+ * @handle: drm_syncobj handle
+ * @point: point in the timeline
+ * @f: Execbuffer fence
+ * @wait: wait for the specified fence
+ * @signal: signal the specified fence
+ *
+ * Add the fence specified by drm_syncobj @handle at specified @point in the
+ * timeline to the Execbuffer fence array @f. If @wait is specified, it is an
+ * input fence and if @signal is specified it is an output fence.
+ *
+ * Returns 0 if the timeline fence to be added is already signaled (@f is not
+ * updated). Returns 1 upon successfully adding the timeline fence (@f is
+ * updated) and -ve error code upon failure.
+ */
+int i915_eb_add_timeline_fence(struct drm_file *file, u32 handle, u64 point,
+			       struct eb_fence *f, bool wait, bool signal)
+{
+	struct drm_syncobj *syncobj;
+	struct dma_fence *fence = NULL;
+	u32 flags = 0;
+	int err = 0;
+
+	syncobj = drm_syncobj_find(file, handle);
+	if (!syncobj) {
+		DRM_DEBUG("Invalid syncobj handle provided\n");
+		return -ENOENT;
+	}
+
+	fence = drm_syncobj_fence_get(syncobj);
+
+	if (!fence && wait && !signal) {
+		DRM_DEBUG("Syncobj handle has no fence\n");
+		drm_syncobj_put(syncobj);
+		return -EINVAL;
+	}
+
+	if (fence)
+		err = dma_fence_chain_find_seqno(&fence, point);
+
+	if (err && !signal) {
+		DRM_DEBUG("Syncobj handle missing requested point %llu\n", point);
+		dma_fence_put(fence);
+		drm_syncobj_put(syncobj);
+		return err;
+	}
+
+	/*
+	 * A point might have been signaled already and
+	 * garbage collected from the timeline. In this case
+	 * just ignore the point and carry on.
+	 */
+	if (!fence && !signal) {
+		drm_syncobj_put(syncobj);
+		return 0;
+	}
+
+	/*
+	 * For timeline syncobjs we need to preallocate chains for
+	 * later signaling.
+	 */
+	if (point != 0 && signal) {
+		/*
+		 * Waiting and signaling the same point (when point !=
+		 * 0) would break the timeline.
+		 */
+		if (wait) {
+			DRM_DEBUG("Trying to wait & signal the same timeline point.\n");
+			dma_fence_put(fence);
+			drm_syncobj_put(syncobj);
+			return -EINVAL;
+		}
+
+		f->chain_fence = dma_fence_chain_alloc();
+		if (!f->chain_fence) {
+			drm_syncobj_put(syncobj);
+			dma_fence_put(fence);
+			return -ENOMEM;
+		}
+	} else {
+		f->chain_fence = NULL;
+	}
+
+	flags |= wait ? __EXEC_COMMON_FENCE_WAIT : 0;
+	flags |= signal ? __EXEC_COMMON_FENCE_SIGNAL : 0;
+
+	f->syncobj = ptr_pack_bits(syncobj, flags, 2);
+	f->dma_fence = fence;
+	f->value = point;
+	return 1;
+}
+
+/**
+ * i915_eb_await_fence_array() - Setup a request to asynchronously
+ * wait for fences in the specified Execbuffer fence array.
+ * @fences: pointer to Execbuffer fence array
+ * @num_fences: number of fences in @fences array
+ * @rq: the i915_request that should wait for fences in @fences array
+ *
+ * Setup the request @rq to asynchronously wait for fences specified in
+ * @fences array to signal before starting execution.
+ *
+ * Returns 0 upon success, -ve error upon failure.
+ */
+int i915_eb_await_fence_array(struct eb_fence *fences, u64 num_fences,
+			      struct i915_request *rq)
+{
+	unsigned int n;
+
+	for (n = 0; n < num_fences; n++) {
+		int err;
+
+		if (!fences[n].dma_fence)
+			continue;
+
+		err = i915_request_await_dma_fence(rq, fences[n].dma_fence);
+		if (err < 0)
+			return err;
+	}
+
+	return 0;
+}
+
+/**
+ * i915_eb_signal_fence_array() - Attach a dma-fence to all out fences of
+ * Execbuffer fence array.
+ * @fences: pointer to Execbuffer fence array
+ * @num_fences: number of fences in @fences array
+ * @fence: the dma-fence to attach to all out fences in @fences array
+ *
+ * Attach the specified @fence to all out fences of Execbuffer fence array
+ * @fences, at the specified timeline point. Thus, the out fences gets
+ * signaled when the specified @fence gets signaled.
+ */
+void i915_eb_signal_fence_array(struct eb_fence *fences, u64 num_fences,
+				struct dma_fence * const fence)
+{
+	unsigned int n;
+
+	for (n = 0; n < num_fences; n++) {
+		struct drm_syncobj *syncobj;
+		unsigned int flags;
+
+		syncobj = ptr_unpack_bits(fences[n].syncobj, &flags, 2);
+		if (!(flags & __EXEC_COMMON_FENCE_SIGNAL))
+			continue;
+
+		if (fences[n].chain_fence) {
+			drm_syncobj_add_point(syncobj,
+					      fences[n].chain_fence,
+					      fence,
+					      fences[n].value);
+			/*
+			 * The chain's ownership is transferred to the
+			 * timeline.
+			 */
+			fences[n].chain_fence = NULL;
+		} else {
+			drm_syncobj_replace_fence(syncobj, fence);
+		}
+	}
+}
+
+/*
+ * Using two helper loops for the order of which requests / batches are created
+ * and added the to backend. Requests are created in order from the parent to
+ * the last child. Requests are added in the reverse order, from the last child
+ * to parent. This is done for locking reasons as the timeline lock is acquired
+ * during request creation and released when the request is added to the
+ * backend. To make lockdep happy (see intel_context_timeline_lock) this must be
+ * the ordering.
+ */
+#define for_each_batch_create_order(_num_batches) \
+	for (unsigned int i = 0; i < (_num_batches); ++i)
+#define for_each_batch_add_order(_num_batches) \
+	for (int i = (_num_batches) - 1; i >= 0; --i)
+
+static void retire_requests(struct intel_timeline *tl, struct i915_request *end)
+{
+	struct i915_request *rq, *rn;
+
+	list_for_each_entry_safe(rq, rn, &tl->requests, link)
+		if (rq == end || !i915_request_retire(rq))
+			break;
+}
+
+static int eb_request_add(struct intel_context *context,
+			  struct i915_request *rq,
+			  struct i915_sched_attr sched,
+			  int err, bool last_parallel)
+{
+	struct intel_timeline * const tl = i915_request_timeline(rq);
+	struct i915_sched_attr attr = {};
+	struct i915_request *prev;
+
+	lockdep_assert_held(&tl->mutex);
+	lockdep_unpin_lock(&tl->mutex, rq->cookie);
+
+	trace_i915_request_add(rq);
+
+	prev = __i915_request_commit(rq);
+
+	/* Check that the context wasn't destroyed before submission */
+	if (likely(!intel_context_is_closed(context))) {
+		attr = sched;
+	} else {
+		/* Serialise with context_close via the add_to_timeline */
+		i915_request_set_error_once(rq, -ENOENT);
+		__i915_request_skip(rq);
+		err = -ENOENT; /* override any transient errors */
+	}
+
+	if (intel_context_is_parallel(context)) {
+		if (err) {
+			__i915_request_skip(rq);
+			set_bit(I915_FENCE_FLAG_SKIP_PARALLEL,
+				&rq->fence.flags);
+		}
+		if (last_parallel)
+			set_bit(I915_FENCE_FLAG_SUBMIT_PARALLEL,
+				&rq->fence.flags);
+	}
+
+	__i915_request_queue(rq, &attr);
+
+	/* Try to clean up the client's timeline after submitting the request */
+	if (prev)
+		retire_requests(tl, prev);
+
+	mutex_unlock(&tl->mutex);
+
+	return err;
+}
+
+/**
+ * i915_eb_requests_add() - Handle request queuing
+ * @requests: pointer to an array of request pointers
+ * @num_requests: size of @requests array
+ * @context: the context
+ * @sched: schedule attribute
+ * @err: error path if true
+ *
+ * Add requests to timeline queue.
+ *
+ * Return 0 upon success, error code upon failure.
+ */
+int i915_eb_requests_add(struct i915_request **requests,
+			 unsigned int num_requests,
+			 struct intel_context *context,
+			 struct i915_sched_attr sched,
+			 int err)
+{
+	/*
+	 * We iterate in reverse order of creation to release timeline mutexes
+	 * in same order.
+	 */
+	for_each_batch_add_order(num_requests) {
+		struct i915_request *rq = requests[i];
+
+		if (!rq)
+			continue;
+
+		err |= eb_request_add(context, rq, sched, err, i == 0);
+	}
+
+	return err;
+}
+
+/**
+ * i915_eb_requests_get() - Get reference of requests
+ * @requests: pointer to an array of request pointers
+ * @num_requests: size of @requests array
+ *
+ * Get reference for each requests in @requests array.
+ */
+void i915_eb_requests_get(struct i915_request **requests,
+			  unsigned int num_requests)
+{
+	for_each_batch_create_order(num_requests) {
+		if (!requests[i])
+			break;
+
+		i915_request_get(requests[i]);
+	}
+}
+
+/**
+ * i915_eb_requests_put() - Release reference of requests
+ * @requests: pointer to an array of request pointers
+ * @num_requests: size of @requests array
+ *
+ * Release reference for each requests in @requests array.
+ */
+void i915_eb_requests_put(struct i915_request **requests,
+			  unsigned int num_requests)
+{
+	for_each_batch_create_order(num_requests) {
+		if (!requests[i])
+			break;
+
+		i915_request_put(requests[i]);
+	}
+}
+
+/**
+ * i915_eb_composite_fence_create() - Create a composite fence for an array of
+ * requests on a specified context.
+ * @requests: pointer to an array of request pointers
+ * @num_requests: size of @requests array
+ * @context: the context
+ *
+ * Create and return the base class of a dma_fence_array with fences of all
+ * requests in @requests array with fence context specified by @context.
+ *
+ * Returns fence array base upon success, an error pointer upon failure.
+ */
+struct dma_fence *i915_eb_composite_fence_create(struct i915_request **requests,
+						 unsigned int num_requests,
+						 struct intel_context *context)
+{
+	struct dma_fence_array *fence_array;
+	struct dma_fence **fences;
+
+	GEM_BUG_ON(!intel_context_is_parent(context));
+
+	fences = kmalloc_array(num_requests, sizeof(*fences), GFP_KERNEL);
+	if (!fences)
+		return ERR_PTR(-ENOMEM);
+
+	for_each_batch_create_order(num_requests) {
+		fences[i] = &requests[i]->fence;
+		__set_bit(I915_FENCE_FLAG_COMPOSITE,
+			  &requests[i]->fence.flags);
+	}
+
+	fence_array = dma_fence_array_create(num_requests,
+					     fences,
+					     context->parallel.fence_context,
+					     context->parallel.seqno++,
+					     false);
+	if (!fence_array) {
+		kfree(fences);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	/* Move ownership to the dma_fence_array created above */
+	for_each_batch_create_order(num_requests)
+		dma_fence_get(fences[i]);
+
+	return &fence_array->base;
+}
+
+/**
+ * i915_eb_select_engine() - Get engine references
+ * @ce: the context
+ *
+ * Get reference of context @ce and children, reference of associated VM
+ * and wakeref of associated tile. Also allocate @ce resources.
+ *
+ * Returns 0 upon success, -ve error upon failure.
+ * Returns -EIO if the associated tile is wedged.
+ */
+int i915_eb_select_engine(struct intel_context *ce)
+{
+	struct intel_context *child;
+	int err;
+
+	for_each_child(ce, child)
+		intel_context_get(child);
+	intel_gt_pm_get(ce->engine->gt);
+
+	if (!test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
+		err = intel_context_alloc_state(ce);
+		if (err)
+			goto err;
+	}
+	for_each_child(ce, child) {
+		if (!test_bit(CONTEXT_ALLOC_BIT, &child->flags)) {
+			err = intel_context_alloc_state(child);
+			if (err)
+				goto err;
+		}
+	}
+
+	/*
+	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
+	 * EIO if the GPU is already wedged.
+	 */
+	err = intel_gt_terminally_wedged(ce->engine->gt);
+	if (err)
+		goto err;
+
+	if (!i915_vm_tryget(ce->vm)) {
+		err = -ENOENT;
+		goto err;
+	}
+
+	return 0;
+err:
+	intel_gt_pm_put(ce->engine->gt);
+	for_each_child(ce, child)
+		intel_context_put(child);
+	return err;
+}
+
+/**
+ * i915_eb_put_engine() - Release engine references
+ * @ce: the context
+ *
+ * Release reference of context @ce and children, reference of associated VM
+ * and wakeref of associated tile.
+ */
+void i915_eb_put_engine(struct intel_context *ce)
+{
+	struct intel_context *child;
+
+	i915_vm_put(ce->vm);
+	intel_gt_pm_put(ce->engine->gt);
+	for_each_child(ce, child)
+		intel_context_put(child);
+	intel_context_put(ce);
+}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
new file mode 100644
index 000000000000..55b25e0357a5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
@@ -0,0 +1,74 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_EXECBUFFER_COMMON_H
+#define __I915_GEM_EXECBUFFER_COMMON_H
+
+#include <linux/types.h>
+
+struct dma_fence;
+struct dma_fence_chain;
+struct drm_file;
+struct drm_syncobj;
+
+struct intel_context;
+struct intel_gt;
+struct i915_gem_ww_ctx;
+struct i915_request;
+struct i915_sched_attr;
+
+/**
+ * struct eb_fence - Execbuffer fence
+ *
+ * Data structure for execbuffer timeline fence handling.
+ */
+struct eb_fence {
+	/** @syncobj: Pointer to user specified syncobj */
+	struct drm_syncobj *syncobj;
+
+	/** @dma_fence: Fence associated with @syncobj */
+	struct dma_fence *dma_fence;
+
+	/** @value: User specified point in the timeline */
+	u64 value;
+
+	/** @chain_fence: Fence chain to add the timeline point */
+	struct dma_fence_chain *chain_fence;
+};
+
+int i915_eb_pin_engine(struct intel_context *ce, struct i915_gem_ww_ctx *ww,
+		       bool throttle, bool nonblock);
+void i915_eb_unpin_engine(struct intel_context *ce);
+int i915_eb_select_engine(struct intel_context *ce);
+void i915_eb_put_engine(struct intel_context *ce);
+
+struct intel_context *
+i915_eb_find_context(struct intel_context *context,
+		     unsigned int context_number);
+
+int i915_eb_add_timeline_fence(struct drm_file *file, u32 handle, u64 point,
+			       struct eb_fence *f, bool wait, bool signal);
+void i915_eb_put_fence_array(struct eb_fence *fences, u64 num_fences);
+int i915_eb_await_fence_array(struct eb_fence *fences, u64 num_fences,
+			      struct i915_request *rq);
+void i915_eb_signal_fence_array(struct eb_fence *fences, u64 num_fences,
+				struct dma_fence * const fence);
+
+int i915_eb_requests_add(struct i915_request **requests,
+			 unsigned int num_requests,
+			 struct intel_context *context,
+			 struct i915_sched_attr sched,
+			 int err);
+void i915_eb_requests_get(struct i915_request **requests,
+			  unsigned int num_requests);
+void i915_eb_requests_put(struct i915_request **requests,
+			  unsigned int num_requests);
+
+struct dma_fence *
+i915_eb_composite_fence_create(struct i915_request **requests,
+			       unsigned int num_requests,
+			       struct intel_context *context);
+
+#endif /* __I915_GEM_EXECBUFFER_COMMON_H */
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 10/20] drm/i915/vm_bind: Abstract out common execbuf functions
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

The new execbuf3 ioctl path and the legacy execbuf ioctl
paths have many common functionalities.
Abstract out the common execbuf functionalities into a
separate file where possible, thus allowing code sharing.

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 ++++++++++++++++++
 .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
 3 files changed, 741 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index b731f3ac80da..35636c6bf856 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -148,6 +148,7 @@ gem-y += \
 	gem/i915_gem_create.o \
 	gem/i915_gem_dmabuf.o \
 	gem/i915_gem_domain.o \
+	gem/i915_gem_execbuffer_common.o \
 	gem/i915_gem_execbuffer.o \
 	gem/i915_gem_internal.o \
 	gem/i915_gem_object.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
new file mode 100644
index 000000000000..4d1c9ce154b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
@@ -0,0 +1,666 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <linux/dma-fence-array.h>
+
+#include <drm/drm_syncobj.h>
+
+#include "gt/intel_context.h"
+#include "gt/intel_gt.h"
+#include "gt/intel_gt_pm.h"
+#include "gt/intel_ring.h"
+
+#include "i915_gem_execbuffer_common.h"
+
+#define __EXEC_COMMON_FENCE_WAIT	BIT(0)
+#define __EXEC_COMMON_FENCE_SIGNAL	BIT(1)
+
+static struct i915_request *eb_throttle(struct intel_context *ce)
+{
+	struct intel_ring *ring = ce->ring;
+	struct intel_timeline *tl = ce->timeline;
+	struct i915_request *rq;
+
+	/*
+	 * Completely unscientific finger-in-the-air estimates for suitable
+	 * maximum user request size (to avoid blocking) and then backoff.
+	 */
+	if (intel_ring_update_space(ring) >= PAGE_SIZE)
+		return NULL;
+
+	/*
+	 * Find a request that after waiting upon, there will be at least half
+	 * the ring available. The hysteresis allows us to compete for the
+	 * shared ring and should mean that we sleep less often prior to
+	 * claiming our resources, but not so long that the ring completely
+	 * drains before we can submit our next request.
+	 */
+	list_for_each_entry(rq, &tl->requests, link) {
+		if (rq->ring != ring)
+			continue;
+
+		if (__intel_ring_space(rq->postfix,
+				       ring->emit, ring->size) > ring->size / 2)
+			break;
+	}
+	if (&rq->link == &tl->requests)
+		return NULL; /* weird, we will check again later for real */
+
+	return i915_request_get(rq);
+}
+
+static int eb_pin_timeline(struct intel_context *ce, bool throttle,
+			   bool nonblock)
+{
+	struct intel_timeline *tl;
+	struct i915_request *rq = NULL;
+
+	/*
+	 * Take a local wakeref for preparing to dispatch the execbuf as
+	 * we expect to access the hardware fairly frequently in the
+	 * process, and require the engine to be kept awake between accesses.
+	 * Upon dispatch, we acquire another prolonged wakeref that we hold
+	 * until the timeline is idle, which in turn releases the wakeref
+	 * taken on the engine, and the parent device.
+	 */
+	tl = intel_context_timeline_lock(ce);
+	if (IS_ERR(tl))
+		return PTR_ERR(tl);
+
+	intel_context_enter(ce);
+	if (throttle)
+		rq = eb_throttle(ce);
+	intel_context_timeline_unlock(tl);
+
+	if (rq) {
+		long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
+
+		if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE,
+				      timeout) < 0) {
+			i915_request_put(rq);
+
+			/*
+			 * Error path, cannot use intel_context_timeline_lock as
+			 * that is user interruptable and this clean up step
+			 * must be done.
+			 */
+			mutex_lock(&ce->timeline->mutex);
+			intel_context_exit(ce);
+			mutex_unlock(&ce->timeline->mutex);
+
+			if (nonblock)
+				return -EWOULDBLOCK;
+			else
+				return -EINTR;
+		}
+		i915_request_put(rq);
+	}
+
+	return 0;
+}
+
+/**
+ * i915_eb_pin_engine() - Pin the engine
+ * @ce: the context
+ * @ww: optional locking context or NULL
+ * @throttle: throttle to ensure enough ring space
+ * @nonblock: do not block during throttle
+ *
+ * Pin the @ce timeline. If @throttle is set, enable throttling to ensure
+ * enough ring space is available either by waiting for requests to complete
+ * (if @nonblock is not set) or by returning error -EWOULDBLOCK (if @nonblock
+ * is set).
+ *
+ * Returns 0 upon success, -ve error code upon error.
+ */
+int i915_eb_pin_engine(struct intel_context *ce, struct i915_gem_ww_ctx *ww,
+		       bool throttle, bool nonblock)
+{
+	struct intel_context *child;
+	int err;
+	int i = 0, j = 0;
+
+	if (unlikely(intel_context_is_banned(ce)))
+		return -EIO;
+
+	/*
+	 * Pinning the contexts may generate requests in order to acquire
+	 * GGTT space, so do this first before we reserve a seqno for
+	 * ourselves.
+	 */
+	err = intel_context_pin_ww(ce, ww);
+	if (err)
+		return err;
+
+	for_each_child(ce, child) {
+		err = intel_context_pin_ww(child, ww);
+		GEM_BUG_ON(err);	/* perma-pinned should incr a counter */
+	}
+
+	for_each_child(ce, child) {
+		err = eb_pin_timeline(child, throttle, nonblock);
+		if (err)
+			goto unwind;
+		++i;
+	}
+	err = eb_pin_timeline(ce, throttle, nonblock);
+	if (err)
+		goto unwind;
+
+	return 0;
+
+unwind:
+	for_each_child(ce, child) {
+		if (j++ < i) {
+			mutex_lock(&child->timeline->mutex);
+			intel_context_exit(child);
+			mutex_unlock(&child->timeline->mutex);
+		}
+	}
+	for_each_child(ce, child)
+		intel_context_unpin(child);
+	intel_context_unpin(ce);
+	return err;
+}
+
+/**
+ * i915_eb_unpin_engine() - Unpin the engine
+ * @ce: the context
+ *
+ * Unpin the @ce timeline.
+ */
+void i915_eb_unpin_engine(struct intel_context *ce)
+{
+	struct intel_context *child;
+
+	for_each_child(ce, child) {
+		mutex_lock(&child->timeline->mutex);
+		intel_context_exit(child);
+		mutex_unlock(&child->timeline->mutex);
+
+		intel_context_unpin(child);
+	}
+
+	mutex_lock(&ce->timeline->mutex);
+	intel_context_exit(ce);
+	mutex_unlock(&ce->timeline->mutex);
+
+	intel_context_unpin(ce);
+}
+
+/**
+ * i915_eb_find_context() - Find the context
+ * @context: the context
+ * @context_number: required context index
+ *
+ * Returns the @context_number'th child of specified @context,
+ * or NULL if the child context is not found.
+ * If @context_number is 0, return the specified @context.
+ */
+struct intel_context *
+i915_eb_find_context(struct intel_context *context, unsigned int context_number)
+{
+	struct intel_context *child;
+
+	if (likely(context_number == 0))
+		return context;
+
+	for_each_child(context, child)
+		if (!--context_number)
+			return child;
+
+	GEM_BUG_ON("Context not found");
+
+	return NULL;
+}
+
+static void __free_fence_array(struct eb_fence *fences, u64 n)
+{
+	while (n--) {
+		drm_syncobj_put(ptr_mask_bits(fences[n].syncobj, 2));
+		dma_fence_put(fences[n].dma_fence);
+		dma_fence_chain_free(fences[n].chain_fence);
+	}
+	kvfree(fences);
+}
+
+/**
+ * i915_eb_put_fence_array() - Free Execbuffer fence array
+ * @fences: Pointer to array of Execbuffer fences (See struct eb_fences)
+ * @num_fences: Number of fences in @fences array
+ *
+ * Free the Execbuffer fences in @fences array.
+ */
+void i915_eb_put_fence_array(struct eb_fence *fences, u64 num_fences)
+{
+	if (fences)
+		__free_fence_array(fences, num_fences);
+}
+
+/**
+ * i915_eb_add_timeline_fence() - Add a fence to the specified Execbuffer fence
+ * array.
+ * @file: drm file pointer
+ * @handle: drm_syncobj handle
+ * @point: point in the timeline
+ * @f: Execbuffer fence
+ * @wait: wait for the specified fence
+ * @signal: signal the specified fence
+ *
+ * Add the fence specified by drm_syncobj @handle at specified @point in the
+ * timeline to the Execbuffer fence array @f. If @wait is specified, it is an
+ * input fence and if @signal is specified it is an output fence.
+ *
+ * Returns 0 if the timeline fence to be added is already signaled (@f is not
+ * updated). Returns 1 upon successfully adding the timeline fence (@f is
+ * updated) and -ve error code upon failure.
+ */
+int i915_eb_add_timeline_fence(struct drm_file *file, u32 handle, u64 point,
+			       struct eb_fence *f, bool wait, bool signal)
+{
+	struct drm_syncobj *syncobj;
+	struct dma_fence *fence = NULL;
+	u32 flags = 0;
+	int err = 0;
+
+	syncobj = drm_syncobj_find(file, handle);
+	if (!syncobj) {
+		DRM_DEBUG("Invalid syncobj handle provided\n");
+		return -ENOENT;
+	}
+
+	fence = drm_syncobj_fence_get(syncobj);
+
+	if (!fence && wait && !signal) {
+		DRM_DEBUG("Syncobj handle has no fence\n");
+		drm_syncobj_put(syncobj);
+		return -EINVAL;
+	}
+
+	if (fence)
+		err = dma_fence_chain_find_seqno(&fence, point);
+
+	if (err && !signal) {
+		DRM_DEBUG("Syncobj handle missing requested point %llu\n", point);
+		dma_fence_put(fence);
+		drm_syncobj_put(syncobj);
+		return err;
+	}
+
+	/*
+	 * A point might have been signaled already and
+	 * garbage collected from the timeline. In this case
+	 * just ignore the point and carry on.
+	 */
+	if (!fence && !signal) {
+		drm_syncobj_put(syncobj);
+		return 0;
+	}
+
+	/*
+	 * For timeline syncobjs we need to preallocate chains for
+	 * later signaling.
+	 */
+	if (point != 0 && signal) {
+		/*
+		 * Waiting and signaling the same point (when point !=
+		 * 0) would break the timeline.
+		 */
+		if (wait) {
+			DRM_DEBUG("Trying to wait & signal the same timeline point.\n");
+			dma_fence_put(fence);
+			drm_syncobj_put(syncobj);
+			return -EINVAL;
+		}
+
+		f->chain_fence = dma_fence_chain_alloc();
+		if (!f->chain_fence) {
+			drm_syncobj_put(syncobj);
+			dma_fence_put(fence);
+			return -ENOMEM;
+		}
+	} else {
+		f->chain_fence = NULL;
+	}
+
+	flags |= wait ? __EXEC_COMMON_FENCE_WAIT : 0;
+	flags |= signal ? __EXEC_COMMON_FENCE_SIGNAL : 0;
+
+	f->syncobj = ptr_pack_bits(syncobj, flags, 2);
+	f->dma_fence = fence;
+	f->value = point;
+	return 1;
+}
+
+/**
+ * i915_eb_await_fence_array() - Setup a request to asynchronously
+ * wait for fences in the specified Execbuffer fence array.
+ * @fences: pointer to Execbuffer fence array
+ * @num_fences: number of fences in @fences array
+ * @rq: the i915_request that should wait for fences in @fences array
+ *
+ * Setup the request @rq to asynchronously wait for fences specified in
+ * @fences array to signal before starting execution.
+ *
+ * Returns 0 upon success, -ve error upon failure.
+ */
+int i915_eb_await_fence_array(struct eb_fence *fences, u64 num_fences,
+			      struct i915_request *rq)
+{
+	unsigned int n;
+
+	for (n = 0; n < num_fences; n++) {
+		int err;
+
+		if (!fences[n].dma_fence)
+			continue;
+
+		err = i915_request_await_dma_fence(rq, fences[n].dma_fence);
+		if (err < 0)
+			return err;
+	}
+
+	return 0;
+}
+
+/**
+ * i915_eb_signal_fence_array() - Attach a dma-fence to all out fences of
+ * Execbuffer fence array.
+ * @fences: pointer to Execbuffer fence array
+ * @num_fences: number of fences in @fences array
+ * @fence: the dma-fence to attach to all out fences in @fences array
+ *
+ * Attach the specified @fence to all out fences of Execbuffer fence array
+ * @fences, at the specified timeline point. Thus, the out fences gets
+ * signaled when the specified @fence gets signaled.
+ */
+void i915_eb_signal_fence_array(struct eb_fence *fences, u64 num_fences,
+				struct dma_fence * const fence)
+{
+	unsigned int n;
+
+	for (n = 0; n < num_fences; n++) {
+		struct drm_syncobj *syncobj;
+		unsigned int flags;
+
+		syncobj = ptr_unpack_bits(fences[n].syncobj, &flags, 2);
+		if (!(flags & __EXEC_COMMON_FENCE_SIGNAL))
+			continue;
+
+		if (fences[n].chain_fence) {
+			drm_syncobj_add_point(syncobj,
+					      fences[n].chain_fence,
+					      fence,
+					      fences[n].value);
+			/*
+			 * The chain's ownership is transferred to the
+			 * timeline.
+			 */
+			fences[n].chain_fence = NULL;
+		} else {
+			drm_syncobj_replace_fence(syncobj, fence);
+		}
+	}
+}
+
+/*
+ * Using two helper loops for the order of which requests / batches are created
+ * and added the to backend. Requests are created in order from the parent to
+ * the last child. Requests are added in the reverse order, from the last child
+ * to parent. This is done for locking reasons as the timeline lock is acquired
+ * during request creation and released when the request is added to the
+ * backend. To make lockdep happy (see intel_context_timeline_lock) this must be
+ * the ordering.
+ */
+#define for_each_batch_create_order(_num_batches) \
+	for (unsigned int i = 0; i < (_num_batches); ++i)
+#define for_each_batch_add_order(_num_batches) \
+	for (int i = (_num_batches) - 1; i >= 0; --i)
+
+static void retire_requests(struct intel_timeline *tl, struct i915_request *end)
+{
+	struct i915_request *rq, *rn;
+
+	list_for_each_entry_safe(rq, rn, &tl->requests, link)
+		if (rq == end || !i915_request_retire(rq))
+			break;
+}
+
+static int eb_request_add(struct intel_context *context,
+			  struct i915_request *rq,
+			  struct i915_sched_attr sched,
+			  int err, bool last_parallel)
+{
+	struct intel_timeline * const tl = i915_request_timeline(rq);
+	struct i915_sched_attr attr = {};
+	struct i915_request *prev;
+
+	lockdep_assert_held(&tl->mutex);
+	lockdep_unpin_lock(&tl->mutex, rq->cookie);
+
+	trace_i915_request_add(rq);
+
+	prev = __i915_request_commit(rq);
+
+	/* Check that the context wasn't destroyed before submission */
+	if (likely(!intel_context_is_closed(context))) {
+		attr = sched;
+	} else {
+		/* Serialise with context_close via the add_to_timeline */
+		i915_request_set_error_once(rq, -ENOENT);
+		__i915_request_skip(rq);
+		err = -ENOENT; /* override any transient errors */
+	}
+
+	if (intel_context_is_parallel(context)) {
+		if (err) {
+			__i915_request_skip(rq);
+			set_bit(I915_FENCE_FLAG_SKIP_PARALLEL,
+				&rq->fence.flags);
+		}
+		if (last_parallel)
+			set_bit(I915_FENCE_FLAG_SUBMIT_PARALLEL,
+				&rq->fence.flags);
+	}
+
+	__i915_request_queue(rq, &attr);
+
+	/* Try to clean up the client's timeline after submitting the request */
+	if (prev)
+		retire_requests(tl, prev);
+
+	mutex_unlock(&tl->mutex);
+
+	return err;
+}
+
+/**
+ * i915_eb_requests_add() - Handle request queuing
+ * @requests: pointer to an array of request pointers
+ * @num_requests: size of @requests array
+ * @context: the context
+ * @sched: schedule attribute
+ * @err: error path if true
+ *
+ * Add requests to timeline queue.
+ *
+ * Return 0 upon success, error code upon failure.
+ */
+int i915_eb_requests_add(struct i915_request **requests,
+			 unsigned int num_requests,
+			 struct intel_context *context,
+			 struct i915_sched_attr sched,
+			 int err)
+{
+	/*
+	 * We iterate in reverse order of creation to release timeline mutexes
+	 * in same order.
+	 */
+	for_each_batch_add_order(num_requests) {
+		struct i915_request *rq = requests[i];
+
+		if (!rq)
+			continue;
+
+		err |= eb_request_add(context, rq, sched, err, i == 0);
+	}
+
+	return err;
+}
+
+/**
+ * i915_eb_requests_get() - Get reference of requests
+ * @requests: pointer to an array of request pointers
+ * @num_requests: size of @requests array
+ *
+ * Get reference for each requests in @requests array.
+ */
+void i915_eb_requests_get(struct i915_request **requests,
+			  unsigned int num_requests)
+{
+	for_each_batch_create_order(num_requests) {
+		if (!requests[i])
+			break;
+
+		i915_request_get(requests[i]);
+	}
+}
+
+/**
+ * i915_eb_requests_put() - Release reference of requests
+ * @requests: pointer to an array of request pointers
+ * @num_requests: size of @requests array
+ *
+ * Release reference for each requests in @requests array.
+ */
+void i915_eb_requests_put(struct i915_request **requests,
+			  unsigned int num_requests)
+{
+	for_each_batch_create_order(num_requests) {
+		if (!requests[i])
+			break;
+
+		i915_request_put(requests[i]);
+	}
+}
+
+/**
+ * i915_eb_composite_fence_create() - Create a composite fence for an array of
+ * requests on a specified context.
+ * @requests: pointer to an array of request pointers
+ * @num_requests: size of @requests array
+ * @context: the context
+ *
+ * Create and return the base class of a dma_fence_array with fences of all
+ * requests in @requests array with fence context specified by @context.
+ *
+ * Returns fence array base upon success, an error pointer upon failure.
+ */
+struct dma_fence *i915_eb_composite_fence_create(struct i915_request **requests,
+						 unsigned int num_requests,
+						 struct intel_context *context)
+{
+	struct dma_fence_array *fence_array;
+	struct dma_fence **fences;
+
+	GEM_BUG_ON(!intel_context_is_parent(context));
+
+	fences = kmalloc_array(num_requests, sizeof(*fences), GFP_KERNEL);
+	if (!fences)
+		return ERR_PTR(-ENOMEM);
+
+	for_each_batch_create_order(num_requests) {
+		fences[i] = &requests[i]->fence;
+		__set_bit(I915_FENCE_FLAG_COMPOSITE,
+			  &requests[i]->fence.flags);
+	}
+
+	fence_array = dma_fence_array_create(num_requests,
+					     fences,
+					     context->parallel.fence_context,
+					     context->parallel.seqno++,
+					     false);
+	if (!fence_array) {
+		kfree(fences);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	/* Move ownership to the dma_fence_array created above */
+	for_each_batch_create_order(num_requests)
+		dma_fence_get(fences[i]);
+
+	return &fence_array->base;
+}
+
+/**
+ * i915_eb_select_engine() - Get engine references
+ * @ce: the context
+ *
+ * Get reference of context @ce and children, reference of associated VM
+ * and wakeref of associated tile. Also allocate @ce resources.
+ *
+ * Returns 0 upon success, -ve error upon failure.
+ * Returns -EIO if the associated tile is wedged.
+ */
+int i915_eb_select_engine(struct intel_context *ce)
+{
+	struct intel_context *child;
+	int err;
+
+	for_each_child(ce, child)
+		intel_context_get(child);
+	intel_gt_pm_get(ce->engine->gt);
+
+	if (!test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
+		err = intel_context_alloc_state(ce);
+		if (err)
+			goto err;
+	}
+	for_each_child(ce, child) {
+		if (!test_bit(CONTEXT_ALLOC_BIT, &child->flags)) {
+			err = intel_context_alloc_state(child);
+			if (err)
+				goto err;
+		}
+	}
+
+	/*
+	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
+	 * EIO if the GPU is already wedged.
+	 */
+	err = intel_gt_terminally_wedged(ce->engine->gt);
+	if (err)
+		goto err;
+
+	if (!i915_vm_tryget(ce->vm)) {
+		err = -ENOENT;
+		goto err;
+	}
+
+	return 0;
+err:
+	intel_gt_pm_put(ce->engine->gt);
+	for_each_child(ce, child)
+		intel_context_put(child);
+	return err;
+}
+
+/**
+ * i915_eb_put_engine() - Release engine references
+ * @ce: the context
+ *
+ * Release reference of context @ce and children, reference of associated VM
+ * and wakeref of associated tile.
+ */
+void i915_eb_put_engine(struct intel_context *ce)
+{
+	struct intel_context *child;
+
+	i915_vm_put(ce->vm);
+	intel_gt_pm_put(ce->engine->gt);
+	for_each_child(ce, child)
+		intel_context_put(child);
+	intel_context_put(ce);
+}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
new file mode 100644
index 000000000000..55b25e0357a5
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
@@ -0,0 +1,74 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#ifndef __I915_GEM_EXECBUFFER_COMMON_H
+#define __I915_GEM_EXECBUFFER_COMMON_H
+
+#include <linux/types.h>
+
+struct dma_fence;
+struct dma_fence_chain;
+struct drm_file;
+struct drm_syncobj;
+
+struct intel_context;
+struct intel_gt;
+struct i915_gem_ww_ctx;
+struct i915_request;
+struct i915_sched_attr;
+
+/**
+ * struct eb_fence - Execbuffer fence
+ *
+ * Data structure for execbuffer timeline fence handling.
+ */
+struct eb_fence {
+	/** @syncobj: Pointer to user specified syncobj */
+	struct drm_syncobj *syncobj;
+
+	/** @dma_fence: Fence associated with @syncobj */
+	struct dma_fence *dma_fence;
+
+	/** @value: User specified point in the timeline */
+	u64 value;
+
+	/** @chain_fence: Fence chain to add the timeline point */
+	struct dma_fence_chain *chain_fence;
+};
+
+int i915_eb_pin_engine(struct intel_context *ce, struct i915_gem_ww_ctx *ww,
+		       bool throttle, bool nonblock);
+void i915_eb_unpin_engine(struct intel_context *ce);
+int i915_eb_select_engine(struct intel_context *ce);
+void i915_eb_put_engine(struct intel_context *ce);
+
+struct intel_context *
+i915_eb_find_context(struct intel_context *context,
+		     unsigned int context_number);
+
+int i915_eb_add_timeline_fence(struct drm_file *file, u32 handle, u64 point,
+			       struct eb_fence *f, bool wait, bool signal);
+void i915_eb_put_fence_array(struct eb_fence *fences, u64 num_fences);
+int i915_eb_await_fence_array(struct eb_fence *fences, u64 num_fences,
+			      struct i915_request *rq);
+void i915_eb_signal_fence_array(struct eb_fence *fences, u64 num_fences,
+				struct dma_fence * const fence);
+
+int i915_eb_requests_add(struct i915_request **requests,
+			 unsigned int num_requests,
+			 struct intel_context *context,
+			 struct i915_sched_attr sched,
+			 int err);
+void i915_eb_requests_get(struct i915_request **requests,
+			  unsigned int num_requests);
+void i915_eb_requests_put(struct i915_request **requests,
+			  unsigned int num_requests);
+
+struct dma_fence *
+i915_eb_composite_fence_create(struct i915_request **requests,
+			       unsigned int num_requests,
+			       struct intel_context *context);
+
+#endif /* __I915_GEM_EXECBUFFER_COMMON_H */
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 11/20] drm/i915/vm_bind: Use common execbuf functions in execbuf path
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Update the execbuf path to use common execbuf functions to
reduce code duplication with the newer execbuf3 path.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 507 ++----------------
 1 file changed, 38 insertions(+), 469 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 43f29acfbec9..749c3c80e02d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -28,6 +28,7 @@
 #include "i915_file_private.h"
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
 #include "i915_gem_evict.h"
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
@@ -235,13 +236,6 @@ enum {
  * the batchbuffer in trusted mode, otherwise the ioctl is rejected.
  */
 
-struct eb_fence {
-	struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */
-	struct dma_fence *dma_fence;
-	u64 value;
-	struct dma_fence_chain *chain_fence;
-};
-
 struct i915_execbuffer {
 	struct drm_i915_private *i915; /** i915 backpointer */
 	struct drm_file *file; /** per-file lookup tables and limits */
@@ -2446,164 +2440,29 @@ static const enum intel_engine_id user_ring_map[] = {
 	[I915_EXEC_VEBOX]	= VECS0
 };
 
-static struct i915_request *eb_throttle(struct i915_execbuffer *eb, struct intel_context *ce)
-{
-	struct intel_ring *ring = ce->ring;
-	struct intel_timeline *tl = ce->timeline;
-	struct i915_request *rq;
-
-	/*
-	 * Completely unscientific finger-in-the-air estimates for suitable
-	 * maximum user request size (to avoid blocking) and then backoff.
-	 */
-	if (intel_ring_update_space(ring) >= PAGE_SIZE)
-		return NULL;
-
-	/*
-	 * Find a request that after waiting upon, there will be at least half
-	 * the ring available. The hysteresis allows us to compete for the
-	 * shared ring and should mean that we sleep less often prior to
-	 * claiming our resources, but not so long that the ring completely
-	 * drains before we can submit our next request.
-	 */
-	list_for_each_entry(rq, &tl->requests, link) {
-		if (rq->ring != ring)
-			continue;
-
-		if (__intel_ring_space(rq->postfix,
-				       ring->emit, ring->size) > ring->size / 2)
-			break;
-	}
-	if (&rq->link == &tl->requests)
-		return NULL; /* weird, we will check again later for real */
-
-	return i915_request_get(rq);
-}
-
-static int eb_pin_timeline(struct i915_execbuffer *eb, struct intel_context *ce,
-			   bool throttle)
-{
-	struct intel_timeline *tl;
-	struct i915_request *rq = NULL;
-
-	/*
-	 * Take a local wakeref for preparing to dispatch the execbuf as
-	 * we expect to access the hardware fairly frequently in the
-	 * process, and require the engine to be kept awake between accesses.
-	 * Upon dispatch, we acquire another prolonged wakeref that we hold
-	 * until the timeline is idle, which in turn releases the wakeref
-	 * taken on the engine, and the parent device.
-	 */
-	tl = intel_context_timeline_lock(ce);
-	if (IS_ERR(tl))
-		return PTR_ERR(tl);
-
-	intel_context_enter(ce);
-	if (throttle)
-		rq = eb_throttle(eb, ce);
-	intel_context_timeline_unlock(tl);
-
-	if (rq) {
-		bool nonblock = eb->file->filp->f_flags & O_NONBLOCK;
-		long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
-
-		if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE,
-				      timeout) < 0) {
-			i915_request_put(rq);
-
-			/*
-			 * Error path, cannot use intel_context_timeline_lock as
-			 * that is user interruptable and this clean up step
-			 * must be done.
-			 */
-			mutex_lock(&ce->timeline->mutex);
-			intel_context_exit(ce);
-			mutex_unlock(&ce->timeline->mutex);
-
-			if (nonblock)
-				return -EWOULDBLOCK;
-			else
-				return -EINTR;
-		}
-		i915_request_put(rq);
-	}
-
-	return 0;
-}
-
 static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle)
 {
-	struct intel_context *ce = eb->context, *child;
 	int err;
-	int i = 0, j = 0;
 
 	GEM_BUG_ON(eb->args->flags & __EXEC_ENGINE_PINNED);
 
-	if (unlikely(intel_context_is_banned(ce)))
-		return -EIO;
-
-	/*
-	 * Pinning the contexts may generate requests in order to acquire
-	 * GGTT space, so do this first before we reserve a seqno for
-	 * ourselves.
-	 */
-	err = intel_context_pin_ww(ce, &eb->ww);
+	err = i915_eb_pin_engine(eb->context, &eb->ww, throttle,
+				 eb->file->filp->f_flags & O_NONBLOCK);
 	if (err)
 		return err;
-	for_each_child(ce, child) {
-		err = intel_context_pin_ww(child, &eb->ww);
-		GEM_BUG_ON(err);	/* perma-pinned should incr a counter */
-	}
-
-	for_each_child(ce, child) {
-		err = eb_pin_timeline(eb, child, throttle);
-		if (err)
-			goto unwind;
-		++i;
-	}
-	err = eb_pin_timeline(eb, ce, throttle);
-	if (err)
-		goto unwind;
 
 	eb->args->flags |= __EXEC_ENGINE_PINNED;
 	return 0;
-
-unwind:
-	for_each_child(ce, child) {
-		if (j++ < i) {
-			mutex_lock(&child->timeline->mutex);
-			intel_context_exit(child);
-			mutex_unlock(&child->timeline->mutex);
-		}
-	}
-	for_each_child(ce, child)
-		intel_context_unpin(child);
-	intel_context_unpin(ce);
-	return err;
 }
 
 static void eb_unpin_engine(struct i915_execbuffer *eb)
 {
-	struct intel_context *ce = eb->context, *child;
-
 	if (!(eb->args->flags & __EXEC_ENGINE_PINNED))
 		return;
 
 	eb->args->flags &= ~__EXEC_ENGINE_PINNED;
 
-	for_each_child(ce, child) {
-		mutex_lock(&child->timeline->mutex);
-		intel_context_exit(child);
-		mutex_unlock(&child->timeline->mutex);
-
-		intel_context_unpin(child);
-	}
-
-	mutex_lock(&ce->timeline->mutex);
-	intel_context_exit(ce);
-	mutex_unlock(&ce->timeline->mutex);
-
-	intel_context_unpin(ce);
+	i915_eb_unpin_engine(eb->context);
 }
 
 static unsigned int
@@ -2652,7 +2511,7 @@ eb_select_legacy_ring(struct i915_execbuffer *eb)
 static int
 eb_select_engine(struct i915_execbuffer *eb)
 {
-	struct intel_context *ce, *child;
+	struct intel_context *ce;
 	unsigned int idx;
 	int err;
 
@@ -2677,36 +2536,10 @@ eb_select_engine(struct i915_execbuffer *eb)
 	}
 	eb->num_batches = ce->parallel.number_children + 1;
 
-	for_each_child(ce, child)
-		intel_context_get(child);
-	intel_gt_pm_get(ce->engine->gt);
-
-	if (!test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
-		err = intel_context_alloc_state(ce);
-		if (err)
-			goto err;
-	}
-	for_each_child(ce, child) {
-		if (!test_bit(CONTEXT_ALLOC_BIT, &child->flags)) {
-			err = intel_context_alloc_state(child);
-			if (err)
-				goto err;
-		}
-	}
-
-	/*
-	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
-	 * EIO if the GPU is already wedged.
-	 */
-	err = intel_gt_terminally_wedged(ce->engine->gt);
+	err = i915_eb_select_engine(ce);
 	if (err)
 		goto err;
 
-	if (!i915_vm_tryget(ce->vm)) {
-		err = -ENOENT;
-		goto err;
-	}
-
 	eb->context = ce;
 	eb->gt = ce->engine->gt;
 
@@ -2715,12 +2548,9 @@ eb_select_engine(struct i915_execbuffer *eb)
 	 * during ww handling. The pool is destroyed when last pm reference
 	 * is dropped, which breaks our -EDEADLK handling.
 	 */
-	return err;
+	return 0;
 
 err:
-	intel_gt_pm_put(ce->engine->gt);
-	for_each_child(ce, child)
-		intel_context_put(child);
 	intel_context_put(ce);
 	return err;
 }
@@ -2728,24 +2558,7 @@ eb_select_engine(struct i915_execbuffer *eb)
 static void
 eb_put_engine(struct i915_execbuffer *eb)
 {
-	struct intel_context *child;
-
-	i915_vm_put(eb->context->vm);
-	intel_gt_pm_put(eb->gt);
-	for_each_child(eb->context, child)
-		intel_context_put(child);
-	intel_context_put(eb->context);
-}
-
-static void
-__free_fence_array(struct eb_fence *fences, unsigned int n)
-{
-	while (n--) {
-		drm_syncobj_put(ptr_mask_bits(fences[n].syncobj, 2));
-		dma_fence_put(fences[n].dma_fence);
-		dma_fence_chain_free(fences[n].chain_fence);
-	}
-	kvfree(fences);
+	i915_eb_put_engine(eb->context);
 }
 
 static int
@@ -2756,7 +2569,6 @@ add_timeline_fence_array(struct i915_execbuffer *eb,
 	u64 __user *user_values;
 	struct eb_fence *f;
 	u64 nfences;
-	int err = 0;
 
 	nfences = timeline_fences->fence_count;
 	if (!nfences)
@@ -2791,9 +2603,9 @@ add_timeline_fence_array(struct i915_execbuffer *eb,
 
 	while (nfences--) {
 		struct drm_i915_gem_exec_fence user_fence;
-		struct drm_syncobj *syncobj;
-		struct dma_fence *fence = NULL;
+		bool wait, signal;
 		u64 point;
+		int ret;
 
 		if (__copy_from_user(&user_fence,
 				     user_fences++,
@@ -2806,70 +2618,15 @@ add_timeline_fence_array(struct i915_execbuffer *eb,
 		if (__get_user(point, user_values++))
 			return -EFAULT;
 
-		syncobj = drm_syncobj_find(eb->file, user_fence.handle);
-		if (!syncobj) {
-			DRM_DEBUG("Invalid syncobj handle provided\n");
-			return -ENOENT;
-		}
-
-		fence = drm_syncobj_fence_get(syncobj);
-
-		if (!fence && user_fence.flags &&
-		    !(user_fence.flags & I915_EXEC_FENCE_SIGNAL)) {
-			DRM_DEBUG("Syncobj handle has no fence\n");
-			drm_syncobj_put(syncobj);
-			return -EINVAL;
-		}
-
-		if (fence)
-			err = dma_fence_chain_find_seqno(&fence, point);
-
-		if (err && !(user_fence.flags & I915_EXEC_FENCE_SIGNAL)) {
-			DRM_DEBUG("Syncobj handle missing requested point %llu\n", point);
-			dma_fence_put(fence);
-			drm_syncobj_put(syncobj);
-			return err;
-		}
-
-		/*
-		 * A point might have been signaled already and
-		 * garbage collected from the timeline. In this case
-		 * just ignore the point and carry on.
-		 */
-		if (!fence && !(user_fence.flags & I915_EXEC_FENCE_SIGNAL)) {
-			drm_syncobj_put(syncobj);
+		wait = user_fence.flags & I915_EXEC_FENCE_WAIT;
+		signal = user_fence.flags & I915_EXEC_FENCE_SIGNAL;
+		ret = i915_eb_add_timeline_fence(eb->file, user_fence.handle,
+						 point, f, wait, signal);
+		if (ret < 0)
+			return ret;
+		else if (!ret)
 			continue;
-		}
 
-		/*
-		 * For timeline syncobjs we need to preallocate chains for
-		 * later signaling.
-		 */
-		if (point != 0 && user_fence.flags & I915_EXEC_FENCE_SIGNAL) {
-			/*
-			 * Waiting and signaling the same point (when point !=
-			 * 0) would break the timeline.
-			 */
-			if (user_fence.flags & I915_EXEC_FENCE_WAIT) {
-				DRM_DEBUG("Trying to wait & signal the same timeline point.\n");
-				dma_fence_put(fence);
-				drm_syncobj_put(syncobj);
-				return -EINVAL;
-			}
-
-			f->chain_fence = dma_fence_chain_alloc();
-			if (!f->chain_fence) {
-				drm_syncobj_put(syncobj);
-				dma_fence_put(fence);
-				return -ENOMEM;
-			}
-		} else {
-			f->chain_fence = NULL;
-		}
-
-		f->syncobj = ptr_pack_bits(syncobj, user_fence.flags, 2);
-		f->dma_fence = fence;
-		f->value = point;
 		f++;
 		eb->num_fences++;
 	}
@@ -2949,60 +2706,6 @@ static int add_fence_array(struct i915_execbuffer *eb)
 	return 0;
 }
 
-static void put_fence_array(struct eb_fence *fences, int num_fences)
-{
-	if (fences)
-		__free_fence_array(fences, num_fences);
-}
-
-static int
-await_fence_array(struct i915_execbuffer *eb,
-		  struct i915_request *rq)
-{
-	unsigned int n;
-	int err;
-
-	for (n = 0; n < eb->num_fences; n++) {
-		if (!eb->fences[n].dma_fence)
-			continue;
-
-		err = i915_request_await_dma_fence(rq, eb->fences[n].dma_fence);
-		if (err < 0)
-			return err;
-	}
-
-	return 0;
-}
-
-static void signal_fence_array(const struct i915_execbuffer *eb,
-			       struct dma_fence * const fence)
-{
-	unsigned int n;
-
-	for (n = 0; n < eb->num_fences; n++) {
-		struct drm_syncobj *syncobj;
-		unsigned int flags;
-
-		syncobj = ptr_unpack_bits(eb->fences[n].syncobj, &flags, 2);
-		if (!(flags & I915_EXEC_FENCE_SIGNAL))
-			continue;
-
-		if (eb->fences[n].chain_fence) {
-			drm_syncobj_add_point(syncobj,
-					      eb->fences[n].chain_fence,
-					      fence,
-					      eb->fences[n].value);
-			/*
-			 * The chain's ownership is transferred to the
-			 * timeline.
-			 */
-			eb->fences[n].chain_fence = NULL;
-		} else {
-			drm_syncobj_replace_fence(syncobj, fence);
-		}
-	}
-}
-
 static int
 parse_timeline_fences(struct i915_user_extension __user *ext, void *data)
 {
@@ -3015,80 +2718,6 @@ parse_timeline_fences(struct i915_user_extension __user *ext, void *data)
 	return add_timeline_fence_array(eb, &timeline_fences);
 }
 
-static void retire_requests(struct intel_timeline *tl, struct i915_request *end)
-{
-	struct i915_request *rq, *rn;
-
-	list_for_each_entry_safe(rq, rn, &tl->requests, link)
-		if (rq == end || !i915_request_retire(rq))
-			break;
-}
-
-static int eb_request_add(struct i915_execbuffer *eb, struct i915_request *rq,
-			  int err, bool last_parallel)
-{
-	struct intel_timeline * const tl = i915_request_timeline(rq);
-	struct i915_sched_attr attr = {};
-	struct i915_request *prev;
-
-	lockdep_assert_held(&tl->mutex);
-	lockdep_unpin_lock(&tl->mutex, rq->cookie);
-
-	trace_i915_request_add(rq);
-
-	prev = __i915_request_commit(rq);
-
-	/* Check that the context wasn't destroyed before submission */
-	if (likely(!intel_context_is_closed(eb->context))) {
-		attr = eb->gem_context->sched;
-	} else {
-		/* Serialise with context_close via the add_to_timeline */
-		i915_request_set_error_once(rq, -ENOENT);
-		__i915_request_skip(rq);
-		err = -ENOENT; /* override any transient errors */
-	}
-
-	if (intel_context_is_parallel(eb->context)) {
-		if (err) {
-			__i915_request_skip(rq);
-			set_bit(I915_FENCE_FLAG_SKIP_PARALLEL,
-				&rq->fence.flags);
-		}
-		if (last_parallel)
-			set_bit(I915_FENCE_FLAG_SUBMIT_PARALLEL,
-				&rq->fence.flags);
-	}
-
-	__i915_request_queue(rq, &attr);
-
-	/* Try to clean up the client's timeline after submitting the request */
-	if (prev)
-		retire_requests(tl, prev);
-
-	mutex_unlock(&tl->mutex);
-
-	return err;
-}
-
-static int eb_requests_add(struct i915_execbuffer *eb, int err)
-{
-	int i;
-
-	/*
-	 * We iterate in reverse order of creation to release timeline mutexes in
-	 * same order.
-	 */
-	for_each_batch_add_order(eb, i) {
-		struct i915_request *rq = eb->requests[i];
-
-		if (!rq)
-			continue;
-		err |= eb_request_add(eb, rq, err, i == 0);
-	}
-
-	return err;
-}
-
 static const i915_user_extension_fn execbuf_extensions[] = {
 	[DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES] = parse_timeline_fences,
 };
@@ -3115,73 +2744,26 @@ parse_execbuf2_extensions(struct drm_i915_gem_execbuffer2 *args,
 				    eb);
 }
 
-static void eb_requests_get(struct i915_execbuffer *eb)
-{
-	unsigned int i;
-
-	for_each_batch_create_order(eb, i) {
-		if (!eb->requests[i])
-			break;
-
-		i915_request_get(eb->requests[i]);
-	}
-}
-
-static void eb_requests_put(struct i915_execbuffer *eb)
-{
-	unsigned int i;
-
-	for_each_batch_create_order(eb, i) {
-		if (!eb->requests[i])
-			break;
-
-		i915_request_put(eb->requests[i]);
-	}
-}
-
 static struct sync_file *
 eb_composite_fence_create(struct i915_execbuffer *eb, int out_fence_fd)
 {
 	struct sync_file *out_fence = NULL;
-	struct dma_fence_array *fence_array;
-	struct dma_fence **fences;
-	unsigned int i;
-
-	GEM_BUG_ON(!intel_context_is_parent(eb->context));
+	struct dma_fence *fence;
 
-	fences = kmalloc_array(eb->num_batches, sizeof(*fences), GFP_KERNEL);
-	if (!fences)
-		return ERR_PTR(-ENOMEM);
-
-	for_each_batch_create_order(eb, i) {
-		fences[i] = &eb->requests[i]->fence;
-		__set_bit(I915_FENCE_FLAG_COMPOSITE,
-			  &eb->requests[i]->fence.flags);
-	}
-
-	fence_array = dma_fence_array_create(eb->num_batches,
-					     fences,
-					     eb->context->parallel.fence_context,
-					     eb->context->parallel.seqno++,
-					     false);
-	if (!fence_array) {
-		kfree(fences);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	/* Move ownership to the dma_fence_array created above */
-	for_each_batch_create_order(eb, i)
-		dma_fence_get(fences[i]);
+	fence = i915_eb_composite_fence_create(eb->requests, eb->num_batches,
+					       eb->context);
+	if (IS_ERR(fence))
+		return ERR_CAST(fence);
 
 	if (out_fence_fd != -1) {
-		out_fence = sync_file_create(&fence_array->base);
+		out_fence = sync_file_create(fence);
 		/* sync_file now owns fence_arry, drop creation ref */
-		dma_fence_put(&fence_array->base);
+		dma_fence_put(fence);
 		if (!out_fence)
 			return ERR_PTR(-ENOMEM);
 	}
 
-	eb->composite_fence = &fence_array->base;
+	eb->composite_fence = fence;
 
 	return out_fence;
 }
@@ -3213,7 +2795,7 @@ eb_fences_add(struct i915_execbuffer *eb, struct i915_request *rq,
 	}
 
 	if (eb->fences) {
-		err = await_fence_array(eb, rq);
+		err = i915_eb_await_fence_array(eb->fences, eb->num_fences, rq);
 		if (err)
 			return ERR_PTR(err);
 	}
@@ -3231,23 +2813,6 @@ eb_fences_add(struct i915_execbuffer *eb, struct i915_request *rq,
 	return out_fence;
 }
 
-static struct intel_context *
-eb_find_context(struct i915_execbuffer *eb, unsigned int context_number)
-{
-	struct intel_context *child;
-
-	if (likely(context_number == 0))
-		return eb->context;
-
-	for_each_child(eb->context, child)
-		if (!--context_number)
-			return child;
-
-	GEM_BUG_ON("Context not found");
-
-	return NULL;
-}
-
 static struct sync_file *
 eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence,
 		   int out_fence_fd)
@@ -3257,7 +2822,9 @@ eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence,
 
 	for_each_batch_create_order(eb, i) {
 		/* Allocate a request for this batch buffer nice and early. */
-		eb->requests[i] = i915_request_create(eb_find_context(eb, i));
+		eb->requests[i] =
+			i915_request_create(i915_eb_find_context(eb->context,
+								 i));
 		if (IS_ERR(eb->requests[i])) {
 			out_fence = ERR_CAST(eb->requests[i]);
 			eb->requests[i] = NULL;
@@ -3437,13 +3004,15 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	err = eb_submit(&eb);
 
 err_request:
-	eb_requests_get(&eb);
-	err = eb_requests_add(&eb, err);
+	i915_eb_requests_get(eb.requests, eb.num_batches);
+	err = i915_eb_requests_add(eb.requests, eb.num_batches, eb.context,
+				   eb.gem_context->sched, err);
 
 	if (eb.fences)
-		signal_fence_array(&eb, eb.composite_fence ?
-				   eb.composite_fence :
-				   &eb.requests[0]->fence);
+		i915_eb_signal_fence_array(eb.fences, eb.num_fences,
+					   eb.composite_fence ?
+					   eb.composite_fence :
+					   &eb.requests[0]->fence);
 
 	if (out_fence) {
 		if (err == 0) {
@@ -3466,7 +3035,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (!out_fence && eb.composite_fence)
 		dma_fence_put(eb.composite_fence);
 
-	eb_requests_put(&eb);
+	i915_eb_requests_put(eb.requests, eb.num_batches);
 
 err_vma:
 	eb_release_vmas(&eb, true);
@@ -3487,7 +3056,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_in_fence:
 	dma_fence_put(in_fence);
 err_ext:
-	put_fence_array(eb.fences, eb.num_fences);
+	i915_eb_put_fence_array(eb.fences, eb.num_fences);
 	return err;
 }
 
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 11/20] drm/i915/vm_bind: Use common execbuf functions in execbuf path
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Update the execbuf path to use common execbuf functions to
reduce code duplication with the newer execbuf3 path.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 507 ++----------------
 1 file changed, 38 insertions(+), 469 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 43f29acfbec9..749c3c80e02d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -28,6 +28,7 @@
 #include "i915_file_private.h"
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
 #include "i915_gem_evict.h"
 #include "i915_gem_ioctls.h"
 #include "i915_trace.h"
@@ -235,13 +236,6 @@ enum {
  * the batchbuffer in trusted mode, otherwise the ioctl is rejected.
  */
 
-struct eb_fence {
-	struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */
-	struct dma_fence *dma_fence;
-	u64 value;
-	struct dma_fence_chain *chain_fence;
-};
-
 struct i915_execbuffer {
 	struct drm_i915_private *i915; /** i915 backpointer */
 	struct drm_file *file; /** per-file lookup tables and limits */
@@ -2446,164 +2440,29 @@ static const enum intel_engine_id user_ring_map[] = {
 	[I915_EXEC_VEBOX]	= VECS0
 };
 
-static struct i915_request *eb_throttle(struct i915_execbuffer *eb, struct intel_context *ce)
-{
-	struct intel_ring *ring = ce->ring;
-	struct intel_timeline *tl = ce->timeline;
-	struct i915_request *rq;
-
-	/*
-	 * Completely unscientific finger-in-the-air estimates for suitable
-	 * maximum user request size (to avoid blocking) and then backoff.
-	 */
-	if (intel_ring_update_space(ring) >= PAGE_SIZE)
-		return NULL;
-
-	/*
-	 * Find a request that after waiting upon, there will be at least half
-	 * the ring available. The hysteresis allows us to compete for the
-	 * shared ring and should mean that we sleep less often prior to
-	 * claiming our resources, but not so long that the ring completely
-	 * drains before we can submit our next request.
-	 */
-	list_for_each_entry(rq, &tl->requests, link) {
-		if (rq->ring != ring)
-			continue;
-
-		if (__intel_ring_space(rq->postfix,
-				       ring->emit, ring->size) > ring->size / 2)
-			break;
-	}
-	if (&rq->link == &tl->requests)
-		return NULL; /* weird, we will check again later for real */
-
-	return i915_request_get(rq);
-}
-
-static int eb_pin_timeline(struct i915_execbuffer *eb, struct intel_context *ce,
-			   bool throttle)
-{
-	struct intel_timeline *tl;
-	struct i915_request *rq = NULL;
-
-	/*
-	 * Take a local wakeref for preparing to dispatch the execbuf as
-	 * we expect to access the hardware fairly frequently in the
-	 * process, and require the engine to be kept awake between accesses.
-	 * Upon dispatch, we acquire another prolonged wakeref that we hold
-	 * until the timeline is idle, which in turn releases the wakeref
-	 * taken on the engine, and the parent device.
-	 */
-	tl = intel_context_timeline_lock(ce);
-	if (IS_ERR(tl))
-		return PTR_ERR(tl);
-
-	intel_context_enter(ce);
-	if (throttle)
-		rq = eb_throttle(eb, ce);
-	intel_context_timeline_unlock(tl);
-
-	if (rq) {
-		bool nonblock = eb->file->filp->f_flags & O_NONBLOCK;
-		long timeout = nonblock ? 0 : MAX_SCHEDULE_TIMEOUT;
-
-		if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE,
-				      timeout) < 0) {
-			i915_request_put(rq);
-
-			/*
-			 * Error path, cannot use intel_context_timeline_lock as
-			 * that is user interruptable and this clean up step
-			 * must be done.
-			 */
-			mutex_lock(&ce->timeline->mutex);
-			intel_context_exit(ce);
-			mutex_unlock(&ce->timeline->mutex);
-
-			if (nonblock)
-				return -EWOULDBLOCK;
-			else
-				return -EINTR;
-		}
-		i915_request_put(rq);
-	}
-
-	return 0;
-}
-
 static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle)
 {
-	struct intel_context *ce = eb->context, *child;
 	int err;
-	int i = 0, j = 0;
 
 	GEM_BUG_ON(eb->args->flags & __EXEC_ENGINE_PINNED);
 
-	if (unlikely(intel_context_is_banned(ce)))
-		return -EIO;
-
-	/*
-	 * Pinning the contexts may generate requests in order to acquire
-	 * GGTT space, so do this first before we reserve a seqno for
-	 * ourselves.
-	 */
-	err = intel_context_pin_ww(ce, &eb->ww);
+	err = i915_eb_pin_engine(eb->context, &eb->ww, throttle,
+				 eb->file->filp->f_flags & O_NONBLOCK);
 	if (err)
 		return err;
-	for_each_child(ce, child) {
-		err = intel_context_pin_ww(child, &eb->ww);
-		GEM_BUG_ON(err);	/* perma-pinned should incr a counter */
-	}
-
-	for_each_child(ce, child) {
-		err = eb_pin_timeline(eb, child, throttle);
-		if (err)
-			goto unwind;
-		++i;
-	}
-	err = eb_pin_timeline(eb, ce, throttle);
-	if (err)
-		goto unwind;
 
 	eb->args->flags |= __EXEC_ENGINE_PINNED;
 	return 0;
-
-unwind:
-	for_each_child(ce, child) {
-		if (j++ < i) {
-			mutex_lock(&child->timeline->mutex);
-			intel_context_exit(child);
-			mutex_unlock(&child->timeline->mutex);
-		}
-	}
-	for_each_child(ce, child)
-		intel_context_unpin(child);
-	intel_context_unpin(ce);
-	return err;
 }
 
 static void eb_unpin_engine(struct i915_execbuffer *eb)
 {
-	struct intel_context *ce = eb->context, *child;
-
 	if (!(eb->args->flags & __EXEC_ENGINE_PINNED))
 		return;
 
 	eb->args->flags &= ~__EXEC_ENGINE_PINNED;
 
-	for_each_child(ce, child) {
-		mutex_lock(&child->timeline->mutex);
-		intel_context_exit(child);
-		mutex_unlock(&child->timeline->mutex);
-
-		intel_context_unpin(child);
-	}
-
-	mutex_lock(&ce->timeline->mutex);
-	intel_context_exit(ce);
-	mutex_unlock(&ce->timeline->mutex);
-
-	intel_context_unpin(ce);
+	i915_eb_unpin_engine(eb->context);
 }
 
 static unsigned int
@@ -2652,7 +2511,7 @@ eb_select_legacy_ring(struct i915_execbuffer *eb)
 static int
 eb_select_engine(struct i915_execbuffer *eb)
 {
-	struct intel_context *ce, *child;
+	struct intel_context *ce;
 	unsigned int idx;
 	int err;
 
@@ -2677,36 +2536,10 @@ eb_select_engine(struct i915_execbuffer *eb)
 	}
 	eb->num_batches = ce->parallel.number_children + 1;
 
-	for_each_child(ce, child)
-		intel_context_get(child);
-	intel_gt_pm_get(ce->engine->gt);
-
-	if (!test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
-		err = intel_context_alloc_state(ce);
-		if (err)
-			goto err;
-	}
-	for_each_child(ce, child) {
-		if (!test_bit(CONTEXT_ALLOC_BIT, &child->flags)) {
-			err = intel_context_alloc_state(child);
-			if (err)
-				goto err;
-		}
-	}
-
-	/*
-	 * ABI: Before userspace accesses the GPU (e.g. execbuffer), report
-	 * EIO if the GPU is already wedged.
-	 */
-	err = intel_gt_terminally_wedged(ce->engine->gt);
+	err = i915_eb_select_engine(ce);
 	if (err)
 		goto err;
 
-	if (!i915_vm_tryget(ce->vm)) {
-		err = -ENOENT;
-		goto err;
-	}
-
 	eb->context = ce;
 	eb->gt = ce->engine->gt;
 
@@ -2715,12 +2548,9 @@ eb_select_engine(struct i915_execbuffer *eb)
 	 * during ww handling. The pool is destroyed when last pm reference
 	 * is dropped, which breaks our -EDEADLK handling.
 	 */
-	return err;
+	return 0;
 
 err:
-	intel_gt_pm_put(ce->engine->gt);
-	for_each_child(ce, child)
-		intel_context_put(child);
 	intel_context_put(ce);
 	return err;
 }
@@ -2728,24 +2558,7 @@ eb_select_engine(struct i915_execbuffer *eb)
 static void
 eb_put_engine(struct i915_execbuffer *eb)
 {
-	struct intel_context *child;
-
-	i915_vm_put(eb->context->vm);
-	intel_gt_pm_put(eb->gt);
-	for_each_child(eb->context, child)
-		intel_context_put(child);
-	intel_context_put(eb->context);
-}
-
-static void
-__free_fence_array(struct eb_fence *fences, unsigned int n)
-{
-	while (n--) {
-		drm_syncobj_put(ptr_mask_bits(fences[n].syncobj, 2));
-		dma_fence_put(fences[n].dma_fence);
-		dma_fence_chain_free(fences[n].chain_fence);
-	}
-	kvfree(fences);
+	i915_eb_put_engine(eb->context);
 }
 
 static int
@@ -2756,7 +2569,6 @@ add_timeline_fence_array(struct i915_execbuffer *eb,
 	u64 __user *user_values;
 	struct eb_fence *f;
 	u64 nfences;
-	int err = 0;
 
 	nfences = timeline_fences->fence_count;
 	if (!nfences)
@@ -2791,9 +2603,9 @@ add_timeline_fence_array(struct i915_execbuffer *eb,
 
 	while (nfences--) {
 		struct drm_i915_gem_exec_fence user_fence;
-		struct drm_syncobj *syncobj;
-		struct dma_fence *fence = NULL;
+		bool wait, signal;
 		u64 point;
+		int ret;
 
 		if (__copy_from_user(&user_fence,
 				     user_fences++,
@@ -2806,70 +2618,15 @@ add_timeline_fence_array(struct i915_execbuffer *eb,
 		if (__get_user(point, user_values++))
 			return -EFAULT;
 
-		syncobj = drm_syncobj_find(eb->file, user_fence.handle);
-		if (!syncobj) {
-			DRM_DEBUG("Invalid syncobj handle provided\n");
-			return -ENOENT;
-		}
-
-		fence = drm_syncobj_fence_get(syncobj);
-
-		if (!fence && user_fence.flags &&
-		    !(user_fence.flags & I915_EXEC_FENCE_SIGNAL)) {
-			DRM_DEBUG("Syncobj handle has no fence\n");
-			drm_syncobj_put(syncobj);
-			return -EINVAL;
-		}
-
-		if (fence)
-			err = dma_fence_chain_find_seqno(&fence, point);
-
-		if (err && !(user_fence.flags & I915_EXEC_FENCE_SIGNAL)) {
-			DRM_DEBUG("Syncobj handle missing requested point %llu\n", point);
-			dma_fence_put(fence);
-			drm_syncobj_put(syncobj);
-			return err;
-		}
-
-		/*
-		 * A point might have been signaled already and
-		 * garbage collected from the timeline. In this case
-		 * just ignore the point and carry on.
-		 */
-		if (!fence && !(user_fence.flags & I915_EXEC_FENCE_SIGNAL)) {
-			drm_syncobj_put(syncobj);
+		wait = user_fence.flags & I915_EXEC_FENCE_WAIT;
+		signal = user_fence.flags & I915_EXEC_FENCE_SIGNAL;
+		ret = i915_eb_add_timeline_fence(eb->file, user_fence.handle,
+						 point, f, wait, signal);
+		if (ret < 0)
+			return ret;
+		else if (!ret)
 			continue;
-		}
 
-		/*
-		 * For timeline syncobjs we need to preallocate chains for
-		 * later signaling.
-		 */
-		if (point != 0 && user_fence.flags & I915_EXEC_FENCE_SIGNAL) {
-			/*
-			 * Waiting and signaling the same point (when point !=
-			 * 0) would break the timeline.
-			 */
-			if (user_fence.flags & I915_EXEC_FENCE_WAIT) {
-				DRM_DEBUG("Trying to wait & signal the same timeline point.\n");
-				dma_fence_put(fence);
-				drm_syncobj_put(syncobj);
-				return -EINVAL;
-			}
-
-			f->chain_fence = dma_fence_chain_alloc();
-			if (!f->chain_fence) {
-				drm_syncobj_put(syncobj);
-				dma_fence_put(fence);
-				return -ENOMEM;
-			}
-		} else {
-			f->chain_fence = NULL;
-		}
-
-		f->syncobj = ptr_pack_bits(syncobj, user_fence.flags, 2);
-		f->dma_fence = fence;
-		f->value = point;
 		f++;
 		eb->num_fences++;
 	}
@@ -2949,60 +2706,6 @@ static int add_fence_array(struct i915_execbuffer *eb)
 	return 0;
 }
 
-static void put_fence_array(struct eb_fence *fences, int num_fences)
-{
-	if (fences)
-		__free_fence_array(fences, num_fences);
-}
-
-static int
-await_fence_array(struct i915_execbuffer *eb,
-		  struct i915_request *rq)
-{
-	unsigned int n;
-	int err;
-
-	for (n = 0; n < eb->num_fences; n++) {
-		if (!eb->fences[n].dma_fence)
-			continue;
-
-		err = i915_request_await_dma_fence(rq, eb->fences[n].dma_fence);
-		if (err < 0)
-			return err;
-	}
-
-	return 0;
-}
-
-static void signal_fence_array(const struct i915_execbuffer *eb,
-			       struct dma_fence * const fence)
-{
-	unsigned int n;
-
-	for (n = 0; n < eb->num_fences; n++) {
-		struct drm_syncobj *syncobj;
-		unsigned int flags;
-
-		syncobj = ptr_unpack_bits(eb->fences[n].syncobj, &flags, 2);
-		if (!(flags & I915_EXEC_FENCE_SIGNAL))
-			continue;
-
-		if (eb->fences[n].chain_fence) {
-			drm_syncobj_add_point(syncobj,
-					      eb->fences[n].chain_fence,
-					      fence,
-					      eb->fences[n].value);
-			/*
-			 * The chain's ownership is transferred to the
-			 * timeline.
-			 */
-			eb->fences[n].chain_fence = NULL;
-		} else {
-			drm_syncobj_replace_fence(syncobj, fence);
-		}
-	}
-}
-
 static int
 parse_timeline_fences(struct i915_user_extension __user *ext, void *data)
 {
@@ -3015,80 +2718,6 @@ parse_timeline_fences(struct i915_user_extension __user *ext, void *data)
 	return add_timeline_fence_array(eb, &timeline_fences);
 }
 
-static void retire_requests(struct intel_timeline *tl, struct i915_request *end)
-{
-	struct i915_request *rq, *rn;
-
-	list_for_each_entry_safe(rq, rn, &tl->requests, link)
-		if (rq == end || !i915_request_retire(rq))
-			break;
-}
-
-static int eb_request_add(struct i915_execbuffer *eb, struct i915_request *rq,
-			  int err, bool last_parallel)
-{
-	struct intel_timeline * const tl = i915_request_timeline(rq);
-	struct i915_sched_attr attr = {};
-	struct i915_request *prev;
-
-	lockdep_assert_held(&tl->mutex);
-	lockdep_unpin_lock(&tl->mutex, rq->cookie);
-
-	trace_i915_request_add(rq);
-
-	prev = __i915_request_commit(rq);
-
-	/* Check that the context wasn't destroyed before submission */
-	if (likely(!intel_context_is_closed(eb->context))) {
-		attr = eb->gem_context->sched;
-	} else {
-		/* Serialise with context_close via the add_to_timeline */
-		i915_request_set_error_once(rq, -ENOENT);
-		__i915_request_skip(rq);
-		err = -ENOENT; /* override any transient errors */
-	}
-
-	if (intel_context_is_parallel(eb->context)) {
-		if (err) {
-			__i915_request_skip(rq);
-			set_bit(I915_FENCE_FLAG_SKIP_PARALLEL,
-				&rq->fence.flags);
-		}
-		if (last_parallel)
-			set_bit(I915_FENCE_FLAG_SUBMIT_PARALLEL,
-				&rq->fence.flags);
-	}
-
-	__i915_request_queue(rq, &attr);
-
-	/* Try to clean up the client's timeline after submitting the request */
-	if (prev)
-		retire_requests(tl, prev);
-
-	mutex_unlock(&tl->mutex);
-
-	return err;
-}
-
-static int eb_requests_add(struct i915_execbuffer *eb, int err)
-{
-	int i;
-
-	/*
-	 * We iterate in reverse order of creation to release timeline mutexes in
-	 * same order.
-	 */
-	for_each_batch_add_order(eb, i) {
-		struct i915_request *rq = eb->requests[i];
-
-		if (!rq)
-			continue;
-		err |= eb_request_add(eb, rq, err, i == 0);
-	}
-
-	return err;
-}
-
 static const i915_user_extension_fn execbuf_extensions[] = {
 	[DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES] = parse_timeline_fences,
 };
@@ -3115,73 +2744,26 @@ parse_execbuf2_extensions(struct drm_i915_gem_execbuffer2 *args,
 				    eb);
 }
 
-static void eb_requests_get(struct i915_execbuffer *eb)
-{
-	unsigned int i;
-
-	for_each_batch_create_order(eb, i) {
-		if (!eb->requests[i])
-			break;
-
-		i915_request_get(eb->requests[i]);
-	}
-}
-
-static void eb_requests_put(struct i915_execbuffer *eb)
-{
-	unsigned int i;
-
-	for_each_batch_create_order(eb, i) {
-		if (!eb->requests[i])
-			break;
-
-		i915_request_put(eb->requests[i]);
-	}
-}
-
 static struct sync_file *
 eb_composite_fence_create(struct i915_execbuffer *eb, int out_fence_fd)
 {
 	struct sync_file *out_fence = NULL;
-	struct dma_fence_array *fence_array;
-	struct dma_fence **fences;
-	unsigned int i;
-
-	GEM_BUG_ON(!intel_context_is_parent(eb->context));
+	struct dma_fence *fence;
 
-	fences = kmalloc_array(eb->num_batches, sizeof(*fences), GFP_KERNEL);
-	if (!fences)
-		return ERR_PTR(-ENOMEM);
-
-	for_each_batch_create_order(eb, i) {
-		fences[i] = &eb->requests[i]->fence;
-		__set_bit(I915_FENCE_FLAG_COMPOSITE,
-			  &eb->requests[i]->fence.flags);
-	}
-
-	fence_array = dma_fence_array_create(eb->num_batches,
-					     fences,
-					     eb->context->parallel.fence_context,
-					     eb->context->parallel.seqno++,
-					     false);
-	if (!fence_array) {
-		kfree(fences);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	/* Move ownership to the dma_fence_array created above */
-	for_each_batch_create_order(eb, i)
-		dma_fence_get(fences[i]);
+	fence = i915_eb_composite_fence_create(eb->requests, eb->num_batches,
+					       eb->context);
+	if (IS_ERR(fence))
+		return ERR_CAST(fence);
 
 	if (out_fence_fd != -1) {
-		out_fence = sync_file_create(&fence_array->base);
+		out_fence = sync_file_create(fence);
 		/* sync_file now owns fence_arry, drop creation ref */
-		dma_fence_put(&fence_array->base);
+		dma_fence_put(fence);
 		if (!out_fence)
 			return ERR_PTR(-ENOMEM);
 	}
 
-	eb->composite_fence = &fence_array->base;
+	eb->composite_fence = fence;
 
 	return out_fence;
 }
@@ -3213,7 +2795,7 @@ eb_fences_add(struct i915_execbuffer *eb, struct i915_request *rq,
 	}
 
 	if (eb->fences) {
-		err = await_fence_array(eb, rq);
+		err = i915_eb_await_fence_array(eb->fences, eb->num_fences, rq);
 		if (err)
 			return ERR_PTR(err);
 	}
@@ -3231,23 +2813,6 @@ eb_fences_add(struct i915_execbuffer *eb, struct i915_request *rq,
 	return out_fence;
 }
 
-static struct intel_context *
-eb_find_context(struct i915_execbuffer *eb, unsigned int context_number)
-{
-	struct intel_context *child;
-
-	if (likely(context_number == 0))
-		return eb->context;
-
-	for_each_child(eb->context, child)
-		if (!--context_number)
-			return child;
-
-	GEM_BUG_ON("Context not found");
-
-	return NULL;
-}
-
 static struct sync_file *
 eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence,
 		   int out_fence_fd)
@@ -3257,7 +2822,9 @@ eb_requests_create(struct i915_execbuffer *eb, struct dma_fence *in_fence,
 
 	for_each_batch_create_order(eb, i) {
 		/* Allocate a request for this batch buffer nice and early. */
-		eb->requests[i] = i915_request_create(eb_find_context(eb, i));
+		eb->requests[i] =
+			i915_request_create(i915_eb_find_context(eb->context,
+								 i));
 		if (IS_ERR(eb->requests[i])) {
 			out_fence = ERR_CAST(eb->requests[i]);
 			eb->requests[i] = NULL;
@@ -3437,13 +3004,15 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	err = eb_submit(&eb);
 
 err_request:
-	eb_requests_get(&eb);
-	err = eb_requests_add(&eb, err);
+	i915_eb_requests_get(eb.requests, eb.num_batches);
+	err = i915_eb_requests_add(eb.requests, eb.num_batches, eb.context,
+				   eb.gem_context->sched, err);
 
 	if (eb.fences)
-		signal_fence_array(&eb, eb.composite_fence ?
-				   eb.composite_fence :
-				   &eb.requests[0]->fence);
+		i915_eb_signal_fence_array(eb.fences, eb.num_fences,
+					   eb.composite_fence ?
+					   eb.composite_fence :
+					   &eb.requests[0]->fence);
 
 	if (out_fence) {
 		if (err == 0) {
@@ -3466,7 +3035,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (!out_fence && eb.composite_fence)
 		dma_fence_put(eb.composite_fence);
 
-	eb_requests_put(&eb);
+	i915_eb_requests_put(eb.requests, eb.num_batches);
 
 err_vma:
 	eb_release_vmas(&eb, true);
@@ -3487,7 +3056,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_in_fence:
 	dma_fence_put(in_fence);
 err_ext:
-	put_fence_array(eb.fences, eb.num_fences);
+	i915_eb_put_fence_array(eb.fences, eb.num_fences);
 	return err;
 }
 
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 12/20] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

Legacy features like relocations etc are not supported by execbuf3.

v2: Add more input validity checks.
v3: batch_address is a VA (not an array) if num_batches=1,
    minor cleanup
v4: replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()
v5: Remove unwanted krealloc() and address other review comments.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 578 ++++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |   2 +
 drivers/gpu/drm/i915/i915_driver.c            |   1 +
 include/uapi/drm/i915_drm.h                   |  61 ++
 5 files changed, 643 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 35636c6bf856..0fbdbb571709 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -150,6 +150,7 @@ gem-y += \
 	gem/i915_gem_domain.o \
 	gem/i915_gem_execbuffer_common.o \
 	gem/i915_gem_execbuffer.o \
+	gem/i915_gem_execbuffer3.o \
 	gem/i915_gem_internal.o \
 	gem/i915_gem_object.o \
 	gem/i915_gem_lmem.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
new file mode 100644
index 000000000000..64251dc4cf91
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -0,0 +1,578 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <linux/dma-resv.h>
+#include <linux/uaccess.h>
+
+#include <drm/drm_syncobj.h>
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+#include "i915_drv.h"
+#include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
+#include "i915_gem_ioctls.h"
+#include "i915_gem_vm_bind.h"
+#include "i915_trace.h"
+
+#define __EXEC3_ENGINE_PINNED		BIT_ULL(32)
+#define __EXEC3_INTERNAL_FLAGS		(~0ull << 32)
+
+/* Catch emission of unexpected errors for CI! */
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+#undef EINVAL
+#define EINVAL ({ \
+	DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
+	22; \
+})
+#endif
+
+/**
+ * DOC: User command execution in vm_bind mode
+ *
+ * A VM in VM_BIND mode will not support older execbuf mode of binding.
+ * The execbuf ioctl handling in VM_BIND mode differs significantly from the
+ * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+ * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+ * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+ * execlist. Hence, no support for implicit sync.
+ *
+ * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+ * works with execbuf3 ioctl for submission.
+ *
+ * The execbuf3 ioctl directly specifies the batch addresses instead of as
+ * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+ * support many of the older features like in/out/submit fences, fence array,
+ * default gem context etc. (See struct drm_i915_gem_execbuffer3).
+ *
+ * In VM_BIND mode, VA allocation is completely managed by the user instead of
+ * the i915 driver. Hence all VA assignment, eviction are not applicable in
+ * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+ * be using the i915_vma active reference tracking. It will instead check the
+ * dma-resv object's fence list for that.
+ *
+ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
+ * vma lookup table, implicit sync, vma active reference tracking etc., are not
+ * applicable for execbuf3 ioctl.
+ */
+
+/**
+ * struct i915_execbuffer - execbuf struct for execbuf3
+ * @i915: reference to the i915 instance we run on
+ * @file: drm file reference
+ * @args: execbuf3 ioctl structure
+ * @gt: reference to the gt instance ioctl submitted for
+ * @context: logical state for the request
+ * @gem_context: callers context
+ * @requests: requests to be build
+ * @composite_fence: used for excl fence in dma_resv objects when > 1 BB submitted
+ * @ww: i915_gem_ww_ctx instance
+ * @num_batches: number of batches submitted
+ * @batch_addresses: addresses corresponds to the submitted batches
+ * @batches: references to the i915_vmas corresponding to the batches
+ * @fences: array of execbuf fences (See struct eb_fence)
+ * @num_fences: number of fences in @fences array
+ */
+struct i915_execbuffer {
+	struct drm_i915_private *i915;
+	struct drm_file *file;
+	struct drm_i915_gem_execbuffer3 *args;
+
+	struct intel_gt *gt;
+	struct intel_context *context;
+	struct i915_gem_context *gem_context;
+
+	struct i915_request *requests[MAX_ENGINE_INSTANCE + 1];
+	struct dma_fence *composite_fence;
+
+	struct i915_gem_ww_ctx ww;
+
+	unsigned int num_batches;
+	u64 batch_addresses[MAX_ENGINE_INSTANCE + 1];
+	struct i915_vma *batches[MAX_ENGINE_INSTANCE + 1];
+
+	struct eb_fence *fences;
+	u64 num_fences;
+};
+
+static void eb_unpin_engine(struct i915_execbuffer *eb);
+
+static int eb_select_context(struct i915_execbuffer *eb)
+{
+	struct i915_gem_context *ctx;
+
+	ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->ctx_id);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
+	if (!i915_gem_vm_is_vm_bind_mode(ctx->vm)) {
+		i915_gem_context_put(ctx);
+		return -EOPNOTSUPP;
+	}
+
+	eb->gem_context = ctx;
+	return 0;
+}
+
+static struct i915_vma *
+eb_find_vma(struct i915_address_space *vm, u64 addr)
+{
+	u64 va;
+
+	lockdep_assert_held(&vm->vm_bind_lock);
+
+	va = gen8_noncanonical_addr(addr & PIN_OFFSET_MASK);
+	return i915_gem_vm_bind_lookup_vma(vm, va);
+}
+
+static int eb_lookup_vma_all(struct i915_execbuffer *eb)
+{
+	struct i915_vma *vma;
+	unsigned int i;
+
+	for (i = 0; i < eb->num_batches; i++) {
+		vma = eb_find_vma(eb->context->vm, eb->batch_addresses[i]);
+		if (!vma)
+			return -EINVAL;
+
+		eb->batches[i] = vma;
+	}
+
+	return 0;
+}
+
+static void eb_release_vma_all(struct i915_execbuffer *eb)
+{
+	eb_unpin_engine(eb);
+}
+
+/*
+ * Using two helper loops for the order of which requests / batches are created
+ * and added the to backend. Requests are created in order from the parent to
+ * the last child. Requests are added in the reverse order, from the last child
+ * to parent. This is done for locking reasons as the timeline lock is acquired
+ * during request creation and released when the request is added to the
+ * backend. To make lockdep happy (see intel_context_timeline_lock) this must be
+ * the ordering.
+ */
+#define for_each_batch_create_order(_eb) \
+	for (unsigned int i = 0; i < (_eb)->num_batches; ++i)
+
+static int eb_move_to_gpu(struct i915_execbuffer *eb)
+{
+	/* Unconditionally flush any chipset caches (for streaming writes). */
+	intel_gt_chipset_flush(eb->gt);
+
+	return 0;
+}
+
+static int eb_request_submit(struct i915_execbuffer *eb,
+			     struct i915_request *rq,
+			     struct i915_vma *batch,
+			     u64 batch_len)
+{
+	struct intel_engine_cs *engine = rq->context->engine;
+	int err;
+
+	if (intel_context_nopreempt(rq->context))
+		__set_bit(I915_FENCE_FLAG_NOPREEMPT, &rq->fence.flags);
+
+	/*
+	 * After we completed waiting for other engines (using HW semaphores)
+	 * then we can signal that this request/batch is ready to run. This
+	 * allows us to determine if the batch is still waiting on the GPU
+	 * or actually running by checking the breadcrumb.
+	 */
+	if (engine->emit_init_breadcrumb) {
+		err = engine->emit_init_breadcrumb(rq);
+		if (err)
+			return err;
+	}
+
+	return engine->emit_bb_start(rq, batch->node.start, batch_len, 0);
+}
+
+static int eb_submit(struct i915_execbuffer *eb)
+{
+	int err;
+
+	err = eb_move_to_gpu(eb);
+
+	for_each_batch_create_order(eb) {
+		if (!eb->requests[i])
+			break;
+
+		trace_i915_request_queue(eb->requests[i], 0);
+		if (!err)
+			err = eb_request_submit(eb, eb->requests[i],
+						eb->batches[i],
+						eb->batches[i]->size);
+	}
+
+	return err;
+}
+
+static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle)
+{
+	int err;
+
+	GEM_BUG_ON(eb->args->flags & __EXEC3_ENGINE_PINNED);
+
+	err = i915_eb_pin_engine(eb->context, &eb->ww, throttle,
+				 eb->file->filp->f_flags & O_NONBLOCK);
+	if (err)
+		return err;
+
+	eb->args->flags |= __EXEC3_ENGINE_PINNED;
+	return 0;
+}
+
+static void eb_unpin_engine(struct i915_execbuffer *eb)
+{
+	if (!(eb->args->flags & __EXEC3_ENGINE_PINNED))
+		return;
+
+	eb->args->flags &= ~__EXEC3_ENGINE_PINNED;
+
+	i915_eb_unpin_engine(eb->context);
+}
+
+static int eb_select_engine(struct i915_execbuffer *eb)
+{
+	struct intel_context *ce;
+	unsigned int idx;
+	int err;
+
+	if (!i915_gem_context_user_engines(eb->gem_context))
+		return -EINVAL;
+
+	idx = eb->args->engine_idx;
+	ce = i915_gem_context_get_engine(eb->gem_context, idx);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
+	eb->num_batches = ce->parallel.number_children + 1;
+
+	err = i915_eb_select_engine(ce);
+	if (err)
+		goto err;
+
+	eb->context = ce;
+	eb->gt = ce->engine->gt;
+
+	/*
+	 * Make sure engine pool stays alive even if we call intel_context_put
+	 * during ww handling. The pool is destroyed when last pm reference
+	 * is dropped, which breaks our -EDEADLK handling.
+	 */
+	return 0;
+
+err:
+	intel_context_put(ce);
+	return err;
+}
+
+static void eb_put_engine(struct i915_execbuffer *eb)
+{
+	i915_eb_put_engine(eb->context);
+}
+
+static int add_timeline_fence_array(struct i915_execbuffer *eb)
+{
+	struct drm_i915_gem_timeline_fence __user *user_fences;
+	struct eb_fence *f;
+	u64 nfences;
+
+	nfences = eb->args->fence_count;
+	if (!nfences)
+		return 0;
+
+	/* Check multiplication overflow for access_ok() and kvmalloc_array() */
+	BUILD_BUG_ON(sizeof(size_t) > sizeof(unsigned long));
+	if (nfences > min_t(unsigned long,
+			    ULONG_MAX / sizeof(*user_fences),
+			    SIZE_MAX / sizeof(*f)))
+		return -EINVAL;
+
+	user_fences = u64_to_user_ptr(eb->args->timeline_fences);
+	if (!access_ok(user_fences, nfences * sizeof(*user_fences)))
+		return -EFAULT;
+
+	eb->fences = kcalloc(nfences, sizeof(*f), __GFP_NOWARN | GFP_KERNEL);
+	if (!eb->fences)
+		return -ENOMEM;
+
+	f = eb->fences;
+
+	BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) &
+		     ~__I915_TIMELINE_FENCE_UNKNOWN_FLAGS);
+
+	while (nfences--) {
+		struct drm_i915_gem_timeline_fence user_fence;
+		bool wait, signal;
+		int ret;
+
+		if (__copy_from_user(&user_fence,
+				     user_fences++,
+				     sizeof(user_fence)))
+			return -EFAULT;
+
+		if (user_fence.flags & __I915_TIMELINE_FENCE_UNKNOWN_FLAGS)
+			return -EINVAL;
+
+		wait = user_fence.flags & I915_TIMELINE_FENCE_WAIT;
+		signal = user_fence.flags & I915_TIMELINE_FENCE_SIGNAL;
+		ret = i915_eb_add_timeline_fence(eb->file, user_fence.handle,
+						 user_fence.value, f, wait,
+						 signal);
+		if (ret < 0)
+			return ret;
+		else if (!ret)
+			continue;
+
+		f++;
+		eb->num_fences++;
+	}
+
+	return 0;
+}
+
+static int parse_timeline_fences(struct i915_execbuffer *eb)
+{
+	return add_timeline_fence_array(eb);
+}
+
+static int parse_batch_addresses(struct i915_execbuffer *eb)
+{
+	struct drm_i915_gem_execbuffer3 *args = eb->args;
+
+	if (eb->num_batches == 1) {
+		eb->batch_addresses[0] = args->batch_address;
+	} else {
+		u64 __user *batch_addr = u64_to_user_ptr(args->batch_address);
+
+		if (copy_from_user(eb->batch_addresses, batch_addr,
+				   sizeof(batch_addr[0]) * eb->num_batches))
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int eb_composite_fence_create(struct i915_execbuffer *eb)
+{
+	struct dma_fence *fence;
+
+	fence = i915_eb_composite_fence_create(eb->requests, eb->num_batches,
+					       eb->context);
+	if (IS_ERR(fence))
+		return PTR_ERR(fence);
+
+	eb->composite_fence = fence;
+
+	return 0;
+}
+
+static int eb_fences_add(struct i915_execbuffer *eb, struct i915_request *rq)
+{
+	int err;
+
+	if (unlikely(eb->gem_context->syncobj)) {
+		struct dma_fence *fence;
+
+		fence = drm_syncobj_fence_get(eb->gem_context->syncobj);
+		err = i915_request_await_dma_fence(rq, fence);
+		dma_fence_put(fence);
+		if (err)
+			return err;
+	}
+
+	if (eb->fences) {
+		err = i915_eb_await_fence_array(eb->fences, eb->num_fences, rq);
+		if (err)
+			return err;
+	}
+
+	if (intel_context_is_parallel(eb->context)) {
+		err = eb_composite_fence_create(eb);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int eb_requests_create(struct i915_execbuffer *eb)
+{
+	int err;
+
+	for_each_batch_create_order(eb) {
+		/* Allocate a request for this batch buffer nice and early. */
+		eb->requests[i] =
+			i915_request_create(i915_eb_find_context(eb->context,
+								 i));
+		if (IS_ERR(eb->requests[i])) {
+			err = PTR_ERR(eb->requests[i]);
+			eb->requests[i] = NULL;
+			return err;
+		}
+
+		/*
+		 * Only the first request added (committed to backend) has to
+		 * take the in fences into account as all subsequent requests
+		 * will have fences inserted inbetween them.
+		 */
+		if (i + 1 == eb->num_batches) {
+			err = eb_fences_add(eb, eb->requests[i]);
+			if (err)
+				return err;
+		}
+
+		if (eb->batches[i])
+			eb->requests[i]->batch_res =
+				i915_vma_resource_get(eb->batches[i]->resource);
+	}
+
+	return 0;
+}
+
+static int
+i915_gem_do_execbuffer(struct drm_device *dev,
+		       struct drm_file *file,
+		       struct drm_i915_gem_execbuffer3 *args)
+{
+	struct drm_i915_private *i915 = to_i915(dev);
+	struct i915_execbuffer eb;
+	bool throttle = true;
+	int err;
+
+	BUILD_BUG_ON(__EXEC3_INTERNAL_FLAGS & ~__I915_EXEC3_UNKNOWN_FLAGS);
+
+	eb.i915 = i915;
+	eb.file = file;
+	eb.args = args;
+
+	eb.fences = NULL;
+	eb.num_fences = 0;
+
+	memset(eb.requests, 0, sizeof(struct i915_request *) *
+	       ARRAY_SIZE(eb.requests));
+	eb.composite_fence = NULL;
+
+	err = parse_timeline_fences(&eb);
+	if (err)
+		return err;
+
+	err = eb_select_context(&eb);
+	if (unlikely(err))
+		goto err_fences;
+
+	err = eb_select_engine(&eb);
+	if (unlikely(err))
+		goto err_context;
+
+	err = parse_batch_addresses(&eb);
+	if (unlikely(err))
+		goto err_engine;
+
+	mutex_lock(&eb.context->vm->vm_bind_lock);
+
+	err = eb_lookup_vma_all(&eb);
+	if (err) {
+		eb_release_vma_all(&eb);
+		goto err_vm_bind_lock;
+	}
+
+	i915_gem_ww_ctx_init(&eb.ww, true);
+
+retry_validate:
+	err = eb_pin_engine(&eb, throttle);
+	if (err)
+		goto err_validate;
+
+	/* only throttle once, even if we didn't need to throttle */
+	throttle = false;
+
+err_validate:
+	if (err == -EDEADLK) {
+		eb_release_vma_all(&eb);
+		err = i915_gem_ww_ctx_backoff(&eb.ww);
+		if (!err)
+			goto retry_validate;
+	}
+	if (err)
+		goto err_vma;
+
+	ww_acquire_done(&eb.ww.ctx);
+
+	err = eb_requests_create(&eb);
+	if (err) {
+		if (eb.requests[0])
+			goto err_request;
+		else
+			goto err_vma;
+	}
+
+	err = eb_submit(&eb);
+
+err_request:
+	i915_eb_requests_get(eb.requests, eb.num_batches);
+	err = i915_eb_requests_add(eb.requests, eb.num_batches, eb.context,
+				   eb.gem_context->sched, err);
+
+	if (eb.fences)
+		i915_eb_signal_fence_array(eb.fences, eb.num_fences,
+					   eb.composite_fence ?
+					   eb.composite_fence :
+					   &eb.requests[0]->fence);
+
+	if (unlikely(eb.gem_context->syncobj)) {
+		drm_syncobj_replace_fence(eb.gem_context->syncobj,
+					  eb.composite_fence ?
+					  eb.composite_fence :
+					  &eb.requests[0]->fence);
+	}
+
+	if (eb.composite_fence)
+		dma_fence_put(eb.composite_fence);
+
+	i915_eb_requests_put(eb.requests, eb.num_batches);
+
+err_vma:
+	eb_release_vma_all(&eb);
+	WARN_ON(err == -EDEADLK);
+	i915_gem_ww_ctx_fini(&eb.ww);
+err_vm_bind_lock:
+	mutex_unlock(&eb.context->vm->vm_bind_lock);
+err_engine:
+	eb_put_engine(&eb);
+err_context:
+	i915_gem_context_put(eb.gem_context);
+err_fences:
+	i915_eb_put_fence_array(eb.fences, eb.num_fences);
+	return err;
+}
+
+int
+i915_gem_execbuffer3_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file)
+{
+	struct drm_i915_gem_execbuffer3 *args = data;
+	int err;
+
+	/* Reserved fields must be 0 */
+	if (args->rsvd || args->extensions)
+		return -EINVAL;
+
+	if (args->flags & __I915_EXEC3_UNKNOWN_FLAGS)
+		return -EINVAL;
+
+	err = i915_gem_do_execbuffer(dev, file, args);
+
+	args->flags &= ~__I915_EXEC3_UNKNOWN_FLAGS;
+	return err;
+}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
index 28d6526e32ab..b7a1e9725a84 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
@@ -18,6 +18,8 @@ int i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
 			      struct drm_file *file);
 int i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
+int i915_gem_execbuffer3_ioctl(struct drm_device *dev, void *data,
+			       struct drm_file *file);
 int i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 				struct drm_file *file);
 int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
index cf41b96ac485..a8b69ee39cee 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -1854,6 +1854,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GEM_INIT, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF_DRV(I915_GEM_EXECBUFFER, drm_invalid_op, DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(I915_GEM_EXECBUFFER2_WR, i915_gem_execbuffer2_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_EXECBUFFER3, i915_gem_execbuffer3_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_PIN, i915_gem_reject_pin_ioctl, DRM_AUTH|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF_DRV(I915_GEM_UNPIN, i915_gem_reject_pin_ioctl, DRM_AUTH|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF_DRV(I915_GEM_BUSY, i915_gem_busy_ioctl, DRM_RENDER_ALLOW),
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 9ac913606d40..59a94f515064 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -472,6 +472,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_CREATE_EXT		0x3c
 #define DRM_I915_GEM_VM_BIND		0x3d
 #define DRM_I915_GEM_VM_UNBIND		0x3e
+#define DRM_I915_GEM_EXECBUFFER3	0x3f
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -538,6 +539,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_VM_BIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
 #define DRM_IOCTL_I915_GEM_VM_UNBIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_unbind)
+#define DRM_IOCTL_I915_GEM_EXECBUFFER3	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct drm_i915_gem_execbuffer3)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -1568,6 +1570,65 @@ struct drm_i915_gem_timeline_fence {
 	__u64 value;
 };
 
+/**
+ * struct drm_i915_gem_execbuffer3 - Structure for DRM_I915_GEM_EXECBUFFER3
+ * ioctl.
+ *
+ * DRM_I915_GEM_EXECBUFFER3 ioctl only works in VM_BIND mode and VM_BIND mode
+ * only works with this ioctl for submission.
+ * See I915_VM_CREATE_FLAGS_USE_VM_BIND.
+ */
+struct drm_i915_gem_execbuffer3 {
+	/**
+	 * @ctx_id: Context id
+	 *
+	 * Only contexts with user engine map are allowed.
+	 */
+	__u32 ctx_id;
+
+	/**
+	 * @engine_idx: Engine index
+	 *
+	 * An index in the user engine map of the context specified by @ctx_id.
+	 */
+	__u32 engine_idx;
+
+	/**
+	 * @batch_address: Batch gpu virtual address/es.
+	 *
+	 * For normal submission, it is the gpu virtual address of the batch
+	 * buffer. For parallel submission, it is a pointer to an array of
+	 * batch buffer gpu virtual addresses with array size equal to the
+	 * number of (parallel) engines involved in that submission (See
+	 * struct i915_context_engines_parallel_submit).
+	 */
+	__u64 batch_address;
+
+	/** @flags: Currently reserved, MBZ */
+	__u64 flags;
+#define __I915_EXEC3_UNKNOWN_FLAGS (~0ull)
+
+	/** @fence_count: Number of fences in @timeline_fences array. */
+	__u64 fence_count;
+
+	/**
+	 * @timeline_fences: Pointer to an array of timeline fences.
+	 *
+	 * Timeline fences are of format struct drm_i915_gem_timeline_fence.
+	 */
+	__u64 timeline_fences;
+
+	/** @rsvd: Reserved, MBZ */
+	__u64 rsvd;
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * For future extensions. See struct i915_user_extension.
+	 */
+	__u64 extensions;
+};
+
 struct drm_i915_gem_pin {
 	/** Handle of the buffer to be pinned. */
 	__u32 handle;
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 12/20] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Implement new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only
works in vm_bind mode. The vm_bind mode only works with
this new execbuf3 ioctl.

The new execbuf3 ioctl will not have any list of objects to validate
bind as all required objects binding would have been requested by the
userspace before submitting the execbuf3.

Legacy features like relocations etc are not supported by execbuf3.

v2: Add more input validity checks.
v3: batch_address is a VA (not an array) if num_batches=1,
    minor cleanup
v4: replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()
v5: Remove unwanted krealloc() and address other review comments.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 578 ++++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |   2 +
 drivers/gpu/drm/i915/i915_driver.c            |   1 +
 include/uapi/drm/i915_drm.h                   |  61 ++
 5 files changed, 643 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 35636c6bf856..0fbdbb571709 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -150,6 +150,7 @@ gem-y += \
 	gem/i915_gem_domain.o \
 	gem/i915_gem_execbuffer_common.o \
 	gem/i915_gem_execbuffer.o \
+	gem/i915_gem_execbuffer3.o \
 	gem/i915_gem_internal.o \
 	gem/i915_gem_object.o \
 	gem/i915_gem_lmem.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
new file mode 100644
index 000000000000..64251dc4cf91
--- /dev/null
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -0,0 +1,578 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <linux/dma-resv.h>
+#include <linux/uaccess.h>
+
+#include <drm/drm_syncobj.h>
+
+#include "gt/intel_context.h"
+#include "gt/intel_gpu_commands.h"
+#include "gt/intel_gt.h"
+
+#include "i915_drv.h"
+#include "i915_gem_context.h"
+#include "i915_gem_execbuffer_common.h"
+#include "i915_gem_ioctls.h"
+#include "i915_gem_vm_bind.h"
+#include "i915_trace.h"
+
+#define __EXEC3_ENGINE_PINNED		BIT_ULL(32)
+#define __EXEC3_INTERNAL_FLAGS		(~0ull << 32)
+
+/* Catch emission of unexpected errors for CI! */
+#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
+#undef EINVAL
+#define EINVAL ({ \
+	DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
+	22; \
+})
+#endif
+
+/**
+ * DOC: User command execution in vm_bind mode
+ *
+ * A VM in VM_BIND mode will not support older execbuf mode of binding.
+ * The execbuf ioctl handling in VM_BIND mode differs significantly from the
+ * older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+ * Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+ * struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+ * execlist. Hence, no support for implicit sync.
+ *
+ * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+ * works with execbuf3 ioctl for submission.
+ *
+ * The execbuf3 ioctl directly specifies the batch addresses instead of as
+ * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+ * support many of the older features like in/out/submit fences, fence array,
+ * default gem context etc. (See struct drm_i915_gem_execbuffer3).
+ *
+ * In VM_BIND mode, VA allocation is completely managed by the user instead of
+ * the i915 driver. Hence all VA assignment, eviction are not applicable in
+ * VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+ * be using the i915_vma active reference tracking. It will instead check the
+ * dma-resv object's fence list for that.
+ *
+ * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
+ * vma lookup table, implicit sync, vma active reference tracking etc., are not
+ * applicable for execbuf3 ioctl.
+ */
+
+/**
+ * struct i915_execbuffer - execbuf struct for execbuf3
+ * @i915: reference to the i915 instance we run on
+ * @file: drm file reference
+ * @args: execbuf3 ioctl structure
+ * @gt: reference to the gt instance ioctl submitted for
+ * @context: logical state for the request
+ * @gem_context: callers context
+ * @requests: requests to be build
+ * @composite_fence: used for excl fence in dma_resv objects when > 1 BB submitted
+ * @ww: i915_gem_ww_ctx instance
+ * @num_batches: number of batches submitted
+ * @batch_addresses: addresses corresponds to the submitted batches
+ * @batches: references to the i915_vmas corresponding to the batches
+ * @fences: array of execbuf fences (See struct eb_fence)
+ * @num_fences: number of fences in @fences array
+ */
+struct i915_execbuffer {
+	struct drm_i915_private *i915;
+	struct drm_file *file;
+	struct drm_i915_gem_execbuffer3 *args;
+
+	struct intel_gt *gt;
+	struct intel_context *context;
+	struct i915_gem_context *gem_context;
+
+	struct i915_request *requests[MAX_ENGINE_INSTANCE + 1];
+	struct dma_fence *composite_fence;
+
+	struct i915_gem_ww_ctx ww;
+
+	unsigned int num_batches;
+	u64 batch_addresses[MAX_ENGINE_INSTANCE + 1];
+	struct i915_vma *batches[MAX_ENGINE_INSTANCE + 1];
+
+	struct eb_fence *fences;
+	u64 num_fences;
+};
+
+static void eb_unpin_engine(struct i915_execbuffer *eb);
+
+static int eb_select_context(struct i915_execbuffer *eb)
+{
+	struct i915_gem_context *ctx;
+
+	ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->ctx_id);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
+	if (!i915_gem_vm_is_vm_bind_mode(ctx->vm)) {
+		i915_gem_context_put(ctx);
+		return -EOPNOTSUPP;
+	}
+
+	eb->gem_context = ctx;
+	return 0;
+}
+
+static struct i915_vma *
+eb_find_vma(struct i915_address_space *vm, u64 addr)
+{
+	u64 va;
+
+	lockdep_assert_held(&vm->vm_bind_lock);
+
+	va = gen8_noncanonical_addr(addr & PIN_OFFSET_MASK);
+	return i915_gem_vm_bind_lookup_vma(vm, va);
+}
+
+static int eb_lookup_vma_all(struct i915_execbuffer *eb)
+{
+	struct i915_vma *vma;
+	unsigned int i;
+
+	for (i = 0; i < eb->num_batches; i++) {
+		vma = eb_find_vma(eb->context->vm, eb->batch_addresses[i]);
+		if (!vma)
+			return -EINVAL;
+
+		eb->batches[i] = vma;
+	}
+
+	return 0;
+}
+
+static void eb_release_vma_all(struct i915_execbuffer *eb)
+{
+	eb_unpin_engine(eb);
+}
+
+/*
+ * Using two helper loops for the order of which requests / batches are created
+ * and added the to backend. Requests are created in order from the parent to
+ * the last child. Requests are added in the reverse order, from the last child
+ * to parent. This is done for locking reasons as the timeline lock is acquired
+ * during request creation and released when the request is added to the
+ * backend. To make lockdep happy (see intel_context_timeline_lock) this must be
+ * the ordering.
+ */
+#define for_each_batch_create_order(_eb) \
+	for (unsigned int i = 0; i < (_eb)->num_batches; ++i)
+
+static int eb_move_to_gpu(struct i915_execbuffer *eb)
+{
+	/* Unconditionally flush any chipset caches (for streaming writes). */
+	intel_gt_chipset_flush(eb->gt);
+
+	return 0;
+}
+
+static int eb_request_submit(struct i915_execbuffer *eb,
+			     struct i915_request *rq,
+			     struct i915_vma *batch,
+			     u64 batch_len)
+{
+	struct intel_engine_cs *engine = rq->context->engine;
+	int err;
+
+	if (intel_context_nopreempt(rq->context))
+		__set_bit(I915_FENCE_FLAG_NOPREEMPT, &rq->fence.flags);
+
+	/*
+	 * After we completed waiting for other engines (using HW semaphores)
+	 * then we can signal that this request/batch is ready to run. This
+	 * allows us to determine if the batch is still waiting on the GPU
+	 * or actually running by checking the breadcrumb.
+	 */
+	if (engine->emit_init_breadcrumb) {
+		err = engine->emit_init_breadcrumb(rq);
+		if (err)
+			return err;
+	}
+
+	return engine->emit_bb_start(rq, batch->node.start, batch_len, 0);
+}
+
+static int eb_submit(struct i915_execbuffer *eb)
+{
+	int err;
+
+	err = eb_move_to_gpu(eb);
+
+	for_each_batch_create_order(eb) {
+		if (!eb->requests[i])
+			break;
+
+		trace_i915_request_queue(eb->requests[i], 0);
+		if (!err)
+			err = eb_request_submit(eb, eb->requests[i],
+						eb->batches[i],
+						eb->batches[i]->size);
+	}
+
+	return err;
+}
+
+static int eb_pin_engine(struct i915_execbuffer *eb, bool throttle)
+{
+	int err;
+
+	GEM_BUG_ON(eb->args->flags & __EXEC3_ENGINE_PINNED);
+
+	err = i915_eb_pin_engine(eb->context, &eb->ww, throttle,
+				 eb->file->filp->f_flags & O_NONBLOCK);
+	if (err)
+		return err;
+
+	eb->args->flags |= __EXEC3_ENGINE_PINNED;
+	return 0;
+}
+
+static void eb_unpin_engine(struct i915_execbuffer *eb)
+{
+	if (!(eb->args->flags & __EXEC3_ENGINE_PINNED))
+		return;
+
+	eb->args->flags &= ~__EXEC3_ENGINE_PINNED;
+
+	i915_eb_unpin_engine(eb->context);
+}
+
+static int eb_select_engine(struct i915_execbuffer *eb)
+{
+	struct intel_context *ce;
+	unsigned int idx;
+	int err;
+
+	if (!i915_gem_context_user_engines(eb->gem_context))
+		return -EINVAL;
+
+	idx = eb->args->engine_idx;
+	ce = i915_gem_context_get_engine(eb->gem_context, idx);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
+	eb->num_batches = ce->parallel.number_children + 1;
+
+	err = i915_eb_select_engine(ce);
+	if (err)
+		goto err;
+
+	eb->context = ce;
+	eb->gt = ce->engine->gt;
+
+	/*
+	 * Make sure engine pool stays alive even if we call intel_context_put
+	 * during ww handling. The pool is destroyed when last pm reference
+	 * is dropped, which breaks our -EDEADLK handling.
+	 */
+	return 0;
+
+err:
+	intel_context_put(ce);
+	return err;
+}
+
+static void eb_put_engine(struct i915_execbuffer *eb)
+{
+	i915_eb_put_engine(eb->context);
+}
+
+static int add_timeline_fence_array(struct i915_execbuffer *eb)
+{
+	struct drm_i915_gem_timeline_fence __user *user_fences;
+	struct eb_fence *f;
+	u64 nfences;
+
+	nfences = eb->args->fence_count;
+	if (!nfences)
+		return 0;
+
+	/* Check multiplication overflow for access_ok() and kvmalloc_array() */
+	BUILD_BUG_ON(sizeof(size_t) > sizeof(unsigned long));
+	if (nfences > min_t(unsigned long,
+			    ULONG_MAX / sizeof(*user_fences),
+			    SIZE_MAX / sizeof(*f)))
+		return -EINVAL;
+
+	user_fences = u64_to_user_ptr(eb->args->timeline_fences);
+	if (!access_ok(user_fences, nfences * sizeof(*user_fences)))
+		return -EFAULT;
+
+	eb->fences = kcalloc(nfences, sizeof(*f), __GFP_NOWARN | GFP_KERNEL);
+	if (!eb->fences)
+		return -ENOMEM;
+
+	f = eb->fences;
+
+	BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) &
+		     ~__I915_TIMELINE_FENCE_UNKNOWN_FLAGS);
+
+	while (nfences--) {
+		struct drm_i915_gem_timeline_fence user_fence;
+		bool wait, signal;
+		int ret;
+
+		if (__copy_from_user(&user_fence,
+				     user_fences++,
+				     sizeof(user_fence)))
+			return -EFAULT;
+
+		if (user_fence.flags & __I915_TIMELINE_FENCE_UNKNOWN_FLAGS)
+			return -EINVAL;
+
+		wait = user_fence.flags & I915_TIMELINE_FENCE_WAIT;
+		signal = user_fence.flags & I915_TIMELINE_FENCE_SIGNAL;
+		ret = i915_eb_add_timeline_fence(eb->file, user_fence.handle,
+						 user_fence.value, f, wait,
+						 signal);
+		if (ret < 0)
+			return ret;
+		else if (!ret)
+			continue;
+
+		f++;
+		eb->num_fences++;
+	}
+
+	return 0;
+}
+
+static int parse_timeline_fences(struct i915_execbuffer *eb)
+{
+	return add_timeline_fence_array(eb);
+}
+
+static int parse_batch_addresses(struct i915_execbuffer *eb)
+{
+	struct drm_i915_gem_execbuffer3 *args = eb->args;
+
+	if (eb->num_batches == 1) {
+		eb->batch_addresses[0] = args->batch_address;
+	} else {
+		u64 __user *batch_addr = u64_to_user_ptr(args->batch_address);
+
+		if (copy_from_user(eb->batch_addresses, batch_addr,
+				   sizeof(batch_addr[0]) * eb->num_batches))
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int eb_composite_fence_create(struct i915_execbuffer *eb)
+{
+	struct dma_fence *fence;
+
+	fence = i915_eb_composite_fence_create(eb->requests, eb->num_batches,
+					       eb->context);
+	if (IS_ERR(fence))
+		return PTR_ERR(fence);
+
+	eb->composite_fence = fence;
+
+	return 0;
+}
+
+static int eb_fences_add(struct i915_execbuffer *eb, struct i915_request *rq)
+{
+	int err;
+
+	if (unlikely(eb->gem_context->syncobj)) {
+		struct dma_fence *fence;
+
+		fence = drm_syncobj_fence_get(eb->gem_context->syncobj);
+		err = i915_request_await_dma_fence(rq, fence);
+		dma_fence_put(fence);
+		if (err)
+			return err;
+	}
+
+	if (eb->fences) {
+		err = i915_eb_await_fence_array(eb->fences, eb->num_fences, rq);
+		if (err)
+			return err;
+	}
+
+	if (intel_context_is_parallel(eb->context)) {
+		err = eb_composite_fence_create(eb);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int eb_requests_create(struct i915_execbuffer *eb)
+{
+	int err;
+
+	for_each_batch_create_order(eb) {
+		/* Allocate a request for this batch buffer nice and early. */
+		eb->requests[i] =
+			i915_request_create(i915_eb_find_context(eb->context,
+								 i));
+		if (IS_ERR(eb->requests[i])) {
+			err = PTR_ERR(eb->requests[i]);
+			eb->requests[i] = NULL;
+			return err;
+		}
+
+		/*
+		 * Only the first request added (committed to backend) has to
+		 * take the in fences into account as all subsequent requests
+		 * will have fences inserted inbetween them.
+		 */
+		if (i + 1 == eb->num_batches) {
+			err = eb_fences_add(eb, eb->requests[i]);
+			if (err)
+				return err;
+		}
+
+		if (eb->batches[i])
+			eb->requests[i]->batch_res =
+				i915_vma_resource_get(eb->batches[i]->resource);
+	}
+
+	return 0;
+}
+
+static int
+i915_gem_do_execbuffer(struct drm_device *dev,
+		       struct drm_file *file,
+		       struct drm_i915_gem_execbuffer3 *args)
+{
+	struct drm_i915_private *i915 = to_i915(dev);
+	struct i915_execbuffer eb;
+	bool throttle = true;
+	int err;
+
+	BUILD_BUG_ON(__EXEC3_INTERNAL_FLAGS & ~__I915_EXEC3_UNKNOWN_FLAGS);
+
+	eb.i915 = i915;
+	eb.file = file;
+	eb.args = args;
+
+	eb.fences = NULL;
+	eb.num_fences = 0;
+
+	memset(eb.requests, 0, sizeof(struct i915_request *) *
+	       ARRAY_SIZE(eb.requests));
+	eb.composite_fence = NULL;
+
+	err = parse_timeline_fences(&eb);
+	if (err)
+		return err;
+
+	err = eb_select_context(&eb);
+	if (unlikely(err))
+		goto err_fences;
+
+	err = eb_select_engine(&eb);
+	if (unlikely(err))
+		goto err_context;
+
+	err = parse_batch_addresses(&eb);
+	if (unlikely(err))
+		goto err_engine;
+
+	mutex_lock(&eb.context->vm->vm_bind_lock);
+
+	err = eb_lookup_vma_all(&eb);
+	if (err) {
+		eb_release_vma_all(&eb);
+		goto err_vm_bind_lock;
+	}
+
+	i915_gem_ww_ctx_init(&eb.ww, true);
+
+retry_validate:
+	err = eb_pin_engine(&eb, throttle);
+	if (err)
+		goto err_validate;
+
+	/* only throttle once, even if we didn't need to throttle */
+	throttle = false;
+
+err_validate:
+	if (err == -EDEADLK) {
+		eb_release_vma_all(&eb);
+		err = i915_gem_ww_ctx_backoff(&eb.ww);
+		if (!err)
+			goto retry_validate;
+	}
+	if (err)
+		goto err_vma;
+
+	ww_acquire_done(&eb.ww.ctx);
+
+	err = eb_requests_create(&eb);
+	if (err) {
+		if (eb.requests[0])
+			goto err_request;
+		else
+			goto err_vma;
+	}
+
+	err = eb_submit(&eb);
+
+err_request:
+	i915_eb_requests_get(eb.requests, eb.num_batches);
+	err = i915_eb_requests_add(eb.requests, eb.num_batches, eb.context,
+				   eb.gem_context->sched, err);
+
+	if (eb.fences)
+		i915_eb_signal_fence_array(eb.fences, eb.num_fences,
+					   eb.composite_fence ?
+					   eb.composite_fence :
+					   &eb.requests[0]->fence);
+
+	if (unlikely(eb.gem_context->syncobj)) {
+		drm_syncobj_replace_fence(eb.gem_context->syncobj,
+					  eb.composite_fence ?
+					  eb.composite_fence :
+					  &eb.requests[0]->fence);
+	}
+
+	if (eb.composite_fence)
+		dma_fence_put(eb.composite_fence);
+
+	i915_eb_requests_put(eb.requests, eb.num_batches);
+
+err_vma:
+	eb_release_vma_all(&eb);
+	WARN_ON(err == -EDEADLK);
+	i915_gem_ww_ctx_fini(&eb.ww);
+err_vm_bind_lock:
+	mutex_unlock(&eb.context->vm->vm_bind_lock);
+err_engine:
+	eb_put_engine(&eb);
+err_context:
+	i915_gem_context_put(eb.gem_context);
+err_fences:
+	i915_eb_put_fence_array(eb.fences, eb.num_fences);
+	return err;
+}
+
+int
+i915_gem_execbuffer3_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file)
+{
+	struct drm_i915_gem_execbuffer3 *args = data;
+	int err;
+
+	/* Reserved fields must be 0 */
+	if (args->rsvd || args->extensions)
+		return -EINVAL;
+
+	if (args->flags & __I915_EXEC3_UNKNOWN_FLAGS)
+		return -EINVAL;
+
+	err = i915_gem_do_execbuffer(dev, file, args);
+
+	args->flags &= ~__I915_EXEC3_UNKNOWN_FLAGS;
+	return err;
+}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
index 28d6526e32ab..b7a1e9725a84 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ioctls.h
@@ -18,6 +18,8 @@ int i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
 			      struct drm_file *file);
 int i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
+int i915_gem_execbuffer3_ioctl(struct drm_device *dev, void *data,
+			       struct drm_file *file);
 int i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 				struct drm_file *file);
 int i915_gem_get_caching_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
index cf41b96ac485..a8b69ee39cee 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -1854,6 +1854,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GEM_INIT, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF_DRV(I915_GEM_EXECBUFFER, drm_invalid_op, DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(I915_GEM_EXECBUFFER2_WR, i915_gem_execbuffer2_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_EXECBUFFER3, i915_gem_execbuffer3_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_PIN, i915_gem_reject_pin_ioctl, DRM_AUTH|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF_DRV(I915_GEM_UNPIN, i915_gem_reject_pin_ioctl, DRM_AUTH|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF_DRV(I915_GEM_BUSY, i915_gem_busy_ioctl, DRM_RENDER_ALLOW),
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 9ac913606d40..59a94f515064 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -472,6 +472,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_CREATE_EXT		0x3c
 #define DRM_I915_GEM_VM_BIND		0x3d
 #define DRM_I915_GEM_VM_UNBIND		0x3e
+#define DRM_I915_GEM_EXECBUFFER3	0x3f
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -538,6 +539,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 #define DRM_IOCTL_I915_GEM_VM_BIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
 #define DRM_IOCTL_I915_GEM_VM_UNBIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_unbind)
+#define DRM_IOCTL_I915_GEM_EXECBUFFER3	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct drm_i915_gem_execbuffer3)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -1568,6 +1570,65 @@ struct drm_i915_gem_timeline_fence {
 	__u64 value;
 };
 
+/**
+ * struct drm_i915_gem_execbuffer3 - Structure for DRM_I915_GEM_EXECBUFFER3
+ * ioctl.
+ *
+ * DRM_I915_GEM_EXECBUFFER3 ioctl only works in VM_BIND mode and VM_BIND mode
+ * only works with this ioctl for submission.
+ * See I915_VM_CREATE_FLAGS_USE_VM_BIND.
+ */
+struct drm_i915_gem_execbuffer3 {
+	/**
+	 * @ctx_id: Context id
+	 *
+	 * Only contexts with user engine map are allowed.
+	 */
+	__u32 ctx_id;
+
+	/**
+	 * @engine_idx: Engine index
+	 *
+	 * An index in the user engine map of the context specified by @ctx_id.
+	 */
+	__u32 engine_idx;
+
+	/**
+	 * @batch_address: Batch gpu virtual address/es.
+	 *
+	 * For normal submission, it is the gpu virtual address of the batch
+	 * buffer. For parallel submission, it is a pointer to an array of
+	 * batch buffer gpu virtual addresses with array size equal to the
+	 * number of (parallel) engines involved in that submission (See
+	 * struct i915_context_engines_parallel_submit).
+	 */
+	__u64 batch_address;
+
+	/** @flags: Currently reserved, MBZ */
+	__u64 flags;
+#define __I915_EXEC3_UNKNOWN_FLAGS (~0ull)
+
+	/** @fence_count: Number of fences in @timeline_fences array. */
+	__u64 fence_count;
+
+	/**
+	 * @timeline_fences: Pointer to an array of timeline fences.
+	 *
+	 * Timeline fences are of format struct drm_i915_gem_timeline_fence.
+	 */
+	__u64 timeline_fences;
+
+	/** @rsvd: Reserved, MBZ */
+	__u64 rsvd;
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * For future extensions. See struct i915_user_extension.
+	 */
+	__u64 extensions;
+};
+
 struct drm_i915_gem_pin {
 	/** Handle of the buffer to be pinned. */
 	__u32 handle;
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 13/20] drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Ensure i915_vma_verify_bind_complete() handles case where bind
is not initiated. Also make it non static, add documentation
and move it out of CONFIG_DRM_I915_DEBUG_GEM.

v2: Fix fence leak

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c | 22 ++++++++++++++++------
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index eaa13e9ba966..aa4705246993 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -439,12 +439,25 @@ int i915_vma_sync(struct i915_vma *vma)
 	return i915_vm_sync(vma->vm);
 }
 
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
-static int i915_vma_verify_bind_complete(struct i915_vma *vma)
+/**
+ * i915_vma_verify_bind_complete() - Check for the bind completion of the vma
+ * @vma: vma to check for bind completion
+ *
+ * As the fence reference is obtained under RCU, no locking is required by
+ * the caller.
+ *
+ * Returns: 0 if the vma bind is completed. Error code otherwise.
+ */
+int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
-	struct dma_fence *fence = i915_active_fence_get(&vma->active.excl);
+	struct dma_fence *fence;
 	int err;
 
+	/* Ensure vma bind is initiated */
+	if (!i915_vma_is_bound(vma, I915_VMA_BIND_MASK))
+		return -EINVAL;
+
+	fence = i915_active_fence_get(&vma->active.excl);
 	if (!fence)
 		return 0;
 
@@ -457,9 +470,6 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
 
 	return err;
 }
-#else
-#define i915_vma_verify_bind_complete(_vma) 0
-#endif
 
 I915_SELFTEST_EXPORT void
 i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 1cadbf8fdedf..04770f8ba815 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -440,6 +440,7 @@ void i915_vma_make_purgeable(struct i915_vma *vma);
 
 int i915_vma_wait_for_bind(struct i915_vma *vma);
 int i915_vma_sync(struct i915_vma *vma);
+int i915_vma_verify_bind_complete(struct i915_vma *vma);
 
 /**
  * i915_vma_get_current_resource - Get the current resource of the vma
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 13/20] drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Ensure i915_vma_verify_bind_complete() handles case where bind
is not initiated. Also make it non static, add documentation
and move it out of CONFIG_DRM_I915_DEBUG_GEM.

v2: Fix fence leak

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c | 22 ++++++++++++++++------
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index eaa13e9ba966..aa4705246993 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -439,12 +439,25 @@ int i915_vma_sync(struct i915_vma *vma)
 	return i915_vm_sync(vma->vm);
 }
 
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
-static int i915_vma_verify_bind_complete(struct i915_vma *vma)
+/**
+ * i915_vma_verify_bind_complete() - Check for the bind completion of the vma
+ * @vma: vma to check for bind completion
+ *
+ * As the fence reference is obtained under RCU, no locking is required by
+ * the caller.
+ *
+ * Returns: 0 if the vma bind is completed. Error code otherwise.
+ */
+int i915_vma_verify_bind_complete(struct i915_vma *vma)
 {
-	struct dma_fence *fence = i915_active_fence_get(&vma->active.excl);
+	struct dma_fence *fence;
 	int err;
 
+	/* Ensure vma bind is initiated */
+	if (!i915_vma_is_bound(vma, I915_VMA_BIND_MASK))
+		return -EINVAL;
+
+	fence = i915_active_fence_get(&vma->active.excl);
 	if (!fence)
 		return 0;
 
@@ -457,9 +470,6 @@ static int i915_vma_verify_bind_complete(struct i915_vma *vma)
 
 	return err;
 }
-#else
-#define i915_vma_verify_bind_complete(_vma) 0
-#endif
 
 I915_SELFTEST_EXPORT void
 i915_vma_resource_init_from_vma(struct i915_vma_resource *vma_res,
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 1cadbf8fdedf..04770f8ba815 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -440,6 +440,7 @@ void i915_vma_make_purgeable(struct i915_vma *vma);
 
 int i915_vma_wait_for_bind(struct i915_vma *vma);
 int i915_vma_sync(struct i915_vma *vma);
+int i915_vma_verify_bind_complete(struct i915_vma *vma);
 
 /**
  * i915_vma_get_current_resource - Get the current resource of the vma
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 14/20] drm/i915/vm_bind: Expose i915_request_await_bind()
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Rename __i915_request_await_bind() as i915_request_await_bind()
and make it non-static as it will be used in execbuf3 ioctl path.

v2: add documentation

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c |  8 +-------
 drivers/gpu/drm/i915/i915_vma.h | 16 ++++++++++++++++
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index aa4705246993..f73955aef16a 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1888,18 +1888,12 @@ void i915_vma_revoke_mmap(struct i915_vma *vma)
 		list_del(&vma->obj->userfault_link);
 }
 
-static int
-__i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma)
-{
-	return __i915_request_await_exclusive(rq, &vma->active);
-}
-
 static int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request *rq)
 {
 	int err;
 
 	/* Wait for the vma to be bound before we start! */
-	err = __i915_request_await_bind(rq, vma);
+	err = i915_request_await_bind(rq, vma);
 	if (err)
 		return err;
 
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 04770f8ba815..737ef310d046 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -54,6 +54,22 @@ void i915_vma_unpin_and_release(struct i915_vma **p_vma, unsigned int flags);
 /* do not reserve memory to prevent deadlocks */
 #define __EXEC_OBJECT_NO_RESERVE BIT(31)
 
+/**
+ * i915_request_await_bind() - Setup request to wait for a vma bind completion
+ * @rq: the request which should wait
+ * @vma: vma whose binding @rq should wait to complete
+ *
+ * Setup the request @rq to asynchronously wait for @vma bind to complete
+ * before starting execution.
+ *
+ * Returns 0 on success, error code on failure.
+ */
+static inline int
+i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma)
+{
+	return __i915_request_await_exclusive(rq, &vma->active);
+}
+
 int __must_check _i915_vma_move_to_active(struct i915_vma *vma,
 					  struct i915_request *rq,
 					  struct dma_fence *fence,
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 14/20] drm/i915/vm_bind: Expose i915_request_await_bind()
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Rename __i915_request_await_bind() as i915_request_await_bind()
and make it non-static as it will be used in execbuf3 ioctl path.

v2: add documentation

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c |  8 +-------
 drivers/gpu/drm/i915/i915_vma.h | 16 ++++++++++++++++
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index aa4705246993..f73955aef16a 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1888,18 +1888,12 @@ void i915_vma_revoke_mmap(struct i915_vma *vma)
 		list_del(&vma->obj->userfault_link);
 }
 
-static int
-__i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma)
-{
-	return __i915_request_await_exclusive(rq, &vma->active);
-}
-
 static int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request *rq)
 {
 	int err;
 
 	/* Wait for the vma to be bound before we start! */
-	err = __i915_request_await_bind(rq, vma);
+	err = i915_request_await_bind(rq, vma);
 	if (err)
 		return err;
 
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 04770f8ba815..737ef310d046 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -54,6 +54,22 @@ void i915_vma_unpin_and_release(struct i915_vma **p_vma, unsigned int flags);
 /* do not reserve memory to prevent deadlocks */
 #define __EXEC_OBJECT_NO_RESERVE BIT(31)
 
+/**
+ * i915_request_await_bind() - Setup request to wait for a vma bind completion
+ * @rq: the request which should wait
+ * @vma: vma whose binding @rq should wait to complete
+ *
+ * Setup the request @rq to asynchronously wait for @vma bind to complete
+ * before starting execution.
+ *
+ * Returns 0 on success, error code on failure.
+ */
+static inline int
+i915_request_await_bind(struct i915_request *rq, struct i915_vma *vma)
+{
+	return __i915_request_await_exclusive(rq, &vma->active);
+}
+
 int __must_check _i915_vma_move_to_active(struct i915_vma *vma,
 					  struct i915_request *rq,
 					  struct dma_fence *fence,
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 15/20] drm/i915/vm_bind: Handle persistent vmas in execbuf3
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Handle persistent (VM_BIND) mappings during the request submission
in the execbuf3 path.

v2: Ensure requests wait for bindings to complete.
v3: Remove short term pinning with PIN_VALIDATE flag.
    Individualize fences before adding to dma_resv obj.
v4: Fix bind completion check, use PIN_NOEVICT,
    use proper lock while checking if vm_rebind_list is empty.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 215 +++++++++++++++++-
 1 file changed, 214 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index 64251dc4cf91..d91c2e96cd0f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -3,6 +3,7 @@
  * Copyright © 2022 Intel Corporation
  */
 
+#include <linux/dma-fence-array.h>
 #include <linux/dma-resv.h>
 #include <linux/uaccess.h>
 
@@ -19,6 +20,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
 
+#define __EXEC3_HAS_PIN			BIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED		BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS		(~0ull << 32)
 
@@ -42,7 +44,9 @@
  * execlist. Hence, no support for implicit sync.
  *
  * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
- * works with execbuf3 ioctl for submission.
+ * works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+ * VM_BIND call) at the time of execbuf3 call are deemed required for that
+ * submission.
  *
  * The execbuf3 ioctl directly specifies the batch addresses instead of as
  * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
@@ -58,6 +62,13 @@
  * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
  * vma lookup table, implicit sync, vma active reference tracking etc., are not
  * applicable for execbuf3 ioctl.
+ *
+ * During each execbuf submission, request fence is added to all VM_BIND mapped
+ * objects with DMA_RESV_USAGE_BOOKKEEP. The DMA_RESV_USAGE_BOOKKEEP usage will
+ * prevent over sync (See enum dma_resv_usage). Note that DRM_I915_GEM_WAIT and
+ * DRM_I915_GEM_BUSY ioctls do not check for DMA_RESV_USAGE_BOOKKEEP usage and
+ * hence should not be used for end of batch check. Instead, the execbuf3
+ * timeline out fence should be used for end of batch check.
  */
 
 /**
@@ -129,6 +140,23 @@ eb_find_vma(struct i915_address_space *vm, u64 addr)
 	return i915_gem_vm_bind_lookup_vma(vm, va);
 }
 
+static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
+{
+	struct i915_vma *vma, *vn;
+
+	/**
+	 * Move all unbound vmas back into vm_bind_list so that they are
+	 * revalidated.
+	 */
+	spin_lock(&vm->vm_rebind_lock);
+	list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, vm_rebind_link) {
+		list_del_init(&vma->vm_rebind_link);
+		if (!list_empty(&vma->vm_bind_link))
+			list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+	}
+	spin_unlock(&vm->vm_rebind_lock);
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
 	struct i915_vma *vma;
@@ -142,14 +170,108 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 		eb->batches[i] = vma;
 	}
 
+	eb_scoop_unbound_vma_all(eb->context->vm);
+
+	return 0;
+}
+
+static int eb_lock_vma_all(struct i915_execbuffer *eb)
+{
+	struct i915_address_space *vm = eb->context->vm;
+	struct i915_vma *vma;
+	int err;
+
+	err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww);
+	if (err)
+		return err;
+
+	list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+			    non_priv_vm_bind_link) {
+		err = i915_gem_object_lock(vma->obj, &eb->ww);
+		if (err)
+			return err;
+	}
+
 	return 0;
 }
 
+static void eb_release_persistent_vma_all(struct i915_execbuffer *eb)
+{
+	struct i915_address_space *vm = eb->context->vm;
+	struct i915_vma *vma, *vn;
+
+	lockdep_assert_held(&vm->vm_bind_lock);
+
+	if (!(eb->args->flags & __EXEC3_HAS_PIN))
+		return;
+
+	assert_object_held(vm->root_obj);
+
+	list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link)
+		if (!i915_vma_verify_bind_complete(vma))
+			list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+
+	eb->args->flags &= ~__EXEC3_HAS_PIN;
+}
+
 static void eb_release_vma_all(struct i915_execbuffer *eb)
 {
+	eb_release_persistent_vma_all(eb);
 	eb_unpin_engine(eb);
 }
 
+static int eb_reserve_fence_for_persistent_vma_all(struct i915_execbuffer *eb)
+{
+	struct i915_address_space *vm = eb->context->vm;
+	u64 num_fences = 1;
+	struct i915_vma *vma;
+	int ret;
+
+	/* Reserve enough slots to accommodate composite fences */
+	if (intel_context_is_parallel(eb->context))
+		num_fences = eb->num_batches;
+
+	ret = dma_resv_reserve_fences(vm->root_obj->base.resv, num_fences);
+	if (ret)
+		return ret;
+
+	list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+			    non_priv_vm_bind_link) {
+		ret = dma_resv_reserve_fences(vma->obj->base.resv, num_fences);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int eb_validate_persistent_vma_all(struct i915_execbuffer *eb)
+{
+	struct i915_address_space *vm = eb->context->vm;
+	struct i915_vma *vma;
+	int ret = 0;
+
+	lockdep_assert_held(&vm->vm_bind_lock);
+	assert_object_held(vm->root_obj);
+
+	ret = eb_reserve_fence_for_persistent_vma_all(eb);
+	if (ret)
+		return ret;
+
+	list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+		u64 pin_flags = vma->start | PIN_OFFSET_FIXED | PIN_USER |
+				PIN_VALIDATE | PIN_NOEVICT;
+
+		ret = i915_vma_pin_ww(vma, &eb->ww, 0, 0, pin_flags);
+		if (ret)
+			break;
+
+		eb->args->flags |= __EXEC3_HAS_PIN;
+	}
+
+	return ret;
+}
+
 /*
  * Using two helper loops for the order of which requests / batches are created
  * and added the to backend. Requests are created in order from the parent to
@@ -161,13 +283,80 @@ static void eb_release_vma_all(struct i915_execbuffer *eb)
  */
 #define for_each_batch_create_order(_eb) \
 	for (unsigned int i = 0; i < (_eb)->num_batches; ++i)
+#define for_each_batch_add_order(_eb) \
+	for (int i = (_eb)->num_batches - 1; i >= 0; --i)
+
+static void __eb_persistent_add_shared_fence(struct drm_i915_gem_object *obj,
+					     struct dma_fence *fence)
+{
+	struct dma_fence *curr;
+	int idx;
+
+	dma_fence_array_for_each(curr, idx, fence)
+		dma_resv_add_fence(obj->base.resv, curr,
+				   DMA_RESV_USAGE_BOOKKEEP);
+
+	obj->write_domain = 0;
+	obj->read_domains |= I915_GEM_GPU_DOMAINS;
+	obj->mm.dirty = true;
+}
+
+static void eb_persistent_add_shared_fence(struct i915_execbuffer *eb)
+{
+	struct i915_address_space *vm = eb->context->vm;
+	struct dma_fence *fence;
+	struct i915_vma *vma;
+
+	fence = eb->composite_fence ? eb->composite_fence :
+		&eb->requests[0]->fence;
+
+	__eb_persistent_add_shared_fence(vm->root_obj, fence);
+	list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+			    non_priv_vm_bind_link)
+		__eb_persistent_add_shared_fence(vma->obj, fence);
+}
+
+static void eb_move_all_persistent_vma_to_active(struct i915_execbuffer *eb)
+{
+	/* Add fence to BOs dma-resv fence list */
+	eb_persistent_add_shared_fence(eb);
+}
 
 static int eb_move_to_gpu(struct i915_execbuffer *eb)
 {
+	struct i915_address_space *vm = eb->context->vm;
+	struct i915_vma *vma;
+	int err = 0;
+
+	lockdep_assert_held(&vm->vm_bind_lock);
+	assert_object_held(vm->root_obj);
+
+	eb_move_all_persistent_vma_to_active(eb);
+
+	list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+		for_each_batch_add_order(eb) {
+			if (!eb->requests[i])
+				continue;
+
+			err = i915_request_await_bind(eb->requests[i], vma);
+			if (err)
+				goto err_skip;
+		}
+	}
+
 	/* Unconditionally flush any chipset caches (for streaming writes). */
 	intel_gt_chipset_flush(eb->gt);
 
 	return 0;
+
+err_skip:
+	for_each_batch_create_order(eb) {
+		if (!eb->requests[i])
+			break;
+
+		i915_request_set_error_once(eb->requests[i], err);
+	}
+	return err;
 }
 
 static int eb_request_submit(struct i915_execbuffer *eb,
@@ -481,6 +670,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 
 	mutex_lock(&eb.context->vm->vm_bind_lock);
 
+lookup_vmas:
 	err = eb_lookup_vma_all(&eb);
 	if (err) {
 		eb_release_vma_all(&eb);
@@ -497,6 +687,29 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	/* only throttle once, even if we didn't need to throttle */
 	throttle = false;
 
+	err = eb_lock_vma_all(&eb);
+	if (err)
+		goto err_validate;
+
+	/**
+	 * No object unbinds possible once the objects are locked. So,
+	 * check for any unbinds here, which needs to be scooped up.
+	 *
+	 * XXX: Probably vm_rebind_list can be scooped in the validation
+	 * phase instead of lookup phase, after holding object locks.
+	 * Then this check won't be needed.
+	 */
+	spin_lock(&eb.context->vm->vm_rebind_lock);
+	if (!list_empty(&eb.context->vm->vm_rebind_list)) {
+		spin_unlock(&eb.context->vm->vm_rebind_lock);
+		eb_release_vma_all(&eb);
+		i915_gem_ww_ctx_fini(&eb.ww);
+		goto lookup_vmas;
+	}
+	spin_unlock(&eb.context->vm->vm_rebind_lock);
+
+	err = eb_validate_persistent_vma_all(&eb);
+
 err_validate:
 	if (err == -EDEADLK) {
 		eb_release_vma_all(&eb);
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 15/20] drm/i915/vm_bind: Handle persistent vmas in execbuf3
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Handle persistent (VM_BIND) mappings during the request submission
in the execbuf3 path.

v2: Ensure requests wait for bindings to complete.
v3: Remove short term pinning with PIN_VALIDATE flag.
    Individualize fences before adding to dma_resv obj.
v4: Fix bind completion check, use PIN_NOEVICT,
    use proper lock while checking if vm_rebind_list is empty.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 215 +++++++++++++++++-
 1 file changed, 214 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index 64251dc4cf91..d91c2e96cd0f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -3,6 +3,7 @@
  * Copyright © 2022 Intel Corporation
  */
 
+#include <linux/dma-fence-array.h>
 #include <linux/dma-resv.h>
 #include <linux/uaccess.h>
 
@@ -19,6 +20,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
 
+#define __EXEC3_HAS_PIN			BIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED		BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS		(~0ull << 32)
 
@@ -42,7 +44,9 @@
  * execlist. Hence, no support for implicit sync.
  *
  * The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
- * works with execbuf3 ioctl for submission.
+ * works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+ * VM_BIND call) at the time of execbuf3 call are deemed required for that
+ * submission.
  *
  * The execbuf3 ioctl directly specifies the batch addresses instead of as
  * object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
@@ -58,6 +62,13 @@
  * So, a lot of code supporting execbuf2 ioctl, like relocations, VA evictions,
  * vma lookup table, implicit sync, vma active reference tracking etc., are not
  * applicable for execbuf3 ioctl.
+ *
+ * During each execbuf submission, request fence is added to all VM_BIND mapped
+ * objects with DMA_RESV_USAGE_BOOKKEEP. The DMA_RESV_USAGE_BOOKKEEP usage will
+ * prevent over sync (See enum dma_resv_usage). Note that DRM_I915_GEM_WAIT and
+ * DRM_I915_GEM_BUSY ioctls do not check for DMA_RESV_USAGE_BOOKKEEP usage and
+ * hence should not be used for end of batch check. Instead, the execbuf3
+ * timeline out fence should be used for end of batch check.
  */
 
 /**
@@ -129,6 +140,23 @@ eb_find_vma(struct i915_address_space *vm, u64 addr)
 	return i915_gem_vm_bind_lookup_vma(vm, va);
 }
 
+static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
+{
+	struct i915_vma *vma, *vn;
+
+	/**
+	 * Move all unbound vmas back into vm_bind_list so that they are
+	 * revalidated.
+	 */
+	spin_lock(&vm->vm_rebind_lock);
+	list_for_each_entry_safe(vma, vn, &vm->vm_rebind_list, vm_rebind_link) {
+		list_del_init(&vma->vm_rebind_link);
+		if (!list_empty(&vma->vm_bind_link))
+			list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+	}
+	spin_unlock(&vm->vm_rebind_lock);
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
 	struct i915_vma *vma;
@@ -142,14 +170,108 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 		eb->batches[i] = vma;
 	}
 
+	eb_scoop_unbound_vma_all(eb->context->vm);
+
+	return 0;
+}
+
+static int eb_lock_vma_all(struct i915_execbuffer *eb)
+{
+	struct i915_address_space *vm = eb->context->vm;
+	struct i915_vma *vma;
+	int err;
+
+	err = i915_gem_object_lock(eb->context->vm->root_obj, &eb->ww);
+	if (err)
+		return err;
+
+	list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+			    non_priv_vm_bind_link) {
+		err = i915_gem_object_lock(vma->obj, &eb->ww);
+		if (err)
+			return err;
+	}
+
 	return 0;
 }
 
+static void eb_release_persistent_vma_all(struct i915_execbuffer *eb)
+{
+	struct i915_address_space *vm = eb->context->vm;
+	struct i915_vma *vma, *vn;
+
+	lockdep_assert_held(&vm->vm_bind_lock);
+
+	if (!(eb->args->flags & __EXEC3_HAS_PIN))
+		return;
+
+	assert_object_held(vm->root_obj);
+
+	list_for_each_entry_safe(vma, vn, &vm->vm_bind_list, vm_bind_link)
+		if (!i915_vma_verify_bind_complete(vma))
+			list_move_tail(&vma->vm_bind_link, &vm->vm_bound_list);
+
+	eb->args->flags &= ~__EXEC3_HAS_PIN;
+}
+
 static void eb_release_vma_all(struct i915_execbuffer *eb)
 {
+	eb_release_persistent_vma_all(eb);
 	eb_unpin_engine(eb);
 }
 
+static int eb_reserve_fence_for_persistent_vma_all(struct i915_execbuffer *eb)
+{
+	struct i915_address_space *vm = eb->context->vm;
+	u64 num_fences = 1;
+	struct i915_vma *vma;
+	int ret;
+
+	/* Reserve enough slots to accommodate composite fences */
+	if (intel_context_is_parallel(eb->context))
+		num_fences = eb->num_batches;
+
+	ret = dma_resv_reserve_fences(vm->root_obj->base.resv, num_fences);
+	if (ret)
+		return ret;
+
+	list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+			    non_priv_vm_bind_link) {
+		ret = dma_resv_reserve_fences(vma->obj->base.resv, num_fences);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int eb_validate_persistent_vma_all(struct i915_execbuffer *eb)
+{
+	struct i915_address_space *vm = eb->context->vm;
+	struct i915_vma *vma;
+	int ret = 0;
+
+	lockdep_assert_held(&vm->vm_bind_lock);
+	assert_object_held(vm->root_obj);
+
+	ret = eb_reserve_fence_for_persistent_vma_all(eb);
+	if (ret)
+		return ret;
+
+	list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+		u64 pin_flags = vma->start | PIN_OFFSET_FIXED | PIN_USER |
+				PIN_VALIDATE | PIN_NOEVICT;
+
+		ret = i915_vma_pin_ww(vma, &eb->ww, 0, 0, pin_flags);
+		if (ret)
+			break;
+
+		eb->args->flags |= __EXEC3_HAS_PIN;
+	}
+
+	return ret;
+}
+
 /*
  * Using two helper loops for the order of which requests / batches are created
  * and added the to backend. Requests are created in order from the parent to
@@ -161,13 +283,80 @@ static void eb_release_vma_all(struct i915_execbuffer *eb)
  */
 #define for_each_batch_create_order(_eb) \
 	for (unsigned int i = 0; i < (_eb)->num_batches; ++i)
+#define for_each_batch_add_order(_eb) \
+	for (int i = (_eb)->num_batches - 1; i >= 0; --i)
+
+static void __eb_persistent_add_shared_fence(struct drm_i915_gem_object *obj,
+					     struct dma_fence *fence)
+{
+	struct dma_fence *curr;
+	int idx;
+
+	dma_fence_array_for_each(curr, idx, fence)
+		dma_resv_add_fence(obj->base.resv, curr,
+				   DMA_RESV_USAGE_BOOKKEEP);
+
+	obj->write_domain = 0;
+	obj->read_domains |= I915_GEM_GPU_DOMAINS;
+	obj->mm.dirty = true;
+}
+
+static void eb_persistent_add_shared_fence(struct i915_execbuffer *eb)
+{
+	struct i915_address_space *vm = eb->context->vm;
+	struct dma_fence *fence;
+	struct i915_vma *vma;
+
+	fence = eb->composite_fence ? eb->composite_fence :
+		&eb->requests[0]->fence;
+
+	__eb_persistent_add_shared_fence(vm->root_obj, fence);
+	list_for_each_entry(vma, &vm->non_priv_vm_bind_list,
+			    non_priv_vm_bind_link)
+		__eb_persistent_add_shared_fence(vma->obj, fence);
+}
+
+static void eb_move_all_persistent_vma_to_active(struct i915_execbuffer *eb)
+{
+	/* Add fence to BOs dma-resv fence list */
+	eb_persistent_add_shared_fence(eb);
+}
 
 static int eb_move_to_gpu(struct i915_execbuffer *eb)
 {
+	struct i915_address_space *vm = eb->context->vm;
+	struct i915_vma *vma;
+	int err = 0;
+
+	lockdep_assert_held(&vm->vm_bind_lock);
+	assert_object_held(vm->root_obj);
+
+	eb_move_all_persistent_vma_to_active(eb);
+
+	list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+		for_each_batch_add_order(eb) {
+			if (!eb->requests[i])
+				continue;
+
+			err = i915_request_await_bind(eb->requests[i], vma);
+			if (err)
+				goto err_skip;
+		}
+	}
+
 	/* Unconditionally flush any chipset caches (for streaming writes). */
 	intel_gt_chipset_flush(eb->gt);
 
 	return 0;
+
+err_skip:
+	for_each_batch_create_order(eb) {
+		if (!eb->requests[i])
+			break;
+
+		i915_request_set_error_once(eb->requests[i], err);
+	}
+	return err;
 }
 
 static int eb_request_submit(struct i915_execbuffer *eb,
@@ -481,6 +670,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 
 	mutex_lock(&eb.context->vm->vm_bind_lock);
 
+lookup_vmas:
 	err = eb_lookup_vma_all(&eb);
 	if (err) {
 		eb_release_vma_all(&eb);
@@ -497,6 +687,29 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	/* only throttle once, even if we didn't need to throttle */
 	throttle = false;
 
+	err = eb_lock_vma_all(&eb);
+	if (err)
+		goto err_validate;
+
+	/**
+	 * No object unbinds possible once the objects are locked. So,
+	 * check for any unbinds here, which needs to be scooped up.
+	 *
+	 * XXX: Probably vm_rebind_list can be scooped in the validation
+	 * phase instead of lookup phase, after holding object locks.
+	 * Then this check won't be needed.
+	 */
+	spin_lock(&eb.context->vm->vm_rebind_lock);
+	if (!list_empty(&eb.context->vm->vm_rebind_list)) {
+		spin_unlock(&eb.context->vm->vm_rebind_lock);
+		eb_release_vma_all(&eb);
+		i915_gem_ww_ctx_fini(&eb.ww);
+		goto lookup_vmas;
+	}
+	spin_unlock(&eb.context->vm->vm_rebind_lock);
+
+	err = eb_validate_persistent_vma_all(&eb);
+
 err_validate:
 	if (err == -EDEADLK) {
 		eb_release_vma_all(&eb);
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 16/20] drm/i915/vm_bind: userptr dma-resv changes
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

For persistent (vm_bind) vmas of userptr BOs, handle the user
page pinning by using the i915_gem_object_userptr_submit_init()
/done() functions

v2: Do not double add vma to vm->userptr_invalidated_list
v3: Initialize vma->userptr_invalidated_link

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 84 ++++++++++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   | 19 +++++
 .../drm/i915/gem/i915_gem_vm_bind_object.c    | 15 ++++
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  2 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  4 +
 drivers/gpu/drm/i915/i915_vma.c               |  1 +
 drivers/gpu/drm/i915/i915_vma_types.h         |  2 +
 7 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index d91c2e96cd0f..895d4f0a2647 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -20,6 +20,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
 
+#define __EXEC3_USERPTR_USED		BIT_ULL(34)
 #define __EXEC3_HAS_PIN			BIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED		BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS		(~0ull << 32)
@@ -144,7 +145,22 @@ static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
 {
 	struct i915_vma *vma, *vn;
 
-	/**
+#ifdef CONFIG_MMU_NOTIFIER
+	/*
+	 * Move all invalidated userptr vmas back into vm_bind_list so that
+	 * they are looked up and revalidated.
+	 */
+	spin_lock(&vm->userptr_invalidated_lock);
+	list_for_each_entry_safe(vma, vn, &vm->userptr_invalidated_list,
+				 userptr_invalidated_link) {
+		list_del_init(&vma->userptr_invalidated_link);
+		if (!list_empty(&vma->vm_bind_link))
+			list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+	}
+	spin_unlock(&vm->userptr_invalidated_lock);
+#endif
+
+	/*
 	 * Move all unbound vmas back into vm_bind_list so that they are
 	 * revalidated.
 	 */
@@ -157,10 +173,47 @@ static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
 	spin_unlock(&vm->vm_rebind_lock);
 }
 
+static int eb_lookup_persistent_userptr_vmas(struct i915_execbuffer *eb)
+{
+	struct i915_address_space *vm = eb->context->vm;
+	struct i915_vma *last_vma = NULL;
+	struct i915_vma *vma;
+	int err;
+
+	lockdep_assert_held(&vm->vm_bind_lock);
+
+	list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+		if (!i915_gem_object_is_userptr(vma->obj))
+			continue;
+
+		err = i915_gem_object_userptr_submit_init(vma->obj);
+		if (err)
+			return err;
+
+		/*
+		 * The above submit_init() call does the object unbind and
+		 * hence adds vma into vm_rebind_list. Remove it from that
+		 * list as it is already scooped for revalidation.
+		 */
+		spin_lock(&vm->vm_rebind_lock);
+		if (!list_empty(&vma->vm_rebind_link))
+			list_del_init(&vma->vm_rebind_link);
+		spin_unlock(&vm->vm_rebind_lock);
+
+		last_vma = vma;
+	}
+
+	if (last_vma)
+		eb->args->flags |= __EXEC3_USERPTR_USED;
+
+	return 0;
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
 	struct i915_vma *vma;
 	unsigned int i;
+	int err = 0;
 
 	for (i = 0; i < eb->num_batches; i++) {
 		vma = eb_find_vma(eb->context->vm, eb->batch_addresses[i]);
@@ -172,6 +225,10 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 
 	eb_scoop_unbound_vma_all(eb->context->vm);
 
+	err = eb_lookup_persistent_userptr_vmas(eb);
+	if (err)
+		return err;
+
 	return 0;
 }
 
@@ -344,6 +401,29 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 		}
 	}
 
+#ifdef CONFIG_MMU_NOTIFIER
+	/* Check for further userptr invalidations */
+	spin_lock(&vm->userptr_invalidated_lock);
+	if (!list_empty(&vm->userptr_invalidated_list))
+		err = -EAGAIN;
+	spin_unlock(&vm->userptr_invalidated_lock);
+
+	if (!err && (eb->args->flags & __EXEC3_USERPTR_USED)) {
+		read_lock(&eb->i915->mm.notifier_lock);
+		list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+			if (!i915_gem_object_is_userptr(vma->obj))
+				continue;
+
+			err = i915_gem_object_userptr_submit_done(vma->obj);
+			if (err)
+				break;
+		}
+		read_unlock(&eb->i915->mm.notifier_lock);
+	}
+#endif
+	if (unlikely(err))
+		goto err_skip;
+
 	/* Unconditionally flush any chipset caches (for streaming writes). */
 	intel_gt_chipset_flush(eb->gt);
 
@@ -691,7 +771,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (err)
 		goto err_validate;
 
-	/**
+	/*
 	 * No object unbinds possible once the objects are locked. So,
 	 * check for any unbinds here, which needs to be scooped up.
 	 *
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index ca7a388ba2bf..38be6ce82600 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -63,6 +63,7 @@ static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier *mni,
 {
 	struct drm_i915_gem_object *obj = container_of(mni, struct drm_i915_gem_object, userptr.notifier);
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct i915_vma *vma;
 	long r;
 
 	if (!mmu_notifier_range_blockable(range))
@@ -85,6 +86,24 @@ static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier *mni,
 	if (current->flags & PF_EXITING)
 		return true;
 
+	/**
+	 * Add persistent vmas into userptr_invalidated list for relookup
+	 * and revalidation.
+	 */
+	spin_lock(&obj->vma.lock);
+	list_for_each_entry(vma, &obj->vma.list, obj_link) {
+		if (!i915_vma_is_persistent(vma))
+			continue;
+
+		spin_lock(&vma->vm->userptr_invalidated_lock);
+		if (list_empty(&vma->userptr_invalidated_link) &&
+		    !i915_vma_is_purged(vma))
+			list_add_tail(&vma->userptr_invalidated_link,
+				      &vma->vm->userptr_invalidated_list);
+		spin_unlock(&vma->vm->userptr_invalidated_lock);
+	}
+	spin_unlock(&obj->vma.lock);
+
 	/* we will unbind on next submission, still have userptr pins */
 	r = dma_resv_wait_timeout(obj->base.resv, DMA_RESV_USAGE_BOOKKEEP, false,
 				  MAX_SCHEDULE_TIMEOUT);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 6396fd1dc520..8532e87399ba 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -306,6 +306,12 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 		goto put_obj;
 	}
 
+	if (i915_gem_object_is_userptr(obj)) {
+		ret = i915_gem_object_userptr_submit_init(obj);
+		if (ret)
+			goto put_obj;
+	}
+
 	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
 	if (ret)
 		goto put_obj;
@@ -335,6 +341,15 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 		if (ret)
 			continue;
 
+#ifdef CONFIG_MMU_NOTIFIER
+		if (i915_gem_object_is_userptr(obj)) {
+			read_lock(&vm->i915->mm.notifier_lock);
+			ret = i915_gem_object_userptr_submit_done(obj);
+			read_unlock(&vm->i915->mm.notifier_lock);
+			if (ret)
+				continue;
+		}
+#endif
 		/* If out fence is not requested, wait for bind to complete */
 		if (!(va->fence.flags & I915_TIMELINE_FENCE_SIGNAL)) {
 			ret = i915_vma_wait_for_bind(vma);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index ebf8fc3a4603..50648ab9214a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -292,6 +292,8 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
 	INIT_LIST_HEAD(&vm->vm_rebind_list);
 	spin_lock_init(&vm->vm_rebind_lock);
+	spin_lock_init(&vm->userptr_invalidated_lock);
+	INIT_LIST_HEAD(&vm->userptr_invalidated_list);
 }
 
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index b5a5b68adb32..08a18603b93a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -270,6 +270,10 @@ struct i915_address_space {
 	struct list_head vm_rebind_list;
 	/** @vm_rebind_lock: protects vm_rebound_list */
 	spinlock_t vm_rebind_lock;
+	/** @userptr_invalidated_list: list of invalidated userptr vmas */
+	struct list_head userptr_invalidated_list;
+	/** @userptr_invalidated_lock: protects userptr_invalidated_list */
+	spinlock_t userptr_invalidated_lock;
 	/** @va: tree of persistent vmas */
 	struct rb_root_cached va;
 	/** @non_priv_vm_bind_list: list of non-private object mappings */
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index f73955aef16a..08218e3a2f12 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -243,6 +243,7 @@ vma_create(struct drm_i915_gem_object *obj,
 	INIT_LIST_HEAD(&vma->vm_bind_link);
 	INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
 	INIT_LIST_HEAD(&vma->vm_rebind_link);
+	INIT_LIST_HEAD(&vma->userptr_invalidated_link);
 	return vma;
 
 err_unlock:
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index 7c8c293ddfcb..90471dc0b235 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -307,6 +307,8 @@ struct i915_vma {
 	struct list_head non_priv_vm_bind_link;
 	/** @vm_rebind_link: link to vm_rebind_list and protected by vm_rebind_lock */
 	struct list_head vm_rebind_link; /* Link in vm_rebind_list */
+	/** @userptr_invalidated_link: link to the vm->userptr_invalidated_list */
+	struct list_head userptr_invalidated_link;
 
 	/** Timeline fence for vm_bind completion notification */
 	struct {
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 16/20] drm/i915/vm_bind: userptr dma-resv changes
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

For persistent (vm_bind) vmas of userptr BOs, handle the user
page pinning by using the i915_gem_object_userptr_submit_init()
/done() functions

v2: Do not double add vma to vm->userptr_invalidated_list
v3: Initialize vma->userptr_invalidated_link

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 84 ++++++++++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   | 19 +++++
 .../drm/i915/gem/i915_gem_vm_bind_object.c    | 15 ++++
 drivers/gpu/drm/i915/gt/intel_gtt.c           |  2 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  4 +
 drivers/gpu/drm/i915/i915_vma.c               |  1 +
 drivers/gpu/drm/i915/i915_vma_types.h         |  2 +
 7 files changed, 125 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
index d91c2e96cd0f..895d4f0a2647 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
@@ -20,6 +20,7 @@
 #include "i915_gem_vm_bind.h"
 #include "i915_trace.h"
 
+#define __EXEC3_USERPTR_USED		BIT_ULL(34)
 #define __EXEC3_HAS_PIN			BIT_ULL(33)
 #define __EXEC3_ENGINE_PINNED		BIT_ULL(32)
 #define __EXEC3_INTERNAL_FLAGS		(~0ull << 32)
@@ -144,7 +145,22 @@ static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
 {
 	struct i915_vma *vma, *vn;
 
-	/**
+#ifdef CONFIG_MMU_NOTIFIER
+	/*
+	 * Move all invalidated userptr vmas back into vm_bind_list so that
+	 * they are looked up and revalidated.
+	 */
+	spin_lock(&vm->userptr_invalidated_lock);
+	list_for_each_entry_safe(vma, vn, &vm->userptr_invalidated_list,
+				 userptr_invalidated_link) {
+		list_del_init(&vma->userptr_invalidated_link);
+		if (!list_empty(&vma->vm_bind_link))
+			list_move_tail(&vma->vm_bind_link, &vm->vm_bind_list);
+	}
+	spin_unlock(&vm->userptr_invalidated_lock);
+#endif
+
+	/*
 	 * Move all unbound vmas back into vm_bind_list so that they are
 	 * revalidated.
 	 */
@@ -157,10 +173,47 @@ static void eb_scoop_unbound_vma_all(struct i915_address_space *vm)
 	spin_unlock(&vm->vm_rebind_lock);
 }
 
+static int eb_lookup_persistent_userptr_vmas(struct i915_execbuffer *eb)
+{
+	struct i915_address_space *vm = eb->context->vm;
+	struct i915_vma *last_vma = NULL;
+	struct i915_vma *vma;
+	int err;
+
+	lockdep_assert_held(&vm->vm_bind_lock);
+
+	list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+		if (!i915_gem_object_is_userptr(vma->obj))
+			continue;
+
+		err = i915_gem_object_userptr_submit_init(vma->obj);
+		if (err)
+			return err;
+
+		/*
+		 * The above submit_init() call does the object unbind and
+		 * hence adds vma into vm_rebind_list. Remove it from that
+		 * list as it is already scooped for revalidation.
+		 */
+		spin_lock(&vm->vm_rebind_lock);
+		if (!list_empty(&vma->vm_rebind_link))
+			list_del_init(&vma->vm_rebind_link);
+		spin_unlock(&vm->vm_rebind_lock);
+
+		last_vma = vma;
+	}
+
+	if (last_vma)
+		eb->args->flags |= __EXEC3_USERPTR_USED;
+
+	return 0;
+}
+
 static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 {
 	struct i915_vma *vma;
 	unsigned int i;
+	int err = 0;
 
 	for (i = 0; i < eb->num_batches; i++) {
 		vma = eb_find_vma(eb->context->vm, eb->batch_addresses[i]);
@@ -172,6 +225,10 @@ static int eb_lookup_vma_all(struct i915_execbuffer *eb)
 
 	eb_scoop_unbound_vma_all(eb->context->vm);
 
+	err = eb_lookup_persistent_userptr_vmas(eb);
+	if (err)
+		return err;
+
 	return 0;
 }
 
@@ -344,6 +401,29 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 		}
 	}
 
+#ifdef CONFIG_MMU_NOTIFIER
+	/* Check for further userptr invalidations */
+	spin_lock(&vm->userptr_invalidated_lock);
+	if (!list_empty(&vm->userptr_invalidated_list))
+		err = -EAGAIN;
+	spin_unlock(&vm->userptr_invalidated_lock);
+
+	if (!err && (eb->args->flags & __EXEC3_USERPTR_USED)) {
+		read_lock(&eb->i915->mm.notifier_lock);
+		list_for_each_entry(vma, &vm->vm_bind_list, vm_bind_link) {
+			if (!i915_gem_object_is_userptr(vma->obj))
+				continue;
+
+			err = i915_gem_object_userptr_submit_done(vma->obj);
+			if (err)
+				break;
+		}
+		read_unlock(&eb->i915->mm.notifier_lock);
+	}
+#endif
+	if (unlikely(err))
+		goto err_skip;
+
 	/* Unconditionally flush any chipset caches (for streaming writes). */
 	intel_gt_chipset_flush(eb->gt);
 
@@ -691,7 +771,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (err)
 		goto err_validate;
 
-	/**
+	/*
 	 * No object unbinds possible once the objects are locked. So,
 	 * check for any unbinds here, which needs to be scooped up.
 	 *
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index ca7a388ba2bf..38be6ce82600 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -63,6 +63,7 @@ static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier *mni,
 {
 	struct drm_i915_gem_object *obj = container_of(mni, struct drm_i915_gem_object, userptr.notifier);
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct i915_vma *vma;
 	long r;
 
 	if (!mmu_notifier_range_blockable(range))
@@ -85,6 +86,24 @@ static bool i915_gem_userptr_invalidate(struct mmu_interval_notifier *mni,
 	if (current->flags & PF_EXITING)
 		return true;
 
+	/**
+	 * Add persistent vmas into userptr_invalidated list for relookup
+	 * and revalidation.
+	 */
+	spin_lock(&obj->vma.lock);
+	list_for_each_entry(vma, &obj->vma.list, obj_link) {
+		if (!i915_vma_is_persistent(vma))
+			continue;
+
+		spin_lock(&vma->vm->userptr_invalidated_lock);
+		if (list_empty(&vma->userptr_invalidated_link) &&
+		    !i915_vma_is_purged(vma))
+			list_add_tail(&vma->userptr_invalidated_link,
+				      &vma->vm->userptr_invalidated_list);
+		spin_unlock(&vma->vm->userptr_invalidated_lock);
+	}
+	spin_unlock(&obj->vma.lock);
+
 	/* we will unbind on next submission, still have userptr pins */
 	r = dma_resv_wait_timeout(obj->base.resv, DMA_RESV_USAGE_BOOKKEEP, false,
 				  MAX_SCHEDULE_TIMEOUT);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
index 6396fd1dc520..8532e87399ba 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
@@ -306,6 +306,12 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 		goto put_obj;
 	}
 
+	if (i915_gem_object_is_userptr(obj)) {
+		ret = i915_gem_object_userptr_submit_init(obj);
+		if (ret)
+			goto put_obj;
+	}
+
 	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
 	if (ret)
 		goto put_obj;
@@ -335,6 +341,15 @@ static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
 		if (ret)
 			continue;
 
+#ifdef CONFIG_MMU_NOTIFIER
+		if (i915_gem_object_is_userptr(obj)) {
+			read_lock(&vm->i915->mm.notifier_lock);
+			ret = i915_gem_object_userptr_submit_done(obj);
+			read_unlock(&vm->i915->mm.notifier_lock);
+			if (ret)
+				continue;
+		}
+#endif
 		/* If out fence is not requested, wait for bind to complete */
 		if (!(va->fence.flags & I915_TIMELINE_FENCE_SIGNAL)) {
 			ret = i915_vma_wait_for_bind(vma);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index ebf8fc3a4603..50648ab9214a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -292,6 +292,8 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	INIT_LIST_HEAD(&vm->non_priv_vm_bind_list);
 	INIT_LIST_HEAD(&vm->vm_rebind_list);
 	spin_lock_init(&vm->vm_rebind_lock);
+	spin_lock_init(&vm->userptr_invalidated_lock);
+	INIT_LIST_HEAD(&vm->userptr_invalidated_list);
 }
 
 void *__px_vaddr(struct drm_i915_gem_object *p)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index b5a5b68adb32..08a18603b93a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -270,6 +270,10 @@ struct i915_address_space {
 	struct list_head vm_rebind_list;
 	/** @vm_rebind_lock: protects vm_rebound_list */
 	spinlock_t vm_rebind_lock;
+	/** @userptr_invalidated_list: list of invalidated userptr vmas */
+	struct list_head userptr_invalidated_list;
+	/** @userptr_invalidated_lock: protects userptr_invalidated_list */
+	spinlock_t userptr_invalidated_lock;
 	/** @va: tree of persistent vmas */
 	struct rb_root_cached va;
 	/** @non_priv_vm_bind_list: list of non-private object mappings */
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index f73955aef16a..08218e3a2f12 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -243,6 +243,7 @@ vma_create(struct drm_i915_gem_object *obj,
 	INIT_LIST_HEAD(&vma->vm_bind_link);
 	INIT_LIST_HEAD(&vma->non_priv_vm_bind_link);
 	INIT_LIST_HEAD(&vma->vm_rebind_link);
+	INIT_LIST_HEAD(&vma->userptr_invalidated_link);
 	return vma;
 
 err_unlock:
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
index 7c8c293ddfcb..90471dc0b235 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -307,6 +307,8 @@ struct i915_vma {
 	struct list_head non_priv_vm_bind_link;
 	/** @vm_rebind_link: link to vm_rebind_list and protected by vm_rebind_lock */
 	struct list_head vm_rebind_link; /* Link in vm_rebind_list */
+	/** @userptr_invalidated_link: link to the vm->userptr_invalidated_list */
+	struct list_head userptr_invalidated_link;
 
 	/** Timeline fence for vm_bind completion notification */
 	struct {
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 17/20] drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Only support vm_bind mode with non-recoverable contexts.
With new vm_bind mode with eb3 submission path, we need not
support older recoverable contexts.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 1630a52f387d..899079d602bc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1617,6 +1617,12 @@ i915_gem_create_context(struct drm_i915_private *i915,
 	INIT_LIST_HEAD(&ctx->stale.engines);
 
 	if (pc->vm) {
+		/* Only non-recoverable contexts are allowed in vm_bind mode */
+		if (i915_gem_vm_is_vm_bind_mode(pc->vm) &&
+		    (pc->user_flags & BIT(UCONTEXT_RECOVERABLE))) {
+			err = -EINVAL;
+			goto err_ctx;
+		}
 		vm = i915_vm_get(pc->vm);
 	} else if (HAS_FULL_PPGTT(i915)) {
 		struct i915_ppgtt *ppgtt;
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 17/20] drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Only support vm_bind mode with non-recoverable contexts.
With new vm_bind mode with eb3 submission path, we need not
support older recoverable contexts.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 1630a52f387d..899079d602bc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1617,6 +1617,12 @@ i915_gem_create_context(struct drm_i915_private *i915,
 	INIT_LIST_HEAD(&ctx->stale.engines);
 
 	if (pc->vm) {
+		/* Only non-recoverable contexts are allowed in vm_bind mode */
+		if (i915_gem_vm_is_vm_bind_mode(pc->vm) &&
+		    (pc->user_flags & BIT(UCONTEXT_RECOVERABLE))) {
+			err = -EINVAL;
+			goto err_ctx;
+		}
 		vm = i915_vm_get(pc->vm);
 	} else if (HAS_FULL_PPGTT(i915)) {
 		struct i915_ppgtt *ppgtt;
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 18/20] drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Add getparam support for VM_BIND capability version.
Add VM creation time flag to enable vm_bind_mode for the VM.

v2: update kernel-doc
v3: create vm->root_obj only upon I915_VM_CREATE_FLAGS_USE_VM_BIND
v4: replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 25 ++++++++++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_context.h |  3 +--
 drivers/gpu/drm/i915/gt/intel_gtt.c         |  2 ++
 drivers/gpu/drm/i915/i915_drv.h             |  2 ++
 drivers/gpu/drm/i915/i915_getparam.c        |  3 +++
 include/uapi/drm/i915_drm.h                 | 26 ++++++++++++++++++++-
 6 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 899079d602bc..56b60413bef9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1809,9 +1809,13 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
 	if (!HAS_FULL_PPGTT(i915))
 		return -ENODEV;
 
-	if (args->flags)
+	if (args->flags & I915_VM_CREATE_FLAGS_UNKNOWN)
 		return -EINVAL;
 
+	if ((args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) &&
+	    !HAS_VM_BIND(i915))
+		return -EOPNOTSUPP;
+
 	ppgtt = i915_ppgtt_create(to_gt(i915), 0);
 	if (IS_ERR(ppgtt))
 		return PTR_ERR(ppgtt);
@@ -1824,15 +1828,32 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
 			goto err_put;
 	}
 
+	if (args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) {
+		struct drm_i915_gem_object *obj;
+
+		obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
+		if (IS_ERR(obj)) {
+			err = PTR_ERR(obj);
+			goto err_put;
+		}
+
+		ppgtt->vm.root_obj = obj;
+	}
+
 	err = xa_alloc(&file_priv->vm_xa, &id, &ppgtt->vm,
 		       xa_limit_32b, GFP_KERNEL);
 	if (err)
-		goto err_put;
+		goto err_root_obj_put;
 
 	GEM_BUG_ON(id == 0); /* reserved for invalid/unassigned ppgtt */
 	args->vm_id = id;
 	return 0;
 
+err_root_obj_put:
+	if (ppgtt->vm.root_obj) {
+		i915_gem_object_put(ppgtt->vm.root_obj);
+		ppgtt->vm.root_obj = NULL;
+	}
 err_put:
 	i915_vm_put(&ppgtt->vm);
 	return err;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index e8b41aa8f8c4..b53aef2853cb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -150,8 +150,7 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
  */
 static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
 {
-	/* No support to enable vm_bind mode yet */
-	return false;
+	return !!vm->root_obj;
 }
 
 struct i915_address_space *
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 50648ab9214a..ae66fdd4bce9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -178,6 +178,8 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
 void i915_address_space_fini(struct i915_address_space *vm)
 {
 	drm_mm_takedown(&vm->mm);
+	if (vm->root_obj)
+		i915_gem_object_put(vm->root_obj);
 	GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
 	mutex_destroy(&vm->vm_bind_lock);
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 05b3300cc4ed..a34d9a7dcd1c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -978,6 +978,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define HAS_LMEMBAR_SMEM_STOLEN(i915) (!HAS_LMEM(i915) && \
 				       GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
 
+#define HAS_VM_BIND(i915) (GRAPHICS_VER(i915) >= 12)
+
 /* intel_device_info.c */
 static inline struct intel_device_info *
 mkwrite_device_info(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c
index 3047e80e1163..9f700c0c0fb0 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -178,6 +178,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_OA_TIMESTAMP_FREQUENCY:
 		value = i915_perf_oa_timestamp_frequency(i915);
 		break;
+	case I915_PARAM_VM_BIND_VERSION:
+		value = HAS_VM_BIND(i915);
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 59a94f515064..62ceb064e11d 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -777,6 +777,22 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_OA_TIMESTAMP_FREQUENCY 57
 
+/*
+ * VM_BIND feature version supported.
+ *
+ * The following versions of VM_BIND have been defined:
+ *
+ * 0: No VM_BIND support.
+ *
+ * 1: In VM_UNBIND calls, the UMD must specify the exact mappings created
+ *    previously with VM_BIND, the ioctl will not support unbinding multiple
+ *    mappings or splitting them. Similarly, VM_BIND calls will not replace
+ *    any existing mappings.
+ *
+ * See struct drm_i915_gem_vm_bind and struct drm_i915_gem_vm_unbind.
+ */
+#define I915_PARAM_VM_BIND_VERSION	58
+
 /* Must be kept compact -- no holes and well documented */
 
 /**
@@ -2644,7 +2660,15 @@ struct drm_i915_gem_vm_control {
 	/** @extensions: Zero-terminated chain of extensions. */
 	__u64 extensions;
 
-	/** @flags: reserved for future usage, currently MBZ */
+	/**
+	 * @flags: Supported flags are,
+	 *
+	 * I915_VM_CREATE_FLAGS_USE_VM_BIND:
+	 *
+	 * VM created will work in VM_BIND mode.
+	 */
+#define I915_VM_CREATE_FLAGS_USE_VM_BIND	(1u << 0)
+#define I915_VM_CREATE_FLAGS_UNKNOWN	(-(I915_VM_CREATE_FLAGS_USE_VM_BIND << 1))
 	__u32 flags;
 
 	/** @vm_id: Id of the VM created or to be destroyed */
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 18/20] drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Add getparam support for VM_BIND capability version.
Add VM creation time flag to enable vm_bind_mode for the VM.

v2: update kernel-doc
v3: create vm->root_obj only upon I915_VM_CREATE_FLAGS_USE_VM_BIND
v4: replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 25 ++++++++++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_context.h |  3 +--
 drivers/gpu/drm/i915/gt/intel_gtt.c         |  2 ++
 drivers/gpu/drm/i915/i915_drv.h             |  2 ++
 drivers/gpu/drm/i915/i915_getparam.c        |  3 +++
 include/uapi/drm/i915_drm.h                 | 26 ++++++++++++++++++++-
 6 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 899079d602bc..56b60413bef9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1809,9 +1809,13 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
 	if (!HAS_FULL_PPGTT(i915))
 		return -ENODEV;
 
-	if (args->flags)
+	if (args->flags & I915_VM_CREATE_FLAGS_UNKNOWN)
 		return -EINVAL;
 
+	if ((args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) &&
+	    !HAS_VM_BIND(i915))
+		return -EOPNOTSUPP;
+
 	ppgtt = i915_ppgtt_create(to_gt(i915), 0);
 	if (IS_ERR(ppgtt))
 		return PTR_ERR(ppgtt);
@@ -1824,15 +1828,32 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
 			goto err_put;
 	}
 
+	if (args->flags & I915_VM_CREATE_FLAGS_USE_VM_BIND) {
+		struct drm_i915_gem_object *obj;
+
+		obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
+		if (IS_ERR(obj)) {
+			err = PTR_ERR(obj);
+			goto err_put;
+		}
+
+		ppgtt->vm.root_obj = obj;
+	}
+
 	err = xa_alloc(&file_priv->vm_xa, &id, &ppgtt->vm,
 		       xa_limit_32b, GFP_KERNEL);
 	if (err)
-		goto err_put;
+		goto err_root_obj_put;
 
 	GEM_BUG_ON(id == 0); /* reserved for invalid/unassigned ppgtt */
 	args->vm_id = id;
 	return 0;
 
+err_root_obj_put:
+	if (ppgtt->vm.root_obj) {
+		i915_gem_object_put(ppgtt->vm.root_obj);
+		ppgtt->vm.root_obj = NULL;
+	}
 err_put:
 	i915_vm_put(&ppgtt->vm);
 	return err;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index e8b41aa8f8c4..b53aef2853cb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -150,8 +150,7 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
  */
 static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
 {
-	/* No support to enable vm_bind mode yet */
-	return false;
+	return !!vm->root_obj;
 }
 
 struct i915_address_space *
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 50648ab9214a..ae66fdd4bce9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -178,6 +178,8 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
 void i915_address_space_fini(struct i915_address_space *vm)
 {
 	drm_mm_takedown(&vm->mm);
+	if (vm->root_obj)
+		i915_gem_object_put(vm->root_obj);
 	GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
 	mutex_destroy(&vm->vm_bind_lock);
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 05b3300cc4ed..a34d9a7dcd1c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -978,6 +978,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define HAS_LMEMBAR_SMEM_STOLEN(i915) (!HAS_LMEM(i915) && \
 				       GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
 
+#define HAS_VM_BIND(i915) (GRAPHICS_VER(i915) >= 12)
+
 /* intel_device_info.c */
 static inline struct intel_device_info *
 mkwrite_device_info(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c
index 3047e80e1163..9f700c0c0fb0 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -178,6 +178,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_OA_TIMESTAMP_FREQUENCY:
 		value = i915_perf_oa_timestamp_frequency(i915);
 		break;
+	case I915_PARAM_VM_BIND_VERSION:
+		value = HAS_VM_BIND(i915);
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 59a94f515064..62ceb064e11d 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -777,6 +777,22 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_OA_TIMESTAMP_FREQUENCY 57
 
+/*
+ * VM_BIND feature version supported.
+ *
+ * The following versions of VM_BIND have been defined:
+ *
+ * 0: No VM_BIND support.
+ *
+ * 1: In VM_UNBIND calls, the UMD must specify the exact mappings created
+ *    previously with VM_BIND, the ioctl will not support unbinding multiple
+ *    mappings or splitting them. Similarly, VM_BIND calls will not replace
+ *    any existing mappings.
+ *
+ * See struct drm_i915_gem_vm_bind and struct drm_i915_gem_vm_unbind.
+ */
+#define I915_PARAM_VM_BIND_VERSION	58
+
 /* Must be kept compact -- no holes and well documented */
 
 /**
@@ -2644,7 +2660,15 @@ struct drm_i915_gem_vm_control {
 	/** @extensions: Zero-terminated chain of extensions. */
 	__u64 extensions;
 
-	/** @flags: reserved for future usage, currently MBZ */
+	/**
+	 * @flags: Supported flags are,
+	 *
+	 * I915_VM_CREATE_FLAGS_USE_VM_BIND:
+	 *
+	 * VM created will work in VM_BIND mode.
+	 */
+#define I915_VM_CREATE_FLAGS_USE_VM_BIND	(1u << 0)
+#define I915_VM_CREATE_FLAGS_UNKNOWN	(-(I915_VM_CREATE_FLAGS_USE_VM_BIND << 1))
 	__u32 flags;
 
 	/** @vm_id: Id of the VM created or to be destroyed */
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 19/20] drm/i915/vm_bind: Render VM_BIND documentation
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Update i915 documentation to include VM_BIND changes
and render all VM_BIND related documentation.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 Documentation/gpu/i915.rst | 78 ++++++++++++++++++++++++++++----------
 1 file changed, 59 insertions(+), 19 deletions(-)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 4e59db1cfb00..5c55cbc980b1 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -283,15 +283,18 @@ An Intel GPU has multiple engines. There are several engine types.
 
 The Intel GPU family is a family of integrated GPU's using Unified
 Memory Access. For having the GPU "do work", user space will feed the
-GPU batch buffers via one of the ioctls `DRM_IOCTL_I915_GEM_EXECBUFFER2`
-or `DRM_IOCTL_I915_GEM_EXECBUFFER2_WR`. Most such batchbuffers will
-instruct the GPU to perform work (for example rendering) and that work
-needs memory from which to read and memory to which to write. All memory
-is encapsulated within GEM buffer objects (usually created with the ioctl
-`DRM_IOCTL_I915_GEM_CREATE`). An ioctl providing a batchbuffer for the GPU
-to create will also list all GEM buffer objects that the batchbuffer reads
-and/or writes. For implementation details of memory management see
-`GEM BO Management Implementation Details`_.
+GPU batch buffers via one of the ioctls `DRM_IOCTL_I915_GEM_EXECBUFFER2`,
+`DRM_IOCTL_I915_GEM_EXECBUFFER2_WR` or `DRM_IOCTL_I915_GEM_EXECBUFFER3`.
+Most such batchbuffers will instruct the GPU to perform work (for example
+rendering) and that work needs memory from which to read and memory to
+which to write. All memory is encapsulated within GEM buffer objects
+(usually created with the ioctl `DRM_IOCTL_I915_GEM_CREATE`). In vm_bind mode
+(see `VM_BIND mode`_), the batch buffer and all the GEM buffer objects that
+it reads and/or writes should be bound with vm_bind ioctl before submitting
+the batch buffer to GPU. In legacy (non-VM_BIND) mode, an ioctl providing a
+batchbuffer for the GPU to create will also list all GEM buffer objects that
+the batchbuffer reads and/or writes. For implementation details of memory
+management see `GEM BO Management Implementation Details`_.
 
 The i915 driver allows user space to create a context via the ioctl
 `DRM_IOCTL_I915_GEM_CONTEXT_CREATE` which is identified by a 32-bit
@@ -309,8 +312,9 @@ In addition to the ordering guarantees, the kernel will restore GPU
 state via HW context when commands are issued to a context, this saves
 user space the need to restore (most of atleast) the GPU state at the
 start of each batchbuffer. The non-deprecated ioctls to submit batchbuffer
-work can pass that ID (in the lower bits of drm_i915_gem_execbuffer2::rsvd1)
-to identify what context to use with the command.
+work can pass that ID (drm_i915_gem_execbuffer3::ctx_id, or in the lower
+bits of drm_i915_gem_execbuffer2::rsvd1) to identify what context to use
+with the command.
 
 The GPU has its own memory management and address space. The kernel
 driver maintains the memory translation table for the GPU. For older
@@ -318,14 +322,14 @@ GPUs (i.e. those before Gen8), there is a single global such translation
 table, a global Graphics Translation Table (GTT). For newer generation
 GPUs each context has its own translation table, called Per-Process
 Graphics Translation Table (PPGTT). Of important note, is that although
-PPGTT is named per-process it is actually per context. When user space
-submits a batchbuffer, the kernel walks the list of GEM buffer objects
-used by the batchbuffer and guarantees that not only is the memory of
-each such GEM buffer object resident but it is also present in the
-(PP)GTT. If the GEM buffer object is not yet placed in the (PP)GTT,
-then it is given an address. Two consequences of this are: the kernel
-needs to edit the batchbuffer submitted to write the correct value of
-the GPU address when a GEM BO is assigned a GPU address and the kernel
+PPGTT is named per-process it is actually per context. In legacy
+(non-vm_bind) mode, when user space submits a batchbuffer, the kernel walks
+the list of GEM buffer objects used by the batchbuffer and guarantees that
+not only is the memory of each such GEM buffer object resident but it is
+also present in the (PP)GTT. If the GEM buffer object is not yet placed in
+the (PP)GTT, then it is given an address. Two consequences of this are: the
+kernel needs to edit the batchbuffer submitted to write the correct value
+of the GPU address when a GEM BO is assigned a GPU address and the kernel
 might evict a different GEM BO from the (PP)GTT to make address room
 for another GEM BO. Consequently, the ioctls submitting a batchbuffer
 for execution also include a list of all locations within buffers that
@@ -407,6 +411,15 @@ objects, which has the goal to make space in gpu virtual address spaces.
 .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
    :internal:
 
+VM_BIND mode
+------------
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+   :doc: VM_BIND/UNBIND ioctls
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+   :internal:
+
 Batchbuffer Parsing
 -------------------
 
@@ -419,11 +432,38 @@ Batchbuffer Parsing
 User Batchbuffer Execution
 --------------------------
 
+Client state
+~~~~~~~~~~~~
+
 .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_context_types.h
 
+User command execution
+~~~~~~~~~~~~~~~~~~~~~~
+
 .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
    :doc: User command execution
 
+User command execution in vm_bind mode
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+   :doc: User command execution in vm_bind mode
+
+Common execbuff utilities
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
+   :internal:
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
+   :internal:
+
+Execbuf3 ioctl path
+~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+   :internal:
+
 Scheduling
 ----------
 .. kernel-doc:: drivers/gpu/drm/i915/i915_scheduler_types.h
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 19/20] drm/i915/vm_bind: Render VM_BIND documentation
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Update i915 documentation to include VM_BIND changes
and render all VM_BIND related documentation.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 Documentation/gpu/i915.rst | 78 ++++++++++++++++++++++++++++----------
 1 file changed, 59 insertions(+), 19 deletions(-)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 4e59db1cfb00..5c55cbc980b1 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -283,15 +283,18 @@ An Intel GPU has multiple engines. There are several engine types.
 
 The Intel GPU family is a family of integrated GPU's using Unified
 Memory Access. For having the GPU "do work", user space will feed the
-GPU batch buffers via one of the ioctls `DRM_IOCTL_I915_GEM_EXECBUFFER2`
-or `DRM_IOCTL_I915_GEM_EXECBUFFER2_WR`. Most such batchbuffers will
-instruct the GPU to perform work (for example rendering) and that work
-needs memory from which to read and memory to which to write. All memory
-is encapsulated within GEM buffer objects (usually created with the ioctl
-`DRM_IOCTL_I915_GEM_CREATE`). An ioctl providing a batchbuffer for the GPU
-to create will also list all GEM buffer objects that the batchbuffer reads
-and/or writes. For implementation details of memory management see
-`GEM BO Management Implementation Details`_.
+GPU batch buffers via one of the ioctls `DRM_IOCTL_I915_GEM_EXECBUFFER2`,
+`DRM_IOCTL_I915_GEM_EXECBUFFER2_WR` or `DRM_IOCTL_I915_GEM_EXECBUFFER3`.
+Most such batchbuffers will instruct the GPU to perform work (for example
+rendering) and that work needs memory from which to read and memory to
+which to write. All memory is encapsulated within GEM buffer objects
+(usually created with the ioctl `DRM_IOCTL_I915_GEM_CREATE`). In vm_bind mode
+(see `VM_BIND mode`_), the batch buffer and all the GEM buffer objects that
+it reads and/or writes should be bound with vm_bind ioctl before submitting
+the batch buffer to GPU. In legacy (non-VM_BIND) mode, an ioctl providing a
+batchbuffer for the GPU to create will also list all GEM buffer objects that
+the batchbuffer reads and/or writes. For implementation details of memory
+management see `GEM BO Management Implementation Details`_.
 
 The i915 driver allows user space to create a context via the ioctl
 `DRM_IOCTL_I915_GEM_CONTEXT_CREATE` which is identified by a 32-bit
@@ -309,8 +312,9 @@ In addition to the ordering guarantees, the kernel will restore GPU
 state via HW context when commands are issued to a context, this saves
 user space the need to restore (most of atleast) the GPU state at the
 start of each batchbuffer. The non-deprecated ioctls to submit batchbuffer
-work can pass that ID (in the lower bits of drm_i915_gem_execbuffer2::rsvd1)
-to identify what context to use with the command.
+work can pass that ID (drm_i915_gem_execbuffer3::ctx_id, or in the lower
+bits of drm_i915_gem_execbuffer2::rsvd1) to identify what context to use
+with the command.
 
 The GPU has its own memory management and address space. The kernel
 driver maintains the memory translation table for the GPU. For older
@@ -318,14 +322,14 @@ GPUs (i.e. those before Gen8), there is a single global such translation
 table, a global Graphics Translation Table (GTT). For newer generation
 GPUs each context has its own translation table, called Per-Process
 Graphics Translation Table (PPGTT). Of important note, is that although
-PPGTT is named per-process it is actually per context. When user space
-submits a batchbuffer, the kernel walks the list of GEM buffer objects
-used by the batchbuffer and guarantees that not only is the memory of
-each such GEM buffer object resident but it is also present in the
-(PP)GTT. If the GEM buffer object is not yet placed in the (PP)GTT,
-then it is given an address. Two consequences of this are: the kernel
-needs to edit the batchbuffer submitted to write the correct value of
-the GPU address when a GEM BO is assigned a GPU address and the kernel
+PPGTT is named per-process it is actually per context. In legacy
+(non-vm_bind) mode, when user space submits a batchbuffer, the kernel walks
+the list of GEM buffer objects used by the batchbuffer and guarantees that
+not only is the memory of each such GEM buffer object resident but it is
+also present in the (PP)GTT. If the GEM buffer object is not yet placed in
+the (PP)GTT, then it is given an address. Two consequences of this are: the
+kernel needs to edit the batchbuffer submitted to write the correct value
+of the GPU address when a GEM BO is assigned a GPU address and the kernel
 might evict a different GEM BO from the (PP)GTT to make address room
 for another GEM BO. Consequently, the ioctls submitting a batchbuffer
 for execution also include a list of all locations within buffers that
@@ -407,6 +411,15 @@ objects, which has the goal to make space in gpu virtual address spaces.
 .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
    :internal:
 
+VM_BIND mode
+------------
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+   :doc: VM_BIND/UNBIND ioctls
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
+   :internal:
+
 Batchbuffer Parsing
 -------------------
 
@@ -419,11 +432,38 @@ Batchbuffer Parsing
 User Batchbuffer Execution
 --------------------------
 
+Client state
+~~~~~~~~~~~~
+
 .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_context_types.h
 
+User command execution
+~~~~~~~~~~~~~~~~~~~~~~
+
 .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
    :doc: User command execution
 
+User command execution in vm_bind mode
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+   :doc: User command execution in vm_bind mode
+
+Common execbuff utilities
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
+   :internal:
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
+   :internal:
+
+Execbuf3 ioctl path
+~~~~~~~~~~~~~~~~~~~
+
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
+   :internal:
+
 Scheduling
 ----------
 .. kernel-doc:: drivers/gpu/drm/i915/i915_scheduler_types.h
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	lionel.g.landwerlin, thomas.hellstrom, matthew.auld, jason,
	andi.shyti, daniel.vetter, christian.koenig

Asynchronously unbind the vma upon vm_unbind call.
Fall back to synchronous unbind if backend doesn't support
async unbind or if async unbind fails.

No need for vm_unbind out fence support as i915 will internally
handle all sequencing and user need not try to sequence any
operation with the unbind completion.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c | 51 ++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 2 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 08218e3a2f12..03c966fad87b 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -42,6 +42,8 @@
 #include "i915_vma.h"
 #include "i915_vma_resource.h"
 
+static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma);
+
 static inline void assert_vma_held_evict(const struct i915_vma *vma)
 {
 	/*
@@ -1711,7 +1713,7 @@ void i915_vma_reopen(struct i915_vma *vma)
 	spin_unlock_irq(&gt->closed_lock);
 }
 
-static void force_unbind(struct i915_vma *vma)
+static void force_unbind(struct i915_vma *vma, bool async)
 {
 	if (!drm_mm_node_allocated(&vma->node))
 		return;
@@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
 		i915_vma_set_purged(vma);
 
 	atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
-	WARN_ON(__i915_vma_unbind(vma));
+	if (async) {
+		struct dma_fence *fence;
+
+		fence = __i915_vma_unbind_async(vma);
+		if (IS_ERR_OR_NULL(fence)) {
+			async = false;
+		} else {
+			dma_resv_add_fence(vma->obj->base.resv, fence,
+					   DMA_RESV_USAGE_READ);
+			dma_fence_put(fence);
+		}
+	}
+
+	if (!async)
+		WARN_ON(__i915_vma_unbind(vma));
 	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
 }
 
@@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
 {
 	lockdep_assert_held(&vma->vm->mutex);
 
-	force_unbind(vma);
+	force_unbind(vma, false);
 	list_del_init(&vma->vm_link);
 	release_references(vma, vma->vm->gt, false);
 }
@@ -1796,7 +1812,34 @@ void i915_vma_destroy(struct i915_vma *vma)
 	bool vm_ddestroy;
 
 	mutex_lock(&vma->vm->mutex);
-	force_unbind(vma);
+	force_unbind(vma, false);
+	list_del_init(&vma->vm_link);
+	vm_ddestroy = vma->vm_ddestroy;
+	vma->vm_ddestroy = false;
+
+	/* vma->vm may be freed when releasing vma->vm->mutex. */
+	gt = vma->vm->gt;
+	mutex_unlock(&vma->vm->mutex);
+	release_references(vma, gt, vm_ddestroy);
+}
+
+void i915_vma_destroy_async(struct i915_vma *vma)
+{
+	bool vm_ddestroy, async = vma->obj->mm.rsgt;
+	struct intel_gt *gt;
+
+	if (dma_resv_reserve_fences(vma->obj->base.resv, 1))
+		async = false;
+
+	mutex_lock(&vma->vm->mutex);
+	/*
+	 * Ensure any asynchronous binding is complete while using
+	 * async unbind as we will be releasing the vma here.
+	 */
+	if (async && i915_active_wait(&vma->active))
+		async = false;
+
+	force_unbind(vma, async);
 	list_del_init(&vma->vm_link);
 	vm_ddestroy = vma->vm_ddestroy;
 	vma->vm_ddestroy = false;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 737ef310d046..25f15965dab8 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -272,6 +272,7 @@ void i915_vma_reopen(struct i915_vma *vma);
 
 void i915_vma_destroy_locked(struct i915_vma *vma);
 void i915_vma_destroy(struct i915_vma *vma);
+void i915_vma_destroy_async(struct i915_vma *vma);
 
 #define assert_vma_held(vma) dma_resv_assert_held((vma)->obj->base.resv)
 
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
@ 2022-11-07  8:52   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-07  8:52 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, jani.nikula, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Asynchronously unbind the vma upon vm_unbind call.
Fall back to synchronous unbind if backend doesn't support
async unbind or if async unbind fails.

No need for vm_unbind out fence support as i915 will internally
handle all sequencing and user need not try to sequence any
operation with the unbind completion.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 drivers/gpu/drm/i915/i915_vma.c | 51 ++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_vma.h |  1 +
 2 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 08218e3a2f12..03c966fad87b 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -42,6 +42,8 @@
 #include "i915_vma.h"
 #include "i915_vma_resource.h"
 
+static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma);
+
 static inline void assert_vma_held_evict(const struct i915_vma *vma)
 {
 	/*
@@ -1711,7 +1713,7 @@ void i915_vma_reopen(struct i915_vma *vma)
 	spin_unlock_irq(&gt->closed_lock);
 }
 
-static void force_unbind(struct i915_vma *vma)
+static void force_unbind(struct i915_vma *vma, bool async)
 {
 	if (!drm_mm_node_allocated(&vma->node))
 		return;
@@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
 		i915_vma_set_purged(vma);
 
 	atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
-	WARN_ON(__i915_vma_unbind(vma));
+	if (async) {
+		struct dma_fence *fence;
+
+		fence = __i915_vma_unbind_async(vma);
+		if (IS_ERR_OR_NULL(fence)) {
+			async = false;
+		} else {
+			dma_resv_add_fence(vma->obj->base.resv, fence,
+					   DMA_RESV_USAGE_READ);
+			dma_fence_put(fence);
+		}
+	}
+
+	if (!async)
+		WARN_ON(__i915_vma_unbind(vma));
 	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
 }
 
@@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
 {
 	lockdep_assert_held(&vma->vm->mutex);
 
-	force_unbind(vma);
+	force_unbind(vma, false);
 	list_del_init(&vma->vm_link);
 	release_references(vma, vma->vm->gt, false);
 }
@@ -1796,7 +1812,34 @@ void i915_vma_destroy(struct i915_vma *vma)
 	bool vm_ddestroy;
 
 	mutex_lock(&vma->vm->mutex);
-	force_unbind(vma);
+	force_unbind(vma, false);
+	list_del_init(&vma->vm_link);
+	vm_ddestroy = vma->vm_ddestroy;
+	vma->vm_ddestroy = false;
+
+	/* vma->vm may be freed when releasing vma->vm->mutex. */
+	gt = vma->vm->gt;
+	mutex_unlock(&vma->vm->mutex);
+	release_references(vma, gt, vm_ddestroy);
+}
+
+void i915_vma_destroy_async(struct i915_vma *vma)
+{
+	bool vm_ddestroy, async = vma->obj->mm.rsgt;
+	struct intel_gt *gt;
+
+	if (dma_resv_reserve_fences(vma->obj->base.resv, 1))
+		async = false;
+
+	mutex_lock(&vma->vm->mutex);
+	/*
+	 * Ensure any asynchronous binding is complete while using
+	 * async unbind as we will be releasing the vma here.
+	 */
+	if (async && i915_active_wait(&vma->active))
+		async = false;
+
+	force_unbind(vma, async);
 	list_del_init(&vma->vm_link);
 	vm_ddestroy = vma->vm_ddestroy;
 	vma->vm_ddestroy = false;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 737ef310d046..25f15965dab8 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -272,6 +272,7 @@ void i915_vma_reopen(struct i915_vma *vma);
 
 void i915_vma_destroy_locked(struct i915_vma *vma);
 void i915_vma_destroy(struct i915_vma *vma);
+void i915_vma_destroy_async(struct i915_vma *vma);
 
 #define assert_vma_held(vma) dma_resv_assert_held((vma)->obj->base.resv)
 
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/vm_bind: Add VM_BIND functionality (rev9)
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
                   ` (20 preceding siblings ...)
  (?)
@ 2022-11-07 11:21 ` Patchwork
  -1 siblings, 0 replies; 71+ messages in thread
From: Patchwork @ 2022-11-07 11:21 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/vm_bind: Add VM_BIND functionality (rev9)
URL   : https://patchwork.freedesktop.org/series/105879/
State : warning

== Summary ==

Error: dim checkpatch failed
64e79a6800a5 drm/i915/vm_bind: Expose vm lookup function
ed295e82a10c drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
4781445a3f99 drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
d1ff6b98969b drm/i915/vm_bind: Add support to create persistent vma
-:61: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#61: FILE: drivers/gpu/drm/i915/i915_vma.c:309:
+	GEM_BUG_ON(!IS_ERR(vma) && i915_vma_compare(vma, vm, view));

-:82: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#82: FILE: drivers/gpu/drm/i915/i915_vma.c:330:
+	GEM_BUG_ON(!kref_read(&vm->ref));

-:127: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#127: FILE: drivers/gpu/drm/i915/i915_vma.h:181:
+	GEM_BUG_ON(view && !(i915_is_ggtt_or_dpt(vm) ||

total: 0 errors, 3 warnings, 0 checks, 107 lines checked
d320691d4cdd drm/i915/vm_bind: Implement bind and unbind of object
Traceback (most recent call last):
  File "scripts/spdxcheck.py", line 11, in <module>
    import git
ModuleNotFoundError: No module named 'git'
Traceback (most recent call last):
  File "scripts/spdxcheck.py", line 11, in <module>
    import git
ModuleNotFoundError: No module named 'git'
-:83: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#83: 
new file mode 100644

-:460: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#460: FILE: drivers/gpu/drm/i915/gt/intel_gtt.c:181:
+	GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));

-:581: WARNING:LONG_LINE: line length of 118 exceeds 100 columns
#581: FILE: include/uapi/drm/i915_drm.h:539:
+#define DRM_IOCTL_I915_GEM_VM_BIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)

-:582: WARNING:LONG_LINE: line length of 122 exceeds 100 columns
#582: FILE: include/uapi/drm/i915_drm.h:540:
+#define DRM_IOCTL_I915_GEM_VM_UNBIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_unbind)

total: 0 errors, 4 warnings, 0 checks, 597 lines checked
9940046e662d drm/i915/vm_bind: Support for VM private BOs
4567f2f4ac4f drm/i915/vm_bind: Add support to handle object evictions
6d94e97499d1 drm/i915/vm_bind: Support persistent vma activeness tracking
6f44bfd47141 drm/i915/vm_bind: Add out fence support
6c57ad92061f drm/i915/vm_bind: Abstract out common execbuf functions
Traceback (most recent call last):
  File "scripts/spdxcheck.py", line 11, in <module>
    import git
ModuleNotFoundError: No module named 'git'
Traceback (most recent call last):
  File "scripts/spdxcheck.py", line 11, in <module>
    import git
ModuleNotFoundError: No module named 'git'
-:28: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#28: 
new file mode 100644

-:171: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#171: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c:139:
+		GEM_BUG_ON(err);	/* perma-pinned should incr a counter */

-:246: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#246: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c:214:
+	GEM_BUG_ON("Context not found");

-:600: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#600: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c:568:
+	GEM_BUG_ON(!intel_context_is_parent(context));

total: 0 errors, 4 warnings, 0 checks, 747 lines checked
5c4a0ace1143 drm/i915/vm_bind: Use common execbuf functions in execbuf path
e1f83eaca535 drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
Traceback (most recent call last):
  File "scripts/spdxcheck.py", line 11, in <module>
    import git
ModuleNotFoundError: No module named 'git'
-:39: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#39: 
new file mode 100644

-:266: WARNING:AVOID_BUG: Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants
#266: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c:223:
+	GEM_BUG_ON(eb->args->flags & __EXEC3_ENGINE_PINNED);

-:663: WARNING:LONG_LINE: line length of 126 exceeds 100 columns
#663: FILE: include/uapi/drm/i915_drm.h:542:
+#define DRM_IOCTL_I915_GEM_EXECBUFFER3	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct drm_i915_gem_execbuffer3)

total: 0 errors, 3 warnings, 0 checks, 679 lines checked
2b6331eda3b9 drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
d2df073faca6 drm/i915/vm_bind: Expose i915_request_await_bind()
bce4821297a2 drm/i915/vm_bind: Handle persistent vmas in execbuf3
e539dd8af93b drm/i915/vm_bind: userptr dma-resv changes
d8739a254b81 drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
f78c70b4458e drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
3e5cfe3941dd drm/i915/vm_bind: Render VM_BIND documentation
244544720245 drm/i915/vm_bind: Async vm_unbind support



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/vm_bind: Add VM_BIND functionality (rev9)
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
                   ` (21 preceding siblings ...)
  (?)
@ 2022-11-07 11:21 ` Patchwork
  -1 siblings, 0 replies; 71+ messages in thread
From: Patchwork @ 2022-11-07 11:21 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/vm_bind: Add VM_BIND functionality (rev9)
URL   : https://patchwork.freedesktop.org/series/105879/
State : warning

== Summary ==

Error: dim sparse failed
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.



^ permalink raw reply	[flat|nested] 71+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/vm_bind: Add VM_BIND functionality (rev9)
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
                   ` (22 preceding siblings ...)
  (?)
@ 2022-11-07 11:40 ` Patchwork
  -1 siblings, 0 replies; 71+ messages in thread
From: Patchwork @ 2022-11-07 11:40 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 4495 bytes --]

== Series Details ==

Series: drm/i915/vm_bind: Add VM_BIND functionality (rev9)
URL   : https://patchwork.freedesktop.org/series/105879/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_12348 -> Patchwork_105879v9
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/index.html

Participating hosts (39 -> 27)
------------------------------

  Missing    (12): fi-rkl-11600 fi-bdw-samus bat-dg2-8 bat-dg2-9 bat-adlp-6 fi-ctg-p8600 bat-adln-1 bat-rplp-1 bat-rpls-1 bat-rpls-2 bat-dg2-11 bat-jsl-1 

Known issues
------------

  Here are the changes found in Patchwork_105879v9 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_tiled_blits@basic:
    - fi-pnv-d510:        [PASS][1] -> [SKIP][2] ([fdo#109271]) +2 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/fi-pnv-d510/igt@gem_tiled_blits@basic.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/fi-pnv-d510/igt@gem_tiled_blits@basic.html

  * igt@i915_selftest@live@execlists:
    - fi-bdw-gvtdvm:      [PASS][3] -> [INCOMPLETE][4] ([i915#2940])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/fi-bdw-gvtdvm/igt@i915_selftest@live@execlists.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/fi-bdw-gvtdvm/igt@i915_selftest@live@execlists.html

  * igt@kms_chamelium@common-hpd-after-suspend:
    - fi-hsw-4770:        NOTRUN -> [SKIP][5] ([fdo#109271] / [fdo#111827])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/fi-hsw-4770/igt@kms_chamelium@common-hpd-after-suspend.html

  * igt@runner@aborted:
    - fi-bdw-gvtdvm:      NOTRUN -> [FAIL][6] ([i915#4312])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/fi-bdw-gvtdvm/igt@runner@aborted.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@hangcheck:
    - fi-hsw-4770:        [INCOMPLETE][7] ([i915#4785]) -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#4785]: https://gitlab.freedesktop.org/drm/intel/issues/4785


Build changes
-------------

  * Linux: CI_DRM_12348 -> Patchwork_105879v9

  CI-20190529: 20190529
  CI_DRM_12348: 274249f2d91b2b43ee26d9363b0f7426c6445ba2 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7044: dbeb6f92720292f8303182a0e649284cea5b11a6 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_105879v9: 274249f2d91b2b43ee26d9363b0f7426c6445ba2 @ git://anongit.freedesktop.org/gfx-ci/linux


### Linux commits

3605637f03fd drm/i915/vm_bind: Async vm_unbind support
a4b64d07b0e3 drm/i915/vm_bind: Render VM_BIND documentation
0d9260049f18 drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
e9aaac78570a drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
924c73f931e8 drm/i915/vm_bind: userptr dma-resv changes
3317504ec1d7 drm/i915/vm_bind: Handle persistent vmas in execbuf3
534980c7c27f drm/i915/vm_bind: Expose i915_request_await_bind()
bb666016bc49 drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
2eabb3dd9832 drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
ff9a6e7dd60b drm/i915/vm_bind: Use common execbuf functions in execbuf path
e348075657f5 drm/i915/vm_bind: Abstract out common execbuf functions
9a9fa330ab38 drm/i915/vm_bind: Add out fence support
14c5398b589f drm/i915/vm_bind: Support persistent vma activeness tracking
4b59949c1f82 drm/i915/vm_bind: Add support to handle object evictions
f3c79a27595c drm/i915/vm_bind: Support for VM private BOs
c0e9b4a1c59c drm/i915/vm_bind: Implement bind and unbind of object
4bf53f2ce1a2 drm/i915/vm_bind: Add support to create persistent vma
9e6f21ed8715 drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
788e89f09d86 drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
7276f31bfd1d drm/i915/vm_bind: Expose vm lookup function

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/index.html

[-- Attachment #2: Type: text/html, Size: 5386 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/vm_bind: Add VM_BIND functionality (rev9)
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
                   ` (23 preceding siblings ...)
  (?)
@ 2022-11-07 14:13 ` Patchwork
  -1 siblings, 0 replies; 71+ messages in thread
From: Patchwork @ 2022-11-07 14:13 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-gfx

[-- Attachment #1: Type: text/plain, Size: 29077 bytes --]

== Series Details ==

Series: drm/i915/vm_bind: Add VM_BIND functionality (rev9)
URL   : https://patchwork.freedesktop.org/series/105879/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_12348_full -> Patchwork_105879v9_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Participating hosts (9 -> 9)
------------------------------

  No changes in participating hosts

Known issues
------------

  Here are the changes found in Patchwork_105879v9_full that come from known issues:

### CI changes ###

#### Possible fixes ####

  * boot:
    - shard-glk:          ([PASS][1], [PASS][2], [PASS][3], [FAIL][4], [PASS][5], [PASS][6], [PASS][7], [PASS][8], [PASS][9], [PASS][10], [PASS][11], [PASS][12], [PASS][13], [PASS][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], [PASS][19], [PASS][20], [PASS][21], [PASS][22], [PASS][23], [PASS][24], [PASS][25]) ([i915#4392]) -> ([PASS][26], [PASS][27], [PASS][28], [PASS][29], [PASS][30], [PASS][31], [PASS][32], [PASS][33], [PASS][34], [PASS][35], [PASS][36], [PASS][37], [PASS][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42], [PASS][43], [PASS][44], [PASS][45], [PASS][46], [PASS][47], [PASS][48], [PASS][49], [PASS][50])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk1/boot.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk1/boot.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk1/boot.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk1/boot.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk2/boot.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk2/boot.html
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk2/boot.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk3/boot.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk3/boot.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk3/boot.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk5/boot.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk5/boot.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk5/boot.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk6/boot.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk6/boot.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk6/boot.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk7/boot.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk7/boot.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk8/boot.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk8/boot.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk8/boot.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk8/boot.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk9/boot.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk9/boot.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk9/boot.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk9/boot.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk9/boot.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk9/boot.html
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk8/boot.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk8/boot.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk8/boot.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk7/boot.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk7/boot.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk7/boot.html
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk6/boot.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk6/boot.html
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk6/boot.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk5/boot.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk5/boot.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk5/boot.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk5/boot.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk3/boot.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk3/boot.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk3/boot.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk2/boot.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk2/boot.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk2/boot.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk1/boot.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk1/boot.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk1/boot.html
    - shard-skl:          ([PASS][51], [PASS][52], [PASS][53], [PASS][54], [PASS][55], [PASS][56], [PASS][57], [FAIL][58], [PASS][59], [PASS][60], [PASS][61], [PASS][62], [PASS][63], [PASS][64], [PASS][65], [PASS][66]) ([i915#5032]) -> ([PASS][67], [PASS][68], [PASS][69], [PASS][70], [PASS][71], [PASS][72], [PASS][73], [PASS][74], [PASS][75], [PASS][76], [PASS][77], [PASS][78], [PASS][79], [PASS][80], [PASS][81], [PASS][82], [PASS][83])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl10/boot.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl10/boot.html
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl10/boot.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl3/boot.html
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl3/boot.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl4/boot.html
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl4/boot.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl5/boot.html
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl5/boot.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl5/boot.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl6/boot.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl6/boot.html
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl7/boot.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl7/boot.html
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl9/boot.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl9/boot.html
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl6/boot.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl9/boot.html
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl9/boot.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl7/boot.html
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl7/boot.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl6/boot.html
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl5/boot.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl5/boot.html
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl4/boot.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl4/boot.html
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl4/boot.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl3/boot.html
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl3/boot.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl2/boot.html
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl2/boot.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl10/boot.html
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl10/boot.html

  

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_balancer@parallel-keep-in-fence:
    - shard-iclb:         [PASS][84] -> [SKIP][85] ([i915#4525]) +3 similar issues
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-iclb2/igt@gem_exec_balancer@parallel-keep-in-fence.html
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb7/igt@gem_exec_balancer@parallel-keep-in-fence.html

  * igt@gem_exec_fair@basic-flow@rcs0:
    - shard-tglb:         [PASS][86] -> [FAIL][87] ([i915#2842]) +1 similar issue
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-tglb3/igt@gem_exec_fair@basic-flow@rcs0.html
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-tglb5/igt@gem_exec_fair@basic-flow@rcs0.html

  * igt@gem_lmem_swapping@basic:
    - shard-skl:          NOTRUN -> [SKIP][88] ([fdo#109271] / [i915#4613]) +6 similar issues
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl4/igt@gem_lmem_swapping@basic.html

  * igt@gem_lmem_swapping@parallel-random-verify:
    - shard-glk:          NOTRUN -> [SKIP][89] ([fdo#109271] / [i915#4613])
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk7/igt@gem_lmem_swapping@parallel-random-verify.html

  * igt@gem_render_copy@yf-tiled-mc-ccs-to-vebox-yf-tiled:
    - shard-skl:          NOTRUN -> [SKIP][90] ([fdo#109271]) +317 similar issues
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl4/igt@gem_render_copy@yf-tiled-mc-ccs-to-vebox-yf-tiled.html

  * igt@gem_userptr_blits@input-checking:
    - shard-glk:          NOTRUN -> [DMESG-WARN][91] ([i915#4991])
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk7/igt@gem_userptr_blits@input-checking.html
    - shard-skl:          NOTRUN -> [DMESG-WARN][92] ([i915#4991])
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl5/igt@gem_userptr_blits@input-checking.html

  * igt@i915_module_load@load:
    - shard-skl:          NOTRUN -> [SKIP][93] ([fdo#109271] / [i915#6227])
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl2/igt@i915_module_load@load.html

  * igt@i915_suspend@fence-restore-tiled2untiled:
    - shard-apl:          [PASS][94] -> [DMESG-WARN][95] ([i915#180])
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-apl3/igt@i915_suspend@fence-restore-tiled2untiled.html
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-apl8/igt@i915_suspend@fence-restore-tiled2untiled.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-0-async-flip:
    - shard-skl:          NOTRUN -> [FAIL][96] ([i915#3763]) +1 similar issue
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl4/igt@kms_big_fb@y-tiled-max-hw-stride-64bpp-rotate-0-async-flip.html

  * igt@kms_ccs@pipe-a-crc-sprite-planes-basic-y_tiled_gen12_rc_ccs_cc:
    - shard-skl:          NOTRUN -> [SKIP][97] ([fdo#109271] / [i915#3886]) +12 similar issues
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl7/igt@kms_ccs@pipe-a-crc-sprite-planes-basic-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-b-random-ccs-data-y_tiled_gen12_rc_ccs_cc:
    - shard-glk:          NOTRUN -> [SKIP][98] ([fdo#109271] / [i915#3886])
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk7/igt@kms_ccs@pipe-b-random-ccs-data-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_chamelium@hdmi-hpd-with-enabled-mode:
    - shard-skl:          NOTRUN -> [SKIP][99] ([fdo#109271] / [fdo#111827]) +13 similar issues
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl7/igt@kms_chamelium@hdmi-hpd-with-enabled-mode.html

  * igt@kms_chamelium@hdmi-mode-timings:
    - shard-glk:          NOTRUN -> [SKIP][100] ([fdo#109271] / [fdo#111827])
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk7/igt@kms_chamelium@hdmi-mode-timings.html

  * igt@kms_cursor_legacy@2x-long-flip-vs-cursor-legacy:
    - shard-glk:          [PASS][101] -> [FAIL][102] ([i915#72])
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk9/igt@kms_cursor_legacy@2x-long-flip-vs-cursor-legacy.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk2/igt@kms_cursor_legacy@2x-long-flip-vs-cursor-legacy.html

  * igt@kms_flip@flip-vs-expired-vblank@a-edp1:
    - shard-skl:          [PASS][103] -> [FAIL][104] ([i915#79])
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl6/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl6/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html

  * igt@kms_flip_scaled_crc@flip-32bpp-4tile-to-32bpp-4tiledg2rcccs-downscaling@pipe-a-valid-mode:
    - shard-iclb:         NOTRUN -> [SKIP][105] ([i915#2587] / [i915#2672]) +1 similar issue
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb7/igt@kms_flip_scaled_crc@flip-32bpp-4tile-to-32bpp-4tiledg2rcccs-downscaling@pipe-a-valid-mode.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytileccs-downscaling@pipe-a-default-mode:
    - shard-iclb:         NOTRUN -> [SKIP][106] ([i915#3555]) +1 similar issue
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb2/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytileccs-downscaling@pipe-a-default-mode.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs-downscaling@pipe-a-default-mode:
    - shard-iclb:         NOTRUN -> [SKIP][107] ([i915#2672] / [i915#3555])
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb3/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs-downscaling@pipe-a-default-mode.html

  * igt@kms_flip_scaled_crc@flip-64bpp-4tile-to-16bpp-4tile-downscaling@pipe-a-default-mode:
    - shard-iclb:         NOTRUN -> [SKIP][108] ([i915#2672]) +1 similar issue
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb2/igt@kms_flip_scaled_crc@flip-64bpp-4tile-to-16bpp-4tile-downscaling@pipe-a-default-mode.html

  * igt@kms_frontbuffer_tracking@psr-2p-primscrn-spr-indfb-draw-mmap-gtt:
    - shard-glk:          NOTRUN -> [SKIP][109] ([fdo#109271]) +11 similar issues
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk7/igt@kms_frontbuffer_tracking@psr-2p-primscrn-spr-indfb-draw-mmap-gtt.html

  * igt@kms_plane_alpha_blend@constant-alpha-min@pipe-c-edp-1:
    - shard-skl:          NOTRUN -> [FAIL][110] ([i915#4573]) +8 similar issues
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl4/igt@kms_plane_alpha_blend@constant-alpha-min@pipe-c-edp-1.html

  * igt@kms_psr2_su@page_flip-p010@pipe-b-edp-1:
    - shard-iclb:         NOTRUN -> [FAIL][111] ([i915#5939]) +2 similar issues
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb2/igt@kms_psr2_su@page_flip-p010@pipe-b-edp-1.html

  * igt@kms_psr2_su@page_flip-xrgb8888:
    - shard-skl:          NOTRUN -> [SKIP][112] ([fdo#109271] / [i915#658]) +6 similar issues
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl7/igt@kms_psr2_su@page_flip-xrgb8888.html

  * igt@kms_psr@psr2_sprite_mmap_cpu:
    - shard-iclb:         [PASS][113] -> [SKIP][114] ([fdo#109441]) +3 similar issues
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-iclb2/igt@kms_psr@psr2_sprite_mmap_cpu.html
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb7/igt@kms_psr@psr2_sprite_mmap_cpu.html

  * igt@kms_vblank@pipe-c-accuracy-idle:
    - shard-skl:          [PASS][115] -> [FAIL][116] ([i915#43])
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl10/igt@kms_vblank@pipe-c-accuracy-idle.html
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl3/igt@kms_vblank@pipe-c-accuracy-idle.html

  * igt@kms_vblank@pipe-d-wait-idle:
    - shard-skl:          NOTRUN -> [SKIP][117] ([fdo#109271] / [i915#533])
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl4/igt@kms_vblank@pipe-d-wait-idle.html

  * igt@kms_writeback@writeback-invalid-parameters:
    - shard-skl:          NOTRUN -> [SKIP][118] ([fdo#109271] / [i915#2437])
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl7/igt@kms_writeback@writeback-invalid-parameters.html

  * igt@perf@polling-small-buf:
    - shard-skl:          NOTRUN -> [FAIL][119] ([i915#1722])
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl9/igt@perf@polling-small-buf.html

  * igt@sysfs_clients@fair-3:
    - shard-skl:          NOTRUN -> [SKIP][120] ([fdo#109271] / [i915#2994]) +3 similar issues
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl4/igt@sysfs_clients@fair-3.html

  
#### Possible fixes ####

  * igt@gem_ctx_exec@basic-nohangcheck:
    - shard-tglb:         [FAIL][121] ([i915#6268]) -> [PASS][122]
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-tglb2/igt@gem_ctx_exec@basic-nohangcheck.html
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-tglb7/igt@gem_ctx_exec@basic-nohangcheck.html

  * igt@gem_exec_balancer@parallel-out-fence:
    - shard-iclb:         [SKIP][123] ([i915#4525]) -> [PASS][124] +1 similar issue
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-iclb7/igt@gem_exec_balancer@parallel-out-fence.html
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb2/igt@gem_exec_balancer@parallel-out-fence.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
    - shard-apl:          [FAIL][125] ([i915#2842]) -> [PASS][126] +1 similar issue
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-apl2/igt@gem_exec_fair@basic-none-solo@rcs0.html
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-apl3/igt@gem_exec_fair@basic-none-solo@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs0:
    - shard-glk:          [FAIL][127] ([i915#2842]) -> [PASS][128]
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk6/igt@gem_exec_fair@basic-none@vcs0.html
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk7/igt@gem_exec_fair@basic-none@vcs0.html

  * igt@i915_pm_dc@dc6-dpms:
    - shard-iclb:         [FAIL][129] ([i915#3989] / [i915#454]) -> [PASS][130]
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-iclb3/igt@i915_pm_dc@dc6-dpms.html
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb6/igt@i915_pm_dc@dc6-dpms.html

  * igt@i915_selftest@live@gt_heartbeat:
    - shard-skl:          [DMESG-FAIL][131] ([i915#5334]) -> [PASS][132]
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl4/igt@i915_selftest@live@gt_heartbeat.html
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl5/igt@i915_selftest@live@gt_heartbeat.html

  * igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions-varying-size:
    - shard-glk:          [FAIL][133] ([i915#2346]) -> [PASS][134]
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk5/igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions-varying-size.html
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk1/igt@kms_cursor_legacy@flip-vs-cursor@atomic-transitions-varying-size.html

  * igt@kms_flip@flip-vs-expired-vblank@c-edp1:
    - shard-skl:          [FAIL][135] ([i915#79]) -> [PASS][136] +1 similar issue
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl6/igt@kms_flip@flip-vs-expired-vblank@c-edp1.html
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl6/igt@kms_flip@flip-vs-expired-vblank@c-edp1.html

  * igt@kms_plane_scaling@plane-scaler-with-clipping-clamping-pixel-formats@pipe-b-edp-1:
    - shard-iclb:         [SKIP][137] ([i915#5176]) -> [PASS][138] +1 similar issue
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-iclb3/igt@kms_plane_scaling@plane-scaler-with-clipping-clamping-pixel-formats@pipe-b-edp-1.html
   [138]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb6/igt@kms_plane_scaling@plane-scaler-with-clipping-clamping-pixel-formats@pipe-b-edp-1.html

  * igt@kms_psr@psr2_sprite_blt:
    - shard-iclb:         [SKIP][139] ([fdo#109441]) -> [PASS][140]
   [139]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-iclb5/igt@kms_psr@psr2_sprite_blt.html
   [140]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb2/igt@kms_psr@psr2_sprite_blt.html

  * igt@kms_psr_stress_test@flip-primary-invalidate-overlay:
    - shard-tglb:         [SKIP][141] ([i915#5519]) -> [PASS][142]
   [141]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-tglb6/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html
   [142]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-tglb2/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html
    - shard-iclb:         [SKIP][143] ([i915#5519]) -> [PASS][144]
   [143]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-iclb1/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html
   [144]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb8/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html

  * igt@kms_sysfs_edid_timing:
    - shard-skl:          [FAIL][145] ([i915#6493]) -> [PASS][146]
   [145]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-skl3/igt@kms_sysfs_edid_timing.html
   [146]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-skl7/igt@kms_sysfs_edid_timing.html

  * igt@sysfs_heartbeat_interval@nopreempt@vcs0:
    - shard-tglb:         [FAIL][147] ([i915#6015]) -> [PASS][148]
   [147]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-tglb1/igt@sysfs_heartbeat_interval@nopreempt@vcs0.html
   [148]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-tglb7/igt@sysfs_heartbeat_interval@nopreempt@vcs0.html

  
#### Warnings ####

  * igt@gem_pread@exhaustion:
    - shard-apl:          [INCOMPLETE][149] ([i915#7248]) -> [WARN][150] ([i915#2658]) +1 similar issue
   [149]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-apl8/igt@gem_pread@exhaustion.html
   [150]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-apl8/igt@gem_pread@exhaustion.html
    - shard-glk:          [INCOMPLETE][151] ([i915#7248]) -> [WARN][152] ([i915#2658])
   [151]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-glk1/igt@gem_pread@exhaustion.html
   [152]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-glk5/igt@gem_pread@exhaustion.html

  * igt@gem_pwrite@basic-exhaustion:
    - shard-tglb:         [INCOMPLETE][153] ([i915#7248]) -> [WARN][154] ([i915#2658])
   [153]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-tglb2/igt@gem_pwrite@basic-exhaustion.html
   [154]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-tglb1/igt@gem_pwrite@basic-exhaustion.html

  * igt@kms_psr2_sf@cursor-plane-update-sf:
    - shard-iclb:         [SKIP][155] ([fdo#111068] / [i915#658]) -> [SKIP][156] ([i915#2920])
   [155]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-iclb7/igt@kms_psr2_sf@cursor-plane-update-sf.html
   [156]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-iclb2/igt@kms_psr2_sf@cursor-plane-update-sf.html

  * igt@runner@aborted:
    - shard-apl:          ([FAIL][157], [FAIL][158], [FAIL][159]) ([fdo#109271] / [i915#3002] / [i915#4312]) -> ([FAIL][160], [FAIL][161], [FAIL][162], [FAIL][163]) ([fdo#109271] / [i915#180] / [i915#3002] / [i915#4312])
   [157]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-apl8/igt@runner@aborted.html
   [158]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-apl6/igt@runner@aborted.html
   [159]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12348/shard-apl3/igt@runner@aborted.html
   [160]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-apl3/igt@runner@aborted.html
   [161]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-apl8/igt@runner@aborted.html
   [162]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-apl7/igt@runner@aborted.html
   [163]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/shard-apl7/igt@runner@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1722]: https://gitlab.freedesktop.org/drm/intel/issues/1722
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2437]: https://gitlab.freedesktop.org/drm/intel/issues/2437
  [i915#2587]: https://gitlab.freedesktop.org/drm/intel/issues/2587
  [i915#2658]: https://gitlab.freedesktop.org/drm/intel/issues/2658
  [i915#2672]: https://gitlab.freedesktop.org/drm/intel/issues/2672
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2920]: https://gitlab.freedesktop.org/drm/intel/issues/2920
  [i915#2994]: https://gitlab.freedesktop.org/drm/intel/issues/2994
  [i915#3002]: https://gitlab.freedesktop.org/drm/intel/issues/3002
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3763]: https://gitlab.freedesktop.org/drm/intel/issues/3763
  [i915#3886]: https://gitlab.freedesktop.org/drm/intel/issues/3886
  [i915#3989]: https://gitlab.freedesktop.org/drm/intel/issues/3989
  [i915#43]: https://gitlab.freedesktop.org/drm/intel/issues/43
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#4392]: https://gitlab.freedesktop.org/drm/intel/issues/4392
  [i915#4525]: https://gitlab.freedesktop.org/drm/intel/issues/4525
  [i915#454]: https://gitlab.freedesktop.org/drm/intel/issues/454
  [i915#4573]: https://gitlab.freedesktop.org/drm/intel/issues/4573
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4991]: https://gitlab.freedesktop.org/drm/intel/issues/4991
  [i915#5032]: https://gitlab.freedesktop.org/drm/intel/issues/5032
  [i915#5176]: https://gitlab.freedesktop.org/drm/intel/issues/5176
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#5334]: https://gitlab.freedesktop.org/drm/intel/issues/5334
  [i915#5519]: https://gitlab.freedesktop.org/drm/intel/issues/5519
  [i915#5939]: https://gitlab.freedesktop.org/drm/intel/issues/5939
  [i915#6015]: https://gitlab.freedesktop.org/drm/intel/issues/6015
  [i915#6227]: https://gitlab.freedesktop.org/drm/intel/issues/6227
  [i915#6268]: https://gitlab.freedesktop.org/drm/intel/issues/6268
  [i915#6493]: https://gitlab.freedesktop.org/drm/intel/issues/6493
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#72]: https://gitlab.freedesktop.org/drm/intel/issues/72
  [i915#7248]: https://gitlab.freedesktop.org/drm/intel/issues/7248
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79


Build changes
-------------

  * Linux: CI_DRM_12348 -> Patchwork_105879v9

  CI-20190529: 20190529
  CI_DRM_12348: 274249f2d91b2b43ee26d9363b0f7426c6445ba2 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_7044: dbeb6f92720292f8303182a0e649284cea5b11a6 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_105879v9: 274249f2d91b2b43ee26d9363b0f7426c6445ba2 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_105879v9/index.html

[-- Attachment #2: Type: text/html, Size: 33401 bytes --]

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
  2022-11-07  8:52   ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-08  1:39     ` Zanoni, Paulo R
  -1 siblings, 0 replies; 71+ messages in thread
From: Zanoni, Paulo R @ 2022-11-08  1:39 UTC (permalink / raw)
  To: dri-devel, Vishwanathapura, Niranjana, intel-gfx
  Cc: Brost, Matthew, andi.shyti, Ursulin,  Tvrtko, Nikula, Jani,
	Landwerlin, Lionel G, Hellstrom, Thomas, Auld, Matthew, jason,
	Vetter, Daniel, christian.koenig

On Mon, 2022-11-07 at 00:52 -0800, Niranjana Vishwanathapura wrote:
> Asynchronously unbind the vma upon vm_unbind call.
> Fall back to synchronous unbind if backend doesn't support
> async unbind or if async unbind fails.
> 
> No need for vm_unbind out fence support as i915 will internally
> handle all sequencing and user need not try to sequence any
> operation with the unbind completion.

Can you please provide some more details on how this works from the
user space point of view? I want to be able to know with 100% certainty
if an unbind has already happened, so I can reuse that vma or whatever
else I may decide to do. I see the interface does not provide any sort
of drm_syncobjs for me to wait on the async unbind. So, when does the
unbind really happen? When can I be sure it's past so I can do stuff
with it? Why would you provide an async ioctl and provide no means for
user space to wait on it?

Thanks,
Paulo

> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_vma.c | 51 ++++++++++++++++++++++++++++++---
>  drivers/gpu/drm/i915/i915_vma.h |  1 +
>  2 files changed, 48 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 08218e3a2f12..03c966fad87b 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -42,6 +42,8 @@
>  #include "i915_vma.h"
>  #include "i915_vma_resource.h"
>  
> 
> 
> 
> +static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma);
> +
>  static inline void assert_vma_held_evict(const struct i915_vma *vma)
>  {
>  	/*
> @@ -1711,7 +1713,7 @@ void i915_vma_reopen(struct i915_vma *vma)
>  	spin_unlock_irq(&gt->closed_lock);
>  }
>  
> 
> 
> 
> -static void force_unbind(struct i915_vma *vma)
> +static void force_unbind(struct i915_vma *vma, bool async)
>  {
>  	if (!drm_mm_node_allocated(&vma->node))
>  		return;
> @@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
>  		i915_vma_set_purged(vma);
>  
> 
> 
> 
>  	atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
> -	WARN_ON(__i915_vma_unbind(vma));
> +	if (async) {
> +		struct dma_fence *fence;
> +
> +		fence = __i915_vma_unbind_async(vma);
> +		if (IS_ERR_OR_NULL(fence)) {
> +			async = false;
> +		} else {
> +			dma_resv_add_fence(vma->obj->base.resv, fence,
> +					   DMA_RESV_USAGE_READ);
> +			dma_fence_put(fence);
> +		}
> +	}
> +
> +	if (!async)
> +		WARN_ON(__i915_vma_unbind(vma));
>  	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
>  }
>  
> 
> 
> 
> @@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
>  {
>  	lockdep_assert_held(&vma->vm->mutex);
>  
> 
> 
> 
> -	force_unbind(vma);
> +	force_unbind(vma, false);
>  	list_del_init(&vma->vm_link);
>  	release_references(vma, vma->vm->gt, false);
>  }
> @@ -1796,7 +1812,34 @@ void i915_vma_destroy(struct i915_vma *vma)
>  	bool vm_ddestroy;
>  
> 
> 
> 
>  	mutex_lock(&vma->vm->mutex);
> -	force_unbind(vma);
> +	force_unbind(vma, false);
> +	list_del_init(&vma->vm_link);
> +	vm_ddestroy = vma->vm_ddestroy;
> +	vma->vm_ddestroy = false;
> +
> +	/* vma->vm may be freed when releasing vma->vm->mutex. */
> +	gt = vma->vm->gt;
> +	mutex_unlock(&vma->vm->mutex);
> +	release_references(vma, gt, vm_ddestroy);
> +}
> +
> +void i915_vma_destroy_async(struct i915_vma *vma)
> +{
> +	bool vm_ddestroy, async = vma->obj->mm.rsgt;
> +	struct intel_gt *gt;
> +
> +	if (dma_resv_reserve_fences(vma->obj->base.resv, 1))
> +		async = false;
> +
> +	mutex_lock(&vma->vm->mutex);
> +	/*
> +	 * Ensure any asynchronous binding is complete while using
> +	 * async unbind as we will be releasing the vma here.
> +	 */
> +	if (async && i915_active_wait(&vma->active))
> +		async = false;
> +
> +	force_unbind(vma, async);
>  	list_del_init(&vma->vm_link);
>  	vm_ddestroy = vma->vm_ddestroy;
>  	vma->vm_ddestroy = false;
> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
> index 737ef310d046..25f15965dab8 100644
> --- a/drivers/gpu/drm/i915/i915_vma.h
> +++ b/drivers/gpu/drm/i915/i915_vma.h
> @@ -272,6 +272,7 @@ void i915_vma_reopen(struct i915_vma *vma);
>  
> 
> 
> 
>  void i915_vma_destroy_locked(struct i915_vma *vma);
>  void i915_vma_destroy(struct i915_vma *vma);
> +void i915_vma_destroy_async(struct i915_vma *vma);
>  
> 
> 
> 
>  #define assert_vma_held(vma) dma_resv_assert_held((vma)->obj->base.resv)
>  
> 
> 
> 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
@ 2022-11-08  1:39     ` Zanoni, Paulo R
  0 siblings, 0 replies; 71+ messages in thread
From: Zanoni, Paulo R @ 2022-11-08  1:39 UTC (permalink / raw)
  To: dri-devel, Vishwanathapura, Niranjana, intel-gfx
  Cc: Nikula, Jani, Hellstrom, Thomas, Auld, Matthew, Vetter, Daniel,
	christian.koenig

On Mon, 2022-11-07 at 00:52 -0800, Niranjana Vishwanathapura wrote:
> Asynchronously unbind the vma upon vm_unbind call.
> Fall back to synchronous unbind if backend doesn't support
> async unbind or if async unbind fails.
> 
> No need for vm_unbind out fence support as i915 will internally
> handle all sequencing and user need not try to sequence any
> operation with the unbind completion.

Can you please provide some more details on how this works from the
user space point of view? I want to be able to know with 100% certainty
if an unbind has already happened, so I can reuse that vma or whatever
else I may decide to do. I see the interface does not provide any sort
of drm_syncobjs for me to wait on the async unbind. So, when does the
unbind really happen? When can I be sure it's past so I can do stuff
with it? Why would you provide an async ioctl and provide no means for
user space to wait on it?

Thanks,
Paulo

> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_vma.c | 51 ++++++++++++++++++++++++++++++---
>  drivers/gpu/drm/i915/i915_vma.h |  1 +
>  2 files changed, 48 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 08218e3a2f12..03c966fad87b 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -42,6 +42,8 @@
>  #include "i915_vma.h"
>  #include "i915_vma_resource.h"
>  
> 
> 
> 
> +static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma);
> +
>  static inline void assert_vma_held_evict(const struct i915_vma *vma)
>  {
>  	/*
> @@ -1711,7 +1713,7 @@ void i915_vma_reopen(struct i915_vma *vma)
>  	spin_unlock_irq(&gt->closed_lock);
>  }
>  
> 
> 
> 
> -static void force_unbind(struct i915_vma *vma)
> +static void force_unbind(struct i915_vma *vma, bool async)
>  {
>  	if (!drm_mm_node_allocated(&vma->node))
>  		return;
> @@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
>  		i915_vma_set_purged(vma);
>  
> 
> 
> 
>  	atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
> -	WARN_ON(__i915_vma_unbind(vma));
> +	if (async) {
> +		struct dma_fence *fence;
> +
> +		fence = __i915_vma_unbind_async(vma);
> +		if (IS_ERR_OR_NULL(fence)) {
> +			async = false;
> +		} else {
> +			dma_resv_add_fence(vma->obj->base.resv, fence,
> +					   DMA_RESV_USAGE_READ);
> +			dma_fence_put(fence);
> +		}
> +	}
> +
> +	if (!async)
> +		WARN_ON(__i915_vma_unbind(vma));
>  	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
>  }
>  
> 
> 
> 
> @@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
>  {
>  	lockdep_assert_held(&vma->vm->mutex);
>  
> 
> 
> 
> -	force_unbind(vma);
> +	force_unbind(vma, false);
>  	list_del_init(&vma->vm_link);
>  	release_references(vma, vma->vm->gt, false);
>  }
> @@ -1796,7 +1812,34 @@ void i915_vma_destroy(struct i915_vma *vma)
>  	bool vm_ddestroy;
>  
> 
> 
> 
>  	mutex_lock(&vma->vm->mutex);
> -	force_unbind(vma);
> +	force_unbind(vma, false);
> +	list_del_init(&vma->vm_link);
> +	vm_ddestroy = vma->vm_ddestroy;
> +	vma->vm_ddestroy = false;
> +
> +	/* vma->vm may be freed when releasing vma->vm->mutex. */
> +	gt = vma->vm->gt;
> +	mutex_unlock(&vma->vm->mutex);
> +	release_references(vma, gt, vm_ddestroy);
> +}
> +
> +void i915_vma_destroy_async(struct i915_vma *vma)
> +{
> +	bool vm_ddestroy, async = vma->obj->mm.rsgt;
> +	struct intel_gt *gt;
> +
> +	if (dma_resv_reserve_fences(vma->obj->base.resv, 1))
> +		async = false;
> +
> +	mutex_lock(&vma->vm->mutex);
> +	/*
> +	 * Ensure any asynchronous binding is complete while using
> +	 * async unbind as we will be releasing the vma here.
> +	 */
> +	if (async && i915_active_wait(&vma->active))
> +		async = false;
> +
> +	force_unbind(vma, async);
>  	list_del_init(&vma->vm_link);
>  	vm_ddestroy = vma->vm_ddestroy;
>  	vma->vm_ddestroy = false;
> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
> index 737ef310d046..25f15965dab8 100644
> --- a/drivers/gpu/drm/i915/i915_vma.h
> +++ b/drivers/gpu/drm/i915/i915_vma.h
> @@ -272,6 +272,7 @@ void i915_vma_reopen(struct i915_vma *vma);
>  
> 
> 
> 
>  void i915_vma_destroy_locked(struct i915_vma *vma);
>  void i915_vma_destroy(struct i915_vma *vma);
> +void i915_vma_destroy_async(struct i915_vma *vma);
>  
> 
> 
> 
>  #define assert_vma_held(vma) dma_resv_assert_held((vma)->obj->base.resv)
>  
> 
> 
> 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
  2022-11-08  1:39     ` [Intel-gfx] " Zanoni, Paulo R
@ 2022-11-08 15:46       ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-08 15:46 UTC (permalink / raw)
  To: Zanoni, Paulo R
  Cc: Brost, Matthew, andi.shyti, Landwerlin, Lionel G, Ursulin,
	 Tvrtko, Nikula, Jani, intel-gfx, dri-devel, Hellstrom, Thomas,
	Auld, Matthew, jason, Vetter, Daniel, christian.koenig

On Mon, Nov 07, 2022 at 05:39:34PM -0800, Zanoni, Paulo R wrote:
>On Mon, 2022-11-07 at 00:52 -0800, Niranjana Vishwanathapura wrote:
>> Asynchronously unbind the vma upon vm_unbind call.
>> Fall back to synchronous unbind if backend doesn't support
>> async unbind or if async unbind fails.
>>
>> No need for vm_unbind out fence support as i915 will internally
>> handle all sequencing and user need not try to sequence any
>> operation with the unbind completion.
>
>Can you please provide some more details on how this works from the
>user space point of view? I want to be able to know with 100% certainty
>if an unbind has already happened, so I can reuse that vma or whatever
>else I may decide to do. I see the interface does not provide any sort
>of drm_syncobjs for me to wait on the async unbind. So, when does the
>unbind really happen? When can I be sure it's past so I can do stuff
>with it? Why would you provide an async ioctl and provide no means for
>user space to wait on it?
>

Paulo,
The async vm_unbind here is not transparent to user space. From user space
point of view, it is like synchronous and they can reuse the assigned virtual
address immediately after vm_unbind ioctl returns. The i915 driver will
ensure that the unbind completes before there is a rebind at that virtual
address. So, unless there is error from user programming where GPU tries to
access the buffer even after user doing the vm_unbind, it should be fine.

Regards,
Niranjana

>Thanks,
>Paulo
>
>>
>> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_vma.c | 51 ++++++++++++++++++++++++++++++---
>>  drivers/gpu/drm/i915/i915_vma.h |  1 +
>>  2 files changed, 48 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
>> index 08218e3a2f12..03c966fad87b 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.c
>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> @@ -42,6 +42,8 @@
>>  #include "i915_vma.h"
>>  #include "i915_vma_resource.h"
>>  
>>
>>
>>
>> +static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma);
>> +
>>  static inline void assert_vma_held_evict(const struct i915_vma *vma)
>>  {
>>  	/*
>> @@ -1711,7 +1713,7 @@ void i915_vma_reopen(struct i915_vma *vma)
>>  	spin_unlock_irq(&gt->closed_lock);
>>  }
>>  
>>
>>
>>
>> -static void force_unbind(struct i915_vma *vma)
>> +static void force_unbind(struct i915_vma *vma, bool async)
>>  {
>>  	if (!drm_mm_node_allocated(&vma->node))
>>  		return;
>> @@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
>>  		i915_vma_set_purged(vma);
>>  
>>
>>
>>
>>  	atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
>> -	WARN_ON(__i915_vma_unbind(vma));
>> +	if (async) {
>> +		struct dma_fence *fence;
>> +
>> +		fence = __i915_vma_unbind_async(vma);
>> +		if (IS_ERR_OR_NULL(fence)) {
>> +			async = false;
>> +		} else {
>> +			dma_resv_add_fence(vma->obj->base.resv, fence,
>> +					   DMA_RESV_USAGE_READ);
>> +			dma_fence_put(fence);
>> +		}
>> +	}
>> +
>> +	if (!async)
>> +		WARN_ON(__i915_vma_unbind(vma));
>>  	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
>>  }
>>  
>>
>>
>>
>> @@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
>>  {
>>  	lockdep_assert_held(&vma->vm->mutex);
>>  
>>
>>
>>
>> -	force_unbind(vma);
>> +	force_unbind(vma, false);
>>  	list_del_init(&vma->vm_link);
>>  	release_references(vma, vma->vm->gt, false);
>>  }
>> @@ -1796,7 +1812,34 @@ void i915_vma_destroy(struct i915_vma *vma)
>>  	bool vm_ddestroy;
>>  
>>
>>
>>
>>  	mutex_lock(&vma->vm->mutex);
>> -	force_unbind(vma);
>> +	force_unbind(vma, false);
>> +	list_del_init(&vma->vm_link);
>> +	vm_ddestroy = vma->vm_ddestroy;
>> +	vma->vm_ddestroy = false;
>> +
>> +	/* vma->vm may be freed when releasing vma->vm->mutex. */
>> +	gt = vma->vm->gt;
>> +	mutex_unlock(&vma->vm->mutex);
>> +	release_references(vma, gt, vm_ddestroy);
>> +}
>> +
>> +void i915_vma_destroy_async(struct i915_vma *vma)
>> +{
>> +	bool vm_ddestroy, async = vma->obj->mm.rsgt;
>> +	struct intel_gt *gt;
>> +
>> +	if (dma_resv_reserve_fences(vma->obj->base.resv, 1))
>> +		async = false;
>> +
>> +	mutex_lock(&vma->vm->mutex);
>> +	/*
>> +	 * Ensure any asynchronous binding is complete while using
>> +	 * async unbind as we will be releasing the vma here.
>> +	 */
>> +	if (async && i915_active_wait(&vma->active))
>> +		async = false;
>> +
>> +	force_unbind(vma, async);
>>  	list_del_init(&vma->vm_link);
>>  	vm_ddestroy = vma->vm_ddestroy;
>>  	vma->vm_ddestroy = false;
>> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
>> index 737ef310d046..25f15965dab8 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.h
>> +++ b/drivers/gpu/drm/i915/i915_vma.h
>> @@ -272,6 +272,7 @@ void i915_vma_reopen(struct i915_vma *vma);
>>  
>>
>>
>>
>>  void i915_vma_destroy_locked(struct i915_vma *vma);
>>  void i915_vma_destroy(struct i915_vma *vma);
>> +void i915_vma_destroy_async(struct i915_vma *vma);
>>  
>>
>>
>>
>>  #define assert_vma_held(vma) dma_resv_assert_held((vma)->obj->base.resv)
>>  
>>
>>
>>
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
@ 2022-11-08 15:46       ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-08 15:46 UTC (permalink / raw)
  To: Zanoni, Paulo R
  Cc: Nikula, Jani, intel-gfx, dri-devel, Hellstrom, Thomas, Auld,
	Matthew, Vetter, Daniel, christian.koenig

On Mon, Nov 07, 2022 at 05:39:34PM -0800, Zanoni, Paulo R wrote:
>On Mon, 2022-11-07 at 00:52 -0800, Niranjana Vishwanathapura wrote:
>> Asynchronously unbind the vma upon vm_unbind call.
>> Fall back to synchronous unbind if backend doesn't support
>> async unbind or if async unbind fails.
>>
>> No need for vm_unbind out fence support as i915 will internally
>> handle all sequencing and user need not try to sequence any
>> operation with the unbind completion.
>
>Can you please provide some more details on how this works from the
>user space point of view? I want to be able to know with 100% certainty
>if an unbind has already happened, so I can reuse that vma or whatever
>else I may decide to do. I see the interface does not provide any sort
>of drm_syncobjs for me to wait on the async unbind. So, when does the
>unbind really happen? When can I be sure it's past so I can do stuff
>with it? Why would you provide an async ioctl and provide no means for
>user space to wait on it?
>

Paulo,
The async vm_unbind here is not transparent to user space. From user space
point of view, it is like synchronous and they can reuse the assigned virtual
address immediately after vm_unbind ioctl returns. The i915 driver will
ensure that the unbind completes before there is a rebind at that virtual
address. So, unless there is error from user programming where GPU tries to
access the buffer even after user doing the vm_unbind, it should be fine.

Regards,
Niranjana

>Thanks,
>Paulo
>
>>
>> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_vma.c | 51 ++++++++++++++++++++++++++++++---
>>  drivers/gpu/drm/i915/i915_vma.h |  1 +
>>  2 files changed, 48 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
>> index 08218e3a2f12..03c966fad87b 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.c
>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> @@ -42,6 +42,8 @@
>>  #include "i915_vma.h"
>>  #include "i915_vma_resource.h"
>>  
>>
>>
>>
>> +static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma);
>> +
>>  static inline void assert_vma_held_evict(const struct i915_vma *vma)
>>  {
>>  	/*
>> @@ -1711,7 +1713,7 @@ void i915_vma_reopen(struct i915_vma *vma)
>>  	spin_unlock_irq(&gt->closed_lock);
>>  }
>>  
>>
>>
>>
>> -static void force_unbind(struct i915_vma *vma)
>> +static void force_unbind(struct i915_vma *vma, bool async)
>>  {
>>  	if (!drm_mm_node_allocated(&vma->node))
>>  		return;
>> @@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
>>  		i915_vma_set_purged(vma);
>>  
>>
>>
>>
>>  	atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
>> -	WARN_ON(__i915_vma_unbind(vma));
>> +	if (async) {
>> +		struct dma_fence *fence;
>> +
>> +		fence = __i915_vma_unbind_async(vma);
>> +		if (IS_ERR_OR_NULL(fence)) {
>> +			async = false;
>> +		} else {
>> +			dma_resv_add_fence(vma->obj->base.resv, fence,
>> +					   DMA_RESV_USAGE_READ);
>> +			dma_fence_put(fence);
>> +		}
>> +	}
>> +
>> +	if (!async)
>> +		WARN_ON(__i915_vma_unbind(vma));
>>  	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
>>  }
>>  
>>
>>
>>
>> @@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
>>  {
>>  	lockdep_assert_held(&vma->vm->mutex);
>>  
>>
>>
>>
>> -	force_unbind(vma);
>> +	force_unbind(vma, false);
>>  	list_del_init(&vma->vm_link);
>>  	release_references(vma, vma->vm->gt, false);
>>  }
>> @@ -1796,7 +1812,34 @@ void i915_vma_destroy(struct i915_vma *vma)
>>  	bool vm_ddestroy;
>>  
>>
>>
>>
>>  	mutex_lock(&vma->vm->mutex);
>> -	force_unbind(vma);
>> +	force_unbind(vma, false);
>> +	list_del_init(&vma->vm_link);
>> +	vm_ddestroy = vma->vm_ddestroy;
>> +	vma->vm_ddestroy = false;
>> +
>> +	/* vma->vm may be freed when releasing vma->vm->mutex. */
>> +	gt = vma->vm->gt;
>> +	mutex_unlock(&vma->vm->mutex);
>> +	release_references(vma, gt, vm_ddestroy);
>> +}
>> +
>> +void i915_vma_destroy_async(struct i915_vma *vma)
>> +{
>> +	bool vm_ddestroy, async = vma->obj->mm.rsgt;
>> +	struct intel_gt *gt;
>> +
>> +	if (dma_resv_reserve_fences(vma->obj->base.resv, 1))
>> +		async = false;
>> +
>> +	mutex_lock(&vma->vm->mutex);
>> +	/*
>> +	 * Ensure any asynchronous binding is complete while using
>> +	 * async unbind as we will be releasing the vma here.
>> +	 */
>> +	if (async && i915_active_wait(&vma->active))
>> +		async = false;
>> +
>> +	force_unbind(vma, async);
>>  	list_del_init(&vma->vm_link);
>>  	vm_ddestroy = vma->vm_ddestroy;
>>  	vma->vm_ddestroy = false;
>> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
>> index 737ef310d046..25f15965dab8 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.h
>> +++ b/drivers/gpu/drm/i915/i915_vma.h
>> @@ -272,6 +272,7 @@ void i915_vma_reopen(struct i915_vma *vma);
>>  
>>
>>
>>
>>  void i915_vma_destroy_locked(struct i915_vma *vma);
>>  void i915_vma_destroy(struct i915_vma *vma);
>> +void i915_vma_destroy_async(struct i915_vma *vma);
>>  
>>
>>
>>
>>  #define assert_vma_held(vma) dma_resv_assert_held((vma)->obj->base.resv)
>>  
>>
>>
>>
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
  2022-11-07  8:52   ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-09 17:52     ` Matthew Auld
  -1 siblings, 0 replies; 71+ messages in thread
From: Matthew Auld @ 2022-11-09 17:52 UTC (permalink / raw)
  To: Niranjana Vishwanathapura
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	intel-gfx, dri-devel, thomas.hellstrom, lionel.g.landwerlin,
	jason, andi.shyti, daniel.vetter, christian.koenig, matthew.auld

On Mon, 7 Nov 2022 at 08:53, Niranjana Vishwanathapura
<niranjana.vishwanathapura@intel.com> wrote:
>
> Asynchronously unbind the vma upon vm_unbind call.
> Fall back to synchronous unbind if backend doesn't support
> async unbind or if async unbind fails.
>
> No need for vm_unbind out fence support as i915 will internally
> handle all sequencing and user need not try to sequence any
> operation with the unbind completion.
>
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_vma.c | 51 ++++++++++++++++++++++++++++++---
>  drivers/gpu/drm/i915/i915_vma.h |  1 +
>  2 files changed, 48 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 08218e3a2f12..03c966fad87b 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -42,6 +42,8 @@
>  #include "i915_vma.h"
>  #include "i915_vma_resource.h"
>
> +static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma);
> +
>  static inline void assert_vma_held_evict(const struct i915_vma *vma)
>  {
>         /*
> @@ -1711,7 +1713,7 @@ void i915_vma_reopen(struct i915_vma *vma)
>         spin_unlock_irq(&gt->closed_lock);
>  }
>
> -static void force_unbind(struct i915_vma *vma)
> +static void force_unbind(struct i915_vma *vma, bool async)
>  {
>         if (!drm_mm_node_allocated(&vma->node))
>                 return;
> @@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
>                 i915_vma_set_purged(vma);
>
>         atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
> -       WARN_ON(__i915_vma_unbind(vma));
> +       if (async) {
> +               struct dma_fence *fence;
> +
> +               fence = __i915_vma_unbind_async(vma);
> +               if (IS_ERR_OR_NULL(fence)) {
> +                       async = false;
> +               } else {
> +                       dma_resv_add_fence(vma->obj->base.resv, fence,
> +                                          DMA_RESV_USAGE_READ);
> +                       dma_fence_put(fence);
> +               }
> +       }
> +
> +       if (!async)
> +               WARN_ON(__i915_vma_unbind(vma));
>         GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
>  }
>
> @@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
>  {
>         lockdep_assert_held(&vma->vm->mutex);
>
> -       force_unbind(vma);
> +       force_unbind(vma, false);
>         list_del_init(&vma->vm_link);
>         release_references(vma, vma->vm->gt, false);
>  }
> @@ -1796,7 +1812,34 @@ void i915_vma_destroy(struct i915_vma *vma)
>         bool vm_ddestroy;
>
>         mutex_lock(&vma->vm->mutex);
> -       force_unbind(vma);
> +       force_unbind(vma, false);
> +       list_del_init(&vma->vm_link);
> +       vm_ddestroy = vma->vm_ddestroy;
> +       vma->vm_ddestroy = false;
> +
> +       /* vma->vm may be freed when releasing vma->vm->mutex. */
> +       gt = vma->vm->gt;
> +       mutex_unlock(&vma->vm->mutex);
> +       release_references(vma, gt, vm_ddestroy);
> +}
> +
> +void i915_vma_destroy_async(struct i915_vma *vma)

Where are we calling this? I can't find it.

> +{
> +       bool vm_ddestroy, async = vma->obj->mm.rsgt;
> +       struct intel_gt *gt;
> +
> +       if (dma_resv_reserve_fences(vma->obj->base.resv, 1))
> +               async = false;
> +
> +       mutex_lock(&vma->vm->mutex);
> +       /*
> +        * Ensure any asynchronous binding is complete while using
> +        * async unbind as we will be releasing the vma here.
> +        */
> +       if (async && i915_active_wait(&vma->active))
> +               async = false;
> +
> +       force_unbind(vma, async);
>         list_del_init(&vma->vm_link);
>         vm_ddestroy = vma->vm_ddestroy;
>         vma->vm_ddestroy = false;
> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
> index 737ef310d046..25f15965dab8 100644
> --- a/drivers/gpu/drm/i915/i915_vma.h
> +++ b/drivers/gpu/drm/i915/i915_vma.h
> @@ -272,6 +272,7 @@ void i915_vma_reopen(struct i915_vma *vma);
>
>  void i915_vma_destroy_locked(struct i915_vma *vma);
>  void i915_vma_destroy(struct i915_vma *vma);
> +void i915_vma_destroy_async(struct i915_vma *vma);
>
>  #define assert_vma_held(vma) dma_resv_assert_held((vma)->obj->base.resv)
>
> --
> 2.21.0.rc0.32.g243a4c7e27
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
@ 2022-11-09 17:52     ` Matthew Auld
  0 siblings, 0 replies; 71+ messages in thread
From: Matthew Auld @ 2022-11-09 17:52 UTC (permalink / raw)
  To: Niranjana Vishwanathapura
  Cc: paulo.r.zanoni, jani.nikula, intel-gfx, dri-devel,
	thomas.hellstrom, daniel.vetter, christian.koenig, matthew.auld

On Mon, 7 Nov 2022 at 08:53, Niranjana Vishwanathapura
<niranjana.vishwanathapura@intel.com> wrote:
>
> Asynchronously unbind the vma upon vm_unbind call.
> Fall back to synchronous unbind if backend doesn't support
> async unbind or if async unbind fails.
>
> No need for vm_unbind out fence support as i915 will internally
> handle all sequencing and user need not try to sequence any
> operation with the unbind completion.
>
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_vma.c | 51 ++++++++++++++++++++++++++++++---
>  drivers/gpu/drm/i915/i915_vma.h |  1 +
>  2 files changed, 48 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 08218e3a2f12..03c966fad87b 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -42,6 +42,8 @@
>  #include "i915_vma.h"
>  #include "i915_vma_resource.h"
>
> +static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma);
> +
>  static inline void assert_vma_held_evict(const struct i915_vma *vma)
>  {
>         /*
> @@ -1711,7 +1713,7 @@ void i915_vma_reopen(struct i915_vma *vma)
>         spin_unlock_irq(&gt->closed_lock);
>  }
>
> -static void force_unbind(struct i915_vma *vma)
> +static void force_unbind(struct i915_vma *vma, bool async)
>  {
>         if (!drm_mm_node_allocated(&vma->node))
>                 return;
> @@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
>                 i915_vma_set_purged(vma);
>
>         atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
> -       WARN_ON(__i915_vma_unbind(vma));
> +       if (async) {
> +               struct dma_fence *fence;
> +
> +               fence = __i915_vma_unbind_async(vma);
> +               if (IS_ERR_OR_NULL(fence)) {
> +                       async = false;
> +               } else {
> +                       dma_resv_add_fence(vma->obj->base.resv, fence,
> +                                          DMA_RESV_USAGE_READ);
> +                       dma_fence_put(fence);
> +               }
> +       }
> +
> +       if (!async)
> +               WARN_ON(__i915_vma_unbind(vma));
>         GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
>  }
>
> @@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
>  {
>         lockdep_assert_held(&vma->vm->mutex);
>
> -       force_unbind(vma);
> +       force_unbind(vma, false);
>         list_del_init(&vma->vm_link);
>         release_references(vma, vma->vm->gt, false);
>  }
> @@ -1796,7 +1812,34 @@ void i915_vma_destroy(struct i915_vma *vma)
>         bool vm_ddestroy;
>
>         mutex_lock(&vma->vm->mutex);
> -       force_unbind(vma);
> +       force_unbind(vma, false);
> +       list_del_init(&vma->vm_link);
> +       vm_ddestroy = vma->vm_ddestroy;
> +       vma->vm_ddestroy = false;
> +
> +       /* vma->vm may be freed when releasing vma->vm->mutex. */
> +       gt = vma->vm->gt;
> +       mutex_unlock(&vma->vm->mutex);
> +       release_references(vma, gt, vm_ddestroy);
> +}
> +
> +void i915_vma_destroy_async(struct i915_vma *vma)

Where are we calling this? I can't find it.

> +{
> +       bool vm_ddestroy, async = vma->obj->mm.rsgt;
> +       struct intel_gt *gt;
> +
> +       if (dma_resv_reserve_fences(vma->obj->base.resv, 1))
> +               async = false;
> +
> +       mutex_lock(&vma->vm->mutex);
> +       /*
> +        * Ensure any asynchronous binding is complete while using
> +        * async unbind as we will be releasing the vma here.
> +        */
> +       if (async && i915_active_wait(&vma->active))
> +               async = false;
> +
> +       force_unbind(vma, async);
>         list_del_init(&vma->vm_link);
>         vm_ddestroy = vma->vm_ddestroy;
>         vma->vm_ddestroy = false;
> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
> index 737ef310d046..25f15965dab8 100644
> --- a/drivers/gpu/drm/i915/i915_vma.h
> +++ b/drivers/gpu/drm/i915/i915_vma.h
> @@ -272,6 +272,7 @@ void i915_vma_reopen(struct i915_vma *vma);
>
>  void i915_vma_destroy_locked(struct i915_vma *vma);
>  void i915_vma_destroy(struct i915_vma *vma);
> +void i915_vma_destroy_async(struct i915_vma *vma);
>
>  #define assert_vma_held(vma) dma_resv_assert_held((vma)->obj->base.resv)
>
> --
> 2.21.0.rc0.32.g243a4c7e27
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
  2022-11-09 17:52     ` [Intel-gfx] " Matthew Auld
@ 2022-11-09 20:11       ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-09 20:11 UTC (permalink / raw)
  To: Matthew Auld
  Cc: matthew.brost, paulo.r.zanoni, tvrtko.ursulin, jani.nikula,
	intel-gfx, dri-devel, thomas.hellstrom, lionel.g.landwerlin,
	jason, andi.shyti, daniel.vetter, christian.koenig, matthew.auld

On Wed, Nov 09, 2022 at 05:52:54PM +0000, Matthew Auld wrote:
>On Mon, 7 Nov 2022 at 08:53, Niranjana Vishwanathapura
><niranjana.vishwanathapura@intel.com> wrote:
>>
>> Asynchronously unbind the vma upon vm_unbind call.
>> Fall back to synchronous unbind if backend doesn't support
>> async unbind or if async unbind fails.
>>
>> No need for vm_unbind out fence support as i915 will internally
>> handle all sequencing and user need not try to sequence any
>> operation with the unbind completion.
>>
>> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_vma.c | 51 ++++++++++++++++++++++++++++++---
>>  drivers/gpu/drm/i915/i915_vma.h |  1 +
>>  2 files changed, 48 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
>> index 08218e3a2f12..03c966fad87b 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.c
>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> @@ -42,6 +42,8 @@
>>  #include "i915_vma.h"
>>  #include "i915_vma_resource.h"
>>
>> +static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma);
>> +
>>  static inline void assert_vma_held_evict(const struct i915_vma *vma)
>>  {
>>         /*
>> @@ -1711,7 +1713,7 @@ void i915_vma_reopen(struct i915_vma *vma)
>>         spin_unlock_irq(&gt->closed_lock);
>>  }
>>
>> -static void force_unbind(struct i915_vma *vma)
>> +static void force_unbind(struct i915_vma *vma, bool async)
>>  {
>>         if (!drm_mm_node_allocated(&vma->node))
>>                 return;
>> @@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
>>                 i915_vma_set_purged(vma);
>>
>>         atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
>> -       WARN_ON(__i915_vma_unbind(vma));
>> +       if (async) {
>> +               struct dma_fence *fence;
>> +
>> +               fence = __i915_vma_unbind_async(vma);
>> +               if (IS_ERR_OR_NULL(fence)) {
>> +                       async = false;
>> +               } else {
>> +                       dma_resv_add_fence(vma->obj->base.resv, fence,
>> +                                          DMA_RESV_USAGE_READ);
>> +                       dma_fence_put(fence);
>> +               }
>> +       }
>> +
>> +       if (!async)
>> +               WARN_ON(__i915_vma_unbind(vma));
>>         GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
>>  }
>>
>> @@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
>>  {
>>         lockdep_assert_held(&vma->vm->mutex);
>>
>> -       force_unbind(vma);
>> +       force_unbind(vma, false);
>>         list_del_init(&vma->vm_link);
>>         release_references(vma, vma->vm->gt, false);
>>  }
>> @@ -1796,7 +1812,34 @@ void i915_vma_destroy(struct i915_vma *vma)
>>         bool vm_ddestroy;
>>
>>         mutex_lock(&vma->vm->mutex);
>> -       force_unbind(vma);
>> +       force_unbind(vma, false);
>> +       list_del_init(&vma->vm_link);
>> +       vm_ddestroy = vma->vm_ddestroy;
>> +       vma->vm_ddestroy = false;
>> +
>> +       /* vma->vm may be freed when releasing vma->vm->mutex. */
>> +       gt = vma->vm->gt;
>> +       mutex_unlock(&vma->vm->mutex);
>> +       release_references(vma, gt, vm_ddestroy);
>> +}
>> +
>> +void i915_vma_destroy_async(struct i915_vma *vma)
>
>Where are we calling this? I can't find it.

Ah, got missed out in this patch. It should be called from vm_unbind
path. Will fix it.

Thanks,
Niranjana

>
>> +{
>> +       bool vm_ddestroy, async = vma->obj->mm.rsgt;
>> +       struct intel_gt *gt;
>> +
>> +       if (dma_resv_reserve_fences(vma->obj->base.resv, 1))
>> +               async = false;
>> +
>> +       mutex_lock(&vma->vm->mutex);
>> +       /*
>> +        * Ensure any asynchronous binding is complete while using
>> +        * async unbind as we will be releasing the vma here.
>> +        */
>> +       if (async && i915_active_wait(&vma->active))
>> +               async = false;
>> +
>> +       force_unbind(vma, async);
>>         list_del_init(&vma->vm_link);
>>         vm_ddestroy = vma->vm_ddestroy;
>>         vma->vm_ddestroy = false;
>> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
>> index 737ef310d046..25f15965dab8 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.h
>> +++ b/drivers/gpu/drm/i915/i915_vma.h
>> @@ -272,6 +272,7 @@ void i915_vma_reopen(struct i915_vma *vma);
>>
>>  void i915_vma_destroy_locked(struct i915_vma *vma);
>>  void i915_vma_destroy(struct i915_vma *vma);
>> +void i915_vma_destroy_async(struct i915_vma *vma);
>>
>>  #define assert_vma_held(vma) dma_resv_assert_held((vma)->obj->base.resv)
>>
>> --
>> 2.21.0.rc0.32.g243a4c7e27
>>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
@ 2022-11-09 20:11       ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-09 20:11 UTC (permalink / raw)
  To: Matthew Auld
  Cc: paulo.r.zanoni, jani.nikula, intel-gfx, dri-devel,
	thomas.hellstrom, daniel.vetter, christian.koenig, matthew.auld

On Wed, Nov 09, 2022 at 05:52:54PM +0000, Matthew Auld wrote:
>On Mon, 7 Nov 2022 at 08:53, Niranjana Vishwanathapura
><niranjana.vishwanathapura@intel.com> wrote:
>>
>> Asynchronously unbind the vma upon vm_unbind call.
>> Fall back to synchronous unbind if backend doesn't support
>> async unbind or if async unbind fails.
>>
>> No need for vm_unbind out fence support as i915 will internally
>> handle all sequencing and user need not try to sequence any
>> operation with the unbind completion.
>>
>> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_vma.c | 51 ++++++++++++++++++++++++++++++---
>>  drivers/gpu/drm/i915/i915_vma.h |  1 +
>>  2 files changed, 48 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
>> index 08218e3a2f12..03c966fad87b 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.c
>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> @@ -42,6 +42,8 @@
>>  #include "i915_vma.h"
>>  #include "i915_vma_resource.h"
>>
>> +static struct dma_fence *__i915_vma_unbind_async(struct i915_vma *vma);
>> +
>>  static inline void assert_vma_held_evict(const struct i915_vma *vma)
>>  {
>>         /*
>> @@ -1711,7 +1713,7 @@ void i915_vma_reopen(struct i915_vma *vma)
>>         spin_unlock_irq(&gt->closed_lock);
>>  }
>>
>> -static void force_unbind(struct i915_vma *vma)
>> +static void force_unbind(struct i915_vma *vma, bool async)
>>  {
>>         if (!drm_mm_node_allocated(&vma->node))
>>                 return;
>> @@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
>>                 i915_vma_set_purged(vma);
>>
>>         atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
>> -       WARN_ON(__i915_vma_unbind(vma));
>> +       if (async) {
>> +               struct dma_fence *fence;
>> +
>> +               fence = __i915_vma_unbind_async(vma);
>> +               if (IS_ERR_OR_NULL(fence)) {
>> +                       async = false;
>> +               } else {
>> +                       dma_resv_add_fence(vma->obj->base.resv, fence,
>> +                                          DMA_RESV_USAGE_READ);
>> +                       dma_fence_put(fence);
>> +               }
>> +       }
>> +
>> +       if (!async)
>> +               WARN_ON(__i915_vma_unbind(vma));
>>         GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
>>  }
>>
>> @@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
>>  {
>>         lockdep_assert_held(&vma->vm->mutex);
>>
>> -       force_unbind(vma);
>> +       force_unbind(vma, false);
>>         list_del_init(&vma->vm_link);
>>         release_references(vma, vma->vm->gt, false);
>>  }
>> @@ -1796,7 +1812,34 @@ void i915_vma_destroy(struct i915_vma *vma)
>>         bool vm_ddestroy;
>>
>>         mutex_lock(&vma->vm->mutex);
>> -       force_unbind(vma);
>> +       force_unbind(vma, false);
>> +       list_del_init(&vma->vm_link);
>> +       vm_ddestroy = vma->vm_ddestroy;
>> +       vma->vm_ddestroy = false;
>> +
>> +       /* vma->vm may be freed when releasing vma->vm->mutex. */
>> +       gt = vma->vm->gt;
>> +       mutex_unlock(&vma->vm->mutex);
>> +       release_references(vma, gt, vm_ddestroy);
>> +}
>> +
>> +void i915_vma_destroy_async(struct i915_vma *vma)
>
>Where are we calling this? I can't find it.

Ah, got missed out in this patch. It should be called from vm_unbind
path. Will fix it.

Thanks,
Niranjana

>
>> +{
>> +       bool vm_ddestroy, async = vma->obj->mm.rsgt;
>> +       struct intel_gt *gt;
>> +
>> +       if (dma_resv_reserve_fences(vma->obj->base.resv, 1))
>> +               async = false;
>> +
>> +       mutex_lock(&vma->vm->mutex);
>> +       /*
>> +        * Ensure any asynchronous binding is complete while using
>> +        * async unbind as we will be releasing the vma here.
>> +        */
>> +       if (async && i915_active_wait(&vma->active))
>> +               async = false;
>> +
>> +       force_unbind(vma, async);
>>         list_del_init(&vma->vm_link);
>>         vm_ddestroy = vma->vm_ddestroy;
>>         vma->vm_ddestroy = false;
>> diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
>> index 737ef310d046..25f15965dab8 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.h
>> +++ b/drivers/gpu/drm/i915/i915_vma.h
>> @@ -272,6 +272,7 @@ void i915_vma_reopen(struct i915_vma *vma);
>>
>>  void i915_vma_destroy_locked(struct i915_vma *vma);
>>  void i915_vma_destroy(struct i915_vma *vma);
>> +void i915_vma_destroy_async(struct i915_vma *vma);
>>
>>  #define assert_vma_held(vma) dma_resv_assert_held((vma)->obj->base.resv)
>>
>> --
>> 2.21.0.rc0.32.g243a4c7e27
>>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
  2022-11-07  8:52   ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-09 21:13     ` Andi Shyti
  -1 siblings, 0 replies; 71+ messages in thread
From: Andi Shyti @ 2022-11-09 21:13 UTC (permalink / raw)
  To: Niranjana Vishwanathapura
  Cc: matthew.brost, paulo.r.zanoni, lionel.g.landwerlin,
	tvrtko.ursulin, jani.nikula, intel-gfx, dri-devel,
	thomas.hellstrom, matthew.auld, jason, andi.shyti, daniel.vetter,
	christian.koenig

Hi Niranjana,

...

> -static void force_unbind(struct i915_vma *vma)
> +static void force_unbind(struct i915_vma *vma, bool async)
>  {
>  	if (!drm_mm_node_allocated(&vma->node))
>  		return;
> @@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
>  		i915_vma_set_purged(vma);
>  
>  	atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
> -	WARN_ON(__i915_vma_unbind(vma));
> +	if (async) {
> +		struct dma_fence *fence;
> +
> +		fence = __i915_vma_unbind_async(vma);
> +		if (IS_ERR_OR_NULL(fence)) {
> +			async = false;
> +		} else {
> +			dma_resv_add_fence(vma->obj->base.resv, fence,
> +					   DMA_RESV_USAGE_READ);
> +			dma_fence_put(fence);
> +		}
> +	}
> +
> +	if (!async)
> +		WARN_ON(__i915_vma_unbind(vma));
>  	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
>  }
>  
> @@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
>  {
>  	lockdep_assert_held(&vma->vm->mutex);
>  
> -	force_unbind(vma);
> +	force_unbind(vma, false);

How about:

#define force_unbind(v)		__force_unbind(v, false)
#define force_unbind_async(v)	__force_unbind(v, true)

The true/false parameters in a function is not immediately
understandable.

or

#define force_unbind_sync(v)	force_unbind(v, false)
#define force_unbind_async(v)	force_unbind(v, true)

but I prefer the first version.

Andi

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
@ 2022-11-09 21:13     ` Andi Shyti
  0 siblings, 0 replies; 71+ messages in thread
From: Andi Shyti @ 2022-11-09 21:13 UTC (permalink / raw)
  To: Niranjana Vishwanathapura
  Cc: paulo.r.zanoni, jani.nikula, intel-gfx, dri-devel,
	thomas.hellstrom, matthew.auld, daniel.vetter, christian.koenig

Hi Niranjana,

...

> -static void force_unbind(struct i915_vma *vma)
> +static void force_unbind(struct i915_vma *vma, bool async)
>  {
>  	if (!drm_mm_node_allocated(&vma->node))
>  		return;
> @@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
>  		i915_vma_set_purged(vma);
>  
>  	atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
> -	WARN_ON(__i915_vma_unbind(vma));
> +	if (async) {
> +		struct dma_fence *fence;
> +
> +		fence = __i915_vma_unbind_async(vma);
> +		if (IS_ERR_OR_NULL(fence)) {
> +			async = false;
> +		} else {
> +			dma_resv_add_fence(vma->obj->base.resv, fence,
> +					   DMA_RESV_USAGE_READ);
> +			dma_fence_put(fence);
> +		}
> +	}
> +
> +	if (!async)
> +		WARN_ON(__i915_vma_unbind(vma));
>  	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
>  }
>  
> @@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
>  {
>  	lockdep_assert_held(&vma->vm->mutex);
>  
> -	force_unbind(vma);
> +	force_unbind(vma, false);

How about:

#define force_unbind(v)		__force_unbind(v, false)
#define force_unbind_async(v)	__force_unbind(v, true)

The true/false parameters in a function is not immediately
understandable.

or

#define force_unbind_sync(v)	force_unbind(v, false)
#define force_unbind_async(v)	force_unbind(v, true)

but I prefer the first version.

Andi

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 00/20] drm/i915/vm_bind: Add VM_BIND functionality
  2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-10  0:16   ` Zanoni, Paulo R
  -1 siblings, 0 replies; 71+ messages in thread
From: Zanoni, Paulo R @ 2022-11-10  0:16 UTC (permalink / raw)
  To: dri-devel, Vishwanathapura, Niranjana, intel-gfx
  Cc: Brost, Matthew, andi.shyti, Ursulin,  Tvrtko, Nikula, Jani,
	Landwerlin, Lionel G, Hellstrom, Thomas, Auld, Matthew, jason,
	Vetter, Daniel, christian.koenig

On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
> DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
> buffer objects (BOs) or sections of a BOs at specified GPU virtual
> addresses on a specified address space (VM). Multiple mappings can map
> to the same physical pages of an object (aliasing). These mappings (also
> referred to as persistent mappings) will be persistent across multiple
> GPU submissions (execbuf calls) issued by the UMD, without user having
> to provide a list of all required mappings during each submission (as
> required by older execbuf mode).
> 
> This patch series support VM_BIND version 1, as described by the param
> I915_PARAM_VM_BIND_VERSION.
> 
> Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
> vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
> The new execbuf3 ioctl will not have any execlist support and all the
> legacy support like relocations etc., are removed.
> 
> NOTEs:
> * It is based on below VM_BIND design+uapi rfc.
>   Documentation/gpu/rfc/i915_vm_bind.rst

Hi

One difference for execbuf3 that I noticed that is not mentioned in the
RFC document is that we now don't have a way to signal
EXEC_OBJECT_WRITE. When looking at the Kernel code, some there are some
pieces that check for this flag:

- there's code that deals with frontbuffer rendering 
- there's code that deals with fences
- there's code that prevents self-modifying batches
- another that seems related to waiting for objects

Are there any new rules regarding frontbuffer rendering when we use
execbuf3? Any other behavior changes related to the other places that
we should expect when using execbuf3?

Thanks,
Paulo

> 
> * The IGT RFC series is posted as,
>   [PATCH i-g-t v5 0/12] vm_bind: Add VM_BIND validation support
> 
> v2: Address various review comments
> v3: Address review comments and other fixes
> v4: Remove vm_unbind out fence uapi which is not supported yet,
>     replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()
> v5: Render kernel-doc, use PIN_NOEVICT, limit vm_bind support to
>     non-recoverable faults
> v6: Rebased, minor fixes, add reserved fields to drm_i915_gem_vm_bind,
>     add new patch for async vm_unbind support
> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> 
> Niranjana Vishwanathapura (20):
>   drm/i915/vm_bind: Expose vm lookup function
>   drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
>   drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
>   drm/i915/vm_bind: Add support to create persistent vma
>   drm/i915/vm_bind: Implement bind and unbind of object
>   drm/i915/vm_bind: Support for VM private BOs
>   drm/i915/vm_bind: Add support to handle object evictions
>   drm/i915/vm_bind: Support persistent vma activeness tracking
>   drm/i915/vm_bind: Add out fence support
>   drm/i915/vm_bind: Abstract out common execbuf functions
>   drm/i915/vm_bind: Use common execbuf functions in execbuf path
>   drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
>   drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
>   drm/i915/vm_bind: Expose i915_request_await_bind()
>   drm/i915/vm_bind: Handle persistent vmas in execbuf3
>   drm/i915/vm_bind: userptr dma-resv changes
>   drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
>   drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
>   drm/i915/vm_bind: Render VM_BIND documentation
>   drm/i915/vm_bind: Async vm_unbind support
> 
>  Documentation/gpu/i915.rst                    |  78 +-
>  drivers/gpu/drm/i915/Makefile                 |   3 +
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  43 +-
>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  17 +
>  drivers/gpu/drm/i915/gem/i915_gem_create.c    |  72 +-
>  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |   6 +
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 516 +----------
>  .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 871 ++++++++++++++++++
>  .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 +++++++++++++
>  .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
>  drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |   2 +
>  drivers/gpu/drm/i915/gem/i915_gem_object.c    |   3 +
>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |   2 +
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +
>  drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  19 +
>  drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  30 +
>  .../drm/i915/gem/i915_gem_vm_bind_object.c    | 449 +++++++++
>  drivers/gpu/drm/i915/gt/intel_gtt.c           |  17 +
>  drivers/gpu/drm/i915/gt/intel_gtt.h           |  21 +
>  drivers/gpu/drm/i915/i915_driver.c            |   4 +
>  drivers/gpu/drm/i915/i915_drv.h               |   2 +
>  drivers/gpu/drm/i915/i915_gem_gtt.c           |  39 +
>  drivers/gpu/drm/i915/i915_gem_gtt.h           |   3 +
>  drivers/gpu/drm/i915/i915_getparam.c          |   3 +
>  drivers/gpu/drm/i915/i915_sw_fence.c          |  28 +-
>  drivers/gpu/drm/i915/i915_sw_fence.h          |  23 +-
>  drivers/gpu/drm/i915/i915_vma.c               | 186 +++-
>  drivers/gpu/drm/i915/i915_vma.h               |  68 +-
>  drivers/gpu/drm/i915/i915_vma_types.h         |  39 +
>  include/uapi/drm/i915_drm.h                   | 264 +++++-
>  30 files changed, 3008 insertions(+), 546 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 00/20] drm/i915/vm_bind: Add VM_BIND functionality
@ 2022-11-10  0:16   ` Zanoni, Paulo R
  0 siblings, 0 replies; 71+ messages in thread
From: Zanoni, Paulo R @ 2022-11-10  0:16 UTC (permalink / raw)
  To: dri-devel, Vishwanathapura, Niranjana, intel-gfx
  Cc: Nikula, Jani, Hellstrom, Thomas, Auld, Matthew, Vetter, Daniel,
	christian.koenig

On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
> DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
> buffer objects (BOs) or sections of a BOs at specified GPU virtual
> addresses on a specified address space (VM). Multiple mappings can map
> to the same physical pages of an object (aliasing). These mappings (also
> referred to as persistent mappings) will be persistent across multiple
> GPU submissions (execbuf calls) issued by the UMD, without user having
> to provide a list of all required mappings during each submission (as
> required by older execbuf mode).
> 
> This patch series support VM_BIND version 1, as described by the param
> I915_PARAM_VM_BIND_VERSION.
> 
> Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
> vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
> The new execbuf3 ioctl will not have any execlist support and all the
> legacy support like relocations etc., are removed.
> 
> NOTEs:
> * It is based on below VM_BIND design+uapi rfc.
>   Documentation/gpu/rfc/i915_vm_bind.rst

Hi

One difference for execbuf3 that I noticed that is not mentioned in the
RFC document is that we now don't have a way to signal
EXEC_OBJECT_WRITE. When looking at the Kernel code, some there are some
pieces that check for this flag:

- there's code that deals with frontbuffer rendering 
- there's code that deals with fences
- there's code that prevents self-modifying batches
- another that seems related to waiting for objects

Are there any new rules regarding frontbuffer rendering when we use
execbuf3? Any other behavior changes related to the other places that
we should expect when using execbuf3?

Thanks,
Paulo

> 
> * The IGT RFC series is posted as,
>   [PATCH i-g-t v5 0/12] vm_bind: Add VM_BIND validation support
> 
> v2: Address various review comments
> v3: Address review comments and other fixes
> v4: Remove vm_unbind out fence uapi which is not supported yet,
>     replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()
> v5: Render kernel-doc, use PIN_NOEVICT, limit vm_bind support to
>     non-recoverable faults
> v6: Rebased, minor fixes, add reserved fields to drm_i915_gem_vm_bind,
>     add new patch for async vm_unbind support
> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> 
> Niranjana Vishwanathapura (20):
>   drm/i915/vm_bind: Expose vm lookup function
>   drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
>   drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
>   drm/i915/vm_bind: Add support to create persistent vma
>   drm/i915/vm_bind: Implement bind and unbind of object
>   drm/i915/vm_bind: Support for VM private BOs
>   drm/i915/vm_bind: Add support to handle object evictions
>   drm/i915/vm_bind: Support persistent vma activeness tracking
>   drm/i915/vm_bind: Add out fence support
>   drm/i915/vm_bind: Abstract out common execbuf functions
>   drm/i915/vm_bind: Use common execbuf functions in execbuf path
>   drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
>   drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
>   drm/i915/vm_bind: Expose i915_request_await_bind()
>   drm/i915/vm_bind: Handle persistent vmas in execbuf3
>   drm/i915/vm_bind: userptr dma-resv changes
>   drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
>   drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
>   drm/i915/vm_bind: Render VM_BIND documentation
>   drm/i915/vm_bind: Async vm_unbind support
> 
>  Documentation/gpu/i915.rst                    |  78 +-
>  drivers/gpu/drm/i915/Makefile                 |   3 +
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  43 +-
>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  17 +
>  drivers/gpu/drm/i915/gem/i915_gem_create.c    |  72 +-
>  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |   6 +
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 516 +----------
>  .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 871 ++++++++++++++++++
>  .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 +++++++++++++
>  .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
>  drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |   2 +
>  drivers/gpu/drm/i915/gem/i915_gem_object.c    |   3 +
>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |   2 +
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +
>  drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  19 +
>  drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  30 +
>  .../drm/i915/gem/i915_gem_vm_bind_object.c    | 449 +++++++++
>  drivers/gpu/drm/i915/gt/intel_gtt.c           |  17 +
>  drivers/gpu/drm/i915/gt/intel_gtt.h           |  21 +
>  drivers/gpu/drm/i915/i915_driver.c            |   4 +
>  drivers/gpu/drm/i915/i915_drv.h               |   2 +
>  drivers/gpu/drm/i915/i915_gem_gtt.c           |  39 +
>  drivers/gpu/drm/i915/i915_gem_gtt.h           |   3 +
>  drivers/gpu/drm/i915/i915_getparam.c          |   3 +
>  drivers/gpu/drm/i915/i915_sw_fence.c          |  28 +-
>  drivers/gpu/drm/i915/i915_sw_fence.h          |  23 +-
>  drivers/gpu/drm/i915/i915_vma.c               | 186 +++-
>  drivers/gpu/drm/i915/i915_vma.h               |  68 +-
>  drivers/gpu/drm/i915/i915_vma_types.h         |  39 +
>  include/uapi/drm/i915_drm.h                   | 264 +++++-
>  30 files changed, 3008 insertions(+), 546 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
  2022-11-09 21:13     ` [Intel-gfx] " Andi Shyti
@ 2022-11-10  0:28       ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-10  0:28 UTC (permalink / raw)
  To: Andi Shyti
  Cc: matthew.brost, paulo.r.zanoni, lionel.g.landwerlin,
	tvrtko.ursulin, jani.nikula, intel-gfx, dri-devel,
	thomas.hellstrom, matthew.auld, jason, daniel.vetter,
	christian.koenig

On Wed, Nov 09, 2022 at 10:13:36PM +0100, Andi Shyti wrote:
>Hi Niranjana,
>
>...
>
>> -static void force_unbind(struct i915_vma *vma)
>> +static void force_unbind(struct i915_vma *vma, bool async)
>>  {
>>  	if (!drm_mm_node_allocated(&vma->node))
>>  		return;
>> @@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
>>  		i915_vma_set_purged(vma);
>>
>>  	atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
>> -	WARN_ON(__i915_vma_unbind(vma));
>> +	if (async) {
>> +		struct dma_fence *fence;
>> +
>> +		fence = __i915_vma_unbind_async(vma);
>> +		if (IS_ERR_OR_NULL(fence)) {
>> +			async = false;
>> +		} else {
>> +			dma_resv_add_fence(vma->obj->base.resv, fence,
>> +					   DMA_RESV_USAGE_READ);
>> +			dma_fence_put(fence);
>> +		}
>> +	}
>> +
>> +	if (!async)
>> +		WARN_ON(__i915_vma_unbind(vma));
>>  	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
>>  }
>>
>> @@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
>>  {
>>  	lockdep_assert_held(&vma->vm->mutex);
>>
>> -	force_unbind(vma);
>> +	force_unbind(vma, false);
>
>How about:
>
>#define force_unbind(v)		__force_unbind(v, false)
>#define force_unbind_async(v)	__force_unbind(v, true)
>
>The true/false parameters in a function is not immediately
>understandable.
>
>or
>
>#define force_unbind_sync(v)	force_unbind(v, false)
>#define force_unbind_async(v)	force_unbind(v, true)
>
>but I prefer the first version.

Andi, I get the point. But currently, force_unbind() is staic
function with only couple of invocations. These defines seems
an overkill (would be good to define these in header files
if the function is not static). Hope we can keep it as is for now.

Niranjana

>
>Andi

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support
@ 2022-11-10  0:28       ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-10  0:28 UTC (permalink / raw)
  To: Andi Shyti
  Cc: paulo.r.zanoni, jani.nikula, intel-gfx, dri-devel,
	thomas.hellstrom, matthew.auld, daniel.vetter, christian.koenig

On Wed, Nov 09, 2022 at 10:13:36PM +0100, Andi Shyti wrote:
>Hi Niranjana,
>
>...
>
>> -static void force_unbind(struct i915_vma *vma)
>> +static void force_unbind(struct i915_vma *vma, bool async)
>>  {
>>  	if (!drm_mm_node_allocated(&vma->node))
>>  		return;
>> @@ -1725,7 +1727,21 @@ static void force_unbind(struct i915_vma *vma)
>>  		i915_vma_set_purged(vma);
>>
>>  	atomic_and(~I915_VMA_PIN_MASK, &vma->flags);
>> -	WARN_ON(__i915_vma_unbind(vma));
>> +	if (async) {
>> +		struct dma_fence *fence;
>> +
>> +		fence = __i915_vma_unbind_async(vma);
>> +		if (IS_ERR_OR_NULL(fence)) {
>> +			async = false;
>> +		} else {
>> +			dma_resv_add_fence(vma->obj->base.resv, fence,
>> +					   DMA_RESV_USAGE_READ);
>> +			dma_fence_put(fence);
>> +		}
>> +	}
>> +
>> +	if (!async)
>> +		WARN_ON(__i915_vma_unbind(vma));
>>  	GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
>>  }
>>
>> @@ -1785,7 +1801,7 @@ void i915_vma_destroy_locked(struct i915_vma *vma)
>>  {
>>  	lockdep_assert_held(&vma->vm->mutex);
>>
>> -	force_unbind(vma);
>> +	force_unbind(vma, false);
>
>How about:
>
>#define force_unbind(v)		__force_unbind(v, false)
>#define force_unbind_async(v)	__force_unbind(v, true)
>
>The true/false parameters in a function is not immediately
>understandable.
>
>or
>
>#define force_unbind_sync(v)	force_unbind(v, false)
>#define force_unbind_async(v)	force_unbind(v, true)
>
>but I prefer the first version.

Andi, I get the point. But currently, force_unbind() is staic
function with only couple of invocations. These defines seems
an overkill (would be good to define these in header files
if the function is not static). Hope we can keep it as is for now.

Niranjana

>
>Andi

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 05/20] drm/i915/vm_bind: Implement bind and unbind of object
  2022-11-07  8:51   ` Niranjana Vishwanathapura
@ 2022-11-10  1:28     ` Zanoni, Paulo R
  -1 siblings, 0 replies; 71+ messages in thread
From: Zanoni, Paulo R @ 2022-11-10  1:28 UTC (permalink / raw)
  To: dri-devel, Vishwanathapura, Niranjana, intel-gfx
  Cc: Brost, Matthew, andi.shyti, Ursulin,  Tvrtko, Nikula, Jani,
	Landwerlin, Lionel G, Hellstrom, Thomas, Auld, Matthew, jason,
	Vetter, Daniel, christian.koenig

On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
> Add uapi and implement support for bind and unbind of an
> object at the specified GPU virtual addresses.
> 
> The vm_bind mode is not supported in legacy execbuf2 ioctl.
> It will be supported only in the newer execbuf3 ioctl.
> 
> v2: On older platforms ctx->vm is not set, check for it.
>     In vm_bind call, add vma to vm_bind_list.
>     Add more input validity checks.
>     Update some documentation.
> v3: In vm_bind call, add vma to vm_bound_list as user can
>     request a fence and pass to execbuf3 as input fence.
>     Remove short term pinning with PIN_VALIDATE flag.
> v4: Replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode().
> v5: Ensure all reserved fields are 0, use PIN_NOEVICT.
> v6: Add reserved fields to drm_i915_gem_vm_bind.
> 
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/Makefile                 |   1 +
>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  15 +
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   5 +
>  drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
>  .../drm/i915/gem/i915_gem_vm_bind_object.c    | 324 ++++++++++++++++++
>  drivers/gpu/drm/i915/gt/intel_gtt.c           |  10 +
>  drivers/gpu/drm/i915/gt/intel_gtt.h           |   9 +
>  drivers/gpu/drm/i915/i915_driver.c            |   3 +
>  drivers/gpu/drm/i915/i915_vma.c               |   1 +
>  drivers/gpu/drm/i915/i915_vma_types.h         |  14 +
>  include/uapi/drm/i915_drm.h                   |  99 ++++++
>  11 files changed, 507 insertions(+)
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 51704b54317c..b731f3ac80da 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -166,6 +166,7 @@ gem-y += \
>  	gem/i915_gem_ttm_move.o \
>  	gem/i915_gem_ttm_pm.o \
>  	gem/i915_gem_userptr.o \
> +	gem/i915_gem_vm_bind_object.o \
>  	gem/i915_gem_wait.o \
>  	gem/i915_gemfs.o
>  i915-y += \
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> index 899fa8f1e0fe..e8b41aa8f8c4 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> @@ -139,6 +139,21 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
>  				       struct drm_file *file);
>  
> 
> 
> 
> +/**
> + * i915_gem_vm_is_vm_bind_mode() - Check if address space is in vm_bind mode
> + * @vm: the address space
> + *
> + * Returns:
> + * true: @vm is in vm_bind mode; allows only vm_bind method of binding.
> + * false: @vm is not in vm_bind mode; allows only legacy execbuff method
> + *        of binding.
> + */
> +static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
> +{
> +	/* No support to enable vm_bind mode yet */
> +	return false;
> +}
> +
>  struct i915_address_space *
>  i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
>  
> 
> 
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 1160723c9d2d..c5bc9f6e887f 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
>  	if (unlikely(IS_ERR(ctx)))
>  		return PTR_ERR(ctx);
>  
> 
> 
> 
> +	if (ctx->vm && i915_gem_vm_is_vm_bind_mode(ctx->vm)) {
> +		i915_gem_context_put(ctx);
> +		return -EOPNOTSUPP;
> +	}
> +
>  	eb->gem_context = ctx;
>  	if (i915_gem_context_has_full_ppgtt(ctx))
>  		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
> new file mode 100644
> index 000000000000..36262a6357b5
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
> @@ -0,0 +1,26 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +#ifndef __I915_GEM_VM_BIND_H
> +#define __I915_GEM_VM_BIND_H
> +
> +#include <linux/types.h>
> +
> +struct drm_device;
> +struct drm_file;
> +struct i915_address_space;
> +struct i915_vma;
> +
> +struct i915_vma *
> +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
> +
> +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
> +			   struct drm_file *file);
> +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
> +			     struct drm_file *file);
> +
> +void i915_gem_vm_unbind_all(struct i915_address_space *vm);
> +
> +#endif /* __I915_GEM_VM_BIND_H */
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> new file mode 100644
> index 000000000000..6f299806bee1
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> @@ -0,0 +1,324 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +#include <uapi/drm/i915_drm.h>
> +
> +#include <linux/interval_tree_generic.h>
> +
> +#include "gem/i915_gem_context.h"
> +#include "gem/i915_gem_vm_bind.h"
> +
> +#include "gt/intel_gpu_commands.h"
> +
> +#define START(node) ((node)->start)
> +#define LAST(node) ((node)->last)
> +
> +/* Not all defined functions are used, hence use __maybe_unused */
> +INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
> +		     START, LAST, __maybe_unused static inline, i915_vm_bind_it)
> +
> +#undef START
> +#undef LAST
> +
> +/**
> + * DOC: VM_BIND/UNBIND ioctls
> + *
> + * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
> + * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
> + * specified address space (VM). Multiple mappings can map to the same physical
> + * pages of an object (aliasing). These mappings (also referred to as persistent
> + * mappings) will be persistent across multiple GPU submissions (execbuf calls)
> + * issued by the UMD, without user having to provide a list of all required
> + * mappings during each submission (as required by older execbuf mode).
> + *
> + * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
> + * signaling the completion of bind/unbind operation.
> + *
> + * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
> + * User has to opt-in for VM_BIND mode of binding for an address space (VM)
> + * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
> + *
> + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
> + * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be
> + * done asynchronously, when valid out fence is specified.
> + *
> + * VM_BIND locking order is as below.
> + *
> + * 1) vm_bind_lock mutex will protect vm_bind lists. This lock is taken in
> + *    vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the
> + *    mapping.
> + *
> + *    In future, when GPU page faults are supported, we can potentially use a
> + *    rwsem instead, so that multiple page fault handlers can take the read
> + *    side lock to lookup the mapping and hence can run in parallel.
> + *    The older execbuf mode of binding do not need this lock.
> + *
> + * 2) The object's dma-resv lock will protect i915_vma state and needs
> + *    to be held while binding/unbinding a vma in the async worker and while
> + *    updating dma-resv fence list of an object. Note that private BOs of a VM
> + *    will all share a dma-resv object.
> + *
> + * 3) Spinlock/s to protect some of the VM's lists like the list of
> + *    invalidated vmas (due to eviction and userptr invalidation) etc.
> + */
> +
> +/**
> + * i915_gem_vm_bind_lookup_vma() - lookup for persistent vma mapped at a
> + * specified address
> + * @vm: virtual address space to look for persistent vma
> + * @va: starting address where vma is mapped
> + *
> + * Retrieves the persistent vma mapped address @va from the @vm's vma tree.
> + *
> + * Returns vma pointer on success, NULL on failure.
> + */
> +struct i915_vma *
> +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va)
> +{
> +	lockdep_assert_held(&vm->vm_bind_lock);
> +
> +	return i915_vm_bind_it_iter_first(&vm->va, va, va);
> +}
> +
> +static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
> +{
> +	lockdep_assert_held(&vma->vm->vm_bind_lock);
> +
> +	list_del_init(&vma->vm_bind_link);
> +	i915_vm_bind_it_remove(vma, &vma->vm->va);
> +
> +	/* Release object */
> +	if (release_obj)
> +		i915_gem_object_put(vma->obj);
> +}
> +
> +static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
> +				  struct drm_i915_gem_vm_unbind *va)
> +{
> +	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
> +	int ret;
> +
> +	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
> +	if (ret)
> +		return ret;
> +
> +	va->start = gen8_noncanonical_addr(va->start);
> +	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
> +
> +	if (!vma)
> +		ret = -ENOENT;
> +	else if (vma->size != va->length)
> +		ret = -EINVAL;
> +
> +	if (ret) {
> +		mutex_unlock(&vm->vm_bind_lock);
> +		return ret;
> +	}
> +
> +	i915_gem_vm_bind_remove(vma, false);
> +
> +	mutex_unlock(&vm->vm_bind_lock);
> +
> +	/*
> +	 * Destroy the vma and then release the object.
> +	 * As persistent vma holds object reference, it can only be destroyed
> +	 * either by vm_unbind ioctl or when VM is being released. As we are
> +	 * holding VM reference here, it is safe accessing the vma here.
> +	 */
> +	obj = vma->obj;
> +	i915_gem_object_lock(obj, NULL);
> +	i915_vma_destroy(vma);
> +	i915_gem_object_unlock(obj);
> +
> +	i915_gem_object_put(obj);
> +
> +	return 0;
> +}
> +
> +/**
> + * i915_gem_vm_unbind_all() - unbind all persistent mappings from an
> + * address space
> + * @vm: Address spece to remove persistent mappings from
> + *
> + * Unbind all userspace requested vm_bind mappings from @vm.
> + */
> +void i915_gem_vm_unbind_all(struct i915_address_space *vm)
> +{
> +	struct i915_vma *vma, *t;
> +
> +	mutex_lock(&vm->vm_bind_lock);
> +	list_for_each_entry_safe(vma, t, &vm->vm_bind_list, vm_bind_link)
> +		i915_gem_vm_bind_remove(vma, true);
> +	list_for_each_entry_safe(vma, t, &vm->vm_bound_list, vm_bind_link)
> +		i915_gem_vm_bind_remove(vma, true);
> +	mutex_unlock(&vm->vm_bind_lock);
> +}
> +
> +static struct i915_vma *vm_bind_get_vma(struct i915_address_space *vm,
> +					struct drm_i915_gem_object *obj,
> +					struct drm_i915_gem_vm_bind *va)
> +{
> +	struct i915_gtt_view view;
> +	struct i915_vma *vma;
> +
> +	va->start = gen8_noncanonical_addr(va->start);
> +	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
> +	if (vma)
> +		return ERR_PTR(-EEXIST);
> +
> +	view.type = I915_GTT_VIEW_PARTIAL;
> +	view.partial.offset = va->offset >> PAGE_SHIFT;
> +	view.partial.size = va->length >> PAGE_SHIFT;
> +	vma = i915_vma_create_persistent(obj, vm, &view);
> +	if (IS_ERR(vma))
> +		return vma;
> +
> +	vma->start = va->start;
> +	vma->last = va->start + va->length - 1;
> +
> +	return vma;
> +}
> +
> +static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
> +				struct drm_i915_gem_vm_bind *va,
> +				struct drm_file *file)
> +{
> +	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma = NULL;
> +	struct i915_gem_ww_ctx ww;
> +	u64 pin_flags;
> +	int ret = 0;
> +
> +	if (!i915_gem_vm_is_vm_bind_mode(vm))
> +		return -EOPNOTSUPP;
> +
> +	/* Ensure start and length fields are valid */
> +	if (!va->length || !IS_ALIGNED(va->start, I915_GTT_PAGE_SIZE))
> +		ret = -EINVAL;
> +
> +	obj = i915_gem_object_lookup(file, va->handle);
> +	if (!obj)
> +		return -ENOENT;
> +
> +	/* Ensure offset and length are aligned to object's max page size */
> +	if (!IS_ALIGNED(va->offset | va->length,
> +			i915_gem_object_max_page_size(obj->mm.placements,
> +						      obj->mm.n_placements)))
> +		ret = -EINVAL;
> +
> +	/* Check for mapping range overflow */
> +	if (range_overflows_t(u64, va->offset, va->length, obj->base.size))
> +		ret = -EINVAL;
> +
> +	if (ret)
> +		goto put_obj;
> +
> +	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
> +	if (ret)
> +		goto put_obj;
> +
> +	vma = vm_bind_get_vma(vm, obj, va);
> +	if (IS_ERR(vma)) {
> +		ret = PTR_ERR(vma);
> +		goto unlock_vm;
> +	}
> +
> +	pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER |
> +		    PIN_VALIDATE | PIN_NOEVICT;
> +
> +	for_i915_gem_ww(&ww, ret, true) {
> +		ret = i915_gem_object_lock(vma->obj, &ww);
> +		if (ret)
> +			continue;
> +
> +		ret = i915_vma_pin_ww(vma, &ww, 0, 0, pin_flags);
> +		if (ret)
> +			continue;
> +
> +		list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
> +		i915_vm_bind_it_insert(vma, &vm->va);
> +
> +		/* Hold object reference until vm_unbind */
> +		i915_gem_object_get(vma->obj);
> +	}
> +
> +	if (ret)
> +		i915_vma_destroy(vma);
> +unlock_vm:
> +	mutex_unlock(&vm->vm_bind_lock);
> +put_obj:
> +	i915_gem_object_put(obj);
> +
> +	return ret;
> +}
> +
> +/**
> + * i915_gem_vm_bind_ioctl() - ioctl function for binding a section of object
> + * at a specified virtual address
> + * @dev: drm_device pointer
> + * @data: ioctl data structure
> + * @file: drm_file pointer
> + *
> + * Adds the specified persistent mapping (virtual address to a section of an
> + * object) and binds it in the device page table.
> + *
> + * Returns 0 on success, error code on failure.
> + */
> +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
> +			   struct drm_file *file)
> +{
> +	struct drm_i915_gem_vm_bind *args = data;
> +	struct i915_address_space *vm;
> +	int ret;
> +
> +	/* Reserved fields must be 0 */
> +	if (args->rsvd[0] || args->rsvd[1] || args->rsvd[2] || args->extensions)
> +		return -EINVAL;
> +
> +	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
> +	if (unlikely(!vm))
> +		return -ENOENT;
> +
> +	ret = i915_gem_vm_bind_obj(vm, args, file);
> +
> +	i915_vm_put(vm);
> +	return ret;
> +}
> +
> +/**
> + * i915_gem_vm_unbind_ioctl() - ioctl function for unbinding a mapping at a
> + * specified virtual address
> + * @dev: drm_device pointer
> + * @data: ioctl data structure
> + * @file: drm_file pointer
> + *
> + * Removes the persistent mapping at the specified address and unbinds it
> + * from the device page table.
> + *
> + * Returns 0 on success, error code on failure. -ENOENT is returned if the
> + * specified mapping is not found.
> + */
> +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
> +			     struct drm_file *file)
> +{
> +	struct drm_i915_gem_vm_unbind *args = data;
> +	struct i915_address_space *vm;
> +	int ret;
> +
> +	/* Reserved fields must be 0 */
> +	if (args->rsvd || args->rsvd2[0] || args->rsvd2[1] ||
> +	    args->rsvd2[2] || args->extensions)
> +		return -EINVAL;
> +
> +	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
> +	if (unlikely(!vm))
> +		return -ENOENT;
> +
> +	ret = i915_gem_vm_unbind_vma(vm, args);
> +
> +	i915_vm_put(vm);
> +	return ret;
> +}
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index e82a9d763e57..412368c67c46 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -12,6 +12,7 @@
>  
> 
> 
> 
>  #include "gem/i915_gem_internal.h"
>  #include "gem/i915_gem_lmem.h"
> +#include "gem/i915_gem_vm_bind.h"
>  #include "i915_trace.h"
>  #include "i915_utils.h"
>  #include "intel_gt.h"
> @@ -177,6 +178,8 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
>  void i915_address_space_fini(struct i915_address_space *vm)
>  {
>  	drm_mm_takedown(&vm->mm);
> +	GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
> +	mutex_destroy(&vm->vm_bind_lock);
>  }
>  
> 
> 
> 
>  /**
> @@ -203,6 +206,8 @@ static void __i915_vm_release(struct work_struct *work)
>  	struct i915_address_space *vm =
>  		container_of(work, struct i915_address_space, release_work);
>  
> 
> 
> 
> +	i915_gem_vm_unbind_all(vm);
> +
>  	__i915_vm_close(vm);
>  
> 
> 
> 
>  	/* Synchronize async unbinds. */
> @@ -279,6 +284,11 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
>  
> 
> 
> 
>  	INIT_LIST_HEAD(&vm->bound_list);
>  	INIT_LIST_HEAD(&vm->unbound_list);
> +
> +	vm->va = RB_ROOT_CACHED;
> +	INIT_LIST_HEAD(&vm->vm_bind_list);
> +	INIT_LIST_HEAD(&vm->vm_bound_list);
> +	mutex_init(&vm->vm_bind_lock);
>  }
>  
> 
> 
> 
>  void *__px_vaddr(struct drm_i915_gem_object *p)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 4d75ba4bb41d..3a9bee1b9d03 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -260,6 +260,15 @@ struct i915_address_space {
>  	 */
>  	struct list_head unbound_list;
>  
> 
> 
> 
> +	/** @vm_bind_lock: Mutex to protect @vm_bind_list and @vm_bound_list */
> +	struct mutex vm_bind_lock;
> +	/** @vm_bind_list: List of vm_binding in process */
> +	struct list_head vm_bind_list;
> +	/** @vm_bound_list: List of vm_binding completed */
> +	struct list_head vm_bound_list;
> +	/** @va: tree of persistent vmas */
> +	struct rb_root_cached va;
> +
>  	/* Global GTT */
>  	bool is_ggtt:1;
>  
> 
> 
> 
> diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
> index c3d43f9b1e45..cf41b96ac485 100644
> --- a/drivers/gpu/drm/i915/i915_driver.c
> +++ b/drivers/gpu/drm/i915/i915_driver.c
> @@ -69,6 +69,7 @@
>  #include "gem/i915_gem_ioctls.h"
>  #include "gem/i915_gem_mman.h"
>  #include "gem/i915_gem_pm.h"
> +#include "gem/i915_gem_vm_bind.h"
>  #include "gt/intel_gt.h"
>  #include "gt/intel_gt_pm.h"
>  #include "gt/intel_rc6.h"
> @@ -1892,6 +1893,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
>  	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
> +	DRM_IOCTL_DEF_DRV(I915_GEM_VM_BIND, i915_gem_vm_bind_ioctl, DRM_RENDER_ALLOW),
> +	DRM_IOCTL_DEF_DRV(I915_GEM_VM_UNBIND, i915_gem_vm_unbind_ioctl, DRM_RENDER_ALLOW),
>  };
>  
> 
> 
> 
>  /*
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 529d97318f00..6a64a130dbcd 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -239,6 +239,7 @@ vma_create(struct drm_i915_gem_object *obj,
>  	spin_unlock(&obj->vma.lock);
>  	mutex_unlock(&vm->mutex);
>  
> 
> 
> 
> +	INIT_LIST_HEAD(&vma->vm_bind_link);
>  	return vma;
>  
> 
> 
> 
>  err_unlock:
> diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
> index 3144d71a0c3e..db786d2d1530 100644
> --- a/drivers/gpu/drm/i915/i915_vma_types.h
> +++ b/drivers/gpu/drm/i915/i915_vma_types.h
> @@ -295,6 +295,20 @@ struct i915_vma {
>  	/** This object's place on the active/inactive lists */
>  	struct list_head vm_link;
>  
> 
> 
> 
> +	/** @vm_bind_link: node for the vm_bind related lists of vm */
> +	struct list_head vm_bind_link;
> +
> +	/** Interval tree structures for persistent vma */
> +
> +	/** @rb: node for the interval tree of vm for persistent vmas */
> +	struct rb_node rb;
> +	/** @start: start endpoint of the rb node */
> +	u64 start;
> +	/** @last: Last endpoint of the rb node */
> +	u64 last;
> +	/** @__subtree_last: last in subtree */
> +	u64 __subtree_last;
> +
>  	struct list_head obj_link; /* Link in the object's VMA list */
>  	struct rb_node obj_node;
>  	struct hlist_node obj_hash;
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 8df261c5ab9b..f06a09f1db2d 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -470,6 +470,8 @@ typedef struct _drm_i915_sarea {
>  #define DRM_I915_GEM_VM_CREATE		0x3a
>  #define DRM_I915_GEM_VM_DESTROY		0x3b
>  #define DRM_I915_GEM_CREATE_EXT		0x3c
> +#define DRM_I915_GEM_VM_BIND		0x3d
> +#define DRM_I915_GEM_VM_UNBIND		0x3e
>  /* Must be kept compact -- no holes */
>  
> 
> 
> 
>  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
> @@ -534,6 +536,8 @@ typedef struct _drm_i915_sarea {
>  #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>  #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
>  #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
> +#define DRM_IOCTL_I915_GEM_VM_BIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
> +#define DRM_IOCTL_I915_GEM_VM_UNBIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_unbind)
>  
> 
> 
> 
>  /* Allow drivers to submit batchbuffers directly to hardware, relying
>   * on the security mechanisms provided by hardware.
> @@ -3727,6 +3731,101 @@ struct drm_i915_gem_create_ext_protected_content {
>  /* ID of the protected content session managed by i915 when PXP is active */
>  #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
>  
> 
> 
> 
> +/**
> + * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
> + *
> + * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
> + * virtual address (VA) range to the section of an object that should be bound
> + * in the device page table of the specified address space (VM).
> + * The VA range specified must be unique (ie., not currently bound) and can
> + * be mapped to whole object or a section of the object (partial binding).
> + * Multiple VA mappings can be created to the same section of the object
> + * (aliasing).
> + *
> + * The @start, @offset and @length must be 4K page aligned. However the DG2
> + * and XEHPSDV has 64K page size for device local memory and has compact page
> + * table. On those platforms, for binding device local-memory objects, the
> + * @start, @offset and @length must be 64K aligned.
> + *
> + * Error code -EINVAL will be returned if @start, @offset and @length are not
> + * properly aligned. In version 1 (See I915_PARAM_VM_BIND_VERSION), error code
> + * -ENOSPC will be returned if the VA range specified can't be reserved.
> + *
> + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
> + * are not ordered. Furthermore, parts of the VM_BIND operation can be done
> + * asynchronously, if valid @fence is specified.
> + */
> +struct drm_i915_gem_vm_bind {
> +	/** @vm_id: VM (address space) id to bind */
> +	__u32 vm_id;
> +
> +	/** @handle: Object handle */
> +	__u32 handle;
> +
> +	/** @start: Virtual Address start to bind */
> +	__u64 start;
> +
> +	/** @offset: Offset in object to bind */
> +	__u64 offset;
> +
> +	/** @length: Length of mapping to bind */
> +	__u64 length;
> +
> +	/** @rsvd: Reserved, MBZ */
> +	__u64 rsvd[3];

In a brand new ioctl with even extensions support, why do we need this?
If we have a plan to add something here in the future, can you please
tell us what it may be? Perhaps having that field already but accepting
only a default value/flag would be better.

I see in previous versions we had 'flags' here. Having 'flags', even if
MBZ for the initial version, seems like a nice thing to have for future
extensibility. Also, you're going to add back the flag to make the page
read-only at some point, right?

> +
> +	/** @rsvd2: Reserved for timeline fence */
> +	__u64 rsvd2[2];

I see this one gets changed in the middle of the series.

> +
> +	/**
> +	 * @extensions: Zero-terminated chain of extensions.
> +	 *
> +	 * For future extensions. See struct i915_user_extension.
> +	 */
> +	__u64 extensions;
> +};
> +
> +/**
> + * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
> + *
> + * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
> + * address (VA) range that should be unbound from the device page table of the
> + * specified address space (VM). VM_UNBIND will force unbind the specified
> + * range from device page table without waiting for any GPU job to complete.
> + * It is UMDs responsibility to ensure the mapping is no longer in use before
> + * calling VM_UNBIND.
> + *
> + * If the specified mapping is not found, the ioctl will simply return without
> + * any error.
> + *
> + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
> + * are not ordered. Furthermore, parts of the VM_UNBIND operation can be done
> + * asynchronously, if valid @fence is specified.

What @fence? There's no way to specify one.

> + */
> +struct drm_i915_gem_vm_unbind {
> +	/** @vm_id: VM (address space) id to bind */
> +	__u32 vm_id;
> +
> +	/** @rsvd: Reserved, MBZ */
> +	__u32 rsvd;

Again here, same question. Perhaps we could name it 'pad' or 'pad0' if
that's the specific goal of having this?

> +
> +	/** @start: Virtual Address start to unbind */
> +	__u64 start;
> +
> +	/** @length: Length of mapping to unbind */
> +	__u64 length;
> +
> +	/** @rsvd2: Reserved, MBZ */
> +	__u64 rsvd2[3];

And here, but this is definitely not just padding.

> +
> +	/**
> +	 * @extensions: Zero-terminated chain of extensions.
> +	 *
> +	 * For future extensions. See struct i915_user_extension.
> +	 */
> +	__u64 extensions;
> +};
> +
>  #if defined(__cplusplus)
>  }
>  #endif


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 05/20] drm/i915/vm_bind: Implement bind and unbind of object
@ 2022-11-10  1:28     ` Zanoni, Paulo R
  0 siblings, 0 replies; 71+ messages in thread
From: Zanoni, Paulo R @ 2022-11-10  1:28 UTC (permalink / raw)
  To: dri-devel, Vishwanathapura, Niranjana, intel-gfx
  Cc: Nikula, Jani, Hellstrom, Thomas, Auld, Matthew, Vetter, Daniel,
	christian.koenig

On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
> Add uapi and implement support for bind and unbind of an
> object at the specified GPU virtual addresses.
> 
> The vm_bind mode is not supported in legacy execbuf2 ioctl.
> It will be supported only in the newer execbuf3 ioctl.
> 
> v2: On older platforms ctx->vm is not set, check for it.
>     In vm_bind call, add vma to vm_bind_list.
>     Add more input validity checks.
>     Update some documentation.
> v3: In vm_bind call, add vma to vm_bound_list as user can
>     request a fence and pass to execbuf3 as input fence.
>     Remove short term pinning with PIN_VALIDATE flag.
> v4: Replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode().
> v5: Ensure all reserved fields are 0, use PIN_NOEVICT.
> v6: Add reserved fields to drm_i915_gem_vm_bind.
> 
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/Makefile                 |   1 +
>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  15 +
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   5 +
>  drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
>  .../drm/i915/gem/i915_gem_vm_bind_object.c    | 324 ++++++++++++++++++
>  drivers/gpu/drm/i915/gt/intel_gtt.c           |  10 +
>  drivers/gpu/drm/i915/gt/intel_gtt.h           |   9 +
>  drivers/gpu/drm/i915/i915_driver.c            |   3 +
>  drivers/gpu/drm/i915/i915_vma.c               |   1 +
>  drivers/gpu/drm/i915/i915_vma_types.h         |  14 +
>  include/uapi/drm/i915_drm.h                   |  99 ++++++
>  11 files changed, 507 insertions(+)
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 51704b54317c..b731f3ac80da 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -166,6 +166,7 @@ gem-y += \
>  	gem/i915_gem_ttm_move.o \
>  	gem/i915_gem_ttm_pm.o \
>  	gem/i915_gem_userptr.o \
> +	gem/i915_gem_vm_bind_object.o \
>  	gem/i915_gem_wait.o \
>  	gem/i915_gemfs.o
>  i915-y += \
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> index 899fa8f1e0fe..e8b41aa8f8c4 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> @@ -139,6 +139,21 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
>  				       struct drm_file *file);
>  
> 
> 
> 
> +/**
> + * i915_gem_vm_is_vm_bind_mode() - Check if address space is in vm_bind mode
> + * @vm: the address space
> + *
> + * Returns:
> + * true: @vm is in vm_bind mode; allows only vm_bind method of binding.
> + * false: @vm is not in vm_bind mode; allows only legacy execbuff method
> + *        of binding.
> + */
> +static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
> +{
> +	/* No support to enable vm_bind mode yet */
> +	return false;
> +}
> +
>  struct i915_address_space *
>  i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
>  
> 
> 
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index 1160723c9d2d..c5bc9f6e887f 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
>  	if (unlikely(IS_ERR(ctx)))
>  		return PTR_ERR(ctx);
>  
> 
> 
> 
> +	if (ctx->vm && i915_gem_vm_is_vm_bind_mode(ctx->vm)) {
> +		i915_gem_context_put(ctx);
> +		return -EOPNOTSUPP;
> +	}
> +
>  	eb->gem_context = ctx;
>  	if (i915_gem_context_has_full_ppgtt(ctx))
>  		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
> new file mode 100644
> index 000000000000..36262a6357b5
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
> @@ -0,0 +1,26 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +#ifndef __I915_GEM_VM_BIND_H
> +#define __I915_GEM_VM_BIND_H
> +
> +#include <linux/types.h>
> +
> +struct drm_device;
> +struct drm_file;
> +struct i915_address_space;
> +struct i915_vma;
> +
> +struct i915_vma *
> +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
> +
> +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
> +			   struct drm_file *file);
> +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
> +			     struct drm_file *file);
> +
> +void i915_gem_vm_unbind_all(struct i915_address_space *vm);
> +
> +#endif /* __I915_GEM_VM_BIND_H */
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> new file mode 100644
> index 000000000000..6f299806bee1
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> @@ -0,0 +1,324 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +#include <uapi/drm/i915_drm.h>
> +
> +#include <linux/interval_tree_generic.h>
> +
> +#include "gem/i915_gem_context.h"
> +#include "gem/i915_gem_vm_bind.h"
> +
> +#include "gt/intel_gpu_commands.h"
> +
> +#define START(node) ((node)->start)
> +#define LAST(node) ((node)->last)
> +
> +/* Not all defined functions are used, hence use __maybe_unused */
> +INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
> +		     START, LAST, __maybe_unused static inline, i915_vm_bind_it)
> +
> +#undef START
> +#undef LAST
> +
> +/**
> + * DOC: VM_BIND/UNBIND ioctls
> + *
> + * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
> + * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
> + * specified address space (VM). Multiple mappings can map to the same physical
> + * pages of an object (aliasing). These mappings (also referred to as persistent
> + * mappings) will be persistent across multiple GPU submissions (execbuf calls)
> + * issued by the UMD, without user having to provide a list of all required
> + * mappings during each submission (as required by older execbuf mode).
> + *
> + * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
> + * signaling the completion of bind/unbind operation.
> + *
> + * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
> + * User has to opt-in for VM_BIND mode of binding for an address space (VM)
> + * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
> + *
> + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
> + * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be
> + * done asynchronously, when valid out fence is specified.
> + *
> + * VM_BIND locking order is as below.
> + *
> + * 1) vm_bind_lock mutex will protect vm_bind lists. This lock is taken in
> + *    vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the
> + *    mapping.
> + *
> + *    In future, when GPU page faults are supported, we can potentially use a
> + *    rwsem instead, so that multiple page fault handlers can take the read
> + *    side lock to lookup the mapping and hence can run in parallel.
> + *    The older execbuf mode of binding do not need this lock.
> + *
> + * 2) The object's dma-resv lock will protect i915_vma state and needs
> + *    to be held while binding/unbinding a vma in the async worker and while
> + *    updating dma-resv fence list of an object. Note that private BOs of a VM
> + *    will all share a dma-resv object.
> + *
> + * 3) Spinlock/s to protect some of the VM's lists like the list of
> + *    invalidated vmas (due to eviction and userptr invalidation) etc.
> + */
> +
> +/**
> + * i915_gem_vm_bind_lookup_vma() - lookup for persistent vma mapped at a
> + * specified address
> + * @vm: virtual address space to look for persistent vma
> + * @va: starting address where vma is mapped
> + *
> + * Retrieves the persistent vma mapped address @va from the @vm's vma tree.
> + *
> + * Returns vma pointer on success, NULL on failure.
> + */
> +struct i915_vma *
> +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va)
> +{
> +	lockdep_assert_held(&vm->vm_bind_lock);
> +
> +	return i915_vm_bind_it_iter_first(&vm->va, va, va);
> +}
> +
> +static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
> +{
> +	lockdep_assert_held(&vma->vm->vm_bind_lock);
> +
> +	list_del_init(&vma->vm_bind_link);
> +	i915_vm_bind_it_remove(vma, &vma->vm->va);
> +
> +	/* Release object */
> +	if (release_obj)
> +		i915_gem_object_put(vma->obj);
> +}
> +
> +static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
> +				  struct drm_i915_gem_vm_unbind *va)
> +{
> +	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
> +	int ret;
> +
> +	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
> +	if (ret)
> +		return ret;
> +
> +	va->start = gen8_noncanonical_addr(va->start);
> +	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
> +
> +	if (!vma)
> +		ret = -ENOENT;
> +	else if (vma->size != va->length)
> +		ret = -EINVAL;
> +
> +	if (ret) {
> +		mutex_unlock(&vm->vm_bind_lock);
> +		return ret;
> +	}
> +
> +	i915_gem_vm_bind_remove(vma, false);
> +
> +	mutex_unlock(&vm->vm_bind_lock);
> +
> +	/*
> +	 * Destroy the vma and then release the object.
> +	 * As persistent vma holds object reference, it can only be destroyed
> +	 * either by vm_unbind ioctl or when VM is being released. As we are
> +	 * holding VM reference here, it is safe accessing the vma here.
> +	 */
> +	obj = vma->obj;
> +	i915_gem_object_lock(obj, NULL);
> +	i915_vma_destroy(vma);
> +	i915_gem_object_unlock(obj);
> +
> +	i915_gem_object_put(obj);
> +
> +	return 0;
> +}
> +
> +/**
> + * i915_gem_vm_unbind_all() - unbind all persistent mappings from an
> + * address space
> + * @vm: Address spece to remove persistent mappings from
> + *
> + * Unbind all userspace requested vm_bind mappings from @vm.
> + */
> +void i915_gem_vm_unbind_all(struct i915_address_space *vm)
> +{
> +	struct i915_vma *vma, *t;
> +
> +	mutex_lock(&vm->vm_bind_lock);
> +	list_for_each_entry_safe(vma, t, &vm->vm_bind_list, vm_bind_link)
> +		i915_gem_vm_bind_remove(vma, true);
> +	list_for_each_entry_safe(vma, t, &vm->vm_bound_list, vm_bind_link)
> +		i915_gem_vm_bind_remove(vma, true);
> +	mutex_unlock(&vm->vm_bind_lock);
> +}
> +
> +static struct i915_vma *vm_bind_get_vma(struct i915_address_space *vm,
> +					struct drm_i915_gem_object *obj,
> +					struct drm_i915_gem_vm_bind *va)
> +{
> +	struct i915_gtt_view view;
> +	struct i915_vma *vma;
> +
> +	va->start = gen8_noncanonical_addr(va->start);
> +	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
> +	if (vma)
> +		return ERR_PTR(-EEXIST);
> +
> +	view.type = I915_GTT_VIEW_PARTIAL;
> +	view.partial.offset = va->offset >> PAGE_SHIFT;
> +	view.partial.size = va->length >> PAGE_SHIFT;
> +	vma = i915_vma_create_persistent(obj, vm, &view);
> +	if (IS_ERR(vma))
> +		return vma;
> +
> +	vma->start = va->start;
> +	vma->last = va->start + va->length - 1;
> +
> +	return vma;
> +}
> +
> +static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
> +				struct drm_i915_gem_vm_bind *va,
> +				struct drm_file *file)
> +{
> +	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma = NULL;
> +	struct i915_gem_ww_ctx ww;
> +	u64 pin_flags;
> +	int ret = 0;
> +
> +	if (!i915_gem_vm_is_vm_bind_mode(vm))
> +		return -EOPNOTSUPP;
> +
> +	/* Ensure start and length fields are valid */
> +	if (!va->length || !IS_ALIGNED(va->start, I915_GTT_PAGE_SIZE))
> +		ret = -EINVAL;
> +
> +	obj = i915_gem_object_lookup(file, va->handle);
> +	if (!obj)
> +		return -ENOENT;
> +
> +	/* Ensure offset and length are aligned to object's max page size */
> +	if (!IS_ALIGNED(va->offset | va->length,
> +			i915_gem_object_max_page_size(obj->mm.placements,
> +						      obj->mm.n_placements)))
> +		ret = -EINVAL;
> +
> +	/* Check for mapping range overflow */
> +	if (range_overflows_t(u64, va->offset, va->length, obj->base.size))
> +		ret = -EINVAL;
> +
> +	if (ret)
> +		goto put_obj;
> +
> +	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
> +	if (ret)
> +		goto put_obj;
> +
> +	vma = vm_bind_get_vma(vm, obj, va);
> +	if (IS_ERR(vma)) {
> +		ret = PTR_ERR(vma);
> +		goto unlock_vm;
> +	}
> +
> +	pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER |
> +		    PIN_VALIDATE | PIN_NOEVICT;
> +
> +	for_i915_gem_ww(&ww, ret, true) {
> +		ret = i915_gem_object_lock(vma->obj, &ww);
> +		if (ret)
> +			continue;
> +
> +		ret = i915_vma_pin_ww(vma, &ww, 0, 0, pin_flags);
> +		if (ret)
> +			continue;
> +
> +		list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
> +		i915_vm_bind_it_insert(vma, &vm->va);
> +
> +		/* Hold object reference until vm_unbind */
> +		i915_gem_object_get(vma->obj);
> +	}
> +
> +	if (ret)
> +		i915_vma_destroy(vma);
> +unlock_vm:
> +	mutex_unlock(&vm->vm_bind_lock);
> +put_obj:
> +	i915_gem_object_put(obj);
> +
> +	return ret;
> +}
> +
> +/**
> + * i915_gem_vm_bind_ioctl() - ioctl function for binding a section of object
> + * at a specified virtual address
> + * @dev: drm_device pointer
> + * @data: ioctl data structure
> + * @file: drm_file pointer
> + *
> + * Adds the specified persistent mapping (virtual address to a section of an
> + * object) and binds it in the device page table.
> + *
> + * Returns 0 on success, error code on failure.
> + */
> +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
> +			   struct drm_file *file)
> +{
> +	struct drm_i915_gem_vm_bind *args = data;
> +	struct i915_address_space *vm;
> +	int ret;
> +
> +	/* Reserved fields must be 0 */
> +	if (args->rsvd[0] || args->rsvd[1] || args->rsvd[2] || args->extensions)
> +		return -EINVAL;
> +
> +	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
> +	if (unlikely(!vm))
> +		return -ENOENT;
> +
> +	ret = i915_gem_vm_bind_obj(vm, args, file);
> +
> +	i915_vm_put(vm);
> +	return ret;
> +}
> +
> +/**
> + * i915_gem_vm_unbind_ioctl() - ioctl function for unbinding a mapping at a
> + * specified virtual address
> + * @dev: drm_device pointer
> + * @data: ioctl data structure
> + * @file: drm_file pointer
> + *
> + * Removes the persistent mapping at the specified address and unbinds it
> + * from the device page table.
> + *
> + * Returns 0 on success, error code on failure. -ENOENT is returned if the
> + * specified mapping is not found.
> + */
> +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
> +			     struct drm_file *file)
> +{
> +	struct drm_i915_gem_vm_unbind *args = data;
> +	struct i915_address_space *vm;
> +	int ret;
> +
> +	/* Reserved fields must be 0 */
> +	if (args->rsvd || args->rsvd2[0] || args->rsvd2[1] ||
> +	    args->rsvd2[2] || args->extensions)
> +		return -EINVAL;
> +
> +	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
> +	if (unlikely(!vm))
> +		return -ENOENT;
> +
> +	ret = i915_gem_vm_unbind_vma(vm, args);
> +
> +	i915_vm_put(vm);
> +	return ret;
> +}
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> index e82a9d763e57..412368c67c46 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> @@ -12,6 +12,7 @@
>  
> 
> 
> 
>  #include "gem/i915_gem_internal.h"
>  #include "gem/i915_gem_lmem.h"
> +#include "gem/i915_gem_vm_bind.h"
>  #include "i915_trace.h"
>  #include "i915_utils.h"
>  #include "intel_gt.h"
> @@ -177,6 +178,8 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
>  void i915_address_space_fini(struct i915_address_space *vm)
>  {
>  	drm_mm_takedown(&vm->mm);
> +	GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
> +	mutex_destroy(&vm->vm_bind_lock);
>  }
>  
> 
> 
> 
>  /**
> @@ -203,6 +206,8 @@ static void __i915_vm_release(struct work_struct *work)
>  	struct i915_address_space *vm =
>  		container_of(work, struct i915_address_space, release_work);
>  
> 
> 
> 
> +	i915_gem_vm_unbind_all(vm);
> +
>  	__i915_vm_close(vm);
>  
> 
> 
> 
>  	/* Synchronize async unbinds. */
> @@ -279,6 +284,11 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
>  
> 
> 
> 
>  	INIT_LIST_HEAD(&vm->bound_list);
>  	INIT_LIST_HEAD(&vm->unbound_list);
> +
> +	vm->va = RB_ROOT_CACHED;
> +	INIT_LIST_HEAD(&vm->vm_bind_list);
> +	INIT_LIST_HEAD(&vm->vm_bound_list);
> +	mutex_init(&vm->vm_bind_lock);
>  }
>  
> 
> 
> 
>  void *__px_vaddr(struct drm_i915_gem_object *p)
> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> index 4d75ba4bb41d..3a9bee1b9d03 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> @@ -260,6 +260,15 @@ struct i915_address_space {
>  	 */
>  	struct list_head unbound_list;
>  
> 
> 
> 
> +	/** @vm_bind_lock: Mutex to protect @vm_bind_list and @vm_bound_list */
> +	struct mutex vm_bind_lock;
> +	/** @vm_bind_list: List of vm_binding in process */
> +	struct list_head vm_bind_list;
> +	/** @vm_bound_list: List of vm_binding completed */
> +	struct list_head vm_bound_list;
> +	/** @va: tree of persistent vmas */
> +	struct rb_root_cached va;
> +
>  	/* Global GTT */
>  	bool is_ggtt:1;
>  
> 
> 
> 
> diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
> index c3d43f9b1e45..cf41b96ac485 100644
> --- a/drivers/gpu/drm/i915/i915_driver.c
> +++ b/drivers/gpu/drm/i915/i915_driver.c
> @@ -69,6 +69,7 @@
>  #include "gem/i915_gem_ioctls.h"
>  #include "gem/i915_gem_mman.h"
>  #include "gem/i915_gem_pm.h"
> +#include "gem/i915_gem_vm_bind.h"
>  #include "gt/intel_gt.h"
>  #include "gt/intel_gt_pm.h"
>  #include "gt/intel_rc6.h"
> @@ -1892,6 +1893,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
>  	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
>  	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
> +	DRM_IOCTL_DEF_DRV(I915_GEM_VM_BIND, i915_gem_vm_bind_ioctl, DRM_RENDER_ALLOW),
> +	DRM_IOCTL_DEF_DRV(I915_GEM_VM_UNBIND, i915_gem_vm_unbind_ioctl, DRM_RENDER_ALLOW),
>  };
>  
> 
> 
> 
>  /*
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 529d97318f00..6a64a130dbcd 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -239,6 +239,7 @@ vma_create(struct drm_i915_gem_object *obj,
>  	spin_unlock(&obj->vma.lock);
>  	mutex_unlock(&vm->mutex);
>  
> 
> 
> 
> +	INIT_LIST_HEAD(&vma->vm_bind_link);
>  	return vma;
>  
> 
> 
> 
>  err_unlock:
> diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
> index 3144d71a0c3e..db786d2d1530 100644
> --- a/drivers/gpu/drm/i915/i915_vma_types.h
> +++ b/drivers/gpu/drm/i915/i915_vma_types.h
> @@ -295,6 +295,20 @@ struct i915_vma {
>  	/** This object's place on the active/inactive lists */
>  	struct list_head vm_link;
>  
> 
> 
> 
> +	/** @vm_bind_link: node for the vm_bind related lists of vm */
> +	struct list_head vm_bind_link;
> +
> +	/** Interval tree structures for persistent vma */
> +
> +	/** @rb: node for the interval tree of vm for persistent vmas */
> +	struct rb_node rb;
> +	/** @start: start endpoint of the rb node */
> +	u64 start;
> +	/** @last: Last endpoint of the rb node */
> +	u64 last;
> +	/** @__subtree_last: last in subtree */
> +	u64 __subtree_last;
> +
>  	struct list_head obj_link; /* Link in the object's VMA list */
>  	struct rb_node obj_node;
>  	struct hlist_node obj_hash;
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 8df261c5ab9b..f06a09f1db2d 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -470,6 +470,8 @@ typedef struct _drm_i915_sarea {
>  #define DRM_I915_GEM_VM_CREATE		0x3a
>  #define DRM_I915_GEM_VM_DESTROY		0x3b
>  #define DRM_I915_GEM_CREATE_EXT		0x3c
> +#define DRM_I915_GEM_VM_BIND		0x3d
> +#define DRM_I915_GEM_VM_UNBIND		0x3e
>  /* Must be kept compact -- no holes */
>  
> 
> 
> 
>  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
> @@ -534,6 +536,8 @@ typedef struct _drm_i915_sarea {
>  #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>  #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
>  #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
> +#define DRM_IOCTL_I915_GEM_VM_BIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
> +#define DRM_IOCTL_I915_GEM_VM_UNBIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_unbind)
>  
> 
> 
> 
>  /* Allow drivers to submit batchbuffers directly to hardware, relying
>   * on the security mechanisms provided by hardware.
> @@ -3727,6 +3731,101 @@ struct drm_i915_gem_create_ext_protected_content {
>  /* ID of the protected content session managed by i915 when PXP is active */
>  #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
>  
> 
> 
> 
> +/**
> + * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
> + *
> + * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
> + * virtual address (VA) range to the section of an object that should be bound
> + * in the device page table of the specified address space (VM).
> + * The VA range specified must be unique (ie., not currently bound) and can
> + * be mapped to whole object or a section of the object (partial binding).
> + * Multiple VA mappings can be created to the same section of the object
> + * (aliasing).
> + *
> + * The @start, @offset and @length must be 4K page aligned. However the DG2
> + * and XEHPSDV has 64K page size for device local memory and has compact page
> + * table. On those platforms, for binding device local-memory objects, the
> + * @start, @offset and @length must be 64K aligned.
> + *
> + * Error code -EINVAL will be returned if @start, @offset and @length are not
> + * properly aligned. In version 1 (See I915_PARAM_VM_BIND_VERSION), error code
> + * -ENOSPC will be returned if the VA range specified can't be reserved.
> + *
> + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
> + * are not ordered. Furthermore, parts of the VM_BIND operation can be done
> + * asynchronously, if valid @fence is specified.
> + */
> +struct drm_i915_gem_vm_bind {
> +	/** @vm_id: VM (address space) id to bind */
> +	__u32 vm_id;
> +
> +	/** @handle: Object handle */
> +	__u32 handle;
> +
> +	/** @start: Virtual Address start to bind */
> +	__u64 start;
> +
> +	/** @offset: Offset in object to bind */
> +	__u64 offset;
> +
> +	/** @length: Length of mapping to bind */
> +	__u64 length;
> +
> +	/** @rsvd: Reserved, MBZ */
> +	__u64 rsvd[3];

In a brand new ioctl with even extensions support, why do we need this?
If we have a plan to add something here in the future, can you please
tell us what it may be? Perhaps having that field already but accepting
only a default value/flag would be better.

I see in previous versions we had 'flags' here. Having 'flags', even if
MBZ for the initial version, seems like a nice thing to have for future
extensibility. Also, you're going to add back the flag to make the page
read-only at some point, right?

> +
> +	/** @rsvd2: Reserved for timeline fence */
> +	__u64 rsvd2[2];

I see this one gets changed in the middle of the series.

> +
> +	/**
> +	 * @extensions: Zero-terminated chain of extensions.
> +	 *
> +	 * For future extensions. See struct i915_user_extension.
> +	 */
> +	__u64 extensions;
> +};
> +
> +/**
> + * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
> + *
> + * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
> + * address (VA) range that should be unbound from the device page table of the
> + * specified address space (VM). VM_UNBIND will force unbind the specified
> + * range from device page table without waiting for any GPU job to complete.
> + * It is UMDs responsibility to ensure the mapping is no longer in use before
> + * calling VM_UNBIND.
> + *
> + * If the specified mapping is not found, the ioctl will simply return without
> + * any error.
> + *
> + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
> + * are not ordered. Furthermore, parts of the VM_UNBIND operation can be done
> + * asynchronously, if valid @fence is specified.

What @fence? There's no way to specify one.

> + */
> +struct drm_i915_gem_vm_unbind {
> +	/** @vm_id: VM (address space) id to bind */
> +	__u32 vm_id;
> +
> +	/** @rsvd: Reserved, MBZ */
> +	__u32 rsvd;

Again here, same question. Perhaps we could name it 'pad' or 'pad0' if
that's the specific goal of having this?

> +
> +	/** @start: Virtual Address start to unbind */
> +	__u64 start;
> +
> +	/** @length: Length of mapping to unbind */
> +	__u64 length;
> +
> +	/** @rsvd2: Reserved, MBZ */
> +	__u64 rsvd2[3];

And here, but this is definitely not just padding.

> +
> +	/**
> +	 * @extensions: Zero-terminated chain of extensions.
> +	 *
> +	 * For future extensions. See struct i915_user_extension.
> +	 */
> +	__u64 extensions;
> +};
> +
>  #if defined(__cplusplus)
>  }
>  #endif


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 00/20] drm/i915/vm_bind: Add VM_BIND functionality
  2022-11-10  0:16   ` [Intel-gfx] " Zanoni, Paulo R
@ 2022-11-10  5:49     ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-10  5:49 UTC (permalink / raw)
  To: Zanoni, Paulo R
  Cc: Brost, Matthew, andi.shyti, Landwerlin, Lionel G, Ursulin,
	 Tvrtko, Nikula, Jani, intel-gfx, dri-devel, Hellstrom, Thomas,
	Auld, Matthew, jason, Vetter, Daniel, christian.koenig

On Wed, Nov 09, 2022 at 04:16:25PM -0800, Zanoni, Paulo R wrote:
>On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
>> DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
>> buffer objects (BOs) or sections of a BOs at specified GPU virtual
>> addresses on a specified address space (VM). Multiple mappings can map
>> to the same physical pages of an object (aliasing). These mappings (also
>> referred to as persistent mappings) will be persistent across multiple
>> GPU submissions (execbuf calls) issued by the UMD, without user having
>> to provide a list of all required mappings during each submission (as
>> required by older execbuf mode).
>>
>> This patch series support VM_BIND version 1, as described by the param
>> I915_PARAM_VM_BIND_VERSION.
>>
>> Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
>> vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
>> The new execbuf3 ioctl will not have any execlist support and all the
>> legacy support like relocations etc., are removed.
>>
>> NOTEs:
>> * It is based on below VM_BIND design+uapi rfc.
>>   Documentation/gpu/rfc/i915_vm_bind.rst
>
>Hi
>
>One difference for execbuf3 that I noticed that is not mentioned in the
>RFC document is that we now don't have a way to signal
>EXEC_OBJECT_WRITE. When looking at the Kernel code, some there are some
>pieces that check for this flag:
>
>- there's code that deals with frontbuffer rendering
>- there's code that deals with fences
>- there's code that prevents self-modifying batches
>- another that seems related to waiting for objects
>
>Are there any new rules regarding frontbuffer rendering when we use
>execbuf3? Any other behavior changes related to the other places that
>we should expect when using execbuf3?
>

Paulo,
Most of the EXEC_OBJECT_WRITE check in execbuf path is related to
implicit dependency tracker which execbuf3 does not support. The
frontbuffer related updated is the only exception and I don't
remember the rationale to not require this on execbuf3.

Matt, Tvrtko, Daniel, can you please comment here?

Thanks,
Niranjana

>Thanks,
>Paulo
>
>>
>> * The IGT RFC series is posted as,
>>   [PATCH i-g-t v5 0/12] vm_bind: Add VM_BIND validation support
>>
>> v2: Address various review comments
>> v3: Address review comments and other fixes
>> v4: Remove vm_unbind out fence uapi which is not supported yet,
>>     replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()
>> v5: Render kernel-doc, use PIN_NOEVICT, limit vm_bind support to
>>     non-recoverable faults
>> v6: Rebased, minor fixes, add reserved fields to drm_i915_gem_vm_bind,
>>     add new patch for async vm_unbind support
>>
>> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>>
>> Niranjana Vishwanathapura (20):
>>   drm/i915/vm_bind: Expose vm lookup function
>>   drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
>>   drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
>>   drm/i915/vm_bind: Add support to create persistent vma
>>   drm/i915/vm_bind: Implement bind and unbind of object
>>   drm/i915/vm_bind: Support for VM private BOs
>>   drm/i915/vm_bind: Add support to handle object evictions
>>   drm/i915/vm_bind: Support persistent vma activeness tracking
>>   drm/i915/vm_bind: Add out fence support
>>   drm/i915/vm_bind: Abstract out common execbuf functions
>>   drm/i915/vm_bind: Use common execbuf functions in execbuf path
>>   drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
>>   drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
>>   drm/i915/vm_bind: Expose i915_request_await_bind()
>>   drm/i915/vm_bind: Handle persistent vmas in execbuf3
>>   drm/i915/vm_bind: userptr dma-resv changes
>>   drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
>>   drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
>>   drm/i915/vm_bind: Render VM_BIND documentation
>>   drm/i915/vm_bind: Async vm_unbind support
>>
>>  Documentation/gpu/i915.rst                    |  78 +-
>>  drivers/gpu/drm/i915/Makefile                 |   3 +
>>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  43 +-
>>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  17 +
>>  drivers/gpu/drm/i915/gem/i915_gem_create.c    |  72 +-
>>  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |   6 +
>>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 516 +----------
>>  .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 871 ++++++++++++++++++
>>  .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 +++++++++++++
>>  .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
>>  drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |   2 +
>>  drivers/gpu/drm/i915/gem/i915_gem_object.c    |   3 +
>>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |   2 +
>>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +
>>  drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  19 +
>>  drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  30 +
>>  .../drm/i915/gem/i915_gem_vm_bind_object.c    | 449 +++++++++
>>  drivers/gpu/drm/i915/gt/intel_gtt.c           |  17 +
>>  drivers/gpu/drm/i915/gt/intel_gtt.h           |  21 +
>>  drivers/gpu/drm/i915/i915_driver.c            |   4 +
>>  drivers/gpu/drm/i915/i915_drv.h               |   2 +
>>  drivers/gpu/drm/i915/i915_gem_gtt.c           |  39 +
>>  drivers/gpu/drm/i915/i915_gem_gtt.h           |   3 +
>>  drivers/gpu/drm/i915/i915_getparam.c          |   3 +
>>  drivers/gpu/drm/i915/i915_sw_fence.c          |  28 +-
>>  drivers/gpu/drm/i915/i915_sw_fence.h          |  23 +-
>>  drivers/gpu/drm/i915/i915_vma.c               | 186 +++-
>>  drivers/gpu/drm/i915/i915_vma.h               |  68 +-
>>  drivers/gpu/drm/i915/i915_vma_types.h         |  39 +
>>  include/uapi/drm/i915_drm.h                   | 264 +++++-
>>  30 files changed, 3008 insertions(+), 546 deletions(-)
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
>>
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 00/20] drm/i915/vm_bind: Add VM_BIND functionality
@ 2022-11-10  5:49     ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-10  5:49 UTC (permalink / raw)
  To: Zanoni, Paulo R
  Cc: Nikula, Jani, intel-gfx, dri-devel, Hellstrom, Thomas, Auld,
	Matthew, Vetter, Daniel, christian.koenig

On Wed, Nov 09, 2022 at 04:16:25PM -0800, Zanoni, Paulo R wrote:
>On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
>> DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
>> buffer objects (BOs) or sections of a BOs at specified GPU virtual
>> addresses on a specified address space (VM). Multiple mappings can map
>> to the same physical pages of an object (aliasing). These mappings (also
>> referred to as persistent mappings) will be persistent across multiple
>> GPU submissions (execbuf calls) issued by the UMD, without user having
>> to provide a list of all required mappings during each submission (as
>> required by older execbuf mode).
>>
>> This patch series support VM_BIND version 1, as described by the param
>> I915_PARAM_VM_BIND_VERSION.
>>
>> Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
>> vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
>> The new execbuf3 ioctl will not have any execlist support and all the
>> legacy support like relocations etc., are removed.
>>
>> NOTEs:
>> * It is based on below VM_BIND design+uapi rfc.
>>   Documentation/gpu/rfc/i915_vm_bind.rst
>
>Hi
>
>One difference for execbuf3 that I noticed that is not mentioned in the
>RFC document is that we now don't have a way to signal
>EXEC_OBJECT_WRITE. When looking at the Kernel code, some there are some
>pieces that check for this flag:
>
>- there's code that deals with frontbuffer rendering
>- there's code that deals with fences
>- there's code that prevents self-modifying batches
>- another that seems related to waiting for objects
>
>Are there any new rules regarding frontbuffer rendering when we use
>execbuf3? Any other behavior changes related to the other places that
>we should expect when using execbuf3?
>

Paulo,
Most of the EXEC_OBJECT_WRITE check in execbuf path is related to
implicit dependency tracker which execbuf3 does not support. The
frontbuffer related updated is the only exception and I don't
remember the rationale to not require this on execbuf3.

Matt, Tvrtko, Daniel, can you please comment here?

Thanks,
Niranjana

>Thanks,
>Paulo
>
>>
>> * The IGT RFC series is posted as,
>>   [PATCH i-g-t v5 0/12] vm_bind: Add VM_BIND validation support
>>
>> v2: Address various review comments
>> v3: Address review comments and other fixes
>> v4: Remove vm_unbind out fence uapi which is not supported yet,
>>     replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()
>> v5: Render kernel-doc, use PIN_NOEVICT, limit vm_bind support to
>>     non-recoverable faults
>> v6: Rebased, minor fixes, add reserved fields to drm_i915_gem_vm_bind,
>>     add new patch for async vm_unbind support
>>
>> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>>
>> Niranjana Vishwanathapura (20):
>>   drm/i915/vm_bind: Expose vm lookup function
>>   drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
>>   drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
>>   drm/i915/vm_bind: Add support to create persistent vma
>>   drm/i915/vm_bind: Implement bind and unbind of object
>>   drm/i915/vm_bind: Support for VM private BOs
>>   drm/i915/vm_bind: Add support to handle object evictions
>>   drm/i915/vm_bind: Support persistent vma activeness tracking
>>   drm/i915/vm_bind: Add out fence support
>>   drm/i915/vm_bind: Abstract out common execbuf functions
>>   drm/i915/vm_bind: Use common execbuf functions in execbuf path
>>   drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
>>   drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
>>   drm/i915/vm_bind: Expose i915_request_await_bind()
>>   drm/i915/vm_bind: Handle persistent vmas in execbuf3
>>   drm/i915/vm_bind: userptr dma-resv changes
>>   drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
>>   drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
>>   drm/i915/vm_bind: Render VM_BIND documentation
>>   drm/i915/vm_bind: Async vm_unbind support
>>
>>  Documentation/gpu/i915.rst                    |  78 +-
>>  drivers/gpu/drm/i915/Makefile                 |   3 +
>>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  43 +-
>>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  17 +
>>  drivers/gpu/drm/i915/gem/i915_gem_create.c    |  72 +-
>>  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |   6 +
>>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 516 +----------
>>  .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 871 ++++++++++++++++++
>>  .../drm/i915/gem/i915_gem_execbuffer_common.c | 666 +++++++++++++
>>  .../drm/i915/gem/i915_gem_execbuffer_common.h |  74 ++
>>  drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |   2 +
>>  drivers/gpu/drm/i915/gem/i915_gem_object.c    |   3 +
>>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |   2 +
>>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +
>>  drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  19 +
>>  drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  30 +
>>  .../drm/i915/gem/i915_gem_vm_bind_object.c    | 449 +++++++++
>>  drivers/gpu/drm/i915/gt/intel_gtt.c           |  17 +
>>  drivers/gpu/drm/i915/gt/intel_gtt.h           |  21 +
>>  drivers/gpu/drm/i915/i915_driver.c            |   4 +
>>  drivers/gpu/drm/i915/i915_drv.h               |   2 +
>>  drivers/gpu/drm/i915/i915_gem_gtt.c           |  39 +
>>  drivers/gpu/drm/i915/i915_gem_gtt.h           |   3 +
>>  drivers/gpu/drm/i915/i915_getparam.c          |   3 +
>>  drivers/gpu/drm/i915/i915_sw_fence.c          |  28 +-
>>  drivers/gpu/drm/i915/i915_sw_fence.h          |  23 +-
>>  drivers/gpu/drm/i915/i915_vma.c               | 186 +++-
>>  drivers/gpu/drm/i915/i915_vma.h               |  68 +-
>>  drivers/gpu/drm/i915/i915_vma_types.h         |  39 +
>>  include/uapi/drm/i915_drm.h                   | 264 +++++-
>>  30 files changed, 3008 insertions(+), 546 deletions(-)
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
>>
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 00/20] drm/i915/vm_bind: Add VM_BIND functionality
  2022-11-10  5:49     ` [Intel-gfx] " Niranjana Vishwanathapura
  (?)
@ 2022-11-10 14:47     ` Tvrtko Ursulin
  2022-11-10 15:05       ` Matthew Auld
  -1 siblings, 1 reply; 71+ messages in thread
From: Tvrtko Ursulin @ 2022-11-10 14:47 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, Zanoni, Paulo R
  Cc: Nikula, Jani, intel-gfx, dri-devel, Hellstrom, Thomas, Auld,
	Matthew, Vetter, Daniel, christian.koenig


On 10/11/2022 05:49, Niranjana Vishwanathapura wrote:
> On Wed, Nov 09, 2022 at 04:16:25PM -0800, Zanoni, Paulo R wrote:
>> On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
>>> DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
>>> buffer objects (BOs) or sections of a BOs at specified GPU virtual
>>> addresses on a specified address space (VM). Multiple mappings can map
>>> to the same physical pages of an object (aliasing). These mappings (also
>>> referred to as persistent mappings) will be persistent across multiple
>>> GPU submissions (execbuf calls) issued by the UMD, without user having
>>> to provide a list of all required mappings during each submission (as
>>> required by older execbuf mode).
>>>
>>> This patch series support VM_BIND version 1, as described by the param
>>> I915_PARAM_VM_BIND_VERSION.
>>>
>>> Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
>>> vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
>>> The new execbuf3 ioctl will not have any execlist support and all the
>>> legacy support like relocations etc., are removed.
>>>
>>> NOTEs:
>>> * It is based on below VM_BIND design+uapi rfc.
>>>   Documentation/gpu/rfc/i915_vm_bind.rst
>>
>> Hi
>>
>> One difference for execbuf3 that I noticed that is not mentioned in the
>> RFC document is that we now don't have a way to signal
>> EXEC_OBJECT_WRITE. When looking at the Kernel code, some there are some
>> pieces that check for this flag:
>>
>> - there's code that deals with frontbuffer rendering
>> - there's code that deals with fences
>> - there's code that prevents self-modifying batches
>> - another that seems related to waiting for objects
>>
>> Are there any new rules regarding frontbuffer rendering when we use
>> execbuf3? Any other behavior changes related to the other places that
>> we should expect when using execbuf3?
>>
> 
> Paulo,
> Most of the EXEC_OBJECT_WRITE check in execbuf path is related to
> implicit dependency tracker which execbuf3 does not support. The
> frontbuffer related updated is the only exception and I don't
> remember the rationale to not require this on execbuf3.
> 
> Matt, Tvrtko, Daniel, can you please comment here?

Does not ring a bell to me. Looking at the code it certainly looks like 
it would be silently failing to handle it properly.

I'll let people with more experience in this area answer, but from my 
point of view, if it is decided that it can be left unsupported, then we 
probably need a way of failing the ioctl is used against a frontbuffer, 
or something, instead of having display corruption.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 00/20] drm/i915/vm_bind: Add VM_BIND functionality
  2022-11-10 14:47     ` Tvrtko Ursulin
@ 2022-11-10 15:05       ` Matthew Auld
  2022-11-10 21:37         ` Zanoni, Paulo R
  0 siblings, 1 reply; 71+ messages in thread
From: Matthew Auld @ 2022-11-10 15:05 UTC (permalink / raw)
  To: Tvrtko Ursulin, Niranjana Vishwanathapura, Zanoni, Paulo R
  Cc: Nikula, Jani, intel-gfx, dri-devel, Hellstrom, Thomas, Vetter,
	Daniel, christian.koenig

On 10/11/2022 14:47, Tvrtko Ursulin wrote:
> 
> On 10/11/2022 05:49, Niranjana Vishwanathapura wrote:
>> On Wed, Nov 09, 2022 at 04:16:25PM -0800, Zanoni, Paulo R wrote:
>>> On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
>>>> DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
>>>> buffer objects (BOs) or sections of a BOs at specified GPU virtual
>>>> addresses on a specified address space (VM). Multiple mappings can map
>>>> to the same physical pages of an object (aliasing). These mappings 
>>>> (also
>>>> referred to as persistent mappings) will be persistent across multiple
>>>> GPU submissions (execbuf calls) issued by the UMD, without user having
>>>> to provide a list of all required mappings during each submission (as
>>>> required by older execbuf mode).
>>>>
>>>> This patch series support VM_BIND version 1, as described by the param
>>>> I915_PARAM_VM_BIND_VERSION.
>>>>
>>>> Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
>>>> vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
>>>> The new execbuf3 ioctl will not have any execlist support and all the
>>>> legacy support like relocations etc., are removed.
>>>>
>>>> NOTEs:
>>>> * It is based on below VM_BIND design+uapi rfc.
>>>>   Documentation/gpu/rfc/i915_vm_bind.rst
>>>
>>> Hi
>>>
>>> One difference for execbuf3 that I noticed that is not mentioned in the
>>> RFC document is that we now don't have a way to signal
>>> EXEC_OBJECT_WRITE. When looking at the Kernel code, some there are some
>>> pieces that check for this flag:
>>>
>>> - there's code that deals with frontbuffer rendering
>>> - there's code that deals with fences
>>> - there's code that prevents self-modifying batches
>>> - another that seems related to waiting for objects
>>>
>>> Are there any new rules regarding frontbuffer rendering when we use
>>> execbuf3? Any other behavior changes related to the other places that
>>> we should expect when using execbuf3?
>>>
>>
>> Paulo,
>> Most of the EXEC_OBJECT_WRITE check in execbuf path is related to
>> implicit dependency tracker which execbuf3 does not support. The
>> frontbuffer related updated is the only exception and I don't
>> remember the rationale to not require this on execbuf3.
>>
>> Matt, Tvrtko, Daniel, can you please comment here?
> 
> Does not ring a bell to me. Looking at the code it certainly looks like 
> it would be silently failing to handle it properly.
> 
> I'll let people with more experience in this area answer, but from my 
> point of view, if it is decided that it can be left unsupported, then we 
> probably need a way of failing the ioctl is used against a frontbuffer, 
> or something, instead of having display corruption.

Maybe it's a coincidence but there is:
https://patchwork.freedesktop.org/series/110715/

Which looks relevant. Maarten, any hints here?

> 
> Regards,
> 
> Tvrtko

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 05/20] drm/i915/vm_bind: Implement bind and unbind of object
  2022-11-10  1:28     ` [Intel-gfx] " Zanoni, Paulo R
@ 2022-11-10 16:32       ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-10 16:32 UTC (permalink / raw)
  To: Zanoni, Paulo R
  Cc: Brost, Matthew, andi.shyti, Landwerlin, Lionel G, Ursulin,
	 Tvrtko, Nikula, Jani, intel-gfx, dri-devel, Hellstrom, Thomas,
	Auld, Matthew, jason, Vetter, Daniel, christian.koenig

On Wed, Nov 09, 2022 at 05:28:59PM -0800, Zanoni, Paulo R wrote:
>On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
>> Add uapi and implement support for bind and unbind of an
>> object at the specified GPU virtual addresses.
>>
>> The vm_bind mode is not supported in legacy execbuf2 ioctl.
>> It will be supported only in the newer execbuf3 ioctl.
>>
>> v2: On older platforms ctx->vm is not set, check for it.
>>     In vm_bind call, add vma to vm_bind_list.
>>     Add more input validity checks.
>>     Update some documentation.
>> v3: In vm_bind call, add vma to vm_bound_list as user can
>>     request a fence and pass to execbuf3 as input fence.
>>     Remove short term pinning with PIN_VALIDATE flag.
>> v4: Replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode().
>> v5: Ensure all reserved fields are 0, use PIN_NOEVICT.
>> v6: Add reserved fields to drm_i915_gem_vm_bind.
>>
>> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
>> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>> Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
>> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/Makefile                 |   1 +
>>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  15 +
>>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   5 +
>>  drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
>>  .../drm/i915/gem/i915_gem_vm_bind_object.c    | 324 ++++++++++++++++++
>>  drivers/gpu/drm/i915/gt/intel_gtt.c           |  10 +
>>  drivers/gpu/drm/i915/gt/intel_gtt.h           |   9 +
>>  drivers/gpu/drm/i915/i915_driver.c            |   3 +
>>  drivers/gpu/drm/i915/i915_vma.c               |   1 +
>>  drivers/gpu/drm/i915/i915_vma_types.h         |  14 +
>>  include/uapi/drm/i915_drm.h                   |  99 ++++++
>>  11 files changed, 507 insertions(+)
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
>>
>> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
>> index 51704b54317c..b731f3ac80da 100644
>> --- a/drivers/gpu/drm/i915/Makefile
>> +++ b/drivers/gpu/drm/i915/Makefile
>> @@ -166,6 +166,7 @@ gem-y += \
>>  	gem/i915_gem_ttm_move.o \
>>  	gem/i915_gem_ttm_pm.o \
>>  	gem/i915_gem_userptr.o \
>> +	gem/i915_gem_vm_bind_object.o \
>>  	gem/i915_gem_wait.o \
>>  	gem/i915_gemfs.o
>>  i915-y += \
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
>> index 899fa8f1e0fe..e8b41aa8f8c4 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
>> @@ -139,6 +139,21 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>>  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
>>  				       struct drm_file *file);
>>  
>>
>>
>>
>> +/**
>> + * i915_gem_vm_is_vm_bind_mode() - Check if address space is in vm_bind mode
>> + * @vm: the address space
>> + *
>> + * Returns:
>> + * true: @vm is in vm_bind mode; allows only vm_bind method of binding.
>> + * false: @vm is not in vm_bind mode; allows only legacy execbuff method
>> + *        of binding.
>> + */
>> +static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
>> +{
>> +	/* No support to enable vm_bind mode yet */
>> +	return false;
>> +}
>> +
>>  struct i915_address_space *
>>  i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
>>  
>>
>>
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> index 1160723c9d2d..c5bc9f6e887f 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> @@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
>>  	if (unlikely(IS_ERR(ctx)))
>>  		return PTR_ERR(ctx);
>>  
>>
>>
>>
>> +	if (ctx->vm && i915_gem_vm_is_vm_bind_mode(ctx->vm)) {
>> +		i915_gem_context_put(ctx);
>> +		return -EOPNOTSUPP;
>> +	}
>> +
>>  	eb->gem_context = ctx;
>>  	if (i915_gem_context_has_full_ppgtt(ctx))
>>  		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>> new file mode 100644
>> index 000000000000..36262a6357b5
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>> @@ -0,0 +1,26 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2022 Intel Corporation
>> + */
>> +
>> +#ifndef __I915_GEM_VM_BIND_H
>> +#define __I915_GEM_VM_BIND_H
>> +
>> +#include <linux/types.h>
>> +
>> +struct drm_device;
>> +struct drm_file;
>> +struct i915_address_space;
>> +struct i915_vma;
>> +
>> +struct i915_vma *
>> +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
>> +
>> +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
>> +			   struct drm_file *file);
>> +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
>> +			     struct drm_file *file);
>> +
>> +void i915_gem_vm_unbind_all(struct i915_address_space *vm);
>> +
>> +#endif /* __I915_GEM_VM_BIND_H */
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
>> new file mode 100644
>> index 000000000000..6f299806bee1
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
>> @@ -0,0 +1,324 @@
>> +// SPDX-License-Identifier: MIT
>> +/*
>> + * Copyright © 2022 Intel Corporation
>> + */
>> +
>> +#include <uapi/drm/i915_drm.h>
>> +
>> +#include <linux/interval_tree_generic.h>
>> +
>> +#include "gem/i915_gem_context.h"
>> +#include "gem/i915_gem_vm_bind.h"
>> +
>> +#include "gt/intel_gpu_commands.h"
>> +
>> +#define START(node) ((node)->start)
>> +#define LAST(node) ((node)->last)
>> +
>> +/* Not all defined functions are used, hence use __maybe_unused */
>> +INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
>> +		     START, LAST, __maybe_unused static inline, i915_vm_bind_it)
>> +
>> +#undef START
>> +#undef LAST
>> +
>> +/**
>> + * DOC: VM_BIND/UNBIND ioctls
>> + *
>> + * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
>> + * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
>> + * specified address space (VM). Multiple mappings can map to the same physical
>> + * pages of an object (aliasing). These mappings (also referred to as persistent
>> + * mappings) will be persistent across multiple GPU submissions (execbuf calls)
>> + * issued by the UMD, without user having to provide a list of all required
>> + * mappings during each submission (as required by older execbuf mode).
>> + *
>> + * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
>> + * signaling the completion of bind/unbind operation.
>> + *
>> + * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
>> + * User has to opt-in for VM_BIND mode of binding for an address space (VM)
>> + * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
>> + *
>> + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
>> + * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be
>> + * done asynchronously, when valid out fence is specified.
>> + *
>> + * VM_BIND locking order is as below.
>> + *
>> + * 1) vm_bind_lock mutex will protect vm_bind lists. This lock is taken in
>> + *    vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the
>> + *    mapping.
>> + *
>> + *    In future, when GPU page faults are supported, we can potentially use a
>> + *    rwsem instead, so that multiple page fault handlers can take the read
>> + *    side lock to lookup the mapping and hence can run in parallel.
>> + *    The older execbuf mode of binding do not need this lock.
>> + *
>> + * 2) The object's dma-resv lock will protect i915_vma state and needs
>> + *    to be held while binding/unbinding a vma in the async worker and while
>> + *    updating dma-resv fence list of an object. Note that private BOs of a VM
>> + *    will all share a dma-resv object.
>> + *
>> + * 3) Spinlock/s to protect some of the VM's lists like the list of
>> + *    invalidated vmas (due to eviction and userptr invalidation) etc.
>> + */
>> +
>> +/**
>> + * i915_gem_vm_bind_lookup_vma() - lookup for persistent vma mapped at a
>> + * specified address
>> + * @vm: virtual address space to look for persistent vma
>> + * @va: starting address where vma is mapped
>> + *
>> + * Retrieves the persistent vma mapped address @va from the @vm's vma tree.
>> + *
>> + * Returns vma pointer on success, NULL on failure.
>> + */
>> +struct i915_vma *
>> +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va)
>> +{
>> +	lockdep_assert_held(&vm->vm_bind_lock);
>> +
>> +	return i915_vm_bind_it_iter_first(&vm->va, va, va);
>> +}
>> +
>> +static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
>> +{
>> +	lockdep_assert_held(&vma->vm->vm_bind_lock);
>> +
>> +	list_del_init(&vma->vm_bind_link);
>> +	i915_vm_bind_it_remove(vma, &vma->vm->va);
>> +
>> +	/* Release object */
>> +	if (release_obj)
>> +		i915_gem_object_put(vma->obj);
>> +}
>> +
>> +static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
>> +				  struct drm_i915_gem_vm_unbind *va)
>> +{
>> +	struct drm_i915_gem_object *obj;
>> +	struct i915_vma *vma;
>> +	int ret;
>> +
>> +	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
>> +	if (ret)
>> +		return ret;
>> +
>> +	va->start = gen8_noncanonical_addr(va->start);
>> +	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
>> +
>> +	if (!vma)
>> +		ret = -ENOENT;
>> +	else if (vma->size != va->length)
>> +		ret = -EINVAL;
>> +
>> +	if (ret) {
>> +		mutex_unlock(&vm->vm_bind_lock);
>> +		return ret;
>> +	}
>> +
>> +	i915_gem_vm_bind_remove(vma, false);
>> +
>> +	mutex_unlock(&vm->vm_bind_lock);
>> +
>> +	/*
>> +	 * Destroy the vma and then release the object.
>> +	 * As persistent vma holds object reference, it can only be destroyed
>> +	 * either by vm_unbind ioctl or when VM is being released. As we are
>> +	 * holding VM reference here, it is safe accessing the vma here.
>> +	 */
>> +	obj = vma->obj;
>> +	i915_gem_object_lock(obj, NULL);
>> +	i915_vma_destroy(vma);
>> +	i915_gem_object_unlock(obj);
>> +
>> +	i915_gem_object_put(obj);
>> +
>> +	return 0;
>> +}
>> +
>> +/**
>> + * i915_gem_vm_unbind_all() - unbind all persistent mappings from an
>> + * address space
>> + * @vm: Address spece to remove persistent mappings from
>> + *
>> + * Unbind all userspace requested vm_bind mappings from @vm.
>> + */
>> +void i915_gem_vm_unbind_all(struct i915_address_space *vm)
>> +{
>> +	struct i915_vma *vma, *t;
>> +
>> +	mutex_lock(&vm->vm_bind_lock);
>> +	list_for_each_entry_safe(vma, t, &vm->vm_bind_list, vm_bind_link)
>> +		i915_gem_vm_bind_remove(vma, true);
>> +	list_for_each_entry_safe(vma, t, &vm->vm_bound_list, vm_bind_link)
>> +		i915_gem_vm_bind_remove(vma, true);
>> +	mutex_unlock(&vm->vm_bind_lock);
>> +}
>> +
>> +static struct i915_vma *vm_bind_get_vma(struct i915_address_space *vm,
>> +					struct drm_i915_gem_object *obj,
>> +					struct drm_i915_gem_vm_bind *va)
>> +{
>> +	struct i915_gtt_view view;
>> +	struct i915_vma *vma;
>> +
>> +	va->start = gen8_noncanonical_addr(va->start);
>> +	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
>> +	if (vma)
>> +		return ERR_PTR(-EEXIST);
>> +
>> +	view.type = I915_GTT_VIEW_PARTIAL;
>> +	view.partial.offset = va->offset >> PAGE_SHIFT;
>> +	view.partial.size = va->length >> PAGE_SHIFT;
>> +	vma = i915_vma_create_persistent(obj, vm, &view);
>> +	if (IS_ERR(vma))
>> +		return vma;
>> +
>> +	vma->start = va->start;
>> +	vma->last = va->start + va->length - 1;
>> +
>> +	return vma;
>> +}
>> +
>> +static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
>> +				struct drm_i915_gem_vm_bind *va,
>> +				struct drm_file *file)
>> +{
>> +	struct drm_i915_gem_object *obj;
>> +	struct i915_vma *vma = NULL;
>> +	struct i915_gem_ww_ctx ww;
>> +	u64 pin_flags;
>> +	int ret = 0;
>> +
>> +	if (!i915_gem_vm_is_vm_bind_mode(vm))
>> +		return -EOPNOTSUPP;
>> +
>> +	/* Ensure start and length fields are valid */
>> +	if (!va->length || !IS_ALIGNED(va->start, I915_GTT_PAGE_SIZE))
>> +		ret = -EINVAL;
>> +
>> +	obj = i915_gem_object_lookup(file, va->handle);
>> +	if (!obj)
>> +		return -ENOENT;
>> +
>> +	/* Ensure offset and length are aligned to object's max page size */
>> +	if (!IS_ALIGNED(va->offset | va->length,
>> +			i915_gem_object_max_page_size(obj->mm.placements,
>> +						      obj->mm.n_placements)))
>> +		ret = -EINVAL;
>> +
>> +	/* Check for mapping range overflow */
>> +	if (range_overflows_t(u64, va->offset, va->length, obj->base.size))
>> +		ret = -EINVAL;
>> +
>> +	if (ret)
>> +		goto put_obj;
>> +
>> +	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
>> +	if (ret)
>> +		goto put_obj;
>> +
>> +	vma = vm_bind_get_vma(vm, obj, va);
>> +	if (IS_ERR(vma)) {
>> +		ret = PTR_ERR(vma);
>> +		goto unlock_vm;
>> +	}
>> +
>> +	pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER |
>> +		    PIN_VALIDATE | PIN_NOEVICT;
>> +
>> +	for_i915_gem_ww(&ww, ret, true) {
>> +		ret = i915_gem_object_lock(vma->obj, &ww);
>> +		if (ret)
>> +			continue;
>> +
>> +		ret = i915_vma_pin_ww(vma, &ww, 0, 0, pin_flags);
>> +		if (ret)
>> +			continue;
>> +
>> +		list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
>> +		i915_vm_bind_it_insert(vma, &vm->va);
>> +
>> +		/* Hold object reference until vm_unbind */
>> +		i915_gem_object_get(vma->obj);
>> +	}
>> +
>> +	if (ret)
>> +		i915_vma_destroy(vma);
>> +unlock_vm:
>> +	mutex_unlock(&vm->vm_bind_lock);
>> +put_obj:
>> +	i915_gem_object_put(obj);
>> +
>> +	return ret;
>> +}
>> +
>> +/**
>> + * i915_gem_vm_bind_ioctl() - ioctl function for binding a section of object
>> + * at a specified virtual address
>> + * @dev: drm_device pointer
>> + * @data: ioctl data structure
>> + * @file: drm_file pointer
>> + *
>> + * Adds the specified persistent mapping (virtual address to a section of an
>> + * object) and binds it in the device page table.
>> + *
>> + * Returns 0 on success, error code on failure.
>> + */
>> +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
>> +			   struct drm_file *file)
>> +{
>> +	struct drm_i915_gem_vm_bind *args = data;
>> +	struct i915_address_space *vm;
>> +	int ret;
>> +
>> +	/* Reserved fields must be 0 */
>> +	if (args->rsvd[0] || args->rsvd[1] || args->rsvd[2] || args->extensions)
>> +		return -EINVAL;
>> +
>> +	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
>> +	if (unlikely(!vm))
>> +		return -ENOENT;
>> +
>> +	ret = i915_gem_vm_bind_obj(vm, args, file);
>> +
>> +	i915_vm_put(vm);
>> +	return ret;
>> +}
>> +
>> +/**
>> + * i915_gem_vm_unbind_ioctl() - ioctl function for unbinding a mapping at a
>> + * specified virtual address
>> + * @dev: drm_device pointer
>> + * @data: ioctl data structure
>> + * @file: drm_file pointer
>> + *
>> + * Removes the persistent mapping at the specified address and unbinds it
>> + * from the device page table.
>> + *
>> + * Returns 0 on success, error code on failure. -ENOENT is returned if the
>> + * specified mapping is not found.
>> + */
>> +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
>> +			     struct drm_file *file)
>> +{
>> +	struct drm_i915_gem_vm_unbind *args = data;
>> +	struct i915_address_space *vm;
>> +	int ret;
>> +
>> +	/* Reserved fields must be 0 */
>> +	if (args->rsvd || args->rsvd2[0] || args->rsvd2[1] ||
>> +	    args->rsvd2[2] || args->extensions)
>> +		return -EINVAL;
>> +
>> +	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
>> +	if (unlikely(!vm))
>> +		return -ENOENT;
>> +
>> +	ret = i915_gem_vm_unbind_vma(vm, args);
>> +
>> +	i915_vm_put(vm);
>> +	return ret;
>> +}
>> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
>> index e82a9d763e57..412368c67c46 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
>> @@ -12,6 +12,7 @@
>>  
>>
>>
>>
>>  #include "gem/i915_gem_internal.h"
>>  #include "gem/i915_gem_lmem.h"
>> +#include "gem/i915_gem_vm_bind.h"
>>  #include "i915_trace.h"
>>  #include "i915_utils.h"
>>  #include "intel_gt.h"
>> @@ -177,6 +178,8 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
>>  void i915_address_space_fini(struct i915_address_space *vm)
>>  {
>>  	drm_mm_takedown(&vm->mm);
>> +	GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
>> +	mutex_destroy(&vm->vm_bind_lock);
>>  }
>>  
>>
>>
>>
>>  /**
>> @@ -203,6 +206,8 @@ static void __i915_vm_release(struct work_struct *work)
>>  	struct i915_address_space *vm =
>>  		container_of(work, struct i915_address_space, release_work);
>>  
>>
>>
>>
>> +	i915_gem_vm_unbind_all(vm);
>> +
>>  	__i915_vm_close(vm);
>>  
>>
>>
>>
>>  	/* Synchronize async unbinds. */
>> @@ -279,6 +284,11 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
>>  
>>
>>
>>
>>  	INIT_LIST_HEAD(&vm->bound_list);
>>  	INIT_LIST_HEAD(&vm->unbound_list);
>> +
>> +	vm->va = RB_ROOT_CACHED;
>> +	INIT_LIST_HEAD(&vm->vm_bind_list);
>> +	INIT_LIST_HEAD(&vm->vm_bound_list);
>> +	mutex_init(&vm->vm_bind_lock);
>>  }
>>  
>>
>>
>>
>>  void *__px_vaddr(struct drm_i915_gem_object *p)
>> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
>> index 4d75ba4bb41d..3a9bee1b9d03 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
>> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
>> @@ -260,6 +260,15 @@ struct i915_address_space {
>>  	 */
>>  	struct list_head unbound_list;
>>  
>>
>>
>>
>> +	/** @vm_bind_lock: Mutex to protect @vm_bind_list and @vm_bound_list */
>> +	struct mutex vm_bind_lock;
>> +	/** @vm_bind_list: List of vm_binding in process */
>> +	struct list_head vm_bind_list;
>> +	/** @vm_bound_list: List of vm_binding completed */
>> +	struct list_head vm_bound_list;
>> +	/** @va: tree of persistent vmas */
>> +	struct rb_root_cached va;
>> +
>>  	/* Global GTT */
>>  	bool is_ggtt:1;
>>  
>>
>>
>>
>> diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
>> index c3d43f9b1e45..cf41b96ac485 100644
>> --- a/drivers/gpu/drm/i915/i915_driver.c
>> +++ b/drivers/gpu/drm/i915/i915_driver.c
>> @@ -69,6 +69,7 @@
>>  #include "gem/i915_gem_ioctls.h"
>>  #include "gem/i915_gem_mman.h"
>>  #include "gem/i915_gem_pm.h"
>> +#include "gem/i915_gem_vm_bind.h"
>>  #include "gt/intel_gt.h"
>>  #include "gt/intel_gt_pm.h"
>>  #include "gt/intel_rc6.h"
>> @@ -1892,6 +1893,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
>>  	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW),
>>  	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
>>  	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
>> +	DRM_IOCTL_DEF_DRV(I915_GEM_VM_BIND, i915_gem_vm_bind_ioctl, DRM_RENDER_ALLOW),
>> +	DRM_IOCTL_DEF_DRV(I915_GEM_VM_UNBIND, i915_gem_vm_unbind_ioctl, DRM_RENDER_ALLOW),
>>  };
>>  
>>
>>
>>
>>  /*
>> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
>> index 529d97318f00..6a64a130dbcd 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.c
>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> @@ -239,6 +239,7 @@ vma_create(struct drm_i915_gem_object *obj,
>>  	spin_unlock(&obj->vma.lock);
>>  	mutex_unlock(&vm->mutex);
>>  
>>
>>
>>
>> +	INIT_LIST_HEAD(&vma->vm_bind_link);
>>  	return vma;
>>  
>>
>>
>>
>>  err_unlock:
>> diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
>> index 3144d71a0c3e..db786d2d1530 100644
>> --- a/drivers/gpu/drm/i915/i915_vma_types.h
>> +++ b/drivers/gpu/drm/i915/i915_vma_types.h
>> @@ -295,6 +295,20 @@ struct i915_vma {
>>  	/** This object's place on the active/inactive lists */
>>  	struct list_head vm_link;
>>  
>>
>>
>>
>> +	/** @vm_bind_link: node for the vm_bind related lists of vm */
>> +	struct list_head vm_bind_link;
>> +
>> +	/** Interval tree structures for persistent vma */
>> +
>> +	/** @rb: node for the interval tree of vm for persistent vmas */
>> +	struct rb_node rb;
>> +	/** @start: start endpoint of the rb node */
>> +	u64 start;
>> +	/** @last: Last endpoint of the rb node */
>> +	u64 last;
>> +	/** @__subtree_last: last in subtree */
>> +	u64 __subtree_last;
>> +
>>  	struct list_head obj_link; /* Link in the object's VMA list */
>>  	struct rb_node obj_node;
>>  	struct hlist_node obj_hash;
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index 8df261c5ab9b..f06a09f1db2d 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -470,6 +470,8 @@ typedef struct _drm_i915_sarea {
>>  #define DRM_I915_GEM_VM_CREATE		0x3a
>>  #define DRM_I915_GEM_VM_DESTROY		0x3b
>>  #define DRM_I915_GEM_CREATE_EXT		0x3c
>> +#define DRM_I915_GEM_VM_BIND		0x3d
>> +#define DRM_I915_GEM_VM_UNBIND		0x3e
>>  /* Must be kept compact -- no holes */
>>  
>>
>>
>>
>>  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
>> @@ -534,6 +536,8 @@ typedef struct _drm_i915_sarea {
>>  #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>>  #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
>>  #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
>> +#define DRM_IOCTL_I915_GEM_VM_BIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
>> +#define DRM_IOCTL_I915_GEM_VM_UNBIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_unbind)
>>  
>>
>>
>>
>>  /* Allow drivers to submit batchbuffers directly to hardware, relying
>>   * on the security mechanisms provided by hardware.
>> @@ -3727,6 +3731,101 @@ struct drm_i915_gem_create_ext_protected_content {
>>  /* ID of the protected content session managed by i915 when PXP is active */
>>  #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
>>  
>>
>>
>>
>> +/**
>> + * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
>> + *
>> + * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
>> + * virtual address (VA) range to the section of an object that should be bound
>> + * in the device page table of the specified address space (VM).
>> + * The VA range specified must be unique (ie., not currently bound) and can
>> + * be mapped to whole object or a section of the object (partial binding).
>> + * Multiple VA mappings can be created to the same section of the object
>> + * (aliasing).
>> + *
>> + * The @start, @offset and @length must be 4K page aligned. However the DG2
>> + * and XEHPSDV has 64K page size for device local memory and has compact page
>> + * table. On those platforms, for binding device local-memory objects, the
>> + * @start, @offset and @length must be 64K aligned.
>> + *
>> + * Error code -EINVAL will be returned if @start, @offset and @length are not
>> + * properly aligned. In version 1 (See I915_PARAM_VM_BIND_VERSION), error code
>> + * -ENOSPC will be returned if the VA range specified can't be reserved.
>> + *
>> + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
>> + * are not ordered. Furthermore, parts of the VM_BIND operation can be done
>> + * asynchronously, if valid @fence is specified.
>> + */
>> +struct drm_i915_gem_vm_bind {
>> +	/** @vm_id: VM (address space) id to bind */
>> +	__u32 vm_id;
>> +
>> +	/** @handle: Object handle */
>> +	__u32 handle;
>> +
>> +	/** @start: Virtual Address start to bind */
>> +	__u64 start;
>> +
>> +	/** @offset: Offset in object to bind */
>> +	__u64 offset;
>> +
>> +	/** @length: Length of mapping to bind */
>> +	__u64 length;
>> +
>> +	/** @rsvd: Reserved, MBZ */
>> +	__u64 rsvd[3];
>
>In a brand new ioctl with even extensions support, why do we need this?
>If we have a plan to add something here in the future, can you please
>tell us what it may be? Perhaps having that field already but accepting
>only a default value/flag would be better.
>

1 quad word is for flags and other 2 are reseved. I had the flag defined
previously, but changed it to reserved based on review comments.
Yes, we have extension, but I think it is OK to have some reserved fields
(I see other examples here). There are some future expansion plans (like
PAT setting support etc) anyhow. Is that fine?

>I see in previous versions we had 'flags' here. Having 'flags', even if
>MBZ for the initial version, seems like a nice thing to have for future
>extensibility. Also, you're going to add back the flag to make the page
>read-only at some point, right?

Yah, I can separte out the flags here and in vm_unbind. Matt, hope that is
fine.

>
>> +
>> +	/** @rsvd2: Reserved for timeline fence */
>> +	__u64 rsvd2[2];
>
>I see this one gets changed in the middle of the series.
>

Yah we reserve it for timeline fence here in this patch,
timeline fence support is added in a later patch in this series.

>> +
>> +	/**
>> +	 * @extensions: Zero-terminated chain of extensions.
>> +	 *
>> +	 * For future extensions. See struct i915_user_extension.
>> +	 */
>> +	__u64 extensions;
>> +};
>> +
>> +/**
>> + * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
>> + *
>> + * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
>> + * address (VA) range that should be unbound from the device page table of the
>> + * specified address space (VM). VM_UNBIND will force unbind the specified
>> + * range from device page table without waiting for any GPU job to complete.
>> + * It is UMDs responsibility to ensure the mapping is no longer in use before
>> + * calling VM_UNBIND.
>> + *
>> + * If the specified mapping is not found, the ioctl will simply return without
>> + * any error.
>> + *
>> + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
>> + * are not ordered. Furthermore, parts of the VM_UNBIND operation can be done
>> + * asynchronously, if valid @fence is specified.
>
>What @fence? There's no way to specify one.
>

Yah, will remove "if valid @fence is specified".

>> + */
>> +struct drm_i915_gem_vm_unbind {
>> +	/** @vm_id: VM (address space) id to bind */
>> +	__u32 vm_id;
>> +
>> +	/** @rsvd: Reserved, MBZ */
>> +	__u32 rsvd;
>
>Again here, same question. Perhaps we could name it 'pad' or 'pad0' if
>that's the specific goal of having this?
>

Yah, pad is appropriate here.

>> +
>> +	/** @start: Virtual Address start to unbind */
>> +	__u64 start;
>> +
>> +	/** @length: Length of mapping to unbind */
>> +	__u64 length;
>> +
>> +	/** @rsvd2: Reserved, MBZ */
>> +	__u64 rsvd2[3];
>
>And here, but this is definitely not just padding.
>

It is for 'flags' and 'timeline fence' in case we need them later on.

Niranjana

>> +
>> +	/**
>> +	 * @extensions: Zero-terminated chain of extensions.
>> +	 *
>> +	 * For future extensions. See struct i915_user_extension.
>> +	 */
>> +	__u64 extensions;
>> +};
>> +
>>  #if defined(__cplusplus)
>>  }
>>  #endif
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 05/20] drm/i915/vm_bind: Implement bind and unbind of object
@ 2022-11-10 16:32       ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 71+ messages in thread
From: Niranjana Vishwanathapura @ 2022-11-10 16:32 UTC (permalink / raw)
  To: Zanoni, Paulo R
  Cc: Nikula, Jani, intel-gfx, dri-devel, Hellstrom, Thomas, Auld,
	Matthew, Vetter, Daniel, christian.koenig

On Wed, Nov 09, 2022 at 05:28:59PM -0800, Zanoni, Paulo R wrote:
>On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
>> Add uapi and implement support for bind and unbind of an
>> object at the specified GPU virtual addresses.
>>
>> The vm_bind mode is not supported in legacy execbuf2 ioctl.
>> It will be supported only in the newer execbuf3 ioctl.
>>
>> v2: On older platforms ctx->vm is not set, check for it.
>>     In vm_bind call, add vma to vm_bind_list.
>>     Add more input validity checks.
>>     Update some documentation.
>> v3: In vm_bind call, add vma to vm_bound_list as user can
>>     request a fence and pass to execbuf3 as input fence.
>>     Remove short term pinning with PIN_VALIDATE flag.
>> v4: Replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode().
>> v5: Ensure all reserved fields are 0, use PIN_NOEVICT.
>> v6: Add reserved fields to drm_i915_gem_vm_bind.
>>
>> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
>> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>> Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
>> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/Makefile                 |   1 +
>>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  15 +
>>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   5 +
>>  drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
>>  .../drm/i915/gem/i915_gem_vm_bind_object.c    | 324 ++++++++++++++++++
>>  drivers/gpu/drm/i915/gt/intel_gtt.c           |  10 +
>>  drivers/gpu/drm/i915/gt/intel_gtt.h           |   9 +
>>  drivers/gpu/drm/i915/i915_driver.c            |   3 +
>>  drivers/gpu/drm/i915/i915_vma.c               |   1 +
>>  drivers/gpu/drm/i915/i915_vma_types.h         |  14 +
>>  include/uapi/drm/i915_drm.h                   |  99 ++++++
>>  11 files changed, 507 insertions(+)
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
>>
>> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
>> index 51704b54317c..b731f3ac80da 100644
>> --- a/drivers/gpu/drm/i915/Makefile
>> +++ b/drivers/gpu/drm/i915/Makefile
>> @@ -166,6 +166,7 @@ gem-y += \
>>  	gem/i915_gem_ttm_move.o \
>>  	gem/i915_gem_ttm_pm.o \
>>  	gem/i915_gem_userptr.o \
>> +	gem/i915_gem_vm_bind_object.o \
>>  	gem/i915_gem_wait.o \
>>  	gem/i915_gemfs.o
>>  i915-y += \
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
>> index 899fa8f1e0fe..e8b41aa8f8c4 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
>> @@ -139,6 +139,21 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>>  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
>>  				       struct drm_file *file);
>>  
>>
>>
>>
>> +/**
>> + * i915_gem_vm_is_vm_bind_mode() - Check if address space is in vm_bind mode
>> + * @vm: the address space
>> + *
>> + * Returns:
>> + * true: @vm is in vm_bind mode; allows only vm_bind method of binding.
>> + * false: @vm is not in vm_bind mode; allows only legacy execbuff method
>> + *        of binding.
>> + */
>> +static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
>> +{
>> +	/* No support to enable vm_bind mode yet */
>> +	return false;
>> +}
>> +
>>  struct i915_address_space *
>>  i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
>>  
>>
>>
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> index 1160723c9d2d..c5bc9f6e887f 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
>> @@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
>>  	if (unlikely(IS_ERR(ctx)))
>>  		return PTR_ERR(ctx);
>>  
>>
>>
>>
>> +	if (ctx->vm && i915_gem_vm_is_vm_bind_mode(ctx->vm)) {
>> +		i915_gem_context_put(ctx);
>> +		return -EOPNOTSUPP;
>> +	}
>> +
>>  	eb->gem_context = ctx;
>>  	if (i915_gem_context_has_full_ppgtt(ctx))
>>  		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>> new file mode 100644
>> index 000000000000..36262a6357b5
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>> @@ -0,0 +1,26 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2022 Intel Corporation
>> + */
>> +
>> +#ifndef __I915_GEM_VM_BIND_H
>> +#define __I915_GEM_VM_BIND_H
>> +
>> +#include <linux/types.h>
>> +
>> +struct drm_device;
>> +struct drm_file;
>> +struct i915_address_space;
>> +struct i915_vma;
>> +
>> +struct i915_vma *
>> +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
>> +
>> +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
>> +			   struct drm_file *file);
>> +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
>> +			     struct drm_file *file);
>> +
>> +void i915_gem_vm_unbind_all(struct i915_address_space *vm);
>> +
>> +#endif /* __I915_GEM_VM_BIND_H */
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
>> new file mode 100644
>> index 000000000000..6f299806bee1
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
>> @@ -0,0 +1,324 @@
>> +// SPDX-License-Identifier: MIT
>> +/*
>> + * Copyright © 2022 Intel Corporation
>> + */
>> +
>> +#include <uapi/drm/i915_drm.h>
>> +
>> +#include <linux/interval_tree_generic.h>
>> +
>> +#include "gem/i915_gem_context.h"
>> +#include "gem/i915_gem_vm_bind.h"
>> +
>> +#include "gt/intel_gpu_commands.h"
>> +
>> +#define START(node) ((node)->start)
>> +#define LAST(node) ((node)->last)
>> +
>> +/* Not all defined functions are used, hence use __maybe_unused */
>> +INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
>> +		     START, LAST, __maybe_unused static inline, i915_vm_bind_it)
>> +
>> +#undef START
>> +#undef LAST
>> +
>> +/**
>> + * DOC: VM_BIND/UNBIND ioctls
>> + *
>> + * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
>> + * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
>> + * specified address space (VM). Multiple mappings can map to the same physical
>> + * pages of an object (aliasing). These mappings (also referred to as persistent
>> + * mappings) will be persistent across multiple GPU submissions (execbuf calls)
>> + * issued by the UMD, without user having to provide a list of all required
>> + * mappings during each submission (as required by older execbuf mode).
>> + *
>> + * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
>> + * signaling the completion of bind/unbind operation.
>> + *
>> + * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
>> + * User has to opt-in for VM_BIND mode of binding for an address space (VM)
>> + * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
>> + *
>> + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
>> + * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be
>> + * done asynchronously, when valid out fence is specified.
>> + *
>> + * VM_BIND locking order is as below.
>> + *
>> + * 1) vm_bind_lock mutex will protect vm_bind lists. This lock is taken in
>> + *    vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the
>> + *    mapping.
>> + *
>> + *    In future, when GPU page faults are supported, we can potentially use a
>> + *    rwsem instead, so that multiple page fault handlers can take the read
>> + *    side lock to lookup the mapping and hence can run in parallel.
>> + *    The older execbuf mode of binding do not need this lock.
>> + *
>> + * 2) The object's dma-resv lock will protect i915_vma state and needs
>> + *    to be held while binding/unbinding a vma in the async worker and while
>> + *    updating dma-resv fence list of an object. Note that private BOs of a VM
>> + *    will all share a dma-resv object.
>> + *
>> + * 3) Spinlock/s to protect some of the VM's lists like the list of
>> + *    invalidated vmas (due to eviction and userptr invalidation) etc.
>> + */
>> +
>> +/**
>> + * i915_gem_vm_bind_lookup_vma() - lookup for persistent vma mapped at a
>> + * specified address
>> + * @vm: virtual address space to look for persistent vma
>> + * @va: starting address where vma is mapped
>> + *
>> + * Retrieves the persistent vma mapped address @va from the @vm's vma tree.
>> + *
>> + * Returns vma pointer on success, NULL on failure.
>> + */
>> +struct i915_vma *
>> +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va)
>> +{
>> +	lockdep_assert_held(&vm->vm_bind_lock);
>> +
>> +	return i915_vm_bind_it_iter_first(&vm->va, va, va);
>> +}
>> +
>> +static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
>> +{
>> +	lockdep_assert_held(&vma->vm->vm_bind_lock);
>> +
>> +	list_del_init(&vma->vm_bind_link);
>> +	i915_vm_bind_it_remove(vma, &vma->vm->va);
>> +
>> +	/* Release object */
>> +	if (release_obj)
>> +		i915_gem_object_put(vma->obj);
>> +}
>> +
>> +static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
>> +				  struct drm_i915_gem_vm_unbind *va)
>> +{
>> +	struct drm_i915_gem_object *obj;
>> +	struct i915_vma *vma;
>> +	int ret;
>> +
>> +	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
>> +	if (ret)
>> +		return ret;
>> +
>> +	va->start = gen8_noncanonical_addr(va->start);
>> +	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
>> +
>> +	if (!vma)
>> +		ret = -ENOENT;
>> +	else if (vma->size != va->length)
>> +		ret = -EINVAL;
>> +
>> +	if (ret) {
>> +		mutex_unlock(&vm->vm_bind_lock);
>> +		return ret;
>> +	}
>> +
>> +	i915_gem_vm_bind_remove(vma, false);
>> +
>> +	mutex_unlock(&vm->vm_bind_lock);
>> +
>> +	/*
>> +	 * Destroy the vma and then release the object.
>> +	 * As persistent vma holds object reference, it can only be destroyed
>> +	 * either by vm_unbind ioctl or when VM is being released. As we are
>> +	 * holding VM reference here, it is safe accessing the vma here.
>> +	 */
>> +	obj = vma->obj;
>> +	i915_gem_object_lock(obj, NULL);
>> +	i915_vma_destroy(vma);
>> +	i915_gem_object_unlock(obj);
>> +
>> +	i915_gem_object_put(obj);
>> +
>> +	return 0;
>> +}
>> +
>> +/**
>> + * i915_gem_vm_unbind_all() - unbind all persistent mappings from an
>> + * address space
>> + * @vm: Address spece to remove persistent mappings from
>> + *
>> + * Unbind all userspace requested vm_bind mappings from @vm.
>> + */
>> +void i915_gem_vm_unbind_all(struct i915_address_space *vm)
>> +{
>> +	struct i915_vma *vma, *t;
>> +
>> +	mutex_lock(&vm->vm_bind_lock);
>> +	list_for_each_entry_safe(vma, t, &vm->vm_bind_list, vm_bind_link)
>> +		i915_gem_vm_bind_remove(vma, true);
>> +	list_for_each_entry_safe(vma, t, &vm->vm_bound_list, vm_bind_link)
>> +		i915_gem_vm_bind_remove(vma, true);
>> +	mutex_unlock(&vm->vm_bind_lock);
>> +}
>> +
>> +static struct i915_vma *vm_bind_get_vma(struct i915_address_space *vm,
>> +					struct drm_i915_gem_object *obj,
>> +					struct drm_i915_gem_vm_bind *va)
>> +{
>> +	struct i915_gtt_view view;
>> +	struct i915_vma *vma;
>> +
>> +	va->start = gen8_noncanonical_addr(va->start);
>> +	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
>> +	if (vma)
>> +		return ERR_PTR(-EEXIST);
>> +
>> +	view.type = I915_GTT_VIEW_PARTIAL;
>> +	view.partial.offset = va->offset >> PAGE_SHIFT;
>> +	view.partial.size = va->length >> PAGE_SHIFT;
>> +	vma = i915_vma_create_persistent(obj, vm, &view);
>> +	if (IS_ERR(vma))
>> +		return vma;
>> +
>> +	vma->start = va->start;
>> +	vma->last = va->start + va->length - 1;
>> +
>> +	return vma;
>> +}
>> +
>> +static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
>> +				struct drm_i915_gem_vm_bind *va,
>> +				struct drm_file *file)
>> +{
>> +	struct drm_i915_gem_object *obj;
>> +	struct i915_vma *vma = NULL;
>> +	struct i915_gem_ww_ctx ww;
>> +	u64 pin_flags;
>> +	int ret = 0;
>> +
>> +	if (!i915_gem_vm_is_vm_bind_mode(vm))
>> +		return -EOPNOTSUPP;
>> +
>> +	/* Ensure start and length fields are valid */
>> +	if (!va->length || !IS_ALIGNED(va->start, I915_GTT_PAGE_SIZE))
>> +		ret = -EINVAL;
>> +
>> +	obj = i915_gem_object_lookup(file, va->handle);
>> +	if (!obj)
>> +		return -ENOENT;
>> +
>> +	/* Ensure offset and length are aligned to object's max page size */
>> +	if (!IS_ALIGNED(va->offset | va->length,
>> +			i915_gem_object_max_page_size(obj->mm.placements,
>> +						      obj->mm.n_placements)))
>> +		ret = -EINVAL;
>> +
>> +	/* Check for mapping range overflow */
>> +	if (range_overflows_t(u64, va->offset, va->length, obj->base.size))
>> +		ret = -EINVAL;
>> +
>> +	if (ret)
>> +		goto put_obj;
>> +
>> +	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
>> +	if (ret)
>> +		goto put_obj;
>> +
>> +	vma = vm_bind_get_vma(vm, obj, va);
>> +	if (IS_ERR(vma)) {
>> +		ret = PTR_ERR(vma);
>> +		goto unlock_vm;
>> +	}
>> +
>> +	pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER |
>> +		    PIN_VALIDATE | PIN_NOEVICT;
>> +
>> +	for_i915_gem_ww(&ww, ret, true) {
>> +		ret = i915_gem_object_lock(vma->obj, &ww);
>> +		if (ret)
>> +			continue;
>> +
>> +		ret = i915_vma_pin_ww(vma, &ww, 0, 0, pin_flags);
>> +		if (ret)
>> +			continue;
>> +
>> +		list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
>> +		i915_vm_bind_it_insert(vma, &vm->va);
>> +
>> +		/* Hold object reference until vm_unbind */
>> +		i915_gem_object_get(vma->obj);
>> +	}
>> +
>> +	if (ret)
>> +		i915_vma_destroy(vma);
>> +unlock_vm:
>> +	mutex_unlock(&vm->vm_bind_lock);
>> +put_obj:
>> +	i915_gem_object_put(obj);
>> +
>> +	return ret;
>> +}
>> +
>> +/**
>> + * i915_gem_vm_bind_ioctl() - ioctl function for binding a section of object
>> + * at a specified virtual address
>> + * @dev: drm_device pointer
>> + * @data: ioctl data structure
>> + * @file: drm_file pointer
>> + *
>> + * Adds the specified persistent mapping (virtual address to a section of an
>> + * object) and binds it in the device page table.
>> + *
>> + * Returns 0 on success, error code on failure.
>> + */
>> +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
>> +			   struct drm_file *file)
>> +{
>> +	struct drm_i915_gem_vm_bind *args = data;
>> +	struct i915_address_space *vm;
>> +	int ret;
>> +
>> +	/* Reserved fields must be 0 */
>> +	if (args->rsvd[0] || args->rsvd[1] || args->rsvd[2] || args->extensions)
>> +		return -EINVAL;
>> +
>> +	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
>> +	if (unlikely(!vm))
>> +		return -ENOENT;
>> +
>> +	ret = i915_gem_vm_bind_obj(vm, args, file);
>> +
>> +	i915_vm_put(vm);
>> +	return ret;
>> +}
>> +
>> +/**
>> + * i915_gem_vm_unbind_ioctl() - ioctl function for unbinding a mapping at a
>> + * specified virtual address
>> + * @dev: drm_device pointer
>> + * @data: ioctl data structure
>> + * @file: drm_file pointer
>> + *
>> + * Removes the persistent mapping at the specified address and unbinds it
>> + * from the device page table.
>> + *
>> + * Returns 0 on success, error code on failure. -ENOENT is returned if the
>> + * specified mapping is not found.
>> + */
>> +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
>> +			     struct drm_file *file)
>> +{
>> +	struct drm_i915_gem_vm_unbind *args = data;
>> +	struct i915_address_space *vm;
>> +	int ret;
>> +
>> +	/* Reserved fields must be 0 */
>> +	if (args->rsvd || args->rsvd2[0] || args->rsvd2[1] ||
>> +	    args->rsvd2[2] || args->extensions)
>> +		return -EINVAL;
>> +
>> +	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
>> +	if (unlikely(!vm))
>> +		return -ENOENT;
>> +
>> +	ret = i915_gem_vm_unbind_vma(vm, args);
>> +
>> +	i915_vm_put(vm);
>> +	return ret;
>> +}
>> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
>> index e82a9d763e57..412368c67c46 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
>> @@ -12,6 +12,7 @@
>>  
>>
>>
>>
>>  #include "gem/i915_gem_internal.h"
>>  #include "gem/i915_gem_lmem.h"
>> +#include "gem/i915_gem_vm_bind.h"
>>  #include "i915_trace.h"
>>  #include "i915_utils.h"
>>  #include "intel_gt.h"
>> @@ -177,6 +178,8 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
>>  void i915_address_space_fini(struct i915_address_space *vm)
>>  {
>>  	drm_mm_takedown(&vm->mm);
>> +	GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
>> +	mutex_destroy(&vm->vm_bind_lock);
>>  }
>>  
>>
>>
>>
>>  /**
>> @@ -203,6 +206,8 @@ static void __i915_vm_release(struct work_struct *work)
>>  	struct i915_address_space *vm =
>>  		container_of(work, struct i915_address_space, release_work);
>>  
>>
>>
>>
>> +	i915_gem_vm_unbind_all(vm);
>> +
>>  	__i915_vm_close(vm);
>>  
>>
>>
>>
>>  	/* Synchronize async unbinds. */
>> @@ -279,6 +284,11 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
>>  
>>
>>
>>
>>  	INIT_LIST_HEAD(&vm->bound_list);
>>  	INIT_LIST_HEAD(&vm->unbound_list);
>> +
>> +	vm->va = RB_ROOT_CACHED;
>> +	INIT_LIST_HEAD(&vm->vm_bind_list);
>> +	INIT_LIST_HEAD(&vm->vm_bound_list);
>> +	mutex_init(&vm->vm_bind_lock);
>>  }
>>  
>>
>>
>>
>>  void *__px_vaddr(struct drm_i915_gem_object *p)
>> diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
>> index 4d75ba4bb41d..3a9bee1b9d03 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
>> +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
>> @@ -260,6 +260,15 @@ struct i915_address_space {
>>  	 */
>>  	struct list_head unbound_list;
>>  
>>
>>
>>
>> +	/** @vm_bind_lock: Mutex to protect @vm_bind_list and @vm_bound_list */
>> +	struct mutex vm_bind_lock;
>> +	/** @vm_bind_list: List of vm_binding in process */
>> +	struct list_head vm_bind_list;
>> +	/** @vm_bound_list: List of vm_binding completed */
>> +	struct list_head vm_bound_list;
>> +	/** @va: tree of persistent vmas */
>> +	struct rb_root_cached va;
>> +
>>  	/* Global GTT */
>>  	bool is_ggtt:1;
>>  
>>
>>
>>
>> diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
>> index c3d43f9b1e45..cf41b96ac485 100644
>> --- a/drivers/gpu/drm/i915/i915_driver.c
>> +++ b/drivers/gpu/drm/i915/i915_driver.c
>> @@ -69,6 +69,7 @@
>>  #include "gem/i915_gem_ioctls.h"
>>  #include "gem/i915_gem_mman.h"
>>  #include "gem/i915_gem_pm.h"
>> +#include "gem/i915_gem_vm_bind.h"
>>  #include "gt/intel_gt.h"
>>  #include "gt/intel_gt_pm.h"
>>  #include "gt/intel_rc6.h"
>> @@ -1892,6 +1893,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
>>  	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW),
>>  	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
>>  	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
>> +	DRM_IOCTL_DEF_DRV(I915_GEM_VM_BIND, i915_gem_vm_bind_ioctl, DRM_RENDER_ALLOW),
>> +	DRM_IOCTL_DEF_DRV(I915_GEM_VM_UNBIND, i915_gem_vm_unbind_ioctl, DRM_RENDER_ALLOW),
>>  };
>>  
>>
>>
>>
>>  /*
>> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
>> index 529d97318f00..6a64a130dbcd 100644
>> --- a/drivers/gpu/drm/i915/i915_vma.c
>> +++ b/drivers/gpu/drm/i915/i915_vma.c
>> @@ -239,6 +239,7 @@ vma_create(struct drm_i915_gem_object *obj,
>>  	spin_unlock(&obj->vma.lock);
>>  	mutex_unlock(&vm->mutex);
>>  
>>
>>
>>
>> +	INIT_LIST_HEAD(&vma->vm_bind_link);
>>  	return vma;
>>  
>>
>>
>>
>>  err_unlock:
>> diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
>> index 3144d71a0c3e..db786d2d1530 100644
>> --- a/drivers/gpu/drm/i915/i915_vma_types.h
>> +++ b/drivers/gpu/drm/i915/i915_vma_types.h
>> @@ -295,6 +295,20 @@ struct i915_vma {
>>  	/** This object's place on the active/inactive lists */
>>  	struct list_head vm_link;
>>  
>>
>>
>>
>> +	/** @vm_bind_link: node for the vm_bind related lists of vm */
>> +	struct list_head vm_bind_link;
>> +
>> +	/** Interval tree structures for persistent vma */
>> +
>> +	/** @rb: node for the interval tree of vm for persistent vmas */
>> +	struct rb_node rb;
>> +	/** @start: start endpoint of the rb node */
>> +	u64 start;
>> +	/** @last: Last endpoint of the rb node */
>> +	u64 last;
>> +	/** @__subtree_last: last in subtree */
>> +	u64 __subtree_last;
>> +
>>  	struct list_head obj_link; /* Link in the object's VMA list */
>>  	struct rb_node obj_node;
>>  	struct hlist_node obj_hash;
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index 8df261c5ab9b..f06a09f1db2d 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -470,6 +470,8 @@ typedef struct _drm_i915_sarea {
>>  #define DRM_I915_GEM_VM_CREATE		0x3a
>>  #define DRM_I915_GEM_VM_DESTROY		0x3b
>>  #define DRM_I915_GEM_CREATE_EXT		0x3c
>> +#define DRM_I915_GEM_VM_BIND		0x3d
>> +#define DRM_I915_GEM_VM_UNBIND		0x3e
>>  /* Must be kept compact -- no holes */
>>  
>>
>>
>>
>>  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
>> @@ -534,6 +536,8 @@ typedef struct _drm_i915_sarea {
>>  #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
>>  #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
>>  #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
>> +#define DRM_IOCTL_I915_GEM_VM_BIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
>> +#define DRM_IOCTL_I915_GEM_VM_UNBIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_unbind)
>>  
>>
>>
>>
>>  /* Allow drivers to submit batchbuffers directly to hardware, relying
>>   * on the security mechanisms provided by hardware.
>> @@ -3727,6 +3731,101 @@ struct drm_i915_gem_create_ext_protected_content {
>>  /* ID of the protected content session managed by i915 when PXP is active */
>>  #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
>>  
>>
>>
>>
>> +/**
>> + * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
>> + *
>> + * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
>> + * virtual address (VA) range to the section of an object that should be bound
>> + * in the device page table of the specified address space (VM).
>> + * The VA range specified must be unique (ie., not currently bound) and can
>> + * be mapped to whole object or a section of the object (partial binding).
>> + * Multiple VA mappings can be created to the same section of the object
>> + * (aliasing).
>> + *
>> + * The @start, @offset and @length must be 4K page aligned. However the DG2
>> + * and XEHPSDV has 64K page size for device local memory and has compact page
>> + * table. On those platforms, for binding device local-memory objects, the
>> + * @start, @offset and @length must be 64K aligned.
>> + *
>> + * Error code -EINVAL will be returned if @start, @offset and @length are not
>> + * properly aligned. In version 1 (See I915_PARAM_VM_BIND_VERSION), error code
>> + * -ENOSPC will be returned if the VA range specified can't be reserved.
>> + *
>> + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
>> + * are not ordered. Furthermore, parts of the VM_BIND operation can be done
>> + * asynchronously, if valid @fence is specified.
>> + */
>> +struct drm_i915_gem_vm_bind {
>> +	/** @vm_id: VM (address space) id to bind */
>> +	__u32 vm_id;
>> +
>> +	/** @handle: Object handle */
>> +	__u32 handle;
>> +
>> +	/** @start: Virtual Address start to bind */
>> +	__u64 start;
>> +
>> +	/** @offset: Offset in object to bind */
>> +	__u64 offset;
>> +
>> +	/** @length: Length of mapping to bind */
>> +	__u64 length;
>> +
>> +	/** @rsvd: Reserved, MBZ */
>> +	__u64 rsvd[3];
>
>In a brand new ioctl with even extensions support, why do we need this?
>If we have a plan to add something here in the future, can you please
>tell us what it may be? Perhaps having that field already but accepting
>only a default value/flag would be better.
>

1 quad word is for flags and other 2 are reseved. I had the flag defined
previously, but changed it to reserved based on review comments.
Yes, we have extension, but I think it is OK to have some reserved fields
(I see other examples here). There are some future expansion plans (like
PAT setting support etc) anyhow. Is that fine?

>I see in previous versions we had 'flags' here. Having 'flags', even if
>MBZ for the initial version, seems like a nice thing to have for future
>extensibility. Also, you're going to add back the flag to make the page
>read-only at some point, right?

Yah, I can separte out the flags here and in vm_unbind. Matt, hope that is
fine.

>
>> +
>> +	/** @rsvd2: Reserved for timeline fence */
>> +	__u64 rsvd2[2];
>
>I see this one gets changed in the middle of the series.
>

Yah we reserve it for timeline fence here in this patch,
timeline fence support is added in a later patch in this series.

>> +
>> +	/**
>> +	 * @extensions: Zero-terminated chain of extensions.
>> +	 *
>> +	 * For future extensions. See struct i915_user_extension.
>> +	 */
>> +	__u64 extensions;
>> +};
>> +
>> +/**
>> + * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
>> + *
>> + * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
>> + * address (VA) range that should be unbound from the device page table of the
>> + * specified address space (VM). VM_UNBIND will force unbind the specified
>> + * range from device page table without waiting for any GPU job to complete.
>> + * It is UMDs responsibility to ensure the mapping is no longer in use before
>> + * calling VM_UNBIND.
>> + *
>> + * If the specified mapping is not found, the ioctl will simply return without
>> + * any error.
>> + *
>> + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
>> + * are not ordered. Furthermore, parts of the VM_UNBIND operation can be done
>> + * asynchronously, if valid @fence is specified.
>
>What @fence? There's no way to specify one.
>

Yah, will remove "if valid @fence is specified".

>> + */
>> +struct drm_i915_gem_vm_unbind {
>> +	/** @vm_id: VM (address space) id to bind */
>> +	__u32 vm_id;
>> +
>> +	/** @rsvd: Reserved, MBZ */
>> +	__u32 rsvd;
>
>Again here, same question. Perhaps we could name it 'pad' or 'pad0' if
>that's the specific goal of having this?
>

Yah, pad is appropriate here.

>> +
>> +	/** @start: Virtual Address start to unbind */
>> +	__u64 start;
>> +
>> +	/** @length: Length of mapping to unbind */
>> +	__u64 length;
>> +
>> +	/** @rsvd2: Reserved, MBZ */
>> +	__u64 rsvd2[3];
>
>And here, but this is definitely not just padding.
>

It is for 'flags' and 'timeline fence' in case we need them later on.

Niranjana

>> +
>> +	/**
>> +	 * @extensions: Zero-terminated chain of extensions.
>> +	 *
>> +	 * For future extensions. See struct i915_user_extension.
>> +	 */
>> +	__u64 extensions;
>> +};
>> +
>>  #if defined(__cplusplus)
>>  }
>>  #endif
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 05/20] drm/i915/vm_bind: Implement bind and unbind of object
  2022-11-10 16:32       ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-11-10 21:21         ` Zanoni, Paulo R
  -1 siblings, 0 replies; 71+ messages in thread
From: Zanoni, Paulo R @ 2022-11-10 21:21 UTC (permalink / raw)
  To: Vishwanathapura, Niranjana
  Cc: Brost, Matthew, jason, Landwerlin, Lionel G, Ursulin, Tvrtko,
	Nikula, Jani, intel-gfx, dri-devel, Hellstrom, Thomas, Auld,
	Matthew, andi.shyti, Vetter,  Daniel, christian.koenig

On Thu, 2022-11-10 at 08:32 -0800, Niranjana Vishwanathapura wrote:
> On Wed, Nov 09, 2022 at 05:28:59PM -0800, Zanoni, Paulo R wrote:
> > On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
> > > Add uapi and implement support for bind and unbind of an
> > > object at the specified GPU virtual addresses.
> > > 
> > > The vm_bind mode is not supported in legacy execbuf2 ioctl.
> > > It will be supported only in the newer execbuf3 ioctl.
> > > 
> > > v2: On older platforms ctx->vm is not set, check for it.
> > >     In vm_bind call, add vma to vm_bind_list.
> > >     Add more input validity checks.
> > >     Update some documentation.
> > > v3: In vm_bind call, add vma to vm_bound_list as user can
> > >     request a fence and pass to execbuf3 as input fence.
> > >     Remove short term pinning with PIN_VALIDATE flag.
> > > v4: Replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode().
> > > v5: Ensure all reserved fields are 0, use PIN_NOEVICT.
> > > v6: Add reserved fields to drm_i915_gem_vm_bind.
> > > 
> > > Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> > > Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> > > Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
> > > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/Makefile                 |   1 +
> > >  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  15 +
> > >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   5 +
> > >  drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
> > >  .../drm/i915/gem/i915_gem_vm_bind_object.c    | 324 ++++++++++++++++++
> > >  drivers/gpu/drm/i915/gt/intel_gtt.c           |  10 +
> > >  drivers/gpu/drm/i915/gt/intel_gtt.h           |   9 +
> > >  drivers/gpu/drm/i915/i915_driver.c            |   3 +
> > >  drivers/gpu/drm/i915/i915_vma.c               |   1 +
> > >  drivers/gpu/drm/i915/i915_vma_types.h         |  14 +
> > >  include/uapi/drm/i915_drm.h                   |  99 ++++++
> > >  11 files changed, 507 insertions(+)
> > >  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
> > >  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> > > 
> > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > > index 51704b54317c..b731f3ac80da 100644
> > > --- a/drivers/gpu/drm/i915/Makefile
> > > +++ b/drivers/gpu/drm/i915/Makefile
> > > @@ -166,6 +166,7 @@ gem-y += \
> > >  	gem/i915_gem_ttm_move.o \
> > >  	gem/i915_gem_ttm_pm.o \
> > >  	gem/i915_gem_userptr.o \
> > > +	gem/i915_gem_vm_bind_object.o \
> > >  	gem/i915_gem_wait.o \
> > >  	gem/i915_gemfs.o
> > >  i915-y += \
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > > index 899fa8f1e0fe..e8b41aa8f8c4 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > > @@ -139,6 +139,21 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> > >  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
> > >  				       struct drm_file *file);
> > >  
> > > 
> > > 
> > > 
> > > +/**
> > > + * i915_gem_vm_is_vm_bind_mode() - Check if address space is in vm_bind mode
> > > + * @vm: the address space
> > > + *
> > > + * Returns:
> > > + * true: @vm is in vm_bind mode; allows only vm_bind method of binding.
> > > + * false: @vm is not in vm_bind mode; allows only legacy execbuff method
> > > + *        of binding.
> > > + */
> > > +static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
> > > +{
> > > +	/* No support to enable vm_bind mode yet */
> > > +	return false;
> > > +}
> > > +
> > >  struct i915_address_space *
> > >  i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
> > >  
> > > 
> > > 
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > index 1160723c9d2d..c5bc9f6e887f 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > @@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
> > >  	if (unlikely(IS_ERR(ctx)))
> > >  		return PTR_ERR(ctx);
> > >  
> > > 
> > > 
> > > 
> > > +	if (ctx->vm && i915_gem_vm_is_vm_bind_mode(ctx->vm)) {
> > > +		i915_gem_context_put(ctx);
> > > +		return -EOPNOTSUPP;
> > > +	}
> > > +
> > >  	eb->gem_context = ctx;
> > >  	if (i915_gem_context_has_full_ppgtt(ctx))
> > >  		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
> > > new file mode 100644
> > > index 000000000000..36262a6357b5
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
> > > @@ -0,0 +1,26 @@
> > > +/* SPDX-License-Identifier: MIT */
> > > +/*
> > > + * Copyright © 2022 Intel Corporation
> > > + */
> > > +
> > > +#ifndef __I915_GEM_VM_BIND_H
> > > +#define __I915_GEM_VM_BIND_H
> > > +
> > > +#include <linux/types.h>
> > > +
> > > +struct drm_device;
> > > +struct drm_file;
> > > +struct i915_address_space;
> > > +struct i915_vma;
> > > +
> > > +struct i915_vma *
> > > +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
> > > +
> > > +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
> > > +			   struct drm_file *file);
> > > +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
> > > +			     struct drm_file *file);
> > > +
> > > +void i915_gem_vm_unbind_all(struct i915_address_space *vm);
> > > +
> > > +#endif /* __I915_GEM_VM_BIND_H */
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> > > new file mode 100644
> > > index 000000000000..6f299806bee1
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> > > @@ -0,0 +1,324 @@
> > > +// SPDX-License-Identifier: MIT
> > > +/*
> > > + * Copyright © 2022 Intel Corporation
> > > + */
> > > +
> > > +#include <uapi/drm/i915_drm.h>
> > > +
> > > +#include <linux/interval_tree_generic.h>
> > > +
> > > +#include "gem/i915_gem_context.h"
> > > +#include "gem/i915_gem_vm_bind.h"
> > > +
> > > +#include "gt/intel_gpu_commands.h"
> > > +
> > > +#define START(node) ((node)->start)
> > > +#define LAST(node) ((node)->last)
> > > +
> > > +/* Not all defined functions are used, hence use __maybe_unused */
> > > +INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
> > > +		     START, LAST, __maybe_unused static inline, i915_vm_bind_it)
> > > +
> > > +#undef START
> > > +#undef LAST
> > > +
> > > +/**
> > > + * DOC: VM_BIND/UNBIND ioctls
> > > + *
> > > + * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
> > > + * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
> > > + * specified address space (VM). Multiple mappings can map to the same physical
> > > + * pages of an object (aliasing). These mappings (also referred to as persistent
> > > + * mappings) will be persistent across multiple GPU submissions (execbuf calls)
> > > + * issued by the UMD, without user having to provide a list of all required
> > > + * mappings during each submission (as required by older execbuf mode).
> > > + *
> > > + * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
> > > + * signaling the completion of bind/unbind operation.
> > > + *
> > > + * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
> > > + * User has to opt-in for VM_BIND mode of binding for an address space (VM)
> > > + * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
> > > + *
> > > + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
> > > + * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be
> > > + * done asynchronously, when valid out fence is specified.
> > > + *
> > > + * VM_BIND locking order is as below.
> > > + *
> > > + * 1) vm_bind_lock mutex will protect vm_bind lists. This lock is taken in
> > > + *    vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the
> > > + *    mapping.
> > > + *
> > > + *    In future, when GPU page faults are supported, we can potentially use a
> > > + *    rwsem instead, so that multiple page fault handlers can take the read
> > > + *    side lock to lookup the mapping and hence can run in parallel.
> > > + *    The older execbuf mode of binding do not need this lock.
> > > + *
> > > + * 2) The object's dma-resv lock will protect i915_vma state and needs
> > > + *    to be held while binding/unbinding a vma in the async worker and while
> > > + *    updating dma-resv fence list of an object. Note that private BOs of a VM
> > > + *    will all share a dma-resv object.
> > > + *
> > > + * 3) Spinlock/s to protect some of the VM's lists like the list of
> > > + *    invalidated vmas (due to eviction and userptr invalidation) etc.
> > > + */
> > > +
> > > +/**
> > > + * i915_gem_vm_bind_lookup_vma() - lookup for persistent vma mapped at a
> > > + * specified address
> > > + * @vm: virtual address space to look for persistent vma
> > > + * @va: starting address where vma is mapped
> > > + *
> > > + * Retrieves the persistent vma mapped address @va from the @vm's vma tree.
> > > + *
> > > + * Returns vma pointer on success, NULL on failure.
> > > + */
> > > +struct i915_vma *
> > > +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va)
> > > +{
> > > +	lockdep_assert_held(&vm->vm_bind_lock);
> > > +
> > > +	return i915_vm_bind_it_iter_first(&vm->va, va, va);
> > > +}
> > > +
> > > +static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
> > > +{
> > > +	lockdep_assert_held(&vma->vm->vm_bind_lock);
> > > +
> > > +	list_del_init(&vma->vm_bind_link);
> > > +	i915_vm_bind_it_remove(vma, &vma->vm->va);
> > > +
> > > +	/* Release object */
> > > +	if (release_obj)
> > > +		i915_gem_object_put(vma->obj);
> > > +}
> > > +
> > > +static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
> > > +				  struct drm_i915_gem_vm_unbind *va)
> > > +{
> > > +	struct drm_i915_gem_object *obj;
> > > +	struct i915_vma *vma;
> > > +	int ret;
> > > +
> > > +	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	va->start = gen8_noncanonical_addr(va->start);
> > > +	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
> > > +
> > > +	if (!vma)
> > > +		ret = -ENOENT;
> > > +	else if (vma->size != va->length)
> > > +		ret = -EINVAL;
> > > +
> > > +	if (ret) {
> > > +		mutex_unlock(&vm->vm_bind_lock);
> > > +		return ret;
> > > +	}
> > > +
> > > +	i915_gem_vm_bind_remove(vma, false);
> > > +
> > > +	mutex_unlock(&vm->vm_bind_lock);
> > > +
> > > +	/*
> > > +	 * Destroy the vma and then release the object.
> > > +	 * As persistent vma holds object reference, it can only be destroyed
> > > +	 * either by vm_unbind ioctl or when VM is being released. As we are
> > > +	 * holding VM reference here, it is safe accessing the vma here.
> > > +	 */
> > > +	obj = vma->obj;
> > > +	i915_gem_object_lock(obj, NULL);
> > > +	i915_vma_destroy(vma);
> > > +	i915_gem_object_unlock(obj);
> > > +
> > > +	i915_gem_object_put(obj);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +/**
> > > + * i915_gem_vm_unbind_all() - unbind all persistent mappings from an
> > > + * address space
> > > + * @vm: Address spece to remove persistent mappings from
> > > + *
> > > + * Unbind all userspace requested vm_bind mappings from @vm.
> > > + */
> > > +void i915_gem_vm_unbind_all(struct i915_address_space *vm)
> > > +{
> > > +	struct i915_vma *vma, *t;
> > > +
> > > +	mutex_lock(&vm->vm_bind_lock);
> > > +	list_for_each_entry_safe(vma, t, &vm->vm_bind_list, vm_bind_link)
> > > +		i915_gem_vm_bind_remove(vma, true);
> > > +	list_for_each_entry_safe(vma, t, &vm->vm_bound_list, vm_bind_link)
> > > +		i915_gem_vm_bind_remove(vma, true);
> > > +	mutex_unlock(&vm->vm_bind_lock);
> > > +}
> > > +
> > > +static struct i915_vma *vm_bind_get_vma(struct i915_address_space *vm,
> > > +					struct drm_i915_gem_object *obj,
> > > +					struct drm_i915_gem_vm_bind *va)
> > > +{
> > > +	struct i915_gtt_view view;
> > > +	struct i915_vma *vma;
> > > +
> > > +	va->start = gen8_noncanonical_addr(va->start);
> > > +	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
> > > +	if (vma)
> > > +		return ERR_PTR(-EEXIST);
> > > +
> > > +	view.type = I915_GTT_VIEW_PARTIAL;
> > > +	view.partial.offset = va->offset >> PAGE_SHIFT;
> > > +	view.partial.size = va->length >> PAGE_SHIFT;
> > > +	vma = i915_vma_create_persistent(obj, vm, &view);
> > > +	if (IS_ERR(vma))
> > > +		return vma;
> > > +
> > > +	vma->start = va->start;
> > > +	vma->last = va->start + va->length - 1;
> > > +
> > > +	return vma;
> > > +}
> > > +
> > > +static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
> > > +				struct drm_i915_gem_vm_bind *va,
> > > +				struct drm_file *file)
> > > +{
> > > +	struct drm_i915_gem_object *obj;
> > > +	struct i915_vma *vma = NULL;
> > > +	struct i915_gem_ww_ctx ww;
> > > +	u64 pin_flags;
> > > +	int ret = 0;
> > > +
> > > +	if (!i915_gem_vm_is_vm_bind_mode(vm))
> > > +		return -EOPNOTSUPP;
> > > +
> > > +	/* Ensure start and length fields are valid */
> > > +	if (!va->length || !IS_ALIGNED(va->start, I915_GTT_PAGE_SIZE))
> > > +		ret = -EINVAL;
> > > +
> > > +	obj = i915_gem_object_lookup(file, va->handle);
> > > +	if (!obj)
> > > +		return -ENOENT;
> > > +
> > > +	/* Ensure offset and length are aligned to object's max page size */
> > > +	if (!IS_ALIGNED(va->offset | va->length,
> > > +			i915_gem_object_max_page_size(obj->mm.placements,
> > > +						      obj->mm.n_placements)))
> > > +		ret = -EINVAL;
> > > +
> > > +	/* Check for mapping range overflow */
> > > +	if (range_overflows_t(u64, va->offset, va->length, obj->base.size))
> > > +		ret = -EINVAL;
> > > +
> > > +	if (ret)
> > > +		goto put_obj;
> > > +
> > > +	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
> > > +	if (ret)
> > > +		goto put_obj;
> > > +
> > > +	vma = vm_bind_get_vma(vm, obj, va);
> > > +	if (IS_ERR(vma)) {
> > > +		ret = PTR_ERR(vma);
> > > +		goto unlock_vm;
> > > +	}
> > > +
> > > +	pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER |
> > > +		    PIN_VALIDATE | PIN_NOEVICT;
> > > +
> > > +	for_i915_gem_ww(&ww, ret, true) {
> > > +		ret = i915_gem_object_lock(vma->obj, &ww);
> > > +		if (ret)
> > > +			continue;
> > > +
> > > +		ret = i915_vma_pin_ww(vma, &ww, 0, 0, pin_flags);
> > > +		if (ret)
> > > +			continue;
> > > +
> > > +		list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
> > > +		i915_vm_bind_it_insert(vma, &vm->va);
> > > +
> > > +		/* Hold object reference until vm_unbind */
> > > +		i915_gem_object_get(vma->obj);
> > > +	}
> > > +
> > > +	if (ret)
> > > +		i915_vma_destroy(vma);
> > > +unlock_vm:
> > > +	mutex_unlock(&vm->vm_bind_lock);
> > > +put_obj:
> > > +	i915_gem_object_put(obj);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +/**
> > > + * i915_gem_vm_bind_ioctl() - ioctl function for binding a section of object
> > > + * at a specified virtual address
> > > + * @dev: drm_device pointer
> > > + * @data: ioctl data structure
> > > + * @file: drm_file pointer
> > > + *
> > > + * Adds the specified persistent mapping (virtual address to a section of an
> > > + * object) and binds it in the device page table.
> > > + *
> > > + * Returns 0 on success, error code on failure.
> > > + */
> > > +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
> > > +			   struct drm_file *file)
> > > +{
> > > +	struct drm_i915_gem_vm_bind *args = data;
> > > +	struct i915_address_space *vm;
> > > +	int ret;
> > > +
> > > +	/* Reserved fields must be 0 */
> > > +	if (args->rsvd[0] || args->rsvd[1] || args->rsvd[2] || args->extensions)
> > > +		return -EINVAL;
> > > +
> > > +	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
> > > +	if (unlikely(!vm))
> > > +		return -ENOENT;
> > > +
> > > +	ret = i915_gem_vm_bind_obj(vm, args, file);
> > > +
> > > +	i915_vm_put(vm);
> > > +	return ret;
> > > +}
> > > +
> > > +/**
> > > + * i915_gem_vm_unbind_ioctl() - ioctl function for unbinding a mapping at a
> > > + * specified virtual address
> > > + * @dev: drm_device pointer
> > > + * @data: ioctl data structure
> > > + * @file: drm_file pointer
> > > + *
> > > + * Removes the persistent mapping at the specified address and unbinds it
> > > + * from the device page table.
> > > + *
> > > + * Returns 0 on success, error code on failure. -ENOENT is returned if the
> > > + * specified mapping is not found.
> > > + */
> > > +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
> > > +			     struct drm_file *file)
> > > +{
> > > +	struct drm_i915_gem_vm_unbind *args = data;
> > > +	struct i915_address_space *vm;
> > > +	int ret;
> > > +
> > > +	/* Reserved fields must be 0 */
> > > +	if (args->rsvd || args->rsvd2[0] || args->rsvd2[1] ||
> > > +	    args->rsvd2[2] || args->extensions)
> > > +		return -EINVAL;
> > > +
> > > +	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
> > > +	if (unlikely(!vm))
> > > +		return -ENOENT;
> > > +
> > > +	ret = i915_gem_vm_unbind_vma(vm, args);
> > > +
> > > +	i915_vm_put(vm);
> > > +	return ret;
> > > +}
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > index e82a9d763e57..412368c67c46 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > @@ -12,6 +12,7 @@
> > >  
> > > 
> > > 
> > > 
> > >  #include "gem/i915_gem_internal.h"
> > >  #include "gem/i915_gem_lmem.h"
> > > +#include "gem/i915_gem_vm_bind.h"
> > >  #include "i915_trace.h"
> > >  #include "i915_utils.h"
> > >  #include "intel_gt.h"
> > > @@ -177,6 +178,8 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
> > >  void i915_address_space_fini(struct i915_address_space *vm)
> > >  {
> > >  	drm_mm_takedown(&vm->mm);
> > > +	GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
> > > +	mutex_destroy(&vm->vm_bind_lock);
> > >  }
> > >  
> > > 
> > > 
> > > 
> > >  /**
> > > @@ -203,6 +206,8 @@ static void __i915_vm_release(struct work_struct *work)
> > >  	struct i915_address_space *vm =
> > >  		container_of(work, struct i915_address_space, release_work);
> > >  
> > > 
> > > 
> > > 
> > > +	i915_gem_vm_unbind_all(vm);
> > > +
> > >  	__i915_vm_close(vm);
> > >  
> > > 
> > > 
> > > 
> > >  	/* Synchronize async unbinds. */
> > > @@ -279,6 +284,11 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
> > >  
> > > 
> > > 
> > > 
> > >  	INIT_LIST_HEAD(&vm->bound_list);
> > >  	INIT_LIST_HEAD(&vm->unbound_list);
> > > +
> > > +	vm->va = RB_ROOT_CACHED;
> > > +	INIT_LIST_HEAD(&vm->vm_bind_list);
> > > +	INIT_LIST_HEAD(&vm->vm_bound_list);
> > > +	mutex_init(&vm->vm_bind_lock);
> > >  }
> > >  
> > > 
> > > 
> > > 
> > >  void *__px_vaddr(struct drm_i915_gem_object *p)
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> > > index 4d75ba4bb41d..3a9bee1b9d03 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> > > @@ -260,6 +260,15 @@ struct i915_address_space {
> > >  	 */
> > >  	struct list_head unbound_list;
> > >  
> > > 
> > > 
> > > 
> > > +	/** @vm_bind_lock: Mutex to protect @vm_bind_list and @vm_bound_list */
> > > +	struct mutex vm_bind_lock;
> > > +	/** @vm_bind_list: List of vm_binding in process */
> > > +	struct list_head vm_bind_list;
> > > +	/** @vm_bound_list: List of vm_binding completed */
> > > +	struct list_head vm_bound_list;
> > > +	/** @va: tree of persistent vmas */
> > > +	struct rb_root_cached va;
> > > +
> > >  	/* Global GTT */
> > >  	bool is_ggtt:1;
> > >  
> > > 
> > > 
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
> > > index c3d43f9b1e45..cf41b96ac485 100644
> > > --- a/drivers/gpu/drm/i915/i915_driver.c
> > > +++ b/drivers/gpu/drm/i915/i915_driver.c
> > > @@ -69,6 +69,7 @@
> > >  #include "gem/i915_gem_ioctls.h"
> > >  #include "gem/i915_gem_mman.h"
> > >  #include "gem/i915_gem_pm.h"
> > > +#include "gem/i915_gem_vm_bind.h"
> > >  #include "gt/intel_gt.h"
> > >  #include "gt/intel_gt_pm.h"
> > >  #include "gt/intel_rc6.h"
> > > @@ -1892,6 +1893,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
> > >  	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW),
> > >  	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
> > >  	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
> > > +	DRM_IOCTL_DEF_DRV(I915_GEM_VM_BIND, i915_gem_vm_bind_ioctl, DRM_RENDER_ALLOW),
> > > +	DRM_IOCTL_DEF_DRV(I915_GEM_VM_UNBIND, i915_gem_vm_unbind_ioctl, DRM_RENDER_ALLOW),
> > >  };
> > >  
> > > 
> > > 
> > > 
> > >  /*
> > > diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> > > index 529d97318f00..6a64a130dbcd 100644
> > > --- a/drivers/gpu/drm/i915/i915_vma.c
> > > +++ b/drivers/gpu/drm/i915/i915_vma.c
> > > @@ -239,6 +239,7 @@ vma_create(struct drm_i915_gem_object *obj,
> > >  	spin_unlock(&obj->vma.lock);
> > >  	mutex_unlock(&vm->mutex);
> > >  
> > > 
> > > 
> > > 
> > > +	INIT_LIST_HEAD(&vma->vm_bind_link);
> > >  	return vma;
> > >  
> > > 
> > > 
> > > 
> > >  err_unlock:
> > > diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
> > > index 3144d71a0c3e..db786d2d1530 100644
> > > --- a/drivers/gpu/drm/i915/i915_vma_types.h
> > > +++ b/drivers/gpu/drm/i915/i915_vma_types.h
> > > @@ -295,6 +295,20 @@ struct i915_vma {
> > >  	/** This object's place on the active/inactive lists */
> > >  	struct list_head vm_link;
> > >  
> > > 
> > > 
> > > 
> > > +	/** @vm_bind_link: node for the vm_bind related lists of vm */
> > > +	struct list_head vm_bind_link;
> > > +
> > > +	/** Interval tree structures for persistent vma */
> > > +
> > > +	/** @rb: node for the interval tree of vm for persistent vmas */
> > > +	struct rb_node rb;
> > > +	/** @start: start endpoint of the rb node */
> > > +	u64 start;
> > > +	/** @last: Last endpoint of the rb node */
> > > +	u64 last;
> > > +	/** @__subtree_last: last in subtree */
> > > +	u64 __subtree_last;
> > > +
> > >  	struct list_head obj_link; /* Link in the object's VMA list */
> > >  	struct rb_node obj_node;
> > >  	struct hlist_node obj_hash;
> > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > > index 8df261c5ab9b..f06a09f1db2d 100644
> > > --- a/include/uapi/drm/i915_drm.h
> > > +++ b/include/uapi/drm/i915_drm.h
> > > @@ -470,6 +470,8 @@ typedef struct _drm_i915_sarea {
> > >  #define DRM_I915_GEM_VM_CREATE		0x3a
> > >  #define DRM_I915_GEM_VM_DESTROY		0x3b
> > >  #define DRM_I915_GEM_CREATE_EXT		0x3c
> > > +#define DRM_I915_GEM_VM_BIND		0x3d
> > > +#define DRM_I915_GEM_VM_UNBIND		0x3e
> > >  /* Must be kept compact -- no holes */
> > >  
> > > 
> > > 
> > > 
> > >  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
> > > @@ -534,6 +536,8 @@ typedef struct _drm_i915_sarea {
> > >  #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
> > >  #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
> > >  #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
> > > +#define DRM_IOCTL_I915_GEM_VM_BIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
> > > +#define DRM_IOCTL_I915_GEM_VM_UNBIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_unbind)
> > >  
> > > 
> > > 
> > > 
> > >  /* Allow drivers to submit batchbuffers directly to hardware, relying
> > >   * on the security mechanisms provided by hardware.
> > > @@ -3727,6 +3731,101 @@ struct drm_i915_gem_create_ext_protected_content {
> > >  /* ID of the protected content session managed by i915 when PXP is active */
> > >  #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
> > >  
> > > 
> > > 
> > > 
> > > +/**
> > > + * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
> > > + *
> > > + * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
> > > + * virtual address (VA) range to the section of an object that should be bound
> > > + * in the device page table of the specified address space (VM).
> > > + * The VA range specified must be unique (ie., not currently bound) and can
> > > + * be mapped to whole object or a section of the object (partial binding).
> > > + * Multiple VA mappings can be created to the same section of the object
> > > + * (aliasing).
> > > + *
> > > + * The @start, @offset and @length must be 4K page aligned. However the DG2
> > > + * and XEHPSDV has 64K page size for device local memory and has compact page
> > > + * table. On those platforms, for binding device local-memory objects, the
> > > + * @start, @offset and @length must be 64K aligned.
> > > + *
> > > + * Error code -EINVAL will be returned if @start, @offset and @length are not
> > > + * properly aligned. In version 1 (See I915_PARAM_VM_BIND_VERSION), error code
> > > + * -ENOSPC will be returned if the VA range specified can't be reserved.
> > > + *
> > > + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
> > > + * are not ordered. Furthermore, parts of the VM_BIND operation can be done
> > > + * asynchronously, if valid @fence is specified.
> > > + */
> > > +struct drm_i915_gem_vm_bind {
> > > +	/** @vm_id: VM (address space) id to bind */
> > > +	__u32 vm_id;
> > > +
> > > +	/** @handle: Object handle */
> > > +	__u32 handle;
> > > +
> > > +	/** @start: Virtual Address start to bind */
> > > +	__u64 start;
> > > +
> > > +	/** @offset: Offset in object to bind */
> > > +	__u64 offset;
> > > +
> > > +	/** @length: Length of mapping to bind */
> > > +	__u64 length;
> > > +
> > > +	/** @rsvd: Reserved, MBZ */
> > > +	__u64 rsvd[3];
> > 
> > In a brand new ioctl with even extensions support, why do we need this?
> > If we have a plan to add something here in the future, can you please
> > tell us what it may be? Perhaps having that field already but accepting
> > only a default value/flag would be better.
> > 
> 
> 1 quad word is for flags and other 2 are reseved. I had the flag defined
> previously, but changed it to reserved based on review comments.
> Yes, we have extension, but I think it is OK to have some reserved fields
> (I see other examples here). There are some future expansion plans (like
> PAT setting support etc) anyhow. Is that fine?
> 
> > I see in previous versions we had 'flags' here. Having 'flags', even if
> > MBZ for the initial version, seems like a nice thing to have for future
> > extensibility. Also, you're going to add back the flag to make the page
> > read-only at some point, right?
> 
> Yah, I can separte out the flags here and in vm_unbind. Matt, hope that is
> fine.

I would prefer to already have the flags fields defined (both here and
in unbind), even if they are MBZ. That makes the life of user space
slightly easier. When you add flags, just bump the version of vm_bind
that we can already query, so we can know the Kernel supports those
flags.

Thanks,
Paulo

> 
> > 
> > > +
> > > +	/** @rsvd2: Reserved for timeline fence */
> > > +	__u64 rsvd2[2];
> > 
> > I see this one gets changed in the middle of the series.
> > 
> 
> Yah we reserve it for timeline fence here in this patch,
> timeline fence support is added in a later patch in this series.
> 
> > > +
> > > +	/**
> > > +	 * @extensions: Zero-terminated chain of extensions.
> > > +	 *
> > > +	 * For future extensions. See struct i915_user_extension.
> > > +	 */
> > > +	__u64 extensions;
> > > +};
> > > +
> > > +/**
> > > + * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
> > > + *
> > > + * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
> > > + * address (VA) range that should be unbound from the device page table of the
> > > + * specified address space (VM). VM_UNBIND will force unbind the specified
> > > + * range from device page table without waiting for any GPU job to complete.
> > > + * It is UMDs responsibility to ensure the mapping is no longer in use before
> > > + * calling VM_UNBIND.
> > > + *
> > > + * If the specified mapping is not found, the ioctl will simply return without
> > > + * any error.
> > > + *
> > > + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
> > > + * are not ordered. Furthermore, parts of the VM_UNBIND operation can be done
> > > + * asynchronously, if valid @fence is specified.
> > 
> > What @fence? There's no way to specify one.
> > 
> 
> Yah, will remove "if valid @fence is specified".
> 
> > > + */
> > > +struct drm_i915_gem_vm_unbind {
> > > +	/** @vm_id: VM (address space) id to bind */
> > > +	__u32 vm_id;
> > > +
> > > +	/** @rsvd: Reserved, MBZ */
> > > +	__u32 rsvd;
> > 
> > Again here, same question. Perhaps we could name it 'pad' or 'pad0' if
> > that's the specific goal of having this?
> > 
> 
> Yah, pad is appropriate here.
> 
> > > +
> > > +	/** @start: Virtual Address start to unbind */
> > > +	__u64 start;
> > > +
> > > +	/** @length: Length of mapping to unbind */
> > > +	__u64 length;
> > > +
> > > +	/** @rsvd2: Reserved, MBZ */
> > > +	__u64 rsvd2[3];
> > 
> > And here, but this is definitely not just padding.
> > 
> 
> It is for 'flags' and 'timeline fence' in case we need them later on.
> 
> Niranjana
> 
> > > +
> > > +	/**
> > > +	 * @extensions: Zero-terminated chain of extensions.
> > > +	 *
> > > +	 * For future extensions. See struct i915_user_extension.
> > > +	 */
> > > +	__u64 extensions;
> > > +};
> > > +
> > >  #if defined(__cplusplus)
> > >  }
> > >  #endif
> > 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 05/20] drm/i915/vm_bind: Implement bind and unbind of object
@ 2022-11-10 21:21         ` Zanoni, Paulo R
  0 siblings, 0 replies; 71+ messages in thread
From: Zanoni, Paulo R @ 2022-11-10 21:21 UTC (permalink / raw)
  To: Vishwanathapura, Niranjana
  Cc: Nikula, Jani, intel-gfx, dri-devel, Hellstrom, Thomas, Auld,
	Matthew, Vetter,  Daniel, christian.koenig

On Thu, 2022-11-10 at 08:32 -0800, Niranjana Vishwanathapura wrote:
> On Wed, Nov 09, 2022 at 05:28:59PM -0800, Zanoni, Paulo R wrote:
> > On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
> > > Add uapi and implement support for bind and unbind of an
> > > object at the specified GPU virtual addresses.
> > > 
> > > The vm_bind mode is not supported in legacy execbuf2 ioctl.
> > > It will be supported only in the newer execbuf3 ioctl.
> > > 
> > > v2: On older platforms ctx->vm is not set, check for it.
> > >     In vm_bind call, add vma to vm_bind_list.
> > >     Add more input validity checks.
> > >     Update some documentation.
> > > v3: In vm_bind call, add vma to vm_bound_list as user can
> > >     request a fence and pass to execbuf3 as input fence.
> > >     Remove short term pinning with PIN_VALIDATE flag.
> > > v4: Replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode().
> > > v5: Ensure all reserved fields are 0, use PIN_NOEVICT.
> > > v6: Add reserved fields to drm_i915_gem_vm_bind.
> > > 
> > > Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> > > Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> > > Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
> > > Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/Makefile                 |   1 +
> > >  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  15 +
> > >  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |   5 +
> > >  drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  26 ++
> > >  .../drm/i915/gem/i915_gem_vm_bind_object.c    | 324 ++++++++++++++++++
> > >  drivers/gpu/drm/i915/gt/intel_gtt.c           |  10 +
> > >  drivers/gpu/drm/i915/gt/intel_gtt.h           |   9 +
> > >  drivers/gpu/drm/i915/i915_driver.c            |   3 +
> > >  drivers/gpu/drm/i915/i915_vma.c               |   1 +
> > >  drivers/gpu/drm/i915/i915_vma_types.h         |  14 +
> > >  include/uapi/drm/i915_drm.h                   |  99 ++++++
> > >  11 files changed, 507 insertions(+)
> > >  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
> > >  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> > > 
> > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > > index 51704b54317c..b731f3ac80da 100644
> > > --- a/drivers/gpu/drm/i915/Makefile
> > > +++ b/drivers/gpu/drm/i915/Makefile
> > > @@ -166,6 +166,7 @@ gem-y += \
> > >  	gem/i915_gem_ttm_move.o \
> > >  	gem/i915_gem_ttm_pm.o \
> > >  	gem/i915_gem_userptr.o \
> > > +	gem/i915_gem_vm_bind_object.o \
> > >  	gem/i915_gem_wait.o \
> > >  	gem/i915_gemfs.o
> > >  i915-y += \
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > > index 899fa8f1e0fe..e8b41aa8f8c4 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
> > > @@ -139,6 +139,21 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> > >  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
> > >  				       struct drm_file *file);
> > >  
> > > 
> > > 
> > > 
> > > +/**
> > > + * i915_gem_vm_is_vm_bind_mode() - Check if address space is in vm_bind mode
> > > + * @vm: the address space
> > > + *
> > > + * Returns:
> > > + * true: @vm is in vm_bind mode; allows only vm_bind method of binding.
> > > + * false: @vm is not in vm_bind mode; allows only legacy execbuff method
> > > + *        of binding.
> > > + */
> > > +static inline bool i915_gem_vm_is_vm_bind_mode(struct i915_address_space *vm)
> > > +{
> > > +	/* No support to enable vm_bind mode yet */
> > > +	return false;
> > > +}
> > > +
> > >  struct i915_address_space *
> > >  i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id);
> > >  
> > > 
> > > 
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > index 1160723c9d2d..c5bc9f6e887f 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> > > @@ -781,6 +781,11 @@ static int eb_select_context(struct i915_execbuffer *eb)
> > >  	if (unlikely(IS_ERR(ctx)))
> > >  		return PTR_ERR(ctx);
> > >  
> > > 
> > > 
> > > 
> > > +	if (ctx->vm && i915_gem_vm_is_vm_bind_mode(ctx->vm)) {
> > > +		i915_gem_context_put(ctx);
> > > +		return -EOPNOTSUPP;
> > > +	}
> > > +
> > >  	eb->gem_context = ctx;
> > >  	if (i915_gem_context_has_full_ppgtt(ctx))
> > >  		eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
> > > new file mode 100644
> > > index 000000000000..36262a6357b5
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
> > > @@ -0,0 +1,26 @@
> > > +/* SPDX-License-Identifier: MIT */
> > > +/*
> > > + * Copyright © 2022 Intel Corporation
> > > + */
> > > +
> > > +#ifndef __I915_GEM_VM_BIND_H
> > > +#define __I915_GEM_VM_BIND_H
> > > +
> > > +#include <linux/types.h>
> > > +
> > > +struct drm_device;
> > > +struct drm_file;
> > > +struct i915_address_space;
> > > +struct i915_vma;
> > > +
> > > +struct i915_vma *
> > > +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va);
> > > +
> > > +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
> > > +			   struct drm_file *file);
> > > +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
> > > +			     struct drm_file *file);
> > > +
> > > +void i915_gem_vm_unbind_all(struct i915_address_space *vm);
> > > +
> > > +#endif /* __I915_GEM_VM_BIND_H */
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> > > new file mode 100644
> > > index 000000000000..6f299806bee1
> > > --- /dev/null
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> > > @@ -0,0 +1,324 @@
> > > +// SPDX-License-Identifier: MIT
> > > +/*
> > > + * Copyright © 2022 Intel Corporation
> > > + */
> > > +
> > > +#include <uapi/drm/i915_drm.h>
> > > +
> > > +#include <linux/interval_tree_generic.h>
> > > +
> > > +#include "gem/i915_gem_context.h"
> > > +#include "gem/i915_gem_vm_bind.h"
> > > +
> > > +#include "gt/intel_gpu_commands.h"
> > > +
> > > +#define START(node) ((node)->start)
> > > +#define LAST(node) ((node)->last)
> > > +
> > > +/* Not all defined functions are used, hence use __maybe_unused */
> > > +INTERVAL_TREE_DEFINE(struct i915_vma, rb, u64, __subtree_last,
> > > +		     START, LAST, __maybe_unused static inline, i915_vm_bind_it)
> > > +
> > > +#undef START
> > > +#undef LAST
> > > +
> > > +/**
> > > + * DOC: VM_BIND/UNBIND ioctls
> > > + *
> > > + * DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
> > > + * objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
> > > + * specified address space (VM). Multiple mappings can map to the same physical
> > > + * pages of an object (aliasing). These mappings (also referred to as persistent
> > > + * mappings) will be persistent across multiple GPU submissions (execbuf calls)
> > > + * issued by the UMD, without user having to provide a list of all required
> > > + * mappings during each submission (as required by older execbuf mode).
> > > + *
> > > + * The VM_BIND/UNBIND calls allow UMDs to request a timeline out fence for
> > > + * signaling the completion of bind/unbind operation.
> > > + *
> > > + * VM_BIND feature is advertised to user via I915_PARAM_VM_BIND_VERSION.
> > > + * User has to opt-in for VM_BIND mode of binding for an address space (VM)
> > > + * during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
> > > + *
> > > + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
> > > + * are not ordered. Furthermore, parts of the VM_BIND/UNBIND operations can be
> > > + * done asynchronously, when valid out fence is specified.
> > > + *
> > > + * VM_BIND locking order is as below.
> > > + *
> > > + * 1) vm_bind_lock mutex will protect vm_bind lists. This lock is taken in
> > > + *    vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the
> > > + *    mapping.
> > > + *
> > > + *    In future, when GPU page faults are supported, we can potentially use a
> > > + *    rwsem instead, so that multiple page fault handlers can take the read
> > > + *    side lock to lookup the mapping and hence can run in parallel.
> > > + *    The older execbuf mode of binding do not need this lock.
> > > + *
> > > + * 2) The object's dma-resv lock will protect i915_vma state and needs
> > > + *    to be held while binding/unbinding a vma in the async worker and while
> > > + *    updating dma-resv fence list of an object. Note that private BOs of a VM
> > > + *    will all share a dma-resv object.
> > > + *
> > > + * 3) Spinlock/s to protect some of the VM's lists like the list of
> > > + *    invalidated vmas (due to eviction and userptr invalidation) etc.
> > > + */
> > > +
> > > +/**
> > > + * i915_gem_vm_bind_lookup_vma() - lookup for persistent vma mapped at a
> > > + * specified address
> > > + * @vm: virtual address space to look for persistent vma
> > > + * @va: starting address where vma is mapped
> > > + *
> > > + * Retrieves the persistent vma mapped address @va from the @vm's vma tree.
> > > + *
> > > + * Returns vma pointer on success, NULL on failure.
> > > + */
> > > +struct i915_vma *
> > > +i915_gem_vm_bind_lookup_vma(struct i915_address_space *vm, u64 va)
> > > +{
> > > +	lockdep_assert_held(&vm->vm_bind_lock);
> > > +
> > > +	return i915_vm_bind_it_iter_first(&vm->va, va, va);
> > > +}
> > > +
> > > +static void i915_gem_vm_bind_remove(struct i915_vma *vma, bool release_obj)
> > > +{
> > > +	lockdep_assert_held(&vma->vm->vm_bind_lock);
> > > +
> > > +	list_del_init(&vma->vm_bind_link);
> > > +	i915_vm_bind_it_remove(vma, &vma->vm->va);
> > > +
> > > +	/* Release object */
> > > +	if (release_obj)
> > > +		i915_gem_object_put(vma->obj);
> > > +}
> > > +
> > > +static int i915_gem_vm_unbind_vma(struct i915_address_space *vm,
> > > +				  struct drm_i915_gem_vm_unbind *va)
> > > +{
> > > +	struct drm_i915_gem_object *obj;
> > > +	struct i915_vma *vma;
> > > +	int ret;
> > > +
> > > +	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	va->start = gen8_noncanonical_addr(va->start);
> > > +	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
> > > +
> > > +	if (!vma)
> > > +		ret = -ENOENT;
> > > +	else if (vma->size != va->length)
> > > +		ret = -EINVAL;
> > > +
> > > +	if (ret) {
> > > +		mutex_unlock(&vm->vm_bind_lock);
> > > +		return ret;
> > > +	}
> > > +
> > > +	i915_gem_vm_bind_remove(vma, false);
> > > +
> > > +	mutex_unlock(&vm->vm_bind_lock);
> > > +
> > > +	/*
> > > +	 * Destroy the vma and then release the object.
> > > +	 * As persistent vma holds object reference, it can only be destroyed
> > > +	 * either by vm_unbind ioctl or when VM is being released. As we are
> > > +	 * holding VM reference here, it is safe accessing the vma here.
> > > +	 */
> > > +	obj = vma->obj;
> > > +	i915_gem_object_lock(obj, NULL);
> > > +	i915_vma_destroy(vma);
> > > +	i915_gem_object_unlock(obj);
> > > +
> > > +	i915_gem_object_put(obj);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +/**
> > > + * i915_gem_vm_unbind_all() - unbind all persistent mappings from an
> > > + * address space
> > > + * @vm: Address spece to remove persistent mappings from
> > > + *
> > > + * Unbind all userspace requested vm_bind mappings from @vm.
> > > + */
> > > +void i915_gem_vm_unbind_all(struct i915_address_space *vm)
> > > +{
> > > +	struct i915_vma *vma, *t;
> > > +
> > > +	mutex_lock(&vm->vm_bind_lock);
> > > +	list_for_each_entry_safe(vma, t, &vm->vm_bind_list, vm_bind_link)
> > > +		i915_gem_vm_bind_remove(vma, true);
> > > +	list_for_each_entry_safe(vma, t, &vm->vm_bound_list, vm_bind_link)
> > > +		i915_gem_vm_bind_remove(vma, true);
> > > +	mutex_unlock(&vm->vm_bind_lock);
> > > +}
> > > +
> > > +static struct i915_vma *vm_bind_get_vma(struct i915_address_space *vm,
> > > +					struct drm_i915_gem_object *obj,
> > > +					struct drm_i915_gem_vm_bind *va)
> > > +{
> > > +	struct i915_gtt_view view;
> > > +	struct i915_vma *vma;
> > > +
> > > +	va->start = gen8_noncanonical_addr(va->start);
> > > +	vma = i915_gem_vm_bind_lookup_vma(vm, va->start);
> > > +	if (vma)
> > > +		return ERR_PTR(-EEXIST);
> > > +
> > > +	view.type = I915_GTT_VIEW_PARTIAL;
> > > +	view.partial.offset = va->offset >> PAGE_SHIFT;
> > > +	view.partial.size = va->length >> PAGE_SHIFT;
> > > +	vma = i915_vma_create_persistent(obj, vm, &view);
> > > +	if (IS_ERR(vma))
> > > +		return vma;
> > > +
> > > +	vma->start = va->start;
> > > +	vma->last = va->start + va->length - 1;
> > > +
> > > +	return vma;
> > > +}
> > > +
> > > +static int i915_gem_vm_bind_obj(struct i915_address_space *vm,
> > > +				struct drm_i915_gem_vm_bind *va,
> > > +				struct drm_file *file)
> > > +{
> > > +	struct drm_i915_gem_object *obj;
> > > +	struct i915_vma *vma = NULL;
> > > +	struct i915_gem_ww_ctx ww;
> > > +	u64 pin_flags;
> > > +	int ret = 0;
> > > +
> > > +	if (!i915_gem_vm_is_vm_bind_mode(vm))
> > > +		return -EOPNOTSUPP;
> > > +
> > > +	/* Ensure start and length fields are valid */
> > > +	if (!va->length || !IS_ALIGNED(va->start, I915_GTT_PAGE_SIZE))
> > > +		ret = -EINVAL;
> > > +
> > > +	obj = i915_gem_object_lookup(file, va->handle);
> > > +	if (!obj)
> > > +		return -ENOENT;
> > > +
> > > +	/* Ensure offset and length are aligned to object's max page size */
> > > +	if (!IS_ALIGNED(va->offset | va->length,
> > > +			i915_gem_object_max_page_size(obj->mm.placements,
> > > +						      obj->mm.n_placements)))
> > > +		ret = -EINVAL;
> > > +
> > > +	/* Check for mapping range overflow */
> > > +	if (range_overflows_t(u64, va->offset, va->length, obj->base.size))
> > > +		ret = -EINVAL;
> > > +
> > > +	if (ret)
> > > +		goto put_obj;
> > > +
> > > +	ret = mutex_lock_interruptible(&vm->vm_bind_lock);
> > > +	if (ret)
> > > +		goto put_obj;
> > > +
> > > +	vma = vm_bind_get_vma(vm, obj, va);
> > > +	if (IS_ERR(vma)) {
> > > +		ret = PTR_ERR(vma);
> > > +		goto unlock_vm;
> > > +	}
> > > +
> > > +	pin_flags = va->start | PIN_OFFSET_FIXED | PIN_USER |
> > > +		    PIN_VALIDATE | PIN_NOEVICT;
> > > +
> > > +	for_i915_gem_ww(&ww, ret, true) {
> > > +		ret = i915_gem_object_lock(vma->obj, &ww);
> > > +		if (ret)
> > > +			continue;
> > > +
> > > +		ret = i915_vma_pin_ww(vma, &ww, 0, 0, pin_flags);
> > > +		if (ret)
> > > +			continue;
> > > +
> > > +		list_add_tail(&vma->vm_bind_link, &vm->vm_bound_list);
> > > +		i915_vm_bind_it_insert(vma, &vm->va);
> > > +
> > > +		/* Hold object reference until vm_unbind */
> > > +		i915_gem_object_get(vma->obj);
> > > +	}
> > > +
> > > +	if (ret)
> > > +		i915_vma_destroy(vma);
> > > +unlock_vm:
> > > +	mutex_unlock(&vm->vm_bind_lock);
> > > +put_obj:
> > > +	i915_gem_object_put(obj);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +/**
> > > + * i915_gem_vm_bind_ioctl() - ioctl function for binding a section of object
> > > + * at a specified virtual address
> > > + * @dev: drm_device pointer
> > > + * @data: ioctl data structure
> > > + * @file: drm_file pointer
> > > + *
> > > + * Adds the specified persistent mapping (virtual address to a section of an
> > > + * object) and binds it in the device page table.
> > > + *
> > > + * Returns 0 on success, error code on failure.
> > > + */
> > > +int i915_gem_vm_bind_ioctl(struct drm_device *dev, void *data,
> > > +			   struct drm_file *file)
> > > +{
> > > +	struct drm_i915_gem_vm_bind *args = data;
> > > +	struct i915_address_space *vm;
> > > +	int ret;
> > > +
> > > +	/* Reserved fields must be 0 */
> > > +	if (args->rsvd[0] || args->rsvd[1] || args->rsvd[2] || args->extensions)
> > > +		return -EINVAL;
> > > +
> > > +	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
> > > +	if (unlikely(!vm))
> > > +		return -ENOENT;
> > > +
> > > +	ret = i915_gem_vm_bind_obj(vm, args, file);
> > > +
> > > +	i915_vm_put(vm);
> > > +	return ret;
> > > +}
> > > +
> > > +/**
> > > + * i915_gem_vm_unbind_ioctl() - ioctl function for unbinding a mapping at a
> > > + * specified virtual address
> > > + * @dev: drm_device pointer
> > > + * @data: ioctl data structure
> > > + * @file: drm_file pointer
> > > + *
> > > + * Removes the persistent mapping at the specified address and unbinds it
> > > + * from the device page table.
> > > + *
> > > + * Returns 0 on success, error code on failure. -ENOENT is returned if the
> > > + * specified mapping is not found.
> > > + */
> > > +int i915_gem_vm_unbind_ioctl(struct drm_device *dev, void *data,
> > > +			     struct drm_file *file)
> > > +{
> > > +	struct drm_i915_gem_vm_unbind *args = data;
> > > +	struct i915_address_space *vm;
> > > +	int ret;
> > > +
> > > +	/* Reserved fields must be 0 */
> > > +	if (args->rsvd || args->rsvd2[0] || args->rsvd2[1] ||
> > > +	    args->rsvd2[2] || args->extensions)
> > > +		return -EINVAL;
> > > +
> > > +	vm = i915_gem_vm_lookup(file->driver_priv, args->vm_id);
> > > +	if (unlikely(!vm))
> > > +		return -ENOENT;
> > > +
> > > +	ret = i915_gem_vm_unbind_vma(vm, args);
> > > +
> > > +	i915_vm_put(vm);
> > > +	return ret;
> > > +}
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > index e82a9d763e57..412368c67c46 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > @@ -12,6 +12,7 @@
> > >  
> > > 
> > > 
> > > 
> > >  #include "gem/i915_gem_internal.h"
> > >  #include "gem/i915_gem_lmem.h"
> > > +#include "gem/i915_gem_vm_bind.h"
> > >  #include "i915_trace.h"
> > >  #include "i915_utils.h"
> > >  #include "intel_gt.h"
> > > @@ -177,6 +178,8 @@ int i915_vm_lock_objects(struct i915_address_space *vm,
> > >  void i915_address_space_fini(struct i915_address_space *vm)
> > >  {
> > >  	drm_mm_takedown(&vm->mm);
> > > +	GEM_BUG_ON(!RB_EMPTY_ROOT(&vm->va.rb_root));
> > > +	mutex_destroy(&vm->vm_bind_lock);
> > >  }
> > >  
> > > 
> > > 
> > > 
> > >  /**
> > > @@ -203,6 +206,8 @@ static void __i915_vm_release(struct work_struct *work)
> > >  	struct i915_address_space *vm =
> > >  		container_of(work, struct i915_address_space, release_work);
> > >  
> > > 
> > > 
> > > 
> > > +	i915_gem_vm_unbind_all(vm);
> > > +
> > >  	__i915_vm_close(vm);
> > >  
> > > 
> > > 
> > > 
> > >  	/* Synchronize async unbinds. */
> > > @@ -279,6 +284,11 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
> > >  
> > > 
> > > 
> > > 
> > >  	INIT_LIST_HEAD(&vm->bound_list);
> > >  	INIT_LIST_HEAD(&vm->unbound_list);
> > > +
> > > +	vm->va = RB_ROOT_CACHED;
> > > +	INIT_LIST_HEAD(&vm->vm_bind_list);
> > > +	INIT_LIST_HEAD(&vm->vm_bound_list);
> > > +	mutex_init(&vm->vm_bind_lock);
> > >  }
> > >  
> > > 
> > > 
> > > 
> > >  void *__px_vaddr(struct drm_i915_gem_object *p)
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
> > > index 4d75ba4bb41d..3a9bee1b9d03 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_gtt.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
> > > @@ -260,6 +260,15 @@ struct i915_address_space {
> > >  	 */
> > >  	struct list_head unbound_list;
> > >  
> > > 
> > > 
> > > 
> > > +	/** @vm_bind_lock: Mutex to protect @vm_bind_list and @vm_bound_list */
> > > +	struct mutex vm_bind_lock;
> > > +	/** @vm_bind_list: List of vm_binding in process */
> > > +	struct list_head vm_bind_list;
> > > +	/** @vm_bound_list: List of vm_binding completed */
> > > +	struct list_head vm_bound_list;
> > > +	/** @va: tree of persistent vmas */
> > > +	struct rb_root_cached va;
> > > +
> > >  	/* Global GTT */
> > >  	bool is_ggtt:1;
> > >  
> > > 
> > > 
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
> > > index c3d43f9b1e45..cf41b96ac485 100644
> > > --- a/drivers/gpu/drm/i915/i915_driver.c
> > > +++ b/drivers/gpu/drm/i915/i915_driver.c
> > > @@ -69,6 +69,7 @@
> > >  #include "gem/i915_gem_ioctls.h"
> > >  #include "gem/i915_gem_mman.h"
> > >  #include "gem/i915_gem_pm.h"
> > > +#include "gem/i915_gem_vm_bind.h"
> > >  #include "gt/intel_gt.h"
> > >  #include "gt/intel_gt_pm.h"
> > >  #include "gt/intel_rc6.h"
> > > @@ -1892,6 +1893,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
> > >  	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW),
> > >  	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
> > >  	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
> > > +	DRM_IOCTL_DEF_DRV(I915_GEM_VM_BIND, i915_gem_vm_bind_ioctl, DRM_RENDER_ALLOW),
> > > +	DRM_IOCTL_DEF_DRV(I915_GEM_VM_UNBIND, i915_gem_vm_unbind_ioctl, DRM_RENDER_ALLOW),
> > >  };
> > >  
> > > 
> > > 
> > > 
> > >  /*
> > > diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> > > index 529d97318f00..6a64a130dbcd 100644
> > > --- a/drivers/gpu/drm/i915/i915_vma.c
> > > +++ b/drivers/gpu/drm/i915/i915_vma.c
> > > @@ -239,6 +239,7 @@ vma_create(struct drm_i915_gem_object *obj,
> > >  	spin_unlock(&obj->vma.lock);
> > >  	mutex_unlock(&vm->mutex);
> > >  
> > > 
> > > 
> > > 
> > > +	INIT_LIST_HEAD(&vma->vm_bind_link);
> > >  	return vma;
> > >  
> > > 
> > > 
> > > 
> > >  err_unlock:
> > > diff --git a/drivers/gpu/drm/i915/i915_vma_types.h b/drivers/gpu/drm/i915/i915_vma_types.h
> > > index 3144d71a0c3e..db786d2d1530 100644
> > > --- a/drivers/gpu/drm/i915/i915_vma_types.h
> > > +++ b/drivers/gpu/drm/i915/i915_vma_types.h
> > > @@ -295,6 +295,20 @@ struct i915_vma {
> > >  	/** This object's place on the active/inactive lists */
> > >  	struct list_head vm_link;
> > >  
> > > 
> > > 
> > > 
> > > +	/** @vm_bind_link: node for the vm_bind related lists of vm */
> > > +	struct list_head vm_bind_link;
> > > +
> > > +	/** Interval tree structures for persistent vma */
> > > +
> > > +	/** @rb: node for the interval tree of vm for persistent vmas */
> > > +	struct rb_node rb;
> > > +	/** @start: start endpoint of the rb node */
> > > +	u64 start;
> > > +	/** @last: Last endpoint of the rb node */
> > > +	u64 last;
> > > +	/** @__subtree_last: last in subtree */
> > > +	u64 __subtree_last;
> > > +
> > >  	struct list_head obj_link; /* Link in the object's VMA list */
> > >  	struct rb_node obj_node;
> > >  	struct hlist_node obj_hash;
> > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > > index 8df261c5ab9b..f06a09f1db2d 100644
> > > --- a/include/uapi/drm/i915_drm.h
> > > +++ b/include/uapi/drm/i915_drm.h
> > > @@ -470,6 +470,8 @@ typedef struct _drm_i915_sarea {
> > >  #define DRM_I915_GEM_VM_CREATE		0x3a
> > >  #define DRM_I915_GEM_VM_DESTROY		0x3b
> > >  #define DRM_I915_GEM_CREATE_EXT		0x3c
> > > +#define DRM_I915_GEM_VM_BIND		0x3d
> > > +#define DRM_I915_GEM_VM_UNBIND		0x3e
> > >  /* Must be kept compact -- no holes */
> > >  
> > > 
> > > 
> > > 
> > >  #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
> > > @@ -534,6 +536,8 @@ typedef struct _drm_i915_sarea {
> > >  #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
> > >  #define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
> > >  #define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
> > > +#define DRM_IOCTL_I915_GEM_VM_BIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
> > > +#define DRM_IOCTL_I915_GEM_VM_UNBIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_unbind)
> > >  
> > > 
> > > 
> > > 
> > >  /* Allow drivers to submit batchbuffers directly to hardware, relying
> > >   * on the security mechanisms provided by hardware.
> > > @@ -3727,6 +3731,101 @@ struct drm_i915_gem_create_ext_protected_content {
> > >  /* ID of the protected content session managed by i915 when PXP is active */
> > >  #define I915_PROTECTED_CONTENT_DEFAULT_SESSION 0xf
> > >  
> > > 
> > > 
> > > 
> > > +/**
> > > + * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
> > > + *
> > > + * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
> > > + * virtual address (VA) range to the section of an object that should be bound
> > > + * in the device page table of the specified address space (VM).
> > > + * The VA range specified must be unique (ie., not currently bound) and can
> > > + * be mapped to whole object or a section of the object (partial binding).
> > > + * Multiple VA mappings can be created to the same section of the object
> > > + * (aliasing).
> > > + *
> > > + * The @start, @offset and @length must be 4K page aligned. However the DG2
> > > + * and XEHPSDV has 64K page size for device local memory and has compact page
> > > + * table. On those platforms, for binding device local-memory objects, the
> > > + * @start, @offset and @length must be 64K aligned.
> > > + *
> > > + * Error code -EINVAL will be returned if @start, @offset and @length are not
> > > + * properly aligned. In version 1 (See I915_PARAM_VM_BIND_VERSION), error code
> > > + * -ENOSPC will be returned if the VA range specified can't be reserved.
> > > + *
> > > + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
> > > + * are not ordered. Furthermore, parts of the VM_BIND operation can be done
> > > + * asynchronously, if valid @fence is specified.
> > > + */
> > > +struct drm_i915_gem_vm_bind {
> > > +	/** @vm_id: VM (address space) id to bind */
> > > +	__u32 vm_id;
> > > +
> > > +	/** @handle: Object handle */
> > > +	__u32 handle;
> > > +
> > > +	/** @start: Virtual Address start to bind */
> > > +	__u64 start;
> > > +
> > > +	/** @offset: Offset in object to bind */
> > > +	__u64 offset;
> > > +
> > > +	/** @length: Length of mapping to bind */
> > > +	__u64 length;
> > > +
> > > +	/** @rsvd: Reserved, MBZ */
> > > +	__u64 rsvd[3];
> > 
> > In a brand new ioctl with even extensions support, why do we need this?
> > If we have a plan to add something here in the future, can you please
> > tell us what it may be? Perhaps having that field already but accepting
> > only a default value/flag would be better.
> > 
> 
> 1 quad word is for flags and other 2 are reseved. I had the flag defined
> previously, but changed it to reserved based on review comments.
> Yes, we have extension, but I think it is OK to have some reserved fields
> (I see other examples here). There are some future expansion plans (like
> PAT setting support etc) anyhow. Is that fine?
> 
> > I see in previous versions we had 'flags' here. Having 'flags', even if
> > MBZ for the initial version, seems like a nice thing to have for future
> > extensibility. Also, you're going to add back the flag to make the page
> > read-only at some point, right?
> 
> Yah, I can separte out the flags here and in vm_unbind. Matt, hope that is
> fine.

I would prefer to already have the flags fields defined (both here and
in unbind), even if they are MBZ. That makes the life of user space
slightly easier. When you add flags, just bump the version of vm_bind
that we can already query, so we can know the Kernel supports those
flags.

Thanks,
Paulo

> 
> > 
> > > +
> > > +	/** @rsvd2: Reserved for timeline fence */
> > > +	__u64 rsvd2[2];
> > 
> > I see this one gets changed in the middle of the series.
> > 
> 
> Yah we reserve it for timeline fence here in this patch,
> timeline fence support is added in a later patch in this series.
> 
> > > +
> > > +	/**
> > > +	 * @extensions: Zero-terminated chain of extensions.
> > > +	 *
> > > +	 * For future extensions. See struct i915_user_extension.
> > > +	 */
> > > +	__u64 extensions;
> > > +};
> > > +
> > > +/**
> > > + * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
> > > + *
> > > + * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
> > > + * address (VA) range that should be unbound from the device page table of the
> > > + * specified address space (VM). VM_UNBIND will force unbind the specified
> > > + * range from device page table without waiting for any GPU job to complete.
> > > + * It is UMDs responsibility to ensure the mapping is no longer in use before
> > > + * calling VM_UNBIND.
> > > + *
> > > + * If the specified mapping is not found, the ioctl will simply return without
> > > + * any error.
> > > + *
> > > + * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
> > > + * are not ordered. Furthermore, parts of the VM_UNBIND operation can be done
> > > + * asynchronously, if valid @fence is specified.
> > 
> > What @fence? There's no way to specify one.
> > 
> 
> Yah, will remove "if valid @fence is specified".
> 
> > > + */
> > > +struct drm_i915_gem_vm_unbind {
> > > +	/** @vm_id: VM (address space) id to bind */
> > > +	__u32 vm_id;
> > > +
> > > +	/** @rsvd: Reserved, MBZ */
> > > +	__u32 rsvd;
> > 
> > Again here, same question. Perhaps we could name it 'pad' or 'pad0' if
> > that's the specific goal of having this?
> > 
> 
> Yah, pad is appropriate here.
> 
> > > +
> > > +	/** @start: Virtual Address start to unbind */
> > > +	__u64 start;
> > > +
> > > +	/** @length: Length of mapping to unbind */
> > > +	__u64 length;
> > > +
> > > +	/** @rsvd2: Reserved, MBZ */
> > > +	__u64 rsvd2[3];
> > 
> > And here, but this is definitely not just padding.
> > 
> 
> It is for 'flags' and 'timeline fence' in case we need them later on.
> 
> Niranjana
> 
> > > +
> > > +	/**
> > > +	 * @extensions: Zero-terminated chain of extensions.
> > > +	 *
> > > +	 * For future extensions. See struct i915_user_extension.
> > > +	 */
> > > +	__u64 extensions;
> > > +};
> > > +
> > >  #if defined(__cplusplus)
> > >  }
> > >  #endif
> > 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [Intel-gfx] [PATCH v6 00/20] drm/i915/vm_bind: Add VM_BIND functionality
  2022-11-10 15:05       ` Matthew Auld
@ 2022-11-10 21:37         ` Zanoni, Paulo R
  0 siblings, 0 replies; 71+ messages in thread
From: Zanoni, Paulo R @ 2022-11-10 21:37 UTC (permalink / raw)
  To: tvrtko.ursulin, Vishwanathapura, Niranjana, Auld, Matthew
  Cc: Nikula, Jani, intel-gfx, dri-devel, Hellstrom, Thomas, Vetter,
	 Daniel, christian.koenig

On Thu, 2022-11-10 at 15:05 +0000, Matthew Auld wrote:
> On 10/11/2022 14:47, Tvrtko Ursulin wrote:
> > 
> > On 10/11/2022 05:49, Niranjana Vishwanathapura wrote:
> > > On Wed, Nov 09, 2022 at 04:16:25PM -0800, Zanoni, Paulo R wrote:
> > > > On Mon, 2022-11-07 at 00:51 -0800, Niranjana Vishwanathapura wrote:
> > > > > DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
> > > > > buffer objects (BOs) or sections of a BOs at specified GPU virtual
> > > > > addresses on a specified address space (VM). Multiple mappings can map
> > > > > to the same physical pages of an object (aliasing). These mappings 
> > > > > (also
> > > > > referred to as persistent mappings) will be persistent across multiple
> > > > > GPU submissions (execbuf calls) issued by the UMD, without user having
> > > > > to provide a list of all required mappings during each submission (as
> > > > > required by older execbuf mode).
> > > > > 
> > > > > This patch series support VM_BIND version 1, as described by the param
> > > > > I915_PARAM_VM_BIND_VERSION.
> > > > > 
> > > > > Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
> > > > > vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
> > > > > The new execbuf3 ioctl will not have any execlist support and all the
> > > > > legacy support like relocations etc., are removed.
> > > > > 
> > > > > NOTEs:
> > > > > * It is based on below VM_BIND design+uapi rfc.
> > > > >   Documentation/gpu/rfc/i915_vm_bind.rst
> > > > 
> > > > Hi
> > > > 
> > > > One difference for execbuf3 that I noticed that is not mentioned in the
> > > > RFC document is that we now don't have a way to signal
> > > > EXEC_OBJECT_WRITE. When looking at the Kernel code, some there are some
> > > > pieces that check for this flag:
> > > > 
> > > > - there's code that deals with frontbuffer rendering
> > > > - there's code that deals with fences
> > > > - there's code that prevents self-modifying batches
> > > > - another that seems related to waiting for objects
> > > > 
> > > > Are there any new rules regarding frontbuffer rendering when we use
> > > > execbuf3? Any other behavior changes related to the other places that
> > > > we should expect when using execbuf3?
> > > > 
> > > 
> > > Paulo,
> > > Most of the EXEC_OBJECT_WRITE check in execbuf path is related to
> > > implicit dependency tracker which execbuf3 does not support. The
> > > frontbuffer related updated is the only exception and I don't
> > > remember the rationale to not require this on execbuf3.
> > > 
> > > Matt, Tvrtko, Daniel, can you please comment here?
> > 
> > Does not ring a bell to me. Looking at the code it certainly looks like 
> > it would be silently failing to handle it properly.
> > 
> > I'll let people with more experience in this area answer, but from my 
> > point of view, if it is decided that it can be left unsupported, then we 
> > probably need a way of failing the ioctl is used against a frontbuffer, 
> > or something, instead of having display corruption.

There's no way for the ioctl to even know we're writing to
frontbuffers. Unless of course it decides to parse the whole
batchbuffer and understand everything that's going on there, which
sounds insane.


> 
> Maybe it's a coincidence but there is:
> https://patchwork.freedesktop.org/series/110715/
> 
> Which looks relevant. Maarten, any hints here?

Can we pretty please have the rules of frontbuffer tracking written
anywhere? I had major trouble trying to understand this back when I was
working on FBC, and now I regret not having written it back then
because I just forgot how it's supposed to work.

My first guess when looking at that patch is that it would completely
break FBC, but hey so many years have passed since I worked on this
that maybe things changed completely. At least I wrote tests to cover
this.

> 
> > 
> > Regards,
> > 
> > Tvrtko


^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2022-11-10 21:37 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-07  8:51 [PATCH v6 00/20] drm/i915/vm_bind: Add VM_BIND functionality Niranjana Vishwanathapura
2022-11-07  8:51 ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-07  8:51 ` [Intel-gfx] [PATCH v6 01/20] drm/i915/vm_bind: Expose vm lookup function Niranjana Vishwanathapura
2022-11-07  8:51   ` Niranjana Vishwanathapura
2022-11-07  8:51 ` [Intel-gfx] [PATCH v6 02/20] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation() Niranjana Vishwanathapura
2022-11-07  8:51   ` Niranjana Vishwanathapura
2022-11-07  8:51 ` [PATCH v6 03/20] drm/i915/vm_bind: Expose i915_gem_object_max_page_size() Niranjana Vishwanathapura
2022-11-07  8:51   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-07  8:51 ` [Intel-gfx] [PATCH v6 04/20] drm/i915/vm_bind: Add support to create persistent vma Niranjana Vishwanathapura
2022-11-07  8:51   ` Niranjana Vishwanathapura
2022-11-07  8:51 ` [Intel-gfx] [PATCH v6 05/20] drm/i915/vm_bind: Implement bind and unbind of object Niranjana Vishwanathapura
2022-11-07  8:51   ` Niranjana Vishwanathapura
2022-11-10  1:28   ` Zanoni, Paulo R
2022-11-10  1:28     ` [Intel-gfx] " Zanoni, Paulo R
2022-11-10 16:32     ` Niranjana Vishwanathapura
2022-11-10 16:32       ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-10 21:21       ` Zanoni, Paulo R
2022-11-10 21:21         ` [Intel-gfx] " Zanoni, Paulo R
2022-11-07  8:51 ` [Intel-gfx] [PATCH v6 06/20] drm/i915/vm_bind: Support for VM private BOs Niranjana Vishwanathapura
2022-11-07  8:51   ` Niranjana Vishwanathapura
2022-11-07  8:51 ` [Intel-gfx] [PATCH v6 07/20] drm/i915/vm_bind: Add support to handle object evictions Niranjana Vishwanathapura
2022-11-07  8:51   ` Niranjana Vishwanathapura
2022-11-07  8:51 ` [Intel-gfx] [PATCH v6 08/20] drm/i915/vm_bind: Support persistent vma activeness tracking Niranjana Vishwanathapura
2022-11-07  8:51   ` Niranjana Vishwanathapura
2022-11-07  8:51 ` [Intel-gfx] [PATCH v6 09/20] drm/i915/vm_bind: Add out fence support Niranjana Vishwanathapura
2022-11-07  8:51   ` Niranjana Vishwanathapura
2022-11-07  8:52 ` [PATCH v6 10/20] drm/i915/vm_bind: Abstract out common execbuf functions Niranjana Vishwanathapura
2022-11-07  8:52   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-07  8:52 ` [PATCH v6 11/20] drm/i915/vm_bind: Use common execbuf functions in execbuf path Niranjana Vishwanathapura
2022-11-07  8:52   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-07  8:52 ` [PATCH v6 12/20] drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl Niranjana Vishwanathapura
2022-11-07  8:52   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-07  8:52 ` [PATCH v6 13/20] drm/i915/vm_bind: Update i915_vma_verify_bind_complete() Niranjana Vishwanathapura
2022-11-07  8:52   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-07  8:52 ` [Intel-gfx] [PATCH v6 14/20] drm/i915/vm_bind: Expose i915_request_await_bind() Niranjana Vishwanathapura
2022-11-07  8:52   ` Niranjana Vishwanathapura
2022-11-07  8:52 ` [PATCH v6 15/20] drm/i915/vm_bind: Handle persistent vmas in execbuf3 Niranjana Vishwanathapura
2022-11-07  8:52   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-07  8:52 ` [PATCH v6 16/20] drm/i915/vm_bind: userptr dma-resv changes Niranjana Vishwanathapura
2022-11-07  8:52   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-07  8:52 ` [Intel-gfx] [PATCH v6 17/20] drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts Niranjana Vishwanathapura
2022-11-07  8:52   ` Niranjana Vishwanathapura
2022-11-07  8:52 ` [PATCH v6 18/20] drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode Niranjana Vishwanathapura
2022-11-07  8:52   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-07  8:52 ` [PATCH v6 19/20] drm/i915/vm_bind: Render VM_BIND documentation Niranjana Vishwanathapura
2022-11-07  8:52   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-07  8:52 ` [PATCH v6 20/20] drm/i915/vm_bind: Async vm_unbind support Niranjana Vishwanathapura
2022-11-07  8:52   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-08  1:39   ` Zanoni, Paulo R
2022-11-08  1:39     ` [Intel-gfx] " Zanoni, Paulo R
2022-11-08 15:46     ` Niranjana Vishwanathapura
2022-11-08 15:46       ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-09 17:52   ` Matthew Auld
2022-11-09 17:52     ` [Intel-gfx] " Matthew Auld
2022-11-09 20:11     ` Niranjana Vishwanathapura
2022-11-09 20:11       ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-09 21:13   ` Andi Shyti
2022-11-09 21:13     ` [Intel-gfx] " Andi Shyti
2022-11-10  0:28     ` Niranjana Vishwanathapura
2022-11-10  0:28       ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-07 11:21 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/vm_bind: Add VM_BIND functionality (rev9) Patchwork
2022-11-07 11:21 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2022-11-07 11:40 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2022-11-07 14:13 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2022-11-10  0:16 ` [PATCH v6 00/20] drm/i915/vm_bind: Add VM_BIND functionality Zanoni, Paulo R
2022-11-10  0:16   ` [Intel-gfx] " Zanoni, Paulo R
2022-11-10  5:49   ` Niranjana Vishwanathapura
2022-11-10  5:49     ` [Intel-gfx] " Niranjana Vishwanathapura
2022-11-10 14:47     ` Tvrtko Ursulin
2022-11-10 15:05       ` Matthew Auld
2022-11-10 21:37         ` Zanoni, Paulo R

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.