[PATCH v3 0/3] drm/doc/rfc: i915 VM_BIND feature design + uapi

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 0/3] drm/doc/rfc: i915 VM_BIND feature design + uapi
@ 2022-06-22  3:56 ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-22  3:56 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, lionel.g.landwerlin,
	tvrtko.ursulin, chris.p.wilson, thomas.hellstrom, oak.zeng,
	matthew.auld, jason, daniel.vetter, christian.koenig

This is the i915 driver VM_BIND feature design RFC patch series along
with the required uapi definition and description of intended use cases.

v2: Reduce the scope to simple Mesa use case.
    Remove all compute related uapi, vm_bind/unbind queue support and
    only support a timeline out fence instead of an in/out timeline
    fence array.
v3: Expand documentation on dma-resv usage, TLB flushing, execbuf3 and
    VM_UNBIND. Add FENCE_VALID and TLB_FLUSH flags.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

Niranjana Vishwanathapura (3):
  drm/doc/rfc: VM_BIND feature design document
  drm/i915: Update i915 uapi documentation
  drm/doc/rfc: VM_BIND uapi definition

 Documentation/gpu/rfc/i915_vm_bind.h   | 243 ++++++++++++++++++++++++
 Documentation/gpu/rfc/i915_vm_bind.rst | 247 +++++++++++++++++++++++++
 Documentation/gpu/rfc/index.rst        |   4 +
 include/uapi/drm/i915_drm.h            | 205 +++++++++++++++-----
 4 files changed, 654 insertions(+), 45 deletions(-)
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.rst

-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Intel-gfx] [PATCH v3 0/3] drm/doc/rfc: i915 VM_BIND feature design + uapi
@ 2022-06-22  3:56 ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-22  3:56 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, chris.p.wilson, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

This is the i915 driver VM_BIND feature design RFC patch series along
with the required uapi definition and description of intended use cases.

v2: Reduce the scope to simple Mesa use case.
    Remove all compute related uapi, vm_bind/unbind queue support and
    only support a timeline out fence instead of an in/out timeline
    fence array.
v3: Expand documentation on dma-resv usage, TLB flushing, execbuf3 and
    VM_UNBIND. Add FENCE_VALID and TLB_FLUSH flags.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

Niranjana Vishwanathapura (3):
  drm/doc/rfc: VM_BIND feature design document
  drm/i915: Update i915 uapi documentation
  drm/doc/rfc: VM_BIND uapi definition

 Documentation/gpu/rfc/i915_vm_bind.h   | 243 ++++++++++++++++++++++++
 Documentation/gpu/rfc/i915_vm_bind.rst | 247 +++++++++++++++++++++++++
 Documentation/gpu/rfc/index.rst        |   4 +
 include/uapi/drm/i915_drm.h            | 205 +++++++++++++++-----
 4 files changed, 654 insertions(+), 45 deletions(-)
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.rst

-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v3 1/3] drm/doc/rfc: VM_BIND feature design document
  2022-06-22  3:56 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-06-22  3:56   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-22  3:56 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, lionel.g.landwerlin,
	tvrtko.ursulin, chris.p.wilson, thomas.hellstrom, oak.zeng,
	matthew.auld, jason, daniel.vetter, christian.koenig

VM_BIND design document with description of intended use cases.

v2: Reduce the scope to simple Mesa use case.
v3: Expand documentation on dma-resv usage, TLB flushing and
    execbuf3.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 Documentation/gpu/rfc/i915_vm_bind.rst | 247 +++++++++++++++++++++++++
 Documentation/gpu/rfc/index.rst        |   4 +
 2 files changed, 251 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.rst

diff --git a/Documentation/gpu/rfc/i915_vm_bind.rst b/Documentation/gpu/rfc/i915_vm_bind.rst
new file mode 100644
index 000000000000..bbe9f0a7c48f
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_vm_bind.rst
@@ -0,0 +1,247 @@
+==========================================
+I915 VM_BIND feature design and use cases
+==========================================
+
+VM_BIND feature
+================
+DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+specified address space (VM). These mappings (also referred to as persistent
+mappings) will be persistent across multiple GPU submissions (execbuf calls)
+issued by the UMD, without user having to provide a list of all required
+mappings during each submission (as required by older execbuf mode).
+
+The VM_BIND/UNBIND calls allow UMDs to request a timeline fence for signaling
+the completion of bind/unbind operation.
+
+VM_BIND feature is advertised to user via I915_PARAM_HAS_VM_BIND.
+User has to opt-in for VM_BIND mode of binding for an address space (VM)
+during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
+
+The bind/unbind operation can get completed asynchronously and out of
+submission order. The out fence when specified will be signaled upon
+completion of bind/unbind operation.
+
+VM_BIND features include:
+
+* Multiple Virtual Address (VA) mappings can map to the same physical pages
+  of an object (aliasing).
+* VA mapping can map to a partial section of the BO (partial binding).
+* Support capture of persistent mappings in the dump upon GPU error.
+* TLB is flushed upon unbind completion.
+* Support for userptr gem objects (no special uapi is required for this).
+
+TLB flushing
+-------------
+TLB is flushed upon unbind completion. If platforms support selective TLB
+invalidation, only the required range is flushed. Otherwise, whole TLB is
+flushed and batching the flushes might be useful here. UMDs can also request
+for TLB flush after the bind completion with a I915_GEM_VM_BIND_TLB_FLUSH
+flag in VM_BIND call (See struct drm_i915_gem_vm_bind) if they need it.
+
+Execbuf ioctl in VM_BIND mode
+-------------------------------
+A VM in VM_BIND mode will not support older execbuf mode of binding.
+The execbuf ioctl handling in VM_BIND mode differs significantly from the
+older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+execlist. Hence, no support for implicit sync. It is expected that the below
+work will be able to support requirements of object dependency setting in all
+use cases:
+
+"dma-buf: Add an API for exporting sync files"
+(https://lwn.net/Articles/859290/)
+
+The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+VM_BIND call) at the time of execbuf3 call are deemed required for that
+submission.
+
+The execbuf3 ioctl directly specifies the batch addresses instead of as
+object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+support many of the older features like in/out/submit fences, fence array,
+default gem context and many more (See struct drm_i915_gem_execbuffer3).
+
+In VM_BIND mode, VA allocation is completely managed by the user instead of
+the i915 driver. Hence all VA assignment, eviction are not applicable in
+VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+be using the i915_vma active reference tracking. It will instead use dma-resv
+object for that (See `VM_BIND dma_resv usage`_).
+
+So, a lot of existing code supporting execbuf2 ioctl, like relocations, VA
+evictions, vma lookup table, implicit sync, vma active reference tracking etc.,
+are not applicable for execbuf3 ioctl. Hence, all execbuf3 specific handling
+should be in a separate file and only functionalities common to these ioctls
+can be the shared code where possible.
+
+VM_PRIVATE objects
+-------------------
+By default, BOs can be mapped on multiple VMs and can also be dma-buf
+exported. Hence these BOs are referred to as Shared BOs.
+During each execbuf submission, the request fence must be added to the
+dma-resv fence list of all shared BOs mapped on the VM.
+
+VM_BIND feature introduces an optimization where user can create BO which
+is private to a specified VM via I915_GEM_CREATE_EXT_VM_PRIVATE flag during
+BO creation. Unlike Shared BOs, these VM private BOs can only be mapped on
+the VM they are private to and can't be dma-buf exported.
+All private BOs of a VM share the dma-resv object. Hence during each execbuf
+submission, they need only one dma-resv fence list updated. Thus, the fast
+path (where required mappings are already bound) submission latency is O(1)
+w.r.t the number of VM private BOs.
+
+VM_BIND locking hirarchy
+-------------------------
+The locking design here supports the older (execlist based) execbuf mode, the
+newer VM_BIND mode, the VM_BIND mode with GPU page faults and possible future
+system allocator support (See `Shared Virtual Memory (SVM) support`_).
+The older execbuf mode and the newer VM_BIND mode without page faults manages
+residency of backing storage using dma_fence. The VM_BIND mode with page faults
+and the system allocator support do not use any dma_fence at all.
+
+VM_BIND locking order is as below.
+
+1) Lock-A: A vm_bind mutex will protect vm_bind lists. This lock is taken in
+   vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the
+   mapping.
+
+   In future, when GPU page faults are supported, we can potentially use a
+   rwsem instead, so that multiple page fault handlers can take the read side
+   lock to lookup the mapping and hence can run in parallel.
+   The older execbuf mode of binding do not need this lock.
+
+2) Lock-B: The object's dma-resv lock will protect i915_vma state and needs to
+   be held while binding/unbinding a vma in the async worker and while updating
+   dma-resv fence list of an object. Note that private BOs of a VM will all
+   share a dma-resv object.
+
+   The future system allocator support will use the HMM prescribed locking
+   instead.
+
+3) Lock-C: Spinlock/s to protect some of the VM's lists like the list of
+   invalidated vmas (due to eviction and userptr invalidation) etc.
+
+When GPU page faults are supported, the execbuf path do not take any of these
+locks. There we will simply smash the new batch buffer address into the ring and
+then tell the scheduler run that. The lock taking only happens from the page
+fault handler, where we take lock-A in read mode, whichever lock-B we need to
+find the backing storage (dma_resv lock for gem objects, and hmm/core mm for
+system allocator) and some additional locks (lock-D) for taking care of page
+table races. Page fault mode should not need to ever manipulate the vm lists,
+so won't ever need lock-C.
+
+VM_BIND LRU handling
+---------------------
+We need to ensure VM_BIND mapped objects are properly LRU tagged to avoid
+performance degradation. We will also need support for bulk LRU movement of
+VM_BIND objects to avoid additional latencies in execbuf path.
+
+The page table pages are similar to VM_BIND mapped objects (See
+`Evictable page table allocations`_) and are maintained per VM and needs to
+be pinned in memory when VM is made active (ie., upon an execbuf call with
+that VM). So, bulk LRU movement of page table pages is also needed.
+
+VM_BIND dma_resv usage
+-----------------------
+Fences needs to be added to all VM_BIND mapped objects. During each execbuf
+submission, they are added with DMA_RESV_USAGE_BOOKKEEP usage to prevent
+over sync (See enum dma_resv_usage). One can override it with either
+DMA_RESV_USAGE_READ or DMA_RESV_USAGE_WRITE usage during explicit object
+dependency setting.
+
+Note that DRM_I915_GEM_WAIT and DRM_I915_GEM_BUSY ioctls do not check for
+DMA_RESV_USAGE_BOOKKEEP usage and hence should not be used for end of batch
+check. Instead, the execbuf3 out fence should be used for end of batch check
+(See struct drm_i915_gem_execbuffer3).
+
+Also, in VM_BIND mode, use dma-resv apis for determining object activeness
+(See dma_resv_test_signaled() and dma_resv_wait_timeout()) and do not use the
+older i915_vma active reference tracking which is deprecated. This should be
+easier to get it working with the current TTM backend.
+
+Mesa use case
+--------------
+VM_BIND can potentially reduce the CPU overhead in Mesa (both Vulkan and Iris),
+hence improving performance of CPU-bound applications. It also allows us to
+implement Vulkan's Sparse Resources. With increasing GPU hardware performance,
+reducing CPU overhead becomes more impactful.
+
+
+Other VM_BIND use cases
+========================
+
+Long running Compute contexts
+------------------------------
+Usage of dma-fence expects that they complete in reasonable amount of time.
+Compute on the other hand can be long running. Hence it is appropriate for
+compute to use user/memory fence (See `User/Memory Fence`_) and dma-fence usage
+must be limited to in-kernel consumption only.
+
+Where GPU page faults are not available, kernel driver upon buffer invalidation
+will initiate a suspend (preemption) of long running context, finish the
+invalidation, revalidate the BO and then resume the compute context. This is
+done by having a per-context preempt fence which is enabled when someone tries
+to wait on it and triggers the context preemption.
+
+User/Memory Fence
+~~~~~~~~~~~~~~~~~~
+User/Memory fence is a <address, value> pair. To signal the user fence, the
+specified value will be written at the specified virtual address and wakeup the
+waiting process. User fence can be signaled either by the GPU or kernel async
+worker (like upon bind completion). User can wait on a user fence with a new
+user fence wait ioctl.
+
+Here is some prior work on this:
+https://patchwork.freedesktop.org/patch/349417/
+
+Low Latency Submission
+~~~~~~~~~~~~~~~~~~~~~~~
+Allows compute UMD to directly submit GPU jobs instead of through execbuf
+ioctl. This is made possible by VM_BIND is not being synchronized against
+execbuf. VM_BIND allows bind/unbind of mappings required for the directly
+submitted jobs.
+
+Debugger
+---------
+With debug event interface user space process (debugger) is able to keep track
+of and act upon resources created by another process (debugged) and attached
+to GPU via vm_bind interface.
+
+GPU page faults
+----------------
+GPU page faults when supported (in future), will only be supported in the
+VM_BIND mode. While both the older execbuf mode and the newer VM_BIND mode of
+binding will require using dma-fence to ensure residency, the GPU page faults
+mode when supported, will not use any dma-fence as residency is purely managed
+by installing and removing/invalidating page table entries.
+
+Page level hints settings
+--------------------------
+VM_BIND allows any hints setting per mapping instead of per BO.
+Possible hints include read-only mapping, placement and atomicity.
+Sub-BO level placement hint will be even more relevant with
+upcoming GPU on-demand page fault support.
+
+Page level Cache/CLOS settings
+-------------------------------
+VM_BIND allows cache/CLOS settings per mapping instead of per BO.
+
+Evictable page table allocations
+---------------------------------
+Make pagetable allocations evictable and manage them similar to VM_BIND
+mapped objects. Page table pages are similar to persistent mappings of a
+VM (difference here are that the page table pages will not have an i915_vma
+structure and after swapping pages back in, parent page link needs to be
+updated).
+
+Shared Virtual Memory (SVM) support
+------------------------------------
+VM_BIND interface can be used to map system memory directly (without gem BO
+abstraction) using the HMM interface. SVM is only supported with GPU page
+faults enabled.
+
+VM_BIND UAPI
+=============
+
+.. kernel-doc:: Documentation/gpu/rfc/i915_vm_bind.h
diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
index 91e93a705230..7d10c36b268d 100644
--- a/Documentation/gpu/rfc/index.rst
+++ b/Documentation/gpu/rfc/index.rst
@@ -23,3 +23,7 @@ host such documentation:
 .. toctree::
 
     i915_scheduler.rst
+
+.. toctree::
+
+    i915_vm_bind.rst
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Intel-gfx] [PATCH v3 1/3] drm/doc/rfc: VM_BIND feature design document
@ 2022-06-22  3:56   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-22  3:56 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, chris.p.wilson, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

VM_BIND design document with description of intended use cases.

v2: Reduce the scope to simple Mesa use case.
v3: Expand documentation on dma-resv usage, TLB flushing and
    execbuf3.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 Documentation/gpu/rfc/i915_vm_bind.rst | 247 +++++++++++++++++++++++++
 Documentation/gpu/rfc/index.rst        |   4 +
 2 files changed, 251 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.rst

diff --git a/Documentation/gpu/rfc/i915_vm_bind.rst b/Documentation/gpu/rfc/i915_vm_bind.rst
new file mode 100644
index 000000000000..bbe9f0a7c48f
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_vm_bind.rst
@@ -0,0 +1,247 @@
+==========================================
+I915 VM_BIND feature design and use cases
+==========================================
+
+VM_BIND feature
+================
+DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM buffer
+objects (BOs) or sections of a BOs at specified GPU virtual addresses on a
+specified address space (VM). These mappings (also referred to as persistent
+mappings) will be persistent across multiple GPU submissions (execbuf calls)
+issued by the UMD, without user having to provide a list of all required
+mappings during each submission (as required by older execbuf mode).
+
+The VM_BIND/UNBIND calls allow UMDs to request a timeline fence for signaling
+the completion of bind/unbind operation.
+
+VM_BIND feature is advertised to user via I915_PARAM_HAS_VM_BIND.
+User has to opt-in for VM_BIND mode of binding for an address space (VM)
+during VM creation time via I915_VM_CREATE_FLAGS_USE_VM_BIND extension.
+
+The bind/unbind operation can get completed asynchronously and out of
+submission order. The out fence when specified will be signaled upon
+completion of bind/unbind operation.
+
+VM_BIND features include:
+
+* Multiple Virtual Address (VA) mappings can map to the same physical pages
+  of an object (aliasing).
+* VA mapping can map to a partial section of the BO (partial binding).
+* Support capture of persistent mappings in the dump upon GPU error.
+* TLB is flushed upon unbind completion.
+* Support for userptr gem objects (no special uapi is required for this).
+
+TLB flushing
+-------------
+TLB is flushed upon unbind completion. If platforms support selective TLB
+invalidation, only the required range is flushed. Otherwise, whole TLB is
+flushed and batching the flushes might be useful here. UMDs can also request
+for TLB flush after the bind completion with a I915_GEM_VM_BIND_TLB_FLUSH
+flag in VM_BIND call (See struct drm_i915_gem_vm_bind) if they need it.
+
+Execbuf ioctl in VM_BIND mode
+-------------------------------
+A VM in VM_BIND mode will not support older execbuf mode of binding.
+The execbuf ioctl handling in VM_BIND mode differs significantly from the
+older execbuf2 ioctl (See struct drm_i915_gem_execbuffer2).
+Hence, a new execbuf3 ioctl has been added to support VM_BIND mode. (See
+struct drm_i915_gem_execbuffer3). The execbuf3 ioctl will not accept any
+execlist. Hence, no support for implicit sync. It is expected that the below
+work will be able to support requirements of object dependency setting in all
+use cases:
+
+"dma-buf: Add an API for exporting sync files"
+(https://lwn.net/Articles/859290/)
+
+The new execbuf3 ioctl only works in VM_BIND mode and the VM_BIND mode only
+works with execbuf3 ioctl for submission. All BOs mapped on that VM (through
+VM_BIND call) at the time of execbuf3 call are deemed required for that
+submission.
+
+The execbuf3 ioctl directly specifies the batch addresses instead of as
+object handles as in execbuf2 ioctl. The execbuf3 ioctl will also not
+support many of the older features like in/out/submit fences, fence array,
+default gem context and many more (See struct drm_i915_gem_execbuffer3).
+
+In VM_BIND mode, VA allocation is completely managed by the user instead of
+the i915 driver. Hence all VA assignment, eviction are not applicable in
+VM_BIND mode. Also, for determining object activeness, VM_BIND mode will not
+be using the i915_vma active reference tracking. It will instead use dma-resv
+object for that (See `VM_BIND dma_resv usage`_).
+
+So, a lot of existing code supporting execbuf2 ioctl, like relocations, VA
+evictions, vma lookup table, implicit sync, vma active reference tracking etc.,
+are not applicable for execbuf3 ioctl. Hence, all execbuf3 specific handling
+should be in a separate file and only functionalities common to these ioctls
+can be the shared code where possible.
+
+VM_PRIVATE objects
+-------------------
+By default, BOs can be mapped on multiple VMs and can also be dma-buf
+exported. Hence these BOs are referred to as Shared BOs.
+During each execbuf submission, the request fence must be added to the
+dma-resv fence list of all shared BOs mapped on the VM.
+
+VM_BIND feature introduces an optimization where user can create BO which
+is private to a specified VM via I915_GEM_CREATE_EXT_VM_PRIVATE flag during
+BO creation. Unlike Shared BOs, these VM private BOs can only be mapped on
+the VM they are private to and can't be dma-buf exported.
+All private BOs of a VM share the dma-resv object. Hence during each execbuf
+submission, they need only one dma-resv fence list updated. Thus, the fast
+path (where required mappings are already bound) submission latency is O(1)
+w.r.t the number of VM private BOs.
+
+VM_BIND locking hirarchy
+-------------------------
+The locking design here supports the older (execlist based) execbuf mode, the
+newer VM_BIND mode, the VM_BIND mode with GPU page faults and possible future
+system allocator support (See `Shared Virtual Memory (SVM) support`_).
+The older execbuf mode and the newer VM_BIND mode without page faults manages
+residency of backing storage using dma_fence. The VM_BIND mode with page faults
+and the system allocator support do not use any dma_fence at all.
+
+VM_BIND locking order is as below.
+
+1) Lock-A: A vm_bind mutex will protect vm_bind lists. This lock is taken in
+   vm_bind/vm_unbind ioctl calls, in the execbuf path and while releasing the
+   mapping.
+
+   In future, when GPU page faults are supported, we can potentially use a
+   rwsem instead, so that multiple page fault handlers can take the read side
+   lock to lookup the mapping and hence can run in parallel.
+   The older execbuf mode of binding do not need this lock.
+
+2) Lock-B: The object's dma-resv lock will protect i915_vma state and needs to
+   be held while binding/unbinding a vma in the async worker and while updating
+   dma-resv fence list of an object. Note that private BOs of a VM will all
+   share a dma-resv object.
+
+   The future system allocator support will use the HMM prescribed locking
+   instead.
+
+3) Lock-C: Spinlock/s to protect some of the VM's lists like the list of
+   invalidated vmas (due to eviction and userptr invalidation) etc.
+
+When GPU page faults are supported, the execbuf path do not take any of these
+locks. There we will simply smash the new batch buffer address into the ring and
+then tell the scheduler run that. The lock taking only happens from the page
+fault handler, where we take lock-A in read mode, whichever lock-B we need to
+find the backing storage (dma_resv lock for gem objects, and hmm/core mm for
+system allocator) and some additional locks (lock-D) for taking care of page
+table races. Page fault mode should not need to ever manipulate the vm lists,
+so won't ever need lock-C.
+
+VM_BIND LRU handling
+---------------------
+We need to ensure VM_BIND mapped objects are properly LRU tagged to avoid
+performance degradation. We will also need support for bulk LRU movement of
+VM_BIND objects to avoid additional latencies in execbuf path.
+
+The page table pages are similar to VM_BIND mapped objects (See
+`Evictable page table allocations`_) and are maintained per VM and needs to
+be pinned in memory when VM is made active (ie., upon an execbuf call with
+that VM). So, bulk LRU movement of page table pages is also needed.
+
+VM_BIND dma_resv usage
+-----------------------
+Fences needs to be added to all VM_BIND mapped objects. During each execbuf
+submission, they are added with DMA_RESV_USAGE_BOOKKEEP usage to prevent
+over sync (See enum dma_resv_usage). One can override it with either
+DMA_RESV_USAGE_READ or DMA_RESV_USAGE_WRITE usage during explicit object
+dependency setting.
+
+Note that DRM_I915_GEM_WAIT and DRM_I915_GEM_BUSY ioctls do not check for
+DMA_RESV_USAGE_BOOKKEEP usage and hence should not be used for end of batch
+check. Instead, the execbuf3 out fence should be used for end of batch check
+(See struct drm_i915_gem_execbuffer3).
+
+Also, in VM_BIND mode, use dma-resv apis for determining object activeness
+(See dma_resv_test_signaled() and dma_resv_wait_timeout()) and do not use the
+older i915_vma active reference tracking which is deprecated. This should be
+easier to get it working with the current TTM backend.
+
+Mesa use case
+--------------
+VM_BIND can potentially reduce the CPU overhead in Mesa (both Vulkan and Iris),
+hence improving performance of CPU-bound applications. It also allows us to
+implement Vulkan's Sparse Resources. With increasing GPU hardware performance,
+reducing CPU overhead becomes more impactful.
+
+
+Other VM_BIND use cases
+========================
+
+Long running Compute contexts
+------------------------------
+Usage of dma-fence expects that they complete in reasonable amount of time.
+Compute on the other hand can be long running. Hence it is appropriate for
+compute to use user/memory fence (See `User/Memory Fence`_) and dma-fence usage
+must be limited to in-kernel consumption only.
+
+Where GPU page faults are not available, kernel driver upon buffer invalidation
+will initiate a suspend (preemption) of long running context, finish the
+invalidation, revalidate the BO and then resume the compute context. This is
+done by having a per-context preempt fence which is enabled when someone tries
+to wait on it and triggers the context preemption.
+
+User/Memory Fence
+~~~~~~~~~~~~~~~~~~
+User/Memory fence is a <address, value> pair. To signal the user fence, the
+specified value will be written at the specified virtual address and wakeup the
+waiting process. User fence can be signaled either by the GPU or kernel async
+worker (like upon bind completion). User can wait on a user fence with a new
+user fence wait ioctl.
+
+Here is some prior work on this:
+https://patchwork.freedesktop.org/patch/349417/
+
+Low Latency Submission
+~~~~~~~~~~~~~~~~~~~~~~~
+Allows compute UMD to directly submit GPU jobs instead of through execbuf
+ioctl. This is made possible by VM_BIND is not being synchronized against
+execbuf. VM_BIND allows bind/unbind of mappings required for the directly
+submitted jobs.
+
+Debugger
+---------
+With debug event interface user space process (debugger) is able to keep track
+of and act upon resources created by another process (debugged) and attached
+to GPU via vm_bind interface.
+
+GPU page faults
+----------------
+GPU page faults when supported (in future), will only be supported in the
+VM_BIND mode. While both the older execbuf mode and the newer VM_BIND mode of
+binding will require using dma-fence to ensure residency, the GPU page faults
+mode when supported, will not use any dma-fence as residency is purely managed
+by installing and removing/invalidating page table entries.
+
+Page level hints settings
+--------------------------
+VM_BIND allows any hints setting per mapping instead of per BO.
+Possible hints include read-only mapping, placement and atomicity.
+Sub-BO level placement hint will be even more relevant with
+upcoming GPU on-demand page fault support.
+
+Page level Cache/CLOS settings
+-------------------------------
+VM_BIND allows cache/CLOS settings per mapping instead of per BO.
+
+Evictable page table allocations
+---------------------------------
+Make pagetable allocations evictable and manage them similar to VM_BIND
+mapped objects. Page table pages are similar to persistent mappings of a
+VM (difference here are that the page table pages will not have an i915_vma
+structure and after swapping pages back in, parent page link needs to be
+updated).
+
+Shared Virtual Memory (SVM) support
+------------------------------------
+VM_BIND interface can be used to map system memory directly (without gem BO
+abstraction) using the HMM interface. SVM is only supported with GPU page
+faults enabled.
+
+VM_BIND UAPI
+=============
+
+.. kernel-doc:: Documentation/gpu/rfc/i915_vm_bind.h
diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
index 91e93a705230..7d10c36b268d 100644
--- a/Documentation/gpu/rfc/index.rst
+++ b/Documentation/gpu/rfc/index.rst
@@ -23,3 +23,7 @@ host such documentation:
 .. toctree::
 
     i915_scheduler.rst
+
+.. toctree::
+
+    i915_vm_bind.rst
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 2/3] drm/i915: Update i915 uapi documentation
  2022-06-22  3:56 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-06-22  3:56   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-22  3:56 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, lionel.g.landwerlin,
	tvrtko.ursulin, chris.p.wilson, thomas.hellstrom, oak.zeng,
	matthew.auld, jason, daniel.vetter, christian.koenig

Add some missing i915 upai documentation which the new
i915 VM_BIND feature documentation will be refer to.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 include/uapi/drm/i915_drm.h | 205 ++++++++++++++++++++++++++++--------
 1 file changed, 160 insertions(+), 45 deletions(-)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index de49b68b4fc8..9e3e8697b837 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -751,14 +751,27 @@ typedef struct drm_i915_irq_wait {
 
 /* Must be kept compact -- no holes and well documented */
 
-typedef struct drm_i915_getparam {
+/**
+ * struct drm_i915_getparam - Driver parameter query structure.
+ */
+struct drm_i915_getparam {
+	/** @param: Driver parameter to query. */
 	__s32 param;
-	/*
+
+	/**
+	 * @value: Address of memory where queried value should be put.
+	 *
 	 * WARNING: Using pointers instead of fixed-size u64 means we need to write
 	 * compat32 code. Don't repeat this mistake.
 	 */
 	int __user *value;
-} drm_i915_getparam_t;
+};
+
+/**
+ * typedef drm_i915_getparam_t - Driver parameter query structure.
+ * See struct drm_i915_getparam.
+ */
+typedef struct drm_i915_getparam drm_i915_getparam_t;
 
 /* Ioctl to set kernel params:
  */
@@ -1239,76 +1252,119 @@ struct drm_i915_gem_exec_object2 {
 	__u64 rsvd2;
 };
 
+/**
+ * struct drm_i915_gem_exec_fence - An input or output fence for the execbuf
+ * ioctl.
+ *
+ * The request will wait for input fence to signal before submission.
+ *
+ * The returned output fence will be signaled after the completion of the
+ * request.
+ */
 struct drm_i915_gem_exec_fence {
-	/**
-	 * User's handle for a drm_syncobj to wait on or signal.
-	 */
+	/** @handle: User's handle for a drm_syncobj to wait on or signal. */
 	__u32 handle;
 
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_EXEC_FENCE_WAIT:
+	 * Wait for the input fence before request submission.
+	 *
+	 * I915_EXEC_FENCE_SIGNAL:
+	 * Return request completion fence as output
+	 */
+	__u32 flags;
 #define I915_EXEC_FENCE_WAIT            (1<<0)
 #define I915_EXEC_FENCE_SIGNAL          (1<<1)
 #define __I915_EXEC_FENCE_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SIGNAL << 1))
-	__u32 flags;
 };
 
-/*
- * See drm_i915_gem_execbuffer_ext_timeline_fences.
- */
-#define DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES 0
-
-/*
+/**
+ * struct drm_i915_gem_execbuffer_ext_timeline_fences - Timeline fences
+ * for execbuf ioctl.
+ *
  * This structure describes an array of drm_syncobj and associated points for
  * timeline variants of drm_syncobj. It is invalid to append this structure to
  * the execbuf if I915_EXEC_FENCE_ARRAY is set.
  */
 struct drm_i915_gem_execbuffer_ext_timeline_fences {
+#define DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES 0
+	/** @base: Extension link. See struct i915_user_extension. */
 	struct i915_user_extension base;
 
 	/**
-	 * Number of element in the handles_ptr & value_ptr arrays.
+	 * @fence_count: Number of elements in the @handles_ptr & @value_ptr
+	 * arrays.
 	 */
 	__u64 fence_count;
 
 	/**
-	 * Pointer to an array of struct drm_i915_gem_exec_fence of length
-	 * fence_count.
+	 * @handles_ptr: Pointer to an array of struct drm_i915_gem_exec_fence
+	 * of length @fence_count.
 	 */
 	__u64 handles_ptr;
 
 	/**
-	 * Pointer to an array of u64 values of length fence_count. Values
-	 * must be 0 for a binary drm_syncobj. A Value of 0 for a timeline
-	 * drm_syncobj is invalid as it turns a drm_syncobj into a binary one.
+	 * @values_ptr: Pointer to an array of u64 values of length
+	 * @fence_count.
+	 * Values must be 0 for a binary drm_syncobj. A Value of 0 for a
+	 * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
+	 * binary one.
 	 */
 	__u64 values_ptr;
 };
 
+/**
+ * struct drm_i915_gem_execbuffer2 - Structure for DRM_I915_GEM_EXECBUFFER2
+ * ioctl.
+ */
 struct drm_i915_gem_execbuffer2 {
-	/**
-	 * List of gem_exec_object2 structs
-	 */
+	/** @buffers_ptr: Pointer to a list of gem_exec_object2 structs */
 	__u64 buffers_ptr;
+
+	/** @buffer_count: Number of elements in @buffers_ptr array */
 	__u32 buffer_count;
 
-	/** Offset in the batchbuffer to start execution from. */
+	/**
+	 * @batch_start_offset: Offset in the batchbuffer to start execution
+	 * from.
+	 */
 	__u32 batch_start_offset;
-	/** Bytes used in batchbuffer from batch_start_offset */
+
+	/**
+	 * @batch_len: Length in bytes of the batch buffer, starting from the
+	 * @batch_start_offset. If 0, length is assumed to be the batch buffer
+	 * object size.
+	 */
 	__u32 batch_len;
+
+	/** @DR1: deprecated */
 	__u32 DR1;
+
+	/** @DR4: deprecated */
 	__u32 DR4;
+
+	/** @num_cliprects: See @cliprects_ptr */
 	__u32 num_cliprects;
+
 	/**
-	 * This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY
-	 * & I915_EXEC_USE_EXTENSIONS are not set.
+	 * @cliprects_ptr: Kernel clipping was a DRI1 misfeature.
+	 *
+	 * It is invalid to use this field if I915_EXEC_FENCE_ARRAY or
+	 * I915_EXEC_USE_EXTENSIONS flags are not set.
 	 *
 	 * If I915_EXEC_FENCE_ARRAY is set, then this is a pointer to an array
-	 * of struct drm_i915_gem_exec_fence and num_cliprects is the length
-	 * of the array.
+	 * of &drm_i915_gem_exec_fence and @num_cliprects is the length of the
+	 * array.
 	 *
 	 * If I915_EXEC_USE_EXTENSIONS is set, then this is a pointer to a
-	 * single struct i915_user_extension and num_cliprects is 0.
+	 * single &i915_user_extension and num_cliprects is 0.
 	 */
 	__u64 cliprects_ptr;
+
+	/** @flags: Execbuf flags */
+	__u64 flags;
 #define I915_EXEC_RING_MASK              (0x3f)
 #define I915_EXEC_DEFAULT                (0<<0)
 #define I915_EXEC_RENDER                 (1<<0)
@@ -1326,10 +1382,6 @@ struct drm_i915_gem_execbuffer2 {
 #define I915_EXEC_CONSTANTS_REL_GENERAL (0<<6) /* default */
 #define I915_EXEC_CONSTANTS_ABSOLUTE 	(1<<6)
 #define I915_EXEC_CONSTANTS_REL_SURFACE (2<<6) /* gen4/5 only */
-	__u64 flags;
-	__u64 rsvd1; /* now used for context info */
-	__u64 rsvd2;
-};
 
 /** Resets the SO write offset registers for transform feedback on gen7. */
 #define I915_EXEC_GEN7_SOL_RESET	(1<<8)
@@ -1432,9 +1484,23 @@ struct drm_i915_gem_execbuffer2 {
  * drm_i915_gem_execbuffer_ext enum.
  */
 #define I915_EXEC_USE_EXTENSIONS	(1 << 21)
-
 #define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_USE_EXTENSIONS << 1))
 
+	/** @rsvd1: Context id */
+	__u64 rsvd1;
+
+	/**
+	 * @rsvd2: in and out sync_file file descriptors.
+	 *
+	 * When I915_EXEC_FENCE_IN or I915_EXEC_FENCE_SUBMIT flag is set, the
+	 * lower 32 bits of this field will have the in sync_file fd (input).
+	 *
+	 * When I915_EXEC_FENCE_OUT flag is set, the upper 32 bits of this
+	 * field will have the out sync_file fd (output).
+	 */
+	__u64 rsvd2;
+};
+
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
 	(eb2).rsvd1 = context & I915_EXEC_CONTEXT_ID_MASK
@@ -1814,19 +1880,58 @@ struct drm_i915_gem_context_create {
 	__u32 pad;
 };
 
+/**
+ * struct drm_i915_gem_context_create_ext - Structure for creating contexts.
+ */
 struct drm_i915_gem_context_create_ext {
-	__u32 ctx_id; /* output: id of new context*/
+	/** @ctx_id: Id of the created context (output) */
+	__u32 ctx_id;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS:
+	 *
+	 * Extensions may be appended to this structure and driver must check
+	 * for those. See @extensions.
+	 *
+	 * I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE
+	 *
+	 * Created context will have single timeline.
+	 */
 	__u32 flags;
 #define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS	(1u << 0)
 #define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE	(1u << 1)
 #define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
 	(-(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE << 1))
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * I915_CONTEXT_CREATE_EXT_SETPARAM:
+	 * Context parameter to set or query during context creation.
+	 * See struct drm_i915_gem_context_create_ext_setparam.
+	 *
+	 * I915_CONTEXT_CREATE_EXT_CLONE:
+	 * This extension has been removed. On the off chance someone somewhere
+	 * has attempted to use it, never re-use this extension number.
+	 */
 	__u64 extensions;
+#define I915_CONTEXT_CREATE_EXT_SETPARAM 0
+#define I915_CONTEXT_CREATE_EXT_CLONE 1
 };
 
+/**
+ * struct drm_i915_gem_context_param - Context parameter to set or query.
+ */
 struct drm_i915_gem_context_param {
+	/** @ctx_id: Context id */
 	__u32 ctx_id;
+
+	/** @size: Size of the parameter @value
 	__u32 size;
+
+	/** @param: Parameter to set or query */
 	__u64 param;
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
 /* I915_CONTEXT_PARAM_NO_ZEROMAP has been removed.  On the off chance
@@ -1973,6 +2078,7 @@ struct drm_i915_gem_context_param {
 #define I915_CONTEXT_PARAM_PROTECTED_CONTENT    0xd
 /* Must be kept compact -- no holes and well documented */
 
+	/** @value: Context parameter value to be set or queried */
 	__u64 value;
 };
 
@@ -2371,23 +2477,29 @@ struct i915_context_param_engines {
 	struct i915_engine_class_instance engines[N__]; \
 } __attribute__((packed)) name__
 
+/**
+ * struct drm_i915_gem_context_create_ext_setparam - Context parameter
+ * to set or query during context creation.
+ */
 struct drm_i915_gem_context_create_ext_setparam {
-#define I915_CONTEXT_CREATE_EXT_SETPARAM 0
+	/** @base: Extension link. See struct i915_user_extension. */
 	struct i915_user_extension base;
+
+	/**
+	 * @param: Context parameter to set or query.
+	 * See struct drm_i915_gem_context_param.
+	 */
 	struct drm_i915_gem_context_param param;
 };
 
-/* This API has been removed.  On the off chance someone somewhere has
- * attempted to use it, never re-use this extension number.
- */
-#define I915_CONTEXT_CREATE_EXT_CLONE 1
-
 struct drm_i915_gem_context_destroy {
 	__u32 ctx_id;
 	__u32 pad;
 };
 
-/*
+/**
+ * struct drm_i915_gem_vm_control - Structure to create or destroy VM.
+ *
  * DRM_I915_GEM_VM_CREATE -
  *
  * Create a new virtual memory address space (ppGTT) for use within a context
@@ -2397,20 +2509,23 @@ struct drm_i915_gem_context_destroy {
  * The id of new VM (bound to the fd) for use with I915_CONTEXT_PARAM_VM is
  * returned in the outparam @id.
  *
- * No flags are defined, with all bits reserved and must be zero.
- *
  * An extension chain maybe provided, starting with @extensions, and terminated
  * by the @next_extension being 0. Currently, no extensions are defined.
  *
  * DRM_I915_GEM_VM_DESTROY -
  *
- * Destroys a previously created VM id, specified in @id.
+ * Destroys a previously created VM id, specified in @vm_id.
  *
  * No extensions or flags are allowed currently, and so must be zero.
  */
 struct drm_i915_gem_vm_control {
+	/** @extensions: Zero-terminated chain of extensions. */
 	__u64 extensions;
+
+	/** @flags: reserved for future usage, currently MBZ */
 	__u32 flags;
+
+	/** @vm_id: Id of the VM created or to be destroyed */
 	__u32 vm_id;
 };
 
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Intel-gfx] [PATCH v3 2/3] drm/i915: Update i915 uapi documentation
@ 2022-06-22  3:56   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-22  3:56 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, chris.p.wilson, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

Add some missing i915 upai documentation which the new
i915 VM_BIND feature documentation will be refer to.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 include/uapi/drm/i915_drm.h | 205 ++++++++++++++++++++++++++++--------
 1 file changed, 160 insertions(+), 45 deletions(-)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index de49b68b4fc8..9e3e8697b837 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -751,14 +751,27 @@ typedef struct drm_i915_irq_wait {
 
 /* Must be kept compact -- no holes and well documented */
 
-typedef struct drm_i915_getparam {
+/**
+ * struct drm_i915_getparam - Driver parameter query structure.
+ */
+struct drm_i915_getparam {
+	/** @param: Driver parameter to query. */
 	__s32 param;
-	/*
+
+	/**
+	 * @value: Address of memory where queried value should be put.
+	 *
 	 * WARNING: Using pointers instead of fixed-size u64 means we need to write
 	 * compat32 code. Don't repeat this mistake.
 	 */
 	int __user *value;
-} drm_i915_getparam_t;
+};
+
+/**
+ * typedef drm_i915_getparam_t - Driver parameter query structure.
+ * See struct drm_i915_getparam.
+ */
+typedef struct drm_i915_getparam drm_i915_getparam_t;
 
 /* Ioctl to set kernel params:
  */
@@ -1239,76 +1252,119 @@ struct drm_i915_gem_exec_object2 {
 	__u64 rsvd2;
 };
 
+/**
+ * struct drm_i915_gem_exec_fence - An input or output fence for the execbuf
+ * ioctl.
+ *
+ * The request will wait for input fence to signal before submission.
+ *
+ * The returned output fence will be signaled after the completion of the
+ * request.
+ */
 struct drm_i915_gem_exec_fence {
-	/**
-	 * User's handle for a drm_syncobj to wait on or signal.
-	 */
+	/** @handle: User's handle for a drm_syncobj to wait on or signal. */
 	__u32 handle;
 
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_EXEC_FENCE_WAIT:
+	 * Wait for the input fence before request submission.
+	 *
+	 * I915_EXEC_FENCE_SIGNAL:
+	 * Return request completion fence as output
+	 */
+	__u32 flags;
 #define I915_EXEC_FENCE_WAIT            (1<<0)
 #define I915_EXEC_FENCE_SIGNAL          (1<<1)
 #define __I915_EXEC_FENCE_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SIGNAL << 1))
-	__u32 flags;
 };
 
-/*
- * See drm_i915_gem_execbuffer_ext_timeline_fences.
- */
-#define DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES 0
-
-/*
+/**
+ * struct drm_i915_gem_execbuffer_ext_timeline_fences - Timeline fences
+ * for execbuf ioctl.
+ *
  * This structure describes an array of drm_syncobj and associated points for
  * timeline variants of drm_syncobj. It is invalid to append this structure to
  * the execbuf if I915_EXEC_FENCE_ARRAY is set.
  */
 struct drm_i915_gem_execbuffer_ext_timeline_fences {
+#define DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES 0
+	/** @base: Extension link. See struct i915_user_extension. */
 	struct i915_user_extension base;
 
 	/**
-	 * Number of element in the handles_ptr & value_ptr arrays.
+	 * @fence_count: Number of elements in the @handles_ptr & @value_ptr
+	 * arrays.
 	 */
 	__u64 fence_count;
 
 	/**
-	 * Pointer to an array of struct drm_i915_gem_exec_fence of length
-	 * fence_count.
+	 * @handles_ptr: Pointer to an array of struct drm_i915_gem_exec_fence
+	 * of length @fence_count.
 	 */
 	__u64 handles_ptr;
 
 	/**
-	 * Pointer to an array of u64 values of length fence_count. Values
-	 * must be 0 for a binary drm_syncobj. A Value of 0 for a timeline
-	 * drm_syncobj is invalid as it turns a drm_syncobj into a binary one.
+	 * @values_ptr: Pointer to an array of u64 values of length
+	 * @fence_count.
+	 * Values must be 0 for a binary drm_syncobj. A Value of 0 for a
+	 * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
+	 * binary one.
 	 */
 	__u64 values_ptr;
 };
 
+/**
+ * struct drm_i915_gem_execbuffer2 - Structure for DRM_I915_GEM_EXECBUFFER2
+ * ioctl.
+ */
 struct drm_i915_gem_execbuffer2 {
-	/**
-	 * List of gem_exec_object2 structs
-	 */
+	/** @buffers_ptr: Pointer to a list of gem_exec_object2 structs */
 	__u64 buffers_ptr;
+
+	/** @buffer_count: Number of elements in @buffers_ptr array */
 	__u32 buffer_count;
 
-	/** Offset in the batchbuffer to start execution from. */
+	/**
+	 * @batch_start_offset: Offset in the batchbuffer to start execution
+	 * from.
+	 */
 	__u32 batch_start_offset;
-	/** Bytes used in batchbuffer from batch_start_offset */
+
+	/**
+	 * @batch_len: Length in bytes of the batch buffer, starting from the
+	 * @batch_start_offset. If 0, length is assumed to be the batch buffer
+	 * object size.
+	 */
 	__u32 batch_len;
+
+	/** @DR1: deprecated */
 	__u32 DR1;
+
+	/** @DR4: deprecated */
 	__u32 DR4;
+
+	/** @num_cliprects: See @cliprects_ptr */
 	__u32 num_cliprects;
+
 	/**
-	 * This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY
-	 * & I915_EXEC_USE_EXTENSIONS are not set.
+	 * @cliprects_ptr: Kernel clipping was a DRI1 misfeature.
+	 *
+	 * It is invalid to use this field if I915_EXEC_FENCE_ARRAY or
+	 * I915_EXEC_USE_EXTENSIONS flags are not set.
 	 *
 	 * If I915_EXEC_FENCE_ARRAY is set, then this is a pointer to an array
-	 * of struct drm_i915_gem_exec_fence and num_cliprects is the length
-	 * of the array.
+	 * of &drm_i915_gem_exec_fence and @num_cliprects is the length of the
+	 * array.
 	 *
 	 * If I915_EXEC_USE_EXTENSIONS is set, then this is a pointer to a
-	 * single struct i915_user_extension and num_cliprects is 0.
+	 * single &i915_user_extension and num_cliprects is 0.
 	 */
 	__u64 cliprects_ptr;
+
+	/** @flags: Execbuf flags */
+	__u64 flags;
 #define I915_EXEC_RING_MASK              (0x3f)
 #define I915_EXEC_DEFAULT                (0<<0)
 #define I915_EXEC_RENDER                 (1<<0)
@@ -1326,10 +1382,6 @@ struct drm_i915_gem_execbuffer2 {
 #define I915_EXEC_CONSTANTS_REL_GENERAL (0<<6) /* default */
 #define I915_EXEC_CONSTANTS_ABSOLUTE 	(1<<6)
 #define I915_EXEC_CONSTANTS_REL_SURFACE (2<<6) /* gen4/5 only */
-	__u64 flags;
-	__u64 rsvd1; /* now used for context info */
-	__u64 rsvd2;
-};
 
 /** Resets the SO write offset registers for transform feedback on gen7. */
 #define I915_EXEC_GEN7_SOL_RESET	(1<<8)
@@ -1432,9 +1484,23 @@ struct drm_i915_gem_execbuffer2 {
  * drm_i915_gem_execbuffer_ext enum.
  */
 #define I915_EXEC_USE_EXTENSIONS	(1 << 21)
-
 #define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_USE_EXTENSIONS << 1))
 
+	/** @rsvd1: Context id */
+	__u64 rsvd1;
+
+	/**
+	 * @rsvd2: in and out sync_file file descriptors.
+	 *
+	 * When I915_EXEC_FENCE_IN or I915_EXEC_FENCE_SUBMIT flag is set, the
+	 * lower 32 bits of this field will have the in sync_file fd (input).
+	 *
+	 * When I915_EXEC_FENCE_OUT flag is set, the upper 32 bits of this
+	 * field will have the out sync_file fd (output).
+	 */
+	__u64 rsvd2;
+};
+
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
 	(eb2).rsvd1 = context & I915_EXEC_CONTEXT_ID_MASK
@@ -1814,19 +1880,58 @@ struct drm_i915_gem_context_create {
 	__u32 pad;
 };
 
+/**
+ * struct drm_i915_gem_context_create_ext - Structure for creating contexts.
+ */
 struct drm_i915_gem_context_create_ext {
-	__u32 ctx_id; /* output: id of new context*/
+	/** @ctx_id: Id of the created context (output) */
+	__u32 ctx_id;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS:
+	 *
+	 * Extensions may be appended to this structure and driver must check
+	 * for those. See @extensions.
+	 *
+	 * I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE
+	 *
+	 * Created context will have single timeline.
+	 */
 	__u32 flags;
 #define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS	(1u << 0)
 #define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE	(1u << 1)
 #define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
 	(-(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE << 1))
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * I915_CONTEXT_CREATE_EXT_SETPARAM:
+	 * Context parameter to set or query during context creation.
+	 * See struct drm_i915_gem_context_create_ext_setparam.
+	 *
+	 * I915_CONTEXT_CREATE_EXT_CLONE:
+	 * This extension has been removed. On the off chance someone somewhere
+	 * has attempted to use it, never re-use this extension number.
+	 */
 	__u64 extensions;
+#define I915_CONTEXT_CREATE_EXT_SETPARAM 0
+#define I915_CONTEXT_CREATE_EXT_CLONE 1
 };
 
+/**
+ * struct drm_i915_gem_context_param - Context parameter to set or query.
+ */
 struct drm_i915_gem_context_param {
+	/** @ctx_id: Context id */
 	__u32 ctx_id;
+
+	/** @size: Size of the parameter @value
 	__u32 size;
+
+	/** @param: Parameter to set or query */
 	__u64 param;
 #define I915_CONTEXT_PARAM_BAN_PERIOD	0x1
 /* I915_CONTEXT_PARAM_NO_ZEROMAP has been removed.  On the off chance
@@ -1973,6 +2078,7 @@ struct drm_i915_gem_context_param {
 #define I915_CONTEXT_PARAM_PROTECTED_CONTENT    0xd
 /* Must be kept compact -- no holes and well documented */
 
+	/** @value: Context parameter value to be set or queried */
 	__u64 value;
 };
 
@@ -2371,23 +2477,29 @@ struct i915_context_param_engines {
 	struct i915_engine_class_instance engines[N__]; \
 } __attribute__((packed)) name__
 
+/**
+ * struct drm_i915_gem_context_create_ext_setparam - Context parameter
+ * to set or query during context creation.
+ */
 struct drm_i915_gem_context_create_ext_setparam {
-#define I915_CONTEXT_CREATE_EXT_SETPARAM 0
+	/** @base: Extension link. See struct i915_user_extension. */
 	struct i915_user_extension base;
+
+	/**
+	 * @param: Context parameter to set or query.
+	 * See struct drm_i915_gem_context_param.
+	 */
 	struct drm_i915_gem_context_param param;
 };
 
-/* This API has been removed.  On the off chance someone somewhere has
- * attempted to use it, never re-use this extension number.
- */
-#define I915_CONTEXT_CREATE_EXT_CLONE 1
-
 struct drm_i915_gem_context_destroy {
 	__u32 ctx_id;
 	__u32 pad;
 };
 
-/*
+/**
+ * struct drm_i915_gem_vm_control - Structure to create or destroy VM.
+ *
  * DRM_I915_GEM_VM_CREATE -
  *
  * Create a new virtual memory address space (ppGTT) for use within a context
@@ -2397,20 +2509,23 @@ struct drm_i915_gem_context_destroy {
  * The id of new VM (bound to the fd) for use with I915_CONTEXT_PARAM_VM is
  * returned in the outparam @id.
  *
- * No flags are defined, with all bits reserved and must be zero.
- *
  * An extension chain maybe provided, starting with @extensions, and terminated
  * by the @next_extension being 0. Currently, no extensions are defined.
  *
  * DRM_I915_GEM_VM_DESTROY -
  *
- * Destroys a previously created VM id, specified in @id.
+ * Destroys a previously created VM id, specified in @vm_id.
  *
  * No extensions or flags are allowed currently, and so must be zero.
  */
 struct drm_i915_gem_vm_control {
+	/** @extensions: Zero-terminated chain of extensions. */
 	__u64 extensions;
+
+	/** @flags: reserved for future usage, currently MBZ */
 	__u32 flags;
+
+	/** @vm_id: Id of the VM created or to be destroyed */
 	__u32 vm_id;
 };
 
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-22  3:56 ` [Intel-gfx] " Niranjana Vishwanathapura
@ 2022-06-22  3:56   ` Niranjana Vishwanathapura
  -1 siblings, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-22  3:56 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, lionel.g.landwerlin,
	tvrtko.ursulin, chris.p.wilson, thomas.hellstrom, oak.zeng,
	matthew.auld, jason, daniel.vetter, christian.koenig

VM_BIND and related uapi definitions

v2: Reduce the scope to simple Mesa use case.
v3: Expand VM_UNBIND documentation and add
    I915_GEM_VM_BIND/UNBIND_FENCE_VALID
    and I915_GEM_VM_BIND_TLB_FLUSH flags.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 Documentation/gpu/rfc/i915_vm_bind.h | 243 +++++++++++++++++++++++++++
 1 file changed, 243 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h

diff --git a/Documentation/gpu/rfc/i915_vm_bind.h b/Documentation/gpu/rfc/i915_vm_bind.h
new file mode 100644
index 000000000000..fa23b2d7ec6f
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_vm_bind.h
@@ -0,0 +1,243 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+/**
+ * DOC: I915_PARAM_HAS_VM_BIND
+ *
+ * VM_BIND feature availability.
+ * See typedef drm_i915_getparam_t param.
+ */
+#define I915_PARAM_HAS_VM_BIND		57
+
+/**
+ * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
+ *
+ * Flag to opt-in for VM_BIND mode of binding during VM creation.
+ * See struct drm_i915_gem_vm_control flags.
+ *
+ * The older execbuf2 ioctl will not support VM_BIND mode of operation.
+ * For VM_BIND mode, we have new execbuf3 ioctl which will not accept any
+ * execlist (See struct drm_i915_gem_execbuffer3 for more details).
+ *
+ */
+#define I915_VM_CREATE_FLAGS_USE_VM_BIND	(1 << 0)
+
+/* VM_BIND related ioctls */
+#define DRM_I915_GEM_VM_BIND		0x3d
+#define DRM_I915_GEM_VM_UNBIND		0x3e
+#define DRM_I915_GEM_EXECBUFFER3	0x3f
+
+#define DRM_IOCTL_I915_GEM_VM_BIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
+#define DRM_IOCTL_I915_GEM_VM_UNBIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_bind)
+#define DRM_IOCTL_I915_GEM_EXECBUFFER3		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct drm_i915_gem_execbuffer3)
+
+/**
+ * struct drm_i915_gem_vm_bind_fence - Bind/unbind completion notification.
+ *
+ * A timeline out fence for vm_bind/unbind completion notification.
+ */
+struct drm_i915_gem_vm_bind_fence {
+	/** @handle: User's handle for a drm_syncobj to signal. */
+	__u32 handle;
+
+	/** @rsvd: Reserved, MBZ */
+	__u32 rsvd;
+
+	/**
+	 * @value: A point in the timeline.
+	 * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
+	 * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
+	 * binary one.
+	 */
+	__u64 value;
+};
+
+/**
+ * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
+ *
+ * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
+ * virtual address (VA) range to the section of an object that should be bound
+ * in the device page table of the specified address space (VM).
+ * The VA range specified must be unique (ie., not currently bound) and can
+ * be mapped to whole object or a section of the object (partial binding).
+ * Multiple VA mappings can be created to the same section of the object
+ * (aliasing).
+ *
+ * The @start, @offset and @length should be 4K page aligned. However the DG2
+ * and XEHPSDV has 64K page size for device local-memory and has compact page
+ * table. On those platforms, for binding device local-memory objects, the
+ * @start should be 2M aligned, @offset and @length should be 64K aligned.
+ * Also, on those platforms, it is not allowed to bind an device local-memory
+ * object and a system memory object in a single 2M section of VA range.
+ */
+struct drm_i915_gem_vm_bind {
+	/** @vm_id: VM (address space) id to bind */
+	__u32 vm_id;
+
+	/** @handle: Object handle */
+	__u32 handle;
+
+	/** @start: Virtual Address start to bind */
+	__u64 start;
+
+	/** @offset: Offset in object to bind */
+	__u64 offset;
+
+	/** @length: Length of mapping to bind */
+	__u64 length;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_GEM_VM_BIND_FENCE_VALID:
+	 * @fence is valid, needs bind completion notification.
+	 *
+	 * I915_GEM_VM_BIND_READONLY:
+	 * Mapping is read-only.
+	 *
+	 * I915_GEM_VM_BIND_CAPTURE:
+	 * Capture this mapping in the dump upon GPU error.
+	 *
+	 * I915_GEM_VM_BIND_TLB_FLUSH:
+	 * Flush the TLB for the specified range after bind completion.
+	 */
+	__u64 flags;
+#define I915_GEM_VM_BIND_FENCE_VALID	(1 << 0)
+#define I915_GEM_VM_BIND_READONLY	(1 << 1)
+#define I915_GEM_VM_BIND_CAPTURE	(1 << 2)
+#define I915_GEM_VM_BIND_TLB_FLUSH	(1 << 2)
+
+	/** @fence: Timeline fence for bind completion signaling */
+	struct drm_i915_gem_vm_bind_fence fence;
+
+	/** @extensions: 0-terminated chain of extensions */
+	__u64 extensions;
+};
+
+/**
+ * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
+ *
+ * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
+ * address (VA) range that should be unbound from the device page table of the
+ * specified address space (VM). The specified VA range must match one of the
+ * mappings created with the VM_BIND ioctl. TLB is flushed upon unbind
+ * completion. The unbind operation will force unbind the specified range from
+ * device page table without waiting for any GPU job to complete. It is UMDs
+ * responsibility to ensure the mapping is no longer in use before calling
+ * VM_UNBIND.
+ *
+ * The @start and @length musy specify a unique mapping bound with VM_BIND
+ * ioctl.
+ */
+struct drm_i915_gem_vm_unbind {
+	/** @vm_id: VM (address space) id to bind */
+	__u32 vm_id;
+
+	/** @rsvd: Reserved, MBZ */
+	__u32 rsvd;
+
+	/** @start: Virtual Address start to unbind */
+	__u64 start;
+
+	/** @length: Length of mapping to unbind */
+	__u64 length;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_GEM_VM_UNBIND_FENCE_VALID:
+	 * @fence is valid, needs unbind completion notification.
+	 */
+	__u64 flags;
+#define I915_GEM_VM_UNBIND_FENCE_VALID	(1 << 0)
+
+	/** @fence: Timeline fence for unbind completion signaling */
+	struct drm_i915_gem_vm_bind_fence fence;
+
+	/** @extensions: 0-terminated chain of extensions */
+	__u64 extensions;
+};
+
+/**
+ * struct drm_i915_gem_execbuffer3 - Structure for DRM_I915_GEM_EXECBUFFER3
+ * ioctl.
+ *
+ * DRM_I915_GEM_EXECBUFFER3 ioctl only works in VM_BIND mode and VM_BIND mode
+ * only works with this ioctl for submission.
+ * See I915_VM_CREATE_FLAGS_USE_VM_BIND.
+ */
+struct drm_i915_gem_execbuffer3 {
+	/**
+	 * @ctx_id: Context id
+	 *
+	 * Only contexts with user engine map are allowed.
+	 */
+	__u32 ctx_id;
+
+	/**
+	 * @engine_idx: Engine index
+	 *
+	 * An index in the user engine map of the context specified by @ctx_id.
+	 */
+	__u32 engine_idx;
+
+	/** @rsvd1: Reserved, MBZ */
+	__u32 rsvd1;
+
+	/**
+	 * @batch_count: Number of batches in @batch_address array.
+	 *
+	 * 0 is invalid. For parallel submission, it should be equal to the
+	 * number of (parallel) engines involved in that submission.
+	 */
+	__u32 batch_count;
+
+	/**
+	 * @batch_address: Array of batch gpu virtual addresses.
+	 *
+	 * If @batch_count is 1, then it is the gpu virtual address of the
+	 * batch buffer. If @batch_count > 1, then it is a pointer to an array
+	 * of batch buffer gpu virtual addresses.
+	 */
+	__u64 batch_address;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_EXEC3_SECURE:
+	 * Request a privileged ("secure") batch buffer/s.
+	 * It is only available for DRM_ROOT_ONLY | DRM_MASTER processes.
+	 */
+	__u64 flags;
+#define I915_EXEC3_SECURE	(1<<0)
+
+	/** @rsvd2: Reserved, MBZ */
+	__u64 rsvd2;
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES:
+	 * It has same format as DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES.
+	 * See struct drm_i915_gem_execbuffer_ext_timeline_fences.
+	 */
+	__u64 extensions;
+#define DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES	0
+};
+
+/**
+ * struct drm_i915_gem_create_ext_vm_private - Extension to make the object
+ * private to the specified VM.
+ *
+ * See struct drm_i915_gem_create_ext.
+ */
+struct drm_i915_gem_create_ext_vm_private {
+#define I915_GEM_CREATE_EXT_VM_PRIVATE		2
+	/** @base: Extension link. See struct i915_user_extension. */
+	struct i915_user_extension base;
+
+	/** @vm_id: Id of the VM to which the object is private */
+	__u32 vm_id;
+};
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
@ 2022-06-22  3:56   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-22  3:56 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: paulo.r.zanoni, chris.p.wilson, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig

VM_BIND and related uapi definitions

v2: Reduce the scope to simple Mesa use case.
v3: Expand VM_UNBIND documentation and add
    I915_GEM_VM_BIND/UNBIND_FENCE_VALID
    and I915_GEM_VM_BIND_TLB_FLUSH flags.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 Documentation/gpu/rfc/i915_vm_bind.h | 243 +++++++++++++++++++++++++++
 1 file changed, 243 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h

diff --git a/Documentation/gpu/rfc/i915_vm_bind.h b/Documentation/gpu/rfc/i915_vm_bind.h
new file mode 100644
index 000000000000..fa23b2d7ec6f
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_vm_bind.h
@@ -0,0 +1,243 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+/**
+ * DOC: I915_PARAM_HAS_VM_BIND
+ *
+ * VM_BIND feature availability.
+ * See typedef drm_i915_getparam_t param.
+ */
+#define I915_PARAM_HAS_VM_BIND		57
+
+/**
+ * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
+ *
+ * Flag to opt-in for VM_BIND mode of binding during VM creation.
+ * See struct drm_i915_gem_vm_control flags.
+ *
+ * The older execbuf2 ioctl will not support VM_BIND mode of operation.
+ * For VM_BIND mode, we have new execbuf3 ioctl which will not accept any
+ * execlist (See struct drm_i915_gem_execbuffer3 for more details).
+ *
+ */
+#define I915_VM_CREATE_FLAGS_USE_VM_BIND	(1 << 0)
+
+/* VM_BIND related ioctls */
+#define DRM_I915_GEM_VM_BIND		0x3d
+#define DRM_I915_GEM_VM_UNBIND		0x3e
+#define DRM_I915_GEM_EXECBUFFER3	0x3f
+
+#define DRM_IOCTL_I915_GEM_VM_BIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
+#define DRM_IOCTL_I915_GEM_VM_UNBIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_bind)
+#define DRM_IOCTL_I915_GEM_EXECBUFFER3		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct drm_i915_gem_execbuffer3)
+
+/**
+ * struct drm_i915_gem_vm_bind_fence - Bind/unbind completion notification.
+ *
+ * A timeline out fence for vm_bind/unbind completion notification.
+ */
+struct drm_i915_gem_vm_bind_fence {
+	/** @handle: User's handle for a drm_syncobj to signal. */
+	__u32 handle;
+
+	/** @rsvd: Reserved, MBZ */
+	__u32 rsvd;
+
+	/**
+	 * @value: A point in the timeline.
+	 * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
+	 * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
+	 * binary one.
+	 */
+	__u64 value;
+};
+
+/**
+ * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
+ *
+ * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
+ * virtual address (VA) range to the section of an object that should be bound
+ * in the device page table of the specified address space (VM).
+ * The VA range specified must be unique (ie., not currently bound) and can
+ * be mapped to whole object or a section of the object (partial binding).
+ * Multiple VA mappings can be created to the same section of the object
+ * (aliasing).
+ *
+ * The @start, @offset and @length should be 4K page aligned. However the DG2
+ * and XEHPSDV has 64K page size for device local-memory and has compact page
+ * table. On those platforms, for binding device local-memory objects, the
+ * @start should be 2M aligned, @offset and @length should be 64K aligned.
+ * Also, on those platforms, it is not allowed to bind an device local-memory
+ * object and a system memory object in a single 2M section of VA range.
+ */
+struct drm_i915_gem_vm_bind {
+	/** @vm_id: VM (address space) id to bind */
+	__u32 vm_id;
+
+	/** @handle: Object handle */
+	__u32 handle;
+
+	/** @start: Virtual Address start to bind */
+	__u64 start;
+
+	/** @offset: Offset in object to bind */
+	__u64 offset;
+
+	/** @length: Length of mapping to bind */
+	__u64 length;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_GEM_VM_BIND_FENCE_VALID:
+	 * @fence is valid, needs bind completion notification.
+	 *
+	 * I915_GEM_VM_BIND_READONLY:
+	 * Mapping is read-only.
+	 *
+	 * I915_GEM_VM_BIND_CAPTURE:
+	 * Capture this mapping in the dump upon GPU error.
+	 *
+	 * I915_GEM_VM_BIND_TLB_FLUSH:
+	 * Flush the TLB for the specified range after bind completion.
+	 */
+	__u64 flags;
+#define I915_GEM_VM_BIND_FENCE_VALID	(1 << 0)
+#define I915_GEM_VM_BIND_READONLY	(1 << 1)
+#define I915_GEM_VM_BIND_CAPTURE	(1 << 2)
+#define I915_GEM_VM_BIND_TLB_FLUSH	(1 << 2)
+
+	/** @fence: Timeline fence for bind completion signaling */
+	struct drm_i915_gem_vm_bind_fence fence;
+
+	/** @extensions: 0-terminated chain of extensions */
+	__u64 extensions;
+};
+
+/**
+ * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
+ *
+ * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
+ * address (VA) range that should be unbound from the device page table of the
+ * specified address space (VM). The specified VA range must match one of the
+ * mappings created with the VM_BIND ioctl. TLB is flushed upon unbind
+ * completion. The unbind operation will force unbind the specified range from
+ * device page table without waiting for any GPU job to complete. It is UMDs
+ * responsibility to ensure the mapping is no longer in use before calling
+ * VM_UNBIND.
+ *
+ * The @start and @length musy specify a unique mapping bound with VM_BIND
+ * ioctl.
+ */
+struct drm_i915_gem_vm_unbind {
+	/** @vm_id: VM (address space) id to bind */
+	__u32 vm_id;
+
+	/** @rsvd: Reserved, MBZ */
+	__u32 rsvd;
+
+	/** @start: Virtual Address start to unbind */
+	__u64 start;
+
+	/** @length: Length of mapping to unbind */
+	__u64 length;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_GEM_VM_UNBIND_FENCE_VALID:
+	 * @fence is valid, needs unbind completion notification.
+	 */
+	__u64 flags;
+#define I915_GEM_VM_UNBIND_FENCE_VALID	(1 << 0)
+
+	/** @fence: Timeline fence for unbind completion signaling */
+	struct drm_i915_gem_vm_bind_fence fence;
+
+	/** @extensions: 0-terminated chain of extensions */
+	__u64 extensions;
+};
+
+/**
+ * struct drm_i915_gem_execbuffer3 - Structure for DRM_I915_GEM_EXECBUFFER3
+ * ioctl.
+ *
+ * DRM_I915_GEM_EXECBUFFER3 ioctl only works in VM_BIND mode and VM_BIND mode
+ * only works with this ioctl for submission.
+ * See I915_VM_CREATE_FLAGS_USE_VM_BIND.
+ */
+struct drm_i915_gem_execbuffer3 {
+	/**
+	 * @ctx_id: Context id
+	 *
+	 * Only contexts with user engine map are allowed.
+	 */
+	__u32 ctx_id;
+
+	/**
+	 * @engine_idx: Engine index
+	 *
+	 * An index in the user engine map of the context specified by @ctx_id.
+	 */
+	__u32 engine_idx;
+
+	/** @rsvd1: Reserved, MBZ */
+	__u32 rsvd1;
+
+	/**
+	 * @batch_count: Number of batches in @batch_address array.
+	 *
+	 * 0 is invalid. For parallel submission, it should be equal to the
+	 * number of (parallel) engines involved in that submission.
+	 */
+	__u32 batch_count;
+
+	/**
+	 * @batch_address: Array of batch gpu virtual addresses.
+	 *
+	 * If @batch_count is 1, then it is the gpu virtual address of the
+	 * batch buffer. If @batch_count > 1, then it is a pointer to an array
+	 * of batch buffer gpu virtual addresses.
+	 */
+	__u64 batch_address;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_EXEC3_SECURE:
+	 * Request a privileged ("secure") batch buffer/s.
+	 * It is only available for DRM_ROOT_ONLY | DRM_MASTER processes.
+	 */
+	__u64 flags;
+#define I915_EXEC3_SECURE	(1<<0)
+
+	/** @rsvd2: Reserved, MBZ */
+	__u64 rsvd2;
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES:
+	 * It has same format as DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES.
+	 * See struct drm_i915_gem_execbuffer_ext_timeline_fences.
+	 */
+	__u64 extensions;
+#define DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES	0
+};
+
+/**
+ * struct drm_i915_gem_create_ext_vm_private - Extension to make the object
+ * private to the specified VM.
+ *
+ * See struct drm_i915_gem_create_ext.
+ */
+struct drm_i915_gem_create_ext_vm_private {
+#define I915_GEM_CREATE_EXT_VM_PRIVATE		2
+	/** @base: Extension link. See struct i915_user_extension. */
+	struct i915_user_extension base;
+
+	/** @vm_id: Id of the VM to which the object is private */
+	__u32 vm_id;
+};
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-22  3:56   ` [Intel-gfx] " Niranjana Vishwanathapura
  (?)
@ 2022-06-22  8:10   ` Tvrtko Ursulin
  2022-06-22 15:12     ` Niranjana Vishwanathapura
  -1 siblings, 1 reply; 34+ messages in thread
From: Tvrtko Ursulin @ 2022-06-22  8:10 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, intel-gfx, dri-devel
  Cc: paulo.r.zanoni, chris.p.wilson, thomas.hellstrom, matthew.auld,
	daniel.vetter, christian.koenig


On 22/06/2022 04:56, Niranjana Vishwanathapura wrote:
> VM_BIND and related uapi definitions
> 
> v2: Reduce the scope to simple Mesa use case.
> v3: Expand VM_UNBIND documentation and add
>      I915_GEM_VM_BIND/UNBIND_FENCE_VALID
>      and I915_GEM_VM_BIND_TLB_FLUSH flags.
> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> ---
>   Documentation/gpu/rfc/i915_vm_bind.h | 243 +++++++++++++++++++++++++++
>   1 file changed, 243 insertions(+)
>   create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
> 
> diff --git a/Documentation/gpu/rfc/i915_vm_bind.h b/Documentation/gpu/rfc/i915_vm_bind.h
> new file mode 100644
> index 000000000000..fa23b2d7ec6f
> --- /dev/null
> +++ b/Documentation/gpu/rfc/i915_vm_bind.h
> @@ -0,0 +1,243 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +/**
> + * DOC: I915_PARAM_HAS_VM_BIND
> + *
> + * VM_BIND feature availability.
> + * See typedef drm_i915_getparam_t param.
> + */
> +#define I915_PARAM_HAS_VM_BIND		57
> +
> +/**
> + * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
> + *
> + * Flag to opt-in for VM_BIND mode of binding during VM creation.
> + * See struct drm_i915_gem_vm_control flags.
> + *
> + * The older execbuf2 ioctl will not support VM_BIND mode of operation.
> + * For VM_BIND mode, we have new execbuf3 ioctl which will not accept any
> + * execlist (See struct drm_i915_gem_execbuffer3 for more details).
> + *
> + */
> +#define I915_VM_CREATE_FLAGS_USE_VM_BIND	(1 << 0)
> +
> +/* VM_BIND related ioctls */
> +#define DRM_I915_GEM_VM_BIND		0x3d
> +#define DRM_I915_GEM_VM_UNBIND		0x3e
> +#define DRM_I915_GEM_EXECBUFFER3	0x3f
> +
> +#define DRM_IOCTL_I915_GEM_VM_BIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
> +#define DRM_IOCTL_I915_GEM_VM_UNBIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_bind)
> +#define DRM_IOCTL_I915_GEM_EXECBUFFER3		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct drm_i915_gem_execbuffer3)
> +
> +/**
> + * struct drm_i915_gem_vm_bind_fence - Bind/unbind completion notification.
> + *
> + * A timeline out fence for vm_bind/unbind completion notification.
> + */
> +struct drm_i915_gem_vm_bind_fence {
> +	/** @handle: User's handle for a drm_syncobj to signal. */
> +	__u32 handle;
> +
> +	/** @rsvd: Reserved, MBZ */
> +	__u32 rsvd;
> +
> +	/**
> +	 * @value: A point in the timeline.
> +	 * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
> +	 * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
> +	 * binary one.
> +	 */
> +	__u64 value;
> +};
> +
> +/**
> + * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
> + *
> + * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
> + * virtual address (VA) range to the section of an object that should be bound
> + * in the device page table of the specified address space (VM).
> + * The VA range specified must be unique (ie., not currently bound) and can
> + * be mapped to whole object or a section of the object (partial binding).
> + * Multiple VA mappings can be created to the same section of the object
> + * (aliasing).
> + *
> + * The @start, @offset and @length should be 4K page aligned. However the DG2
> + * and XEHPSDV has 64K page size for device local-memory and has compact page
> + * table. On those platforms, for binding device local-memory objects, the
> + * @start should be 2M aligned, @offset and @length should be 64K aligned.

Should some error codes be documented and has the ability to 
programmatically probe the alignment restrictions been considered?

> + * Also, on those platforms, it is not allowed to bind an device local-memory
> + * object and a system memory object in a single 2M section of VA range.

Text should be clear whether "not allowed" means there will be an error 
returned, or it will appear to work but bad things will happen.

> + */
> +struct drm_i915_gem_vm_bind {
> +	/** @vm_id: VM (address space) id to bind */
> +	__u32 vm_id;
> +
> +	/** @handle: Object handle */
> +	__u32 handle;
> +
> +	/** @start: Virtual Address start to bind */
> +	__u64 start;
> +
> +	/** @offset: Offset in object to bind */
> +	__u64 offset;
> +
> +	/** @length: Length of mapping to bind */
> +	__u64 length;
> +
> +	/**
> +	 * @flags: Supported flags are:
> +	 *
> +	 * I915_GEM_VM_BIND_FENCE_VALID:
> +	 * @fence is valid, needs bind completion notification.
> +	 *
> +	 * I915_GEM_VM_BIND_READONLY:
> +	 * Mapping is read-only.
> +	 *
> +	 * I915_GEM_VM_BIND_CAPTURE:
> +	 * Capture this mapping in the dump upon GPU error.
> +	 *
> +	 * I915_GEM_VM_BIND_TLB_FLUSH:
> +	 * Flush the TLB for the specified range after bind completion.
> +	 */
> +	__u64 flags;
> +#define I915_GEM_VM_BIND_FENCE_VALID	(1 << 0)
> +#define I915_GEM_VM_BIND_READONLY	(1 << 1)
> +#define I915_GEM_VM_BIND_CAPTURE	(1 << 2)
> +#define I915_GEM_VM_BIND_TLB_FLUSH	(1 << 2)

What is the use case for allowing any random user to play with (global) 
TLB flushing?

> +
> +	/** @fence: Timeline fence for bind completion signaling */
> +	struct drm_i915_gem_vm_bind_fence fence;

As agreed the other day - please document in the main kerneldoc section 
that all (un)binds are executed asynchronously and out of order.

> +
> +	/** @extensions: 0-terminated chain of extensions */
> +	__u64 extensions;
> +};
> +
> +/**
> + * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
> + *
> + * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
> + * address (VA) range that should be unbound from the device page table of the
> + * specified address space (VM). The specified VA range must match one of the
> + * mappings created with the VM_BIND ioctl. TLB is flushed upon unbind
> + * completion. The unbind operation will force unbind the specified 

Do we want to provide TLB flushing guarantees here and why? (As opposed 
to leaving them for implementation details.) If there is no implied 
order in either binds/unbinds, or between the two intermixed, then what 
is the point of guaranteeing a TLB flush on unbind completion?

range from
> + * device page table without waiting for any GPU job to complete. It is UMDs
> + * responsibility to ensure the mapping is no longer in use before calling
> + * VM_UNBIND.
> + *
> + * The @start and @length musy specify a unique mapping bound with VM_BIND
> + * ioctl.
> + */
> +struct drm_i915_gem_vm_unbind {
> +	/** @vm_id: VM (address space) id to bind */
> +	__u32 vm_id;
> +
> +	/** @rsvd: Reserved, MBZ */
> +	__u32 rsvd;
> +
> +	/** @start: Virtual Address start to unbind */
> +	__u64 start;
> +
> +	/** @length: Length of mapping to unbind */
> +	__u64 length;
> +
> +	/**
> +	 * @flags: Supported flags are:
> +	 *
> +	 * I915_GEM_VM_UNBIND_FENCE_VALID:
> +	 * @fence is valid, needs unbind completion notification.
> +	 */
> +	__u64 flags;
> +#define I915_GEM_VM_UNBIND_FENCE_VALID	(1 << 0)
> +
> +	/** @fence: Timeline fence for unbind completion signaling */
> +	struct drm_i915_gem_vm_bind_fence fence;

I am not sure the simplified ioctl story is super coherent. If 
everything is now fully async and out of order, but the input fence has 
been dropped, then how is userspace supposed to handle the address 
space? It will have to wait (in userspace) for unbinds to complete 
before submitting subsequent binds which use the same VA range.

Maybe that's passable, but then the fact execbuf3 has no input fence 
suggests a userspace wait between it and binds. And I am pretty sure 
historically those were always quite bad for performance.

Presumably userspace clients are happy with no input fences or it was 
considered to costly to implement it?

Regards,

Tvrtko

> +
> +	/** @extensions: 0-terminated chain of extensions */
> +	__u64 extensions;
> +};
> +
> +/**
> + * struct drm_i915_gem_execbuffer3 - Structure for DRM_I915_GEM_EXECBUFFER3
> + * ioctl.
> + *
> + * DRM_I915_GEM_EXECBUFFER3 ioctl only works in VM_BIND mode and VM_BIND mode
> + * only works with this ioctl for submission.
> + * See I915_VM_CREATE_FLAGS_USE_VM_BIND.
> + */
> +struct drm_i915_gem_execbuffer3 {
> +	/**
> +	 * @ctx_id: Context id
> +	 *
> +	 * Only contexts with user engine map are allowed.
> +	 */
> +	__u32 ctx_id;
> +
> +	/**
> +	 * @engine_idx: Engine index
> +	 *
> +	 * An index in the user engine map of the context specified by @ctx_id.
> +	 */
> +	__u32 engine_idx;
> +
> +	/** @rsvd1: Reserved, MBZ */
> +	__u32 rsvd1;
> +
> +	/**
> +	 * @batch_count: Number of batches in @batch_address array.
> +	 *
> +	 * 0 is invalid. For parallel submission, it should be equal to the
> +	 * number of (parallel) engines involved in that submission.
> +	 */
> +	__u32 batch_count;
> +
> +	/**
> +	 * @batch_address: Array of batch gpu virtual addresses.
> +	 *
> +	 * If @batch_count is 1, then it is the gpu virtual address of the
> +	 * batch buffer. If @batch_count > 1, then it is a pointer to an array
> +	 * of batch buffer gpu virtual addresses.
> +	 */
> +	__u64 batch_address;
> +
> +	/**
> +	 * @flags: Supported flags are:
> +	 *
> +	 * I915_EXEC3_SECURE:
> +	 * Request a privileged ("secure") batch buffer/s.
> +	 * It is only available for DRM_ROOT_ONLY | DRM_MASTER processes.
> +	 */
> +	__u64 flags;
> +#define I915_EXEC3_SECURE	(1<<0)
> +
> +	/** @rsvd2: Reserved, MBZ */
> +	__u64 rsvd2;
> +
> +	/**
> +	 * @extensions: Zero-terminated chain of extensions.
> +	 *
> +	 * DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES:
> +	 * It has same format as DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES.
> +	 * See struct drm_i915_gem_execbuffer_ext_timeline_fences.
> +	 */
> +	__u64 extensions;
> +#define DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES	0
> +};
> +
> +/**
> + * struct drm_i915_gem_create_ext_vm_private - Extension to make the object
> + * private to the specified VM.
> + *
> + * See struct drm_i915_gem_create_ext.
> + */
> +struct drm_i915_gem_create_ext_vm_private {
> +#define I915_GEM_CREATE_EXT_VM_PRIVATE		2
> +	/** @base: Extension link. See struct i915_user_extension. */
> +	struct i915_user_extension base;
> +
> +	/** @vm_id: Id of the VM to which the object is private */
> +	__u32 vm_id;
> +};

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-22  8:10   ` Tvrtko Ursulin
@ 2022-06-22 15:12     ` Niranjana Vishwanathapura
  2022-06-22 15:57       ` Tvrtko Ursulin
  2022-06-23  9:28       ` Lionel Landwerlin
  0 siblings, 2 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-22 15:12 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: paulo.r.zanoni, intel-gfx, chris.p.wilson, thomas.hellstrom,
	dri-devel, daniel.vetter, christian.koenig, matthew.auld

On Wed, Jun 22, 2022 at 09:10:07AM +0100, Tvrtko Ursulin wrote:
>
>On 22/06/2022 04:56, Niranjana Vishwanathapura wrote:
>>VM_BIND and related uapi definitions
>>
>>v2: Reduce the scope to simple Mesa use case.
>>v3: Expand VM_UNBIND documentation and add
>>     I915_GEM_VM_BIND/UNBIND_FENCE_VALID
>>     and I915_GEM_VM_BIND_TLB_FLUSH flags.
>>
>>Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>>---
>>  Documentation/gpu/rfc/i915_vm_bind.h | 243 +++++++++++++++++++++++++++
>>  1 file changed, 243 insertions(+)
>>  create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
>>
>>diff --git a/Documentation/gpu/rfc/i915_vm_bind.h b/Documentation/gpu/rfc/i915_vm_bind.h
>>new file mode 100644
>>index 000000000000..fa23b2d7ec6f
>>--- /dev/null
>>+++ b/Documentation/gpu/rfc/i915_vm_bind.h
>>@@ -0,0 +1,243 @@
>>+/* SPDX-License-Identifier: MIT */
>>+/*
>>+ * Copyright © 2022 Intel Corporation
>>+ */
>>+
>>+/**
>>+ * DOC: I915_PARAM_HAS_VM_BIND
>>+ *
>>+ * VM_BIND feature availability.
>>+ * See typedef drm_i915_getparam_t param.
>>+ */
>>+#define I915_PARAM_HAS_VM_BIND		57
>>+
>>+/**
>>+ * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
>>+ *
>>+ * Flag to opt-in for VM_BIND mode of binding during VM creation.
>>+ * See struct drm_i915_gem_vm_control flags.
>>+ *
>>+ * The older execbuf2 ioctl will not support VM_BIND mode of operation.
>>+ * For VM_BIND mode, we have new execbuf3 ioctl which will not accept any
>>+ * execlist (See struct drm_i915_gem_execbuffer3 for more details).
>>+ *
>>+ */
>>+#define I915_VM_CREATE_FLAGS_USE_VM_BIND	(1 << 0)
>>+
>>+/* VM_BIND related ioctls */
>>+#define DRM_I915_GEM_VM_BIND		0x3d
>>+#define DRM_I915_GEM_VM_UNBIND		0x3e
>>+#define DRM_I915_GEM_EXECBUFFER3	0x3f
>>+
>>+#define DRM_IOCTL_I915_GEM_VM_BIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
>>+#define DRM_IOCTL_I915_GEM_VM_UNBIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_bind)
>>+#define DRM_IOCTL_I915_GEM_EXECBUFFER3		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct drm_i915_gem_execbuffer3)
>>+
>>+/**
>>+ * struct drm_i915_gem_vm_bind_fence - Bind/unbind completion notification.
>>+ *
>>+ * A timeline out fence for vm_bind/unbind completion notification.
>>+ */
>>+struct drm_i915_gem_vm_bind_fence {
>>+	/** @handle: User's handle for a drm_syncobj to signal. */
>>+	__u32 handle;
>>+
>>+	/** @rsvd: Reserved, MBZ */
>>+	__u32 rsvd;
>>+
>>+	/**
>>+	 * @value: A point in the timeline.
>>+	 * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
>>+	 * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
>>+	 * binary one.
>>+	 */
>>+	__u64 value;
>>+};
>>+
>>+/**
>>+ * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
>>+ *
>>+ * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
>>+ * virtual address (VA) range to the section of an object that should be bound
>>+ * in the device page table of the specified address space (VM).
>>+ * The VA range specified must be unique (ie., not currently bound) and can
>>+ * be mapped to whole object or a section of the object (partial binding).
>>+ * Multiple VA mappings can be created to the same section of the object
>>+ * (aliasing).
>>+ *
>>+ * The @start, @offset and @length should be 4K page aligned. However the DG2
>>+ * and XEHPSDV has 64K page size for device local-memory and has compact page
>>+ * table. On those platforms, for binding device local-memory objects, the
>>+ * @start should be 2M aligned, @offset and @length should be 64K aligned.
>
>Should some error codes be documented and has the ability to 
>programmatically probe the alignment restrictions been considered?
>

Currently what we have internally is that -EINVAL is returned if the sart, offset
and length are not aligned. If the specified mapping already exits, we return
-EEXIST. If there are conflicts in the VA range and VA range can't be reserved,
then -ENOSPC is returned. I can add this documentation here. But I am worried
that there will be more suggestions/feedback about error codes while reviewing
the code patch series, and we have to revisit it again.

>>+ * Also, on those platforms, it is not allowed to bind an device local-memory
>>+ * object and a system memory object in a single 2M section of VA range.
>
>Text should be clear whether "not allowed" means there will be an 
>error returned, or it will appear to work but bad things will happen.
>

Yah, error returned, will fix.

>>+ */
>>+struct drm_i915_gem_vm_bind {
>>+	/** @vm_id: VM (address space) id to bind */
>>+	__u32 vm_id;
>>+
>>+	/** @handle: Object handle */
>>+	__u32 handle;
>>+
>>+	/** @start: Virtual Address start to bind */
>>+	__u64 start;
>>+
>>+	/** @offset: Offset in object to bind */
>>+	__u64 offset;
>>+
>>+	/** @length: Length of mapping to bind */
>>+	__u64 length;
>>+
>>+	/**
>>+	 * @flags: Supported flags are:
>>+	 *
>>+	 * I915_GEM_VM_BIND_FENCE_VALID:
>>+	 * @fence is valid, needs bind completion notification.
>>+	 *
>>+	 * I915_GEM_VM_BIND_READONLY:
>>+	 * Mapping is read-only.
>>+	 *
>>+	 * I915_GEM_VM_BIND_CAPTURE:
>>+	 * Capture this mapping in the dump upon GPU error.
>>+	 *
>>+	 * I915_GEM_VM_BIND_TLB_FLUSH:
>>+	 * Flush the TLB for the specified range after bind completion.
>>+	 */
>>+	__u64 flags;
>>+#define I915_GEM_VM_BIND_FENCE_VALID	(1 << 0)
>>+#define I915_GEM_VM_BIND_READONLY	(1 << 1)
>>+#define I915_GEM_VM_BIND_CAPTURE	(1 << 2)
>>+#define I915_GEM_VM_BIND_TLB_FLUSH	(1 << 2)
>
>What is the use case for allowing any random user to play with 
>(global) TLB flushing?
>

I heard it from Daniel on intel-gfx, apparently it is a Mesa requirement.

>>+
>>+	/** @fence: Timeline fence for bind completion signaling */
>>+	struct drm_i915_gem_vm_bind_fence fence;
>
>As agreed the other day - please document in the main kerneldoc 
>section that all (un)binds are executed asynchronously and out of 
>order.
>

I have added it in the latest revision of .rst file.

>>+
>>+	/** @extensions: 0-terminated chain of extensions */
>>+	__u64 extensions;
>>+};
>>+
>>+/**
>>+ * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
>>+ *
>>+ * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
>>+ * address (VA) range that should be unbound from the device page table of the
>>+ * specified address space (VM). The specified VA range must match one of the
>>+ * mappings created with the VM_BIND ioctl. TLB is flushed upon unbind
>>+ * completion. The unbind operation will force unbind the specified
>
>Do we want to provide TLB flushing guarantees here and why? (As 
>opposed to leaving them for implementation details.) If there is no 
>implied order in either binds/unbinds, or between the two intermixed, 
>then what is the point of guaranteeing a TLB flush on unbind 
>completion?
>

I think we ensure that tlb is flushed before signaling the out fence
of vm_unbind call, then user ensure corretness by staging submissions
or vm_bind calls after vm_unbind out fence signaling.

>range from
>>+ * device page table without waiting for any GPU job to complete. It is UMDs
>>+ * responsibility to ensure the mapping is no longer in use before calling
>>+ * VM_UNBIND.
>>+ *
>>+ * The @start and @length musy specify a unique mapping bound with VM_BIND
>>+ * ioctl.
>>+ */
>>+struct drm_i915_gem_vm_unbind {
>>+	/** @vm_id: VM (address space) id to bind */
>>+	__u32 vm_id;
>>+
>>+	/** @rsvd: Reserved, MBZ */
>>+	__u32 rsvd;
>>+
>>+	/** @start: Virtual Address start to unbind */
>>+	__u64 start;
>>+
>>+	/** @length: Length of mapping to unbind */
>>+	__u64 length;
>>+
>>+	/**
>>+	 * @flags: Supported flags are:
>>+	 *
>>+	 * I915_GEM_VM_UNBIND_FENCE_VALID:
>>+	 * @fence is valid, needs unbind completion notification.
>>+	 */
>>+	__u64 flags;
>>+#define I915_GEM_VM_UNBIND_FENCE_VALID	(1 << 0)
>>+
>>+	/** @fence: Timeline fence for unbind completion signaling */
>>+	struct drm_i915_gem_vm_bind_fence fence;
>
>I am not sure the simplified ioctl story is super coherent. If 
>everything is now fully async and out of order, but the input fence 
>has been dropped, then how is userspace supposed to handle the address 
>space? It will have to wait (in userspace) for unbinds to complete 
>before submitting subsequent binds which use the same VA range.
>

Yah and Mesa appararently will be having the support to handle it.

>Maybe that's passable, but then the fact execbuf3 has no input fence 
>suggests a userspace wait between it and binds. And I am pretty sure 
>historically those were always quite bad for performance.
>

execbuf3 has the input fence through timline fence array support.

>Presumably userspace clients are happy with no input fences or it was 
>considered to costly to implement it?
>

Yah, apparently Mesa can work with no input fence. This helps us in
focusing on rest of the VM_BIND feature delivery.

Niranjana

>Regards,
>
>Tvrtko
>
>>+
>>+	/** @extensions: 0-terminated chain of extensions */
>>+	__u64 extensions;
>>+};
>>+
>>+/**
>>+ * struct drm_i915_gem_execbuffer3 - Structure for DRM_I915_GEM_EXECBUFFER3
>>+ * ioctl.
>>+ *
>>+ * DRM_I915_GEM_EXECBUFFER3 ioctl only works in VM_BIND mode and VM_BIND mode
>>+ * only works with this ioctl for submission.
>>+ * See I915_VM_CREATE_FLAGS_USE_VM_BIND.
>>+ */
>>+struct drm_i915_gem_execbuffer3 {
>>+	/**
>>+	 * @ctx_id: Context id
>>+	 *
>>+	 * Only contexts with user engine map are allowed.
>>+	 */
>>+	__u32 ctx_id;
>>+
>>+	/**
>>+	 * @engine_idx: Engine index
>>+	 *
>>+	 * An index in the user engine map of the context specified by @ctx_id.
>>+	 */
>>+	__u32 engine_idx;
>>+
>>+	/** @rsvd1: Reserved, MBZ */
>>+	__u32 rsvd1;
>>+
>>+	/**
>>+	 * @batch_count: Number of batches in @batch_address array.
>>+	 *
>>+	 * 0 is invalid. For parallel submission, it should be equal to the
>>+	 * number of (parallel) engines involved in that submission.
>>+	 */
>>+	__u32 batch_count;
>>+
>>+	/**
>>+	 * @batch_address: Array of batch gpu virtual addresses.
>>+	 *
>>+	 * If @batch_count is 1, then it is the gpu virtual address of the
>>+	 * batch buffer. If @batch_count > 1, then it is a pointer to an array
>>+	 * of batch buffer gpu virtual addresses.
>>+	 */
>>+	__u64 batch_address;
>>+
>>+	/**
>>+	 * @flags: Supported flags are:
>>+	 *
>>+	 * I915_EXEC3_SECURE:
>>+	 * Request a privileged ("secure") batch buffer/s.
>>+	 * It is only available for DRM_ROOT_ONLY | DRM_MASTER processes.
>>+	 */
>>+	__u64 flags;
>>+#define I915_EXEC3_SECURE	(1<<0)
>>+
>>+	/** @rsvd2: Reserved, MBZ */
>>+	__u64 rsvd2;
>>+
>>+	/**
>>+	 * @extensions: Zero-terminated chain of extensions.
>>+	 *
>>+	 * DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES:
>>+	 * It has same format as DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES.
>>+	 * See struct drm_i915_gem_execbuffer_ext_timeline_fences.
>>+	 */
>>+	__u64 extensions;
>>+#define DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES	0
>>+};
>>+
>>+/**
>>+ * struct drm_i915_gem_create_ext_vm_private - Extension to make the object
>>+ * private to the specified VM.
>>+ *
>>+ * See struct drm_i915_gem_create_ext.
>>+ */
>>+struct drm_i915_gem_create_ext_vm_private {
>>+#define I915_GEM_CREATE_EXT_VM_PRIVATE		2
>>+	/** @base: Extension link. See struct i915_user_extension. */
>>+	struct i915_user_extension base;
>>+
>>+	/** @vm_id: Id of the VM to which the object is private */
>>+	__u32 vm_id;
>>+};

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-22 15:12     ` Niranjana Vishwanathapura
@ 2022-06-22 15:57       ` Tvrtko Ursulin
  2022-06-22 16:44         ` Niranjana Vishwanathapura
  2022-06-23  9:28       ` Lionel Landwerlin
  1 sibling, 1 reply; 34+ messages in thread
From: Tvrtko Ursulin @ 2022-06-22 15:57 UTC (permalink / raw)
  To: Niranjana Vishwanathapura
  Cc: paulo.r.zanoni, intel-gfx, chris.p.wilson, thomas.hellstrom,
	dri-devel, daniel.vetter, christian.koenig, matthew.auld


On 22/06/2022 16:12, Niranjana Vishwanathapura wrote:
> On Wed, Jun 22, 2022 at 09:10:07AM +0100, Tvrtko Ursulin wrote:
>>
>> On 22/06/2022 04:56, Niranjana Vishwanathapura wrote:
>>> VM_BIND and related uapi definitions
>>>
>>> v2: Reduce the scope to simple Mesa use case.
>>> v3: Expand VM_UNBIND documentation and add
>>>     I915_GEM_VM_BIND/UNBIND_FENCE_VALID
>>>     and I915_GEM_VM_BIND_TLB_FLUSH flags.
>>>
>>> Signed-off-by: Niranjana Vishwanathapura 
>>> <niranjana.vishwanathapura@intel.com>
>>> ---
>>>  Documentation/gpu/rfc/i915_vm_bind.h | 243 +++++++++++++++++++++++++++
>>>  1 file changed, 243 insertions(+)
>>>  create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
>>>
>>> diff --git a/Documentation/gpu/rfc/i915_vm_bind.h 
>>> b/Documentation/gpu/rfc/i915_vm_bind.h
>>> new file mode 100644
>>> index 000000000000..fa23b2d7ec6f
>>> --- /dev/null
>>> +++ b/Documentation/gpu/rfc/i915_vm_bind.h
>>> @@ -0,0 +1,243 @@
>>> +/* SPDX-License-Identifier: MIT */
>>> +/*
>>> + * Copyright © 2022 Intel Corporation
>>> + */
>>> +
>>> +/**
>>> + * DOC: I915_PARAM_HAS_VM_BIND
>>> + *
>>> + * VM_BIND feature availability.
>>> + * See typedef drm_i915_getparam_t param.
>>> + */
>>> +#define I915_PARAM_HAS_VM_BIND        57
>>> +
>>> +/**
>>> + * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
>>> + *
>>> + * Flag to opt-in for VM_BIND mode of binding during VM creation.
>>> + * See struct drm_i915_gem_vm_control flags.
>>> + *
>>> + * The older execbuf2 ioctl will not support VM_BIND mode of operation.
>>> + * For VM_BIND mode, we have new execbuf3 ioctl which will not 
>>> accept any
>>> + * execlist (See struct drm_i915_gem_execbuffer3 for more details).
>>> + *
>>> + */
>>> +#define I915_VM_CREATE_FLAGS_USE_VM_BIND    (1 << 0)
>>> +
>>> +/* VM_BIND related ioctls */
>>> +#define DRM_I915_GEM_VM_BIND        0x3d
>>> +#define DRM_I915_GEM_VM_UNBIND        0x3e
>>> +#define DRM_I915_GEM_EXECBUFFER3    0x3f
>>> +
>>> +#define DRM_IOCTL_I915_GEM_VM_BIND        DRM_IOWR(DRM_COMMAND_BASE 
>>> + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
>>> +#define DRM_IOCTL_I915_GEM_VM_UNBIND        
>>> DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct 
>>> drm_i915_gem_vm_bind)
>>> +#define DRM_IOCTL_I915_GEM_EXECBUFFER3        
>>> DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct 
>>> drm_i915_gem_execbuffer3)
>>> +
>>> +/**
>>> + * struct drm_i915_gem_vm_bind_fence - Bind/unbind completion 
>>> notification.
>>> + *
>>> + * A timeline out fence for vm_bind/unbind completion notification.
>>> + */
>>> +struct drm_i915_gem_vm_bind_fence {
>>> +    /** @handle: User's handle for a drm_syncobj to signal. */
>>> +    __u32 handle;
>>> +
>>> +    /** @rsvd: Reserved, MBZ */
>>> +    __u32 rsvd;
>>> +
>>> +    /**
>>> +     * @value: A point in the timeline.
>>> +     * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
>>> +     * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
>>> +     * binary one.
>>> +     */
>>> +    __u64 value;
>>> +};
>>> +
>>> +/**
>>> + * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
>>> + *
>>> + * This structure is passed to VM_BIND ioctl and specifies the 
>>> mapping of GPU
>>> + * virtual address (VA) range to the section of an object that 
>>> should be bound
>>> + * in the device page table of the specified address space (VM).
>>> + * The VA range specified must be unique (ie., not currently bound) 
>>> and can
>>> + * be mapped to whole object or a section of the object (partial 
>>> binding).
>>> + * Multiple VA mappings can be created to the same section of the 
>>> object
>>> + * (aliasing).
>>> + *
>>> + * The @start, @offset and @length should be 4K page aligned. 
>>> However the DG2
>>> + * and XEHPSDV has 64K page size for device local-memory and has 
>>> compact page
>>> + * table. On those platforms, for binding device local-memory 
>>> objects, the
>>> + * @start should be 2M aligned, @offset and @length should be 64K 
>>> aligned.
>>
>> Should some error codes be documented and has the ability to 
>> programmatically probe the alignment restrictions been considered?
>>
> 
> Currently what we have internally is that -EINVAL is returned if the 
> sart, offset
> and length are not aligned. If the specified mapping already exits, we 
> return
> -EEXIST. If there are conflicts in the VA range and VA range can't be 
> reserved,
> then -ENOSPC is returned. I can add this documentation here. But I am 
> worried
> that there will be more suggestions/feedback about error codes while 
> reviewing
> the code patch series, and we have to revisit it again.

I'd still suggest documenting those three. It makes sense to explain to 
userspace what behaviour they will see if they get it wrong.

>>> + * Also, on those platforms, it is not allowed to bind an device 
>>> local-memory
>>> + * object and a system memory object in a single 2M section of VA 
>>> range.
>>
>> Text should be clear whether "not allowed" means there will be an 
>> error returned, or it will appear to work but bad things will happen.
>>
> 
> Yah, error returned, will fix.
> 
>>> + */
>>> +struct drm_i915_gem_vm_bind {
>>> +    /** @vm_id: VM (address space) id to bind */
>>> +    __u32 vm_id;
>>> +
>>> +    /** @handle: Object handle */
>>> +    __u32 handle;
>>> +
>>> +    /** @start: Virtual Address start to bind */
>>> +    __u64 start;
>>> +
>>> +    /** @offset: Offset in object to bind */
>>> +    __u64 offset;
>>> +
>>> +    /** @length: Length of mapping to bind */
>>> +    __u64 length;
>>> +
>>> +    /**
>>> +     * @flags: Supported flags are:
>>> +     *
>>> +     * I915_GEM_VM_BIND_FENCE_VALID:
>>> +     * @fence is valid, needs bind completion notification.
>>> +     *
>>> +     * I915_GEM_VM_BIND_READONLY:
>>> +     * Mapping is read-only.
>>> +     *
>>> +     * I915_GEM_VM_BIND_CAPTURE:
>>> +     * Capture this mapping in the dump upon GPU error.
>>> +     *
>>> +     * I915_GEM_VM_BIND_TLB_FLUSH:
>>> +     * Flush the TLB for the specified range after bind completion.
>>> +     */
>>> +    __u64 flags;
>>> +#define I915_GEM_VM_BIND_FENCE_VALID    (1 << 0)
>>> +#define I915_GEM_VM_BIND_READONLY    (1 << 1)
>>> +#define I915_GEM_VM_BIND_CAPTURE    (1 << 2)
>>> +#define I915_GEM_VM_BIND_TLB_FLUSH    (1 << 2)
>>
>> What is the use case for allowing any random user to play with 
>> (global) TLB flushing?
>>
> 
> I heard it from Daniel on intel-gfx, apparently it is a Mesa requirement.

Okay I think that one needs clarifying.

>>> +
>>> +    /** @fence: Timeline fence for bind completion signaling */
>>> +    struct drm_i915_gem_vm_bind_fence fence;
>>
>> As agreed the other day - please document in the main kerneldoc 
>> section that all (un)binds are executed asynchronously and out of order.
>>
> 
> I have added it in the latest revision of .rst file.

Right, but I'd say to mention it in the uapi docs.

>>> +
>>> +    /** @extensions: 0-terminated chain of extensions */
>>> +    __u64 extensions;
>>> +};
>>> +
>>> +/**
>>> + * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
>>> + *
>>> + * This structure is passed to VM_UNBIND ioctl and specifies the GPU 
>>> virtual
>>> + * address (VA) range that should be unbound from the device page 
>>> table of the
>>> + * specified address space (VM). The specified VA range must match 
>>> one of the
>>> + * mappings created with the VM_BIND ioctl. TLB is flushed upon unbind
>>> + * completion. The unbind operation will force unbind the specified
>>
>> Do we want to provide TLB flushing guarantees here and why? (As 
>> opposed to leaving them for implementation details.) If there is no 
>> implied order in either binds/unbinds, or between the two intermixed, 
>> then what is the point of guaranteeing a TLB flush on unbind completion?
>>
> 
> I think we ensure that tlb is flushed before signaling the out fence
> of vm_unbind call, then user ensure corretness by staging submissions
> or vm_bind calls after vm_unbind out fence signaling.

I don't see why is this required. Driver does not need to flush 
immediately on unbind for correctness/security and neither for the uapi 
contract. If there is no subsequent usage/bind then the flush is 
pointless. And if the user re-binds to same VA range, against an active 
VM, then perhaps the expectations need to be defined. Is this supported 
or user error or what.

>> range from
>>> + * device page table without waiting for any GPU job to complete. It 
>>> is UMDs
>>> + * responsibility to ensure the mapping is no longer in use before 
>>> calling
>>> + * VM_UNBIND.
>>> + *
>>> + * The @start and @length musy specify a unique mapping bound with 
>>> VM_BIND
>>> + * ioctl.
>>> + */
>>> +struct drm_i915_gem_vm_unbind {
>>> +    /** @vm_id: VM (address space) id to bind */
>>> +    __u32 vm_id;
>>> +
>>> +    /** @rsvd: Reserved, MBZ */
>>> +    __u32 rsvd;
>>> +
>>> +    /** @start: Virtual Address start to unbind */
>>> +    __u64 start;
>>> +
>>> +    /** @length: Length of mapping to unbind */
>>> +    __u64 length;
>>> +
>>> +    /**
>>> +     * @flags: Supported flags are:
>>> +     *
>>> +     * I915_GEM_VM_UNBIND_FENCE_VALID:
>>> +     * @fence is valid, needs unbind completion notification.
>>> +     */
>>> +    __u64 flags;
>>> +#define I915_GEM_VM_UNBIND_FENCE_VALID    (1 << 0)
>>> +
>>> +    /** @fence: Timeline fence for unbind completion signaling */
>>> +    struct drm_i915_gem_vm_bind_fence fence;
>>
>> I am not sure the simplified ioctl story is super coherent. If 
>> everything is now fully async and out of order, but the input fence 
>> has been dropped, then how is userspace supposed to handle the address 
>> space? It will have to wait (in userspace) for unbinds to complete 
>> before submitting subsequent binds which use the same VA range.
>>
> 
> Yah and Mesa appararently will be having the support to handle it.
> 
>> Maybe that's passable, but then the fact execbuf3 has no input fence 
>> suggests a userspace wait between it and binds. And I am pretty sure 
>> historically those were always quite bad for performance.
>>
> 
> execbuf3 has the input fence through timline fence array support.

I think I confused the field in execbuf3 for for the output fence.. So 
that part is fine, async binds chained with input fence to execbuf3. 
Fire and forget for userspace.

Although I then don't understand why execbuf3 wouldn't support an output 
fence? What mechanism is userspace supposed to use for that? Export a 
fence from batch buffer BO? That would be an extra ioctl so if we can 
avoid it why not?

>> Presumably userspace clients are happy with no input fences or it was 
>> considered to costly to implement it?
>>
> 
> Yah, apparently Mesa can work with no input fence. This helps us in
> focusing on rest of the VM_BIND feature delivery.

Okay.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-22 15:57       ` Tvrtko Ursulin
@ 2022-06-22 16:44         ` Niranjana Vishwanathapura
  2022-06-22 18:53           ` Niranjana Vishwanathapura
  2022-06-23  8:27           ` Tvrtko Ursulin
  0 siblings, 2 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-22 16:44 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: paulo.r.zanoni, intel-gfx, chris.p.wilson, thomas.hellstrom,
	dri-devel, daniel.vetter, christian.koenig, matthew.auld

On Wed, Jun 22, 2022 at 04:57:17PM +0100, Tvrtko Ursulin wrote:
>
>On 22/06/2022 16:12, Niranjana Vishwanathapura wrote:
>>On Wed, Jun 22, 2022 at 09:10:07AM +0100, Tvrtko Ursulin wrote:
>>>
>>>On 22/06/2022 04:56, Niranjana Vishwanathapura wrote:
>>>>VM_BIND and related uapi definitions
>>>>
>>>>v2: Reduce the scope to simple Mesa use case.
>>>>v3: Expand VM_UNBIND documentation and add
>>>>    I915_GEM_VM_BIND/UNBIND_FENCE_VALID
>>>>    and I915_GEM_VM_BIND_TLB_FLUSH flags.
>>>>
>>>>Signed-off-by: Niranjana Vishwanathapura 
>>>><niranjana.vishwanathapura@intel.com>
>>>>---
>>>> Documentation/gpu/rfc/i915_vm_bind.h | 243 +++++++++++++++++++++++++++
>>>> 1 file changed, 243 insertions(+)
>>>> create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
>>>>
>>>>diff --git a/Documentation/gpu/rfc/i915_vm_bind.h 
>>>>b/Documentation/gpu/rfc/i915_vm_bind.h
>>>>new file mode 100644
>>>>index 000000000000..fa23b2d7ec6f
>>>>--- /dev/null
>>>>+++ b/Documentation/gpu/rfc/i915_vm_bind.h
>>>>@@ -0,0 +1,243 @@
>>>>+/* SPDX-License-Identifier: MIT */
>>>>+/*
>>>>+ * Copyright © 2022 Intel Corporation
>>>>+ */
>>>>+
>>>>+/**
>>>>+ * DOC: I915_PARAM_HAS_VM_BIND
>>>>+ *
>>>>+ * VM_BIND feature availability.
>>>>+ * See typedef drm_i915_getparam_t param.
>>>>+ */
>>>>+#define I915_PARAM_HAS_VM_BIND        57
>>>>+
>>>>+/**
>>>>+ * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
>>>>+ *
>>>>+ * Flag to opt-in for VM_BIND mode of binding during VM creation.
>>>>+ * See struct drm_i915_gem_vm_control flags.
>>>>+ *
>>>>+ * The older execbuf2 ioctl will not support VM_BIND mode of operation.
>>>>+ * For VM_BIND mode, we have new execbuf3 ioctl which will not 
>>>>accept any
>>>>+ * execlist (See struct drm_i915_gem_execbuffer3 for more details).
>>>>+ *
>>>>+ */
>>>>+#define I915_VM_CREATE_FLAGS_USE_VM_BIND    (1 << 0)
>>>>+
>>>>+/* VM_BIND related ioctls */
>>>>+#define DRM_I915_GEM_VM_BIND        0x3d
>>>>+#define DRM_I915_GEM_VM_UNBIND        0x3e
>>>>+#define DRM_I915_GEM_EXECBUFFER3    0x3f
>>>>+
>>>>+#define DRM_IOCTL_I915_GEM_VM_BIND        
>>>>DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct 
>>>>drm_i915_gem_vm_bind)
>>>>+#define DRM_IOCTL_I915_GEM_VM_UNBIND        
>>>>DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct 
>>>>drm_i915_gem_vm_bind)
>>>>+#define DRM_IOCTL_I915_GEM_EXECBUFFER3        
>>>>DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct 
>>>>drm_i915_gem_execbuffer3)
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_vm_bind_fence - Bind/unbind completion 
>>>>notification.
>>>>+ *
>>>>+ * A timeline out fence for vm_bind/unbind completion notification.
>>>>+ */
>>>>+struct drm_i915_gem_vm_bind_fence {
>>>>+    /** @handle: User's handle for a drm_syncobj to signal. */
>>>>+    __u32 handle;
>>>>+
>>>>+    /** @rsvd: Reserved, MBZ */
>>>>+    __u32 rsvd;
>>>>+
>>>>+    /**
>>>>+     * @value: A point in the timeline.
>>>>+     * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
>>>>+     * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
>>>>+     * binary one.
>>>>+     */
>>>>+    __u64 value;
>>>>+};
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
>>>>+ *
>>>>+ * This structure is passed to VM_BIND ioctl and specifies the 
>>>>mapping of GPU
>>>>+ * virtual address (VA) range to the section of an object that 
>>>>should be bound
>>>>+ * in the device page table of the specified address space (VM).
>>>>+ * The VA range specified must be unique (ie., not currently 
>>>>bound) and can
>>>>+ * be mapped to whole object or a section of the object 
>>>>(partial binding).
>>>>+ * Multiple VA mappings can be created to the same section of 
>>>>the object
>>>>+ * (aliasing).
>>>>+ *
>>>>+ * The @start, @offset and @length should be 4K page aligned. 
>>>>However the DG2
>>>>+ * and XEHPSDV has 64K page size for device local-memory and 
>>>>has compact page
>>>>+ * table. On those platforms, for binding device local-memory 
>>>>objects, the
>>>>+ * @start should be 2M aligned, @offset and @length should be 
>>>>64K aligned.
>>>
>>>Should some error codes be documented and has the ability to 
>>>programmatically probe the alignment restrictions been considered?
>>>
>>
>>Currently what we have internally is that -EINVAL is returned if the 
>>sart, offset
>>and length are not aligned. If the specified mapping already exits, 
>>we return
>>-EEXIST. If there are conflicts in the VA range and VA range can't 
>>be reserved,
>>then -ENOSPC is returned. I can add this documentation here. But I 
>>am worried
>>that there will be more suggestions/feedback about error codes while 
>>reviewing
>>the code patch series, and we have to revisit it again.
>
>I'd still suggest documenting those three. It makes sense to explain 
>to userspace what behaviour they will see if they get it wrong.
>

Ok.

>>>>+ * Also, on those platforms, it is not allowed to bind an 
>>>>device local-memory
>>>>+ * object and a system memory object in a single 2M section of 
>>>>VA range.
>>>
>>>Text should be clear whether "not allowed" means there will be an 
>>>error returned, or it will appear to work but bad things will 
>>>happen.
>>>
>>
>>Yah, error returned, will fix.
>>
>>>>+ */
>>>>+struct drm_i915_gem_vm_bind {
>>>>+    /** @vm_id: VM (address space) id to bind */
>>>>+    __u32 vm_id;
>>>>+
>>>>+    /** @handle: Object handle */
>>>>+    __u32 handle;
>>>>+
>>>>+    /** @start: Virtual Address start to bind */
>>>>+    __u64 start;
>>>>+
>>>>+    /** @offset: Offset in object to bind */
>>>>+    __u64 offset;
>>>>+
>>>>+    /** @length: Length of mapping to bind */
>>>>+    __u64 length;
>>>>+
>>>>+    /**
>>>>+     * @flags: Supported flags are:
>>>>+     *
>>>>+     * I915_GEM_VM_BIND_FENCE_VALID:
>>>>+     * @fence is valid, needs bind completion notification.
>>>>+     *
>>>>+     * I915_GEM_VM_BIND_READONLY:
>>>>+     * Mapping is read-only.
>>>>+     *
>>>>+     * I915_GEM_VM_BIND_CAPTURE:
>>>>+     * Capture this mapping in the dump upon GPU error.
>>>>+     *
>>>>+     * I915_GEM_VM_BIND_TLB_FLUSH:
>>>>+     * Flush the TLB for the specified range after bind completion.
>>>>+     */
>>>>+    __u64 flags;
>>>>+#define I915_GEM_VM_BIND_FENCE_VALID    (1 << 0)
>>>>+#define I915_GEM_VM_BIND_READONLY    (1 << 1)
>>>>+#define I915_GEM_VM_BIND_CAPTURE    (1 << 2)
>>>>+#define I915_GEM_VM_BIND_TLB_FLUSH    (1 << 2)
>>>
>>>What is the use case for allowing any random user to play with 
>>>(global) TLB flushing?
>>>
>>
>>I heard it from Daniel on intel-gfx, apparently it is a Mesa requirement.
>
>Okay I think that one needs clarifying.
>

After chatting with Jason, I think we can remove it for now and
we can revisit it later if Mesa thinks it is required.

>>>>+
>>>>+    /** @fence: Timeline fence for bind completion signaling */
>>>>+    struct drm_i915_gem_vm_bind_fence fence;
>>>
>>>As agreed the other day - please document in the main kerneldoc 
>>>section that all (un)binds are executed asynchronously and out of 
>>>order.
>>>
>>
>>I have added it in the latest revision of .rst file.
>
>Right, but I'd say to mention it in the uapi docs.
>

Ok

>>>>+
>>>>+    /** @extensions: 0-terminated chain of extensions */
>>>>+    __u64 extensions;
>>>>+};
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
>>>>+ *
>>>>+ * This structure is passed to VM_UNBIND ioctl and specifies 
>>>>the GPU virtual
>>>>+ * address (VA) range that should be unbound from the device 
>>>>page table of the
>>>>+ * specified address space (VM). The specified VA range must 
>>>>match one of the
>>>>+ * mappings created with the VM_BIND ioctl. TLB is flushed upon unbind
>>>>+ * completion. The unbind operation will force unbind the specified
>>>
>>>Do we want to provide TLB flushing guarantees here and why? (As 
>>>opposed to leaving them for implementation details.) If there is 
>>>no implied order in either binds/unbinds, or between the two 
>>>intermixed, then what is the point of guaranteeing a TLB flush on 
>>>unbind completion?
>>>
>>
>>I think we ensure that tlb is flushed before signaling the out fence
>>of vm_unbind call, then user ensure corretness by staging submissions
>>or vm_bind calls after vm_unbind out fence signaling.
>
>I don't see why is this required. Driver does not need to flush 
>immediately on unbind for correctness/security and neither for the 
>uapi contract. If there is no subsequent usage/bind then the flush is 
>pointless. And if the user re-binds to same VA range, against an 
>active VM, then perhaps the expectations need to be defined. Is this 
>supported or user error or what.
>

After a vm_unbind, UMD can re-bind to same VA range against an active VM.
Though I am not sue with Mesa usecase if that new mapping is required for
running GPU job or it will be for the next submission. But ensuring the
tlb flush upon unbind, KMD can ensure correctness.

Note that on platforms with selective TLB invalidation, it is not
as expensive as flushing the whole TLB. On platforms without selective
tlb invalidation, we can put some optimization later as mentioned
in the .rst file.

Also note that UMDs can vm_unbind a mapping while VM is active.
By flushing the tlb, we ensure there is no inadvertent access to
mapping that no longer exists. I can add this to documentation.

>>>range from
>>>>+ * device page table without waiting for any GPU job to 
>>>>complete. It is UMDs
>>>>+ * responsibility to ensure the mapping is no longer in use 
>>>>before calling
>>>>+ * VM_UNBIND.
>>>>+ *
>>>>+ * The @start and @length musy specify a unique mapping bound 
>>>>with VM_BIND
>>>>+ * ioctl.
>>>>+ */
>>>>+struct drm_i915_gem_vm_unbind {
>>>>+    /** @vm_id: VM (address space) id to bind */
>>>>+    __u32 vm_id;
>>>>+
>>>>+    /** @rsvd: Reserved, MBZ */
>>>>+    __u32 rsvd;
>>>>+
>>>>+    /** @start: Virtual Address start to unbind */
>>>>+    __u64 start;
>>>>+
>>>>+    /** @length: Length of mapping to unbind */
>>>>+    __u64 length;
>>>>+
>>>>+    /**
>>>>+     * @flags: Supported flags are:
>>>>+     *
>>>>+     * I915_GEM_VM_UNBIND_FENCE_VALID:
>>>>+     * @fence is valid, needs unbind completion notification.
>>>>+     */
>>>>+    __u64 flags;
>>>>+#define I915_GEM_VM_UNBIND_FENCE_VALID    (1 << 0)
>>>>+
>>>>+    /** @fence: Timeline fence for unbind completion signaling */
>>>>+    struct drm_i915_gem_vm_bind_fence fence;
>>>
>>>I am not sure the simplified ioctl story is super coherent. If 
>>>everything is now fully async and out of order, but the input 
>>>fence has been dropped, then how is userspace supposed to handle 
>>>the address space? It will have to wait (in userspace) for unbinds 
>>>to complete before submitting subsequent binds which use the same 
>>>VA range.
>>>
>>
>>Yah and Mesa appararently will be having the support to handle it.
>>
>>>Maybe that's passable, but then the fact execbuf3 has no input 
>>>fence suggests a userspace wait between it and binds. And I am 
>>>pretty sure historically those were always quite bad for 
>>>performance.
>>>
>>
>>execbuf3 has the input fence through timline fence array support.
>
>I think I confused the field in execbuf3 for for the output fence.. So 
>that part is fine, async binds chained with input fence to execbuf3. 
>Fire and forget for userspace.
>
>Although I then don't understand why execbuf3 wouldn't support an 
>output fence? What mechanism is userspace supposed to use for that? 
>Export a fence from batch buffer BO? That would be an extra ioctl so 
>if we can avoid it why not?
>

execbuf3 supports out fence as well through timeline fence array.

Niranjana

>>>Presumably userspace clients are happy with no input fences or it 
>>>was considered to costly to implement it?
>>>
>>
>>Yah, apparently Mesa can work with no input fence. This helps us in
>>focusing on rest of the VM_BIND feature delivery.
>
>Okay.
>
>Regards,
>
>Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-22 16:44         ` Niranjana Vishwanathapura
@ 2022-06-22 18:53           ` Niranjana Vishwanathapura
  2022-06-23  8:27           ` Tvrtko Ursulin
  1 sibling, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-22 18:53 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: paulo.r.zanoni, intel-gfx, dri-devel, thomas.hellstrom,
	chris.p.wilson, daniel.vetter, christian.koenig, matthew.auld

On Wed, Jun 22, 2022 at 09:44:47AM -0700, Niranjana Vishwanathapura wrote:
>On Wed, Jun 22, 2022 at 04:57:17PM +0100, Tvrtko Ursulin wrote:
>>
>>On 22/06/2022 16:12, Niranjana Vishwanathapura wrote:
>>>On Wed, Jun 22, 2022 at 09:10:07AM +0100, Tvrtko Ursulin wrote:
>>>>
>>>>On 22/06/2022 04:56, Niranjana Vishwanathapura wrote:
>>>>>VM_BIND and related uapi definitions
>>>>>
>>>>>v2: Reduce the scope to simple Mesa use case.
>>>>>v3: Expand VM_UNBIND documentation and add
>>>>>    I915_GEM_VM_BIND/UNBIND_FENCE_VALID
>>>>>    and I915_GEM_VM_BIND_TLB_FLUSH flags.
>>>>>
>>>>>Signed-off-by: Niranjana Vishwanathapura 
>>>>><niranjana.vishwanathapura@intel.com>
>>>>>---
>>>>> Documentation/gpu/rfc/i915_vm_bind.h | 243 +++++++++++++++++++++++++++
>>>>> 1 file changed, 243 insertions(+)
>>>>> create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
>>>>>
>>>>>diff --git a/Documentation/gpu/rfc/i915_vm_bind.h 
>>>>>b/Documentation/gpu/rfc/i915_vm_bind.h
>>>>>new file mode 100644
>>>>>index 000000000000..fa23b2d7ec6f
>>>>>--- /dev/null
>>>>>+++ b/Documentation/gpu/rfc/i915_vm_bind.h
>>>>>@@ -0,0 +1,243 @@
>>>>>+/* SPDX-License-Identifier: MIT */
>>>>>+/*
>>>>>+ * Copyright © 2022 Intel Corporation
>>>>>+ */
>>>>>+
>>>>>+/**
>>>>>+ * DOC: I915_PARAM_HAS_VM_BIND
>>>>>+ *
>>>>>+ * VM_BIND feature availability.
>>>>>+ * See typedef drm_i915_getparam_t param.
>>>>>+ */
>>>>>+#define I915_PARAM_HAS_VM_BIND        57
>>>>>+
>>>>>+/**
>>>>>+ * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
>>>>>+ *
>>>>>+ * Flag to opt-in for VM_BIND mode of binding during VM creation.
>>>>>+ * See struct drm_i915_gem_vm_control flags.
>>>>>+ *
>>>>>+ * The older execbuf2 ioctl will not support VM_BIND mode of operation.
>>>>>+ * For VM_BIND mode, we have new execbuf3 ioctl which will 
>>>>>not accept any
>>>>>+ * execlist (See struct drm_i915_gem_execbuffer3 for more details).
>>>>>+ *
>>>>>+ */
>>>>>+#define I915_VM_CREATE_FLAGS_USE_VM_BIND    (1 << 0)
>>>>>+
>>>>>+/* VM_BIND related ioctls */
>>>>>+#define DRM_I915_GEM_VM_BIND        0x3d
>>>>>+#define DRM_I915_GEM_VM_UNBIND        0x3e
>>>>>+#define DRM_I915_GEM_EXECBUFFER3    0x3f
>>>>>+
>>>>>+#define DRM_IOCTL_I915_GEM_VM_BIND        
>>>>>DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct 
>>>>>drm_i915_gem_vm_bind)
>>>>>+#define DRM_IOCTL_I915_GEM_VM_UNBIND        
>>>>>DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct 
>>>>>drm_i915_gem_vm_bind)
>>>>>+#define DRM_IOCTL_I915_GEM_EXECBUFFER3        
>>>>>DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct 
>>>>>drm_i915_gem_execbuffer3)
>>>>>+
>>>>>+/**
>>>>>+ * struct drm_i915_gem_vm_bind_fence - Bind/unbind completion 
>>>>>notification.
>>>>>+ *
>>>>>+ * A timeline out fence for vm_bind/unbind completion notification.
>>>>>+ */
>>>>>+struct drm_i915_gem_vm_bind_fence {
>>>>>+    /** @handle: User's handle for a drm_syncobj to signal. */
>>>>>+    __u32 handle;
>>>>>+
>>>>>+    /** @rsvd: Reserved, MBZ */
>>>>>+    __u32 rsvd;
>>>>>+
>>>>>+    /**
>>>>>+     * @value: A point in the timeline.
>>>>>+     * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
>>>>>+     * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
>>>>>+     * binary one.
>>>>>+     */
>>>>>+    __u64 value;
>>>>>+};
>>>>>+
>>>>>+/**
>>>>>+ * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
>>>>>+ *
>>>>>+ * This structure is passed to VM_BIND ioctl and specifies 
>>>>>the mapping of GPU
>>>>>+ * virtual address (VA) range to the section of an object 
>>>>>that should be bound
>>>>>+ * in the device page table of the specified address space (VM).
>>>>>+ * The VA range specified must be unique (ie., not currently 
>>>>>bound) and can
>>>>>+ * be mapped to whole object or a section of the object 
>>>>>(partial binding).
>>>>>+ * Multiple VA mappings can be created to the same section of 
>>>>>the object
>>>>>+ * (aliasing).
>>>>>+ *
>>>>>+ * The @start, @offset and @length should be 4K page aligned. 
>>>>>However the DG2
>>>>>+ * and XEHPSDV has 64K page size for device local-memory and 
>>>>>has compact page
>>>>>+ * table. On those platforms, for binding device local-memory 
>>>>>objects, the
>>>>>+ * @start should be 2M aligned, @offset and @length should be 
>>>>>64K aligned.
>>>>
>>>>Should some error codes be documented and has the ability to 
>>>>programmatically probe the alignment restrictions been 
>>>>considered?
>>>>
>>>
>>>Currently what we have internally is that -EINVAL is returned if 
>>>the sart, offset
>>>and length are not aligned. If the specified mapping already 
>>>exits, we return
>>>-EEXIST. If there are conflicts in the VA range and VA range can't 
>>>be reserved,
>>>then -ENOSPC is returned. I can add this documentation here. But I 
>>>am worried
>>>that there will be more suggestions/feedback about error codes 
>>>while reviewing
>>>the code patch series, and we have to revisit it again.
>>
>>I'd still suggest documenting those three. It makes sense to explain 
>>to userspace what behaviour they will see if they get it wrong.
>>
>
>Ok.

I have posted v4 with the fixes. I have simplified the error code a
bit by removing EEXIST which is just a special case of ENOSPC.

Niranjana

>
>>>>>+ * Also, on those platforms, it is not allowed to bind an 
>>>>>device local-memory
>>>>>+ * object and a system memory object in a single 2M section 
>>>>>of VA range.
>>>>
>>>>Text should be clear whether "not allowed" means there will be 
>>>>an error returned, or it will appear to work but bad things will 
>>>>happen.
>>>>
>>>
>>>Yah, error returned, will fix.
>>>
>>>>>+ */
>>>>>+struct drm_i915_gem_vm_bind {
>>>>>+    /** @vm_id: VM (address space) id to bind */
>>>>>+    __u32 vm_id;
>>>>>+
>>>>>+    /** @handle: Object handle */
>>>>>+    __u32 handle;
>>>>>+
>>>>>+    /** @start: Virtual Address start to bind */
>>>>>+    __u64 start;
>>>>>+
>>>>>+    /** @offset: Offset in object to bind */
>>>>>+    __u64 offset;
>>>>>+
>>>>>+    /** @length: Length of mapping to bind */
>>>>>+    __u64 length;
>>>>>+
>>>>>+    /**
>>>>>+     * @flags: Supported flags are:
>>>>>+     *
>>>>>+     * I915_GEM_VM_BIND_FENCE_VALID:
>>>>>+     * @fence is valid, needs bind completion notification.
>>>>>+     *
>>>>>+     * I915_GEM_VM_BIND_READONLY:
>>>>>+     * Mapping is read-only.
>>>>>+     *
>>>>>+     * I915_GEM_VM_BIND_CAPTURE:
>>>>>+     * Capture this mapping in the dump upon GPU error.
>>>>>+     *
>>>>>+     * I915_GEM_VM_BIND_TLB_FLUSH:
>>>>>+     * Flush the TLB for the specified range after bind completion.
>>>>>+     */
>>>>>+    __u64 flags;
>>>>>+#define I915_GEM_VM_BIND_FENCE_VALID    (1 << 0)
>>>>>+#define I915_GEM_VM_BIND_READONLY    (1 << 1)
>>>>>+#define I915_GEM_VM_BIND_CAPTURE    (1 << 2)
>>>>>+#define I915_GEM_VM_BIND_TLB_FLUSH    (1 << 2)
>>>>
>>>>What is the use case for allowing any random user to play with 
>>>>(global) TLB flushing?
>>>>
>>>
>>>I heard it from Daniel on intel-gfx, apparently it is a Mesa requirement.
>>
>>Okay I think that one needs clarifying.
>>
>
>After chatting with Jason, I think we can remove it for now and
>we can revisit it later if Mesa thinks it is required.
>
>>>>>+
>>>>>+    /** @fence: Timeline fence for bind completion signaling */
>>>>>+    struct drm_i915_gem_vm_bind_fence fence;
>>>>
>>>>As agreed the other day - please document in the main kerneldoc 
>>>>section that all (un)binds are executed asynchronously and out 
>>>>of order.
>>>>
>>>
>>>I have added it in the latest revision of .rst file.
>>
>>Right, but I'd say to mention it in the uapi docs.
>>
>
>Ok
>
>>>>>+
>>>>>+    /** @extensions: 0-terminated chain of extensions */
>>>>>+    __u64 extensions;
>>>>>+};
>>>>>+
>>>>>+/**
>>>>>+ * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
>>>>>+ *
>>>>>+ * This structure is passed to VM_UNBIND ioctl and specifies 
>>>>>the GPU virtual
>>>>>+ * address (VA) range that should be unbound from the device 
>>>>>page table of the
>>>>>+ * specified address space (VM). The specified VA range must 
>>>>>match one of the
>>>>>+ * mappings created with the VM_BIND ioctl. TLB is flushed upon unbind
>>>>>+ * completion. The unbind operation will force unbind the specified
>>>>
>>>>Do we want to provide TLB flushing guarantees here and why? (As 
>>>>opposed to leaving them for implementation details.) If there is 
>>>>no implied order in either binds/unbinds, or between the two 
>>>>intermixed, then what is the point of guaranteeing a TLB flush 
>>>>on unbind completion?
>>>>
>>>
>>>I think we ensure that tlb is flushed before signaling the out fence
>>>of vm_unbind call, then user ensure corretness by staging submissions
>>>or vm_bind calls after vm_unbind out fence signaling.
>>
>>I don't see why is this required. Driver does not need to flush 
>>immediately on unbind for correctness/security and neither for the 
>>uapi contract. If there is no subsequent usage/bind then the flush 
>>is pointless. And if the user re-binds to same VA range, against an 
>>active VM, then perhaps the expectations need to be defined. Is this 
>>supported or user error or what.
>>
>
>After a vm_unbind, UMD can re-bind to same VA range against an active VM.
>Though I am not sue with Mesa usecase if that new mapping is required for
>running GPU job or it will be for the next submission. But ensuring the
>tlb flush upon unbind, KMD can ensure correctness.
>
>Note that on platforms with selective TLB invalidation, it is not
>as expensive as flushing the whole TLB. On platforms without selective
>tlb invalidation, we can put some optimization later as mentioned
>in the .rst file.
>
>Also note that UMDs can vm_unbind a mapping while VM is active.
>By flushing the tlb, we ensure there is no inadvertent access to
>mapping that no longer exists. I can add this to documentation.
>
>>>>range from
>>>>>+ * device page table without waiting for any GPU job to 
>>>>>complete. It is UMDs
>>>>>+ * responsibility to ensure the mapping is no longer in use 
>>>>>before calling
>>>>>+ * VM_UNBIND.
>>>>>+ *
>>>>>+ * The @start and @length musy specify a unique mapping bound 
>>>>>with VM_BIND
>>>>>+ * ioctl.
>>>>>+ */
>>>>>+struct drm_i915_gem_vm_unbind {
>>>>>+    /** @vm_id: VM (address space) id to bind */
>>>>>+    __u32 vm_id;
>>>>>+
>>>>>+    /** @rsvd: Reserved, MBZ */
>>>>>+    __u32 rsvd;
>>>>>+
>>>>>+    /** @start: Virtual Address start to unbind */
>>>>>+    __u64 start;
>>>>>+
>>>>>+    /** @length: Length of mapping to unbind */
>>>>>+    __u64 length;
>>>>>+
>>>>>+    /**
>>>>>+     * @flags: Supported flags are:
>>>>>+     *
>>>>>+     * I915_GEM_VM_UNBIND_FENCE_VALID:
>>>>>+     * @fence is valid, needs unbind completion notification.
>>>>>+     */
>>>>>+    __u64 flags;
>>>>>+#define I915_GEM_VM_UNBIND_FENCE_VALID    (1 << 0)
>>>>>+
>>>>>+    /** @fence: Timeline fence for unbind completion signaling */
>>>>>+    struct drm_i915_gem_vm_bind_fence fence;
>>>>
>>>>I am not sure the simplified ioctl story is super coherent. If 
>>>>everything is now fully async and out of order, but the input 
>>>>fence has been dropped, then how is userspace supposed to handle 
>>>>the address space? It will have to wait (in userspace) for 
>>>>unbinds to complete before submitting subsequent binds which use 
>>>>the same VA range.
>>>>
>>>
>>>Yah and Mesa appararently will be having the support to handle it.
>>>
>>>>Maybe that's passable, but then the fact execbuf3 has no input 
>>>>fence suggests a userspace wait between it and binds. And I am 
>>>>pretty sure historically those were always quite bad for 
>>>>performance.
>>>>
>>>
>>>execbuf3 has the input fence through timline fence array support.
>>
>>I think I confused the field in execbuf3 for for the output fence.. 
>>So that part is fine, async binds chained with input fence to 
>>execbuf3. Fire and forget for userspace.
>>
>>Although I then don't understand why execbuf3 wouldn't support an 
>>output fence? What mechanism is userspace supposed to use for that? 
>>Export a fence from batch buffer BO? That would be an extra ioctl so 
>>if we can avoid it why not?
>>
>
>execbuf3 supports out fence as well through timeline fence array.
>
>Niranjana
>
>>>>Presumably userspace clients are happy with no input fences or 
>>>>it was considered to costly to implement it?
>>>>
>>>
>>>Yah, apparently Mesa can work with no input fence. This helps us in
>>>focusing on rest of the VM_BIND feature delivery.
>>
>>Okay.
>>
>>Regards,
>>
>>Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/doc/rfc: i915 VM_BIND feature design + uapi
  2022-06-22  3:56 ` [Intel-gfx] " Niranjana Vishwanathapura
                   ` (3 preceding siblings ...)
  (?)
@ 2022-06-22 19:49 ` Patchwork
  -1 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2022-06-22 19:49 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-gfx

== Series Details ==

Series: drm/doc/rfc: i915 VM_BIND feature design + uapi
URL   : https://patchwork.freedesktop.org/series/105452/
State : failure

== Summary ==

Error: make failed
  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  DESCEND objtool
  CHK     include/generated/compile.h
  CC [M]  drivers/gpu/drm/i915/i915_driver.o
In file included from ./drivers/gpu/drm/i915/i915_pmu.h:13,
                 from ./drivers/gpu/drm/i915/gt/intel_engine_types.h:21,
                 from ./drivers/gpu/drm/i915/gt/intel_context_types.h:18,
                 from ./drivers/gpu/drm/i915/gem/i915_gem_context_types.h:20,
                 from ./drivers/gpu/drm/i915/i915_request.h:34,
                 from ./drivers/gpu/drm/i915/i915_active.h:13,
                 from ./drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h:12,
                 from ./drivers/gpu/drm/i915/i915_vma.h:33,
                 from drivers/gpu/drm/i915/display/intel_display_types.h:49,
                 from drivers/gpu/drm/i915/i915_driver.c:52:
./include/uapi/drm/i915_drm.h:1934:2: error: "/*" within comment [-Werror=comment]
  /** @param: Parameter to set or query */
   
cc1: all warnings being treated as errors
scripts/Makefile.build:249: recipe for target 'drivers/gpu/drm/i915/i915_driver.o' failed
make[4]: *** [drivers/gpu/drm/i915/i915_driver.o] Error 1
scripts/Makefile.build:466: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:466: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:466: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1843: recipe for target 'drivers' failed
make: *** [drivers] Error 2



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-22 16:44         ` Niranjana Vishwanathapura
  2022-06-22 18:53           ` Niranjana Vishwanathapura
@ 2022-06-23  8:27           ` Tvrtko Ursulin
  2022-06-23  8:57             ` Lionel Landwerlin
  2022-06-23 14:47             ` Niranjana Vishwanathapura
  1 sibling, 2 replies; 34+ messages in thread
From: Tvrtko Ursulin @ 2022-06-23  8:27 UTC (permalink / raw)
  To: Niranjana Vishwanathapura
  Cc: paulo.r.zanoni, intel-gfx, chris.p.wilson, thomas.hellstrom,
	dri-devel, daniel.vetter, christian.koenig, matthew.auld


On 22/06/2022 17:44, Niranjana Vishwanathapura wrote:
> On Wed, Jun 22, 2022 at 04:57:17PM +0100, Tvrtko Ursulin wrote:
>>
>> On 22/06/2022 16:12, Niranjana Vishwanathapura wrote:
>>> On Wed, Jun 22, 2022 at 09:10:07AM +0100, Tvrtko Ursulin wrote:
>>>>
>>>> On 22/06/2022 04:56, Niranjana Vishwanathapura wrote:
>>>>> VM_BIND and related uapi definitions
>>>>>
>>>>> v2: Reduce the scope to simple Mesa use case.
>>>>> v3: Expand VM_UNBIND documentation and add
>>>>>     I915_GEM_VM_BIND/UNBIND_FENCE_VALID
>>>>>     and I915_GEM_VM_BIND_TLB_FLUSH flags.
>>>>>
>>>>> Signed-off-by: Niranjana Vishwanathapura 
>>>>> <niranjana.vishwanathapura@intel.com>
>>>>> ---
>>>>>  Documentation/gpu/rfc/i915_vm_bind.h | 243 
>>>>> +++++++++++++++++++++++++++
>>>>>  1 file changed, 243 insertions(+)
>>>>>  create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
>>>>>
>>>>> diff --git a/Documentation/gpu/rfc/i915_vm_bind.h 
>>>>> b/Documentation/gpu/rfc/i915_vm_bind.h
>>>>> new file mode 100644
>>>>> index 000000000000..fa23b2d7ec6f
>>>>> --- /dev/null
>>>>> +++ b/Documentation/gpu/rfc/i915_vm_bind.h
>>>>> @@ -0,0 +1,243 @@
>>>>> +/* SPDX-License-Identifier: MIT */
>>>>> +/*
>>>>> + * Copyright © 2022 Intel Corporation
>>>>> + */
>>>>> +
>>>>> +/**
>>>>> + * DOC: I915_PARAM_HAS_VM_BIND
>>>>> + *
>>>>> + * VM_BIND feature availability.
>>>>> + * See typedef drm_i915_getparam_t param.
>>>>> + */
>>>>> +#define I915_PARAM_HAS_VM_BIND        57
>>>>> +
>>>>> +/**
>>>>> + * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
>>>>> + *
>>>>> + * Flag to opt-in for VM_BIND mode of binding during VM creation.
>>>>> + * See struct drm_i915_gem_vm_control flags.
>>>>> + *
>>>>> + * The older execbuf2 ioctl will not support VM_BIND mode of 
>>>>> operation.
>>>>> + * For VM_BIND mode, we have new execbuf3 ioctl which will not 
>>>>> accept any
>>>>> + * execlist (See struct drm_i915_gem_execbuffer3 for more details).
>>>>> + *
>>>>> + */
>>>>> +#define I915_VM_CREATE_FLAGS_USE_VM_BIND    (1 << 0)
>>>>> +
>>>>> +/* VM_BIND related ioctls */
>>>>> +#define DRM_I915_GEM_VM_BIND        0x3d
>>>>> +#define DRM_I915_GEM_VM_UNBIND        0x3e
>>>>> +#define DRM_I915_GEM_EXECBUFFER3    0x3f
>>>>> +
>>>>> +#define DRM_IOCTL_I915_GEM_VM_BIND DRM_IOWR(DRM_COMMAND_BASE + 
>>>>> DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
>>>>> +#define DRM_IOCTL_I915_GEM_VM_UNBIND DRM_IOWR(DRM_COMMAND_BASE + 
>>>>> DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_bind)
>>>>> +#define DRM_IOCTL_I915_GEM_EXECBUFFER3 DRM_IOWR(DRM_COMMAND_BASE + 
>>>>> DRM_I915_GEM_EXECBUFFER3, struct drm_i915_gem_execbuffer3)
>>>>> +
>>>>> +/**
>>>>> + * struct drm_i915_gem_vm_bind_fence - Bind/unbind completion 
>>>>> notification.
>>>>> + *
>>>>> + * A timeline out fence for vm_bind/unbind completion notification.
>>>>> + */
>>>>> +struct drm_i915_gem_vm_bind_fence {
>>>>> +    /** @handle: User's handle for a drm_syncobj to signal. */
>>>>> +    __u32 handle;
>>>>> +
>>>>> +    /** @rsvd: Reserved, MBZ */
>>>>> +    __u32 rsvd;
>>>>> +
>>>>> +    /**
>>>>> +     * @value: A point in the timeline.
>>>>> +     * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
>>>>> +     * timeline drm_syncobj is invalid as it turns a drm_syncobj 
>>>>> into a
>>>>> +     * binary one.
>>>>> +     */
>>>>> +    __u64 value;
>>>>> +};
>>>>> +
>>>>> +/**
>>>>> + * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
>>>>> + *
>>>>> + * This structure is passed to VM_BIND ioctl and specifies the 
>>>>> mapping of GPU
>>>>> + * virtual address (VA) range to the section of an object that 
>>>>> should be bound
>>>>> + * in the device page table of the specified address space (VM).
>>>>> + * The VA range specified must be unique (ie., not currently 
>>>>> bound) and can
>>>>> + * be mapped to whole object or a section of the object (partial 
>>>>> binding).
>>>>> + * Multiple VA mappings can be created to the same section of the 
>>>>> object
>>>>> + * (aliasing).
>>>>> + *
>>>>> + * The @start, @offset and @length should be 4K page aligned. 
>>>>> However the DG2
>>>>> + * and XEHPSDV has 64K page size for device local-memory and has 
>>>>> compact page
>>>>> + * table. On those platforms, for binding device local-memory 
>>>>> objects, the
>>>>> + * @start should be 2M aligned, @offset and @length should be 64K 
>>>>> aligned.
>>>>
>>>> Should some error codes be documented and has the ability to 
>>>> programmatically probe the alignment restrictions been considered?
>>>>
>>>
>>> Currently what we have internally is that -EINVAL is returned if the 
>>> sart, offset
>>> and length are not aligned. If the specified mapping already exits, 
>>> we return
>>> -EEXIST. If there are conflicts in the VA range and VA range can't be 
>>> reserved,
>>> then -ENOSPC is returned. I can add this documentation here. But I am 
>>> worried
>>> that there will be more suggestions/feedback about error codes while 
>>> reviewing
>>> the code patch series, and we have to revisit it again.
>>
>> I'd still suggest documenting those three. It makes sense to explain 
>> to userspace what behaviour they will see if they get it wrong.
>>
> 
> Ok.
> 
>>>>> + * Also, on those platforms, it is not allowed to bind an device 
>>>>> local-memory
>>>>> + * object and a system memory object in a single 2M section of VA 
>>>>> range.
>>>>
>>>> Text should be clear whether "not allowed" means there will be an 
>>>> error returned, or it will appear to work but bad things will happen.
>>>>
>>>
>>> Yah, error returned, will fix.
>>>
>>>>> + */
>>>>> +struct drm_i915_gem_vm_bind {
>>>>> +    /** @vm_id: VM (address space) id to bind */
>>>>> +    __u32 vm_id;
>>>>> +
>>>>> +    /** @handle: Object handle */
>>>>> +    __u32 handle;
>>>>> +
>>>>> +    /** @start: Virtual Address start to bind */
>>>>> +    __u64 start;
>>>>> +
>>>>> +    /** @offset: Offset in object to bind */
>>>>> +    __u64 offset;
>>>>> +
>>>>> +    /** @length: Length of mapping to bind */
>>>>> +    __u64 length;
>>>>> +
>>>>> +    /**
>>>>> +     * @flags: Supported flags are:
>>>>> +     *
>>>>> +     * I915_GEM_VM_BIND_FENCE_VALID:
>>>>> +     * @fence is valid, needs bind completion notification.
>>>>> +     *
>>>>> +     * I915_GEM_VM_BIND_READONLY:
>>>>> +     * Mapping is read-only.
>>>>> +     *
>>>>> +     * I915_GEM_VM_BIND_CAPTURE:
>>>>> +     * Capture this mapping in the dump upon GPU error.
>>>>> +     *
>>>>> +     * I915_GEM_VM_BIND_TLB_FLUSH:
>>>>> +     * Flush the TLB for the specified range after bind completion.
>>>>> +     */
>>>>> +    __u64 flags;
>>>>> +#define I915_GEM_VM_BIND_FENCE_VALID    (1 << 0)
>>>>> +#define I915_GEM_VM_BIND_READONLY    (1 << 1)
>>>>> +#define I915_GEM_VM_BIND_CAPTURE    (1 << 2)
>>>>> +#define I915_GEM_VM_BIND_TLB_FLUSH    (1 << 2)
>>>>
>>>> What is the use case for allowing any random user to play with 
>>>> (global) TLB flushing?
>>>>
>>>
>>> I heard it from Daniel on intel-gfx, apparently it is a Mesa 
>>> requirement.
>>
>> Okay I think that one needs clarifying.
>>
> 
> After chatting with Jason, I think we can remove it for now and
> we can revisit it later if Mesa thinks it is required.

IRC or some other thread?

>>>>> +
>>>>> +    /** @fence: Timeline fence for bind completion signaling */
>>>>> +    struct drm_i915_gem_vm_bind_fence fence;
>>>>
>>>> As agreed the other day - please document in the main kerneldoc 
>>>> section that all (un)binds are executed asynchronously and out of 
>>>> order.
>>>>
>>>
>>> I have added it in the latest revision of .rst file.
>>
>> Right, but I'd say to mention it in the uapi docs.
>>
> 
> Ok
> 
>>>>> +
>>>>> +    /** @extensions: 0-terminated chain of extensions */
>>>>> +    __u64 extensions;
>>>>> +};
>>>>> +
>>>>> +/**
>>>>> + * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
>>>>> + *
>>>>> + * This structure is passed to VM_UNBIND ioctl and specifies the 
>>>>> GPU virtual
>>>>> + * address (VA) range that should be unbound from the device page 
>>>>> table of the
>>>>> + * specified address space (VM). The specified VA range must match 
>>>>> one of the
>>>>> + * mappings created with the VM_BIND ioctl. TLB is flushed upon 
>>>>> unbind
>>>>> + * completion. The unbind operation will force unbind the specified
>>>>
>>>> Do we want to provide TLB flushing guarantees here and why? (As 
>>>> opposed to leaving them for implementation details.) If there is no 
>>>> implied order in either binds/unbinds, or between the two 
>>>> intermixed, then what is the point of guaranteeing a TLB flush on 
>>>> unbind completion?
>>>>
>>>
>>> I think we ensure that tlb is flushed before signaling the out fence
>>> of vm_unbind call, then user ensure corretness by staging submissions
>>> or vm_bind calls after vm_unbind out fence signaling.
>>
>> I don't see why is this required. Driver does not need to flush 
>> immediately on unbind for correctness/security and neither for the 
>> uapi contract. If there is no subsequent usage/bind then the flush is 
>> pointless. And if the user re-binds to same VA range, against an 
>> active VM, then perhaps the expectations need to be defined. Is this 
>> supported or user error or what.
>>
> 
> After a vm_unbind, UMD can re-bind to same VA range against an active VM.
> Though I am not sue with Mesa usecase if that new mapping is required for
> running GPU job or it will be for the next submission. But ensuring the
> tlb flush upon unbind, KMD can ensure correctness.

Isn't that their problem? If they re-bind for submitting _new_ work then 
they get the flush as part of batch buffer pre-amble.

> Note that on platforms with selective TLB invalidation, it is not
> as expensive as flushing the whole TLB. On platforms without selective
> tlb invalidation, we can put some optimization later as mentioned
> in the .rst file.
> 
> Also note that UMDs can vm_unbind a mapping while VM is active.
> By flushing the tlb, we ensure there is no inadvertent access to
> mapping that no longer exists. I can add this to documentation.

This one would surely be their problem. Kernel only needs to flush when 
it decides to re-use the backing store.

To be clear, overall I have reservations about offering strong 
guarantees about the TLB flushing behaviour at the level of these two 
ioctls. If we don't need to offer them it would be good to not do it, 
otherwise we limit ourselves on the implementation side and more 
importantly add a global performance hit where majority of userspace do 
not need this guarantee to start with.

I only don't fully remember how is that compute use case supposed to 
work where new work keeps getting submitted against a running batch. Am 
I missing something there?

>>>> range from
>>>>> + * device page table without waiting for any GPU job to complete. 
>>>>> It is UMDs
>>>>> + * responsibility to ensure the mapping is no longer in use before 
>>>>> calling
>>>>> + * VM_UNBIND.
>>>>> + *
>>>>> + * The @start and @length musy specify a unique mapping bound with 
>>>>> VM_BIND
>>>>> + * ioctl.
>>>>> + */
>>>>> +struct drm_i915_gem_vm_unbind {
>>>>> +    /** @vm_id: VM (address space) id to bind */
>>>>> +    __u32 vm_id;
>>>>> +
>>>>> +    /** @rsvd: Reserved, MBZ */
>>>>> +    __u32 rsvd;
>>>>> +
>>>>> +    /** @start: Virtual Address start to unbind */
>>>>> +    __u64 start;
>>>>> +
>>>>> +    /** @length: Length of mapping to unbind */
>>>>> +    __u64 length;
>>>>> +
>>>>> +    /**
>>>>> +     * @flags: Supported flags are:
>>>>> +     *
>>>>> +     * I915_GEM_VM_UNBIND_FENCE_VALID:
>>>>> +     * @fence is valid, needs unbind completion notification.
>>>>> +     */
>>>>> +    __u64 flags;
>>>>> +#define I915_GEM_VM_UNBIND_FENCE_VALID    (1 << 0)
>>>>> +
>>>>> +    /** @fence: Timeline fence for unbind completion signaling */
>>>>> +    struct drm_i915_gem_vm_bind_fence fence;
>>>>
>>>> I am not sure the simplified ioctl story is super coherent. If 
>>>> everything is now fully async and out of order, but the input fence 
>>>> has been dropped, then how is userspace supposed to handle the 
>>>> address space? It will have to wait (in userspace) for unbinds to 
>>>> complete before submitting subsequent binds which use the same VA 
>>>> range.
>>>>
>>>
>>> Yah and Mesa appararently will be having the support to handle it.
>>>
>>>> Maybe that's passable, but then the fact execbuf3 has no input fence 
>>>> suggests a userspace wait between it and binds. And I am pretty sure 
>>>> historically those were always quite bad for performance.
>>>>
>>>
>>> execbuf3 has the input fence through timline fence array support.
>>
>> I think I confused the field in execbuf3 for for the output fence.. So 
>> that part is fine, async binds chained with input fence to execbuf3. 
>> Fire and forget for userspace.
>>
>> Although I then don't understand why execbuf3 wouldn't support an 
>> output fence? What mechanism is userspace supposed to use for that? 
>> Export a fence from batch buffer BO? That would be an extra ioctl so 
>> if we can avoid it why not?
>>
> 
> execbuf3 supports out fence as well through timeline fence array.

Ah okay, I am uninformed in this topic, sorry.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-23  8:27           ` Tvrtko Ursulin
@ 2022-06-23  8:57             ` Lionel Landwerlin
  2022-06-23 11:05               ` Tvrtko Ursulin
  2022-06-23 14:47             ` Niranjana Vishwanathapura
  1 sibling, 1 reply; 34+ messages in thread
From: Lionel Landwerlin @ 2022-06-23  8:57 UTC (permalink / raw)
  To: Tvrtko Ursulin, Niranjana Vishwanathapura
  Cc: paulo.r.zanoni, intel-gfx, dri-devel, thomas.hellstrom,
	chris.p.wilson, daniel.vetter, christian.koenig, matthew.auld

[-- Attachment #1: Type: text/plain, Size: 868 bytes --]

On 23/06/2022 11:27, Tvrtko Ursulin wrote:
>>
>> After a vm_unbind, UMD can re-bind to same VA range against an active 
>> VM.
>> Though I am not sue with Mesa usecase if that new mapping is required 
>> for
>> running GPU job or it will be for the next submission. But ensuring the
>> tlb flush upon unbind, KMD can ensure correctness.
>
> Isn't that their problem? If they re-bind for submitting _new_ work 
> then they get the flush as part of batch buffer pre-amble. 

In the non sparse case, if a VA range is unbound, it is invalid to use 
that range for anything until it has been rebound by something else.

We'll take the fence provided by vm_bind and put it as a wait fence on 
the next execbuffer.

It might be safer in case of memory over fetching?


TLB flush will have to happen at some point right?

What's the alternative to do it in unbind?


-Lionel


[-- Attachment #2: Type: text/html, Size: 1612 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-22 15:12     ` Niranjana Vishwanathapura
  2022-06-22 15:57       ` Tvrtko Ursulin
@ 2022-06-23  9:28       ` Lionel Landwerlin
  2022-06-23 14:43           ` Niranjana Vishwanathapura
  1 sibling, 1 reply; 34+ messages in thread
From: Lionel Landwerlin @ 2022-06-23  9:28 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, Tvrtko Ursulin
  Cc: paulo.r.zanoni, intel-gfx, dri-devel, thomas.hellstrom,
	chris.p.wilson, daniel.vetter, christian.koenig, matthew.auld

On 22/06/2022 18:12, Niranjana Vishwanathapura wrote:
> On Wed, Jun 22, 2022 at 09:10:07AM +0100, Tvrtko Ursulin wrote:
>>
>> On 22/06/2022 04:56, Niranjana Vishwanathapura wrote:
>>> VM_BIND and related uapi definitions
>>>
>>> v2: Reduce the scope to simple Mesa use case.
>>> v3: Expand VM_UNBIND documentation and add
>>>     I915_GEM_VM_BIND/UNBIND_FENCE_VALID
>>>     and I915_GEM_VM_BIND_TLB_FLUSH flags.
>>>
>>> Signed-off-by: Niranjana Vishwanathapura 
>>> <niranjana.vishwanathapura@intel.com>
>>> ---
>>>  Documentation/gpu/rfc/i915_vm_bind.h | 243 +++++++++++++++++++++++++++
>>>  1 file changed, 243 insertions(+)
>>>  create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
>>>
>>> diff --git a/Documentation/gpu/rfc/i915_vm_bind.h 
>>> b/Documentation/gpu/rfc/i915_vm_bind.h
>>> new file mode 100644
>>> index 000000000000..fa23b2d7ec6f
>>> --- /dev/null
>>> +++ b/Documentation/gpu/rfc/i915_vm_bind.h
>>> @@ -0,0 +1,243 @@
>>> +/* SPDX-License-Identifier: MIT */
>>> +/*
>>> + * Copyright © 2022 Intel Corporation
>>> + */
>>> +
>>> +/**
>>> + * DOC: I915_PARAM_HAS_VM_BIND
>>> + *
>>> + * VM_BIND feature availability.
>>> + * See typedef drm_i915_getparam_t param.
>>> + */
>>> +#define I915_PARAM_HAS_VM_BIND        57
>>> +
>>> +/**
>>> + * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
>>> + *
>>> + * Flag to opt-in for VM_BIND mode of binding during VM creation.
>>> + * See struct drm_i915_gem_vm_control flags.
>>> + *
>>> + * The older execbuf2 ioctl will not support VM_BIND mode of 
>>> operation.
>>> + * For VM_BIND mode, we have new execbuf3 ioctl which will not 
>>> accept any
>>> + * execlist (See struct drm_i915_gem_execbuffer3 for more details).
>>> + *
>>> + */
>>> +#define I915_VM_CREATE_FLAGS_USE_VM_BIND    (1 << 0)
>>> +
>>> +/* VM_BIND related ioctls */
>>> +#define DRM_I915_GEM_VM_BIND        0x3d
>>> +#define DRM_I915_GEM_VM_UNBIND        0x3e
>>> +#define DRM_I915_GEM_EXECBUFFER3    0x3f
>>> +
>>> +#define DRM_IOCTL_I915_GEM_VM_BIND DRM_IOWR(DRM_COMMAND_BASE + 
>>> DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
>>> +#define DRM_IOCTL_I915_GEM_VM_UNBIND DRM_IOWR(DRM_COMMAND_BASE + 
>>> DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_bind)
>>> +#define DRM_IOCTL_I915_GEM_EXECBUFFER3 DRM_IOWR(DRM_COMMAND_BASE + 
>>> DRM_I915_GEM_EXECBUFFER3, struct drm_i915_gem_execbuffer3)
>>> +
>>> +/**
>>> + * struct drm_i915_gem_vm_bind_fence - Bind/unbind completion 
>>> notification.
>>> + *
>>> + * A timeline out fence for vm_bind/unbind completion notification.
>>> + */
>>> +struct drm_i915_gem_vm_bind_fence {
>>> +    /** @handle: User's handle for a drm_syncobj to signal. */
>>> +    __u32 handle;
>>> +
>>> +    /** @rsvd: Reserved, MBZ */
>>> +    __u32 rsvd;
>>> +
>>> +    /**
>>> +     * @value: A point in the timeline.
>>> +     * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
>>> +     * timeline drm_syncobj is invalid as it turns a drm_syncobj 
>>> into a
>>> +     * binary one.
>>> +     */
>>> +    __u64 value;
>>> +};
>>> +
>>> +/**
>>> + * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
>>> + *
>>> + * This structure is passed to VM_BIND ioctl and specifies the 
>>> mapping of GPU
>>> + * virtual address (VA) range to the section of an object that 
>>> should be bound
>>> + * in the device page table of the specified address space (VM).
>>> + * The VA range specified must be unique (ie., not currently bound) 
>>> and can
>>> + * be mapped to whole object or a section of the object (partial 
>>> binding).
>>> + * Multiple VA mappings can be created to the same section of the 
>>> object
>>> + * (aliasing).
>>> + *
>>> + * The @start, @offset and @length should be 4K page aligned. 
>>> However the DG2
>>> + * and XEHPSDV has 64K page size for device local-memory and has 
>>> compact page
>>> + * table. On those platforms, for binding device local-memory 
>>> objects, the
>>> + * @start should be 2M aligned, @offset and @length should be 64K 
>>> aligned.
>>
>> Should some error codes be documented and has the ability to 
>> programmatically probe the alignment restrictions been considered?
>>
>
> Currently what we have internally is that -EINVAL is returned if the 
> sart, offset
> and length are not aligned. If the specified mapping already exits, we 
> return
> -EEXIST. If there are conflicts in the VA range and VA range can't be 
> reserved,
> then -ENOSPC is returned. I can add this documentation here. But I am 
> worried
> that there will be more suggestions/feedback about error codes while 
> reviewing
> the code patch series, and we have to revisit it again.


That's not really a good excuse to not document.


>
>>> + * Also, on those platforms, it is not allowed to bind an device 
>>> local-memory
>>> + * object and a system memory object in a single 2M section of VA 
>>> range.
>>
>> Text should be clear whether "not allowed" means there will be an 
>> error returned, or it will appear to work but bad things will happen.
>>
>
> Yah, error returned, will fix.
>
>>> + */
>>> +struct drm_i915_gem_vm_bind {
>>> +    /** @vm_id: VM (address space) id to bind */
>>> +    __u32 vm_id;
>>> +
>>> +    /** @handle: Object handle */
>>> +    __u32 handle;
>>> +
>>> +    /** @start: Virtual Address start to bind */
>>> +    __u64 start;
>>> +
>>> +    /** @offset: Offset in object to bind */
>>> +    __u64 offset;
>>> +
>>> +    /** @length: Length of mapping to bind */
>>> +    __u64 length;
>>> +
>>> +    /**
>>> +     * @flags: Supported flags are:
>>> +     *
>>> +     * I915_GEM_VM_BIND_FENCE_VALID:
>>> +     * @fence is valid, needs bind completion notification.
>>> +     *
>>> +     * I915_GEM_VM_BIND_READONLY:
>>> +     * Mapping is read-only.
>>> +     *
>>> +     * I915_GEM_VM_BIND_CAPTURE:
>>> +     * Capture this mapping in the dump upon GPU error.
>>> +     *
>>> +     * I915_GEM_VM_BIND_TLB_FLUSH:
>>> +     * Flush the TLB for the specified range after bind completion.
>>> +     */
>>> +    __u64 flags;
>>> +#define I915_GEM_VM_BIND_FENCE_VALID    (1 << 0)
>>> +#define I915_GEM_VM_BIND_READONLY    (1 << 1)
>>> +#define I915_GEM_VM_BIND_CAPTURE    (1 << 2)
>>> +#define I915_GEM_VM_BIND_TLB_FLUSH    (1 << 2)
>>
>> What is the use case for allowing any random user to play with 
>> (global) TLB flushing?
>>
>
> I heard it from Daniel on intel-gfx, apparently it is a Mesa requirement.
>
>>> +
>>> +    /** @fence: Timeline fence for bind completion signaling */
>>> +    struct drm_i915_gem_vm_bind_fence fence;
>>
>> As agreed the other day - please document in the main kerneldoc 
>> section that all (un)binds are executed asynchronously and out of order.
>>
>
> I have added it in the latest revision of .rst file.
>
>>> +
>>> +    /** @extensions: 0-terminated chain of extensions */
>>> +    __u64 extensions;
>>> +};
>>> +
>>> +/**
>>> + * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
>>> + *
>>> + * This structure is passed to VM_UNBIND ioctl and specifies the 
>>> GPU virtual
>>> + * address (VA) range that should be unbound from the device page 
>>> table of the
>>> + * specified address space (VM). The specified VA range must match 
>>> one of the
>>> + * mappings created with the VM_BIND ioctl. 


This will not work for space bindings.

We need to make this a feature and have i915 say that non-matching 
bind/unbind are not currently supported.

So that when support is added for non matching bind/unbind, we can 
detect the support and enable sparse in UMD.


>>> TLB is flushed upon unbind
>>> + * completion. The unbind operation will force unbind the specified
>>
>> Do we want to provide TLB flushing guarantees here and why? (As 
>> opposed to leaving them for implementation details.) If there is no 
>> implied order in either binds/unbinds, or between the two intermixed, 
>> then what is the point of guaranteeing a TLB flush on unbind completion?
>>
>
> I think we ensure that tlb is flushed before signaling the out fence
> of vm_unbind call, then user ensure corretness by staging submissions
> or vm_bind calls after vm_unbind out fence signaling.
>
>> range from
>>> + * device page table without waiting for any GPU job to complete. 
>>> It is UMDs
>>> + * responsibility to ensure the mapping is no longer in use before 
>>> calling
>>> + * VM_UNBIND.
>>> + *
>>> + * The @start and @length musy specify a unique mapping bound with 
>>> VM_BIND
>>> + * ioctl.
>>> + */
>>> +struct drm_i915_gem_vm_unbind {
>>> +    /** @vm_id: VM (address space) id to bind */
>>> +    __u32 vm_id;
>>> +
>>> +    /** @rsvd: Reserved, MBZ */
>>> +    __u32 rsvd;
>>> +
>>> +    /** @start: Virtual Address start to unbind */
>>> +    __u64 start;
>>> +
>>> +    /** @length: Length of mapping to unbind */
>>> +    __u64 length;
>>> +
>>> +    /**
>>> +     * @flags: Supported flags are:
>>> +     *
>>> +     * I915_GEM_VM_UNBIND_FENCE_VALID:
>>> +     * @fence is valid, needs unbind completion notification.
>>> +     */
>>> +    __u64 flags;
>>> +#define I915_GEM_VM_UNBIND_FENCE_VALID    (1 << 0)
>>> +
>>> +    /** @fence: Timeline fence for unbind completion signaling */
>>> +    struct drm_i915_gem_vm_bind_fence fence;
>>
>> I am not sure the simplified ioctl story is super coherent. If 
>> everything is now fully async and out of order, but the input fence 
>> has been dropped, then how is userspace supposed to handle the 
>> address space? It will have to wait (in userspace) for unbinds to 
>> complete before submitting subsequent binds which use the same VA range.
>>
>
> Yah and Mesa appararently will be having the support to handle it.


Maybe there was miscommunication, but I thought things would be in order 
with a out fence only.

I didn't see out-of-order mentioned in our last internal discussion.

I think we can deal with it anyway using a timeline semaphore.


>
>> Maybe that's passable, but then the fact execbuf3 has no input fence 
>> suggests a userspace wait between it and binds. And I am pretty sure 
>> historically those were always quite bad for performance.
>>
>
> execbuf3 has the input fence through timline fence array support.
>
>> Presumably userspace clients are happy with no input fences or it was 
>> considered to costly to implement it?
>>
>
> Yah, apparently Mesa can work with no input fence. This helps us in
> focusing on rest of the VM_BIND feature delivery.
>
> Niranjana
>
>> Regards,
>>
>> Tvrtko
>>
>>> +
>>> +    /** @extensions: 0-terminated chain of extensions */
>>> +    __u64 extensions;
>>> +};
>>> +
>>> +/**
>>> + * struct drm_i915_gem_execbuffer3 - Structure for 
>>> DRM_I915_GEM_EXECBUFFER3
>>> + * ioctl.
>>> + *
>>> + * DRM_I915_GEM_EXECBUFFER3 ioctl only works in VM_BIND mode and 
>>> VM_BIND mode
>>> + * only works with this ioctl for submission.
>>> + * See I915_VM_CREATE_FLAGS_USE_VM_BIND.
>>> + */
>>> +struct drm_i915_gem_execbuffer3 {
>>> +    /**
>>> +     * @ctx_id: Context id
>>> +     *
>>> +     * Only contexts with user engine map are allowed.
>>> +     */
>>> +    __u32 ctx_id;
>>> +
>>> +    /**
>>> +     * @engine_idx: Engine index
>>> +     *
>>> +     * An index in the user engine map of the context specified by 
>>> @ctx_id.
>>> +     */
>>> +    __u32 engine_idx;
>>> +
>>> +    /** @rsvd1: Reserved, MBZ */
>>> +    __u32 rsvd1;
>>> +
>>> +    /**
>>> +     * @batch_count: Number of batches in @batch_address array.
>>> +     *
>>> +     * 0 is invalid. For parallel submission, it should be equal to 
>>> the
>>> +     * number of (parallel) engines involved in that submission.
>>> +     */
>>> +    __u32 batch_count;
>>> +
>>> +    /**
>>> +     * @batch_address: Array of batch gpu virtual addresses.
>>> +     *
>>> +     * If @batch_count is 1, then it is the gpu virtual address of the
>>> +     * batch buffer. If @batch_count > 1, then it is a pointer to 
>>> an array
>>> +     * of batch buffer gpu virtual addresses.
>>> +     */
>>> +    __u64 batch_address;
>>> +
>>> +    /**
>>> +     * @flags: Supported flags are:
>>> +     *
>>> +     * I915_EXEC3_SECURE:
>>> +     * Request a privileged ("secure") batch buffer/s.
>>> +     * It is only available for DRM_ROOT_ONLY | DRM_MASTER processes.
>>> +     */
>>> +    __u64 flags;
>>> +#define I915_EXEC3_SECURE    (1<<0)
>>> +
>>> +    /** @rsvd2: Reserved, MBZ */
>>> +    __u64 rsvd2;
>>> +
>>> +    /**
>>> +     * @extensions: Zero-terminated chain of extensions.
>>> +     *
>>> +     * DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES:
>>> +     * It has same format as 
>>> DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES.
>>> +     * See struct drm_i915_gem_execbuffer_ext_timeline_fences.
>>> +     */
>>> +    __u64 extensions;
>>> +#define DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES    0
>>> +};
>>> +
>>> +/**
>>> + * struct drm_i915_gem_create_ext_vm_private - Extension to make 
>>> the object
>>> + * private to the specified VM.
>>> + *
>>> + * See struct drm_i915_gem_create_ext.
>>> + */
>>> +struct drm_i915_gem_create_ext_vm_private {
>>> +#define I915_GEM_CREATE_EXT_VM_PRIVATE        2
>>> +    /** @base: Extension link. See struct i915_user_extension. */
>>> +    struct i915_user_extension base;
>>> +
>>> +    /** @vm_id: Id of the VM to which the object is private */
>>> +    __u32 vm_id;
>>> +};



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-23  8:57             ` Lionel Landwerlin
@ 2022-06-23 11:05               ` Tvrtko Ursulin
  2022-06-23 12:41                 ` Lionel Landwerlin
  2022-06-23 21:05                   ` Zeng, Oak
  0 siblings, 2 replies; 34+ messages in thread
From: Tvrtko Ursulin @ 2022-06-23 11:05 UTC (permalink / raw)
  To: Lionel Landwerlin, Niranjana Vishwanathapura
  Cc: paulo.r.zanoni, intel-gfx, dri-devel, thomas.hellstrom,
	chris.p.wilson, daniel.vetter, christian.koenig, matthew.auld

On 23/06/2022 09:57, Lionel Landwerlin wrote:
> On 23/06/2022 11:27, Tvrtko Ursulin wrote:
>>>
>>> After a vm_unbind, UMD can re-bind to same VA range against an active 
>>> VM.
>>> Though I am not sue with Mesa usecase if that new mapping is required 
>>> for
>>> running GPU job or it will be for the next submission. But ensuring the
>>> tlb flush upon unbind, KMD can ensure correctness.
>>
>> Isn't that their problem? If they re-bind for submitting _new_ work 
>> then they get the flush as part of batch buffer pre-amble. 
> 
> In the non sparse case, if a VA range is unbound, it is invalid to use 
> that range for anything until it has been rebound by something else.
> 
> We'll take the fence provided by vm_bind and put it as a wait fence on 
> the next execbuffer.
> 
> It might be safer in case of memory over fetching?
> 
> 
> TLB flush will have to happen at some point right?
> 
> What's the alternative to do it in unbind?

Currently TLB flush happens from the ring before every BB_START and also 
when i915 returns the backing store pages to the system.

For the former, I haven't seen any mention that for execbuf3 there are 
plans to stop doing it? Anyway, as long as this is kept and sequence of 
bind[1..N]+execbuf is safe and correctly sees all the preceding binds.
Hence about the alternative to doing it in unbind - first I think lets 
state the problem that is trying to solve.

For instance is it just for the compute "append work to the running 
batch" use case? I honestly don't remember how was that supposed to work 
so maybe the tlb flush on bind was supposed to deal with that scenario?

Or you see a problem even for Mesa with the current model?

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-23 11:05               ` Tvrtko Ursulin
@ 2022-06-23 12:41                 ` Lionel Landwerlin
  2022-06-23 21:05                   ` Zeng, Oak
  1 sibling, 0 replies; 34+ messages in thread
From: Lionel Landwerlin @ 2022-06-23 12:41 UTC (permalink / raw)
  To: Tvrtko Ursulin, Niranjana Vishwanathapura
  Cc: paulo.r.zanoni, intel-gfx, dri-devel, thomas.hellstrom,
	chris.p.wilson, daniel.vetter, christian.koenig, matthew.auld

On 23/06/2022 14:05, Tvrtko Ursulin wrote:
>
> On 23/06/2022 09:57, Lionel Landwerlin wrote:
>> On 23/06/2022 11:27, Tvrtko Ursulin wrote:
>>>>
>>>> After a vm_unbind, UMD can re-bind to same VA range against an 
>>>> active VM.
>>>> Though I am not sue with Mesa usecase if that new mapping is 
>>>> required for
>>>> running GPU job or it will be for the next submission. But ensuring 
>>>> the
>>>> tlb flush upon unbind, KMD can ensure correctness.
>>>
>>> Isn't that their problem? If they re-bind for submitting _new_ work 
>>> then they get the flush as part of batch buffer pre-amble. 
>>
>> In the non sparse case, if a VA range is unbound, it is invalid to 
>> use that range for anything until it has been rebound by something else.
>>
>> We'll take the fence provided by vm_bind and put it as a wait fence 
>> on the next execbuffer.
>>
>> It might be safer in case of memory over fetching?
>>
>>
>> TLB flush will have to happen at some point right?
>>
>> What's the alternative to do it in unbind?
>
> Currently TLB flush happens from the ring before every BB_START and 
> also when i915 returns the backing store pages to the system.
>
> For the former, I haven't seen any mention that for execbuf3 there are 
> plans to stop doing it? Anyway, as long as this is kept and sequence 
> of bind[1..N]+execbuf is safe and correctly sees all the preceding binds.
> Hence about the alternative to doing it in unbind - first I think lets 
> state the problem that is trying to solve.
>
> For instance is it just for the compute "append work to the running 
> batch" use case? I honestly don't remember how was that supposed to 
> work so maybe the tlb flush on bind was supposed to deal with that 
> scenario?
>
> Or you see a problem even for Mesa with the current model?
>
> Regards,
>
> Tvrtko


As far as I can tell, all the binds should have completed before execbuf 
starts if you follow the vulkan sparse binding rules.

For non-sparse, the UMD will take care of it.

I think we're fine.


-Lionel



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-23  9:28       ` Lionel Landwerlin
@ 2022-06-23 14:43           ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-23 14:43 UTC (permalink / raw)
  To: Lionel Landwerlin
  Cc: Tvrtko Ursulin, paulo.r.zanoni, intel-gfx, dri-devel,
	thomas.hellstrom, chris.p.wilson, daniel.vetter,
	christian.koenig, matthew.auld

On Thu, Jun 23, 2022 at 12:28:32PM +0300, Lionel Landwerlin wrote:
>On 22/06/2022 18:12, Niranjana Vishwanathapura wrote:
>>On Wed, Jun 22, 2022 at 09:10:07AM +0100, Tvrtko Ursulin wrote:
>>>
>>>On 22/06/2022 04:56, Niranjana Vishwanathapura wrote:
>>>>VM_BIND and related uapi definitions
>>>>
>>>>v2: Reduce the scope to simple Mesa use case.
>>>>v3: Expand VM_UNBIND documentation and add
>>>>    I915_GEM_VM_BIND/UNBIND_FENCE_VALID
>>>>    and I915_GEM_VM_BIND_TLB_FLUSH flags.
>>>>
>>>>Signed-off-by: Niranjana Vishwanathapura 
>>>><niranjana.vishwanathapura@intel.com>
>>>>---
>>>> Documentation/gpu/rfc/i915_vm_bind.h | 243 +++++++++++++++++++++++++++
>>>> 1 file changed, 243 insertions(+)
>>>> create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
>>>>
>>>>diff --git a/Documentation/gpu/rfc/i915_vm_bind.h 
>>>>b/Documentation/gpu/rfc/i915_vm_bind.h
>>>>new file mode 100644
>>>>index 000000000000..fa23b2d7ec6f
>>>>--- /dev/null
>>>>+++ b/Documentation/gpu/rfc/i915_vm_bind.h
>>>>@@ -0,0 +1,243 @@
>>>>+/* SPDX-License-Identifier: MIT */
>>>>+/*
>>>>+ * Copyright © 2022 Intel Corporation
>>>>+ */
>>>>+
>>>>+/**
>>>>+ * DOC: I915_PARAM_HAS_VM_BIND
>>>>+ *
>>>>+ * VM_BIND feature availability.
>>>>+ * See typedef drm_i915_getparam_t param.
>>>>+ */
>>>>+#define I915_PARAM_HAS_VM_BIND        57
>>>>+
>>>>+/**
>>>>+ * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
>>>>+ *
>>>>+ * Flag to opt-in for VM_BIND mode of binding during VM creation.
>>>>+ * See struct drm_i915_gem_vm_control flags.
>>>>+ *
>>>>+ * The older execbuf2 ioctl will not support VM_BIND mode of 
>>>>operation.
>>>>+ * For VM_BIND mode, we have new execbuf3 ioctl which will not 
>>>>accept any
>>>>+ * execlist (See struct drm_i915_gem_execbuffer3 for more details).
>>>>+ *
>>>>+ */
>>>>+#define I915_VM_CREATE_FLAGS_USE_VM_BIND    (1 << 0)
>>>>+
>>>>+/* VM_BIND related ioctls */
>>>>+#define DRM_I915_GEM_VM_BIND        0x3d
>>>>+#define DRM_I915_GEM_VM_UNBIND        0x3e
>>>>+#define DRM_I915_GEM_EXECBUFFER3    0x3f
>>>>+
>>>>+#define DRM_IOCTL_I915_GEM_VM_BIND DRM_IOWR(DRM_COMMAND_BASE + 
>>>>DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
>>>>+#define DRM_IOCTL_I915_GEM_VM_UNBIND DRM_IOWR(DRM_COMMAND_BASE 
>>>>+ DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_bind)
>>>>+#define DRM_IOCTL_I915_GEM_EXECBUFFER3 
>>>>DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct 
>>>>drm_i915_gem_execbuffer3)
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_vm_bind_fence - Bind/unbind completion 
>>>>notification.
>>>>+ *
>>>>+ * A timeline out fence for vm_bind/unbind completion notification.
>>>>+ */
>>>>+struct drm_i915_gem_vm_bind_fence {
>>>>+    /** @handle: User's handle for a drm_syncobj to signal. */
>>>>+    __u32 handle;
>>>>+
>>>>+    /** @rsvd: Reserved, MBZ */
>>>>+    __u32 rsvd;
>>>>+
>>>>+    /**
>>>>+     * @value: A point in the timeline.
>>>>+     * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
>>>>+     * timeline drm_syncobj is invalid as it turns a 
>>>>drm_syncobj into a
>>>>+     * binary one.
>>>>+     */
>>>>+    __u64 value;
>>>>+};
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
>>>>+ *
>>>>+ * This structure is passed to VM_BIND ioctl and specifies the 
>>>>mapping of GPU
>>>>+ * virtual address (VA) range to the section of an object that 
>>>>should be bound
>>>>+ * in the device page table of the specified address space (VM).
>>>>+ * The VA range specified must be unique (ie., not currently 
>>>>bound) and can
>>>>+ * be mapped to whole object or a section of the object 
>>>>(partial binding).
>>>>+ * Multiple VA mappings can be created to the same section of 
>>>>the object
>>>>+ * (aliasing).
>>>>+ *
>>>>+ * The @start, @offset and @length should be 4K page aligned. 
>>>>However the DG2
>>>>+ * and XEHPSDV has 64K page size for device local-memory and 
>>>>has compact page
>>>>+ * table. On those platforms, for binding device local-memory 
>>>>objects, the
>>>>+ * @start should be 2M aligned, @offset and @length should be 
>>>>64K aligned.
>>>
>>>Should some error codes be documented and has the ability to 
>>>programmatically probe the alignment restrictions been considered?
>>>
>>
>>Currently what we have internally is that -EINVAL is returned if the 
>>sart, offset
>>and length are not aligned. If the specified mapping already exits, 
>>we return
>>-EEXIST. If there are conflicts in the VA range and VA range can't 
>>be reserved,
>>then -ENOSPC is returned. I can add this documentation here. But I 
>>am worried
>>that there will be more suggestions/feedback about error codes while 
>>reviewing
>>the code patch series, and we have to revisit it again.
>
>
>That's not really a good excuse to not document.
>

Yah, I have documented it in the v4 series I sent out.

>
>>
>>>>+ * Also, on those platforms, it is not allowed to bind an 
>>>>device local-memory
>>>>+ * object and a system memory object in a single 2M section of 
>>>>VA range.
>>>
>>>Text should be clear whether "not allowed" means there will be an 
>>>error returned, or it will appear to work but bad things will 
>>>happen.
>>>
>>
>>Yah, error returned, will fix.
>>
>>>>+ */
>>>>+struct drm_i915_gem_vm_bind {
>>>>+    /** @vm_id: VM (address space) id to bind */
>>>>+    __u32 vm_id;
>>>>+
>>>>+    /** @handle: Object handle */
>>>>+    __u32 handle;
>>>>+
>>>>+    /** @start: Virtual Address start to bind */
>>>>+    __u64 start;
>>>>+
>>>>+    /** @offset: Offset in object to bind */
>>>>+    __u64 offset;
>>>>+
>>>>+    /** @length: Length of mapping to bind */
>>>>+    __u64 length;
>>>>+
>>>>+    /**
>>>>+     * @flags: Supported flags are:
>>>>+     *
>>>>+     * I915_GEM_VM_BIND_FENCE_VALID:
>>>>+     * @fence is valid, needs bind completion notification.
>>>>+     *
>>>>+     * I915_GEM_VM_BIND_READONLY:
>>>>+     * Mapping is read-only.
>>>>+     *
>>>>+     * I915_GEM_VM_BIND_CAPTURE:
>>>>+     * Capture this mapping in the dump upon GPU error.
>>>>+     *
>>>>+     * I915_GEM_VM_BIND_TLB_FLUSH:
>>>>+     * Flush the TLB for the specified range after bind completion.
>>>>+     */
>>>>+    __u64 flags;
>>>>+#define I915_GEM_VM_BIND_FENCE_VALID    (1 << 0)
>>>>+#define I915_GEM_VM_BIND_READONLY    (1 << 1)
>>>>+#define I915_GEM_VM_BIND_CAPTURE    (1 << 2)
>>>>+#define I915_GEM_VM_BIND_TLB_FLUSH    (1 << 2)
>>>
>>>What is the use case for allowing any random user to play with 
>>>(global) TLB flushing?
>>>
>>
>>I heard it from Daniel on intel-gfx, apparently it is a Mesa requirement.
>>
>>>>+
>>>>+    /** @fence: Timeline fence for bind completion signaling */
>>>>+    struct drm_i915_gem_vm_bind_fence fence;
>>>
>>>As agreed the other day - please document in the main kerneldoc 
>>>section that all (un)binds are executed asynchronously and out of 
>>>order.
>>>
>>
>>I have added it in the latest revision of .rst file.
>>
>>>>+
>>>>+    /** @extensions: 0-terminated chain of extensions */
>>>>+    __u64 extensions;
>>>>+};
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
>>>>+ *
>>>>+ * This structure is passed to VM_UNBIND ioctl and specifies 
>>>>the GPU virtual
>>>>+ * address (VA) range that should be unbound from the device 
>>>>page table of the
>>>>+ * specified address space (VM). The specified VA range must 
>>>>match one of the
>>>>+ * mappings created with the VM_BIND ioctl.
>
>
>This will not work for space bindings.
>
>We need to make this a feature and have i915 say that non-matching 
>bind/unbind are not currently supported.
>
>So that when support is added for non matching bind/unbind, we can 
>detect the support and enable sparse in UMD.
>

Ok, will add a 'version' tag to HAS_VM_BIND query and add documentation.

>
>>>>TLB is flushed upon unbind
>>>>+ * completion. The unbind operation will force unbind the specified
>>>
>>>Do we want to provide TLB flushing guarantees here and why? (As 
>>>opposed to leaving them for implementation details.) If there is 
>>>no implied order in either binds/unbinds, or between the two 
>>>intermixed, then what is the point of guaranteeing a TLB flush on 
>>>unbind completion?
>>>
>>
>>I think we ensure that tlb is flushed before signaling the out fence
>>of vm_unbind call, then user ensure corretness by staging submissions
>>or vm_bind calls after vm_unbind out fence signaling.
>>
>>>range from
>>>>+ * device page table without waiting for any GPU job to 
>>>>complete. It is UMDs
>>>>+ * responsibility to ensure the mapping is no longer in use 
>>>>before calling
>>>>+ * VM_UNBIND.
>>>>+ *
>>>>+ * The @start and @length musy specify a unique mapping bound 
>>>>with VM_BIND
>>>>+ * ioctl.
>>>>+ */
>>>>+struct drm_i915_gem_vm_unbind {
>>>>+    /** @vm_id: VM (address space) id to bind */
>>>>+    __u32 vm_id;
>>>>+
>>>>+    /** @rsvd: Reserved, MBZ */
>>>>+    __u32 rsvd;
>>>>+
>>>>+    /** @start: Virtual Address start to unbind */
>>>>+    __u64 start;
>>>>+
>>>>+    /** @length: Length of mapping to unbind */
>>>>+    __u64 length;
>>>>+
>>>>+    /**
>>>>+     * @flags: Supported flags are:
>>>>+     *
>>>>+     * I915_GEM_VM_UNBIND_FENCE_VALID:
>>>>+     * @fence is valid, needs unbind completion notification.
>>>>+     */
>>>>+    __u64 flags;
>>>>+#define I915_GEM_VM_UNBIND_FENCE_VALID    (1 << 0)
>>>>+
>>>>+    /** @fence: Timeline fence for unbind completion signaling */
>>>>+    struct drm_i915_gem_vm_bind_fence fence;
>>>
>>>I am not sure the simplified ioctl story is super coherent. If 
>>>everything is now fully async and out of order, but the input 
>>>fence has been dropped, then how is userspace supposed to handle 
>>>the address space? It will have to wait (in userspace) for unbinds 
>>>to complete before submitting subsequent binds which use the same 
>>>VA range.
>>>
>>
>>Yah and Mesa appararently will be having the support to handle it.
>
>
>Maybe there was miscommunication, but I thought things would be in 
>order with a out fence only.
>
>I didn't see out-of-order mentioned in our last internal discussion.
>

It was part of internal discussion with Mesa where we dropped multiple
queue support etc.

>I think we can deal with it anyway using a timeline semaphore.

:)

Niranjana

>
>
>>
>>>Maybe that's passable, but then the fact execbuf3 has no input 
>>>fence suggests a userspace wait between it and binds. And I am 
>>>pretty sure historically those were always quite bad for 
>>>performance.
>>>
>>
>>execbuf3 has the input fence through timline fence array support.
>>
>>>Presumably userspace clients are happy with no input fences or it 
>>>was considered to costly to implement it?
>>>
>>
>>Yah, apparently Mesa can work with no input fence. This helps us in
>>focusing on rest of the VM_BIND feature delivery.
>>
>>Niranjana
>>
>>>Regards,
>>>
>>>Tvrtko
>>>
>>>>+
>>>>+    /** @extensions: 0-terminated chain of extensions */
>>>>+    __u64 extensions;
>>>>+};
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_execbuffer3 - Structure for 
>>>>DRM_I915_GEM_EXECBUFFER3
>>>>+ * ioctl.
>>>>+ *
>>>>+ * DRM_I915_GEM_EXECBUFFER3 ioctl only works in VM_BIND mode 
>>>>and VM_BIND mode
>>>>+ * only works with this ioctl for submission.
>>>>+ * See I915_VM_CREATE_FLAGS_USE_VM_BIND.
>>>>+ */
>>>>+struct drm_i915_gem_execbuffer3 {
>>>>+    /**
>>>>+     * @ctx_id: Context id
>>>>+     *
>>>>+     * Only contexts with user engine map are allowed.
>>>>+     */
>>>>+    __u32 ctx_id;
>>>>+
>>>>+    /**
>>>>+     * @engine_idx: Engine index
>>>>+     *
>>>>+     * An index in the user engine map of the context specified 
>>>>by @ctx_id.
>>>>+     */
>>>>+    __u32 engine_idx;
>>>>+
>>>>+    /** @rsvd1: Reserved, MBZ */
>>>>+    __u32 rsvd1;
>>>>+
>>>>+    /**
>>>>+     * @batch_count: Number of batches in @batch_address array.
>>>>+     *
>>>>+     * 0 is invalid. For parallel submission, it should be 
>>>>equal to the
>>>>+     * number of (parallel) engines involved in that submission.
>>>>+     */
>>>>+    __u32 batch_count;
>>>>+
>>>>+    /**
>>>>+     * @batch_address: Array of batch gpu virtual addresses.
>>>>+     *
>>>>+     * If @batch_count is 1, then it is the gpu virtual address of the
>>>>+     * batch buffer. If @batch_count > 1, then it is a pointer 
>>>>to an array
>>>>+     * of batch buffer gpu virtual addresses.
>>>>+     */
>>>>+    __u64 batch_address;
>>>>+
>>>>+    /**
>>>>+     * @flags: Supported flags are:
>>>>+     *
>>>>+     * I915_EXEC3_SECURE:
>>>>+     * Request a privileged ("secure") batch buffer/s.
>>>>+     * It is only available for DRM_ROOT_ONLY | DRM_MASTER processes.
>>>>+     */
>>>>+    __u64 flags;
>>>>+#define I915_EXEC3_SECURE    (1<<0)
>>>>+
>>>>+    /** @rsvd2: Reserved, MBZ */
>>>>+    __u64 rsvd2;
>>>>+
>>>>+    /**
>>>>+     * @extensions: Zero-terminated chain of extensions.
>>>>+     *
>>>>+     * DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES:
>>>>+     * It has same format as 
>>>>DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES.
>>>>+     * See struct drm_i915_gem_execbuffer_ext_timeline_fences.
>>>>+     */
>>>>+    __u64 extensions;
>>>>+#define DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES    0
>>>>+};
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_create_ext_vm_private - Extension to 
>>>>make the object
>>>>+ * private to the specified VM.
>>>>+ *
>>>>+ * See struct drm_i915_gem_create_ext.
>>>>+ */
>>>>+struct drm_i915_gem_create_ext_vm_private {
>>>>+#define I915_GEM_CREATE_EXT_VM_PRIVATE        2
>>>>+    /** @base: Extension link. See struct i915_user_extension. */
>>>>+    struct i915_user_extension base;
>>>>+
>>>>+    /** @vm_id: Id of the VM to which the object is private */
>>>>+    __u32 vm_id;
>>>>+};
>
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
@ 2022-06-23 14:43           ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-23 14:43 UTC (permalink / raw)
  To: Lionel Landwerlin
  Cc: paulo.r.zanoni, intel-gfx, dri-devel, thomas.hellstrom,
	chris.p.wilson, daniel.vetter, christian.koenig, matthew.auld

On Thu, Jun 23, 2022 at 12:28:32PM +0300, Lionel Landwerlin wrote:
>On 22/06/2022 18:12, Niranjana Vishwanathapura wrote:
>>On Wed, Jun 22, 2022 at 09:10:07AM +0100, Tvrtko Ursulin wrote:
>>>
>>>On 22/06/2022 04:56, Niranjana Vishwanathapura wrote:
>>>>VM_BIND and related uapi definitions
>>>>
>>>>v2: Reduce the scope to simple Mesa use case.
>>>>v3: Expand VM_UNBIND documentation and add
>>>>    I915_GEM_VM_BIND/UNBIND_FENCE_VALID
>>>>    and I915_GEM_VM_BIND_TLB_FLUSH flags.
>>>>
>>>>Signed-off-by: Niranjana Vishwanathapura 
>>>><niranjana.vishwanathapura@intel.com>
>>>>---
>>>> Documentation/gpu/rfc/i915_vm_bind.h | 243 +++++++++++++++++++++++++++
>>>> 1 file changed, 243 insertions(+)
>>>> create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
>>>>
>>>>diff --git a/Documentation/gpu/rfc/i915_vm_bind.h 
>>>>b/Documentation/gpu/rfc/i915_vm_bind.h
>>>>new file mode 100644
>>>>index 000000000000..fa23b2d7ec6f
>>>>--- /dev/null
>>>>+++ b/Documentation/gpu/rfc/i915_vm_bind.h
>>>>@@ -0,0 +1,243 @@
>>>>+/* SPDX-License-Identifier: MIT */
>>>>+/*
>>>>+ * Copyright © 2022 Intel Corporation
>>>>+ */
>>>>+
>>>>+/**
>>>>+ * DOC: I915_PARAM_HAS_VM_BIND
>>>>+ *
>>>>+ * VM_BIND feature availability.
>>>>+ * See typedef drm_i915_getparam_t param.
>>>>+ */
>>>>+#define I915_PARAM_HAS_VM_BIND        57
>>>>+
>>>>+/**
>>>>+ * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
>>>>+ *
>>>>+ * Flag to opt-in for VM_BIND mode of binding during VM creation.
>>>>+ * See struct drm_i915_gem_vm_control flags.
>>>>+ *
>>>>+ * The older execbuf2 ioctl will not support VM_BIND mode of 
>>>>operation.
>>>>+ * For VM_BIND mode, we have new execbuf3 ioctl which will not 
>>>>accept any
>>>>+ * execlist (See struct drm_i915_gem_execbuffer3 for more details).
>>>>+ *
>>>>+ */
>>>>+#define I915_VM_CREATE_FLAGS_USE_VM_BIND    (1 << 0)
>>>>+
>>>>+/* VM_BIND related ioctls */
>>>>+#define DRM_I915_GEM_VM_BIND        0x3d
>>>>+#define DRM_I915_GEM_VM_UNBIND        0x3e
>>>>+#define DRM_I915_GEM_EXECBUFFER3    0x3f
>>>>+
>>>>+#define DRM_IOCTL_I915_GEM_VM_BIND DRM_IOWR(DRM_COMMAND_BASE + 
>>>>DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
>>>>+#define DRM_IOCTL_I915_GEM_VM_UNBIND DRM_IOWR(DRM_COMMAND_BASE 
>>>>+ DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_bind)
>>>>+#define DRM_IOCTL_I915_GEM_EXECBUFFER3 
>>>>DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct 
>>>>drm_i915_gem_execbuffer3)
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_vm_bind_fence - Bind/unbind completion 
>>>>notification.
>>>>+ *
>>>>+ * A timeline out fence for vm_bind/unbind completion notification.
>>>>+ */
>>>>+struct drm_i915_gem_vm_bind_fence {
>>>>+    /** @handle: User's handle for a drm_syncobj to signal. */
>>>>+    __u32 handle;
>>>>+
>>>>+    /** @rsvd: Reserved, MBZ */
>>>>+    __u32 rsvd;
>>>>+
>>>>+    /**
>>>>+     * @value: A point in the timeline.
>>>>+     * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
>>>>+     * timeline drm_syncobj is invalid as it turns a 
>>>>drm_syncobj into a
>>>>+     * binary one.
>>>>+     */
>>>>+    __u64 value;
>>>>+};
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
>>>>+ *
>>>>+ * This structure is passed to VM_BIND ioctl and specifies the 
>>>>mapping of GPU
>>>>+ * virtual address (VA) range to the section of an object that 
>>>>should be bound
>>>>+ * in the device page table of the specified address space (VM).
>>>>+ * The VA range specified must be unique (ie., not currently 
>>>>bound) and can
>>>>+ * be mapped to whole object or a section of the object 
>>>>(partial binding).
>>>>+ * Multiple VA mappings can be created to the same section of 
>>>>the object
>>>>+ * (aliasing).
>>>>+ *
>>>>+ * The @start, @offset and @length should be 4K page aligned. 
>>>>However the DG2
>>>>+ * and XEHPSDV has 64K page size for device local-memory and 
>>>>has compact page
>>>>+ * table. On those platforms, for binding device local-memory 
>>>>objects, the
>>>>+ * @start should be 2M aligned, @offset and @length should be 
>>>>64K aligned.
>>>
>>>Should some error codes be documented and has the ability to 
>>>programmatically probe the alignment restrictions been considered?
>>>
>>
>>Currently what we have internally is that -EINVAL is returned if the 
>>sart, offset
>>and length are not aligned. If the specified mapping already exits, 
>>we return
>>-EEXIST. If there are conflicts in the VA range and VA range can't 
>>be reserved,
>>then -ENOSPC is returned. I can add this documentation here. But I 
>>am worried
>>that there will be more suggestions/feedback about error codes while 
>>reviewing
>>the code patch series, and we have to revisit it again.
>
>
>That's not really a good excuse to not document.
>

Yah, I have documented it in the v4 series I sent out.

>
>>
>>>>+ * Also, on those platforms, it is not allowed to bind an 
>>>>device local-memory
>>>>+ * object and a system memory object in a single 2M section of 
>>>>VA range.
>>>
>>>Text should be clear whether "not allowed" means there will be an 
>>>error returned, or it will appear to work but bad things will 
>>>happen.
>>>
>>
>>Yah, error returned, will fix.
>>
>>>>+ */
>>>>+struct drm_i915_gem_vm_bind {
>>>>+    /** @vm_id: VM (address space) id to bind */
>>>>+    __u32 vm_id;
>>>>+
>>>>+    /** @handle: Object handle */
>>>>+    __u32 handle;
>>>>+
>>>>+    /** @start: Virtual Address start to bind */
>>>>+    __u64 start;
>>>>+
>>>>+    /** @offset: Offset in object to bind */
>>>>+    __u64 offset;
>>>>+
>>>>+    /** @length: Length of mapping to bind */
>>>>+    __u64 length;
>>>>+
>>>>+    /**
>>>>+     * @flags: Supported flags are:
>>>>+     *
>>>>+     * I915_GEM_VM_BIND_FENCE_VALID:
>>>>+     * @fence is valid, needs bind completion notification.
>>>>+     *
>>>>+     * I915_GEM_VM_BIND_READONLY:
>>>>+     * Mapping is read-only.
>>>>+     *
>>>>+     * I915_GEM_VM_BIND_CAPTURE:
>>>>+     * Capture this mapping in the dump upon GPU error.
>>>>+     *
>>>>+     * I915_GEM_VM_BIND_TLB_FLUSH:
>>>>+     * Flush the TLB for the specified range after bind completion.
>>>>+     */
>>>>+    __u64 flags;
>>>>+#define I915_GEM_VM_BIND_FENCE_VALID    (1 << 0)
>>>>+#define I915_GEM_VM_BIND_READONLY    (1 << 1)
>>>>+#define I915_GEM_VM_BIND_CAPTURE    (1 << 2)
>>>>+#define I915_GEM_VM_BIND_TLB_FLUSH    (1 << 2)
>>>
>>>What is the use case for allowing any random user to play with 
>>>(global) TLB flushing?
>>>
>>
>>I heard it from Daniel on intel-gfx, apparently it is a Mesa requirement.
>>
>>>>+
>>>>+    /** @fence: Timeline fence for bind completion signaling */
>>>>+    struct drm_i915_gem_vm_bind_fence fence;
>>>
>>>As agreed the other day - please document in the main kerneldoc 
>>>section that all (un)binds are executed asynchronously and out of 
>>>order.
>>>
>>
>>I have added it in the latest revision of .rst file.
>>
>>>>+
>>>>+    /** @extensions: 0-terminated chain of extensions */
>>>>+    __u64 extensions;
>>>>+};
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
>>>>+ *
>>>>+ * This structure is passed to VM_UNBIND ioctl and specifies 
>>>>the GPU virtual
>>>>+ * address (VA) range that should be unbound from the device 
>>>>page table of the
>>>>+ * specified address space (VM). The specified VA range must 
>>>>match one of the
>>>>+ * mappings created with the VM_BIND ioctl.
>
>
>This will not work for space bindings.
>
>We need to make this a feature and have i915 say that non-matching 
>bind/unbind are not currently supported.
>
>So that when support is added for non matching bind/unbind, we can 
>detect the support and enable sparse in UMD.
>

Ok, will add a 'version' tag to HAS_VM_BIND query and add documentation.

>
>>>>TLB is flushed upon unbind
>>>>+ * completion. The unbind operation will force unbind the specified
>>>
>>>Do we want to provide TLB flushing guarantees here and why? (As 
>>>opposed to leaving them for implementation details.) If there is 
>>>no implied order in either binds/unbinds, or between the two 
>>>intermixed, then what is the point of guaranteeing a TLB flush on 
>>>unbind completion?
>>>
>>
>>I think we ensure that tlb is flushed before signaling the out fence
>>of vm_unbind call, then user ensure corretness by staging submissions
>>or vm_bind calls after vm_unbind out fence signaling.
>>
>>>range from
>>>>+ * device page table without waiting for any GPU job to 
>>>>complete. It is UMDs
>>>>+ * responsibility to ensure the mapping is no longer in use 
>>>>before calling
>>>>+ * VM_UNBIND.
>>>>+ *
>>>>+ * The @start and @length musy specify a unique mapping bound 
>>>>with VM_BIND
>>>>+ * ioctl.
>>>>+ */
>>>>+struct drm_i915_gem_vm_unbind {
>>>>+    /** @vm_id: VM (address space) id to bind */
>>>>+    __u32 vm_id;
>>>>+
>>>>+    /** @rsvd: Reserved, MBZ */
>>>>+    __u32 rsvd;
>>>>+
>>>>+    /** @start: Virtual Address start to unbind */
>>>>+    __u64 start;
>>>>+
>>>>+    /** @length: Length of mapping to unbind */
>>>>+    __u64 length;
>>>>+
>>>>+    /**
>>>>+     * @flags: Supported flags are:
>>>>+     *
>>>>+     * I915_GEM_VM_UNBIND_FENCE_VALID:
>>>>+     * @fence is valid, needs unbind completion notification.
>>>>+     */
>>>>+    __u64 flags;
>>>>+#define I915_GEM_VM_UNBIND_FENCE_VALID    (1 << 0)
>>>>+
>>>>+    /** @fence: Timeline fence for unbind completion signaling */
>>>>+    struct drm_i915_gem_vm_bind_fence fence;
>>>
>>>I am not sure the simplified ioctl story is super coherent. If 
>>>everything is now fully async and out of order, but the input 
>>>fence has been dropped, then how is userspace supposed to handle 
>>>the address space? It will have to wait (in userspace) for unbinds 
>>>to complete before submitting subsequent binds which use the same 
>>>VA range.
>>>
>>
>>Yah and Mesa appararently will be having the support to handle it.
>
>
>Maybe there was miscommunication, but I thought things would be in 
>order with a out fence only.
>
>I didn't see out-of-order mentioned in our last internal discussion.
>

It was part of internal discussion with Mesa where we dropped multiple
queue support etc.

>I think we can deal with it anyway using a timeline semaphore.

:)

Niranjana

>
>
>>
>>>Maybe that's passable, but then the fact execbuf3 has no input 
>>>fence suggests a userspace wait between it and binds. And I am 
>>>pretty sure historically those were always quite bad for 
>>>performance.
>>>
>>
>>execbuf3 has the input fence through timline fence array support.
>>
>>>Presumably userspace clients are happy with no input fences or it 
>>>was considered to costly to implement it?
>>>
>>
>>Yah, apparently Mesa can work with no input fence. This helps us in
>>focusing on rest of the VM_BIND feature delivery.
>>
>>Niranjana
>>
>>>Regards,
>>>
>>>Tvrtko
>>>
>>>>+
>>>>+    /** @extensions: 0-terminated chain of extensions */
>>>>+    __u64 extensions;
>>>>+};
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_execbuffer3 - Structure for 
>>>>DRM_I915_GEM_EXECBUFFER3
>>>>+ * ioctl.
>>>>+ *
>>>>+ * DRM_I915_GEM_EXECBUFFER3 ioctl only works in VM_BIND mode 
>>>>and VM_BIND mode
>>>>+ * only works with this ioctl for submission.
>>>>+ * See I915_VM_CREATE_FLAGS_USE_VM_BIND.
>>>>+ */
>>>>+struct drm_i915_gem_execbuffer3 {
>>>>+    /**
>>>>+     * @ctx_id: Context id
>>>>+     *
>>>>+     * Only contexts with user engine map are allowed.
>>>>+     */
>>>>+    __u32 ctx_id;
>>>>+
>>>>+    /**
>>>>+     * @engine_idx: Engine index
>>>>+     *
>>>>+     * An index in the user engine map of the context specified 
>>>>by @ctx_id.
>>>>+     */
>>>>+    __u32 engine_idx;
>>>>+
>>>>+    /** @rsvd1: Reserved, MBZ */
>>>>+    __u32 rsvd1;
>>>>+
>>>>+    /**
>>>>+     * @batch_count: Number of batches in @batch_address array.
>>>>+     *
>>>>+     * 0 is invalid. For parallel submission, it should be 
>>>>equal to the
>>>>+     * number of (parallel) engines involved in that submission.
>>>>+     */
>>>>+    __u32 batch_count;
>>>>+
>>>>+    /**
>>>>+     * @batch_address: Array of batch gpu virtual addresses.
>>>>+     *
>>>>+     * If @batch_count is 1, then it is the gpu virtual address of the
>>>>+     * batch buffer. If @batch_count > 1, then it is a pointer 
>>>>to an array
>>>>+     * of batch buffer gpu virtual addresses.
>>>>+     */
>>>>+    __u64 batch_address;
>>>>+
>>>>+    /**
>>>>+     * @flags: Supported flags are:
>>>>+     *
>>>>+     * I915_EXEC3_SECURE:
>>>>+     * Request a privileged ("secure") batch buffer/s.
>>>>+     * It is only available for DRM_ROOT_ONLY | DRM_MASTER processes.
>>>>+     */
>>>>+    __u64 flags;
>>>>+#define I915_EXEC3_SECURE    (1<<0)
>>>>+
>>>>+    /** @rsvd2: Reserved, MBZ */
>>>>+    __u64 rsvd2;
>>>>+
>>>>+    /**
>>>>+     * @extensions: Zero-terminated chain of extensions.
>>>>+     *
>>>>+     * DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES:
>>>>+     * It has same format as 
>>>>DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES.
>>>>+     * See struct drm_i915_gem_execbuffer_ext_timeline_fences.
>>>>+     */
>>>>+    __u64 extensions;
>>>>+#define DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES    0
>>>>+};
>>>>+
>>>>+/**
>>>>+ * struct drm_i915_gem_create_ext_vm_private - Extension to 
>>>>make the object
>>>>+ * private to the specified VM.
>>>>+ *
>>>>+ * See struct drm_i915_gem_create_ext.
>>>>+ */
>>>>+struct drm_i915_gem_create_ext_vm_private {
>>>>+#define I915_GEM_CREATE_EXT_VM_PRIVATE        2
>>>>+    /** @base: Extension link. See struct i915_user_extension. */
>>>>+    struct i915_user_extension base;
>>>>+
>>>>+    /** @vm_id: Id of the VM to which the object is private */
>>>>+    __u32 vm_id;
>>>>+};
>
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-23  8:27           ` Tvrtko Ursulin
  2022-06-23  8:57             ` Lionel Landwerlin
@ 2022-06-23 14:47             ` Niranjana Vishwanathapura
  1 sibling, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-23 14:47 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: paulo.r.zanoni, intel-gfx, chris.p.wilson, thomas.hellstrom,
	dri-devel, daniel.vetter, christian.koenig, matthew.auld

On Thu, Jun 23, 2022 at 09:27:22AM +0100, Tvrtko Ursulin wrote:
>
>On 22/06/2022 17:44, Niranjana Vishwanathapura wrote:
>>On Wed, Jun 22, 2022 at 04:57:17PM +0100, Tvrtko Ursulin wrote:
>>>
>>>On 22/06/2022 16:12, Niranjana Vishwanathapura wrote:
>>>>On Wed, Jun 22, 2022 at 09:10:07AM +0100, Tvrtko Ursulin wrote:
>>>>>
>>>>>On 22/06/2022 04:56, Niranjana Vishwanathapura wrote:
>>>>>>VM_BIND and related uapi definitions
>>>>>>
>>>>>>v2: Reduce the scope to simple Mesa use case.
>>>>>>v3: Expand VM_UNBIND documentation and add
>>>>>>    I915_GEM_VM_BIND/UNBIND_FENCE_VALID
>>>>>>    and I915_GEM_VM_BIND_TLB_FLUSH flags.
>>>>>>
>>>>>>Signed-off-by: Niranjana Vishwanathapura 
>>>>>><niranjana.vishwanathapura@intel.com>
>>>>>>---
>>>>>> Documentation/gpu/rfc/i915_vm_bind.h | 243 
>>>>>>+++++++++++++++++++++++++++
>>>>>> 1 file changed, 243 insertions(+)
>>>>>> create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
>>>>>>
>>>>>>diff --git a/Documentation/gpu/rfc/i915_vm_bind.h 
>>>>>>b/Documentation/gpu/rfc/i915_vm_bind.h
>>>>>>new file mode 100644
>>>>>>index 000000000000..fa23b2d7ec6f
>>>>>>--- /dev/null
>>>>>>+++ b/Documentation/gpu/rfc/i915_vm_bind.h
>>>>>>@@ -0,0 +1,243 @@
>>>>>>+/* SPDX-License-Identifier: MIT */
>>>>>>+/*
>>>>>>+ * Copyright © 2022 Intel Corporation
>>>>>>+ */
>>>>>>+
>>>>>>+/**
>>>>>>+ * DOC: I915_PARAM_HAS_VM_BIND
>>>>>>+ *
>>>>>>+ * VM_BIND feature availability.
>>>>>>+ * See typedef drm_i915_getparam_t param.
>>>>>>+ */
>>>>>>+#define I915_PARAM_HAS_VM_BIND        57
>>>>>>+
>>>>>>+/**
>>>>>>+ * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
>>>>>>+ *
>>>>>>+ * Flag to opt-in for VM_BIND mode of binding during VM creation.
>>>>>>+ * See struct drm_i915_gem_vm_control flags.
>>>>>>+ *
>>>>>>+ * The older execbuf2 ioctl will not support VM_BIND mode 
>>>>>>of operation.
>>>>>>+ * For VM_BIND mode, we have new execbuf3 ioctl which will 
>>>>>>not accept any
>>>>>>+ * execlist (See struct drm_i915_gem_execbuffer3 for more details).
>>>>>>+ *
>>>>>>+ */
>>>>>>+#define I915_VM_CREATE_FLAGS_USE_VM_BIND    (1 << 0)
>>>>>>+
>>>>>>+/* VM_BIND related ioctls */
>>>>>>+#define DRM_I915_GEM_VM_BIND        0x3d
>>>>>>+#define DRM_I915_GEM_VM_UNBIND        0x3e
>>>>>>+#define DRM_I915_GEM_EXECBUFFER3    0x3f
>>>>>>+
>>>>>>+#define DRM_IOCTL_I915_GEM_VM_BIND 
>>>>>>DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct 
>>>>>>drm_i915_gem_vm_bind)
>>>>>>+#define DRM_IOCTL_I915_GEM_VM_UNBIND 
>>>>>>DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct 
>>>>>>drm_i915_gem_vm_bind)
>>>>>>+#define DRM_IOCTL_I915_GEM_EXECBUFFER3 
>>>>>>DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct 
>>>>>>drm_i915_gem_execbuffer3)
>>>>>>+
>>>>>>+/**
>>>>>>+ * struct drm_i915_gem_vm_bind_fence - Bind/unbind 
>>>>>>completion notification.
>>>>>>+ *
>>>>>>+ * A timeline out fence for vm_bind/unbind completion notification.
>>>>>>+ */
>>>>>>+struct drm_i915_gem_vm_bind_fence {
>>>>>>+    /** @handle: User's handle for a drm_syncobj to signal. */
>>>>>>+    __u32 handle;
>>>>>>+
>>>>>>+    /** @rsvd: Reserved, MBZ */
>>>>>>+    __u32 rsvd;
>>>>>>+
>>>>>>+    /**
>>>>>>+     * @value: A point in the timeline.
>>>>>>+     * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
>>>>>>+     * timeline drm_syncobj is invalid as it turns a 
>>>>>>drm_syncobj into a
>>>>>>+     * binary one.
>>>>>>+     */
>>>>>>+    __u64 value;
>>>>>>+};
>>>>>>+
>>>>>>+/**
>>>>>>+ * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
>>>>>>+ *
>>>>>>+ * This structure is passed to VM_BIND ioctl and specifies 
>>>>>>the mapping of GPU
>>>>>>+ * virtual address (VA) range to the section of an object 
>>>>>>that should be bound
>>>>>>+ * in the device page table of the specified address space (VM).
>>>>>>+ * The VA range specified must be unique (ie., not 
>>>>>>currently bound) and can
>>>>>>+ * be mapped to whole object or a section of the object 
>>>>>>(partial binding).
>>>>>>+ * Multiple VA mappings can be created to the same section 
>>>>>>of the object
>>>>>>+ * (aliasing).
>>>>>>+ *
>>>>>>+ * The @start, @offset and @length should be 4K page 
>>>>>>aligned. However the DG2
>>>>>>+ * and XEHPSDV has 64K page size for device local-memory 
>>>>>>and has compact page
>>>>>>+ * table. On those platforms, for binding device 
>>>>>>local-memory objects, the
>>>>>>+ * @start should be 2M aligned, @offset and @length should 
>>>>>>be 64K aligned.
>>>>>
>>>>>Should some error codes be documented and has the ability to 
>>>>>programmatically probe the alignment restrictions been 
>>>>>considered?
>>>>>
>>>>
>>>>Currently what we have internally is that -EINVAL is returned if 
>>>>the sart, offset
>>>>and length are not aligned. If the specified mapping already 
>>>>exits, we return
>>>>-EEXIST. If there are conflicts in the VA range and VA range 
>>>>can't be reserved,
>>>>then -ENOSPC is returned. I can add this documentation here. But 
>>>>I am worried
>>>>that there will be more suggestions/feedback about error codes 
>>>>while reviewing
>>>>the code patch series, and we have to revisit it again.
>>>
>>>I'd still suggest documenting those three. It makes sense to 
>>>explain to userspace what behaviour they will see if they get it 
>>>wrong.
>>>
>>
>>Ok.
>>
>>>>>>+ * Also, on those platforms, it is not allowed to bind an 
>>>>>>device local-memory
>>>>>>+ * object and a system memory object in a single 2M section 
>>>>>>of VA range.
>>>>>
>>>>>Text should be clear whether "not allowed" means there will be 
>>>>>an error returned, or it will appear to work but bad things 
>>>>>will happen.
>>>>>
>>>>
>>>>Yah, error returned, will fix.
>>>>
>>>>>>+ */
>>>>>>+struct drm_i915_gem_vm_bind {
>>>>>>+    /** @vm_id: VM (address space) id to bind */
>>>>>>+    __u32 vm_id;
>>>>>>+
>>>>>>+    /** @handle: Object handle */
>>>>>>+    __u32 handle;
>>>>>>+
>>>>>>+    /** @start: Virtual Address start to bind */
>>>>>>+    __u64 start;
>>>>>>+
>>>>>>+    /** @offset: Offset in object to bind */
>>>>>>+    __u64 offset;
>>>>>>+
>>>>>>+    /** @length: Length of mapping to bind */
>>>>>>+    __u64 length;
>>>>>>+
>>>>>>+    /**
>>>>>>+     * @flags: Supported flags are:
>>>>>>+     *
>>>>>>+     * I915_GEM_VM_BIND_FENCE_VALID:
>>>>>>+     * @fence is valid, needs bind completion notification.
>>>>>>+     *
>>>>>>+     * I915_GEM_VM_BIND_READONLY:
>>>>>>+     * Mapping is read-only.
>>>>>>+     *
>>>>>>+     * I915_GEM_VM_BIND_CAPTURE:
>>>>>>+     * Capture this mapping in the dump upon GPU error.
>>>>>>+     *
>>>>>>+     * I915_GEM_VM_BIND_TLB_FLUSH:
>>>>>>+     * Flush the TLB for the specified range after bind completion.
>>>>>>+     */
>>>>>>+    __u64 flags;
>>>>>>+#define I915_GEM_VM_BIND_FENCE_VALID    (1 << 0)
>>>>>>+#define I915_GEM_VM_BIND_READONLY    (1 << 1)
>>>>>>+#define I915_GEM_VM_BIND_CAPTURE    (1 << 2)
>>>>>>+#define I915_GEM_VM_BIND_TLB_FLUSH    (1 << 2)
>>>>>
>>>>>What is the use case for allowing any random user to play with 
>>>>>(global) TLB flushing?
>>>>>
>>>>
>>>>I heard it from Daniel on intel-gfx, apparently it is a Mesa 
>>>>requirement.
>>>
>>>Okay I think that one needs clarifying.
>>>
>>
>>After chatting with Jason, I think we can remove it for now and
>>we can revisit it later if Mesa thinks it is required.
>
>IRC or some other thread?

#intel-gfx

Niranjana

>
>>>>>>+
>>>>>>+    /** @fence: Timeline fence for bind completion signaling */
>>>>>>+    struct drm_i915_gem_vm_bind_fence fence;
>>>>>
>>>>>As agreed the other day - please document in the main 
>>>>>kerneldoc section that all (un)binds are executed 
>>>>>asynchronously and out of order.
>>>>>
>>>>
>>>>I have added it in the latest revision of .rst file.
>>>
>>>Right, but I'd say to mention it in the uapi docs.
>>>
>>
>>Ok
>>
>>>>>>+
>>>>>>+    /** @extensions: 0-terminated chain of extensions */
>>>>>>+    __u64 extensions;
>>>>>>+};
>>>>>>+
>>>>>>+/**
>>>>>>+ * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
>>>>>>+ *
>>>>>>+ * This structure is passed to VM_UNBIND ioctl and 
>>>>>>specifies the GPU virtual
>>>>>>+ * address (VA) range that should be unbound from the 
>>>>>>device page table of the
>>>>>>+ * specified address space (VM). The specified VA range 
>>>>>>must match one of the
>>>>>>+ * mappings created with the VM_BIND ioctl. TLB is flushed 
>>>>>>upon unbind
>>>>>>+ * completion. The unbind operation will force unbind the specified
>>>>>
>>>>>Do we want to provide TLB flushing guarantees here and why? 
>>>>>(As opposed to leaving them for implementation details.) If 
>>>>>there is no implied order in either binds/unbinds, or between 
>>>>>the two intermixed, then what is the point of guaranteeing a 
>>>>>TLB flush on unbind completion?
>>>>>
>>>>
>>>>I think we ensure that tlb is flushed before signaling the out fence
>>>>of vm_unbind call, then user ensure corretness by staging submissions
>>>>or vm_bind calls after vm_unbind out fence signaling.
>>>
>>>I don't see why is this required. Driver does not need to flush 
>>>immediately on unbind for correctness/security and neither for the 
>>>uapi contract. If there is no subsequent usage/bind then the flush 
>>>is pointless. And if the user re-binds to same VA range, against 
>>>an active VM, then perhaps the expectations need to be defined. Is 
>>>this supported or user error or what.
>>>
>>
>>After a vm_unbind, UMD can re-bind to same VA range against an active VM.
>>Though I am not sue with Mesa usecase if that new mapping is required for
>>running GPU job or it will be for the next submission. But ensuring the
>>tlb flush upon unbind, KMD can ensure correctness.
>
>Isn't that their problem? If they re-bind for submitting _new_ work 
>then they get the flush as part of batch buffer pre-amble.
>
>>Note that on platforms with selective TLB invalidation, it is not
>>as expensive as flushing the whole TLB. On platforms without selective
>>tlb invalidation, we can put some optimization later as mentioned
>>in the .rst file.
>>
>>Also note that UMDs can vm_unbind a mapping while VM is active.
>>By flushing the tlb, we ensure there is no inadvertent access to
>>mapping that no longer exists. I can add this to documentation.
>
>This one would surely be their problem. Kernel only needs to flush 
>when it decides to re-use the backing store.
>
>To be clear, overall I have reservations about offering strong 
>guarantees about the TLB flushing behaviour at the level of these two 
>ioctls. If we don't need to offer them it would be good to not do it, 
>otherwise we limit ourselves on the implementation side and more 
>importantly add a global performance hit where majority of userspace 
>do not need this guarantee to start with.
>
>I only don't fully remember how is that compute use case supposed to 
>work where new work keeps getting submitted against a running batch. 
>Am I missing something there?
>
>>>>>range from
>>>>>>+ * device page table without waiting for any GPU job to 
>>>>>>complete. It is UMDs
>>>>>>+ * responsibility to ensure the mapping is no longer in use 
>>>>>>before calling
>>>>>>+ * VM_UNBIND.
>>>>>>+ *
>>>>>>+ * The @start and @length musy specify a unique mapping 
>>>>>>bound with VM_BIND
>>>>>>+ * ioctl.
>>>>>>+ */
>>>>>>+struct drm_i915_gem_vm_unbind {
>>>>>>+    /** @vm_id: VM (address space) id to bind */
>>>>>>+    __u32 vm_id;
>>>>>>+
>>>>>>+    /** @rsvd: Reserved, MBZ */
>>>>>>+    __u32 rsvd;
>>>>>>+
>>>>>>+    /** @start: Virtual Address start to unbind */
>>>>>>+    __u64 start;
>>>>>>+
>>>>>>+    /** @length: Length of mapping to unbind */
>>>>>>+    __u64 length;
>>>>>>+
>>>>>>+    /**
>>>>>>+     * @flags: Supported flags are:
>>>>>>+     *
>>>>>>+     * I915_GEM_VM_UNBIND_FENCE_VALID:
>>>>>>+     * @fence is valid, needs unbind completion notification.
>>>>>>+     */
>>>>>>+    __u64 flags;
>>>>>>+#define I915_GEM_VM_UNBIND_FENCE_VALID    (1 << 0)
>>>>>>+
>>>>>>+    /** @fence: Timeline fence for unbind completion signaling */
>>>>>>+    struct drm_i915_gem_vm_bind_fence fence;
>>>>>
>>>>>I am not sure the simplified ioctl story is super coherent. If 
>>>>>everything is now fully async and out of order, but the input 
>>>>>fence has been dropped, then how is userspace supposed to 
>>>>>handle the address space? It will have to wait (in userspace) 
>>>>>for unbinds to complete before submitting subsequent binds 
>>>>>which use the same VA range.
>>>>>
>>>>
>>>>Yah and Mesa appararently will be having the support to handle it.
>>>>
>>>>>Maybe that's passable, but then the fact execbuf3 has no input 
>>>>>fence suggests a userspace wait between it and binds. And I am 
>>>>>pretty sure historically those were always quite bad for 
>>>>>performance.
>>>>>
>>>>
>>>>execbuf3 has the input fence through timline fence array support.
>>>
>>>I think I confused the field in execbuf3 for for the output 
>>>fence.. So that part is fine, async binds chained with input fence 
>>>to execbuf3. Fire and forget for userspace.
>>>
>>>Although I then don't understand why execbuf3 wouldn't support an 
>>>output fence? What mechanism is userspace supposed to use for 
>>>that? Export a fence from batch buffer BO? That would be an extra 
>>>ioctl so if we can avoid it why not?
>>>
>>
>>execbuf3 supports out fence as well through timeline fence array.
>
>Ah okay, I am uninformed in this topic, sorry.
>
>Regards,
>
>Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-23 11:05               ` Tvrtko Ursulin
@ 2022-06-23 21:05                   ` Zeng, Oak
  2022-06-23 21:05                   ` Zeng, Oak
  1 sibling, 0 replies; 34+ messages in thread
From: Zeng, Oak @ 2022-06-23 21:05 UTC (permalink / raw)
  To: Tvrtko Ursulin, Landwerlin, Lionel G, Vishwanathapura, Niranjana
  Cc: Zanoni, Paulo R, intel-gfx, dri-devel, Hellstrom, Thomas, Wilson,
	Chris P, Vetter, Daniel, christian.koenig, Auld, Matthew



Regards,
Oak

> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Tvrtko
> Ursulin
> Sent: June 23, 2022 7:06 AM
> To: Landwerlin, Lionel G <lionel.g.landwerlin@intel.com>; Vishwanathapura,
> Niranjana <niranjana.vishwanathapura@intel.com>
> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-gfx@lists.freedesktop.org;
> dri-devel@lists.freedesktop.org; Hellstrom, Thomas <thomas.hellstrom@intel.com>;
> Wilson, Chris P <chris.p.wilson@intel.com>; Vetter, Daniel
> <daniel.vetter@intel.com>; christian.koenig@amd.com; Auld, Matthew
> <matthew.auld@intel.com>
> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
> 
> 
> On 23/06/2022 09:57, Lionel Landwerlin wrote:
> > On 23/06/2022 11:27, Tvrtko Ursulin wrote:
> >>>
> >>> After a vm_unbind, UMD can re-bind to same VA range against an active
> >>> VM.
> >>> Though I am not sue with Mesa usecase if that new mapping is required
> >>> for
> >>> running GPU job or it will be for the next submission. But ensuring the
> >>> tlb flush upon unbind, KMD can ensure correctness.
> >>
> >> Isn't that their problem? If they re-bind for submitting _new_ work
> >> then they get the flush as part of batch buffer pre-amble.
> >
> > In the non sparse case, if a VA range is unbound, it is invalid to use
> > that range for anything until it has been rebound by something else.
> >
> > We'll take the fence provided by vm_bind and put it as a wait fence on
> > the next execbuffer.
> >
> > It might be safer in case of memory over fetching?
> >
> >
> > TLB flush will have to happen at some point right?
> >
> > What's the alternative to do it in unbind?
> 
> Currently TLB flush happens from the ring before every BB_START and also
> when i915 returns the backing store pages to the system.


Can you explain more why tlb flush when i915 retire the backing storage? I never figured that out when I looked at the codes. As I understand it, tlb caches the gpu page tables which map a va to a pa. So it is straight forward to me that we perform a tlb flush when we change the page table (either at vm bind time or unbind time. Better at unbind time for performance reason).

But it is rather tricky to me to flush tlb when we retire a backing storage. I don't see how backing storage can be connected to page table. Let's say user unbind va1 from pa1, then bind va1 to pa2. Then retire pa1. Submit shader code using va1. If we don't tlb flush after unbind va1, the new shader code which is supposed to use pa2 will still use pa1 due to the stale entries in tlb, right? The point is, tlb cached is tagged with virtual address, not physical address. so after we unbind va1 from pa1, regardless we retire pa1 or not, va1 can be bound to another pa2.

Thanks,
Oak 


> 
> For the former, I haven't seen any mention that for execbuf3 there are
> plans to stop doing it? Anyway, as long as this is kept and sequence of
> bind[1..N]+execbuf is safe and correctly sees all the preceding binds.
> Hence about the alternative to doing it in unbind - first I think lets
> state the problem that is trying to solve.
> 
> For instance is it just for the compute "append work to the running
> batch" use case? I honestly don't remember how was that supposed to work
> so maybe the tlb flush on bind was supposed to deal with that scenario?
> 
> Or you see a problem even for Mesa with the current model?
> 
> Regards,
> 
> Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
@ 2022-06-23 21:05                   ` Zeng, Oak
  0 siblings, 0 replies; 34+ messages in thread
From: Zeng, Oak @ 2022-06-23 21:05 UTC (permalink / raw)
  To: Tvrtko Ursulin, Landwerlin, Lionel G, Vishwanathapura, Niranjana
  Cc: Zanoni, Paulo R, intel-gfx, dri-devel, Hellstrom, Thomas, Wilson,
	Chris P, Vetter, Daniel, christian.koenig, Auld, Matthew



Regards,
Oak

> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Tvrtko
> Ursulin
> Sent: June 23, 2022 7:06 AM
> To: Landwerlin, Lionel G <lionel.g.landwerlin@intel.com>; Vishwanathapura,
> Niranjana <niranjana.vishwanathapura@intel.com>
> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-gfx@lists.freedesktop.org;
> dri-devel@lists.freedesktop.org; Hellstrom, Thomas <thomas.hellstrom@intel.com>;
> Wilson, Chris P <chris.p.wilson@intel.com>; Vetter, Daniel
> <daniel.vetter@intel.com>; christian.koenig@amd.com; Auld, Matthew
> <matthew.auld@intel.com>
> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
> 
> 
> On 23/06/2022 09:57, Lionel Landwerlin wrote:
> > On 23/06/2022 11:27, Tvrtko Ursulin wrote:
> >>>
> >>> After a vm_unbind, UMD can re-bind to same VA range against an active
> >>> VM.
> >>> Though I am not sue with Mesa usecase if that new mapping is required
> >>> for
> >>> running GPU job or it will be for the next submission. But ensuring the
> >>> tlb flush upon unbind, KMD can ensure correctness.
> >>
> >> Isn't that their problem? If they re-bind for submitting _new_ work
> >> then they get the flush as part of batch buffer pre-amble.
> >
> > In the non sparse case, if a VA range is unbound, it is invalid to use
> > that range for anything until it has been rebound by something else.
> >
> > We'll take the fence provided by vm_bind and put it as a wait fence on
> > the next execbuffer.
> >
> > It might be safer in case of memory over fetching?
> >
> >
> > TLB flush will have to happen at some point right?
> >
> > What's the alternative to do it in unbind?
> 
> Currently TLB flush happens from the ring before every BB_START and also
> when i915 returns the backing store pages to the system.


Can you explain more why tlb flush when i915 retire the backing storage? I never figured that out when I looked at the codes. As I understand it, tlb caches the gpu page tables which map a va to a pa. So it is straight forward to me that we perform a tlb flush when we change the page table (either at vm bind time or unbind time. Better at unbind time for performance reason).

But it is rather tricky to me to flush tlb when we retire a backing storage. I don't see how backing storage can be connected to page table. Let's say user unbind va1 from pa1, then bind va1 to pa2. Then retire pa1. Submit shader code using va1. If we don't tlb flush after unbind va1, the new shader code which is supposed to use pa2 will still use pa1 due to the stale entries in tlb, right? The point is, tlb cached is tagged with virtual address, not physical address. so after we unbind va1 from pa1, regardless we retire pa1 or not, va1 can be bound to another pa2.

Thanks,
Oak 


> 
> For the former, I haven't seen any mention that for execbuf3 there are
> plans to stop doing it? Anyway, as long as this is kept and sequence of
> bind[1..N]+execbuf is safe and correctly sees all the preceding binds.
> Hence about the alternative to doing it in unbind - first I think lets
> state the problem that is trying to solve.
> 
> For instance is it just for the compute "append work to the running
> batch" use case? I honestly don't remember how was that supposed to work
> so maybe the tlb flush on bind was supposed to deal with that scenario?
> 
> Or you see a problem even for Mesa with the current model?
> 
> Regards,
> 
> Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-23 21:05                   ` Zeng, Oak
  (?)
@ 2022-06-24  8:32                   ` Tvrtko Ursulin
  2022-06-24 20:23                       ` Zeng, Oak
  -1 siblings, 1 reply; 34+ messages in thread
From: Tvrtko Ursulin @ 2022-06-24  8:32 UTC (permalink / raw)
  To: Zeng, Oak, Landwerlin, Lionel G, Vishwanathapura, Niranjana
  Cc: Zanoni, Paulo R, intel-gfx, dri-devel, Hellstrom, Thomas, Wilson,
	Chris P, Vetter, Daniel, christian.koenig, Auld, Matthew

On 23/06/2022 22:05, Zeng, Oak wrote:
>> -----Original Message-----
>> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Tvrtko
>> Ursulin
>> Sent: June 23, 2022 7:06 AM
>> To: Landwerlin, Lionel G <lionel.g.landwerlin@intel.com>; Vishwanathapura,
>> Niranjana <niranjana.vishwanathapura@intel.com>
>> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-gfx@lists.freedesktop.org;
>> dri-devel@lists.freedesktop.org; Hellstrom, Thomas <thomas.hellstrom@intel.com>;
>> Wilson, Chris P <chris.p.wilson@intel.com>; Vetter, Daniel
>> <daniel.vetter@intel.com>; christian.koenig@amd.com; Auld, Matthew
>> <matthew.auld@intel.com>
>> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
>>
>>
>> On 23/06/2022 09:57, Lionel Landwerlin wrote:
>>> On 23/06/2022 11:27, Tvrtko Ursulin wrote:
>>>>>
>>>>> After a vm_unbind, UMD can re-bind to same VA range against an active
>>>>> VM.
>>>>> Though I am not sue with Mesa usecase if that new mapping is required
>>>>> for
>>>>> running GPU job or it will be for the next submission. But ensuring the
>>>>> tlb flush upon unbind, KMD can ensure correctness.
>>>>
>>>> Isn't that their problem? If they re-bind for submitting _new_ work
>>>> then they get the flush as part of batch buffer pre-amble.
>>>
>>> In the non sparse case, if a VA range is unbound, it is invalid to use
>>> that range for anything until it has been rebound by something else.
>>>
>>> We'll take the fence provided by vm_bind and put it as a wait fence on
>>> the next execbuffer.
>>>
>>> It might be safer in case of memory over fetching?
>>>
>>>
>>> TLB flush will have to happen at some point right?
>>>
>>> What's the alternative to do it in unbind?
>>
>> Currently TLB flush happens from the ring before every BB_START and also
>> when i915 returns the backing store pages to the system.
> 
> 
> Can you explain more why tlb flush when i915 retire the backing storage? I never figured that out when I looked at the codes. As I understand it, tlb caches the gpu page tables which map a va to a pa. So it is straight forward to me that we perform a tlb flush when we change the page table (either at vm bind time or unbind time. Better at unbind time for performance reason).

I don't know what performs better - someone can measure the two 
approaches? Certainly on platforms where we only have global TLB 
flushing the cost is quite high so my thinking was to allow i915 to 
control when it will be done and not guarantee it in the uapi if it 
isn't needed for security reasons.

> But it is rather tricky to me to flush tlb when we retire a backing storage. I don't see how backing storage can be connected to page table. Let's say user unbind va1 from pa1, then bind va1 to pa2. Then retire pa1. Submit shader code using va1. If we don't tlb flush after unbind va1, the new shader code which is supposed to use pa2 will still use pa1 due to the stale entries in tlb, right? The point is, tlb cached is tagged with virtual address, not physical address. so after we unbind va1 from pa1, regardless we retire pa1 or not, va1 can be bound to another pa2.

When you say "retire pa1" I will assume you meant release backing 
storage for pa1. At this point i915 currently does do the TLB flush and 
that ensures no PTE can point to pa1.

This approach deals with security of the system as a whole. Client may 
still cause rendering corruption or a GPU hang for itself but that 
should be completely isolated. (This is the part where you say 
"regardless if we retire pa1 or not" I think.)

But I think those are advanced use cases where userspace wants to 
manipulate PTEs while something is running on the GPU in parallel. AFAIK 
limited to compute "infinite batch" so my thinking is to avoid adding a 
performance penalty to the common case. Especially on platforms which 
only have global flush.

But.. to circle back on the measuring angle. Until someone invests time 
and effort to benchmark the two approaches (flush on unbind vs flush on 
backing store release) we don't really know. All I know is the perf hit 
with the current solution was significant, AFAIR up to teen digits on 
some games. And considering the flushes were driven only by the shrinker 
activity, my thinking was they would be less frequent than the unbinds, 
therefore have the potential for a smaller perf hit.

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-24  8:32                   ` Tvrtko Ursulin
@ 2022-06-24 20:23                       ` Zeng, Oak
  0 siblings, 0 replies; 34+ messages in thread
From: Zeng, Oak @ 2022-06-24 20:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, Landwerlin, Lionel G, Vishwanathapura, Niranjana
  Cc: Zanoni, Paulo R, intel-gfx, dri-devel, Hellstrom, Thomas, Wilson,
	Chris P, Vetter, Daniel, christian.koenig, Auld, Matthew

Let's compare "tlb invalidate at vm unbind" vs "tlb invalidate at backing storage":

Correctness: 
consider this sequence of:
1. unbind va1 from pa1, 
2. then bind va1 to pa2. //user space has the freedom to do this as it manages virtual address space
3. Submit shader code using va1, 
4. Then retire pa1. 

If you don't perform tlb invalidate at step #1, in step #3, shader will use stale entries in tlb and pa1 will be used for the shader. User want to use pa2. So I don't think invalidate tlb at step #4 make correctness.


Performance: 
It is straight forward to invalidate tlb at step 1. If platform support range based tlb invalidation, we can perform range based invalidation easily at step1.
If you do it at step 4, you either need to perform a whole gt tlb invalidation (worse performance), or you need to record all the VAs that this pa has been bound to and invalidate all the VA ranges - ugly program.


Thanks,
Oak

> -----Original Message-----
> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Sent: June 24, 2022 4:32 AM
> To: Zeng, Oak <oak.zeng@intel.com>; Landwerlin, Lionel G
> <lionel.g.landwerlin@intel.com>; Vishwanathapura, Niranjana
> <niranjana.vishwanathapura@intel.com>
> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-
> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Hellstrom,
> Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
> 
> 
> On 23/06/2022 22:05, Zeng, Oak wrote:
> >> -----Original Message-----
> >> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf
> >> Of Tvrtko Ursulin
> >> Sent: June 23, 2022 7:06 AM
> >> To: Landwerlin, Lionel G <lionel.g.landwerlin@intel.com>;
> >> Vishwanathapura, Niranjana <niranjana.vishwanathapura@intel.com>
> >> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>;
> >> intel-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> >> Hellstrom, Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> >> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
> >> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
> >> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi
> >> definition
> >>
> >>
> >> On 23/06/2022 09:57, Lionel Landwerlin wrote:
> >>> On 23/06/2022 11:27, Tvrtko Ursulin wrote:
> >>>>>
> >>>>> After a vm_unbind, UMD can re-bind to same VA range against an
> >>>>> active VM.
> >>>>> Though I am not sue with Mesa usecase if that new mapping is
> >>>>> required for running GPU job or it will be for the next
> >>>>> submission. But ensuring the tlb flush upon unbind, KMD can ensure
> >>>>> correctness.
> >>>>
> >>>> Isn't that their problem? If they re-bind for submitting _new_ work
> >>>> then they get the flush as part of batch buffer pre-amble.
> >>>
> >>> In the non sparse case, if a VA range is unbound, it is invalid to
> >>> use that range for anything until it has been rebound by something else.
> >>>
> >>> We'll take the fence provided by vm_bind and put it as a wait fence
> >>> on the next execbuffer.
> >>>
> >>> It might be safer in case of memory over fetching?
> >>>
> >>>
> >>> TLB flush will have to happen at some point right?
> >>>
> >>> What's the alternative to do it in unbind?
> >>
> >> Currently TLB flush happens from the ring before every BB_START and
> >> also when i915 returns the backing store pages to the system.
> >
> >
> > Can you explain more why tlb flush when i915 retire the backing storage? I
> never figured that out when I looked at the codes. As I understand it, tlb
> caches the gpu page tables which map a va to a pa. So it is straight forward to
> me that we perform a tlb flush when we change the page table (either at vm
> bind time or unbind time. Better at unbind time for performance reason).
> 
> I don't know what performs better - someone can measure the two
> approaches? Certainly on platforms where we only have global TLB flushing
> the cost is quite high so my thinking was to allow i915 to control when it will
> be done and not guarantee it in the uapi if it isn't needed for security reasons.
> 
> > But it is rather tricky to me to flush tlb when we retire a backing storage. I
> don't see how backing storage can be connected to page table. Let's say user
> unbind va1 from pa1, then bind va1 to pa2. Then retire pa1. Submit shader
> code using va1. If we don't tlb flush after unbind va1, the new shader code
> which is supposed to use pa2 will still use pa1 due to the stale entries in tlb,
> right? The point is, tlb cached is tagged with virtual address, not physical
> address. so after we unbind va1 from pa1, regardless we retire pa1 or not,
> va1 can be bound to another pa2.
> 
> When you say "retire pa1" I will assume you meant release backing storage
> for pa1. At this point i915 currently does do the TLB flush and that ensures no
> PTE can point to pa1.
> 
> This approach deals with security of the system as a whole. Client may still
> cause rendering corruption or a GPU hang for itself but that should be
> completely isolated. (This is the part where you say "regardless if we retire
> pa1 or not" I think.)
> 
> But I think those are advanced use cases where userspace wants to
> manipulate PTEs while something is running on the GPU in parallel. AFAIK
> limited to compute "infinite batch" so my thinking is to avoid adding a
> performance penalty to the common case. Especially on platforms which only
> have global flush.
> 
> But.. to circle back on the measuring angle. Until someone invests time and
> effort to benchmark the two approaches (flush on unbind vs flush on backing
> store release) we don't really know. All I know is the perf hit with the current
> solution was significant, AFAIR up to teen digits on some games. And
> considering the flushes were driven only by the shrinker activity, my thinking
> was they would be less frequent than the unbinds, therefore have the
> potential for a smaller perf hit.
> 
> Regards,
> 
> Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
@ 2022-06-24 20:23                       ` Zeng, Oak
  0 siblings, 0 replies; 34+ messages in thread
From: Zeng, Oak @ 2022-06-24 20:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, Landwerlin, Lionel G, Vishwanathapura, Niranjana
  Cc: Zanoni, Paulo R, intel-gfx, dri-devel, Hellstrom, Thomas, Wilson,
	Chris P, Vetter, Daniel, christian.koenig, Auld, Matthew

Let's compare "tlb invalidate at vm unbind" vs "tlb invalidate at backing storage":

Correctness: 
consider this sequence of:
1. unbind va1 from pa1, 
2. then bind va1 to pa2. //user space has the freedom to do this as it manages virtual address space
3. Submit shader code using va1, 
4. Then retire pa1. 

If you don't perform tlb invalidate at step #1, in step #3, shader will use stale entries in tlb and pa1 will be used for the shader. User want to use pa2. So I don't think invalidate tlb at step #4 make correctness.


Performance: 
It is straight forward to invalidate tlb at step 1. If platform support range based tlb invalidation, we can perform range based invalidation easily at step1.
If you do it at step 4, you either need to perform a whole gt tlb invalidation (worse performance), or you need to record all the VAs that this pa has been bound to and invalidate all the VA ranges - ugly program.


Thanks,
Oak

> -----Original Message-----
> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Sent: June 24, 2022 4:32 AM
> To: Zeng, Oak <oak.zeng@intel.com>; Landwerlin, Lionel G
> <lionel.g.landwerlin@intel.com>; Vishwanathapura, Niranjana
> <niranjana.vishwanathapura@intel.com>
> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-
> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Hellstrom,
> Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
> 
> 
> On 23/06/2022 22:05, Zeng, Oak wrote:
> >> -----Original Message-----
> >> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf
> >> Of Tvrtko Ursulin
> >> Sent: June 23, 2022 7:06 AM
> >> To: Landwerlin, Lionel G <lionel.g.landwerlin@intel.com>;
> >> Vishwanathapura, Niranjana <niranjana.vishwanathapura@intel.com>
> >> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>;
> >> intel-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> >> Hellstrom, Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> >> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
> >> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
> >> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi
> >> definition
> >>
> >>
> >> On 23/06/2022 09:57, Lionel Landwerlin wrote:
> >>> On 23/06/2022 11:27, Tvrtko Ursulin wrote:
> >>>>>
> >>>>> After a vm_unbind, UMD can re-bind to same VA range against an
> >>>>> active VM.
> >>>>> Though I am not sue with Mesa usecase if that new mapping is
> >>>>> required for running GPU job or it will be for the next
> >>>>> submission. But ensuring the tlb flush upon unbind, KMD can ensure
> >>>>> correctness.
> >>>>
> >>>> Isn't that their problem? If they re-bind for submitting _new_ work
> >>>> then they get the flush as part of batch buffer pre-amble.
> >>>
> >>> In the non sparse case, if a VA range is unbound, it is invalid to
> >>> use that range for anything until it has been rebound by something else.
> >>>
> >>> We'll take the fence provided by vm_bind and put it as a wait fence
> >>> on the next execbuffer.
> >>>
> >>> It might be safer in case of memory over fetching?
> >>>
> >>>
> >>> TLB flush will have to happen at some point right?
> >>>
> >>> What's the alternative to do it in unbind?
> >>
> >> Currently TLB flush happens from the ring before every BB_START and
> >> also when i915 returns the backing store pages to the system.
> >
> >
> > Can you explain more why tlb flush when i915 retire the backing storage? I
> never figured that out when I looked at the codes. As I understand it, tlb
> caches the gpu page tables which map a va to a pa. So it is straight forward to
> me that we perform a tlb flush when we change the page table (either at vm
> bind time or unbind time. Better at unbind time for performance reason).
> 
> I don't know what performs better - someone can measure the two
> approaches? Certainly on platforms where we only have global TLB flushing
> the cost is quite high so my thinking was to allow i915 to control when it will
> be done and not guarantee it in the uapi if it isn't needed for security reasons.
> 
> > But it is rather tricky to me to flush tlb when we retire a backing storage. I
> don't see how backing storage can be connected to page table. Let's say user
> unbind va1 from pa1, then bind va1 to pa2. Then retire pa1. Submit shader
> code using va1. If we don't tlb flush after unbind va1, the new shader code
> which is supposed to use pa2 will still use pa1 due to the stale entries in tlb,
> right? The point is, tlb cached is tagged with virtual address, not physical
> address. so after we unbind va1 from pa1, regardless we retire pa1 or not,
> va1 can be bound to another pa2.
> 
> When you say "retire pa1" I will assume you meant release backing storage
> for pa1. At this point i915 currently does do the TLB flush and that ensures no
> PTE can point to pa1.
> 
> This approach deals with security of the system as a whole. Client may still
> cause rendering corruption or a GPU hang for itself but that should be
> completely isolated. (This is the part where you say "regardless if we retire
> pa1 or not" I think.)
> 
> But I think those are advanced use cases where userspace wants to
> manipulate PTEs while something is running on the GPU in parallel. AFAIK
> limited to compute "infinite batch" so my thinking is to avoid adding a
> performance penalty to the common case. Especially on platforms which only
> have global flush.
> 
> But.. to circle back on the measuring angle. Until someone invests time and
> effort to benchmark the two approaches (flush on unbind vs flush on backing
> store release) we don't really know. All I know is the perf hit with the current
> solution was significant, AFAIR up to teen digits on some games. And
> considering the flushes were driven only by the shrinker activity, my thinking
> was they would be less frequent than the unbinds, therefore have the
> potential for a smaller perf hit.
> 
> Regards,
> 
> Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-24 20:23                       ` Zeng, Oak
  (?)
@ 2022-06-27  8:30                       ` Tvrtko Ursulin
  2022-06-27 18:58                           ` Zeng, Oak
  -1 siblings, 1 reply; 34+ messages in thread
From: Tvrtko Ursulin @ 2022-06-27  8:30 UTC (permalink / raw)
  To: Zeng, Oak, Landwerlin, Lionel G, Vishwanathapura, Niranjana
  Cc: Zanoni, Paulo R, intel-gfx, dri-devel, Hellstrom, Thomas, Wilson,
	Chris P, Vetter, Daniel, christian.koenig, Auld, Matthew


On 24/06/2022 21:23, Zeng, Oak wrote:
> Let's compare "tlb invalidate at vm unbind" vs "tlb invalidate at backing storage":
> 
> Correctness:
> consider this sequence of:
> 1. unbind va1 from pa1,
> 2. then bind va1 to pa2. //user space has the freedom to do this as it manages virtual address space
> 3. Submit shader code using va1,
> 4. Then retire pa1.
> 
> If you don't perform tlb invalidate at step #1, in step #3, shader will use stale entries in tlb and pa1 will be used for the shader. User want to use pa2. So I don't think invalidate tlb at step #4 make correctness.

Define step 3. Is it a new execbuf? If so then there will be a TLB flush 
there. Unless the plan is to stop doing that with eb3 but I haven't 
picked up on that anywhere so far.

> Performance:
> It is straight forward to invalidate tlb at step 1. If platform support range based tlb invalidation, we can perform range based invalidation easily at step1.

If the platform supports range base yes. If it doesn't _and_ the flush 
at unbind is not needed for 99% of use cases then it is simply a waste.

> If you do it at step 4, you either need to perform a whole gt tlb invalidation (worse performance), or you need to record all the VAs that this pa has been bound to and invalidate all the VA ranges - ugly program.

Someone can setup some benchmarking? :)

Regards,

Tvrtko

> 
> 
> Thanks,
> Oak
> 
>> -----Original Message-----
>> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> Sent: June 24, 2022 4:32 AM
>> To: Zeng, Oak <oak.zeng@intel.com>; Landwerlin, Lionel G
>> <lionel.g.landwerlin@intel.com>; Vishwanathapura, Niranjana
>> <niranjana.vishwanathapura@intel.com>
>> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-
>> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Hellstrom,
>> Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
>> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
>> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
>> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
>>
>>
>> On 23/06/2022 22:05, Zeng, Oak wrote:
>>>> -----Original Message-----
>>>> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf
>>>> Of Tvrtko Ursulin
>>>> Sent: June 23, 2022 7:06 AM
>>>> To: Landwerlin, Lionel G <lionel.g.landwerlin@intel.com>;
>>>> Vishwanathapura, Niranjana <niranjana.vishwanathapura@intel.com>
>>>> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>;
>>>> intel-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
>>>> Hellstrom, Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
>>>> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
>>>> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
>>>> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi
>>>> definition
>>>>
>>>>
>>>> On 23/06/2022 09:57, Lionel Landwerlin wrote:
>>>>> On 23/06/2022 11:27, Tvrtko Ursulin wrote:
>>>>>>>
>>>>>>> After a vm_unbind, UMD can re-bind to same VA range against an
>>>>>>> active VM.
>>>>>>> Though I am not sue with Mesa usecase if that new mapping is
>>>>>>> required for running GPU job or it will be for the next
>>>>>>> submission. But ensuring the tlb flush upon unbind, KMD can ensure
>>>>>>> correctness.
>>>>>>
>>>>>> Isn't that their problem? If they re-bind for submitting _new_ work
>>>>>> then they get the flush as part of batch buffer pre-amble.
>>>>>
>>>>> In the non sparse case, if a VA range is unbound, it is invalid to
>>>>> use that range for anything until it has been rebound by something else.
>>>>>
>>>>> We'll take the fence provided by vm_bind and put it as a wait fence
>>>>> on the next execbuffer.
>>>>>
>>>>> It might be safer in case of memory over fetching?
>>>>>
>>>>>
>>>>> TLB flush will have to happen at some point right?
>>>>>
>>>>> What's the alternative to do it in unbind?
>>>>
>>>> Currently TLB flush happens from the ring before every BB_START and
>>>> also when i915 returns the backing store pages to the system.
>>>
>>>
>>> Can you explain more why tlb flush when i915 retire the backing storage? I
>> never figured that out when I looked at the codes. As I understand it, tlb
>> caches the gpu page tables which map a va to a pa. So it is straight forward to
>> me that we perform a tlb flush when we change the page table (either at vm
>> bind time or unbind time. Better at unbind time for performance reason).
>>
>> I don't know what performs better - someone can measure the two
>> approaches? Certainly on platforms where we only have global TLB flushing
>> the cost is quite high so my thinking was to allow i915 to control when it will
>> be done and not guarantee it in the uapi if it isn't needed for security reasons.
>>
>>> But it is rather tricky to me to flush tlb when we retire a backing storage. I
>> don't see how backing storage can be connected to page table. Let's say user
>> unbind va1 from pa1, then bind va1 to pa2. Then retire pa1. Submit shader
>> code using va1. If we don't tlb flush after unbind va1, the new shader code
>> which is supposed to use pa2 will still use pa1 due to the stale entries in tlb,
>> right? The point is, tlb cached is tagged with virtual address, not physical
>> address. so after we unbind va1 from pa1, regardless we retire pa1 or not,
>> va1 can be bound to another pa2.
>>
>> When you say "retire pa1" I will assume you meant release backing storage
>> for pa1. At this point i915 currently does do the TLB flush and that ensures no
>> PTE can point to pa1.
>>
>> This approach deals with security of the system as a whole. Client may still
>> cause rendering corruption or a GPU hang for itself but that should be
>> completely isolated. (This is the part where you say "regardless if we retire
>> pa1 or not" I think.)
>>
>> But I think those are advanced use cases where userspace wants to
>> manipulate PTEs while something is running on the GPU in parallel. AFAIK
>> limited to compute "infinite batch" so my thinking is to avoid adding a
>> performance penalty to the common case. Especially on platforms which only
>> have global flush.
>>
>> But.. to circle back on the measuring angle. Until someone invests time and
>> effort to benchmark the two approaches (flush on unbind vs flush on backing
>> store release) we don't really know. All I know is the perf hit with the current
>> solution was significant, AFAIR up to teen digits on some games. And
>> considering the flushes were driven only by the shrinker activity, my thinking
>> was they would be less frequent than the unbinds, therefore have the
>> potential for a smaller perf hit.
>>
>> Regards,
>>
>> Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-27  8:30                       ` Tvrtko Ursulin
@ 2022-06-27 18:58                           ` Zeng, Oak
  0 siblings, 0 replies; 34+ messages in thread
From: Zeng, Oak @ 2022-06-27 18:58 UTC (permalink / raw)
  To: Tvrtko Ursulin, Landwerlin, Lionel G, Vishwanathapura, Niranjana
  Cc: Zanoni, Paulo R, intel-gfx, dri-devel, Hellstrom, Thomas, Wilson,
	Chris P, Vetter, Daniel, christian.koenig, Auld, Matthew



Thanks,
Oak

> -----Original Message-----
> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Sent: June 27, 2022 4:30 AM
> To: Zeng, Oak <oak.zeng@intel.com>; Landwerlin, Lionel G
> <lionel.g.landwerlin@intel.com>; Vishwanathapura, Niranjana
> <niranjana.vishwanathapura@intel.com>
> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-
> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Hellstrom,
> Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
> 
> 
> On 24/06/2022 21:23, Zeng, Oak wrote:
> > Let's compare "tlb invalidate at vm unbind" vs "tlb invalidate at backing
> storage":
> >
> > Correctness:
> > consider this sequence of:
> > 1. unbind va1 from pa1,
> > 2. then bind va1 to pa2. //user space has the freedom to do this as it
> > manages virtual address space 3. Submit shader code using va1, 4. Then
> > retire pa1.
> >
> > If you don't perform tlb invalidate at step #1, in step #3, shader will use
> stale entries in tlb and pa1 will be used for the shader. User want to use pa2.
> So I don't think invalidate tlb at step #4 make correctness.
> 
> Define step 3. Is it a new execbuf? If so then there will be a TLB flush there.
> Unless the plan is to stop doing that with eb3 but I haven't picked up on that
> anywhere so far.

In Niranjana's latest patch series, he removed the TLB flushing from vm_unbind. He also said explicitly TLB invalidation will be performed at job submission and backing storage releasing time, which is the existing behavior of the current i915 driver.

I think if we invalidate TLB on each vm_unbind, then we don't need to invalidate at submission and backing storage releasing. It doesn't make a lot of sense to me to perform a tlb invalidation at execbuf time. Maybe it is a behavior for the old implicit binding programming model. For vm_bind and eb3, we separate the binding and job submission into two APIs. It is more natural the TLB invalidation be coupled with the vm bind/unbind, not job submission. So in my opinion we should remove tlb invalidation from submission and backing storage releasing and add it to vm unbind. This is method is cleaner to me.

Regarding performance, we don't have data. In my opinion, we should make things work in a most straight forward way as the first step. Then consider performance improvement if necessary. Consider some delayed tlb invalidation at submission and backing release time without performance data support wasn't a good decision.

Regards,
Oak

> 
> > Performance:
> > It is straight forward to invalidate tlb at step 1. If platform support range
> based tlb invalidation, we can perform range based invalidation easily at
> step1.
> 
> If the platform supports range base yes. If it doesn't _and_ the flush at
> unbind is not needed for 99% of use cases then it is simply a waste.
> 
> > If you do it at step 4, you either need to perform a whole gt tlb invalidation
> (worse performance), or you need to record all the VAs that this pa has been
> bound to and invalidate all the VA ranges - ugly program.
> 
> Someone can setup some benchmarking? :)
> 
> Regards,
> 
> Tvrtko
> 
> >
> >
> > Thanks,
> > Oak
> >
> >> -----Original Message-----
> >> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> >> Sent: June 24, 2022 4:32 AM
> >> To: Zeng, Oak <oak.zeng@intel.com>; Landwerlin, Lionel G
> >> <lionel.g.landwerlin@intel.com>; Vishwanathapura, Niranjana
> >> <niranjana.vishwanathapura@intel.com>
> >> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-
> >> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> >> Hellstrom, Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> >> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
> >> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
> >> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi
> >> definition
> >>
> >>
> >> On 23/06/2022 22:05, Zeng, Oak wrote:
> >>>> -----Original Message-----
> >>>> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf
> >>>> Of Tvrtko Ursulin
> >>>> Sent: June 23, 2022 7:06 AM
> >>>> To: Landwerlin, Lionel G <lionel.g.landwerlin@intel.com>;
> >>>> Vishwanathapura, Niranjana <niranjana.vishwanathapura@intel.com>
> >>>> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>;
> >>>> intel-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> >>>> Hellstrom, Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> >>>> <chris.p.wilson@intel.com>; Vetter, Daniel
> >>>> <daniel.vetter@intel.com>; christian.koenig@amd.com; Auld,
> Matthew
> >>>> <matthew.auld@intel.com>
> >>>> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi
> >>>> definition
> >>>>
> >>>>
> >>>> On 23/06/2022 09:57, Lionel Landwerlin wrote:
> >>>>> On 23/06/2022 11:27, Tvrtko Ursulin wrote:
> >>>>>>>
> >>>>>>> After a vm_unbind, UMD can re-bind to same VA range against an
> >>>>>>> active VM.
> >>>>>>> Though I am not sue with Mesa usecase if that new mapping is
> >>>>>>> required for running GPU job or it will be for the next
> >>>>>>> submission. But ensuring the tlb flush upon unbind, KMD can
> >>>>>>> ensure correctness.
> >>>>>>
> >>>>>> Isn't that their problem? If they re-bind for submitting _new_
> >>>>>> work then they get the flush as part of batch buffer pre-amble.
> >>>>>
> >>>>> In the non sparse case, if a VA range is unbound, it is invalid to
> >>>>> use that range for anything until it has been rebound by something
> else.
> >>>>>
> >>>>> We'll take the fence provided by vm_bind and put it as a wait
> >>>>> fence on the next execbuffer.
> >>>>>
> >>>>> It might be safer in case of memory over fetching?
> >>>>>
> >>>>>
> >>>>> TLB flush will have to happen at some point right?
> >>>>>
> >>>>> What's the alternative to do it in unbind?
> >>>>
> >>>> Currently TLB flush happens from the ring before every BB_START and
> >>>> also when i915 returns the backing store pages to the system.
> >>>
> >>>
> >>> Can you explain more why tlb flush when i915 retire the backing
> >>> storage? I
> >> never figured that out when I looked at the codes. As I understand
> >> it, tlb caches the gpu page tables which map a va to a pa. So it is
> >> straight forward to me that we perform a tlb flush when we change the
> >> page table (either at vm bind time or unbind time. Better at unbind time
> for performance reason).
> >>
> >> I don't know what performs better - someone can measure the two
> >> approaches? Certainly on platforms where we only have global TLB
> >> flushing the cost is quite high so my thinking was to allow i915 to
> >> control when it will be done and not guarantee it in the uapi if it isn't
> needed for security reasons.
> >>
> >>> But it is rather tricky to me to flush tlb when we retire a backing
> >>> storage. I
> >> don't see how backing storage can be connected to page table. Let's
> >> say user unbind va1 from pa1, then bind va1 to pa2. Then retire pa1.
> >> Submit shader code using va1. If we don't tlb flush after unbind va1,
> >> the new shader code which is supposed to use pa2 will still use pa1
> >> due to the stale entries in tlb, right? The point is, tlb cached is
> >> tagged with virtual address, not physical address. so after we unbind
> >> va1 from pa1, regardless we retire pa1 or not,
> >> va1 can be bound to another pa2.
> >>
> >> When you say "retire pa1" I will assume you meant release backing
> >> storage for pa1. At this point i915 currently does do the TLB flush
> >> and that ensures no PTE can point to pa1.
> >>
> >> This approach deals with security of the system as a whole. Client
> >> may still cause rendering corruption or a GPU hang for itself but
> >> that should be completely isolated. (This is the part where you say
> >> "regardless if we retire
> >> pa1 or not" I think.)
> >>
> >> But I think those are advanced use cases where userspace wants to
> >> manipulate PTEs while something is running on the GPU in parallel.
> >> AFAIK limited to compute "infinite batch" so my thinking is to avoid
> >> adding a performance penalty to the common case. Especially on
> >> platforms which only have global flush.
> >>
> >> But.. to circle back on the measuring angle. Until someone invests
> >> time and effort to benchmark the two approaches (flush on unbind vs
> >> flush on backing store release) we don't really know. All I know is
> >> the perf hit with the current solution was significant, AFAIR up to
> >> teen digits on some games. And considering the flushes were driven
> >> only by the shrinker activity, my thinking was they would be less
> >> frequent than the unbinds, therefore have the potential for a smaller perf
> hit.
> >>
> >> Regards,
> >>
> >> Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
@ 2022-06-27 18:58                           ` Zeng, Oak
  0 siblings, 0 replies; 34+ messages in thread
From: Zeng, Oak @ 2022-06-27 18:58 UTC (permalink / raw)
  To: Tvrtko Ursulin, Landwerlin, Lionel G, Vishwanathapura, Niranjana
  Cc: Zanoni, Paulo R, intel-gfx, dri-devel, Hellstrom, Thomas, Wilson,
	Chris P, Vetter, Daniel, christian.koenig, Auld, Matthew



Thanks,
Oak

> -----Original Message-----
> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Sent: June 27, 2022 4:30 AM
> To: Zeng, Oak <oak.zeng@intel.com>; Landwerlin, Lionel G
> <lionel.g.landwerlin@intel.com>; Vishwanathapura, Niranjana
> <niranjana.vishwanathapura@intel.com>
> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-
> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Hellstrom,
> Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
> 
> 
> On 24/06/2022 21:23, Zeng, Oak wrote:
> > Let's compare "tlb invalidate at vm unbind" vs "tlb invalidate at backing
> storage":
> >
> > Correctness:
> > consider this sequence of:
> > 1. unbind va1 from pa1,
> > 2. then bind va1 to pa2. //user space has the freedom to do this as it
> > manages virtual address space 3. Submit shader code using va1, 4. Then
> > retire pa1.
> >
> > If you don't perform tlb invalidate at step #1, in step #3, shader will use
> stale entries in tlb and pa1 will be used for the shader. User want to use pa2.
> So I don't think invalidate tlb at step #4 make correctness.
> 
> Define step 3. Is it a new execbuf? If so then there will be a TLB flush there.
> Unless the plan is to stop doing that with eb3 but I haven't picked up on that
> anywhere so far.

In Niranjana's latest patch series, he removed the TLB flushing from vm_unbind. He also said explicitly TLB invalidation will be performed at job submission and backing storage releasing time, which is the existing behavior of the current i915 driver.

I think if we invalidate TLB on each vm_unbind, then we don't need to invalidate at submission and backing storage releasing. It doesn't make a lot of sense to me to perform a tlb invalidation at execbuf time. Maybe it is a behavior for the old implicit binding programming model. For vm_bind and eb3, we separate the binding and job submission into two APIs. It is more natural the TLB invalidation be coupled with the vm bind/unbind, not job submission. So in my opinion we should remove tlb invalidation from submission and backing storage releasing and add it to vm unbind. This is method is cleaner to me.

Regarding performance, we don't have data. In my opinion, we should make things work in a most straight forward way as the first step. Then consider performance improvement if necessary. Consider some delayed tlb invalidation at submission and backing release time without performance data support wasn't a good decision.

Regards,
Oak

> 
> > Performance:
> > It is straight forward to invalidate tlb at step 1. If platform support range
> based tlb invalidation, we can perform range based invalidation easily at
> step1.
> 
> If the platform supports range base yes. If it doesn't _and_ the flush at
> unbind is not needed for 99% of use cases then it is simply a waste.
> 
> > If you do it at step 4, you either need to perform a whole gt tlb invalidation
> (worse performance), or you need to record all the VAs that this pa has been
> bound to and invalidate all the VA ranges - ugly program.
> 
> Someone can setup some benchmarking? :)
> 
> Regards,
> 
> Tvrtko
> 
> >
> >
> > Thanks,
> > Oak
> >
> >> -----Original Message-----
> >> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> >> Sent: June 24, 2022 4:32 AM
> >> To: Zeng, Oak <oak.zeng@intel.com>; Landwerlin, Lionel G
> >> <lionel.g.landwerlin@intel.com>; Vishwanathapura, Niranjana
> >> <niranjana.vishwanathapura@intel.com>
> >> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-
> >> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> >> Hellstrom, Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> >> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
> >> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
> >> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi
> >> definition
> >>
> >>
> >> On 23/06/2022 22:05, Zeng, Oak wrote:
> >>>> -----Original Message-----
> >>>> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf
> >>>> Of Tvrtko Ursulin
> >>>> Sent: June 23, 2022 7:06 AM
> >>>> To: Landwerlin, Lionel G <lionel.g.landwerlin@intel.com>;
> >>>> Vishwanathapura, Niranjana <niranjana.vishwanathapura@intel.com>
> >>>> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>;
> >>>> intel-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> >>>> Hellstrom, Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> >>>> <chris.p.wilson@intel.com>; Vetter, Daniel
> >>>> <daniel.vetter@intel.com>; christian.koenig@amd.com; Auld,
> Matthew
> >>>> <matthew.auld@intel.com>
> >>>> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi
> >>>> definition
> >>>>
> >>>>
> >>>> On 23/06/2022 09:57, Lionel Landwerlin wrote:
> >>>>> On 23/06/2022 11:27, Tvrtko Ursulin wrote:
> >>>>>>>
> >>>>>>> After a vm_unbind, UMD can re-bind to same VA range against an
> >>>>>>> active VM.
> >>>>>>> Though I am not sue with Mesa usecase if that new mapping is
> >>>>>>> required for running GPU job or it will be for the next
> >>>>>>> submission. But ensuring the tlb flush upon unbind, KMD can
> >>>>>>> ensure correctness.
> >>>>>>
> >>>>>> Isn't that their problem? If they re-bind for submitting _new_
> >>>>>> work then they get the flush as part of batch buffer pre-amble.
> >>>>>
> >>>>> In the non sparse case, if a VA range is unbound, it is invalid to
> >>>>> use that range for anything until it has been rebound by something
> else.
> >>>>>
> >>>>> We'll take the fence provided by vm_bind and put it as a wait
> >>>>> fence on the next execbuffer.
> >>>>>
> >>>>> It might be safer in case of memory over fetching?
> >>>>>
> >>>>>
> >>>>> TLB flush will have to happen at some point right?
> >>>>>
> >>>>> What's the alternative to do it in unbind?
> >>>>
> >>>> Currently TLB flush happens from the ring before every BB_START and
> >>>> also when i915 returns the backing store pages to the system.
> >>>
> >>>
> >>> Can you explain more why tlb flush when i915 retire the backing
> >>> storage? I
> >> never figured that out when I looked at the codes. As I understand
> >> it, tlb caches the gpu page tables which map a va to a pa. So it is
> >> straight forward to me that we perform a tlb flush when we change the
> >> page table (either at vm bind time or unbind time. Better at unbind time
> for performance reason).
> >>
> >> I don't know what performs better - someone can measure the two
> >> approaches? Certainly on platforms where we only have global TLB
> >> flushing the cost is quite high so my thinking was to allow i915 to
> >> control when it will be done and not guarantee it in the uapi if it isn't
> needed for security reasons.
> >>
> >>> But it is rather tricky to me to flush tlb when we retire a backing
> >>> storage. I
> >> don't see how backing storage can be connected to page table. Let's
> >> say user unbind va1 from pa1, then bind va1 to pa2. Then retire pa1.
> >> Submit shader code using va1. If we don't tlb flush after unbind va1,
> >> the new shader code which is supposed to use pa2 will still use pa1
> >> due to the stale entries in tlb, right? The point is, tlb cached is
> >> tagged with virtual address, not physical address. so after we unbind
> >> va1 from pa1, regardless we retire pa1 or not,
> >> va1 can be bound to another pa2.
> >>
> >> When you say "retire pa1" I will assume you meant release backing
> >> storage for pa1. At this point i915 currently does do the TLB flush
> >> and that ensures no PTE can point to pa1.
> >>
> >> This approach deals with security of the system as a whole. Client
> >> may still cause rendering corruption or a GPU hang for itself but
> >> that should be completely isolated. (This is the part where you say
> >> "regardless if we retire
> >> pa1 or not" I think.)
> >>
> >> But I think those are advanced use cases where userspace wants to
> >> manipulate PTEs while something is running on the GPU in parallel.
> >> AFAIK limited to compute "infinite batch" so my thinking is to avoid
> >> adding a performance penalty to the common case. Especially on
> >> platforms which only have global flush.
> >>
> >> But.. to circle back on the measuring angle. Until someone invests
> >> time and effort to benchmark the two approaches (flush on unbind vs
> >> flush on backing store release) we don't really know. All I know is
> >> the perf hit with the current solution was significant, AFAIR up to
> >> teen digits on some games. And considering the flushes were driven
> >> only by the shrinker activity, my thinking was they would be less
> >> frequent than the unbinds, therefore have the potential for a smaller perf
> hit.
> >>
> >> Regards,
> >>
> >> Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-27 18:58                           ` Zeng, Oak
  (?)
@ 2022-06-28  8:58                           ` Tvrtko Ursulin
  2022-06-28 13:53                               ` Zeng, Oak
  -1 siblings, 1 reply; 34+ messages in thread
From: Tvrtko Ursulin @ 2022-06-28  8:58 UTC (permalink / raw)
  To: Zeng, Oak, Landwerlin, Lionel G, Vishwanathapura, Niranjana
  Cc: Zanoni, Paulo R, intel-gfx, dri-devel, Hellstrom, Thomas, Wilson,
	Chris P, Vetter, Daniel, christian.koenig, Auld, Matthew


On 27/06/2022 19:58, Zeng, Oak wrote:
> 
> 
> Thanks,
> Oak
> 
>> -----Original Message-----
>> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> Sent: June 27, 2022 4:30 AM
>> To: Zeng, Oak <oak.zeng@intel.com>; Landwerlin, Lionel G
>> <lionel.g.landwerlin@intel.com>; Vishwanathapura, Niranjana
>> <niranjana.vishwanathapura@intel.com>
>> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-
>> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Hellstrom,
>> Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
>> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
>> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
>> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
>>
>>
>> On 24/06/2022 21:23, Zeng, Oak wrote:
>>> Let's compare "tlb invalidate at vm unbind" vs "tlb invalidate at backing
>> storage":
>>>
>>> Correctness:
>>> consider this sequence of:
>>> 1. unbind va1 from pa1,
>>> 2. then bind va1 to pa2. //user space has the freedom to do this as it
>>> manages virtual address space 3. Submit shader code using va1, 4. Then
>>> retire pa1.
>>>
>>> If you don't perform tlb invalidate at step #1, in step #3, shader will use
>> stale entries in tlb and pa1 will be used for the shader. User want to use pa2.
>> So I don't think invalidate tlb at step #4 make correctness.
>>
>> Define step 3. Is it a new execbuf? If so then there will be a TLB flush there.
>> Unless the plan is to stop doing that with eb3 but I haven't picked up on that
>> anywhere so far.
> 
> In Niranjana's latest patch series, he removed the TLB flushing from vm_unbind. He also said explicitly TLB invalidation will be performed at job submission and backing storage releasing time, which is the existing behavior of the current i915 driver.
> 
> I think if we invalidate TLB on each vm_unbind, then we don't need to invalidate at submission and backing storage releasing. It doesn't make a lot of sense to me to perform a tlb invalidation at execbuf time. Maybe it is a behavior for the old implicit binding programming model. For vm_bind and eb3, we separate the binding and job submission into two APIs. It is more natural the TLB invalidation be coupled with the vm bind/unbind, not job submission. So in my opinion we should remove tlb invalidation from submission and backing storage releasing and add it to vm unbind. This is method is cleaner to me.

You can propose this model (not flushing in eb3) but I have my doubts. 
Consider the pointlessness of flushing on N unbinds for 99% of clients 
which are not infinite compute batch. And consider how you make the 
behaviour consistent on all platforms (selective vs global tlb flush).

Also note that this discussion is orthogonal to unbind vs backing store 
release.

> Regarding performance, we don't have data. In my opinion, we should make things work in a most straight forward way as the first step. Then consider performance improvement if necessary. Consider some delayed tlb invalidation at submission and backing release time without performance data support wasn't a good decision.

It is quite straightforward though. ;) It aligns with the eb2 model and 
argument can be made backing store release is (much) less frequent than 
unbind (consider softpin where client could trigger a lot of pointless 
flushes).

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* RE: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-28  8:58                           ` Tvrtko Ursulin
@ 2022-06-28 13:53                               ` Zeng, Oak
  0 siblings, 0 replies; 34+ messages in thread
From: Zeng, Oak @ 2022-06-28 13:53 UTC (permalink / raw)
  To: Tvrtko Ursulin, Landwerlin, Lionel G, Vishwanathapura, Niranjana
  Cc: Zanoni, Paulo R, intel-gfx, dri-devel, Hellstrom, Thomas, Wilson,
	Chris P, Vetter, Daniel, christian.koenig, Auld, Matthew



Thanks,
Oak

> -----Original Message-----
> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Sent: June 28, 2022 4:58 AM
> To: Zeng, Oak <oak.zeng@intel.com>; Landwerlin, Lionel G
> <lionel.g.landwerlin@intel.com>; Vishwanathapura, Niranjana
> <niranjana.vishwanathapura@intel.com>
> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-
> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Hellstrom,
> Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
> 
> 
> On 27/06/2022 19:58, Zeng, Oak wrote:
> >
> >
> > Thanks,
> > Oak
> >
> >> -----Original Message-----
> >> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> >> Sent: June 27, 2022 4:30 AM
> >> To: Zeng, Oak <oak.zeng@intel.com>; Landwerlin, Lionel G
> >> <lionel.g.landwerlin@intel.com>; Vishwanathapura, Niranjana
> >> <niranjana.vishwanathapura@intel.com>
> >> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-
> >> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> >> Hellstrom, Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> >> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
> >> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
> >> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi
> >> definition
> >>
> >>
> >> On 24/06/2022 21:23, Zeng, Oak wrote:
> >>> Let's compare "tlb invalidate at vm unbind" vs "tlb invalidate at
> >>> backing
> >> storage":
> >>>
> >>> Correctness:
> >>> consider this sequence of:
> >>> 1. unbind va1 from pa1,
> >>> 2. then bind va1 to pa2. //user space has the freedom to do this as
> >>> it manages virtual address space 3. Submit shader code using va1, 4.
> >>> Then retire pa1.
> >>>
> >>> If you don't perform tlb invalidate at step #1, in step #3, shader
> >>> will use
> >> stale entries in tlb and pa1 will be used for the shader. User want to use
> pa2.
> >> So I don't think invalidate tlb at step #4 make correctness.
> >>
> >> Define step 3. Is it a new execbuf? If so then there will be a TLB flush
> there.
> >> Unless the plan is to stop doing that with eb3 but I haven't picked
> >> up on that anywhere so far.
> >
> > In Niranjana's latest patch series, he removed the TLB flushing from
> vm_unbind. He also said explicitly TLB invalidation will be performed at job
> submission and backing storage releasing time, which is the existing behavior
> of the current i915 driver.
> >
> > I think if we invalidate TLB on each vm_unbind, then we don't need to
> invalidate at submission and backing storage releasing. It doesn't make a lot
> of sense to me to perform a tlb invalidation at execbuf time. Maybe it is a
> behavior for the old implicit binding programming model. For vm_bind and
> eb3, we separate the binding and job submission into two APIs. It is more
> natural the TLB invalidation be coupled with the vm bind/unbind, not job
> submission. So in my opinion we should remove tlb invalidation from
> submission and backing storage releasing and add it to vm unbind. This is
> method is cleaner to me.
> 
> You can propose this model (not flushing in eb3) but I have my doubts.
> Consider the pointlessness of flushing on N unbinds for 99% of clients which
> are not infinite compute batch. And consider how you make the behaviour
> consistent on all platforms (selective vs global tlb flush).

When I thought about eb3, compute workload and ulls were also in the picture. Under ulls, user mode keep submitting job without calling execbuf (it uses a semaphore to notify HW of the new batch). The execbuf + backing release flush has a correctness issue as I pointed out. Now we decided eb3 is only for mesa, not for compute, we don't have this correctness problem for now. We can close this conversation for now and revive it when we move to Xe and vm bind for compute.

Regards,
Oak


> 
> Also note that this discussion is orthogonal to unbind vs backing store release.
> 
> > Regarding performance, we don't have data. In my opinion, we should
> make things work in a most straight forward way as the first step. Then
> consider performance improvement if necessary. Consider some delayed tlb
> invalidation at submission and backing release time without performance
> data support wasn't a good decision.
> 
> It is quite straightforward though. ;) It aligns with the eb2 model and
> argument can be made backing store release is (much) less frequent than
> unbind (consider softpin where client could trigger a lot of pointless flushes).
> 
> Regards,
> 
> Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
@ 2022-06-28 13:53                               ` Zeng, Oak
  0 siblings, 0 replies; 34+ messages in thread
From: Zeng, Oak @ 2022-06-28 13:53 UTC (permalink / raw)
  To: Tvrtko Ursulin, Landwerlin, Lionel G, Vishwanathapura, Niranjana
  Cc: Zanoni, Paulo R, intel-gfx, dri-devel, Hellstrom, Thomas, Wilson,
	Chris P, Vetter, Daniel, christian.koenig, Auld, Matthew



Thanks,
Oak

> -----Original Message-----
> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Sent: June 28, 2022 4:58 AM
> To: Zeng, Oak <oak.zeng@intel.com>; Landwerlin, Lionel G
> <lionel.g.landwerlin@intel.com>; Vishwanathapura, Niranjana
> <niranjana.vishwanathapura@intel.com>
> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-
> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Hellstrom,
> Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
> 
> 
> On 27/06/2022 19:58, Zeng, Oak wrote:
> >
> >
> > Thanks,
> > Oak
> >
> >> -----Original Message-----
> >> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> >> Sent: June 27, 2022 4:30 AM
> >> To: Zeng, Oak <oak.zeng@intel.com>; Landwerlin, Lionel G
> >> <lionel.g.landwerlin@intel.com>; Vishwanathapura, Niranjana
> >> <niranjana.vishwanathapura@intel.com>
> >> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>; intel-
> >> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> >> Hellstrom, Thomas <thomas.hellstrom@intel.com>; Wilson, Chris P
> >> <chris.p.wilson@intel.com>; Vetter, Daniel <daniel.vetter@intel.com>;
> >> christian.koenig@amd.com; Auld, Matthew <matthew.auld@intel.com>
> >> Subject: Re: [Intel-gfx] [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi
> >> definition
> >>
> >>
> >> On 24/06/2022 21:23, Zeng, Oak wrote:
> >>> Let's compare "tlb invalidate at vm unbind" vs "tlb invalidate at
> >>> backing
> >> storage":
> >>>
> >>> Correctness:
> >>> consider this sequence of:
> >>> 1. unbind va1 from pa1,
> >>> 2. then bind va1 to pa2. //user space has the freedom to do this as
> >>> it manages virtual address space 3. Submit shader code using va1, 4.
> >>> Then retire pa1.
> >>>
> >>> If you don't perform tlb invalidate at step #1, in step #3, shader
> >>> will use
> >> stale entries in tlb and pa1 will be used for the shader. User want to use
> pa2.
> >> So I don't think invalidate tlb at step #4 make correctness.
> >>
> >> Define step 3. Is it a new execbuf? If so then there will be a TLB flush
> there.
> >> Unless the plan is to stop doing that with eb3 but I haven't picked
> >> up on that anywhere so far.
> >
> > In Niranjana's latest patch series, he removed the TLB flushing from
> vm_unbind. He also said explicitly TLB invalidation will be performed at job
> submission and backing storage releasing time, which is the existing behavior
> of the current i915 driver.
> >
> > I think if we invalidate TLB on each vm_unbind, then we don't need to
> invalidate at submission and backing storage releasing. It doesn't make a lot
> of sense to me to perform a tlb invalidation at execbuf time. Maybe it is a
> behavior for the old implicit binding programming model. For vm_bind and
> eb3, we separate the binding and job submission into two APIs. It is more
> natural the TLB invalidation be coupled with the vm bind/unbind, not job
> submission. So in my opinion we should remove tlb invalidation from
> submission and backing storage releasing and add it to vm unbind. This is
> method is cleaner to me.
> 
> You can propose this model (not flushing in eb3) but I have my doubts.
> Consider the pointlessness of flushing on N unbinds for 99% of clients which
> are not infinite compute batch. And consider how you make the behaviour
> consistent on all platforms (selective vs global tlb flush).

When I thought about eb3, compute workload and ulls were also in the picture. Under ulls, user mode keep submitting job without calling execbuf (it uses a semaphore to notify HW of the new batch). The execbuf + backing release flush has a correctness issue as I pointed out. Now we decided eb3 is only for mesa, not for compute, we don't have this correctness problem for now. We can close this conversation for now and revive it when we move to Xe and vm bind for compute.

Regards,
Oak


> 
> Also note that this discussion is orthogonal to unbind vs backing store release.
> 
> > Regarding performance, we don't have data. In my opinion, we should
> make things work in a most straight forward way as the first step. Then
> consider performance improvement if necessary. Consider some delayed tlb
> invalidation at submission and backing release time without performance
> data support wasn't a good decision.
> 
> It is quite straightforward though. ;) It aligns with the eb2 model and
> argument can be made backing store release is (much) less frequent than
> unbind (consider softpin where client could trigger a lot of pointless flushes).
> 
> Regards,
> 
> Tvrtko

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition
  2022-06-22 18:50 [PATCH v4 0/3] " Niranjana Vishwanathapura
@ 2022-06-22 18:50 ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 34+ messages in thread
From: Niranjana Vishwanathapura @ 2022-06-22 18:50 UTC (permalink / raw)
  To: intel-gfx, dri-devel
  Cc: matthew.brost, paulo.r.zanoni, lionel.g.landwerlin,
	tvrtko.ursulin, chris.p.wilson, thomas.hellstrom, oak.zeng,
	matthew.auld, jason, daniel.vetter, christian.koenig

VM_BIND and related uapi definitions

v2: Reduce the scope to simple Mesa use case.
v3: Expand VM_UNBIND documentation and add
    I915_GEM_VM_BIND/UNBIND_FENCE_VALID
    and I915_GEM_VM_BIND_TLB_FLUSH flags.
v4: Remove I915_GEM_VM_BIND_TLB_FLUSH flag and add additional
    documentation for vm_bind/unbind.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 Documentation/gpu/rfc/i915_vm_bind.h | 252 +++++++++++++++++++++++++++
 1 file changed, 252 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h

diff --git a/Documentation/gpu/rfc/i915_vm_bind.h b/Documentation/gpu/rfc/i915_vm_bind.h
new file mode 100644
index 000000000000..7248791a4513
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_vm_bind.h
@@ -0,0 +1,252 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+/**
+ * DOC: I915_PARAM_HAS_VM_BIND
+ *
+ * VM_BIND feature availability.
+ * See typedef drm_i915_getparam_t param.
+ */
+#define I915_PARAM_HAS_VM_BIND		57
+
+/**
+ * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
+ *
+ * Flag to opt-in for VM_BIND mode of binding during VM creation.
+ * See struct drm_i915_gem_vm_control flags.
+ *
+ * The older execbuf2 ioctl will not support VM_BIND mode of operation.
+ * For VM_BIND mode, we have new execbuf3 ioctl which will not accept any
+ * execlist (See struct drm_i915_gem_execbuffer3 for more details).
+ *
+ */
+#define I915_VM_CREATE_FLAGS_USE_VM_BIND	(1 << 0)
+
+/* VM_BIND related ioctls */
+#define DRM_I915_GEM_VM_BIND		0x3d
+#define DRM_I915_GEM_VM_UNBIND		0x3e
+#define DRM_I915_GEM_EXECBUFFER3	0x3f
+
+#define DRM_IOCTL_I915_GEM_VM_BIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
+#define DRM_IOCTL_I915_GEM_VM_UNBIND		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_bind)
+#define DRM_IOCTL_I915_GEM_EXECBUFFER3		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct drm_i915_gem_execbuffer3)
+
+/**
+ * struct drm_i915_gem_vm_bind_fence - Bind/unbind completion notification.
+ *
+ * A timeline out fence for vm_bind/unbind completion notification.
+ */
+struct drm_i915_gem_vm_bind_fence {
+	/** @handle: User's handle for a drm_syncobj to signal. */
+	__u32 handle;
+
+	/** @rsvd: Reserved, MBZ */
+	__u32 rsvd;
+
+	/**
+	 * @value: A point in the timeline.
+	 * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
+	 * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
+	 * binary one.
+	 */
+	__u64 value;
+};
+
+/**
+ * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
+ *
+ * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
+ * virtual address (VA) range to the section of an object that should be bound
+ * in the device page table of the specified address space (VM).
+ * The VA range specified must be unique (ie., not currently bound) and can
+ * be mapped to whole object or a section of the object (partial binding).
+ * Multiple VA mappings can be created to the same section of the object
+ * (aliasing).
+ *
+ * The @start, @offset and @length should be 4K page aligned. However the DG2
+ * and XEHPSDV has 64K page size for device local-memory and has compact page
+ * table. On those platforms, for binding device local-memory objects, the
+ * @start should be 2M aligned, @offset and @length should be 64K aligned.
+ * Also, on those platforms, error -ENOSPC will be returned if user tries to
+ * bind a device local-memory object and a system memory object in a single 2M
+ * section of VA range.
+ *
+ * Error code -EINVAL will be returned if @start, @offset and @length are not
+ * properly aligned. Error code of -ENOSPC will be returned if the VA range
+ * specified can't be reserved.
+ *
+ * The bind operation can get completed asynchronously and out of submission
+ * order. When I915_GEM_VM_BIND_FENCE_VALID flag is set, the @fence will be
+ * signaled upon completion of bind operation.
+ */
+struct drm_i915_gem_vm_bind {
+	/** @vm_id: VM (address space) id to bind */
+	__u32 vm_id;
+
+	/** @handle: Object handle */
+	__u32 handle;
+
+	/** @start: Virtual Address start to bind */
+	__u64 start;
+
+	/** @offset: Offset in object to bind */
+	__u64 offset;
+
+	/** @length: Length of mapping to bind */
+	__u64 length;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_GEM_VM_BIND_FENCE_VALID:
+	 * @fence is valid, needs bind completion notification.
+	 *
+	 * I915_GEM_VM_BIND_READONLY:
+	 * Mapping is read-only.
+	 *
+	 * I915_GEM_VM_BIND_CAPTURE:
+	 * Capture this mapping in the dump upon GPU error.
+	 */
+	__u64 flags;
+#define I915_GEM_VM_BIND_FENCE_VALID	(1 << 0)
+#define I915_GEM_VM_BIND_READONLY	(1 << 1)
+#define I915_GEM_VM_BIND_CAPTURE	(1 << 2)
+
+	/** @fence: Timeline fence for bind completion signaling */
+	struct drm_i915_gem_vm_bind_fence fence;
+
+	/** @extensions: 0-terminated chain of extensions */
+	__u64 extensions;
+};
+
+/**
+ * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
+ *
+ * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
+ * address (VA) range that should be unbound from the device page table of the
+ * specified address space (VM). The specified VA range must match one of the
+ * mappings created with the VM_BIND ioctl. TLB is flushed upon unbind
+ * completion. The unbind operation will force unbind the specified range from
+ * device page table without waiting for any GPU job to complete. It is UMDs
+ * responsibility to ensure the mapping is no longer in use before calling
+ * VM_UNBIND.
+ *
+ * If the specified mapping is not found, the ioctl will simply return without
+ * any error.
+ *
+ * The unbind operation can get completed asynchronously and out of submission
+ * order. When I915_GEM_VM_UNBIND_FENCE_VALID flag is set, the @fence will be
+ * signaled upon completion of unbind operation.
+ */
+struct drm_i915_gem_vm_unbind {
+	/** @vm_id: VM (address space) id to bind */
+	__u32 vm_id;
+
+	/** @rsvd: Reserved, MBZ */
+	__u32 rsvd;
+
+	/** @start: Virtual Address start to unbind */
+	__u64 start;
+
+	/** @length: Length of mapping to unbind */
+	__u64 length;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_GEM_VM_UNBIND_FENCE_VALID:
+	 * @fence is valid, needs unbind completion notification.
+	 */
+	__u64 flags;
+#define I915_GEM_VM_UNBIND_FENCE_VALID	(1 << 0)
+
+	/** @fence: Timeline fence for unbind completion signaling */
+	struct drm_i915_gem_vm_bind_fence fence;
+
+	/** @extensions: 0-terminated chain of extensions */
+	__u64 extensions;
+};
+
+/**
+ * struct drm_i915_gem_execbuffer3 - Structure for DRM_I915_GEM_EXECBUFFER3
+ * ioctl.
+ *
+ * DRM_I915_GEM_EXECBUFFER3 ioctl only works in VM_BIND mode and VM_BIND mode
+ * only works with this ioctl for submission.
+ * See I915_VM_CREATE_FLAGS_USE_VM_BIND.
+ */
+struct drm_i915_gem_execbuffer3 {
+	/**
+	 * @ctx_id: Context id
+	 *
+	 * Only contexts with user engine map are allowed.
+	 */
+	__u32 ctx_id;
+
+	/**
+	 * @engine_idx: Engine index
+	 *
+	 * An index in the user engine map of the context specified by @ctx_id.
+	 */
+	__u32 engine_idx;
+
+	/** @rsvd1: Reserved, MBZ */
+	__u32 rsvd1;
+
+	/**
+	 * @batch_count: Number of batches in @batch_address array.
+	 *
+	 * 0 is invalid. For parallel submission, it should be equal to the
+	 * number of (parallel) engines involved in that submission.
+	 */
+	__u32 batch_count;
+
+	/**
+	 * @batch_address: Array of batch gpu virtual addresses.
+	 *
+	 * If @batch_count is 1, then it is the gpu virtual address of the
+	 * batch buffer. If @batch_count > 1, then it is a pointer to an array
+	 * of batch buffer gpu virtual addresses.
+	 */
+	__u64 batch_address;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_EXEC3_SECURE:
+	 * Request a privileged ("secure") batch buffer/s.
+	 * It is only available for DRM_ROOT_ONLY | DRM_MASTER processes.
+	 */
+	__u64 flags;
+#define I915_EXEC3_SECURE	(1<<0)
+
+	/** @rsvd2: Reserved, MBZ */
+	__u64 rsvd2;
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES:
+	 * It has same format as DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES.
+	 * See struct drm_i915_gem_execbuffer_ext_timeline_fences.
+	 */
+	__u64 extensions;
+#define DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES	0
+};
+
+/**
+ * struct drm_i915_gem_create_ext_vm_private - Extension to make the object
+ * private to the specified VM.
+ *
+ * See struct drm_i915_gem_create_ext.
+ */
+struct drm_i915_gem_create_ext_vm_private {
+#define I915_GEM_CREATE_EXT_VM_PRIVATE		2
+	/** @base: Extension link. See struct i915_user_extension. */
+	struct i915_user_extension base;
+
+	/** @vm_id: Id of the VM to which the object is private */
+	__u32 vm_id;
+};
-- 
2.21.0.rc0.32.g243a4c7e27


^ permalink raw reply related	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2022-06-28 13:54 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-22  3:56 [PATCH v3 0/3] drm/doc/rfc: i915 VM_BIND feature design + uapi Niranjana Vishwanathapura
2022-06-22  3:56 ` [Intel-gfx] " Niranjana Vishwanathapura
2022-06-22  3:56 ` [PATCH v3 1/3] drm/doc/rfc: VM_BIND feature design document Niranjana Vishwanathapura
2022-06-22  3:56   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-06-22  3:56 ` [PATCH v3 2/3] drm/i915: Update i915 uapi documentation Niranjana Vishwanathapura
2022-06-22  3:56   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-06-22  3:56 ` [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition Niranjana Vishwanathapura
2022-06-22  3:56   ` [Intel-gfx] " Niranjana Vishwanathapura
2022-06-22  8:10   ` Tvrtko Ursulin
2022-06-22 15:12     ` Niranjana Vishwanathapura
2022-06-22 15:57       ` Tvrtko Ursulin
2022-06-22 16:44         ` Niranjana Vishwanathapura
2022-06-22 18:53           ` Niranjana Vishwanathapura
2022-06-23  8:27           ` Tvrtko Ursulin
2022-06-23  8:57             ` Lionel Landwerlin
2022-06-23 11:05               ` Tvrtko Ursulin
2022-06-23 12:41                 ` Lionel Landwerlin
2022-06-23 21:05                 ` Zeng, Oak
2022-06-23 21:05                   ` Zeng, Oak
2022-06-24  8:32                   ` Tvrtko Ursulin
2022-06-24 20:23                     ` Zeng, Oak
2022-06-24 20:23                       ` Zeng, Oak
2022-06-27  8:30                       ` Tvrtko Ursulin
2022-06-27 18:58                         ` Zeng, Oak
2022-06-27 18:58                           ` Zeng, Oak
2022-06-28  8:58                           ` Tvrtko Ursulin
2022-06-28 13:53                             ` Zeng, Oak
2022-06-28 13:53                               ` Zeng, Oak
2022-06-23 14:47             ` Niranjana Vishwanathapura
2022-06-23  9:28       ` Lionel Landwerlin
2022-06-23 14:43         ` Niranjana Vishwanathapura
2022-06-23 14:43           ` Niranjana Vishwanathapura
2022-06-22 19:49 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/doc/rfc: i915 VM_BIND feature design + uapi Patchwork
2022-06-22 18:50 [PATCH v4 0/3] " Niranjana Vishwanathapura
2022-06-22 18:50 ` [PATCH v3 3/3] drm/doc/rfc: VM_BIND uapi definition Niranjana Vishwanathapura

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.