All of lore.kernel.org
 help / color / mirror / Atom feed
* [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support
@ 2022-10-18  7:16 Niranjana Vishwanathapura
  2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 01/11] lib/vm_bind: import uapi definitions Niranjana Vishwanathapura
                   ` (12 more replies)
  0 siblings, 13 replies; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18  7:16 UTC (permalink / raw)
  To: igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, matthew.auld, daniel.vetter,
	petri.latvala

DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
buffer objects (BOs) or sections of a BOs at specified GPU virtual
addresses on a specified address space (VM). Multiple mappings can map
to the same physical pages of an object (aliasing). These mappings (also
referred to as persistent mappings) will be persistent across multiple
GPU submissions (execbuf calls) issued by the UMD, without user having
to provide a list of all required mappings during each submission (as
required by older execbuf mode).

The new execbuf3 ioctl (I915_GEM_EXECBUFFER3) will only work in vm_bind
mode. The vm_bind mode only works with this new execbuf3 ioctl.

Add sanity tests to validate the VM_BIND, VM_UNBIND and execbuf3 ioctls.

Add basic test to create and VM_BIND the objects and issue execbuf3 for
GPU to copy the data from a source to destination buffer.

TODOs:
* More validation support.
* Port some relevant gem_exec_* tests for execbuf3.

NOTEs:
* It is based on below VM_BIND design+uapi rfc.
  Documentation/gpu/rfc/i915_vm_bind.rst

* The i915 VM_BIND support is posted as,
  [PATCH v4 00/17] drm/i915/vm_bind: Add VM_BIND functionality

v2: Address various review comments
v3: Add gem_exec3_basic and gem_exec3_balancer tests,
    add gem_lmem_swapping@vm_bind subtest and some fixes.
v4: Add and use i915_vm_bind library functions,
    add lmem test if lmem region is present instead of skipping,
    remove helpers to create vm_private objects, instead use
    gem_create_ext() function.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

Niranjana Vishwanathapura (11):
  lib/vm_bind: import uapi definitions
  lib/vm_bind: Add vm_bind/unbind and execbuf3 ioctls
  lib/vm_bind: Add vm_bind mode support for VM
  lib/vm_bind: Add vm_bind specific library functions
  lib/vm_bind: Add __prime_handle_to_fd()
  tests/i915/vm_bind: Add vm_bind sanity test
  tests/i915/vm_bind: Add basic VM_BIND test support
  tests/i915/vm_bind: Add userptr subtest
  tests/i915/vm_bind: Add gem_exec3_basic test
  tests/i915/vm_bind: Add gem_exec3_balancer test
  tests/i915/vm_bind: Add gem_lmem_swapping@vm_bind sub test

 lib/i915/gem_context.c                |  24 ++
 lib/i915/gem_context.h                |   3 +
 lib/i915/gem_vm.c                     |  31 +-
 lib/i915/gem_vm.h                     |   3 +-
 lib/i915/i915_drm_local.h             | 308 ++++++++++++++
 lib/i915/i915_vm_bind.c               |  61 +++
 lib/i915/i915_vm_bind.h               |  16 +
 lib/intel_chipset.h                   |   2 +
 lib/ioctl_wrappers.c                  | 117 +++++-
 lib/ioctl_wrappers.h                  |   7 +
 lib/meson.build                       |   1 +
 tests/i915/gem_exec3_balancer.c       | 473 +++++++++++++++++++++
 tests/i915/gem_exec3_basic.c          | 133 ++++++
 tests/i915/gem_lmem_swapping.c        | 128 +++++-
 tests/i915/gem_pwrite.c               |  16 +-
 tests/i915/i915_vm_bind_basic.c       | 573 ++++++++++++++++++++++++++
 tests/i915/i915_vm_bind_sanity.c      | 275 ++++++++++++
 tests/intel-ci/fast-feedback.testlist |   3 +
 tests/meson.build                     |  10 +
 tests/prime_mmap.c                    |  26 +-
 20 files changed, 2160 insertions(+), 50 deletions(-)
 create mode 100644 lib/i915/i915_vm_bind.c
 create mode 100644 lib/i915/i915_vm_bind.h
 create mode 100644 tests/i915/gem_exec3_balancer.c
 create mode 100644 tests/i915/gem_exec3_basic.c
 create mode 100644 tests/i915/i915_vm_bind_basic.c
 create mode 100644 tests/i915/i915_vm_bind_sanity.c

-- 
2.21.0.rc0.32.g243a4c7e27

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [igt-dev] [PATCH i-g-t v4 01/11] lib/vm_bind: import uapi definitions
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
@ 2022-10-18  7:16 ` Niranjana Vishwanathapura
  2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 02/11] lib/vm_bind: Add vm_bind/unbind and execbuf3 ioctls Niranjana Vishwanathapura
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18  7:16 UTC (permalink / raw)
  To: igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, matthew.auld, daniel.vetter,
	petri.latvala

Import required VM_BIND kernel uapi definitions.

DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
buffer objects (BOs) or sections of a BOs at specified GPU virtual
addresses on a specified address space (VM). Multiple mappings can map
to the same physical pages of an object (aliasing). These mappings (also
referred to as persistent mappings) will be persistent across multiple
GPU submissions (execbuf calls) issued by the UMD, without user having
to provide a list of all required mappings during each submission (as
required by older execbuf mode).

The new execbuf3 ioctl (I915_GEM_EXECBUFFER3) will only work in vm_bind
mode. The vm_bind mode only works with this new execbuf3 ioctl.

v2: Move vm_bind uapi definitions to i915_drm-local.h
    Define HAS_64K_PAGES()
    Pull in uapi updates

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 lib/i915/i915_drm_local.h | 308 ++++++++++++++++++++++++++++++++++++++
 lib/intel_chipset.h       |   2 +
 2 files changed, 310 insertions(+)

diff --git a/lib/i915/i915_drm_local.h b/lib/i915/i915_drm_local.h
index 9a2273c4e4..128d5f416f 100644
--- a/lib/i915/i915_drm_local.h
+++ b/lib/i915/i915_drm_local.h
@@ -23,6 +23,314 @@ extern "C" {
 
 #define DRM_I915_QUERY_GEOMETRY_SUBSLICES      6
 
+/*
+ * Signal to the kernel that the object will need to be accessed via
+ * the CPU.
+ *
+ * Only valid when placing objects in I915_MEMORY_CLASS_DEVICE, and only
+ * strictly required on platforms where only some of the device memory
+ * is directly visible or mappable through the CPU, like on DG2+.
+ *
+ * One of the placements MUST also be I915_MEMORY_CLASS_SYSTEM, to
+ * ensure we can always spill the allocation to system memory, if we
+ * can't place the object in the mappable part of
+ * I915_MEMORY_CLASS_DEVICE.
+ *
+ * Without this hint, the kernel will assume that non-mappable
+ * I915_MEMORY_CLASS_DEVICE is preferred for this object. Note that the
+ * kernel can still migrate the object to the mappable part, as a last
+ * resort, if userspace ever CPU faults this object, but this might be
+ * expensive, and so ideally should be avoided.
+ */
+#define I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS (1 << 0)
+
+#define DRM_I915_GEM_VM_BIND		0x3d
+#define DRM_I915_GEM_VM_UNBIND		0x3e
+#define DRM_I915_GEM_EXECBUFFER3	0x3f
+
+#define DRM_IOCTL_I915_GEM_VM_BIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_BIND, struct drm_i915_gem_vm_bind)
+#define DRM_IOCTL_I915_GEM_VM_UNBIND	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_UNBIND, struct drm_i915_gem_vm_unbind)
+#define DRM_IOCTL_I915_GEM_EXECBUFFER3  DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER3, struct drm_i915_gem_execbuffer3)
+
+/*
+ * VM_BIND feature version supported.
+ *
+ * The following versions of VM_BIND have been defined:
+ *
+ * 0: No VM_BIND support.
+ *
+ * 1: In VM_UNBIND calls, the UMD must specify the exact mappings created
+ *    previously with VM_BIND, the ioctl will not support unbinding multiple
+ *    mappings or splitting them. Similarly, VM_BIND calls will not replace
+ *    any existing mappings.
+ *
+ * See struct drm_i915_gem_vm_bind and struct drm_i915_gem_vm_unbind.
+ */
+#define I915_PARAM_VM_BIND_VERSION	57
+
+#define DRM_I915_GEM_EXECBUFFER3_EXT_TIMELINE_FENCES 0
+
+/**
+ * struct drm_i915_gem_timeline_fence - An input or output timeline fence.
+ *
+ * The operation will wait for input fence to signal.
+ *
+ * The returned output fence will be signaled after the completion of the
+ * operation.
+ */
+struct drm_i915_gem_timeline_fence {
+	/** @handle: User's handle for a drm_syncobj to wait on or signal. */
+	__u32 handle;
+
+	/**
+	 * @flags: Supported flags are:
+	 *
+	 * I915_TIMELINE_FENCE_WAIT:
+	 * Wait for the input fence before the operation.
+	 *
+	 * I915_TIMELINE_FENCE_SIGNAL:
+	 * Return operation completion fence as output.
+	 */
+	__u32 flags;
+#define I915_TIMELINE_FENCE_WAIT            (1 << 0)
+#define I915_TIMELINE_FENCE_SIGNAL          (1 << 1)
+#define __I915_TIMELINE_FENCE_UNKNOWN_FLAGS (-(I915_TIMELINE_FENCE_SIGNAL << 1))
+
+	/**
+	 * @value: A point in the timeline.
+	 * Value must be 0 for a binary drm_syncobj. A Value of 0 for a
+	 * timeline drm_syncobj is invalid as it turns a drm_syncobj into a
+	 * binary one.
+	 */
+	__u64 value;
+};
+
+/**
+ * struct drm_i915_gem_execbuffer3 - Structure for DRM_I915_GEM_EXECBUFFER3
+ * ioctl.
+ *
+ * DRM_I915_GEM_EXECBUFFER3 ioctl only works in VM_BIND mode and VM_BIND mode
+ * only works with this ioctl for submission.
+ * See I915_VM_CREATE_FLAGS_USE_VM_BIND.
+ */
+struct drm_i915_gem_execbuffer3 {
+	/**
+	 * @ctx_id: Context id
+	 *
+	 * Only contexts with user engine map are allowed.
+	 */
+	__u32 ctx_id;
+
+	/**
+	 * @engine_idx: Engine index
+	 *
+	 * An index in the user engine map of the context specified by @ctx_id.
+	 */
+	__u32 engine_idx;
+
+	/**
+	 * @batch_address: Batch gpu virtual address/es.
+	 *
+	 * For normal submission, it is the gpu virtual address of the batch
+	 * buffer. For parallel submission, it is a pointer to an array of
+	 * batch buffer gpu virtual addresses with array size equal to the
+	 * number of (parallel) engines involved in that submission (See
+	 * struct i915_context_engines_parallel_submit).
+	 */
+	__u64 batch_address;
+
+	/** @flags: Currently reserved, MBZ */
+	__u64 flags;
+#define __I915_EXEC3_UNKNOWN_FLAGS (~0)
+
+	/** @fence_count: Number of fences in @timeline_fences array. */
+	__u64 fence_count;
+
+	/**
+	 * @timeline_fences: Pointer to an array of timeline fences.
+	 *
+	 * Timeline fences are of format struct drm_i915_gem_timeline_fence.
+	 */
+	__u64 timeline_fences;
+
+	/** @rsvd: Reserved, MBZ */
+	__u64 rsvd;
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * For future extensions. See struct i915_user_extension.
+	 */
+	__u64 extensions;
+};
+
+/*
+ * If I915_VM_CREATE_FLAGS_USE_VM_BIND flag is set, VM created will work in
+ * VM_BIND mode
+ */
+#define I915_VM_CREATE_FLAGS_USE_VM_BIND	(1u << 0)
+
+/*
+ * For I915_GEM_CREATE_EXT_VM_PRIVATE usage see
+ * struct drm_i915_gem_create_ext_vm_private.
+ */
+#define I915_GEM_CREATE_EXT_VM_PRIVATE 2
+
+/**
+ * struct drm_i915_gem_create_ext_vm_private - Extension to make the object
+ * private to the specified VM.
+ *
+ * See struct drm_i915_gem_create_ext.
+ *
+ * By default, BOs can be mapped on multiple VMs and can also be dma-buf
+ * exported. Hence these BOs are referred to as Shared BOs.
+ * During each execbuf3 submission, the request fence must be added to the
+ * dma-resv fence list of all shared BOs mapped on the VM.
+ *
+ * Unlike Shared BOs, these VM private BOs can only be mapped on the VM they
+ * are private to and can't be dma-buf exported. All private BOs of a VM share
+ * the dma-resv object. Hence during each execbuf3 submission, they need only
+ * one dma-resv fence list updated. Thus, the fast path (where required
+ * mappings are already bound) submission latency is O(1) w.r.t the number of
+ * VM private BOs.
+ */
+struct drm_i915_gem_create_ext_vm_private {
+	/** @base: Extension link. See struct i915_user_extension. */
+	struct i915_user_extension base;
+
+	/** @vm_id: Id of the VM to which Object is private */
+	__u32 vm_id;
+
+	/** @rsvd: Reserved, MBZ */
+	__u32 rsvd;
+};
+
+/**
+ * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
+ *
+ * This structure is passed to VM_BIND ioctl and specifies the mapping of GPU
+ * virtual address (VA) range to the section of an object that should be bound
+ * in the device page table of the specified address space (VM).
+ * The VA range specified must be unique (ie., not currently bound) and can
+ * be mapped to whole object or a section of the object (partial binding).
+ * Multiple VA mappings can be created to the same section of the object
+ * (aliasing).
+ *
+ * The @start, @offset and @length must be 4K page aligned. However the DG2
+ * and XEHPSDV has 64K page size for device local memory and has compact page
+ * table. On those platforms, for binding device local-memory objects, the
+ * @start, @offset and @length must be 64K aligned.
+ *
+ * Error code -EINVAL will be returned if @start, @offset and @length are not
+ * properly aligned. In version 1 (See I915_PARAM_VM_BIND_VERSION), error code
+ * -ENOSPC will be returned if the VA range specified can't be reserved.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
+ * are not ordered. Furthermore, parts of the VM_BIND operation can be done
+ * asynchronously, if valid @fence is specified.
+ */
+struct drm_i915_gem_vm_bind {
+	/** @vm_id: VM (address space) id to bind */
+	__u32 vm_id;
+
+	/** @handle: Object handle */
+	__u32 handle;
+
+	/** @start: Virtual Address start to bind */
+	__u64 start;
+
+	/** @offset: Offset in object to bind */
+	__u64 offset;
+
+	/** @length: Length of mapping to bind */
+	__u64 length;
+
+	/**
+	 * @flags: Currently reserved, MBZ.
+	 *
+	 * Note that @fence carries its own flags.
+	 */
+	__u64 flags;
+
+	/**
+	 * @fence: Timeline fence for bind completion signaling.
+	 *
+	 * Timeline fence is of format struct drm_i915_gem_timeline_fence.
+	 *
+	 * It is an out fence, hence using I915_TIMELINE_FENCE_WAIT flag
+	 * is invalid, and an error will be returned.
+	 *
+	 * If I915_TIMELINE_FENCE_SIGNAL flag is not set, then out fence
+	 * is not requested and binding is completed synchronously.
+	 */
+	struct drm_i915_gem_timeline_fence fence;
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * For future extensions. See struct i915_user_extension.
+	 */
+	__u64 extensions;
+};
+
+/**
+ * struct drm_i915_gem_vm_unbind - VA to object mapping to unbind.
+ *
+ * This structure is passed to VM_UNBIND ioctl and specifies the GPU virtual
+ * address (VA) range that should be unbound from the device page table of the
+ * specified address space (VM). VM_UNBIND will force unbind the specified
+ * range from device page table without waiting for any GPU job to complete.
+ * It is UMDs responsibility to ensure the mapping is no longer in use before
+ * calling VM_UNBIND.
+ *
+ * If the specified mapping is not found, the ioctl will simply return without
+ * any error.
+ *
+ * VM_BIND/UNBIND ioctl calls executed on different CPU threads concurrently
+ * are not ordered. Furthermore, parts of the VM_UNBIND operation can be done
+ * asynchronously, if valid @fence is specified.
+ */
+struct drm_i915_gem_vm_unbind {
+	/** @vm_id: VM (address space) id to bind */
+	__u32 vm_id;
+
+	/** @rsvd: Reserved, MBZ */
+	__u32 rsvd;
+
+	/** @start: Virtual Address start to unbind */
+	__u64 start;
+
+	/** @length: Length of mapping to unbind */
+	__u64 length;
+
+	/**
+	 * @flags: Currently reserved, MBZ.
+	 *
+	 * Note that @fence carries its own flags.
+	 */
+	__u64 flags;
+
+	/**
+	 * @fence: Timeline fence for unbind completion signaling.
+	 *
+	 * Timeline fence is of format struct drm_i915_gem_timeline_fence.
+	 *
+	 * It is an out fence, hence using I915_TIMELINE_FENCE_WAIT flag
+	 * is invalid, and an error will be returned.
+	 *
+	 * If I915_TIMELINE_FENCE_SIGNAL flag is not set, then out fence
+	 * is not requested and unbinding is completed synchronously.
+	 */
+	struct drm_i915_gem_timeline_fence fence;
+
+	/**
+	 * @extensions: Zero-terminated chain of extensions.
+	 *
+	 * For future extensions. See struct i915_user_extension.
+	 */
+	__u64 extensions;
+};
+
 #if defined(__cplusplus)
 }
 #endif
diff --git a/lib/intel_chipset.h b/lib/intel_chipset.h
index d7a6ff190f..7cf8259157 100644
--- a/lib/intel_chipset.h
+++ b/lib/intel_chipset.h
@@ -225,4 +225,6 @@ void intel_check_pch(void);
 
 #define HAS_FLATCCS(devid)	(intel_get_device_info(devid)->has_flatccs)
 
+#define HAS_64K_PAGES(devid)	(IS_DG2(devid))
+
 #endif /* _INTEL_CHIPSET_H */
-- 
2.21.0.rc0.32.g243a4c7e27

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [igt-dev] [PATCH i-g-t v4 02/11] lib/vm_bind: Add vm_bind/unbind and execbuf3 ioctls
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
  2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 01/11] lib/vm_bind: import uapi definitions Niranjana Vishwanathapura
@ 2022-10-18  7:16 ` Niranjana Vishwanathapura
  2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 03/11] lib/vm_bind: Add vm_bind mode support for VM Niranjana Vishwanathapura
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18  7:16 UTC (permalink / raw)
  To: igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, matthew.auld, daniel.vetter,
	petri.latvala

Add required library interfaces for new vm_bind/unbind
and execbuf3 ioctls.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 lib/ioctl_wrappers.c | 78 ++++++++++++++++++++++++++++++++++++++++++++
 lib/ioctl_wrappers.h |  6 ++++
 2 files changed, 84 insertions(+)

diff --git a/lib/ioctl_wrappers.c b/lib/ioctl_wrappers.c
index 09eb3ce7b5..ac37b6bb43 100644
--- a/lib/ioctl_wrappers.c
+++ b/lib/ioctl_wrappers.c
@@ -706,6 +706,38 @@ void gem_execbuf_wr(int fd, struct drm_i915_gem_execbuffer2 *execbuf)
 	igt_assert_eq(__gem_execbuf_wr(fd, execbuf), 0);
 }
 
+/**
+ * __gem_execbuf3:
+ * @fd: open i915 drm file descriptor
+ * @execbuf: execbuffer data structure
+ *
+ * This wraps the EXECBUFFER3 ioctl, which submits a batchbuffer for the gpu to
+ * run. This is allowed to fail, with -errno returned.
+ */
+int __gem_execbuf3(int fd, struct drm_i915_gem_execbuffer3 *execbuf)
+{
+	int err = 0;
+	if (igt_ioctl(fd, DRM_IOCTL_I915_GEM_EXECBUFFER3, execbuf)) {
+		err = -errno;
+		igt_assume(err != 0);
+	}
+	errno = 0;
+	return err;
+}
+
+/**
+ * gem_execbuf3:
+ * @fd: open i915 drm file descriptor
+ * @execbuf: execbuffer data structure
+ *
+ * This wraps the EXECBUFFER3 ioctl, which submits a batchbuffer for the gpu to
+ * run.
+ */
+void gem_execbuf3(int fd, struct drm_i915_gem_execbuffer3 *execbuf)
+{
+	igt_assert_eq(__gem_execbuf3(fd, execbuf), 0);
+}
+
 /**
  * gem_madvise:
  * @fd: open i915 drm file descriptor
@@ -1328,3 +1360,49 @@ bool igt_has_drm_cap(int fd, uint64_t capability)
 	igt_assert(drmIoctl(fd, DRM_IOCTL_GET_CAP, &cap) == 0);
 	return cap.value;
 }
+
+/* VM_BIND */
+
+int __gem_vm_bind(int fd, struct drm_i915_gem_vm_bind *bind)
+{
+	int err = 0;
+
+	if (drmIoctl(fd, DRM_IOCTL_I915_GEM_VM_BIND, bind))
+		err = -errno;
+	return err;
+}
+
+/**
+ * gem_vm_bind:
+ * @fd: open i915 drm file descriptor
+ * @bind: vm_bind data structure
+ *
+ * This wraps the VM_BIND ioctl to bind an address range to
+ * the specified address space.
+ */
+void gem_vm_bind(int fd, struct drm_i915_gem_vm_bind *bind)
+{
+	igt_assert_eq(__gem_vm_bind(fd, bind), 0);
+}
+
+int __gem_vm_unbind(int fd, struct drm_i915_gem_vm_unbind *unbind)
+{
+	int err = 0;
+
+	if (drmIoctl(fd, DRM_IOCTL_I915_GEM_VM_UNBIND, unbind))
+		err = -errno;
+	return err;
+}
+
+/**
+ * gem_vm_unbind:
+ * @fd: open i915 drm file descriptor
+ * @unbind: vm_unbind data structure
+ *
+ * This wraps the VM_UNBIND ioctl to unbind an address range from
+ * the specified address space.
+ */
+void gem_vm_unbind(int fd, struct drm_i915_gem_vm_unbind *unbind)
+{
+	igt_assert_eq(__gem_vm_unbind(fd, unbind), 0);
+}
diff --git a/lib/ioctl_wrappers.h b/lib/ioctl_wrappers.h
index 9a897fec23..223cd9160c 100644
--- a/lib/ioctl_wrappers.h
+++ b/lib/ioctl_wrappers.h
@@ -84,6 +84,12 @@ void gem_execbuf_wr(int fd, struct drm_i915_gem_execbuffer2 *execbuf);
 int __gem_execbuf_wr(int fd, struct drm_i915_gem_execbuffer2 *execbuf);
 void gem_execbuf(int fd, struct drm_i915_gem_execbuffer2 *execbuf);
 int __gem_execbuf(int fd, struct drm_i915_gem_execbuffer2 *execbuf);
+void gem_execbuf3(int fd, struct drm_i915_gem_execbuffer3 *execbuf);
+int __gem_execbuf3(int fd, struct drm_i915_gem_execbuffer3 *execbuf);
+int __gem_vm_bind(int fd, struct drm_i915_gem_vm_bind *bind);
+void gem_vm_bind(int fd, struct drm_i915_gem_vm_bind *bind);
+int __gem_vm_unbind(int fd, struct drm_i915_gem_vm_unbind *unbind);
+void gem_vm_unbind(int fd, struct drm_i915_gem_vm_unbind *unbind);
 
 #ifndef I915_GEM_DOMAIN_WC
 #define I915_GEM_DOMAIN_WC 0x80
-- 
2.21.0.rc0.32.g243a4c7e27

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [igt-dev] [PATCH i-g-t v4 03/11] lib/vm_bind: Add vm_bind mode support for VM
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
  2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 01/11] lib/vm_bind: import uapi definitions Niranjana Vishwanathapura
  2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 02/11] lib/vm_bind: Add vm_bind/unbind and execbuf3 ioctls Niranjana Vishwanathapura
@ 2022-10-18  7:16 ` Niranjana Vishwanathapura
  2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 04/11] lib/vm_bind: Add vm_bind specific library functions Niranjana Vishwanathapura
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18  7:16 UTC (permalink / raw)
  To: igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, matthew.auld, daniel.vetter,
	petri.latvala

Add required library interfaces to create VM in
vm_bind mode and assign the VM to a context.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 lib/i915/gem_context.c | 24 ++++++++++++++++++++++++
 lib/i915/gem_context.h |  3 +++
 lib/i915/gem_vm.c      | 31 ++++++++++++++++++++++++++-----
 lib/i915/gem_vm.h      |  3 ++-
 4 files changed, 55 insertions(+), 6 deletions(-)

diff --git a/lib/i915/gem_context.c b/lib/i915/gem_context.c
index fe989a8d1c..2d06b41980 100644
--- a/lib/i915/gem_context.c
+++ b/lib/i915/gem_context.c
@@ -517,3 +517,27 @@ uint32_t gem_context_create_for_class(int i915,
 	*count = i;
 	return p.ctx_id;
 }
+
+uint32_t gem_context_get_vm(int fd, uint32_t ctx_id)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_VM,
+		.ctx_id = ctx_id,
+	};
+
+	gem_context_get_param(fd, &p);
+	igt_assert(p.value);
+
+	return p.value;
+}
+
+void gem_context_set_vm(int fd, uint32_t ctx_id, uint32_t vm_id)
+{
+	struct drm_i915_gem_context_param p = {
+		.param = I915_CONTEXT_PARAM_VM,
+		.ctx_id = ctx_id,
+		.value = vm_id,
+	};
+
+	gem_context_set_param(fd, &p);
+}
diff --git a/lib/i915/gem_context.h b/lib/i915/gem_context.h
index 505d55724e..2a2247fe10 100644
--- a/lib/i915/gem_context.h
+++ b/lib/i915/gem_context.h
@@ -63,4 +63,7 @@ void gem_context_set_persistence(int i915, uint32_t ctx, bool state);
 
 bool gem_context_has_engine(int fd, uint32_t ctx, uint64_t engine);
 
+uint32_t gem_context_get_vm(int fd, uint32_t ctx_id);
+void gem_context_set_vm(int fd, uint32_t ctx_id, uint32_t vm_id);
+
 #endif /* GEM_CONTEXT_H */
diff --git a/lib/i915/gem_vm.c b/lib/i915/gem_vm.c
index 9a022a56c7..ee3c65d06e 100644
--- a/lib/i915/gem_vm.c
+++ b/lib/i915/gem_vm.c
@@ -48,7 +48,7 @@ bool gem_has_vm(int i915)
 {
 	uint32_t vm_id = 0;
 
-	__gem_vm_create(i915, &vm_id);
+	__gem_vm_create(i915, 0, &vm_id);
 	if (vm_id)
 		gem_vm_destroy(i915, vm_id);
 
@@ -67,9 +67,9 @@ void gem_require_vm(int i915)
 	igt_require(gem_has_vm(i915));
 }
 
-int __gem_vm_create(int i915, uint32_t *vm_id)
+int __gem_vm_create(int i915, uint32_t flags, uint32_t *vm_id)
 {
-       struct drm_i915_gem_vm_control ctl = {};
+       struct drm_i915_gem_vm_control ctl = { .flags = flags };
        int err = 0;
 
        if (igt_ioctl(i915, DRM_IOCTL_I915_GEM_VM_CREATE, &ctl) == 0) {
@@ -88,7 +88,8 @@ int __gem_vm_create(int i915, uint32_t *vm_id)
  * @i915: open i915 drm file descriptor
  *
  * This wraps the VM_CREATE ioctl, which is used to allocate a new
- * address space for use with GEM contexts.
+ * address space for use with GEM contexts, with legacy execbuff
+ * method of binding.
  *
  * Returns: The id of the allocated address space.
  */
@@ -96,7 +97,27 @@ uint32_t gem_vm_create(int i915)
 {
 	uint32_t vm_id;
 
-	igt_assert_eq(__gem_vm_create(i915, &vm_id), 0);
+	igt_assert_eq(__gem_vm_create(i915, 0, &vm_id), 0);
+	igt_assert(vm_id != 0);
+
+	return vm_id;
+}
+
+/**
+ * gem_vm_create_in_vm_bind_mode:
+ * @i915: open i915 drm file descriptor
+ *
+ * This wraps the VM_CREATE ioctl with I915_VM_CREATE_FLAGS_USE_VM_BIND,
+ * flag which is used to allocate a new address space for use with GEM contexts
+ * with vm_bind mode of binding.
+ *
+ * Returns: The id of the allocated address space.
+ */
+uint32_t gem_vm_create_in_vm_bind_mode(int i915)
+{
+	uint32_t vm_id;
+
+	igt_assert_eq(__gem_vm_create(i915, I915_VM_CREATE_FLAGS_USE_VM_BIND, &vm_id), 0);
 	igt_assert(vm_id != 0);
 
 	return vm_id;
diff --git a/lib/i915/gem_vm.h b/lib/i915/gem_vm.h
index acbb663e65..6cf46d8876 100644
--- a/lib/i915/gem_vm.h
+++ b/lib/i915/gem_vm.h
@@ -31,7 +31,8 @@ bool gem_has_vm(int i915);
 void gem_require_vm(int i915);
 
 uint32_t gem_vm_create(int i915);
-int __gem_vm_create(int i915, uint32_t *vm_id);
+uint32_t gem_vm_create_in_vm_bind_mode(int i915);
+int __gem_vm_create(int i915, uint32_t flags, uint32_t *vm_id);
 
 void gem_vm_destroy(int i915, uint32_t vm_id);
 int __gem_vm_destroy(int i915, uint32_t vm_id);
-- 
2.21.0.rc0.32.g243a4c7e27

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [igt-dev] [PATCH i-g-t v4 04/11] lib/vm_bind: Add vm_bind specific library functions
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
                   ` (2 preceding siblings ...)
  2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 03/11] lib/vm_bind: Add vm_bind mode support for VM Niranjana Vishwanathapura
@ 2022-10-18  7:16 ` Niranjana Vishwanathapura
  2022-10-18  9:26   ` Matthew Auld
  2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 05/11] lib/vm_bind: Add __prime_handle_to_fd() Niranjana Vishwanathapura
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18  7:16 UTC (permalink / raw)
  To: igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, matthew.auld, daniel.vetter,
	petri.latvala

Add vm_bind specific library interfaces.

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 lib/i915/i915_vm_bind.c | 61 +++++++++++++++++++++++++++++++++++++++++
 lib/i915/i915_vm_bind.h | 16 +++++++++++
 lib/meson.build         |  1 +
 3 files changed, 78 insertions(+)
 create mode 100644 lib/i915/i915_vm_bind.c
 create mode 100644 lib/i915/i915_vm_bind.h

diff --git a/lib/i915/i915_vm_bind.c b/lib/i915/i915_vm_bind.c
new file mode 100644
index 0000000000..5624f589b5
--- /dev/null
+++ b/lib/i915/i915_vm_bind.c
@@ -0,0 +1,61 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <errno.h>
+#include <string.h>
+#include <sys/ioctl.h>
+
+#include "ioctl_wrappers.h"
+#include "i915_drm.h"
+#include "i915_drm_local.h"
+#include "i915_vm_bind.h"
+
+void i915_vm_bind(int i915, uint32_t vm_id, uint64_t va, uint32_t handle,
+		  uint64_t offset, uint64_t length, uint32_t syncobj,
+		  uint64_t fence_value)
+{
+	struct drm_i915_gem_vm_bind bind;
+
+	memset(&bind, 0, sizeof(bind));
+	bind.vm_id = vm_id;
+	bind.handle = handle;
+	bind.start = va;
+	bind.offset = offset;
+	bind.length = length;
+	if (syncobj) {
+		bind.fence.handle = syncobj;
+		bind.fence.value = fence_value;
+		bind.fence.flags = I915_TIMELINE_FENCE_SIGNAL;
+	}
+
+	gem_vm_bind(i915, &bind);
+}
+
+void i915_vm_unbind(int i915, uint32_t vm_id, uint64_t va, uint64_t length)
+{
+	struct drm_i915_gem_vm_unbind unbind;
+
+	memset(&unbind, 0, sizeof(unbind));
+	unbind.vm_id = vm_id;
+	unbind.start = va;
+	unbind.length = length;
+
+	gem_vm_unbind(i915, &unbind);
+}
+
+int i915_vm_bind_version(int i915)
+{
+	struct drm_i915_getparam gp;
+	int value = 0;
+
+	memset(&gp, 0, sizeof(gp));
+	gp.param = I915_PARAM_VM_BIND_VERSION;
+	gp.value = &value;
+
+	ioctl(i915, DRM_IOCTL_I915_GETPARAM, &gp, sizeof(gp));
+	errno = 0;
+
+	return value;
+}
diff --git a/lib/i915/i915_vm_bind.h b/lib/i915/i915_vm_bind.h
new file mode 100644
index 0000000000..46a382c032
--- /dev/null
+++ b/lib/i915/i915_vm_bind.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+#ifndef _I915_VM_BIND_H_
+#define _I915_VM_BIND_H_
+
+#include <stdint.h>
+
+void i915_vm_bind(int i915, uint32_t vm_id, uint64_t va, uint32_t handle,
+		  uint64_t offset, uint64_t length, uint32_t syncobj,
+		  uint64_t fence_value);
+void i915_vm_unbind(int i915, uint32_t vm_id, uint64_t va, uint64_t length);
+int i915_vm_bind_version(int i915);
+
+#endif /* _I915_VM_BIND_ */
diff --git a/lib/meson.build b/lib/meson.build
index 8d6c8a244a..e50a2b0e79 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -14,6 +14,7 @@ lib_sources = [
 	'i915/intel_mocs.c',
 	'i915/i915_blt.c',
 	'i915/i915_crc.c',
+	'i915/i915_vm_bind.c',
 	'igt_collection.c',
 	'igt_color_encoding.c',
 	'igt_crc.c',
-- 
2.21.0.rc0.32.g243a4c7e27

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [igt-dev] [PATCH i-g-t v4 05/11] lib/vm_bind: Add __prime_handle_to_fd()
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
                   ` (3 preceding siblings ...)
  2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 04/11] lib/vm_bind: Add vm_bind specific library functions Niranjana Vishwanathapura
@ 2022-10-18  7:16 ` Niranjana Vishwanathapura
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 06/11] tests/i915/vm_bind: Add vm_bind sanity test Niranjana Vishwanathapura
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18  7:16 UTC (permalink / raw)
  To: igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, matthew.auld, daniel.vetter,
	petri.latvala

Make __prime_handle_to_fd() as a library interface as VM_BIND
functionality will also be using it.

v2: Rename prime_handle_to_fd_no_assert() to __prime_handle_to_fd()

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 lib/ioctl_wrappers.c    | 39 +++++++++++++++++++++++++++++++++------
 lib/ioctl_wrappers.h    |  1 +
 tests/i915/gem_pwrite.c | 16 +---------------
 tests/prime_mmap.c      | 26 ++++----------------------
 4 files changed, 39 insertions(+), 43 deletions(-)

diff --git a/lib/ioctl_wrappers.c b/lib/ioctl_wrappers.c
index ac37b6bb43..db9cbb724c 100644
--- a/lib/ioctl_wrappers.c
+++ b/lib/ioctl_wrappers.c
@@ -1153,28 +1153,55 @@ void gem_require_mocs_registers(int fd)
 /* prime */
 
 /**
- * prime_handle_to_fd:
+ * __prime_handle_to_fd:
  * @fd: open i915 drm file descriptor
  * @handle: file-private gem buffer object handle
+ * @flags: DRM_IOCTL_PRIME_HANDLE_TO_FD ioctl flags
+ * @fd_out: place holder for output file handle
  *
  * This wraps the PRIME_HANDLE_TO_FD ioctl, which is used to export a gem buffer
  * object into a global (i.e. potentially cross-device) dma-buf file-descriptor
  * handle.
  *
- * Returns: The created dma-buf fd handle.
+ * Returns: 0 on success, error otherwise. Upon success, it returns
+ *          the created dma-buf fd handle in fd_out.
  */
-int prime_handle_to_fd(int fd, uint32_t handle)
+int __prime_handle_to_fd(int fd, uint32_t handle, int flags, int *fd_out)
 {
 	struct drm_prime_handle args;
+	int ret;
 
 	memset(&args, 0, sizeof(args));
 	args.handle = handle;
-	args.flags = DRM_CLOEXEC;
+	args.flags = flags;
 	args.fd = -1;
 
-	do_ioctl(fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
+	ret = drmIoctl(fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
+	if (ret)
+		ret = -errno;
+	*fd_out = args.fd;
 
-	return args.fd;
+	return ret;
+}
+
+/**
+ * prime_handle_to_fd:
+ * @fd: open i915 drm file descriptor
+ * @handle: file-private gem buffer object handle
+ *
+ * This wraps the PRIME_HANDLE_TO_FD ioctl, which is used to export a gem buffer
+ * object into a global (i.e. potentially cross-device) dma-buf file-descriptor
+ * handle. It asserts that ioctl succeeds.
+ *
+ * Returns: The created dma-buf fd handle.
+ */
+int prime_handle_to_fd(int fd, uint32_t handle)
+{
+	int dmabuf;
+
+	igt_assert_eq(__prime_handle_to_fd(fd, handle, DRM_CLOEXEC, &dmabuf), 0);
+
+	return dmabuf;
 }
 
 /**
diff --git a/lib/ioctl_wrappers.h b/lib/ioctl_wrappers.h
index 223cd9160c..66faa27551 100644
--- a/lib/ioctl_wrappers.h
+++ b/lib/ioctl_wrappers.h
@@ -143,6 +143,7 @@ struct local_dma_buf_sync {
 #define LOCAL_DMA_BUF_BASE 'b'
 #define LOCAL_DMA_BUF_IOCTL_SYNC _IOW(LOCAL_DMA_BUF_BASE, 0, struct local_dma_buf_sync)
 
+int __prime_handle_to_fd(int fd, uint32_t handle, int flags, int *fd_out);
 int prime_handle_to_fd(int fd, uint32_t handle);
 #ifndef DRM_RDWR
 #define DRM_RDWR O_RDWR
diff --git a/tests/i915/gem_pwrite.c b/tests/i915/gem_pwrite.c
index 6e3f833cd8..f4fe069caf 100644
--- a/tests/i915/gem_pwrite.c
+++ b/tests/i915/gem_pwrite.c
@@ -293,19 +293,6 @@ struct ufd_thread {
 	int err;
 };
 
-static int __prime_handle_to_fd(int fd, uint32_t handle)
-{
-	struct drm_prime_handle args;
-
-	memset(&args, 0, sizeof(args));
-	args.handle = handle;
-	args.flags = DRM_CLOEXEC;
-	args.fd = -1;
-
-	ioctl(fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
-	return args.fd;
-}
-
 static uint32_t dmabuf_create_handle(int i915, int vgem)
 {
 	struct vgem_bo scratch;
@@ -317,8 +304,7 @@ static uint32_t dmabuf_create_handle(int i915, int vgem)
 	scratch.bpp = 32;
 	vgem_create(vgem, &scratch);
 
-	dmabuf = __prime_handle_to_fd(vgem, scratch.handle);
-	if (dmabuf < 0)
+	if (__prime_handle_to_fd(vgem, scratch.handle, DRM_CLOEXEC, &dmabuf))
 		return 0;
 
 	handle = prime_fd_to_handle(i915, dmabuf);
diff --git a/tests/prime_mmap.c b/tests/prime_mmap.c
index bc19f68c98..432e857631 100644
--- a/tests/prime_mmap.c
+++ b/tests/prime_mmap.c
@@ -298,24 +298,6 @@ test_dup(uint32_t region, uint64_t size)
 	close (dma_buf_fd);
 }
 
-/* Used for error case testing to avoid wrapper */
-static int prime_handle_to_fd_no_assert(uint32_t handle, int flags, int *fd_out)
-{
-	struct drm_prime_handle args;
-	int ret;
-
-	args.handle = handle;
-	args.flags = flags;
-	args.fd = -1;
-
-	ret = drmIoctl(fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
-	if (ret)
-		ret = errno;
-	*fd_out = args.fd;
-
-	return ret;
-}
-
 static bool has_userptr(void)
 {
 	uint32_t handle = 0;
@@ -346,9 +328,9 @@ test_userptr(uint32_t region, uint64_t size)
 	gem_userptr(fd, (uint32_t *)ptr, size, 0, 0, &handle);
 
 	/* export userptr */
-	ret = prime_handle_to_fd_no_assert(handle, DRM_CLOEXEC, &dma_buf_fd);
+	ret = __prime_handle_to_fd(fd, handle, DRM_CLOEXEC, &dma_buf_fd);
 	if (ret) {
-		igt_assert(ret == EINVAL || ret == ENODEV);
+		igt_assert(ret == -EINVAL || ret == -ENODEV);
 		goto free_userptr;
 	} else {
 		igt_assert_eq(ret, 0);
@@ -376,7 +358,7 @@ test_errors(uint32_t region, uint64_t size)
 	/* Test for invalid flags */
 	igt_assert(__gem_create_in_memory_regions(fd, &handle, &size, region) == 0);
 	for (i = 0; i < ARRAY_SIZE(invalid_flags); i++) {
-		prime_handle_to_fd_no_assert(handle, invalid_flags[i], &dma_buf_fd);
+		__prime_handle_to_fd(fd, handle, invalid_flags[i], &dma_buf_fd);
 		igt_assert_eq(errno, EINVAL);
 		errno = 0;
 	}
@@ -386,7 +368,7 @@ test_errors(uint32_t region, uint64_t size)
 	igt_assert(__gem_create_in_memory_regions(fd, &handle, &size, region) == 0);
 	fill_bo(handle, size);
 	gem_close(fd, handle);
-	prime_handle_to_fd_no_assert(handle, DRM_CLOEXEC, &dma_buf_fd);
+	__prime_handle_to_fd(fd, handle, DRM_CLOEXEC, &dma_buf_fd);
 	igt_assert(dma_buf_fd == -1 && errno == ENOENT);
 	errno = 0;
 
-- 
2.21.0.rc0.32.g243a4c7e27

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [igt-dev] [PATCH i-g-t v4 06/11] tests/i915/vm_bind: Add vm_bind sanity test
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
                   ` (4 preceding siblings ...)
  2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 05/11] lib/vm_bind: Add __prime_handle_to_fd() Niranjana Vishwanathapura
@ 2022-10-18  7:17 ` Niranjana Vishwanathapura
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 07/11] tests/i915/vm_bind: Add basic VM_BIND test support Niranjana Vishwanathapura
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18  7:17 UTC (permalink / raw)
  To: igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, matthew.auld, daniel.vetter,
	petri.latvala

Add sanity test to exercise vm_bind uapi.
Test for various cases with vm_bind and vm_unbind ioctls.

v2: Add more input validity tests
    Add sanity test to fast-feedback.testlist
v3: Add only basic-smem subtest to fast-feedback.testlist
v4: Use gem_create_ext to create vm private objects,
    Ensure vm private objects are only allowed in vm_bind mode,
    Use library routine i915_vm_bind_version()

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 tests/i915/i915_vm_bind_sanity.c      | 275 ++++++++++++++++++++++++++
 tests/intel-ci/fast-feedback.testlist |   1 +
 tests/meson.build                     |   1 +
 3 files changed, 277 insertions(+)
 create mode 100644 tests/i915/i915_vm_bind_sanity.c

diff --git a/tests/i915/i915_vm_bind_sanity.c b/tests/i915/i915_vm_bind_sanity.c
new file mode 100644
index 0000000000..322b6190e9
--- /dev/null
+++ b/tests/i915/i915_vm_bind_sanity.c
@@ -0,0 +1,275 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+/** @file i915_vm_bind_sanity.c
+ *
+ * This is the sanity test for VM_BIND UAPI.
+ *
+ * The goal is to test the UAPI interface.
+ */
+
+#include <errno.h>
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#include <sys/poll.h>
+
+#include "i915/gem.h"
+#include "i915/gem_create.h"
+#include "i915/gem_vm.h"
+#include "i915/i915_vm_bind.h"
+#include "intel_allocator.h"
+#include "igt.h"
+#include "igt_syncobj.h"
+
+#define PAGE_SIZE   4096
+#define SZ_64K      (16 * PAGE_SIZE)
+#define SZ_2M       (512 * PAGE_SIZE)
+
+IGT_TEST_DESCRIPTION("Sanity test vm_bind related interfaces");
+
+static uint64_t
+gettime_ns(void)
+{
+	struct timespec current;
+	clock_gettime(CLOCK_MONOTONIC, &current);
+	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
+}
+
+static bool syncobj_busy(int fd, uint32_t handle)
+{
+	bool result;
+	int sf;
+
+	sf = syncobj_handle_to_fd(fd, handle,
+				  DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE);
+	result = poll(&(struct pollfd){sf, POLLIN}, 1, 0) == 0;
+	close(sf);
+
+	return result;
+}
+
+static inline int
+__vm_bind(int fd, uint32_t vm_id, uint32_t handle, uint64_t start,
+	  uint64_t offset, uint64_t length, uint64_t flags,
+	  struct drm_i915_gem_timeline_fence *fence, uint64_t extensions)
+{
+	struct drm_i915_gem_vm_bind bind;
+
+	memset(&bind, 0, sizeof(bind));
+	bind.vm_id = vm_id;
+	bind.handle = handle;
+	bind.start = start;
+	bind.offset = offset;
+	bind.length = length;
+	bind.flags = flags;
+	bind.extensions = extensions;
+	if (fence)
+		bind.fence = *fence;
+
+	return __gem_vm_bind(fd, &bind);
+}
+
+static inline void
+vm_bind(int fd, uint32_t vm_id, uint32_t handle, uint64_t start,
+	uint64_t offset, uint64_t length, uint64_t flags,
+	struct drm_i915_gem_timeline_fence *fence)
+{
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, start, offset,
+				length, flags, fence, 0), 0);
+	if (fence) {
+		igt_assert(syncobj_timeline_wait(fd, &fence->handle, (uint64_t *)&fence->value,
+						 1, gettime_ns() + (2 * NSEC_PER_SEC),
+						 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
+		igt_assert(!syncobj_busy(fd, fence->handle));
+	}
+}
+
+static inline int
+__vm_unbind(int fd, uint32_t vm_id, uint64_t start, uint64_t length, uint64_t flags,
+	    uint32_t rsvd, uint64_t extensions)
+{
+	struct drm_i915_gem_vm_unbind unbind;
+
+	memset(&unbind, 0, sizeof(unbind));
+	unbind.vm_id = vm_id;
+	unbind.rsvd = rsvd;
+	unbind.start = start;
+	unbind.length = length;
+	unbind.flags = flags;
+	unbind.extensions = extensions;
+
+	return __gem_vm_unbind(fd, &unbind);
+}
+
+static inline void
+vm_unbind(int fd, uint32_t vm_id, uint64_t start, uint64_t length, uint64_t flags)
+{
+	igt_assert_eq(__vm_unbind(fd, vm_id, start, length, flags, 0, 0), 0);
+}
+
+static void basic(int fd, const struct gem_memory_region *mr)
+{
+	uint32_t vm_id, vm_id2, vm_id_exec_mode, handle;
+	struct drm_i915_gem_create_ext_memory_regions setparam_region = {
+		.base = { .name = I915_GEM_CREATE_EXT_MEMORY_REGIONS },
+		.regions = to_user_pointer(&mr->ci),
+		.num_regions = 1,
+	};
+	struct drm_i915_gem_timeline_fence fence = {
+		.handle = syncobj_create(fd, 0),
+		.flags = I915_TIMELINE_FENCE_SIGNAL,
+		.value = 0,
+	};
+	struct drm_i915_gem_create_ext_vm_private vm_priv = {
+		.base = { .name = I915_GEM_CREATE_EXT_VM_PRIVATE },
+	};
+	struct drm_i915_gem_execbuffer2 execbuf;
+	struct drm_i915_gem_exec_object2 obj;
+	uint64_t ahnd, va, pg_size, size;
+	const intel_ctx_t *ctx;
+	int dmabuf;
+
+	ahnd = intel_allocator_open_full(fd, 1, 0, 0,
+					 INTEL_ALLOCATOR_RANDOM,
+					 ALLOC_STRATEGY_HIGH_TO_LOW,
+					 SZ_2M);
+
+	pg_size = ((mr->ci.memory_class == I915_MEMORY_CLASS_DEVICE) &&
+		   HAS_64K_PAGES(intel_get_drm_devid(fd))) ? SZ_64K : PAGE_SIZE;
+	size = pg_size * 4;
+
+	vm_id = gem_vm_create_in_vm_bind_mode(fd);
+	handle = gem_create_ext(fd, size, 0, &setparam_region.base);
+	va = CANONICAL(get_offset(ahnd, handle, size, 0));
+
+	/* Bind and unbind */
+	vm_bind(fd, vm_id, handle, va, 0, size, 0, NULL);
+	vm_unbind(fd, vm_id, va, size, 0);
+
+	/* Bind with out fence */
+	vm_bind(fd, vm_id, handle, va, 0, size, 0, &fence);
+	vm_unbind(fd, vm_id, va, size, 0);
+
+	/* Aliasing bind and unbind */
+	vm_bind(fd, vm_id, handle, va, 0, size, 0, NULL);
+	vm_bind(fd, vm_id, handle, va + SZ_2M, 0, size, 0, NULL);
+	vm_unbind(fd, vm_id, va, size, 0);
+	vm_unbind(fd, vm_id, va + SZ_2M, size, 0);
+
+	/* MBZ fields are not 0 */
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, va, 0, size, 0x10, NULL, 0), -EINVAL);
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, va, 0, size, 0, NULL, to_user_pointer(&obj)), -EINVAL);
+	vm_bind(fd, vm_id, handle, va, 0, size, 0, NULL);
+	igt_assert_eq(__vm_unbind(fd, vm_id, va, size, 0x10, 0, 0), -EINVAL);
+	igt_assert_eq(__vm_unbind(fd, vm_id, va, size, 0, 0x10, 0), -EINVAL);
+	igt_assert_eq(__vm_unbind(fd, vm_id, va, size, 0, 0, to_user_pointer(&obj)), -EINVAL);
+	vm_unbind(fd, vm_id, va, size, 0);
+
+	/* Invalid handle */
+	igt_assert_eq(__vm_bind(fd, vm_id, handle + 10, va, 0, size, 0, NULL, 0), -ENOENT);
+
+	/* Invalid mapping range */
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, va, 0, 0, 0, NULL, 0), -EINVAL);
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, va, pg_size, size, 0, NULL, 0), -EINVAL);
+
+	/* Unaligned binds */
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, va + 0x10, 0, size, 0, NULL, 0), -EINVAL);
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, va, pg_size / 2, pg_size, 0, NULL, 0), -EINVAL);
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, va, 0, pg_size / 2, 0, NULL, 0), -EINVAL);
+
+	/* range overflow binds */
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, va, pg_size, -pg_size, 0, NULL, 0), -EINVAL);
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, va, pg_size * 2, -pg_size, 0, NULL, 0), -EINVAL);
+
+	/* re-bind VA range without unbinding */
+	vm_bind(fd, vm_id, handle, va, 0, size, 0, NULL);
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, va, 0, size, 0, NULL, 0), -EEXIST);
+	vm_unbind(fd, vm_id, va, size, 0);
+
+	/* unbind a non-existing mapping */
+	igt_assert_eq(__vm_bind(fd, vm_id, 0, va + SZ_2M, 0, size, 0, NULL, 0), -ENOENT);
+
+	/* unbind with length mismatch */
+	vm_bind(fd, vm_id, handle, va, 0, size, 0, NULL);
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, va, 0, size * 2, 0, NULL, 0), -EINVAL);
+	vm_unbind(fd, vm_id, va, size, 0);
+
+	/* validate exclusivity of vm_bind & exec modes of binding */
+	vm_id_exec_mode = gem_vm_create(fd);
+	igt_assert_eq(__vm_bind(fd, vm_id_exec_mode, handle, va, 0, size, 0, NULL, 0), -EOPNOTSUPP);
+
+	ctx = intel_ctx_create_all_physical(fd);
+	gem_context_set_vm(fd, ctx->id, vm_id);
+	(void)gem_context_get_vm(fd, ctx->id);
+
+	memset(&obj, 0, sizeof(obj));
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.buffers_ptr = to_user_pointer(&obj);
+	execbuf.buffer_count = 1;
+	obj.handle = handle;
+	i915_execbuffer2_set_context_id(execbuf, ctx->id);
+	igt_assert_eq(__gem_execbuf(fd, &execbuf), -EOPNOTSUPP);
+
+	intel_ctx_destroy(fd, ctx);
+	gem_vm_destroy(fd, vm_id_exec_mode);
+	gem_close(fd, handle);
+
+	/* validate VM private objects */
+	setparam_region.base.next_extension = to_user_pointer(&vm_priv);
+	igt_assert_eq(__gem_create_ext(fd, &size, 0, &handle,
+				       &setparam_region.base), -ENOENT);
+	vm_priv.rsvd = 0x10;
+	igt_assert_eq(__gem_create_ext(fd, &size, 0, &handle,
+				       &setparam_region.base), -EINVAL);
+	vm_id2 = gem_vm_create(fd);
+	vm_priv.rsvd = 0;
+	vm_priv.vm_id = vm_id2;
+	igt_assert_eq(__gem_create_ext(fd, &size, 0, &handle,
+				       &setparam_region.base), -EINVAL);
+	gem_vm_destroy(fd, vm_id2);
+
+	vm_id2 = gem_vm_create_in_vm_bind_mode(fd);
+	vm_priv.vm_id = vm_id2;
+	handle = gem_create_ext(fd, size, 0, &setparam_region.base);
+
+	igt_assert_eq(__prime_handle_to_fd(fd, handle, DRM_CLOEXEC, &dmabuf), -EINVAL);
+	igt_assert_eq(__vm_bind(fd, vm_id, handle, va, 0, size, 0, NULL, 0), -EINVAL);
+	vm_bind(fd, vm_id2, handle, va, 0, size, 0, NULL);
+	vm_unbind(fd, vm_id2, va, size, 0);
+
+	gem_close(fd, handle);
+	gem_vm_destroy(fd, vm_id2);
+	gem_vm_destroy(fd, vm_id);
+	syncobj_destroy(fd, fence.handle);
+	intel_allocator_close(ahnd);
+}
+
+igt_main
+{
+	int fd;
+
+	igt_fixture {
+		fd = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(fd);
+		igt_require(i915_vm_bind_version(fd) == 1);
+	}
+
+	igt_describe("Basic vm_bind sanity test");
+	igt_subtest_with_dynamic("basic") {
+		for_each_memory_region(r, fd) {
+			if (r->ci.memory_instance)
+				continue;
+
+			igt_dynamic_f("%s", r->name)
+				basic(fd, r);
+		}
+	}
+
+	igt_fixture {
+		close(fd);
+	}
+
+	igt_exit();
+}
diff --git a/tests/intel-ci/fast-feedback.testlist b/tests/intel-ci/fast-feedback.testlist
index bd5538a035..185c2fef54 100644
--- a/tests/intel-ci/fast-feedback.testlist
+++ b/tests/intel-ci/fast-feedback.testlist
@@ -53,6 +53,7 @@ igt@i915_getparams_basic@basic-eu-total
 igt@i915_getparams_basic@basic-subslice-total
 igt@i915_hangman@error-state-basic
 igt@i915_pciid
+igt@i915_vm_bind_sanity@basic
 igt@kms_addfb_basic@addfb25-bad-modifier
 igt@kms_addfb_basic@addfb25-framebuffer-vs-set-tiling
 igt@kms_addfb_basic@addfb25-modifier-no-flag
diff --git a/tests/meson.build b/tests/meson.build
index 12e53e0bd2..d81c6c28c2 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -250,6 +250,7 @@ i915_progs = [
 	'sysfs_heartbeat_interval',
 	'sysfs_preempt_timeout',
 	'sysfs_timeslice_duration',
+	'i915_vm_bind_sanity',
 ]
 
 msm_progs = [
-- 
2.21.0.rc0.32.g243a4c7e27

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [igt-dev] [PATCH i-g-t v4 07/11] tests/i915/vm_bind: Add basic VM_BIND test support
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
                   ` (5 preceding siblings ...)
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 06/11] tests/i915/vm_bind: Add vm_bind sanity test Niranjana Vishwanathapura
@ 2022-10-18  7:17 ` Niranjana Vishwanathapura
  2022-10-21 16:27   ` Matthew Auld
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 08/11] tests/i915/vm_bind: Add userptr subtest Niranjana Vishwanathapura
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18  7:17 UTC (permalink / raw)
  To: igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, matthew.auld, daniel.vetter,
	petri.latvala

Add basic tests for VM_BIND functionality. Bind the buffer objects in
device page table with VM_BIND calls and have GPU copy the data from a
source buffer object to destination buffer object.
Test for different buffer sizes, buffer object placement and with
multiple contexts.

v2: Add basic test to fast-feedback.testlist
v3: Run all tests for different memory types,
    pass VA instead of an array for batch_address
v4: Iterate for memory region types instead of hardcoding,
    use i915_vm_bind library functions

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

basic
---
 tests/i915/i915_vm_bind_basic.c       | 522 ++++++++++++++++++++++++++
 tests/intel-ci/fast-feedback.testlist |   1 +
 tests/meson.build                     |   1 +
 3 files changed, 524 insertions(+)
 create mode 100644 tests/i915/i915_vm_bind_basic.c

diff --git a/tests/i915/i915_vm_bind_basic.c b/tests/i915/i915_vm_bind_basic.c
new file mode 100644
index 0000000000..ede76ba095
--- /dev/null
+++ b/tests/i915/i915_vm_bind_basic.c
@@ -0,0 +1,522 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+/** @file i915_vm_bind_basic.c
+ *
+ * This is the basic test for VM_BIND functionality.
+ *
+ * The goal is to ensure that basics work.
+ */
+
+#include <sys/poll.h>
+
+#include "i915/gem.h"
+#include "i915/i915_vm_bind.h"
+#include "igt.h"
+#include "igt_syncobj.h"
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <errno.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include "drm.h"
+#include "i915/gem_vm.h"
+
+IGT_TEST_DESCRIPTION("Basic test for vm_bind functionality");
+
+#define PAGE_SIZE   4096
+#define PAGE_SHIFT  12
+
+#define GEN9_XY_FAST_COPY_BLT_CMD       (2 << 29 | 0x42 << 22)
+#define BLT_DEPTH_32                    (3 << 24)
+
+#define DEFAULT_BUFF_SIZE  (4 * PAGE_SIZE)
+#define SZ_64K             (16 * PAGE_SIZE)
+#define SZ_2M              (512 * PAGE_SIZE)
+
+#define MAX_CTXTS   2
+#define MAX_CMDS    4
+
+#define BATCH_FENCE  0
+#define SRC_FENCE    1
+#define DST_FENCE    2
+#define EXEC_FENCE   3
+#define NUM_FENCES   4
+
+enum {
+	BATCH_MAP,
+	SRC_MAP,
+	DST_MAP = SRC_MAP + MAX_CMDS,
+	MAX_MAP
+};
+
+struct mapping {
+	uint32_t  obj;
+	uint64_t  va;
+	uint64_t  offset;
+	uint64_t  length;
+	uint64_t  flags;
+};
+
+#define SET_MAP(map, _obj, _va, _offset, _length, _flags)   \
+{                                  \
+	(map).obj = _obj;          \
+	(map).va = _va;            \
+	(map).offset = _offset;	   \
+	(map).length = _length;	   \
+	(map).flags = _flags;	   \
+}
+
+#define MAX_BATCH_DWORD    64
+
+#define abs(x) ((x) >= 0 ? (x) : -(x))
+
+#define TEST_LMEM             BIT(0)
+#define TEST_SKIP_UNBIND      BIT(1)
+#define TEST_SHARE_VM         BIT(2)
+
+#define is_lmem(cfg)        ((cfg)->flags & TEST_LMEM)
+#define do_unbind(cfg)      (!((cfg)->flags & TEST_SKIP_UNBIND))
+#define do_share_vm(cfg)    ((cfg)->flags & TEST_SHARE_VM)
+
+struct test_cfg {
+	const char *name;
+	uint32_t size;
+	uint8_t num_cmds;
+	uint32_t num_ctxts;
+	uint32_t flags;
+};
+
+static uint64_t
+gettime_ns(void)
+{
+	struct timespec current;
+	clock_gettime(CLOCK_MONOTONIC, &current);
+	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
+}
+
+static bool syncobj_busy(int fd, uint32_t handle)
+{
+	bool result;
+	int sf;
+
+	sf = syncobj_handle_to_fd(fd, handle,
+				  DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE);
+	result = poll(&(struct pollfd){sf, POLLIN}, 1, 0) == 0;
+	close(sf);
+
+	return result;
+}
+
+static inline void vm_bind(int fd, uint32_t vm_id, struct mapping *m,
+			   struct drm_i915_gem_timeline_fence *fence)
+{
+	uint32_t syncobj = 0;
+
+	if (fence) {
+		syncobj =  syncobj_create(fd, 0);
+
+		fence->handle = syncobj;
+		fence->flags = I915_TIMELINE_FENCE_WAIT;
+		fence->value = 0;
+	}
+
+	igt_info("VM_BIND vm:0x%x h:0x%x v:0x%lx o:0x%lx l:0x%lx\n",
+		 vm_id, m->obj, m->va, m->offset, m->length);
+	i915_vm_bind(fd, vm_id, m->va, m->obj, m->offset, m->length, syncobj, 0);
+}
+
+static inline void vm_unbind(int fd, uint32_t vm_id, struct mapping *m)
+{
+	/* Object handle is not required during unbind */
+	igt_info("VM_UNBIND vm:0x%x v:0x%lx l:0x%lx\n", vm_id, m->va, m->length);
+	i915_vm_unbind(fd, vm_id, m->va, m->length);
+}
+
+static void print_buffer(void *buf, uint32_t size,
+			 const char *str, bool full)
+{
+	uint32_t i = 0;
+
+	igt_debug("Printing %s size 0x%x\n", str, size);
+	while (i < size) {
+		uint32_t *b = buf + i;
+
+		igt_debug("\t%s[0x%04x]: 0x%08x 0x%08x 0x%08x 0x%08x %s\n",
+			  str, i, b[0], b[1], b[2], b[3], full ? "" : "...");
+		i += full ? 16 : PAGE_SIZE;
+	}
+}
+
+static int gem_linear_fast_blt(uint32_t *batch, uint64_t src,
+			       uint64_t dst, uint32_t size)
+{
+	uint32_t *cmd = batch;
+
+	*cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
+	*cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
+	*cmd++ = 0;
+	*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+	*cmd++ = lower_32_bits(dst);
+	*cmd++ = upper_32_bits(dst);
+	*cmd++ = 0;
+	*cmd++ = PAGE_SIZE;
+	*cmd++ = lower_32_bits(src);
+	*cmd++ = upper_32_bits(src);
+
+	*cmd++ = MI_BATCH_BUFFER_END;
+	*cmd++ = 0;
+
+	return ALIGN((cmd - batch + 1) * sizeof(uint32_t), 8);
+}
+
+static void __gem_copy(int fd, uint64_t src, uint64_t dst, uint32_t size,
+		       uint32_t ctx_id, void *batch_addr, unsigned int eb_flags,
+		       struct drm_i915_gem_timeline_fence *fence)
+{
+	uint32_t len, buf[MAX_BATCH_DWORD] = { 0 };
+	struct drm_i915_gem_execbuffer3 execbuf;
+
+	len = gem_linear_fast_blt(buf, src, dst, size);
+
+	memcpy(batch_addr, (void *)buf, len);
+	print_buffer(buf, len, "batch", true);
+
+	memset(&execbuf, 0, sizeof(execbuf));
+	execbuf.ctx_id = ctx_id;
+	execbuf.batch_address = to_user_pointer(batch_addr);
+	execbuf.engine_idx = eb_flags;
+	execbuf.fence_count = NUM_FENCES;
+	execbuf.timeline_fences = to_user_pointer(fence);
+	gem_execbuf3(fd, &execbuf);
+}
+
+static void i915_gem_copy(int fd, uint64_t src, uint64_t dst, uint32_t va_delta,
+			  uint32_t delta, uint32_t size, const intel_ctx_t **ctx,
+			  uint32_t num_ctxts, void **batch_addr, unsigned int eb_flags,
+			  struct drm_i915_gem_timeline_fence (*fence)[NUM_FENCES])
+{
+	uint32_t i;
+
+	for (i = 0; i < num_ctxts; i++) {
+		igt_info("Issuing gem copy on ctx 0x%x\n", ctx[i]->id);
+		__gem_copy(fd, src + (i * va_delta), dst + (i * va_delta), delta,
+			   ctx[i]->id, batch_addr[i], eb_flags, fence[i]);
+	}
+}
+
+static void i915_gem_sync(int fd, const intel_ctx_t **ctx, uint32_t num_ctxts,
+			  struct drm_i915_gem_timeline_fence (*fence)[NUM_FENCES])
+{
+	uint32_t i;
+
+	for (i = 0; i < num_ctxts; i++) {
+		uint64_t fence_value = 0;
+
+		igt_assert(syncobj_timeline_wait(fd, &fence[i][EXEC_FENCE].handle,
+						 (uint64_t *)&fence_value, 1,
+						 gettime_ns() + (2 * NSEC_PER_SEC),
+						 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
+		igt_assert(!syncobj_busy(fd, fence[i][EXEC_FENCE].handle));
+		igt_info("gem copy completed on ctx 0x%x\n", ctx[i]->id);
+	}
+}
+
+static struct igt_collection *get_region_set(int fd, struct test_cfg *cfg, bool lmem_only)
+{
+	uint32_t mem_type[] = { I915_SYSTEM_MEMORY, I915_DEVICE_MEMORY };
+	uint32_t lmem_type[] = { I915_DEVICE_MEMORY };
+	struct drm_i915_query_memory_regions *query_info;
+
+	query_info = gem_get_query_memory_regions(fd);
+	igt_assert(query_info);
+
+	if (lmem_only)
+		return __get_memory_region_set(query_info, lmem_type, 1);
+	else
+		return __get_memory_region_set(query_info, mem_type, 2);
+}
+
+static void create_src_objs(int fd, struct test_cfg *cfg, uint32_t src[], uint32_t size,
+			    uint32_t num_cmds, void *src_addr[])
+{
+	int i;
+	struct igt_collection *set = get_region_set(fd, cfg, is_lmem(cfg) && (num_cmds == 1));
+	uint32_t region;
+
+	for (i = 0; i < num_cmds; i++) {
+		region = igt_collection_get_value(set, i % set->size);
+		src[i] = gem_create_in_memory_regions(fd, size, region);
+		igt_info("Src obj 0x%x created in region 0x%x:0x%x\n", src[i],
+			 MEMORY_TYPE_FROM_REGION(region), MEMORY_INSTANCE_FROM_REGION(region));
+		src_addr[i] = gem_mmap__cpu(fd, src[i], 0, size, PROT_WRITE);
+	}
+}
+
+static void destroy_src_objs(int fd, struct test_cfg *cfg, uint32_t src[], uint32_t size,
+			     uint32_t num_cmds, void *src_addr[])
+{
+	int i;
+
+	for (i = 0; i < num_cmds; i++) {
+		igt_assert(gem_munmap(src_addr[i], size) == 0);
+		igt_debug("Closing object 0x%x\n", src[i]);
+		gem_close(fd, src[i]);
+	}
+}
+
+static uint32_t create_dst_obj(int fd, struct test_cfg *cfg, uint32_t size, void **dst_addr)
+{
+	uint32_t dst;
+	struct igt_collection *set = get_region_set(fd, cfg, is_lmem(cfg));
+	uint32_t region = igt_collection_get_value(set, 0);
+
+	dst = gem_create_in_memory_regions(fd, size, region);
+	igt_info("Dst obj 0x%x created in region 0x%x:0x%x\n", dst,
+		 MEMORY_TYPE_FROM_REGION(region), MEMORY_INSTANCE_FROM_REGION(region));
+	*dst_addr = gem_mmap__cpu(fd, dst, 0, size, PROT_WRITE);
+
+	return dst;
+}
+
+static void destroy_dst_obj(int fd, struct test_cfg *cfg, uint32_t dst, uint32_t size, void *dst_addr)
+{
+	igt_assert(gem_munmap(dst_addr, size) == 0);
+	igt_debug("Closing object 0x%x\n", dst);
+	gem_close(fd, dst);
+}
+
+static void pattern_fill_buf(void *src_addr[], uint32_t size, uint32_t num_cmds, uint32_t npages)
+{
+	uint32_t i, j;
+	void *buf;
+
+	/* Allocate buffer and fill pattern */
+	buf = malloc(size);
+	igt_require(buf);
+
+	for (i = 0; i < num_cmds; i++) {
+		for (j = 0; j < npages; j++)
+			memset(buf + j * PAGE_SIZE, i * npages + j + 1, PAGE_SIZE);
+
+		memcpy(src_addr[i], buf, size);
+	}
+
+	free(buf);
+}
+
+static void run_test(int fd, const intel_ctx_t *base_ctx, struct test_cfg *cfg,
+		     const struct intel_execution_engine2 *e)
+{
+	void *src_addr[MAX_CMDS] = { 0 }, *dst_addr = NULL;
+	uint32_t src[MAX_CMDS], dst, i, size = cfg->size;
+	struct drm_i915_gem_timeline_fence exec_fence[MAX_CTXTS][NUM_FENCES];
+	uint32_t shared_vm_id, vm_id[MAX_CTXTS];
+	struct mapping map[MAX_CTXTS][MAX_MAP];
+	uint32_t num_ctxts = cfg->num_ctxts;
+	uint32_t num_cmds = cfg->num_cmds;
+	uint32_t npages = size / PAGE_SIZE;
+	const intel_ctx_t *ctx[MAX_CTXTS];
+	bool share_vm = do_share_vm(cfg);
+	uint32_t delta, va_delta = SZ_2M;
+	void *batch_addr[MAX_CTXTS];
+	uint32_t batch[MAX_CTXTS];
+	uint64_t src_va, dst_va;
+
+	delta = size / num_ctxts;
+	if (share_vm)
+		shared_vm_id = gem_vm_create_in_vm_bind_mode(fd);
+
+	/* Create contexts */
+	num_ctxts = min_t(num_ctxts, MAX_CTXTS, num_ctxts);
+	for (i = 0; i < num_ctxts; i++) {
+		uint32_t vmid;
+
+		if (share_vm)
+			vmid = shared_vm_id;
+		else
+			vmid = gem_vm_create_in_vm_bind_mode(fd);
+
+		ctx[i] = intel_ctx_create(fd, &base_ctx->cfg);
+		gem_context_set_vm(fd, ctx[i]->id, vmid);
+		vm_id[i] = gem_context_get_vm(fd, ctx[i]->id);
+
+		exec_fence[i][EXEC_FENCE].handle = syncobj_create(fd, 0);
+		exec_fence[i][EXEC_FENCE].flags = I915_TIMELINE_FENCE_SIGNAL;
+		exec_fence[i][EXEC_FENCE].value = 0;
+	}
+
+	/* Create objects */
+	num_cmds = min_t(num_cmds, MAX_CMDS, num_cmds);
+	create_src_objs(fd, cfg, src, size, num_cmds, src_addr);
+	dst = create_dst_obj(fd, cfg, size, &dst_addr);
+
+	/*
+	 * mmap'ed addresses are not 64K aligned. On platforms requiring
+	 * 64K alignment, use static addresses.
+	 */
+	if (size < SZ_2M && num_cmds && !HAS_64K_PAGES(intel_get_drm_devid(fd))) {
+		src_va = to_user_pointer(src_addr[0]);
+		dst_va = to_user_pointer(dst_addr);
+	} else {
+		src_va = 0xa000000;
+		dst_va = 0xb000000;
+	}
+
+	pattern_fill_buf(src_addr, size, num_cmds, npages);
+
+	if (num_cmds)
+		print_buffer(src_addr[num_cmds - 1], size, "src_obj", false);
+
+	for (i = 0; i < num_ctxts; i++) {
+		batch[i] = gem_create_in_memory_regions(fd, PAGE_SIZE, REGION_SMEM);
+		igt_info("Batch obj 0x%x created in region 0x%x:0x%x\n", batch[i],
+			 MEMORY_TYPE_FROM_REGION(REGION_SMEM), MEMORY_INSTANCE_FROM_REGION(REGION_SMEM));
+		batch_addr[i] = gem_mmap__cpu(fd, batch[i], 0, PAGE_SIZE, PROT_WRITE);
+	}
+
+	/* Create mappings */
+	for (i = 0; i < num_ctxts; i++) {
+		uint64_t va_offset = i * va_delta;
+		uint64_t offset = i * delta;
+		uint32_t j;
+
+		for (j = 0; j < num_cmds; j++)
+			SET_MAP(map[i][SRC_MAP + j], src[j], src_va + va_offset, offset, delta, 0);
+		SET_MAP(map[i][DST_MAP], dst, dst_va + va_offset, offset, delta, 0);
+		SET_MAP(map[i][BATCH_MAP], batch[i], to_user_pointer(batch_addr[i]), 0, PAGE_SIZE, 0);
+	}
+
+	/* Bind the buffers to device page table */
+	for (i = 0; i < num_ctxts; i++) {
+		vm_bind(fd, vm_id[i], &map[i][BATCH_MAP], &exec_fence[i][BATCH_FENCE]);
+		vm_bind(fd, vm_id[i], &map[i][DST_MAP], &exec_fence[i][DST_FENCE]);
+	}
+
+	/* Have GPU do the copy */
+	for (i = 0; i < cfg->num_cmds; i++) {
+		uint32_t j;
+
+		for (j = 0; j < num_ctxts; j++)
+			vm_bind(fd, vm_id[j], &map[j][SRC_MAP + i], &exec_fence[j][SRC_FENCE]);
+
+		i915_gem_copy(fd, src_va, dst_va, va_delta, delta, size, ctx,
+			      num_ctxts, batch_addr, e->flags, exec_fence);
+
+		i915_gem_sync(fd, ctx, num_ctxts, exec_fence);
+
+		for (j = 0; j < num_ctxts; j++) {
+			syncobj_destroy(fd, exec_fence[j][SRC_FENCE].handle);
+			if (do_unbind(cfg))
+				vm_unbind(fd, vm_id[j], &map[j][SRC_MAP + i]);
+		}
+	}
+
+	/*
+	 * Unbind buffers from device page table.
+	 * If not, it should get unbound while freeing the buffer.
+	 */
+	for (i = 0; i < num_ctxts; i++) {
+		syncobj_destroy(fd, exec_fence[i][BATCH_FENCE].handle);
+		syncobj_destroy(fd, exec_fence[i][DST_FENCE].handle);
+		if (do_unbind(cfg)) {
+			vm_unbind(fd, vm_id[i], &map[i][BATCH_MAP]);
+			vm_unbind(fd, vm_id[i], &map[i][DST_MAP]);
+		}
+	}
+
+	/* Close batch buffers */
+	for (i = 0; i < num_ctxts; i++) {
+		syncobj_destroy(fd, exec_fence[i][EXEC_FENCE].handle);
+		gem_close(fd, batch[i]);
+	}
+
+	/* Accessing the buffer will migrate the pages from device to host */
+	print_buffer(dst_addr, size, "dst_obj", false);
+
+	/* Validate by comparing the last SRC with DST */
+	if (num_cmds)
+		igt_assert(memcmp(src_addr[num_cmds - 1], dst_addr, size) == 0);
+
+	/* Free the objects */
+	destroy_src_objs(fd, cfg, src, size, num_cmds, src_addr);
+	destroy_dst_obj(fd, cfg, dst, size, dst_addr);
+
+	/* Done with the contexts */
+	for (i = 0; i < num_ctxts; i++) {
+		igt_debug("Destroying context 0x%x\n", ctx[i]->id);
+		gem_vm_destroy(fd, vm_id[i]);
+		intel_ctx_destroy(fd, ctx[i]);
+	}
+
+	if (share_vm)
+		gem_vm_destroy(fd, shared_vm_id);
+}
+
+igt_main
+{
+	struct test_cfg *t, tests[] = {
+		{"basic", 0, 1, 1, 0},
+		{"multi_cmds", 0, MAX_CMDS, 1, 0},
+		{"skip_copy", 0, 0, 1, 0},
+		{"skip_unbind",  0, 1, 1, TEST_SKIP_UNBIND},
+		{"multi_ctxts", 0, 1, MAX_CTXTS, 0},
+		{"share_vm", 0, 1, MAX_CTXTS, TEST_SHARE_VM},
+		{"64K", (16 * PAGE_SIZE), 1, 1, 0},
+		{"2M", SZ_2M, 1, 1, 0},
+		{ }
+	};
+	int fd;
+	uint32_t def_size;
+	struct intel_execution_engine2 *e;
+	const intel_ctx_t *ctx;
+
+	igt_fixture {
+		fd = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(fd);
+		igt_require(i915_vm_bind_version(fd) == 1);
+		def_size = HAS_64K_PAGES(intel_get_drm_devid(fd)) ?
+			   SZ_64K : DEFAULT_BUFF_SIZE;
+		ctx = intel_ctx_create_all_physical(fd);
+		/* Get copy engine */
+		for_each_ctx_engine(fd, ctx, e)
+			if (e->class == I915_ENGINE_CLASS_COPY)
+				break;
+	}
+
+	/* Adjust test variables */
+	for (t = tests; t->name; t++)
+		t->size = t->size ? : (def_size * abs(t->num_ctxts));
+
+	for (t = tests; t->name; t++) {
+		igt_describe_f("VM_BIND %s", t->name);
+		igt_subtest_with_dynamic(t->name) {
+			for_each_memory_region(r, fd) {
+				struct test_cfg cfg = *t;
+
+				if (r->ci.memory_instance)
+					continue;
+
+				if (r->ci.memory_class == I915_MEMORY_CLASS_DEVICE)
+					cfg.flags |= TEST_LMEM;
+
+				igt_dynamic_f("%s", r->name)
+					run_test(fd, ctx, &cfg, e);
+			}
+		}
+	}
+
+	igt_fixture {
+		intel_ctx_destroy(fd, ctx);
+		close(fd);
+	}
+
+	igt_exit();
+}
diff --git a/tests/intel-ci/fast-feedback.testlist b/tests/intel-ci/fast-feedback.testlist
index 185c2fef54..8a47eaddda 100644
--- a/tests/intel-ci/fast-feedback.testlist
+++ b/tests/intel-ci/fast-feedback.testlist
@@ -53,6 +53,7 @@ igt@i915_getparams_basic@basic-eu-total
 igt@i915_getparams_basic@basic-subslice-total
 igt@i915_hangman@error-state-basic
 igt@i915_pciid
+igt@i915_vm_bind_basic@basic-smem
 igt@i915_vm_bind_sanity@basic
 igt@kms_addfb_basic@addfb25-bad-modifier
 igt@kms_addfb_basic@addfb25-framebuffer-vs-set-tiling
diff --git a/tests/meson.build b/tests/meson.build
index d81c6c28c2..1b9529a7e2 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -251,6 +251,7 @@ i915_progs = [
 	'sysfs_preempt_timeout',
 	'sysfs_timeslice_duration',
 	'i915_vm_bind_sanity',
+	'i915_vm_bind_basic',
 ]
 
 msm_progs = [
-- 
2.21.0.rc0.32.g243a4c7e27

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [igt-dev] [PATCH i-g-t v4 08/11] tests/i915/vm_bind: Add userptr subtest
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
                   ` (6 preceding siblings ...)
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 07/11] tests/i915/vm_bind: Add basic VM_BIND test support Niranjana Vishwanathapura
@ 2022-10-18  7:17 ` Niranjana Vishwanathapura
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 09/11] tests/i915/vm_bind: Add gem_exec3_basic test Niranjana Vishwanathapura
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18  7:17 UTC (permalink / raw)
  To: igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, matthew.auld, daniel.vetter,
	petri.latvala

Add userptr object type to vm_bind_basic test.

v2: Run all tests for userptr type

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 tests/i915/i915_vm_bind_basic.c | 79 +++++++++++++++++++++++++++------
 1 file changed, 65 insertions(+), 14 deletions(-)

diff --git a/tests/i915/i915_vm_bind_basic.c b/tests/i915/i915_vm_bind_basic.c
index ede76ba095..5316d1b9e8 100644
--- a/tests/i915/i915_vm_bind_basic.c
+++ b/tests/i915/i915_vm_bind_basic.c
@@ -81,10 +81,12 @@ struct mapping {
 #define TEST_LMEM             BIT(0)
 #define TEST_SKIP_UNBIND      BIT(1)
 #define TEST_SHARE_VM         BIT(2)
+#define TEST_USERPTR          BIT(3)
 
 #define is_lmem(cfg)        ((cfg)->flags & TEST_LMEM)
 #define do_unbind(cfg)      (!((cfg)->flags & TEST_SKIP_UNBIND))
 #define do_share_vm(cfg)    ((cfg)->flags & TEST_SHARE_VM)
+#define is_userptr(cfg)     ((cfg)->flags & TEST_USERPTR)
 
 struct test_cfg {
 	const char *name;
@@ -248,15 +250,24 @@ static void create_src_objs(int fd, struct test_cfg *cfg, uint32_t src[], uint32
 			    uint32_t num_cmds, void *src_addr[])
 {
 	int i;
-	struct igt_collection *set = get_region_set(fd, cfg, is_lmem(cfg) && (num_cmds == 1));
-	uint32_t region;
 
-	for (i = 0; i < num_cmds; i++) {
-		region = igt_collection_get_value(set, i % set->size);
-		src[i] = gem_create_in_memory_regions(fd, size, region);
-		igt_info("Src obj 0x%x created in region 0x%x:0x%x\n", src[i],
-			 MEMORY_TYPE_FROM_REGION(region), MEMORY_INSTANCE_FROM_REGION(region));
-		src_addr[i] = gem_mmap__cpu(fd, src[i], 0, size, PROT_WRITE);
+	if (is_userptr(cfg)) {
+		for (i = 0; i < num_cmds; i++) {
+			igt_assert(posix_memalign(&src_addr[i], SZ_2M * MAX_CTXTS, size) == 0);
+			gem_userptr(fd, src_addr[i], size, 0, 0, &src[i]);
+			igt_info("Src userptr obj 0x%x created\n", src[i]);
+		}
+	} else {
+		struct igt_collection *set = get_region_set(fd, cfg, is_lmem(cfg) && (num_cmds == 1));
+		uint32_t region;
+
+		for (i = 0; i < num_cmds; i++) {
+			region = igt_collection_get_value(set, i % set->size);
+			src[i] = gem_create_in_memory_regions(fd, size, region);
+			igt_info("Src obj 0x%x created in region 0x%x:0x%x\n", src[i],
+				 MEMORY_TYPE_FROM_REGION(region), MEMORY_INSTANCE_FROM_REGION(region));
+			src_addr[i] = gem_mmap__cpu(fd, src[i], 0, size, PROT_WRITE);
+		}
 	}
 }
 
@@ -269,19 +280,29 @@ static void destroy_src_objs(int fd, struct test_cfg *cfg, uint32_t src[], uint3
 		igt_assert(gem_munmap(src_addr[i], size) == 0);
 		igt_debug("Closing object 0x%x\n", src[i]);
 		gem_close(fd, src[i]);
+		if (is_userptr(cfg))
+			free(src_addr[i]);
 	}
 }
 
 static uint32_t create_dst_obj(int fd, struct test_cfg *cfg, uint32_t size, void **dst_addr)
 {
 	uint32_t dst;
-	struct igt_collection *set = get_region_set(fd, cfg, is_lmem(cfg));
-	uint32_t region = igt_collection_get_value(set, 0);
 
-	dst = gem_create_in_memory_regions(fd, size, region);
-	igt_info("Dst obj 0x%x created in region 0x%x:0x%x\n", dst,
-		 MEMORY_TYPE_FROM_REGION(region), MEMORY_INSTANCE_FROM_REGION(region));
-	*dst_addr = gem_mmap__cpu(fd, dst, 0, size, PROT_WRITE);
+	if (is_userptr(cfg)) {
+		igt_assert(posix_memalign(dst_addr, SZ_2M * MAX_CTXTS, size) == 0);
+
+		gem_userptr(fd, *dst_addr, size, 0, 0, &dst);
+		igt_info("Dst userptr obj 0x%x created\n", dst);
+	} else {
+		struct igt_collection *set = get_region_set(fd, cfg, is_lmem(cfg));
+		uint32_t region = igt_collection_get_value(set, 0);
+
+		dst = gem_create_in_memory_regions(fd, size, igt_collection_get_value(set, 0));
+		igt_info("Dst obj 0x%x created in region 0x%x:0x%x\n", dst,
+			 MEMORY_TYPE_FROM_REGION(region), MEMORY_INSTANCE_FROM_REGION(region));
+		*dst_addr = gem_mmap__cpu(fd, dst, 0, size, PROT_WRITE);
+	}
 
 	return dst;
 }
@@ -291,6 +312,9 @@ static void destroy_dst_obj(int fd, struct test_cfg *cfg, uint32_t dst, uint32_t
 	igt_assert(gem_munmap(dst_addr, size) == 0);
 	igt_debug("Closing object 0x%x\n", dst);
 	gem_close(fd, dst);
+
+	if (is_userptr(cfg))
+		free(dst_addr);
 }
 
 static void pattern_fill_buf(void *src_addr[], uint32_t size, uint32_t num_cmds, uint32_t npages)
@@ -460,6 +484,25 @@ static void run_test(int fd, const intel_ctx_t *base_ctx, struct test_cfg *cfg,
 		gem_vm_destroy(fd, shared_vm_id);
 }
 
+static int has_userptr(int fd)
+{
+	uint32_t handle = 0;
+	void *ptr;
+	int ret;
+
+	assert(posix_memalign(&ptr, PAGE_SIZE, PAGE_SIZE) == 0);
+	ret = __gem_userptr(fd, ptr, PAGE_SIZE, 0, 0, &handle);
+	if (ret != 0) {
+		free(ptr);
+		return 0;
+	}
+
+	gem_close(fd, handle);
+	free(ptr);
+
+	return handle != 0;
+}
+
 igt_main
 {
 	struct test_cfg *t, tests[] = {
@@ -510,6 +553,14 @@ igt_main
 				igt_dynamic_f("%s", r->name)
 					run_test(fd, ctx, &cfg, e);
 			}
+
+			if (has_userptr(fd)) {
+				struct test_cfg cfg = *t;
+
+				cfg.flags |= TEST_USERPTR;
+				igt_dynamic("userptr")
+					run_test(fd, ctx, &cfg, e);
+			}
 		}
 	}
 
-- 
2.21.0.rc0.32.g243a4c7e27

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [igt-dev] [PATCH i-g-t v4 09/11] tests/i915/vm_bind: Add gem_exec3_basic test
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
                   ` (7 preceding siblings ...)
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 08/11] tests/i915/vm_bind: Add userptr subtest Niranjana Vishwanathapura
@ 2022-10-18  7:17 ` Niranjana Vishwanathapura
  2022-10-18  9:34   ` Matthew Auld
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 10/11] tests/i915/vm_bind: Add gem_exec3_balancer test Niranjana Vishwanathapura
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18  7:17 UTC (permalink / raw)
  To: igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, matthew.auld, daniel.vetter,
	petri.latvala

Port gem_exec_basic to a new gem_exec3_basic test which
uses the newer execbuf3 ioctl.

v2: use i915_vm_bind library functions

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 tests/i915/gem_exec3_basic.c          | 133 ++++++++++++++++++++++++++
 tests/intel-ci/fast-feedback.testlist |   1 +
 tests/meson.build                     |   1 +
 3 files changed, 135 insertions(+)
 create mode 100644 tests/i915/gem_exec3_basic.c

diff --git a/tests/i915/gem_exec3_basic.c b/tests/i915/gem_exec3_basic.c
new file mode 100644
index 0000000000..3994381748
--- /dev/null
+++ b/tests/i915/gem_exec3_basic.c
@@ -0,0 +1,133 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+/** @file gem_exec3_basic.c
+ *
+ * Basic execbuf3 test.
+ * Ported from gem_exec_basic and made to work with
+ * vm_bind and execbuf3.
+ *
+ */
+
+#include "i915/gem.h"
+#include "i915/i915_vm_bind.h"
+#include "igt.h"
+#include "igt_collection.h"
+#include "igt_syncobj.h"
+
+#include "i915/gem_create.h"
+#include "i915/gem_vm.h"
+
+IGT_TEST_DESCRIPTION("Basic sanity check of execbuf3-ioctl rings.");
+
+#define BATCH_VA	0xa0000000
+#define PAGE_SIZE	4096
+
+static uint64_t gettime_ns(void)
+{
+	struct timespec current;
+	clock_gettime(CLOCK_MONOTONIC, &current);
+	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
+}
+
+static uint32_t batch_create(int fd, uint64_t *batch_size, uint32_t region)
+{
+	const uint32_t bbe = MI_BATCH_BUFFER_END;
+	uint32_t handle;
+
+	igt_assert(__gem_create_in_memory_regions(fd, &handle, batch_size, region) == 0);
+	gem_write(fd, handle, 0, &bbe, sizeof(bbe));
+
+	return handle;
+}
+
+igt_main
+{
+	const struct intel_execution_engine2 *e;
+	struct drm_i915_query_memory_regions *query_info;
+	struct igt_collection *regions, *set;
+	const intel_ctx_t *base_ctx, *ctx;
+	uint32_t vm_id;
+	int fd = -1;
+
+	igt_fixture {
+		fd = drm_open_driver(DRIVER_INTEL);
+		base_ctx = intel_ctx_create_all_physical(fd);
+
+		igt_require_gem(fd);
+		igt_require(i915_vm_bind_version(fd) == 1);
+		igt_fork_hang_detector(fd);
+
+		vm_id = gem_vm_create_in_vm_bind_mode(fd);
+		ctx  = intel_ctx_create(fd, &base_ctx->cfg);
+		gem_context_set_vm(fd, ctx->id, vm_id);
+
+		query_info = gem_get_query_memory_regions(fd);
+		igt_assert(query_info);
+
+		set = get_memory_region_set(query_info,
+					    I915_SYSTEM_MEMORY,
+					    I915_DEVICE_MEMORY);
+	}
+
+	igt_describe("Check basic functionality of GEM_EXECBUFFER3 ioctl on every"
+		     " ring and iterating over memory regions.");
+	igt_subtest_with_dynamic("basic") {
+		for_each_combination(regions, 1, set) {
+			struct drm_i915_gem_timeline_fence exec_fence[GEM_MAX_ENGINES][2] = { };
+			char *sub_name = memregion_dynamic_subtest_name(regions);
+			uint32_t region = igt_collection_get_value(regions, 0);
+			uint32_t bind_syncobj, exec_syncobj[GEM_MAX_ENGINES];
+			uint64_t fence_value[GEM_MAX_ENGINES] = { };
+			uint64_t batch_size = PAGE_SIZE;
+			uint32_t handle, idx = 0;
+
+			handle = batch_create(fd, &batch_size, region);
+			bind_syncobj = syncobj_create(fd, 0);
+			i915_vm_bind(fd, vm_id, BATCH_VA, handle, 0, batch_size, bind_syncobj, 0);
+
+			for_each_ctx_engine(fd, ctx, e) {
+				igt_dynamic_f("%s-%s", e->name, sub_name) {
+					struct drm_i915_gem_execbuffer3 execbuf = {
+						.ctx_id = ctx->id,
+						.batch_address = BATCH_VA,
+						.engine_idx = e->flags,
+						.fence_count = 2,
+						.timeline_fences = to_user_pointer(&exec_fence[idx]),
+					};
+
+					exec_syncobj[idx] = syncobj_create(fd, 0);
+					exec_fence[idx][0].handle = bind_syncobj;
+					exec_fence[idx][0].flags = I915_TIMELINE_FENCE_WAIT;
+					exec_fence[idx][1].handle = exec_syncobj[idx];
+					exec_fence[idx][1].flags = I915_TIMELINE_FENCE_SIGNAL;
+					idx++;
+
+					gem_execbuf3(fd, &execbuf);
+				}
+			}
+
+			igt_assert(syncobj_timeline_wait(fd, exec_syncobj, fence_value, idx,
+							 gettime_ns() + (2 * NSEC_PER_SEC),
+							 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
+			while (idx)
+				syncobj_destroy(fd, exec_syncobj[--idx]);
+			syncobj_destroy(fd, bind_syncobj);
+			i915_vm_unbind(fd, vm_id, BATCH_VA, batch_size);
+			gem_close(fd, handle);
+			free(sub_name);
+		}
+	}
+
+	igt_fixture {
+		intel_ctx_destroy(fd, ctx);
+		gem_vm_destroy(fd, vm_id);
+		free(query_info);
+		igt_collection_destroy(set);
+		igt_stop_hang_detector();
+		intel_ctx_destroy(fd, base_ctx);
+		close(fd);
+	}
+}
diff --git a/tests/intel-ci/fast-feedback.testlist b/tests/intel-ci/fast-feedback.testlist
index 8a47eaddda..c5be7e300b 100644
--- a/tests/intel-ci/fast-feedback.testlist
+++ b/tests/intel-ci/fast-feedback.testlist
@@ -27,6 +27,7 @@ igt@gem_exec_fence@nb-await
 igt@gem_exec_gttfill@basic
 igt@gem_exec_parallel@engines
 igt@gem_exec_store@basic
+igt@gem_exec3_basic@basic
 igt@gem_flink_basic@bad-flink
 igt@gem_flink_basic@bad-open
 igt@gem_flink_basic@basic
diff --git a/tests/meson.build b/tests/meson.build
index 1b9529a7e2..50de8b7e89 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -252,6 +252,7 @@ i915_progs = [
 	'sysfs_timeslice_duration',
 	'i915_vm_bind_sanity',
 	'i915_vm_bind_basic',
+	'gem_exec3_basic',
 ]
 
 msm_progs = [
-- 
2.21.0.rc0.32.g243a4c7e27

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [igt-dev] [PATCH i-g-t v4 10/11] tests/i915/vm_bind: Add gem_exec3_balancer test
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
                   ` (8 preceding siblings ...)
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 09/11] tests/i915/vm_bind: Add gem_exec3_basic test Niranjana Vishwanathapura
@ 2022-10-18  7:17 ` Niranjana Vishwanathapura
  2022-10-21 16:42   ` Matthew Auld
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 11/11] tests/i915/vm_bind: Add gem_lmem_swapping@vm_bind sub test Niranjana Vishwanathapura
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18  7:17 UTC (permalink / raw)
  To: igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, matthew.auld, daniel.vetter,
	petri.latvala

To test parallel submissions support in execbuf3, port the subtest
gem_exec_balancer@parallel-ordering to a new gem_exec3_balancer test
and switch to execbuf3 ioctl.

v2: use i915_vm_bind library functions

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 tests/i915/gem_exec3_balancer.c | 473 ++++++++++++++++++++++++++++++++
 tests/meson.build               |   7 +
 2 files changed, 480 insertions(+)
 create mode 100644 tests/i915/gem_exec3_balancer.c

diff --git a/tests/i915/gem_exec3_balancer.c b/tests/i915/gem_exec3_balancer.c
new file mode 100644
index 0000000000..34b2a6a0b3
--- /dev/null
+++ b/tests/i915/gem_exec3_balancer.c
@@ -0,0 +1,473 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+/** @file gem_exec3_balancer.c
+ *
+ * Load balancer tests with execbuf3.
+ * Ported from gem_exec_balancer and made to work with
+ * vm_bind and execbuf3.
+ *
+ */
+
+#include <poll.h>
+
+#include "i915/gem.h"
+#include "i915/i915_vm_bind.h"
+#include "i915/gem_engine_topology.h"
+#include "i915/gem_create.h"
+#include "i915/gem_vm.h"
+#include "igt.h"
+#include "igt_gt.h"
+#include "igt_perf.h"
+#include "igt_syncobj.h"
+
+IGT_TEST_DESCRIPTION("Exercise in-kernel load-balancing with execbuf3");
+
+#define INSTANCE_COUNT (1 << I915_PMU_SAMPLE_INSTANCE_BITS)
+
+static bool has_class_instance(int i915, uint16_t class, uint16_t instance)
+{
+	int fd;
+
+	fd = perf_i915_open(i915, I915_PMU_ENGINE_BUSY(class, instance));
+	if (fd >= 0) {
+		close(fd);
+		return true;
+	}
+
+	return false;
+}
+
+static struct i915_engine_class_instance *
+list_engines(int i915, uint32_t class_mask, unsigned int *out)
+{
+	unsigned int count = 0, size = 64;
+	struct i915_engine_class_instance *engines;
+
+	engines = malloc(size * sizeof(*engines));
+	igt_assert(engines);
+
+	for (enum drm_i915_gem_engine_class class = I915_ENGINE_CLASS_RENDER;
+	     class_mask;
+	     class++, class_mask >>= 1) {
+		if (!(class_mask & 1))
+			continue;
+
+		for (unsigned int instance = 0;
+		     instance < INSTANCE_COUNT;
+		     instance++) {
+			if (!has_class_instance(i915, class, instance))
+				continue;
+
+			if (count == size) {
+				size *= 2;
+				engines = realloc(engines,
+						  size * sizeof(*engines));
+				igt_assert(engines);
+			}
+
+			engines[count++] = (struct i915_engine_class_instance){
+				.engine_class = class,
+				.engine_instance = instance,
+			};
+		}
+	}
+
+	if (!count) {
+		free(engines);
+		engines = NULL;
+	}
+
+	*out = count;
+	return engines;
+}
+
+static bool has_perf_engines(int i915)
+{
+	return i915_perf_type_id(i915);
+}
+
+static intel_ctx_cfg_t
+ctx_cfg_for_engines(const struct i915_engine_class_instance *ci,
+		    unsigned int count)
+{
+	intel_ctx_cfg_t cfg = { };
+	unsigned int i;
+
+	for (i = 0; i < count; i++)
+		cfg.engines[i] = ci[i];
+	cfg.num_engines = count;
+
+	return cfg;
+}
+
+static const intel_ctx_t *
+ctx_create_engines(int i915, const struct i915_engine_class_instance *ci,
+		   unsigned int count)
+{
+	intel_ctx_cfg_t cfg = ctx_cfg_for_engines(ci, count);
+	return intel_ctx_create(i915, &cfg);
+}
+
+static void check_bo(int i915, uint32_t handle, unsigned int expected,
+		     bool wait)
+{
+	uint32_t *map;
+
+	map = gem_mmap__cpu(i915, handle, 0, 4096, PROT_READ);
+	if (wait)
+		gem_set_domain(i915, handle, I915_GEM_DOMAIN_CPU,
+			       I915_GEM_DOMAIN_CPU);
+	igt_assert_eq(map[0], expected);
+	munmap(map, 4096);
+}
+
+static struct drm_i915_query_engine_info *query_engine_info(int i915)
+{
+	struct drm_i915_query_engine_info *engines;
+
+#define QUERY_SIZE	0x4000
+	engines = malloc(QUERY_SIZE);
+	igt_assert(engines);
+	memset(engines, 0, QUERY_SIZE);
+	igt_assert(!__gem_query_engines(i915, engines, QUERY_SIZE));
+#undef QUERY_SIZE
+
+	return engines;
+}
+
+/* This function only works if siblings contains all instances of a class */
+static void logical_sort_siblings(int i915,
+				  struct i915_engine_class_instance *siblings,
+				  unsigned int count)
+{
+	struct i915_engine_class_instance *sorted;
+	struct drm_i915_query_engine_info *engines;
+	unsigned int i, j;
+
+	sorted = calloc(count, sizeof(*sorted));
+	igt_assert(sorted);
+
+	engines = query_engine_info(i915);
+
+	for (j = 0; j < count; ++j) {
+		for (i = 0; i < engines->num_engines; ++i) {
+			if (siblings[j].engine_class ==
+			    engines->engines[i].engine.engine_class &&
+			    siblings[j].engine_instance ==
+			    engines->engines[i].engine.engine_instance) {
+				uint16_t logical_instance =
+					engines->engines[i].logical_instance;
+
+				igt_assert(logical_instance < count);
+				igt_assert(!sorted[logical_instance].engine_class);
+				igt_assert(!sorted[logical_instance].engine_instance);
+
+				sorted[logical_instance] = siblings[j];
+				break;
+			}
+		}
+		igt_assert(i != engines->num_engines);
+	}
+
+	memcpy(siblings, sorted, sizeof(*sorted) * count);
+	free(sorted);
+	free(engines);
+}
+
+static bool fence_busy(int fence)
+{
+	return poll(&(struct pollfd){fence, POLLIN}, 1, 0) == 0;
+}
+
+/*
+ * Always reading from engine instance 0, with GuC submission the values are the
+ * same across all instances. Execlists they may differ but quite unlikely they
+ * would be and if they are we can live with this.
+ */
+static unsigned int get_timeslice(int i915,
+				  struct i915_engine_class_instance engine)
+{
+	unsigned int val;
+
+	switch (engine.engine_class) {
+	case I915_ENGINE_CLASS_RENDER:
+		gem_engine_property_scanf(i915, "rcs0", "timeslice_duration_ms",
+					  "%d", &val);
+		break;
+	case I915_ENGINE_CLASS_COPY:
+		gem_engine_property_scanf(i915, "bcs0", "timeslice_duration_ms",
+					  "%d", &val);
+		break;
+	case I915_ENGINE_CLASS_VIDEO:
+		gem_engine_property_scanf(i915, "vcs0", "timeslice_duration_ms",
+					  "%d", &val);
+		break;
+	case I915_ENGINE_CLASS_VIDEO_ENHANCE:
+		gem_engine_property_scanf(i915, "vecs0", "timeslice_duration_ms",
+					  "%d", &val);
+		break;
+	}
+
+	return val;
+}
+
+static uint64_t gettime_ns(void)
+{
+	struct timespec current;
+	clock_gettime(CLOCK_MONOTONIC, &current);
+	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
+}
+
+static bool syncobj_busy(int i915, uint32_t handle)
+{
+	bool result;
+	int sf;
+
+	sf = syncobj_handle_to_fd(i915, handle,
+				  DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE);
+	result = poll(&(struct pollfd){sf, POLLIN}, 1, 0) == 0;
+	close(sf);
+
+	return result;
+}
+
+/*
+ * Ensure a parallel submit actually runs on HW in parallel by putting on a
+ * spinner on 1 engine, doing a parallel submit, and parallel submit is blocked
+ * behind spinner.
+ */
+static void parallel_ordering(int i915, unsigned int flags)
+{
+	uint32_t vm_id;
+	int class;
+
+	vm_id = gem_vm_create_in_vm_bind_mode(i915);
+
+	for (class = 0; class < 32; class++) {
+		struct drm_i915_gem_timeline_fence exec_fence[32] = { };
+		const intel_ctx_t *ctx = NULL, *spin_ctx = NULL;
+		uint64_t fence_value = 0, batch_addr[32] = { };
+		struct i915_engine_class_instance *siblings;
+		uint32_t exec_syncobj, bind_syncobj[32];
+		struct drm_i915_gem_execbuffer3 execbuf;
+		uint32_t batch[16], obj[32];
+		intel_ctx_cfg_t cfg;
+		unsigned int count;
+		igt_spin_t *spin;
+		uint64_t ahnd;
+		int i = 0;
+
+		siblings = list_engines(i915, 1u << class, &count);
+		if (!siblings)
+			continue;
+
+		if (count < 2) {
+			free(siblings);
+			continue;
+		}
+
+		logical_sort_siblings(i915, siblings, count);
+
+		memset(&cfg, 0, sizeof(cfg));
+		cfg.parallel = true;
+		cfg.num_engines = 1;
+		cfg.width = count;
+		memcpy(cfg.engines, siblings, sizeof(*siblings) * count);
+
+		if (__intel_ctx_create(i915, &cfg, &ctx)) {
+			free(siblings);
+			continue;
+		}
+		gem_context_set_vm(i915, ctx->id, vm_id);
+
+		batch[i] = MI_ATOMIC | MI_ATOMIC_INC;
+#define TARGET_BO_OFFSET	(0x1 << 16)
+		batch[++i] = TARGET_BO_OFFSET;
+		batch[++i] = 0;
+		batch[++i] = MI_BATCH_BUFFER_END;
+
+		obj[0] = gem_create(i915, 4096);
+		bind_syncobj[0] = syncobj_create(i915, 0);
+		exec_fence[0].handle = bind_syncobj[0];
+		exec_fence[0].flags = I915_TIMELINE_FENCE_WAIT;
+		i915_vm_bind(i915, vm_id, TARGET_BO_OFFSET, obj[0], 0, 4096, bind_syncobj[0], 0);
+
+		for (i = 1; i < count + 1; ++i) {
+			obj[i] = gem_create(i915, 4096);
+			gem_write(i915, obj[i], 0, batch, sizeof(batch));
+
+			batch_addr[i - 1] = TARGET_BO_OFFSET * (i + 1);
+			bind_syncobj[i] = syncobj_create(i915, 0);
+			exec_fence[i].handle = bind_syncobj[i];
+			exec_fence[i].flags = I915_TIMELINE_FENCE_WAIT;
+			i915_vm_bind(i915, vm_id, batch_addr[i - 1], obj[i], 0, 4096, bind_syncobj[i], 0);
+		}
+
+		exec_syncobj = syncobj_create(i915, 0);
+		exec_fence[i].handle = exec_syncobj;
+		exec_fence[i].flags = I915_TIMELINE_FENCE_SIGNAL;
+
+		memset(&execbuf, 0, sizeof(execbuf));
+		execbuf.ctx_id = ctx->id,
+		execbuf.batch_address = to_user_pointer(batch_addr),
+		execbuf.fence_count = count + 2,
+		execbuf.timeline_fences = to_user_pointer(exec_fence),
+
+		/* Block parallel submission */
+		spin_ctx = ctx_create_engines(i915, siblings, count);
+		ahnd = get_simple_ahnd(i915, spin_ctx->id);
+		spin = __igt_spin_new(i915,
+				      .ahnd = ahnd,
+				      .ctx = spin_ctx,
+				      .engine = 0,
+				      .flags = IGT_SPIN_FENCE_OUT |
+				      IGT_SPIN_NO_PREEMPTION);
+
+		/* Wait for spinners to start */
+		usleep(5 * 10000);
+		igt_assert(fence_busy(spin->out_fence));
+
+		/* Submit parallel execbuf */
+		gem_execbuf3(i915, &execbuf);
+
+		/*
+		 * Wait long enough for timeslcing to kick in but not
+		 * preemption. Spinner + parallel execbuf should be
+		 * active. Assuming default timeslice / preemption values, if
+		 * these are changed it is possible for the test to fail.
+		 */
+		usleep(get_timeslice(i915, siblings[0]) * 2);
+		igt_assert(fence_busy(spin->out_fence));
+		igt_assert(syncobj_busy(i915, exec_syncobj));
+		check_bo(i915, obj[0], 0, false);
+
+		/*
+		 * End spinner and wait for spinner + parallel execbuf
+		 * to compelte.
+		 */
+		igt_spin_end(spin);
+		igt_assert(syncobj_timeline_wait(i915, &exec_syncobj, &fence_value, 1,
+						 gettime_ns() + (2 * NSEC_PER_SEC),
+						 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
+		igt_assert(!syncobj_busy(i915, exec_syncobj));
+		syncobj_destroy(i915, exec_syncobj);
+		for (i = 0; i < count + 1; ++i)
+			syncobj_destroy(i915, bind_syncobj[i]);
+		check_bo(i915, obj[0], count, true);
+
+		/* Clean up */
+		intel_ctx_destroy(i915, ctx);
+		intel_ctx_destroy(i915, spin_ctx);
+		i915_vm_unbind(i915, vm_id, TARGET_BO_OFFSET, 4096);
+		for (i = 1; i < count + 1; ++i)
+			i915_vm_unbind(i915, vm_id, batch_addr[i - 1], 4096);
+
+		for (i = 0; i < count + 1; ++i)
+			gem_close(i915, obj[i]);
+		free(siblings);
+		igt_spin_free(i915, spin);
+		put_ahnd(ahnd);
+	}
+
+	gem_vm_destroy(i915, vm_id);
+}
+
+static bool has_load_balancer(int i915)
+{
+	const intel_ctx_cfg_t cfg = {
+		.load_balance = true,
+		.num_engines = 1,
+	};
+	const intel_ctx_t *ctx = NULL;
+	int err;
+
+	err = __intel_ctx_create(i915, &cfg, &ctx);
+	intel_ctx_destroy(i915, ctx);
+
+	return err == 0;
+}
+
+static bool has_logical_mapping(int i915)
+{
+	struct drm_i915_query_engine_info *engines;
+	unsigned int i;
+
+	engines = query_engine_info(i915);
+
+	for (i = 0; i < engines->num_engines; ++i)
+		if (!(engines->engines[i].flags &
+		     I915_ENGINE_INFO_HAS_LOGICAL_INSTANCE)) {
+			free(engines);
+			return false;
+		}
+
+	free(engines);
+	return true;
+}
+
+static bool has_parallel_execbuf(int i915)
+{
+	intel_ctx_cfg_t cfg = {
+		.parallel = true,
+		.num_engines = 1,
+	};
+	const intel_ctx_t *ctx = NULL;
+	int err;
+
+	for (int class = 0; class < 32; class++) {
+		struct i915_engine_class_instance *siblings;
+		unsigned int count;
+
+		siblings = list_engines(i915, 1u << class, &count);
+		if (!siblings)
+			continue;
+
+		if (count < 2) {
+			free(siblings);
+			continue;
+		}
+
+		logical_sort_siblings(i915, siblings, count);
+
+		cfg.width = count;
+		memcpy(cfg.engines, siblings, sizeof(*siblings) * count);
+		free(siblings);
+
+		err = __intel_ctx_create(i915, &cfg, &ctx);
+		intel_ctx_destroy(i915, ctx);
+
+		return err == 0;
+	}
+
+	return false;
+}
+
+igt_main
+{
+	int i915 = -1;
+
+	igt_fixture {
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+
+		gem_require_contexts(i915);
+		igt_require(i915_vm_bind_version(i915) == 1);
+		igt_require(gem_has_engine_topology(i915));
+		igt_require(has_load_balancer(i915));
+		igt_require(has_perf_engines(i915));
+	}
+
+	igt_subtest_group {
+		igt_fixture {
+			igt_require(has_logical_mapping(i915));
+			igt_require(has_parallel_execbuf(i915));
+		}
+
+		igt_describe("Ensure a parallel submit actually runs in parallel");
+		igt_subtest("parallel-ordering")
+			parallel_ordering(i915, 0);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 50de8b7e89..6ca5a94282 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -375,6 +375,13 @@ test_executables += executable('gem_exec_balancer', 'i915/gem_exec_balancer.c',
 	   install : true)
 test_list += 'gem_exec_balancer'
 
+test_executables += executable('gem_exec3_balancer', 'i915/gem_exec3_balancer.c',
+	   dependencies : test_deps + [ lib_igt_perf ],
+	   install_dir : libexecdir,
+	   install_rpath : libexecdir_rpathdir,
+	   install : true)
+test_list += 'gem_exec3_balancer'
+
 test_executables += executable('gem_mmap_offset',
 	   join_paths('i915', 'gem_mmap_offset.c'),
 	   dependencies : test_deps + [ libatomic ],
-- 
2.21.0.rc0.32.g243a4c7e27

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [igt-dev] [PATCH i-g-t v4 11/11] tests/i915/vm_bind: Add gem_lmem_swapping@vm_bind sub test
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
                   ` (9 preceding siblings ...)
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 10/11] tests/i915/vm_bind: Add gem_exec3_balancer test Niranjana Vishwanathapura
@ 2022-10-18  7:17 ` Niranjana Vishwanathapura
  2022-10-21 17:20   ` Matthew Auld
  2022-10-18  7:45 ` [igt-dev] ✓ Fi.CI.BAT: success for vm_bind: Add VM_BIND validation support (rev8) Patchwork
  2022-10-18 22:13 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
  12 siblings, 1 reply; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18  7:17 UTC (permalink / raw)
  To: igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, matthew.auld, daniel.vetter,
	petri.latvala

Validate eviction of objects with persistent mappings and
check persistent mappings are properly rebound upon subsequent
execbuf3 call.

TODO: Add a new swapping test with just vm_bind mode
      (without legacy execbuf).

v2: use i915_vm_bind library functions

Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
---
 tests/i915/gem_lmem_swapping.c | 128 ++++++++++++++++++++++++++++++++-
 1 file changed, 127 insertions(+), 1 deletion(-)

diff --git a/tests/i915/gem_lmem_swapping.c b/tests/i915/gem_lmem_swapping.c
index cccdb3195b..017833775c 100644
--- a/tests/i915/gem_lmem_swapping.c
+++ b/tests/i915/gem_lmem_swapping.c
@@ -3,12 +3,16 @@
  * Copyright © 2021 Intel Corporation
  */
 
+#include <poll.h>
+
 #include "i915/gem.h"
 #include "i915/gem_create.h"
 #include "i915/gem_vm.h"
 #include "i915/intel_memory_region.h"
+#include "i915/i915_vm_bind.h"
 #include "igt.h"
 #include "igt_kmod.h"
+#include "igt_syncobj.h"
 #include <unistd.h>
 #include <stdlib.h>
 #include <stdint.h>
@@ -54,6 +58,7 @@ struct params {
 		uint64_t max;
 	} size;
 	unsigned int count;
+	unsigned int vm_bind_count;
 	unsigned int loops;
 	unsigned int mem_limit;
 #define TEST_VERIFY	(1 << 0)
@@ -64,15 +69,18 @@ struct params {
 #define TEST_MULTI	(1 << 5)
 #define TEST_CCS	(1 << 6)
 #define TEST_MASSIVE	(1 << 7)
+#define TEST_VM_BIND	(1 << 8)
 	unsigned int flags;
 	unsigned int seed;
 	bool oom_test;
+	uint64_t va;
 };
 
 struct object {
 	uint64_t size;
 	uint32_t seed;
 	uint32_t handle;
+	uint64_t va;
 	struct blt_copy_object *blt_obj;
 };
 
@@ -278,6 +286,86 @@ verify_object_ccs(int i915, const struct object *obj,
 	free(cmd);
 }
 
+#define BATCH_VA       0xa00000
+
+static uint64_t gettime_ns(void)
+{
+	struct timespec current;
+	clock_gettime(CLOCK_MONOTONIC, &current);
+	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
+}
+
+static void vm_bind(int i915, struct object *list, unsigned int num, uint64_t *va,
+		    uint32_t batch, uint64_t batch_size, uint32_t vm_id)
+{
+	uint32_t *bind_syncobj;
+	uint64_t *fence_value;
+	unsigned int i;
+
+	bind_syncobj = calloc(num + 1, sizeof(*bind_syncobj));
+	igt_assert(bind_syncobj);
+
+	fence_value = calloc(num + 1, sizeof(*fence_value));
+	igt_assert(fence_value);
+
+	for (i = 0; i < num; i++) {
+		list[i].va = *va;
+		bind_syncobj[i] = syncobj_create(i915, 0);
+		i915_vm_bind(i915, vm_id, *va, list[i].handle, 0, list[i].size, bind_syncobj[i], 0);
+		*va += list[i].size;
+	}
+	bind_syncobj[i] = syncobj_create(i915, 0);
+	i915_vm_bind(i915, vm_id, BATCH_VA, batch, 0, batch_size, bind_syncobj[i], 0);
+
+	igt_assert(syncobj_timeline_wait(i915, bind_syncobj, fence_value, num + 1,
+					 gettime_ns() + (2 * NSEC_PER_SEC),
+					 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
+	for (i = 0; i <= num; i++)
+		syncobj_destroy(i915, bind_syncobj[i]);
+}
+
+static void vm_unbind(int i915, struct object *list, unsigned int num,
+		      uint64_t batch_size, uint32_t vm_id)
+{
+	unsigned int i;
+
+	i915_vm_unbind(i915, vm_id, BATCH_VA, batch_size);
+	for (i = 0; i < num; i++)
+		i915_vm_unbind(i915, vm_id, list[i].va, list[i].size);
+}
+
+static void move_to_lmem_execbuf3(int i915,
+				  const intel_ctx_t *ctx,
+				  unsigned int engine,
+				  bool do_oom_test)
+{
+	uint32_t exec_syncobj = syncobj_create(i915, 0);
+	struct drm_i915_gem_timeline_fence exec_fence = {
+		.handle = exec_syncobj,
+		.flags = I915_TIMELINE_FENCE_SIGNAL
+	};
+	struct drm_i915_gem_execbuffer3 eb = {
+		.ctx_id = ctx->id,
+		.batch_address = BATCH_VA,
+		.engine_idx = engine,
+		.fence_count = 1,
+		.timeline_fences = to_user_pointer(&exec_fence),
+	};
+	uint64_t fence_value = 0;
+	int ret;
+
+retry:
+	ret = __gem_execbuf3(i915, &eb);
+	if (do_oom_test && (ret == -ENOMEM || ret == -ENXIO))
+		goto retry;
+	igt_assert_eq(ret, 0);
+
+	igt_assert(syncobj_timeline_wait(i915, &exec_syncobj, &fence_value, 1,
+					 gettime_ns() + (2 * NSEC_PER_SEC),
+					 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
+	syncobj_destroy(i915, exec_syncobj);
+}
+
 static void move_to_lmem(int i915,
 			 const intel_ctx_t *ctx,
 			 struct object *list,
@@ -325,14 +413,16 @@ static void __do_evict(int i915,
 	uint32_t region_id = INTEL_MEMORY_REGION_ID(region->memory_class,
 						    region->memory_instance);
 	const unsigned int max_swap_in = params->count / 100 + 1;
+	uint64_t size, ahnd, batch_size = 4096;
 	struct object *objects, *obj, *list;
 	const uint32_t bpp = 32;
 	uint32_t width, height, stride;
+	const intel_ctx_t *vm_bind_ctx;
 	const intel_ctx_t *blt_ctx;
 	struct blt_copy_object *tmp;
 	unsigned int engine = 0;
+	uint32_t batch, vm_id;
 	unsigned int i, l;
-	uint64_t size, ahnd;
 	struct timespec t = {};
 	unsigned int num;
 
@@ -416,6 +506,20 @@ static void __do_evict(int i915,
 		  readable_size(params->size.max), readable_unit(params->size.max),
 		  params->count, seed);
 
+	/* VM_BIND the specified subset of objects (as persistent mappings) */
+	if (params->flags & TEST_VM_BIND) {
+		const uint32_t bbe = MI_BATCH_BUFFER_END;
+
+		batch = gem_create_from_pool(i915, &batch_size, region_id);
+		gem_write(i915, batch, 0, &bbe, sizeof(bbe));
+
+		vm_id = gem_vm_create_in_vm_bind_mode(i915);
+		vm_bind_ctx = intel_ctx_create(i915, &ctx->cfg);
+		gem_context_set_vm(i915, vm_bind_ctx->id, vm_id);
+		vm_bind(i915, objects, params->vm_bind_count, &params->va,
+			batch, batch_size, vm_id);
+	}
+
 	/*
 	 * Move random objects back into lmem.
 	 * For TEST_MULTI runs, make each object counts a loop to
@@ -454,6 +558,15 @@ static void __do_evict(int i915,
 		}
 	}
 
+	/* Rebind persistent mappings to ensure they are swapped back in */
+	if (params->flags & TEST_VM_BIND) {
+		move_to_lmem_execbuf3(i915, vm_bind_ctx, engine, params->oom_test);
+
+		vm_unbind(i915, objects, params->vm_bind_count, batch_size, vm_id);
+		intel_ctx_destroy(i915, vm_bind_ctx);
+		gem_vm_destroy(i915, vm_id);
+	}
+
 	for (i = 0; i < params->count; i++) {
 		gem_close(i915, objects[i].handle);
 		free(objects[i].blt_obj);
@@ -553,6 +666,15 @@ static void fill_params(int i915, struct params *params,
 	if (flags & TEST_HEAVY)
 		params->loops = params->loops / 2 + 1;
 
+	/*
+	 * Set vm_bind_count and ensure it doesn't over-subscribe LMEM.
+	 * Set va range to ensure it is big enough for all bindings in a VM.
+	 */
+	if (flags & TEST_VM_BIND) {
+		params->vm_bind_count = params->count * 50 / ((flags & TEST_HEAVY) ? 300 : 150);
+		params->va = (uint64_t)params->vm_bind_count * size * 4;
+	}
+
 	params->flags = flags;
 	params->oom_test = do_oom_test;
 
@@ -583,6 +705,9 @@ static void test_evict(int i915,
 	if (flags & TEST_CCS)
 		igt_require(IS_DG2(intel_get_drm_devid(i915)));
 
+	if (flags & TEST_VM_BIND)
+		igt_require(i915_vm_bind_version(i915) == 1);
+
 	fill_params(i915, &params, region, flags, nproc, false);
 
 	if (flags & TEST_PARALLEL) {
@@ -764,6 +889,7 @@ igt_main_args("", long_options, help_str, opt_handler, NULL)
 		{ "heavy-verify-random-ccs", TEST_CCS | TEST_RANDOM | TEST_HEAVY },
 		{ "heavy-verify-multi-ccs", TEST_CCS | TEST_RANDOM | TEST_HEAVY | TEST_ENGINES | TEST_MULTI },
 		{ "parallel-random-verify-ccs", TEST_PARALLEL | TEST_RANDOM | TEST_CCS },
+		{ "vm_bind", TEST_VM_BIND },
 		{ }
 	};
 	const intel_ctx_t *ctx;
-- 
2.21.0.rc0.32.g243a4c7e27

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for vm_bind: Add VM_BIND validation support (rev8)
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
                   ` (10 preceding siblings ...)
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 11/11] tests/i915/vm_bind: Add gem_lmem_swapping@vm_bind sub test Niranjana Vishwanathapura
@ 2022-10-18  7:45 ` Patchwork
  2022-10-18 22:13 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
  12 siblings, 0 replies; 28+ messages in thread
From: Patchwork @ 2022-10-18  7:45 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: igt-dev

[-- Attachment #1: Type: text/plain, Size: 7212 bytes --]

== Series Details ==

Series: vm_bind: Add VM_BIND validation support (rev8)
URL   : https://patchwork.freedesktop.org/series/105881/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_12254 -> IGTPW_7975
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/index.html

Participating hosts (45 -> 40)
------------------------------

  Additional (1): fi-hsw-4770 
  Missing    (6): fi-bxt-dsi fi-icl-u2 fi-ilk-650 fi-kbl-guc fi-snb-2520m fi-bdw-samus 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_7975:

### IGT changes ###

#### Possible regressions ####

  * {igt@i915_vm_bind_basic@basic-smem} (NEW):
    - {fi-tgl-dsi}:       NOTRUN -> [SKIP][1] +2 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-tgl-dsi/igt@i915_vm_bind_basic@basic-smem.html

  
New tests
---------

  New tests have been introduced between CI_DRM_12254 and IGTPW_7975:

### New IGT tests (3) ###

  * igt@gem_exec3_basic@basic:
    - Statuses : 14 skip(s)
    - Exec time: [0.0] s

  * igt@i915_vm_bind_basic@basic-smem:
    - Statuses : 14 skip(s)
    - Exec time: [0.0] s

  * igt@i915_vm_bind_sanity@basic:
    - Statuses : 14 skip(s)
    - Exec time: [0.0] s

  

Known issues
------------

  Here are the changes found in IGTPW_7975 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * {igt@gem_exec3_basic@basic} (NEW):
    - fi-bdw-gvtdvm:      NOTRUN -> [SKIP][2] ([fdo#109271]) +2 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-bdw-gvtdvm/igt@gem_exec3_basic@basic.html
    - fi-pnv-d510:        NOTRUN -> [SKIP][3] ([fdo#109271]) +2 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-pnv-d510/igt@gem_exec3_basic@basic.html

  * igt@gem_tiled_blits@basic:
    - fi-pnv-d510:        [PASS][4] -> [SKIP][5] ([fdo#109271]) +2 similar issues
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12254/fi-pnv-d510/igt@gem_tiled_blits@basic.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-pnv-d510/igt@gem_tiled_blits@basic.html

  * igt@i915_pm_backlight@basic-brightness:
    - fi-hsw-4770:        NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#3012])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-hsw-4770/igt@i915_pm_backlight@basic-brightness.html

  * igt@i915_pm_rpm@module-reload:
    - fi-hsw-4770:        NOTRUN -> [INCOMPLETE][7] ([i915#7221])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-hsw-4770/igt@i915_pm_rpm@module-reload.html

  * {igt@i915_vm_bind_basic@basic-smem} (NEW):
    - fi-kbl-7567u:       NOTRUN -> [SKIP][8] ([fdo#109271]) +2 similar issues
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-kbl-7567u/igt@i915_vm_bind_basic@basic-smem.html
    - fi-cfl-8700k:       NOTRUN -> [SKIP][9] ([fdo#109271]) +2 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-cfl-8700k/igt@i915_vm_bind_basic@basic-smem.html
    - fi-elk-e7500:       NOTRUN -> [SKIP][10] ([fdo#109271]) +2 similar issues
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-elk-e7500/igt@i915_vm_bind_basic@basic-smem.html
    - fi-bsw-nick:        NOTRUN -> [SKIP][11] ([fdo#109271]) +2 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-bsw-nick/igt@i915_vm_bind_basic@basic-smem.html
    - fi-hsw-g3258:       NOTRUN -> [SKIP][12] ([fdo#109271]) +2 similar issues
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-hsw-g3258/igt@i915_vm_bind_basic@basic-smem.html
    - fi-bsw-kefka:       NOTRUN -> [SKIP][13] ([fdo#109271]) +2 similar issues
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-bsw-kefka/igt@i915_vm_bind_basic@basic-smem.html
    - fi-cfl-guc:         NOTRUN -> [SKIP][14] ([fdo#109271]) +2 similar issues
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-cfl-guc/igt@i915_vm_bind_basic@basic-smem.html
    - fi-skl-6600u:       NOTRUN -> [SKIP][15] ([fdo#109271]) +2 similar issues
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-skl-6600u/igt@i915_vm_bind_basic@basic-smem.html

  * {igt@i915_vm_bind_sanity@basic} (NEW):
    - fi-snb-2600:        NOTRUN -> [SKIP][16] ([fdo#109271]) +2 similar issues
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-snb-2600/igt@i915_vm_bind_sanity@basic.html
    - fi-blb-e6850:       NOTRUN -> [SKIP][17] ([fdo#109271]) +2 similar issues
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-blb-e6850/igt@i915_vm_bind_sanity@basic.html

  * igt@kms_addfb_basic@addfb25-y-tiled-small-legacy:
    - fi-hsw-4770:        NOTRUN -> [SKIP][18] ([fdo#109271]) +12 similar issues
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-hsw-4770/igt@kms_addfb_basic@addfb25-y-tiled-small-legacy.html

  * igt@kms_chamelium@dp-crc-fast:
    - fi-hsw-4770:        NOTRUN -> [SKIP][19] ([fdo#109271] / [fdo#111827]) +7 similar issues
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-hsw-4770/igt@kms_chamelium@dp-crc-fast.html

  * igt@kms_psr@sprite_plane_onoff:
    - fi-hsw-4770:        NOTRUN -> [SKIP][20] ([fdo#109271] / [i915#1072]) +3 similar issues
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-hsw-4770/igt@kms_psr@sprite_plane_onoff.html

  * igt@runner@aborted:
    - fi-hsw-4770:        NOTRUN -> [FAIL][21] ([fdo#109271] / [i915#4312] / [i915#5594])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/fi-hsw-4770/igt@runner@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#3012]: https://gitlab.freedesktop.org/drm/intel/issues/3012
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#5594]: https://gitlab.freedesktop.org/drm/intel/issues/5594
  [i915#7221]: https://gitlab.freedesktop.org/drm/intel/issues/7221


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_7018 -> IGTPW_7975

  CI-20190529: 20190529
  CI_DRM_12254: 2e6cdde56f896add665edb8d2f6d3dfce8b1b3b6 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_7975: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/index.html
  IGT_7018: 8312a2fe3f3287ba4ac4bc8d100de0734480f482 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git


Testlist changes
----------------

+igt@gem_exec3_balancer@parallel-ordering
+igt@gem_exec3_basic@basic
+igt@gem_lmem_swapping@vm_bind
+igt@i915_vm_bind_basic@2m
+igt@i915_vm_bind_basic@64k
+igt@i915_vm_bind_basic@basic
+igt@i915_vm_bind_basic@multi_cmds
+igt@i915_vm_bind_basic@multi_ctxts
+igt@i915_vm_bind_basic@share_vm
+igt@i915_vm_bind_basic@skip_copy
+igt@i915_vm_bind_basic@skip_unbind
+igt@i915_vm_bind_sanity@basic

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/index.html

[-- Attachment #2: Type: text/html, Size: 9498 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 04/11] lib/vm_bind: Add vm_bind specific library functions
  2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 04/11] lib/vm_bind: Add vm_bind specific library functions Niranjana Vishwanathapura
@ 2022-10-18  9:26   ` Matthew Auld
  2022-10-18 14:43     ` Niranjana Vishwanathapura
  0 siblings, 1 reply; 28+ messages in thread
From: Matthew Auld @ 2022-10-18  9:26 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, daniel.vetter, petri.latvala

On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:
> Add vm_bind specific library interfaces.
> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> ---
>   lib/i915/i915_vm_bind.c | 61 +++++++++++++++++++++++++++++++++++++++++
>   lib/i915/i915_vm_bind.h | 16 +++++++++++
>   lib/meson.build         |  1 +
>   3 files changed, 78 insertions(+)
>   create mode 100644 lib/i915/i915_vm_bind.c
>   create mode 100644 lib/i915/i915_vm_bind.h
> 
> diff --git a/lib/i915/i915_vm_bind.c b/lib/i915/i915_vm_bind.c
> new file mode 100644
> index 0000000000..5624f589b5
> --- /dev/null
> +++ b/lib/i915/i915_vm_bind.c
> @@ -0,0 +1,61 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +#include <errno.h>
> +#include <string.h>
> +#include <sys/ioctl.h>
> +
> +#include "ioctl_wrappers.h"
> +#include "i915_drm.h"
> +#include "i915_drm_local.h"
> +#include "i915_vm_bind.h"
> +
> +void i915_vm_bind(int i915, uint32_t vm_id, uint64_t va, uint32_t handle,
> +		  uint64_t offset, uint64_t length, uint32_t syncobj,

It's not possible to have a valid syncobj = 0 right?

Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> +		  uint64_t fence_value)
> +{
> +	struct drm_i915_gem_vm_bind bind;
> +
> +	memset(&bind, 0, sizeof(bind));
> +	bind.vm_id = vm_id;
> +	bind.handle = handle;
> +	bind.start = va;
> +	bind.offset = offset;
> +	bind.length = length;
> +	if (syncobj) {
> +		bind.fence.handle = syncobj;
> +		bind.fence.value = fence_value;
> +		bind.fence.flags = I915_TIMELINE_FENCE_SIGNAL;
> +	}
> +
> +	gem_vm_bind(i915, &bind);
> +}
> +
> +void i915_vm_unbind(int i915, uint32_t vm_id, uint64_t va, uint64_t length)
> +{
> +	struct drm_i915_gem_vm_unbind unbind;
> +
> +	memset(&unbind, 0, sizeof(unbind));
> +	unbind.vm_id = vm_id;
> +	unbind.start = va;
> +	unbind.length = length;
> +
> +	gem_vm_unbind(i915, &unbind);
> +}
> +
> +int i915_vm_bind_version(int i915)
> +{
> +	struct drm_i915_getparam gp;
> +	int value = 0;
> +
> +	memset(&gp, 0, sizeof(gp));
> +	gp.param = I915_PARAM_VM_BIND_VERSION;
> +	gp.value = &value;
> +
> +	ioctl(i915, DRM_IOCTL_I915_GETPARAM, &gp, sizeof(gp));
> +	errno = 0;
> +
> +	return value;
> +}
> diff --git a/lib/i915/i915_vm_bind.h b/lib/i915/i915_vm_bind.h
> new file mode 100644
> index 0000000000..46a382c032
> --- /dev/null
> +++ b/lib/i915/i915_vm_bind.h
> @@ -0,0 +1,16 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +#ifndef _I915_VM_BIND_H_
> +#define _I915_VM_BIND_H_
> +
> +#include <stdint.h>
> +
> +void i915_vm_bind(int i915, uint32_t vm_id, uint64_t va, uint32_t handle,
> +		  uint64_t offset, uint64_t length, uint32_t syncobj,
> +		  uint64_t fence_value);
> +void i915_vm_unbind(int i915, uint32_t vm_id, uint64_t va, uint64_t length);
> +int i915_vm_bind_version(int i915);
> +
> +#endif /* _I915_VM_BIND_ */
> diff --git a/lib/meson.build b/lib/meson.build
> index 8d6c8a244a..e50a2b0e79 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -14,6 +14,7 @@ lib_sources = [
>   	'i915/intel_mocs.c',
>   	'i915/i915_blt.c',
>   	'i915/i915_crc.c',
> +	'i915/i915_vm_bind.c',
>   	'igt_collection.c',
>   	'igt_color_encoding.c',
>   	'igt_crc.c',

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 09/11] tests/i915/vm_bind: Add gem_exec3_basic test
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 09/11] tests/i915/vm_bind: Add gem_exec3_basic test Niranjana Vishwanathapura
@ 2022-10-18  9:34   ` Matthew Auld
  2022-10-18 14:46     ` Niranjana Vishwanathapura
  0 siblings, 1 reply; 28+ messages in thread
From: Matthew Auld @ 2022-10-18  9:34 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, daniel.vetter, petri.latvala

On 18/10/2022 08:17, Niranjana Vishwanathapura wrote:
> Port gem_exec_basic to a new gem_exec3_basic test which
> uses the newer execbuf3 ioctl.
> 
> v2: use i915_vm_bind library functions
> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> ---
>   tests/i915/gem_exec3_basic.c          | 133 ++++++++++++++++++++++++++
>   tests/intel-ci/fast-feedback.testlist |   1 +
>   tests/meson.build                     |   1 +
>   3 files changed, 135 insertions(+)
>   create mode 100644 tests/i915/gem_exec3_basic.c
> 
> diff --git a/tests/i915/gem_exec3_basic.c b/tests/i915/gem_exec3_basic.c
> new file mode 100644
> index 0000000000..3994381748
> --- /dev/null
> +++ b/tests/i915/gem_exec3_basic.c
> @@ -0,0 +1,133 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +/** @file gem_exec3_basic.c
> + *
> + * Basic execbuf3 test.
> + * Ported from gem_exec_basic and made to work with
> + * vm_bind and execbuf3.
> + *
> + */
> +
> +#include "i915/gem.h"
> +#include "i915/i915_vm_bind.h"
> +#include "igt.h"
> +#include "igt_collection.h"
> +#include "igt_syncobj.h"
> +
> +#include "i915/gem_create.h"
> +#include "i915/gem_vm.h"
> +
> +IGT_TEST_DESCRIPTION("Basic sanity check of execbuf3-ioctl rings.");
> +
> +#define BATCH_VA	0xa0000000
> +#define PAGE_SIZE	4096
> +
> +static uint64_t gettime_ns(void)
> +{
> +	struct timespec current;
> +	clock_gettime(CLOCK_MONOTONIC, &current);
> +	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
> +}
> +
> +static uint32_t batch_create(int fd, uint64_t *batch_size, uint32_t region)
> +{
> +	const uint32_t bbe = MI_BATCH_BUFFER_END;
> +	uint32_t handle;
> +
> +	igt_assert(__gem_create_in_memory_regions(fd, &handle, batch_size, region) == 0);
> +	gem_write(fd, handle, 0, &bbe, sizeof(bbe));
> +
> +	return handle;
> +}
> +
> +igt_main
> +{
> +	const struct intel_execution_engine2 *e;
> +	struct drm_i915_query_memory_regions *query_info;
> +	struct igt_collection *regions, *set;
> +	const intel_ctx_t *base_ctx, *ctx;
> +	uint32_t vm_id;
> +	int fd = -1;
> +
> +	igt_fixture {
> +		fd = drm_open_driver(DRIVER_INTEL);
> +		base_ctx = intel_ctx_create_all_physical(fd);
> +
> +		igt_require_gem(fd);
> +		igt_require(i915_vm_bind_version(fd) == 1);

Here and elsewhere, should that rather be >= 1?

Reviewed-by: Matthew Auld <matthew.auld@intel.com>


> +		igt_fork_hang_detector(fd);
> +
> +		vm_id = gem_vm_create_in_vm_bind_mode(fd);
> +		ctx  = intel_ctx_create(fd, &base_ctx->cfg);
> +		gem_context_set_vm(fd, ctx->id, vm_id);
> +
> +		query_info = gem_get_query_memory_regions(fd);
> +		igt_assert(query_info);
> +
> +		set = get_memory_region_set(query_info,
> +					    I915_SYSTEM_MEMORY,
> +					    I915_DEVICE_MEMORY);
> +	}
> +
> +	igt_describe("Check basic functionality of GEM_EXECBUFFER3 ioctl on every"
> +		     " ring and iterating over memory regions.");
> +	igt_subtest_with_dynamic("basic") {
> +		for_each_combination(regions, 1, set) {
> +			struct drm_i915_gem_timeline_fence exec_fence[GEM_MAX_ENGINES][2] = { };
> +			char *sub_name = memregion_dynamic_subtest_name(regions);
> +			uint32_t region = igt_collection_get_value(regions, 0);
> +			uint32_t bind_syncobj, exec_syncobj[GEM_MAX_ENGINES];
> +			uint64_t fence_value[GEM_MAX_ENGINES] = { };
> +			uint64_t batch_size = PAGE_SIZE;
> +			uint32_t handle, idx = 0;
> +
> +			handle = batch_create(fd, &batch_size, region);
> +			bind_syncobj = syncobj_create(fd, 0);
> +			i915_vm_bind(fd, vm_id, BATCH_VA, handle, 0, batch_size, bind_syncobj, 0);
> +
> +			for_each_ctx_engine(fd, ctx, e) {
> +				igt_dynamic_f("%s-%s", e->name, sub_name) {
> +					struct drm_i915_gem_execbuffer3 execbuf = {
> +						.ctx_id = ctx->id,
> +						.batch_address = BATCH_VA,
> +						.engine_idx = e->flags,
> +						.fence_count = 2,
> +						.timeline_fences = to_user_pointer(&exec_fence[idx]),
> +					};
> +
> +					exec_syncobj[idx] = syncobj_create(fd, 0);
> +					exec_fence[idx][0].handle = bind_syncobj;
> +					exec_fence[idx][0].flags = I915_TIMELINE_FENCE_WAIT;
> +					exec_fence[idx][1].handle = exec_syncobj[idx];
> +					exec_fence[idx][1].flags = I915_TIMELINE_FENCE_SIGNAL;
> +					idx++;
> +
> +					gem_execbuf3(fd, &execbuf);
> +				}
> +			}
> +
> +			igt_assert(syncobj_timeline_wait(fd, exec_syncobj, fence_value, idx,
> +							 gettime_ns() + (2 * NSEC_PER_SEC),
> +							 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
> +			while (idx)
> +				syncobj_destroy(fd, exec_syncobj[--idx]);
> +			syncobj_destroy(fd, bind_syncobj);
> +			i915_vm_unbind(fd, vm_id, BATCH_VA, batch_size);
> +			gem_close(fd, handle);
> +			free(sub_name);
> +		}
> +	}
> +
> +	igt_fixture {
> +		intel_ctx_destroy(fd, ctx);
> +		gem_vm_destroy(fd, vm_id);
> +		free(query_info);
> +		igt_collection_destroy(set);
> +		igt_stop_hang_detector();
> +		intel_ctx_destroy(fd, base_ctx);
> +		close(fd);
> +	}
> +}
> diff --git a/tests/intel-ci/fast-feedback.testlist b/tests/intel-ci/fast-feedback.testlist
> index 8a47eaddda..c5be7e300b 100644
> --- a/tests/intel-ci/fast-feedback.testlist
> +++ b/tests/intel-ci/fast-feedback.testlist
> @@ -27,6 +27,7 @@ igt@gem_exec_fence@nb-await
>   igt@gem_exec_gttfill@basic
>   igt@gem_exec_parallel@engines
>   igt@gem_exec_store@basic
> +igt@gem_exec3_basic@basic
>   igt@gem_flink_basic@bad-flink
>   igt@gem_flink_basic@bad-open
>   igt@gem_flink_basic@basic
> diff --git a/tests/meson.build b/tests/meson.build
> index 1b9529a7e2..50de8b7e89 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -252,6 +252,7 @@ i915_progs = [
>   	'sysfs_timeslice_duration',
>   	'i915_vm_bind_sanity',
>   	'i915_vm_bind_basic',
> +	'gem_exec3_basic',
>   ]
>   
>   msm_progs = [

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 04/11] lib/vm_bind: Add vm_bind specific library functions
  2022-10-18  9:26   ` Matthew Auld
@ 2022-10-18 14:43     ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18 14:43 UTC (permalink / raw)
  To: Matthew Auld
  Cc: tvrtko.ursulin, igt-dev, thomas.hellstrom, daniel.vetter, petri.latvala

On Tue, Oct 18, 2022 at 10:26:39AM +0100, Matthew Auld wrote:
>On 18/10/2022 08:16, Niranjana Vishwanathapura wrote:
>>Add vm_bind specific library interfaces.
>>
>>Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>>---
>>  lib/i915/i915_vm_bind.c | 61 +++++++++++++++++++++++++++++++++++++++++
>>  lib/i915/i915_vm_bind.h | 16 +++++++++++
>>  lib/meson.build         |  1 +
>>  3 files changed, 78 insertions(+)
>>  create mode 100644 lib/i915/i915_vm_bind.c
>>  create mode 100644 lib/i915/i915_vm_bind.h
>>
>>diff --git a/lib/i915/i915_vm_bind.c b/lib/i915/i915_vm_bind.c
>>new file mode 100644
>>index 0000000000..5624f589b5
>>--- /dev/null
>>+++ b/lib/i915/i915_vm_bind.c
>>@@ -0,0 +1,61 @@
>>+// SPDX-License-Identifier: MIT
>>+/*
>>+ * Copyright © 2022 Intel Corporation
>>+ */
>>+
>>+#include <errno.h>
>>+#include <string.h>
>>+#include <sys/ioctl.h>
>>+
>>+#include "ioctl_wrappers.h"
>>+#include "i915_drm.h"
>>+#include "i915_drm_local.h"
>>+#include "i915_vm_bind.h"
>>+
>>+void i915_vm_bind(int i915, uint32_t vm_id, uint64_t va, uint32_t handle,
>>+		  uint64_t offset, uint64_t length, uint32_t syncobj,
>
>It's not possible to have a valid syncobj = 0 right?
>

Yah, not possible. syncobj_create asserts if handle is 0.

Regards,
Niranjana

>Reviewed-by: Matthew Auld <matthew.auld@intel.com>
>
>>+		  uint64_t fence_value)
>>+{
>>+	struct drm_i915_gem_vm_bind bind;
>>+
>>+	memset(&bind, 0, sizeof(bind));
>>+	bind.vm_id = vm_id;
>>+	bind.handle = handle;
>>+	bind.start = va;
>>+	bind.offset = offset;
>>+	bind.length = length;
>>+	if (syncobj) {
>>+		bind.fence.handle = syncobj;
>>+		bind.fence.value = fence_value;
>>+		bind.fence.flags = I915_TIMELINE_FENCE_SIGNAL;
>>+	}
>>+
>>+	gem_vm_bind(i915, &bind);
>>+}
>>+
>>+void i915_vm_unbind(int i915, uint32_t vm_id, uint64_t va, uint64_t length)
>>+{
>>+	struct drm_i915_gem_vm_unbind unbind;
>>+
>>+	memset(&unbind, 0, sizeof(unbind));
>>+	unbind.vm_id = vm_id;
>>+	unbind.start = va;
>>+	unbind.length = length;
>>+
>>+	gem_vm_unbind(i915, &unbind);
>>+}
>>+
>>+int i915_vm_bind_version(int i915)
>>+{
>>+	struct drm_i915_getparam gp;
>>+	int value = 0;
>>+
>>+	memset(&gp, 0, sizeof(gp));
>>+	gp.param = I915_PARAM_VM_BIND_VERSION;
>>+	gp.value = &value;
>>+
>>+	ioctl(i915, DRM_IOCTL_I915_GETPARAM, &gp, sizeof(gp));
>>+	errno = 0;
>>+
>>+	return value;
>>+}
>>diff --git a/lib/i915/i915_vm_bind.h b/lib/i915/i915_vm_bind.h
>>new file mode 100644
>>index 0000000000..46a382c032
>>--- /dev/null
>>+++ b/lib/i915/i915_vm_bind.h
>>@@ -0,0 +1,16 @@
>>+/* SPDX-License-Identifier: MIT */
>>+/*
>>+ * Copyright © 2022 Intel Corporation
>>+ */
>>+#ifndef _I915_VM_BIND_H_
>>+#define _I915_VM_BIND_H_
>>+
>>+#include <stdint.h>
>>+
>>+void i915_vm_bind(int i915, uint32_t vm_id, uint64_t va, uint32_t handle,
>>+		  uint64_t offset, uint64_t length, uint32_t syncobj,
>>+		  uint64_t fence_value);
>>+void i915_vm_unbind(int i915, uint32_t vm_id, uint64_t va, uint64_t length);
>>+int i915_vm_bind_version(int i915);
>>+
>>+#endif /* _I915_VM_BIND_ */
>>diff --git a/lib/meson.build b/lib/meson.build
>>index 8d6c8a244a..e50a2b0e79 100644
>>--- a/lib/meson.build
>>+++ b/lib/meson.build
>>@@ -14,6 +14,7 @@ lib_sources = [
>>  	'i915/intel_mocs.c',
>>  	'i915/i915_blt.c',
>>  	'i915/i915_crc.c',
>>+	'i915/i915_vm_bind.c',
>>  	'igt_collection.c',
>>  	'igt_color_encoding.c',
>>  	'igt_crc.c',

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 09/11] tests/i915/vm_bind: Add gem_exec3_basic test
  2022-10-18  9:34   ` Matthew Auld
@ 2022-10-18 14:46     ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-18 14:46 UTC (permalink / raw)
  To: Matthew Auld
  Cc: tvrtko.ursulin, igt-dev, thomas.hellstrom, daniel.vetter, petri.latvala

On Tue, Oct 18, 2022 at 10:34:06AM +0100, Matthew Auld wrote:
>On 18/10/2022 08:17, Niranjana Vishwanathapura wrote:
>>Port gem_exec_basic to a new gem_exec3_basic test which
>>uses the newer execbuf3 ioctl.
>>
>>v2: use i915_vm_bind library functions
>>
>>Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>>---
>>  tests/i915/gem_exec3_basic.c          | 133 ++++++++++++++++++++++++++
>>  tests/intel-ci/fast-feedback.testlist |   1 +
>>  tests/meson.build                     |   1 +
>>  3 files changed, 135 insertions(+)
>>  create mode 100644 tests/i915/gem_exec3_basic.c
>>
>>diff --git a/tests/i915/gem_exec3_basic.c b/tests/i915/gem_exec3_basic.c
>>new file mode 100644
>>index 0000000000..3994381748
>>--- /dev/null
>>+++ b/tests/i915/gem_exec3_basic.c
>>@@ -0,0 +1,133 @@
>>+// SPDX-License-Identifier: MIT
>>+/*
>>+ * Copyright © 2022 Intel Corporation
>>+ */
>>+
>>+/** @file gem_exec3_basic.c
>>+ *
>>+ * Basic execbuf3 test.
>>+ * Ported from gem_exec_basic and made to work with
>>+ * vm_bind and execbuf3.
>>+ *
>>+ */
>>+
>>+#include "i915/gem.h"
>>+#include "i915/i915_vm_bind.h"
>>+#include "igt.h"
>>+#include "igt_collection.h"
>>+#include "igt_syncobj.h"
>>+
>>+#include "i915/gem_create.h"
>>+#include "i915/gem_vm.h"
>>+
>>+IGT_TEST_DESCRIPTION("Basic sanity check of execbuf3-ioctl rings.");
>>+
>>+#define BATCH_VA	0xa0000000
>>+#define PAGE_SIZE	4096
>>+
>>+static uint64_t gettime_ns(void)
>>+{
>>+	struct timespec current;
>>+	clock_gettime(CLOCK_MONOTONIC, &current);
>>+	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
>>+}
>>+
>>+static uint32_t batch_create(int fd, uint64_t *batch_size, uint32_t region)
>>+{
>>+	const uint32_t bbe = MI_BATCH_BUFFER_END;
>>+	uint32_t handle;
>>+
>>+	igt_assert(__gem_create_in_memory_regions(fd, &handle, batch_size, region) == 0);
>>+	gem_write(fd, handle, 0, &bbe, sizeof(bbe));
>>+
>>+	return handle;
>>+}
>>+
>>+igt_main
>>+{
>>+	const struct intel_execution_engine2 *e;
>>+	struct drm_i915_query_memory_regions *query_info;
>>+	struct igt_collection *regions, *set;
>>+	const intel_ctx_t *base_ctx, *ctx;
>>+	uint32_t vm_id;
>>+	int fd = -1;
>>+
>>+	igt_fixture {
>>+		fd = drm_open_driver(DRIVER_INTEL);
>>+		base_ctx = intel_ctx_create_all_physical(fd);
>>+
>>+		igt_require_gem(fd);
>>+		igt_require(i915_vm_bind_version(fd) == 1);
>
>Here and elsewhere, should that rather be >= 1?
>

Currently we only support version 1.
Though version upgrade will be backward compatible, we can
update here once we work on version 2.

Thanks,
Niranjana

>Reviewed-by: Matthew Auld <matthew.auld@intel.com>
>
>
>>+		igt_fork_hang_detector(fd);
>>+
>>+		vm_id = gem_vm_create_in_vm_bind_mode(fd);
>>+		ctx  = intel_ctx_create(fd, &base_ctx->cfg);
>>+		gem_context_set_vm(fd, ctx->id, vm_id);
>>+
>>+		query_info = gem_get_query_memory_regions(fd);
>>+		igt_assert(query_info);
>>+
>>+		set = get_memory_region_set(query_info,
>>+					    I915_SYSTEM_MEMORY,
>>+					    I915_DEVICE_MEMORY);
>>+	}
>>+
>>+	igt_describe("Check basic functionality of GEM_EXECBUFFER3 ioctl on every"
>>+		     " ring and iterating over memory regions.");
>>+	igt_subtest_with_dynamic("basic") {
>>+		for_each_combination(regions, 1, set) {
>>+			struct drm_i915_gem_timeline_fence exec_fence[GEM_MAX_ENGINES][2] = { };
>>+			char *sub_name = memregion_dynamic_subtest_name(regions);
>>+			uint32_t region = igt_collection_get_value(regions, 0);
>>+			uint32_t bind_syncobj, exec_syncobj[GEM_MAX_ENGINES];
>>+			uint64_t fence_value[GEM_MAX_ENGINES] = { };
>>+			uint64_t batch_size = PAGE_SIZE;
>>+			uint32_t handle, idx = 0;
>>+
>>+			handle = batch_create(fd, &batch_size, region);
>>+			bind_syncobj = syncobj_create(fd, 0);
>>+			i915_vm_bind(fd, vm_id, BATCH_VA, handle, 0, batch_size, bind_syncobj, 0);
>>+
>>+			for_each_ctx_engine(fd, ctx, e) {
>>+				igt_dynamic_f("%s-%s", e->name, sub_name) {
>>+					struct drm_i915_gem_execbuffer3 execbuf = {
>>+						.ctx_id = ctx->id,
>>+						.batch_address = BATCH_VA,
>>+						.engine_idx = e->flags,
>>+						.fence_count = 2,
>>+						.timeline_fences = to_user_pointer(&exec_fence[idx]),
>>+					};
>>+
>>+					exec_syncobj[idx] = syncobj_create(fd, 0);
>>+					exec_fence[idx][0].handle = bind_syncobj;
>>+					exec_fence[idx][0].flags = I915_TIMELINE_FENCE_WAIT;
>>+					exec_fence[idx][1].handle = exec_syncobj[idx];
>>+					exec_fence[idx][1].flags = I915_TIMELINE_FENCE_SIGNAL;
>>+					idx++;
>>+
>>+					gem_execbuf3(fd, &execbuf);
>>+				}
>>+			}
>>+
>>+			igt_assert(syncobj_timeline_wait(fd, exec_syncobj, fence_value, idx,
>>+							 gettime_ns() + (2 * NSEC_PER_SEC),
>>+							 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
>>+			while (idx)
>>+				syncobj_destroy(fd, exec_syncobj[--idx]);
>>+			syncobj_destroy(fd, bind_syncobj);
>>+			i915_vm_unbind(fd, vm_id, BATCH_VA, batch_size);
>>+			gem_close(fd, handle);
>>+			free(sub_name);
>>+		}
>>+	}
>>+
>>+	igt_fixture {
>>+		intel_ctx_destroy(fd, ctx);
>>+		gem_vm_destroy(fd, vm_id);
>>+		free(query_info);
>>+		igt_collection_destroy(set);
>>+		igt_stop_hang_detector();
>>+		intel_ctx_destroy(fd, base_ctx);
>>+		close(fd);
>>+	}
>>+}
>>diff --git a/tests/intel-ci/fast-feedback.testlist b/tests/intel-ci/fast-feedback.testlist
>>index 8a47eaddda..c5be7e300b 100644
>>--- a/tests/intel-ci/fast-feedback.testlist
>>+++ b/tests/intel-ci/fast-feedback.testlist
>>@@ -27,6 +27,7 @@ igt@gem_exec_fence@nb-await
>>  igt@gem_exec_gttfill@basic
>>  igt@gem_exec_parallel@engines
>>  igt@gem_exec_store@basic
>>+igt@gem_exec3_basic@basic
>>  igt@gem_flink_basic@bad-flink
>>  igt@gem_flink_basic@bad-open
>>  igt@gem_flink_basic@basic
>>diff --git a/tests/meson.build b/tests/meson.build
>>index 1b9529a7e2..50de8b7e89 100644
>>--- a/tests/meson.build
>>+++ b/tests/meson.build
>>@@ -252,6 +252,7 @@ i915_progs = [
>>  	'sysfs_timeslice_duration',
>>  	'i915_vm_bind_sanity',
>>  	'i915_vm_bind_basic',
>>+	'gem_exec3_basic',
>>  ]
>>  msm_progs = [

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [igt-dev] ✗ Fi.CI.IGT: failure for vm_bind: Add VM_BIND validation support (rev8)
  2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
                   ` (11 preceding siblings ...)
  2022-10-18  7:45 ` [igt-dev] ✓ Fi.CI.BAT: success for vm_bind: Add VM_BIND validation support (rev8) Patchwork
@ 2022-10-18 22:13 ` Patchwork
  12 siblings, 0 replies; 28+ messages in thread
From: Patchwork @ 2022-10-18 22:13 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: igt-dev

[-- Attachment #1: Type: text/plain, Size: 26374 bytes --]

== Series Details ==

Series: vm_bind: Add VM_BIND validation support (rev8)
URL   : https://patchwork.freedesktop.org/series/105881/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_12254_full -> IGTPW_7975_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with IGTPW_7975_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in IGTPW_7975_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/index.html

Participating hosts (11 -> 8)
------------------------------

  Missing    (3): pig-skl-6260u pig-kbl-iris pig-glk-j5005 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_7975_full:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_pm_rpm@system-suspend:
    - shard-apl:          NOTRUN -> [INCOMPLETE][1] +2 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl1/igt@i915_pm_rpm@system-suspend.html

  * {igt@i915_vm_bind_basic@skip_copy} (NEW):
    - shard-tglb:         NOTRUN -> [SKIP][2] +6 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@i915_vm_bind_basic@skip_copy.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-0-async-flip:
    - shard-tglb:         [PASS][3] -> [FAIL][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12254/shard-tglb7/igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-0-async-flip.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-0-async-flip.html

  * igt@kms_flip@flip-vs-suspend-interruptible@a-edp1:
    - shard-tglb:         [PASS][5] -> [INCOMPLETE][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12254/shard-tglb1/igt@kms_flip@flip-vs-suspend-interruptible@a-edp1.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_flip@flip-vs-suspend-interruptible@a-edp1.html

  
New tests
---------

  New tests have been introduced between CI_DRM_12254_full and IGTPW_7975_full:

### New IGT tests (12) ###

  * igt@gem_exec3_balancer@parallel-ordering:
    - Statuses : 2 skip(s)
    - Exec time: [0.0] s

  * igt@gem_exec3_basic@basic:
    - Statuses : 3 skip(s)
    - Exec time: [0.0] s

  * igt@gem_lmem_swapping@vm_bind:
    - Statuses : 1 skip(s)
    - Exec time: [0.0] s

  * igt@i915_vm_bind_basic@2m:
    - Statuses : 3 skip(s)
    - Exec time: [0.0] s

  * igt@i915_vm_bind_basic@64k:
    - Statuses :
    - Exec time: [None] s

  * igt@i915_vm_bind_basic@basic:
    - Statuses : 1 skip(s)
    - Exec time: [0.0] s

  * igt@i915_vm_bind_basic@multi_cmds:
    - Statuses : 1 skip(s)
    - Exec time: [0.0] s

  * igt@i915_vm_bind_basic@multi_ctxts:
    - Statuses : 3 skip(s)
    - Exec time: [0.0] s

  * igt@i915_vm_bind_basic@share_vm:
    - Statuses : 2 skip(s)
    - Exec time: [0.0] s

  * igt@i915_vm_bind_basic@skip_copy:
    - Statuses : 2 skip(s)
    - Exec time: [0.0] s

  * igt@i915_vm_bind_basic@skip_unbind:
    - Statuses : 1 skip(s)
    - Exec time: [0.0] s

  * igt@i915_vm_bind_sanity@basic:
    - Statuses : 3 skip(s)
    - Exec time: [0.0] s

  

Known issues
------------

  Here are the changes found in IGTPW_7975_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@feature_discovery@display-4x:
    - shard-tglb:         NOTRUN -> [SKIP][7] ([i915#1839])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@feature_discovery@display-4x.html

  * igt@gem_ccs@block-copy-compressed:
    - shard-tglb:         NOTRUN -> [SKIP][8] ([i915#3555] / [i915#5325])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@gem_ccs@block-copy-compressed.html

  * igt@gem_create@create-ext-cpu-access-big:
    - shard-tglb:         NOTRUN -> [SKIP][9] ([i915#6335])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@gem_create@create-ext-cpu-access-big.html

  * igt@gem_create@create-massive:
    - shard-apl:          NOTRUN -> [DMESG-WARN][10] ([i915#4991])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl1/igt@gem_create@create-massive.html

  * igt@gem_ctx_persistence@engines-cleanup:
    - shard-snb:          NOTRUN -> [SKIP][11] ([fdo#109271] / [i915#1099]) +2 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-snb4/igt@gem_ctx_persistence@engines-cleanup.html

  * igt@gem_exec_balancer@parallel-ordering:
    - shard-tglb:         NOTRUN -> [FAIL][12] ([i915#6117])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@gem_exec_balancer@parallel-ordering.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-tglb:         NOTRUN -> [FAIL][13] ([i915#2842])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_lmem_swapping@parallel-random-verify:
    - shard-tglb:         NOTRUN -> [SKIP][14] ([i915#4613]) +5 similar issues
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@gem_lmem_swapping@parallel-random-verify.html

  * igt@gem_lmem_swapping@verify-random-ccs:
    - shard-apl:          NOTRUN -> [SKIP][15] ([fdo#109271] / [i915#4613]) +7 similar issues
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl1/igt@gem_lmem_swapping@verify-random-ccs.html

  * igt@gem_pxp@protected-encrypted-src-copy-not-readible:
    - shard-tglb:         NOTRUN -> [SKIP][16] ([i915#4270]) +3 similar issues
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@gem_pxp@protected-encrypted-src-copy-not-readible.html

  * igt@gem_softpin@noreloc-s3:
    - shard-apl:          NOTRUN -> [INCOMPLETE][17] ([i915#7236])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl2/igt@gem_softpin@noreloc-s3.html

  * igt@gem_userptr_blits@readonly-pwrite-unsync:
    - shard-tglb:         NOTRUN -> [SKIP][18] ([i915#3297]) +1 similar issue
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@gem_userptr_blits@readonly-pwrite-unsync.html

  * igt@gen7_exec_parse@oacontrol-tracking:
    - shard-tglb:         NOTRUN -> [SKIP][19] ([fdo#109289]) +3 similar issues
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@gen7_exec_parse@oacontrol-tracking.html

  * igt@gen9_exec_parse@allowed-single:
    - shard-tglb:         NOTRUN -> [SKIP][20] ([i915#2527] / [i915#2856])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@gen9_exec_parse@allowed-single.html

  * igt@i915_pm_freq_mult@media-freq@gt0:
    - shard-tglb:         NOTRUN -> [SKIP][21] ([i915#6590])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@i915_pm_freq_mult@media-freq@gt0.html

  * igt@i915_pm_lpsp@screens-disabled:
    - shard-tglb:         NOTRUN -> [SKIP][22] ([i915#1902])
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@i915_pm_lpsp@screens-disabled.html

  * igt@i915_pm_rpm@dpms-mode-unset-non-lpsp:
    - shard-tglb:         NOTRUN -> [SKIP][23] ([fdo#111644] / [i915#1397])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@i915_pm_rpm@dpms-mode-unset-non-lpsp.html

  * igt@i915_pm_rpm@system-suspend-execbuf:
    - shard-apl:          NOTRUN -> [INCOMPLETE][24] ([i915#7241])
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl6/igt@i915_pm_rpm@system-suspend-execbuf.html

  * igt@i915_query@hwconfig_table:
    - shard-tglb:         NOTRUN -> [SKIP][25] ([i915#6245])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@i915_query@hwconfig_table.html

  * igt@i915_query@query-topology-unsupported:
    - shard-tglb:         NOTRUN -> [SKIP][26] ([fdo#109302])
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@i915_query@query-topology-unsupported.html

  * igt@i915_selftest@live@workarounds:
    - shard-tglb:         NOTRUN -> [INCOMPLETE][27] ([i915#7222])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@i915_selftest@live@workarounds.html

  * igt@i915_selftest@perf@migrate:
    - shard-apl:          NOTRUN -> [INCOMPLETE][28] ([i915#7223] / [i915#7249])
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl8/igt@i915_selftest@perf@migrate.html

  * igt@i915_suspend@sysfs-reader:
    - shard-apl:          NOTRUN -> [INCOMPLETE][29] ([i915#4817] / [i915#7233])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl3/igt@i915_suspend@sysfs-reader.html

  * igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0:
    - shard-tglb:         NOTRUN -> [SKIP][30] ([i915#5286]) +5 similar issues
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0.html

  * igt@kms_big_fb@linear-64bpp-rotate-90:
    - shard-tglb:         NOTRUN -> [SKIP][31] ([fdo#111614]) +5 similar issues
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_big_fb@linear-64bpp-rotate-90.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-0-hflip-async-flip:
    - shard-tglb:         [PASS][32] -> [FAIL][33] ([i915#3743]) +2 similar issues
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12254/shard-tglb1/igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-0-hflip-async-flip.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-0-hflip-async-flip.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-180-hflip:
    - shard-apl:          NOTRUN -> [SKIP][34] ([fdo#109271]) +234 similar issues
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl2/igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-180-hflip.html

  * igt@kms_big_fb@yf-tiled-64bpp-rotate-90:
    - shard-tglb:         NOTRUN -> [SKIP][35] ([fdo#111615]) +3 similar issues
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_big_fb@yf-tiled-64bpp-rotate-90.html

  * igt@kms_big_joiner@2x-modeset:
    - shard-tglb:         NOTRUN -> [SKIP][36] ([i915#2705])
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_big_joiner@2x-modeset.html

  * igt@kms_ccs@pipe-a-bad-pixel-format-4_tiled_dg2_mc_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][37] ([i915#6095]) +2 similar issues
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_ccs@pipe-a-bad-pixel-format-4_tiled_dg2_mc_ccs.html

  * igt@kms_ccs@pipe-a-bad-rotation-90-yf_tiled_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][38] ([fdo#111615] / [i915#3689]) +3 similar issues
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_ccs@pipe-a-bad-rotation-90-yf_tiled_ccs.html

  * igt@kms_ccs@pipe-a-missing-ccs-buffer-y_tiled_gen12_rc_ccs_cc:
    - shard-apl:          NOTRUN -> [SKIP][39] ([fdo#109271] / [i915#3886]) +12 similar issues
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl7/igt@kms_ccs@pipe-a-missing-ccs-buffer-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-b-random-ccs-data-4_tiled_dg2_rc_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][40] ([i915#3689] / [i915#6095]) +3 similar issues
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_ccs@pipe-b-random-ccs-data-4_tiled_dg2_rc_ccs.html

  * igt@kms_ccs@pipe-c-bad-rotation-90-y_tiled_gen12_mc_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][41] ([i915#3689] / [i915#3886]) +5 similar issues
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_ccs@pipe-c-bad-rotation-90-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-c-random-ccs-data-y_tiled_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][42] ([i915#3689]) +9 similar issues
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_ccs@pipe-c-random-ccs-data-y_tiled_ccs.html

  * igt@kms_cdclk@mode-transition:
    - shard-tglb:         NOTRUN -> [SKIP][43] ([i915#3742])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_cdclk@mode-transition.html

  * igt@kms_chamelium@dp-mode-timings:
    - shard-snb:          NOTRUN -> [SKIP][44] ([fdo#109271] / [fdo#111827]) +5 similar issues
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-snb2/igt@kms_chamelium@dp-mode-timings.html

  * igt@kms_chamelium@hdmi-hpd-after-suspend:
    - shard-tglb:         NOTRUN -> [SKIP][45] ([fdo#109284] / [fdo#111827]) +6 similar issues
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_chamelium@hdmi-hpd-after-suspend.html

  * igt@kms_color@ctm-0-25@pipe-c-edp-1:
    - shard-tglb:         NOTRUN -> [FAIL][46] ([i915#315] / [i915#6946]) +3 similar issues
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_color@ctm-0-25@pipe-c-edp-1.html

  * igt@kms_color_chamelium@ctm-0-50:
    - shard-apl:          NOTRUN -> [SKIP][47] ([fdo#109271] / [fdo#111827]) +10 similar issues
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl3/igt@kms_color_chamelium@ctm-0-50.html

  * igt@kms_content_protection@atomic-dpms:
    - shard-tglb:         NOTRUN -> [SKIP][48] ([i915#7118]) +1 similar issue
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_content_protection@atomic-dpms.html

  * igt@kms_content_protection@dp-mst-lic-type-0:
    - shard-tglb:         NOTRUN -> [SKIP][49] ([i915#3116] / [i915#3299]) +2 similar issues
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_content_protection@dp-mst-lic-type-0.html

  * igt@kms_cursor_crc@cursor-random-32x32:
    - shard-tglb:         NOTRUN -> [SKIP][50] ([i915#3555]) +6 similar issues
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_cursor_crc@cursor-random-32x32.html

  * igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic:
    - shard-tglb:         NOTRUN -> [SKIP][51] ([fdo#109274] / [fdo#111825]) +10 similar issues
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic.html

  * igt@kms_cursor_legacy@short-busy-flip-before-cursor:
    - shard-tglb:         NOTRUN -> [SKIP][52] ([i915#4103])
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_cursor_legacy@short-busy-flip-before-cursor.html

  * igt@kms_fbcon_fbt@fbc-suspend:
    - shard-snb:          NOTRUN -> [FAIL][53] ([fdo#103375]) +2 similar issues
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-snb4/igt@kms_fbcon_fbt@fbc-suspend.html

  * igt@kms_flip@2x-absolute-wf_vblank:
    - shard-tglb:         NOTRUN -> [SKIP][54] ([fdo#109274] / [fdo#111825] / [i915#3637] / [i915#3966])
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_flip@2x-absolute-wf_vblank.html

  * igt@kms_flip@2x-flip-vs-modeset:
    - shard-tglb:         NOTRUN -> [SKIP][55] ([fdo#109274] / [fdo#111825] / [i915#3637]) +8 similar issues
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_flip@2x-flip-vs-modeset.html

  * igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-32bpp-yftile-upscaling@pipe-a-valid-mode:
    - shard-tglb:         NOTRUN -> [SKIP][56] ([i915#2587] / [i915#2672]) +4 similar issues
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_flip_scaled_crc@flip-64bpp-yftile-to-32bpp-yftile-upscaling@pipe-a-valid-mode.html

  * igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-cur-indfb-onoff:
    - shard-tglb:         NOTRUN -> [SKIP][57] ([fdo#109280] / [fdo#111825]) +31 similar issues
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-cur-indfb-onoff.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-shrfb-pgflip-blt:
    - shard-tglb:         NOTRUN -> [SKIP][58] ([i915#6497]) +11 similar issues
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-shrfb-pgflip-blt.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-pri-indfb-draw-mmap-cpu:
    - shard-snb:          NOTRUN -> [SKIP][59] ([fdo#109271]) +218 similar issues
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-snb6/igt@kms_frontbuffer_tracking@fbcpsr-2p-primscrn-pri-indfb-draw-mmap-cpu.html

  * igt@kms_plane_lowres@tiling-y@pipe-c-edp-1:
    - shard-tglb:         NOTRUN -> [SKIP][60] ([i915#3536]) +11 similar issues
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_plane_lowres@tiling-y@pipe-c-edp-1.html

  * igt@kms_plane_scaling@plane-downscale-with-rotation-factor-0-75@pipe-d-edp-1:
    - shard-tglb:         NOTRUN -> [SKIP][61] ([i915#5176]) +19 similar issues
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@kms_plane_scaling@plane-downscale-with-rotation-factor-0-75@pipe-d-edp-1.html

  * igt@kms_prime@basic-modeset-hybrid:
    - shard-tglb:         NOTRUN -> [SKIP][62] ([i915#6524]) +1 similar issue
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_prime@basic-modeset-hybrid.html

  * igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-sf:
    - shard-tglb:         NOTRUN -> [SKIP][63] ([i915#2920]) +1 similar issue
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_psr2_sf@cursor-plane-move-continuous-exceed-sf.html

  * igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-big-fb:
    - shard-apl:          NOTRUN -> [SKIP][64] ([fdo#109271] / [i915#658]) +1 similar issue
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl8/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-big-fb.html

  * igt@kms_psr@psr2_sprite_mmap_cpu:
    - shard-tglb:         NOTRUN -> [FAIL][65] ([i915#132] / [i915#3467]) +1 similar issue
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_psr@psr2_sprite_mmap_cpu.html

  * igt@kms_rotation_crc@primary-yf-tiled-reflect-x-90:
    - shard-tglb:         NOTRUN -> [SKIP][66] ([fdo#111615] / [i915#5289])
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_rotation_crc@primary-yf-tiled-reflect-x-90.html

  * igt@kms_vblank@pipe-c-ts-continuation-dpms-suspend:
    - shard-apl:          NOTRUN -> [INCOMPLETE][67] ([i915#4939])
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl7/igt@kms_vblank@pipe-c-ts-continuation-dpms-suspend.html

  * igt@kms_writeback@writeback-fb-id:
    - shard-apl:          NOTRUN -> [SKIP][68] ([fdo#109271] / [i915#2437])
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl1/igt@kms_writeback@writeback-fb-id.html

  * igt@perf_pmu@event-wait@rcs0:
    - shard-tglb:         NOTRUN -> [SKIP][69] ([fdo#112283])
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@perf_pmu@event-wait@rcs0.html

  * igt@prime_vgem@coherency-gtt:
    - shard-tglb:         NOTRUN -> [SKIP][70] ([fdo#109295] / [fdo#111656])
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb7/igt@prime_vgem@coherency-gtt.html

  * igt@prime_vgem@fence-flip-hang:
    - shard-tglb:         NOTRUN -> [SKIP][71] ([fdo#109295])
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@prime_vgem@fence-flip-hang.html

  * igt@sysfs_clients@pidname:
    - shard-apl:          NOTRUN -> [SKIP][72] ([fdo#109271] / [i915#2994]) +2 similar issues
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-apl7/igt@sysfs_clients@pidname.html

  * igt@sysfs_clients@sema-50:
    - shard-tglb:         NOTRUN -> [SKIP][73] ([i915#2994]) +1 similar issue
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@sysfs_clients@sema-50.html

  
#### Possible fixes ####

  * igt@gem_huc_copy@huc-copy:
    - shard-tglb:         [SKIP][74] ([i915#2190]) -> [PASS][75]
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12254/shard-tglb7/igt@gem_huc_copy@huc-copy.html
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@gem_huc_copy@huc-copy.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180-hflip-async-flip:
    - shard-tglb:         [FAIL][76] ([i915#3743]) -> [PASS][77]
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12254/shard-tglb1/igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180-hflip-async-flip.html
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/shard-tglb1/igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180-hflip-async-flip.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103375]: https://bugs.freedesktop.org/show_bug.cgi?id=103375
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109274]: https://bugs.freedesktop.org/show_bug.cgi?id=109274
  [fdo#109280]: https://bugs.freedesktop.org/show_bug.cgi?id=109280
  [fdo#109284]: https://bugs.freedesktop.org/show_bug.cgi?id=109284
  [fdo#109289]: https://bugs.freedesktop.org/show_bug.cgi?id=109289
  [fdo#109295]: https://bugs.freedesktop.org/show_bug.cgi?id=109295
  [fdo#109302]: https://bugs.freedesktop.org/show_bug.cgi?id=109302
  [fdo#111614]: https://bugs.freedesktop.org/show_bug.cgi?id=111614
  [fdo#111615]: https://bugs.freedesktop.org/show_bug.cgi?id=111615
  [fdo#111644]: https://bugs.freedesktop.org/show_bug.cgi?id=111644
  [fdo#111656]: https://bugs.freedesktop.org/show_bug.cgi?id=111656
  [fdo#111825]: https://bugs.freedesktop.org/show_bug.cgi?id=111825
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [fdo#112283]: https://bugs.freedesktop.org/show_bug.cgi?id=112283
  [i915#1099]: https://gitlab.freedesktop.org/drm/intel/issues/1099
  [i915#132]: https://gitlab.freedesktop.org/drm/intel/issues/132
  [i915#1397]: https://gitlab.freedesktop.org/drm/intel/issues/1397
  [i915#1839]: https://gitlab.freedesktop.org/drm/intel/issues/1839
  [i915#1902]: https://gitlab.freedesktop.org/drm/intel/issues/1902
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2437]: https://gitlab.freedesktop.org/drm/intel/issues/2437
  [i915#2527]: https://gitlab.freedesktop.org/drm/intel/issues/2527
  [i915#2587]: https://gitlab.freedesktop.org/drm/intel/issues/2587
  [i915#2672]: https://gitlab.freedesktop.org/drm/intel/issues/2672
  [i915#2705]: https://gitlab.freedesktop.org/drm/intel/issues/2705
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2856]: https://gitlab.freedesktop.org/drm/intel/issues/2856
  [i915#2920]: https://gitlab.freedesktop.org/drm/intel/issues/2920
  [i915#2994]: https://gitlab.freedesktop.org/drm/intel/issues/2994
  [i915#3116]: https://gitlab.freedesktop.org/drm/intel/issues/3116
  [i915#315]: https://gitlab.freedesktop.org/drm/intel/issues/315
  [i915#3297]: https://gitlab.freedesktop.org/drm/intel/issues/3297
  [i915#3299]: https://gitlab.freedesktop.org/drm/intel/issues/3299
  [i915#3467]: https://gitlab.freedesktop.org/drm/intel/issues/3467
  [i915#3536]: https://gitlab.freedesktop.org/drm/intel/issues/3536
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3637]: https://gitlab.freedesktop.org/drm/intel/issues/3637
  [i915#3689]: https://gitlab.freedesktop.org/drm/intel/issues/3689
  [i915#3742]: https://gitlab.freedesktop.org/drm/intel/issues/3742
  [i915#3743]: https://gitlab.freedesktop.org/drm/intel/issues/3743
  [i915#3886]: https://gitlab.freedesktop.org/drm/intel/issues/3886
  [i915#3966]: https://gitlab.freedesktop.org/drm/intel/issues/3966
  [i915#4103]: https://gitlab.freedesktop.org/drm/intel/issues/4103
  [i915#4270]: https://gitlab.freedesktop.org/drm/intel/issues/4270
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4817]: https://gitlab.freedesktop.org/drm/intel/issues/4817
  [i915#4939]: https://gitlab.freedesktop.org/drm/intel/issues/4939
  [i915#4991]: https://gitlab.freedesktop.org/drm/intel/issues/4991
  [i915#5176]: https://gitlab.freedesktop.org/drm/intel/issues/5176
  [i915#5286]: https://gitlab.freedesktop.org/drm/intel/issues/5286
  [i915#5289]: https://gitlab.freedesktop.org/drm/intel/issues/5289
  [i915#5325]: https://gitlab.freedesktop.org/drm/intel/issues/5325
  [i915#6095]: https://gitlab.freedesktop.org/drm/intel/issues/6095
  [i915#6117]: https://gitlab.freedesktop.org/drm/intel/issues/6117
  [i915#6245]: https://gitlab.freedesktop.org/drm/intel/issues/6245
  [i915#6335]: https://gitlab.freedesktop.org/drm/intel/issues/6335
  [i915#6497]: https://gitlab.freedesktop.org/drm/intel/issues/6497
  [i915#6524]: https://gitlab.freedesktop.org/drm/intel/issues/6524
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#6590]: https://gitlab.freedesktop.org/drm/intel/issues/6590
  [i915#6946]: https://gitlab.freedesktop.org/drm/intel/issues/6946
  [i915#7118]: https://gitlab.freedesktop.org/drm/intel/issues/7118
  [i915#7222]: https://gitlab.freedesktop.org/drm/intel/issues/7222
  [i915#7223]: https://gitlab.freedesktop.org/drm/intel/issues/7223
  [i915#7233]: https://gitlab.freedesktop.org/drm/intel/issues/7233
  [i915#7236]: https://gitlab.freedesktop.org/drm/intel/issues/7236
  [i915#7241]: https://gitlab.freedesktop.org/drm/intel/issues/7241
  [i915#7249]: https://gitlab.freedesktop.org/drm/intel/issues/7249


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_7018 -> IGTPW_7975
  * Piglit: piglit_4509 -> None

  CI-20190529: 20190529
  CI_DRM_12254: 2e6cdde56f896add665edb8d2f6d3dfce8b1b3b6 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_7975: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/index.html
  IGT_7018: 8312a2fe3f3287ba4ac4bc8d100de0734480f482 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_7975/index.html

[-- Attachment #2: Type: text/html, Size: 31105 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 07/11] tests/i915/vm_bind: Add basic VM_BIND test support
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 07/11] tests/i915/vm_bind: Add basic VM_BIND test support Niranjana Vishwanathapura
@ 2022-10-21 16:27   ` Matthew Auld
  2022-10-25  7:06     ` Niranjana Vishwanathapura
  0 siblings, 1 reply; 28+ messages in thread
From: Matthew Auld @ 2022-10-21 16:27 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, daniel.vetter, petri.latvala

On 18/10/2022 08:17, Niranjana Vishwanathapura wrote:
> Add basic tests for VM_BIND functionality. Bind the buffer objects in
> device page table with VM_BIND calls and have GPU copy the data from a
> source buffer object to destination buffer object.
> Test for different buffer sizes, buffer object placement and with
> multiple contexts.
> 
> v2: Add basic test to fast-feedback.testlist
> v3: Run all tests for different memory types,
>      pass VA instead of an array for batch_address
> v4: Iterate for memory region types instead of hardcoding,
>      use i915_vm_bind library functions
> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> 
> basic

Unwanted change?

> ---
>   tests/i915/i915_vm_bind_basic.c       | 522 ++++++++++++++++++++++++++
>   tests/intel-ci/fast-feedback.testlist |   1 +
>   tests/meson.build                     |   1 +
>   3 files changed, 524 insertions(+)
>   create mode 100644 tests/i915/i915_vm_bind_basic.c
> 
> diff --git a/tests/i915/i915_vm_bind_basic.c b/tests/i915/i915_vm_bind_basic.c
> new file mode 100644
> index 0000000000..ede76ba095
> --- /dev/null
> +++ b/tests/i915/i915_vm_bind_basic.c
> @@ -0,0 +1,522 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +/** @file i915_vm_bind_basic.c
> + *
> + * This is the basic test for VM_BIND functionality.
> + *
> + * The goal is to ensure that basics work.
> + */
> +
> +#include <sys/poll.h>
> +
> +#include "i915/gem.h"
> +#include "i915/i915_vm_bind.h"
> +#include "igt.h"
> +#include "igt_syncobj.h"
> +#include <unistd.h>
> +#include <stdlib.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include <fcntl.h>
> +#include <inttypes.h>
> +#include <errno.h>
> +#include <sys/stat.h>
> +#include <sys/ioctl.h>
> +#include "drm.h"
> +#include "i915/gem_vm.h"
> +
> +IGT_TEST_DESCRIPTION("Basic test for vm_bind functionality");
> +
> +#define PAGE_SIZE   4096
> +#define PAGE_SHIFT  12
> +
> +#define GEN9_XY_FAST_COPY_BLT_CMD       (2 << 29 | 0x42 << 22)
> +#define BLT_DEPTH_32                    (3 << 24)
> +
> +#define DEFAULT_BUFF_SIZE  (4 * PAGE_SIZE)
> +#define SZ_64K             (16 * PAGE_SIZE)
> +#define SZ_2M              (512 * PAGE_SIZE)
> +
> +#define MAX_CTXTS   2
> +#define MAX_CMDS    4
> +
> +#define BATCH_FENCE  0
> +#define SRC_FENCE    1
> +#define DST_FENCE    2
> +#define EXEC_FENCE   3
> +#define NUM_FENCES   4
> +
> +enum {
> +	BATCH_MAP,
> +	SRC_MAP,
> +	DST_MAP = SRC_MAP + MAX_CMDS,
> +	MAX_MAP
> +};
> +
> +struct mapping {
> +	uint32_t  obj;
> +	uint64_t  va;
> +	uint64_t  offset;
> +	uint64_t  length;
> +	uint64_t  flags;
> +};
> +
> +#define SET_MAP(map, _obj, _va, _offset, _length, _flags)   \
> +{                                  \
> +	(map).obj = _obj;          \
> +	(map).va = _va;            \
> +	(map).offset = _offset;	   \
> +	(map).length = _length;	   \
> +	(map).flags = _flags;	   \
> +}
> +
> +#define MAX_BATCH_DWORD    64
> +
> +#define abs(x) ((x) >= 0 ? (x) : -(x))
> +
> +#define TEST_LMEM             BIT(0)
> +#define TEST_SKIP_UNBIND      BIT(1)
> +#define TEST_SHARE_VM         BIT(2)
> +
> +#define is_lmem(cfg)        ((cfg)->flags & TEST_LMEM)
> +#define do_unbind(cfg)      (!((cfg)->flags & TEST_SKIP_UNBIND))
> +#define do_share_vm(cfg)    ((cfg)->flags & TEST_SHARE_VM)
> +
> +struct test_cfg {
> +	const char *name;
> +	uint32_t size;
> +	uint8_t num_cmds;
> +	uint32_t num_ctxts;
> +	uint32_t flags;
> +};
> +
> +static uint64_t
> +gettime_ns(void)
> +{
> +	struct timespec current;
> +	clock_gettime(CLOCK_MONOTONIC, &current);
> +	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
> +}
> +
> +static bool syncobj_busy(int fd, uint32_t handle)
> +{
> +	bool result;
> +	int sf;
> +
> +	sf = syncobj_handle_to_fd(fd, handle,
> +				  DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE);
> +	result = poll(&(struct pollfd){sf, POLLIN}, 1, 0) == 0;
> +	close(sf);
> +
> +	return result;
> +}
> +
> +static inline void vm_bind(int fd, uint32_t vm_id, struct mapping *m,
> +			   struct drm_i915_gem_timeline_fence *fence)
> +{
> +	uint32_t syncobj = 0;
> +
> +	if (fence) {
> +		syncobj =  syncobj_create(fd, 0);
> +
> +		fence->handle = syncobj;
> +		fence->flags = I915_TIMELINE_FENCE_WAIT;
> +		fence->value = 0;
> +	}
> +
> +	igt_info("VM_BIND vm:0x%x h:0x%x v:0x%lx o:0x%lx l:0x%lx\n",
> +		 vm_id, m->obj, m->va, m->offset, m->length);

This igt_info() and elsewhere are not too loud? Maybe just igt_debug() 
for some of these? Looks quite noisy when running this test locally.

> +	i915_vm_bind(fd, vm_id, m->va, m->obj, m->offset, m->length, syncobj, 0);
> +}
> +
> +static inline void vm_unbind(int fd, uint32_t vm_id, struct mapping *m)
> +{
> +	/* Object handle is not required during unbind */
> +	igt_info("VM_UNBIND vm:0x%x v:0x%lx l:0x%lx\n", vm_id, m->va, m->length);
> +	i915_vm_unbind(fd, vm_id, m->va, m->length);
> +}
> +
> +static void print_buffer(void *buf, uint32_t size,
> +			 const char *str, bool full)
> +{
> +	uint32_t i = 0;
> +
> +	igt_debug("Printing %s size 0x%x\n", str, size);
> +	while (i < size) {
> +		uint32_t *b = buf + i;
> +
> +		igt_debug("\t%s[0x%04x]: 0x%08x 0x%08x 0x%08x 0x%08x %s\n",
> +			  str, i, b[0], b[1], b[2], b[3], full ? "" : "...");
> +		i += full ? 16 : PAGE_SIZE;
> +	}
> +}
> +
> +static int gem_linear_fast_blt(uint32_t *batch, uint64_t src,
> +			       uint64_t dst, uint32_t size)
> +{
> +	uint32_t *cmd = batch;
> +
> +	*cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
> +	*cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
> +	*cmd++ = 0;
> +	*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> +	*cmd++ = lower_32_bits(dst);
> +	*cmd++ = upper_32_bits(dst);
> +	*cmd++ = 0;
> +	*cmd++ = PAGE_SIZE;
> +	*cmd++ = lower_32_bits(src);
> +	*cmd++ = upper_32_bits(src);
> +
> +	*cmd++ = MI_BATCH_BUFFER_END;
> +	*cmd++ = 0;
> +
> +	return ALIGN((cmd - batch + 1) * sizeof(uint32_t), 8);
> +}
> +
> +static void __gem_copy(int fd, uint64_t src, uint64_t dst, uint32_t size,
> +		       uint32_t ctx_id, void *batch_addr, unsigned int eb_flags,
> +		       struct drm_i915_gem_timeline_fence *fence)
> +{
> +	uint32_t len, buf[MAX_BATCH_DWORD] = { 0 };
> +	struct drm_i915_gem_execbuffer3 execbuf;
> +
> +	len = gem_linear_fast_blt(buf, src, dst, size);
> +
> +	memcpy(batch_addr, (void *)buf, len);
> +	print_buffer(buf, len, "batch", true);
> +
> +	memset(&execbuf, 0, sizeof(execbuf));
> +	execbuf.ctx_id = ctx_id;
> +	execbuf.batch_address = to_user_pointer(batch_addr);
> +	execbuf.engine_idx = eb_flags;
> +	execbuf.fence_count = NUM_FENCES;
> +	execbuf.timeline_fences = to_user_pointer(fence);
> +	gem_execbuf3(fd, &execbuf);
> +}
> +
> +static void i915_gem_copy(int fd, uint64_t src, uint64_t dst, uint32_t va_delta,
> +			  uint32_t delta, uint32_t size, const intel_ctx_t **ctx,
> +			  uint32_t num_ctxts, void **batch_addr, unsigned int eb_flags,
> +			  struct drm_i915_gem_timeline_fence (*fence)[NUM_FENCES])
> +{
> +	uint32_t i;
> +
> +	for (i = 0; i < num_ctxts; i++) {
> +		igt_info("Issuing gem copy on ctx 0x%x\n", ctx[i]->id);
> +		__gem_copy(fd, src + (i * va_delta), dst + (i * va_delta), delta,
> +			   ctx[i]->id, batch_addr[i], eb_flags, fence[i]);
> +	}
> +}
> +
> +static void i915_gem_sync(int fd, const intel_ctx_t **ctx, uint32_t num_ctxts,
> +			  struct drm_i915_gem_timeline_fence (*fence)[NUM_FENCES])
> +{
> +	uint32_t i;
> +
> +	for (i = 0; i < num_ctxts; i++) {
> +		uint64_t fence_value = 0;
> +
> +		igt_assert(syncobj_timeline_wait(fd, &fence[i][EXEC_FENCE].handle,
> +						 (uint64_t *)&fence_value, 1,
> +						 gettime_ns() + (2 * NSEC_PER_SEC),
> +						 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
> +		igt_assert(!syncobj_busy(fd, fence[i][EXEC_FENCE].handle));
> +		igt_info("gem copy completed on ctx 0x%x\n", ctx[i]->id);
> +	}
> +}
> +
> +static struct igt_collection *get_region_set(int fd, struct test_cfg *cfg, bool lmem_only)
> +{
> +	uint32_t mem_type[] = { I915_SYSTEM_MEMORY, I915_DEVICE_MEMORY };
> +	uint32_t lmem_type[] = { I915_DEVICE_MEMORY };
> +	struct drm_i915_query_memory_regions *query_info;
> +
> +	query_info = gem_get_query_memory_regions(fd);
> +	igt_assert(query_info);
> +
> +	if (lmem_only)
> +		return __get_memory_region_set(query_info, lmem_type, 1);
> +	else
> +		return __get_memory_region_set(query_info, mem_type, 2);
> +}
> +
> +static void create_src_objs(int fd, struct test_cfg *cfg, uint32_t src[], uint32_t size,
> +			    uint32_t num_cmds, void *src_addr[])
> +{
> +	int i;
> +	struct igt_collection *set = get_region_set(fd, cfg, is_lmem(cfg) && (num_cmds == 1));
> +	uint32_t region;
> +
> +	for (i = 0; i < num_cmds; i++) {
> +		region = igt_collection_get_value(set, i % set->size);
> +		src[i] = gem_create_in_memory_regions(fd, size, region);
> +		igt_info("Src obj 0x%x created in region 0x%x:0x%x\n", src[i],
> +			 MEMORY_TYPE_FROM_REGION(region), MEMORY_INSTANCE_FROM_REGION(region));
> +		src_addr[i] = gem_mmap__cpu(fd, src[i], 0, size, PROT_WRITE);
> +	}
> +}
> +
> +static void destroy_src_objs(int fd, struct test_cfg *cfg, uint32_t src[], uint32_t size,
> +			     uint32_t num_cmds, void *src_addr[])
> +{
> +	int i;
> +
> +	for (i = 0; i < num_cmds; i++) {
> +		igt_assert(gem_munmap(src_addr[i], size) == 0);
> +		igt_debug("Closing object 0x%x\n", src[i]);
> +		gem_close(fd, src[i]);
> +	}
> +}
> +
> +static uint32_t create_dst_obj(int fd, struct test_cfg *cfg, uint32_t size, void **dst_addr)
> +{
> +	uint32_t dst;
> +	struct igt_collection *set = get_region_set(fd, cfg, is_lmem(cfg));
> +	uint32_t region = igt_collection_get_value(set, 0);

Maybe just use the gem_memory_region and then we don't need this 
igt_collection stuff here and elsewhere? IMO it's a lot nicer.

> +
> +	dst = gem_create_in_memory_regions(fd, size, region);
> +	igt_info("Dst obj 0x%x created in region 0x%x:0x%x\n", dst,
> +		 MEMORY_TYPE_FROM_REGION(region), MEMORY_INSTANCE_FROM_REGION(region));

And then here and elsewhere we can just use gem_memory_region.name, 
which is a lot more readable for humans :)

> +	*dst_addr = gem_mmap__cpu(fd, dst, 0, size, PROT_WRITE);
> +
> +	return dst;
> +}
> +
> +static void destroy_dst_obj(int fd, struct test_cfg *cfg, uint32_t dst, uint32_t size, void *dst_addr)
> +{
> +	igt_assert(gem_munmap(dst_addr, size) == 0);
> +	igt_debug("Closing object 0x%x\n", dst);
> +	gem_close(fd, dst);
> +}
> +
> +static void pattern_fill_buf(void *src_addr[], uint32_t size, uint32_t num_cmds, uint32_t npages)
> +{
> +	uint32_t i, j;
> +	void *buf;
> +
> +	/* Allocate buffer and fill pattern */
> +	buf = malloc(size);
> +	igt_require(buf);
> +
> +	for (i = 0; i < num_cmds; i++) {
> +		for (j = 0; j < npages; j++)
> +			memset(buf + j * PAGE_SIZE, i * npages + j + 1, PAGE_SIZE);
> +
> +		memcpy(src_addr[i], buf, size);
> +	}
> +
> +	free(buf);
> +}
> +
> +static void run_test(int fd, const intel_ctx_t *base_ctx, struct test_cfg *cfg,
> +		     const struct intel_execution_engine2 *e)
> +{
> +	void *src_addr[MAX_CMDS] = { 0 }, *dst_addr = NULL;
> +	uint32_t src[MAX_CMDS], dst, i, size = cfg->size;
> +	struct drm_i915_gem_timeline_fence exec_fence[MAX_CTXTS][NUM_FENCES];
> +	uint32_t shared_vm_id, vm_id[MAX_CTXTS];
> +	struct mapping map[MAX_CTXTS][MAX_MAP];
> +	uint32_t num_ctxts = cfg->num_ctxts;
> +	uint32_t num_cmds = cfg->num_cmds;
> +	uint32_t npages = size / PAGE_SIZE;
> +	const intel_ctx_t *ctx[MAX_CTXTS];
> +	bool share_vm = do_share_vm(cfg);
> +	uint32_t delta, va_delta = SZ_2M;
> +	void *batch_addr[MAX_CTXTS];
> +	uint32_t batch[MAX_CTXTS];
> +	uint64_t src_va, dst_va;
> +
> +	delta = size / num_ctxts;
> +	if (share_vm)
> +		shared_vm_id = gem_vm_create_in_vm_bind_mode(fd);
> +
> +	/* Create contexts */
> +	num_ctxts = min_t(num_ctxts, MAX_CTXTS, num_ctxts);
> +	for (i = 0; i < num_ctxts; i++) {
> +		uint32_t vmid;
> +
> +		if (share_vm)
> +			vmid = shared_vm_id;
> +		else
> +			vmid = gem_vm_create_in_vm_bind_mode(fd);
> +
> +		ctx[i] = intel_ctx_create(fd, &base_ctx->cfg);
> +		gem_context_set_vm(fd, ctx[i]->id, vmid);
> +		vm_id[i] = gem_context_get_vm(fd, ctx[i]->id);
> +
> +		exec_fence[i][EXEC_FENCE].handle = syncobj_create(fd, 0);
> +		exec_fence[i][EXEC_FENCE].flags = I915_TIMELINE_FENCE_SIGNAL;
> +		exec_fence[i][EXEC_FENCE].value = 0;
> +	}
> +
> +	/* Create objects */
> +	num_cmds = min_t(num_cmds, MAX_CMDS, num_cmds);
> +	create_src_objs(fd, cfg, src, size, num_cmds, src_addr);
> +	dst = create_dst_obj(fd, cfg, size, &dst_addr);
> +
> +	/*
> +	 * mmap'ed addresses are not 64K aligned. On platforms requiring
> +	 * 64K alignment, use static addresses.
> +	 */
> +	if (size < SZ_2M && num_cmds && !HAS_64K_PAGES(intel_get_drm_devid(fd))) {
> +		src_va = to_user_pointer(src_addr[0]);
> +		dst_va = to_user_pointer(dst_addr);
> +	} else {
> +		src_va = 0xa000000;
> +		dst_va = 0xb000000;
> +	}
> +
> +	pattern_fill_buf(src_addr, size, num_cmds, npages);
> +
> +	if (num_cmds)
> +		print_buffer(src_addr[num_cmds - 1], size, "src_obj", false);
> +
> +	for (i = 0; i < num_ctxts; i++) {
> +		batch[i] = gem_create_in_memory_regions(fd, PAGE_SIZE, REGION_SMEM);
> +		igt_info("Batch obj 0x%x created in region 0x%x:0x%x\n", batch[i],
> +			 MEMORY_TYPE_FROM_REGION(REGION_SMEM), MEMORY_INSTANCE_FROM_REGION(REGION_SMEM));
> +		batch_addr[i] = gem_mmap__cpu(fd, batch[i], 0, PAGE_SIZE, PROT_WRITE);

Hmm, are there any incoherent platforms (non-llc) that will be entering 
vm_bind world? I guess nothing currently that is graphics version >= 12? 
If something does come along we might see some suprises here and elsewhere.

IIRC the old model would be something like mmap__cpu(), set_domain(CPU), 
do some CPU writes, and then later execbuf would see that the object is 
dirty and would do any clflushes if required before submit. Maybe the 
test needs to manually do all clflushing now with new eb3 model? We can 
also sidestep for now by using __device() or so.

> +	}
> +
> +	/* Create mappings */
> +	for (i = 0; i < num_ctxts; i++) {
> +		uint64_t va_offset = i * va_delta;
> +		uint64_t offset = i * delta;
> +		uint32_t j;
> +
> +		for (j = 0; j < num_cmds; j++)
> +			SET_MAP(map[i][SRC_MAP + j], src[j], src_va + va_offset, offset, delta, 0);
> +		SET_MAP(map[i][DST_MAP], dst, dst_va + va_offset, offset, delta, 0);
> +		SET_MAP(map[i][BATCH_MAP], batch[i], to_user_pointer(batch_addr[i]), 0, PAGE_SIZE, 0);
> +	}
> +
> +	/* Bind the buffers to device page table */
> +	for (i = 0; i < num_ctxts; i++) {
> +		vm_bind(fd, vm_id[i], &map[i][BATCH_MAP], &exec_fence[i][BATCH_FENCE]);
> +		vm_bind(fd, vm_id[i], &map[i][DST_MAP], &exec_fence[i][DST_FENCE]);
> +	}
> +
> +	/* Have GPU do the copy */
> +	for (i = 0; i < cfg->num_cmds; i++) {
> +		uint32_t j;
> +
> +		for (j = 0; j < num_ctxts; j++)
> +			vm_bind(fd, vm_id[j], &map[j][SRC_MAP + i], &exec_fence[j][SRC_FENCE]);
> +
> +		i915_gem_copy(fd, src_va, dst_va, va_delta, delta, size, ctx,
> +			      num_ctxts, batch_addr, e->flags, exec_fence);
> +
> +		i915_gem_sync(fd, ctx, num_ctxts, exec_fence);
> +
> +		for (j = 0; j < num_ctxts; j++) {
> +			syncobj_destroy(fd, exec_fence[j][SRC_FENCE].handle);
> +			if (do_unbind(cfg))
> +				vm_unbind(fd, vm_id[j], &map[j][SRC_MAP + i]);
> +		}
> +	}
> +
> +	/*
> +	 * Unbind buffers from device page table.
> +	 * If not, it should get unbound while freeing the buffer.
> +	 */
> +	for (i = 0; i < num_ctxts; i++) {
> +		syncobj_destroy(fd, exec_fence[i][BATCH_FENCE].handle);
> +		syncobj_destroy(fd, exec_fence[i][DST_FENCE].handle);
> +		if (do_unbind(cfg)) {
> +			vm_unbind(fd, vm_id[i], &map[i][BATCH_MAP]);
> +			vm_unbind(fd, vm_id[i], &map[i][DST_MAP]);
> +		}
> +	}
> +
> +	/* Close batch buffers */
> +	for (i = 0; i < num_ctxts; i++) {
> +		syncobj_destroy(fd, exec_fence[i][EXEC_FENCE].handle);
> +		gem_close(fd, batch[i]);
> +	}
> +
> +	/* Accessing the buffer will migrate the pages from device to host */

It shouldn't do that. Do you have some more info?

> +	print_buffer(dst_addr, size, "dst_obj", false);

Normally we only want to dump the buffer if the test fails because there 
is some mismatch when comparing the expected value(s)?

> +
> +	/* Validate by comparing the last SRC with DST */
> +	if (num_cmds)
> +		igt_assert(memcmp(src_addr[num_cmds - 1], dst_addr, size) == 0);
> +
> +	/* Free the objects */
> +	destroy_src_objs(fd, cfg, src, size, num_cmds, src_addr);
> +	destroy_dst_obj(fd, cfg, dst, size, dst_addr);
> +
> +	/* Done with the contexts */
> +	for (i = 0; i < num_ctxts; i++) {
> +		igt_debug("Destroying context 0x%x\n", ctx[i]->id);
> +		gem_vm_destroy(fd, vm_id[i]);
> +		intel_ctx_destroy(fd, ctx[i]);
> +	}
> +
> +	if (share_vm)
> +		gem_vm_destroy(fd, shared_vm_id);
> +}
> +
> +igt_main
> +{
> +	struct test_cfg *t, tests[] = {
> +		{"basic", 0, 1, 1, 0},
> +		{"multi_cmds", 0, MAX_CMDS, 1, 0},
> +		{"skip_copy", 0, 0, 1, 0},
> +		{"skip_unbind",  0, 1, 1, TEST_SKIP_UNBIND},
> +		{"multi_ctxts", 0, 1, MAX_CTXTS, 0},
> +		{"share_vm", 0, 1, MAX_CTXTS, TEST_SHARE_VM},
> +		{"64K", (16 * PAGE_SIZE), 1, 1, 0},
> +		{"2M", SZ_2M, 1, 1, 0},
> +		{ }
> +	};
> +	int fd;
> +	uint32_t def_size;
> +	struct intel_execution_engine2 *e;
> +	const intel_ctx_t *ctx;
> +
> +	igt_fixture {
> +		fd = drm_open_driver(DRIVER_INTEL);
> +		igt_require_gem(fd);
> +		igt_require(i915_vm_bind_version(fd) == 1);
> +		def_size = HAS_64K_PAGES(intel_get_drm_devid(fd)) ?
> +			   SZ_64K : DEFAULT_BUFF_SIZE;

Should be possible to cut over to gtt_alignment, if you feel like it. 
For example just stick it in gem_memory_region.

> +		ctx = intel_ctx_create_all_physical(fd);
> +		/* Get copy engine */
> +		for_each_ctx_engine(fd, ctx, e)
> +			if (e->class == I915_ENGINE_CLASS_COPY)
> +				break;

Maybe igt_require/assert() at least one copy engine? Dunno.

> +	}
> +
> +	/* Adjust test variables */
> +	for (t = tests; t->name; t++)
> +		t->size = t->size ? : (def_size * abs(t->num_ctxts));

num_ctx can be negative?

> +
> +	for (t = tests; t->name; t++) {
> +		igt_describe_f("VM_BIND %s", t->name);
> +		igt_subtest_with_dynamic(t->name) {
> +			for_each_memory_region(r, fd) {
> +				struct test_cfg cfg = *t;
> +
> +				if (r->ci.memory_instance)
> +					continue;
> +
> +				if (r->ci.memory_class == I915_MEMORY_CLASS_DEVICE)
> +					cfg.flags |= TEST_LMEM;
> +
> +				igt_dynamic_f("%s", r->name)
> +					run_test(fd, ctx, &cfg, e);

Normally we just pass along the gem_memory_region. Can then fish out 
useful stuff like gtt_alignment, name, size etc.

> +			}
> +		}
> +	}
> +
> +	igt_fixture {
> +		intel_ctx_destroy(fd, ctx);
> +		close(fd);
> +	}
> +
> +	igt_exit();
> +}
> diff --git a/tests/intel-ci/fast-feedback.testlist b/tests/intel-ci/fast-feedback.testlist
> index 185c2fef54..8a47eaddda 100644
> --- a/tests/intel-ci/fast-feedback.testlist
> +++ b/tests/intel-ci/fast-feedback.testlist
> @@ -53,6 +53,7 @@ igt@i915_getparams_basic@basic-eu-total
>   igt@i915_getparams_basic@basic-subslice-total
>   igt@i915_hangman@error-state-basic
>   igt@i915_pciid
> +igt@i915_vm_bind_basic@basic-smem
>   igt@i915_vm_bind_sanity@basic
>   igt@kms_addfb_basic@addfb25-bad-modifier
>   igt@kms_addfb_basic@addfb25-framebuffer-vs-set-tiling
> diff --git a/tests/meson.build b/tests/meson.build
> index d81c6c28c2..1b9529a7e2 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -251,6 +251,7 @@ i915_progs = [
>   	'sysfs_preempt_timeout',
>   	'sysfs_timeslice_duration',
>   	'i915_vm_bind_sanity',
> +	'i915_vm_bind_basic',

Keep it sorted?

>   ]
>   
>   msm_progs = [

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 10/11] tests/i915/vm_bind: Add gem_exec3_balancer test
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 10/11] tests/i915/vm_bind: Add gem_exec3_balancer test Niranjana Vishwanathapura
@ 2022-10-21 16:42   ` Matthew Auld
  2022-10-21 20:11     ` Niranjana Vishwanathapura
  0 siblings, 1 reply; 28+ messages in thread
From: Matthew Auld @ 2022-10-21 16:42 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, daniel.vetter, petri.latvala

On 18/10/2022 08:17, Niranjana Vishwanathapura wrote:
> To test parallel submissions support in execbuf3, port the subtest
> gem_exec_balancer@parallel-ordering to a new gem_exec3_balancer test
> and switch to execbuf3 ioctl.
> 
> v2: use i915_vm_bind library functions
> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
> ---
>   tests/i915/gem_exec3_balancer.c | 473 ++++++++++++++++++++++++++++++++
>   tests/meson.build               |   7 +
>   2 files changed, 480 insertions(+)
>   create mode 100644 tests/i915/gem_exec3_balancer.c
> 
> diff --git a/tests/i915/gem_exec3_balancer.c b/tests/i915/gem_exec3_balancer.c
> new file mode 100644
> index 0000000000..34b2a6a0b3
> --- /dev/null
> +++ b/tests/i915/gem_exec3_balancer.c
> @@ -0,0 +1,473 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +/** @file gem_exec3_balancer.c
> + *
> + * Load balancer tests with execbuf3.
> + * Ported from gem_exec_balancer and made to work with
> + * vm_bind and execbuf3.
> + *
> + */
> +
> +#include <poll.h>
> +
> +#include "i915/gem.h"
> +#include "i915/i915_vm_bind.h"
> +#include "i915/gem_engine_topology.h"
> +#include "i915/gem_create.h"
> +#include "i915/gem_vm.h"
> +#include "igt.h"
> +#include "igt_gt.h"
> +#include "igt_perf.h"
> +#include "igt_syncobj.h"
> +
> +IGT_TEST_DESCRIPTION("Exercise in-kernel load-balancing with execbuf3");
> +
> +#define INSTANCE_COUNT (1 << I915_PMU_SAMPLE_INSTANCE_BITS)
> +
> +static bool has_class_instance(int i915, uint16_t class, uint16_t instance)
> +{
> +	int fd;
> +
> +	fd = perf_i915_open(i915, I915_PMU_ENGINE_BUSY(class, instance));
> +	if (fd >= 0) {
> +		close(fd);
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +static struct i915_engine_class_instance *
> +list_engines(int i915, uint32_t class_mask, unsigned int *out)
> +{
> +	unsigned int count = 0, size = 64;
> +	struct i915_engine_class_instance *engines;
> +
> +	engines = malloc(size * sizeof(*engines));
> +	igt_assert(engines);
> +
> +	for (enum drm_i915_gem_engine_class class = I915_ENGINE_CLASS_RENDER;
> +	     class_mask;
> +	     class++, class_mask >>= 1) {
> +		if (!(class_mask & 1))
> +			continue;
> +
> +		for (unsigned int instance = 0;
> +		     instance < INSTANCE_COUNT;
> +		     instance++) {
> +			if (!has_class_instance(i915, class, instance))
> +				continue;
> +
> +			if (count == size) {
> +				size *= 2;
> +				engines = realloc(engines,
> +						  size * sizeof(*engines));
> +				igt_assert(engines);
> +			}
> +
> +			engines[count++] = (struct i915_engine_class_instance){
> +				.engine_class = class,
> +				.engine_instance = instance,
> +			};
> +		}
> +	}
> +
> +	if (!count) {
> +		free(engines);
> +		engines = NULL;
> +	}
> +
> +	*out = count;
> +	return engines;
> +}
> +
> +static bool has_perf_engines(int i915)
> +{
> +	return i915_perf_type_id(i915);
> +}
> +
> +static intel_ctx_cfg_t
> +ctx_cfg_for_engines(const struct i915_engine_class_instance *ci,
> +		    unsigned int count)
> +{
> +	intel_ctx_cfg_t cfg = { };
> +	unsigned int i;
> +
> +	for (i = 0; i < count; i++)
> +		cfg.engines[i] = ci[i];
> +	cfg.num_engines = count;
> +
> +	return cfg;
> +}
> +
> +static const intel_ctx_t *
> +ctx_create_engines(int i915, const struct i915_engine_class_instance *ci,
> +		   unsigned int count)
> +{
> +	intel_ctx_cfg_t cfg = ctx_cfg_for_engines(ci, count);
> +	return intel_ctx_create(i915, &cfg);
> +}
> +
> +static void check_bo(int i915, uint32_t handle, unsigned int expected,
> +		     bool wait)
> +{
> +	uint32_t *map;
> +
> +	map = gem_mmap__cpu(i915, handle, 0, 4096, PROT_READ);
> +	if (wait)
> +		gem_set_domain(i915, handle, I915_GEM_DOMAIN_CPU,
> +			       I915_GEM_DOMAIN_CPU);

Hmm, does the wait still work as expected in set_domain() or does that 
ignore BOOKKEEP, in which case this doesn't do anything with eb3? 
Perhaps can be deleted? Also slightly worried about the dirty tracking, 
and whether this test cares...

> +	igt_assert_eq(map[0], expected);
> +	munmap(map, 4096);
> +}
> +
> +static struct drm_i915_query_engine_info *query_engine_info(int i915)
> +{
> +	struct drm_i915_query_engine_info *engines;
> +
> +#define QUERY_SIZE	0x4000
> +	engines = malloc(QUERY_SIZE);
> +	igt_assert(engines);
> +	memset(engines, 0, QUERY_SIZE);
> +	igt_assert(!__gem_query_engines(i915, engines, QUERY_SIZE));
> +#undef QUERY_SIZE
> +
> +	return engines;
> +}
> +
> +/* This function only works if siblings contains all instances of a class */
> +static void logical_sort_siblings(int i915,
> +				  struct i915_engine_class_instance *siblings,
> +				  unsigned int count)
> +{
> +	struct i915_engine_class_instance *sorted;
> +	struct drm_i915_query_engine_info *engines;
> +	unsigned int i, j;
> +
> +	sorted = calloc(count, sizeof(*sorted));
> +	igt_assert(sorted);
> +
> +	engines = query_engine_info(i915);
> +
> +	for (j = 0; j < count; ++j) {
> +		for (i = 0; i < engines->num_engines; ++i) {
> +			if (siblings[j].engine_class ==
> +			    engines->engines[i].engine.engine_class &&
> +			    siblings[j].engine_instance ==
> +			    engines->engines[i].engine.engine_instance) {
> +				uint16_t logical_instance =
> +					engines->engines[i].logical_instance;
> +
> +				igt_assert(logical_instance < count);
> +				igt_assert(!sorted[logical_instance].engine_class);
> +				igt_assert(!sorted[logical_instance].engine_instance);
> +
> +				sorted[logical_instance] = siblings[j];
> +				break;
> +			}
> +		}
> +		igt_assert(i != engines->num_engines);
> +	}
> +
> +	memcpy(siblings, sorted, sizeof(*sorted) * count);
> +	free(sorted);
> +	free(engines);
> +}
> +
> +static bool fence_busy(int fence)
> +{
> +	return poll(&(struct pollfd){fence, POLLIN}, 1, 0) == 0;
> +}
> +
> +/*
> + * Always reading from engine instance 0, with GuC submission the values are the
> + * same across all instances. Execlists they may differ but quite unlikely they
> + * would be and if they are we can live with this.
> + */
> +static unsigned int get_timeslice(int i915,
> +				  struct i915_engine_class_instance engine)
> +{
> +	unsigned int val;
> +
> +	switch (engine.engine_class) {
> +	case I915_ENGINE_CLASS_RENDER:
> +		gem_engine_property_scanf(i915, "rcs0", "timeslice_duration_ms",
> +					  "%d", &val);
> +		break;
> +	case I915_ENGINE_CLASS_COPY:
> +		gem_engine_property_scanf(i915, "bcs0", "timeslice_duration_ms",
> +					  "%d", &val);
> +		break;
> +	case I915_ENGINE_CLASS_VIDEO:
> +		gem_engine_property_scanf(i915, "vcs0", "timeslice_duration_ms",
> +					  "%d", &val);
> +		break;
> +	case I915_ENGINE_CLASS_VIDEO_ENHANCE:
> +		gem_engine_property_scanf(i915, "vecs0", "timeslice_duration_ms",
> +					  "%d", &val);
> +		break;
> +	}
> +
> +	return val;
> +}
> +
> +static uint64_t gettime_ns(void)
> +{
> +	struct timespec current;
> +	clock_gettime(CLOCK_MONOTONIC, &current);
> +	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
> +}
> +
> +static bool syncobj_busy(int i915, uint32_t handle)
> +{
> +	bool result;
> +	int sf;
> +
> +	sf = syncobj_handle_to_fd(i915, handle,
> +				  DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE);
> +	result = poll(&(struct pollfd){sf, POLLIN}, 1, 0) == 0;
> +	close(sf);
> +
> +	return result;
> +}
> +
> +/*
> + * Ensure a parallel submit actually runs on HW in parallel by putting on a
> + * spinner on 1 engine, doing a parallel submit, and parallel submit is blocked
> + * behind spinner.
> + */
> +static void parallel_ordering(int i915, unsigned int flags)
> +{
> +	uint32_t vm_id;
> +	int class;
> +
> +	vm_id = gem_vm_create_in_vm_bind_mode(i915);
> +
> +	for (class = 0; class < 32; class++) {
> +		struct drm_i915_gem_timeline_fence exec_fence[32] = { };
> +		const intel_ctx_t *ctx = NULL, *spin_ctx = NULL;
> +		uint64_t fence_value = 0, batch_addr[32] = { };
> +		struct i915_engine_class_instance *siblings;
> +		uint32_t exec_syncobj, bind_syncobj[32];
> +		struct drm_i915_gem_execbuffer3 execbuf;
> +		uint32_t batch[16], obj[32];
> +		intel_ctx_cfg_t cfg;
> +		unsigned int count;
> +		igt_spin_t *spin;
> +		uint64_t ahnd;
> +		int i = 0;
> +
> +		siblings = list_engines(i915, 1u << class, &count);
> +		if (!siblings)
> +			continue;
> +
> +		if (count < 2) {
> +			free(siblings);
> +			continue;
> +		}
> +
> +		logical_sort_siblings(i915, siblings, count);
> +
> +		memset(&cfg, 0, sizeof(cfg));
> +		cfg.parallel = true;
> +		cfg.num_engines = 1;
> +		cfg.width = count;
> +		memcpy(cfg.engines, siblings, sizeof(*siblings) * count);
> +
> +		if (__intel_ctx_create(i915, &cfg, &ctx)) {
> +			free(siblings);
> +			continue;
> +		}
> +		gem_context_set_vm(i915, ctx->id, vm_id);
> +
> +		batch[i] = MI_ATOMIC | MI_ATOMIC_INC;
> +#define TARGET_BO_OFFSET	(0x1 << 16)
> +		batch[++i] = TARGET_BO_OFFSET;
> +		batch[++i] = 0;
> +		batch[++i] = MI_BATCH_BUFFER_END;
> +
> +		obj[0] = gem_create(i915, 4096);
> +		bind_syncobj[0] = syncobj_create(i915, 0);
> +		exec_fence[0].handle = bind_syncobj[0];
> +		exec_fence[0].flags = I915_TIMELINE_FENCE_WAIT;
> +		i915_vm_bind(i915, vm_id, TARGET_BO_OFFSET, obj[0], 0, 4096, bind_syncobj[0], 0);
> +
> +		for (i = 1; i < count + 1; ++i) {
> +			obj[i] = gem_create(i915, 4096);
> +			gem_write(i915, obj[i], 0, batch, sizeof(batch));
> +
> +			batch_addr[i - 1] = TARGET_BO_OFFSET * (i + 1);
> +			bind_syncobj[i] = syncobj_create(i915, 0);
> +			exec_fence[i].handle = bind_syncobj[i];
> +			exec_fence[i].flags = I915_TIMELINE_FENCE_WAIT;
> +			i915_vm_bind(i915, vm_id, batch_addr[i - 1], obj[i], 0, 4096, bind_syncobj[i], 0);
> +		}
> +
> +		exec_syncobj = syncobj_create(i915, 0);
> +		exec_fence[i].handle = exec_syncobj;
> +		exec_fence[i].flags = I915_TIMELINE_FENCE_SIGNAL;
> +
> +		memset(&execbuf, 0, sizeof(execbuf));
> +		execbuf.ctx_id = ctx->id,
> +		execbuf.batch_address = to_user_pointer(batch_addr),
> +		execbuf.fence_count = count + 2,
> +		execbuf.timeline_fences = to_user_pointer(exec_fence),
> +
> +		/* Block parallel submission */
> +		spin_ctx = ctx_create_engines(i915, siblings, count);
> +		ahnd = get_simple_ahnd(i915, spin_ctx->id);
> +		spin = __igt_spin_new(i915,
> +				      .ahnd = ahnd,
> +				      .ctx = spin_ctx,
> +				      .engine = 0,
> +				      .flags = IGT_SPIN_FENCE_OUT |
> +				      IGT_SPIN_NO_PREEMPTION);
> +
> +		/* Wait for spinners to start */
> +		usleep(5 * 10000);
> +		igt_assert(fence_busy(spin->out_fence));
> +
> +		/* Submit parallel execbuf */
> +		gem_execbuf3(i915, &execbuf);
> +
> +		/*
> +		 * Wait long enough for timeslcing to kick in but not
> +		 * preemption. Spinner + parallel execbuf should be
> +		 * active. Assuming default timeslice / preemption values, if
> +		 * these are changed it is possible for the test to fail.
> +		 */
> +		usleep(get_timeslice(i915, siblings[0]) * 2);
> +		igt_assert(fence_busy(spin->out_fence));
> +		igt_assert(syncobj_busy(i915, exec_syncobj));
> +		check_bo(i915, obj[0], 0, false);
> +
> +		/*
> +		 * End spinner and wait for spinner + parallel execbuf
> +		 * to compelte.
> +		 */
> +		igt_spin_end(spin);
> +		igt_assert(syncobj_timeline_wait(i915, &exec_syncobj, &fence_value, 1,
> +						 gettime_ns() + (2 * NSEC_PER_SEC),
> +						 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
> +		igt_assert(!syncobj_busy(i915, exec_syncobj));
> +		syncobj_destroy(i915, exec_syncobj);
> +		for (i = 0; i < count + 1; ++i)
> +			syncobj_destroy(i915, bind_syncobj[i]);
> +		check_bo(i915, obj[0], count, true);
> +
> +		/* Clean up */
> +		intel_ctx_destroy(i915, ctx);
> +		intel_ctx_destroy(i915, spin_ctx);
> +		i915_vm_unbind(i915, vm_id, TARGET_BO_OFFSET, 4096);
> +		for (i = 1; i < count + 1; ++i)
> +			i915_vm_unbind(i915, vm_id, batch_addr[i - 1], 4096);
> +
> +		for (i = 0; i < count + 1; ++i)
> +			gem_close(i915, obj[i]);
> +		free(siblings);
> +		igt_spin_free(i915, spin);
> +		put_ahnd(ahnd);
> +	}
> +
> +	gem_vm_destroy(i915, vm_id);
> +}
> +
> +static bool has_load_balancer(int i915)
> +{
> +	const intel_ctx_cfg_t cfg = {
> +		.load_balance = true,
> +		.num_engines = 1,
> +	};
> +	const intel_ctx_t *ctx = NULL;
> +	int err;
> +
> +	err = __intel_ctx_create(i915, &cfg, &ctx);
> +	intel_ctx_destroy(i915, ctx);
> +
> +	return err == 0;
> +}
> +
> +static bool has_logical_mapping(int i915)
> +{
> +	struct drm_i915_query_engine_info *engines;
> +	unsigned int i;
> +
> +	engines = query_engine_info(i915);
> +
> +	for (i = 0; i < engines->num_engines; ++i)
> +		if (!(engines->engines[i].flags &
> +		     I915_ENGINE_INFO_HAS_LOGICAL_INSTANCE)) {
> +			free(engines);
> +			return false;
> +		}
> +
> +	free(engines);
> +	return true;
> +}
> +
> +static bool has_parallel_execbuf(int i915)
> +{
> +	intel_ctx_cfg_t cfg = {
> +		.parallel = true,
> +		.num_engines = 1,
> +	};
> +	const intel_ctx_t *ctx = NULL;
> +	int err;
> +
> +	for (int class = 0; class < 32; class++) {
> +		struct i915_engine_class_instance *siblings;
> +		unsigned int count;
> +
> +		siblings = list_engines(i915, 1u << class, &count);
> +		if (!siblings)
> +			continue;
> +
> +		if (count < 2) {
> +			free(siblings);
> +			continue;
> +		}
> +
> +		logical_sort_siblings(i915, siblings, count);
> +
> +		cfg.width = count;
> +		memcpy(cfg.engines, siblings, sizeof(*siblings) * count);
> +		free(siblings);
> +
> +		err = __intel_ctx_create(i915, &cfg, &ctx);
> +		intel_ctx_destroy(i915, ctx);
> +
> +		return err == 0;
> +	}
> +
> +	return false;
> +}
> +
> +igt_main
> +{
> +	int i915 = -1;
> +
> +	igt_fixture {
> +		i915 = drm_open_driver(DRIVER_INTEL);
> +		igt_require_gem(i915);
> +
> +		gem_require_contexts(i915);
> +		igt_require(i915_vm_bind_version(i915) == 1);
> +		igt_require(gem_has_engine_topology(i915));
> +		igt_require(has_load_balancer(i915));
> +		igt_require(has_perf_engines(i915));
> +	}
> +
> +	igt_subtest_group {
> +		igt_fixture {
> +			igt_require(has_logical_mapping(i915));
> +			igt_require(has_parallel_execbuf(i915));
> +		}
> +
> +		igt_describe("Ensure a parallel submit actually runs in parallel");
> +		igt_subtest("parallel-ordering")
> +			parallel_ordering(i915, 0);
> +	}
> +}
> diff --git a/tests/meson.build b/tests/meson.build
> index 50de8b7e89..6ca5a94282 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -375,6 +375,13 @@ test_executables += executable('gem_exec_balancer', 'i915/gem_exec_balancer.c',
>   	   install : true)
>   test_list += 'gem_exec_balancer'
>   
> +test_executables += executable('gem_exec3_balancer', 'i915/gem_exec3_balancer.c',
> +	   dependencies : test_deps + [ lib_igt_perf ],
> +	   install_dir : libexecdir,
> +	   install_rpath : libexecdir_rpathdir,
> +	   install : true)
> +test_list += 'gem_exec3_balancer'
> +
>   test_executables += executable('gem_mmap_offset',
>   	   join_paths('i915', 'gem_mmap_offset.c'),
>   	   dependencies : test_deps + [ libatomic ],

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 11/11] tests/i915/vm_bind: Add gem_lmem_swapping@vm_bind sub test
  2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 11/11] tests/i915/vm_bind: Add gem_lmem_swapping@vm_bind sub test Niranjana Vishwanathapura
@ 2022-10-21 17:20   ` Matthew Auld
  2022-10-25  7:10     ` Niranjana Vishwanathapura
  0 siblings, 1 reply; 28+ messages in thread
From: Matthew Auld @ 2022-10-21 17:20 UTC (permalink / raw)
  To: Niranjana Vishwanathapura, igt-dev
  Cc: tvrtko.ursulin, thomas.hellstrom, daniel.vetter, petri.latvala

On 18/10/2022 08:17, Niranjana Vishwanathapura wrote:
> Validate eviction of objects with persistent mappings and
> check persistent mappings are properly rebound upon subsequent
> execbuf3 call.
> 
> TODO: Add a new swapping test with just vm_bind mode
>        (without legacy execbuf).

So is this using the eb2 path to generate some memory pressure and 
hopefully evict some of our vm_binds? I guess so long as we trigger the 
rebinding.

> 
> v2: use i915_vm_bind library functions
> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

This test is failing on DG2 btw:

Starting dynamic subtest: lmem0
Memory: system-total 19947MiB, lmem-region 4048MiB, usage-limit 21002MiB
Using 1 thread(s), 6072 loop(s), 6072 objects of 1 MiB - 1 MiB, seed: 
1666372356, oom: no
(gem_lmem_swapping:1666) ioctl_wrappers-CRITICAL: Test assertion failure 
function gem_vm_bind, file ../lib/ioctl_wrappers.c:1412:
(gem_lmem_swapping:1666) ioctl_wrappers-CRITICAL: Failed assertion: 
__gem_vm_bind(fd, bind) == 0
(gem_lmem_swapping:1666) ioctl_wrappers-CRITICAL: Last errno: 22, 
Invalid argument
(gem_lmem_swapping:1666) ioctl_wrappers-CRITICAL: error: -22 != 0

> ---
>   tests/i915/gem_lmem_swapping.c | 128 ++++++++++++++++++++++++++++++++-
>   1 file changed, 127 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/i915/gem_lmem_swapping.c b/tests/i915/gem_lmem_swapping.c
> index cccdb3195b..017833775c 100644
> --- a/tests/i915/gem_lmem_swapping.c
> +++ b/tests/i915/gem_lmem_swapping.c
> @@ -3,12 +3,16 @@
>    * Copyright © 2021 Intel Corporation
>    */
>   
> +#include <poll.h>
> +
>   #include "i915/gem.h"
>   #include "i915/gem_create.h"
>   #include "i915/gem_vm.h"
>   #include "i915/intel_memory_region.h"
> +#include "i915/i915_vm_bind.h"
>   #include "igt.h"
>   #include "igt_kmod.h"
> +#include "igt_syncobj.h"
>   #include <unistd.h>
>   #include <stdlib.h>
>   #include <stdint.h>
> @@ -54,6 +58,7 @@ struct params {
>   		uint64_t max;
>   	} size;
>   	unsigned int count;
> +	unsigned int vm_bind_count;
>   	unsigned int loops;
>   	unsigned int mem_limit;
>   #define TEST_VERIFY	(1 << 0)
> @@ -64,15 +69,18 @@ struct params {
>   #define TEST_MULTI	(1 << 5)
>   #define TEST_CCS	(1 << 6)
>   #define TEST_MASSIVE	(1 << 7)
> +#define TEST_VM_BIND	(1 << 8)
>   	unsigned int flags;
>   	unsigned int seed;
>   	bool oom_test;
> +	uint64_t va;
>   };
>   
>   struct object {
>   	uint64_t size;
>   	uint32_t seed;
>   	uint32_t handle;
> +	uint64_t va;
>   	struct blt_copy_object *blt_obj;
>   };
>   
> @@ -278,6 +286,86 @@ verify_object_ccs(int i915, const struct object *obj,
>   	free(cmd);
>   }
>   
> +#define BATCH_VA       0xa00000
> +
> +static uint64_t gettime_ns(void)
> +{
> +	struct timespec current;
> +	clock_gettime(CLOCK_MONOTONIC, &current);
> +	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
> +}
> +
> +static void vm_bind(int i915, struct object *list, unsigned int num, uint64_t *va,
> +		    uint32_t batch, uint64_t batch_size, uint32_t vm_id)
> +{
> +	uint32_t *bind_syncobj;
> +	uint64_t *fence_value;
> +	unsigned int i;
> +
> +	bind_syncobj = calloc(num + 1, sizeof(*bind_syncobj));
> +	igt_assert(bind_syncobj);
> +
> +	fence_value = calloc(num + 1, sizeof(*fence_value));
> +	igt_assert(fence_value);
> +
> +	for (i = 0; i < num; i++) {
> +		list[i].va = *va;
> +		bind_syncobj[i] = syncobj_create(i915, 0);
> +		i915_vm_bind(i915, vm_id, *va, list[i].handle, 0, list[i].size, bind_syncobj[i], 0);
> +		*va += list[i].size;
> +	}
> +	bind_syncobj[i] = syncobj_create(i915, 0);
> +	i915_vm_bind(i915, vm_id, BATCH_VA, batch, 0, batch_size, bind_syncobj[i], 0);
> +
> +	igt_assert(syncobj_timeline_wait(i915, bind_syncobj, fence_value, num + 1,
> +					 gettime_ns() + (2 * NSEC_PER_SEC),
> +					 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
> +	for (i = 0; i <= num; i++)
> +		syncobj_destroy(i915, bind_syncobj[i]);
> +}
> +
> +static void vm_unbind(int i915, struct object *list, unsigned int num,
> +		      uint64_t batch_size, uint32_t vm_id)
> +{
> +	unsigned int i;
> +
> +	i915_vm_unbind(i915, vm_id, BATCH_VA, batch_size);
> +	for (i = 0; i < num; i++)
> +		i915_vm_unbind(i915, vm_id, list[i].va, list[i].size);
> +}
> +
> +static void move_to_lmem_execbuf3(int i915,
> +				  const intel_ctx_t *ctx,
> +				  unsigned int engine,
> +				  bool do_oom_test)
> +{
> +	uint32_t exec_syncobj = syncobj_create(i915, 0);
> +	struct drm_i915_gem_timeline_fence exec_fence = {
> +		.handle = exec_syncobj,
> +		.flags = I915_TIMELINE_FENCE_SIGNAL
> +	};
> +	struct drm_i915_gem_execbuffer3 eb = {
> +		.ctx_id = ctx->id,
> +		.batch_address = BATCH_VA,
> +		.engine_idx = engine,
> +		.fence_count = 1,
> +		.timeline_fences = to_user_pointer(&exec_fence),
> +	};
> +	uint64_t fence_value = 0;
> +	int ret;
> +
> +retry:
> +	ret = __gem_execbuf3(i915, &eb);
> +	if (do_oom_test && (ret == -ENOMEM || ret == -ENXIO))
> +		goto retry;
> +	igt_assert_eq(ret, 0);
> +
> +	igt_assert(syncobj_timeline_wait(i915, &exec_syncobj, &fence_value, 1,
> +					 gettime_ns() + (2 * NSEC_PER_SEC),
> +					 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
> +	syncobj_destroy(i915, exec_syncobj);
> +}
> +
>   static void move_to_lmem(int i915,
>   			 const intel_ctx_t *ctx,
>   			 struct object *list,
> @@ -325,14 +413,16 @@ static void __do_evict(int i915,
>   	uint32_t region_id = INTEL_MEMORY_REGION_ID(region->memory_class,
>   						    region->memory_instance);
>   	const unsigned int max_swap_in = params->count / 100 + 1;
> +	uint64_t size, ahnd, batch_size = 4096;
>   	struct object *objects, *obj, *list;
>   	const uint32_t bpp = 32;
>   	uint32_t width, height, stride;
> +	const intel_ctx_t *vm_bind_ctx;
>   	const intel_ctx_t *blt_ctx;
>   	struct blt_copy_object *tmp;
>   	unsigned int engine = 0;
> +	uint32_t batch, vm_id;
>   	unsigned int i, l;
> -	uint64_t size, ahnd;
>   	struct timespec t = {};
>   	unsigned int num;
>   
> @@ -416,6 +506,20 @@ static void __do_evict(int i915,
>   		  readable_size(params->size.max), readable_unit(params->size.max),
>   		  params->count, seed);
>   
> +	/* VM_BIND the specified subset of objects (as persistent mappings) */
> +	if (params->flags & TEST_VM_BIND) {
> +		const uint32_t bbe = MI_BATCH_BUFFER_END;
> +
> +		batch = gem_create_from_pool(i915, &batch_size, region_id);
> +		gem_write(i915, batch, 0, &bbe, sizeof(bbe));
> +
> +		vm_id = gem_vm_create_in_vm_bind_mode(i915);
> +		vm_bind_ctx = intel_ctx_create(i915, &ctx->cfg);
> +		gem_context_set_vm(i915, vm_bind_ctx->id, vm_id);
> +		vm_bind(i915, objects, params->vm_bind_count, &params->va,
> +			batch, batch_size, vm_id);
> +	}
> +
>   	/*
>   	 * Move random objects back into lmem.
>   	 * For TEST_MULTI runs, make each object counts a loop to
> @@ -454,6 +558,15 @@ static void __do_evict(int i915,
>   		}
>   	}
>   
> +	/* Rebind persistent mappings to ensure they are swapped back in */
> +	if (params->flags & TEST_VM_BIND) {
> +		move_to_lmem_execbuf3(i915, vm_bind_ctx, engine, params->oom_test);
> +
> +		vm_unbind(i915, objects, params->vm_bind_count, batch_size, vm_id);
> +		intel_ctx_destroy(i915, vm_bind_ctx);
> +		gem_vm_destroy(i915, vm_id);
> +	}
> +
>   	for (i = 0; i < params->count; i++) {
>   		gem_close(i915, objects[i].handle);
>   		free(objects[i].blt_obj);
> @@ -553,6 +666,15 @@ static void fill_params(int i915, struct params *params,
>   	if (flags & TEST_HEAVY)
>   		params->loops = params->loops / 2 + 1;
>   
> +	/*
> +	 * Set vm_bind_count and ensure it doesn't over-subscribe LMEM.
> +	 * Set va range to ensure it is big enough for all bindings in a VM.
> +	 */
> +	if (flags & TEST_VM_BIND) {
> +		params->vm_bind_count = params->count * 50 / ((flags & TEST_HEAVY) ? 300 : 150);
> +		params->va = (uint64_t)params->vm_bind_count * size * 4;
> +	}
> +
>   	params->flags = flags;
>   	params->oom_test = do_oom_test;
>   
> @@ -583,6 +705,9 @@ static void test_evict(int i915,
>   	if (flags & TEST_CCS)
>   		igt_require(IS_DG2(intel_get_drm_devid(i915)));
>   
> +	if (flags & TEST_VM_BIND)
> +		igt_require(i915_vm_bind_version(i915) == 1);
> +
>   	fill_params(i915, &params, region, flags, nproc, false);
>   
>   	if (flags & TEST_PARALLEL) {
> @@ -764,6 +889,7 @@ igt_main_args("", long_options, help_str, opt_handler, NULL)
>   		{ "heavy-verify-random-ccs", TEST_CCS | TEST_RANDOM | TEST_HEAVY },
>   		{ "heavy-verify-multi-ccs", TEST_CCS | TEST_RANDOM | TEST_HEAVY | TEST_ENGINES | TEST_MULTI },
>   		{ "parallel-random-verify-ccs", TEST_PARALLEL | TEST_RANDOM | TEST_CCS },
> +		{ "vm_bind", TEST_VM_BIND },
>   		{ }
>   	};
>   	const intel_ctx_t *ctx;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 10/11] tests/i915/vm_bind: Add gem_exec3_balancer test
  2022-10-21 16:42   ` Matthew Auld
@ 2022-10-21 20:11     ` Niranjana Vishwanathapura
  2022-10-21 20:47       ` Niranjana Vishwanathapura
  0 siblings, 1 reply; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-21 20:11 UTC (permalink / raw)
  To: Matthew Auld
  Cc: tvrtko.ursulin, igt-dev, thomas.hellstrom, daniel.vetter, petri.latvala

On Fri, Oct 21, 2022 at 05:42:55PM +0100, Matthew Auld wrote:
>On 18/10/2022 08:17, Niranjana Vishwanathapura wrote:
>>To test parallel submissions support in execbuf3, port the subtest
>>gem_exec_balancer@parallel-ordering to a new gem_exec3_balancer test
>>and switch to execbuf3 ioctl.
>>
>>v2: use i915_vm_bind library functions
>>
>>Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>>---
>>  tests/i915/gem_exec3_balancer.c | 473 ++++++++++++++++++++++++++++++++
>>  tests/meson.build               |   7 +
>>  2 files changed, 480 insertions(+)
>>  create mode 100644 tests/i915/gem_exec3_balancer.c
>>
>>diff --git a/tests/i915/gem_exec3_balancer.c b/tests/i915/gem_exec3_balancer.c
>>new file mode 100644
>>index 0000000000..34b2a6a0b3
>>--- /dev/null
>>+++ b/tests/i915/gem_exec3_balancer.c
>>@@ -0,0 +1,473 @@
>>+// SPDX-License-Identifier: MIT
>>+/*
>>+ * Copyright © 2022 Intel Corporation
>>+ */
>>+
>>+/** @file gem_exec3_balancer.c
>>+ *
>>+ * Load balancer tests with execbuf3.
>>+ * Ported from gem_exec_balancer and made to work with
>>+ * vm_bind and execbuf3.
>>+ *
>>+ */
>>+
>>+#include <poll.h>
>>+
>>+#include "i915/gem.h"
>>+#include "i915/i915_vm_bind.h"
>>+#include "i915/gem_engine_topology.h"
>>+#include "i915/gem_create.h"
>>+#include "i915/gem_vm.h"
>>+#include "igt.h"
>>+#include "igt_gt.h"
>>+#include "igt_perf.h"
>>+#include "igt_syncobj.h"
>>+
>>+IGT_TEST_DESCRIPTION("Exercise in-kernel load-balancing with execbuf3");
>>+
>>+#define INSTANCE_COUNT (1 << I915_PMU_SAMPLE_INSTANCE_BITS)
>>+
>>+static bool has_class_instance(int i915, uint16_t class, uint16_t instance)
>>+{
>>+	int fd;
>>+
>>+	fd = perf_i915_open(i915, I915_PMU_ENGINE_BUSY(class, instance));
>>+	if (fd >= 0) {
>>+		close(fd);
>>+		return true;
>>+	}
>>+
>>+	return false;
>>+}
>>+
>>+static struct i915_engine_class_instance *
>>+list_engines(int i915, uint32_t class_mask, unsigned int *out)
>>+{
>>+	unsigned int count = 0, size = 64;
>>+	struct i915_engine_class_instance *engines;
>>+
>>+	engines = malloc(size * sizeof(*engines));
>>+	igt_assert(engines);
>>+
>>+	for (enum drm_i915_gem_engine_class class = I915_ENGINE_CLASS_RENDER;
>>+	     class_mask;
>>+	     class++, class_mask >>= 1) {
>>+		if (!(class_mask & 1))
>>+			continue;
>>+
>>+		for (unsigned int instance = 0;
>>+		     instance < INSTANCE_COUNT;
>>+		     instance++) {
>>+			if (!has_class_instance(i915, class, instance))
>>+				continue;
>>+
>>+			if (count == size) {
>>+				size *= 2;
>>+				engines = realloc(engines,
>>+						  size * sizeof(*engines));
>>+				igt_assert(engines);
>>+			}
>>+
>>+			engines[count++] = (struct i915_engine_class_instance){
>>+				.engine_class = class,
>>+				.engine_instance = instance,
>>+			};
>>+		}
>>+	}
>>+
>>+	if (!count) {
>>+		free(engines);
>>+		engines = NULL;
>>+	}
>>+
>>+	*out = count;
>>+	return engines;
>>+}
>>+
>>+static bool has_perf_engines(int i915)
>>+{
>>+	return i915_perf_type_id(i915);
>>+}
>>+
>>+static intel_ctx_cfg_t
>>+ctx_cfg_for_engines(const struct i915_engine_class_instance *ci,
>>+		    unsigned int count)
>>+{
>>+	intel_ctx_cfg_t cfg = { };
>>+	unsigned int i;
>>+
>>+	for (i = 0; i < count; i++)
>>+		cfg.engines[i] = ci[i];
>>+	cfg.num_engines = count;
>>+
>>+	return cfg;
>>+}
>>+
>>+static const intel_ctx_t *
>>+ctx_create_engines(int i915, const struct i915_engine_class_instance *ci,
>>+		   unsigned int count)
>>+{
>>+	intel_ctx_cfg_t cfg = ctx_cfg_for_engines(ci, count);
>>+	return intel_ctx_create(i915, &cfg);
>>+}
>>+
>>+static void check_bo(int i915, uint32_t handle, unsigned int expected,
>>+		     bool wait)
>>+{
>>+	uint32_t *map;
>>+
>>+	map = gem_mmap__cpu(i915, handle, 0, 4096, PROT_READ);
>>+	if (wait)
>>+		gem_set_domain(i915, handle, I915_GEM_DOMAIN_CPU,
>>+			       I915_GEM_DOMAIN_CPU);
>
>Hmm, does the wait still work as expected in set_domain() or does that 
>ignore BOOKKEEP, in which case this doesn't do anything with eb3? 
>Perhaps can be deleted? Also slightly worried about the dirty 
>tracking, and whether this test cares...

I think we still need this. gem_set_domain() will correctly wait for
implicit dependency tracking only and hence will ignore BOOKKEEP. As
there is no implicit dependency tracking with VM_BIND, user has to
do an explicit wait. This test is properly waiting for execution fence
status before calling this function. Hence I think we should be good.

Regards,
Niranjana

>
>>+	igt_assert_eq(map[0], expected);
>>+	munmap(map, 4096);
>>+}
>>+
>>+static struct drm_i915_query_engine_info *query_engine_info(int i915)
>>+{
>>+	struct drm_i915_query_engine_info *engines;
>>+
>>+#define QUERY_SIZE	0x4000
>>+	engines = malloc(QUERY_SIZE);
>>+	igt_assert(engines);
>>+	memset(engines, 0, QUERY_SIZE);
>>+	igt_assert(!__gem_query_engines(i915, engines, QUERY_SIZE));
>>+#undef QUERY_SIZE
>>+
>>+	return engines;
>>+}
>>+
>>+/* This function only works if siblings contains all instances of a class */
>>+static void logical_sort_siblings(int i915,
>>+				  struct i915_engine_class_instance *siblings,
>>+				  unsigned int count)
>>+{
>>+	struct i915_engine_class_instance *sorted;
>>+	struct drm_i915_query_engine_info *engines;
>>+	unsigned int i, j;
>>+
>>+	sorted = calloc(count, sizeof(*sorted));
>>+	igt_assert(sorted);
>>+
>>+	engines = query_engine_info(i915);
>>+
>>+	for (j = 0; j < count; ++j) {
>>+		for (i = 0; i < engines->num_engines; ++i) {
>>+			if (siblings[j].engine_class ==
>>+			    engines->engines[i].engine.engine_class &&
>>+			    siblings[j].engine_instance ==
>>+			    engines->engines[i].engine.engine_instance) {
>>+				uint16_t logical_instance =
>>+					engines->engines[i].logical_instance;
>>+
>>+				igt_assert(logical_instance < count);
>>+				igt_assert(!sorted[logical_instance].engine_class);
>>+				igt_assert(!sorted[logical_instance].engine_instance);
>>+
>>+				sorted[logical_instance] = siblings[j];
>>+				break;
>>+			}
>>+		}
>>+		igt_assert(i != engines->num_engines);
>>+	}
>>+
>>+	memcpy(siblings, sorted, sizeof(*sorted) * count);
>>+	free(sorted);
>>+	free(engines);
>>+}
>>+
>>+static bool fence_busy(int fence)
>>+{
>>+	return poll(&(struct pollfd){fence, POLLIN}, 1, 0) == 0;
>>+}
>>+
>>+/*
>>+ * Always reading from engine instance 0, with GuC submission the values are the
>>+ * same across all instances. Execlists they may differ but quite unlikely they
>>+ * would be and if they are we can live with this.
>>+ */
>>+static unsigned int get_timeslice(int i915,
>>+				  struct i915_engine_class_instance engine)
>>+{
>>+	unsigned int val;
>>+
>>+	switch (engine.engine_class) {
>>+	case I915_ENGINE_CLASS_RENDER:
>>+		gem_engine_property_scanf(i915, "rcs0", "timeslice_duration_ms",
>>+					  "%d", &val);
>>+		break;
>>+	case I915_ENGINE_CLASS_COPY:
>>+		gem_engine_property_scanf(i915, "bcs0", "timeslice_duration_ms",
>>+					  "%d", &val);
>>+		break;
>>+	case I915_ENGINE_CLASS_VIDEO:
>>+		gem_engine_property_scanf(i915, "vcs0", "timeslice_duration_ms",
>>+					  "%d", &val);
>>+		break;
>>+	case I915_ENGINE_CLASS_VIDEO_ENHANCE:
>>+		gem_engine_property_scanf(i915, "vecs0", "timeslice_duration_ms",
>>+					  "%d", &val);
>>+		break;
>>+	}
>>+
>>+	return val;
>>+}
>>+
>>+static uint64_t gettime_ns(void)
>>+{
>>+	struct timespec current;
>>+	clock_gettime(CLOCK_MONOTONIC, &current);
>>+	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
>>+}
>>+
>>+static bool syncobj_busy(int i915, uint32_t handle)
>>+{
>>+	bool result;
>>+	int sf;
>>+
>>+	sf = syncobj_handle_to_fd(i915, handle,
>>+				  DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE);
>>+	result = poll(&(struct pollfd){sf, POLLIN}, 1, 0) == 0;
>>+	close(sf);
>>+
>>+	return result;
>>+}
>>+
>>+/*
>>+ * Ensure a parallel submit actually runs on HW in parallel by putting on a
>>+ * spinner on 1 engine, doing a parallel submit, and parallel submit is blocked
>>+ * behind spinner.
>>+ */
>>+static void parallel_ordering(int i915, unsigned int flags)
>>+{
>>+	uint32_t vm_id;
>>+	int class;
>>+
>>+	vm_id = gem_vm_create_in_vm_bind_mode(i915);
>>+
>>+	for (class = 0; class < 32; class++) {
>>+		struct drm_i915_gem_timeline_fence exec_fence[32] = { };
>>+		const intel_ctx_t *ctx = NULL, *spin_ctx = NULL;
>>+		uint64_t fence_value = 0, batch_addr[32] = { };
>>+		struct i915_engine_class_instance *siblings;
>>+		uint32_t exec_syncobj, bind_syncobj[32];
>>+		struct drm_i915_gem_execbuffer3 execbuf;
>>+		uint32_t batch[16], obj[32];
>>+		intel_ctx_cfg_t cfg;
>>+		unsigned int count;
>>+		igt_spin_t *spin;
>>+		uint64_t ahnd;
>>+		int i = 0;
>>+
>>+		siblings = list_engines(i915, 1u << class, &count);
>>+		if (!siblings)
>>+			continue;
>>+
>>+		if (count < 2) {
>>+			free(siblings);
>>+			continue;
>>+		}
>>+
>>+		logical_sort_siblings(i915, siblings, count);
>>+
>>+		memset(&cfg, 0, sizeof(cfg));
>>+		cfg.parallel = true;
>>+		cfg.num_engines = 1;
>>+		cfg.width = count;
>>+		memcpy(cfg.engines, siblings, sizeof(*siblings) * count);
>>+
>>+		if (__intel_ctx_create(i915, &cfg, &ctx)) {
>>+			free(siblings);
>>+			continue;
>>+		}
>>+		gem_context_set_vm(i915, ctx->id, vm_id);
>>+
>>+		batch[i] = MI_ATOMIC | MI_ATOMIC_INC;
>>+#define TARGET_BO_OFFSET	(0x1 << 16)
>>+		batch[++i] = TARGET_BO_OFFSET;
>>+		batch[++i] = 0;
>>+		batch[++i] = MI_BATCH_BUFFER_END;
>>+
>>+		obj[0] = gem_create(i915, 4096);
>>+		bind_syncobj[0] = syncobj_create(i915, 0);
>>+		exec_fence[0].handle = bind_syncobj[0];
>>+		exec_fence[0].flags = I915_TIMELINE_FENCE_WAIT;
>>+		i915_vm_bind(i915, vm_id, TARGET_BO_OFFSET, obj[0], 0, 4096, bind_syncobj[0], 0);
>>+
>>+		for (i = 1; i < count + 1; ++i) {
>>+			obj[i] = gem_create(i915, 4096);
>>+			gem_write(i915, obj[i], 0, batch, sizeof(batch));
>>+
>>+			batch_addr[i - 1] = TARGET_BO_OFFSET * (i + 1);
>>+			bind_syncobj[i] = syncobj_create(i915, 0);
>>+			exec_fence[i].handle = bind_syncobj[i];
>>+			exec_fence[i].flags = I915_TIMELINE_FENCE_WAIT;
>>+			i915_vm_bind(i915, vm_id, batch_addr[i - 1], obj[i], 0, 4096, bind_syncobj[i], 0);
>>+		}
>>+
>>+		exec_syncobj = syncobj_create(i915, 0);
>>+		exec_fence[i].handle = exec_syncobj;
>>+		exec_fence[i].flags = I915_TIMELINE_FENCE_SIGNAL;
>>+
>>+		memset(&execbuf, 0, sizeof(execbuf));
>>+		execbuf.ctx_id = ctx->id,
>>+		execbuf.batch_address = to_user_pointer(batch_addr),
>>+		execbuf.fence_count = count + 2,
>>+		execbuf.timeline_fences = to_user_pointer(exec_fence),
>>+
>>+		/* Block parallel submission */
>>+		spin_ctx = ctx_create_engines(i915, siblings, count);
>>+		ahnd = get_simple_ahnd(i915, spin_ctx->id);
>>+		spin = __igt_spin_new(i915,
>>+				      .ahnd = ahnd,
>>+				      .ctx = spin_ctx,
>>+				      .engine = 0,
>>+				      .flags = IGT_SPIN_FENCE_OUT |
>>+				      IGT_SPIN_NO_PREEMPTION);
>>+
>>+		/* Wait for spinners to start */
>>+		usleep(5 * 10000);
>>+		igt_assert(fence_busy(spin->out_fence));
>>+
>>+		/* Submit parallel execbuf */
>>+		gem_execbuf3(i915, &execbuf);
>>+
>>+		/*
>>+		 * Wait long enough for timeslcing to kick in but not
>>+		 * preemption. Spinner + parallel execbuf should be
>>+		 * active. Assuming default timeslice / preemption values, if
>>+		 * these are changed it is possible for the test to fail.
>>+		 */
>>+		usleep(get_timeslice(i915, siblings[0]) * 2);
>>+		igt_assert(fence_busy(spin->out_fence));
>>+		igt_assert(syncobj_busy(i915, exec_syncobj));
>>+		check_bo(i915, obj[0], 0, false);
>>+
>>+		/*
>>+		 * End spinner and wait for spinner + parallel execbuf
>>+		 * to compelte.
>>+		 */
>>+		igt_spin_end(spin);
>>+		igt_assert(syncobj_timeline_wait(i915, &exec_syncobj, &fence_value, 1,
>>+						 gettime_ns() + (2 * NSEC_PER_SEC),
>>+						 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
>>+		igt_assert(!syncobj_busy(i915, exec_syncobj));
>>+		syncobj_destroy(i915, exec_syncobj);
>>+		for (i = 0; i < count + 1; ++i)
>>+			syncobj_destroy(i915, bind_syncobj[i]);
>>+		check_bo(i915, obj[0], count, true);
>>+
>>+		/* Clean up */
>>+		intel_ctx_destroy(i915, ctx);
>>+		intel_ctx_destroy(i915, spin_ctx);
>>+		i915_vm_unbind(i915, vm_id, TARGET_BO_OFFSET, 4096);
>>+		for (i = 1; i < count + 1; ++i)
>>+			i915_vm_unbind(i915, vm_id, batch_addr[i - 1], 4096);
>>+
>>+		for (i = 0; i < count + 1; ++i)
>>+			gem_close(i915, obj[i]);
>>+		free(siblings);
>>+		igt_spin_free(i915, spin);
>>+		put_ahnd(ahnd);
>>+	}
>>+
>>+	gem_vm_destroy(i915, vm_id);
>>+}
>>+
>>+static bool has_load_balancer(int i915)
>>+{
>>+	const intel_ctx_cfg_t cfg = {
>>+		.load_balance = true,
>>+		.num_engines = 1,
>>+	};
>>+	const intel_ctx_t *ctx = NULL;
>>+	int err;
>>+
>>+	err = __intel_ctx_create(i915, &cfg, &ctx);
>>+	intel_ctx_destroy(i915, ctx);
>>+
>>+	return err == 0;
>>+}
>>+
>>+static bool has_logical_mapping(int i915)
>>+{
>>+	struct drm_i915_query_engine_info *engines;
>>+	unsigned int i;
>>+
>>+	engines = query_engine_info(i915);
>>+
>>+	for (i = 0; i < engines->num_engines; ++i)
>>+		if (!(engines->engines[i].flags &
>>+		     I915_ENGINE_INFO_HAS_LOGICAL_INSTANCE)) {
>>+			free(engines);
>>+			return false;
>>+		}
>>+
>>+	free(engines);
>>+	return true;
>>+}
>>+
>>+static bool has_parallel_execbuf(int i915)
>>+{
>>+	intel_ctx_cfg_t cfg = {
>>+		.parallel = true,
>>+		.num_engines = 1,
>>+	};
>>+	const intel_ctx_t *ctx = NULL;
>>+	int err;
>>+
>>+	for (int class = 0; class < 32; class++) {
>>+		struct i915_engine_class_instance *siblings;
>>+		unsigned int count;
>>+
>>+		siblings = list_engines(i915, 1u << class, &count);
>>+		if (!siblings)
>>+			continue;
>>+
>>+		if (count < 2) {
>>+			free(siblings);
>>+			continue;
>>+		}
>>+
>>+		logical_sort_siblings(i915, siblings, count);
>>+
>>+		cfg.width = count;
>>+		memcpy(cfg.engines, siblings, sizeof(*siblings) * count);
>>+		free(siblings);
>>+
>>+		err = __intel_ctx_create(i915, &cfg, &ctx);
>>+		intel_ctx_destroy(i915, ctx);
>>+
>>+		return err == 0;
>>+	}
>>+
>>+	return false;
>>+}
>>+
>>+igt_main
>>+{
>>+	int i915 = -1;
>>+
>>+	igt_fixture {
>>+		i915 = drm_open_driver(DRIVER_INTEL);
>>+		igt_require_gem(i915);
>>+
>>+		gem_require_contexts(i915);
>>+		igt_require(i915_vm_bind_version(i915) == 1);
>>+		igt_require(gem_has_engine_topology(i915));
>>+		igt_require(has_load_balancer(i915));
>>+		igt_require(has_perf_engines(i915));
>>+	}
>>+
>>+	igt_subtest_group {
>>+		igt_fixture {
>>+			igt_require(has_logical_mapping(i915));
>>+			igt_require(has_parallel_execbuf(i915));
>>+		}
>>+
>>+		igt_describe("Ensure a parallel submit actually runs in parallel");
>>+		igt_subtest("parallel-ordering")
>>+			parallel_ordering(i915, 0);
>>+	}
>>+}
>>diff --git a/tests/meson.build b/tests/meson.build
>>index 50de8b7e89..6ca5a94282 100644
>>--- a/tests/meson.build
>>+++ b/tests/meson.build
>>@@ -375,6 +375,13 @@ test_executables += executable('gem_exec_balancer', 'i915/gem_exec_balancer.c',
>>  	   install : true)
>>  test_list += 'gem_exec_balancer'
>>+test_executables += executable('gem_exec3_balancer', 'i915/gem_exec3_balancer.c',
>>+	   dependencies : test_deps + [ lib_igt_perf ],
>>+	   install_dir : libexecdir,
>>+	   install_rpath : libexecdir_rpathdir,
>>+	   install : true)
>>+test_list += 'gem_exec3_balancer'
>>+
>>  test_executables += executable('gem_mmap_offset',
>>  	   join_paths('i915', 'gem_mmap_offset.c'),
>>  	   dependencies : test_deps + [ libatomic ],

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 10/11] tests/i915/vm_bind: Add gem_exec3_balancer test
  2022-10-21 20:11     ` Niranjana Vishwanathapura
@ 2022-10-21 20:47       ` Niranjana Vishwanathapura
  2022-10-24 15:26         ` Matthew Auld
  0 siblings, 1 reply; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-21 20:47 UTC (permalink / raw)
  To: Matthew Auld
  Cc: igt-dev, daniel.vetter, thomas.hellstrom, petri.latvala, tvrtko.ursulin

On Fri, Oct 21, 2022 at 01:11:28PM -0700, Niranjana Vishwanathapura wrote:
>On Fri, Oct 21, 2022 at 05:42:55PM +0100, Matthew Auld wrote:
>>On 18/10/2022 08:17, Niranjana Vishwanathapura wrote:
>>>To test parallel submissions support in execbuf3, port the subtest
>>>gem_exec_balancer@parallel-ordering to a new gem_exec3_balancer test
>>>and switch to execbuf3 ioctl.
>>>
>>>v2: use i915_vm_bind library functions
>>>
>>>Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>>>---
>>> tests/i915/gem_exec3_balancer.c | 473 ++++++++++++++++++++++++++++++++
>>> tests/meson.build               |   7 +
>>> 2 files changed, 480 insertions(+)
>>> create mode 100644 tests/i915/gem_exec3_balancer.c
>>>
>>>diff --git a/tests/i915/gem_exec3_balancer.c b/tests/i915/gem_exec3_balancer.c
>>>new file mode 100644
>>>index 0000000000..34b2a6a0b3
>>>--- /dev/null
>>>+++ b/tests/i915/gem_exec3_balancer.c
>>>@@ -0,0 +1,473 @@
>>>+// SPDX-License-Identifier: MIT
>>>+/*
>>>+ * Copyright © 2022 Intel Corporation
>>>+ */
>>>+
>>>+/** @file gem_exec3_balancer.c
>>>+ *
>>>+ * Load balancer tests with execbuf3.
>>>+ * Ported from gem_exec_balancer and made to work with
>>>+ * vm_bind and execbuf3.
>>>+ *
>>>+ */
>>>+
>>>+#include <poll.h>
>>>+
>>>+#include "i915/gem.h"
>>>+#include "i915/i915_vm_bind.h"
>>>+#include "i915/gem_engine_topology.h"
>>>+#include "i915/gem_create.h"
>>>+#include "i915/gem_vm.h"
>>>+#include "igt.h"
>>>+#include "igt_gt.h"
>>>+#include "igt_perf.h"
>>>+#include "igt_syncobj.h"
>>>+
>>>+IGT_TEST_DESCRIPTION("Exercise in-kernel load-balancing with execbuf3");
>>>+
>>>+#define INSTANCE_COUNT (1 << I915_PMU_SAMPLE_INSTANCE_BITS)
>>>+
>>>+static bool has_class_instance(int i915, uint16_t class, uint16_t instance)
>>>+{
>>>+	int fd;
>>>+
>>>+	fd = perf_i915_open(i915, I915_PMU_ENGINE_BUSY(class, instance));
>>>+	if (fd >= 0) {
>>>+		close(fd);
>>>+		return true;
>>>+	}
>>>+
>>>+	return false;
>>>+}
>>>+
>>>+static struct i915_engine_class_instance *
>>>+list_engines(int i915, uint32_t class_mask, unsigned int *out)
>>>+{
>>>+	unsigned int count = 0, size = 64;
>>>+	struct i915_engine_class_instance *engines;
>>>+
>>>+	engines = malloc(size * sizeof(*engines));
>>>+	igt_assert(engines);
>>>+
>>>+	for (enum drm_i915_gem_engine_class class = I915_ENGINE_CLASS_RENDER;
>>>+	     class_mask;
>>>+	     class++, class_mask >>= 1) {
>>>+		if (!(class_mask & 1))
>>>+			continue;
>>>+
>>>+		for (unsigned int instance = 0;
>>>+		     instance < INSTANCE_COUNT;
>>>+		     instance++) {
>>>+			if (!has_class_instance(i915, class, instance))
>>>+				continue;
>>>+
>>>+			if (count == size) {
>>>+				size *= 2;
>>>+				engines = realloc(engines,
>>>+						  size * sizeof(*engines));
>>>+				igt_assert(engines);
>>>+			}
>>>+
>>>+			engines[count++] = (struct i915_engine_class_instance){
>>>+				.engine_class = class,
>>>+				.engine_instance = instance,
>>>+			};
>>>+		}
>>>+	}
>>>+
>>>+	if (!count) {
>>>+		free(engines);
>>>+		engines = NULL;
>>>+	}
>>>+
>>>+	*out = count;
>>>+	return engines;
>>>+}
>>>+
>>>+static bool has_perf_engines(int i915)
>>>+{
>>>+	return i915_perf_type_id(i915);
>>>+}
>>>+
>>>+static intel_ctx_cfg_t
>>>+ctx_cfg_for_engines(const struct i915_engine_class_instance *ci,
>>>+		    unsigned int count)
>>>+{
>>>+	intel_ctx_cfg_t cfg = { };
>>>+	unsigned int i;
>>>+
>>>+	for (i = 0; i < count; i++)
>>>+		cfg.engines[i] = ci[i];
>>>+	cfg.num_engines = count;
>>>+
>>>+	return cfg;
>>>+}
>>>+
>>>+static const intel_ctx_t *
>>>+ctx_create_engines(int i915, const struct i915_engine_class_instance *ci,
>>>+		   unsigned int count)
>>>+{
>>>+	intel_ctx_cfg_t cfg = ctx_cfg_for_engines(ci, count);
>>>+	return intel_ctx_create(i915, &cfg);
>>>+}
>>>+
>>>+static void check_bo(int i915, uint32_t handle, unsigned int expected,
>>>+		     bool wait)
>>>+{
>>>+	uint32_t *map;
>>>+
>>>+	map = gem_mmap__cpu(i915, handle, 0, 4096, PROT_READ);
>>>+	if (wait)
>>>+		gem_set_domain(i915, handle, I915_GEM_DOMAIN_CPU,
>>>+			       I915_GEM_DOMAIN_CPU);
>>
>>Hmm, does the wait still work as expected in set_domain() or does 
>>that ignore BOOKKEEP, in which case this doesn't do anything with 
>>eb3? Perhaps can be deleted? Also slightly worried about the dirty 
>>tracking, and whether this test cares...
>
>I think we still need this. gem_set_domain() will correctly wait for
>implicit dependency tracking only and hence will ignore BOOKKEEP. As
>there is no implicit dependency tracking with VM_BIND, user has to
>do an explicit wait. This test is properly waiting for execution fence
>status before calling this function. Hence I think we should be good.
>

So, probably instead of naming it 'wait', we should call it 'flush'
or something for calling get_set_domain()?

Niranjana

>Regards,
>Niranjana
>
>>
>>>+	igt_assert_eq(map[0], expected);
>>>+	munmap(map, 4096);
>>>+}
>>>+
>>>+static struct drm_i915_query_engine_info *query_engine_info(int i915)
>>>+{
>>>+	struct drm_i915_query_engine_info *engines;
>>>+
>>>+#define QUERY_SIZE	0x4000
>>>+	engines = malloc(QUERY_SIZE);
>>>+	igt_assert(engines);
>>>+	memset(engines, 0, QUERY_SIZE);
>>>+	igt_assert(!__gem_query_engines(i915, engines, QUERY_SIZE));
>>>+#undef QUERY_SIZE
>>>+
>>>+	return engines;
>>>+}
>>>+
>>>+/* This function only works if siblings contains all instances of a class */
>>>+static void logical_sort_siblings(int i915,
>>>+				  struct i915_engine_class_instance *siblings,
>>>+				  unsigned int count)
>>>+{
>>>+	struct i915_engine_class_instance *sorted;
>>>+	struct drm_i915_query_engine_info *engines;
>>>+	unsigned int i, j;
>>>+
>>>+	sorted = calloc(count, sizeof(*sorted));
>>>+	igt_assert(sorted);
>>>+
>>>+	engines = query_engine_info(i915);
>>>+
>>>+	for (j = 0; j < count; ++j) {
>>>+		for (i = 0; i < engines->num_engines; ++i) {
>>>+			if (siblings[j].engine_class ==
>>>+			    engines->engines[i].engine.engine_class &&
>>>+			    siblings[j].engine_instance ==
>>>+			    engines->engines[i].engine.engine_instance) {
>>>+				uint16_t logical_instance =
>>>+					engines->engines[i].logical_instance;
>>>+
>>>+				igt_assert(logical_instance < count);
>>>+				igt_assert(!sorted[logical_instance].engine_class);
>>>+				igt_assert(!sorted[logical_instance].engine_instance);
>>>+
>>>+				sorted[logical_instance] = siblings[j];
>>>+				break;
>>>+			}
>>>+		}
>>>+		igt_assert(i != engines->num_engines);
>>>+	}
>>>+
>>>+	memcpy(siblings, sorted, sizeof(*sorted) * count);
>>>+	free(sorted);
>>>+	free(engines);
>>>+}
>>>+
>>>+static bool fence_busy(int fence)
>>>+{
>>>+	return poll(&(struct pollfd){fence, POLLIN}, 1, 0) == 0;
>>>+}
>>>+
>>>+/*
>>>+ * Always reading from engine instance 0, with GuC submission the values are the
>>>+ * same across all instances. Execlists they may differ but quite unlikely they
>>>+ * would be and if they are we can live with this.
>>>+ */
>>>+static unsigned int get_timeslice(int i915,
>>>+				  struct i915_engine_class_instance engine)
>>>+{
>>>+	unsigned int val;
>>>+
>>>+	switch (engine.engine_class) {
>>>+	case I915_ENGINE_CLASS_RENDER:
>>>+		gem_engine_property_scanf(i915, "rcs0", "timeslice_duration_ms",
>>>+					  "%d", &val);
>>>+		break;
>>>+	case I915_ENGINE_CLASS_COPY:
>>>+		gem_engine_property_scanf(i915, "bcs0", "timeslice_duration_ms",
>>>+					  "%d", &val);
>>>+		break;
>>>+	case I915_ENGINE_CLASS_VIDEO:
>>>+		gem_engine_property_scanf(i915, "vcs0", "timeslice_duration_ms",
>>>+					  "%d", &val);
>>>+		break;
>>>+	case I915_ENGINE_CLASS_VIDEO_ENHANCE:
>>>+		gem_engine_property_scanf(i915, "vecs0", "timeslice_duration_ms",
>>>+					  "%d", &val);
>>>+		break;
>>>+	}
>>>+
>>>+	return val;
>>>+}
>>>+
>>>+static uint64_t gettime_ns(void)
>>>+{
>>>+	struct timespec current;
>>>+	clock_gettime(CLOCK_MONOTONIC, &current);
>>>+	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
>>>+}
>>>+
>>>+static bool syncobj_busy(int i915, uint32_t handle)
>>>+{
>>>+	bool result;
>>>+	int sf;
>>>+
>>>+	sf = syncobj_handle_to_fd(i915, handle,
>>>+				  DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE);
>>>+	result = poll(&(struct pollfd){sf, POLLIN}, 1, 0) == 0;
>>>+	close(sf);
>>>+
>>>+	return result;
>>>+}
>>>+
>>>+/*
>>>+ * Ensure a parallel submit actually runs on HW in parallel by putting on a
>>>+ * spinner on 1 engine, doing a parallel submit, and parallel submit is blocked
>>>+ * behind spinner.
>>>+ */
>>>+static void parallel_ordering(int i915, unsigned int flags)
>>>+{
>>>+	uint32_t vm_id;
>>>+	int class;
>>>+
>>>+	vm_id = gem_vm_create_in_vm_bind_mode(i915);
>>>+
>>>+	for (class = 0; class < 32; class++) {
>>>+		struct drm_i915_gem_timeline_fence exec_fence[32] = { };
>>>+		const intel_ctx_t *ctx = NULL, *spin_ctx = NULL;
>>>+		uint64_t fence_value = 0, batch_addr[32] = { };
>>>+		struct i915_engine_class_instance *siblings;
>>>+		uint32_t exec_syncobj, bind_syncobj[32];
>>>+		struct drm_i915_gem_execbuffer3 execbuf;
>>>+		uint32_t batch[16], obj[32];
>>>+		intel_ctx_cfg_t cfg;
>>>+		unsigned int count;
>>>+		igt_spin_t *spin;
>>>+		uint64_t ahnd;
>>>+		int i = 0;
>>>+
>>>+		siblings = list_engines(i915, 1u << class, &count);
>>>+		if (!siblings)
>>>+			continue;
>>>+
>>>+		if (count < 2) {
>>>+			free(siblings);
>>>+			continue;
>>>+		}
>>>+
>>>+		logical_sort_siblings(i915, siblings, count);
>>>+
>>>+		memset(&cfg, 0, sizeof(cfg));
>>>+		cfg.parallel = true;
>>>+		cfg.num_engines = 1;
>>>+		cfg.width = count;
>>>+		memcpy(cfg.engines, siblings, sizeof(*siblings) * count);
>>>+
>>>+		if (__intel_ctx_create(i915, &cfg, &ctx)) {
>>>+			free(siblings);
>>>+			continue;
>>>+		}
>>>+		gem_context_set_vm(i915, ctx->id, vm_id);
>>>+
>>>+		batch[i] = MI_ATOMIC | MI_ATOMIC_INC;
>>>+#define TARGET_BO_OFFSET	(0x1 << 16)
>>>+		batch[++i] = TARGET_BO_OFFSET;
>>>+		batch[++i] = 0;
>>>+		batch[++i] = MI_BATCH_BUFFER_END;
>>>+
>>>+		obj[0] = gem_create(i915, 4096);
>>>+		bind_syncobj[0] = syncobj_create(i915, 0);
>>>+		exec_fence[0].handle = bind_syncobj[0];
>>>+		exec_fence[0].flags = I915_TIMELINE_FENCE_WAIT;
>>>+		i915_vm_bind(i915, vm_id, TARGET_BO_OFFSET, obj[0], 0, 4096, bind_syncobj[0], 0);
>>>+
>>>+		for (i = 1; i < count + 1; ++i) {
>>>+			obj[i] = gem_create(i915, 4096);
>>>+			gem_write(i915, obj[i], 0, batch, sizeof(batch));
>>>+
>>>+			batch_addr[i - 1] = TARGET_BO_OFFSET * (i + 1);
>>>+			bind_syncobj[i] = syncobj_create(i915, 0);
>>>+			exec_fence[i].handle = bind_syncobj[i];
>>>+			exec_fence[i].flags = I915_TIMELINE_FENCE_WAIT;
>>>+			i915_vm_bind(i915, vm_id, batch_addr[i - 1], obj[i], 0, 4096, bind_syncobj[i], 0);
>>>+		}
>>>+
>>>+		exec_syncobj = syncobj_create(i915, 0);
>>>+		exec_fence[i].handle = exec_syncobj;
>>>+		exec_fence[i].flags = I915_TIMELINE_FENCE_SIGNAL;
>>>+
>>>+		memset(&execbuf, 0, sizeof(execbuf));
>>>+		execbuf.ctx_id = ctx->id,
>>>+		execbuf.batch_address = to_user_pointer(batch_addr),
>>>+		execbuf.fence_count = count + 2,
>>>+		execbuf.timeline_fences = to_user_pointer(exec_fence),
>>>+
>>>+		/* Block parallel submission */
>>>+		spin_ctx = ctx_create_engines(i915, siblings, count);
>>>+		ahnd = get_simple_ahnd(i915, spin_ctx->id);
>>>+		spin = __igt_spin_new(i915,
>>>+				      .ahnd = ahnd,
>>>+				      .ctx = spin_ctx,
>>>+				      .engine = 0,
>>>+				      .flags = IGT_SPIN_FENCE_OUT |
>>>+				      IGT_SPIN_NO_PREEMPTION);
>>>+
>>>+		/* Wait for spinners to start */
>>>+		usleep(5 * 10000);
>>>+		igt_assert(fence_busy(spin->out_fence));
>>>+
>>>+		/* Submit parallel execbuf */
>>>+		gem_execbuf3(i915, &execbuf);
>>>+
>>>+		/*
>>>+		 * Wait long enough for timeslcing to kick in but not
>>>+		 * preemption. Spinner + parallel execbuf should be
>>>+		 * active. Assuming default timeslice / preemption values, if
>>>+		 * these are changed it is possible for the test to fail.
>>>+		 */
>>>+		usleep(get_timeslice(i915, siblings[0]) * 2);
>>>+		igt_assert(fence_busy(spin->out_fence));
>>>+		igt_assert(syncobj_busy(i915, exec_syncobj));
>>>+		check_bo(i915, obj[0], 0, false);
>>>+
>>>+		/*
>>>+		 * End spinner and wait for spinner + parallel execbuf
>>>+		 * to compelte.
>>>+		 */
>>>+		igt_spin_end(spin);
>>>+		igt_assert(syncobj_timeline_wait(i915, &exec_syncobj, &fence_value, 1,
>>>+						 gettime_ns() + (2 * NSEC_PER_SEC),
>>>+						 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
>>>+		igt_assert(!syncobj_busy(i915, exec_syncobj));
>>>+		syncobj_destroy(i915, exec_syncobj);
>>>+		for (i = 0; i < count + 1; ++i)
>>>+			syncobj_destroy(i915, bind_syncobj[i]);
>>>+		check_bo(i915, obj[0], count, true);
>>>+
>>>+		/* Clean up */
>>>+		intel_ctx_destroy(i915, ctx);
>>>+		intel_ctx_destroy(i915, spin_ctx);
>>>+		i915_vm_unbind(i915, vm_id, TARGET_BO_OFFSET, 4096);
>>>+		for (i = 1; i < count + 1; ++i)
>>>+			i915_vm_unbind(i915, vm_id, batch_addr[i - 1], 4096);
>>>+
>>>+		for (i = 0; i < count + 1; ++i)
>>>+			gem_close(i915, obj[i]);
>>>+		free(siblings);
>>>+		igt_spin_free(i915, spin);
>>>+		put_ahnd(ahnd);
>>>+	}
>>>+
>>>+	gem_vm_destroy(i915, vm_id);
>>>+}
>>>+
>>>+static bool has_load_balancer(int i915)
>>>+{
>>>+	const intel_ctx_cfg_t cfg = {
>>>+		.load_balance = true,
>>>+		.num_engines = 1,
>>>+	};
>>>+	const intel_ctx_t *ctx = NULL;
>>>+	int err;
>>>+
>>>+	err = __intel_ctx_create(i915, &cfg, &ctx);
>>>+	intel_ctx_destroy(i915, ctx);
>>>+
>>>+	return err == 0;
>>>+}
>>>+
>>>+static bool has_logical_mapping(int i915)
>>>+{
>>>+	struct drm_i915_query_engine_info *engines;
>>>+	unsigned int i;
>>>+
>>>+	engines = query_engine_info(i915);
>>>+
>>>+	for (i = 0; i < engines->num_engines; ++i)
>>>+		if (!(engines->engines[i].flags &
>>>+		     I915_ENGINE_INFO_HAS_LOGICAL_INSTANCE)) {
>>>+			free(engines);
>>>+			return false;
>>>+		}
>>>+
>>>+	free(engines);
>>>+	return true;
>>>+}
>>>+
>>>+static bool has_parallel_execbuf(int i915)
>>>+{
>>>+	intel_ctx_cfg_t cfg = {
>>>+		.parallel = true,
>>>+		.num_engines = 1,
>>>+	};
>>>+	const intel_ctx_t *ctx = NULL;
>>>+	int err;
>>>+
>>>+	for (int class = 0; class < 32; class++) {
>>>+		struct i915_engine_class_instance *siblings;
>>>+		unsigned int count;
>>>+
>>>+		siblings = list_engines(i915, 1u << class, &count);
>>>+		if (!siblings)
>>>+			continue;
>>>+
>>>+		if (count < 2) {
>>>+			free(siblings);
>>>+			continue;
>>>+		}
>>>+
>>>+		logical_sort_siblings(i915, siblings, count);
>>>+
>>>+		cfg.width = count;
>>>+		memcpy(cfg.engines, siblings, sizeof(*siblings) * count);
>>>+		free(siblings);
>>>+
>>>+		err = __intel_ctx_create(i915, &cfg, &ctx);
>>>+		intel_ctx_destroy(i915, ctx);
>>>+
>>>+		return err == 0;
>>>+	}
>>>+
>>>+	return false;
>>>+}
>>>+
>>>+igt_main
>>>+{
>>>+	int i915 = -1;
>>>+
>>>+	igt_fixture {
>>>+		i915 = drm_open_driver(DRIVER_INTEL);
>>>+		igt_require_gem(i915);
>>>+
>>>+		gem_require_contexts(i915);
>>>+		igt_require(i915_vm_bind_version(i915) == 1);
>>>+		igt_require(gem_has_engine_topology(i915));
>>>+		igt_require(has_load_balancer(i915));
>>>+		igt_require(has_perf_engines(i915));
>>>+	}
>>>+
>>>+	igt_subtest_group {
>>>+		igt_fixture {
>>>+			igt_require(has_logical_mapping(i915));
>>>+			igt_require(has_parallel_execbuf(i915));
>>>+		}
>>>+
>>>+		igt_describe("Ensure a parallel submit actually runs in parallel");
>>>+		igt_subtest("parallel-ordering")
>>>+			parallel_ordering(i915, 0);
>>>+	}
>>>+}
>>>diff --git a/tests/meson.build b/tests/meson.build
>>>index 50de8b7e89..6ca5a94282 100644
>>>--- a/tests/meson.build
>>>+++ b/tests/meson.build
>>>@@ -375,6 +375,13 @@ test_executables += executable('gem_exec_balancer', 'i915/gem_exec_balancer.c',
>>> 	   install : true)
>>> test_list += 'gem_exec_balancer'
>>>+test_executables += executable('gem_exec3_balancer', 'i915/gem_exec3_balancer.c',
>>>+	   dependencies : test_deps + [ lib_igt_perf ],
>>>+	   install_dir : libexecdir,
>>>+	   install_rpath : libexecdir_rpathdir,
>>>+	   install : true)
>>>+test_list += 'gem_exec3_balancer'
>>>+
>>> test_executables += executable('gem_mmap_offset',
>>> 	   join_paths('i915', 'gem_mmap_offset.c'),
>>> 	   dependencies : test_deps + [ libatomic ],

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 10/11] tests/i915/vm_bind: Add gem_exec3_balancer test
  2022-10-21 20:47       ` Niranjana Vishwanathapura
@ 2022-10-24 15:26         ` Matthew Auld
  2022-10-25  7:08           ` Niranjana Vishwanathapura
  0 siblings, 1 reply; 28+ messages in thread
From: Matthew Auld @ 2022-10-24 15:26 UTC (permalink / raw)
  To: Niranjana Vishwanathapura
  Cc: igt-dev, daniel.vetter, thomas.hellstrom, petri.latvala, tvrtko.ursulin

On 21/10/2022 21:47, Niranjana Vishwanathapura wrote:
> On Fri, Oct 21, 2022 at 01:11:28PM -0700, Niranjana Vishwanathapura wrote:
>> On Fri, Oct 21, 2022 at 05:42:55PM +0100, Matthew Auld wrote:
>>> On 18/10/2022 08:17, Niranjana Vishwanathapura wrote:
>>>> To test parallel submissions support in execbuf3, port the subtest
>>>> gem_exec_balancer@parallel-ordering to a new gem_exec3_balancer test
>>>> and switch to execbuf3 ioctl.
>>>>
>>>> v2: use i915_vm_bind library functions
>>>>
>>>> Signed-off-by: Niranjana Vishwanathapura 
>>>> <niranjana.vishwanathapura@intel.com>
>>>> ---
>>>> tests/i915/gem_exec3_balancer.c | 473 ++++++++++++++++++++++++++++++++
>>>> tests/meson.build               |   7 +
>>>> 2 files changed, 480 insertions(+)
>>>> create mode 100644 tests/i915/gem_exec3_balancer.c
>>>>
>>>> diff --git a/tests/i915/gem_exec3_balancer.c 
>>>> b/tests/i915/gem_exec3_balancer.c
>>>> new file mode 100644
>>>> index 0000000000..34b2a6a0b3
>>>> --- /dev/null
>>>> +++ b/tests/i915/gem_exec3_balancer.c
>>>> @@ -0,0 +1,473 @@
>>>> +// SPDX-License-Identifier: MIT
>>>> +/*
>>>> + * Copyright © 2022 Intel Corporation
>>>> + */
>>>> +
>>>> +/** @file gem_exec3_balancer.c
>>>> + *
>>>> + * Load balancer tests with execbuf3.
>>>> + * Ported from gem_exec_balancer and made to work with
>>>> + * vm_bind and execbuf3.
>>>> + *
>>>> + */
>>>> +
>>>> +#include <poll.h>
>>>> +
>>>> +#include "i915/gem.h"
>>>> +#include "i915/i915_vm_bind.h"
>>>> +#include "i915/gem_engine_topology.h"
>>>> +#include "i915/gem_create.h"
>>>> +#include "i915/gem_vm.h"
>>>> +#include "igt.h"
>>>> +#include "igt_gt.h"
>>>> +#include "igt_perf.h"
>>>> +#include "igt_syncobj.h"
>>>> +
>>>> +IGT_TEST_DESCRIPTION("Exercise in-kernel load-balancing with 
>>>> execbuf3");
>>>> +
>>>> +#define INSTANCE_COUNT (1 << I915_PMU_SAMPLE_INSTANCE_BITS)
>>>> +
>>>> +static bool has_class_instance(int i915, uint16_t class, uint16_t 
>>>> instance)
>>>> +{
>>>> +    int fd;
>>>> +
>>>> +    fd = perf_i915_open(i915, I915_PMU_ENGINE_BUSY(class, instance));
>>>> +    if (fd >= 0) {
>>>> +        close(fd);
>>>> +        return true;
>>>> +    }
>>>> +
>>>> +    return false;
>>>> +}
>>>> +
>>>> +static struct i915_engine_class_instance *
>>>> +list_engines(int i915, uint32_t class_mask, unsigned int *out)
>>>> +{
>>>> +    unsigned int count = 0, size = 64;
>>>> +    struct i915_engine_class_instance *engines;
>>>> +
>>>> +    engines = malloc(size * sizeof(*engines));
>>>> +    igt_assert(engines);
>>>> +
>>>> +    for (enum drm_i915_gem_engine_class class = 
>>>> I915_ENGINE_CLASS_RENDER;
>>>> +         class_mask;
>>>> +         class++, class_mask >>= 1) {
>>>> +        if (!(class_mask & 1))
>>>> +            continue;
>>>> +
>>>> +        for (unsigned int instance = 0;
>>>> +             instance < INSTANCE_COUNT;
>>>> +             instance++) {
>>>> +            if (!has_class_instance(i915, class, instance))
>>>> +                continue;
>>>> +
>>>> +            if (count == size) {
>>>> +                size *= 2;
>>>> +                engines = realloc(engines,
>>>> +                          size * sizeof(*engines));
>>>> +                igt_assert(engines);
>>>> +            }
>>>> +
>>>> +            engines[count++] = (struct i915_engine_class_instance){
>>>> +                .engine_class = class,
>>>> +                .engine_instance = instance,
>>>> +            };
>>>> +        }
>>>> +    }
>>>> +
>>>> +    if (!count) {
>>>> +        free(engines);
>>>> +        engines = NULL;
>>>> +    }
>>>> +
>>>> +    *out = count;
>>>> +    return engines;
>>>> +}
>>>> +
>>>> +static bool has_perf_engines(int i915)
>>>> +{
>>>> +    return i915_perf_type_id(i915);
>>>> +}
>>>> +
>>>> +static intel_ctx_cfg_t
>>>> +ctx_cfg_for_engines(const struct i915_engine_class_instance *ci,
>>>> +            unsigned int count)
>>>> +{
>>>> +    intel_ctx_cfg_t cfg = { };
>>>> +    unsigned int i;
>>>> +
>>>> +    for (i = 0; i < count; i++)
>>>> +        cfg.engines[i] = ci[i];
>>>> +    cfg.num_engines = count;
>>>> +
>>>> +    return cfg;
>>>> +}
>>>> +
>>>> +static const intel_ctx_t *
>>>> +ctx_create_engines(int i915, const struct 
>>>> i915_engine_class_instance *ci,
>>>> +           unsigned int count)
>>>> +{
>>>> +    intel_ctx_cfg_t cfg = ctx_cfg_for_engines(ci, count);
>>>> +    return intel_ctx_create(i915, &cfg);
>>>> +}
>>>> +
>>>> +static void check_bo(int i915, uint32_t handle, unsigned int expected,
>>>> +             bool wait)
>>>> +{
>>>> +    uint32_t *map;
>>>> +
>>>> +    map = gem_mmap__cpu(i915, handle, 0, 4096, PROT_READ);
>>>> +    if (wait)
>>>> +        gem_set_domain(i915, handle, I915_GEM_DOMAIN_CPU,
>>>> +                   I915_GEM_DOMAIN_CPU);
>>>
>>> Hmm, does the wait still work as expected in set_domain() or does 
>>> that ignore BOOKKEEP, in which case this doesn't do anything with 
>>> eb3? Perhaps can be deleted? Also slightly worried about the dirty 
>>> tracking, and whether this test cares...
>>
>> I think we still need this. gem_set_domain() will correctly wait for
>> implicit dependency tracking only and hence will ignore BOOKKEEP. As
>> there is no implicit dependency tracking with VM_BIND, user has to
>> do an explicit wait. This test is properly waiting for execution fence
>> status before calling this function. Hence I think we should be good.

And in this test there should be no implicit dependencies, right?

>>
> 
> So, probably instead of naming it 'wait', we should call it 'flush'
> or something for calling get_set_domain()?

It doesn't look like it flushes anything either here, AFAICT. But 
looking again, it doesn't seem like the test cares about check_bo() 
performing a flush. So it should be fine to just drop the set_domain()?

> 
> Niranjana
> 
>> Regards,
>> Niranjana
>>
>>>
>>>> +    igt_assert_eq(map[0], expected);
>>>> +    munmap(map, 4096);
>>>> +}
>>>> +
>>>> +static struct drm_i915_query_engine_info *query_engine_info(int i915)
>>>> +{
>>>> +    struct drm_i915_query_engine_info *engines;
>>>> +
>>>> +#define QUERY_SIZE    0x4000
>>>> +    engines = malloc(QUERY_SIZE);
>>>> +    igt_assert(engines);
>>>> +    memset(engines, 0, QUERY_SIZE);
>>>> +    igt_assert(!__gem_query_engines(i915, engines, QUERY_SIZE));
>>>> +#undef QUERY_SIZE
>>>> +
>>>> +    return engines;
>>>> +}
>>>> +
>>>> +/* This function only works if siblings contains all instances of a 
>>>> class */
>>>> +static void logical_sort_siblings(int i915,
>>>> +                  struct i915_engine_class_instance *siblings,
>>>> +                  unsigned int count)
>>>> +{
>>>> +    struct i915_engine_class_instance *sorted;
>>>> +    struct drm_i915_query_engine_info *engines;
>>>> +    unsigned int i, j;
>>>> +
>>>> +    sorted = calloc(count, sizeof(*sorted));
>>>> +    igt_assert(sorted);
>>>> +
>>>> +    engines = query_engine_info(i915);
>>>> +
>>>> +    for (j = 0; j < count; ++j) {
>>>> +        for (i = 0; i < engines->num_engines; ++i) {
>>>> +            if (siblings[j].engine_class ==
>>>> +                engines->engines[i].engine.engine_class &&
>>>> +                siblings[j].engine_instance ==
>>>> +                engines->engines[i].engine.engine_instance) {
>>>> +                uint16_t logical_instance =
>>>> +                    engines->engines[i].logical_instance;
>>>> +
>>>> +                igt_assert(logical_instance < count);
>>>> +                igt_assert(!sorted[logical_instance].engine_class);
>>>> +                igt_assert(!sorted[logical_instance].engine_instance);
>>>> +
>>>> +                sorted[logical_instance] = siblings[j];
>>>> +                break;
>>>> +            }
>>>> +        }
>>>> +        igt_assert(i != engines->num_engines);
>>>> +    }
>>>> +
>>>> +    memcpy(siblings, sorted, sizeof(*sorted) * count);
>>>> +    free(sorted);
>>>> +    free(engines);
>>>> +}
>>>> +
>>>> +static bool fence_busy(int fence)
>>>> +{
>>>> +    return poll(&(struct pollfd){fence, POLLIN}, 1, 0) == 0;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Always reading from engine instance 0, with GuC submission the 
>>>> values are the
>>>> + * same across all instances. Execlists they may differ but quite 
>>>> unlikely they
>>>> + * would be and if they are we can live with this.
>>>> + */
>>>> +static unsigned int get_timeslice(int i915,
>>>> +                  struct i915_engine_class_instance engine)
>>>> +{
>>>> +    unsigned int val;
>>>> +
>>>> +    switch (engine.engine_class) {
>>>> +    case I915_ENGINE_CLASS_RENDER:
>>>> +        gem_engine_property_scanf(i915, "rcs0", 
>>>> "timeslice_duration_ms",
>>>> +                      "%d", &val);
>>>> +        break;
>>>> +    case I915_ENGINE_CLASS_COPY:
>>>> +        gem_engine_property_scanf(i915, "bcs0", 
>>>> "timeslice_duration_ms",
>>>> +                      "%d", &val);
>>>> +        break;
>>>> +    case I915_ENGINE_CLASS_VIDEO:
>>>> +        gem_engine_property_scanf(i915, "vcs0", 
>>>> "timeslice_duration_ms",
>>>> +                      "%d", &val);
>>>> +        break;
>>>> +    case I915_ENGINE_CLASS_VIDEO_ENHANCE:
>>>> +        gem_engine_property_scanf(i915, "vecs0", 
>>>> "timeslice_duration_ms",
>>>> +                      "%d", &val);
>>>> +        break;
>>>> +    }
>>>> +
>>>> +    return val;
>>>> +}
>>>> +
>>>> +static uint64_t gettime_ns(void)
>>>> +{
>>>> +    struct timespec current;
>>>> +    clock_gettime(CLOCK_MONOTONIC, &current);
>>>> +    return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
>>>> +}
>>>> +
>>>> +static bool syncobj_busy(int i915, uint32_t handle)
>>>> +{
>>>> +    bool result;
>>>> +    int sf;
>>>> +
>>>> +    sf = syncobj_handle_to_fd(i915, handle,
>>>> +                  DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE);
>>>> +    result = poll(&(struct pollfd){sf, POLLIN}, 1, 0) == 0;
>>>> +    close(sf);
>>>> +
>>>> +    return result;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Ensure a parallel submit actually runs on HW in parallel by 
>>>> putting on a
>>>> + * spinner on 1 engine, doing a parallel submit, and parallel 
>>>> submit is blocked
>>>> + * behind spinner.
>>>> + */
>>>> +static void parallel_ordering(int i915, unsigned int flags)
>>>> +{
>>>> +    uint32_t vm_id;
>>>> +    int class;
>>>> +
>>>> +    vm_id = gem_vm_create_in_vm_bind_mode(i915);
>>>> +
>>>> +    for (class = 0; class < 32; class++) {
>>>> +        struct drm_i915_gem_timeline_fence exec_fence[32] = { };
>>>> +        const intel_ctx_t *ctx = NULL, *spin_ctx = NULL;
>>>> +        uint64_t fence_value = 0, batch_addr[32] = { };
>>>> +        struct i915_engine_class_instance *siblings;
>>>> +        uint32_t exec_syncobj, bind_syncobj[32];
>>>> +        struct drm_i915_gem_execbuffer3 execbuf;
>>>> +        uint32_t batch[16], obj[32];
>>>> +        intel_ctx_cfg_t cfg;
>>>> +        unsigned int count;
>>>> +        igt_spin_t *spin;
>>>> +        uint64_t ahnd;
>>>> +        int i = 0;
>>>> +
>>>> +        siblings = list_engines(i915, 1u << class, &count);
>>>> +        if (!siblings)
>>>> +            continue;
>>>> +
>>>> +        if (count < 2) {
>>>> +            free(siblings);
>>>> +            continue;
>>>> +        }
>>>> +
>>>> +        logical_sort_siblings(i915, siblings, count);
>>>> +
>>>> +        memset(&cfg, 0, sizeof(cfg));
>>>> +        cfg.parallel = true;
>>>> +        cfg.num_engines = 1;
>>>> +        cfg.width = count;
>>>> +        memcpy(cfg.engines, siblings, sizeof(*siblings) * count);
>>>> +
>>>> +        if (__intel_ctx_create(i915, &cfg, &ctx)) {
>>>> +            free(siblings);
>>>> +            continue;
>>>> +        }
>>>> +        gem_context_set_vm(i915, ctx->id, vm_id);
>>>> +
>>>> +        batch[i] = MI_ATOMIC | MI_ATOMIC_INC;
>>>> +#define TARGET_BO_OFFSET    (0x1 << 16)
>>>> +        batch[++i] = TARGET_BO_OFFSET;
>>>> +        batch[++i] = 0;
>>>> +        batch[++i] = MI_BATCH_BUFFER_END;
>>>> +
>>>> +        obj[0] = gem_create(i915, 4096);
>>>> +        bind_syncobj[0] = syncobj_create(i915, 0);
>>>> +        exec_fence[0].handle = bind_syncobj[0];
>>>> +        exec_fence[0].flags = I915_TIMELINE_FENCE_WAIT;
>>>> +        i915_vm_bind(i915, vm_id, TARGET_BO_OFFSET, obj[0], 0, 
>>>> 4096, bind_syncobj[0], 0);
>>>> +
>>>> +        for (i = 1; i < count + 1; ++i) {
>>>> +            obj[i] = gem_create(i915, 4096);
>>>> +            gem_write(i915, obj[i], 0, batch, sizeof(batch));
>>>> +
>>>> +            batch_addr[i - 1] = TARGET_BO_OFFSET * (i + 1);
>>>> +            bind_syncobj[i] = syncobj_create(i915, 0);
>>>> +            exec_fence[i].handle = bind_syncobj[i];
>>>> +            exec_fence[i].flags = I915_TIMELINE_FENCE_WAIT;
>>>> +            i915_vm_bind(i915, vm_id, batch_addr[i - 1], obj[i], 0, 
>>>> 4096, bind_syncobj[i], 0);
>>>> +        }
>>>> +
>>>> +        exec_syncobj = syncobj_create(i915, 0);
>>>> +        exec_fence[i].handle = exec_syncobj;
>>>> +        exec_fence[i].flags = I915_TIMELINE_FENCE_SIGNAL;
>>>> +
>>>> +        memset(&execbuf, 0, sizeof(execbuf));
>>>> +        execbuf.ctx_id = ctx->id,
>>>> +        execbuf.batch_address = to_user_pointer(batch_addr),
>>>> +        execbuf.fence_count = count + 2,
>>>> +        execbuf.timeline_fences = to_user_pointer(exec_fence),
>>>> +
>>>> +        /* Block parallel submission */
>>>> +        spin_ctx = ctx_create_engines(i915, siblings, count);
>>>> +        ahnd = get_simple_ahnd(i915, spin_ctx->id);
>>>> +        spin = __igt_spin_new(i915,
>>>> +                      .ahnd = ahnd,
>>>> +                      .ctx = spin_ctx,
>>>> +                      .engine = 0,
>>>> +                      .flags = IGT_SPIN_FENCE_OUT |
>>>> +                      IGT_SPIN_NO_PREEMPTION);
>>>> +
>>>> +        /* Wait for spinners to start */
>>>> +        usleep(5 * 10000);
>>>> +        igt_assert(fence_busy(spin->out_fence));
>>>> +
>>>> +        /* Submit parallel execbuf */
>>>> +        gem_execbuf3(i915, &execbuf);
>>>> +
>>>> +        /*
>>>> +         * Wait long enough for timeslcing to kick in but not
>>>> +         * preemption. Spinner + parallel execbuf should be
>>>> +         * active. Assuming default timeslice / preemption values, if
>>>> +         * these are changed it is possible for the test to fail.
>>>> +         */
>>>> +        usleep(get_timeslice(i915, siblings[0]) * 2);
>>>> +        igt_assert(fence_busy(spin->out_fence));
>>>> +        igt_assert(syncobj_busy(i915, exec_syncobj));
>>>> +        check_bo(i915, obj[0], 0, false);
>>>> +
>>>> +        /*
>>>> +         * End spinner and wait for spinner + parallel execbuf
>>>> +         * to compelte.
>>>> +         */
>>>> +        igt_spin_end(spin);
>>>> +        igt_assert(syncobj_timeline_wait(i915, &exec_syncobj, 
>>>> &fence_value, 1,
>>>> +                         gettime_ns() + (2 * NSEC_PER_SEC),
>>>> +                         DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, 
>>>> NULL));
>>>> +        igt_assert(!syncobj_busy(i915, exec_syncobj));
>>>> +        syncobj_destroy(i915, exec_syncobj);
>>>> +        for (i = 0; i < count + 1; ++i)
>>>> +            syncobj_destroy(i915, bind_syncobj[i]);
>>>> +        check_bo(i915, obj[0], count, true);
>>>> +
>>>> +        /* Clean up */
>>>> +        intel_ctx_destroy(i915, ctx);
>>>> +        intel_ctx_destroy(i915, spin_ctx);
>>>> +        i915_vm_unbind(i915, vm_id, TARGET_BO_OFFSET, 4096);
>>>> +        for (i = 1; i < count + 1; ++i)
>>>> +            i915_vm_unbind(i915, vm_id, batch_addr[i - 1], 4096);
>>>> +
>>>> +        for (i = 0; i < count + 1; ++i)
>>>> +            gem_close(i915, obj[i]);
>>>> +        free(siblings);
>>>> +        igt_spin_free(i915, spin);
>>>> +        put_ahnd(ahnd);
>>>> +    }
>>>> +
>>>> +    gem_vm_destroy(i915, vm_id);
>>>> +}
>>>> +
>>>> +static bool has_load_balancer(int i915)
>>>> +{
>>>> +    const intel_ctx_cfg_t cfg = {
>>>> +        .load_balance = true,
>>>> +        .num_engines = 1,
>>>> +    };
>>>> +    const intel_ctx_t *ctx = NULL;
>>>> +    int err;
>>>> +
>>>> +    err = __intel_ctx_create(i915, &cfg, &ctx);
>>>> +    intel_ctx_destroy(i915, ctx);
>>>> +
>>>> +    return err == 0;
>>>> +}
>>>> +
>>>> +static bool has_logical_mapping(int i915)
>>>> +{
>>>> +    struct drm_i915_query_engine_info *engines;
>>>> +    unsigned int i;
>>>> +
>>>> +    engines = query_engine_info(i915);
>>>> +
>>>> +    for (i = 0; i < engines->num_engines; ++i)
>>>> +        if (!(engines->engines[i].flags &
>>>> +             I915_ENGINE_INFO_HAS_LOGICAL_INSTANCE)) {
>>>> +            free(engines);
>>>> +            return false;
>>>> +        }
>>>> +
>>>> +    free(engines);
>>>> +    return true;
>>>> +}
>>>> +
>>>> +static bool has_parallel_execbuf(int i915)
>>>> +{
>>>> +    intel_ctx_cfg_t cfg = {
>>>> +        .parallel = true,
>>>> +        .num_engines = 1,
>>>> +    };
>>>> +    const intel_ctx_t *ctx = NULL;
>>>> +    int err;
>>>> +
>>>> +    for (int class = 0; class < 32; class++) {
>>>> +        struct i915_engine_class_instance *siblings;
>>>> +        unsigned int count;
>>>> +
>>>> +        siblings = list_engines(i915, 1u << class, &count);
>>>> +        if (!siblings)
>>>> +            continue;
>>>> +
>>>> +        if (count < 2) {
>>>> +            free(siblings);
>>>> +            continue;
>>>> +        }
>>>> +
>>>> +        logical_sort_siblings(i915, siblings, count);
>>>> +
>>>> +        cfg.width = count;
>>>> +        memcpy(cfg.engines, siblings, sizeof(*siblings) * count);
>>>> +        free(siblings);
>>>> +
>>>> +        err = __intel_ctx_create(i915, &cfg, &ctx);
>>>> +        intel_ctx_destroy(i915, ctx);
>>>> +
>>>> +        return err == 0;
>>>> +    }
>>>> +
>>>> +    return false;
>>>> +}
>>>> +
>>>> +igt_main
>>>> +{
>>>> +    int i915 = -1;
>>>> +
>>>> +    igt_fixture {
>>>> +        i915 = drm_open_driver(DRIVER_INTEL);
>>>> +        igt_require_gem(i915);
>>>> +
>>>> +        gem_require_contexts(i915);
>>>> +        igt_require(i915_vm_bind_version(i915) == 1);
>>>> +        igt_require(gem_has_engine_topology(i915));
>>>> +        igt_require(has_load_balancer(i915));
>>>> +        igt_require(has_perf_engines(i915));
>>>> +    }
>>>> +
>>>> +    igt_subtest_group {
>>>> +        igt_fixture {
>>>> +            igt_require(has_logical_mapping(i915));
>>>> +            igt_require(has_parallel_execbuf(i915));
>>>> +        }
>>>> +
>>>> +        igt_describe("Ensure a parallel submit actually runs in 
>>>> parallel");
>>>> +        igt_subtest("parallel-ordering")
>>>> +            parallel_ordering(i915, 0);
>>>> +    }
>>>> +}
>>>> diff --git a/tests/meson.build b/tests/meson.build
>>>> index 50de8b7e89..6ca5a94282 100644
>>>> --- a/tests/meson.build
>>>> +++ b/tests/meson.build
>>>> @@ -375,6 +375,13 @@ test_executables += 
>>>> executable('gem_exec_balancer', 'i915/gem_exec_balancer.c',
>>>>        install : true)
>>>> test_list += 'gem_exec_balancer'
>>>> +test_executables += executable('gem_exec3_balancer', 
>>>> 'i915/gem_exec3_balancer.c',
>>>> +       dependencies : test_deps + [ lib_igt_perf ],
>>>> +       install_dir : libexecdir,
>>>> +       install_rpath : libexecdir_rpathdir,
>>>> +       install : true)
>>>> +test_list += 'gem_exec3_balancer'
>>>> +
>>>> test_executables += executable('gem_mmap_offset',
>>>>        join_paths('i915', 'gem_mmap_offset.c'),
>>>>        dependencies : test_deps + [ libatomic ],

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 07/11] tests/i915/vm_bind: Add basic VM_BIND test support
  2022-10-21 16:27   ` Matthew Auld
@ 2022-10-25  7:06     ` Niranjana Vishwanathapura
  2022-10-28 16:38       ` Matthew Auld
  0 siblings, 1 reply; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-25  7:06 UTC (permalink / raw)
  To: Matthew Auld
  Cc: tvrtko.ursulin, igt-dev, thomas.hellstrom, daniel.vetter, petri.latvala

On Fri, Oct 21, 2022 at 05:27:56PM +0100, Matthew Auld wrote:
>On 18/10/2022 08:17, Niranjana Vishwanathapura wrote:
>>Add basic tests for VM_BIND functionality. Bind the buffer objects in
>>device page table with VM_BIND calls and have GPU copy the data from a
>>source buffer object to destination buffer object.
>>Test for different buffer sizes, buffer object placement and with
>>multiple contexts.
>>
>>v2: Add basic test to fast-feedback.testlist
>>v3: Run all tests for different memory types,
>>     pass VA instead of an array for batch_address
>>v4: Iterate for memory region types instead of hardcoding,
>>     use i915_vm_bind library functions
>>
>>Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>>
>>basic
>
>Unwanted change?
>

fixed

>>---
>>  tests/i915/i915_vm_bind_basic.c       | 522 ++++++++++++++++++++++++++
>>  tests/intel-ci/fast-feedback.testlist |   1 +
>>  tests/meson.build                     |   1 +
>>  3 files changed, 524 insertions(+)
>>  create mode 100644 tests/i915/i915_vm_bind_basic.c
>>
>>diff --git a/tests/i915/i915_vm_bind_basic.c b/tests/i915/i915_vm_bind_basic.c
>>new file mode 100644
>>index 0000000000..ede76ba095
>>--- /dev/null
>>+++ b/tests/i915/i915_vm_bind_basic.c
>>@@ -0,0 +1,522 @@
>>+// SPDX-License-Identifier: MIT
>>+/*
>>+ * Copyright © 2022 Intel Corporation
>>+ */
>>+
>>+/** @file i915_vm_bind_basic.c
>>+ *
>>+ * This is the basic test for VM_BIND functionality.
>>+ *
>>+ * The goal is to ensure that basics work.
>>+ */
>>+
>>+#include <sys/poll.h>
>>+
>>+#include "i915/gem.h"
>>+#include "i915/i915_vm_bind.h"
>>+#include "igt.h"
>>+#include "igt_syncobj.h"
>>+#include <unistd.h>
>>+#include <stdlib.h>
>>+#include <stdint.h>
>>+#include <stdio.h>
>>+#include <string.h>
>>+#include <fcntl.h>
>>+#include <inttypes.h>
>>+#include <errno.h>
>>+#include <sys/stat.h>
>>+#include <sys/ioctl.h>
>>+#include "drm.h"
>>+#include "i915/gem_vm.h"
>>+
>>+IGT_TEST_DESCRIPTION("Basic test for vm_bind functionality");
>>+
>>+#define PAGE_SIZE   4096
>>+#define PAGE_SHIFT  12
>>+
>>+#define GEN9_XY_FAST_COPY_BLT_CMD       (2 << 29 | 0x42 << 22)
>>+#define BLT_DEPTH_32                    (3 << 24)
>>+
>>+#define DEFAULT_BUFF_SIZE  (4 * PAGE_SIZE)
>>+#define SZ_64K             (16 * PAGE_SIZE)
>>+#define SZ_2M              (512 * PAGE_SIZE)
>>+
>>+#define MAX_CTXTS   2
>>+#define MAX_CMDS    4
>>+
>>+#define BATCH_FENCE  0
>>+#define SRC_FENCE    1
>>+#define DST_FENCE    2
>>+#define EXEC_FENCE   3
>>+#define NUM_FENCES   4
>>+
>>+enum {
>>+	BATCH_MAP,
>>+	SRC_MAP,
>>+	DST_MAP = SRC_MAP + MAX_CMDS,
>>+	MAX_MAP
>>+};
>>+
>>+struct mapping {
>>+	uint32_t  obj;
>>+	uint64_t  va;
>>+	uint64_t  offset;
>>+	uint64_t  length;
>>+	uint64_t  flags;
>>+};
>>+
>>+#define SET_MAP(map, _obj, _va, _offset, _length, _flags)   \
>>+{                                  \
>>+	(map).obj = _obj;          \
>>+	(map).va = _va;            \
>>+	(map).offset = _offset;	   \
>>+	(map).length = _length;	   \
>>+	(map).flags = _flags;	   \
>>+}
>>+
>>+#define MAX_BATCH_DWORD    64
>>+
>>+#define abs(x) ((x) >= 0 ? (x) : -(x))
>>+
>>+#define TEST_LMEM             BIT(0)
>>+#define TEST_SKIP_UNBIND      BIT(1)
>>+#define TEST_SHARE_VM         BIT(2)
>>+
>>+#define is_lmem(cfg)        ((cfg)->flags & TEST_LMEM)
>>+#define do_unbind(cfg)      (!((cfg)->flags & TEST_SKIP_UNBIND))
>>+#define do_share_vm(cfg)    ((cfg)->flags & TEST_SHARE_VM)
>>+
>>+struct test_cfg {
>>+	const char *name;
>>+	uint32_t size;
>>+	uint8_t num_cmds;
>>+	uint32_t num_ctxts;
>>+	uint32_t flags;
>>+};
>>+
>>+static uint64_t
>>+gettime_ns(void)
>>+{
>>+	struct timespec current;
>>+	clock_gettime(CLOCK_MONOTONIC, &current);
>>+	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
>>+}
>>+
>>+static bool syncobj_busy(int fd, uint32_t handle)
>>+{
>>+	bool result;
>>+	int sf;
>>+
>>+	sf = syncobj_handle_to_fd(fd, handle,
>>+				  DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE);
>>+	result = poll(&(struct pollfd){sf, POLLIN}, 1, 0) == 0;
>>+	close(sf);
>>+
>>+	return result;
>>+}
>>+
>>+static inline void vm_bind(int fd, uint32_t vm_id, struct mapping *m,
>>+			   struct drm_i915_gem_timeline_fence *fence)
>>+{
>>+	uint32_t syncobj = 0;
>>+
>>+	if (fence) {
>>+		syncobj =  syncobj_create(fd, 0);
>>+
>>+		fence->handle = syncobj;
>>+		fence->flags = I915_TIMELINE_FENCE_WAIT;
>>+		fence->value = 0;
>>+	}
>>+
>>+	igt_info("VM_BIND vm:0x%x h:0x%x v:0x%lx o:0x%lx l:0x%lx\n",
>>+		 vm_id, m->obj, m->va, m->offset, m->length);
>
>This igt_info() and elsewhere are not too loud? Maybe just igt_debug() 
>for some of these? Looks quite noisy when running this test locally.

ok, changed it to igt_debug()

>
>>+	i915_vm_bind(fd, vm_id, m->va, m->obj, m->offset, m->length, syncobj, 0);
>>+}
>>+
>>+static inline void vm_unbind(int fd, uint32_t vm_id, struct mapping *m)
>>+{
>>+	/* Object handle is not required during unbind */
>>+	igt_info("VM_UNBIND vm:0x%x v:0x%lx l:0x%lx\n", vm_id, m->va, m->length);
>>+	i915_vm_unbind(fd, vm_id, m->va, m->length);
>>+}
>>+
>>+static void print_buffer(void *buf, uint32_t size,
>>+			 const char *str, bool full)
>>+{
>>+	uint32_t i = 0;
>>+
>>+	igt_debug("Printing %s size 0x%x\n", str, size);
>>+	while (i < size) {
>>+		uint32_t *b = buf + i;
>>+
>>+		igt_debug("\t%s[0x%04x]: 0x%08x 0x%08x 0x%08x 0x%08x %s\n",
>>+			  str, i, b[0], b[1], b[2], b[3], full ? "" : "...");
>>+		i += full ? 16 : PAGE_SIZE;
>>+	}
>>+}
>>+
>>+static int gem_linear_fast_blt(uint32_t *batch, uint64_t src,
>>+			       uint64_t dst, uint32_t size)
>>+{
>>+	uint32_t *cmd = batch;
>>+
>>+	*cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
>>+	*cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
>>+	*cmd++ = 0;
>>+	*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
>>+	*cmd++ = lower_32_bits(dst);
>>+	*cmd++ = upper_32_bits(dst);
>>+	*cmd++ = 0;
>>+	*cmd++ = PAGE_SIZE;
>>+	*cmd++ = lower_32_bits(src);
>>+	*cmd++ = upper_32_bits(src);
>>+
>>+	*cmd++ = MI_BATCH_BUFFER_END;
>>+	*cmd++ = 0;
>>+
>>+	return ALIGN((cmd - batch + 1) * sizeof(uint32_t), 8);
>>+}
>>+
>>+static void __gem_copy(int fd, uint64_t src, uint64_t dst, uint32_t size,
>>+		       uint32_t ctx_id, void *batch_addr, unsigned int eb_flags,
>>+		       struct drm_i915_gem_timeline_fence *fence)
>>+{
>>+	uint32_t len, buf[MAX_BATCH_DWORD] = { 0 };
>>+	struct drm_i915_gem_execbuffer3 execbuf;
>>+
>>+	len = gem_linear_fast_blt(buf, src, dst, size);
>>+
>>+	memcpy(batch_addr, (void *)buf, len);
>>+	print_buffer(buf, len, "batch", true);
>>+
>>+	memset(&execbuf, 0, sizeof(execbuf));
>>+	execbuf.ctx_id = ctx_id;
>>+	execbuf.batch_address = to_user_pointer(batch_addr);
>>+	execbuf.engine_idx = eb_flags;
>>+	execbuf.fence_count = NUM_FENCES;
>>+	execbuf.timeline_fences = to_user_pointer(fence);
>>+	gem_execbuf3(fd, &execbuf);
>>+}
>>+
>>+static void i915_gem_copy(int fd, uint64_t src, uint64_t dst, uint32_t va_delta,
>>+			  uint32_t delta, uint32_t size, const intel_ctx_t **ctx,
>>+			  uint32_t num_ctxts, void **batch_addr, unsigned int eb_flags,
>>+			  struct drm_i915_gem_timeline_fence (*fence)[NUM_FENCES])
>>+{
>>+	uint32_t i;
>>+
>>+	for (i = 0; i < num_ctxts; i++) {
>>+		igt_info("Issuing gem copy on ctx 0x%x\n", ctx[i]->id);
>>+		__gem_copy(fd, src + (i * va_delta), dst + (i * va_delta), delta,
>>+			   ctx[i]->id, batch_addr[i], eb_flags, fence[i]);
>>+	}
>>+}
>>+
>>+static void i915_gem_sync(int fd, const intel_ctx_t **ctx, uint32_t num_ctxts,
>>+			  struct drm_i915_gem_timeline_fence (*fence)[NUM_FENCES])
>>+{
>>+	uint32_t i;
>>+
>>+	for (i = 0; i < num_ctxts; i++) {
>>+		uint64_t fence_value = 0;
>>+
>>+		igt_assert(syncobj_timeline_wait(fd, &fence[i][EXEC_FENCE].handle,
>>+						 (uint64_t *)&fence_value, 1,
>>+						 gettime_ns() + (2 * NSEC_PER_SEC),
>>+						 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
>>+		igt_assert(!syncobj_busy(fd, fence[i][EXEC_FENCE].handle));
>>+		igt_info("gem copy completed on ctx 0x%x\n", ctx[i]->id);
>>+	}
>>+}
>>+
>>+static struct igt_collection *get_region_set(int fd, struct test_cfg *cfg, bool lmem_only)
>>+{
>>+	uint32_t mem_type[] = { I915_SYSTEM_MEMORY, I915_DEVICE_MEMORY };
>>+	uint32_t lmem_type[] = { I915_DEVICE_MEMORY };
>>+	struct drm_i915_query_memory_regions *query_info;
>>+
>>+	query_info = gem_get_query_memory_regions(fd);
>>+	igt_assert(query_info);
>>+
>>+	if (lmem_only)
>>+		return __get_memory_region_set(query_info, lmem_type, 1);
>>+	else
>>+		return __get_memory_region_set(query_info, mem_type, 2);
>>+}
>>+
>>+static void create_src_objs(int fd, struct test_cfg *cfg, uint32_t src[], uint32_t size,
>>+			    uint32_t num_cmds, void *src_addr[])
>>+{
>>+	int i;
>>+	struct igt_collection *set = get_region_set(fd, cfg, is_lmem(cfg) && (num_cmds == 1));
>>+	uint32_t region;
>>+
>>+	for (i = 0; i < num_cmds; i++) {
>>+		region = igt_collection_get_value(set, i % set->size);
>>+		src[i] = gem_create_in_memory_regions(fd, size, region);
>>+		igt_info("Src obj 0x%x created in region 0x%x:0x%x\n", src[i],
>>+			 MEMORY_TYPE_FROM_REGION(region), MEMORY_INSTANCE_FROM_REGION(region));
>>+		src_addr[i] = gem_mmap__cpu(fd, src[i], 0, size, PROT_WRITE);
>>+	}
>>+}
>>+
>>+static void destroy_src_objs(int fd, struct test_cfg *cfg, uint32_t src[], uint32_t size,
>>+			     uint32_t num_cmds, void *src_addr[])
>>+{
>>+	int i;
>>+
>>+	for (i = 0; i < num_cmds; i++) {
>>+		igt_assert(gem_munmap(src_addr[i], size) == 0);
>>+		igt_debug("Closing object 0x%x\n", src[i]);
>>+		gem_close(fd, src[i]);
>>+	}
>>+}
>>+
>>+static uint32_t create_dst_obj(int fd, struct test_cfg *cfg, uint32_t size, void **dst_addr)
>>+{
>>+	uint32_t dst;
>>+	struct igt_collection *set = get_region_set(fd, cfg, is_lmem(cfg));
>>+	uint32_t region = igt_collection_get_value(set, 0);
>
>Maybe just use the gem_memory_region and then we don't need this 
>igt_collection stuff here and elsewhere? IMO it's a lot nicer.

ok

>
>>+
>>+	dst = gem_create_in_memory_regions(fd, size, region);
>>+	igt_info("Dst obj 0x%x created in region 0x%x:0x%x\n", dst,
>>+		 MEMORY_TYPE_FROM_REGION(region), MEMORY_INSTANCE_FROM_REGION(region));
>
>And then here and elsewhere we can just use gem_memory_region.name, 
>which is a lot more readable for humans :)
>

ok

>>+	*dst_addr = gem_mmap__cpu(fd, dst, 0, size, PROT_WRITE);
>>+
>>+	return dst;
>>+}
>>+
>>+static void destroy_dst_obj(int fd, struct test_cfg *cfg, uint32_t dst, uint32_t size, void *dst_addr)
>>+{
>>+	igt_assert(gem_munmap(dst_addr, size) == 0);
>>+	igt_debug("Closing object 0x%x\n", dst);
>>+	gem_close(fd, dst);
>>+}
>>+
>>+static void pattern_fill_buf(void *src_addr[], uint32_t size, uint32_t num_cmds, uint32_t npages)
>>+{
>>+	uint32_t i, j;
>>+	void *buf;
>>+
>>+	/* Allocate buffer and fill pattern */
>>+	buf = malloc(size);
>>+	igt_require(buf);
>>+
>>+	for (i = 0; i < num_cmds; i++) {
>>+		for (j = 0; j < npages; j++)
>>+			memset(buf + j * PAGE_SIZE, i * npages + j + 1, PAGE_SIZE);
>>+
>>+		memcpy(src_addr[i], buf, size);
>>+	}
>>+
>>+	free(buf);
>>+}
>>+
>>+static void run_test(int fd, const intel_ctx_t *base_ctx, struct test_cfg *cfg,
>>+		     const struct intel_execution_engine2 *e)
>>+{
>>+	void *src_addr[MAX_CMDS] = { 0 }, *dst_addr = NULL;
>>+	uint32_t src[MAX_CMDS], dst, i, size = cfg->size;
>>+	struct drm_i915_gem_timeline_fence exec_fence[MAX_CTXTS][NUM_FENCES];
>>+	uint32_t shared_vm_id, vm_id[MAX_CTXTS];
>>+	struct mapping map[MAX_CTXTS][MAX_MAP];
>>+	uint32_t num_ctxts = cfg->num_ctxts;
>>+	uint32_t num_cmds = cfg->num_cmds;
>>+	uint32_t npages = size / PAGE_SIZE;
>>+	const intel_ctx_t *ctx[MAX_CTXTS];
>>+	bool share_vm = do_share_vm(cfg);
>>+	uint32_t delta, va_delta = SZ_2M;
>>+	void *batch_addr[MAX_CTXTS];
>>+	uint32_t batch[MAX_CTXTS];
>>+	uint64_t src_va, dst_va;
>>+
>>+	delta = size / num_ctxts;
>>+	if (share_vm)
>>+		shared_vm_id = gem_vm_create_in_vm_bind_mode(fd);
>>+
>>+	/* Create contexts */
>>+	num_ctxts = min_t(num_ctxts, MAX_CTXTS, num_ctxts);
>>+	for (i = 0; i < num_ctxts; i++) {
>>+		uint32_t vmid;
>>+
>>+		if (share_vm)
>>+			vmid = shared_vm_id;
>>+		else
>>+			vmid = gem_vm_create_in_vm_bind_mode(fd);
>>+
>>+		ctx[i] = intel_ctx_create(fd, &base_ctx->cfg);
>>+		gem_context_set_vm(fd, ctx[i]->id, vmid);
>>+		vm_id[i] = gem_context_get_vm(fd, ctx[i]->id);
>>+
>>+		exec_fence[i][EXEC_FENCE].handle = syncobj_create(fd, 0);
>>+		exec_fence[i][EXEC_FENCE].flags = I915_TIMELINE_FENCE_SIGNAL;
>>+		exec_fence[i][EXEC_FENCE].value = 0;
>>+	}
>>+
>>+	/* Create objects */
>>+	num_cmds = min_t(num_cmds, MAX_CMDS, num_cmds);
>>+	create_src_objs(fd, cfg, src, size, num_cmds, src_addr);
>>+	dst = create_dst_obj(fd, cfg, size, &dst_addr);
>>+
>>+	/*
>>+	 * mmap'ed addresses are not 64K aligned. On platforms requiring
>>+	 * 64K alignment, use static addresses.
>>+	 */
>>+	if (size < SZ_2M && num_cmds && !HAS_64K_PAGES(intel_get_drm_devid(fd))) {
>>+		src_va = to_user_pointer(src_addr[0]);
>>+		dst_va = to_user_pointer(dst_addr);
>>+	} else {
>>+		src_va = 0xa000000;
>>+		dst_va = 0xb000000;
>>+	}
>>+
>>+	pattern_fill_buf(src_addr, size, num_cmds, npages);
>>+
>>+	if (num_cmds)
>>+		print_buffer(src_addr[num_cmds - 1], size, "src_obj", false);
>>+
>>+	for (i = 0; i < num_ctxts; i++) {
>>+		batch[i] = gem_create_in_memory_regions(fd, PAGE_SIZE, REGION_SMEM);
>>+		igt_info("Batch obj 0x%x created in region 0x%x:0x%x\n", batch[i],
>>+			 MEMORY_TYPE_FROM_REGION(REGION_SMEM), MEMORY_INSTANCE_FROM_REGION(REGION_SMEM));
>>+		batch_addr[i] = gem_mmap__cpu(fd, batch[i], 0, PAGE_SIZE, PROT_WRITE);
>
>Hmm, are there any incoherent platforms (non-llc) that will be 
>entering vm_bind world? I guess nothing currently that is graphics 
>version >= 12? If something does come along we might see some suprises 
>here and elsewhere.
>

Yah, I too think it should be fine.

>IIRC the old model would be something like mmap__cpu(), 
>set_domain(CPU), do some CPU writes, and then later execbuf would see 
>that the object is dirty and would do any clflushes if required before 
>submit. Maybe the test needs to manually do all clflushing now with 
>new eb3 model? We can also sidestep for now by using __device() or so.
>

Yah we need to take care of this if we are supporting vm_bind on older
platforms. Worst case, we need loop of vm_bind objects in eb3 and cflush.

>>+	}
>>+
>>+	/* Create mappings */
>>+	for (i = 0; i < num_ctxts; i++) {
>>+		uint64_t va_offset = i * va_delta;
>>+		uint64_t offset = i * delta;
>>+		uint32_t j;
>>+
>>+		for (j = 0; j < num_cmds; j++)
>>+			SET_MAP(map[i][SRC_MAP + j], src[j], src_va + va_offset, offset, delta, 0);
>>+		SET_MAP(map[i][DST_MAP], dst, dst_va + va_offset, offset, delta, 0);
>>+		SET_MAP(map[i][BATCH_MAP], batch[i], to_user_pointer(batch_addr[i]), 0, PAGE_SIZE, 0);
>>+	}
>>+
>>+	/* Bind the buffers to device page table */
>>+	for (i = 0; i < num_ctxts; i++) {
>>+		vm_bind(fd, vm_id[i], &map[i][BATCH_MAP], &exec_fence[i][BATCH_FENCE]);
>>+		vm_bind(fd, vm_id[i], &map[i][DST_MAP], &exec_fence[i][DST_FENCE]);
>>+	}
>>+
>>+	/* Have GPU do the copy */
>>+	for (i = 0; i < cfg->num_cmds; i++) {
>>+		uint32_t j;
>>+
>>+		for (j = 0; j < num_ctxts; j++)
>>+			vm_bind(fd, vm_id[j], &map[j][SRC_MAP + i], &exec_fence[j][SRC_FENCE]);
>>+
>>+		i915_gem_copy(fd, src_va, dst_va, va_delta, delta, size, ctx,
>>+			      num_ctxts, batch_addr, e->flags, exec_fence);
>>+
>>+		i915_gem_sync(fd, ctx, num_ctxts, exec_fence);
>>+
>>+		for (j = 0; j < num_ctxts; j++) {
>>+			syncobj_destroy(fd, exec_fence[j][SRC_FENCE].handle);
>>+			if (do_unbind(cfg))
>>+				vm_unbind(fd, vm_id[j], &map[j][SRC_MAP + i]);
>>+		}
>>+	}
>>+
>>+	/*
>>+	 * Unbind buffers from device page table.
>>+	 * If not, it should get unbound while freeing the buffer.
>>+	 */
>>+	for (i = 0; i < num_ctxts; i++) {
>>+		syncobj_destroy(fd, exec_fence[i][BATCH_FENCE].handle);
>>+		syncobj_destroy(fd, exec_fence[i][DST_FENCE].handle);
>>+		if (do_unbind(cfg)) {
>>+			vm_unbind(fd, vm_id[i], &map[i][BATCH_MAP]);
>>+			vm_unbind(fd, vm_id[i], &map[i][DST_MAP]);
>>+		}
>>+	}
>>+
>>+	/* Close batch buffers */
>>+	for (i = 0; i < num_ctxts; i++) {
>>+		syncobj_destroy(fd, exec_fence[i][EXEC_FENCE].handle);
>>+		gem_close(fd, batch[i]);
>>+	}
>>+
>>+	/* Accessing the buffer will migrate the pages from device to host */
>
>It shouldn't do that. Do you have some more info?
>

This comment is a relic from the days when we had system allocator support.
Will remove it.

>>+	print_buffer(dst_addr, size, "dst_obj", false);
>
>Normally we only want to dump the buffer if the test fails because 
>there is some mismatch when comparing the expected value(s)?
>

It is dumping it in case of failure. print_buffer uses igt_debug().
Will rename this function to debug_dump_buffer()

>>+
>>+	/* Validate by comparing the last SRC with DST */
>>+	if (num_cmds)
>>+		igt_assert(memcmp(src_addr[num_cmds - 1], dst_addr, size) == 0);
>>+
>>+	/* Free the objects */
>>+	destroy_src_objs(fd, cfg, src, size, num_cmds, src_addr);
>>+	destroy_dst_obj(fd, cfg, dst, size, dst_addr);
>>+
>>+	/* Done with the contexts */
>>+	for (i = 0; i < num_ctxts; i++) {
>>+		igt_debug("Destroying context 0x%x\n", ctx[i]->id);
>>+		gem_vm_destroy(fd, vm_id[i]);
>>+		intel_ctx_destroy(fd, ctx[i]);
>>+	}
>>+
>>+	if (share_vm)
>>+		gem_vm_destroy(fd, shared_vm_id);
>>+}
>>+
>>+igt_main
>>+{
>>+	struct test_cfg *t, tests[] = {
>>+		{"basic", 0, 1, 1, 0},
>>+		{"multi_cmds", 0, MAX_CMDS, 1, 0},
>>+		{"skip_copy", 0, 0, 1, 0},
>>+		{"skip_unbind",  0, 1, 1, TEST_SKIP_UNBIND},
>>+		{"multi_ctxts", 0, 1, MAX_CTXTS, 0},
>>+		{"share_vm", 0, 1, MAX_CTXTS, TEST_SHARE_VM},
>>+		{"64K", (16 * PAGE_SIZE), 1, 1, 0},
>>+		{"2M", SZ_2M, 1, 1, 0},
>>+		{ }
>>+	};
>>+	int fd;
>>+	uint32_t def_size;
>>+	struct intel_execution_engine2 *e;
>>+	const intel_ctx_t *ctx;
>>+
>>+	igt_fixture {
>>+		fd = drm_open_driver(DRIVER_INTEL);
>>+		igt_require_gem(fd);
>>+		igt_require(i915_vm_bind_version(fd) == 1);
>>+		def_size = HAS_64K_PAGES(intel_get_drm_devid(fd)) ?
>>+			   SZ_64K : DEFAULT_BUFF_SIZE;
>
>Should be possible to cut over to gtt_alignment, if you feel like it. 
>For example just stick it in gem_memory_region.

ok

>
>>+		ctx = intel_ctx_create_all_physical(fd);
>>+		/* Get copy engine */
>>+		for_each_ctx_engine(fd, ctx, e)
>>+			if (e->class == I915_ENGINE_CLASS_COPY)
>>+				break;
>
>Maybe igt_require/assert() at least one copy engine? Dunno.
>

ok

>>+	}
>>+
>>+	/* Adjust test variables */
>>+	for (t = tests; t->name; t++)
>>+		t->size = t->size ? : (def_size * abs(t->num_ctxts));
>
>num_ctx can be negative?
>

No, it is a relic from older design. I have restructured the code
here and we no longer will have this loop.

>>+
>>+	for (t = tests; t->name; t++) {
>>+		igt_describe_f("VM_BIND %s", t->name);
>>+		igt_subtest_with_dynamic(t->name) {
>>+			for_each_memory_region(r, fd) {
>>+				struct test_cfg cfg = *t;
>>+
>>+				if (r->ci.memory_instance)
>>+					continue;
>>+
>>+				if (r->ci.memory_class == I915_MEMORY_CLASS_DEVICE)
>>+					cfg.flags |= TEST_LMEM;
>>+
>>+				igt_dynamic_f("%s", r->name)
>>+					run_test(fd, ctx, &cfg, e);
>
>Normally we just pass along the gem_memory_region. Can then fish out 
>useful stuff like gtt_alignment, name, size etc.
>

ok

Thanks,
Niranjana

>>+			}
>>+		}
>>+	}
>>+
>>+	igt_fixture {
>>+		intel_ctx_destroy(fd, ctx);
>>+		close(fd);
>>+	}
>>+
>>+	igt_exit();
>>+}
>>diff --git a/tests/intel-ci/fast-feedback.testlist b/tests/intel-ci/fast-feedback.testlist
>>index 185c2fef54..8a47eaddda 100644
>>--- a/tests/intel-ci/fast-feedback.testlist
>>+++ b/tests/intel-ci/fast-feedback.testlist
>>@@ -53,6 +53,7 @@ igt@i915_getparams_basic@basic-eu-total
>>  igt@i915_getparams_basic@basic-subslice-total
>>  igt@i915_hangman@error-state-basic
>>  igt@i915_pciid
>>+igt@i915_vm_bind_basic@basic-smem
>>  igt@i915_vm_bind_sanity@basic
>>  igt@kms_addfb_basic@addfb25-bad-modifier
>>  igt@kms_addfb_basic@addfb25-framebuffer-vs-set-tiling
>>diff --git a/tests/meson.build b/tests/meson.build
>>index d81c6c28c2..1b9529a7e2 100644
>>--- a/tests/meson.build
>>+++ b/tests/meson.build
>>@@ -251,6 +251,7 @@ i915_progs = [
>>  	'sysfs_preempt_timeout',
>>  	'sysfs_timeslice_duration',
>>  	'i915_vm_bind_sanity',
>>+	'i915_vm_bind_basic',
>
>Keep it sorted?
>
>>  ]
>>  msm_progs = [

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 10/11] tests/i915/vm_bind: Add gem_exec3_balancer test
  2022-10-24 15:26         ` Matthew Auld
@ 2022-10-25  7:08           ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-25  7:08 UTC (permalink / raw)
  To: Matthew Auld
  Cc: igt-dev, daniel.vetter, thomas.hellstrom, petri.latvala, tvrtko.ursulin

On Mon, Oct 24, 2022 at 04:26:38PM +0100, Matthew Auld wrote:
>On 21/10/2022 21:47, Niranjana Vishwanathapura wrote:
>>On Fri, Oct 21, 2022 at 01:11:28PM -0700, Niranjana Vishwanathapura wrote:
>>>On Fri, Oct 21, 2022 at 05:42:55PM +0100, Matthew Auld wrote:
>>>>On 18/10/2022 08:17, Niranjana Vishwanathapura wrote:
>>>>>To test parallel submissions support in execbuf3, port the subtest
>>>>>gem_exec_balancer@parallel-ordering to a new gem_exec3_balancer test
>>>>>and switch to execbuf3 ioctl.
>>>>>
>>>>>v2: use i915_vm_bind library functions
>>>>>
>>>>>Signed-off-by: Niranjana Vishwanathapura 
>>>>><niranjana.vishwanathapura@intel.com>
>>>>>---
>>>>>tests/i915/gem_exec3_balancer.c | 473 ++++++++++++++++++++++++++++++++
>>>>>tests/meson.build               |   7 +
>>>>>2 files changed, 480 insertions(+)
>>>>>create mode 100644 tests/i915/gem_exec3_balancer.c
>>>>>
>>>>>diff --git a/tests/i915/gem_exec3_balancer.c 
>>>>>b/tests/i915/gem_exec3_balancer.c
>>>>>new file mode 100644
>>>>>index 0000000000..34b2a6a0b3
>>>>>--- /dev/null
>>>>>+++ b/tests/i915/gem_exec3_balancer.c
>>>>>@@ -0,0 +1,473 @@
>>>>>+// SPDX-License-Identifier: MIT
>>>>>+/*
>>>>>+ * Copyright © 2022 Intel Corporation
>>>>>+ */
>>>>>+
>>>>>+/** @file gem_exec3_balancer.c
>>>>>+ *
>>>>>+ * Load balancer tests with execbuf3.
>>>>>+ * Ported from gem_exec_balancer and made to work with
>>>>>+ * vm_bind and execbuf3.
>>>>>+ *
>>>>>+ */
>>>>>+
>>>>>+#include <poll.h>
>>>>>+
>>>>>+#include "i915/gem.h"
>>>>>+#include "i915/i915_vm_bind.h"
>>>>>+#include "i915/gem_engine_topology.h"
>>>>>+#include "i915/gem_create.h"
>>>>>+#include "i915/gem_vm.h"
>>>>>+#include "igt.h"
>>>>>+#include "igt_gt.h"
>>>>>+#include "igt_perf.h"
>>>>>+#include "igt_syncobj.h"
>>>>>+
>>>>>+IGT_TEST_DESCRIPTION("Exercise in-kernel load-balancing with 
>>>>>execbuf3");
>>>>>+
>>>>>+#define INSTANCE_COUNT (1 << I915_PMU_SAMPLE_INSTANCE_BITS)
>>>>>+
>>>>>+static bool has_class_instance(int i915, uint16_t class, 
>>>>>uint16_t instance)
>>>>>+{
>>>>>+    int fd;
>>>>>+
>>>>>+    fd = perf_i915_open(i915, I915_PMU_ENGINE_BUSY(class, instance));
>>>>>+    if (fd >= 0) {
>>>>>+        close(fd);
>>>>>+        return true;
>>>>>+    }
>>>>>+
>>>>>+    return false;
>>>>>+}
>>>>>+
>>>>>+static struct i915_engine_class_instance *
>>>>>+list_engines(int i915, uint32_t class_mask, unsigned int *out)
>>>>>+{
>>>>>+    unsigned int count = 0, size = 64;
>>>>>+    struct i915_engine_class_instance *engines;
>>>>>+
>>>>>+    engines = malloc(size * sizeof(*engines));
>>>>>+    igt_assert(engines);
>>>>>+
>>>>>+    for (enum drm_i915_gem_engine_class class = 
>>>>>I915_ENGINE_CLASS_RENDER;
>>>>>+         class_mask;
>>>>>+         class++, class_mask >>= 1) {
>>>>>+        if (!(class_mask & 1))
>>>>>+            continue;
>>>>>+
>>>>>+        for (unsigned int instance = 0;
>>>>>+             instance < INSTANCE_COUNT;
>>>>>+             instance++) {
>>>>>+            if (!has_class_instance(i915, class, instance))
>>>>>+                continue;
>>>>>+
>>>>>+            if (count == size) {
>>>>>+                size *= 2;
>>>>>+                engines = realloc(engines,
>>>>>+                          size * sizeof(*engines));
>>>>>+                igt_assert(engines);
>>>>>+            }
>>>>>+
>>>>>+            engines[count++] = (struct i915_engine_class_instance){
>>>>>+                .engine_class = class,
>>>>>+                .engine_instance = instance,
>>>>>+            };
>>>>>+        }
>>>>>+    }
>>>>>+
>>>>>+    if (!count) {
>>>>>+        free(engines);
>>>>>+        engines = NULL;
>>>>>+    }
>>>>>+
>>>>>+    *out = count;
>>>>>+    return engines;
>>>>>+}
>>>>>+
>>>>>+static bool has_perf_engines(int i915)
>>>>>+{
>>>>>+    return i915_perf_type_id(i915);
>>>>>+}
>>>>>+
>>>>>+static intel_ctx_cfg_t
>>>>>+ctx_cfg_for_engines(const struct i915_engine_class_instance *ci,
>>>>>+            unsigned int count)
>>>>>+{
>>>>>+    intel_ctx_cfg_t cfg = { };
>>>>>+    unsigned int i;
>>>>>+
>>>>>+    for (i = 0; i < count; i++)
>>>>>+        cfg.engines[i] = ci[i];
>>>>>+    cfg.num_engines = count;
>>>>>+
>>>>>+    return cfg;
>>>>>+}
>>>>>+
>>>>>+static const intel_ctx_t *
>>>>>+ctx_create_engines(int i915, const struct 
>>>>>i915_engine_class_instance *ci,
>>>>>+           unsigned int count)
>>>>>+{
>>>>>+    intel_ctx_cfg_t cfg = ctx_cfg_for_engines(ci, count);
>>>>>+    return intel_ctx_create(i915, &cfg);
>>>>>+}
>>>>>+
>>>>>+static void check_bo(int i915, uint32_t handle, unsigned int expected,
>>>>>+             bool wait)
>>>>>+{
>>>>>+    uint32_t *map;
>>>>>+
>>>>>+    map = gem_mmap__cpu(i915, handle, 0, 4096, PROT_READ);
>>>>>+    if (wait)
>>>>>+        gem_set_domain(i915, handle, I915_GEM_DOMAIN_CPU,
>>>>>+                   I915_GEM_DOMAIN_CPU);
>>>>
>>>>Hmm, does the wait still work as expected in set_domain() or 
>>>>does that ignore BOOKKEEP, in which case this doesn't do 
>>>>anything with eb3? Perhaps can be deleted? Also slightly worried 
>>>>about the dirty tracking, and whether this test cares...
>>>
>>>I think we still need this. gem_set_domain() will correctly wait for
>>>implicit dependency tracking only and hence will ignore BOOKKEEP. As
>>>there is no implicit dependency tracking with VM_BIND, user has to
>>>do an explicit wait. This test is properly waiting for execution fence
>>>status before calling this function. Hence I think we should be good.
>
>And in this test there should be no implicit dependencies, right?

Yah, there is not.

>
>>>
>>
>>So, probably instead of naming it 'wait', we should call it 'flush'
>>or something for calling get_set_domain()?
>
>It doesn't look like it flushes anything either here, AFAICT. But 
>looking again, it doesn't seem like the test cares about check_bo() 
>performing a flush. So it should be fine to just drop the 
>set_domain()?
>

ok, will remove set_domain() here.

Niranjana

>>
>>Niranjana
>>
>>>Regards,
>>>Niranjana
>>>
>>>>
>>>>>+    igt_assert_eq(map[0], expected);
>>>>>+    munmap(map, 4096);
>>>>>+}
>>>>>+
>>>>>+static struct drm_i915_query_engine_info *query_engine_info(int i915)
>>>>>+{
>>>>>+    struct drm_i915_query_engine_info *engines;
>>>>>+
>>>>>+#define QUERY_SIZE    0x4000
>>>>>+    engines = malloc(QUERY_SIZE);
>>>>>+    igt_assert(engines);
>>>>>+    memset(engines, 0, QUERY_SIZE);
>>>>>+    igt_assert(!__gem_query_engines(i915, engines, QUERY_SIZE));
>>>>>+#undef QUERY_SIZE
>>>>>+
>>>>>+    return engines;
>>>>>+}
>>>>>+
>>>>>+/* This function only works if siblings contains all 
>>>>>instances of a class */
>>>>>+static void logical_sort_siblings(int i915,
>>>>>+                  struct i915_engine_class_instance *siblings,
>>>>>+                  unsigned int count)
>>>>>+{
>>>>>+    struct i915_engine_class_instance *sorted;
>>>>>+    struct drm_i915_query_engine_info *engines;
>>>>>+    unsigned int i, j;
>>>>>+
>>>>>+    sorted = calloc(count, sizeof(*sorted));
>>>>>+    igt_assert(sorted);
>>>>>+
>>>>>+    engines = query_engine_info(i915);
>>>>>+
>>>>>+    for (j = 0; j < count; ++j) {
>>>>>+        for (i = 0; i < engines->num_engines; ++i) {
>>>>>+            if (siblings[j].engine_class ==
>>>>>+                engines->engines[i].engine.engine_class &&
>>>>>+                siblings[j].engine_instance ==
>>>>>+                engines->engines[i].engine.engine_instance) {
>>>>>+                uint16_t logical_instance =
>>>>>+                    engines->engines[i].logical_instance;
>>>>>+
>>>>>+                igt_assert(logical_instance < count);
>>>>>+                igt_assert(!sorted[logical_instance].engine_class);
>>>>>+                igt_assert(!sorted[logical_instance].engine_instance);
>>>>>+
>>>>>+                sorted[logical_instance] = siblings[j];
>>>>>+                break;
>>>>>+            }
>>>>>+        }
>>>>>+        igt_assert(i != engines->num_engines);
>>>>>+    }
>>>>>+
>>>>>+    memcpy(siblings, sorted, sizeof(*sorted) * count);
>>>>>+    free(sorted);
>>>>>+    free(engines);
>>>>>+}
>>>>>+
>>>>>+static bool fence_busy(int fence)
>>>>>+{
>>>>>+    return poll(&(struct pollfd){fence, POLLIN}, 1, 0) == 0;
>>>>>+}
>>>>>+
>>>>>+/*
>>>>>+ * Always reading from engine instance 0, with GuC submission 
>>>>>the values are the
>>>>>+ * same across all instances. Execlists they may differ but 
>>>>>quite unlikely they
>>>>>+ * would be and if they are we can live with this.
>>>>>+ */
>>>>>+static unsigned int get_timeslice(int i915,
>>>>>+                  struct i915_engine_class_instance engine)
>>>>>+{
>>>>>+    unsigned int val;
>>>>>+
>>>>>+    switch (engine.engine_class) {
>>>>>+    case I915_ENGINE_CLASS_RENDER:
>>>>>+        gem_engine_property_scanf(i915, "rcs0", 
>>>>>"timeslice_duration_ms",
>>>>>+                      "%d", &val);
>>>>>+        break;
>>>>>+    case I915_ENGINE_CLASS_COPY:
>>>>>+        gem_engine_property_scanf(i915, "bcs0", 
>>>>>"timeslice_duration_ms",
>>>>>+                      "%d", &val);
>>>>>+        break;
>>>>>+    case I915_ENGINE_CLASS_VIDEO:
>>>>>+        gem_engine_property_scanf(i915, "vcs0", 
>>>>>"timeslice_duration_ms",
>>>>>+                      "%d", &val);
>>>>>+        break;
>>>>>+    case I915_ENGINE_CLASS_VIDEO_ENHANCE:
>>>>>+        gem_engine_property_scanf(i915, "vecs0", 
>>>>>"timeslice_duration_ms",
>>>>>+                      "%d", &val);
>>>>>+        break;
>>>>>+    }
>>>>>+
>>>>>+    return val;
>>>>>+}
>>>>>+
>>>>>+static uint64_t gettime_ns(void)
>>>>>+{
>>>>>+    struct timespec current;
>>>>>+    clock_gettime(CLOCK_MONOTONIC, &current);
>>>>>+    return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
>>>>>+}
>>>>>+
>>>>>+static bool syncobj_busy(int i915, uint32_t handle)
>>>>>+{
>>>>>+    bool result;
>>>>>+    int sf;
>>>>>+
>>>>>+    sf = syncobj_handle_to_fd(i915, handle,
>>>>>+                  DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE);
>>>>>+    result = poll(&(struct pollfd){sf, POLLIN}, 1, 0) == 0;
>>>>>+    close(sf);
>>>>>+
>>>>>+    return result;
>>>>>+}
>>>>>+
>>>>>+/*
>>>>>+ * Ensure a parallel submit actually runs on HW in parallel 
>>>>>by putting on a
>>>>>+ * spinner on 1 engine, doing a parallel submit, and parallel 
>>>>>submit is blocked
>>>>>+ * behind spinner.
>>>>>+ */
>>>>>+static void parallel_ordering(int i915, unsigned int flags)
>>>>>+{
>>>>>+    uint32_t vm_id;
>>>>>+    int class;
>>>>>+
>>>>>+    vm_id = gem_vm_create_in_vm_bind_mode(i915);
>>>>>+
>>>>>+    for (class = 0; class < 32; class++) {
>>>>>+        struct drm_i915_gem_timeline_fence exec_fence[32] = { };
>>>>>+        const intel_ctx_t *ctx = NULL, *spin_ctx = NULL;
>>>>>+        uint64_t fence_value = 0, batch_addr[32] = { };
>>>>>+        struct i915_engine_class_instance *siblings;
>>>>>+        uint32_t exec_syncobj, bind_syncobj[32];
>>>>>+        struct drm_i915_gem_execbuffer3 execbuf;
>>>>>+        uint32_t batch[16], obj[32];
>>>>>+        intel_ctx_cfg_t cfg;
>>>>>+        unsigned int count;
>>>>>+        igt_spin_t *spin;
>>>>>+        uint64_t ahnd;
>>>>>+        int i = 0;
>>>>>+
>>>>>+        siblings = list_engines(i915, 1u << class, &count);
>>>>>+        if (!siblings)
>>>>>+            continue;
>>>>>+
>>>>>+        if (count < 2) {
>>>>>+            free(siblings);
>>>>>+            continue;
>>>>>+        }
>>>>>+
>>>>>+        logical_sort_siblings(i915, siblings, count);
>>>>>+
>>>>>+        memset(&cfg, 0, sizeof(cfg));
>>>>>+        cfg.parallel = true;
>>>>>+        cfg.num_engines = 1;
>>>>>+        cfg.width = count;
>>>>>+        memcpy(cfg.engines, siblings, sizeof(*siblings) * count);
>>>>>+
>>>>>+        if (__intel_ctx_create(i915, &cfg, &ctx)) {
>>>>>+            free(siblings);
>>>>>+            continue;
>>>>>+        }
>>>>>+        gem_context_set_vm(i915, ctx->id, vm_id);
>>>>>+
>>>>>+        batch[i] = MI_ATOMIC | MI_ATOMIC_INC;
>>>>>+#define TARGET_BO_OFFSET    (0x1 << 16)
>>>>>+        batch[++i] = TARGET_BO_OFFSET;
>>>>>+        batch[++i] = 0;
>>>>>+        batch[++i] = MI_BATCH_BUFFER_END;
>>>>>+
>>>>>+        obj[0] = gem_create(i915, 4096);
>>>>>+        bind_syncobj[0] = syncobj_create(i915, 0);
>>>>>+        exec_fence[0].handle = bind_syncobj[0];
>>>>>+        exec_fence[0].flags = I915_TIMELINE_FENCE_WAIT;
>>>>>+        i915_vm_bind(i915, vm_id, TARGET_BO_OFFSET, obj[0], 
>>>>>0, 4096, bind_syncobj[0], 0);
>>>>>+
>>>>>+        for (i = 1; i < count + 1; ++i) {
>>>>>+            obj[i] = gem_create(i915, 4096);
>>>>>+            gem_write(i915, obj[i], 0, batch, sizeof(batch));
>>>>>+
>>>>>+            batch_addr[i - 1] = TARGET_BO_OFFSET * (i + 1);
>>>>>+            bind_syncobj[i] = syncobj_create(i915, 0);
>>>>>+            exec_fence[i].handle = bind_syncobj[i];
>>>>>+            exec_fence[i].flags = I915_TIMELINE_FENCE_WAIT;
>>>>>+            i915_vm_bind(i915, vm_id, batch_addr[i - 1], 
>>>>>obj[i], 0, 4096, bind_syncobj[i], 0);
>>>>>+        }
>>>>>+
>>>>>+        exec_syncobj = syncobj_create(i915, 0);
>>>>>+        exec_fence[i].handle = exec_syncobj;
>>>>>+        exec_fence[i].flags = I915_TIMELINE_FENCE_SIGNAL;
>>>>>+
>>>>>+        memset(&execbuf, 0, sizeof(execbuf));
>>>>>+        execbuf.ctx_id = ctx->id,
>>>>>+        execbuf.batch_address = to_user_pointer(batch_addr),
>>>>>+        execbuf.fence_count = count + 2,
>>>>>+        execbuf.timeline_fences = to_user_pointer(exec_fence),
>>>>>+
>>>>>+        /* Block parallel submission */
>>>>>+        spin_ctx = ctx_create_engines(i915, siblings, count);
>>>>>+        ahnd = get_simple_ahnd(i915, spin_ctx->id);
>>>>>+        spin = __igt_spin_new(i915,
>>>>>+                      .ahnd = ahnd,
>>>>>+                      .ctx = spin_ctx,
>>>>>+                      .engine = 0,
>>>>>+                      .flags = IGT_SPIN_FENCE_OUT |
>>>>>+                      IGT_SPIN_NO_PREEMPTION);
>>>>>+
>>>>>+        /* Wait for spinners to start */
>>>>>+        usleep(5 * 10000);
>>>>>+        igt_assert(fence_busy(spin->out_fence));
>>>>>+
>>>>>+        /* Submit parallel execbuf */
>>>>>+        gem_execbuf3(i915, &execbuf);
>>>>>+
>>>>>+        /*
>>>>>+         * Wait long enough for timeslcing to kick in but not
>>>>>+         * preemption. Spinner + parallel execbuf should be
>>>>>+         * active. Assuming default timeslice / preemption values, if
>>>>>+         * these are changed it is possible for the test to fail.
>>>>>+         */
>>>>>+        usleep(get_timeslice(i915, siblings[0]) * 2);
>>>>>+        igt_assert(fence_busy(spin->out_fence));
>>>>>+        igt_assert(syncobj_busy(i915, exec_syncobj));
>>>>>+        check_bo(i915, obj[0], 0, false);
>>>>>+
>>>>>+        /*
>>>>>+         * End spinner and wait for spinner + parallel execbuf
>>>>>+         * to compelte.
>>>>>+         */
>>>>>+        igt_spin_end(spin);
>>>>>+        igt_assert(syncobj_timeline_wait(i915, &exec_syncobj, 
>>>>>&fence_value, 1,
>>>>>+                         gettime_ns() + (2 * NSEC_PER_SEC),
>>>>>+                         
>>>>>DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
>>>>>+        igt_assert(!syncobj_busy(i915, exec_syncobj));
>>>>>+        syncobj_destroy(i915, exec_syncobj);
>>>>>+        for (i = 0; i < count + 1; ++i)
>>>>>+            syncobj_destroy(i915, bind_syncobj[i]);
>>>>>+        check_bo(i915, obj[0], count, true);
>>>>>+
>>>>>+        /* Clean up */
>>>>>+        intel_ctx_destroy(i915, ctx);
>>>>>+        intel_ctx_destroy(i915, spin_ctx);
>>>>>+        i915_vm_unbind(i915, vm_id, TARGET_BO_OFFSET, 4096);
>>>>>+        for (i = 1; i < count + 1; ++i)
>>>>>+            i915_vm_unbind(i915, vm_id, batch_addr[i - 1], 4096);
>>>>>+
>>>>>+        for (i = 0; i < count + 1; ++i)
>>>>>+            gem_close(i915, obj[i]);
>>>>>+        free(siblings);
>>>>>+        igt_spin_free(i915, spin);
>>>>>+        put_ahnd(ahnd);
>>>>>+    }
>>>>>+
>>>>>+    gem_vm_destroy(i915, vm_id);
>>>>>+}
>>>>>+
>>>>>+static bool has_load_balancer(int i915)
>>>>>+{
>>>>>+    const intel_ctx_cfg_t cfg = {
>>>>>+        .load_balance = true,
>>>>>+        .num_engines = 1,
>>>>>+    };
>>>>>+    const intel_ctx_t *ctx = NULL;
>>>>>+    int err;
>>>>>+
>>>>>+    err = __intel_ctx_create(i915, &cfg, &ctx);
>>>>>+    intel_ctx_destroy(i915, ctx);
>>>>>+
>>>>>+    return err == 0;
>>>>>+}
>>>>>+
>>>>>+static bool has_logical_mapping(int i915)
>>>>>+{
>>>>>+    struct drm_i915_query_engine_info *engines;
>>>>>+    unsigned int i;
>>>>>+
>>>>>+    engines = query_engine_info(i915);
>>>>>+
>>>>>+    for (i = 0; i < engines->num_engines; ++i)
>>>>>+        if (!(engines->engines[i].flags &
>>>>>+             I915_ENGINE_INFO_HAS_LOGICAL_INSTANCE)) {
>>>>>+            free(engines);
>>>>>+            return false;
>>>>>+        }
>>>>>+
>>>>>+    free(engines);
>>>>>+    return true;
>>>>>+}
>>>>>+
>>>>>+static bool has_parallel_execbuf(int i915)
>>>>>+{
>>>>>+    intel_ctx_cfg_t cfg = {
>>>>>+        .parallel = true,
>>>>>+        .num_engines = 1,
>>>>>+    };
>>>>>+    const intel_ctx_t *ctx = NULL;
>>>>>+    int err;
>>>>>+
>>>>>+    for (int class = 0; class < 32; class++) {
>>>>>+        struct i915_engine_class_instance *siblings;
>>>>>+        unsigned int count;
>>>>>+
>>>>>+        siblings = list_engines(i915, 1u << class, &count);
>>>>>+        if (!siblings)
>>>>>+            continue;
>>>>>+
>>>>>+        if (count < 2) {
>>>>>+            free(siblings);
>>>>>+            continue;
>>>>>+        }
>>>>>+
>>>>>+        logical_sort_siblings(i915, siblings, count);
>>>>>+
>>>>>+        cfg.width = count;
>>>>>+        memcpy(cfg.engines, siblings, sizeof(*siblings) * count);
>>>>>+        free(siblings);
>>>>>+
>>>>>+        err = __intel_ctx_create(i915, &cfg, &ctx);
>>>>>+        intel_ctx_destroy(i915, ctx);
>>>>>+
>>>>>+        return err == 0;
>>>>>+    }
>>>>>+
>>>>>+    return false;
>>>>>+}
>>>>>+
>>>>>+igt_main
>>>>>+{
>>>>>+    int i915 = -1;
>>>>>+
>>>>>+    igt_fixture {
>>>>>+        i915 = drm_open_driver(DRIVER_INTEL);
>>>>>+        igt_require_gem(i915);
>>>>>+
>>>>>+        gem_require_contexts(i915);
>>>>>+        igt_require(i915_vm_bind_version(i915) == 1);
>>>>>+        igt_require(gem_has_engine_topology(i915));
>>>>>+        igt_require(has_load_balancer(i915));
>>>>>+        igt_require(has_perf_engines(i915));
>>>>>+    }
>>>>>+
>>>>>+    igt_subtest_group {
>>>>>+        igt_fixture {
>>>>>+            igt_require(has_logical_mapping(i915));
>>>>>+            igt_require(has_parallel_execbuf(i915));
>>>>>+        }
>>>>>+
>>>>>+        igt_describe("Ensure a parallel submit actually runs 
>>>>>in parallel");
>>>>>+        igt_subtest("parallel-ordering")
>>>>>+            parallel_ordering(i915, 0);
>>>>>+    }
>>>>>+}
>>>>>diff --git a/tests/meson.build b/tests/meson.build
>>>>>index 50de8b7e89..6ca5a94282 100644
>>>>>--- a/tests/meson.build
>>>>>+++ b/tests/meson.build
>>>>>@@ -375,6 +375,13 @@ test_executables += 
>>>>>executable('gem_exec_balancer', 'i915/gem_exec_balancer.c',
>>>>>       install : true)
>>>>>test_list += 'gem_exec_balancer'
>>>>>+test_executables += executable('gem_exec3_balancer', 
>>>>>'i915/gem_exec3_balancer.c',
>>>>>+       dependencies : test_deps + [ lib_igt_perf ],
>>>>>+       install_dir : libexecdir,
>>>>>+       install_rpath : libexecdir_rpathdir,
>>>>>+       install : true)
>>>>>+test_list += 'gem_exec3_balancer'
>>>>>+
>>>>>test_executables += executable('gem_mmap_offset',
>>>>>       join_paths('i915', 'gem_mmap_offset.c'),
>>>>>       dependencies : test_deps + [ libatomic ],

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 11/11] tests/i915/vm_bind: Add gem_lmem_swapping@vm_bind sub test
  2022-10-21 17:20   ` Matthew Auld
@ 2022-10-25  7:10     ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 28+ messages in thread
From: Niranjana Vishwanathapura @ 2022-10-25  7:10 UTC (permalink / raw)
  To: Matthew Auld
  Cc: tvrtko.ursulin, igt-dev, thomas.hellstrom, daniel.vetter, petri.latvala

On Fri, Oct 21, 2022 at 06:20:46PM +0100, Matthew Auld wrote:
>On 18/10/2022 08:17, Niranjana Vishwanathapura wrote:
>>Validate eviction of objects with persistent mappings and
>>check persistent mappings are properly rebound upon subsequent
>>execbuf3 call.
>>
>>TODO: Add a new swapping test with just vm_bind mode
>>       (without legacy execbuf).
>
>So is this using the eb2 path to generate some memory pressure and 
>hopefully evict some of our vm_binds? I guess so long as we trigger 
>the rebinding.
>

Yes, I have verified that vm_bind objects are evicted and rebound
when needed.

>>
>>v2: use i915_vm_bind library functions
>>
>>Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
>
>This test is failing on DG2 btw:
>
>Starting dynamic subtest: lmem0
>Memory: system-total 19947MiB, lmem-region 4048MiB, usage-limit 21002MiB
>Using 1 thread(s), 6072 loop(s), 6072 objects of 1 MiB - 1 MiB, seed: 
>1666372356, oom: no
>(gem_lmem_swapping:1666) ioctl_wrappers-CRITICAL: Test assertion 
>failure function gem_vm_bind, file ../lib/ioctl_wrappers.c:1412:
>(gem_lmem_swapping:1666) ioctl_wrappers-CRITICAL: Failed assertion: 
>__gem_vm_bind(fd, bind) == 0
>(gem_lmem_swapping:1666) ioctl_wrappers-CRITICAL: Last errno: 22, 
>Invalid argument
>(gem_lmem_swapping:1666) ioctl_wrappers-CRITICAL: error: -22 != 0
>

Hmm...I thought with PS64 support, we won't need 64K aligned size anymore.
But i915 driver still needs it. Will fix this test.

Niranajana

>>---
>>  tests/i915/gem_lmem_swapping.c | 128 ++++++++++++++++++++++++++++++++-
>>  1 file changed, 127 insertions(+), 1 deletion(-)
>>
>>diff --git a/tests/i915/gem_lmem_swapping.c b/tests/i915/gem_lmem_swapping.c
>>index cccdb3195b..017833775c 100644
>>--- a/tests/i915/gem_lmem_swapping.c
>>+++ b/tests/i915/gem_lmem_swapping.c
>>@@ -3,12 +3,16 @@
>>   * Copyright © 2021 Intel Corporation
>>   */
>>+#include <poll.h>
>>+
>>  #include "i915/gem.h"
>>  #include "i915/gem_create.h"
>>  #include "i915/gem_vm.h"
>>  #include "i915/intel_memory_region.h"
>>+#include "i915/i915_vm_bind.h"
>>  #include "igt.h"
>>  #include "igt_kmod.h"
>>+#include "igt_syncobj.h"
>>  #include <unistd.h>
>>  #include <stdlib.h>
>>  #include <stdint.h>
>>@@ -54,6 +58,7 @@ struct params {
>>  		uint64_t max;
>>  	} size;
>>  	unsigned int count;
>>+	unsigned int vm_bind_count;
>>  	unsigned int loops;
>>  	unsigned int mem_limit;
>>  #define TEST_VERIFY	(1 << 0)
>>@@ -64,15 +69,18 @@ struct params {
>>  #define TEST_MULTI	(1 << 5)
>>  #define TEST_CCS	(1 << 6)
>>  #define TEST_MASSIVE	(1 << 7)
>>+#define TEST_VM_BIND	(1 << 8)
>>  	unsigned int flags;
>>  	unsigned int seed;
>>  	bool oom_test;
>>+	uint64_t va;
>>  };
>>  struct object {
>>  	uint64_t size;
>>  	uint32_t seed;
>>  	uint32_t handle;
>>+	uint64_t va;
>>  	struct blt_copy_object *blt_obj;
>>  };
>>@@ -278,6 +286,86 @@ verify_object_ccs(int i915, const struct object *obj,
>>  	free(cmd);
>>  }
>>+#define BATCH_VA       0xa00000
>>+
>>+static uint64_t gettime_ns(void)
>>+{
>>+	struct timespec current;
>>+	clock_gettime(CLOCK_MONOTONIC, &current);
>>+	return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
>>+}
>>+
>>+static void vm_bind(int i915, struct object *list, unsigned int num, uint64_t *va,
>>+		    uint32_t batch, uint64_t batch_size, uint32_t vm_id)
>>+{
>>+	uint32_t *bind_syncobj;
>>+	uint64_t *fence_value;
>>+	unsigned int i;
>>+
>>+	bind_syncobj = calloc(num + 1, sizeof(*bind_syncobj));
>>+	igt_assert(bind_syncobj);
>>+
>>+	fence_value = calloc(num + 1, sizeof(*fence_value));
>>+	igt_assert(fence_value);
>>+
>>+	for (i = 0; i < num; i++) {
>>+		list[i].va = *va;
>>+		bind_syncobj[i] = syncobj_create(i915, 0);
>>+		i915_vm_bind(i915, vm_id, *va, list[i].handle, 0, list[i].size, bind_syncobj[i], 0);
>>+		*va += list[i].size;
>>+	}
>>+	bind_syncobj[i] = syncobj_create(i915, 0);
>>+	i915_vm_bind(i915, vm_id, BATCH_VA, batch, 0, batch_size, bind_syncobj[i], 0);
>>+
>>+	igt_assert(syncobj_timeline_wait(i915, bind_syncobj, fence_value, num + 1,
>>+					 gettime_ns() + (2 * NSEC_PER_SEC),
>>+					 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
>>+	for (i = 0; i <= num; i++)
>>+		syncobj_destroy(i915, bind_syncobj[i]);
>>+}
>>+
>>+static void vm_unbind(int i915, struct object *list, unsigned int num,
>>+		      uint64_t batch_size, uint32_t vm_id)
>>+{
>>+	unsigned int i;
>>+
>>+	i915_vm_unbind(i915, vm_id, BATCH_VA, batch_size);
>>+	for (i = 0; i < num; i++)
>>+		i915_vm_unbind(i915, vm_id, list[i].va, list[i].size);
>>+}
>>+
>>+static void move_to_lmem_execbuf3(int i915,
>>+				  const intel_ctx_t *ctx,
>>+				  unsigned int engine,
>>+				  bool do_oom_test)
>>+{
>>+	uint32_t exec_syncobj = syncobj_create(i915, 0);
>>+	struct drm_i915_gem_timeline_fence exec_fence = {
>>+		.handle = exec_syncobj,
>>+		.flags = I915_TIMELINE_FENCE_SIGNAL
>>+	};
>>+	struct drm_i915_gem_execbuffer3 eb = {
>>+		.ctx_id = ctx->id,
>>+		.batch_address = BATCH_VA,
>>+		.engine_idx = engine,
>>+		.fence_count = 1,
>>+		.timeline_fences = to_user_pointer(&exec_fence),
>>+	};
>>+	uint64_t fence_value = 0;
>>+	int ret;
>>+
>>+retry:
>>+	ret = __gem_execbuf3(i915, &eb);
>>+	if (do_oom_test && (ret == -ENOMEM || ret == -ENXIO))
>>+		goto retry;
>>+	igt_assert_eq(ret, 0);
>>+
>>+	igt_assert(syncobj_timeline_wait(i915, &exec_syncobj, &fence_value, 1,
>>+					 gettime_ns() + (2 * NSEC_PER_SEC),
>>+					 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, NULL));
>>+	syncobj_destroy(i915, exec_syncobj);
>>+}
>>+
>>  static void move_to_lmem(int i915,
>>  			 const intel_ctx_t *ctx,
>>  			 struct object *list,
>>@@ -325,14 +413,16 @@ static void __do_evict(int i915,
>>  	uint32_t region_id = INTEL_MEMORY_REGION_ID(region->memory_class,
>>  						    region->memory_instance);
>>  	const unsigned int max_swap_in = params->count / 100 + 1;
>>+	uint64_t size, ahnd, batch_size = 4096;
>>  	struct object *objects, *obj, *list;
>>  	const uint32_t bpp = 32;
>>  	uint32_t width, height, stride;
>>+	const intel_ctx_t *vm_bind_ctx;
>>  	const intel_ctx_t *blt_ctx;
>>  	struct blt_copy_object *tmp;
>>  	unsigned int engine = 0;
>>+	uint32_t batch, vm_id;
>>  	unsigned int i, l;
>>-	uint64_t size, ahnd;
>>  	struct timespec t = {};
>>  	unsigned int num;
>>@@ -416,6 +506,20 @@ static void __do_evict(int i915,
>>  		  readable_size(params->size.max), readable_unit(params->size.max),
>>  		  params->count, seed);
>>+	/* VM_BIND the specified subset of objects (as persistent mappings) */
>>+	if (params->flags & TEST_VM_BIND) {
>>+		const uint32_t bbe = MI_BATCH_BUFFER_END;
>>+
>>+		batch = gem_create_from_pool(i915, &batch_size, region_id);
>>+		gem_write(i915, batch, 0, &bbe, sizeof(bbe));
>>+
>>+		vm_id = gem_vm_create_in_vm_bind_mode(i915);
>>+		vm_bind_ctx = intel_ctx_create(i915, &ctx->cfg);
>>+		gem_context_set_vm(i915, vm_bind_ctx->id, vm_id);
>>+		vm_bind(i915, objects, params->vm_bind_count, &params->va,
>>+			batch, batch_size, vm_id);
>>+	}
>>+
>>  	/*
>>  	 * Move random objects back into lmem.
>>  	 * For TEST_MULTI runs, make each object counts a loop to
>>@@ -454,6 +558,15 @@ static void __do_evict(int i915,
>>  		}
>>  	}
>>+	/* Rebind persistent mappings to ensure they are swapped back in */
>>+	if (params->flags & TEST_VM_BIND) {
>>+		move_to_lmem_execbuf3(i915, vm_bind_ctx, engine, params->oom_test);
>>+
>>+		vm_unbind(i915, objects, params->vm_bind_count, batch_size, vm_id);
>>+		intel_ctx_destroy(i915, vm_bind_ctx);
>>+		gem_vm_destroy(i915, vm_id);
>>+	}
>>+
>>  	for (i = 0; i < params->count; i++) {
>>  		gem_close(i915, objects[i].handle);
>>  		free(objects[i].blt_obj);
>>@@ -553,6 +666,15 @@ static void fill_params(int i915, struct params *params,
>>  	if (flags & TEST_HEAVY)
>>  		params->loops = params->loops / 2 + 1;
>>+	/*
>>+	 * Set vm_bind_count and ensure it doesn't over-subscribe LMEM.
>>+	 * Set va range to ensure it is big enough for all bindings in a VM.
>>+	 */
>>+	if (flags & TEST_VM_BIND) {
>>+		params->vm_bind_count = params->count * 50 / ((flags & TEST_HEAVY) ? 300 : 150);
>>+		params->va = (uint64_t)params->vm_bind_count * size * 4;
>>+	}
>>+
>>  	params->flags = flags;
>>  	params->oom_test = do_oom_test;
>>@@ -583,6 +705,9 @@ static void test_evict(int i915,
>>  	if (flags & TEST_CCS)
>>  		igt_require(IS_DG2(intel_get_drm_devid(i915)));
>>+	if (flags & TEST_VM_BIND)
>>+		igt_require(i915_vm_bind_version(i915) == 1);
>>+
>>  	fill_params(i915, &params, region, flags, nproc, false);
>>  	if (flags & TEST_PARALLEL) {
>>@@ -764,6 +889,7 @@ igt_main_args("", long_options, help_str, opt_handler, NULL)
>>  		{ "heavy-verify-random-ccs", TEST_CCS | TEST_RANDOM | TEST_HEAVY },
>>  		{ "heavy-verify-multi-ccs", TEST_CCS | TEST_RANDOM | TEST_HEAVY | TEST_ENGINES | TEST_MULTI },
>>  		{ "parallel-random-verify-ccs", TEST_PARALLEL | TEST_RANDOM | TEST_CCS },
>>+		{ "vm_bind", TEST_VM_BIND },
>>  		{ }
>>  	};
>>  	const intel_ctx_t *ctx;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v4 07/11] tests/i915/vm_bind: Add basic VM_BIND test support
  2022-10-25  7:06     ` Niranjana Vishwanathapura
@ 2022-10-28 16:38       ` Matthew Auld
  0 siblings, 0 replies; 28+ messages in thread
From: Matthew Auld @ 2022-10-28 16:38 UTC (permalink / raw)
  To: Niranjana Vishwanathapura
  Cc: tvrtko.ursulin, igt-dev, thomas.hellstrom, daniel.vetter, petri.latvala

On 25/10/2022 08:06, Niranjana Vishwanathapura wrote:
> On Fri, Oct 21, 2022 at 05:27:56PM +0100, Matthew Auld wrote:
>> On 18/10/2022 08:17, Niranjana Vishwanathapura wrote:
>>> Add basic tests for VM_BIND functionality. Bind the buffer objects in
>>> device page table with VM_BIND calls and have GPU copy the data from a
>>> source buffer object to destination buffer object.
>>> Test for different buffer sizes, buffer object placement and with
>>> multiple contexts.
>>>
>>> v2: Add basic test to fast-feedback.testlist
>>> v3: Run all tests for different memory types,
>>>     pass VA instead of an array for batch_address
>>> v4: Iterate for memory region types instead of hardcoding,
>>>     use i915_vm_bind library functions
>>>
>>> Signed-off-by: Niranjana Vishwanathapura 
>>> <niranjana.vishwanathapura@intel.com>
>>>
>>> basic
>>
>> Unwanted change?
>>
> 
> fixed
> 
>>> ---
>>>  tests/i915/i915_vm_bind_basic.c       | 522 ++++++++++++++++++++++++++
>>>  tests/intel-ci/fast-feedback.testlist |   1 +
>>>  tests/meson.build                     |   1 +
>>>  3 files changed, 524 insertions(+)
>>>  create mode 100644 tests/i915/i915_vm_bind_basic.c
>>>
>>> diff --git a/tests/i915/i915_vm_bind_basic.c 
>>> b/tests/i915/i915_vm_bind_basic.c
>>> new file mode 100644
>>> index 0000000000..ede76ba095
>>> --- /dev/null
>>> +++ b/tests/i915/i915_vm_bind_basic.c
>>> @@ -0,0 +1,522 @@
>>> +// SPDX-License-Identifier: MIT
>>> +/*
>>> + * Copyright © 2022 Intel Corporation
>>> + */
>>> +
>>> +/** @file i915_vm_bind_basic.c
>>> + *
>>> + * This is the basic test for VM_BIND functionality.
>>> + *
>>> + * The goal is to ensure that basics work.
>>> + */
>>> +
>>> +#include <sys/poll.h>
>>> +
>>> +#include "i915/gem.h"
>>> +#include "i915/i915_vm_bind.h"
>>> +#include "igt.h"
>>> +#include "igt_syncobj.h"
>>> +#include <unistd.h>
>>> +#include <stdlib.h>
>>> +#include <stdint.h>
>>> +#include <stdio.h>
>>> +#include <string.h>
>>> +#include <fcntl.h>
>>> +#include <inttypes.h>
>>> +#include <errno.h>
>>> +#include <sys/stat.h>
>>> +#include <sys/ioctl.h>
>>> +#include "drm.h"
>>> +#include "i915/gem_vm.h"
>>> +
>>> +IGT_TEST_DESCRIPTION("Basic test for vm_bind functionality");
>>> +
>>> +#define PAGE_SIZE   4096
>>> +#define PAGE_SHIFT  12
>>> +
>>> +#define GEN9_XY_FAST_COPY_BLT_CMD       (2 << 29 | 0x42 << 22)
>>> +#define BLT_DEPTH_32                    (3 << 24)
>>> +
>>> +#define DEFAULT_BUFF_SIZE  (4 * PAGE_SIZE)
>>> +#define SZ_64K             (16 * PAGE_SIZE)
>>> +#define SZ_2M              (512 * PAGE_SIZE)
>>> +
>>> +#define MAX_CTXTS   2
>>> +#define MAX_CMDS    4
>>> +
>>> +#define BATCH_FENCE  0
>>> +#define SRC_FENCE    1
>>> +#define DST_FENCE    2
>>> +#define EXEC_FENCE   3
>>> +#define NUM_FENCES   4
>>> +
>>> +enum {
>>> +    BATCH_MAP,
>>> +    SRC_MAP,
>>> +    DST_MAP = SRC_MAP + MAX_CMDS,
>>> +    MAX_MAP
>>> +};
>>> +
>>> +struct mapping {
>>> +    uint32_t  obj;
>>> +    uint64_t  va;
>>> +    uint64_t  offset;
>>> +    uint64_t  length;
>>> +    uint64_t  flags;
>>> +};
>>> +
>>> +#define SET_MAP(map, _obj, _va, _offset, _length, _flags)   \
>>> +{                                  \
>>> +    (map).obj = _obj;          \
>>> +    (map).va = _va;            \
>>> +    (map).offset = _offset;       \
>>> +    (map).length = _length;       \
>>> +    (map).flags = _flags;       \
>>> +}
>>> +
>>> +#define MAX_BATCH_DWORD    64
>>> +
>>> +#define abs(x) ((x) >= 0 ? (x) : -(x))
>>> +
>>> +#define TEST_LMEM             BIT(0)
>>> +#define TEST_SKIP_UNBIND      BIT(1)
>>> +#define TEST_SHARE_VM         BIT(2)
>>> +
>>> +#define is_lmem(cfg)        ((cfg)->flags & TEST_LMEM)
>>> +#define do_unbind(cfg)      (!((cfg)->flags & TEST_SKIP_UNBIND))
>>> +#define do_share_vm(cfg)    ((cfg)->flags & TEST_SHARE_VM)
>>> +
>>> +struct test_cfg {
>>> +    const char *name;
>>> +    uint32_t size;
>>> +    uint8_t num_cmds;
>>> +    uint32_t num_ctxts;
>>> +    uint32_t flags;
>>> +};
>>> +
>>> +static uint64_t
>>> +gettime_ns(void)
>>> +{
>>> +    struct timespec current;
>>> +    clock_gettime(CLOCK_MONOTONIC, &current);
>>> +    return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
>>> +}
>>> +
>>> +static bool syncobj_busy(int fd, uint32_t handle)
>>> +{
>>> +    bool result;
>>> +    int sf;
>>> +
>>> +    sf = syncobj_handle_to_fd(fd, handle,
>>> +                  DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE);
>>> +    result = poll(&(struct pollfd){sf, POLLIN}, 1, 0) == 0;
>>> +    close(sf);
>>> +
>>> +    return result;
>>> +}
>>> +
>>> +static inline void vm_bind(int fd, uint32_t vm_id, struct mapping *m,
>>> +               struct drm_i915_gem_timeline_fence *fence)
>>> +{
>>> +    uint32_t syncobj = 0;
>>> +
>>> +    if (fence) {
>>> +        syncobj =  syncobj_create(fd, 0);
>>> +
>>> +        fence->handle = syncobj;
>>> +        fence->flags = I915_TIMELINE_FENCE_WAIT;
>>> +        fence->value = 0;
>>> +    }
>>> +
>>> +    igt_info("VM_BIND vm:0x%x h:0x%x v:0x%lx o:0x%lx l:0x%lx\n",
>>> +         vm_id, m->obj, m->va, m->offset, m->length);
>>
>> This igt_info() and elsewhere are not too loud? Maybe just igt_debug() 
>> for some of these? Looks quite noisy when running this test locally.
> 
> ok, changed it to igt_debug()
> 
>>
>>> +    i915_vm_bind(fd, vm_id, m->va, m->obj, m->offset, m->length, 
>>> syncobj, 0);
>>> +}
>>> +
>>> +static inline void vm_unbind(int fd, uint32_t vm_id, struct mapping *m)
>>> +{
>>> +    /* Object handle is not required during unbind */
>>> +    igt_info("VM_UNBIND vm:0x%x v:0x%lx l:0x%lx\n", vm_id, m->va, 
>>> m->length);
>>> +    i915_vm_unbind(fd, vm_id, m->va, m->length);
>>> +}
>>> +
>>> +static void print_buffer(void *buf, uint32_t size,
>>> +             const char *str, bool full)
>>> +{
>>> +    uint32_t i = 0;
>>> +
>>> +    igt_debug("Printing %s size 0x%x\n", str, size);
>>> +    while (i < size) {
>>> +        uint32_t *b = buf + i;
>>> +
>>> +        igt_debug("\t%s[0x%04x]: 0x%08x 0x%08x 0x%08x 0x%08x %s\n",
>>> +              str, i, b[0], b[1], b[2], b[3], full ? "" : "...");
>>> +        i += full ? 16 : PAGE_SIZE;
>>> +    }
>>> +}
>>> +
>>> +static int gem_linear_fast_blt(uint32_t *batch, uint64_t src,
>>> +                   uint64_t dst, uint32_t size)
>>> +{
>>> +    uint32_t *cmd = batch;
>>> +
>>> +    *cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
>>> +    *cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
>>> +    *cmd++ = 0;
>>> +    *cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
>>> +    *cmd++ = lower_32_bits(dst);
>>> +    *cmd++ = upper_32_bits(dst);
>>> +    *cmd++ = 0;
>>> +    *cmd++ = PAGE_SIZE;
>>> +    *cmd++ = lower_32_bits(src);
>>> +    *cmd++ = upper_32_bits(src);
>>> +
>>> +    *cmd++ = MI_BATCH_BUFFER_END;
>>> +    *cmd++ = 0;
>>> +
>>> +    return ALIGN((cmd - batch + 1) * sizeof(uint32_t), 8);
>>> +}
>>> +
>>> +static void __gem_copy(int fd, uint64_t src, uint64_t dst, uint32_t 
>>> size,
>>> +               uint32_t ctx_id, void *batch_addr, unsigned int 
>>> eb_flags,
>>> +               struct drm_i915_gem_timeline_fence *fence)
>>> +{
>>> +    uint32_t len, buf[MAX_BATCH_DWORD] = { 0 };
>>> +    struct drm_i915_gem_execbuffer3 execbuf;
>>> +
>>> +    len = gem_linear_fast_blt(buf, src, dst, size);
>>> +
>>> +    memcpy(batch_addr, (void *)buf, len);
>>> +    print_buffer(buf, len, "batch", true);
>>> +
>>> +    memset(&execbuf, 0, sizeof(execbuf));
>>> +    execbuf.ctx_id = ctx_id;
>>> +    execbuf.batch_address = to_user_pointer(batch_addr);
>>> +    execbuf.engine_idx = eb_flags;
>>> +    execbuf.fence_count = NUM_FENCES;
>>> +    execbuf.timeline_fences = to_user_pointer(fence);
>>> +    gem_execbuf3(fd, &execbuf);
>>> +}
>>> +
>>> +static void i915_gem_copy(int fd, uint64_t src, uint64_t dst, 
>>> uint32_t va_delta,
>>> +              uint32_t delta, uint32_t size, const intel_ctx_t **ctx,
>>> +              uint32_t num_ctxts, void **batch_addr, unsigned int 
>>> eb_flags,
>>> +              struct drm_i915_gem_timeline_fence (*fence)[NUM_FENCES])
>>> +{
>>> +    uint32_t i;
>>> +
>>> +    for (i = 0; i < num_ctxts; i++) {
>>> +        igt_info("Issuing gem copy on ctx 0x%x\n", ctx[i]->id);
>>> +        __gem_copy(fd, src + (i * va_delta), dst + (i * va_delta), 
>>> delta,
>>> +               ctx[i]->id, batch_addr[i], eb_flags, fence[i]);
>>> +    }
>>> +}
>>> +
>>> +static void i915_gem_sync(int fd, const intel_ctx_t **ctx, uint32_t 
>>> num_ctxts,
>>> +              struct drm_i915_gem_timeline_fence (*fence)[NUM_FENCES])
>>> +{
>>> +    uint32_t i;
>>> +
>>> +    for (i = 0; i < num_ctxts; i++) {
>>> +        uint64_t fence_value = 0;
>>> +
>>> +        igt_assert(syncobj_timeline_wait(fd, 
>>> &fence[i][EXEC_FENCE].handle,
>>> +                         (uint64_t *)&fence_value, 1,
>>> +                         gettime_ns() + (2 * NSEC_PER_SEC),
>>> +                         DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT, 
>>> NULL));
>>> +        igt_assert(!syncobj_busy(fd, fence[i][EXEC_FENCE].handle));
>>> +        igt_info("gem copy completed on ctx 0x%x\n", ctx[i]->id);
>>> +    }
>>> +}
>>> +
>>> +static struct igt_collection *get_region_set(int fd, struct test_cfg 
>>> *cfg, bool lmem_only)
>>> +{
>>> +    uint32_t mem_type[] = { I915_SYSTEM_MEMORY, I915_DEVICE_MEMORY };
>>> +    uint32_t lmem_type[] = { I915_DEVICE_MEMORY };
>>> +    struct drm_i915_query_memory_regions *query_info;
>>> +
>>> +    query_info = gem_get_query_memory_regions(fd);
>>> +    igt_assert(query_info);
>>> +
>>> +    if (lmem_only)
>>> +        return __get_memory_region_set(query_info, lmem_type, 1);
>>> +    else
>>> +        return __get_memory_region_set(query_info, mem_type, 2);
>>> +}
>>> +
>>> +static void create_src_objs(int fd, struct test_cfg *cfg, uint32_t 
>>> src[], uint32_t size,
>>> +                uint32_t num_cmds, void *src_addr[])
>>> +{
>>> +    int i;
>>> +    struct igt_collection *set = get_region_set(fd, cfg, 
>>> is_lmem(cfg) && (num_cmds == 1));
>>> +    uint32_t region;
>>> +
>>> +    for (i = 0; i < num_cmds; i++) {
>>> +        region = igt_collection_get_value(set, i % set->size);
>>> +        src[i] = gem_create_in_memory_regions(fd, size, region);
>>> +        igt_info("Src obj 0x%x created in region 0x%x:0x%x\n", src[i],
>>> +             MEMORY_TYPE_FROM_REGION(region), 
>>> MEMORY_INSTANCE_FROM_REGION(region));
>>> +        src_addr[i] = gem_mmap__cpu(fd, src[i], 0, size, PROT_WRITE);
>>> +    }
>>> +}
>>> +
>>> +static void destroy_src_objs(int fd, struct test_cfg *cfg, uint32_t 
>>> src[], uint32_t size,
>>> +                 uint32_t num_cmds, void *src_addr[])
>>> +{
>>> +    int i;
>>> +
>>> +    for (i = 0; i < num_cmds; i++) {
>>> +        igt_assert(gem_munmap(src_addr[i], size) == 0);
>>> +        igt_debug("Closing object 0x%x\n", src[i]);
>>> +        gem_close(fd, src[i]);
>>> +    }
>>> +}
>>> +
>>> +static uint32_t create_dst_obj(int fd, struct test_cfg *cfg, 
>>> uint32_t size, void **dst_addr)
>>> +{
>>> +    uint32_t dst;
>>> +    struct igt_collection *set = get_region_set(fd, cfg, is_lmem(cfg));
>>> +    uint32_t region = igt_collection_get_value(set, 0);
>>
>> Maybe just use the gem_memory_region and then we don't need this 
>> igt_collection stuff here and elsewhere? IMO it's a lot nicer.
> 
> ok
> 
>>
>>> +
>>> +    dst = gem_create_in_memory_regions(fd, size, region);
>>> +    igt_info("Dst obj 0x%x created in region 0x%x:0x%x\n", dst,
>>> +         MEMORY_TYPE_FROM_REGION(region), 
>>> MEMORY_INSTANCE_FROM_REGION(region));
>>
>> And then here and elsewhere we can just use gem_memory_region.name, 
>> which is a lot more readable for humans :)
>>
> 
> ok
> 
>>> +    *dst_addr = gem_mmap__cpu(fd, dst, 0, size, PROT_WRITE);
>>> +
>>> +    return dst;
>>> +}
>>> +
>>> +static void destroy_dst_obj(int fd, struct test_cfg *cfg, uint32_t 
>>> dst, uint32_t size, void *dst_addr)
>>> +{
>>> +    igt_assert(gem_munmap(dst_addr, size) == 0);
>>> +    igt_debug("Closing object 0x%x\n", dst);
>>> +    gem_close(fd, dst);
>>> +}
>>> +
>>> +static void pattern_fill_buf(void *src_addr[], uint32_t size, 
>>> uint32_t num_cmds, uint32_t npages)
>>> +{
>>> +    uint32_t i, j;
>>> +    void *buf;
>>> +
>>> +    /* Allocate buffer and fill pattern */
>>> +    buf = malloc(size);
>>> +    igt_require(buf);
>>> +
>>> +    for (i = 0; i < num_cmds; i++) {
>>> +        for (j = 0; j < npages; j++)
>>> +            memset(buf + j * PAGE_SIZE, i * npages + j + 1, PAGE_SIZE);
>>> +
>>> +        memcpy(src_addr[i], buf, size);
>>> +    }
>>> +
>>> +    free(buf);
>>> +}
>>> +
>>> +static void run_test(int fd, const intel_ctx_t *base_ctx, struct 
>>> test_cfg *cfg,
>>> +             const struct intel_execution_engine2 *e)
>>> +{
>>> +    void *src_addr[MAX_CMDS] = { 0 }, *dst_addr = NULL;
>>> +    uint32_t src[MAX_CMDS], dst, i, size = cfg->size;
>>> +    struct drm_i915_gem_timeline_fence 
>>> exec_fence[MAX_CTXTS][NUM_FENCES];
>>> +    uint32_t shared_vm_id, vm_id[MAX_CTXTS];
>>> +    struct mapping map[MAX_CTXTS][MAX_MAP];
>>> +    uint32_t num_ctxts = cfg->num_ctxts;
>>> +    uint32_t num_cmds = cfg->num_cmds;
>>> +    uint32_t npages = size / PAGE_SIZE;
>>> +    const intel_ctx_t *ctx[MAX_CTXTS];
>>> +    bool share_vm = do_share_vm(cfg);
>>> +    uint32_t delta, va_delta = SZ_2M;
>>> +    void *batch_addr[MAX_CTXTS];
>>> +    uint32_t batch[MAX_CTXTS];
>>> +    uint64_t src_va, dst_va;
>>> +
>>> +    delta = size / num_ctxts;
>>> +    if (share_vm)
>>> +        shared_vm_id = gem_vm_create_in_vm_bind_mode(fd);
>>> +
>>> +    /* Create contexts */
>>> +    num_ctxts = min_t(num_ctxts, MAX_CTXTS, num_ctxts);
>>> +    for (i = 0; i < num_ctxts; i++) {
>>> +        uint32_t vmid;
>>> +
>>> +        if (share_vm)
>>> +            vmid = shared_vm_id;
>>> +        else
>>> +            vmid = gem_vm_create_in_vm_bind_mode(fd);
>>> +
>>> +        ctx[i] = intel_ctx_create(fd, &base_ctx->cfg);
>>> +        gem_context_set_vm(fd, ctx[i]->id, vmid);
>>> +        vm_id[i] = gem_context_get_vm(fd, ctx[i]->id);
>>> +
>>> +        exec_fence[i][EXEC_FENCE].handle = syncobj_create(fd, 0);
>>> +        exec_fence[i][EXEC_FENCE].flags = I915_TIMELINE_FENCE_SIGNAL;
>>> +        exec_fence[i][EXEC_FENCE].value = 0;
>>> +    }
>>> +
>>> +    /* Create objects */
>>> +    num_cmds = min_t(num_cmds, MAX_CMDS, num_cmds);
>>> +    create_src_objs(fd, cfg, src, size, num_cmds, src_addr);
>>> +    dst = create_dst_obj(fd, cfg, size, &dst_addr);
>>> +
>>> +    /*
>>> +     * mmap'ed addresses are not 64K aligned. On platforms requiring
>>> +     * 64K alignment, use static addresses.
>>> +     */
>>> +    if (size < SZ_2M && num_cmds && 
>>> !HAS_64K_PAGES(intel_get_drm_devid(fd))) {
>>> +        src_va = to_user_pointer(src_addr[0]);
>>> +        dst_va = to_user_pointer(dst_addr);
>>> +    } else {
>>> +        src_va = 0xa000000;
>>> +        dst_va = 0xb000000;
>>> +    }
>>> +
>>> +    pattern_fill_buf(src_addr, size, num_cmds, npages);
>>> +
>>> +    if (num_cmds)
>>> +        print_buffer(src_addr[num_cmds - 1], size, "src_obj", false);
>>> +
>>> +    for (i = 0; i < num_ctxts; i++) {
>>> +        batch[i] = gem_create_in_memory_regions(fd, PAGE_SIZE, 
>>> REGION_SMEM);
>>> +        igt_info("Batch obj 0x%x created in region 0x%x:0x%x\n", 
>>> batch[i],
>>> +             MEMORY_TYPE_FROM_REGION(REGION_SMEM), 
>>> MEMORY_INSTANCE_FROM_REGION(REGION_SMEM));
>>> +        batch_addr[i] = gem_mmap__cpu(fd, batch[i], 0, PAGE_SIZE, 
>>> PROT_WRITE);
>>
>> Hmm, are there any incoherent platforms (non-llc) that will be 
>> entering vm_bind world? I guess nothing currently that is graphics 
>> version >= 12? If something does come along we might see some suprises 
>> here and elsewhere.
>>
> 
> Yah, I too think it should be fine.

It sounds like mtl is perhaps one such platform, but I guess we can 
re-visit this at some later point, if needed.

> 
>> IIRC the old model would be something like mmap__cpu(), 
>> set_domain(CPU), do some CPU writes, and then later execbuf would see 
>> that the object is dirty and would do any clflushes if required before 
>> submit. Maybe the test needs to manually do all clflushing now with 
>> new eb3 model? We can also sidestep for now by using __device() or so.
>>
> 
> Yah we need to take care of this if we are supporting vm_bind on older
> platforms. Worst case, we need loop of vm_bind objects in eb3 and cflush.
> 
>>> +    }
>>> +
>>> +    /* Create mappings */
>>> +    for (i = 0; i < num_ctxts; i++) {
>>> +        uint64_t va_offset = i * va_delta;
>>> +        uint64_t offset = i * delta;
>>> +        uint32_t j;
>>> +
>>> +        for (j = 0; j < num_cmds; j++)
>>> +            SET_MAP(map[i][SRC_MAP + j], src[j], src_va + va_offset, 
>>> offset, delta, 0);
>>> +        SET_MAP(map[i][DST_MAP], dst, dst_va + va_offset, offset, 
>>> delta, 0);
>>> +        SET_MAP(map[i][BATCH_MAP], batch[i], 
>>> to_user_pointer(batch_addr[i]), 0, PAGE_SIZE, 0);
>>> +    }
>>> +
>>> +    /* Bind the buffers to device page table */
>>> +    for (i = 0; i < num_ctxts; i++) {
>>> +        vm_bind(fd, vm_id[i], &map[i][BATCH_MAP], 
>>> &exec_fence[i][BATCH_FENCE]);
>>> +        vm_bind(fd, vm_id[i], &map[i][DST_MAP], 
>>> &exec_fence[i][DST_FENCE]);
>>> +    }
>>> +
>>> +    /* Have GPU do the copy */
>>> +    for (i = 0; i < cfg->num_cmds; i++) {
>>> +        uint32_t j;
>>> +
>>> +        for (j = 0; j < num_ctxts; j++)
>>> +            vm_bind(fd, vm_id[j], &map[j][SRC_MAP + i], 
>>> &exec_fence[j][SRC_FENCE]);
>>> +
>>> +        i915_gem_copy(fd, src_va, dst_va, va_delta, delta, size, ctx,
>>> +                  num_ctxts, batch_addr, e->flags, exec_fence);
>>> +
>>> +        i915_gem_sync(fd, ctx, num_ctxts, exec_fence);
>>> +
>>> +        for (j = 0; j < num_ctxts; j++) {
>>> +            syncobj_destroy(fd, exec_fence[j][SRC_FENCE].handle);
>>> +            if (do_unbind(cfg))
>>> +                vm_unbind(fd, vm_id[j], &map[j][SRC_MAP + i]);
>>> +        }
>>> +    }
>>> +
>>> +    /*
>>> +     * Unbind buffers from device page table.
>>> +     * If not, it should get unbound while freeing the buffer.
>>> +     */
>>> +    for (i = 0; i < num_ctxts; i++) {
>>> +        syncobj_destroy(fd, exec_fence[i][BATCH_FENCE].handle);
>>> +        syncobj_destroy(fd, exec_fence[i][DST_FENCE].handle);
>>> +        if (do_unbind(cfg)) {
>>> +            vm_unbind(fd, vm_id[i], &map[i][BATCH_MAP]);
>>> +            vm_unbind(fd, vm_id[i], &map[i][DST_MAP]);
>>> +        }
>>> +    }
>>> +
>>> +    /* Close batch buffers */
>>> +    for (i = 0; i < num_ctxts; i++) {
>>> +        syncobj_destroy(fd, exec_fence[i][EXEC_FENCE].handle);
>>> +        gem_close(fd, batch[i]);
>>> +    }
>>> +
>>> +    /* Accessing the buffer will migrate the pages from device to 
>>> host */
>>
>> It shouldn't do that. Do you have some more info?
>>
> 
> This comment is a relic from the days when we had system allocator support.
> Will remove it.
> 
>>> +    print_buffer(dst_addr, size, "dst_obj", false);
>>
>> Normally we only want to dump the buffer if the test fails because 
>> there is some mismatch when comparing the expected value(s)?
>>
> 
> It is dumping it in case of failure. print_buffer uses igt_debug().
> Will rename this function to debug_dump_buffer()
> 
>>> +
>>> +    /* Validate by comparing the last SRC with DST */
>>> +    if (num_cmds)
>>> +        igt_assert(memcmp(src_addr[num_cmds - 1], dst_addr, size) == 
>>> 0);
>>> +
>>> +    /* Free the objects */
>>> +    destroy_src_objs(fd, cfg, src, size, num_cmds, src_addr);
>>> +    destroy_dst_obj(fd, cfg, dst, size, dst_addr);
>>> +
>>> +    /* Done with the contexts */
>>> +    for (i = 0; i < num_ctxts; i++) {
>>> +        igt_debug("Destroying context 0x%x\n", ctx[i]->id);
>>> +        gem_vm_destroy(fd, vm_id[i]);
>>> +        intel_ctx_destroy(fd, ctx[i]);
>>> +    }
>>> +
>>> +    if (share_vm)
>>> +        gem_vm_destroy(fd, shared_vm_id);
>>> +}
>>> +
>>> +igt_main
>>> +{
>>> +    struct test_cfg *t, tests[] = {
>>> +        {"basic", 0, 1, 1, 0},
>>> +        {"multi_cmds", 0, MAX_CMDS, 1, 0},
>>> +        {"skip_copy", 0, 0, 1, 0},
>>> +        {"skip_unbind",  0, 1, 1, TEST_SKIP_UNBIND},
>>> +        {"multi_ctxts", 0, 1, MAX_CTXTS, 0},
>>> +        {"share_vm", 0, 1, MAX_CTXTS, TEST_SHARE_VM},
>>> +        {"64K", (16 * PAGE_SIZE), 1, 1, 0},
>>> +        {"2M", SZ_2M, 1, 1, 0},
>>> +        { }
>>> +    };
>>> +    int fd;
>>> +    uint32_t def_size;
>>> +    struct intel_execution_engine2 *e;
>>> +    const intel_ctx_t *ctx;
>>> +
>>> +    igt_fixture {
>>> +        fd = drm_open_driver(DRIVER_INTEL);
>>> +        igt_require_gem(fd);
>>> +        igt_require(i915_vm_bind_version(fd) == 1);
>>> +        def_size = HAS_64K_PAGES(intel_get_drm_devid(fd)) ?
>>> +               SZ_64K : DEFAULT_BUFF_SIZE;
>>
>> Should be possible to cut over to gtt_alignment, if you feel like it. 
>> For example just stick it in gem_memory_region.
> 
> ok
> 
>>
>>> +        ctx = intel_ctx_create_all_physical(fd);
>>> +        /* Get copy engine */
>>> +        for_each_ctx_engine(fd, ctx, e)
>>> +            if (e->class == I915_ENGINE_CLASS_COPY)
>>> +                break;
>>
>> Maybe igt_require/assert() at least one copy engine? Dunno.
>>
> 
> ok
> 
>>> +    }
>>> +
>>> +    /* Adjust test variables */
>>> +    for (t = tests; t->name; t++)
>>> +        t->size = t->size ? : (def_size * abs(t->num_ctxts));
>>
>> num_ctx can be negative?
>>
> 
> No, it is a relic from older design. I have restructured the code
> here and we no longer will have this loop.
> 
>>> +
>>> +    for (t = tests; t->name; t++) {
>>> +        igt_describe_f("VM_BIND %s", t->name);
>>> +        igt_subtest_with_dynamic(t->name) {
>>> +            for_each_memory_region(r, fd) {
>>> +                struct test_cfg cfg = *t;
>>> +
>>> +                if (r->ci.memory_instance)
>>> +                    continue;
>>> +
>>> +                if (r->ci.memory_class == I915_MEMORY_CLASS_DEVICE)
>>> +                    cfg.flags |= TEST_LMEM;
>>> +
>>> +                igt_dynamic_f("%s", r->name)
>>> +                    run_test(fd, ctx, &cfg, e);
>>
>> Normally we just pass along the gem_memory_region. Can then fish out 
>> useful stuff like gtt_alignment, name, size etc.
>>
> 
> ok
> 
> Thanks,
> Niranjana
> 
>>> +            }
>>> +        }
>>> +    }
>>> +
>>> +    igt_fixture {
>>> +        intel_ctx_destroy(fd, ctx);
>>> +        close(fd);
>>> +    }
>>> +
>>> +    igt_exit();
>>> +}
>>> diff --git a/tests/intel-ci/fast-feedback.testlist 
>>> b/tests/intel-ci/fast-feedback.testlist
>>> index 185c2fef54..8a47eaddda 100644
>>> --- a/tests/intel-ci/fast-feedback.testlist
>>> +++ b/tests/intel-ci/fast-feedback.testlist
>>> @@ -53,6 +53,7 @@ igt@i915_getparams_basic@basic-eu-total
>>>  igt@i915_getparams_basic@basic-subslice-total
>>>  igt@i915_hangman@error-state-basic
>>>  igt@i915_pciid
>>> +igt@i915_vm_bind_basic@basic-smem
>>>  igt@i915_vm_bind_sanity@basic
>>>  igt@kms_addfb_basic@addfb25-bad-modifier
>>>  igt@kms_addfb_basic@addfb25-framebuffer-vs-set-tiling
>>> diff --git a/tests/meson.build b/tests/meson.build
>>> index d81c6c28c2..1b9529a7e2 100644
>>> --- a/tests/meson.build
>>> +++ b/tests/meson.build
>>> @@ -251,6 +251,7 @@ i915_progs = [
>>>      'sysfs_preempt_timeout',
>>>      'sysfs_timeslice_duration',
>>>      'i915_vm_bind_sanity',
>>> +    'i915_vm_bind_basic',
>>
>> Keep it sorted?
>>
>>>  ]
>>>  msm_progs = [

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2022-10-28 16:39 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-18  7:16 [igt-dev] [PATCH i-g-t v4 00/11] vm_bind: Add VM_BIND validation support Niranjana Vishwanathapura
2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 01/11] lib/vm_bind: import uapi definitions Niranjana Vishwanathapura
2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 02/11] lib/vm_bind: Add vm_bind/unbind and execbuf3 ioctls Niranjana Vishwanathapura
2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 03/11] lib/vm_bind: Add vm_bind mode support for VM Niranjana Vishwanathapura
2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 04/11] lib/vm_bind: Add vm_bind specific library functions Niranjana Vishwanathapura
2022-10-18  9:26   ` Matthew Auld
2022-10-18 14:43     ` Niranjana Vishwanathapura
2022-10-18  7:16 ` [igt-dev] [PATCH i-g-t v4 05/11] lib/vm_bind: Add __prime_handle_to_fd() Niranjana Vishwanathapura
2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 06/11] tests/i915/vm_bind: Add vm_bind sanity test Niranjana Vishwanathapura
2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 07/11] tests/i915/vm_bind: Add basic VM_BIND test support Niranjana Vishwanathapura
2022-10-21 16:27   ` Matthew Auld
2022-10-25  7:06     ` Niranjana Vishwanathapura
2022-10-28 16:38       ` Matthew Auld
2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 08/11] tests/i915/vm_bind: Add userptr subtest Niranjana Vishwanathapura
2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 09/11] tests/i915/vm_bind: Add gem_exec3_basic test Niranjana Vishwanathapura
2022-10-18  9:34   ` Matthew Auld
2022-10-18 14:46     ` Niranjana Vishwanathapura
2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 10/11] tests/i915/vm_bind: Add gem_exec3_balancer test Niranjana Vishwanathapura
2022-10-21 16:42   ` Matthew Auld
2022-10-21 20:11     ` Niranjana Vishwanathapura
2022-10-21 20:47       ` Niranjana Vishwanathapura
2022-10-24 15:26         ` Matthew Auld
2022-10-25  7:08           ` Niranjana Vishwanathapura
2022-10-18  7:17 ` [igt-dev] [PATCH i-g-t v4 11/11] tests/i915/vm_bind: Add gem_lmem_swapping@vm_bind sub test Niranjana Vishwanathapura
2022-10-21 17:20   ` Matthew Auld
2022-10-25  7:10     ` Niranjana Vishwanathapura
2022-10-18  7:45 ` [igt-dev] ✓ Fi.CI.BAT: success for vm_bind: Add VM_BIND validation support (rev8) Patchwork
2022-10-18 22:13 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.