All of lore.kernel.org
 help / color / mirror / Atom feed
* [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe
@ 2023-07-06  6:05 Zbigniew Kempczyński
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 01/16] tests/api_intel_allocator: Don't use allocator ahnd aliasing api Zbigniew Kempczyński
                   ` (17 more replies)
  0 siblings, 18 replies; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

Blitter library currently supports block-copy, ctrl-surf-copy
and fast-copy on i915. Lets extend this to xe as most of the
code is driver independent.

v2: Rewrite tracking allocator calls alloc()/free() to handle
    multiprocess/multithreaded scenarios. api_intel_allocator
    now supports both drivers (i915 and xe).

Cc: Kamil Konieczny <kamil.konieczny@linux.intel.com>
Cc: Karolina Stolarek <karolina.stolarek@intel.com>
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Bhanuprakash Modem <bhanuprakash.modem@intel.com>

Zbigniew Kempczyński (16):
  tests/api_intel_allocator: Don't use allocator ahnd aliasing api
  lib/intel_allocator: Drop aliasing allocator handle api
  lib/intel_allocator: Remove extensive debugging
  lib/xe_query: Use vramN when returning string region name
  lib/xe_query: Add xe_region_class() helper
  lib/drmtest: Add get_intel_driver() helper
  lib/xe_util: Return dynamic subtest name for Xe
  lib/xe_util: Add vm bind/unbind helper for Xe
  lib/intel_allocator: Add field to distinquish underlying driver
  lib/intel_allocator: Add intel_allocator_bind()
  lib/intel_ctx: Add xe context information
  lib/intel_blt: Introduce blt_copy_init() helper to cache driver
  lib/intel_blt: Extend blitter library to support xe driver
  tests/xe_ccs: Check if flatccs is working with block-copy for Xe
  tests/xe_exercise_blt: Check blitter library fast-copy for Xe
  tests/api-intel-allocator: Adopt to exercise allocator to Xe

 lib/drmtest.c                    |  10 +
 lib/drmtest.h                    |   1 +
 lib/igt_core.c                   |   5 +
 lib/igt_fb.c                     |   2 +-
 lib/intel_allocator.c            | 285 ++++++++++--
 lib/intel_allocator.h            |   4 +-
 lib/intel_blt.c                  | 292 ++++++++----
 lib/intel_blt.h                  |  10 +-
 lib/intel_ctx.c                  | 110 ++++-
 lib/intel_ctx.h                  |  14 +
 lib/meson.build                  |   3 +-
 lib/xe/xe_query.c                |  20 +-
 lib/xe/xe_query.h                |   1 +
 lib/xe/xe_util.c                 | 236 ++++++++++
 lib/xe/xe_util.h                 |  48 ++
 tests/i915/api_intel_allocator.c |  46 +-
 tests/i915/gem_ccs.c             |  34 +-
 tests/i915/gem_exercise_blt.c    |  22 +-
 tests/i915/gem_lmem_swapping.c   |   4 +-
 tests/meson.build                |   2 +
 tests/xe/xe_ccs.c                | 763 +++++++++++++++++++++++++++++++
 tests/xe/xe_exercise_blt.c       | 372 +++++++++++++++
 22 files changed, 2112 insertions(+), 172 deletions(-)
 create mode 100644 lib/xe/xe_util.c
 create mode 100644 lib/xe/xe_util.h
 create mode 100644 tests/xe/xe_ccs.c
 create mode 100644 tests/xe/xe_exercise_blt.c

-- 
2.34.1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 01/16] tests/api_intel_allocator: Don't use allocator ahnd aliasing api
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-06  9:04   ` Karolina Stolarek
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 02/16] lib/intel_allocator: Drop aliasing allocator handle api Zbigniew Kempczyński
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

There's no tests (apart this one) which are using aliasing ahnd
- intel_allocator_open_vm_as(). Additionally it is problematic
on adopting allocator to xe where I need to track allocations
to support easy vm binding. Let's adopt "open-vm" to not to use
this api.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 tests/i915/api_intel_allocator.c | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/tests/i915/api_intel_allocator.c b/tests/i915/api_intel_allocator.c
index b7e3efb87f..238e76c9fd 100644
--- a/tests/i915/api_intel_allocator.c
+++ b/tests/i915/api_intel_allocator.c
@@ -612,32 +612,27 @@ static void reopen_fork(int fd)
 
 static void open_vm(int fd)
 {
-	uint64_t ahnd[4], offset[4], size = 0x1000;
+	uint64_t ahnd[3], offset[3], size = 0x1000;
 	int i, n = ARRAY_SIZE(ahnd);
 
 	ahnd[0] = intel_allocator_open_vm(fd, 1, INTEL_ALLOCATOR_SIMPLE);
 	ahnd[1] = intel_allocator_open_vm(fd, 1, INTEL_ALLOCATOR_SIMPLE);
-	ahnd[2] = intel_allocator_open_vm_as(ahnd[1], 2);
-	ahnd[3] = intel_allocator_open(fd, 3, INTEL_ALLOCATOR_SIMPLE);
+	ahnd[2] = intel_allocator_open(fd, 2, INTEL_ALLOCATOR_SIMPLE);
 
 	offset[0] = intel_allocator_alloc(ahnd[0], 1, size, 0);
 	offset[1] = intel_allocator_alloc(ahnd[1], 2, size, 0);
 	igt_assert(offset[0] != offset[1]);
 
-	offset[2] = intel_allocator_alloc(ahnd[2], 3, size, 0);
-	igt_assert(offset[0] != offset[2] && offset[1] != offset[2]);
-
-	offset[3] = intel_allocator_alloc(ahnd[3], 1, size, 0);
-	igt_assert(offset[0] == offset[3]);
+	offset[2] = intel_allocator_alloc(ahnd[2], 1, size, 0);
+	igt_assert(offset[0] == offset[2]);
 
 	/*
-	 * As ahnd[0-2] lead to same allocator check can we free all handles
+	 * As ahnd[0-1] lead to same allocator check can we free all handles
 	 * using selected ahnd.
 	 */
-	intel_allocator_free(ahnd[0], 1);
-	intel_allocator_free(ahnd[0], 2);
-	intel_allocator_free(ahnd[0], 3);
-	intel_allocator_free(ahnd[3], 1);
+	igt_assert_eq(intel_allocator_free(ahnd[0], 1), true);
+	igt_assert_eq(intel_allocator_free(ahnd[1], 2), true);
+	igt_assert_eq(intel_allocator_free(ahnd[2], 1), true);
 
 	for (i = 0; i < n - 1; i++)
 		igt_assert_eq(intel_allocator_close(ahnd[i]), (i == n - 2));
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 02/16] lib/intel_allocator: Drop aliasing allocator handle api
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 01/16] tests/api_intel_allocator: Don't use allocator ahnd aliasing api Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-06  8:31   ` Karolina Stolarek
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 03/16] lib/intel_allocator: Remove extensive debugging Zbigniew Kempczyński
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

There's no real user of this api, lets drop it.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/intel_allocator.c | 18 ------------------
 lib/intel_allocator.h |  1 -
 2 files changed, 19 deletions(-)

diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
index 8161221dbf..c31576ecef 100644
--- a/lib/intel_allocator.c
+++ b/lib/intel_allocator.c
@@ -1037,24 +1037,6 @@ uint64_t intel_allocator_open_vm(int fd, uint32_t vm, uint8_t allocator_type)
 					    ALLOC_STRATEGY_HIGH_TO_LOW, 0);
 }
 
-uint64_t intel_allocator_open_vm_as(uint64_t allocator_handle, uint32_t new_vm)
-{
-	struct alloc_req req = { .request_type = REQ_OPEN_AS,
-				 .allocator_handle = allocator_handle,
-				 .open_as.new_vm = new_vm };
-	struct alloc_resp resp;
-
-	/* Get child_tid only once at open() */
-	if (child_tid == -1)
-		child_tid = gettid();
-
-	igt_assert(handle_request(&req, &resp) == 0);
-	igt_assert(resp.open_as.allocator_handle);
-	igt_assert(resp.response_type == RESP_OPEN_AS);
-
-	return resp.open.allocator_handle;
-}
-
 /**
  * intel_allocator_close:
  * @allocator_handle: handle to the allocator that will be closed
diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
index a6bf573e9d..3ec74f6191 100644
--- a/lib/intel_allocator.h
+++ b/lib/intel_allocator.h
@@ -182,7 +182,6 @@ uint64_t intel_allocator_open_vm_full(int fd, uint32_t vm,
 				      enum allocator_strategy strategy,
 				      uint64_t default_alignment);
 
-uint64_t intel_allocator_open_vm_as(uint64_t allocator_handle, uint32_t new_vm);
 bool intel_allocator_close(uint64_t allocator_handle);
 void intel_allocator_get_address_range(uint64_t allocator_handle,
 				       uint64_t *startp, uint64_t *endp);
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 03/16] lib/intel_allocator: Remove extensive debugging
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 01/16] tests/api_intel_allocator: Don't use allocator ahnd aliasing api Zbigniew Kempczyński
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 02/16] lib/intel_allocator: Drop aliasing allocator handle api Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-06  9:30   ` Karolina Stolarek
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 04/16] lib/xe_query: Use vramN when returning string region name Zbigniew Kempczyński
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

Debugging map keys comparison leads to produce extensive logging what
in turn obscures analysis. Remove it to get clearer logs.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/intel_allocator.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
index c31576ecef..be24f8f2d0 100644
--- a/lib/intel_allocator.c
+++ b/lib/intel_allocator.c
@@ -1385,9 +1385,6 @@ static int equal_handles(const void *key1, const void *key2)
 {
 	const struct handle_entry *h1 = key1, *h2 = key2;
 
-	alloc_debug("h1: %llx, h2: %llx\n",
-		   (long long) h1->handle, (long long) h2->handle);
-
 	return h1->handle == h2->handle;
 }
 
@@ -1395,9 +1392,6 @@ static int equal_ctx(const void *key1, const void *key2)
 {
 	const struct allocator *a1 = key1, *a2 = key2;
 
-	alloc_debug("a1: <fd: %d, ctx: %u>, a2 <fd: %d, ctx: %u>\n",
-		   a1->fd, a1->ctx, a2->fd, a2->ctx);
-
 	return a1->fd == a2->fd && a1->ctx == a2->ctx;
 }
 
@@ -1405,9 +1399,6 @@ static int equal_vm(const void *key1, const void *key2)
 {
 	const struct allocator *a1 = key1, *a2 = key2;
 
-	alloc_debug("a1: <fd: %d, vm: %u>, a2 <fd: %d, vm: %u>\n",
-		   a1->fd, a1->vm, a2->fd, a2->vm);
-
 	return a1->fd == a2->fd && a1->vm == a2->vm;
 }
 
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 04/16] lib/xe_query: Use vramN when returning string region name
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (2 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 03/16] lib/intel_allocator: Remove extensive debugging Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 05/16] lib/xe_query: Add xe_region_class() helper Zbigniew Kempczyński
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

For tests which are mixing regions (like xe_ccs) name is confusing.
As an example might be example "subtest-name-vram-1-vram-1". It's
more readable when it will be renamed to "subtest-name-vram1-vram1".

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Reviewed-by: Karolina Stolarek <karolina.stolarek@intel.com>
---
 lib/xe/xe_query.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
index 48fb5afba7..830b7e401d 100644
--- a/lib/xe/xe_query.c
+++ b/lib/xe/xe_query.c
@@ -445,7 +445,7 @@ struct drm_xe_query_mem_region *xe_mem_region(int fd, uint64_t region)
  * xe_region_name:
  * @region: region mask
  *
- * Returns region string like "system" or "vram-n" where n=0...62.
+ * Returns region string like "system" or "vramN" where N=0...62.
  */
 const char *xe_region_name(uint64_t region)
 {
@@ -457,7 +457,7 @@ const char *xe_region_name(uint64_t region)
 		vrams = calloc(64, sizeof(char *));
 		for (int i = 0; i < 64; i++) {
 			if (i != 0)
-				asprintf(&vrams[i], "vram-%d", i - 1);
+				asprintf(&vrams[i], "vram%d", i - 1);
 			else
 				asprintf(&vrams[i], "system");
 			igt_assert(vrams[i]);
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 05/16] lib/xe_query: Add xe_region_class() helper
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (3 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 04/16] lib/xe_query: Use vramN when returning string region name Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 06/16] lib/drmtest: Add get_intel_driver() helper Zbigniew Kempczyński
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

In the common code we often need to be aware of region class.
Add helper which returns it from region id.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Reviewed-by: Karolina Stolarek <karolina.stolarek@intel.com>
---
 lib/xe/xe_query.c | 16 ++++++++++++++++
 lib/xe/xe_query.h |  1 +
 2 files changed, 17 insertions(+)

diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
index 830b7e401d..f535ad8534 100644
--- a/lib/xe/xe_query.c
+++ b/lib/xe/xe_query.c
@@ -467,6 +467,22 @@ const char *xe_region_name(uint64_t region)
 	return vrams[region_idx];
 }
 
+/**
+ * xe_region_class:
+ * @fd: xe device fd
+ * @region: region mask
+ *
+ * Returns class of memory region structure for @region mask.
+ */
+uint16_t xe_region_class(int fd, uint64_t region)
+{
+	struct drm_xe_query_mem_region *memreg;
+
+	memreg = xe_mem_region(fd, region);
+
+	return memreg->mem_class;
+}
+
 /**
  * xe_min_page_size:
  * @fd: xe device fd
diff --git a/lib/xe/xe_query.h b/lib/xe/xe_query.h
index ff328ab942..68ca5a680c 100644
--- a/lib/xe/xe_query.h
+++ b/lib/xe/xe_query.h
@@ -85,6 +85,7 @@ struct drm_xe_engine_class_instance *xe_hw_engines(int fd);
 struct drm_xe_engine_class_instance *xe_hw_engine(int fd, int idx);
 struct drm_xe_query_mem_region *xe_mem_region(int fd, uint64_t region);
 const char *xe_region_name(uint64_t region);
+uint16_t xe_region_class(int fd, uint64_t region);
 uint32_t xe_min_page_size(int fd, uint64_t region);
 struct drm_xe_query_config *xe_config(int fd);
 unsigned int xe_number_hw_engines(int fd);
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 06/16] lib/drmtest: Add get_intel_driver() helper
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (4 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 05/16] lib/xe_query: Add xe_region_class() helper Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 07/16] lib/xe_util: Return dynamic subtest name for Xe Zbigniew Kempczyński
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

In libraries with i915 and Xe code divergence we might use
is_xe_device() or is_i915_device() to distinct code paths.
But to avoid additional open and string compare we can cache
this information in data structures.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Reviewed-by: Karolina Stolarek <karolina.stolarek@intel.com>
---
 lib/drmtest.c | 10 ++++++++++
 lib/drmtest.h |  1 +
 2 files changed, 11 insertions(+)

diff --git a/lib/drmtest.c b/lib/drmtest.c
index 5cdb0196d3..e1da66c877 100644
--- a/lib/drmtest.c
+++ b/lib/drmtest.c
@@ -151,6 +151,16 @@ bool is_intel_device(int fd)
 	return is_i915_device(fd) || is_xe_device(fd);
 }
 
+enum intel_driver get_intel_driver(int fd)
+{
+	if (is_xe_device(fd))
+		return INTEL_DRIVER_XE;
+	else if (is_i915_device(fd))
+		return INTEL_DRIVER_I915;
+
+	igt_assert_f(0, "Device is not handled by Intel driver\n");
+}
+
 static char _forced_driver[16] = "";
 
 /**
diff --git a/lib/drmtest.h b/lib/drmtest.h
index 9c3ea5d14c..97ab6e759e 100644
--- a/lib/drmtest.h
+++ b/lib/drmtest.h
@@ -124,6 +124,7 @@ bool is_nouveau_device(int fd);
 bool is_vc4_device(int fd);
 bool is_xe_device(int fd);
 bool is_intel_device(int fd);
+enum intel_driver get_intel_driver(int fd);
 
 /**
  * do_or_die:
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 07/16] lib/xe_util: Return dynamic subtest name for Xe
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (5 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 06/16] lib/drmtest: Add get_intel_driver() helper Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-06  9:37   ` Karolina Stolarek
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 08/16] lib/xe_util: Add vm bind/unbind helper " Zbigniew Kempczyński
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

For tests which are working on more than one region using name suffix
like "vram01-system" etc. is common thing. Instead handcrafting this
naming add xe_memregion_dynamic_subtest_name() function which is
similar to memregion_dynamic_subtest_name() for i915.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/meson.build  |   3 +-
 lib/xe/xe_util.c | 104 +++++++++++++++++++++++++++++++++++++++++++++++
 lib/xe/xe_util.h |  30 ++++++++++++++
 3 files changed, 136 insertions(+), 1 deletion(-)
 create mode 100644 lib/xe/xe_util.c
 create mode 100644 lib/xe/xe_util.h

diff --git a/lib/meson.build b/lib/meson.build
index 5523b4450e..ce11c0715f 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -105,7 +105,8 @@ lib_sources = [
 	'xe/xe_compute_square_kernels.c',
 	'xe/xe_ioctl.c',
 	'xe/xe_query.c',
-	'xe/xe_spin.c'
+	'xe/xe_spin.c',
+	'xe/xe_util.c',
 ]
 
 lib_deps = [
diff --git a/lib/xe/xe_util.c b/lib/xe/xe_util.c
new file mode 100644
index 0000000000..448b3a3d27
--- /dev/null
+++ b/lib/xe/xe_util.c
@@ -0,0 +1,104 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include "igt.h"
+#include "igt_syncobj.h"
+#include "xe/xe_ioctl.h"
+#include "xe/xe_query.h"
+#include "xe/xe_util.h"
+
+static bool __region_belongs_to_regions_type(struct drm_xe_query_mem_region *region,
+					     uint32_t *mem_regions_type,
+					     int num_regions)
+{
+	for (int i = 0; i < num_regions; i++)
+		if (mem_regions_type[i] == region->mem_class)
+			return true;
+	return false;
+}
+
+struct igt_collection *
+__xe_get_memory_region_set(int xe, uint32_t *mem_regions_type, int num_regions)
+{
+	struct drm_xe_query_mem_region *memregion;
+	struct igt_collection *set = NULL;
+	uint64_t memreg = all_memory_regions(xe), region;
+	int count = 0, pos = 0;
+
+	xe_for_each_mem_region(xe, memreg, region) {
+		memregion = xe_mem_region(xe, region);
+		if (__region_belongs_to_regions_type(memregion,
+						     mem_regions_type,
+						     num_regions))
+			count++;
+	}
+
+	set = igt_collection_create(count);
+
+	xe_for_each_mem_region(xe, memreg, region) {
+		memregion = xe_mem_region(xe, region);
+		igt_assert(region < (1ull << 31));
+		if (__region_belongs_to_regions_type(memregion,
+						     mem_regions_type,
+						     num_regions)) {
+			igt_collection_set_value(set, pos++, (int)region);
+		}
+	}
+
+	igt_assert(count == pos);
+
+	return set;
+}
+
+/**
+  * xe_memregion_dynamic_subtest_name:
+  * @xe: drm fd of Xe device
+  * @igt_collection: memory region collection
+  *
+  * Function iterates over all memory regions inside the collection (keeped
+  * in the value field) and generates the name which can be used during dynamic
+  * subtest creation.
+  *
+  * Returns: newly allocated string, has to be freed by caller. Asserts if
+  * caller tries to create a name using empty collection.
+  */
+char *xe_memregion_dynamic_subtest_name(int xe, struct igt_collection *set)
+{
+	struct igt_collection_data *data;
+	char *name, *p;
+	uint32_t region, len;
+
+	igt_assert(set && set->size);
+	/* enough for "name%d-" * n */
+	len = set->size * 8;
+	p = name = malloc(len);
+	igt_assert(name);
+
+	for_each_collection_data(data, set) {
+		struct drm_xe_query_mem_region *memreg;
+		int r;
+
+		region = data->value;
+		memreg = xe_mem_region(xe, region);
+
+		if (XE_IS_CLASS_VRAM(memreg))
+			r = snprintf(p, len, "%s%d-",
+				     xe_region_name(region),
+				     memreg->instance);
+		else
+			r = snprintf(p, len, "%s-",
+				     xe_region_name(region));
+
+		igt_assert(r > 0);
+		p += r;
+		len -= r;
+	}
+
+	/* remove last '-' */
+	*(p - 1) = 0;
+
+	return name;
+}
+
diff --git a/lib/xe/xe_util.h b/lib/xe/xe_util.h
new file mode 100644
index 0000000000..9f56fa9898
--- /dev/null
+++ b/lib/xe/xe_util.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2023 Intel Corporation
+ *
+ */
+
+#ifndef XE_UTIL_H
+#define XE_UTIL_H
+
+#include <stdbool.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <xe_drm.h>
+
+#define XE_IS_SYSMEM_MEMORY_REGION(fd, region) \
+	(xe_region_class(fd, region) == XE_MEM_REGION_CLASS_SYSMEM)
+#define XE_IS_VRAM_MEMORY_REGION(fd, region) \
+	(xe_region_class(fd, region) == XE_MEM_REGION_CLASS_VRAM)
+
+struct igt_collection *
+__xe_get_memory_region_set(int xe, uint32_t *mem_regions_type, int num_regions);
+
+#define xe_get_memory_region_set(regions, mem_region_types...) ({ \
+	unsigned int arr__[] = { mem_region_types }; \
+	__xe_get_memory_region_set(regions, arr__, ARRAY_SIZE(arr__)); \
+})
+
+char *xe_memregion_dynamic_subtest_name(int xe, struct igt_collection *set);
+
+#endif /* XE_UTIL_H */
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 08/16] lib/xe_util: Add vm bind/unbind helper for Xe
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (6 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 07/16] lib/xe_util: Return dynamic subtest name for Xe Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-06 10:27   ` Karolina Stolarek
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 09/16] lib/intel_allocator: Add field to distinquish underlying driver Zbigniew Kempczyński
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

Before calling exec we need to prepare vm to contain valid entries.
Bind/unbind in xe expects single bind_op or vector of bind_ops what
makes preparation of it a little bit inconvinient. Add function
which iterates over list of xe_object (auxiliary structure which
describes bind information for object) and performs the bind/unbind
in one step. It also supports passing syncobj in/out to work in
pipelined executions.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/xe/xe_util.c | 132 +++++++++++++++++++++++++++++++++++++++++++++++
 lib/xe/xe_util.h |  18 +++++++
 2 files changed, 150 insertions(+)

diff --git a/lib/xe/xe_util.c b/lib/xe/xe_util.c
index 448b3a3d27..b998a52e73 100644
--- a/lib/xe/xe_util.c
+++ b/lib/xe/xe_util.c
@@ -102,3 +102,135 @@ char *xe_memregion_dynamic_subtest_name(int xe, struct igt_collection *set)
 	return name;
 }
 
+#ifdef XEBINDDBG
+#define bind_info igt_info
+#define bind_debug igt_debug
+#else
+#define bind_info(...) {}
+#define bind_debug(...) {}
+#endif
+
+static struct drm_xe_vm_bind_op *xe_alloc_bind_ops(struct igt_list_head *obj_list,
+						   uint32_t *num_ops)
+{
+	struct drm_xe_vm_bind_op *bind_ops, *ops;
+	struct xe_object *obj;
+	uint32_t num_objects = 0, i = 0, op;
+
+	igt_list_for_each_entry(obj, obj_list, link)
+		num_objects++;
+
+	*num_ops = num_objects;
+	if (!num_objects) {
+		bind_info(" [nothing to bind]\n");
+		return NULL;
+	}
+
+	bind_ops = calloc(num_objects, sizeof(*bind_ops));
+	igt_assert(bind_ops);
+
+	igt_list_for_each_entry(obj, obj_list, link) {
+		ops = &bind_ops[i];
+
+		if (obj->bind_op == XE_OBJECT_BIND) {
+			op = XE_VM_BIND_OP_MAP | XE_VM_BIND_FLAG_ASYNC;
+			ops->obj = obj->handle;
+		} else {
+			op = XE_VM_BIND_OP_UNMAP | XE_VM_BIND_FLAG_ASYNC;
+		}
+
+		ops->op = op;
+		ops->obj_offset = 0;
+		ops->addr = obj->offset;
+		ops->range = obj->size;
+		ops->region = 0;
+
+		bind_info("  [%d]: [%6s] handle: %u, offset: %llx, size: %llx\n",
+			  i, obj->bind_op == XE_OBJECT_BIND ? "BIND" : "UNBIND",
+			  ops->obj, (long long)ops->addr, (long long)ops->range);
+		i++;
+	}
+
+	return bind_ops;
+}
+
+static void __xe_op_bind_async(int xe, uint32_t vm, uint32_t bind_engine,
+			       struct igt_list_head *obj_list,
+			       uint32_t sync_in, uint32_t sync_out)
+{
+	struct drm_xe_vm_bind_op *bind_ops;
+	struct drm_xe_sync tabsyncs[2] = {
+		{ .flags = DRM_XE_SYNC_SYNCOBJ, .handle = sync_in },
+		{ .flags = DRM_XE_SYNC_SYNCOBJ | DRM_XE_SYNC_SIGNAL, .handle = sync_out },
+	};
+	struct drm_xe_sync *syncs;
+	uint32_t num_binds = 0;
+	int num_syncs;
+
+	bind_info("[Binding to vm: %u]\n", vm);
+	bind_ops = xe_alloc_bind_ops(obj_list, &num_binds);
+
+	if (!num_binds) {
+		if (sync_out)
+			syncobj_signal(xe, &sync_out, 1);
+		return;
+	}
+
+	if (sync_in) {
+		syncs = tabsyncs;
+		num_syncs = 2;
+	} else {
+		syncs = &tabsyncs[1];
+		num_syncs = 1;
+	}
+
+	/* User didn't pass sync out, create it and wait for completion */
+	if (!sync_out)
+		tabsyncs[1].handle = syncobj_create(xe, 0);
+
+	bind_info("[Binding syncobjs: (in: %u, out: %u)]\n",
+		  tabsyncs[0].handle, tabsyncs[1].handle);
+
+	if (num_binds == 1) {
+		if ((bind_ops[0].op & 0xffff) == XE_VM_BIND_OP_MAP)
+			xe_vm_bind_async(xe, vm, bind_engine, bind_ops[0].obj, 0,
+					bind_ops[0].addr, bind_ops[0].range,
+					syncs, num_syncs);
+		else
+			xe_vm_unbind_async(xe, vm, bind_engine, 0,
+					   bind_ops[0].addr, bind_ops[0].range,
+					   syncs, num_syncs);
+	} else {
+		xe_vm_bind_array(xe, vm, bind_engine, bind_ops,
+				 num_binds, syncs, num_syncs);
+	}
+
+	if (!sync_out) {
+		igt_assert_eq(syncobj_wait_err(xe, &tabsyncs[1].handle, 1, INT64_MAX, 0), 0);
+		syncobj_destroy(xe, tabsyncs[1].handle);
+	}
+
+	free(bind_ops);
+}
+
+/**
+  * xe_bind_unbind_async:
+  * @xe: drm fd of Xe device
+  * @vm: vm to bind/unbind objects to/from
+  * @bind_engine: bind engine, 0 if default
+  * @obj_list: list of xe_object
+  * @sync_in: sync object (fence-in), 0 if there's no input dependency
+  * @sync_out: sync object (fence-out) to signal on bind/unbind completion,
+  *            if 0 wait for bind/unbind completion.
+  *
+  * Function iterates over xe_object @obj_list, prepares binding operation
+  * and does bind/unbind in one step. Providing sync_in / sync_out allows
+  * working in pipelined mode. With sync_in and sync_out set to 0 function
+  * waits until binding operation is complete.
+  */
+void xe_bind_unbind_async(int fd, uint32_t vm, uint32_t bind_engine,
+			  struct igt_list_head *obj_list,
+			  uint32_t sync_in, uint32_t sync_out)
+{
+	return __xe_op_bind_async(fd, vm, bind_engine, obj_list, sync_in, sync_out);
+}
diff --git a/lib/xe/xe_util.h b/lib/xe/xe_util.h
index 9f56fa9898..32f309923e 100644
--- a/lib/xe/xe_util.h
+++ b/lib/xe/xe_util.h
@@ -27,4 +27,22 @@ __xe_get_memory_region_set(int xe, uint32_t *mem_regions_type, int num_regions);
 
 char *xe_memregion_dynamic_subtest_name(int xe, struct igt_collection *set);
 
+enum xe_bind_op {
+	XE_OBJECT_BIND,
+	XE_OBJECT_UNBIND,
+};
+
+struct xe_object {
+	uint32_t handle;
+	uint64_t offset;
+	uint64_t size;
+	enum xe_bind_op bind_op;
+	void *priv;
+	struct igt_list_head link;
+};
+
+void xe_bind_unbind_async(int fd, uint32_t vm, uint32_t bind_engine,
+			  struct igt_list_head *obj_list,
+			  uint32_t sync_in, uint32_t sync_out);
+
 #endif /* XE_UTIL_H */
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 09/16] lib/intel_allocator: Add field to distinquish underlying driver
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (7 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 08/16] lib/xe_util: Add vm bind/unbind helper " Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-06 10:34   ` Karolina Stolarek
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 10/16] lib/intel_allocator: Add intel_allocator_bind() Zbigniew Kempczyński
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

Cache what driver is using on drm fd to avoid calling same code
in allocator functions.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/intel_allocator.c | 1 +
 lib/intel_allocator.h | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
index be24f8f2d0..228b33b92f 100644
--- a/lib/intel_allocator.c
+++ b/lib/intel_allocator.c
@@ -318,6 +318,7 @@ static struct intel_allocator *intel_allocator_create(int fd,
 
 	igt_assert(ial);
 
+	ial->driver = get_intel_driver(fd);
 	ial->type = allocator_type;
 	ial->strategy = allocator_strategy;
 	ial->default_alignment = default_alignment;
diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
index 3ec74f6191..1001b21b98 100644
--- a/lib/intel_allocator.h
+++ b/lib/intel_allocator.h
@@ -141,6 +141,9 @@ struct intel_allocator {
 	/* allocator's private structure */
 	void *priv;
 
+	/* driver - i915 or Xe */
+	enum intel_driver driver;
+
 	void (*get_address_range)(struct intel_allocator *ial,
 				  uint64_t *startp, uint64_t *endp);
 	uint64_t (*alloc)(struct intel_allocator *ial, uint32_t handle,
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 10/16] lib/intel_allocator: Add intel_allocator_bind()
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (8 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 09/16] lib/intel_allocator: Add field to distinquish underlying driver Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-06 13:02   ` Karolina Stolarek
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 11/16] lib/intel_ctx: Add xe context information Zbigniew Kempczyński
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

Synchronize allocator state to vm.

This change allows xe user to execute vm-bind/unbind for allocator
alloc()/free() operations which occurred since last binding/unbinding.
Before doing exec user should call intel_allocator_bind() to ensure
all vma's are in place.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
v2: Rewrite tracking mechanism: previous code uses bind map embedded
    in allocator structure. Unfortunately this wasn't good idea
    - for xe binding everything was fine, but it regress multiprocess/
    multithreaded allocations. Main reason of this was children
    processes couldn't get its reference as this memory was allocated
    on allocator thread (separate process). Currently each child
    contains its own separate tracking maps for ahnd and for each
    ahnd bind map.
---
 lib/igt_core.c        |   5 +
 lib/intel_allocator.c | 259 +++++++++++++++++++++++++++++++++++++++++-
 lib/intel_allocator.h |   6 +-
 3 files changed, 265 insertions(+), 5 deletions(-)

diff --git a/lib/igt_core.c b/lib/igt_core.c
index 3ee3a01c36..6286e97b1b 100644
--- a/lib/igt_core.c
+++ b/lib/igt_core.c
@@ -74,6 +74,7 @@
 #include "igt_sysrq.h"
 #include "igt_rc.h"
 #include "igt_list.h"
+#include "igt_map.h"
 #include "igt_device_scan.h"
 #include "igt_thread.h"
 #include "runnercomms.h"
@@ -319,6 +320,8 @@ bool test_multi_fork_child;
 /* For allocator purposes */
 pid_t child_pid  = -1;
 __thread pid_t child_tid  = -1;
+struct igt_map *ahnd_map;
+pthread_mutex_t ahnd_map_mutex;
 
 enum {
 	/*
@@ -2509,6 +2512,8 @@ bool __igt_fork(void)
 	case 0:
 		test_child = true;
 		pthread_mutex_init(&print_mutex, NULL);
+		pthread_mutex_init(&ahnd_map_mutex, NULL);
+		ahnd_map = igt_map_create(igt_map_hash_64, igt_map_equal_64);
 		child_pid = getpid();
 		child_tid = -1;
 		exit_handler_count = 0;
diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
index 228b33b92f..02d3404abc 100644
--- a/lib/intel_allocator.c
+++ b/lib/intel_allocator.c
@@ -17,6 +17,7 @@
 #include "intel_allocator.h"
 #include "intel_allocator_msgchannel.h"
 #include "xe/xe_query.h"
+#include "xe/xe_util.h"
 
 //#define ALLOCDBG
 #ifdef ALLOCDBG
@@ -46,6 +47,14 @@ static inline const char *reqstr(enum reqtype request_type)
 #define alloc_debug(...) {}
 #endif
 
+#ifdef ALLOCBINDDBG
+#define bind_info igt_info
+#define bind_debug igt_debug
+#else
+#define bind_info(...) {}
+#define bind_debug(...) {}
+#endif
+
 /*
  * We limit allocator space to avoid hang when batch would be
  * pinned in the last page.
@@ -65,6 +74,31 @@ struct handle_entry {
 	struct allocator *al;
 };
 
+/* For tracking alloc()/free() for Xe */
+struct ahnd_info {
+	int fd;
+	uint64_t ahnd;
+	uint32_t ctx;
+	uint32_t vm;
+	enum intel_driver driver;
+	struct igt_map *bind_map;
+	pthread_mutex_t bind_map_mutex;
+};
+
+enum allocator_bind_op {
+	BOUND,
+	TO_BIND,
+	TO_UNBIND,
+};
+
+struct allocator_object {
+	uint32_t handle;
+	uint64_t offset;
+	uint64_t size;
+
+	enum allocator_bind_op bind_op;
+};
+
 struct intel_allocator *
 intel_allocator_reloc_create(int fd, uint64_t start, uint64_t end);
 struct intel_allocator *
@@ -123,6 +157,13 @@ static pid_t allocator_pid = -1;
 extern pid_t child_pid;
 extern __thread pid_t child_tid;
 
+/*
+ * Track alloc()/free() requires storing in local process which has
+ * an access to real drm fd it can work on.
+ */
+extern struct igt_map *ahnd_map;
+extern pthread_mutex_t ahnd_map_mutex;
+
 /*
  * - for parent process we have child_pid == -1
  * - for child which calls intel_allocator_init() allocator_pid == child_pid
@@ -318,7 +359,6 @@ static struct intel_allocator *intel_allocator_create(int fd,
 
 	igt_assert(ial);
 
-	ial->driver = get_intel_driver(fd);
 	ial->type = allocator_type;
 	ial->strategy = allocator_strategy;
 	ial->default_alignment = default_alignment;
@@ -893,6 +933,46 @@ void intel_allocator_multiprocess_stop(void)
 	}
 }
 
+static void track_ahnd(int fd, uint64_t ahnd, uint32_t ctx, uint32_t vm)
+{
+	struct ahnd_info *ainfo;
+
+	pthread_mutex_lock(&ahnd_map_mutex);
+	ainfo = igt_map_search(ahnd_map, &ahnd);
+	if (!ainfo) {
+		ainfo = malloc(sizeof(*ainfo));
+		ainfo->fd = fd;
+		ainfo->ahnd = ahnd;
+		ainfo->ctx = ctx;
+		ainfo->vm = vm;
+		ainfo->driver = get_intel_driver(fd);
+		ainfo->bind_map = igt_map_create(igt_map_hash_32, igt_map_equal_32);
+		pthread_mutex_init(&ainfo->bind_map_mutex, NULL);
+		bind_debug("[TRACK AHND] pid: %d, tid: %d, create <fd: %d, "
+			   "ahnd: %llx, ctx: %u, vm: %u, driver: %d, ahnd_map: %p, bind_map: %p>\n",
+			   getpid(), gettid(), ainfo->fd,
+			   (long long)ainfo->ahnd, ainfo->ctx, ainfo->vm,
+			   ainfo->driver, ahnd_map, ainfo->bind_map);
+		igt_map_insert(ahnd_map, &ainfo->ahnd, ainfo);
+	}
+
+	pthread_mutex_unlock(&ahnd_map_mutex);
+}
+
+static void untrack_ahnd(uint64_t ahnd)
+{
+	struct ahnd_info *ainfo;
+
+	pthread_mutex_lock(&ahnd_map_mutex);
+	ainfo = igt_map_search(ahnd_map, &ahnd);
+	if (ainfo) {
+		bind_debug("[UNTRACK AHND]: pid: %d, tid: %d, removing ahnd: %llx\n",
+			   getpid(), gettid(), (long long)ahnd);
+		igt_map_remove(ahnd_map, &ahnd, map_entry_free_func);
+	}
+	pthread_mutex_unlock(&ahnd_map_mutex);
+}
+
 static uint64_t __intel_allocator_open_full(int fd, uint32_t ctx,
 					    uint32_t vm,
 					    uint64_t start, uint64_t end,
@@ -951,6 +1031,8 @@ static uint64_t __intel_allocator_open_full(int fd, uint32_t ctx,
 	igt_assert(resp.open.allocator_handle);
 	igt_assert(resp.response_type == RESP_OPEN);
 
+	track_ahnd(fd, resp.open.allocator_handle, ctx, vm);
+
 	return resp.open.allocator_handle;
 }
 
@@ -1057,6 +1139,8 @@ bool intel_allocator_close(uint64_t allocator_handle)
 	igt_assert(handle_request(&req, &resp) == 0);
 	igt_assert(resp.response_type == RESP_CLOSE);
 
+	untrack_ahnd(allocator_handle);
+
 	return resp.close.is_empty;
 }
 
@@ -1090,6 +1174,76 @@ void intel_allocator_get_address_range(uint64_t allocator_handle,
 		*endp = resp.address_range.end;
 }
 
+static bool is_same(struct allocator_object *obj,
+		    uint32_t handle, uint64_t offset, uint64_t size,
+		    enum allocator_bind_op bind_op)
+{
+	return obj->handle == handle &&	obj->offset == offset && obj->size == size &&
+	       (obj->bind_op == bind_op || obj->bind_op == BOUND);
+}
+
+static void track_object(uint64_t allocator_handle, uint32_t handle,
+			 uint64_t offset, uint64_t size,
+			 enum allocator_bind_op bind_op)
+{
+	struct ahnd_info *ainfo;
+	struct allocator_object *obj;
+
+	bind_debug("[TRACK OBJECT]: [%s] pid: %d, tid: %d, ahnd: %llx, handle: %u, offset: %llx, size: %llx\n",
+		   bind_op == TO_BIND ? "BIND" : "UNBIND",
+		   getpid(), gettid(),
+		   (long long)allocator_handle,
+		   handle, (long long)offset, (long long)size);
+
+	if (offset == ALLOC_INVALID_ADDRESS) {
+		bind_debug("[TRACK OBJECT] => invalid address %llx, skipping tracking\n",
+			   (long long)offset);
+		return;
+	}
+
+	pthread_mutex_lock(&ahnd_map_mutex);
+	ainfo = igt_map_search(ahnd_map, &allocator_handle);
+	pthread_mutex_unlock(&ahnd_map_mutex);
+	if (!ainfo) {
+		igt_warn("[TRACK OBJECT] => MISSING ahnd %llx <=\n", (long long)allocator_handle);
+		igt_assert(ainfo);
+	}
+
+	if (ainfo->driver == INTEL_DRIVER_I915)
+		return; /* no-op for i915, at least now */
+
+	pthread_mutex_lock(&ainfo->bind_map_mutex);
+	obj = igt_map_search(ainfo->bind_map, &handle);
+	if (obj) {
+		/*
+		 * User may call alloc() couple of times, check object is the
+		 * same. For free() there's simple case, just remove from
+		 * bind_map.
+		 */
+		if (bind_op == TO_BIND)
+			igt_assert_eq(is_same(obj, handle, offset, size, bind_op), true);
+		else if (bind_op == TO_UNBIND) {
+			if (obj->bind_op == TO_BIND)
+				igt_map_remove(ainfo->bind_map, &obj->handle, map_entry_free_func);
+			else if (obj->bind_op == BOUND)
+				obj->bind_op = bind_op;
+		}
+	} else {
+		/* Ignore to unbind bo which wasn't previously inserted */
+		if (bind_op == TO_UNBIND)
+			goto out;
+
+		obj = calloc(1, sizeof(*obj));
+		obj->handle = handle;
+		obj->offset = offset;
+		obj->size = size;
+		obj->bind_op = bind_op;
+		igt_map_insert(ainfo->bind_map, &obj->handle, obj);
+	}
+out:
+	pthread_mutex_unlock(&ainfo->bind_map_mutex);
+}
+
 /**
  * __intel_allocator_alloc:
  * @allocator_handle: handle to an allocator
@@ -1121,6 +1275,8 @@ uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
 	igt_assert(handle_request(&req, &resp) == 0);
 	igt_assert(resp.response_type == RESP_ALLOC);
 
+	track_object(allocator_handle, handle, resp.alloc.offset, size, TO_BIND);
+
 	return resp.alloc.offset;
 }
 
@@ -1198,6 +1354,8 @@ bool intel_allocator_free(uint64_t allocator_handle, uint32_t handle)
 	igt_assert(handle_request(&req, &resp) == 0);
 	igt_assert(resp.response_type == RESP_FREE);
 
+	track_object(allocator_handle, handle, 0, 0, TO_UNBIND);
+
 	return resp.free.freed;
 }
 
@@ -1382,6 +1540,83 @@ void intel_allocator_print(uint64_t allocator_handle)
 	}
 }
 
+static void __xe_op_bind(struct ahnd_info *ainfo, uint32_t sync_in, uint32_t sync_out)
+{
+	struct allocator_object *obj;
+	struct igt_map_entry *pos;
+	struct igt_list_head obj_list;
+	struct xe_object *entry, *tmp;
+
+	IGT_INIT_LIST_HEAD(&obj_list);
+
+	pthread_mutex_lock(&ainfo->bind_map_mutex);
+	igt_map_foreach(ainfo->bind_map, pos) {
+		obj = pos->data;
+
+		if (obj->bind_op == BOUND)
+			continue;
+
+		bind_info("= [vm: %u] %s => %u %lx %lx\n",
+			  ainfo->ctx,
+			  obj->bind_op == TO_BIND ? "TO BIND" : "TO UNBIND",
+			  obj->handle, obj->offset,
+			  obj->size);
+
+		entry = malloc(sizeof(*entry));
+		entry->handle = obj->handle;
+		entry->offset = obj->offset;
+		entry->size = obj->size;
+		entry->bind_op = obj->bind_op == TO_BIND ? XE_OBJECT_BIND :
+							   XE_OBJECT_UNBIND;
+		entry->priv = obj;
+		igt_list_add(&entry->link, &obj_list);
+	}
+	pthread_mutex_unlock(&ainfo->bind_map_mutex);
+
+	xe_bind_unbind_async(ainfo->fd, ainfo->ctx, 0, &obj_list, sync_in, sync_out);
+
+	pthread_mutex_lock(&ainfo->bind_map_mutex);
+	igt_list_for_each_entry_safe(entry, tmp, &obj_list, link) {
+		obj = entry->priv;
+		if (obj->bind_op == TO_BIND)
+			obj->bind_op = BOUND;
+		else
+			igt_map_remove(ainfo->bind_map, &obj->handle, map_entry_free_func);
+
+		igt_list_del(&entry->link);
+		free(entry);
+	}
+	pthread_mutex_unlock(&ainfo->bind_map_mutex);
+}
+
+/**
+ * intel_allocator_bind:
+ * @allocator_handle: handle to an allocator
+ * @sync_in: syncobj (fence-in)
+ * @sync_out: syncobj (fence-out)
+ *
+ * Function binds and unbinds all objects added to the allocator which weren't
+ * previously binded/unbinded.
+ *
+ **/
+void intel_allocator_bind(uint64_t allocator_handle,
+			  uint32_t sync_in, uint32_t sync_out)
+{
+	struct ahnd_info *ainfo;
+
+	pthread_mutex_lock(&ahnd_map_mutex);
+	ainfo = igt_map_search(ahnd_map, &allocator_handle);
+	pthread_mutex_unlock(&ahnd_map_mutex);
+	igt_assert(ainfo);
+
+	/*
+	 * We collect bind/unbind operations on alloc()/free() to do group
+	 * operation getting @sync_in as syncobj handle (fence-in). If user
+	 * passes 0 as @sync_out we bind/unbind synchronously.
+	 */
+	__xe_op_bind(ainfo, sync_in, sync_out);
+}
+
 static int equal_handles(const void *key1, const void *key2)
 {
 	const struct handle_entry *h1 = key1, *h2 = key2;
@@ -1439,6 +1674,23 @@ static void __free_maps(struct igt_map *map, bool close_allocators)
 	igt_map_destroy(map, map_entry_free_func);
 }
 
+static void __free_ahnd_map(void)
+{
+	struct igt_map_entry *pos;
+	struct ahnd_info *ainfo;
+
+	if (!ahnd_map)
+		return;
+
+	igt_map_foreach(ahnd_map, pos) {
+		ainfo = pos->data;
+		igt_map_destroy(ainfo->bind_map, map_entry_free_func);
+	}
+
+	igt_map_destroy(ahnd_map, map_entry_free_func);
+}
+
+
 /**
  * intel_allocator_init:
  *
@@ -1456,12 +1708,15 @@ void intel_allocator_init(void)
 	__free_maps(handles, true);
 	__free_maps(ctx_map, false);
 	__free_maps(vm_map, false);
+	__free_ahnd_map();
 
 	atomic_init(&next_handle, 1);
 	handles = igt_map_create(hash_handles, equal_handles);
 	ctx_map = igt_map_create(hash_instance, equal_ctx);
 	vm_map = igt_map_create(hash_instance, equal_vm);
-	igt_assert(handles && ctx_map && vm_map);
+	pthread_mutex_init(&ahnd_map_mutex, NULL);
+	ahnd_map = igt_map_create(igt_map_hash_64, igt_map_equal_64);
+	igt_assert(handles && ctx_map && vm_map && ahnd_map);
 
 	channel = intel_allocator_get_msgchannel(CHANNEL_SYSVIPC_MSGQUEUE);
 }
diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
index 1001b21b98..f9ff7f1cc9 100644
--- a/lib/intel_allocator.h
+++ b/lib/intel_allocator.h
@@ -141,9 +141,6 @@ struct intel_allocator {
 	/* allocator's private structure */
 	void *priv;
 
-	/* driver - i915 or Xe */
-	enum intel_driver driver;
-
 	void (*get_address_range)(struct intel_allocator *ial,
 				  uint64_t *startp, uint64_t *endp);
 	uint64_t (*alloc)(struct intel_allocator *ial, uint32_t handle,
@@ -213,6 +210,9 @@ bool intel_allocator_reserve_if_not_allocated(uint64_t allocator_handle,
 
 void intel_allocator_print(uint64_t allocator_handle);
 
+void intel_allocator_bind(uint64_t allocator_handle,
+			  uint32_t sync_in, uint32_t sync_out);
+
 #define ALLOC_INVALID_ADDRESS (-1ull)
 #define INTEL_ALLOCATOR_NONE   0
 #define INTEL_ALLOCATOR_RELOC  1
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 11/16] lib/intel_ctx: Add xe context information
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (9 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 10/16] lib/intel_allocator: Add intel_allocator_bind() Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-07  8:31   ` Karolina Stolarek
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 12/16] lib/intel_blt: Introduce blt_copy_init() helper to cache driver Zbigniew Kempczyński
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

Most complicated part in adopting i915_blt to intel_blt - which should
handle both drivers - is how to achieve pipelined execution. In term
pipelined execution I mean all gpu workloads are executed without
stalls.

Comparing to i915 relocations and softpinning xe architecture migrates
binding (this means also unbind operation) responsibility from the
kernel to user via vm_bind ioctl(). To avoid stalls user has to
provide in/out fences (syncobjs) between consecutive bindings/execs.
Of course for many igt tests we don't need pipelined execution,
just synchronous bind, then exec. But exercising the driver should
also cover pipelining to verify it is possible to work without stalls.

I decided to extend intel_ctx_t with all necessary for xe objects
(vm, engine, syncobjs) to get flexibility in deciding how to bind,
execute and wait for (synchronize) those operations. Context object
along with i915 engine is already passed to blitter library so adding
xe required fields doesn't break i915 but will allow xe path to get
all necessary data to execute.

Using intel_ctx with xe requires some code patterns caused by usage
of an allocator. For xe the allocator started tracking alloc()/free()
operations to do bind/unbind in one call just before execution.
I've added two helpers in intel_ctx which - intel_ctx_xe_exec()
and intel_ctx_xe_sync(). Depending how intel_ctx was created
(with 0 or real syncobj handles as in/bind/out fences) bind and exec
in intel_ctx_xe_exec() are pipelined but synchronize last operation
(exec). For real syncobjs they are used to join bind + exec calls
but there's no wait for exec (sync-out) completion. This allows
building more cascaded bind + exec operations without stalls.

To wait for a sync-out fence caller may use intel_ctx_xe_sync()
which is synchronous wait on syncobj. It allows user to reset
fences to prepare for next operation.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/intel_ctx.c | 110 +++++++++++++++++++++++++++++++++++++++++++++++-
 lib/intel_ctx.h |  14 ++++++
 2 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/lib/intel_ctx.c b/lib/intel_ctx.c
index ded9c0f1e4..f210907fac 100644
--- a/lib/intel_ctx.c
+++ b/lib/intel_ctx.c
@@ -5,9 +5,12 @@
 
 #include <stddef.h>
 
+#include "i915/gem_engine_topology.h"
+#include "igt_syncobj.h"
+#include "intel_allocator.h"
 #include "intel_ctx.h"
 #include "ioctl_wrappers.h"
-#include "i915/gem_engine_topology.h"
+#include "xe/xe_ioctl.h"
 
 /**
  * SECTION:intel_ctx
@@ -390,3 +393,108 @@ unsigned int intel_ctx_engine_class(const intel_ctx_t *ctx, unsigned int engine)
 {
 	return intel_ctx_cfg_engine_class(&ctx->cfg, engine);
 }
+
+/**
+ * intel_ctx_xe:
+ * @fd: open i915 drm file descriptor
+ * @vm: vm
+ * @engine: engine
+ *
+ * Returns an intel_ctx_t representing the xe context.
+ */
+intel_ctx_t *intel_ctx_xe(int fd, uint32_t vm, uint32_t engine,
+			  uint32_t sync_in, uint32_t sync_bind, uint32_t sync_out)
+{
+	intel_ctx_t *ctx;
+
+	ctx = calloc(1, sizeof(*ctx));
+	igt_assert(ctx);
+
+	ctx->fd = fd;
+	ctx->vm = vm;
+	ctx->engine = engine;
+	ctx->sync_in = sync_in;
+	ctx->sync_bind = sync_bind;
+	ctx->sync_out = sync_out;
+
+	return ctx;
+}
+
+static int __xe_exec(int fd, struct drm_xe_exec *exec)
+{
+	int err = 0;
+
+	if (igt_ioctl(fd, DRM_IOCTL_XE_EXEC, exec)) {
+		err = -errno;
+		igt_assume(err != 0);
+	}
+	errno = 0;
+	return err;
+}
+
+int __intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset)
+{
+	struct drm_xe_sync syncs[2] = {
+		{ .flags = DRM_XE_SYNC_SYNCOBJ, },
+		{ .flags = DRM_XE_SYNC_SYNCOBJ | DRM_XE_SYNC_SIGNAL, },
+	};
+	struct drm_xe_exec exec = {
+		.engine_id = ctx->engine,
+		.syncs = (uintptr_t)syncs,
+		.num_syncs = 2,
+		.address = bb_offset,
+		.num_batch_buffer = 1,
+	};
+	uint32_t sync_in = ctx->sync_in;
+	uint32_t sync_bind = ctx->sync_bind ?: syncobj_create(ctx->fd, 0);
+	uint32_t sync_out = ctx->sync_out ?: syncobj_create(ctx->fd, 0);
+	int ret;
+
+	/* Synchronize allocator state -> vm */
+	intel_allocator_bind(ahnd, sync_in, sync_bind);
+
+	/* Pipelined exec */
+	syncs[0].handle = sync_bind;
+	syncs[1].handle = sync_out;
+
+	ret = __xe_exec(ctx->fd, &exec);
+	if (ret)
+		goto err;
+
+	if (!ctx->sync_bind || !ctx->sync_out)
+		syncobj_wait_err(ctx->fd, &sync_out, 1, INT64_MAX, 0);
+
+err:
+	if (!ctx->sync_bind)
+		syncobj_destroy(ctx->fd, sync_bind);
+
+	if (!ctx->sync_out)
+		syncobj_destroy(ctx->fd, sync_out);
+
+	return ret;
+}
+
+void intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset)
+{
+	igt_assert_eq(__intel_ctx_xe_exec(ctx, ahnd, bb_offset), 0);
+}
+
+#define RESET_SYNCOBJ(__fd, __sync) do { \
+	if (__sync) \
+		syncobj_reset((__fd), &(__sync), 1); \
+} while (0)
+
+int intel_ctx_xe_sync(intel_ctx_t *ctx, bool reset_syncs)
+{
+	int ret;
+
+	ret = syncobj_wait_err(ctx->fd, &ctx->sync_out, 1, INT64_MAX, 0);
+
+	if (reset_syncs) {
+		RESET_SYNCOBJ(ctx->fd, ctx->sync_in);
+		RESET_SYNCOBJ(ctx->fd, ctx->sync_bind);
+		RESET_SYNCOBJ(ctx->fd, ctx->sync_out);
+	}
+
+	return ret;
+}
diff --git a/lib/intel_ctx.h b/lib/intel_ctx.h
index 3cfeaae81e..59d0360ada 100644
--- a/lib/intel_ctx.h
+++ b/lib/intel_ctx.h
@@ -67,6 +67,14 @@ int intel_ctx_cfg_engine_class(const intel_ctx_cfg_t *cfg, unsigned int engine);
 typedef struct intel_ctx {
 	uint32_t id;
 	intel_ctx_cfg_t cfg;
+
+	/* Xe */
+	int fd;
+	uint32_t vm;
+	uint32_t engine;
+	uint32_t sync_in;
+	uint32_t sync_bind;
+	uint32_t sync_out;
 } intel_ctx_t;
 
 int __intel_ctx_create(int fd, const intel_ctx_cfg_t *cfg,
@@ -81,4 +89,10 @@ void intel_ctx_destroy(int fd, const intel_ctx_t *ctx);
 
 unsigned int intel_ctx_engine_class(const intel_ctx_t *ctx, unsigned int engine);
 
+intel_ctx_t *intel_ctx_xe(int fd, uint32_t vm, uint32_t engine,
+			  uint32_t sync_in, uint32_t sync_bind, uint32_t sync_out);
+int __intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset);
+void intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset);
+int intel_ctx_xe_sync(intel_ctx_t *ctx, bool reset_syncs);
+
 #endif
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 12/16] lib/intel_blt: Introduce blt_copy_init() helper to cache driver
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (10 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 11/16] lib/intel_ctx: Add xe context information Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-07  8:51   ` Karolina Stolarek
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 13/16] lib/intel_blt: Extend blitter library to support xe driver Zbigniew Kempczyński
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

Instead of calling is_xe_device() and is_i915_device() multiple
times in code which distincs xe and i915 paths add driver field
to structures used in blitter library.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/igt_fb.c                   |  2 +-
 lib/intel_blt.c                | 40 +++++++++++++++++++++++++++++++---
 lib/intel_blt.h                |  8 ++++++-
 tests/i915/gem_ccs.c           | 34 ++++++++++++++++-------------
 tests/i915/gem_exercise_blt.c  | 22 ++++++++++---------
 tests/i915/gem_lmem_swapping.c |  4 ++--
 6 files changed, 78 insertions(+), 32 deletions(-)

diff --git a/lib/igt_fb.c b/lib/igt_fb.c
index a8988274f2..1814e8db11 100644
--- a/lib/igt_fb.c
+++ b/lib/igt_fb.c
@@ -2900,7 +2900,7 @@ static void blitcopy(const struct igt_fb *dst_fb,
 			src = blt_fb_init(src_fb, i, mem_region);
 			dst = blt_fb_init(dst_fb, i, mem_region);
 
-			memset(&blt, 0, sizeof(blt));
+			blt_copy_init(src_fb->fd, &blt);
 			blt.color_depth = blt_get_bpp(src_fb);
 			blt_set_copy_object(&blt.src, src);
 			blt_set_copy_object(&blt.dst, dst);
diff --git a/lib/intel_blt.c b/lib/intel_blt.c
index bc28f15e8d..f2f86e4947 100644
--- a/lib/intel_blt.c
+++ b/lib/intel_blt.c
@@ -692,6 +692,22 @@ static void dump_bb_ext(struct gen12_block_copy_data_ext *data)
 		 data->dw21.src_array_index);
 }
 
+/**
+ * blt_copy_init:
+ * @fd: drm fd
+ * @blt: structure for initialization
+ *
+ * Function is zeroing @blt and sets fd and driver fields (INTEL_DRIVER_I915 or
+ * INTEL_DRIVER_XE).
+ */
+void blt_copy_init(int fd, struct blt_copy_data *blt)
+{
+	memset(blt, 0, sizeof(*blt));
+
+	blt->fd = fd;
+	blt->driver = get_intel_driver(fd);
+}
+
 /**
  * emit_blt_block_copy:
  * @fd: drm fd
@@ -889,6 +905,22 @@ static void dump_bb_surf_ctrl_cmd(const struct gen12_ctrl_surf_copy_data *data)
 		 cmd[4], data->dw04.dst_address_hi, data->dw04.dst_mocs);
 }
 
+/**
+ * blt_ctrl_surf_copy_init:
+ * @fd: drm fd
+ * @surf: structure for initialization
+ *
+ * Function is zeroing @surf and sets fd and driver fields (INTEL_DRIVER_I915 or
+ * INTEL_DRIVER_XE).
+ */
+void blt_ctrl_surf_copy_init(int fd, struct blt_ctrl_surf_copy_data *surf)
+{
+	memset(surf, 0, sizeof(*surf));
+
+	surf->fd = fd;
+	surf->driver = get_intel_driver(fd);
+}
+
 /**
  * emit_blt_ctrl_surf_copy:
  * @fd: drm fd
@@ -1317,7 +1349,7 @@ void blt_set_batch(struct blt_copy_batch *batch,
 }
 
 struct blt_copy_object *
-blt_create_object(int fd, uint32_t region,
+blt_create_object(const struct blt_copy_data *blt, uint32_t region,
 		  uint32_t width, uint32_t height, uint32_t bpp, uint8_t mocs,
 		  enum blt_tiling_type tiling,
 		  enum blt_compression compression,
@@ -1329,10 +1361,12 @@ blt_create_object(int fd, uint32_t region,
 	uint32_t stride = tiling == T_LINEAR ? width * 4 : width;
 	uint32_t handle;
 
+	igt_assert_f(blt->driver, "Driver isn't set, have you called blt_copy_init()?\n");
+
 	obj = calloc(1, sizeof(*obj));
 
 	obj->size = size;
-	igt_assert(__gem_create_in_memory_regions(fd, &handle,
+	igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
 						  &size, region) == 0);
 
 	blt_set_object(obj, handle, size, region, mocs, tiling,
@@ -1340,7 +1374,7 @@ blt_create_object(int fd, uint32_t region,
 	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
 
 	if (create_mapping)
-		obj->ptr = gem_mmap__device_coherent(fd, handle, 0, size,
+		obj->ptr = gem_mmap__device_coherent(blt->fd, handle, 0, size,
 						     PROT_READ | PROT_WRITE);
 
 	return obj;
diff --git a/lib/intel_blt.h b/lib/intel_blt.h
index 9c4ddc7a89..7516ce8ac7 100644
--- a/lib/intel_blt.h
+++ b/lib/intel_blt.h
@@ -102,6 +102,7 @@ struct blt_copy_batch {
 /* Common for block-copy and fast-copy */
 struct blt_copy_data {
 	int fd;
+	enum intel_driver driver;
 	struct blt_copy_object src;
 	struct blt_copy_object dst;
 	struct blt_copy_batch bb;
@@ -155,6 +156,7 @@ struct blt_ctrl_surf_copy_object {
 
 struct blt_ctrl_surf_copy_data {
 	int fd;
+	enum intel_driver driver;
 	struct blt_ctrl_surf_copy_object src;
 	struct blt_ctrl_surf_copy_object dst;
 	struct blt_copy_batch bb;
@@ -185,6 +187,8 @@ bool blt_uses_extended_block_copy(int fd);
 
 const char *blt_tiling_name(enum blt_tiling_type tiling);
 
+void blt_copy_init(int fd, struct blt_copy_data *blt);
+
 uint64_t emit_blt_block_copy(int fd,
 			     uint64_t ahnd,
 			     const struct blt_copy_data *blt,
@@ -205,6 +209,8 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
 				 uint64_t bb_pos,
 				 bool emit_bbe);
 
+void blt_ctrl_surf_copy_init(int fd, struct blt_ctrl_surf_copy_data *surf);
+
 int blt_ctrl_surf_copy(int fd,
 		       const intel_ctx_t *ctx,
 		       const struct intel_execution_engine2 *e,
@@ -230,7 +236,7 @@ void blt_set_batch(struct blt_copy_batch *batch,
 		   uint32_t handle, uint64_t size, uint32_t region);
 
 struct blt_copy_object *
-blt_create_object(int fd, uint32_t region,
+blt_create_object(const struct blt_copy_data *blt, uint32_t region,
 		  uint32_t width, uint32_t height, uint32_t bpp, uint8_t mocs,
 		  enum blt_tiling_type tiling,
 		  enum blt_compression compression,
diff --git a/tests/i915/gem_ccs.c b/tests/i915/gem_ccs.c
index f9ad9267df..d9d785ed9b 100644
--- a/tests/i915/gem_ccs.c
+++ b/tests/i915/gem_ccs.c
@@ -167,7 +167,7 @@ static void surf_copy(int i915,
 	ccs = gem_create(i915, ccssize);
 	ccs2 = gem_create(i915, ccssize);
 
-	surf.fd = i915;
+	blt_ctrl_surf_copy_init(i915, &surf);
 	surf.print_bb = param.print_bb;
 	set_surf_object(&surf.src, mid->handle, mid->region, mid->size,
 			uc_mocs, BLT_INDIRECT_ACCESS);
@@ -219,7 +219,7 @@ static void surf_copy(int i915,
 			uc_mocs, INDIRECT_ACCESS);
 	blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
 
-	memset(&blt, 0, sizeof(blt));
+	blt_copy_init(i915, &blt);
 	blt.color_depth = CD_32bit;
 	blt.print_bb = param.print_bb;
 	blt_set_copy_object(&blt.src, mid);
@@ -310,7 +310,7 @@ static int blt_block_copy3(int i915,
 	bb_offset = get_offset(ahnd, blt3->bb.handle, blt3->bb.size, alignment);
 
 	/* First blit src -> mid */
-	memset(&blt0, 0, sizeof(blt0));
+	blt_copy_init(i915, &blt0);
 	blt0.src = blt3->src;
 	blt0.dst = blt3->mid;
 	blt0.bb = blt3->bb;
@@ -321,7 +321,7 @@ static int blt_block_copy3(int i915,
 	bb_pos = emit_blt_block_copy(i915, ahnd, &blt0, &ext0, bb_pos, false);
 
 	/* Second blit mid -> dst */
-	memset(&blt0, 0, sizeof(blt0));
+	blt_copy_init(i915, &blt0);
 	blt0.src = blt3->mid;
 	blt0.dst = blt3->dst;
 	blt0.bb = blt3->bb;
@@ -332,7 +332,7 @@ static int blt_block_copy3(int i915,
 	bb_pos = emit_blt_block_copy(i915, ahnd, &blt0, &ext0, bb_pos, false);
 
 	/* Third blit dst -> final */
-	memset(&blt0, 0, sizeof(blt0));
+	blt_copy_init(i915, &blt0);
 	blt0.src = blt3->dst;
 	blt0.dst = blt3->final;
 	blt0.bb = blt3->bb;
@@ -390,11 +390,13 @@ static void block_copy(int i915,
 	if (!blt_uses_extended_block_copy(i915))
 		pext = NULL;
 
-	src = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
+	blt_copy_init(i915, &blt);
+
+	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
 				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
-	mid = blt_create_object(i915, mid_region, width, height, bpp, uc_mocs,
+	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
 				mid_tiling, mid_compression, comp_type, true);
-	dst = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
+	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
 				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
 	igt_assert(src->size == dst->size);
 	PRINT_SURFACE_INFO("src", src);
@@ -404,7 +406,6 @@ static void block_copy(int i915,
 	blt_surface_fill_rect(i915, src, width, height);
 	WRITE_PNG(i915, run_id, "src", src, width, height);
 
-	memset(&blt, 0, sizeof(blt));
 	blt.color_depth = CD_32bit;
 	blt.print_bb = param.print_bb;
 	blt_set_copy_object(&blt.src, src);
@@ -449,7 +450,7 @@ static void block_copy(int i915,
 		}
 	}
 
-	memset(&blt, 0, sizeof(blt));
+	blt_copy_init(i915, &blt);
 	blt.color_depth = CD_32bit;
 	blt.print_bb = param.print_bb;
 	blt_set_copy_object(&blt.src, mid);
@@ -486,6 +487,7 @@ static void block_multicopy(int i915,
 			    const struct test_config *config)
 {
 	struct blt_copy3_data blt3 = {};
+	struct blt_copy_data blt = {};
 	struct blt_block_copy3_data_ext ext3 = {}, *pext3 = &ext3;
 	struct blt_copy_object *src, *mid, *dst, *final;
 	const uint32_t bpp = 32;
@@ -505,13 +507,16 @@ static void block_multicopy(int i915,
 	if (!blt_uses_extended_block_copy(i915))
 		pext3 = NULL;
 
-	src = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
+	/* For object creation */
+	blt_copy_init(i915, &blt);
+
+	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
 				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
-	mid = blt_create_object(i915, mid_region, width, height, bpp, uc_mocs,
+	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
 				mid_tiling, mid_compression, comp_type, true);
-	dst = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
+	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
 				mid_tiling, COMPRESSION_DISABLED, comp_type, true);
-	final = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
+	final = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
 				  T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
 	igt_assert(src->size == dst->size);
 	PRINT_SURFACE_INFO("src", src);
@@ -521,7 +526,6 @@ static void block_multicopy(int i915,
 
 	blt_surface_fill_rect(i915, src, width, height);
 
-	memset(&blt3, 0, sizeof(blt3));
 	blt3.color_depth = CD_32bit;
 	blt3.print_bb = param.print_bb;
 	blt_set_copy_object(&blt3.src, src);
diff --git a/tests/i915/gem_exercise_blt.c b/tests/i915/gem_exercise_blt.c
index 0cd1820430..7355eabbe9 100644
--- a/tests/i915/gem_exercise_blt.c
+++ b/tests/i915/gem_exercise_blt.c
@@ -89,7 +89,7 @@ static int fast_copy_one_bb(int i915,
 	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
 
 	/* First blit */
-	memset(&blt_tmp, 0, sizeof(blt_tmp));
+	blt_copy_init(i915, &blt_tmp);
 	blt_tmp.src = blt->src;
 	blt_tmp.dst = blt->mid;
 	blt_tmp.bb = blt->bb;
@@ -98,7 +98,7 @@ static int fast_copy_one_bb(int i915,
 	bb_pos = emit_blt_fast_copy(i915, ahnd, &blt_tmp, bb_pos, false);
 
 	/* Second blit */
-	memset(&blt_tmp, 0, sizeof(blt_tmp));
+	blt_copy_init(i915, &blt_tmp);
 	blt_tmp.src = blt->mid;
 	blt_tmp.dst = blt->dst;
 	blt_tmp.bb = blt->bb;
@@ -140,6 +140,7 @@ static void fast_copy_emit(int i915, const intel_ctx_t *ctx,
 			   uint32_t region1, uint32_t region2,
 			   enum blt_tiling_type mid_tiling)
 {
+	struct blt_copy_data bltinit = {};
 	struct blt_fast_copy_data blt = {};
 	struct blt_copy_object *src, *mid, *dst;
 	const uint32_t bpp = 32;
@@ -152,11 +153,12 @@ static void fast_copy_emit(int i915, const intel_ctx_t *ctx,
 
 	igt_assert(__gem_create_in_memory_regions(i915, &bb, &bb_size, region1) == 0);
 
-	src = blt_create_object(i915, region1, width, height, bpp, 0,
+	blt_copy_init(i915, &bltinit);
+	src = blt_create_object(&bltinit, region1, width, height, bpp, 0,
 				T_LINEAR, COMPRESSION_DISABLED, 0, true);
-	mid = blt_create_object(i915, region2, width, height, bpp, 0,
+	mid = blt_create_object(&bltinit, region2, width, height, bpp, 0,
 				mid_tiling, COMPRESSION_DISABLED, 0, true);
-	dst = blt_create_object(i915, region1, width, height, bpp, 0,
+	dst = blt_create_object(&bltinit, region1, width, height, bpp, 0,
 				T_LINEAR, COMPRESSION_DISABLED, 0, true);
 	igt_assert(src->size == dst->size);
 
@@ -212,17 +214,17 @@ static void fast_copy(int i915, const intel_ctx_t *ctx,
 
 	igt_assert(__gem_create_in_memory_regions(i915, &bb, &bb_size, region1) == 0);
 
-	src = blt_create_object(i915, region1, width, height, bpp, 0,
+	blt_copy_init(i915, &blt);
+	src = blt_create_object(&blt, region1, width, height, bpp, 0,
 				T_LINEAR, COMPRESSION_DISABLED, 0, true);
-	mid = blt_create_object(i915, region2, width, height, bpp, 0,
+	mid = blt_create_object(&blt, region2, width, height, bpp, 0,
 				mid_tiling, COMPRESSION_DISABLED, 0, true);
-	dst = blt_create_object(i915, region1, width, height, bpp, 0,
+	dst = blt_create_object(&blt, region1, width, height, bpp, 0,
 				T_LINEAR, COMPRESSION_DISABLED, 0, true);
 	igt_assert(src->size == dst->size);
 
 	blt_surface_fill_rect(i915, src, width, height);
 
-	memset(&blt, 0, sizeof(blt));
 	blt.color_depth = CD_32bit;
 	blt.print_bb = param.print_bb;
 	blt_set_copy_object(&blt.src, src);
@@ -235,7 +237,7 @@ static void fast_copy(int i915, const intel_ctx_t *ctx,
 	WRITE_PNG(i915, mid_tiling, "src", &blt.src, width, height);
 	WRITE_PNG(i915, mid_tiling, "mid", &blt.dst, width, height);
 
-	memset(&blt, 0, sizeof(blt));
+	blt_copy_init(i915, &blt);
 	blt.color_depth = CD_32bit;
 	blt.print_bb = param.print_bb;
 	blt_set_copy_object(&blt.src, mid);
diff --git a/tests/i915/gem_lmem_swapping.c b/tests/i915/gem_lmem_swapping.c
index 83dbebec83..2921de8f9f 100644
--- a/tests/i915/gem_lmem_swapping.c
+++ b/tests/i915/gem_lmem_swapping.c
@@ -308,7 +308,7 @@ init_object_ccs(int i915, struct object *obj, struct blt_copy_object *tmp,
 		buf[j] = seed++;
 	munmap(buf, obj->size);
 
-	memset(&blt, 0, sizeof(blt));
+	blt_copy_init(i915, &blt);
 	blt.color_depth = CD_32bit;
 
 	memcpy(&blt.src, tmp, sizeof(blt.src));
@@ -366,7 +366,7 @@ verify_object_ccs(int i915, const struct object *obj,
 	cmd->handle = gem_create_from_pool(i915, &size, region);
 	blt_set_batch(cmd, cmd->handle, size, region);
 
-	memset(&blt, 0, sizeof(blt));
+	blt_copy_init(i915, &blt);
 	blt.color_depth = CD_32bit;
 
 	memcpy(&blt.src, obj->blt_obj, sizeof(blt.src));
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 13/16] lib/intel_blt: Extend blitter library to support xe driver
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (11 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 12/16] lib/intel_blt: Introduce blt_copy_init() helper to cache driver Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-07  9:26   ` Karolina Stolarek
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 14/16] tests/xe_ccs: Check if flatccs is working with block-copy for Xe Zbigniew Kempczyński
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

Use already written for i915 blitter library in xe development.
Add appropriate code paths which are unique for those drivers.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/intel_blt.c | 256 ++++++++++++++++++++++++++++++++----------------
 lib/intel_blt.h |   2 +-
 2 files changed, 170 insertions(+), 88 deletions(-)

diff --git a/lib/intel_blt.c b/lib/intel_blt.c
index f2f86e4947..3eb5d45460 100644
--- a/lib/intel_blt.c
+++ b/lib/intel_blt.c
@@ -9,9 +9,13 @@
 #include <malloc.h>
 #include <cairo.h>
 #include "drm.h"
-#include "igt.h"
 #include "i915/gem_create.h"
+#include "igt.h"
+#include "igt_syncobj.h"
 #include "intel_blt.h"
+#include "xe/xe_ioctl.h"
+#include "xe/xe_query.h"
+#include "xe/xe_util.h"
 
 #define BITRANGE(start, end) (end - start + 1)
 #define GET_CMDS_INFO(__fd) intel_get_cmds_info(intel_get_drm_devid(__fd))
@@ -468,24 +472,40 @@ static int __special_mode(const struct blt_copy_data *blt)
 	return SM_NONE;
 }
 
-static int __memory_type(uint32_t region)
+static int __memory_type(int fd, enum intel_driver driver, uint32_t region)
 {
-	igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
-		     IS_SYSTEM_MEMORY_REGION(region),
-		     "Invalid region: %x\n", region);
+	if (driver == INTEL_DRIVER_I915) {
+		igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
+			     IS_SYSTEM_MEMORY_REGION(region),
+			     "Invalid region: %x\n", region);
+	} else {
+		igt_assert_f(XE_IS_VRAM_MEMORY_REGION(fd, region) ||
+			     XE_IS_SYSMEM_MEMORY_REGION(fd, region),
+			     "Invalid region: %x\n", region);
+	}
 
-	if (IS_DEVICE_MEMORY_REGION(region))
+	if (driver == INTEL_DRIVER_I915 && IS_DEVICE_MEMORY_REGION(region))
 		return TM_LOCAL_MEM;
+	else if (driver == INTEL_DRIVER_XE && XE_IS_VRAM_MEMORY_REGION(fd, region))
+		return TM_LOCAL_MEM;
+
 	return TM_SYSTEM_MEM;
 }
 
-static enum blt_aux_mode __aux_mode(const struct blt_copy_object *obj)
+static enum blt_aux_mode __aux_mode(int fd,
+				    enum intel_driver driver,
+				    const struct blt_copy_object *obj)
 {
-	if (obj->compression == COMPRESSION_ENABLED) {
+	if (driver == INTEL_DRIVER_I915 && obj->compression == COMPRESSION_ENABLED) {
 		igt_assert_f(IS_DEVICE_MEMORY_REGION(obj->region),
 			     "XY_BLOCK_COPY_BLT supports compression "
 			     "on device memory only\n");
 		return AM_AUX_CCS_E;
+	} else if (driver == INTEL_DRIVER_XE && obj->compression == COMPRESSION_ENABLED) {
+		igt_assert_f(XE_IS_VRAM_MEMORY_REGION(fd, obj->region),
+			     "XY_BLOCK_COPY_BLT supports compression "
+			     "on device memory only\n");
+		return AM_AUX_CCS_E;
 	}
 
 	return AM_AUX_NONE;
@@ -508,9 +528,9 @@ static void fill_data(struct gen12_block_copy_data *data,
 	data->dw00.length = extended_command ? 20 : 10;
 
 	if (__special_mode(blt) == SM_FULL_RESOLVE)
-		data->dw01.dst_aux_mode = __aux_mode(&blt->src);
+		data->dw01.dst_aux_mode = __aux_mode(blt->fd, blt->driver, &blt->src);
 	else
-		data->dw01.dst_aux_mode = __aux_mode(&blt->dst);
+		data->dw01.dst_aux_mode = __aux_mode(blt->fd, blt->driver, &blt->dst);
 	data->dw01.dst_pitch = blt->dst.pitch - 1;
 
 	data->dw01.dst_mocs = blt->dst.mocs;
@@ -531,13 +551,13 @@ static void fill_data(struct gen12_block_copy_data *data,
 
 	data->dw06.dst_x_offset = blt->dst.x_offset;
 	data->dw06.dst_y_offset = blt->dst.y_offset;
-	data->dw06.dst_target_memory = __memory_type(blt->dst.region);
+	data->dw06.dst_target_memory = __memory_type(blt->fd, blt->driver, blt->dst.region);
 
 	data->dw07.src_x1 = blt->src.x1;
 	data->dw07.src_y1 = blt->src.y1;
 
 	data->dw08.src_pitch = blt->src.pitch - 1;
-	data->dw08.src_aux_mode = __aux_mode(&blt->src);
+	data->dw08.src_aux_mode = __aux_mode(blt->fd, blt->driver, &blt->src);
 	data->dw08.src_mocs = blt->src.mocs;
 	data->dw08.src_compression = blt->src.compression;
 	data->dw08.src_tiling = __block_tiling(blt->src.tiling);
@@ -550,7 +570,7 @@ static void fill_data(struct gen12_block_copy_data *data,
 
 	data->dw11.src_x_offset = blt->src.x_offset;
 	data->dw11.src_y_offset = blt->src.y_offset;
-	data->dw11.src_target_memory = __memory_type(blt->src.region);
+	data->dw11.src_target_memory = __memory_type(blt->fd, blt->driver, blt->src.region);
 }
 
 static void fill_data_ext(struct gen12_block_copy_data_ext *dext,
@@ -739,7 +759,10 @@ uint64_t emit_blt_block_copy(int fd,
 	igt_assert_f(ahnd, "block-copy supports softpin only\n");
 	igt_assert_f(blt, "block-copy requires data to do blit\n");
 
-	alignment = gem_detect_safe_alignment(fd);
+	if (blt->driver == INTEL_DRIVER_XE)
+		alignment = xe_get_default_alignment(fd);
+	else
+		alignment = gem_detect_safe_alignment(fd);
 	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
 		     + blt->src.plane_offset;
 	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
@@ -748,8 +771,11 @@ uint64_t emit_blt_block_copy(int fd,
 
 	fill_data(&data, blt, src_offset, dst_offset, ext);
 
-	bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
-				       PROT_READ | PROT_WRITE);
+	if (blt->driver == INTEL_DRIVER_XE)
+		bb = xe_bo_map(fd, blt->bb.handle, blt->bb.size);
+	else
+		bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
+					       PROT_READ | PROT_WRITE);
 
 	igt_assert(bb_pos + sizeof(data) < blt->bb.size);
 	memcpy(bb + bb_pos, &data, sizeof(data));
@@ -812,29 +838,38 @@ int blt_block_copy(int fd,
 
 	igt_assert_f(ahnd, "block-copy supports softpin only\n");
 	igt_assert_f(blt, "block-copy requires data to do blit\n");
+	igt_assert_neq(blt->driver, 0);
 
-	alignment = gem_detect_safe_alignment(fd);
+	if (blt->driver == INTEL_DRIVER_XE)
+		alignment = xe_get_default_alignment(fd);
+	else
+		alignment = gem_detect_safe_alignment(fd);
 	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
 	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
 	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
 
 	emit_blt_block_copy(fd, ahnd, blt, ext, 0, true);
 
-	obj[0].offset = CANONICAL(dst_offset);
-	obj[1].offset = CANONICAL(src_offset);
-	obj[2].offset = CANONICAL(bb_offset);
-	obj[0].handle = blt->dst.handle;
-	obj[1].handle = blt->src.handle;
-	obj[2].handle = blt->bb.handle;
-	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
-		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
-	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
-	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
-	execbuf.buffer_count = 3;
-	execbuf.buffers_ptr = to_user_pointer(obj);
-	execbuf.rsvd1 = ctx ? ctx->id : 0;
-	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
-	ret = __gem_execbuf(fd, &execbuf);
+	if (blt->driver == INTEL_DRIVER_XE) {
+		intel_ctx_xe_exec(ctx, ahnd, CANONICAL(bb_offset));
+	} else {
+		obj[0].offset = CANONICAL(dst_offset);
+		obj[1].offset = CANONICAL(src_offset);
+		obj[2].offset = CANONICAL(bb_offset);
+		obj[0].handle = blt->dst.handle;
+		obj[1].handle = blt->src.handle;
+		obj[2].handle = blt->bb.handle;
+		obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+				EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		execbuf.buffer_count = 3;
+		execbuf.buffers_ptr = to_user_pointer(obj);
+		execbuf.rsvd1 = ctx ? ctx->id : 0;
+		execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+
+		ret = __gem_execbuf(fd, &execbuf);
+	}
 
 	return ret;
 }
@@ -950,7 +985,10 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
 	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
 	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
 
-	alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
+	if (surf->driver == INTEL_DRIVER_XE)
+		alignment = max_t(uint64_t, xe_get_default_alignment(fd), 1ull << 16);
+	else
+		alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
 
 	data.dw00.client = 0x2;
 	data.dw00.opcode = 0x48;
@@ -973,8 +1011,11 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
 	data.dw04.dst_address_hi = dst_offset >> 32;
 	data.dw04.dst_mocs = surf->dst.mocs;
 
-	bb = gem_mmap__device_coherent(fd, surf->bb.handle, 0, surf->bb.size,
-				       PROT_READ | PROT_WRITE);
+	if (surf->driver == INTEL_DRIVER_XE)
+		bb = xe_bo_map(fd, surf->bb.handle, surf->bb.size);
+	else
+		bb = gem_mmap__device_coherent(fd, surf->bb.handle, 0, surf->bb.size,
+					       PROT_READ | PROT_WRITE);
 
 	igt_assert(bb_pos + sizeof(data) < surf->bb.size);
 	memcpy(bb + bb_pos, &data, sizeof(data));
@@ -1002,7 +1043,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
 
 /**
  * blt_ctrl_surf_copy:
- * @fd: drm fd
+ * @blt: bldrm fd
  * @ctx: intel_ctx_t context
  * @e: blitter engine for @ctx
  * @ahnd: allocator handle
@@ -1026,32 +1067,41 @@ int blt_ctrl_surf_copy(int fd,
 
 	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
 	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
+	igt_assert_neq(surf->driver, 0);
+
+	if (surf->driver == INTEL_DRIVER_XE)
+		alignment = max_t(uint64_t, xe_get_default_alignment(fd), 1ull << 16);
+	else
+		alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
 
-	alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
 	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
 	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
 	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
 
 	emit_blt_ctrl_surf_copy(fd, ahnd, surf, 0, true);
 
-	obj[0].offset = CANONICAL(dst_offset);
-	obj[1].offset = CANONICAL(src_offset);
-	obj[2].offset = CANONICAL(bb_offset);
-	obj[0].handle = surf->dst.handle;
-	obj[1].handle = surf->src.handle;
-	obj[2].handle = surf->bb.handle;
-	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
-		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
-	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
-	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
-	execbuf.buffer_count = 3;
-	execbuf.buffers_ptr = to_user_pointer(obj);
-	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
-	execbuf.rsvd1 = ctx ? ctx->id : 0;
-	gem_execbuf(fd, &execbuf);
-	put_offset(ahnd, surf->dst.handle);
-	put_offset(ahnd, surf->src.handle);
-	put_offset(ahnd, surf->bb.handle);
+	if (surf->driver == INTEL_DRIVER_XE) {
+		intel_ctx_xe_exec(ctx, ahnd, CANONICAL(bb_offset));
+	} else {
+		obj[0].offset = CANONICAL(dst_offset);
+		obj[1].offset = CANONICAL(src_offset);
+		obj[2].offset = CANONICAL(bb_offset);
+		obj[0].handle = surf->dst.handle;
+		obj[1].handle = surf->src.handle;
+		obj[2].handle = surf->bb.handle;
+		obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+				EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		execbuf.buffer_count = 3;
+		execbuf.buffers_ptr = to_user_pointer(obj);
+		execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+		execbuf.rsvd1 = ctx ? ctx->id : 0;
+		gem_execbuf(fd, &execbuf);
+		put_offset(ahnd, surf->dst.handle);
+		put_offset(ahnd, surf->src.handle);
+		put_offset(ahnd, surf->bb.handle);
+	}
 
 	return 0;
 }
@@ -1208,7 +1258,10 @@ uint64_t emit_blt_fast_copy(int fd,
 	uint32_t bbe = MI_BATCH_BUFFER_END;
 	uint32_t *bb;
 
-	alignment = gem_detect_safe_alignment(fd);
+	if (blt->driver == INTEL_DRIVER_XE)
+		alignment = xe_get_default_alignment(fd);
+	else
+		alignment = gem_detect_safe_alignment(fd);
 
 	data.dw00.client = 0x2;
 	data.dw00.opcode = 0x42;
@@ -1218,8 +1271,8 @@ uint64_t emit_blt_fast_copy(int fd,
 
 	data.dw01.dst_pitch = blt->dst.pitch;
 	data.dw01.color_depth = __fast_color_depth(blt->color_depth);
-	data.dw01.dst_memory = __memory_type(blt->dst.region);
-	data.dw01.src_memory = __memory_type(blt->src.region);
+	data.dw01.dst_memory = __memory_type(blt->fd, blt->driver, blt->dst.region);
+	data.dw01.src_memory = __memory_type(blt->fd, blt->driver, blt->src.region);
 	data.dw01.dst_type_y = __new_tile_y_type(blt->dst.tiling) ? 1 : 0;
 	data.dw01.src_type_y = __new_tile_y_type(blt->src.tiling) ? 1 : 0;
 
@@ -1246,8 +1299,11 @@ uint64_t emit_blt_fast_copy(int fd,
 	data.dw08.src_address_lo = src_offset;
 	data.dw09.src_address_hi = src_offset >> 32;
 
-	bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
-				       PROT_READ | PROT_WRITE);
+	if (blt->driver == INTEL_DRIVER_XE)
+		bb = xe_bo_map(fd, blt->bb.handle, blt->bb.size);
+	else
+		bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
+					       PROT_READ | PROT_WRITE);
 
 	igt_assert(bb_pos + sizeof(data) < blt->bb.size);
 	memcpy(bb + bb_pos, &data, sizeof(data));
@@ -1297,7 +1353,14 @@ int blt_fast_copy(int fd,
 	uint64_t dst_offset, src_offset, bb_offset, alignment;
 	int ret;
 
-	alignment = gem_detect_safe_alignment(fd);
+	igt_assert_f(ahnd, "fast-copy supports softpin only\n");
+	igt_assert_f(blt, "fast-copy requires data to do fast-copy blit\n");
+	igt_assert_neq(blt->driver, 0);
+
+	if (blt->driver == INTEL_DRIVER_XE)
+		alignment = xe_get_default_alignment(fd);
+	else
+		alignment = gem_detect_safe_alignment(fd);
 
 	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
 	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
@@ -1305,24 +1368,28 @@ int blt_fast_copy(int fd,
 
 	emit_blt_fast_copy(fd, ahnd, blt, 0, true);
 
-	obj[0].offset = CANONICAL(dst_offset);
-	obj[1].offset = CANONICAL(src_offset);
-	obj[2].offset = CANONICAL(bb_offset);
-	obj[0].handle = blt->dst.handle;
-	obj[1].handle = blt->src.handle;
-	obj[2].handle = blt->bb.handle;
-	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
-		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
-	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
-	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
-	execbuf.buffer_count = 3;
-	execbuf.buffers_ptr = to_user_pointer(obj);
-	execbuf.rsvd1 = ctx ? ctx->id : 0;
-	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
-	ret = __gem_execbuf(fd, &execbuf);
-	put_offset(ahnd, blt->dst.handle);
-	put_offset(ahnd, blt->src.handle);
-	put_offset(ahnd, blt->bb.handle);
+	if (blt->driver == INTEL_DRIVER_XE) {
+		intel_ctx_xe_exec(ctx, ahnd, CANONICAL(bb_offset));
+	} else {
+		obj[0].offset = CANONICAL(dst_offset);
+		obj[1].offset = CANONICAL(src_offset);
+		obj[2].offset = CANONICAL(bb_offset);
+		obj[0].handle = blt->dst.handle;
+		obj[1].handle = blt->src.handle;
+		obj[2].handle = blt->bb.handle;
+		obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+				EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+		execbuf.buffer_count = 3;
+		execbuf.buffers_ptr = to_user_pointer(obj);
+		execbuf.rsvd1 = ctx ? ctx->id : 0;
+		execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+		ret = __gem_execbuf(fd, &execbuf);
+		put_offset(ahnd, blt->dst.handle);
+		put_offset(ahnd, blt->src.handle);
+		put_offset(ahnd, blt->bb.handle);
+	}
 
 	return ret;
 }
@@ -1366,16 +1433,26 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
 	obj = calloc(1, sizeof(*obj));
 
 	obj->size = size;
-	igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
-						  &size, region) == 0);
+
+	if (blt->driver == INTEL_DRIVER_XE) {
+		size = ALIGN(size, xe_get_default_alignment(blt->fd));
+		handle = xe_bo_create_flags(blt->fd, 0, size, region);
+	} else {
+		igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
+							  &size, region) == 0);
+	}
 
 	blt_set_object(obj, handle, size, region, mocs, tiling,
 		       compression, compression_type);
 	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
 
-	if (create_mapping)
-		obj->ptr = gem_mmap__device_coherent(blt->fd, handle, 0, size,
-						     PROT_READ | PROT_WRITE);
+	if (create_mapping) {
+		if (blt->driver == INTEL_DRIVER_XE)
+			obj->ptr = xe_bo_map(blt->fd, handle, size);
+		else
+			obj->ptr = gem_mmap__device_coherent(blt->fd, handle, 0, size,
+							     PROT_READ | PROT_WRITE);
+	}
 
 	return obj;
 }
@@ -1518,14 +1595,19 @@ void blt_surface_to_png(int fd, uint32_t run_id, const char *fileid,
 	int format;
 	int stride = obj->tiling ? obj->pitch * 4 : obj->pitch;
 	char filename[FILENAME_MAX];
+	bool is_xe;
 
 	snprintf(filename, FILENAME_MAX-1, "%d-%s-%s-%ux%u-%s.png",
 		 run_id, fileid, blt_tiling_name(obj->tiling), width, height,
 		 obj->compression ? "compressed" : "uncompressed");
 
-	if (!map)
-		map = gem_mmap__device_coherent(fd, obj->handle, 0,
-						obj->size, PROT_READ);
+	if (!map) {
+		if (is_xe)
+			map = xe_bo_map(fd, obj->handle, obj->size);
+		else
+			map = gem_mmap__device_coherent(fd, obj->handle, 0,
+							obj->size, PROT_READ);
+	}
 	format = CAIRO_FORMAT_RGB24;
 	surface = cairo_image_surface_create_for_data(map,
 						      format, width, height,
diff --git a/lib/intel_blt.h b/lib/intel_blt.h
index 7516ce8ac7..944e2b4ae7 100644
--- a/lib/intel_blt.h
+++ b/lib/intel_blt.h
@@ -8,7 +8,7 @@
 
 /**
  * SECTION:intel_blt
- * @short_description: i915 blitter library
+ * @short_description: i915/xe blitter library
  * @title: Blitter library
  * @include: intel_blt.h
  *
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 14/16] tests/xe_ccs: Check if flatccs is working with block-copy for Xe
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (12 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 13/16] lib/intel_blt: Extend blitter library to support xe driver Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-07 10:05   ` Karolina Stolarek
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 15/16] tests/xe_exercise_blt: Check blitter library fast-copy " Zbigniew Kempczyński
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

This is ported to xe copy of i915 gem_ccs test. Ported means all driver
dependent calls - like working on regions, binding and execution were
replaced by xe counterparts. I wondered to add conditionals for xe
in gem_ccs but this would decrease test readability so I dropped
this idea.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 tests/meson.build |   1 +
 tests/xe/xe_ccs.c | 763 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 764 insertions(+)
 create mode 100644 tests/xe/xe_ccs.c

diff --git a/tests/meson.build b/tests/meson.build
index ee066b8490..9bca57a5e8 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -244,6 +244,7 @@ i915_progs = [
 ]
 
 xe_progs = [
+	'xe_ccs',
 	'xe_create',
 	'xe_compute',
 	'xe_dma_buf_sync',
diff --git a/tests/xe/xe_ccs.c b/tests/xe/xe_ccs.c
new file mode 100644
index 0000000000..e6bb29a5ed
--- /dev/null
+++ b/tests/xe/xe_ccs.c
@@ -0,0 +1,763 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include <errno.h>
+#include <glib.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+#include <malloc.h>
+#include "drm.h"
+#include "igt.h"
+#include "igt_syncobj.h"
+#include "intel_blt.h"
+#include "intel_mocs.h"
+#include "xe/xe_ioctl.h"
+#include "xe/xe_query.h"
+#include "xe/xe_util.h"
+/**
+ * TEST: xe ccs
+ * Description: Exercise gen12 blitter with and without flatccs compression on Xe
+ * Run type: FULL
+ *
+ * SUBTEST: block-copy-compressed
+ * Description: Check block-copy flatccs compressed blit
+ *
+ * SUBTEST: block-copy-uncompressed
+ * Description: Check block-copy uncompressed blit
+ *
+ * SUBTEST: block-multicopy-compressed
+ * Description: Check block-multicopy flatccs compressed blit
+ *
+ * SUBTEST: block-multicopy-inplace
+ * Description: Check block-multicopy flatccs inplace decompression blit
+ *
+ * SUBTEST: ctrl-surf-copy
+ * Description: Check flatccs data can be copied from/to surface
+ *
+ * SUBTEST: ctrl-surf-copy-new-ctx
+ * Description: Check flatccs data are physically tagged and visible in vm
+ *
+ * SUBTEST: suspend-resume
+ * Description: Check flatccs data persists after suspend / resume (S0)
+ */
+
+IGT_TEST_DESCRIPTION("Exercise gen12 blitter with and without flatccs compression on Xe");
+
+static struct param {
+	int compression_format;
+	int tiling;
+	bool write_png;
+	bool print_bb;
+	bool print_surface_info;
+	int width;
+	int height;
+} param = {
+	.compression_format = 0,
+	.tiling = -1,
+	.write_png = false,
+	.print_bb = false,
+	.print_surface_info = false,
+	.width = 512,
+	.height = 512,
+};
+
+struct test_config {
+	bool compression;
+	bool inplace;
+	bool surfcopy;
+	bool new_ctx;
+	bool suspend_resume;
+};
+
+static void set_surf_object(struct blt_ctrl_surf_copy_object *obj,
+			    uint32_t handle, uint32_t region, uint64_t size,
+			    uint8_t mocs, enum blt_access_type access_type)
+{
+	obj->handle = handle;
+	obj->region = region;
+	obj->size = size;
+	obj->mocs = mocs;
+	obj->access_type = access_type;
+}
+
+#define PRINT_SURFACE_INFO(name, obj) do { \
+	if (param.print_surface_info) \
+		blt_surface_info((name), (obj)); } while (0)
+
+#define WRITE_PNG(fd, id, name, obj, w, h) do { \
+	if (param.write_png) \
+		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)
+
+static int compare_nxn(const struct blt_copy_object *surf1,
+		       const struct blt_copy_object *surf2,
+		       int xsize, int ysize, int bx, int by)
+{
+	int x, y, corrupted;
+	uint32_t pos, px1, px2;
+
+	corrupted = 0;
+	for (y = 0; y < ysize; y++) {
+		for (x = 0; x < xsize; x++) {
+			pos = bx * xsize + by * ysize * surf1->pitch / 4;
+			pos += x + y * surf1->pitch / 4;
+			px1 = surf1->ptr[pos];
+			px2 = surf2->ptr[pos];
+			if (px1 != px2)
+				corrupted++;
+		}
+	}
+
+	return corrupted;
+}
+
+static void dump_corruption_info(const struct blt_copy_object *surf1,
+				 const struct blt_copy_object *surf2)
+{
+	const int xsize = 8, ysize = 8;
+	int w, h, bx, by, corrupted;
+
+	igt_assert(surf1->x1 == surf2->x1 && surf1->x2 == surf2->x2);
+	igt_assert(surf1->y1 == surf2->y1 && surf1->y2 == surf2->y2);
+	w = surf1->x2;
+	h = surf1->y2;
+
+	igt_info("dump corruption - width: %d, height: %d, sizex: %x, sizey: %x\n",
+		 surf1->x2, surf1->y2, xsize, ysize);
+
+	for (by = 0; by < h / ysize; by++) {
+		for (bx = 0; bx < w / xsize; bx++) {
+			corrupted = compare_nxn(surf1, surf2, xsize, ysize, bx, by);
+			if (corrupted == 0)
+				igt_info(".");
+			else
+				igt_info("%c", '0' + corrupted);
+		}
+		igt_info("\n");
+	}
+}
+
+static void surf_copy(int xe,
+		      intel_ctx_t *ctx,
+		      uint64_t ahnd,
+		      const struct blt_copy_object *src,
+		      const struct blt_copy_object *mid,
+		      const struct blt_copy_object *dst,
+		      int run_id, bool suspend_resume)
+{
+	struct blt_copy_data blt = {};
+	struct blt_block_copy_data_ext ext = {};
+	struct blt_ctrl_surf_copy_data surf = {};
+	uint32_t bb1, bb2, ccs, ccs2, *ccsmap, *ccsmap2;
+	uint64_t bb_size, ccssize = mid->size / CCS_RATIO;
+	uint32_t *ccscopy;
+	uint8_t uc_mocs = intel_get_uc_mocs(xe);
+	uint32_t sysmem = system_memory(xe);
+	int result;
+
+	igt_assert(mid->compression);
+	ccscopy = (uint32_t *) malloc(ccssize);
+	ccs = xe_bo_create_flags(xe, 0, ccssize, sysmem);
+	ccs2 = xe_bo_create_flags(xe, 0, ccssize, sysmem);
+
+	blt_ctrl_surf_copy_init(xe, &surf);
+	surf.print_bb = param.print_bb;
+	set_surf_object(&surf.src, mid->handle, mid->region, mid->size,
+			uc_mocs, BLT_INDIRECT_ACCESS);
+	set_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
+	bb_size = xe_get_default_alignment(xe);
+	bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
+	blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
+	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
+	intel_ctx_xe_sync(ctx, true);
+
+	ccsmap = xe_bo_map(xe, ccs, surf.dst.size);
+	memcpy(ccscopy, ccsmap, ccssize);
+
+	if (suspend_resume) {
+		char *orig, *orig2, *newsum, *newsum2;
+
+		orig = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
+						   (void *)ccsmap, surf.dst.size);
+		orig2 = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
+						    (void *)mid->ptr, mid->size);
+
+		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
+
+		set_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,
+				0, DIRECT_ACCESS);
+		blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
+		intel_ctx_xe_sync(ctx, true);
+
+		ccsmap2 = xe_bo_map(xe, ccs2, surf.dst.size);
+		newsum = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
+						     (void *)ccsmap2, surf.dst.size);
+		newsum2 = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
+						      (void *)mid->ptr, mid->size);
+
+		munmap(ccsmap2, ccssize);
+		igt_assert(!strcmp(orig, newsum));
+		igt_assert(!strcmp(orig2, newsum2));
+		g_free(orig);
+		g_free(orig2);
+		g_free(newsum);
+		g_free(newsum2);
+	}
+
+	/* corrupt ccs */
+	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
+		ccsmap[i] = i;
+	set_surf_object(&surf.src, ccs, sysmem, ccssize,
+			uc_mocs, DIRECT_ACCESS);
+	set_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
+			uc_mocs, INDIRECT_ACCESS);
+	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
+	intel_ctx_xe_sync(ctx, true);
+
+	blt_copy_init(xe, &blt);
+	blt.color_depth = CD_32bit;
+	blt.print_bb = param.print_bb;
+	blt_set_copy_object(&blt.src, mid);
+	blt_set_copy_object(&blt.dst, dst);
+	blt_set_object_ext(&ext.src, mid->compression_type, mid->x2, mid->y2, SURFACE_TYPE_2D);
+	blt_set_object_ext(&ext.dst, 0, dst->x2, dst->y2, SURFACE_TYPE_2D);
+	bb2 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
+	blt_set_batch(&blt.bb, bb2, bb_size, sysmem);
+	blt_block_copy(xe, ctx, NULL, ahnd, &blt, &ext);
+	intel_ctx_xe_sync(ctx, true);
+	WRITE_PNG(xe, run_id, "corrupted", &blt.dst, dst->x2, dst->y2);
+	result = memcmp(src->ptr, dst->ptr, src->size);
+	igt_assert(result != 0);
+
+	/* retrieve back ccs */
+	memcpy(ccsmap, ccscopy, ccssize);
+	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
+
+	blt_block_copy(xe, ctx, NULL, ahnd, &blt, &ext);
+	intel_ctx_xe_sync(ctx, true);
+	WRITE_PNG(xe, run_id, "corrected", &blt.dst, dst->x2, dst->y2);
+	result = memcmp(src->ptr, dst->ptr, src->size);
+	if (result)
+		dump_corruption_info(src, dst);
+
+	munmap(ccsmap, ccssize);
+	gem_close(xe, ccs);
+	gem_close(xe, ccs2);
+	gem_close(xe, bb1);
+	gem_close(xe, bb2);
+
+	igt_assert_f(result == 0,
+		     "Source and destination surfaces are different after "
+		     "restoring source ccs data\n");
+}
+
+struct blt_copy3_data {
+	int xe;
+	struct blt_copy_object src;
+	struct blt_copy_object mid;
+	struct blt_copy_object dst;
+	struct blt_copy_object final;
+	struct blt_copy_batch bb;
+	enum blt_color_depth color_depth;
+
+	/* debug stuff */
+	bool print_bb;
+};
+
+struct blt_block_copy3_data_ext {
+	struct blt_block_copy_object_ext src;
+	struct blt_block_copy_object_ext mid;
+	struct blt_block_copy_object_ext dst;
+	struct blt_block_copy_object_ext final;
+};
+
+#define FILL_OBJ(_idx, _handle, _offset) do { \
+	obj[(_idx)].handle = (_handle); \
+	obj[(_idx)].offset = (_offset); \
+} while (0)
+
+static int blt_block_copy3(int xe,
+			   const intel_ctx_t *ctx,
+			   uint64_t ahnd,
+			   const struct blt_copy3_data *blt3,
+			   const struct blt_block_copy3_data_ext *ext3)
+{
+	struct blt_copy_data blt0;
+	struct blt_block_copy_data_ext ext0;
+	uint64_t bb_offset, alignment;
+	uint64_t bb_pos = 0;
+	int ret;
+
+	igt_assert_f(ahnd, "block-copy3 supports softpin only\n");
+	igt_assert_f(blt3, "block-copy3 requires data to do blit\n");
+
+	alignment = xe_get_default_alignment(xe);
+	get_offset(ahnd, blt3->src.handle, blt3->src.size, alignment);
+	get_offset(ahnd, blt3->mid.handle, blt3->mid.size, alignment);
+	get_offset(ahnd, blt3->dst.handle, blt3->dst.size, alignment);
+	get_offset(ahnd, blt3->final.handle, blt3->final.size, alignment);
+	bb_offset = get_offset(ahnd, blt3->bb.handle, blt3->bb.size, alignment);
+
+	/* First blit src -> mid */
+	blt_copy_init(xe, &blt0);
+	blt0.src = blt3->src;
+	blt0.dst = blt3->mid;
+	blt0.bb = blt3->bb;
+	blt0.color_depth = blt3->color_depth;
+	blt0.print_bb = blt3->print_bb;
+	ext0.src = ext3->src;
+	ext0.dst = ext3->mid;
+	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, false);
+
+	/* Second blit mid -> dst */
+	blt_copy_init(xe, &blt0);
+	blt0.src = blt3->mid;
+	blt0.dst = blt3->dst;
+	blt0.bb = blt3->bb;
+	blt0.color_depth = blt3->color_depth;
+	blt0.print_bb = blt3->print_bb;
+	ext0.src = ext3->mid;
+	ext0.dst = ext3->dst;
+	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, false);
+
+	/* Third blit dst -> final */
+	blt_copy_init(xe, &blt0);
+	blt0.src = blt3->dst;
+	blt0.dst = blt3->final;
+	blt0.bb = blt3->bb;
+	blt0.color_depth = blt3->color_depth;
+	blt0.print_bb = blt3->print_bb;
+	ext0.src = ext3->dst;
+	ext0.dst = ext3->final;
+	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, true);
+
+	intel_ctx_xe_exec(ctx, ahnd, bb_offset);
+
+	return ret;
+}
+
+static void block_copy(int xe,
+		       intel_ctx_t *ctx,
+		       uint32_t region1, uint32_t region2,
+		       enum blt_tiling_type mid_tiling,
+		       const struct test_config *config)
+{
+	struct blt_copy_data blt = {};
+	struct blt_block_copy_data_ext ext = {}, *pext = &ext;
+	struct blt_copy_object *src, *mid, *dst;
+	const uint32_t bpp = 32;
+	uint64_t bb_size = xe_get_default_alignment(xe);
+	uint64_t ahnd = intel_allocator_open(xe, ctx->vm, INTEL_ALLOCATOR_RELOC);
+	uint32_t run_id = mid_tiling;
+	uint32_t mid_region = region2, bb;
+	uint32_t width = param.width, height = param.height;
+	enum blt_compression mid_compression = config->compression;
+	int mid_compression_format = param.compression_format;
+	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
+	uint8_t uc_mocs = intel_get_uc_mocs(xe);
+	int result;
+
+	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
+
+	if (!blt_uses_extended_block_copy(xe))
+		pext = NULL;
+
+	blt_copy_init(xe, &blt);
+
+	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
+				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
+	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
+				mid_tiling, mid_compression, comp_type, true);
+	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
+				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
+	igt_assert(src->size == dst->size);
+	PRINT_SURFACE_INFO("src", src);
+	PRINT_SURFACE_INFO("mid", mid);
+	PRINT_SURFACE_INFO("dst", dst);
+
+	blt_surface_fill_rect(xe, src, width, height);
+	WRITE_PNG(xe, run_id, "src", src, width, height);
+
+	blt.color_depth = CD_32bit;
+	blt.print_bb = param.print_bb;
+	blt_set_copy_object(&blt.src, src);
+	blt_set_copy_object(&blt.dst, mid);
+	blt_set_object_ext(&ext.src, 0, width, height, SURFACE_TYPE_2D);
+	blt_set_object_ext(&ext.dst, mid_compression_format, width, height, SURFACE_TYPE_2D);
+	blt_set_batch(&blt.bb, bb, bb_size, region1);
+	blt_block_copy(xe, ctx, NULL, ahnd, &blt, pext);
+	intel_ctx_xe_sync(ctx, true);
+
+	/* We expect mid != src if there's compression */
+	if (mid->compression)
+		igt_assert(memcmp(src->ptr, mid->ptr, src->size) != 0);
+
+	WRITE_PNG(xe, run_id, "src", &blt.src, width, height);
+	WRITE_PNG(xe, run_id, "mid", &blt.dst, width, height);
+
+	if (config->surfcopy && pext) {
+		struct drm_xe_engine_class_instance inst = {
+			.engine_class = DRM_XE_ENGINE_CLASS_COPY,
+		};
+		intel_ctx_t *surf_ctx = ctx;
+		uint64_t surf_ahnd = ahnd;
+		uint32_t vm, engine;
+
+		if (config->new_ctx) {
+			vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
+			engine = xe_engine_create(xe, vm, &inst, 0);
+			surf_ctx = intel_ctx_xe(xe, vm, engine, 0, 0, 0);
+			surf_ahnd = intel_allocator_open(xe, surf_ctx->vm,
+							 INTEL_ALLOCATOR_RELOC);
+		}
+		surf_copy(xe, surf_ctx, surf_ahnd, src, mid, dst, run_id,
+			  config->suspend_resume);
+
+		if (surf_ctx != ctx) {
+			xe_engine_destroy(xe, engine);
+			xe_vm_destroy(xe, vm);
+			free(surf_ctx);
+			put_ahnd(surf_ahnd);
+		}
+	}
+
+	blt_copy_init(xe, &blt);
+	blt.color_depth = CD_32bit;
+	blt.print_bb = param.print_bb;
+	blt_set_copy_object(&blt.src, mid);
+	blt_set_copy_object(&blt.dst, dst);
+	blt_set_object_ext(&ext.src, mid_compression_format, width, height, SURFACE_TYPE_2D);
+	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
+	if (config->inplace) {
+		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
+			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
+		blt.dst.ptr = mid->ptr;
+	}
+
+	blt_set_batch(&blt.bb, bb, bb_size, region1);
+	blt_block_copy(xe, ctx, NULL, ahnd, &blt, pext);
+	intel_ctx_xe_sync(ctx, true);
+
+	WRITE_PNG(xe, run_id, "dst", &blt.dst, width, height);
+
+	result = memcmp(src->ptr, blt.dst.ptr, src->size);
+
+	/* Politely clean vm */
+	put_offset(ahnd, src->handle);
+	put_offset(ahnd, mid->handle);
+	put_offset(ahnd, dst->handle);
+	put_offset(ahnd, bb);
+	intel_allocator_bind(ahnd, 0, 0);
+	blt_destroy_object(xe, src);
+	blt_destroy_object(xe, mid);
+	blt_destroy_object(xe, dst);
+	gem_close(xe, bb);
+	put_ahnd(ahnd);
+
+	igt_assert_f(!result, "source and destination surfaces differs!\n");
+}
+
+static void block_multicopy(int xe,
+			    intel_ctx_t *ctx,
+			    uint32_t region1, uint32_t region2,
+			    enum blt_tiling_type mid_tiling,
+			    const struct test_config *config)
+{
+	struct blt_copy3_data blt3 = {};
+	struct blt_copy_data blt = {};
+	struct blt_block_copy3_data_ext ext3 = {}, *pext3 = &ext3;
+	struct blt_copy_object *src, *mid, *dst, *final;
+	const uint32_t bpp = 32;
+	uint64_t bb_size = xe_get_default_alignment(xe);
+	uint64_t ahnd = intel_allocator_open(xe, ctx->vm, INTEL_ALLOCATOR_RELOC);
+	uint32_t run_id = mid_tiling;
+	uint32_t mid_region = region2, bb;
+	uint32_t width = param.width, height = param.height;
+	enum blt_compression mid_compression = config->compression;
+	int mid_compression_format = param.compression_format;
+	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
+	uint8_t uc_mocs = intel_get_uc_mocs(xe);
+	int result;
+
+	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
+
+	if (!blt_uses_extended_block_copy(xe))
+		pext3 = NULL;
+
+	blt_copy_init(xe, &blt);
+
+	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
+				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
+	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
+				mid_tiling, mid_compression, comp_type, true);
+	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
+				mid_tiling, COMPRESSION_DISABLED, comp_type, true);
+	final = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
+				  T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
+	igt_assert(src->size == dst->size);
+	PRINT_SURFACE_INFO("src", src);
+	PRINT_SURFACE_INFO("mid", mid);
+	PRINT_SURFACE_INFO("dst", dst);
+	PRINT_SURFACE_INFO("final", final);
+
+	blt_surface_fill_rect(xe, src, width, height);
+
+	blt3.color_depth = CD_32bit;
+	blt3.print_bb = param.print_bb;
+	blt_set_copy_object(&blt3.src, src);
+	blt_set_copy_object(&blt3.mid, mid);
+	blt_set_copy_object(&blt3.dst, dst);
+	blt_set_copy_object(&blt3.final, final);
+
+	if (config->inplace) {
+		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
+			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
+			       comp_type);
+		blt3.dst.ptr = mid->ptr;
+	}
+
+	blt_set_object_ext(&ext3.src, 0, width, height, SURFACE_TYPE_2D);
+	blt_set_object_ext(&ext3.mid, mid_compression_format, width, height, SURFACE_TYPE_2D);
+	blt_set_object_ext(&ext3.dst, 0, width, height, SURFACE_TYPE_2D);
+	blt_set_object_ext(&ext3.final, 0, width, height, SURFACE_TYPE_2D);
+	blt_set_batch(&blt3.bb, bb, bb_size, region1);
+
+	blt_block_copy3(xe, ctx, ahnd, &blt3, pext3);
+	intel_ctx_xe_sync(ctx, true);
+
+	WRITE_PNG(xe, run_id, "src", &blt3.src, width, height);
+	if (!config->inplace)
+		WRITE_PNG(xe, run_id, "mid", &blt3.mid, width, height);
+	WRITE_PNG(xe, run_id, "dst", &blt3.dst, width, height);
+	WRITE_PNG(xe, run_id, "final", &blt3.final, width, height);
+
+	result = memcmp(src->ptr, blt3.final.ptr, src->size);
+
+	put_offset(ahnd, src->handle);
+	put_offset(ahnd, mid->handle);
+	put_offset(ahnd, dst->handle);
+	put_offset(ahnd, final->handle);
+	put_offset(ahnd, bb);
+	intel_allocator_bind(ahnd, 0, 0);
+	blt_destroy_object(xe, src);
+	blt_destroy_object(xe, mid);
+	blt_destroy_object(xe, dst);
+	blt_destroy_object(xe, final);
+	gem_close(xe, bb);
+	put_ahnd(ahnd);
+
+	igt_assert_f(!result, "source and destination surfaces differs!\n");
+}
+
+enum copy_func {
+	BLOCK_COPY,
+	BLOCK_MULTICOPY,
+};
+
+static const struct {
+	const char *suffix;
+	void (*copyfn)(int fd,
+		       intel_ctx_t *ctx,
+		       uint32_t region1, uint32_t region2,
+		       enum blt_tiling_type btype,
+		       const struct test_config *config);
+} copyfns[] = {
+	[BLOCK_COPY] = { "", block_copy },
+	[BLOCK_MULTICOPY] = { "-multicopy", block_multicopy },
+};
+
+static void block_copy_test(int xe,
+			    const struct test_config *config,
+			    struct igt_collection *set,
+			    enum copy_func copy_function)
+{
+	struct drm_xe_engine_class_instance inst = {
+		.engine_class = DRM_XE_ENGINE_CLASS_COPY,
+	};
+	intel_ctx_t *ctx;
+	struct igt_collection *regions;
+	uint32_t vm, engine;
+	int tiling;
+
+	if (config->compression && !blt_block_copy_supports_compression(xe))
+		return;
+
+	if (config->inplace && !config->compression)
+		return;
+
+	for_each_tiling(tiling) {
+		if (!blt_block_copy_supports_tiling(xe, tiling) ||
+		    (param.tiling >= 0 && param.tiling != tiling))
+			continue;
+
+		for_each_variation_r(regions, 2, set) {
+			uint32_t region1, region2;
+			char *regtxt;
+
+			region1 = igt_collection_get_value(regions, 0);
+			region2 = igt_collection_get_value(regions, 1);
+
+			/* Compressed surface must be in device memory */
+			if (config->compression && !XE_IS_VRAM_MEMORY_REGION(xe, region2))
+				continue;
+
+			regtxt = xe_memregion_dynamic_subtest_name(xe, regions);
+
+			igt_dynamic_f("%s-%s-compfmt%d-%s%s",
+				      blt_tiling_name(tiling),
+				      config->compression ?
+					      "compressed" : "uncompressed",
+				      param.compression_format, regtxt,
+				      copyfns[copy_function].suffix) {
+				uint32_t sync_bind, sync_out;
+
+				vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
+				engine = xe_engine_create(xe, vm, &inst, 0);
+				sync_bind = syncobj_create(xe, 0);
+				sync_out = syncobj_create(xe, 0);
+				ctx = intel_ctx_xe(xe, vm, engine,
+						   0, sync_bind, sync_out);
+
+				copyfns[copy_function].copyfn(xe, ctx,
+							      region1, region2,
+							      tiling, config);
+
+				xe_engine_destroy(xe, engine);
+				xe_vm_destroy(xe, vm);
+				syncobj_destroy(xe, sync_bind);
+				syncobj_destroy(xe, sync_out);
+				free(ctx);
+			}
+
+			free(regtxt);
+		}
+	}
+}
+
+static int opt_handler(int opt, int opt_index, void *data)
+{
+	switch (opt) {
+	case 'b':
+		param.print_bb = true;
+		igt_debug("Print bb: %d\n", param.print_bb);
+		break;
+	case 'f':
+		param.compression_format = atoi(optarg);
+		igt_debug("Compression format: %d\n", param.compression_format);
+		igt_assert((param.compression_format & ~0x1f) == 0);
+		break;
+	case 'p':
+		param.write_png = true;
+		igt_debug("Write png: %d\n", param.write_png);
+		break;
+	case 's':
+		param.print_surface_info = true;
+		igt_debug("Print surface info: %d\n", param.print_surface_info);
+		break;
+	case 't':
+		param.tiling = atoi(optarg);
+		igt_debug("Tiling: %d\n", param.tiling);
+		break;
+	case 'W':
+		param.width = atoi(optarg);
+		igt_debug("Width: %d\n", param.width);
+		break;
+	case 'H':
+		param.height = atoi(optarg);
+		igt_debug("Height: %d\n", param.height);
+		break;
+	default:
+		return IGT_OPT_HANDLER_ERROR;
+	}
+
+	return IGT_OPT_HANDLER_SUCCESS;
+}
+
+const char *help_str =
+	"  -b\tPrint bb\n"
+	"  -f\tCompression format (0-31)\n"
+	"  -p\tWrite PNG\n"
+	"  -s\tPrint surface info\n"
+	"  -t\tTiling format (0 - linear, 1 - XMAJOR, 2 - YMAJOR, 3 - TILE4, 4 - TILE64)\n"
+	"  -W\tWidth (default 512)\n"
+	"  -H\tHeight (default 512)"
+	;
+
+igt_main_args("bf:pst:W:H:", NULL, help_str, opt_handler, NULL)
+{
+	struct igt_collection *set;
+	int xe;
+
+	igt_fixture {
+		xe = drm_open_driver(DRIVER_XE);
+		igt_require(blt_has_block_copy(xe));
+
+		xe_device_get(xe);
+
+		set = xe_get_memory_region_set(xe,
+					       XE_MEM_REGION_CLASS_SYSMEM,
+					       XE_MEM_REGION_CLASS_VRAM);
+	}
+
+	igt_describe("Check block-copy uncompressed blit");
+	igt_subtest_with_dynamic("block-copy-uncompressed") {
+		struct test_config config = {};
+
+		block_copy_test(xe, &config, set, BLOCK_COPY);
+	}
+
+	igt_describe("Check block-copy flatccs compressed blit");
+	igt_subtest_with_dynamic("block-copy-compressed") {
+		struct test_config config = { .compression = true };
+
+		block_copy_test(xe, &config, set, BLOCK_COPY);
+	}
+
+	igt_describe("Check block-multicopy flatccs compressed blit");
+	igt_subtest_with_dynamic("block-multicopy-compressed") {
+		struct test_config config = { .compression = true };
+
+		block_copy_test(xe, &config, set, BLOCK_MULTICOPY);
+	}
+
+	igt_describe("Check block-multicopy flatccs inplace decompression blit");
+	igt_subtest_with_dynamic("block-multicopy-inplace") {
+		struct test_config config = { .compression = true,
+					      .inplace = true };
+
+		block_copy_test(xe, &config, set, BLOCK_MULTICOPY);
+	}
+
+	igt_describe("Check flatccs data can be copied from/to surface");
+	igt_subtest_with_dynamic("ctrl-surf-copy") {
+		struct test_config config = { .compression = true,
+					      .surfcopy = true };
+
+		block_copy_test(xe, &config, set, BLOCK_COPY);
+	}
+
+	igt_describe("Check flatccs data are physically tagged and visible"
+		     " in different contexts");
+	igt_subtest_with_dynamic("ctrl-surf-copy-new-ctx") {
+		struct test_config config = { .compression = true,
+					      .surfcopy = true,
+					      .new_ctx = true };
+
+		block_copy_test(xe, &config, set, BLOCK_COPY);
+	}
+
+	igt_describe("Check flatccs data persists after suspend / resume (S0)");
+	igt_subtest_with_dynamic("suspend-resume") {
+		struct test_config config = { .compression = true,
+					      .surfcopy = true,
+					      .suspend_resume = true };
+
+		block_copy_test(xe, &config, set, BLOCK_COPY);
+	}
+
+	igt_fixture {
+		xe_device_put(xe);
+		close(xe);
+	}
+}
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 15/16] tests/xe_exercise_blt: Check blitter library fast-copy for Xe
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (13 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 14/16] tests/xe_ccs: Check if flatccs is working with block-copy for Xe Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-07 11:10   ` Karolina Stolarek
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 16/16] tests/api-intel-allocator: Adopt to exercise allocator to Xe Zbigniew Kempczyński
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

Port this test to work on xe. Instead of adding conditional code for
xe code which would decrease readability this is new test for xe.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 tests/meson.build          |   1 +
 tests/xe/xe_exercise_blt.c | 372 +++++++++++++++++++++++++++++++++++++
 2 files changed, 373 insertions(+)
 create mode 100644 tests/xe/xe_exercise_blt.c

diff --git a/tests/meson.build b/tests/meson.build
index 9bca57a5e8..137a5cf01f 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -257,6 +257,7 @@ xe_progs = [
 	'xe_exec_reset',
 	'xe_exec_store',
 	'xe_exec_threads',
+	'xe_exercise_blt',
 	'xe_gpgpu_fill',
 	'xe_guc_pc',
 	'xe_huc_copy',
diff --git a/tests/xe/xe_exercise_blt.c b/tests/xe/xe_exercise_blt.c
new file mode 100644
index 0000000000..8340cf7148
--- /dev/null
+++ b/tests/xe/xe_exercise_blt.c
@@ -0,0 +1,372 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2023 Intel Corporation
+ */
+
+#include "igt.h"
+#include "drm.h"
+#include "lib/intel_chipset.h"
+#include "intel_blt.h"
+#include "intel_mocs.h"
+#include "xe/xe_ioctl.h"
+#include "xe/xe_query.h"
+#include "xe/xe_util.h"
+
+/**
+ * TEST: xe exercise blt
+ * Description: Exercise blitter commands on Xe
+ * Feature: blitter
+ * Run type: FULL
+ * Test category: GEM_Legacy
+ *
+ * SUBTEST: fast-copy
+ * Description:
+ *   Check fast-copy blit
+ *   blitter
+ *
+ * SUBTEST: fast-copy-emit
+ * Description:
+ *   Check multiple fast-copy in one batch
+ *   blitter
+ */
+
+IGT_TEST_DESCRIPTION("Exercise blitter commands on Xe");
+
+static struct param {
+	int tiling;
+	bool write_png;
+	bool print_bb;
+	bool print_surface_info;
+	int width;
+	int height;
+} param = {
+	.tiling = -1,
+	.write_png = false,
+	.print_bb = false,
+	.print_surface_info = false,
+	.width = 512,
+	.height = 512,
+};
+
+#define PRINT_SURFACE_INFO(name, obj) do { \
+	if (param.print_surface_info) \
+		blt_surface_info((name), (obj)); } while (0)
+
+#define WRITE_PNG(fd, id, name, obj, w, h) do { \
+	if (param.write_png) \
+		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)
+
+struct blt_fast_copy_data {
+	int xe;
+	struct blt_copy_object src;
+	struct blt_copy_object mid;
+	struct blt_copy_object dst;
+
+	struct blt_copy_batch bb;
+	enum blt_color_depth color_depth;
+
+	/* debug stuff */
+	bool print_bb;
+};
+
+static int fast_copy_one_bb(int xe,
+			    const intel_ctx_t *ctx,
+			    uint64_t ahnd,
+			    const struct blt_fast_copy_data *blt)
+{
+	struct blt_copy_data blt_tmp;
+	uint64_t bb_offset, alignment;
+	uint64_t bb_pos = 0;
+	int ret = 0;
+
+	alignment = xe_get_default_alignment(xe);
+
+	get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
+	get_offset(ahnd, blt->mid.handle, blt->mid.size, alignment);
+	get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
+
+	/* First blit */
+	blt_copy_init(xe, &blt_tmp);
+	blt_tmp.src = blt->src;
+	blt_tmp.dst = blt->mid;
+	blt_tmp.bb = blt->bb;
+	blt_tmp.color_depth = blt->color_depth;
+	blt_tmp.print_bb = blt->print_bb;
+	bb_pos = emit_blt_fast_copy(xe, ahnd, &blt_tmp, bb_pos, false);
+
+	/* Second blit */
+	blt_copy_init(xe, &blt_tmp);
+	blt_tmp.src = blt->mid;
+	blt_tmp.dst = blt->dst;
+	blt_tmp.bb = blt->bb;
+	blt_tmp.color_depth = blt->color_depth;
+	blt_tmp.print_bb = blt->print_bb;
+	bb_pos = emit_blt_fast_copy(xe, ahnd, &blt_tmp, bb_pos, true);
+
+	intel_ctx_xe_exec(ctx, ahnd, bb_offset);
+
+	return ret;
+}
+
+static void fast_copy_emit(int xe, const intel_ctx_t *ctx,
+			   uint32_t region1, uint32_t region2,
+			   enum blt_tiling_type mid_tiling)
+{
+	struct blt_copy_data bltinit = {};
+	struct blt_fast_copy_data blt = {};
+	struct blt_copy_object *src, *mid, *dst;
+	const uint32_t bpp = 32;
+	uint64_t bb_size = xe_get_default_alignment(xe);
+	uint64_t ahnd = intel_allocator_open_full(xe, ctx->vm, 0, 0,
+						  INTEL_ALLOCATOR_SIMPLE,
+						  ALLOC_STRATEGY_LOW_TO_HIGH, 0);
+	uint32_t bb, width = param.width, height = param.height;
+	int result;
+
+	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
+
+	blt_copy_init(xe, &bltinit);
+	src = blt_create_object(&bltinit, region1, width, height, bpp, 0,
+				T_LINEAR, COMPRESSION_DISABLED, 0, true);
+	mid = blt_create_object(&bltinit, region2, width, height, bpp, 0,
+				mid_tiling, COMPRESSION_DISABLED, 0, true);
+	dst = blt_create_object(&bltinit, region1, width, height, bpp, 0,
+				T_LINEAR, COMPRESSION_DISABLED, 0, true);
+	igt_assert(src->size == dst->size);
+
+	PRINT_SURFACE_INFO("src", src);
+	PRINT_SURFACE_INFO("mid", mid);
+	PRINT_SURFACE_INFO("dst", dst);
+
+	blt_surface_fill_rect(xe, src, width, height);
+	WRITE_PNG(xe, mid_tiling, "src", src, width, height);
+
+	memset(&blt, 0, sizeof(blt));
+	blt.color_depth = CD_32bit;
+	blt.print_bb = param.print_bb;
+	blt_set_copy_object(&blt.src, src);
+	blt_set_copy_object(&blt.mid, mid);
+	blt_set_copy_object(&blt.dst, dst);
+	blt_set_batch(&blt.bb, bb, bb_size, region1);
+
+	fast_copy_one_bb(xe, ctx, ahnd, &blt);
+
+	WRITE_PNG(xe, mid_tiling, "mid", &blt.mid, width, height);
+	WRITE_PNG(xe, mid_tiling, "dst", &blt.dst, width, height);
+
+	result = memcmp(src->ptr, blt.dst.ptr, src->size);
+
+	blt_destroy_object(xe, src);
+	blt_destroy_object(xe, mid);
+	blt_destroy_object(xe, dst);
+	gem_close(xe, bb);
+	put_ahnd(ahnd);
+
+	munmap(&bb, bb_size);
+
+	igt_assert_f(!result, "source and destination surfaces differs!\n");
+}
+
+static void fast_copy(int xe, const intel_ctx_t *ctx,
+		      uint32_t region1, uint32_t region2,
+		      enum blt_tiling_type mid_tiling)
+{
+	struct blt_copy_data blt = {};
+	struct blt_copy_object *src, *mid, *dst;
+	const uint32_t bpp = 32;
+	uint64_t bb_size = xe_get_default_alignment(xe);
+	uint64_t ahnd = intel_allocator_open_full(xe, ctx->vm, 0, 0,
+						  INTEL_ALLOCATOR_SIMPLE,
+						  ALLOC_STRATEGY_LOW_TO_HIGH, 0);
+	uint32_t bb;
+	uint32_t width = param.width, height = param.height;
+	int result;
+
+	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
+
+	blt_copy_init(xe, &blt);
+	src = blt_create_object(&blt, region1, width, height, bpp, 0,
+				T_LINEAR, COMPRESSION_DISABLED, 0, true);
+	mid = blt_create_object(&blt, region2, width, height, bpp, 0,
+				mid_tiling, COMPRESSION_DISABLED, 0, true);
+	dst = blt_create_object(&blt, region1, width, height, bpp, 0,
+				T_LINEAR, COMPRESSION_DISABLED, 0, true);
+	igt_assert(src->size == dst->size);
+
+	blt_surface_fill_rect(xe, src, width, height);
+
+	blt.color_depth = CD_32bit;
+	blt.print_bb = param.print_bb;
+	blt_set_copy_object(&blt.src, src);
+	blt_set_copy_object(&blt.dst, mid);
+	blt_set_batch(&blt.bb, bb, bb_size, region1);
+
+	blt_fast_copy(xe, ctx, NULL, ahnd, &blt);
+
+	WRITE_PNG(xe, mid_tiling, "src", &blt.src, width, height);
+	WRITE_PNG(xe, mid_tiling, "mid", &blt.dst, width, height);
+
+	blt_copy_init(xe, &blt);
+	blt.color_depth = CD_32bit;
+	blt.print_bb = param.print_bb;
+	blt_set_copy_object(&blt.src, mid);
+	blt_set_copy_object(&blt.dst, dst);
+	blt_set_batch(&blt.bb, bb, bb_size, region1);
+
+	blt_fast_copy(xe, ctx, NULL, ahnd, &blt);
+
+	WRITE_PNG(xe, mid_tiling, "dst", &blt.dst, width, height);
+
+	result = memcmp(src->ptr, blt.dst.ptr, src->size);
+
+	blt_destroy_object(xe, src);
+	blt_destroy_object(xe, mid);
+	blt_destroy_object(xe, dst);
+	gem_close(xe, bb);
+	put_ahnd(ahnd);
+
+	igt_assert_f(!result, "source and destination surfaces differs!\n");
+}
+
+enum fast_copy_func {
+	FAST_COPY,
+	FAST_COPY_EMIT
+};
+
+static char
+	*full_subtest_str(char *regtxt, enum blt_tiling_type tiling,
+			  enum fast_copy_func func)
+{
+	char *name;
+	uint32_t len;
+
+	len = asprintf(&name, "%s-%s%s", blt_tiling_name(tiling), regtxt,
+		       func == FAST_COPY_EMIT ? "-emit" : "");
+
+	igt_assert_f(len >= 0, "asprintf failed!\n");
+
+	return name;
+}
+
+static void fast_copy_test(int xe,
+			   struct igt_collection *set,
+			   enum fast_copy_func func)
+{
+	struct drm_xe_engine_class_instance inst = {
+		.engine_class = DRM_XE_ENGINE_CLASS_COPY,
+	};
+	struct igt_collection *regions;
+	void (*copy_func)(int xe, const intel_ctx_t *ctx,
+			  uint32_t r1, uint32_t r2, enum blt_tiling_type tiling);
+	intel_ctx_t *ctx;
+	int tiling;
+
+	for_each_tiling(tiling) {
+		if (!blt_fast_copy_supports_tiling(xe, tiling))
+			continue;
+
+		for_each_variation_r(regions, 2, set) {
+			uint32_t region1, region2;
+			uint32_t vm, engine;
+			char *regtxt, *test_name;
+
+			region1 = igt_collection_get_value(regions, 0);
+			region2 = igt_collection_get_value(regions, 1);
+
+			vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
+			engine = xe_engine_create(xe, vm, &inst, 0);
+			ctx = intel_ctx_xe(xe, vm, engine, 0, 0, 0);
+
+			copy_func = (func == FAST_COPY) ? fast_copy : fast_copy_emit;
+			regtxt = xe_memregion_dynamic_subtest_name(xe, regions);
+			test_name = full_subtest_str(regtxt, tiling, func);
+
+			igt_dynamic_f("%s", test_name) {
+				copy_func(xe, ctx,
+					  region1, region2,
+					  tiling);
+			}
+
+			free(regtxt);
+			free(test_name);
+			xe_engine_destroy(xe, engine);
+			xe_vm_destroy(xe, vm);
+			free(ctx);
+		}
+	}
+}
+
+static int opt_handler(int opt, int opt_index, void *data)
+{
+	switch (opt) {
+	case 'b':
+		param.print_bb = true;
+		igt_debug("Print bb: %d\n", param.print_bb);
+		break;
+	case 'p':
+		param.write_png = true;
+		igt_debug("Write png: %d\n", param.write_png);
+		break;
+	case 's':
+		param.print_surface_info = true;
+		igt_debug("Print surface info: %d\n", param.print_surface_info);
+		break;
+	case 't':
+		param.tiling = atoi(optarg);
+		igt_debug("Tiling: %d\n", param.tiling);
+		break;
+	case 'W':
+		param.width = atoi(optarg);
+		igt_debug("Width: %d\n", param.width);
+		break;
+	case 'H':
+		param.height = atoi(optarg);
+		igt_debug("Height: %d\n", param.height);
+		break;
+	default:
+		return IGT_OPT_HANDLER_ERROR;
+	}
+
+	return IGT_OPT_HANDLER_SUCCESS;
+}
+
+const char *help_str =
+	"  -b\tPrint bb\n"
+	"  -p\tWrite PNG\n"
+	"  -s\tPrint surface info\n"
+	"  -t\tTiling format (0 - linear, 1 - XMAJOR, 2 - YMAJOR, 3 - TILE4, 4 - TILE64, 5 - YFMAJOR)\n"
+	"  -W\tWidth (default 512)\n"
+	"  -H\tHeight (default 512)"
+	;
+
+igt_main_args("b:pst:W:H:", NULL, help_str, opt_handler, NULL)
+{
+	struct igt_collection *set;
+	int xe;
+
+	igt_fixture {
+		xe = drm_open_driver(DRIVER_XE);
+		igt_require(blt_has_block_copy(xe));
+
+		xe_device_get(xe);
+
+		set = xe_get_memory_region_set(xe,
+					       XE_MEM_REGION_CLASS_SYSMEM,
+					       XE_MEM_REGION_CLASS_VRAM);
+	}
+
+	igt_describe("Check fast-copy blit");
+	igt_subtest_with_dynamic("fast-copy") {
+		fast_copy_test(xe, set, FAST_COPY);
+	}
+
+	igt_describe("Check multiple fast-copy in one batch");
+	igt_subtest_with_dynamic("fast-copy-emit") {
+		fast_copy_test(xe, set, FAST_COPY_EMIT);
+	}
+
+	igt_fixture {
+		drm_close_driver(xe);
+	}
+}
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] [PATCH i-g-t v2 16/16] tests/api-intel-allocator: Adopt to exercise allocator to Xe
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (14 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 15/16] tests/xe_exercise_blt: Check blitter library fast-copy " Zbigniew Kempczyński
@ 2023-07-06  6:05 ` Zbigniew Kempczyński
  2023-07-07 10:11   ` Karolina Stolarek
  2023-07-06  6:58 ` [igt-dev] ✓ Fi.CI.BAT: success for Extend intel_blt to work on Xe (rev2) Patchwork
  2023-07-06  9:26 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork
  17 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06  6:05 UTC (permalink / raw)
  To: igt-dev

Xe vm binding requires some cooperation from allocator side (tracking
alloc()/free()) operations. This diverges the path internally inside
the allocator so it is necessary to check if allocator supports
properly both drivers.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 tests/i915/api_intel_allocator.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/tests/i915/api_intel_allocator.c b/tests/i915/api_intel_allocator.c
index 238e76c9fd..e70de0e1d0 100644
--- a/tests/i915/api_intel_allocator.c
+++ b/tests/i915/api_intel_allocator.c
@@ -9,6 +9,9 @@
 #include "igt.h"
 #include "igt_aux.h"
 #include "intel_allocator.h"
+#include "xe/xe_ioctl.h"
+#include "xe/xe_query.h"
+
 /**
  * TEST: api intel allocator
  * Category: Infrastructure
@@ -454,6 +457,7 @@ static void __simple_allocs(int fd)
 	uint32_t handles[SIMPLE_GROUP_ALLOCS];
 	uint64_t ahnd;
 	uint32_t ctx;
+	bool is_xe = is_xe_device(fd);
 	int i;
 
 	ctx = rand() % 2;
@@ -463,7 +467,12 @@ static void __simple_allocs(int fd)
 		uint32_t size;
 
 		size = (rand() % 4 + 1) * 0x1000;
-		handles[i] = gem_create(fd, size);
+		if (is_xe)
+			handles[i] = xe_bo_create_flags(fd, 0, size,
+							system_memory(fd));
+		else
+			handles[i] = gem_create(fd, size);
+
 		intel_allocator_alloc(ahnd, handles[i], size, 0x1000);
 	}
 
@@ -573,8 +582,6 @@ static void reopen(int fd)
 {
 	int fd2;
 
-	igt_require_gem(fd);
-
 	fd2 = drm_reopen_driver(fd);
 
 	__reopen_allocs(fd, fd2, true);
@@ -587,8 +594,6 @@ static void reopen_fork(int fd)
 {
 	int fd2;
 
-	igt_require_gem(fd);
-
 	intel_allocator_multiprocess_start();
 
 	fd2 = drm_reopen_driver(fd);
@@ -838,7 +843,7 @@ igt_main
 	struct allocators *a;
 
 	igt_fixture {
-		fd = drm_open_driver(DRIVER_INTEL);
+		fd = drm_open_driver(DRIVER_INTEL | DRIVER_XE);
 		atomic_init(&next_handle, 1);
 		srandom(0xdeadbeef);
 	}
@@ -911,12 +916,16 @@ igt_main
 	igt_subtest_f("open-vm")
 		open_vm(fd);
 
-	igt_subtest_f("execbuf-with-allocator")
+	igt_subtest_f("execbuf-with-allocator") {
+		igt_require(is_i915_device(fd));
 		execbuf_with_allocator(fd);
+	}
 
 	igt_describe("Verifies creating and executing bb from gem pool");
-	igt_subtest_f("gem-pool")
+	igt_subtest_f("gem-pool") {
+		igt_require(is_i915_device(fd));
 		gem_pool(fd);
+	}
 
 	igt_fixture
 		drm_close_driver(fd);
-- 
2.34.1

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for Extend intel_blt to work on Xe (rev2)
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (15 preceding siblings ...)
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 16/16] tests/api-intel-allocator: Adopt to exercise allocator to Xe Zbigniew Kempczyński
@ 2023-07-06  6:58 ` Patchwork
  2023-07-06  9:26 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork
  17 siblings, 0 replies; 46+ messages in thread
From: Patchwork @ 2023-07-06  6:58 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

[-- Attachment #1: Type: text/plain, Size: 10125 bytes --]

== Series Details ==

Series: Extend intel_blt to work on Xe (rev2)
URL   : https://patchwork.freedesktop.org/series/120162/
State : success

== Summary ==

CI Bug Log - changes from IGT_7373 -> IGTPW_9350
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/index.html

Participating hosts (41 -> 41)
------------------------------

  Additional (1): fi-kbl-soraka 
  Missing    (1): fi-snb-2520m 

Known issues
------------

  Here are the changes found in IGTPW_9350 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_huc_copy@huc-copy:
    - fi-kbl-soraka:      NOTRUN -> [SKIP][1] ([fdo#109271] / [i915#2190])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/fi-kbl-soraka/igt@gem_huc_copy@huc-copy.html

  * igt@gem_lmem_swapping@basic:
    - fi-kbl-soraka:      NOTRUN -> [SKIP][2] ([fdo#109271] / [i915#4613]) +3 similar issues
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/fi-kbl-soraka/igt@gem_lmem_swapping@basic.html

  * igt@i915_selftest@live@gt_mocs:
    - bat-mtlp-6:         [PASS][3] -> [DMESG-FAIL][4] ([i915#7059])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/bat-mtlp-6/igt@i915_selftest@live@gt_mocs.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-mtlp-6/igt@i915_selftest@live@gt_mocs.html

  * igt@i915_selftest@live@gt_pm:
    - fi-kbl-soraka:      NOTRUN -> [DMESG-FAIL][5] ([i915#1886] / [i915#7913])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html

  * igt@i915_selftest@live@guc:
    - bat-rpls-2:         NOTRUN -> [DMESG-WARN][6] ([i915#7852])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-rpls-2/igt@i915_selftest@live@guc.html

  * igt@i915_selftest@live@migrate:
    - bat-mtlp-6:         [PASS][7] -> [DMESG-FAIL][8] ([i915#7699])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/bat-mtlp-6/igt@i915_selftest@live@migrate.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-mtlp-6/igt@i915_selftest@live@migrate.html

  * igt@i915_selftest@live@slpc:
    - bat-rpls-1:         NOTRUN -> [DMESG-WARN][9] ([i915#6367])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-rpls-1/igt@i915_selftest@live@slpc.html

  * igt@i915_suspend@basic-s2idle-without-i915:
    - bat-rpls-2:         NOTRUN -> [ABORT][10] ([i915#6687] / [i915#8668])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-rpls-2/igt@i915_suspend@basic-s2idle-without-i915.html

  * igt@i915_suspend@basic-s3-without-i915:
    - bat-mtlp-6:         NOTRUN -> [SKIP][11] ([i915#6645])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-mtlp-6/igt@i915_suspend@basic-s3-without-i915.html
    - bat-rpls-1:         NOTRUN -> [ABORT][12] ([i915#6687] / [i915#7978] / [i915#8668])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-rpls-1/igt@i915_suspend@basic-s3-without-i915.html

  * igt@kms_chamelium_hpd@common-hpd-after-suspend:
    - bat-mtlp-6:         NOTRUN -> [SKIP][13] ([i915#7828])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-mtlp-6/igt@kms_chamelium_hpd@common-hpd-after-suspend.html
    - fi-blb-e6850:       NOTRUN -> [SKIP][14] ([fdo#109271])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/fi-blb-e6850/igt@kms_chamelium_hpd@common-hpd-after-suspend.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
    - fi-kbl-soraka:      NOTRUN -> [SKIP][15] ([fdo#109271]) +15 similar issues
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/fi-kbl-soraka/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_pipe_crc_basic@suspend-read-crc:
    - bat-mtlp-6:         NOTRUN -> [SKIP][16] ([i915#1845] / [i915#4078])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-mtlp-6/igt@kms_pipe_crc_basic@suspend-read-crc.html

  * igt@kms_psr@primary_mmap_gtt:
    - bat-rplp-1:         NOTRUN -> [SKIP][17] ([i915#1072]) +1 similar issue
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-rplp-1/igt@kms_psr@primary_mmap_gtt.html

  * igt@kms_setmode@basic-clone-single-crtc:
    - bat-rplp-1:         NOTRUN -> [ABORT][18] ([i915#8260] / [i915#8668])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-rplp-1/igt@kms_setmode@basic-clone-single-crtc.html

  
#### Possible fixes ####

  * igt@i915_selftest@live@gem_contexts:
    - bat-mtlp-6:         [ABORT][19] ([i915#8630]) -> [PASS][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/bat-mtlp-6/igt@i915_selftest@live@gem_contexts.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-mtlp-6/igt@i915_selftest@live@gem_contexts.html

  * igt@i915_selftest@live@requests:
    - bat-rpls-1:         [ABORT][21] ([i915#7920] / [i915#7982]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/bat-rpls-1/igt@i915_selftest@live@requests.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-rpls-1/igt@i915_selftest@live@requests.html
    - bat-mtlp-6:         [DMESG-FAIL][23] ([i915#8497]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/bat-mtlp-6/igt@i915_selftest@live@requests.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-mtlp-6/igt@i915_selftest@live@requests.html

  * igt@i915_selftest@live@reset:
    - bat-rpls-2:         [ABORT][25] ([i915#4983] / [i915#7461] / [i915#7913] / [i915#8347]) -> [PASS][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/bat-rpls-2/igt@i915_selftest@live@reset.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-rpls-2/igt@i915_selftest@live@reset.html

  * igt@i915_selftest@live@workarounds:
    - bat-mtlp-6:         [DMESG-FAIL][27] ([i915#6763]) -> [PASS][28]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/bat-mtlp-6/igt@i915_selftest@live@workarounds.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-mtlp-6/igt@i915_selftest@live@workarounds.html

  * igt@i915_suspend@basic-s2idle-without-i915:
    - fi-blb-e6850:       [ABORT][29] ([i915#7985]) -> [PASS][30]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/fi-blb-e6850/igt@i915_suspend@basic-s2idle-without-i915.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/fi-blb-e6850/igt@i915_suspend@basic-s2idle-without-i915.html

  
#### Warnings ####

  * igt@i915_module_load@load:
    - bat-adlp-11:        [DMESG-WARN][31] ([i915#4423]) -> [ABORT][32] ([i915#4423])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/bat-adlp-11/igt@i915_module_load@load.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-adlp-11/igt@i915_module_load@load.html

  * igt@kms_psr@cursor_plane_move:
    - bat-rplp-1:         [ABORT][33] ([i915#8434]) -> [SKIP][34] ([i915#1072])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/bat-rplp-1/igt@kms_psr@cursor_plane_move.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/bat-rplp-1/igt@kms_psr@cursor_plane_move.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1845]: https://gitlab.freedesktop.org/drm/intel/issues/1845
  [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#4078]: https://gitlab.freedesktop.org/drm/intel/issues/4078
  [i915#4423]: https://gitlab.freedesktop.org/drm/intel/issues/4423
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4983]: https://gitlab.freedesktop.org/drm/intel/issues/4983
  [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
  [i915#6645]: https://gitlab.freedesktop.org/drm/intel/issues/6645
  [i915#6687]: https://gitlab.freedesktop.org/drm/intel/issues/6687
  [i915#6763]: https://gitlab.freedesktop.org/drm/intel/issues/6763
  [i915#7059]: https://gitlab.freedesktop.org/drm/intel/issues/7059
  [i915#7461]: https://gitlab.freedesktop.org/drm/intel/issues/7461
  [i915#7699]: https://gitlab.freedesktop.org/drm/intel/issues/7699
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828
  [i915#7852]: https://gitlab.freedesktop.org/drm/intel/issues/7852
  [i915#7913]: https://gitlab.freedesktop.org/drm/intel/issues/7913
  [i915#7920]: https://gitlab.freedesktop.org/drm/intel/issues/7920
  [i915#7978]: https://gitlab.freedesktop.org/drm/intel/issues/7978
  [i915#7982]: https://gitlab.freedesktop.org/drm/intel/issues/7982
  [i915#7985]: https://gitlab.freedesktop.org/drm/intel/issues/7985
  [i915#8260]: https://gitlab.freedesktop.org/drm/intel/issues/8260
  [i915#8347]: https://gitlab.freedesktop.org/drm/intel/issues/8347
  [i915#8434]: https://gitlab.freedesktop.org/drm/intel/issues/8434
  [i915#8497]: https://gitlab.freedesktop.org/drm/intel/issues/8497
  [i915#8630]: https://gitlab.freedesktop.org/drm/intel/issues/8630
  [i915#8668]: https://gitlab.freedesktop.org/drm/intel/issues/8668


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_7373 -> IGTPW_9350

  CI-20190529: 20190529
  CI_DRM_13347: ee9b323b764f1f14ae4e6e8213164bd250160770 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_9350: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/index.html
  IGT_7373: 654347ccaa55cbfcd10b978cc6662ef6db25224d @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git


Testlist changes
----------------

+igt@xe_ccs@block-copy-compressed
+igt@xe_ccs@block-copy-uncompressed
+igt@xe_ccs@block-multicopy-compressed
+igt@xe_ccs@block-multicopy-inplace
+igt@xe_ccs@ctrl-surf-copy
+igt@xe_ccs@ctrl-surf-copy-new-ctx
+igt@xe_ccs@suspend-resume
+igt@xe_exercise_blt@fast-copy
+igt@xe_exercise_blt@fast-copy-emit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/index.html

[-- Attachment #2: Type: text/html, Size: 11943 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 02/16] lib/intel_allocator: Drop aliasing allocator handle api
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 02/16] lib/intel_allocator: Drop aliasing allocator handle api Zbigniew Kempczyński
@ 2023-07-06  8:31   ` Karolina Stolarek
  2023-07-06 11:20     ` Zbigniew Kempczyński
  0 siblings, 1 reply; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-06  8:31 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> There's no real user of this api, lets drop it.
> 

Should we drop REQ_OPEN_AS and/or RESP_OPEN_AS enum values as well?

All the best,
Karolina

> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>   lib/intel_allocator.c | 18 ------------------
>   lib/intel_allocator.h |  1 -
>   2 files changed, 19 deletions(-)
> 
> diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
> index 8161221dbf..c31576ecef 100644
> --- a/lib/intel_allocator.c
> +++ b/lib/intel_allocator.c
> @@ -1037,24 +1037,6 @@ uint64_t intel_allocator_open_vm(int fd, uint32_t vm, uint8_t allocator_type)
>   					    ALLOC_STRATEGY_HIGH_TO_LOW, 0);
>   }
>   
> -uint64_t intel_allocator_open_vm_as(uint64_t allocator_handle, uint32_t new_vm)
> -{
> -	struct alloc_req req = { .request_type = REQ_OPEN_AS,
> -				 .allocator_handle = allocator_handle,
> -				 .open_as.new_vm = new_vm };
> -	struct alloc_resp resp;
> -
> -	/* Get child_tid only once at open() */
> -	if (child_tid == -1)
> -		child_tid = gettid();
> -
> -	igt_assert(handle_request(&req, &resp) == 0);
> -	igt_assert(resp.open_as.allocator_handle);
> -	igt_assert(resp.response_type == RESP_OPEN_AS);
> -
> -	return resp.open.allocator_handle;
> -}
> -
>   /**
>    * intel_allocator_close:
>    * @allocator_handle: handle to the allocator that will be closed
> diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
> index a6bf573e9d..3ec74f6191 100644
> --- a/lib/intel_allocator.h
> +++ b/lib/intel_allocator.h
> @@ -182,7 +182,6 @@ uint64_t intel_allocator_open_vm_full(int fd, uint32_t vm,
>   				      enum allocator_strategy strategy,
>   				      uint64_t default_alignment);
>   
> -uint64_t intel_allocator_open_vm_as(uint64_t allocator_handle, uint32_t new_vm);
>   bool intel_allocator_close(uint64_t allocator_handle);
>   void intel_allocator_get_address_range(uint64_t allocator_handle,
>   				       uint64_t *startp, uint64_t *endp);

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 01/16] tests/api_intel_allocator: Don't use allocator ahnd aliasing api
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 01/16] tests/api_intel_allocator: Don't use allocator ahnd aliasing api Zbigniew Kempczyński
@ 2023-07-06  9:04   ` Karolina Stolarek
  0 siblings, 0 replies; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-06  9:04 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> There's no tests (apart this one) which are using aliasing ahnd
> - intel_allocator_open_vm_as(). Additionally it is problematic
> on adopting allocator to xe where I need to track allocations
> to support easy vm binding. Let's adopt "open-vm" to not to use
> this api.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>   tests/i915/api_intel_allocator.c | 21 ++++++++-------------
>   1 file changed, 8 insertions(+), 13 deletions(-)
> 
> diff --git a/tests/i915/api_intel_allocator.c b/tests/i915/api_intel_allocator.c
> index b7e3efb87f..238e76c9fd 100644
> --- a/tests/i915/api_intel_allocator.c
> +++ b/tests/i915/api_intel_allocator.c
> @@ -612,32 +612,27 @@ static void reopen_fork(int fd)
>   
>   static void open_vm(int fd)
>   {
> -	uint64_t ahnd[4], offset[4], size = 0x1000;
> +	uint64_t ahnd[3], offset[3], size = 0x1000;
>   	int i, n = ARRAY_SIZE(ahnd);
>   
>   	ahnd[0] = intel_allocator_open_vm(fd, 1, INTEL_ALLOCATOR_SIMPLE);
>   	ahnd[1] = intel_allocator_open_vm(fd, 1, INTEL_ALLOCATOR_SIMPLE);
> -	ahnd[2] = intel_allocator_open_vm_as(ahnd[1], 2);
> -	ahnd[3] = intel_allocator_open(fd, 3, INTEL_ALLOCATOR_SIMPLE);
> +	ahnd[2] = intel_allocator_open(fd, 2, INTEL_ALLOCATOR_SIMPLE);
>   
>   	offset[0] = intel_allocator_alloc(ahnd[0], 1, size, 0);
>   	offset[1] = intel_allocator_alloc(ahnd[1], 2, size, 0);
>   	igt_assert(offset[0] != offset[1]);
>   
> -	offset[2] = intel_allocator_alloc(ahnd[2], 3, size, 0);
> -	igt_assert(offset[0] != offset[2] && offset[1] != offset[2]);
> -
> -	offset[3] = intel_allocator_alloc(ahnd[3], 1, size, 0);
> -	igt_assert(offset[0] == offset[3]);
> +	offset[2] = intel_allocator_alloc(ahnd[2], 1, size, 0);
> +	igt_assert(offset[0] == offset[2]);
>   
>   	/*
> -	 * As ahnd[0-2] lead to same allocator check can we free all handles
> +	 * As ahnd[0-1] lead to same allocator check can we free all handles

nit: Shouldn't this we "we can free" instead of "can we free"?

Apart from that, I think the test update looks correct:

Reviewed-by: Karolina Stolarek <karolina.stolarek@intel.com>

>   	 * using selected ahnd.
>   	 */
> -	intel_allocator_free(ahnd[0], 1);
> -	intel_allocator_free(ahnd[0], 2);
> -	intel_allocator_free(ahnd[0], 3);
> -	intel_allocator_free(ahnd[3], 1);
> +	igt_assert_eq(intel_allocator_free(ahnd[0], 1), true);
> +	igt_assert_eq(intel_allocator_free(ahnd[1], 2), true);
> +	igt_assert_eq(intel_allocator_free(ahnd[2], 1), true);
>   
>   	for (i = 0; i < n - 1; i++)
>   		igt_assert_eq(intel_allocator_close(ahnd[i]), (i == n - 2));

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [igt-dev] ✓ Fi.CI.IGT: success for Extend intel_blt to work on Xe (rev2)
  2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
                   ` (16 preceding siblings ...)
  2023-07-06  6:58 ` [igt-dev] ✓ Fi.CI.BAT: success for Extend intel_blt to work on Xe (rev2) Patchwork
@ 2023-07-06  9:26 ` Patchwork
  17 siblings, 0 replies; 46+ messages in thread
From: Patchwork @ 2023-07-06  9:26 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

[-- Attachment #1: Type: text/plain, Size: 70103 bytes --]

== Series Details ==

Series: Extend intel_blt to work on Xe (rev2)
URL   : https://patchwork.freedesktop.org/series/120162/
State : success

== Summary ==

CI Bug Log - changes from IGT_7373_full -> IGTPW_9350_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/index.html

Participating hosts (9 -> 9)
------------------------------

  No changes in participating hosts

Known issues
------------

  Here are the changes found in IGTPW_9350_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@api_intel_bb@crc32:
    - shard-tglu:         NOTRUN -> [SKIP][1] ([i915#6230])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-9/igt@api_intel_bb@crc32.html

  * igt@api_intel_bb@object-reloc-purge-cache:
    - shard-mtlp:         NOTRUN -> [SKIP][2] ([i915#8411])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-1/igt@api_intel_bb@object-reloc-purge-cache.html

  * igt@api_intel_bb@render-ccs:
    - shard-dg2:          NOTRUN -> [FAIL][3] ([i915#6122])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@api_intel_bb@render-ccs.html

  * igt@drm_fdinfo@most-busy-idle-check-all@rcs0:
    - shard-rkl:          NOTRUN -> [FAIL][4] ([i915#7742])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@drm_fdinfo@most-busy-idle-check-all@rcs0.html

  * igt@drm_fdinfo@most-busy-idle-check-all@vecs1:
    - shard-dg2:          NOTRUN -> [SKIP][5] ([i915#8414]) +19 similar issues
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@drm_fdinfo@most-busy-idle-check-all@vecs1.html

  * igt@drm_fdinfo@virtual-busy-idle-all:
    - shard-mtlp:         NOTRUN -> [SKIP][6] ([i915#8414])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-6/igt@drm_fdinfo@virtual-busy-idle-all.html

  * igt@feature_discovery@display-3x:
    - shard-tglu:         NOTRUN -> [SKIP][7] ([i915#1839])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-3/igt@feature_discovery@display-3x.html

  * igt@gem_ctx_persistence@heartbeat-close:
    - shard-dg2:          NOTRUN -> [SKIP][8] ([i915#8555])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-6/igt@gem_ctx_persistence@heartbeat-close.html

  * igt@gem_ctx_persistence@legacy-engines-hang@bsd2:
    - shard-mtlp:         [PASS][9] -> [ABORT][10] ([i915#8217])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-2/igt@gem_ctx_persistence@legacy-engines-hang@bsd2.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-3/igt@gem_ctx_persistence@legacy-engines-hang@bsd2.html

  * igt@gem_ctx_persistence@saturated-hostile@vecs0:
    - shard-mtlp:         [PASS][11] -> [FAIL][12] ([i915#7816]) +2 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-7/igt@gem_ctx_persistence@saturated-hostile@vecs0.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-1/igt@gem_ctx_persistence@saturated-hostile@vecs0.html

  * igt@gem_ctx_sseu@engines:
    - shard-dg2:          NOTRUN -> [SKIP][13] ([i915#280])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-3/igt@gem_ctx_sseu@engines.html

  * igt@gem_exec_balancer@parallel-contexts:
    - shard-rkl:          NOTRUN -> [SKIP][14] ([i915#4525])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-7/igt@gem_exec_balancer@parallel-contexts.html

  * igt@gem_exec_capture@capture-recoverable:
    - shard-tglu:         NOTRUN -> [SKIP][15] ([i915#6344])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-6/igt@gem_exec_capture@capture-recoverable.html

  * igt@gem_exec_fair@basic-deadline:
    - shard-rkl:          [PASS][16] -> [FAIL][17] ([i915#2846])
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-rkl-4/igt@gem_exec_fair@basic-deadline.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-2/igt@gem_exec_fair@basic-deadline.html

  * igt@gem_exec_fair@basic-none-share:
    - shard-dg2:          NOTRUN -> [SKIP][18] ([i915#3539] / [i915#4852]) +1 similar issue
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-12/igt@gem_exec_fair@basic-none-share.html

  * igt@gem_exec_fair@basic-pace@rcs0:
    - shard-glk:          [PASS][19] -> [FAIL][20] ([i915#2842]) +1 similar issue
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-glk8/igt@gem_exec_fair@basic-pace@rcs0.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-glk6/igt@gem_exec_fair@basic-pace@rcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-rkl:          [PASS][21] -> [FAIL][22] ([i915#2842])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-rkl-4/igt@gem_exec_fair@basic-throttle@rcs0.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-4/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_exec_fence@submit67:
    - shard-mtlp:         NOTRUN -> [SKIP][23] ([i915#4812])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-8/igt@gem_exec_fence@submit67.html

  * igt@gem_exec_parallel@engines@basic:
    - shard-mtlp:         [PASS][24] -> [FAIL][25] ([i915#8386])
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-5/igt@gem_exec_parallel@engines@basic.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-4/igt@gem_exec_parallel@engines@basic.html

  * igt@gem_exec_reloc@basic-write-gtt:
    - shard-dg2:          NOTRUN -> [SKIP][26] ([i915#3281]) +2 similar issues
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@gem_exec_reloc@basic-write-gtt.html

  * igt@gem_exec_reloc@basic-write-read:
    - shard-rkl:          NOTRUN -> [SKIP][27] ([i915#3281])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-2/igt@gem_exec_reloc@basic-write-read.html

  * igt@gem_exec_reloc@basic-write-wc:
    - shard-mtlp:         NOTRUN -> [SKIP][28] ([i915#3281]) +3 similar issues
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-4/igt@gem_exec_reloc@basic-write-wc.html

  * igt@gem_exec_schedule@preempt-queue-chain:
    - shard-dg2:          NOTRUN -> [SKIP][29] ([i915#4537] / [i915#4812])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@gem_exec_schedule@preempt-queue-chain.html

  * igt@gem_exec_suspend@basic-s4-devices@smem:
    - shard-tglu:         [PASS][30] -> [ABORT][31] ([i915#7975] / [i915#8213])
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-tglu-7/igt@gem_exec_suspend@basic-s4-devices@smem.html
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-10/igt@gem_exec_suspend@basic-s4-devices@smem.html

  * igt@gem_exec_whisper@basic-contexts-forked-all:
    - shard-mtlp:         [PASS][32] -> [TIMEOUT][33] ([i915#8628])
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-1/igt@gem_exec_whisper@basic-contexts-forked-all.html
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-5/igt@gem_exec_whisper@basic-contexts-forked-all.html

  * igt@gem_exec_whisper@basic-fds-priority-all:
    - shard-mtlp:         NOTRUN -> [FAIL][34] ([i915#6363])
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-1/igt@gem_exec_whisper@basic-fds-priority-all.html

  * igt@gem_fenced_exec_thrash@no-spare-fences:
    - shard-dg2:          NOTRUN -> [SKIP][35] ([i915#4860])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-12/igt@gem_fenced_exec_thrash@no-spare-fences.html

  * igt@gem_lmem_swapping@parallel-multi:
    - shard-rkl:          NOTRUN -> [SKIP][36] ([i915#4613]) +1 similar issue
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@gem_lmem_swapping@parallel-multi.html

  * igt@gem_lmem_swapping@parallel-random:
    - shard-mtlp:         NOTRUN -> [SKIP][37] ([i915#4613])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-5/igt@gem_lmem_swapping@parallel-random.html

  * igt@gem_media_fill@media-fill:
    - shard-dg2:          NOTRUN -> [SKIP][38] ([i915#8289])
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@gem_media_fill@media-fill.html

  * igt@gem_mmap_gtt@cpuset-medium-copy-xy:
    - shard-mtlp:         NOTRUN -> [SKIP][39] ([i915#4077]) +1 similar issue
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-8/igt@gem_mmap_gtt@cpuset-medium-copy-xy.html

  * igt@gem_mmap_gtt@medium-copy-xy:
    - shard-dg2:          NOTRUN -> [SKIP][40] ([i915#4077]) +8 similar issues
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-8/igt@gem_mmap_gtt@medium-copy-xy.html

  * igt@gem_mmap_wc@coherency:
    - shard-mtlp:         NOTRUN -> [SKIP][41] ([i915#4083])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-8/igt@gem_mmap_wc@coherency.html

  * igt@gem_mmap_wc@write-gtt-read-wc:
    - shard-dg2:          NOTRUN -> [SKIP][42] ([i915#4083]) +3 similar issues
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-7/igt@gem_mmap_wc@write-gtt-read-wc.html

  * igt@gem_partial_pwrite_pread@write-snoop:
    - shard-dg2:          NOTRUN -> [SKIP][43] ([i915#3282]) +1 similar issue
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-8/igt@gem_partial_pwrite_pread@write-snoop.html

  * igt@gem_partial_pwrite_pread@write-uncached:
    - shard-rkl:          NOTRUN -> [SKIP][44] ([i915#3282])
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@gem_partial_pwrite_pread@write-uncached.html

  * igt@gem_pxp@reject-modify-context-protection-off-1:
    - shard-dg2:          NOTRUN -> [SKIP][45] ([i915#4270]) +1 similar issue
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@gem_pxp@reject-modify-context-protection-off-1.html

  * igt@gem_pxp@reject-modify-context-protection-off-3:
    - shard-rkl:          NOTRUN -> [SKIP][46] ([i915#4270])
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@gem_pxp@reject-modify-context-protection-off-3.html

  * igt@gem_render_copy@mixed-tiled-to-yf-tiled-ccs:
    - shard-mtlp:         NOTRUN -> [SKIP][47] ([i915#8428])
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-6/igt@gem_render_copy@mixed-tiled-to-yf-tiled-ccs.html

  * igt@gem_render_copy@yf-tiled-ccs-to-y-tiled:
    - shard-dg2:          NOTRUN -> [SKIP][48] ([i915#5190]) +7 similar issues
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@gem_render_copy@yf-tiled-ccs-to-y-tiled.html

  * igt@gem_set_tiling_vs_pwrite:
    - shard-dg2:          NOTRUN -> [SKIP][49] ([i915#4079])
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-7/igt@gem_set_tiling_vs_pwrite.html

  * igt@gem_softpin@evict-snoop:
    - shard-rkl:          NOTRUN -> [SKIP][50] ([fdo#109312])
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-4/igt@gem_softpin@evict-snoop.html
    - shard-dg2:          NOTRUN -> [SKIP][51] ([i915#4885])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@gem_softpin@evict-snoop.html

  * igt@gem_userptr_blits@unsync-overlap:
    - shard-tglu:         NOTRUN -> [SKIP][52] ([i915#3297])
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-3/igt@gem_userptr_blits@unsync-overlap.html

  * igt@gen3_render_tiledy_blits:
    - shard-dg2:          NOTRUN -> [SKIP][53] ([fdo#109289]) +2 similar issues
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@gen3_render_tiledy_blits.html

  * igt@gen7_exec_parse@basic-offset:
    - shard-mtlp:         NOTRUN -> [SKIP][54] ([fdo#109289]) +1 similar issue
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-5/igt@gen7_exec_parse@basic-offset.html

  * igt@gen9_exec_parse@allowed-single:
    - shard-apl:          [PASS][55] -> [ABORT][56] ([i915#5566])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-apl3/igt@gen9_exec_parse@allowed-single.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-apl7/igt@gen9_exec_parse@allowed-single.html

  * igt@gen9_exec_parse@bb-chained:
    - shard-mtlp:         NOTRUN -> [SKIP][57] ([i915#2856])
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-8/igt@gen9_exec_parse@bb-chained.html

  * igt@gen9_exec_parse@bb-start-out:
    - shard-dg2:          NOTRUN -> [SKIP][58] ([i915#2856]) +1 similar issue
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@gen9_exec_parse@bb-start-out.html
    - shard-rkl:          NOTRUN -> [SKIP][59] ([i915#2527])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@gen9_exec_parse@bb-start-out.html

  * igt@i915_hangman@engine-engine-error@vcs0:
    - shard-mtlp:         [PASS][60] -> [FAIL][61] ([i915#7069]) +2 similar issues
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-2/igt@i915_hangman@engine-engine-error@vcs0.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-5/igt@i915_hangman@engine-engine-error@vcs0.html

  * igt@i915_module_load@load:
    - shard-dg2:          NOTRUN -> [SKIP][62] ([i915#6227])
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-8/igt@i915_module_load@load.html
    - shard-rkl:          NOTRUN -> [SKIP][63] ([i915#6227])
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@i915_module_load@load.html

  * igt@i915_module_load@reload-with-fault-injection:
    - shard-dg2:          [PASS][64] -> [DMESG-WARN][65] ([i915#7061])
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-6/igt@i915_module_load@reload-with-fault-injection.html
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-3/igt@i915_module_load@reload-with-fault-injection.html

  * igt@i915_pm_dc@dc6-dpms:
    - shard-dg2:          NOTRUN -> [SKIP][66] ([i915#5978])
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-8/igt@i915_pm_dc@dc6-dpms.html
    - shard-rkl:          NOTRUN -> [SKIP][67] ([i915#3361])
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-4/igt@i915_pm_dc@dc6-dpms.html

  * igt@i915_pm_dc@dc9-dpms:
    - shard-apl:          [PASS][68] -> [FAIL][69] ([i915#4275])
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-apl1/igt@i915_pm_dc@dc9-dpms.html
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-apl6/igt@i915_pm_dc@dc9-dpms.html

  * igt@i915_pm_rpm@dpms-mode-unset-non-lpsp:
    - shard-rkl:          [PASS][70] -> [SKIP][71] ([i915#1397]) +1 similar issue
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-rkl-4/igt@i915_pm_rpm@dpms-mode-unset-non-lpsp.html
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-7/igt@i915_pm_rpm@dpms-mode-unset-non-lpsp.html

  * igt@i915_pm_rpm@i2c:
    - shard-glk:          [PASS][72] -> [FAIL][73] ([i915#5466])
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-glk6/igt@i915_pm_rpm@i2c.html
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-glk9/igt@i915_pm_rpm@i2c.html

  * igt@i915_pm_rpm@modeset-lpsp-stress-no-wait:
    - shard-dg2:          [PASS][74] -> [SKIP][75] ([i915#1397]) +2 similar issues
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-12/igt@i915_pm_rpm@modeset-lpsp-stress-no-wait.html
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-7/igt@i915_pm_rpm@modeset-lpsp-stress-no-wait.html

  * igt@i915_pm_rpm@modeset-non-lpsp-stress:
    - shard-mtlp:         NOTRUN -> [SKIP][76] ([i915#1397])
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-1/igt@i915_pm_rpm@modeset-non-lpsp-stress.html

  * igt@i915_pm_rpm@modeset-pc8-residency-stress:
    - shard-mtlp:         NOTRUN -> [SKIP][77] ([fdo#109293])
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-8/igt@i915_pm_rpm@modeset-pc8-residency-stress.html

  * igt@i915_query@query-topology-coherent-slice-mask:
    - shard-dg2:          NOTRUN -> [SKIP][78] ([i915#6188])
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-12/igt@i915_query@query-topology-coherent-slice-mask.html

  * igt@i915_selftest@live@gem_contexts:
    - shard-rkl:          [PASS][79] -> [ABORT][80] ([i915#7461])
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-rkl-6/igt@i915_selftest@live@gem_contexts.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-6/igt@i915_selftest@live@gem_contexts.html

  * igt@i915_selftest@live@slpc:
    - shard-mtlp:         NOTRUN -> [DMESG-WARN][81] ([i915#6367])
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-1/igt@i915_selftest@live@slpc.html

  * igt@kms_addfb_basic@framebuffer-vs-set-tiling:
    - shard-dg2:          NOTRUN -> [SKIP][82] ([i915#4212])
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-5/igt@kms_addfb_basic@framebuffer-vs-set-tiling.html

  * igt@kms_async_flips@alternate-sync-async-flip@pipe-b-edp-1:
    - shard-mtlp:         [PASS][83] -> [FAIL][84] ([i915#2521])
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-8/igt@kms_async_flips@alternate-sync-async-flip@pipe-b-edp-1.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-3/igt@kms_async_flips@alternate-sync-async-flip@pipe-b-edp-1.html

  * igt@kms_big_fb@4-tiled-32bpp-rotate-270:
    - shard-rkl:          NOTRUN -> [SKIP][85] ([i915#5286]) +3 similar issues
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@kms_big_fb@4-tiled-32bpp-rotate-270.html

  * igt@kms_big_fb@4-tiled-64bpp-rotate-90:
    - shard-tglu:         NOTRUN -> [SKIP][86] ([fdo#111615] / [i915#5286]) +1 similar issue
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-10/igt@kms_big_fb@4-tiled-64bpp-rotate-90.html

  * igt@kms_big_fb@4-tiled-max-hw-stride-32bpp-rotate-0-async-flip:
    - shard-mtlp:         [PASS][87] -> [FAIL][88] ([i915#3743]) +3 similar issues
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-1/igt@kms_big_fb@4-tiled-max-hw-stride-32bpp-rotate-0-async-flip.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-8/igt@kms_big_fb@4-tiled-max-hw-stride-32bpp-rotate-0-async-flip.html

  * igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0-hflip:
    - shard-mtlp:         [PASS][89] -> [FAIL][90] ([i915#5138])
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-7/igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0-hflip.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-2/igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0-hflip.html

  * igt@kms_big_fb@linear-16bpp-rotate-90:
    - shard-mtlp:         NOTRUN -> [SKIP][91] ([fdo#111614])
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-2/igt@kms_big_fb@linear-16bpp-rotate-90.html

  * igt@kms_big_fb@x-tiled-16bpp-rotate-0:
    - shard-apl:          [PASS][92] -> [INCOMPLETE][93] ([i915#2295])
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-apl1/igt@kms_big_fb@x-tiled-16bpp-rotate-0.html
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-apl3/igt@kms_big_fb@x-tiled-16bpp-rotate-0.html

  * igt@kms_big_fb@x-tiled-16bpp-rotate-90:
    - shard-dg2:          NOTRUN -> [SKIP][94] ([fdo#111614]) +3 similar issues
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-12/igt@kms_big_fb@x-tiled-16bpp-rotate-90.html

  * igt@kms_big_fb@y-tiled-8bpp-rotate-90:
    - shard-rkl:          NOTRUN -> [SKIP][95] ([fdo#111614] / [i915#3638]) +1 similar issue
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-4/igt@kms_big_fb@y-tiled-8bpp-rotate-90.html

  * igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-0-async-flip:
    - shard-mtlp:         NOTRUN -> [SKIP][96] ([fdo#111615])
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-7/igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-0-async-flip.html

  * igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-180:
    - shard-rkl:          NOTRUN -> [SKIP][97] ([fdo#110723]) +3 similar issues
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-2/igt@kms_big_fb@yf-tiled-max-hw-stride-32bpp-rotate-180.html

  * igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-180-hflip:
    - shard-dg2:          NOTRUN -> [SKIP][98] ([i915#4538] / [i915#5190]) +4 similar issues
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-180-hflip.html

  * igt@kms_ccs@pipe-a-bad-aux-stride-y_tiled_ccs:
    - shard-tglu:         NOTRUN -> [SKIP][99] ([i915#3689] / [i915#5354] / [i915#6095]) +4 similar issues
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-4/igt@kms_ccs@pipe-a-bad-aux-stride-y_tiled_ccs.html

  * igt@kms_ccs@pipe-b-bad-aux-stride-y_tiled_gen12_mc_ccs:
    - shard-tglu:         NOTRUN -> [SKIP][100] ([i915#3689] / [i915#3886] / [i915#5354] / [i915#6095]) +1 similar issue
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-10/igt@kms_ccs@pipe-b-bad-aux-stride-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-b-bad-pixel-format-4_tiled_mtl_rc_ccs:
    - shard-rkl:          NOTRUN -> [SKIP][101] ([i915#5354] / [i915#6095]) +3 similar issues
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-2/igt@kms_ccs@pipe-b-bad-pixel-format-4_tiled_mtl_rc_ccs.html

  * igt@kms_ccs@pipe-b-crc-primary-basic-yf_tiled_ccs:
    - shard-rkl:          NOTRUN -> [SKIP][102] ([i915#3734] / [i915#5354] / [i915#6095]) +2 similar issues
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-6/igt@kms_ccs@pipe-b-crc-primary-basic-yf_tiled_ccs.html

  * igt@kms_ccs@pipe-b-crc-sprite-planes-basic-y_tiled_gen12_mc_ccs:
    - shard-mtlp:         NOTRUN -> [SKIP][103] ([i915#3886] / [i915#6095]) +1 similar issue
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-1/igt@kms_ccs@pipe-b-crc-sprite-planes-basic-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-c-random-ccs-data-y_tiled_gen12_rc_ccs_cc:
    - shard-dg2:          NOTRUN -> [SKIP][104] ([i915#3689] / [i915#3886] / [i915#5354]) +2 similar issues
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-8/igt@kms_ccs@pipe-c-random-ccs-data-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-d-bad-aux-stride-y_tiled_gen12_rc_ccs_cc:
    - shard-rkl:          NOTRUN -> [SKIP][105] ([i915#5354]) +9 similar issues
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-2/igt@kms_ccs@pipe-d-bad-aux-stride-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-d-crc-sprite-planes-basic-4_tiled_mtl_mc_ccs:
    - shard-tglu:         NOTRUN -> [SKIP][106] ([i915#5354] / [i915#6095]) +1 similar issue
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-7/igt@kms_ccs@pipe-d-crc-sprite-planes-basic-4_tiled_mtl_mc_ccs.html

  * igt@kms_ccs@pipe-d-missing-ccs-buffer-y_tiled_gen12_mc_ccs:
    - shard-dg2:          NOTRUN -> [SKIP][107] ([i915#3689] / [i915#5354]) +10 similar issues
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-12/igt@kms_ccs@pipe-d-missing-ccs-buffer-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-d-random-ccs-data-y_tiled_gen12_mc_ccs:
    - shard-mtlp:         NOTRUN -> [SKIP][108] ([i915#6095]) +6 similar issues
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-2/igt@kms_ccs@pipe-d-random-ccs-data-y_tiled_gen12_mc_ccs.html

  * igt@kms_ccs@pipe-d-random-ccs-data-yf_tiled_ccs:
    - shard-tglu:         NOTRUN -> [SKIP][109] ([fdo#111615] / [i915#3689] / [i915#5354] / [i915#6095]) +2 similar issues
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-5/igt@kms_ccs@pipe-d-random-ccs-data-yf_tiled_ccs.html

  * igt@kms_cdclk@mode-transition-all-outputs:
    - shard-rkl:          NOTRUN -> [SKIP][110] ([i915#3742])
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-4/igt@kms_cdclk@mode-transition-all-outputs.html

  * igt@kms_cdclk@mode-transition@pipe-d-hdmi-a-3:
    - shard-dg2:          NOTRUN -> [SKIP][111] ([i915#4087]) +8 similar issues
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-6/igt@kms_cdclk@mode-transition@pipe-d-hdmi-a-3.html

  * igt@kms_chamelium_color@ctm-limited-range:
    - shard-dg2:          NOTRUN -> [SKIP][112] ([fdo#111827])
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-8/igt@kms_chamelium_color@ctm-limited-range.html
    - shard-rkl:          NOTRUN -> [SKIP][113] ([fdo#111827])
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@kms_chamelium_color@ctm-limited-range.html

  * igt@kms_chamelium_edid@dp-mode-timings:
    - shard-dg2:          NOTRUN -> [SKIP][114] ([i915#7828]) +2 similar issues
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-6/igt@kms_chamelium_edid@dp-mode-timings.html

  * igt@kms_chamelium_frames@hdmi-crc-nonplanar-formats:
    - shard-mtlp:         NOTRUN -> [SKIP][115] ([i915#7828])
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-5/igt@kms_chamelium_frames@hdmi-crc-nonplanar-formats.html

  * igt@kms_chamelium_hpd@hdmi-hpd-for-each-pipe:
    - shard-tglu:         NOTRUN -> [SKIP][116] ([i915#7828]) +1 similar issue
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-7/igt@kms_chamelium_hpd@hdmi-hpd-for-each-pipe.html

  * igt@kms_content_protection@lic:
    - shard-dg2:          NOTRUN -> [SKIP][117] ([i915#7118])
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@kms_content_protection@lic.html
    - shard-rkl:          NOTRUN -> [SKIP][118] ([i915#7118])
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@kms_content_protection@lic.html

  * igt@kms_content_protection@mei_interface:
    - shard-tglu:         NOTRUN -> [SKIP][119] ([i915#6944] / [i915#7116] / [i915#7118])
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-10/igt@kms_content_protection@mei_interface.html

  * igt@kms_content_protection@srm:
    - shard-mtlp:         NOTRUN -> [SKIP][120] ([i915#3555] / [i915#6944])
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-7/igt@kms_content_protection@srm.html

  * igt@kms_content_protection@uevent@pipe-a-dp-2:
    - shard-dg2:          NOTRUN -> [FAIL][121] ([i915#1339])
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-12/igt@kms_content_protection@uevent@pipe-a-dp-2.html

  * igt@kms_cursor_crc@cursor-rapid-movement-32x32:
    - shard-tglu:         NOTRUN -> [SKIP][122] ([i915#3555]) +1 similar issue
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-4/igt@kms_cursor_crc@cursor-rapid-movement-32x32.html

  * igt@kms_cursor_legacy@cursor-vs-flip-toggle:
    - shard-mtlp:         NOTRUN -> [FAIL][123] ([i915#8248])
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-1/igt@kms_cursor_legacy@cursor-vs-flip-toggle.html

  * igt@kms_cursor_legacy@cursorb-vs-flipa-atomic-transitions-varying-size:
    - shard-mtlp:         NOTRUN -> [SKIP][124] ([i915#3546])
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-7/igt@kms_cursor_legacy@cursorb-vs-flipa-atomic-transitions-varying-size.html

  * igt@kms_cursor_legacy@short-busy-flip-before-cursor-toggle:
    - shard-dg2:          NOTRUN -> [SKIP][125] ([i915#4103] / [i915#4213])
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-12/igt@kms_cursor_legacy@short-busy-flip-before-cursor-toggle.html
    - shard-rkl:          NOTRUN -> [SKIP][126] ([i915#4103])
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-2/igt@kms_cursor_legacy@short-busy-flip-before-cursor-toggle.html

  * igt@kms_dsc@dsc-basic:
    - shard-dg2:          NOTRUN -> [SKIP][127] ([i915#3555] / [i915#3840])
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-3/igt@kms_dsc@dsc-basic.html
    - shard-rkl:          NOTRUN -> [SKIP][128] ([i915#3555] / [i915#3840])
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-6/igt@kms_dsc@dsc-basic.html

  * igt@kms_dsc@dsc-with-formats:
    - shard-mtlp:         NOTRUN -> [SKIP][129] ([i915#3555])
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-7/igt@kms_dsc@dsc-with-formats.html

  * igt@kms_flip@2x-flip-vs-expired-vblank-interruptible:
    - shard-mtlp:         NOTRUN -> [SKIP][130] ([fdo#111767] / [i915#3637])
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-2/igt@kms_flip@2x-flip-vs-expired-vblank-interruptible.html

  * igt@kms_flip@2x-flip-vs-fences:
    - shard-mtlp:         NOTRUN -> [SKIP][131] ([i915#8381])
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-4/igt@kms_flip@2x-flip-vs-fences.html

  * igt@kms_flip@2x-flip-vs-modeset-vs-hang:
    - shard-tglu:         NOTRUN -> [SKIP][132] ([fdo#109274] / [i915#3637])
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-8/igt@kms_flip@2x-flip-vs-modeset-vs-hang.html

  * igt@kms_flip@2x-flip-vs-suspend:
    - shard-mtlp:         NOTRUN -> [SKIP][133] ([i915#3637]) +1 similar issue
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-6/igt@kms_flip@2x-flip-vs-suspend.html

  * igt@kms_flip@2x-modeset-vs-vblank-race:
    - shard-dg2:          NOTRUN -> [SKIP][134] ([fdo#109274]) +2 similar issues
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@kms_flip@2x-modeset-vs-vblank-race.html
    - shard-rkl:          NOTRUN -> [SKIP][135] ([fdo#111825]) +1 similar issue
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@kms_flip@2x-modeset-vs-vblank-race.html

  * igt@kms_flip@flip-vs-expired-vblank@c-hdmi-a1:
    - shard-glk:          [PASS][136] -> [FAIL][137] ([i915#79])
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-glk4/igt@kms_flip@flip-vs-expired-vblank@c-hdmi-a1.html
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-glk3/igt@kms_flip@flip-vs-expired-vblank@c-hdmi-a1.html

  * igt@kms_flip_scaled_crc@flip-32bpp-yftile-to-64bpp-yftile-upscaling@pipe-a-valid-mode:
    - shard-dg2:          NOTRUN -> [SKIP][138] ([i915#2672]) +4 similar issues
   [138]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-7/igt@kms_flip_scaled_crc@flip-32bpp-yftile-to-64bpp-yftile-upscaling@pipe-a-valid-mode.html
    - shard-rkl:          NOTRUN -> [SKIP][139] ([i915#2672]) +1 similar issue
   [139]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-6/igt@kms_flip_scaled_crc@flip-32bpp-yftile-to-64bpp-yftile-upscaling@pipe-a-valid-mode.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytileccs-upscaling@pipe-a-default-mode:
    - shard-mtlp:         NOTRUN -> [SKIP][140] ([i915#2672])
   [140]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-4/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytileccs-upscaling@pipe-a-default-mode.html

  * igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs-downscaling@pipe-a-default-mode:
    - shard-mtlp:         NOTRUN -> [SKIP][141] ([i915#2672] / [i915#3555])
   [141]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-7/igt@kms_flip_scaled_crc@flip-32bpp-ytile-to-32bpp-ytilegen12rcccs-downscaling@pipe-a-default-mode.html

  * igt@kms_flip_scaled_crc@flip-64bpp-4tile-to-16bpp-4tile-upscaling@pipe-a-valid-mode:
    - shard-tglu:         NOTRUN -> [SKIP][142] ([i915#2587] / [i915#2672])
   [142]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-6/igt@kms_flip_scaled_crc@flip-64bpp-4tile-to-16bpp-4tile-upscaling@pipe-a-valid-mode.html

  * igt@kms_force_connector_basic@force-load-detect:
    - shard-mtlp:         NOTRUN -> [SKIP][143] ([fdo#109285])
   [143]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-3/igt@kms_force_connector_basic@force-load-detect.html

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-shrfb-draw-mmap-gtt:
    - shard-dg2:          NOTRUN -> [SKIP][144] ([i915#8708]) +6 similar issues
   [144]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-6/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-shrfb-draw-mmap-gtt.html
    - shard-rkl:          NOTRUN -> [SKIP][145] ([fdo#111825] / [i915#1825]) +12 similar issues
   [145]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-7/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-shrfb-draw-mmap-gtt.html

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-shrfb-draw-pwrite:
    - shard-mtlp:         NOTRUN -> [SKIP][146] ([i915#1825]) +8 similar issues
   [146]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-4/igt@kms_frontbuffer_tracking@fbc-2p-primscrn-pri-shrfb-draw-pwrite.html

  * igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite:
    - shard-dg2:          [PASS][147] -> [FAIL][148] ([i915#6880]) +1 similar issue
   [147]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-5/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite.html
   [148]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-6/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-pwrite.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-rte:
    - shard-rkl:          NOTRUN -> [SKIP][149] ([i915#3023]) +6 similar issues
   [149]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-7/igt@kms_frontbuffer_tracking@fbcpsr-1p-rte.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-cur-indfb-move:
    - shard-dg2:          NOTRUN -> [SKIP][150] ([i915#5354]) +25 similar issues
   [150]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-cur-indfb-move.html

  * igt@kms_frontbuffer_tracking@fbcpsr-rgb565-draw-blt:
    - shard-dg2:          NOTRUN -> [SKIP][151] ([i915#3458]) +11 similar issues
   [151]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@kms_frontbuffer_tracking@fbcpsr-rgb565-draw-blt.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-indfb-draw-mmap-cpu:
    - shard-tglu:         NOTRUN -> [SKIP][152] ([fdo#110189]) +4 similar issues
   [152]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-8/igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-indfb-draw-mmap-cpu.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-indfb-draw-mmap-gtt:
    - shard-mtlp:         NOTRUN -> [SKIP][153] ([i915#8708]) +4 similar issues
   [153]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-2/igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-indfb-draw-mmap-gtt.html

  * igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-shrfb-draw-render:
    - shard-tglu:         NOTRUN -> [SKIP][154] ([fdo#109280]) +7 similar issues
   [154]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-6/igt@kms_frontbuffer_tracking@psr-2p-scndscrn-pri-shrfb-draw-render.html

  * igt@kms_hdr@static-toggle-suspend:
    - shard-dg2:          NOTRUN -> [SKIP][155] ([i915#3555]) +6 similar issues
   [155]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-8/igt@kms_hdr@static-toggle-suspend.html

  * igt@kms_pipe_b_c_ivb@from-pipe-c-to-b-with-3-lanes:
    - shard-tglu:         NOTRUN -> [SKIP][156] ([fdo#109289]) +1 similar issue
   [156]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-5/igt@kms_pipe_b_c_ivb@from-pipe-c-to-b-with-3-lanes.html

  * igt@kms_plane_scaling@intel-max-src-size@pipe-a-dp-2:
    - shard-dg2:          NOTRUN -> [FAIL][157] ([i915#8292])
   [157]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-12/igt@kms_plane_scaling@intel-max-src-size@pipe-a-dp-2.html

  * igt@kms_plane_scaling@plane-downscale-with-modifiers-factor-0-25@pipe-d-dp-4:
    - shard-dg2:          NOTRUN -> [SKIP][158] ([i915#5176]) +7 similar issues
   [158]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@kms_plane_scaling@plane-downscale-with-modifiers-factor-0-25@pipe-d-dp-4.html

  * igt@kms_plane_scaling@plane-upscale-with-rotation-factor-0-25@pipe-a-hdmi-a-2:
    - shard-rkl:          NOTRUN -> [SKIP][159] ([i915#5176]) +5 similar issues
   [159]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-2/igt@kms_plane_scaling@plane-upscale-with-rotation-factor-0-25@pipe-a-hdmi-a-2.html

  * igt@kms_plane_scaling@planes-downscale-factor-0-5-unity-scaling@pipe-b-vga-1:
    - shard-snb:          NOTRUN -> [SKIP][160] ([fdo#109271]) +26 similar issues
   [160]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-snb2/igt@kms_plane_scaling@planes-downscale-factor-0-5-unity-scaling@pipe-b-vga-1.html

  * igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-25@pipe-b-dp-4:
    - shard-dg2:          NOTRUN -> [SKIP][161] ([i915#5235]) +11 similar issues
   [161]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-25@pipe-b-dp-4.html

  * igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-25@pipe-b-hdmi-a-2:
    - shard-rkl:          NOTRUN -> [SKIP][162] ([i915#5235]) +7 similar issues
   [162]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-2/igt@kms_plane_scaling@planes-unity-scaling-downscale-factor-0-25@pipe-b-hdmi-a-2.html

  * igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-25@pipe-b-edp-1:
    - shard-mtlp:         NOTRUN -> [SKIP][163] ([i915#5235]) +3 similar issues
   [163]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-4/igt@kms_plane_scaling@planes-upscale-factor-0-25-downscale-factor-0-25@pipe-b-edp-1.html

  * igt@kms_psr2_su@page_flip-nv12:
    - shard-dg2:          NOTRUN -> [SKIP][164] ([i915#658]) +1 similar issue
   [164]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-3/igt@kms_psr2_su@page_flip-nv12.html

  * igt@kms_psr2_su@page_flip-p010:
    - shard-rkl:          NOTRUN -> [SKIP][165] ([fdo#111068] / [i915#658])
   [165]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@kms_psr2_su@page_flip-p010.html

  * igt@kms_psr2_su@page_flip-xrgb8888:
    - shard-tglu:         NOTRUN -> [SKIP][166] ([fdo#109642] / [fdo#111068] / [i915#658])
   [166]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-10/igt@kms_psr2_su@page_flip-xrgb8888.html

  * igt@kms_psr@primary_render:
    - shard-rkl:          NOTRUN -> [SKIP][167] ([i915#1072]) +1 similar issue
   [167]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-2/igt@kms_psr@primary_render.html
    - shard-dg2:          NOTRUN -> [SKIP][168] ([i915#1072]) +1 similar issue
   [168]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-12/igt@kms_psr@primary_render.html

  * igt@kms_psr_stress_test@flip-primary-invalidate-overlay:
    - shard-dg2:          NOTRUN -> [SKIP][169] ([i915#5461] / [i915#658])
   [169]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-7/igt@kms_psr_stress_test@flip-primary-invalidate-overlay.html

  * igt@kms_rotation_crc@bad-tiling:
    - shard-dg2:          NOTRUN -> [SKIP][170] ([i915#4235])
   [170]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-3/igt@kms_rotation_crc@bad-tiling.html

  * igt@kms_rotation_crc@primary-y-tiled-reflect-x-270:
    - shard-dg2:          NOTRUN -> [SKIP][171] ([i915#4235] / [i915#5190])
   [171]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@kms_rotation_crc@primary-y-tiled-reflect-x-270.html

  * igt@kms_rotation_crc@primary-yf-tiled-reflect-x-0:
    - shard-tglu:         NOTRUN -> [SKIP][172] ([fdo#111615] / [i915#5289])
   [172]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-4/igt@kms_rotation_crc@primary-yf-tiled-reflect-x-0.html

  * igt@kms_rotation_crc@primary-yf-tiled-reflect-x-180:
    - shard-rkl:          NOTRUN -> [SKIP][173] ([fdo#111615] / [i915#5289])
   [173]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-2/igt@kms_rotation_crc@primary-yf-tiled-reflect-x-180.html

  * igt@kms_scaling_modes@scaling-mode-none:
    - shard-rkl:          NOTRUN -> [SKIP][174] ([i915#3555]) +3 similar issues
   [174]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@kms_scaling_modes@scaling-mode-none.html

  * igt@kms_setmode@invalid-clone-single-crtc:
    - shard-rkl:          NOTRUN -> [SKIP][175] ([i915#3555] / [i915#4098])
   [175]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-6/igt@kms_setmode@invalid-clone-single-crtc.html

  * igt@kms_sysfs_edid_timing:
    - shard-dg2:          [PASS][176] -> [FAIL][177] ([IGT#2])
   [176]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-12/igt@kms_sysfs_edid_timing.html
   [177]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-8/igt@kms_sysfs_edid_timing.html

  * igt@kms_vblank@pipe-c-query-busy:
    - shard-rkl:          NOTRUN -> [SKIP][178] ([i915#4070] / [i915#6768])
   [178]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-7/igt@kms_vblank@pipe-c-query-busy.html

  * igt@kms_vblank@pipe-d-ts-continuation-dpms-suspend:
    - shard-rkl:          NOTRUN -> [SKIP][179] ([i915#4070] / [i915#533] / [i915#6768]) +1 similar issue
   [179]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-6/igt@kms_vblank@pipe-d-ts-continuation-dpms-suspend.html

  * igt@kms_vblank@pipe-d-ts-continuation-suspend:
    - shard-dg2:          [PASS][180] -> [FAIL][181] ([fdo#103375] / [i915#6121])
   [180]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-12/igt@kms_vblank@pipe-d-ts-continuation-suspend.html
   [181]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-5/igt@kms_vblank@pipe-d-ts-continuation-suspend.html

  * igt@kms_vblank@pipe-d-wait-busy:
    - shard-glk:          NOTRUN -> [SKIP][182] ([fdo#109271]) +17 similar issues
   [182]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-glk9/igt@kms_vblank@pipe-d-wait-busy.html

  * igt@kms_writeback@writeback-invalid-parameters:
    - shard-rkl:          NOTRUN -> [SKIP][183] ([i915#2437])
   [183]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-2/igt@kms_writeback@writeback-invalid-parameters.html

  * igt@kms_writeback@writeback-pixel-formats:
    - shard-dg2:          NOTRUN -> [SKIP][184] ([i915#2437]) +1 similar issue
   [184]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-8/igt@kms_writeback@writeback-pixel-formats.html

  * igt@perf@gen8-unprivileged-single-ctx-counters:
    - shard-dg2:          NOTRUN -> [SKIP][185] ([i915#2436])
   [185]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@perf@gen8-unprivileged-single-ctx-counters.html
    - shard-rkl:          NOTRUN -> [SKIP][186] ([i915#2436])
   [186]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-7/igt@perf@gen8-unprivileged-single-ctx-counters.html

  * igt@perf_pmu@busy-double-start@rcs0:
    - shard-dg2:          [PASS][187] -> [FAIL][188] ([i915#4349])
   [187]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-8/igt@perf_pmu@busy-double-start@rcs0.html
   [188]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-5/igt@perf_pmu@busy-double-start@rcs0.html

  * igt@perf_pmu@faulting-read@gtt:
    - shard-mtlp:         NOTRUN -> [SKIP][189] ([i915#8440])
   [189]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-2/igt@perf_pmu@faulting-read@gtt.html

  * igt@perf_pmu@rc6@other-idle-gt0:
    - shard-dg2:          NOTRUN -> [SKIP][190] ([i915#8516])
   [190]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@perf_pmu@rc6@other-idle-gt0.html
    - shard-rkl:          NOTRUN -> [SKIP][191] ([i915#8516])
   [191]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@perf_pmu@rc6@other-idle-gt0.html

  * igt@prime_vgem@basic-write:
    - shard-dg2:          NOTRUN -> [SKIP][192] ([i915#3291] / [i915#3708])
   [192]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-12/igt@prime_vgem@basic-write.html

  * igt@sysfs_heartbeat_interval@mixed@ccs0:
    - shard-mtlp:         [PASS][193] -> [ABORT][194] ([i915#8552])
   [193]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-3/igt@sysfs_heartbeat_interval@mixed@ccs0.html
   [194]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-7/igt@sysfs_heartbeat_interval@mixed@ccs0.html

  * igt@sysfs_heartbeat_interval@mixed@vecs0:
    - shard-mtlp:         [PASS][195] -> [FAIL][196] ([i915#1731])
   [195]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-3/igt@sysfs_heartbeat_interval@mixed@vecs0.html
   [196]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-7/igt@sysfs_heartbeat_interval@mixed@vecs0.html

  * igt@v3d/v3d_get_param@base-params:
    - shard-dg2:          NOTRUN -> [SKIP][197] ([i915#2575]) +6 similar issues
   [197]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-6/igt@v3d/v3d_get_param@base-params.html

  * igt@v3d/v3d_perfmon@create-perfmon-invalid-counters:
    - shard-tglu:         NOTRUN -> [SKIP][198] ([fdo#109315] / [i915#2575]) +1 similar issue
   [198]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-5/igt@v3d/v3d_perfmon@create-perfmon-invalid-counters.html

  * igt@v3d/v3d_submit_csd@bad-multisync-pad:
    - shard-rkl:          NOTRUN -> [SKIP][199] ([fdo#109315]) +4 similar issues
   [199]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-4/igt@v3d/v3d_submit_csd@bad-multisync-pad.html

  * igt@v3d/v3d_submit_csd@single-out-sync:
    - shard-mtlp:         NOTRUN -> [SKIP][200] ([i915#2575]) +5 similar issues
   [200]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-1/igt@v3d/v3d_submit_csd@single-out-sync.html

  * igt@vc4/vc4_perfmon@create-two-perfmon:
    - shard-tglu:         NOTRUN -> [SKIP][201] ([i915#2575])
   [201]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-8/igt@vc4/vc4_perfmon@create-two-perfmon.html

  * igt@vc4/vc4_purgeable_bo@free-purged-bo:
    - shard-mtlp:         NOTRUN -> [SKIP][202] ([i915#7711]) +1 similar issue
   [202]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-3/igt@vc4/vc4_purgeable_bo@free-purged-bo.html

  * igt@vc4/vc4_tiling@get-bad-handle:
    - shard-dg2:          NOTRUN -> [SKIP][203] ([i915#7711]) +5 similar issues
   [203]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-8/igt@vc4/vc4_tiling@get-bad-handle.html
    - shard-rkl:          NOTRUN -> [SKIP][204] ([i915#7711]) +2 similar issues
   [204]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-4/igt@vc4/vc4_tiling@get-bad-handle.html

  
#### Possible fixes ####

  * igt@drm_fdinfo@most-busy-check-all@rcs0:
    - shard-rkl:          [FAIL][205] ([i915#7742]) -> [PASS][206]
   [205]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-rkl-6/igt@drm_fdinfo@most-busy-check-all@rcs0.html
   [206]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-4/igt@drm_fdinfo@most-busy-check-all@rcs0.html

  * igt@gem_eio@kms:
    - shard-glk:          [FAIL][207] ([i915#8764]) -> [PASS][208]
   [207]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-glk8/igt@gem_eio@kms.html
   [208]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-glk8/igt@gem_eio@kms.html

  * igt@gem_exec_schedule@deep@vcs0:
    - shard-mtlp:         [FAIL][209] ([i915#8545]) -> [PASS][210]
   [209]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-4/igt@gem_exec_schedule@deep@vcs0.html
   [210]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-2/igt@gem_exec_schedule@deep@vcs0.html

  * igt@gem_exec_whisper@basic-normal:
    - shard-mtlp:         [FAIL][211] ([i915#6363]) -> [PASS][212] +2 similar issues
   [211]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-6/igt@gem_exec_whisper@basic-normal.html
   [212]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-4/igt@gem_exec_whisper@basic-normal.html

  * igt@gem_ppgtt@blt-vs-render-ctx0:
    - shard-snb:          [DMESG-FAIL][213] ([i915#8295]) -> [PASS][214]
   [213]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-snb1/igt@gem_ppgtt@blt-vs-render-ctx0.html
   [214]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-snb7/igt@gem_ppgtt@blt-vs-render-ctx0.html

  * igt@i915_pm_rc6_residency@rc6-accuracy:
    - shard-mtlp:         [SKIP][215] ([fdo#109289] / [i915#8403]) -> [PASS][216]
   [215]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-2/igt@i915_pm_rc6_residency@rc6-accuracy.html
   [216]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-1/igt@i915_pm_rc6_residency@rc6-accuracy.html

  * igt@i915_pm_rpm@dpms-mode-unset-non-lpsp:
    - {shard-dg1}:        [SKIP][217] ([i915#1397]) -> [PASS][218] +1 similar issue
   [217]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg1-19/igt@i915_pm_rpm@dpms-mode-unset-non-lpsp.html
   [218]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg1-16/igt@i915_pm_rpm@dpms-mode-unset-non-lpsp.html

  * igt@i915_pm_rpm@drm-resources-equal:
    - shard-dg2:          [FAIL][219] -> [PASS][220]
   [219]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-11/igt@i915_pm_rpm@drm-resources-equal.html
   [220]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@i915_pm_rpm@drm-resources-equal.html

  * igt@i915_pm_rpm@modeset-non-lpsp-stress-no-wait:
    - shard-dg2:          [SKIP][221] ([i915#1397]) -> [PASS][222]
   [221]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-12/igt@i915_pm_rpm@modeset-non-lpsp-stress-no-wait.html
   [222]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-8/igt@i915_pm_rpm@modeset-non-lpsp-stress-no-wait.html

  * igt@i915_suspend@forcewake:
    - shard-dg2:          [FAIL][223] ([fdo#103375] / [i915#6121]) -> [PASS][224] +1 similar issue
   [223]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-5/igt@i915_suspend@forcewake.html
   [224]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@i915_suspend@forcewake.html

  * igt@kms_big_fb@4-tiled-64bpp-rotate-180:
    - shard-mtlp:         [FAIL][225] ([i915#5138]) -> [PASS][226]
   [225]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-7/igt@kms_big_fb@4-tiled-64bpp-rotate-180.html
   [226]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-8/igt@kms_big_fb@4-tiled-64bpp-rotate-180.html

  * igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0-async-flip:
    - shard-mtlp:         [FAIL][227] ([i915#3743]) -> [PASS][228]
   [227]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-7/igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0-async-flip.html
   [228]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-8/igt@kms_big_fb@4-tiled-max-hw-stride-64bpp-rotate-0-async-flip.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-apl:          [FAIL][229] ([i915#2346]) -> [PASS][230]
   [229]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-apl3/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [230]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-apl6/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size:
    - shard-glk:          [FAIL][231] ([i915#2346]) -> [PASS][232]
   [231]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-glk2/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html
   [232]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-glk1/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html

  * igt@kms_flip@flip-vs-dpms-off-vs-modeset-interruptible@d-edp1:
    - shard-mtlp:         [INCOMPLETE][233] -> [PASS][234]
   [233]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-7/igt@kms_flip@flip-vs-dpms-off-vs-modeset-interruptible@d-edp1.html
   [234]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-5/igt@kms_flip@flip-vs-dpms-off-vs-modeset-interruptible@d-edp1.html

  * igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-render:
    - shard-dg2:          [FAIL][235] ([i915#6880]) -> [PASS][236]
   [235]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-1/igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-render.html
   [236]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-render.html

  * igt@perf@oa-exponents@0-rcs0:
    - shard-glk:          [ABORT][237] ([i915#5213] / [i915#7941]) -> [PASS][238]
   [237]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-glk9/igt@perf@oa-exponents@0-rcs0.html
   [238]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-glk2/igt@perf@oa-exponents@0-rcs0.html

  * igt@sysfs_heartbeat_interval@precise@vecs0:
    - shard-mtlp:         [FAIL][239] ([i915#8332]) -> [PASS][240]
   [239]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-1/igt@sysfs_heartbeat_interval@precise@vecs0.html
   [240]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-8/igt@sysfs_heartbeat_interval@precise@vecs0.html

  * igt@sysfs_preempt_timeout@timeout@vecs0:
    - shard-mtlp:         [ABORT][241] ([i915#8521]) -> [PASS][242]
   [241]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-6/igt@sysfs_preempt_timeout@timeout@vecs0.html
   [242]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-5/igt@sysfs_preempt_timeout@timeout@vecs0.html

  
#### Warnings ####

  * igt@gem_exec_whisper@basic-contexts-priority-all:
    - shard-mtlp:         [ABORT][243] ([i915#8131]) -> [TIMEOUT][244] ([i915#7392])
   [243]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-1/igt@gem_exec_whisper@basic-contexts-priority-all.html
   [244]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-6/igt@gem_exec_whisper@basic-contexts-priority-all.html

  * igt@gem_lmem_swapping@smem-oom@lmem0:
    - shard-dg2:          [TIMEOUT][245] ([i915#5493]) -> [DMESG-WARN][246] ([i915#4936] / [i915#5493])
   [245]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-3/igt@gem_lmem_swapping@smem-oom@lmem0.html
   [246]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-5/igt@gem_lmem_swapping@smem-oom@lmem0.html

  * igt@i915_pm_rc6_residency@media-rc6-accuracy:
    - shard-mtlp:         [SKIP][247] ([fdo#109289]) -> [SKIP][248] ([fdo#109289] / [i915#8403])
   [247]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-6/igt@i915_pm_rc6_residency@media-rc6-accuracy.html
   [248]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-4/igt@i915_pm_rc6_residency@media-rc6-accuracy.html

  * igt@i915_pm_rc6_residency@rc6-idle@bcs0:
    - shard-tglu:         [WARN][249] ([i915#2681]) -> [FAIL][250] ([i915#2681] / [i915#3591])
   [249]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-tglu-5/igt@i915_pm_rc6_residency@rc6-idle@bcs0.html
   [250]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-tglu-4/igt@i915_pm_rc6_residency@rc6-idle@bcs0.html

  * igt@kms_async_flips@crc@pipe-a-edp-1:
    - shard-mtlp:         [FAIL][251] ([i915#8247]) -> [DMESG-FAIL][252] ([i915#8561])
   [251]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-4/igt@kms_async_flips@crc@pipe-a-edp-1.html
   [252]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-6/igt@kms_async_flips@crc@pipe-a-edp-1.html

  * igt@kms_content_protection@content_type_change:
    - shard-dg2:          [SKIP][253] ([i915#3555] / [i915#7118]) -> [SKIP][254] ([i915#3555] / [i915#7118] / [i915#7162])
   [253]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-7/igt@kms_content_protection@content_type_change.html
   [254]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@kms_content_protection@content_type_change.html

  * igt@kms_content_protection@mei_interface:
    - shard-dg2:          [SKIP][255] ([i915#7118]) -> [SKIP][256] ([i915#7118] / [i915#7162])
   [255]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-1/igt@kms_content_protection@mei_interface.html
   [256]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-11/igt@kms_content_protection@mei_interface.html

  * igt@kms_content_protection@type1:
    - shard-dg2:          [SKIP][257] ([i915#3555] / [i915#7118] / [i915#7162]) -> [SKIP][258] ([i915#3555] / [i915#7118])
   [257]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-dg2-12/igt@kms_content_protection@type1.html
   [258]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-dg2-1/igt@kms_content_protection@type1.html

  * igt@kms_content_protection@uevent:
    - shard-mtlp:         [SKIP][259] ([i915#3555]) -> [SKIP][260] ([i915#3555] / [i915#6944]) +3 similar issues
   [259]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-3/igt@kms_content_protection@uevent.html
   [260]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-6/igt@kms_content_protection@uevent.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size:
    - shard-mtlp:         [FAIL][261] ([i915#2346]) -> [DMESG-FAIL][262] ([i915#2017] / [i915#5954])
   [261]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-mtlp-7/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html
   [262]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-mtlp-1/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html

  * igt@kms_fbcon_fbt@psr:
    - shard-rkl:          [SKIP][263] ([i915#3955]) -> [SKIP][264] ([fdo#110189] / [i915#3955])
   [263]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-rkl-4/igt@kms_fbcon_fbt@psr.html
   [264]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@kms_fbcon_fbt@psr.html

  * igt@kms_multipipe_modeset@basic-max-pipe-crc-check:
    - shard-rkl:          [SKIP][265] ([i915#4816]) -> [SKIP][266] ([i915#4070] / [i915#4816])
   [265]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7373/shard-rkl-7/igt@kms_multipipe_modeset@basic-max-pipe-crc-check.html
   [266]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/shard-rkl-1/igt@kms_multipipe_modeset@basic-max-pipe-crc-check.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [IGT#2]: https://gitlab.freedesktop.org/drm/igt-gpu-tools/issues/2
  [fdo#103375]: https://bugs.freedesktop.org/show_bug.cgi?id=103375
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109274]: https://bugs.freedesktop.org/show_bug.cgi?id=109274
  [fdo#109280]: https://bugs.freedesktop.org/show_bug.cgi?id=109280
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109289]: https://bugs.freedesktop.org/show_bug.cgi?id=109289
  [fdo#109293]: https://bugs.freedesktop.org/show_bug.cgi?id=109293
  [fdo#109303]: https://bugs.freedesktop.org/show_bug.cgi?id=109303
  [fdo#109312]: https://bugs.freedesktop.org/show_bug.cgi?id=109312
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#109642]: https://bugs.freedesktop.org/show_bug.cgi?id=109642
  [fdo#110189]: https://bugs.freedesktop.org/show_bug.cgi?id=110189
  [fdo#110723]: https://bugs.freedesktop.org/show_bug.cgi?id=110723
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [fdo#111614]: https://bugs.freedesktop.org/show_bug.cgi?id=111614
  [fdo#111615]: https://bugs.freedesktop.org/show_bug.cgi?id=111615
  [fdo#111767]: https://bugs.freedesktop.org/show_bug.cgi?id=111767
  [fdo#111825]: https://bugs.freedesktop.org/show_bug.cgi?id=111825
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1339]: https://gitlab.freedesktop.org/drm/intel/issues/1339
  [i915#1397]: https://gitlab.freedesktop.org/drm/intel/issues/1397
  [i915#1731]: https://gitlab.freedesktop.org/drm/intel/issues/1731
  [i915#1825]: https://gitlab.freedesktop.org/drm/intel/issues/1825
  [i915#1839]: https://gitlab.freedesktop.org/drm/intel/issues/1839
  [i915#2017]: https://gitlab.freedesktop.org/drm/intel/issues/2017
  [i915#2295]: https://gitlab.freedesktop.org/drm/intel/issues/2295
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2436]: https://gitlab.freedesktop.org/drm/intel/issues/2436
  [i915#2437]: https://gitlab.freedesktop.org/drm/intel/issues/2437
  [i915#2521]: https://gitlab.freedesktop.org/drm/intel/issues/2521
  [i915#2527]: https://gitlab.freedesktop.org/drm/intel/issues/2527
  [i915#2575]: https://gitlab.freedesktop.org/drm/intel/issues/2575
  [i915#2587]: https://gitlab.freedesktop.org/drm/intel/issues/2587
  [i915#2672]: https://gitlab.freedesktop.org/drm/intel/issues/2672
  [i915#2681]: https://gitlab.freedesktop.org/drm/intel/issues/2681
  [i915#280]: https://gitlab.freedesktop.org/drm/intel/issues/280
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2846]: https://gitlab.freedesktop.org/drm/intel/issues/2846
  [i915#2856]: https://gitlab.freedesktop.org/drm/intel/issues/2856
  [i915#3023]: https://gitlab.freedesktop.org/drm/intel/issues/3023
  [i915#3281]: https://gitlab.freedesktop.org/drm/intel/issues/3281
  [i915#3282]: https://gitlab.freedesktop.org/drm/intel/issues/3282
  [i915#3291]: https://gitlab.freedesktop.org/drm/intel/issues/3291
  [i915#3297]: https://gitlab.freedesktop.org/drm/intel/issues/3297
  [i915#3361]: https://gitlab.freedesktop.org/drm/intel/issues/3361
  [i915#3458]: https://gitlab.freedesktop.org/drm/intel/issues/3458
  [i915#3539]: https://gitlab.freedesktop.org/drm/intel/issues/3539
  [i915#3546]: https://gitlab.freedesktop.org/drm/intel/issues/3546
  [i915#3555]: https://gitlab.freedesktop.org/drm/intel/issues/3555
  [i915#3591]: https://gitlab.freedesktop.org/drm/intel/issues/3591
  [i915#3637]: https://gitlab.freedesktop.org/drm/intel/issues/3637
  [i915#3638]: https://gitlab.freedesktop.org/drm/intel/issues/3638
  [i915#3689]: https://gitlab.freedesktop.org/drm/intel/issues/3689
  [i915#3708]: https://gitlab.freedesktop.org/drm/intel/issues/3708
  [i915#3734]: https://gitlab.freedesktop.org/drm/intel/issues/3734
  [i915#3742]: https://gitlab.freedesktop.org/drm/intel/issues/3742
  [i915#3743]: https://gitlab.freedesktop.org/drm/intel/issues/3743
  [i915#3840]: https://gitlab.freedesktop.org/drm/intel/issues/3840
  [i915#3886]: https://gitlab.freedesktop.org/drm/intel/issues/3886
  [i915#3955]: https://gitlab.freedesktop.org/drm/intel/issues/3955
  [i915#4070]: https://gitlab.freedesktop.org/drm/intel/issues/4070
  [i915#4077]: https://gitlab.freedesktop.org/drm/intel/issues/4077
  [i915#4078]: https://gitlab.freedesktop.org/drm/intel/issues/4078
  [i915#4079]: https://gitlab.freedesktop.org/drm/intel/issues/4079
  [i915#4083]: https://gitlab.freedesktop.org/drm/intel/issues/4083
  [i915#4087]: https://gitlab.freedesktop.org/drm/intel/issues/4087
  [i915#4098]: https://gitlab.freedesktop.org/drm/intel/issues/4098
  [i915#4103]: https://gitlab.freedesktop.org/drm/intel/issues/4103
  [i915#4212]: https://gitlab.freedesktop.org/drm/intel/issues/4212
  [i915#4213]: https://gitlab.freedesktop.org/drm/intel/issues/4213
  [i915#4235]: https://gitlab.freedesktop.org/drm/intel/issues/4235
  [i915#4270]: https://gitlab.freedesktop.org/drm/intel/issues/4270
  [i915#4275]: https://gitlab.freedesktop.org/drm/intel/issues/4275
  [i915#4349]: https://gitlab.freedesktop.org/drm/intel/issues/4349
  [i915#4525]: https://gitlab.freedesktop.org/drm/intel/issues/4525
  [i915#4537]: https://gitlab.freedesktop.org/drm/intel/issues/4537
  [i915#4538]: https://gitlab.freedesktop.org/drm/intel/issues/4538
  [i915#4565]: https://gitlab.freedesktop.org/drm/intel/issues/4565
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4771]: https://gitlab.freedesktop.org/drm/intel/issues/4771
  [i915#4812]: https://gitlab.freedesktop.org/drm/intel/issues/4812
  [i915#4816]: https://gitlab.freedesktop.org/drm/intel/issues/4816
  [i915#4852]: https://gitlab.freedesktop.org/drm/intel/issues/4852
  [i915#4860]: https://gitlab.freedesktop.org/drm/intel/issues/4860
  [i915#4881]: https://gitlab.freedesktop.org/drm/intel/issues/4881
  [i915#4885]: https://gitlab.freedesktop.org/drm/intel/issues/4885
  [i915#4936]: https://gitlab.freedesktop.org/drm/intel/issues/4936
  [i915#5138]: https://gitlab.freedesktop.org/drm/intel/issues/5138
  [i915#5176]: https://gitlab.freedesktop.org/drm/intel/issues/5176
  [i915#5190]: https://gitlab.freedesktop.org/drm/intel/issues/5190
  [i915#5213]: https://gitlab.freedesktop.org/drm/intel/issues/5213
  [i915#5235]: https://gitlab.freedesktop.org/drm/intel/issues/5235
  [i915#5286]: https://gitlab.freedesktop.org/drm/intel/issues/5286
  [i915#5289]: https://gitlab.freedesktop.org/drm/intel/issues/5289
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#5354]: https://gitlab.freedesktop.org/drm/intel/issues/5354
  [i915#5461]: https://gitlab.freedesktop.org/drm/intel/issues/5461
  [i915#5466]: https://gitlab.freedesktop.org/drm/intel/issues/5466
  [i915#5493]: https://gitlab.freedesktop.org/drm/intel/issues/5493
  [i915#5566]: https://gitlab.freedesktop.org/drm/intel/issues/5566
  [i915#5954]: https://gitlab.freedesktop.org/drm/intel/issues/5954
  [i915#5978]: https://gitlab.freedesktop.org/drm/intel/issues/5978
  [i915#6095]: https://gitlab.freedesktop.org/drm/intel/issues/6095
  [i915#6121]: https://gitlab.freedesktop.org/drm/intel/issues/6121
  [i915#6122]: https://gitlab.freedesktop.org/drm/intel/issues/6122
  [i915#6188]: https://gitlab.freedesktop.org/drm/intel/issues/6188
  [i915#6227]: https://gitlab.freedesktop.org/drm/intel/issues/6227
  [i915#6230]: https://gitlab.freedesktop.org/drm/intel/issues/6230
  [i915#6344]: https://gitlab.freedesktop.org/drm/intel/issues/6344
  [i915#6363]: https://gitlab.freedesktop.org/drm/intel/issues/6363
  [i915#6367]: https://gitlab.freedesktop.org/drm/intel/issues/6367
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#6768]: https://gitlab.freedesktop.org/drm/intel/issues/6768
  [i915#6880]: https://gitlab.freedesktop.org/drm/intel/issues/6880
  [i915#6944]: https://gitlab.freedesktop.org/drm/intel/issues/6944
  [i915#7061]: https://gitlab.freedesktop.org/drm/intel/issues/7061
  [i915#7069]: https://gitlab.freedesktop.org/drm/intel/issues/7069
  [i915#7116]: https://gitlab.freedesktop.org/drm/intel/issues/7116
  [i915#7118]: https://gitlab.freedesktop.org/drm/intel/issues/7118
  [i915#7162]: https://gitlab.freedesktop.org/drm/intel/issues/7162
  [i915#7392]: https://gitlab.freedesktop.org/drm/intel/issues/7392
  [i915#7461]: https://gitlab.freedesktop.org/drm/intel/issues/7461
  [i915#7711]: https://gitlab.freedesktop.org/drm/intel/issues/7711
  [i915#7742]: https://gitlab.freedesktop.org/drm/intel/issues/7742
  [i915#7816]: https://gitlab.freedesktop.org/drm/intel/issues/7816
  [i915#7828]: https://gitlab.freedesktop.org/drm/intel/issues/7828
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
  [i915#7941]: https://gitlab.freedesktop.org/drm/intel/issues/7941
  [i915#7975]: https://gitlab.freedesktop.org/drm/intel/issues/7975
  [i915#8131]: https://gitlab.freedesktop.org/drm/intel/issues/8131
  [i915#8213]: https://gitlab.freedesktop.org/drm/intel/issues/8213
  [i915#8217]: https://gitlab.freedesktop.org/drm/intel/issues/8217
  [i915#8247]: https://gitlab.freedesktop.org/drm/intel/issues/8247
  [i915#8248]: https://gitlab.freedesktop.org/drm/intel/issues/8248
  [i915#8289]: https://gitlab.freedesktop.org/drm/intel/issues/8289
  [i915#8292]: https://gitlab.freedesktop.org/drm/intel/issues/8292
  [i915#8295]: https://gitlab.freedesktop.org/drm/intel/issues/8295
  [i915#8332]: https://gitlab.freedesktop.org/drm/intel/issues/8332
  [i915#8381]: https://gitlab.freedesktop.org/drm/intel/issues/8381
  [i915#8386]: https://gitlab.freedesktop.org/drm/intel/issues/8386
  [i915#8399]: https://gitlab.freedesktop.org/drm/intel/issues/8399
  [i915#8403]: https://gitlab.freedesktop.org/drm/intel/issues/8403
  [i915#8411]: https://gitlab.freedesktop.org/drm/intel/issues/8411
  [i915#8414]: https://gitlab.freedesktop.org/drm/intel/issues/8414
  [i915#8428]: https://gitlab.freedesktop.org/drm/intel/issues/8428
  [i915#8440]: https://gitlab.freedesktop.org/drm/intel/issues/8440
  [i915#8516]: https://gitlab.freedesktop.org/drm/intel/issues/8516
  [i915#8521]: https://gitlab.freedesktop.org/drm/intel/issues/8521
  [i915#8537]: https://gitlab.freedesktop.org/drm/intel/issues/8537
  [i915#8545]: https://gitlab.freedesktop.org/drm/intel/issues/8545
  [i915#8552]: https://gitlab.freedesktop.org/drm/intel/issues/8552
  [i915#8555]: https://gitlab.freedesktop.org/drm/intel/issues/8555
  [i915#8561]: https://gitlab.freedesktop.org/drm/intel/issues/8561
  [i915#8628]: https://gitlab.freedesktop.org/drm/intel/issues/8628
  [i915#8661]: https://gitlab.freedesktop.org/drm/intel/issues/8661
  [i915#8708]: https://gitlab.freedesktop.org/drm/intel/issues/8708
  [i915#8764]: https://gitlab.freedesktop.org/drm/intel/issues/8764


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_7373 -> IGTPW_9350

  CI-20190529: 20190529
  CI_DRM_13347: ee9b323b764f1f14ae4e6e8213164bd250160770 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_9350: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/index.html
  IGT_7373: 654347ccaa55cbfcd10b978cc6662ef6db25224d @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_9350/index.html

[-- Attachment #2: Type: text/html, Size: 83746 bytes --]

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 03/16] lib/intel_allocator: Remove extensive debugging
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 03/16] lib/intel_allocator: Remove extensive debugging Zbigniew Kempczyński
@ 2023-07-06  9:30   ` Karolina Stolarek
  0 siblings, 0 replies; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-06  9:30 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> Debugging map keys comparison leads to produce extensive logging what
> in turn obscures analysis. Remove it to get clearer logs.

Logging you've introduced with this series is much clearer, thanks for 
doing this:

Reviewed-by: Karolina Stolarek <karolina.stolarek@intel.com>

> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>   lib/intel_allocator.c | 9 ---------
>   1 file changed, 9 deletions(-)
> 
> diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
> index c31576ecef..be24f8f2d0 100644
> --- a/lib/intel_allocator.c
> +++ b/lib/intel_allocator.c
> @@ -1385,9 +1385,6 @@ static int equal_handles(const void *key1, const void *key2)
>   {
>   	const struct handle_entry *h1 = key1, *h2 = key2;
>   
> -	alloc_debug("h1: %llx, h2: %llx\n",
> -		   (long long) h1->handle, (long long) h2->handle);
> -
>   	return h1->handle == h2->handle;
>   }
>   
> @@ -1395,9 +1392,6 @@ static int equal_ctx(const void *key1, const void *key2)
>   {
>   	const struct allocator *a1 = key1, *a2 = key2;
>   
> -	alloc_debug("a1: <fd: %d, ctx: %u>, a2 <fd: %d, ctx: %u>\n",
> -		   a1->fd, a1->ctx, a2->fd, a2->ctx);
> -
>   	return a1->fd == a2->fd && a1->ctx == a2->ctx;
>   }
>   
> @@ -1405,9 +1399,6 @@ static int equal_vm(const void *key1, const void *key2)
>   {
>   	const struct allocator *a1 = key1, *a2 = key2;
>   
> -	alloc_debug("a1: <fd: %d, vm: %u>, a2 <fd: %d, vm: %u>\n",
> -		   a1->fd, a1->vm, a2->fd, a2->vm);
> -
>   	return a1->fd == a2->fd && a1->vm == a2->vm;
>   }
>   

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 07/16] lib/xe_util: Return dynamic subtest name for Xe
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 07/16] lib/xe_util: Return dynamic subtest name for Xe Zbigniew Kempczyński
@ 2023-07-06  9:37   ` Karolina Stolarek
  0 siblings, 0 replies; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-06  9:37 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> For tests which are working on more than one region using name suffix
> like "vram01-system" etc. is common thing. Instead handcrafting this
> naming add xe_memregion_dynamic_subtest_name() function which is
> similar to memregion_dynamic_subtest_name() for i915.
> 

Like I said in my previous review, I don't want to block the whole 
series because of this patch. Still, it would be good to refactor it 
into one memregion_dynamic_subtest_name().

checkpatch complains about block comment of 
xe_memregion_dynamic_subtest_name() (i.e. each * should be aligned with 
the first one in /**).

Once you fix that:

Reviewed-by: Karolina Stolarek <karolina.stolarek@intel.com>

> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>   lib/meson.build  |   3 +-
>   lib/xe/xe_util.c | 104 +++++++++++++++++++++++++++++++++++++++++++++++
>   lib/xe/xe_util.h |  30 ++++++++++++++
>   3 files changed, 136 insertions(+), 1 deletion(-)
>   create mode 100644 lib/xe/xe_util.c
>   create mode 100644 lib/xe/xe_util.h
> 
> diff --git a/lib/meson.build b/lib/meson.build
> index 5523b4450e..ce11c0715f 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -105,7 +105,8 @@ lib_sources = [
>   	'xe/xe_compute_square_kernels.c',
>   	'xe/xe_ioctl.c',
>   	'xe/xe_query.c',
> -	'xe/xe_spin.c'
> +	'xe/xe_spin.c',
> +	'xe/xe_util.c',
>   ]
>   
>   lib_deps = [
> diff --git a/lib/xe/xe_util.c b/lib/xe/xe_util.c
> new file mode 100644
> index 0000000000..448b3a3d27
> --- /dev/null
> +++ b/lib/xe/xe_util.c
> @@ -0,0 +1,104 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#include "igt.h"
> +#include "igt_syncobj.h"
> +#include "xe/xe_ioctl.h"
> +#include "xe/xe_query.h"
> +#include "xe/xe_util.h"
> +
> +static bool __region_belongs_to_regions_type(struct drm_xe_query_mem_region *region,
> +					     uint32_t *mem_regions_type,
> +					     int num_regions)
> +{
> +	for (int i = 0; i < num_regions; i++)
> +		if (mem_regions_type[i] == region->mem_class)
> +			return true;
> +	return false;
> +}
> +
> +struct igt_collection *
> +__xe_get_memory_region_set(int xe, uint32_t *mem_regions_type, int num_regions)
> +{
> +	struct drm_xe_query_mem_region *memregion;
> +	struct igt_collection *set = NULL;
> +	uint64_t memreg = all_memory_regions(xe), region;
> +	int count = 0, pos = 0;
> +
> +	xe_for_each_mem_region(xe, memreg, region) {
> +		memregion = xe_mem_region(xe, region);
> +		if (__region_belongs_to_regions_type(memregion,
> +						     mem_regions_type,
> +						     num_regions))
> +			count++;
> +	}
> +
> +	set = igt_collection_create(count);
> +
> +	xe_for_each_mem_region(xe, memreg, region) {
> +		memregion = xe_mem_region(xe, region);
> +		igt_assert(region < (1ull << 31));
> +		if (__region_belongs_to_regions_type(memregion,
> +						     mem_regions_type,
> +						     num_regions)) {
> +			igt_collection_set_value(set, pos++, (int)region);
> +		}
> +	}
> +
> +	igt_assert(count == pos);
> +
> +	return set;
> +}
> +
> +/**
> +  * xe_memregion_dynamic_subtest_name:
> +  * @xe: drm fd of Xe device
> +  * @igt_collection: memory region collection
> +  *
> +  * Function iterates over all memory regions inside the collection (keeped
> +  * in the value field) and generates the name which can be used during dynamic
> +  * subtest creation.
> +  *
> +  * Returns: newly allocated string, has to be freed by caller. Asserts if
> +  * caller tries to create a name using empty collection.
> +  */
> +char *xe_memregion_dynamic_subtest_name(int xe, struct igt_collection *set)
> +{
> +	struct igt_collection_data *data;
> +	char *name, *p;
> +	uint32_t region, len;
> +
> +	igt_assert(set && set->size);
> +	/* enough for "name%d-" * n */
> +	len = set->size * 8;
> +	p = name = malloc(len);
> +	igt_assert(name);
> +
> +	for_each_collection_data(data, set) {
> +		struct drm_xe_query_mem_region *memreg;
> +		int r;
> +
> +		region = data->value;
> +		memreg = xe_mem_region(xe, region);
> +
> +		if (XE_IS_CLASS_VRAM(memreg))
> +			r = snprintf(p, len, "%s%d-",
> +				     xe_region_name(region),
> +				     memreg->instance);
> +		else
> +			r = snprintf(p, len, "%s-",
> +				     xe_region_name(region));
> +
> +		igt_assert(r > 0);
> +		p += r;
> +		len -= r;
> +	}
> +
> +	/* remove last '-' */
> +	*(p - 1) = 0;
> +
> +	return name;
> +}
> +
> diff --git a/lib/xe/xe_util.h b/lib/xe/xe_util.h
> new file mode 100644
> index 0000000000..9f56fa9898
> --- /dev/null
> +++ b/lib/xe/xe_util.h
> @@ -0,0 +1,30 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2023 Intel Corporation
> + *
> + */
> +
> +#ifndef XE_UTIL_H
> +#define XE_UTIL_H
> +
> +#include <stdbool.h>
> +#include <stddef.h>
> +#include <stdint.h>
> +#include <xe_drm.h>
> +
> +#define XE_IS_SYSMEM_MEMORY_REGION(fd, region) \
> +	(xe_region_class(fd, region) == XE_MEM_REGION_CLASS_SYSMEM)
> +#define XE_IS_VRAM_MEMORY_REGION(fd, region) \
> +	(xe_region_class(fd, region) == XE_MEM_REGION_CLASS_VRAM)
> +
> +struct igt_collection *
> +__xe_get_memory_region_set(int xe, uint32_t *mem_regions_type, int num_regions);
> +
> +#define xe_get_memory_region_set(regions, mem_region_types...) ({ \
> +	unsigned int arr__[] = { mem_region_types }; \
> +	__xe_get_memory_region_set(regions, arr__, ARRAY_SIZE(arr__)); \
> +})
> +
> +char *xe_memregion_dynamic_subtest_name(int xe, struct igt_collection *set);
> +
> +#endif /* XE_UTIL_H */

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 08/16] lib/xe_util: Add vm bind/unbind helper for Xe
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 08/16] lib/xe_util: Add vm bind/unbind helper " Zbigniew Kempczyński
@ 2023-07-06 10:27   ` Karolina Stolarek
  0 siblings, 0 replies; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-06 10:27 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

Hi Zbigniew,

I know I replied in v1, but I still have one more question, sorry :o)

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> Before calling exec we need to prepare vm to contain valid entries.
> Bind/unbind in xe expects single bind_op or vector of bind_ops what
> makes preparation of it a little bit inconvinient. Add function
> which iterates over list of xe_object (auxiliary structure which
> describes bind information for object) and performs the bind/unbind
> in one step. It also supports passing syncobj in/out to work in
> pipelined executions.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>   lib/xe/xe_util.c | 132 +++++++++++++++++++++++++++++++++++++++++++++++
>   lib/xe/xe_util.h |  18 +++++++
>   2 files changed, 150 insertions(+)
> 
> diff --git a/lib/xe/xe_util.c b/lib/xe/xe_util.c
> index 448b3a3d27..b998a52e73 100644
> --- a/lib/xe/xe_util.c
> +++ b/lib/xe/xe_util.c
> @@ -102,3 +102,135 @@ char *xe_memregion_dynamic_subtest_name(int xe, struct igt_collection *set)
>   	return name;
>   }
>   
> +#ifdef XEBINDDBG
> +#define bind_info igt_info
> +#define bind_debug igt_debug
> +#else
> +#define bind_info(...) {}
> +#define bind_debug(...) {}
> +#endif
> +
> +static struct drm_xe_vm_bind_op *xe_alloc_bind_ops(struct igt_list_head *obj_list,
> +						   uint32_t *num_ops)
> +{
> +	struct drm_xe_vm_bind_op *bind_ops, *ops;
> +	struct xe_object *obj;
> +	uint32_t num_objects = 0, i = 0, op;
> +
> +	igt_list_for_each_entry(obj, obj_list, link)
> +		num_objects++;
> +
> +	*num_ops = num_objects;
> +	if (!num_objects) {
> +		bind_info(" [nothing to bind]\n");
> +		return NULL;
> +	}
> +
> +	bind_ops = calloc(num_objects, sizeof(*bind_ops));
> +	igt_assert(bind_ops);
> +
> +	igt_list_for_each_entry(obj, obj_list, link) {
> +		ops = &bind_ops[i];
> +
> +		if (obj->bind_op == XE_OBJECT_BIND) {
> +			op = XE_VM_BIND_OP_MAP | XE_VM_BIND_FLAG_ASYNC;
> +			ops->obj = obj->handle;
> +		} else {
> +			op = XE_VM_BIND_OP_UNMAP | XE_VM_BIND_FLAG_ASYNC;
> +		}
> +
> +		ops->op = op;
> +		ops->obj_offset = 0;
> +		ops->addr = obj->offset;
> +		ops->range = obj->size;
> +		ops->region = 0;
> +
> +		bind_info("  [%d]: [%6s] handle: %u, offset: %llx, size: %llx\n",
> +			  i, obj->bind_op == XE_OBJECT_BIND ? "BIND" : "UNBIND",
> +			  ops->obj, (long long)ops->addr, (long long)ops->range);
> +		i++;
> +	}
> +
> +	return bind_ops;
> +}
> +
> +static void __xe_op_bind_async(int xe, uint32_t vm, uint32_t bind_engine,
> +			       struct igt_list_head *obj_list,
> +			       uint32_t sync_in, uint32_t sync_out)
> +{
> +	struct drm_xe_vm_bind_op *bind_ops;
> +	struct drm_xe_sync tabsyncs[2] = {
> +		{ .flags = DRM_XE_SYNC_SYNCOBJ, .handle = sync_in },
> +		{ .flags = DRM_XE_SYNC_SYNCOBJ | DRM_XE_SYNC_SIGNAL, .handle = sync_out },
> +	};
> +	struct drm_xe_sync *syncs;
> +	uint32_t num_binds = 0;
> +	int num_syncs;
> +
> +	bind_info("[Binding to vm: %u]\n", vm);
> +	bind_ops = xe_alloc_bind_ops(obj_list, &num_binds);
> +
> +	if (!num_binds) {
> +		if (sync_out)
> +			syncobj_signal(xe, &sync_out, 1);
> +		return;
> +	}
> +
> +	if (sync_in) {
> +		syncs = tabsyncs;
> +		num_syncs = 2;
> +	} else {
> +		syncs = &tabsyncs[1];
> +		num_syncs = 1;
> +	}
> +
> +	/* User didn't pass sync out, create it and wait for completion */
> +	if (!sync_out)
> +		tabsyncs[1].handle = syncobj_create(xe, 0);
> +
> +	bind_info("[Binding syncobjs: (in: %u, out: %u)]\n",
> +		  tabsyncs[0].handle, tabsyncs[1].handle);
> +
> +	if (num_binds == 1) {
> +		if ((bind_ops[0].op & 0xffff) == XE_VM_BIND_OP_MAP)
> +			xe_vm_bind_async(xe, vm, bind_engine, bind_ops[0].obj, 0,
> +					bind_ops[0].addr, bind_ops[0].range,
> +					syncs, num_syncs);
> +		else
> +			xe_vm_unbind_async(xe, vm, bind_engine, 0,
> +					   bind_ops[0].addr, bind_ops[0].range,
> +					   syncs, num_syncs);
> +	} else {
> +		xe_vm_bind_array(xe, vm, bind_engine, bind_ops,
> +				 num_binds, syncs, num_syncs);
> +	}
> +
> +	if (!sync_out) {
> +		igt_assert_eq(syncobj_wait_err(xe, &tabsyncs[1].handle, 1, INT64_MAX, 0), 0);
> +		syncobj_destroy(xe, tabsyncs[1].handle);
> +	}
> +
> +	free(bind_ops);
> +}
> +
> +/**
> +  * xe_bind_unbind_async:
> +  * @xe: drm fd of Xe device
> +  * @vm: vm to bind/unbind objects to/from
> +  * @bind_engine: bind engine, 0 if default
> +  * @obj_list: list of xe_object
> +  * @sync_in: sync object (fence-in), 0 if there's no input dependency
> +  * @sync_out: sync object (fence-out) to signal on bind/unbind completion,
> +  *            if 0 wait for bind/unbind completion.
> +  *
> +  * Function iterates over xe_object @obj_list, prepares binding operation
> +  * and does bind/unbind in one step. Providing sync_in / sync_out allows
> +  * working in pipelined mode. With sync_in and sync_out set to 0 function
> +  * waits until binding operation is complete.
> +  */
> +void xe_bind_unbind_async(int fd, uint32_t vm, uint32_t bind_engine,
> +			  struct igt_list_head *obj_list,
> +			  uint32_t sync_in, uint32_t sync_out)
> +{
> +	return __xe_op_bind_async(fd, vm, bind_engine, obj_list, sync_in, sync_out);
> +}
> diff --git a/lib/xe/xe_util.h b/lib/xe/xe_util.h
> index 9f56fa9898..32f309923e 100644
> --- a/lib/xe/xe_util.h
> +++ b/lib/xe/xe_util.h
> @@ -27,4 +27,22 @@ __xe_get_memory_region_set(int xe, uint32_t *mem_regions_type, int num_regions);
>   
>   char *xe_memregion_dynamic_subtest_name(int xe, struct igt_collection *set);
>   
> +enum xe_bind_op {
> +	XE_OBJECT_BIND,
> +	XE_OBJECT_UNBIND,
> +};
> +
> +struct xe_object {
> +	uint32_t handle;
> +	uint64_t offset;
> +	uint64_t size;
> +	enum xe_bind_op bind_op;
> +	void *priv;
> +	struct igt_list_head link;
> +};

It might seem obvious what *priv is for, but could we add a comment 
about it? Or a couple of words saying about the purpose of this struct.
Apart from that, I'm happy with the patch.

Thanks,
Karolina

> +
> +void xe_bind_unbind_async(int fd, uint32_t vm, uint32_t bind_engine,
> +			  struct igt_list_head *obj_list,
> +			  uint32_t sync_in, uint32_t sync_out);
> +
>   #endif /* XE_UTIL_H */

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 09/16] lib/intel_allocator: Add field to distinquish underlying driver
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 09/16] lib/intel_allocator: Add field to distinquish underlying driver Zbigniew Kempczyński
@ 2023-07-06 10:34   ` Karolina Stolarek
  0 siblings, 0 replies; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-06 10:34 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> Cache what driver is using on drm fd to avoid calling same code
> in allocator functions.

Good call!

Reviewed-by: Karolina Stolarek <karolina.stolarek@intel.com>

> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>   lib/intel_allocator.c | 1 +
>   lib/intel_allocator.h | 3 +++
>   2 files changed, 4 insertions(+)
> 
> diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
> index be24f8f2d0..228b33b92f 100644
> --- a/lib/intel_allocator.c
> +++ b/lib/intel_allocator.c
> @@ -318,6 +318,7 @@ static struct intel_allocator *intel_allocator_create(int fd,
>   
>   	igt_assert(ial);
>   
> +	ial->driver = get_intel_driver(fd);
>   	ial->type = allocator_type;
>   	ial->strategy = allocator_strategy;
>   	ial->default_alignment = default_alignment;
> diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
> index 3ec74f6191..1001b21b98 100644
> --- a/lib/intel_allocator.h
> +++ b/lib/intel_allocator.h
> @@ -141,6 +141,9 @@ struct intel_allocator {
>   	/* allocator's private structure */
>   	void *priv;
>   
> +	/* driver - i915 or Xe */
> +	enum intel_driver driver;
> +
>   	void (*get_address_range)(struct intel_allocator *ial,
>   				  uint64_t *startp, uint64_t *endp);
>   	uint64_t (*alloc)(struct intel_allocator *ial, uint32_t handle,

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 02/16] lib/intel_allocator: Drop aliasing allocator handle api
  2023-07-06  8:31   ` Karolina Stolarek
@ 2023-07-06 11:20     ` Zbigniew Kempczyński
  2023-07-06 13:28       ` Karolina Stolarek
  0 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06 11:20 UTC (permalink / raw)
  To: Karolina Stolarek; +Cc: igt-dev

On Thu, Jul 06, 2023 at 10:31:15AM +0200, Karolina Stolarek wrote:
> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> > There's no real user of this api, lets drop it.
> > 
> 
> Should we drop REQ_OPEN_AS and/or RESP_OPEN_AS enum values as well?
> 

Good catch, thanks. Will be in v3.

--
Zbigniew

> All the best,
> Karolina
> 
> > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > ---
> >   lib/intel_allocator.c | 18 ------------------
> >   lib/intel_allocator.h |  1 -
> >   2 files changed, 19 deletions(-)
> > 
> > diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
> > index 8161221dbf..c31576ecef 100644
> > --- a/lib/intel_allocator.c
> > +++ b/lib/intel_allocator.c
> > @@ -1037,24 +1037,6 @@ uint64_t intel_allocator_open_vm(int fd, uint32_t vm, uint8_t allocator_type)
> >   					    ALLOC_STRATEGY_HIGH_TO_LOW, 0);
> >   }
> > -uint64_t intel_allocator_open_vm_as(uint64_t allocator_handle, uint32_t new_vm)
> > -{
> > -	struct alloc_req req = { .request_type = REQ_OPEN_AS,
> > -				 .allocator_handle = allocator_handle,
> > -				 .open_as.new_vm = new_vm };
> > -	struct alloc_resp resp;
> > -
> > -	/* Get child_tid only once at open() */
> > -	if (child_tid == -1)
> > -		child_tid = gettid();
> > -
> > -	igt_assert(handle_request(&req, &resp) == 0);
> > -	igt_assert(resp.open_as.allocator_handle);
> > -	igt_assert(resp.response_type == RESP_OPEN_AS);
> > -
> > -	return resp.open.allocator_handle;
> > -}
> > -
> >   /**
> >    * intel_allocator_close:
> >    * @allocator_handle: handle to the allocator that will be closed
> > diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
> > index a6bf573e9d..3ec74f6191 100644
> > --- a/lib/intel_allocator.h
> > +++ b/lib/intel_allocator.h
> > @@ -182,7 +182,6 @@ uint64_t intel_allocator_open_vm_full(int fd, uint32_t vm,
> >   				      enum allocator_strategy strategy,
> >   				      uint64_t default_alignment);
> > -uint64_t intel_allocator_open_vm_as(uint64_t allocator_handle, uint32_t new_vm);
> >   bool intel_allocator_close(uint64_t allocator_handle);
> >   void intel_allocator_get_address_range(uint64_t allocator_handle,
> >   				       uint64_t *startp, uint64_t *endp);

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 10/16] lib/intel_allocator: Add intel_allocator_bind()
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 10/16] lib/intel_allocator: Add intel_allocator_bind() Zbigniew Kempczyński
@ 2023-07-06 13:02   ` Karolina Stolarek
  2023-07-06 16:09     ` Zbigniew Kempczyński
  0 siblings, 1 reply; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-06 13:02 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> Synchronize allocator state to vm.
> 
> This change allows xe user to execute vm-bind/unbind for allocator
> alloc()/free() operations which occurred since last binding/unbinding.
> Before doing exec user should call intel_allocator_bind() to ensure
> all vma's are in place.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
> v2: Rewrite tracking mechanism: previous code uses bind map embedded
>      in allocator structure. Unfortunately this wasn't good idea
>      - for xe binding everything was fine, but it regress multiprocess/
>      multithreaded allocations. Main reason of this was children
>      processes couldn't get its reference as this memory was allocated
>      on allocator thread (separate process). Currently each child
>      contains its own separate tracking maps for ahnd and for each
>      ahnd bind map.
> ---
>   lib/igt_core.c        |   5 +
>   lib/intel_allocator.c | 259 +++++++++++++++++++++++++++++++++++++++++-
>   lib/intel_allocator.h |   6 +-
>   3 files changed, 265 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/igt_core.c b/lib/igt_core.c
> index 3ee3a01c36..6286e97b1b 100644
> --- a/lib/igt_core.c
> +++ b/lib/igt_core.c
> @@ -74,6 +74,7 @@
>   #include "igt_sysrq.h"
>   #include "igt_rc.h"
>   #include "igt_list.h"
> +#include "igt_map.h"
>   #include "igt_device_scan.h"
>   #include "igt_thread.h"
>   #include "runnercomms.h"
> @@ -319,6 +320,8 @@ bool test_multi_fork_child;
>   /* For allocator purposes */
>   pid_t child_pid  = -1;
>   __thread pid_t child_tid  = -1;
> +struct igt_map *ahnd_map;
> +pthread_mutex_t ahnd_map_mutex;
>   
>   enum {
>   	/*
> @@ -2509,6 +2512,8 @@ bool __igt_fork(void)
>   	case 0:
>   		test_child = true;
>   		pthread_mutex_init(&print_mutex, NULL);
> +		pthread_mutex_init(&ahnd_map_mutex, NULL);
> +		ahnd_map = igt_map_create(igt_map_hash_64, igt_map_equal_64);
>   		child_pid = getpid();
>   		child_tid = -1;
>   		exit_handler_count = 0;
> diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
> index 228b33b92f..02d3404abc 100644
> --- a/lib/intel_allocator.c
> +++ b/lib/intel_allocator.c
> @@ -17,6 +17,7 @@
>   #include "intel_allocator.h"
>   #include "intel_allocator_msgchannel.h"
>   #include "xe/xe_query.h"
> +#include "xe/xe_util.h"
>   
>   //#define ALLOCDBG

I know it has been here before, but do we want to keep that symbol as a 
comment?

>   #ifdef ALLOCDBG
> @@ -46,6 +47,14 @@ static inline const char *reqstr(enum reqtype request_type)
>   #define alloc_debug(...) {}
>   #endif
>   
> +#ifdef ALLOCBINDDBG
> +#define bind_info igt_info
> +#define bind_debug igt_debug
> +#else
> +#define bind_info(...) {}
> +#define bind_debug(...) {}
> +#endif
> +
>   /*
>    * We limit allocator space to avoid hang when batch would be
>    * pinned in the last page.
> @@ -65,6 +74,31 @@ struct handle_entry {
>   	struct allocator *al;
>   };
>   
> +/* For tracking alloc()/free() for Xe */

Hmm, but it looks like we track it for both drivers, so that comment is 
slightly confusing. I understand that we build the struct, but don't 
actually use it with i915. The question is if we want to have it around 
with i915.

> +struct ahnd_info {
> +	int fd;
> +	uint64_t ahnd;
> +	uint32_t ctx;

I'm just thinking, given your 11/16 patch where you update intel_ctx_t 
struct, could we store it here instead of just an id? Would it be useful 
to have access to intel_ctx_cfg_t, if one was defined?

> +	uint32_t vm;
> +	enum intel_driver driver;
> +	struct igt_map *bind_map;
> +	pthread_mutex_t bind_map_mutex;
> +};
> +
> +enum allocator_bind_op {
> +	BOUND,
> +	TO_BIND,
> +	TO_UNBIND,
> +};
> +
> +struct allocator_object {
> +	uint32_t handle;
> +	uint64_t offset;
> +	uint64_t size;
> +
> +	enum allocator_bind_op bind_op;
> +};
> +
>   struct intel_allocator *
>   intel_allocator_reloc_create(int fd, uint64_t start, uint64_t end);
>   struct intel_allocator *
> @@ -123,6 +157,13 @@ static pid_t allocator_pid = -1;
>   extern pid_t child_pid;
>   extern __thread pid_t child_tid;
>   
> +/*
> + * Track alloc()/free() requires storing in local process which has
> + * an access to real drm fd it can work on.
> + */
> +extern struct igt_map *ahnd_map;
> +extern pthread_mutex_t ahnd_map_mutex;
> +
>   /*
>    * - for parent process we have child_pid == -1
>    * - for child which calls intel_allocator_init() allocator_pid == child_pid
> @@ -318,7 +359,6 @@ static struct intel_allocator *intel_allocator_create(int fd,
>   
>   	igt_assert(ial);
>   
> -	ial->driver = get_intel_driver(fd);

Please remember to drop that patch so we don't have such diff in v3.

>   	ial->type = allocator_type;
>   	ial->strategy = allocator_strategy;
>   	ial->default_alignment = default_alignment;
> @@ -893,6 +933,46 @@ void intel_allocator_multiprocess_stop(void)
>   	}
>   }
>   
> +static void track_ahnd(int fd, uint64_t ahnd, uint32_t ctx, uint32_t vm)
> +{
> +	struct ahnd_info *ainfo;
> +
> +	pthread_mutex_lock(&ahnd_map_mutex);
> +	ainfo = igt_map_search(ahnd_map, &ahnd);
> +	if (!ainfo) {
> +		ainfo = malloc(sizeof(*ainfo));
> +		ainfo->fd = fd;
> +		ainfo->ahnd = ahnd;
> +		ainfo->ctx = ctx;
> +		ainfo->vm = vm;
> +		ainfo->driver = get_intel_driver(fd);
> +		ainfo->bind_map = igt_map_create(igt_map_hash_32, igt_map_equal_32);
> +		pthread_mutex_init(&ainfo->bind_map_mutex, NULL);
> +		bind_debug("[TRACK AHND] pid: %d, tid: %d, create <fd: %d, "
> +			   "ahnd: %llx, ctx: %u, vm: %u, driver: %d, ahnd_map: %p, bind_map: %p>\n",
> +			   getpid(), gettid(), ainfo->fd,
> +			   (long long)ainfo->ahnd, ainfo->ctx, ainfo->vm,
> +			   ainfo->driver, ahnd_map, ainfo->bind_map);
> +		igt_map_insert(ahnd_map, &ainfo->ahnd, ainfo);
> +	}
> +
> +	pthread_mutex_unlock(&ahnd_map_mutex);
> +}
> +
> +static void untrack_ahnd(uint64_t ahnd)
> +{
> +	struct ahnd_info *ainfo;
> +
> +	pthread_mutex_lock(&ahnd_map_mutex);
> +	ainfo = igt_map_search(ahnd_map, &ahnd);
> +	if (ainfo) {
> +		bind_debug("[UNTRACK AHND]: pid: %d, tid: %d, removing ahnd: %llx\n",
> +			   getpid(), gettid(), (long long)ahnd);
> +		igt_map_remove(ahnd_map, &ahnd, map_entry_free_func);
> +	}

Suggestion: I'd warn on !ainfo, we tried to untrack/free something that 
wasn't tracked before.

> +	pthread_mutex_unlock(&ahnd_map_mutex);
> +}
> +
>   static uint64_t __intel_allocator_open_full(int fd, uint32_t ctx,
>   					    uint32_t vm,
>   					    uint64_t start, uint64_t end,
> @@ -951,6 +1031,8 @@ static uint64_t __intel_allocator_open_full(int fd, uint32_t ctx,
>   	igt_assert(resp.open.allocator_handle);
>   	igt_assert(resp.response_type == RESP_OPEN);
>   
> +	track_ahnd(fd, resp.open.allocator_handle, ctx, vm);
> +
>   	return resp.open.allocator_handle;
>   }
>   
> @@ -1057,6 +1139,8 @@ bool intel_allocator_close(uint64_t allocator_handle)
>   	igt_assert(handle_request(&req, &resp) == 0);
>   	igt_assert(resp.response_type == RESP_CLOSE);
>   
> +	untrack_ahnd(allocator_handle);
> +
>   	return resp.close.is_empty;
>   }
>   
> @@ -1090,6 +1174,76 @@ void intel_allocator_get_address_range(uint64_t allocator_handle,
>   		*endp = resp.address_range.end;
>   }
>   
> +static bool is_same(struct allocator_object *obj,
> +		    uint32_t handle, uint64_t offset, uint64_t size,
> +		    enum allocator_bind_op bind_op)
> +{
> +	return obj->handle == handle &&	obj->offset == offset && obj->size == size &&
> +	       (obj->bind_op == bind_op || obj->bind_op == BOUND);
> +}
> +
> +static void track_object(uint64_t allocator_handle, uint32_t handle,
> +			 uint64_t offset, uint64_t size,
> +			 enum allocator_bind_op bind_op)
> +{
> +	struct ahnd_info *ainfo;
> +	struct allocator_object *obj;
> +
> +	bind_debug("[TRACK OBJECT]: [%s] pid: %d, tid: %d, ahnd: %llx, handle: %u, offset: %llx, size: %llx\n",
> +		   bind_op == TO_BIND ? "BIND" : "UNBIND",
> +		   getpid(), gettid(),
> +		   (long long)allocator_handle,
> +		   handle, (long long)offset, (long long)size);
> +
> +	if (offset == ALLOC_INVALID_ADDRESS) {
> +		bind_debug("[TRACK OBJECT] => invalid address %llx, skipping tracking\n",
> +			   (long long)offset);

OK, we don't track ALLOC_INVALID_ADDRESS as it means that the allocation 
in simple_vma_heap_alloc() failed, correct?

> +		return;
> +	}
> +
> +	pthread_mutex_lock(&ahnd_map_mutex);
> +	ainfo = igt_map_search(ahnd_map, &allocator_handle);
> +	pthread_mutex_unlock(&ahnd_map_mutex);
> +	if (!ainfo) {
> +		igt_warn("[TRACK OBJECT] => MISSING ahnd %llx <=\n", (long long)allocator_handle);
> +		igt_assert(ainfo);
> +	}

Could we do igt_assert_f() instead?

> +
> +	if (ainfo->driver == INTEL_DRIVER_I915)
> +		return; /* no-op for i915, at least now */

I wonder if we could move that tracking to the xe path for now. I mean, 
maybe there will be some benefit of doing it for i915, but I can't see 
it, at least for now.

> +
> +	pthread_mutex_lock(&ainfo->bind_map_mutex);
> +	obj = igt_map_search(ainfo->bind_map, &handle);
> +	if (obj) {
> +		/*
> +		 * User may call alloc() couple of times, check object is the
> +		 * same. For free() there's simple case, just remove from
> +		 * bind_map.
> +		 */
> +		if (bind_op == TO_BIND)
> +			igt_assert_eq(is_same(obj, handle, offset, size, bind_op), true);

Checkpatch.pl doesn't like the fact that you're not using braces both 
for if and else if (I have no strong pereference)

> +		else if (bind_op == TO_UNBIND) {
> +			if (obj->bind_op == TO_BIND)
> +				igt_map_remove(ainfo->bind_map, &obj->handle, map_entry_free_func);
> +			else if (obj->bind_op == BOUND)
> +				obj->bind_op = bind_op; > +		}
> +	} else {
> +		/* Ignore to unbind bo which wasn't previously inserted */
> +		if (bind_op == TO_UNBIND)
> +			goto out;
> +
> +		obj = calloc(1, sizeof(*obj));
> +		obj->handle = handle;
> +		obj->offset = offset;
> +		obj->size = size;
> +		obj->bind_op = bind_op;

We don't have to check here for bind_op == BOUND, because the only way 
to get from to this state is to call __xe_op_bind(), and not alloc/free, 
correct?

> +		igt_map_insert(ainfo->bind_map, &obj->handle, obj);
> +	}
> +out:
> +	pthread_mutex_unlock(&ainfo->bind_map_mutex);
> +}
> +
>   /**
>    * __intel_allocator_alloc:
>    * @allocator_handle: handle to an allocator
> @@ -1121,6 +1275,8 @@ uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
>   	igt_assert(handle_request(&req, &resp) == 0);
>   	igt_assert(resp.response_type == RESP_ALLOC);
>   
> +	track_object(allocator_handle, handle, resp.alloc.offset, size, TO_BIND);
> +
>   	return resp.alloc.offset;
>   }
>   
> @@ -1198,6 +1354,8 @@ bool intel_allocator_free(uint64_t allocator_handle, uint32_t handle)
>   	igt_assert(handle_request(&req, &resp) == 0);
>   	igt_assert(resp.response_type == RESP_FREE);
>   
> +	track_object(allocator_handle, handle, 0, 0, TO_UNBIND);
> +
>   	return resp.free.freed;
>   }
>   
> @@ -1382,6 +1540,83 @@ void intel_allocator_print(uint64_t allocator_handle)
>   	}
>   }
>   
> +static void __xe_op_bind(struct ahnd_info *ainfo, uint32_t sync_in, uint32_t sync_out)
> +{
> +	struct allocator_object *obj;
> +	struct igt_map_entry *pos;
> +	struct igt_list_head obj_list;
> +	struct xe_object *entry, *tmp;
> +
> +	IGT_INIT_LIST_HEAD(&obj_list);
> +
> +	pthread_mutex_lock(&ainfo->bind_map_mutex);
> +	igt_map_foreach(ainfo->bind_map, pos) {
> +		obj = pos->data;
> +
> +		if (obj->bind_op == BOUND)
> +			continue;
> +
> +		bind_info("= [vm: %u] %s => %u %lx %lx\n",
> +			  ainfo->ctx,
> +			  obj->bind_op == TO_BIND ? "TO BIND" : "TO UNBIND",
> +			  obj->handle, obj->offset,
> +			  obj->size);
> +
> +		entry = malloc(sizeof(*entry));
> +		entry->handle = obj->handle;
> +		entry->offset = obj->offset;
> +		entry->size = obj->size;
> +		entry->bind_op = obj->bind_op == TO_BIND ? XE_OBJECT_BIND :
> +							   XE_OBJECT_UNBIND;
> +		entry->priv = obj;
> +		igt_list_add(&entry->link, &obj_list);
> +	}
> +	pthread_mutex_unlock(&ainfo->bind_map_mutex);
> +
> +	xe_bind_unbind_async(ainfo->fd, ainfo->ctx, 0, &obj_list, sync_in, sync_out);

Shouldn't the second param be ainfo->vm, not ainfo->ctx?

> +
> +	pthread_mutex_lock(&ainfo->bind_map_mutex);
> +	igt_list_for_each_entry_safe(entry, tmp, &obj_list, link) {
> +		obj = entry->priv;
> +		if (obj->bind_op == TO_BIND)
> +			obj->bind_op = BOUND;
> +		else
> +			igt_map_remove(ainfo->bind_map, &obj->handle, map_entry_free_func);
> +
> +		igt_list_del(&entry->link);
> +		free(entry);
> +	}
> +	pthread_mutex_unlock(&ainfo->bind_map_mutex);
> +}
> +
> +/**
> + * intel_allocator_bind:
> + * @allocator_handle: handle to an allocator
> + * @sync_in: syncobj (fence-in)
> + * @sync_out: syncobj (fence-out)
> + *
> + * Function binds and unbinds all objects added to the allocator which weren't
> + * previously binded/unbinded.
> + *
> + **/
> +void intel_allocator_bind(uint64_t allocator_handle,
> +			  uint32_t sync_in, uint32_t sync_out)
> +{
> +	struct ahnd_info *ainfo;
> +
> +	pthread_mutex_lock(&ahnd_map_mutex);
> +	ainfo = igt_map_search(ahnd_map, &allocator_handle);
> +	pthread_mutex_unlock(&ahnd_map_mutex);
> +	igt_assert(ainfo);
> +
> +	/*
> +	 * We collect bind/unbind operations on alloc()/free() to do group
> +	 * operation getting @sync_in as syncobj handle (fence-in). If user
> +	 * passes 0 as @sync_out we bind/unbind synchronously.
> +	 */
> +	__xe_op_bind(ainfo, sync_in, sync_out);
> +}
> +
>   static int equal_handles(const void *key1, const void *key2)
>   {
>   	const struct handle_entry *h1 = key1, *h2 = key2;
> @@ -1439,6 +1674,23 @@ static void __free_maps(struct igt_map *map, bool close_allocators)
>   	igt_map_destroy(map, map_entry_free_func);
>   }
>   
> +static void __free_ahnd_map(void)
> +{
> +	struct igt_map_entry *pos;
> +	struct ahnd_info *ainfo;
> +
> +	if (!ahnd_map)
> +		return;
> +
> +	igt_map_foreach(ahnd_map, pos) {
> +		ainfo = pos->data;
> +		igt_map_destroy(ainfo->bind_map, map_entry_free_func);
> +	}
> +
> +	igt_map_destroy(ahnd_map, map_entry_free_func);
> +}
> +
> +

^- Whoops, an extra blank line

>   /**
>    * intel_allocator_init:
>    *
> @@ -1456,12 +1708,15 @@ void intel_allocator_init(void)
>   	__free_maps(handles, true);
>   	__free_maps(ctx_map, false);
>   	__free_maps(vm_map, false);
> +	__free_ahnd_map();
>   
>   	atomic_init(&next_handle, 1);
>   	handles = igt_map_create(hash_handles, equal_handles);
>   	ctx_map = igt_map_create(hash_instance, equal_ctx);
>   	vm_map = igt_map_create(hash_instance, equal_vm);
> -	igt_assert(handles && ctx_map && vm_map);
> +	pthread_mutex_init(&ahnd_map_mutex, NULL);
> +	ahnd_map = igt_map_create(igt_map_hash_64, igt_map_equal_64);
> +	igt_assert(handles && ctx_map && vm_map && ahnd_map); >
>   	channel = intel_allocator_get_msgchannel(CHANNEL_SYSVIPC_MSGQUEUE);
>   }
> diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
> index 1001b21b98..f9ff7f1cc9 100644
> --- a/lib/intel_allocator.h
> +++ b/lib/intel_allocator.h
> @@ -141,9 +141,6 @@ struct intel_allocator {
>   	/* allocator's private structure */
>   	void *priv;
>   
> -	/* driver - i915 or Xe */
> -	enum intel_driver driver;
> -

OK, I see it now. Please drop 9/16 and use per-thread driver info then.

Many thanks,
Karolina

>   	void (*get_address_range)(struct intel_allocator *ial,
>   				  uint64_t *startp, uint64_t *endp);
>   	uint64_t (*alloc)(struct intel_allocator *ial, uint32_t handle,
> @@ -213,6 +210,9 @@ bool intel_allocator_reserve_if_not_allocated(uint64_t allocator_handle,
>   
>   void intel_allocator_print(uint64_t allocator_handle);
>   
> +void intel_allocator_bind(uint64_t allocator_handle,
> +			  uint32_t sync_in, uint32_t sync_out);
> +
>   #define ALLOC_INVALID_ADDRESS (-1ull)
>   #define INTEL_ALLOCATOR_NONE   0
>   #define INTEL_ALLOCATOR_RELOC  1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 02/16] lib/intel_allocator: Drop aliasing allocator handle api
  2023-07-06 11:20     ` Zbigniew Kempczyński
@ 2023-07-06 13:28       ` Karolina Stolarek
  0 siblings, 0 replies; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-06 13:28 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 13:20, Zbigniew Kempczyński wrote:
> On Thu, Jul 06, 2023 at 10:31:15AM +0200, Karolina Stolarek wrote:
>> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
>>> There's no real user of this api, lets drop it.
>>>
>>
>> Should we drop REQ_OPEN_AS and/or RESP_OPEN_AS enum values as well?
>>
> 
> Good catch, thanks. Will be in v3.

If you decide to delete them, also remember about the switch statement 
in handle_request(), it will require a cleanup as well.

All the best,
Karolina

> 
> --
> Zbigniew
> 
>> All the best,
>> Karolina
>>
>>> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
>>> ---
>>>    lib/intel_allocator.c | 18 ------------------
>>>    lib/intel_allocator.h |  1 -
>>>    2 files changed, 19 deletions(-)
>>>
>>> diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
>>> index 8161221dbf..c31576ecef 100644
>>> --- a/lib/intel_allocator.c
>>> +++ b/lib/intel_allocator.c
>>> @@ -1037,24 +1037,6 @@ uint64_t intel_allocator_open_vm(int fd, uint32_t vm, uint8_t allocator_type)
>>>    					    ALLOC_STRATEGY_HIGH_TO_LOW, 0);
>>>    }
>>> -uint64_t intel_allocator_open_vm_as(uint64_t allocator_handle, uint32_t new_vm)
>>> -{
>>> -	struct alloc_req req = { .request_type = REQ_OPEN_AS,
>>> -				 .allocator_handle = allocator_handle,
>>> -				 .open_as.new_vm = new_vm };
>>> -	struct alloc_resp resp;
>>> -
>>> -	/* Get child_tid only once at open() */
>>> -	if (child_tid == -1)
>>> -		child_tid = gettid();
>>> -
>>> -	igt_assert(handle_request(&req, &resp) == 0);
>>> -	igt_assert(resp.open_as.allocator_handle);
>>> -	igt_assert(resp.response_type == RESP_OPEN_AS);
>>> -
>>> -	return resp.open.allocator_handle;
>>> -}
>>> -
>>>    /**
>>>     * intel_allocator_close:
>>>     * @allocator_handle: handle to the allocator that will be closed
>>> diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
>>> index a6bf573e9d..3ec74f6191 100644
>>> --- a/lib/intel_allocator.h
>>> +++ b/lib/intel_allocator.h
>>> @@ -182,7 +182,6 @@ uint64_t intel_allocator_open_vm_full(int fd, uint32_t vm,
>>>    				      enum allocator_strategy strategy,
>>>    				      uint64_t default_alignment);
>>> -uint64_t intel_allocator_open_vm_as(uint64_t allocator_handle, uint32_t new_vm);
>>>    bool intel_allocator_close(uint64_t allocator_handle);
>>>    void intel_allocator_get_address_range(uint64_t allocator_handle,
>>>    				       uint64_t *startp, uint64_t *endp);

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 10/16] lib/intel_allocator: Add intel_allocator_bind()
  2023-07-06 13:02   ` Karolina Stolarek
@ 2023-07-06 16:09     ` Zbigniew Kempczyński
  2023-07-07  8:01       ` Karolina Stolarek
  0 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-06 16:09 UTC (permalink / raw)
  To: Karolina Stolarek; +Cc: igt-dev

On Thu, Jul 06, 2023 at 03:02:29PM +0200, Karolina Stolarek wrote:
> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> > Synchronize allocator state to vm.
> > 
> > This change allows xe user to execute vm-bind/unbind for allocator
> > alloc()/free() operations which occurred since last binding/unbinding.
> > Before doing exec user should call intel_allocator_bind() to ensure
> > all vma's are in place.
> > 
> > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > ---
> > v2: Rewrite tracking mechanism: previous code uses bind map embedded
> >      in allocator structure. Unfortunately this wasn't good idea
> >      - for xe binding everything was fine, but it regress multiprocess/
> >      multithreaded allocations. Main reason of this was children
> >      processes couldn't get its reference as this memory was allocated
> >      on allocator thread (separate process). Currently each child
> >      contains its own separate tracking maps for ahnd and for each
> >      ahnd bind map.
> > ---
> >   lib/igt_core.c        |   5 +
> >   lib/intel_allocator.c | 259 +++++++++++++++++++++++++++++++++++++++++-
> >   lib/intel_allocator.h |   6 +-
> >   3 files changed, 265 insertions(+), 5 deletions(-)
> > 
> > diff --git a/lib/igt_core.c b/lib/igt_core.c
> > index 3ee3a01c36..6286e97b1b 100644
> > --- a/lib/igt_core.c
> > +++ b/lib/igt_core.c
> > @@ -74,6 +74,7 @@
> >   #include "igt_sysrq.h"
> >   #include "igt_rc.h"
> >   #include "igt_list.h"
> > +#include "igt_map.h"
> >   #include "igt_device_scan.h"
> >   #include "igt_thread.h"
> >   #include "runnercomms.h"
> > @@ -319,6 +320,8 @@ bool test_multi_fork_child;
> >   /* For allocator purposes */
> >   pid_t child_pid  = -1;
> >   __thread pid_t child_tid  = -1;
> > +struct igt_map *ahnd_map;
> > +pthread_mutex_t ahnd_map_mutex;
> >   enum {
> >   	/*
> > @@ -2509,6 +2512,8 @@ bool __igt_fork(void)
> >   	case 0:
> >   		test_child = true;
> >   		pthread_mutex_init(&print_mutex, NULL);
> > +		pthread_mutex_init(&ahnd_map_mutex, NULL);
> > +		ahnd_map = igt_map_create(igt_map_hash_64, igt_map_equal_64);
> >   		child_pid = getpid();
> >   		child_tid = -1;
> >   		exit_handler_count = 0;
> > diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
> > index 228b33b92f..02d3404abc 100644
> > --- a/lib/intel_allocator.c
> > +++ b/lib/intel_allocator.c
> > @@ -17,6 +17,7 @@
> >   #include "intel_allocator.h"
> >   #include "intel_allocator_msgchannel.h"
> >   #include "xe/xe_query.h"
> > +#include "xe/xe_util.h"
> >   //#define ALLOCDBG
> 
> I know it has been here before, but do we want to keep that symbol as a
> comment?
> 

Ok, I'm touching line above so I'll get rid of this one.

> >   #ifdef ALLOCDBG
> > @@ -46,6 +47,14 @@ static inline const char *reqstr(enum reqtype request_type)
> >   #define alloc_debug(...) {}
> >   #endif
> > +#ifdef ALLOCBINDDBG
> > +#define bind_info igt_info
> > +#define bind_debug igt_debug
> > +#else
> > +#define bind_info(...) {}
> > +#define bind_debug(...) {}
> > +#endif
> > +
> >   /*
> >    * We limit allocator space to avoid hang when batch would be
> >    * pinned in the last page.
> > @@ -65,6 +74,31 @@ struct handle_entry {
> >   	struct allocator *al;
> >   };
> > +/* For tracking alloc()/free() for Xe */
> 
> Hmm, but it looks like we track it for both drivers, so that comment is
> slightly confusing. I understand that we build the struct, but don't
> actually use it with i915. The question is if we want to have it around with
> i915.
> 

At the moment we enter track_object() function but immediately we find
ahnd info from the ahnd_map and driver is i915 we return. So alloc() /
free() are not tracked. I decided to track ahnd because access to map
and checking out driver field is faster than is_(xe|i915)_device().
So comment above is correct imo.

> > +struct ahnd_info {
> > +	int fd;
> > +	uint64_t ahnd;
> > +	uint32_t ctx;
> 
> I'm just thinking, given your 11/16 patch where you update intel_ctx_t
> struct, could we store it here instead of just an id? Would it be useful to
> have access to intel_ctx_cfg_t, if one was defined?

I don't want to depend on intel_ctx in intel_allocator. But you're right,
ctx and vm here are confusing. I'm going to leave only 'vm' field, this
will enforce distinction between those two drivers. Expect change in v3.

> 
> > +	uint32_t vm;
> > +	enum intel_driver driver;
> > +	struct igt_map *bind_map;
> > +	pthread_mutex_t bind_map_mutex;
> > +};
> > +
> > +enum allocator_bind_op {
> > +	BOUND,
> > +	TO_BIND,
> > +	TO_UNBIND,
> > +};
> > +
> > +struct allocator_object {
> > +	uint32_t handle;
> > +	uint64_t offset;
> > +	uint64_t size;
> > +
> > +	enum allocator_bind_op bind_op;
> > +};
> > +
> >   struct intel_allocator *
> >   intel_allocator_reloc_create(int fd, uint64_t start, uint64_t end);
> >   struct intel_allocator *
> > @@ -123,6 +157,13 @@ static pid_t allocator_pid = -1;
> >   extern pid_t child_pid;
> >   extern __thread pid_t child_tid;
> > +/*
> > + * Track alloc()/free() requires storing in local process which has
> > + * an access to real drm fd it can work on.
> > + */
> > +extern struct igt_map *ahnd_map;
> > +extern pthread_mutex_t ahnd_map_mutex;
> > +
> >   /*
> >    * - for parent process we have child_pid == -1
> >    * - for child which calls intel_allocator_init() allocator_pid == child_pid
> > @@ -318,7 +359,6 @@ static struct intel_allocator *intel_allocator_create(int fd,
> >   	igt_assert(ial);
> > -	ial->driver = get_intel_driver(fd);
> 
> Please remember to drop that patch so we don't have such diff in v3.

Yes, I've dropped it already.

> 
> >   	ial->type = allocator_type;
> >   	ial->strategy = allocator_strategy;
> >   	ial->default_alignment = default_alignment;
> > @@ -893,6 +933,46 @@ void intel_allocator_multiprocess_stop(void)
> >   	}
> >   }
> > +static void track_ahnd(int fd, uint64_t ahnd, uint32_t ctx, uint32_t vm)
> > +{
> > +	struct ahnd_info *ainfo;
> > +
> > +	pthread_mutex_lock(&ahnd_map_mutex);
> > +	ainfo = igt_map_search(ahnd_map, &ahnd);
> > +	if (!ainfo) {
> > +		ainfo = malloc(sizeof(*ainfo));
> > +		ainfo->fd = fd;
> > +		ainfo->ahnd = ahnd;
> > +		ainfo->ctx = ctx;
> > +		ainfo->vm = vm;
> > +		ainfo->driver = get_intel_driver(fd);
> > +		ainfo->bind_map = igt_map_create(igt_map_hash_32, igt_map_equal_32);
> > +		pthread_mutex_init(&ainfo->bind_map_mutex, NULL);
> > +		bind_debug("[TRACK AHND] pid: %d, tid: %d, create <fd: %d, "
> > +			   "ahnd: %llx, ctx: %u, vm: %u, driver: %d, ahnd_map: %p, bind_map: %p>\n",
> > +			   getpid(), gettid(), ainfo->fd,
> > +			   (long long)ainfo->ahnd, ainfo->ctx, ainfo->vm,
> > +			   ainfo->driver, ahnd_map, ainfo->bind_map);
> > +		igt_map_insert(ahnd_map, &ainfo->ahnd, ainfo);
> > +	}
> > +
> > +	pthread_mutex_unlock(&ahnd_map_mutex);
> > +}
> > +
> > +static void untrack_ahnd(uint64_t ahnd)
> > +{
> > +	struct ahnd_info *ainfo;
> > +
> > +	pthread_mutex_lock(&ahnd_map_mutex);
> > +	ainfo = igt_map_search(ahnd_map, &ahnd);
> > +	if (ainfo) {
> > +		bind_debug("[UNTRACK AHND]: pid: %d, tid: %d, removing ahnd: %llx\n",
> > +			   getpid(), gettid(), (long long)ahnd);
> > +		igt_map_remove(ahnd_map, &ahnd, map_entry_free_func);
> > +	}
> 
> Suggestion: I'd warn on !ainfo, we tried to untrack/free something that
> wasn't tracked before.
> 

I don't want to warn. In this moment I don't treat closing already closed
ahnd as reason to warn. See REQ_CLOSE path where it doesn't find allocator.
It just skips and intel_allocator_close() returns false.  I missed adding
test for this but for example on api_intel_allocator@simple-alloc I may
add last line:

igt_assert_eq(intel_allocator_close(ahnd), true);
+igt_assert_eq(intel_allocator_close(ahnd), false);

This will check if allocator was already closed and generate warning
so cibuglog won't be happy. But maybe I should rethink this and don't
let invalid ahnd's to be closed? Anyway this requires strict programming
from the users at all.

> > +	pthread_mutex_unlock(&ahnd_map_mutex);
> > +}
> > +
> >   static uint64_t __intel_allocator_open_full(int fd, uint32_t ctx,
> >   					    uint32_t vm,
> >   					    uint64_t start, uint64_t end,
> > @@ -951,6 +1031,8 @@ static uint64_t __intel_allocator_open_full(int fd, uint32_t ctx,
> >   	igt_assert(resp.open.allocator_handle);
> >   	igt_assert(resp.response_type == RESP_OPEN);
> > +	track_ahnd(fd, resp.open.allocator_handle, ctx, vm);
> > +
> >   	return resp.open.allocator_handle;
> >   }
> > @@ -1057,6 +1139,8 @@ bool intel_allocator_close(uint64_t allocator_handle)
> >   	igt_assert(handle_request(&req, &resp) == 0);
> >   	igt_assert(resp.response_type == RESP_CLOSE);
> > +	untrack_ahnd(allocator_handle);
> > +
> >   	return resp.close.is_empty;
> >   }
> > @@ -1090,6 +1174,76 @@ void intel_allocator_get_address_range(uint64_t allocator_handle,
> >   		*endp = resp.address_range.end;
> >   }
> > +static bool is_same(struct allocator_object *obj,
> > +		    uint32_t handle, uint64_t offset, uint64_t size,
> > +		    enum allocator_bind_op bind_op)
> > +{
> > +	return obj->handle == handle &&	obj->offset == offset && obj->size == size &&
> > +	       (obj->bind_op == bind_op || obj->bind_op == BOUND);
> > +}
> > +
> > +static void track_object(uint64_t allocator_handle, uint32_t handle,
> > +			 uint64_t offset, uint64_t size,
> > +			 enum allocator_bind_op bind_op)
> > +{
> > +	struct ahnd_info *ainfo;
> > +	struct allocator_object *obj;
> > +
> > +	bind_debug("[TRACK OBJECT]: [%s] pid: %d, tid: %d, ahnd: %llx, handle: %u, offset: %llx, size: %llx\n",
> > +		   bind_op == TO_BIND ? "BIND" : "UNBIND",
> > +		   getpid(), gettid(),
> > +		   (long long)allocator_handle,
> > +		   handle, (long long)offset, (long long)size);
> > +
> > +	if (offset == ALLOC_INVALID_ADDRESS) {
> > +		bind_debug("[TRACK OBJECT] => invalid address %llx, skipping tracking\n",
> > +			   (long long)offset);
> 
> OK, we don't track ALLOC_INVALID_ADDRESS as it means that the allocation in
> simple_vma_heap_alloc() failed, correct?
> 

Yes, if we cannot fit - first allocation tooks whole space so another one
will return invalid address.

> > +		return;
> > +	}
> > +
> > +	pthread_mutex_lock(&ahnd_map_mutex);
> > +	ainfo = igt_map_search(ahnd_map, &allocator_handle);
> > +	pthread_mutex_unlock(&ahnd_map_mutex);
> > +	if (!ainfo) {
> > +		igt_warn("[TRACK OBJECT] => MISSING ahnd %llx <=\n", (long long)allocator_handle);
> > +		igt_assert(ainfo);
> > +	}
> 
> Could we do igt_assert_f() instead?
> 

Yes, definitely. Left from previous debugging shape.

> > +
> > +	if (ainfo->driver == INTEL_DRIVER_I915)
> > +		return; /* no-op for i915, at least now */
> 
> I wonder if we could move that tracking to the xe path for now. I mean,
> maybe there will be some benefit of doing it for i915, but I can't see it,
> at least for now.
> 

If I don't track ahnd_info (with ahnd as the key) I cannot retrieve driver
field and I need to run is_i915_device() or is_xe_device() which in my
opinion consts more than simple access to the map with O(1) complexity.

> > +
> > +	pthread_mutex_lock(&ainfo->bind_map_mutex);
> > +	obj = igt_map_search(ainfo->bind_map, &handle);
> > +	if (obj) {
> > +		/*
> > +		 * User may call alloc() couple of times, check object is the
> > +		 * same. For free() there's simple case, just remove from
> > +		 * bind_map.
> > +		 */
> > +		if (bind_op == TO_BIND)
> > +			igt_assert_eq(is_same(obj, handle, offset, size, bind_op), true);
> 
> Checkpatch.pl doesn't like the fact that you're not using braces both for if
> and else if (I have no strong pereference)
> 

Sure, I'll add missing ones to make it happy.

> > +		else if (bind_op == TO_UNBIND) {
> > +			if (obj->bind_op == TO_BIND)
> > +				igt_map_remove(ainfo->bind_map, &obj->handle, map_entry_free_func);
> > +			else if (obj->bind_op == BOUND)
> > +				obj->bind_op = bind_op; > +		}
> > +	} else {
> > +		/* Ignore to unbind bo which wasn't previously inserted */
> > +		if (bind_op == TO_UNBIND)
> > +			goto out;
> > +
> > +		obj = calloc(1, sizeof(*obj));
> > +		obj->handle = handle;
> > +		obj->offset = offset;
> > +		obj->size = size;
> > +		obj->bind_op = bind_op;
> 
> We don't have to check here for bind_op == BOUND, because the only way to
> get from to this state is to call __xe_op_bind(), and not alloc/free,
> correct?
> 

Yes. bind_map which collects active objects allocations/frees works
on ahnd. I keep those allocations on the map to make this fast.
Alloc() adds object to map with TO_BIND state, free() with TO_UNBIND
unless user didn't inject offsets to vm (__xe_op_bind()).
Collecting objects in xe_object form allows xe_bind_unbind_async()
to be allocator agostic. The drawback of this is code shape is
I need to do cleanups in the map (assign BOUND to objects which
were bound and get rid of unbinded). So I keep allocator_object
reference in xe_object priv field. Walk over list and finding
in the map is much quicker. I think I may introduce temporary
map which will keep xe_object -> allocator_object mapping and
then I can get rid of 'priv' field.

> > +		igt_map_insert(ainfo->bind_map, &obj->handle, obj);
> > +	}
> > +out:
> > +	pthread_mutex_unlock(&ainfo->bind_map_mutex);
> > +}
> > +
> >   /**
> >    * __intel_allocator_alloc:
> >    * @allocator_handle: handle to an allocator
> > @@ -1121,6 +1275,8 @@ uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
> >   	igt_assert(handle_request(&req, &resp) == 0);
> >   	igt_assert(resp.response_type == RESP_ALLOC);
> > +	track_object(allocator_handle, handle, resp.alloc.offset, size, TO_BIND);
> > +
> >   	return resp.alloc.offset;
> >   }
> > @@ -1198,6 +1354,8 @@ bool intel_allocator_free(uint64_t allocator_handle, uint32_t handle)
> >   	igt_assert(handle_request(&req, &resp) == 0);
> >   	igt_assert(resp.response_type == RESP_FREE);
> > +	track_object(allocator_handle, handle, 0, 0, TO_UNBIND);
> > +
> >   	return resp.free.freed;
> >   }
> > @@ -1382,6 +1540,83 @@ void intel_allocator_print(uint64_t allocator_handle)
> >   	}
> >   }
> > +static void __xe_op_bind(struct ahnd_info *ainfo, uint32_t sync_in, uint32_t sync_out)
> > +{
> > +	struct allocator_object *obj;
> > +	struct igt_map_entry *pos;
> > +	struct igt_list_head obj_list;
> > +	struct xe_object *entry, *tmp;
> > +
> > +	IGT_INIT_LIST_HEAD(&obj_list);
> > +
> > +	pthread_mutex_lock(&ainfo->bind_map_mutex);
> > +	igt_map_foreach(ainfo->bind_map, pos) {
> > +		obj = pos->data;
> > +
> > +		if (obj->bind_op == BOUND)
> > +			continue;
> > +
> > +		bind_info("= [vm: %u] %s => %u %lx %lx\n",
> > +			  ainfo->ctx,
> > +			  obj->bind_op == TO_BIND ? "TO BIND" : "TO UNBIND",
> > +			  obj->handle, obj->offset,
> > +			  obj->size);
> > +
> > +		entry = malloc(sizeof(*entry));
> > +		entry->handle = obj->handle;
> > +		entry->offset = obj->offset;
> > +		entry->size = obj->size;
> > +		entry->bind_op = obj->bind_op == TO_BIND ? XE_OBJECT_BIND :
> > +							   XE_OBJECT_UNBIND;
> > +		entry->priv = obj;
> > +		igt_list_add(&entry->link, &obj_list);
> > +	}
> > +	pthread_mutex_unlock(&ainfo->bind_map_mutex);
> > +
> > +	xe_bind_unbind_async(ainfo->fd, ainfo->ctx, 0, &obj_list, sync_in, sync_out);
> 
> Shouldn't the second param be ainfo->vm, not ainfo->ctx?
> 

Yes, mixing i915 ctx as vm reference is confusing. I'm going to remove
that unlucky ctx field there.

> > +
> > +	pthread_mutex_lock(&ainfo->bind_map_mutex);
> > +	igt_list_for_each_entry_safe(entry, tmp, &obj_list, link) {
> > +		obj = entry->priv;
> > +		if (obj->bind_op == TO_BIND)
> > +			obj->bind_op = BOUND;
> > +		else
> > +			igt_map_remove(ainfo->bind_map, &obj->handle, map_entry_free_func);
> > +
> > +		igt_list_del(&entry->link);
> > +		free(entry);
> > +	}
> > +	pthread_mutex_unlock(&ainfo->bind_map_mutex);
> > +}
> > +
> > +/**
> > + * intel_allocator_bind:
> > + * @allocator_handle: handle to an allocator
> > + * @sync_in: syncobj (fence-in)
> > + * @sync_out: syncobj (fence-out)
> > + *
> > + * Function binds and unbinds all objects added to the allocator which weren't
> > + * previously binded/unbinded.
> > + *
> > + **/
> > +void intel_allocator_bind(uint64_t allocator_handle,
> > +			  uint32_t sync_in, uint32_t sync_out)
> > +{
> > +	struct ahnd_info *ainfo;
> > +
> > +	pthread_mutex_lock(&ahnd_map_mutex);
> > +	ainfo = igt_map_search(ahnd_map, &allocator_handle);
> > +	pthread_mutex_unlock(&ahnd_map_mutex);
> > +	igt_assert(ainfo);
> > +
> > +	/*
> > +	 * We collect bind/unbind operations on alloc()/free() to do group
> > +	 * operation getting @sync_in as syncobj handle (fence-in). If user
> > +	 * passes 0 as @sync_out we bind/unbind synchronously.
> > +	 */
> > +	__xe_op_bind(ainfo, sync_in, sync_out);
> > +}
> > +
> >   static int equal_handles(const void *key1, const void *key2)
> >   {
> >   	const struct handle_entry *h1 = key1, *h2 = key2;
> > @@ -1439,6 +1674,23 @@ static void __free_maps(struct igt_map *map, bool close_allocators)
> >   	igt_map_destroy(map, map_entry_free_func);
> >   }
> > +static void __free_ahnd_map(void)
> > +{
> > +	struct igt_map_entry *pos;
> > +	struct ahnd_info *ainfo;
> > +
> > +	if (!ahnd_map)
> > +		return;
> > +
> > +	igt_map_foreach(ahnd_map, pos) {
> > +		ainfo = pos->data;
> > +		igt_map_destroy(ainfo->bind_map, map_entry_free_func);
> > +	}
> > +
> > +	igt_map_destroy(ahnd_map, map_entry_free_func);
> > +}
> > +
> > +
> 
> ^- Whoops, an extra blank line
> 

Deleted in v3.

> >   /**
> >    * intel_allocator_init:
> >    *
> > @@ -1456,12 +1708,15 @@ void intel_allocator_init(void)
> >   	__free_maps(handles, true);
> >   	__free_maps(ctx_map, false);
> >   	__free_maps(vm_map, false);
> > +	__free_ahnd_map();
> >   	atomic_init(&next_handle, 1);
> >   	handles = igt_map_create(hash_handles, equal_handles);
> >   	ctx_map = igt_map_create(hash_instance, equal_ctx);
> >   	vm_map = igt_map_create(hash_instance, equal_vm);
> > -	igt_assert(handles && ctx_map && vm_map);
> > +	pthread_mutex_init(&ahnd_map_mutex, NULL);
> > +	ahnd_map = igt_map_create(igt_map_hash_64, igt_map_equal_64);
> > +	igt_assert(handles && ctx_map && vm_map && ahnd_map); >
> >   	channel = intel_allocator_get_msgchannel(CHANNEL_SYSVIPC_MSGQUEUE);
> >   }
> > diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
> > index 1001b21b98..f9ff7f1cc9 100644
> > --- a/lib/intel_allocator.h
> > +++ b/lib/intel_allocator.h
> > @@ -141,9 +141,6 @@ struct intel_allocator {
> >   	/* allocator's private structure */
> >   	void *priv;
> > -	/* driver - i915 or Xe */
> > -	enum intel_driver driver;
> > -
> 
> OK, I see it now. Please drop 9/16 and use per-thread driver info then.
> 
> Many thanks,
> Karolina

Thank you for the time and review comments.

--
Zbigniew

> 
> >   	void (*get_address_range)(struct intel_allocator *ial,
> >   				  uint64_t *startp, uint64_t *endp);
> >   	uint64_t (*alloc)(struct intel_allocator *ial, uint32_t handle,
> > @@ -213,6 +210,9 @@ bool intel_allocator_reserve_if_not_allocated(uint64_t allocator_handle,
> >   void intel_allocator_print(uint64_t allocator_handle);
> > +void intel_allocator_bind(uint64_t allocator_handle,
> > +			  uint32_t sync_in, uint32_t sync_out);
> > +
> >   #define ALLOC_INVALID_ADDRESS (-1ull)
> >   #define INTEL_ALLOCATOR_NONE   0
> >   #define INTEL_ALLOCATOR_RELOC  1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 10/16] lib/intel_allocator: Add intel_allocator_bind()
  2023-07-06 16:09     ` Zbigniew Kempczyński
@ 2023-07-07  8:01       ` Karolina Stolarek
  0 siblings, 0 replies; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-07  8:01 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 18:09, Zbigniew Kempczyński wrote:
> On Thu, Jul 06, 2023 at 03:02:29PM +0200, Karolina Stolarek wrote:
>> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
>>> Synchronize allocator state to vm.
>>>
>>> This change allows xe user to execute vm-bind/unbind for allocator
>>> alloc()/free() operations which occurred since last binding/unbinding.
>>> Before doing exec user should call intel_allocator_bind() to ensure
>>> all vma's are in place.
>>>
>>> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
>>> ---
>>> v2: Rewrite tracking mechanism: previous code uses bind map embedded
>>>       in allocator structure. Unfortunately this wasn't good idea
>>>       - for xe binding everything was fine, but it regress multiprocess/
>>>       multithreaded allocations. Main reason of this was children
>>>       processes couldn't get its reference as this memory was allocated
>>>       on allocator thread (separate process). Currently each child
>>>       contains its own separate tracking maps for ahnd and for each
>>>       ahnd bind map.
>>> ---
>>>    lib/igt_core.c        |   5 +
>>>    lib/intel_allocator.c | 259 +++++++++++++++++++++++++++++++++++++++++-
>>>    lib/intel_allocator.h |   6 +-
>>>    3 files changed, 265 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/lib/igt_core.c b/lib/igt_core.c
>>> index 3ee3a01c36..6286e97b1b 100644
>>> --- a/lib/igt_core.c
>>> +++ b/lib/igt_core.c
>>> @@ -74,6 +74,7 @@
>>>    #include "igt_sysrq.h"
>>>    #include "igt_rc.h"
>>>    #include "igt_list.h"
>>> +#include "igt_map.h"
>>>    #include "igt_device_scan.h"
>>>    #include "igt_thread.h"
>>>    #include "runnercomms.h"
>>> @@ -319,6 +320,8 @@ bool test_multi_fork_child;
>>>    /* For allocator purposes */
>>>    pid_t child_pid  = -1;
>>>    __thread pid_t child_tid  = -1;
>>> +struct igt_map *ahnd_map;
>>> +pthread_mutex_t ahnd_map_mutex;
>>>    enum {
>>>    	/*
>>> @@ -2509,6 +2512,8 @@ bool __igt_fork(void)
>>>    	case 0:
>>>    		test_child = true;
>>>    		pthread_mutex_init(&print_mutex, NULL);
>>> +		pthread_mutex_init(&ahnd_map_mutex, NULL);
>>> +		ahnd_map = igt_map_create(igt_map_hash_64, igt_map_equal_64);
>>>    		child_pid = getpid();
>>>    		child_tid = -1;
>>>    		exit_handler_count = 0;
>>> diff --git a/lib/intel_allocator.c b/lib/intel_allocator.c
>>> index 228b33b92f..02d3404abc 100644
>>> --- a/lib/intel_allocator.c
>>> +++ b/lib/intel_allocator.c
>>> @@ -17,6 +17,7 @@
>>>    #include "intel_allocator.h"
>>>    #include "intel_allocator_msgchannel.h"
>>>    #include "xe/xe_query.h"
>>> +#include "xe/xe_util.h"
>>>    //#define ALLOCDBG
>>
>> I know it has been here before, but do we want to keep that symbol as a
>> comment?
>>
> 
> Ok, I'm touching line above so I'll get rid of this one.

Thanks!

> 
>>>    #ifdef ALLOCDBG
>>> @@ -46,6 +47,14 @@ static inline const char *reqstr(enum reqtype request_type)
>>>    #define alloc_debug(...) {}
>>>    #endif
>>> +#ifdef ALLOCBINDDBG
>>> +#define bind_info igt_info
>>> +#define bind_debug igt_debug
>>> +#else
>>> +#define bind_info(...) {}
>>> +#define bind_debug(...) {}
>>> +#endif
>>> +
>>>    /*
>>>     * We limit allocator space to avoid hang when batch would be
>>>     * pinned in the last page.
>>> @@ -65,6 +74,31 @@ struct handle_entry {
>>>    	struct allocator *al;
>>>    };
>>> +/* For tracking alloc()/free() for Xe */
>>
>> Hmm, but it looks like we track it for both drivers, so that comment is
>> slightly confusing. I understand that we build the struct, but don't
>> actually use it with i915. The question is if we want to have it around with
>> i915.
>>
> 
> At the moment we enter track_object() function but immediately we find
> ahnd info from the ahnd_map and driver is i915 we return. So alloc() /
> free() are not tracked. I decided to track ahnd because access to map
> and checking out driver field is faster than is_(xe|i915)_device().
> So comment above is correct imo.

Right, by "tracking" I meant calling track_object(), even if it's a 
noop. But I'm buying the argument with speed difference, let's leave it 
as it is.

> 
>>> +struct ahnd_info {
>>> +	int fd;
>>> +	uint64_t ahnd;
>>> +	uint32_t ctx;
>>
>> I'm just thinking, given your 11/16 patch where you update intel_ctx_t
>> struct, could we store it here instead of just an id? Would it be useful to
>> have access to intel_ctx_cfg_t, if one was defined?
> 
> I don't want to depend on intel_ctx in intel_allocator. But you're right,
> ctx and vm here are confusing. I'm going to leave only 'vm' field, this
> will enforce distinction between those two drivers. Expect change in v3.

Sounds good, thanks

> 
>>
>>> +	uint32_t vm;
>>> +	enum intel_driver driver;
>>> +	struct igt_map *bind_map;
>>> +	pthread_mutex_t bind_map_mutex;
>>> +};
>>> +
>>> +enum allocator_bind_op {
>>> +	BOUND,
>>> +	TO_BIND,
>>> +	TO_UNBIND,
>>> +};
>>> +
>>> +struct allocator_object {
>>> +	uint32_t handle;
>>> +	uint64_t offset;
>>> +	uint64_t size;
>>> +
>>> +	enum allocator_bind_op bind_op;
>>> +};
>>> +
>>>    struct intel_allocator *
>>>    intel_allocator_reloc_create(int fd, uint64_t start, uint64_t end);
>>>    struct intel_allocator *
>>> @@ -123,6 +157,13 @@ static pid_t allocator_pid = -1;
>>>    extern pid_t child_pid;
>>>    extern __thread pid_t child_tid;
>>> +/*
>>> + * Track alloc()/free() requires storing in local process which has
>>> + * an access to real drm fd it can work on.
>>> + */
>>> +extern struct igt_map *ahnd_map;
>>> +extern pthread_mutex_t ahnd_map_mutex;
>>> +
>>>    /*
>>>     * - for parent process we have child_pid == -1
>>>     * - for child which calls intel_allocator_init() allocator_pid == child_pid
>>> @@ -318,7 +359,6 @@ static struct intel_allocator *intel_allocator_create(int fd,
>>>    	igt_assert(ial);
>>> -	ial->driver = get_intel_driver(fd);
>>
>> Please remember to drop that patch so we don't have such diff in v3.
> 
> Yes, I've dropped it already.
> 
>>
>>>    	ial->type = allocator_type;
>>>    	ial->strategy = allocator_strategy;
>>>    	ial->default_alignment = default_alignment;
>>> @@ -893,6 +933,46 @@ void intel_allocator_multiprocess_stop(void)
>>>    	}
>>>    }
>>> +static void track_ahnd(int fd, uint64_t ahnd, uint32_t ctx, uint32_t vm)
>>> +{
>>> +	struct ahnd_info *ainfo;
>>> +
>>> +	pthread_mutex_lock(&ahnd_map_mutex);
>>> +	ainfo = igt_map_search(ahnd_map, &ahnd);
>>> +	if (!ainfo) {
>>> +		ainfo = malloc(sizeof(*ainfo));
>>> +		ainfo->fd = fd;
>>> +		ainfo->ahnd = ahnd;
>>> +		ainfo->ctx = ctx;
>>> +		ainfo->vm = vm;
>>> +		ainfo->driver = get_intel_driver(fd);
>>> +		ainfo->bind_map = igt_map_create(igt_map_hash_32, igt_map_equal_32);
>>> +		pthread_mutex_init(&ainfo->bind_map_mutex, NULL);
>>> +		bind_debug("[TRACK AHND] pid: %d, tid: %d, create <fd: %d, "
>>> +			   "ahnd: %llx, ctx: %u, vm: %u, driver: %d, ahnd_map: %p, bind_map: %p>\n",
>>> +			   getpid(), gettid(), ainfo->fd,
>>> +			   (long long)ainfo->ahnd, ainfo->ctx, ainfo->vm,
>>> +			   ainfo->driver, ahnd_map, ainfo->bind_map);
>>> +		igt_map_insert(ahnd_map, &ainfo->ahnd, ainfo);
>>> +	}
>>> +
>>> +	pthread_mutex_unlock(&ahnd_map_mutex);
>>> +}
>>> +
>>> +static void untrack_ahnd(uint64_t ahnd)
>>> +{
>>> +	struct ahnd_info *ainfo;
>>> +
>>> +	pthread_mutex_lock(&ahnd_map_mutex);
>>> +	ainfo = igt_map_search(ahnd_map, &ahnd);
>>> +	if (ainfo) {
>>> +		bind_debug("[UNTRACK AHND]: pid: %d, tid: %d, removing ahnd: %llx\n",
>>> +			   getpid(), gettid(), (long long)ahnd);
>>> +		igt_map_remove(ahnd_map, &ahnd, map_entry_free_func);
>>> +	}
>>
>> Suggestion: I'd warn on !ainfo, we tried to untrack/free something that
>> wasn't tracked before.
>>
> 
> I don't want to warn. In this moment I don't treat closing already closed
> ahnd as reason to warn. See REQ_CLOSE path where it doesn't find allocator.
> It just skips and intel_allocator_close() returns false. 

OK, I just checked that if we can't find an allocator, we'll warn about 
it. We _could_ warn once again in the untrack function, but that 
wouldn't bring too much value.

> I missed adding
> test for this but for example on api_intel_allocator@simple-alloc I may
> add last line:
> 
> igt_assert_eq(intel_allocator_close(ahnd), true);
> +igt_assert_eq(intel_allocator_close(ahnd), false);
> 
> This will check if allocator was already closed and generate warning
> so cibuglog won't be happy. But maybe I should rethink this and don't
> let invalid ahnd's to be closed? Anyway this requires strict programming
> from the users at all.

I think that we don't need to test for warn, like you said, it would 
anger the CI. As for closing an already closed allocator, yeah, that 
sounds a bit wrong, but there are no side effects coming from it, so I'd 
leave it as it is. I wasn't aware that we already warn in 
allocator_close(), hence my previous comment.

> 
>>> +	pthread_mutex_unlock(&ahnd_map_mutex);
>>> +}
>>> +
>>>    static uint64_t __intel_allocator_open_full(int fd, uint32_t ctx,
>>>    					    uint32_t vm,
>>>    					    uint64_t start, uint64_t end,
>>> @@ -951,6 +1031,8 @@ static uint64_t __intel_allocator_open_full(int fd, uint32_t ctx,
>>>    	igt_assert(resp.open.allocator_handle);
>>>    	igt_assert(resp.response_type == RESP_OPEN);
>>> +	track_ahnd(fd, resp.open.allocator_handle, ctx, vm);
>>> +
>>>    	return resp.open.allocator_handle;
>>>    }
>>> @@ -1057,6 +1139,8 @@ bool intel_allocator_close(uint64_t allocator_handle)
>>>    	igt_assert(handle_request(&req, &resp) == 0);
>>>    	igt_assert(resp.response_type == RESP_CLOSE);
>>> +	untrack_ahnd(allocator_handle);
>>> +
>>>    	return resp.close.is_empty;
>>>    }
>>> @@ -1090,6 +1174,76 @@ void intel_allocator_get_address_range(uint64_t allocator_handle,
>>>    		*endp = resp.address_range.end;
>>>    }
>>> +static bool is_same(struct allocator_object *obj,
>>> +		    uint32_t handle, uint64_t offset, uint64_t size,
>>> +		    enum allocator_bind_op bind_op)
>>> +{
>>> +	return obj->handle == handle &&	obj->offset == offset && obj->size == size &&
>>> +	       (obj->bind_op == bind_op || obj->bind_op == BOUND);
>>> +}
>>> +
>>> +static void track_object(uint64_t allocator_handle, uint32_t handle,
>>> +			 uint64_t offset, uint64_t size,
>>> +			 enum allocator_bind_op bind_op)
>>> +{
>>> +	struct ahnd_info *ainfo;
>>> +	struct allocator_object *obj;
>>> +
>>> +	bind_debug("[TRACK OBJECT]: [%s] pid: %d, tid: %d, ahnd: %llx, handle: %u, offset: %llx, size: %llx\n",
>>> +		   bind_op == TO_BIND ? "BIND" : "UNBIND",
>>> +		   getpid(), gettid(),
>>> +		   (long long)allocator_handle,
>>> +		   handle, (long long)offset, (long long)size);
>>> +
>>> +	if (offset == ALLOC_INVALID_ADDRESS) {
>>> +		bind_debug("[TRACK OBJECT] => invalid address %llx, skipping tracking\n",
>>> +			   (long long)offset);
>>
>> OK, we don't track ALLOC_INVALID_ADDRESS as it means that the allocation in
>> simple_vma_heap_alloc() failed, correct?
>>
> 
> Yes, if we cannot fit - first allocation tooks whole space so another one
> will return invalid address.
> 
>>> +		return;
>>> +	}
>>> +
>>> +	pthread_mutex_lock(&ahnd_map_mutex);
>>> +	ainfo = igt_map_search(ahnd_map, &allocator_handle);
>>> +	pthread_mutex_unlock(&ahnd_map_mutex);
>>> +	if (!ainfo) {
>>> +		igt_warn("[TRACK OBJECT] => MISSING ahnd %llx <=\n", (long long)allocator_handle);
>>> +		igt_assert(ainfo);
>>> +	}
>>
>> Could we do igt_assert_f() instead?
>>
> 
> Yes, definitely. Left from previous debugging shape.
> 
>>> +
>>> +	if (ainfo->driver == INTEL_DRIVER_I915)
>>> +		return; /* no-op for i915, at least now */
>>
>> I wonder if we could move that tracking to the xe path for now. I mean,
>> maybe there will be some benefit of doing it for i915, but I can't see it,
>> at least for now.
>>
> 
> If I don't track ahnd_info (with ahnd as the key) I cannot retrieve driver
> field and I need to run is_i915_device() or is_xe_device() which in my
> opinion consts more than simple access to the map with O(1) complexity.

Right, to reiterate, I'm fine with that approach, you've convinced me :)

> 
>>> +
>>> +	pthread_mutex_lock(&ainfo->bind_map_mutex);
>>> +	obj = igt_map_search(ainfo->bind_map, &handle);
>>> +	if (obj) {
>>> +		/*
>>> +		 * User may call alloc() couple of times, check object is the
>>> +		 * same. For free() there's simple case, just remove from
>>> +		 * bind_map.
>>> +		 */
>>> +		if (bind_op == TO_BIND)
>>> +			igt_assert_eq(is_same(obj, handle, offset, size, bind_op), true);
>>
>> Checkpatch.pl doesn't like the fact that you're not using braces both for if
>> and else if (I have no strong pereference)
>>
> 
> Sure, I'll add missing ones to make it happy.
> 
>>> +		else if (bind_op == TO_UNBIND) {
>>> +			if (obj->bind_op == TO_BIND)
>>> +				igt_map_remove(ainfo->bind_map, &obj->handle, map_entry_free_func);
>>> +			else if (obj->bind_op == BOUND)
>>> +				obj->bind_op = bind_op; > +		}
>>> +	} else {
>>> +		/* Ignore to unbind bo which wasn't previously inserted */
>>> +		if (bind_op == TO_UNBIND)
>>> +			goto out;
>>> +
>>> +		obj = calloc(1, sizeof(*obj));
>>> +		obj->handle = handle;
>>> +		obj->offset = offset;
>>> +		obj->size = size;
>>> +		obj->bind_op = bind_op;
>>
>> We don't have to check here for bind_op == BOUND, because the only way to
>> get from to this state is to call __xe_op_bind(), and not alloc/free,
>> correct?
>>
> 
> Yes. bind_map which collects active objects allocations/frees works
> on ahnd. I keep those allocations on the map to make this fast.
> Alloc() adds object to map with TO_BIND state, free() with TO_UNBIND
> unless user didn't inject offsets to vm (__xe_op_bind()).
> Collecting objects in xe_object form allows xe_bind_unbind_async()
> to be allocator agostic. The drawback of this is code shape is
> I need to do cleanups in the map (assign BOUND to objects which
> were bound and get rid of unbinded). So I keep allocator_object
> reference in xe_object priv field. Walk over list and finding
> in the map is much quicker. I think I may introduce temporary
> map which will keep xe_object -> allocator_object mapping and
> then I can get rid of 'priv' field.

Thank you for the exhaustive explanation. You can try the temp map, but 
maybe as the improvement/refactoring? The priv field itself isn't that 
bad, I just asked about adding a comment to it, so the users understand 
what it is used for.

Many thanks,
Karolina

> 
>>> +		igt_map_insert(ainfo->bind_map, &obj->handle, obj);
>>> +	}
>>> +out:
>>> +	pthread_mutex_unlock(&ainfo->bind_map_mutex);
>>> +}
>>> +
>>>    /**
>>>     * __intel_allocator_alloc:
>>>     * @allocator_handle: handle to an allocator
>>> @@ -1121,6 +1275,8 @@ uint64_t __intel_allocator_alloc(uint64_t allocator_handle, uint32_t handle,
>>>    	igt_assert(handle_request(&req, &resp) == 0);
>>>    	igt_assert(resp.response_type == RESP_ALLOC);
>>> +	track_object(allocator_handle, handle, resp.alloc.offset, size, TO_BIND);
>>> +
>>>    	return resp.alloc.offset;
>>>    }
>>> @@ -1198,6 +1354,8 @@ bool intel_allocator_free(uint64_t allocator_handle, uint32_t handle)
>>>    	igt_assert(handle_request(&req, &resp) == 0);
>>>    	igt_assert(resp.response_type == RESP_FREE);
>>> +	track_object(allocator_handle, handle, 0, 0, TO_UNBIND);
>>> +
>>>    	return resp.free.freed;
>>>    }
>>> @@ -1382,6 +1540,83 @@ void intel_allocator_print(uint64_t allocator_handle)
>>>    	}
>>>    }
>>> +static void __xe_op_bind(struct ahnd_info *ainfo, uint32_t sync_in, uint32_t sync_out)
>>> +{
>>> +	struct allocator_object *obj;
>>> +	struct igt_map_entry *pos;
>>> +	struct igt_list_head obj_list;
>>> +	struct xe_object *entry, *tmp;
>>> +
>>> +	IGT_INIT_LIST_HEAD(&obj_list);
>>> +
>>> +	pthread_mutex_lock(&ainfo->bind_map_mutex);
>>> +	igt_map_foreach(ainfo->bind_map, pos) {
>>> +		obj = pos->data;
>>> +
>>> +		if (obj->bind_op == BOUND)
>>> +			continue;
>>> +
>>> +		bind_info("= [vm: %u] %s => %u %lx %lx\n",
>>> +			  ainfo->ctx,
>>> +			  obj->bind_op == TO_BIND ? "TO BIND" : "TO UNBIND",
>>> +			  obj->handle, obj->offset,
>>> +			  obj->size);
>>> +
>>> +		entry = malloc(sizeof(*entry));
>>> +		entry->handle = obj->handle;
>>> +		entry->offset = obj->offset;
>>> +		entry->size = obj->size;
>>> +		entry->bind_op = obj->bind_op == TO_BIND ? XE_OBJECT_BIND :
>>> +							   XE_OBJECT_UNBIND;
>>> +		entry->priv = obj;
>>> +		igt_list_add(&entry->link, &obj_list);
>>> +	}
>>> +	pthread_mutex_unlock(&ainfo->bind_map_mutex);
>>> +
>>> +	xe_bind_unbind_async(ainfo->fd, ainfo->ctx, 0, &obj_list, sync_in, sync_out);
>>
>> Shouldn't the second param be ainfo->vm, not ainfo->ctx?
>>
> 
> Yes, mixing i915 ctx as vm reference is confusing. I'm going to remove
> that unlucky ctx field there.
> 
>>> +
>>> +	pthread_mutex_lock(&ainfo->bind_map_mutex);
>>> +	igt_list_for_each_entry_safe(entry, tmp, &obj_list, link) {
>>> +		obj = entry->priv;
>>> +		if (obj->bind_op == TO_BIND)
>>> +			obj->bind_op = BOUND;
>>> +		else
>>> +			igt_map_remove(ainfo->bind_map, &obj->handle, map_entry_free_func);
>>> +
>>> +		igt_list_del(&entry->link);
>>> +		free(entry);
>>> +	}
>>> +	pthread_mutex_unlock(&ainfo->bind_map_mutex);
>>> +}
>>> +
>>> +/**
>>> + * intel_allocator_bind:
>>> + * @allocator_handle: handle to an allocator
>>> + * @sync_in: syncobj (fence-in)
>>> + * @sync_out: syncobj (fence-out)
>>> + *
>>> + * Function binds and unbinds all objects added to the allocator which weren't
>>> + * previously binded/unbinded.
>>> + *
>>> + **/
>>> +void intel_allocator_bind(uint64_t allocator_handle,
>>> +			  uint32_t sync_in, uint32_t sync_out)
>>> +{
>>> +	struct ahnd_info *ainfo;
>>> +
>>> +	pthread_mutex_lock(&ahnd_map_mutex);
>>> +	ainfo = igt_map_search(ahnd_map, &allocator_handle);
>>> +	pthread_mutex_unlock(&ahnd_map_mutex);
>>> +	igt_assert(ainfo);
>>> +
>>> +	/*
>>> +	 * We collect bind/unbind operations on alloc()/free() to do group
>>> +	 * operation getting @sync_in as syncobj handle (fence-in). If user
>>> +	 * passes 0 as @sync_out we bind/unbind synchronously.
>>> +	 */
>>> +	__xe_op_bind(ainfo, sync_in, sync_out);
>>> +}
>>> +
>>>    static int equal_handles(const void *key1, const void *key2)
>>>    {
>>>    	const struct handle_entry *h1 = key1, *h2 = key2;
>>> @@ -1439,6 +1674,23 @@ static void __free_maps(struct igt_map *map, bool close_allocators)
>>>    	igt_map_destroy(map, map_entry_free_func);
>>>    }
>>> +static void __free_ahnd_map(void)
>>> +{
>>> +	struct igt_map_entry *pos;
>>> +	struct ahnd_info *ainfo;
>>> +
>>> +	if (!ahnd_map)
>>> +		return;
>>> +
>>> +	igt_map_foreach(ahnd_map, pos) {
>>> +		ainfo = pos->data;
>>> +		igt_map_destroy(ainfo->bind_map, map_entry_free_func);
>>> +	}
>>> +
>>> +	igt_map_destroy(ahnd_map, map_entry_free_func);
>>> +}
>>> +
>>> +
>>
>> ^- Whoops, an extra blank line
>>
> 
> Deleted in v3.
> 
>>>    /**
>>>     * intel_allocator_init:
>>>     *
>>> @@ -1456,12 +1708,15 @@ void intel_allocator_init(void)
>>>    	__free_maps(handles, true);
>>>    	__free_maps(ctx_map, false);
>>>    	__free_maps(vm_map, false);
>>> +	__free_ahnd_map();
>>>    	atomic_init(&next_handle, 1);
>>>    	handles = igt_map_create(hash_handles, equal_handles);
>>>    	ctx_map = igt_map_create(hash_instance, equal_ctx);
>>>    	vm_map = igt_map_create(hash_instance, equal_vm);
>>> -	igt_assert(handles && ctx_map && vm_map);
>>> +	pthread_mutex_init(&ahnd_map_mutex, NULL);
>>> +	ahnd_map = igt_map_create(igt_map_hash_64, igt_map_equal_64);
>>> +	igt_assert(handles && ctx_map && vm_map && ahnd_map); >
>>>    	channel = intel_allocator_get_msgchannel(CHANNEL_SYSVIPC_MSGQUEUE);
>>>    }
>>> diff --git a/lib/intel_allocator.h b/lib/intel_allocator.h
>>> index 1001b21b98..f9ff7f1cc9 100644
>>> --- a/lib/intel_allocator.h
>>> +++ b/lib/intel_allocator.h
>>> @@ -141,9 +141,6 @@ struct intel_allocator {
>>>    	/* allocator's private structure */
>>>    	void *priv;
>>> -	/* driver - i915 or Xe */
>>> -	enum intel_driver driver;
>>> -
>>
>> OK, I see it now. Please drop 9/16 and use per-thread driver info then.
>>
>> Many thanks,
>> Karolina
> 
> Thank you for the time and review comments.
> 
> --
> Zbigniew
> 
>>
>>>    	void (*get_address_range)(struct intel_allocator *ial,
>>>    				  uint64_t *startp, uint64_t *endp);
>>>    	uint64_t (*alloc)(struct intel_allocator *ial, uint32_t handle,
>>> @@ -213,6 +210,9 @@ bool intel_allocator_reserve_if_not_allocated(uint64_t allocator_handle,
>>>    void intel_allocator_print(uint64_t allocator_handle);
>>> +void intel_allocator_bind(uint64_t allocator_handle,
>>> +			  uint32_t sync_in, uint32_t sync_out);
>>> +
>>>    #define ALLOC_INVALID_ADDRESS (-1ull)
>>>    #define INTEL_ALLOCATOR_NONE   0
>>>    #define INTEL_ALLOCATOR_RELOC  1

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 11/16] lib/intel_ctx: Add xe context information
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 11/16] lib/intel_ctx: Add xe context information Zbigniew Kempczyński
@ 2023-07-07  8:31   ` Karolina Stolarek
  2023-07-11  9:06     ` Zbigniew Kempczyński
  0 siblings, 1 reply; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-07  8:31 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> Most complicated part in adopting i915_blt to intel_blt - which should
> handle both drivers - is how to achieve pipelined execution. In term
> pipelined execution I mean all gpu workloads are executed without
> stalls.
> 
> Comparing to i915 relocations and softpinning xe architecture migrates
> binding (this means also unbind operation) responsibility from the
> kernel to user via vm_bind ioctl(). To avoid stalls user has to
> provide in/out fences (syncobjs) between consecutive bindings/execs.
> Of course for many igt tests we don't need pipelined execution,
> just synchronous bind, then exec. But exercising the driver should
> also cover pipelining to verify it is possible to work without stalls.
> 
> I decided to extend intel_ctx_t with all necessary for xe objects
> (vm, engine, syncobjs) to get flexibility in deciding how to bind,
> execute and wait for (synchronize) those operations. Context object
> along with i915 engine is already passed to blitter library so adding
> xe required fields doesn't break i915 but will allow xe path to get
> all necessary data to execute.
> 
> Using intel_ctx with xe requires some code patterns caused by usage
> of an allocator. For xe the allocator started tracking alloc()/free()
> operations to do bind/unbind in one call just before execution.
> I've added two helpers in intel_ctx which - intel_ctx_xe_exec()
> and intel_ctx_xe_sync(). Depending how intel_ctx was created
> (with 0 or real syncobj handles as in/bind/out fences) bind and exec
> in intel_ctx_xe_exec() are pipelined but synchronize last operation
> (exec). For real syncobjs they are used to join bind + exec calls
> but there's no wait for exec (sync-out) completion. This allows
> building more cascaded bind + exec operations without stalls.
> 
> To wait for a sync-out fence caller may use intel_ctx_xe_sync()
> which is synchronous wait on syncobj. It allows user to reset
> fences to prepare for next operation.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>   lib/intel_ctx.c | 110 +++++++++++++++++++++++++++++++++++++++++++++++-
>   lib/intel_ctx.h |  14 ++++++
>   2 files changed, 123 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/intel_ctx.c b/lib/intel_ctx.c
> index ded9c0f1e4..f210907fac 100644
> --- a/lib/intel_ctx.c
> +++ b/lib/intel_ctx.c
> @@ -5,9 +5,12 @@
>   
>   #include <stddef.h>
>   
> +#include "i915/gem_engine_topology.h"
> +#include "igt_syncobj.h"
> +#include "intel_allocator.h"
>   #include "intel_ctx.h"
>   #include "ioctl_wrappers.h"
> -#include "i915/gem_engine_topology.h"
> +#include "xe/xe_ioctl.h"
>   
>   /**
>    * SECTION:intel_ctx
> @@ -390,3 +393,108 @@ unsigned int intel_ctx_engine_class(const intel_ctx_t *ctx, unsigned int engine)
>   {
>   	return intel_ctx_cfg_engine_class(&ctx->cfg, engine);
>   }
> +
> +/**
> + * intel_ctx_xe:
> + * @fd: open i915 drm file descriptor
> + * @vm: vm
> + * @engine: engine
> + *
> + * Returns an intel_ctx_t representing the xe context.
> + */
> +intel_ctx_t *intel_ctx_xe(int fd, uint32_t vm, uint32_t engine,
> +			  uint32_t sync_in, uint32_t sync_bind, uint32_t sync_out)
> +{
> +	intel_ctx_t *ctx;
> +
> +	ctx = calloc(1, sizeof(*ctx));
> +	igt_assert(ctx);
> +
> +	ctx->fd = fd;
> +	ctx->vm = vm;
> +	ctx->engine = engine;
> +	ctx->sync_in = sync_in;
> +	ctx->sync_bind = sync_bind;
> +	ctx->sync_out = sync_out;
> +
> +	return ctx;
> +}
> +
> +static int __xe_exec(int fd, struct drm_xe_exec *exec)
> +{
> +	int err = 0;
> +
> +	if (igt_ioctl(fd, DRM_IOCTL_XE_EXEC, exec)) {
> +		err = -errno;
> +		igt_assume(err != 0);

Wouldn't "igt_assume(err)" be enough?

> +	}
> +	errno = 0;
> +	return err;
> +}

I'm aware that it's a helper that you use in other execs, but it feels 
out of place, it doesn't deal with intel_ctx_t. Maybe xe_util could be 
its new home?

> +
> +int __intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset)
> +{
> +	struct drm_xe_sync syncs[2] = {
> +		{ .flags = DRM_XE_SYNC_SYNCOBJ, },
> +		{ .flags = DRM_XE_SYNC_SYNCOBJ | DRM_XE_SYNC_SIGNAL, },
> +	};
> +	struct drm_xe_exec exec = {
> +		.engine_id = ctx->engine,
> +		.syncs = (uintptr_t)syncs,
> +		.num_syncs = 2,
> +		.address = bb_offset,
> +		.num_batch_buffer = 1,
> +	};
> +	uint32_t sync_in = ctx->sync_in;
> +	uint32_t sync_bind = ctx->sync_bind ?: syncobj_create(ctx->fd, 0);
> +	uint32_t sync_out = ctx->sync_out ?: syncobj_create(ctx->fd, 0);
> +	int ret;
> +
> +	/* Synchronize allocator state -> vm */
> +	intel_allocator_bind(ahnd, sync_in, sync_bind);
> +
> +	/* Pipelined exec */
> +	syncs[0].handle = sync_bind;
> +	syncs[1].handle = sync_out;
> +
> +	ret = __xe_exec(ctx->fd, &exec);
> +	if (ret)
> +		goto err;
> +
> +	if (!ctx->sync_bind || !ctx->sync_out)
> +		syncobj_wait_err(ctx->fd, &sync_out, 1, INT64_MAX, 0);

This whole flow is so nice and tidy, I like it

> +
> +err:
> +	if (!ctx->sync_bind)
> +		syncobj_destroy(ctx->fd, sync_bind);
> +
> +	if (!ctx->sync_out)
> +		syncobj_destroy(ctx->fd, sync_out);
> +
> +	return ret;
> +}
> +
> +void intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset)
> +{
> +	igt_assert_eq(__intel_ctx_xe_exec(ctx, ahnd, bb_offset), 0);
> +}
> +
> +#define RESET_SYNCOBJ(__fd, __sync) do { \
> +	if (__sync) \
> +		syncobj_reset((__fd), &(__sync), 1); \
> +} while (0)
> +
> +int intel_ctx_xe_sync(intel_ctx_t *ctx, bool reset_syncs)
> +{
> +	int ret;
> +
> +	ret = syncobj_wait_err(ctx->fd, &ctx->sync_out, 1, INT64_MAX, 0);
> +
> +	if (reset_syncs) {
> +		RESET_SYNCOBJ(ctx->fd, ctx->sync_in);
> +		RESET_SYNCOBJ(ctx->fd, ctx->sync_bind);
> +		RESET_SYNCOBJ(ctx->fd, ctx->sync_out);
> +	}

Is there a usecase where we want to do a synced execution without 
resetting syncobjs?

> +
> +	return ret;
> +}
> diff --git a/lib/intel_ctx.h b/lib/intel_ctx.h
> index 3cfeaae81e..59d0360ada 100644
> --- a/lib/intel_ctx.h
> +++ b/lib/intel_ctx.h
> @@ -67,6 +67,14 @@ int intel_ctx_cfg_engine_class(const intel_ctx_cfg_t *cfg, unsigned int engine);
>   typedef struct intel_ctx {
>   	uint32_t id;
>   	intel_ctx_cfg_t cfg;
> +
> +	/* Xe */
> +	int fd;
> +	uint32_t vm;
> +	uint32_t engine;
> +	uint32_t sync_in;
> +	uint32_t sync_bind;
> +	uint32_t sync_out;

Hmm, I wonder if we could wrap it in a struct. Yes, it would be painful 
to unpack, but now it feels like we've just added a bunch of fields that 
are irrelevant 80% of the time. Instead, we could have one additional 
field that could be NULL, and use it if it's initialized.
But maybe I'm just being too nit-picky.

All the best,
Karolina

>   } intel_ctx_t;
>   
>   int __intel_ctx_create(int fd, const intel_ctx_cfg_t *cfg,
> @@ -81,4 +89,10 @@ void intel_ctx_destroy(int fd, const intel_ctx_t *ctx);
>   
>   unsigned int intel_ctx_engine_class(const intel_ctx_t *ctx, unsigned int engine);
>   
> +intel_ctx_t *intel_ctx_xe(int fd, uint32_t vm, uint32_t engine,
> +			  uint32_t sync_in, uint32_t sync_bind, uint32_t sync_out);
> +int __intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset);
> +void intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset);
> +int intel_ctx_xe_sync(intel_ctx_t *ctx, bool reset_syncs);
> +
>   #endif

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 12/16] lib/intel_blt: Introduce blt_copy_init() helper to cache driver
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 12/16] lib/intel_blt: Introduce blt_copy_init() helper to cache driver Zbigniew Kempczyński
@ 2023-07-07  8:51   ` Karolina Stolarek
  2023-07-11  9:23     ` Zbigniew Kempczyński
  0 siblings, 1 reply; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-07  8:51 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> Instead of calling is_xe_device() and is_i915_device() multiple
> times in code which distincs xe and i915 paths add driver field
> to structures used in blitter library.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>   lib/igt_fb.c                   |  2 +-
>   lib/intel_blt.c                | 40 +++++++++++++++++++++++++++++++---
>   lib/intel_blt.h                |  8 ++++++-
>   tests/i915/gem_ccs.c           | 34 ++++++++++++++++-------------
>   tests/i915/gem_exercise_blt.c  | 22 ++++++++++---------
>   tests/i915/gem_lmem_swapping.c |  4 ++--
>   6 files changed, 78 insertions(+), 32 deletions(-)
> 
> diff --git a/lib/igt_fb.c b/lib/igt_fb.c
> index a8988274f2..1814e8db11 100644
> --- a/lib/igt_fb.c
> +++ b/lib/igt_fb.c
> @@ -2900,7 +2900,7 @@ static void blitcopy(const struct igt_fb *dst_fb,
>   			src = blt_fb_init(src_fb, i, mem_region);
>   			dst = blt_fb_init(dst_fb, i, mem_region);
>   
> -			memset(&blt, 0, sizeof(blt));
> +			blt_copy_init(src_fb->fd, &blt);
>   			blt.color_depth = blt_get_bpp(src_fb);
>   			blt_set_copy_object(&blt.src, src);
>   			blt_set_copy_object(&blt.dst, dst);
> diff --git a/lib/intel_blt.c b/lib/intel_blt.c
> index bc28f15e8d..f2f86e4947 100644
> --- a/lib/intel_blt.c
> +++ b/lib/intel_blt.c
> @@ -692,6 +692,22 @@ static void dump_bb_ext(struct gen12_block_copy_data_ext *data)
>   		 data->dw21.src_array_index);
>   }
>   
> +/**
> + * blt_copy_init:
> + * @fd: drm fd
> + * @blt: structure for initialization
> + *
> + * Function is zeroing @blt and sets fd and driver fields (INTEL_DRIVER_I915 or
> + * INTEL_DRIVER_XE).
> + */
> +void blt_copy_init(int fd, struct blt_copy_data *blt)
> +{
> +	memset(blt, 0, sizeof(*blt));
> +
> +	blt->fd = fd;
> +	blt->driver = get_intel_driver(fd);
> +}
> +
>   /**
>    * emit_blt_block_copy:
>    * @fd: drm fd
> @@ -889,6 +905,22 @@ static void dump_bb_surf_ctrl_cmd(const struct gen12_ctrl_surf_copy_data *data)
>   		 cmd[4], data->dw04.dst_address_hi, data->dw04.dst_mocs);
>   }
>   
> +/**
> + * blt_ctrl_surf_copy_init:
> + * @fd: drm fd
> + * @surf: structure for initialization
> + *
> + * Function is zeroing @surf and sets fd and driver fields (INTEL_DRIVER_I915 or
> + * INTEL_DRIVER_XE).
> + */
> +void blt_ctrl_surf_copy_init(int fd, struct blt_ctrl_surf_copy_data *surf)
> +{
> +	memset(surf, 0, sizeof(*surf));
> +
> +	surf->fd = fd;
> +	surf->driver = get_intel_driver(fd);
> +}
> +
>   /**
>    * emit_blt_ctrl_surf_copy:
>    * @fd: drm fd
> @@ -1317,7 +1349,7 @@ void blt_set_batch(struct blt_copy_batch *batch,
>   }
>   
>   struct blt_copy_object *
> -blt_create_object(int fd, uint32_t region,
> +blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>   		  uint32_t width, uint32_t height, uint32_t bpp, uint8_t mocs,
>   		  enum blt_tiling_type tiling,
>   		  enum blt_compression compression,
> @@ -1329,10 +1361,12 @@ blt_create_object(int fd, uint32_t region,
>   	uint32_t stride = tiling == T_LINEAR ? width * 4 : width;
>   	uint32_t handle;
>   
> +	igt_assert_f(blt->driver, "Driver isn't set, have you called blt_copy_init()?\n");
> +
>   	obj = calloc(1, sizeof(*obj));
>   
>   	obj->size = size;
> -	igt_assert(__gem_create_in_memory_regions(fd, &handle,
> +	igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
>   						  &size, region) == 0);
>   
>   	blt_set_object(obj, handle, size, region, mocs, tiling,
> @@ -1340,7 +1374,7 @@ blt_create_object(int fd, uint32_t region,
>   	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
>   
>   	if (create_mapping)
> -		obj->ptr = gem_mmap__device_coherent(fd, handle, 0, size,
> +		obj->ptr = gem_mmap__device_coherent(blt->fd, handle, 0, size,
>   						     PROT_READ | PROT_WRITE);
>   
>   	return obj;
> diff --git a/lib/intel_blt.h b/lib/intel_blt.h
> index 9c4ddc7a89..7516ce8ac7 100644
> --- a/lib/intel_blt.h
> +++ b/lib/intel_blt.h
> @@ -102,6 +102,7 @@ struct blt_copy_batch {
>   /* Common for block-copy and fast-copy */
>   struct blt_copy_data {
>   	int fd;
> +	enum intel_driver driver;
>   	struct blt_copy_object src;
>   	struct blt_copy_object dst;
>   	struct blt_copy_batch bb;
> @@ -155,6 +156,7 @@ struct blt_ctrl_surf_copy_object {
>   
>   struct blt_ctrl_surf_copy_data {
>   	int fd;
> +	enum intel_driver driver;
>   	struct blt_ctrl_surf_copy_object src;
>   	struct blt_ctrl_surf_copy_object dst;
>   	struct blt_copy_batch bb;
> @@ -185,6 +187,8 @@ bool blt_uses_extended_block_copy(int fd);
>   
>   const char *blt_tiling_name(enum blt_tiling_type tiling);
>   
> +void blt_copy_init(int fd, struct blt_copy_data *blt);
> +
>   uint64_t emit_blt_block_copy(int fd,
>   			     uint64_t ahnd,
>   			     const struct blt_copy_data *blt,
> @@ -205,6 +209,8 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>   				 uint64_t bb_pos,
>   				 bool emit_bbe);
>   
> +void blt_ctrl_surf_copy_init(int fd, struct blt_ctrl_surf_copy_data *surf);
> +
>   int blt_ctrl_surf_copy(int fd,
>   		       const intel_ctx_t *ctx,
>   		       const struct intel_execution_engine2 *e,
> @@ -230,7 +236,7 @@ void blt_set_batch(struct blt_copy_batch *batch,
>   		   uint32_t handle, uint64_t size, uint32_t region);
>   
>   struct blt_copy_object *
> -blt_create_object(int fd, uint32_t region,
> +blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>   		  uint32_t width, uint32_t height, uint32_t bpp, uint8_t mocs,
>   		  enum blt_tiling_type tiling,
>   		  enum blt_compression compression,
> diff --git a/tests/i915/gem_ccs.c b/tests/i915/gem_ccs.c
> index f9ad9267df..d9d785ed9b 100644
> --- a/tests/i915/gem_ccs.c
> +++ b/tests/i915/gem_ccs.c
> @@ -167,7 +167,7 @@ static void surf_copy(int i915,
>   	ccs = gem_create(i915, ccssize);
>   	ccs2 = gem_create(i915, ccssize);
>   
> -	surf.fd = i915;
> +	blt_ctrl_surf_copy_init(i915, &surf);
>   	surf.print_bb = param.print_bb;
>   	set_surf_object(&surf.src, mid->handle, mid->region, mid->size,
>   			uc_mocs, BLT_INDIRECT_ACCESS);
> @@ -219,7 +219,7 @@ static void surf_copy(int i915,
>   			uc_mocs, INDIRECT_ACCESS);
>   	blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
>   
> -	memset(&blt, 0, sizeof(blt));
> +	blt_copy_init(i915, &blt);
>   	blt.color_depth = CD_32bit;
>   	blt.print_bb = param.print_bb;
>   	blt_set_copy_object(&blt.src, mid);
> @@ -310,7 +310,7 @@ static int blt_block_copy3(int i915,
>   	bb_offset = get_offset(ahnd, blt3->bb.handle, blt3->bb.size, alignment);
>   
>   	/* First blit src -> mid */
> -	memset(&blt0, 0, sizeof(blt0));
> +	blt_copy_init(i915, &blt0);
>   	blt0.src = blt3->src;
>   	blt0.dst = blt3->mid;
>   	blt0.bb = blt3->bb;
> @@ -321,7 +321,7 @@ static int blt_block_copy3(int i915,
>   	bb_pos = emit_blt_block_copy(i915, ahnd, &blt0, &ext0, bb_pos, false);
>   
>   	/* Second blit mid -> dst */
> -	memset(&blt0, 0, sizeof(blt0));
> +	blt_copy_init(i915, &blt0);
>   	blt0.src = blt3->mid;
>   	blt0.dst = blt3->dst;
>   	blt0.bb = blt3->bb;
> @@ -332,7 +332,7 @@ static int blt_block_copy3(int i915,
>   	bb_pos = emit_blt_block_copy(i915, ahnd, &blt0, &ext0, bb_pos, false);
>   
>   	/* Third blit dst -> final */
> -	memset(&blt0, 0, sizeof(blt0));
> +	blt_copy_init(i915, &blt0);
>   	blt0.src = blt3->dst;
>   	blt0.dst = blt3->final;
>   	blt0.bb = blt3->bb;
> @@ -390,11 +390,13 @@ static void block_copy(int i915,
>   	if (!blt_uses_extended_block_copy(i915))
>   		pext = NULL;
>   
> -	src = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
> +	blt_copy_init(i915, &blt);
> +
> +	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
>   				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> -	mid = blt_create_object(i915, mid_region, width, height, bpp, uc_mocs,
> +	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
>   				mid_tiling, mid_compression, comp_type, true);
> -	dst = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
> +	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
>   				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
>   	igt_assert(src->size == dst->size);
>   	PRINT_SURFACE_INFO("src", src);
> @@ -404,7 +406,6 @@ static void block_copy(int i915,
>   	blt_surface_fill_rect(i915, src, width, height);
>   	WRITE_PNG(i915, run_id, "src", src, width, height);
>   
> -	memset(&blt, 0, sizeof(blt));
>   	blt.color_depth = CD_32bit;
>   	blt.print_bb = param.print_bb;
>   	blt_set_copy_object(&blt.src, src);
> @@ -449,7 +450,7 @@ static void block_copy(int i915,
>   		}
>   	}
>   
> -	memset(&blt, 0, sizeof(blt));
> +	blt_copy_init(i915, &blt);
>   	blt.color_depth = CD_32bit;
>   	blt.print_bb = param.print_bb;
>   	blt_set_copy_object(&blt.src, mid);
> @@ -486,6 +487,7 @@ static void block_multicopy(int i915,
>   			    const struct test_config *config)
>   {
>   	struct blt_copy3_data blt3 = {};
> +	struct blt_copy_data blt = {};
>   	struct blt_block_copy3_data_ext ext3 = {}, *pext3 = &ext3;
>   	struct blt_copy_object *src, *mid, *dst, *final;
>   	const uint32_t bpp = 32;
> @@ -505,13 +507,16 @@ static void block_multicopy(int i915,
>   	if (!blt_uses_extended_block_copy(i915))
>   		pext3 = NULL;
>   
> -	src = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
> +	/* For object creation */
> +	blt_copy_init(i915, &blt);
> +
> +	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
>   				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> -	mid = blt_create_object(i915, mid_region, width, height, bpp, uc_mocs,
> +	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
>   				mid_tiling, mid_compression, comp_type, true);
> -	dst = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
> +	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
>   				mid_tiling, COMPRESSION_DISABLED, comp_type, true);
> -	final = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
> +	final = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
>   				  T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
>   	igt_assert(src->size == dst->size);
>   	PRINT_SURFACE_INFO("src", src);
> @@ -521,7 +526,6 @@ static void block_multicopy(int i915,
>   
>   	blt_surface_fill_rect(i915, src, width, height);
>   
> -	memset(&blt3, 0, sizeof(blt3));

We don't init blt3 because we don't to init blt copy objects and we have 
fd/driver passed in, correct?

Overall, the patch looks good to me:

Reviewed-by: Karolina Stolarek <karolina.stolarek@intel.com>

>   	blt3.color_depth = CD_32bit;
>   	blt3.print_bb = param.print_bb;
>   	blt_set_copy_object(&blt3.src, src);
> diff --git a/tests/i915/gem_exercise_blt.c b/tests/i915/gem_exercise_blt.c
> index 0cd1820430..7355eabbe9 100644
> --- a/tests/i915/gem_exercise_blt.c
> +++ b/tests/i915/gem_exercise_blt.c
> @@ -89,7 +89,7 @@ static int fast_copy_one_bb(int i915,
>   	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>   
>   	/* First blit */
> -	memset(&blt_tmp, 0, sizeof(blt_tmp));
> +	blt_copy_init(i915, &blt_tmp);
>   	blt_tmp.src = blt->src;
>   	blt_tmp.dst = blt->mid;
>   	blt_tmp.bb = blt->bb;
> @@ -98,7 +98,7 @@ static int fast_copy_one_bb(int i915,
>   	bb_pos = emit_blt_fast_copy(i915, ahnd, &blt_tmp, bb_pos, false);
>   
>   	/* Second blit */
> -	memset(&blt_tmp, 0, sizeof(blt_tmp));
> +	blt_copy_init(i915, &blt_tmp);
>   	blt_tmp.src = blt->mid;
>   	blt_tmp.dst = blt->dst;
>   	blt_tmp.bb = blt->bb;
> @@ -140,6 +140,7 @@ static void fast_copy_emit(int i915, const intel_ctx_t *ctx,
>   			   uint32_t region1, uint32_t region2,
>   			   enum blt_tiling_type mid_tiling)
>   {
> +	struct blt_copy_data bltinit = {};
>   	struct blt_fast_copy_data blt = {};
>   	struct blt_copy_object *src, *mid, *dst;
>   	const uint32_t bpp = 32;
> @@ -152,11 +153,12 @@ static void fast_copy_emit(int i915, const intel_ctx_t *ctx,
>   
>   	igt_assert(__gem_create_in_memory_regions(i915, &bb, &bb_size, region1) == 0);
>   
> -	src = blt_create_object(i915, region1, width, height, bpp, 0,
> +	blt_copy_init(i915, &bltinit);
> +	src = blt_create_object(&bltinit, region1, width, height, bpp, 0,
>   				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> -	mid = blt_create_object(i915, region2, width, height, bpp, 0,
> +	mid = blt_create_object(&bltinit, region2, width, height, bpp, 0,
>   				mid_tiling, COMPRESSION_DISABLED, 0, true);
> -	dst = blt_create_object(i915, region1, width, height, bpp, 0,
> +	dst = blt_create_object(&bltinit, region1, width, height, bpp, 0,
>   				T_LINEAR, COMPRESSION_DISABLED, 0, true);
>   	igt_assert(src->size == dst->size);
>   
> @@ -212,17 +214,17 @@ static void fast_copy(int i915, const intel_ctx_t *ctx,
>   
>   	igt_assert(__gem_create_in_memory_regions(i915, &bb, &bb_size, region1) == 0);
>   
> -	src = blt_create_object(i915, region1, width, height, bpp, 0,
> +	blt_copy_init(i915, &blt);
> +	src = blt_create_object(&blt, region1, width, height, bpp, 0,
>   				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> -	mid = blt_create_object(i915, region2, width, height, bpp, 0,
> +	mid = blt_create_object(&blt, region2, width, height, bpp, 0,
>   				mid_tiling, COMPRESSION_DISABLED, 0, true);
> -	dst = blt_create_object(i915, region1, width, height, bpp, 0,
> +	dst = blt_create_object(&blt, region1, width, height, bpp, 0,
>   				T_LINEAR, COMPRESSION_DISABLED, 0, true);
>   	igt_assert(src->size == dst->size);
>   
>   	blt_surface_fill_rect(i915, src, width, height);
>   
> -	memset(&blt, 0, sizeof(blt));
>   	blt.color_depth = CD_32bit;
>   	blt.print_bb = param.print_bb;
>   	blt_set_copy_object(&blt.src, src);
> @@ -235,7 +237,7 @@ static void fast_copy(int i915, const intel_ctx_t *ctx,
>   	WRITE_PNG(i915, mid_tiling, "src", &blt.src, width, height);
>   	WRITE_PNG(i915, mid_tiling, "mid", &blt.dst, width, height);
>   
> -	memset(&blt, 0, sizeof(blt));
> +	blt_copy_init(i915, &blt);
>   	blt.color_depth = CD_32bit;
>   	blt.print_bb = param.print_bb;
>   	blt_set_copy_object(&blt.src, mid);
> diff --git a/tests/i915/gem_lmem_swapping.c b/tests/i915/gem_lmem_swapping.c
> index 83dbebec83..2921de8f9f 100644
> --- a/tests/i915/gem_lmem_swapping.c
> +++ b/tests/i915/gem_lmem_swapping.c
> @@ -308,7 +308,7 @@ init_object_ccs(int i915, struct object *obj, struct blt_copy_object *tmp,
>   		buf[j] = seed++;
>   	munmap(buf, obj->size);
>   
> -	memset(&blt, 0, sizeof(blt));
> +	blt_copy_init(i915, &blt);
>   	blt.color_depth = CD_32bit;
>   
>   	memcpy(&blt.src, tmp, sizeof(blt.src));
> @@ -366,7 +366,7 @@ verify_object_ccs(int i915, const struct object *obj,
>   	cmd->handle = gem_create_from_pool(i915, &size, region);
>   	blt_set_batch(cmd, cmd->handle, size, region);
>   
> -	memset(&blt, 0, sizeof(blt));
> +	blt_copy_init(i915, &blt);
>   	blt.color_depth = CD_32bit;
>   
>   	memcpy(&blt.src, obj->blt_obj, sizeof(blt.src));

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 13/16] lib/intel_blt: Extend blitter library to support xe driver
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 13/16] lib/intel_blt: Extend blitter library to support xe driver Zbigniew Kempczyński
@ 2023-07-07  9:26   ` Karolina Stolarek
  2023-07-11 10:16     ` Zbigniew Kempczyński
  0 siblings, 1 reply; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-07  9:26 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> Use already written for i915 blitter library in xe development.
> Add appropriate code paths which are unique for those drivers.

I'm excited about this one :)

> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>   lib/intel_blt.c | 256 ++++++++++++++++++++++++++++++++----------------
>   lib/intel_blt.h |   2 +-
>   2 files changed, 170 insertions(+), 88 deletions(-)
> 
> diff --git a/lib/intel_blt.c b/lib/intel_blt.c
> index f2f86e4947..3eb5d45460 100644
> --- a/lib/intel_blt.c
> +++ b/lib/intel_blt.c
> @@ -9,9 +9,13 @@
>   #include <malloc.h>
>   #include <cairo.h>
>   #include "drm.h"
> -#include "igt.h"
>   #include "i915/gem_create.h"
> +#include "igt.h"
> +#include "igt_syncobj.h"
>   #include "intel_blt.h"
> +#include "xe/xe_ioctl.h"
> +#include "xe/xe_query.h"
> +#include "xe/xe_util.h"
>   
>   #define BITRANGE(start, end) (end - start + 1)
>   #define GET_CMDS_INFO(__fd) intel_get_cmds_info(intel_get_drm_devid(__fd))
> @@ -468,24 +472,40 @@ static int __special_mode(const struct blt_copy_data *blt)
>   	return SM_NONE;
>   }
>   
> -static int __memory_type(uint32_t region)
> +static int __memory_type(int fd, enum intel_driver driver, uint32_t region)

This comment applies both to __memory_type() and __aux_mode(). In 
fill_data(), we unpack whatever was passed in blt and pass as three 
separate params. Could we let the functions unpack them on their own, 
just like in __special_mode()?

Also, fill_data() changes make me wonder if my r-b for 12/16 is valid -- 
we don't set fd and driver fields in blt3 in i915 multicopy test. I 
think we should do it.

>   {
> -	igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
> -		     IS_SYSTEM_MEMORY_REGION(region),
> -		     "Invalid region: %x\n", region);
> +	if (driver == INTEL_DRIVER_I915) {
> +		igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
> +			     IS_SYSTEM_MEMORY_REGION(region),
> +			     "Invalid region: %x\n", region);
> +	} else {
> +		igt_assert_f(XE_IS_VRAM_MEMORY_REGION(fd, region) ||
> +			     XE_IS_SYSMEM_MEMORY_REGION(fd, region),
> +			     "Invalid region: %x\n", region);
> +	}
>   
> -	if (IS_DEVICE_MEMORY_REGION(region))
> +	if (driver == INTEL_DRIVER_I915 && IS_DEVICE_MEMORY_REGION(region))
>   		return TM_LOCAL_MEM;
> +	else if (driver == INTEL_DRIVER_XE && XE_IS_VRAM_MEMORY_REGION(fd, region))
> +		return TM_LOCAL_MEM;
> +
>   	return TM_SYSTEM_MEM;
>   }
>   
> -static enum blt_aux_mode __aux_mode(const struct blt_copy_object *obj)
> +static enum blt_aux_mode __aux_mode(int fd,
> +				    enum intel_driver driver,
> +				    const struct blt_copy_object *obj)
>   {
> -	if (obj->compression == COMPRESSION_ENABLED) {
> +	if (driver == INTEL_DRIVER_I915 && obj->compression == COMPRESSION_ENABLED) {
>   		igt_assert_f(IS_DEVICE_MEMORY_REGION(obj->region),
>   			     "XY_BLOCK_COPY_BLT supports compression "
>   			     "on device memory only\n");
>   		return AM_AUX_CCS_E;
> +	} else if (driver == INTEL_DRIVER_XE && obj->compression == COMPRESSION_ENABLED) {
> +		igt_assert_f(XE_IS_VRAM_MEMORY_REGION(fd, obj->region),
> +			     "XY_BLOCK_COPY_BLT supports compression "
> +			     "on device memory only\n");
> +		return AM_AUX_CCS_E;
>   	}
>   
>   	return AM_AUX_NONE;
> @@ -508,9 +528,9 @@ static void fill_data(struct gen12_block_copy_data *data,
>   	data->dw00.length = extended_command ? 20 : 10;
>   
>   	if (__special_mode(blt) == SM_FULL_RESOLVE)
> -		data->dw01.dst_aux_mode = __aux_mode(&blt->src);
> +		data->dw01.dst_aux_mode = __aux_mode(blt->fd, blt->driver, &blt->src);
>   	else
> -		data->dw01.dst_aux_mode = __aux_mode(&blt->dst);
> +		data->dw01.dst_aux_mode = __aux_mode(blt->fd, blt->driver, &blt->dst);
>   	data->dw01.dst_pitch = blt->dst.pitch - 1;
>   
>   	data->dw01.dst_mocs = blt->dst.mocs;
> @@ -531,13 +551,13 @@ static void fill_data(struct gen12_block_copy_data *data,
>   
>   	data->dw06.dst_x_offset = blt->dst.x_offset;
>   	data->dw06.dst_y_offset = blt->dst.y_offset;
> -	data->dw06.dst_target_memory = __memory_type(blt->dst.region);
> +	data->dw06.dst_target_memory = __memory_type(blt->fd, blt->driver, blt->dst.region);
>   
>   	data->dw07.src_x1 = blt->src.x1;
>   	data->dw07.src_y1 = blt->src.y1;
>   
>   	data->dw08.src_pitch = blt->src.pitch - 1;
> -	data->dw08.src_aux_mode = __aux_mode(&blt->src);
> +	data->dw08.src_aux_mode = __aux_mode(blt->fd, blt->driver, &blt->src);
>   	data->dw08.src_mocs = blt->src.mocs;
>   	data->dw08.src_compression = blt->src.compression;
>   	data->dw08.src_tiling = __block_tiling(blt->src.tiling);
> @@ -550,7 +570,7 @@ static void fill_data(struct gen12_block_copy_data *data,
>   
>   	data->dw11.src_x_offset = blt->src.x_offset;
>   	data->dw11.src_y_offset = blt->src.y_offset;
> -	data->dw11.src_target_memory = __memory_type(blt->src.region);
> +	data->dw11.src_target_memory = __memory_type(blt->fd, blt->driver, blt->src.region);
>   }
>   
>   static void fill_data_ext(struct gen12_block_copy_data_ext *dext,
> @@ -739,7 +759,10 @@ uint64_t emit_blt_block_copy(int fd,
>   	igt_assert_f(ahnd, "block-copy supports softpin only\n");
>   	igt_assert_f(blt, "block-copy requires data to do blit\n");
>   
> -	alignment = gem_detect_safe_alignment(fd);
> +	if (blt->driver == INTEL_DRIVER_XE)
> +		alignment = xe_get_default_alignment(fd);
> +	else
> +		alignment = gem_detect_safe_alignment(fd);

I see this pattern of getting the alignment repeated a couple of times, 
so I wonder if we could wrap it in a macro that switches on blt->driver? 
The argument list is the same for both functions.

>   	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
>   		     + blt->src.plane_offset;
>   	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
> @@ -748,8 +771,11 @@ uint64_t emit_blt_block_copy(int fd,
>   
>   	fill_data(&data, blt, src_offset, dst_offset, ext);
>   
> -	bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
> -				       PROT_READ | PROT_WRITE);
> +	if (blt->driver == INTEL_DRIVER_XE)
> +		bb = xe_bo_map(fd, blt->bb.handle, blt->bb.size);
> +	else
> +		bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
> +					       PROT_READ | PROT_WRITE);
>   
>   	igt_assert(bb_pos + sizeof(data) < blt->bb.size);
>   	memcpy(bb + bb_pos, &data, sizeof(data));
> @@ -812,29 +838,38 @@ int blt_block_copy(int fd,
>   
>   	igt_assert_f(ahnd, "block-copy supports softpin only\n");
>   	igt_assert_f(blt, "block-copy requires data to do blit\n");
> +	igt_assert_neq(blt->driver, 0);
>   
> -	alignment = gem_detect_safe_alignment(fd);
> +	if (blt->driver == INTEL_DRIVER_XE)
> +		alignment = xe_get_default_alignment(fd);
> +	else
> +		alignment = gem_detect_safe_alignment(fd);
>   	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
>   	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
>   	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>   
>   	emit_blt_block_copy(fd, ahnd, blt, ext, 0, true);
>   
> -	obj[0].offset = CANONICAL(dst_offset);
> -	obj[1].offset = CANONICAL(src_offset);
> -	obj[2].offset = CANONICAL(bb_offset);
> -	obj[0].handle = blt->dst.handle;
> -	obj[1].handle = blt->src.handle;
> -	obj[2].handle = blt->bb.handle;
> -	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> -		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> -	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> -	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> -	execbuf.buffer_count = 3;
> -	execbuf.buffers_ptr = to_user_pointer(obj);
> -	execbuf.rsvd1 = ctx ? ctx->id : 0;
> -	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> -	ret = __gem_execbuf(fd, &execbuf);
> +	if (blt->driver == INTEL_DRIVER_XE) {
> +		intel_ctx_xe_exec(ctx, ahnd, CANONICAL(bb_offset));
> +	} else {
> +		obj[0].offset = CANONICAL(dst_offset);
> +		obj[1].offset = CANONICAL(src_offset);
> +		obj[2].offset = CANONICAL(bb_offset);
> +		obj[0].handle = blt->dst.handle;
> +		obj[1].handle = blt->src.handle;
> +		obj[2].handle = blt->bb.handle;
> +		obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> +				EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +		obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +		obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +		execbuf.buffer_count = 3;
> +		execbuf.buffers_ptr = to_user_pointer(obj);
> +		execbuf.rsvd1 = ctx ? ctx->id : 0;
> +		execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> +
> +		ret = __gem_execbuf(fd, &execbuf);
> +	}
>   
>   	return ret;
>   }
> @@ -950,7 +985,10 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>   	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
>   	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
>   
> -	alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
> +	if (surf->driver == INTEL_DRIVER_XE)
> +		alignment = max_t(uint64_t, xe_get_default_alignment(fd), 1ull << 16);
> +	else
> +		alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
>   
>   	data.dw00.client = 0x2;
>   	data.dw00.opcode = 0x48;
> @@ -973,8 +1011,11 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>   	data.dw04.dst_address_hi = dst_offset >> 32;
>   	data.dw04.dst_mocs = surf->dst.mocs;
>   
> -	bb = gem_mmap__device_coherent(fd, surf->bb.handle, 0, surf->bb.size,
> -				       PROT_READ | PROT_WRITE);
> +	if (surf->driver == INTEL_DRIVER_XE)
> +		bb = xe_bo_map(fd, surf->bb.handle, surf->bb.size);
> +	else
> +		bb = gem_mmap__device_coherent(fd, surf->bb.handle, 0, surf->bb.size,
> +					       PROT_READ | PROT_WRITE);
>   
>   	igt_assert(bb_pos + sizeof(data) < surf->bb.size);
>   	memcpy(bb + bb_pos, &data, sizeof(data));
> @@ -1002,7 +1043,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>   
>   /**
>    * blt_ctrl_surf_copy:
> - * @fd: drm fd
> + * @blt: bldrm fd
>    * @ctx: intel_ctx_t context
>    * @e: blitter engine for @ctx
>    * @ahnd: allocator handle
> @@ -1026,32 +1067,41 @@ int blt_ctrl_surf_copy(int fd,
>   
>   	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
>   	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
> +	igt_assert_neq(surf->driver, 0);
> +
> +	if (surf->driver == INTEL_DRIVER_XE)
> +		alignment = max_t(uint64_t, xe_get_default_alignment(fd), 1ull << 16);
> +	else
> +		alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);

Here, if we were to implement my macro suggestion, we could have one 
line assignment, as the maximum value is the same.

>   
> -	alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
>   	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
>   	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
>   	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
>   
>   	emit_blt_ctrl_surf_copy(fd, ahnd, surf, 0, true);
>   
> -	obj[0].offset = CANONICAL(dst_offset);
> -	obj[1].offset = CANONICAL(src_offset);
> -	obj[2].offset = CANONICAL(bb_offset);
> -	obj[0].handle = surf->dst.handle;
> -	obj[1].handle = surf->src.handle;
> -	obj[2].handle = surf->bb.handle;
> -	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> -		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> -	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> -	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> -	execbuf.buffer_count = 3;
> -	execbuf.buffers_ptr = to_user_pointer(obj);
> -	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> -	execbuf.rsvd1 = ctx ? ctx->id : 0;
> -	gem_execbuf(fd, &execbuf);
> -	put_offset(ahnd, surf->dst.handle);
> -	put_offset(ahnd, surf->src.handle);
> -	put_offset(ahnd, surf->bb.handle);
> +	if (surf->driver == INTEL_DRIVER_XE) {
> +		intel_ctx_xe_exec(ctx, ahnd, CANONICAL(bb_offset));
> +	} else {
> +		obj[0].offset = CANONICAL(dst_offset);
> +		obj[1].offset = CANONICAL(src_offset);
> +		obj[2].offset = CANONICAL(bb_offset);
> +		obj[0].handle = surf->dst.handle;
> +		obj[1].handle = surf->src.handle;
> +		obj[2].handle = surf->bb.handle;
> +		obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> +				EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +		obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +		obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +		execbuf.buffer_count = 3;
> +		execbuf.buffers_ptr = to_user_pointer(obj);
> +		execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> +		execbuf.rsvd1 = ctx ? ctx->id : 0;
> +		gem_execbuf(fd, &execbuf);
> +		put_offset(ahnd, surf->dst.handle);
> +		put_offset(ahnd, surf->src.handle);
> +		put_offset(ahnd, surf->bb.handle);
> +	}
>   
>   	return 0;
>   }
> @@ -1208,7 +1258,10 @@ uint64_t emit_blt_fast_copy(int fd,
>   	uint32_t bbe = MI_BATCH_BUFFER_END;
>   	uint32_t *bb;
>   
> -	alignment = gem_detect_safe_alignment(fd);
> +	if (blt->driver == INTEL_DRIVER_XE)
> +		alignment = xe_get_default_alignment(fd);
> +	else
> +		alignment = gem_detect_safe_alignment(fd);
>   
>   	data.dw00.client = 0x2;
>   	data.dw00.opcode = 0x42;
> @@ -1218,8 +1271,8 @@ uint64_t emit_blt_fast_copy(int fd,
>   
>   	data.dw01.dst_pitch = blt->dst.pitch;
>   	data.dw01.color_depth = __fast_color_depth(blt->color_depth);
> -	data.dw01.dst_memory = __memory_type(blt->dst.region);
> -	data.dw01.src_memory = __memory_type(blt->src.region);
> +	data.dw01.dst_memory = __memory_type(blt->fd, blt->driver, blt->dst.region);
> +	data.dw01.src_memory = __memory_type(blt->fd, blt->driver, blt->src.region);
>   	data.dw01.dst_type_y = __new_tile_y_type(blt->dst.tiling) ? 1 : 0;
>   	data.dw01.src_type_y = __new_tile_y_type(blt->src.tiling) ? 1 : 0;
>   
> @@ -1246,8 +1299,11 @@ uint64_t emit_blt_fast_copy(int fd,
>   	data.dw08.src_address_lo = src_offset;
>   	data.dw09.src_address_hi = src_offset >> 32;
>   
> -	bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
> -				       PROT_READ | PROT_WRITE);
> +	if (blt->driver == INTEL_DRIVER_XE)
> +		bb = xe_bo_map(fd, blt->bb.handle, blt->bb.size);
> +	else
> +		bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
> +					       PROT_READ | PROT_WRITE);
>   
>   	igt_assert(bb_pos + sizeof(data) < blt->bb.size);
>   	memcpy(bb + bb_pos, &data, sizeof(data));
> @@ -1297,7 +1353,14 @@ int blt_fast_copy(int fd,
>   	uint64_t dst_offset, src_offset, bb_offset, alignment;
>   	int ret;
>   
> -	alignment = gem_detect_safe_alignment(fd);
> +	igt_assert_f(ahnd, "fast-copy supports softpin only\n");
> +	igt_assert_f(blt, "fast-copy requires data to do fast-copy blit\n");
> +	igt_assert_neq(blt->driver, 0);
> +
> +	if (blt->driver == INTEL_DRIVER_XE)
> +		alignment = xe_get_default_alignment(fd);
> +	else
> +		alignment = gem_detect_safe_alignment(fd);
>   
>   	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
>   	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> @@ -1305,24 +1368,28 @@ int blt_fast_copy(int fd,
>   
>   	emit_blt_fast_copy(fd, ahnd, blt, 0, true);
>   
> -	obj[0].offset = CANONICAL(dst_offset);
> -	obj[1].offset = CANONICAL(src_offset);
> -	obj[2].offset = CANONICAL(bb_offset);
> -	obj[0].handle = blt->dst.handle;
> -	obj[1].handle = blt->src.handle;
> -	obj[2].handle = blt->bb.handle;
> -	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> -		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> -	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> -	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> -	execbuf.buffer_count = 3;
> -	execbuf.buffers_ptr = to_user_pointer(obj);
> -	execbuf.rsvd1 = ctx ? ctx->id : 0;
> -	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> -	ret = __gem_execbuf(fd, &execbuf);
> -	put_offset(ahnd, blt->dst.handle);
> -	put_offset(ahnd, blt->src.handle);
> -	put_offset(ahnd, blt->bb.handle);
> +	if (blt->driver == INTEL_DRIVER_XE) {
> +		intel_ctx_xe_exec(ctx, ahnd, CANONICAL(bb_offset));
> +	} else {
> +		obj[0].offset = CANONICAL(dst_offset);
> +		obj[1].offset = CANONICAL(src_offset);
> +		obj[2].offset = CANONICAL(bb_offset);
> +		obj[0].handle = blt->dst.handle;
> +		obj[1].handle = blt->src.handle;
> +		obj[2].handle = blt->bb.handle;
> +		obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> +				EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +		obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +		obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +		execbuf.buffer_count = 3;
> +		execbuf.buffers_ptr = to_user_pointer(obj);
> +		execbuf.rsvd1 = ctx ? ctx->id : 0;
> +		execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> +		ret = __gem_execbuf(fd, &execbuf);
> +		put_offset(ahnd, blt->dst.handle);
> +		put_offset(ahnd, blt->src.handle);
> +		put_offset(ahnd, blt->bb.handle);
> +	}
>   
>   	return ret;
>   }
> @@ -1366,16 +1433,26 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>   	obj = calloc(1, sizeof(*obj));
>   
>   	obj->size = size;
> -	igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
> -						  &size, region) == 0);
> +
> +	if (blt->driver == INTEL_DRIVER_XE) {
> +		size = ALIGN(size, xe_get_default_alignment(blt->fd));
> +		handle = xe_bo_create_flags(blt->fd, 0, size, region);
> +	} else {
> +		igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
> +							  &size, region) == 0);
> +	}
>   
>   	blt_set_object(obj, handle, size, region, mocs, tiling,
>   		       compression, compression_type);
>   	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
>   
> -	if (create_mapping)
> -		obj->ptr = gem_mmap__device_coherent(blt->fd, handle, 0, size,
> -						     PROT_READ | PROT_WRITE);
> +	if (create_mapping) {
> +		if (blt->driver == INTEL_DRIVER_XE)
> +			obj->ptr = xe_bo_map(blt->fd, handle, size);
> +		else
> +			obj->ptr = gem_mmap__device_coherent(blt->fd, handle, 0, size,
> +							     PROT_READ | PROT_WRITE);
> +	}
>   
>   	return obj;
>   }
> @@ -1518,14 +1595,19 @@ void blt_surface_to_png(int fd, uint32_t run_id, const char *fileid,
>   	int format;
>   	int stride = obj->tiling ? obj->pitch * 4 : obj->pitch;
>   	char filename[FILENAME_MAX];
> +	bool is_xe;
>   
>   	snprintf(filename, FILENAME_MAX-1, "%d-%s-%s-%ux%u-%s.png",
>   		 run_id, fileid, blt_tiling_name(obj->tiling), width, height,
>   		 obj->compression ? "compressed" : "uncompressed");
>   
> -	if (!map)
> -		map = gem_mmap__device_coherent(fd, obj->handle, 0,
> -						obj->size, PROT_READ);
> +	if (!map) {
> +		if (is_xe)

is_xe doesn't seem to be set, we'll always pick "else" if copy object is 
not mapped.

All the best,
Karolina

> +			map = xe_bo_map(fd, obj->handle, obj->size);
> +		else
> +			map = gem_mmap__device_coherent(fd, obj->handle, 0,
> +							obj->size, PROT_READ);
> +	}
>   	format = CAIRO_FORMAT_RGB24;
>   	surface = cairo_image_surface_create_for_data(map,
>   						      format, width, height,
> diff --git a/lib/intel_blt.h b/lib/intel_blt.h
> index 7516ce8ac7..944e2b4ae7 100644
> --- a/lib/intel_blt.h
> +++ b/lib/intel_blt.h
> @@ -8,7 +8,7 @@
>   
>   /**
>    * SECTION:intel_blt
> - * @short_description: i915 blitter library
> + * @short_description: i915/xe blitter library
>    * @title: Blitter library
>    * @include: intel_blt.h
>    *

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 14/16] tests/xe_ccs: Check if flatccs is working with block-copy for Xe
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 14/16] tests/xe_ccs: Check if flatccs is working with block-copy for Xe Zbigniew Kempczyński
@ 2023-07-07 10:05   ` Karolina Stolarek
  2023-07-11 10:45     ` Zbigniew Kempczyński
  0 siblings, 1 reply; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-07 10:05 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> This is ported to xe copy of i915 gem_ccs test. Ported means all driver
> dependent calls - like working on regions, binding and execution were
> replaced by xe counterparts. I wondered to add conditionals for xe
> in gem_ccs but this would decrease test readability so I dropped
> this idea.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>   tests/meson.build |   1 +
>   tests/xe/xe_ccs.c | 763 ++++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 764 insertions(+)
>   create mode 100644 tests/xe/xe_ccs.c
> 
> diff --git a/tests/meson.build b/tests/meson.build
> index ee066b8490..9bca57a5e8 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -244,6 +244,7 @@ i915_progs = [
>   ]
>   
>   xe_progs = [
> +	'xe_ccs',
>   	'xe_create',
>   	'xe_compute',
>   	'xe_dma_buf_sync',
> diff --git a/tests/xe/xe_ccs.c b/tests/xe/xe_ccs.c
> new file mode 100644
> index 0000000000..e6bb29a5ed
> --- /dev/null
> +++ b/tests/xe/xe_ccs.c
> @@ -0,0 +1,763 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#include <errno.h>
> +#include <glib.h>
> +#include <sys/ioctl.h>
> +#include <sys/time.h>
> +#include <malloc.h>
> +#include "drm.h"
> +#include "igt.h"
> +#include "igt_syncobj.h"
> +#include "intel_blt.h"
> +#include "intel_mocs.h"
> +#include "xe/xe_ioctl.h"
> +#include "xe/xe_query.h"
> +#include "xe/xe_util.h"
> +/**
> + * TEST: xe ccs
> + * Description: Exercise gen12 blitter with and without flatccs compression on Xe
> + * Run type: FULL
> + *
> + * SUBTEST: block-copy-compressed
> + * Description: Check block-copy flatccs compressed blit
> + *
> + * SUBTEST: block-copy-uncompressed
> + * Description: Check block-copy uncompressed blit
> + *
> + * SUBTEST: block-multicopy-compressed
> + * Description: Check block-multicopy flatccs compressed blit
> + *
> + * SUBTEST: block-multicopy-inplace
> + * Description: Check block-multicopy flatccs inplace decompression blit
> + *
> + * SUBTEST: ctrl-surf-copy
> + * Description: Check flatccs data can be copied from/to surface
> + *
> + * SUBTEST: ctrl-surf-copy-new-ctx
> + * Description: Check flatccs data are physically tagged and visible in vm
> + *
> + * SUBTEST: suspend-resume
> + * Description: Check flatccs data persists after suspend / resume (S0)
> + */
> +
> +IGT_TEST_DESCRIPTION("Exercise gen12 blitter with and without flatccs compression on Xe");
> +
> +static struct param {
> +	int compression_format;
> +	int tiling;
> +	bool write_png;
> +	bool print_bb;
> +	bool print_surface_info;
> +	int width;
> +	int height;
> +} param = {
> +	.compression_format = 0,
> +	.tiling = -1,
> +	.write_png = false,
> +	.print_bb = false,
> +	.print_surface_info = false,
> +	.width = 512,
> +	.height = 512,
> +};
> +
> +struct test_config {
> +	bool compression;
> +	bool inplace;
> +	bool surfcopy;
> +	bool new_ctx;
> +	bool suspend_resume;
> +};
> +
> +static void set_surf_object(struct blt_ctrl_surf_copy_object *obj,
> +			    uint32_t handle, uint32_t region, uint64_t size,
> +			    uint8_t mocs, enum blt_access_type access_type)
> +{
> +	obj->handle = handle;
> +	obj->region = region;
> +	obj->size = size;
> +	obj->mocs = mocs;
> +	obj->access_type = access_type;
> +}
> +
> +#define PRINT_SURFACE_INFO(name, obj) do { \
> +	if (param.print_surface_info) \
> +		blt_surface_info((name), (obj)); } while (0)
> +
> +#define WRITE_PNG(fd, id, name, obj, w, h) do { \
> +	if (param.write_png) \
> +		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)
> +
> +static int compare_nxn(const struct blt_copy_object *surf1,
> +		       const struct blt_copy_object *surf2,
> +		       int xsize, int ysize, int bx, int by)

I think that you could avoid some repetition by creating a small lib 
with blt copy test helpers. For example, we have 4 definitions of 
WRITE_PNG, and it would be good to have just one and call it in four 
separate tests.

> +{
> +	int x, y, corrupted;
> +	uint32_t pos, px1, px2;
> +
> +	corrupted = 0;
> +	for (y = 0; y < ysize; y++) {
> +		for (x = 0; x < xsize; x++) {
> +			pos = bx * xsize + by * ysize * surf1->pitch / 4;
> +			pos += x + y * surf1->pitch / 4;
> +			px1 = surf1->ptr[pos];
> +			px2 = surf2->ptr[pos];
> +			if (px1 != px2)
> +				corrupted++;
> +		}
> +	}
> +
> +	return corrupted;
> +}
> +
> +static void dump_corruption_info(const struct blt_copy_object *surf1,
> +				 const struct blt_copy_object *surf2)
> +{
> +	const int xsize = 8, ysize = 8;
> +	int w, h, bx, by, corrupted;
> +
> +	igt_assert(surf1->x1 == surf2->x1 && surf1->x2 == surf2->x2);
> +	igt_assert(surf1->y1 == surf2->y1 && surf1->y2 == surf2->y2);
> +	w = surf1->x2;
> +	h = surf1->y2;
> +
> +	igt_info("dump corruption - width: %d, height: %d, sizex: %x, sizey: %x\n",
> +		 surf1->x2, surf1->y2, xsize, ysize);
> +
> +	for (by = 0; by < h / ysize; by++) {
> +		for (bx = 0; bx < w / xsize; bx++) {
> +			corrupted = compare_nxn(surf1, surf2, xsize, ysize, bx, by);
> +			if (corrupted == 0)
> +				igt_info(".");
> +			else
> +				igt_info("%c", '0' + corrupted);
> +		}
> +		igt_info("\n");
> +	}
> +}
> +
> +static void surf_copy(int xe,
> +		      intel_ctx_t *ctx,
> +		      uint64_t ahnd,
> +		      const struct blt_copy_object *src,
> +		      const struct blt_copy_object *mid,
> +		      const struct blt_copy_object *dst,
> +		      int run_id, bool suspend_resume)
> +{
> +	struct blt_copy_data blt = {};
> +	struct blt_block_copy_data_ext ext = {};
> +	struct blt_ctrl_surf_copy_data surf = {};
> +	uint32_t bb1, bb2, ccs, ccs2, *ccsmap, *ccsmap2;
> +	uint64_t bb_size, ccssize = mid->size / CCS_RATIO;
> +	uint32_t *ccscopy;
> +	uint8_t uc_mocs = intel_get_uc_mocs(xe);
> +	uint32_t sysmem = system_memory(xe);
> +	int result;
> +
> +	igt_assert(mid->compression);
> +	ccscopy = (uint32_t *) malloc(ccssize);
> +	ccs = xe_bo_create_flags(xe, 0, ccssize, sysmem);
> +	ccs2 = xe_bo_create_flags(xe, 0, ccssize, sysmem);
> +
> +	blt_ctrl_surf_copy_init(xe, &surf);
> +	surf.print_bb = param.print_bb;
> +	set_surf_object(&surf.src, mid->handle, mid->region, mid->size,
> +			uc_mocs, BLT_INDIRECT_ACCESS);
> +	set_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
> +	bb_size = xe_get_default_alignment(xe);
> +	bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
> +	blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
> +	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> +	intel_ctx_xe_sync(ctx, true);
> +
> +	ccsmap = xe_bo_map(xe, ccs, surf.dst.size);
> +	memcpy(ccscopy, ccsmap, ccssize);
> +
> +	if (suspend_resume) {
> +		char *orig, *orig2, *newsum, *newsum2;
> +
> +		orig = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
> +						   (void *)ccsmap, surf.dst.size);
> +		orig2 = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
> +						    (void *)mid->ptr, mid->size);
> +
> +		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
> +
> +		set_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,

Shouldn't this be sysmem instead of REGION_SMEM?

> +				0, DIRECT_ACCESS);
> +		blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> +		intel_ctx_xe_sync(ctx, true);
> +
> +		ccsmap2 = xe_bo_map(xe, ccs2, surf.dst.size);
> +		newsum = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
> +						     (void *)ccsmap2, surf.dst.size);
> +		newsum2 = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
> +						      (void *)mid->ptr, mid->size);
> +
> +		munmap(ccsmap2, ccssize);
> +		igt_assert(!strcmp(orig, newsum));
> +		igt_assert(!strcmp(orig2, newsum2));
> +		g_free(orig);
> +		g_free(orig2);
> +		g_free(newsum);
> +		g_free(newsum2);
> +	}
> +
> +	/* corrupt ccs */
> +	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
> +		ccsmap[i] = i;
> +	set_surf_object(&surf.src, ccs, sysmem, ccssize,
> +			uc_mocs, DIRECT_ACCESS);
> +	set_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
> +			uc_mocs, INDIRECT_ACCESS);
> +	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> +	intel_ctx_xe_sync(ctx, true);
> +
> +	blt_copy_init(xe, &blt);
> +	blt.color_depth = CD_32bit;
> +	blt.print_bb = param.print_bb;
> +	blt_set_copy_object(&blt.src, mid);
> +	blt_set_copy_object(&blt.dst, dst);
> +	blt_set_object_ext(&ext.src, mid->compression_type, mid->x2, mid->y2, SURFACE_TYPE_2D);
> +	blt_set_object_ext(&ext.dst, 0, dst->x2, dst->y2, SURFACE_TYPE_2D);
> +	bb2 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
> +	blt_set_batch(&blt.bb, bb2, bb_size, sysmem);
> +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, &ext);
> +	intel_ctx_xe_sync(ctx, true);
> +	WRITE_PNG(xe, run_id, "corrupted", &blt.dst, dst->x2, dst->y2);
> +	result = memcmp(src->ptr, dst->ptr, src->size);
> +	igt_assert(result != 0);
> +
> +	/* retrieve back ccs */
> +	memcpy(ccsmap, ccscopy, ccssize);
> +	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> +
> +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, &ext);
> +	intel_ctx_xe_sync(ctx, true);
> +	WRITE_PNG(xe, run_id, "corrected", &blt.dst, dst->x2, dst->y2);
> +	result = memcmp(src->ptr, dst->ptr, src->size);
> +	if (result)
> +		dump_corruption_info(src, dst);
> +
> +	munmap(ccsmap, ccssize);
> +	gem_close(xe, ccs);
> +	gem_close(xe, ccs2);
> +	gem_close(xe, bb1);
> +	gem_close(xe, bb2);
> +
> +	igt_assert_f(result == 0,
> +		     "Source and destination surfaces are different after "
> +		     "restoring source ccs data\n");
> +}
> +
> +struct blt_copy3_data {
> +	int xe;
> +	struct blt_copy_object src;
> +	struct blt_copy_object mid;
> +	struct blt_copy_object dst;
> +	struct blt_copy_object final;
> +	struct blt_copy_batch bb;
> +	enum blt_color_depth color_depth;
> +
> +	/* debug stuff */
> +	bool print_bb;
> +};
> +
> +struct blt_block_copy3_data_ext {
> +	struct blt_block_copy_object_ext src;
> +	struct blt_block_copy_object_ext mid;
> +	struct blt_block_copy_object_ext dst;
> +	struct blt_block_copy_object_ext final;
> +};
> +

Hmm, we really could make use of shared definitions like these (yes, I 
sound like a broken record at this point, sorry!)

> +#define FILL_OBJ(_idx, _handle, _offset) do { \
> +	obj[(_idx)].handle = (_handle); \
> +	obj[(_idx)].offset = (_offset); \
> +} while (0)

We don't use this definition in Xe tests, you can delete it

> +
> +static int blt_block_copy3(int xe,
> +			   const intel_ctx_t *ctx,
> +			   uint64_t ahnd,
> +			   const struct blt_copy3_data *blt3,
> +			   const struct blt_block_copy3_data_ext *ext3)
> +{
> +	struct blt_copy_data blt0;
> +	struct blt_block_copy_data_ext ext0;
> +	uint64_t bb_offset, alignment;
> +	uint64_t bb_pos = 0;
> +	int ret;
> +
> +	igt_assert_f(ahnd, "block-copy3 supports softpin only\n");
> +	igt_assert_f(blt3, "block-copy3 requires data to do blit\n");
> +
> +	alignment = xe_get_default_alignment(xe);
> +	get_offset(ahnd, blt3->src.handle, blt3->src.size, alignment);
> +	get_offset(ahnd, blt3->mid.handle, blt3->mid.size, alignment);
> +	get_offset(ahnd, blt3->dst.handle, blt3->dst.size, alignment);
> +	get_offset(ahnd, blt3->final.handle, blt3->final.size, alignment);
> +	bb_offset = get_offset(ahnd, blt3->bb.handle, blt3->bb.size, alignment);
> +
> +	/* First blit src -> mid */
> +	blt_copy_init(xe, &blt0);
> +	blt0.src = blt3->src;
> +	blt0.dst = blt3->mid;
> +	blt0.bb = blt3->bb;
> +	blt0.color_depth = blt3->color_depth;
> +	blt0.print_bb = blt3->print_bb;
> +	ext0.src = ext3->src;
> +	ext0.dst = ext3->mid;
> +	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, false);
> +
> +	/* Second blit mid -> dst */
> +	blt_copy_init(xe, &blt0);
> +	blt0.src = blt3->mid;
> +	blt0.dst = blt3->dst;
> +	blt0.bb = blt3->bb;
> +	blt0.color_depth = blt3->color_depth;
> +	blt0.print_bb = blt3->print_bb;
> +	ext0.src = ext3->mid;
> +	ext0.dst = ext3->dst;
> +	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, false);
> +
> +	/* Third blit dst -> final */
> +	blt_copy_init(xe, &blt0);
> +	blt0.src = blt3->dst;
> +	blt0.dst = blt3->final;
> +	blt0.bb = blt3->bb;
> +	blt0.color_depth = blt3->color_depth;
> +	blt0.print_bb = blt3->print_bb;
> +	ext0.src = ext3->dst;
> +	ext0.dst = ext3->final;
> +	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, true);
> +
> +	intel_ctx_xe_exec(ctx, ahnd, bb_offset);
> +
> +	return ret;
> +}
> +
> +static void block_copy(int xe,
> +		       intel_ctx_t *ctx,
> +		       uint32_t region1, uint32_t region2,
> +		       enum blt_tiling_type mid_tiling,
> +		       const struct test_config *config)
> +{
> +	struct blt_copy_data blt = {};
> +	struct blt_block_copy_data_ext ext = {}, *pext = &ext;
> +	struct blt_copy_object *src, *mid, *dst;
> +	const uint32_t bpp = 32;
> +	uint64_t bb_size = xe_get_default_alignment(xe);
> +	uint64_t ahnd = intel_allocator_open(xe, ctx->vm, INTEL_ALLOCATOR_RELOC);
> +	uint32_t run_id = mid_tiling;
> +	uint32_t mid_region = region2, bb;
> +	uint32_t width = param.width, height = param.height;
> +	enum blt_compression mid_compression = config->compression;
> +	int mid_compression_format = param.compression_format;
> +	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
> +	uint8_t uc_mocs = intel_get_uc_mocs(xe);
> +	int result;
> +
> +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
> +
> +	if (!blt_uses_extended_block_copy(xe))
> +		pext = NULL;
> +
> +	blt_copy_init(xe, &blt);
> +
> +	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> +				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> +	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
> +				mid_tiling, mid_compression, comp_type, true);
> +	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> +				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> +	igt_assert(src->size == dst->size);
> +	PRINT_SURFACE_INFO("src", src);
> +	PRINT_SURFACE_INFO("mid", mid);
> +	PRINT_SURFACE_INFO("dst", dst);
> +
> +	blt_surface_fill_rect(xe, src, width, height);
> +	WRITE_PNG(xe, run_id, "src", src, width, height);
> +
> +	blt.color_depth = CD_32bit;
> +	blt.print_bb = param.print_bb;
> +	blt_set_copy_object(&blt.src, src);
> +	blt_set_copy_object(&blt.dst, mid);
> +	blt_set_object_ext(&ext.src, 0, width, height, SURFACE_TYPE_2D);
> +	blt_set_object_ext(&ext.dst, mid_compression_format, width, height, SURFACE_TYPE_2D);
> +	blt_set_batch(&blt.bb, bb, bb_size, region1);
> +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, pext);
> +	intel_ctx_xe_sync(ctx, true);
> +
> +	/* We expect mid != src if there's compression */
> +	if (mid->compression)
> +		igt_assert(memcmp(src->ptr, mid->ptr, src->size) != 0);
> +
> +	WRITE_PNG(xe, run_id, "src", &blt.src, width, height);

It's also in gem_ccs, why are we saving "src" surface twice?

Many thanks,
Karolina

> +	WRITE_PNG(xe, run_id, "mid", &blt.dst, width, height);
> +
> +	if (config->surfcopy && pext) {
> +		struct drm_xe_engine_class_instance inst = {
> +			.engine_class = DRM_XE_ENGINE_CLASS_COPY,
> +		};
> +		intel_ctx_t *surf_ctx = ctx;
> +		uint64_t surf_ahnd = ahnd;
> +		uint32_t vm, engine;
> +
> +		if (config->new_ctx) {
> +			vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
> +			engine = xe_engine_create(xe, vm, &inst, 0);
> +			surf_ctx = intel_ctx_xe(xe, vm, engine, 0, 0, 0);
> +			surf_ahnd = intel_allocator_open(xe, surf_ctx->vm,
> +							 INTEL_ALLOCATOR_RELOC);
> +		}
> +		surf_copy(xe, surf_ctx, surf_ahnd, src, mid, dst, run_id,
> +			  config->suspend_resume);
> +
> +		if (surf_ctx != ctx) {
> +			xe_engine_destroy(xe, engine);
> +			xe_vm_destroy(xe, vm);
> +			free(surf_ctx);
> +			put_ahnd(surf_ahnd);
> +		}
> +	}
> +
> +	blt_copy_init(xe, &blt);
> +	blt.color_depth = CD_32bit;
> +	blt.print_bb = param.print_bb;
> +	blt_set_copy_object(&blt.src, mid);
> +	blt_set_copy_object(&blt.dst, dst);
> +	blt_set_object_ext(&ext.src, mid_compression_format, width, height, SURFACE_TYPE_2D);
> +	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
> +	if (config->inplace) {
> +		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
> +			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
> +		blt.dst.ptr = mid->ptr;
> +	}
> +
> +	blt_set_batch(&blt.bb, bb, bb_size, region1);
> +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, pext);
> +	intel_ctx_xe_sync(ctx, true);
> +
> +	WRITE_PNG(xe, run_id, "dst", &blt.dst, width, height);
> +
> +	result = memcmp(src->ptr, blt.dst.ptr, src->size);
> +
> +	/* Politely clean vm */
> +	put_offset(ahnd, src->handle);
> +	put_offset(ahnd, mid->handle);
> +	put_offset(ahnd, dst->handle);
> +	put_offset(ahnd, bb);
> +	intel_allocator_bind(ahnd, 0, 0);
> +	blt_destroy_object(xe, src);
> +	blt_destroy_object(xe, mid);
> +	blt_destroy_object(xe, dst);
> +	gem_close(xe, bb);
> +	put_ahnd(ahnd);
> +
> +	igt_assert_f(!result, "source and destination surfaces differs!\n");
> +}
> +
> +static void block_multicopy(int xe,
> +			    intel_ctx_t *ctx,
> +			    uint32_t region1, uint32_t region2,
> +			    enum blt_tiling_type mid_tiling,
> +			    const struct test_config *config)
> +{
> +	struct blt_copy3_data blt3 = {};
> +	struct blt_copy_data blt = {};
> +	struct blt_block_copy3_data_ext ext3 = {}, *pext3 = &ext3;
> +	struct blt_copy_object *src, *mid, *dst, *final;
> +	const uint32_t bpp = 32;
> +	uint64_t bb_size = xe_get_default_alignment(xe);
> +	uint64_t ahnd = intel_allocator_open(xe, ctx->vm, INTEL_ALLOCATOR_RELOC);
> +	uint32_t run_id = mid_tiling;
> +	uint32_t mid_region = region2, bb;
> +	uint32_t width = param.width, height = param.height;
> +	enum blt_compression mid_compression = config->compression;
> +	int mid_compression_format = param.compression_format;
> +	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
> +	uint8_t uc_mocs = intel_get_uc_mocs(xe);
> +	int result;
> +
> +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
> +
> +	if (!blt_uses_extended_block_copy(xe))
> +		pext3 = NULL;
> +
> +	blt_copy_init(xe, &blt);
> +
> +	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> +				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> +	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
> +				mid_tiling, mid_compression, comp_type, true);
> +	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> +				mid_tiling, COMPRESSION_DISABLED, comp_type, true);
> +	final = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> +				  T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> +	igt_assert(src->size == dst->size);
> +	PRINT_SURFACE_INFO("src", src);
> +	PRINT_SURFACE_INFO("mid", mid);
> +	PRINT_SURFACE_INFO("dst", dst);
> +	PRINT_SURFACE_INFO("final", final);
> +
> +	blt_surface_fill_rect(xe, src, width, height);
> +
> +	blt3.color_depth = CD_32bit;
> +	blt3.print_bb = param.print_bb;
> +	blt_set_copy_object(&blt3.src, src);
> +	blt_set_copy_object(&blt3.mid, mid);
> +	blt_set_copy_object(&blt3.dst, dst);
> +	blt_set_copy_object(&blt3.final, final);
> +
> +	if (config->inplace) {
> +		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
> +			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
> +			       comp_type);
> +		blt3.dst.ptr = mid->ptr;
> +	}
> +
> +	blt_set_object_ext(&ext3.src, 0, width, height, SURFACE_TYPE_2D);
> +	blt_set_object_ext(&ext3.mid, mid_compression_format, width, height, SURFACE_TYPE_2D);
> +	blt_set_object_ext(&ext3.dst, 0, width, height, SURFACE_TYPE_2D);
> +	blt_set_object_ext(&ext3.final, 0, width, height, SURFACE_TYPE_2D);
> +	blt_set_batch(&blt3.bb, bb, bb_size, region1);
> +
> +	blt_block_copy3(xe, ctx, ahnd, &blt3, pext3);
> +	intel_ctx_xe_sync(ctx, true);
> +
> +	WRITE_PNG(xe, run_id, "src", &blt3.src, width, height);
> +	if (!config->inplace)
> +		WRITE_PNG(xe, run_id, "mid", &blt3.mid, width, height);
> +	WRITE_PNG(xe, run_id, "dst", &blt3.dst, width, height);
> +	WRITE_PNG(xe, run_id, "final", &blt3.final, width, height);
> +
> +	result = memcmp(src->ptr, blt3.final.ptr, src->size);
> +
> +	put_offset(ahnd, src->handle);
> +	put_offset(ahnd, mid->handle);
> +	put_offset(ahnd, dst->handle);
> +	put_offset(ahnd, final->handle);
> +	put_offset(ahnd, bb);
> +	intel_allocator_bind(ahnd, 0, 0);
> +	blt_destroy_object(xe, src);
> +	blt_destroy_object(xe, mid);
> +	blt_destroy_object(xe, dst);
> +	blt_destroy_object(xe, final);
> +	gem_close(xe, bb);
> +	put_ahnd(ahnd);
> +
> +	igt_assert_f(!result, "source and destination surfaces differs!\n");
> +}
> +
> +enum copy_func {
> +	BLOCK_COPY,
> +	BLOCK_MULTICOPY,
> +};
> +
> +static const struct {
> +	const char *suffix;
> +	void (*copyfn)(int fd,
> +		       intel_ctx_t *ctx,
> +		       uint32_t region1, uint32_t region2,
> +		       enum blt_tiling_type btype,
> +		       const struct test_config *config);
> +} copyfns[] = {
> +	[BLOCK_COPY] = { "", block_copy },
> +	[BLOCK_MULTICOPY] = { "-multicopy", block_multicopy },
> +};
> +
> +static void block_copy_test(int xe,
> +			    const struct test_config *config,
> +			    struct igt_collection *set,
> +			    enum copy_func copy_function)
> +{
> +	struct drm_xe_engine_class_instance inst = {
> +		.engine_class = DRM_XE_ENGINE_CLASS_COPY,
> +	};
> +	intel_ctx_t *ctx;
> +	struct igt_collection *regions;
> +	uint32_t vm, engine;
> +	int tiling;
> +
> +	if (config->compression && !blt_block_copy_supports_compression(xe))
> +		return;
> +
> +	if (config->inplace && !config->compression)
> +		return;
> +
> +	for_each_tiling(tiling) {
> +		if (!blt_block_copy_supports_tiling(xe, tiling) ||
> +		    (param.tiling >= 0 && param.tiling != tiling))
> +			continue;
> +
> +		for_each_variation_r(regions, 2, set) {
> +			uint32_t region1, region2;
> +			char *regtxt;
> +
> +			region1 = igt_collection_get_value(regions, 0);
> +			region2 = igt_collection_get_value(regions, 1);
> +
> +			/* Compressed surface must be in device memory */
> +			if (config->compression && !XE_IS_VRAM_MEMORY_REGION(xe, region2))
> +				continue;
> +
> +			regtxt = xe_memregion_dynamic_subtest_name(xe, regions);
> +
> +			igt_dynamic_f("%s-%s-compfmt%d-%s%s",
> +				      blt_tiling_name(tiling),
> +				      config->compression ?
> +					      "compressed" : "uncompressed",
> +				      param.compression_format, regtxt,
> +				      copyfns[copy_function].suffix) {
> +				uint32_t sync_bind, sync_out;
> +
> +				vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
> +				engine = xe_engine_create(xe, vm, &inst, 0);
> +				sync_bind = syncobj_create(xe, 0);
> +				sync_out = syncobj_create(xe, 0);
> +				ctx = intel_ctx_xe(xe, vm, engine,
> +						   0, sync_bind, sync_out);
> +
> +				copyfns[copy_function].copyfn(xe, ctx,
> +							      region1, region2,
> +							      tiling, config);
> +
> +				xe_engine_destroy(xe, engine);
> +				xe_vm_destroy(xe, vm);
> +				syncobj_destroy(xe, sync_bind);
> +				syncobj_destroy(xe, sync_out);
> +				free(ctx);
> +			}
> +
> +			free(regtxt);
> +		}
> +	}
> +}
> +
> +static int opt_handler(int opt, int opt_index, void *data)
> +{
> +	switch (opt) {
> +	case 'b':
> +		param.print_bb = true;
> +		igt_debug("Print bb: %d\n", param.print_bb);
> +		break;
> +	case 'f':
> +		param.compression_format = atoi(optarg);
> +		igt_debug("Compression format: %d\n", param.compression_format);
> +		igt_assert((param.compression_format & ~0x1f) == 0);
> +		break;
> +	case 'p':
> +		param.write_png = true;
> +		igt_debug("Write png: %d\n", param.write_png);
> +		break;
> +	case 's':
> +		param.print_surface_info = true;
> +		igt_debug("Print surface info: %d\n", param.print_surface_info);
> +		break;
> +	case 't':
> +		param.tiling = atoi(optarg);
> +		igt_debug("Tiling: %d\n", param.tiling);
> +		break;
> +	case 'W':
> +		param.width = atoi(optarg);
> +		igt_debug("Width: %d\n", param.width);
> +		break;
> +	case 'H':
> +		param.height = atoi(optarg);
> +		igt_debug("Height: %d\n", param.height);
> +		break;
> +	default:
> +		return IGT_OPT_HANDLER_ERROR;
> +	}
> +
> +	return IGT_OPT_HANDLER_SUCCESS;
> +}
> +
> +const char *help_str =
> +	"  -b\tPrint bb\n"
> +	"  -f\tCompression format (0-31)\n"
> +	"  -p\tWrite PNG\n"
> +	"  -s\tPrint surface info\n"
> +	"  -t\tTiling format (0 - linear, 1 - XMAJOR, 2 - YMAJOR, 3 - TILE4, 4 - TILE64)\n"
> +	"  -W\tWidth (default 512)\n"
> +	"  -H\tHeight (default 512)"
> +	;
> +
> +igt_main_args("bf:pst:W:H:", NULL, help_str, opt_handler, NULL)
> +{
> +	struct igt_collection *set;
> +	int xe;
> +
> +	igt_fixture {
> +		xe = drm_open_driver(DRIVER_XE);
> +		igt_require(blt_has_block_copy(xe));
> +
> +		xe_device_get(xe);
> +
> +		set = xe_get_memory_region_set(xe,
> +					       XE_MEM_REGION_CLASS_SYSMEM,
> +					       XE_MEM_REGION_CLASS_VRAM);
> +	}
> +
> +	igt_describe("Check block-copy uncompressed blit");
> +	igt_subtest_with_dynamic("block-copy-uncompressed") {
> +		struct test_config config = {};
> +
> +		block_copy_test(xe, &config, set, BLOCK_COPY);
> +	}
> +
> +	igt_describe("Check block-copy flatccs compressed blit");
> +	igt_subtest_with_dynamic("block-copy-compressed") {
> +		struct test_config config = { .compression = true };
> +
> +		block_copy_test(xe, &config, set, BLOCK_COPY);
> +	}
> +
> +	igt_describe("Check block-multicopy flatccs compressed blit");
> +	igt_subtest_with_dynamic("block-multicopy-compressed") {
> +		struct test_config config = { .compression = true };
> +
> +		block_copy_test(xe, &config, set, BLOCK_MULTICOPY);
> +	}
> +
> +	igt_describe("Check block-multicopy flatccs inplace decompression blit");
> +	igt_subtest_with_dynamic("block-multicopy-inplace") {
> +		struct test_config config = { .compression = true,
> +					      .inplace = true };
> +
> +		block_copy_test(xe, &config, set, BLOCK_MULTICOPY);
> +	}
> +
> +	igt_describe("Check flatccs data can be copied from/to surface");
> +	igt_subtest_with_dynamic("ctrl-surf-copy") {
> +		struct test_config config = { .compression = true,
> +					      .surfcopy = true };
> +
> +		block_copy_test(xe, &config, set, BLOCK_COPY);
> +	}
> +
> +	igt_describe("Check flatccs data are physically tagged and visible"
> +		     " in different contexts");
> +	igt_subtest_with_dynamic("ctrl-surf-copy-new-ctx") {
> +		struct test_config config = { .compression = true,
> +					      .surfcopy = true,
> +					      .new_ctx = true };
> +
> +		block_copy_test(xe, &config, set, BLOCK_COPY);
> +	}
> +
> +	igt_describe("Check flatccs data persists after suspend / resume (S0)");
> +	igt_subtest_with_dynamic("suspend-resume") {
> +		struct test_config config = { .compression = true,
> +					      .surfcopy = true,
> +					      .suspend_resume = true };
> +
> +		block_copy_test(xe, &config, set, BLOCK_COPY);
> +	}
> +
> +	igt_fixture {
> +		xe_device_put(xe);
> +		close(xe);
> +	}
> +}

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 16/16] tests/api-intel-allocator: Adopt to exercise allocator to Xe
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 16/16] tests/api-intel-allocator: Adopt to exercise allocator to Xe Zbigniew Kempczyński
@ 2023-07-07 10:11   ` Karolina Stolarek
  0 siblings, 0 replies; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-07 10:11 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> Xe vm binding requires some cooperation from allocator side (tracking
> alloc()/free()) operations. This diverges the path internally inside
> the allocator so it is necessary to check if allocator supports
> properly both drivers.

Reviewed-by: Karolina Stolarek <karolina.stolarek@intel.com>

> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>   tests/i915/api_intel_allocator.c | 25 +++++++++++++++++--------
>   1 file changed, 17 insertions(+), 8 deletions(-)
> 
> diff --git a/tests/i915/api_intel_allocator.c b/tests/i915/api_intel_allocator.c
> index 238e76c9fd..e70de0e1d0 100644
> --- a/tests/i915/api_intel_allocator.c
> +++ b/tests/i915/api_intel_allocator.c
> @@ -9,6 +9,9 @@
>   #include "igt.h"
>   #include "igt_aux.h"
>   #include "intel_allocator.h"
> +#include "xe/xe_ioctl.h"
> +#include "xe/xe_query.h"
> +
>   /**
>    * TEST: api intel allocator
>    * Category: Infrastructure
> @@ -454,6 +457,7 @@ static void __simple_allocs(int fd)
>   	uint32_t handles[SIMPLE_GROUP_ALLOCS];
>   	uint64_t ahnd;
>   	uint32_t ctx;
> +	bool is_xe = is_xe_device(fd);
>   	int i;
>   
>   	ctx = rand() % 2;
> @@ -463,7 +467,12 @@ static void __simple_allocs(int fd)
>   		uint32_t size;
>   
>   		size = (rand() % 4 + 1) * 0x1000;
> -		handles[i] = gem_create(fd, size);
> +		if (is_xe)
> +			handles[i] = xe_bo_create_flags(fd, 0, size,
> +							system_memory(fd));
> +		else
> +			handles[i] = gem_create(fd, size);
> +
>   		intel_allocator_alloc(ahnd, handles[i], size, 0x1000);
>   	}
>   
> @@ -573,8 +582,6 @@ static void reopen(int fd)
>   {
>   	int fd2;
>   
> -	igt_require_gem(fd);
> -
>   	fd2 = drm_reopen_driver(fd);
>   
>   	__reopen_allocs(fd, fd2, true);
> @@ -587,8 +594,6 @@ static void reopen_fork(int fd)
>   {
>   	int fd2;
>   
> -	igt_require_gem(fd);
> -
>   	intel_allocator_multiprocess_start();
>   
>   	fd2 = drm_reopen_driver(fd);
> @@ -838,7 +843,7 @@ igt_main
>   	struct allocators *a;
>   
>   	igt_fixture {
> -		fd = drm_open_driver(DRIVER_INTEL);
> +		fd = drm_open_driver(DRIVER_INTEL | DRIVER_XE);
>   		atomic_init(&next_handle, 1);
>   		srandom(0xdeadbeef);
>   	}
> @@ -911,12 +916,16 @@ igt_main
>   	igt_subtest_f("open-vm")
>   		open_vm(fd);
>   
> -	igt_subtest_f("execbuf-with-allocator")
> +	igt_subtest_f("execbuf-with-allocator") {
> +		igt_require(is_i915_device(fd));
>   		execbuf_with_allocator(fd);
> +	}
>   
>   	igt_describe("Verifies creating and executing bb from gem pool");
> -	igt_subtest_f("gem-pool")
> +	igt_subtest_f("gem-pool") {
> +		igt_require(is_i915_device(fd));
>   		gem_pool(fd);
> +	}
>   
>   	igt_fixture
>   		drm_close_driver(fd);

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 15/16] tests/xe_exercise_blt: Check blitter library fast-copy for Xe
  2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 15/16] tests/xe_exercise_blt: Check blitter library fast-copy " Zbigniew Kempczyński
@ 2023-07-07 11:10   ` Karolina Stolarek
  2023-07-11 11:07     ` Zbigniew Kempczyński
  0 siblings, 1 reply; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-07 11:10 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> Port this test to work on xe. Instead of adding conditional code for
> xe code which would decrease readability this is new test for xe.
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>   tests/meson.build          |   1 +
>   tests/xe/xe_exercise_blt.c | 372 +++++++++++++++++++++++++++++++++++++
>   2 files changed, 373 insertions(+)
>   create mode 100644 tests/xe/xe_exercise_blt.c
> 
> diff --git a/tests/meson.build b/tests/meson.build
> index 9bca57a5e8..137a5cf01f 100644
> --- a/tests/meson.build
> +++ b/tests/meson.build
> @@ -257,6 +257,7 @@ xe_progs = [
>   	'xe_exec_reset',
>   	'xe_exec_store',
>   	'xe_exec_threads',
> +	'xe_exercise_blt',
>   	'xe_gpgpu_fill',
>   	'xe_guc_pc',
>   	'xe_huc_copy',
> diff --git a/tests/xe/xe_exercise_blt.c b/tests/xe/xe_exercise_blt.c
> new file mode 100644
> index 0000000000..8340cf7148
> --- /dev/null
> +++ b/tests/xe/xe_exercise_blt.c
> @@ -0,0 +1,372 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2023 Intel Corporation
> + */
> +
> +#include "igt.h"
> +#include "drm.h"
> +#include "lib/intel_chipset.h"
> +#include "intel_blt.h"
> +#include "intel_mocs.h"
> +#include "xe/xe_ioctl.h"
> +#include "xe/xe_query.h"
> +#include "xe/xe_util.h"
> +
> +/**
> + * TEST: xe exercise blt
> + * Description: Exercise blitter commands on Xe
> + * Feature: blitter
> + * Run type: FULL
> + * Test category: GEM_Legacy
> + *
> + * SUBTEST: fast-copy
> + * Description:
> + *   Check fast-copy blit
> + *   blitter
> + *
> + * SUBTEST: fast-copy-emit
> + * Description:
> + *   Check multiple fast-copy in one batch
> + *   blitter
> + */
> +
> +IGT_TEST_DESCRIPTION("Exercise blitter commands on Xe");
> +
> +static struct param {
> +	int tiling;
> +	bool write_png;
> +	bool print_bb;
> +	bool print_surface_info;
> +	int width;
> +	int height;
> +} param = {
> +	.tiling = -1,
> +	.write_png = false,
> +	.print_bb = false,
> +	.print_surface_info = false,
> +	.width = 512,
> +	.height = 512,
> +};
> +
> +#define PRINT_SURFACE_INFO(name, obj) do { \
> +	if (param.print_surface_info) \
> +		blt_surface_info((name), (obj)); } while (0)
> +
> +#define WRITE_PNG(fd, id, name, obj, w, h) do { \
> +	if (param.write_png) \
> +		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)
> +

My suggestion with shared functions applies here as well

> +struct blt_fast_copy_data {
> +	int xe;
> +	struct blt_copy_object src;
> +	struct blt_copy_object mid;
> +	struct blt_copy_object dst;
> +
> +	struct blt_copy_batch bb;
> +	enum blt_color_depth color_depth;
> +
> +	/* debug stuff */
> +	bool print_bb;
> +};
> +
> +static int fast_copy_one_bb(int xe,
> +			    const intel_ctx_t *ctx,
> +			    uint64_t ahnd,
> +			    const struct blt_fast_copy_data *blt)
> +{
> +	struct blt_copy_data blt_tmp;
> +	uint64_t bb_offset, alignment;
> +	uint64_t bb_pos = 0;
> +	int ret = 0;
> +
> +	alignment = xe_get_default_alignment(xe);
> +
> +	get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> +	get_offset(ahnd, blt->mid.handle, blt->mid.size, alignment);
> +	get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> +	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> +
> +	/* First blit */
> +	blt_copy_init(xe, &blt_tmp);
> +	blt_tmp.src = blt->src;
> +	blt_tmp.dst = blt->mid;
> +	blt_tmp.bb = blt->bb;
> +	blt_tmp.color_depth = blt->color_depth;
> +	blt_tmp.print_bb = blt->print_bb;
> +	bb_pos = emit_blt_fast_copy(xe, ahnd, &blt_tmp, bb_pos, false);
> +
> +	/* Second blit */
> +	blt_copy_init(xe, &blt_tmp);
> +	blt_tmp.src = blt->mid;
> +	blt_tmp.dst = blt->dst;
> +	blt_tmp.bb = blt->bb;
> +	blt_tmp.color_depth = blt->color_depth;
> +	blt_tmp.print_bb = blt->print_bb;
> +	bb_pos = emit_blt_fast_copy(xe, ahnd, &blt_tmp, bb_pos, true);
> +
> +	intel_ctx_xe_exec(ctx, ahnd, bb_offset);
> +
> +	return ret;
> +}
> +
> +static void fast_copy_emit(int xe, const intel_ctx_t *ctx,
> +			   uint32_t region1, uint32_t region2,
> +			   enum blt_tiling_type mid_tiling)
> +{
> +	struct blt_copy_data bltinit = {};
> +	struct blt_fast_copy_data blt = {};
> +	struct blt_copy_object *src, *mid, *dst;
> +	const uint32_t bpp = 32;
> +	uint64_t bb_size = xe_get_default_alignment(xe);
> +	uint64_t ahnd = intel_allocator_open_full(xe, ctx->vm, 0, 0,
> +						  INTEL_ALLOCATOR_SIMPLE,
> +						  ALLOC_STRATEGY_LOW_TO_HIGH, 0);
> +	uint32_t bb, width = param.width, height = param.height;
> +	int result;
> +
> +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
> +
> +	blt_copy_init(xe, &bltinit);
> +	src = blt_create_object(&bltinit, region1, width, height, bpp, 0,
> +				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> +	mid = blt_create_object(&bltinit, region2, width, height, bpp, 0,
> +				mid_tiling, COMPRESSION_DISABLED, 0, true);
> +	dst = blt_create_object(&bltinit, region1, width, height, bpp, 0,
> +				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> +	igt_assert(src->size == dst->size);
> +
> +	PRINT_SURFACE_INFO("src", src);
> +	PRINT_SURFACE_INFO("mid", mid);
> +	PRINT_SURFACE_INFO("dst", dst);
> +
> +	blt_surface_fill_rect(xe, src, width, height);
> +	WRITE_PNG(xe, mid_tiling, "src", src, width, height);
> +
> +	memset(&blt, 0, sizeof(blt));
> +	blt.color_depth = CD_32bit;
> +	blt.print_bb = param.print_bb;
> +	blt_set_copy_object(&blt.src, src);
> +	blt_set_copy_object(&blt.mid, mid);
> +	blt_set_copy_object(&blt.dst, dst);
> +	blt_set_batch(&blt.bb, bb, bb_size, region1);
> +
> +	fast_copy_one_bb(xe, ctx, ahnd, &blt);
> +
> +	WRITE_PNG(xe, mid_tiling, "mid", &blt.mid, width, height);
> +	WRITE_PNG(xe, mid_tiling, "dst", &blt.dst, width, height);
> +
> +	result = memcmp(src->ptr, blt.dst.ptr, src->size);
> +
> +	blt_destroy_object(xe, src);
> +	blt_destroy_object(xe, mid);
> +	blt_destroy_object(xe, dst);

I see that in block_copy tests we also call put_offset() for all copy 
objects' handles. Although we have a different allocator, I think that 
we should free the ranges on object destruction.

> +	gem_close(xe, bb);
> +	put_ahnd(ahnd);
> +
> +	munmap(&bb, bb_size);
> +
> +	igt_assert_f(!result, "source and destination surfaces differs!\n");
> +}
> +
> +static void fast_copy(int xe, const intel_ctx_t *ctx,
> +		      uint32_t region1, uint32_t region2,
> +		      enum blt_tiling_type mid_tiling)
> +{
> +	struct blt_copy_data blt = {};
> +	struct blt_copy_object *src, *mid, *dst;
> +	const uint32_t bpp = 32;
> +	uint64_t bb_size = xe_get_default_alignment(xe);
> +	uint64_t ahnd = intel_allocator_open_full(xe, ctx->vm, 0, 0,
> +						  INTEL_ALLOCATOR_SIMPLE,
> +						  ALLOC_STRATEGY_LOW_TO_HIGH, 0);
> +	uint32_t bb;
> +	uint32_t width = param.width, height = param.height;
> +	int result;
> +
> +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
> +
> +	blt_copy_init(xe, &blt);
> +	src = blt_create_object(&blt, region1, width, height, bpp, 0,
> +				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> +	mid = blt_create_object(&blt, region2, width, height, bpp, 0,
> +				mid_tiling, COMPRESSION_DISABLED, 0, true);
> +	dst = blt_create_object(&blt, region1, width, height, bpp, 0,
> +				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> +	igt_assert(src->size == dst->size);
> +
> +	blt_surface_fill_rect(xe, src, width, height);
> +
> +	blt.color_depth = CD_32bit;
> +	blt.print_bb = param.print_bb;
> +	blt_set_copy_object(&blt.src, src);
> +	blt_set_copy_object(&blt.dst, mid);
> +	blt_set_batch(&blt.bb, bb, bb_size, region1);
> +
> +	blt_fast_copy(xe, ctx, NULL, ahnd, &blt);
> +
> +	WRITE_PNG(xe, mid_tiling, "src", &blt.src, width, height);
> +	WRITE_PNG(xe, mid_tiling, "mid", &blt.dst, width, height);
> +
> +	blt_copy_init(xe, &blt);
> +	blt.color_depth = CD_32bit;
> +	blt.print_bb = param.print_bb;
> +	blt_set_copy_object(&blt.src, mid);
> +	blt_set_copy_object(&blt.dst, dst);
> +	blt_set_batch(&blt.bb, bb, bb_size, region1);
> +
> +	blt_fast_copy(xe, ctx, NULL, ahnd, &blt);
> +
> +	WRITE_PNG(xe, mid_tiling, "dst", &blt.dst, width, height);
> +
> +	result = memcmp(src->ptr, blt.dst.ptr, src->size);
> +
> +	blt_destroy_object(xe, src);
> +	blt_destroy_object(xe, mid);
> +	blt_destroy_object(xe, dst);
> +	gem_close(xe, bb);
> +	put_ahnd(ahnd);
> +
> +	igt_assert_f(!result, "source and destination surfaces differs!\n");
> +}
> +
> +enum fast_copy_func {
> +	FAST_COPY,
> +	FAST_COPY_EMIT
> +};
> +
> +static char
> +	*full_subtest_str(char *regtxt, enum blt_tiling_type tiling,
> +			  enum fast_copy_func func)
> +{
> +	char *name;
> +	uint32_t len;
> +
> +	len = asprintf(&name, "%s-%s%s", blt_tiling_name(tiling), regtxt,
> +		       func == FAST_COPY_EMIT ? "-emit" : "");
> +
> +	igt_assert_f(len >= 0, "asprintf failed!\n");
> +
> +	return name;
> +}
> +
> +static void fast_copy_test(int xe,
> +			   struct igt_collection *set,
> +			   enum fast_copy_func func)
> +{
> +	struct drm_xe_engine_class_instance inst = {
> +		.engine_class = DRM_XE_ENGINE_CLASS_COPY,
> +	};
> +	struct igt_collection *regions;
> +	void (*copy_func)(int xe, const intel_ctx_t *ctx,
> +			  uint32_t r1, uint32_t r2, enum blt_tiling_type tiling);
> +	intel_ctx_t *ctx;
> +	int tiling;
> +
> +	for_each_tiling(tiling) {
> +		if (!blt_fast_copy_supports_tiling(xe, tiling))
> +			continue;
> +
> +		for_each_variation_r(regions, 2, set) {
> +			uint32_t region1, region2;
> +			uint32_t vm, engine;
> +			char *regtxt, *test_name;
> +
> +			region1 = igt_collection_get_value(regions, 0);
> +			region2 = igt_collection_get_value(regions, 1);
> +
> +			vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
> +			engine = xe_engine_create(xe, vm, &inst, 0);
> +			ctx = intel_ctx_xe(xe, vm, engine, 0, 0, 0);
> +
> +			copy_func = (func == FAST_COPY) ? fast_copy : fast_copy_emit;
> +			regtxt = xe_memregion_dynamic_subtest_name(xe, regions);
> +			test_name = full_subtest_str(regtxt, tiling, func);
> +
> +			igt_dynamic_f("%s", test_name) {
> +				copy_func(xe, ctx,
> +					  region1, region2,
> +					  tiling);
> +			}
> +
> +			free(regtxt);
> +			free(test_name);
> +			xe_engine_destroy(xe, engine);
> +			xe_vm_destroy(xe, vm);
> +			free(ctx);
> +		}
> +	}
> +}
> +
> +static int opt_handler(int opt, int opt_index, void *data)
> +{
> +	switch (opt) {
> +	case 'b':
> +		param.print_bb = true;
> +		igt_debug("Print bb: %d\n", param.print_bb);
> +		break;
> +	case 'p':
> +		param.write_png = true;
> +		igt_debug("Write png: %d\n", param.write_png);
> +		break;
> +	case 's':
> +		param.print_surface_info = true;
> +		igt_debug("Print surface info: %d\n", param.print_surface_info);
> +		break;
> +	case 't':
> +		param.tiling = atoi(optarg);
> +		igt_debug("Tiling: %d\n", param.tiling);
> +		break;
> +	case 'W':
> +		param.width = atoi(optarg);
> +		igt_debug("Width: %d\n", param.width);
> +		break;
> +	case 'H':
> +		param.height = atoi(optarg);
> +		igt_debug("Height: %d\n", param.height);
> +		break;
> +	default:
> +		return IGT_OPT_HANDLER_ERROR;
> +	}
> +
> +	return IGT_OPT_HANDLER_SUCCESS;
> +}
> +
> +const char *help_str =
> +	"  -b\tPrint bb\n"
> +	"  -p\tWrite PNG\n"
> +	"  -s\tPrint surface info\n"
> +	"  -t\tTiling format (0 - linear, 1 - XMAJOR, 2 - YMAJOR, 3 - TILE4, 4 - TILE64, 5 - YFMAJOR)\n"
> +	"  -W\tWidth (default 512)\n"
> +	"  -H\tHeight (default 512)"
> +	;
> +
> +igt_main_args("b:pst:W:H:", NULL, help_str, opt_handler, NULL)
> +{
> +	struct igt_collection *set;
> +	int xe;
> +
> +	igt_fixture {
> +		xe = drm_open_driver(DRIVER_XE);
> +		igt_require(blt_has_block_copy(xe));

Should be blt_has_fast_copy(xe)

All the best,
Karolina
> +
> +		xe_device_get(xe);
> +
> +		set = xe_get_memory_region_set(xe,
> +					       XE_MEM_REGION_CLASS_SYSMEM,
> +					       XE_MEM_REGION_CLASS_VRAM);
> +	}
> +
> +	igt_describe("Check fast-copy blit");
> +	igt_subtest_with_dynamic("fast-copy") {
> +		fast_copy_test(xe, set, FAST_COPY);
> +	}
> +
> +	igt_describe("Check multiple fast-copy in one batch");
> +	igt_subtest_with_dynamic("fast-copy-emit") {
> +		fast_copy_test(xe, set, FAST_COPY_EMIT);
> +	}
> +
> +	igt_fixture {
> +		drm_close_driver(xe);
> +	}
> +}

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 11/16] lib/intel_ctx: Add xe context information
  2023-07-07  8:31   ` Karolina Stolarek
@ 2023-07-11  9:06     ` Zbigniew Kempczyński
  2023-07-11 10:38       ` Karolina Stolarek
  0 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-11  9:06 UTC (permalink / raw)
  To: Karolina Stolarek; +Cc: igt-dev

On Fri, Jul 07, 2023 at 10:31:19AM +0200, Karolina Stolarek wrote:
> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> > Most complicated part in adopting i915_blt to intel_blt - which should
> > handle both drivers - is how to achieve pipelined execution. In term
> > pipelined execution I mean all gpu workloads are executed without
> > stalls.
> > 
> > Comparing to i915 relocations and softpinning xe architecture migrates
> > binding (this means also unbind operation) responsibility from the
> > kernel to user via vm_bind ioctl(). To avoid stalls user has to
> > provide in/out fences (syncobjs) between consecutive bindings/execs.
> > Of course for many igt tests we don't need pipelined execution,
> > just synchronous bind, then exec. But exercising the driver should
> > also cover pipelining to verify it is possible to work without stalls.
> > 
> > I decided to extend intel_ctx_t with all necessary for xe objects
> > (vm, engine, syncobjs) to get flexibility in deciding how to bind,
> > execute and wait for (synchronize) those operations. Context object
> > along with i915 engine is already passed to blitter library so adding
> > xe required fields doesn't break i915 but will allow xe path to get
> > all necessary data to execute.
> > 
> > Using intel_ctx with xe requires some code patterns caused by usage
> > of an allocator. For xe the allocator started tracking alloc()/free()
> > operations to do bind/unbind in one call just before execution.
> > I've added two helpers in intel_ctx which - intel_ctx_xe_exec()
> > and intel_ctx_xe_sync(). Depending how intel_ctx was created
> > (with 0 or real syncobj handles as in/bind/out fences) bind and exec
> > in intel_ctx_xe_exec() are pipelined but synchronize last operation
> > (exec). For real syncobjs they are used to join bind + exec calls
> > but there's no wait for exec (sync-out) completion. This allows
> > building more cascaded bind + exec operations without stalls.
> > 
> > To wait for a sync-out fence caller may use intel_ctx_xe_sync()
> > which is synchronous wait on syncobj. It allows user to reset
> > fences to prepare for next operation.
> > 
> > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > ---
> >   lib/intel_ctx.c | 110 +++++++++++++++++++++++++++++++++++++++++++++++-
> >   lib/intel_ctx.h |  14 ++++++
> >   2 files changed, 123 insertions(+), 1 deletion(-)
> > 
> > diff --git a/lib/intel_ctx.c b/lib/intel_ctx.c
> > index ded9c0f1e4..f210907fac 100644
> > --- a/lib/intel_ctx.c
> > +++ b/lib/intel_ctx.c
> > @@ -5,9 +5,12 @@
> >   #include <stddef.h>
> > +#include "i915/gem_engine_topology.h"
> > +#include "igt_syncobj.h"
> > +#include "intel_allocator.h"
> >   #include "intel_ctx.h"
> >   #include "ioctl_wrappers.h"
> > -#include "i915/gem_engine_topology.h"
> > +#include "xe/xe_ioctl.h"
> >   /**
> >    * SECTION:intel_ctx
> > @@ -390,3 +393,108 @@ unsigned int intel_ctx_engine_class(const intel_ctx_t *ctx, unsigned int engine)
> >   {
> >   	return intel_ctx_cfg_engine_class(&ctx->cfg, engine);
> >   }
> > +
> > +/**
> > + * intel_ctx_xe:
> > + * @fd: open i915 drm file descriptor
> > + * @vm: vm
> > + * @engine: engine
> > + *
> > + * Returns an intel_ctx_t representing the xe context.
> > + */
> > +intel_ctx_t *intel_ctx_xe(int fd, uint32_t vm, uint32_t engine,
> > +			  uint32_t sync_in, uint32_t sync_bind, uint32_t sync_out)
> > +{
> > +	intel_ctx_t *ctx;
> > +
> > +	ctx = calloc(1, sizeof(*ctx));
> > +	igt_assert(ctx);
> > +
> > +	ctx->fd = fd;
> > +	ctx->vm = vm;
> > +	ctx->engine = engine;
> > +	ctx->sync_in = sync_in;
> > +	ctx->sync_bind = sync_bind;
> > +	ctx->sync_out = sync_out;
> > +
> > +	return ctx;
> > +}
> > +
> > +static int __xe_exec(int fd, struct drm_xe_exec *exec)
> > +{
> > +	int err = 0;
> > +
> > +	if (igt_ioctl(fd, DRM_IOCTL_XE_EXEC, exec)) {
> > +		err = -errno;
> > +		igt_assume(err != 0);
> 
> Wouldn't "igt_assume(err)" be enough?
> 
> > +	}
> > +	errno = 0;
> > +	return err;
> > +}
> 
> I'm aware that it's a helper that you use in other execs, but it feels out
> of place, it doesn't deal with intel_ctx_t. Maybe xe_util could be its new
> home?
> 

I'm going to just export __xe_exec() from xe_ioctl.c.

> > +
> > +int __intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset)
> > +{
> > +	struct drm_xe_sync syncs[2] = {
> > +		{ .flags = DRM_XE_SYNC_SYNCOBJ, },
> > +		{ .flags = DRM_XE_SYNC_SYNCOBJ | DRM_XE_SYNC_SIGNAL, },
> > +	};
> > +	struct drm_xe_exec exec = {
> > +		.engine_id = ctx->engine,
> > +		.syncs = (uintptr_t)syncs,
> > +		.num_syncs = 2,
> > +		.address = bb_offset,
> > +		.num_batch_buffer = 1,
> > +	};
> > +	uint32_t sync_in = ctx->sync_in;
> > +	uint32_t sync_bind = ctx->sync_bind ?: syncobj_create(ctx->fd, 0);
> > +	uint32_t sync_out = ctx->sync_out ?: syncobj_create(ctx->fd, 0);
> > +	int ret;
> > +
> > +	/* Synchronize allocator state -> vm */
> > +	intel_allocator_bind(ahnd, sync_in, sync_bind);
> > +
> > +	/* Pipelined exec */
> > +	syncs[0].handle = sync_bind;
> > +	syncs[1].handle = sync_out;
> > +
> > +	ret = __xe_exec(ctx->fd, &exec);
> > +	if (ret)
> > +		goto err;
> > +
> > +	if (!ctx->sync_bind || !ctx->sync_out)
> > +		syncobj_wait_err(ctx->fd, &sync_out, 1, INT64_MAX, 0);
> 
> This whole flow is so nice and tidy, I like it
> 
> > +
> > +err:
> > +	if (!ctx->sync_bind)
> > +		syncobj_destroy(ctx->fd, sync_bind);
> > +
> > +	if (!ctx->sync_out)
> > +		syncobj_destroy(ctx->fd, sync_out);
> > +
> > +	return ret;
> > +}
> > +
> > +void intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset)
> > +{
> > +	igt_assert_eq(__intel_ctx_xe_exec(ctx, ahnd, bb_offset), 0);
> > +}
> > +
> > +#define RESET_SYNCOBJ(__fd, __sync) do { \
> > +	if (__sync) \
> > +		syncobj_reset((__fd), &(__sync), 1); \
> > +} while (0)
> > +
> > +int intel_ctx_xe_sync(intel_ctx_t *ctx, bool reset_syncs)
> > +{
> > +	int ret;
> > +
> > +	ret = syncobj_wait_err(ctx->fd, &ctx->sync_out, 1, INT64_MAX, 0);
> > +
> > +	if (reset_syncs) {
> > +		RESET_SYNCOBJ(ctx->fd, ctx->sync_in);
> > +		RESET_SYNCOBJ(ctx->fd, ctx->sync_bind);
> > +		RESET_SYNCOBJ(ctx->fd, ctx->sync_out);
> > +	}
> 
> Is there a usecase where we want to do a synced execution without resetting
> syncobjs?
> 

I don't know - that's I left decision to the user.

> > +
> > +	return ret;
> > +}
> > diff --git a/lib/intel_ctx.h b/lib/intel_ctx.h
> > index 3cfeaae81e..59d0360ada 100644
> > --- a/lib/intel_ctx.h
> > +++ b/lib/intel_ctx.h
> > @@ -67,6 +67,14 @@ int intel_ctx_cfg_engine_class(const intel_ctx_cfg_t *cfg, unsigned int engine);
> >   typedef struct intel_ctx {
> >   	uint32_t id;
> >   	intel_ctx_cfg_t cfg;
> > +
> > +	/* Xe */
> > +	int fd;
> > +	uint32_t vm;
> > +	uint32_t engine;
> > +	uint32_t sync_in;
> > +	uint32_t sync_bind;
> > +	uint32_t sync_out;
> 
> Hmm, I wonder if we could wrap it in a struct. Yes, it would be painful to
> unpack, but now it feels like we've just added a bunch of fields that are
> irrelevant 80% of the time. Instead, we could have one additional field that
> could be NULL, and use it if it's initialized.
> But maybe I'm just being too nit-picky.

I wondered to introduce union of i915 and xe structs, but I would need
to rewrite almost all igt's which are using this struct so I dropped
this idea. At the moment I need to handle both drivers so mixing
fields is not a big pain imo.

--
Zbigniew

> 
> All the best,
> Karolina
> 
> >   } intel_ctx_t;
> >   int __intel_ctx_create(int fd, const intel_ctx_cfg_t *cfg,
> > @@ -81,4 +89,10 @@ void intel_ctx_destroy(int fd, const intel_ctx_t *ctx);
> >   unsigned int intel_ctx_engine_class(const intel_ctx_t *ctx, unsigned int engine);
> > +intel_ctx_t *intel_ctx_xe(int fd, uint32_t vm, uint32_t engine,
> > +			  uint32_t sync_in, uint32_t sync_bind, uint32_t sync_out);
> > +int __intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset);
> > +void intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset);
> > +int intel_ctx_xe_sync(intel_ctx_t *ctx, bool reset_syncs);
> > +
> >   #endif

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 12/16] lib/intel_blt: Introduce blt_copy_init() helper to cache driver
  2023-07-07  8:51   ` Karolina Stolarek
@ 2023-07-11  9:23     ` Zbigniew Kempczyński
  0 siblings, 0 replies; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-11  9:23 UTC (permalink / raw)
  To: Karolina Stolarek; +Cc: igt-dev

On Fri, Jul 07, 2023 at 10:51:43AM +0200, Karolina Stolarek wrote:
> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> > Instead of calling is_xe_device() and is_i915_device() multiple
> > times in code which distincs xe and i915 paths add driver field
> > to structures used in blitter library.
> > 
> > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > ---
> >   lib/igt_fb.c                   |  2 +-
> >   lib/intel_blt.c                | 40 +++++++++++++++++++++++++++++++---
> >   lib/intel_blt.h                |  8 ++++++-
> >   tests/i915/gem_ccs.c           | 34 ++++++++++++++++-------------
> >   tests/i915/gem_exercise_blt.c  | 22 ++++++++++---------
> >   tests/i915/gem_lmem_swapping.c |  4 ++--
> >   6 files changed, 78 insertions(+), 32 deletions(-)
> > 
> > diff --git a/lib/igt_fb.c b/lib/igt_fb.c
> > index a8988274f2..1814e8db11 100644
> > --- a/lib/igt_fb.c
> > +++ b/lib/igt_fb.c
> > @@ -2900,7 +2900,7 @@ static void blitcopy(const struct igt_fb *dst_fb,
> >   			src = blt_fb_init(src_fb, i, mem_region);
> >   			dst = blt_fb_init(dst_fb, i, mem_region);
> > -			memset(&blt, 0, sizeof(blt));
> > +			blt_copy_init(src_fb->fd, &blt);
> >   			blt.color_depth = blt_get_bpp(src_fb);
> >   			blt_set_copy_object(&blt.src, src);
> >   			blt_set_copy_object(&blt.dst, dst);
> > diff --git a/lib/intel_blt.c b/lib/intel_blt.c
> > index bc28f15e8d..f2f86e4947 100644
> > --- a/lib/intel_blt.c
> > +++ b/lib/intel_blt.c
> > @@ -692,6 +692,22 @@ static void dump_bb_ext(struct gen12_block_copy_data_ext *data)
> >   		 data->dw21.src_array_index);
> >   }
> > +/**
> > + * blt_copy_init:
> > + * @fd: drm fd
> > + * @blt: structure for initialization
> > + *
> > + * Function is zeroing @blt and sets fd and driver fields (INTEL_DRIVER_I915 or
> > + * INTEL_DRIVER_XE).
> > + */
> > +void blt_copy_init(int fd, struct blt_copy_data *blt)
> > +{
> > +	memset(blt, 0, sizeof(*blt));
> > +
> > +	blt->fd = fd;
> > +	blt->driver = get_intel_driver(fd);
> > +}
> > +
> >   /**
> >    * emit_blt_block_copy:
> >    * @fd: drm fd
> > @@ -889,6 +905,22 @@ static void dump_bb_surf_ctrl_cmd(const struct gen12_ctrl_surf_copy_data *data)
> >   		 cmd[4], data->dw04.dst_address_hi, data->dw04.dst_mocs);
> >   }
> > +/**
> > + * blt_ctrl_surf_copy_init:
> > + * @fd: drm fd
> > + * @surf: structure for initialization
> > + *
> > + * Function is zeroing @surf and sets fd and driver fields (INTEL_DRIVER_I915 or
> > + * INTEL_DRIVER_XE).
> > + */
> > +void blt_ctrl_surf_copy_init(int fd, struct blt_ctrl_surf_copy_data *surf)
> > +{
> > +	memset(surf, 0, sizeof(*surf));
> > +
> > +	surf->fd = fd;
> > +	surf->driver = get_intel_driver(fd);
> > +}
> > +
> >   /**
> >    * emit_blt_ctrl_surf_copy:
> >    * @fd: drm fd
> > @@ -1317,7 +1349,7 @@ void blt_set_batch(struct blt_copy_batch *batch,
> >   }
> >   struct blt_copy_object *
> > -blt_create_object(int fd, uint32_t region,
> > +blt_create_object(const struct blt_copy_data *blt, uint32_t region,
> >   		  uint32_t width, uint32_t height, uint32_t bpp, uint8_t mocs,
> >   		  enum blt_tiling_type tiling,
> >   		  enum blt_compression compression,
> > @@ -1329,10 +1361,12 @@ blt_create_object(int fd, uint32_t region,
> >   	uint32_t stride = tiling == T_LINEAR ? width * 4 : width;
> >   	uint32_t handle;
> > +	igt_assert_f(blt->driver, "Driver isn't set, have you called blt_copy_init()?\n");
> > +
> >   	obj = calloc(1, sizeof(*obj));
> >   	obj->size = size;
> > -	igt_assert(__gem_create_in_memory_regions(fd, &handle,
> > +	igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
> >   						  &size, region) == 0);
> >   	blt_set_object(obj, handle, size, region, mocs, tiling,
> > @@ -1340,7 +1374,7 @@ blt_create_object(int fd, uint32_t region,
> >   	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
> >   	if (create_mapping)
> > -		obj->ptr = gem_mmap__device_coherent(fd, handle, 0, size,
> > +		obj->ptr = gem_mmap__device_coherent(blt->fd, handle, 0, size,
> >   						     PROT_READ | PROT_WRITE);
> >   	return obj;
> > diff --git a/lib/intel_blt.h b/lib/intel_blt.h
> > index 9c4ddc7a89..7516ce8ac7 100644
> > --- a/lib/intel_blt.h
> > +++ b/lib/intel_blt.h
> > @@ -102,6 +102,7 @@ struct blt_copy_batch {
> >   /* Common for block-copy and fast-copy */
> >   struct blt_copy_data {
> >   	int fd;
> > +	enum intel_driver driver;
> >   	struct blt_copy_object src;
> >   	struct blt_copy_object dst;
> >   	struct blt_copy_batch bb;
> > @@ -155,6 +156,7 @@ struct blt_ctrl_surf_copy_object {
> >   struct blt_ctrl_surf_copy_data {
> >   	int fd;
> > +	enum intel_driver driver;
> >   	struct blt_ctrl_surf_copy_object src;
> >   	struct blt_ctrl_surf_copy_object dst;
> >   	struct blt_copy_batch bb;
> > @@ -185,6 +187,8 @@ bool blt_uses_extended_block_copy(int fd);
> >   const char *blt_tiling_name(enum blt_tiling_type tiling);
> > +void blt_copy_init(int fd, struct blt_copy_data *blt);
> > +
> >   uint64_t emit_blt_block_copy(int fd,
> >   			     uint64_t ahnd,
> >   			     const struct blt_copy_data *blt,
> > @@ -205,6 +209,8 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
> >   				 uint64_t bb_pos,
> >   				 bool emit_bbe);
> > +void blt_ctrl_surf_copy_init(int fd, struct blt_ctrl_surf_copy_data *surf);
> > +
> >   int blt_ctrl_surf_copy(int fd,
> >   		       const intel_ctx_t *ctx,
> >   		       const struct intel_execution_engine2 *e,
> > @@ -230,7 +236,7 @@ void blt_set_batch(struct blt_copy_batch *batch,
> >   		   uint32_t handle, uint64_t size, uint32_t region);
> >   struct blt_copy_object *
> > -blt_create_object(int fd, uint32_t region,
> > +blt_create_object(const struct blt_copy_data *blt, uint32_t region,
> >   		  uint32_t width, uint32_t height, uint32_t bpp, uint8_t mocs,
> >   		  enum blt_tiling_type tiling,
> >   		  enum blt_compression compression,
> > diff --git a/tests/i915/gem_ccs.c b/tests/i915/gem_ccs.c
> > index f9ad9267df..d9d785ed9b 100644
> > --- a/tests/i915/gem_ccs.c
> > +++ b/tests/i915/gem_ccs.c
> > @@ -167,7 +167,7 @@ static void surf_copy(int i915,
> >   	ccs = gem_create(i915, ccssize);
> >   	ccs2 = gem_create(i915, ccssize);
> > -	surf.fd = i915;
> > +	blt_ctrl_surf_copy_init(i915, &surf);
> >   	surf.print_bb = param.print_bb;
> >   	set_surf_object(&surf.src, mid->handle, mid->region, mid->size,
> >   			uc_mocs, BLT_INDIRECT_ACCESS);
> > @@ -219,7 +219,7 @@ static void surf_copy(int i915,
> >   			uc_mocs, INDIRECT_ACCESS);
> >   	blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
> > -	memset(&blt, 0, sizeof(blt));
> > +	blt_copy_init(i915, &blt);
> >   	blt.color_depth = CD_32bit;
> >   	blt.print_bb = param.print_bb;
> >   	blt_set_copy_object(&blt.src, mid);
> > @@ -310,7 +310,7 @@ static int blt_block_copy3(int i915,
> >   	bb_offset = get_offset(ahnd, blt3->bb.handle, blt3->bb.size, alignment);
> >   	/* First blit src -> mid */
> > -	memset(&blt0, 0, sizeof(blt0));
> > +	blt_copy_init(i915, &blt0);
> >   	blt0.src = blt3->src;
> >   	blt0.dst = blt3->mid;
> >   	blt0.bb = blt3->bb;
> > @@ -321,7 +321,7 @@ static int blt_block_copy3(int i915,
> >   	bb_pos = emit_blt_block_copy(i915, ahnd, &blt0, &ext0, bb_pos, false);
> >   	/* Second blit mid -> dst */
> > -	memset(&blt0, 0, sizeof(blt0));
> > +	blt_copy_init(i915, &blt0);
> >   	blt0.src = blt3->mid;
> >   	blt0.dst = blt3->dst;
> >   	blt0.bb = blt3->bb;
> > @@ -332,7 +332,7 @@ static int blt_block_copy3(int i915,
> >   	bb_pos = emit_blt_block_copy(i915, ahnd, &blt0, &ext0, bb_pos, false);
> >   	/* Third blit dst -> final */
> > -	memset(&blt0, 0, sizeof(blt0));
> > +	blt_copy_init(i915, &blt0);
> >   	blt0.src = blt3->dst;
> >   	blt0.dst = blt3->final;
> >   	blt0.bb = blt3->bb;
> > @@ -390,11 +390,13 @@ static void block_copy(int i915,
> >   	if (!blt_uses_extended_block_copy(i915))
> >   		pext = NULL;
> > -	src = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
> > +	blt_copy_init(i915, &blt);
> > +
> > +	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> >   				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> > -	mid = blt_create_object(i915, mid_region, width, height, bpp, uc_mocs,
> > +	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
> >   				mid_tiling, mid_compression, comp_type, true);
> > -	dst = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
> > +	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> >   				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> >   	igt_assert(src->size == dst->size);
> >   	PRINT_SURFACE_INFO("src", src);
> > @@ -404,7 +406,6 @@ static void block_copy(int i915,
> >   	blt_surface_fill_rect(i915, src, width, height);
> >   	WRITE_PNG(i915, run_id, "src", src, width, height);
> > -	memset(&blt, 0, sizeof(blt));
> >   	blt.color_depth = CD_32bit;
> >   	blt.print_bb = param.print_bb;
> >   	blt_set_copy_object(&blt.src, src);
> > @@ -449,7 +450,7 @@ static void block_copy(int i915,
> >   		}
> >   	}
> > -	memset(&blt, 0, sizeof(blt));
> > +	blt_copy_init(i915, &blt);
> >   	blt.color_depth = CD_32bit;
> >   	blt.print_bb = param.print_bb;
> >   	blt_set_copy_object(&blt.src, mid);
> > @@ -486,6 +487,7 @@ static void block_multicopy(int i915,
> >   			    const struct test_config *config)
> >   {
> >   	struct blt_copy3_data blt3 = {};
> > +	struct blt_copy_data blt = {};
> >   	struct blt_block_copy3_data_ext ext3 = {}, *pext3 = &ext3;
> >   	struct blt_copy_object *src, *mid, *dst, *final;
> >   	const uint32_t bpp = 32;
> > @@ -505,13 +507,16 @@ static void block_multicopy(int i915,
> >   	if (!blt_uses_extended_block_copy(i915))
> >   		pext3 = NULL;
> > -	src = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
> > +	/* For object creation */
> > +	blt_copy_init(i915, &blt);
> > +
> > +	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> >   				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> > -	mid = blt_create_object(i915, mid_region, width, height, bpp, uc_mocs,
> > +	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
> >   				mid_tiling, mid_compression, comp_type, true);
> > -	dst = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
> > +	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> >   				mid_tiling, COMPRESSION_DISABLED, comp_type, true);
> > -	final = blt_create_object(i915, region1, width, height, bpp, uc_mocs,
> > +	final = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> >   				  T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> >   	igt_assert(src->size == dst->size);
> >   	PRINT_SURFACE_INFO("src", src);
> > @@ -521,7 +526,6 @@ static void block_multicopy(int i915,
> >   	blt_surface_fill_rect(i915, src, width, height);
> > -	memset(&blt3, 0, sizeof(blt3));
> 
> We don't init blt3 because we don't to init blt copy objects and we have
> fd/driver passed in, correct?

No, it is local for the test which collects data to use in
blt_copy_data.

--
Zbigniew

> 
> Overall, the patch looks good to me:
> 
> Reviewed-by: Karolina Stolarek <karolina.stolarek@intel.com>
> 
> >   	blt3.color_depth = CD_32bit;
> >   	blt3.print_bb = param.print_bb;
> >   	blt_set_copy_object(&blt3.src, src);
> > diff --git a/tests/i915/gem_exercise_blt.c b/tests/i915/gem_exercise_blt.c
> > index 0cd1820430..7355eabbe9 100644
> > --- a/tests/i915/gem_exercise_blt.c
> > +++ b/tests/i915/gem_exercise_blt.c
> > @@ -89,7 +89,7 @@ static int fast_copy_one_bb(int i915,
> >   	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> >   	/* First blit */
> > -	memset(&blt_tmp, 0, sizeof(blt_tmp));
> > +	blt_copy_init(i915, &blt_tmp);
> >   	blt_tmp.src = blt->src;
> >   	blt_tmp.dst = blt->mid;
> >   	blt_tmp.bb = blt->bb;
> > @@ -98,7 +98,7 @@ static int fast_copy_one_bb(int i915,
> >   	bb_pos = emit_blt_fast_copy(i915, ahnd, &blt_tmp, bb_pos, false);
> >   	/* Second blit */
> > -	memset(&blt_tmp, 0, sizeof(blt_tmp));
> > +	blt_copy_init(i915, &blt_tmp);
> >   	blt_tmp.src = blt->mid;
> >   	blt_tmp.dst = blt->dst;
> >   	blt_tmp.bb = blt->bb;
> > @@ -140,6 +140,7 @@ static void fast_copy_emit(int i915, const intel_ctx_t *ctx,
> >   			   uint32_t region1, uint32_t region2,
> >   			   enum blt_tiling_type mid_tiling)
> >   {
> > +	struct blt_copy_data bltinit = {};
> >   	struct blt_fast_copy_data blt = {};
> >   	struct blt_copy_object *src, *mid, *dst;
> >   	const uint32_t bpp = 32;
> > @@ -152,11 +153,12 @@ static void fast_copy_emit(int i915, const intel_ctx_t *ctx,
> >   	igt_assert(__gem_create_in_memory_regions(i915, &bb, &bb_size, region1) == 0);
> > -	src = blt_create_object(i915, region1, width, height, bpp, 0,
> > +	blt_copy_init(i915, &bltinit);
> > +	src = blt_create_object(&bltinit, region1, width, height, bpp, 0,
> >   				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> > -	mid = blt_create_object(i915, region2, width, height, bpp, 0,
> > +	mid = blt_create_object(&bltinit, region2, width, height, bpp, 0,
> >   				mid_tiling, COMPRESSION_DISABLED, 0, true);
> > -	dst = blt_create_object(i915, region1, width, height, bpp, 0,
> > +	dst = blt_create_object(&bltinit, region1, width, height, bpp, 0,
> >   				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> >   	igt_assert(src->size == dst->size);
> > @@ -212,17 +214,17 @@ static void fast_copy(int i915, const intel_ctx_t *ctx,
> >   	igt_assert(__gem_create_in_memory_regions(i915, &bb, &bb_size, region1) == 0);
> > -	src = blt_create_object(i915, region1, width, height, bpp, 0,
> > +	blt_copy_init(i915, &blt);
> > +	src = blt_create_object(&blt, region1, width, height, bpp, 0,
> >   				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> > -	mid = blt_create_object(i915, region2, width, height, bpp, 0,
> > +	mid = blt_create_object(&blt, region2, width, height, bpp, 0,
> >   				mid_tiling, COMPRESSION_DISABLED, 0, true);
> > -	dst = blt_create_object(i915, region1, width, height, bpp, 0,
> > +	dst = blt_create_object(&blt, region1, width, height, bpp, 0,
> >   				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> >   	igt_assert(src->size == dst->size);
> >   	blt_surface_fill_rect(i915, src, width, height);
> > -	memset(&blt, 0, sizeof(blt));
> >   	blt.color_depth = CD_32bit;
> >   	blt.print_bb = param.print_bb;
> >   	blt_set_copy_object(&blt.src, src);
> > @@ -235,7 +237,7 @@ static void fast_copy(int i915, const intel_ctx_t *ctx,
> >   	WRITE_PNG(i915, mid_tiling, "src", &blt.src, width, height);
> >   	WRITE_PNG(i915, mid_tiling, "mid", &blt.dst, width, height);
> > -	memset(&blt, 0, sizeof(blt));
> > +	blt_copy_init(i915, &blt);
> >   	blt.color_depth = CD_32bit;
> >   	blt.print_bb = param.print_bb;
> >   	blt_set_copy_object(&blt.src, mid);
> > diff --git a/tests/i915/gem_lmem_swapping.c b/tests/i915/gem_lmem_swapping.c
> > index 83dbebec83..2921de8f9f 100644
> > --- a/tests/i915/gem_lmem_swapping.c
> > +++ b/tests/i915/gem_lmem_swapping.c
> > @@ -308,7 +308,7 @@ init_object_ccs(int i915, struct object *obj, struct blt_copy_object *tmp,
> >   		buf[j] = seed++;
> >   	munmap(buf, obj->size);
> > -	memset(&blt, 0, sizeof(blt));
> > +	blt_copy_init(i915, &blt);
> >   	blt.color_depth = CD_32bit;
> >   	memcpy(&blt.src, tmp, sizeof(blt.src));
> > @@ -366,7 +366,7 @@ verify_object_ccs(int i915, const struct object *obj,
> >   	cmd->handle = gem_create_from_pool(i915, &size, region);
> >   	blt_set_batch(cmd, cmd->handle, size, region);
> > -	memset(&blt, 0, sizeof(blt));
> > +	blt_copy_init(i915, &blt);
> >   	blt.color_depth = CD_32bit;
> >   	memcpy(&blt.src, obj->blt_obj, sizeof(blt.src));

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 13/16] lib/intel_blt: Extend blitter library to support xe driver
  2023-07-07  9:26   ` Karolina Stolarek
@ 2023-07-11 10:16     ` Zbigniew Kempczyński
  2023-07-11 10:41       ` Karolina Stolarek
  0 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-11 10:16 UTC (permalink / raw)
  To: Karolina Stolarek; +Cc: igt-dev

On Fri, Jul 07, 2023 at 11:26:18AM +0200, Karolina Stolarek wrote:
> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> > Use already written for i915 blitter library in xe development.
> > Add appropriate code paths which are unique for those drivers.
> 
> I'm excited about this one :)
> 
> > 
> > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > ---
> >   lib/intel_blt.c | 256 ++++++++++++++++++++++++++++++++----------------
> >   lib/intel_blt.h |   2 +-
> >   2 files changed, 170 insertions(+), 88 deletions(-)
> > 
> > diff --git a/lib/intel_blt.c b/lib/intel_blt.c
> > index f2f86e4947..3eb5d45460 100644
> > --- a/lib/intel_blt.c
> > +++ b/lib/intel_blt.c
> > @@ -9,9 +9,13 @@
> >   #include <malloc.h>
> >   #include <cairo.h>
> >   #include "drm.h"
> > -#include "igt.h"
> >   #include "i915/gem_create.h"
> > +#include "igt.h"
> > +#include "igt_syncobj.h"
> >   #include "intel_blt.h"
> > +#include "xe/xe_ioctl.h"
> > +#include "xe/xe_query.h"
> > +#include "xe/xe_util.h"
> >   #define BITRANGE(start, end) (end - start + 1)
> >   #define GET_CMDS_INFO(__fd) intel_get_cmds_info(intel_get_drm_devid(__fd))
> > @@ -468,24 +472,40 @@ static int __special_mode(const struct blt_copy_data *blt)
> >   	return SM_NONE;
> >   }
> > -static int __memory_type(uint32_t region)
> > +static int __memory_type(int fd, enum intel_driver driver, uint32_t region)
> 
> This comment applies both to __memory_type() and __aux_mode(). In
> fill_data(), we unpack whatever was passed in blt and pass as three separate
> params. Could we let the functions unpack them on their own, just like in
> __special_mode()?

In __special_mode() you operate on both objects src and dst.
__memory_type() and __aux_mode() works on single region, so if I pass
blt to them I still will don't know what's the region and I need
to pass it as second argument. So simple unpacking on blt just won't
work if I wouldn't pass on which object I'm working on.

> 
> Also, fill_data() changes make me wonder if my r-b for 12/16 is valid -- we
> don't set fd and driver fields in blt3 in i915 multicopy test. I think we
> should do it.

Blt3 is local to xe_ccs and it is used to build the blt.

> 
> >   {
> > -	igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
> > -		     IS_SYSTEM_MEMORY_REGION(region),
> > -		     "Invalid region: %x\n", region);
> > +	if (driver == INTEL_DRIVER_I915) {
> > +		igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
> > +			     IS_SYSTEM_MEMORY_REGION(region),
> > +			     "Invalid region: %x\n", region);
> > +	} else {
> > +		igt_assert_f(XE_IS_VRAM_MEMORY_REGION(fd, region) ||
> > +			     XE_IS_SYSMEM_MEMORY_REGION(fd, region),
> > +			     "Invalid region: %x\n", region);
> > +	}
> > -	if (IS_DEVICE_MEMORY_REGION(region))
> > +	if (driver == INTEL_DRIVER_I915 && IS_DEVICE_MEMORY_REGION(region))
> >   		return TM_LOCAL_MEM;
> > +	else if (driver == INTEL_DRIVER_XE && XE_IS_VRAM_MEMORY_REGION(fd, region))
> > +		return TM_LOCAL_MEM;
> > +
> >   	return TM_SYSTEM_MEM;
> >   }
> > -static enum blt_aux_mode __aux_mode(const struct blt_copy_object *obj)
> > +static enum blt_aux_mode __aux_mode(int fd,
> > +				    enum intel_driver driver,
> > +				    const struct blt_copy_object *obj)
> >   {
> > -	if (obj->compression == COMPRESSION_ENABLED) {
> > +	if (driver == INTEL_DRIVER_I915 && obj->compression == COMPRESSION_ENABLED) {
> >   		igt_assert_f(IS_DEVICE_MEMORY_REGION(obj->region),
> >   			     "XY_BLOCK_COPY_BLT supports compression "
> >   			     "on device memory only\n");
> >   		return AM_AUX_CCS_E;
> > +	} else if (driver == INTEL_DRIVER_XE && obj->compression == COMPRESSION_ENABLED) {
> > +		igt_assert_f(XE_IS_VRAM_MEMORY_REGION(fd, obj->region),
> > +			     "XY_BLOCK_COPY_BLT supports compression "
> > +			     "on device memory only\n");
> > +		return AM_AUX_CCS_E;
> >   	}
> >   	return AM_AUX_NONE;
> > @@ -508,9 +528,9 @@ static void fill_data(struct gen12_block_copy_data *data,
> >   	data->dw00.length = extended_command ? 20 : 10;
> >   	if (__special_mode(blt) == SM_FULL_RESOLVE)
> > -		data->dw01.dst_aux_mode = __aux_mode(&blt->src);
> > +		data->dw01.dst_aux_mode = __aux_mode(blt->fd, blt->driver, &blt->src);
> >   	else
> > -		data->dw01.dst_aux_mode = __aux_mode(&blt->dst);
> > +		data->dw01.dst_aux_mode = __aux_mode(blt->fd, blt->driver, &blt->dst);
> >   	data->dw01.dst_pitch = blt->dst.pitch - 1;
> >   	data->dw01.dst_mocs = blt->dst.mocs;
> > @@ -531,13 +551,13 @@ static void fill_data(struct gen12_block_copy_data *data,
> >   	data->dw06.dst_x_offset = blt->dst.x_offset;
> >   	data->dw06.dst_y_offset = blt->dst.y_offset;
> > -	data->dw06.dst_target_memory = __memory_type(blt->dst.region);
> > +	data->dw06.dst_target_memory = __memory_type(blt->fd, blt->driver, blt->dst.region);
> >   	data->dw07.src_x1 = blt->src.x1;
> >   	data->dw07.src_y1 = blt->src.y1;
> >   	data->dw08.src_pitch = blt->src.pitch - 1;
> > -	data->dw08.src_aux_mode = __aux_mode(&blt->src);
> > +	data->dw08.src_aux_mode = __aux_mode(blt->fd, blt->driver, &blt->src);
> >   	data->dw08.src_mocs = blt->src.mocs;
> >   	data->dw08.src_compression = blt->src.compression;
> >   	data->dw08.src_tiling = __block_tiling(blt->src.tiling);
> > @@ -550,7 +570,7 @@ static void fill_data(struct gen12_block_copy_data *data,
> >   	data->dw11.src_x_offset = blt->src.x_offset;
> >   	data->dw11.src_y_offset = blt->src.y_offset;
> > -	data->dw11.src_target_memory = __memory_type(blt->src.region);
> > +	data->dw11.src_target_memory = __memory_type(blt->fd, blt->driver, blt->src.region);
> >   }
> >   static void fill_data_ext(struct gen12_block_copy_data_ext *dext,
> > @@ -739,7 +759,10 @@ uint64_t emit_blt_block_copy(int fd,
> >   	igt_assert_f(ahnd, "block-copy supports softpin only\n");
> >   	igt_assert_f(blt, "block-copy requires data to do blit\n");
> > -	alignment = gem_detect_safe_alignment(fd);
> > +	if (blt->driver == INTEL_DRIVER_XE)
> > +		alignment = xe_get_default_alignment(fd);
> > +	else
> > +		alignment = gem_detect_safe_alignment(fd);
> 
> I see this pattern of getting the alignment repeated a couple of times, so I
> wonder if we could wrap it in a macro that switches on blt->driver? The
> argument list is the same for both functions.
> 

Agree, there're couple of such conditionals so it's worth to create some
helper for this.

> >   	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
> >   		     + blt->src.plane_offset;
> >   	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
> > @@ -748,8 +771,11 @@ uint64_t emit_blt_block_copy(int fd,
> >   	fill_data(&data, blt, src_offset, dst_offset, ext);
> > -	bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
> > -				       PROT_READ | PROT_WRITE);
> > +	if (blt->driver == INTEL_DRIVER_XE)
> > +		bb = xe_bo_map(fd, blt->bb.handle, blt->bb.size);
> > +	else
> > +		bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
> > +					       PROT_READ | PROT_WRITE);
> >   	igt_assert(bb_pos + sizeof(data) < blt->bb.size);
> >   	memcpy(bb + bb_pos, &data, sizeof(data));
> > @@ -812,29 +838,38 @@ int blt_block_copy(int fd,
> >   	igt_assert_f(ahnd, "block-copy supports softpin only\n");
> >   	igt_assert_f(blt, "block-copy requires data to do blit\n");
> > +	igt_assert_neq(blt->driver, 0);
> > -	alignment = gem_detect_safe_alignment(fd);
> > +	if (blt->driver == INTEL_DRIVER_XE)
> > +		alignment = xe_get_default_alignment(fd);
> > +	else
> > +		alignment = gem_detect_safe_alignment(fd);
> >   	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> >   	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> >   	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> >   	emit_blt_block_copy(fd, ahnd, blt, ext, 0, true);
> > -	obj[0].offset = CANONICAL(dst_offset);
> > -	obj[1].offset = CANONICAL(src_offset);
> > -	obj[2].offset = CANONICAL(bb_offset);
> > -	obj[0].handle = blt->dst.handle;
> > -	obj[1].handle = blt->src.handle;
> > -	obj[2].handle = blt->bb.handle;
> > -	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> > -		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > -	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > -	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > -	execbuf.buffer_count = 3;
> > -	execbuf.buffers_ptr = to_user_pointer(obj);
> > -	execbuf.rsvd1 = ctx ? ctx->id : 0;
> > -	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> > -	ret = __gem_execbuf(fd, &execbuf);
> > +	if (blt->driver == INTEL_DRIVER_XE) {
> > +		intel_ctx_xe_exec(ctx, ahnd, CANONICAL(bb_offset));
> > +	} else {
> > +		obj[0].offset = CANONICAL(dst_offset);
> > +		obj[1].offset = CANONICAL(src_offset);
> > +		obj[2].offset = CANONICAL(bb_offset);
> > +		obj[0].handle = blt->dst.handle;
> > +		obj[1].handle = blt->src.handle;
> > +		obj[2].handle = blt->bb.handle;
> > +		obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> > +				EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +		obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +		obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +		execbuf.buffer_count = 3;
> > +		execbuf.buffers_ptr = to_user_pointer(obj);
> > +		execbuf.rsvd1 = ctx ? ctx->id : 0;
> > +		execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> > +
> > +		ret = __gem_execbuf(fd, &execbuf);
> > +	}
> >   	return ret;
> >   }
> > @@ -950,7 +985,10 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
> >   	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
> >   	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
> > -	alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
> > +	if (surf->driver == INTEL_DRIVER_XE)
> > +		alignment = max_t(uint64_t, xe_get_default_alignment(fd), 1ull << 16);
> > +	else
> > +		alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
> >   	data.dw00.client = 0x2;
> >   	data.dw00.opcode = 0x48;
> > @@ -973,8 +1011,11 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
> >   	data.dw04.dst_address_hi = dst_offset >> 32;
> >   	data.dw04.dst_mocs = surf->dst.mocs;
> > -	bb = gem_mmap__device_coherent(fd, surf->bb.handle, 0, surf->bb.size,
> > -				       PROT_READ | PROT_WRITE);
> > +	if (surf->driver == INTEL_DRIVER_XE)
> > +		bb = xe_bo_map(fd, surf->bb.handle, surf->bb.size);
> > +	else
> > +		bb = gem_mmap__device_coherent(fd, surf->bb.handle, 0, surf->bb.size,
> > +					       PROT_READ | PROT_WRITE);
> >   	igt_assert(bb_pos + sizeof(data) < surf->bb.size);
> >   	memcpy(bb + bb_pos, &data, sizeof(data));
> > @@ -1002,7 +1043,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
> >   /**
> >    * blt_ctrl_surf_copy:
> > - * @fd: drm fd
> > + * @blt: bldrm fd
> >    * @ctx: intel_ctx_t context
> >    * @e: blitter engine for @ctx
> >    * @ahnd: allocator handle
> > @@ -1026,32 +1067,41 @@ int blt_ctrl_surf_copy(int fd,
> >   	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
> >   	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
> > +	igt_assert_neq(surf->driver, 0);
> > +
> > +	if (surf->driver == INTEL_DRIVER_XE)
> > +		alignment = max_t(uint64_t, xe_get_default_alignment(fd), 1ull << 16);
> > +	else
> > +		alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
> 
> Here, if we were to implement my macro suggestion, we could have one line
> assignment, as the maximum value is the same.
> 
> > -	alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
> >   	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
> >   	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
> >   	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
> >   	emit_blt_ctrl_surf_copy(fd, ahnd, surf, 0, true);
> > -	obj[0].offset = CANONICAL(dst_offset);
> > -	obj[1].offset = CANONICAL(src_offset);
> > -	obj[2].offset = CANONICAL(bb_offset);
> > -	obj[0].handle = surf->dst.handle;
> > -	obj[1].handle = surf->src.handle;
> > -	obj[2].handle = surf->bb.handle;
> > -	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> > -		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > -	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > -	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > -	execbuf.buffer_count = 3;
> > -	execbuf.buffers_ptr = to_user_pointer(obj);
> > -	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> > -	execbuf.rsvd1 = ctx ? ctx->id : 0;
> > -	gem_execbuf(fd, &execbuf);
> > -	put_offset(ahnd, surf->dst.handle);
> > -	put_offset(ahnd, surf->src.handle);
> > -	put_offset(ahnd, surf->bb.handle);
> > +	if (surf->driver == INTEL_DRIVER_XE) {
> > +		intel_ctx_xe_exec(ctx, ahnd, CANONICAL(bb_offset));
> > +	} else {
> > +		obj[0].offset = CANONICAL(dst_offset);
> > +		obj[1].offset = CANONICAL(src_offset);
> > +		obj[2].offset = CANONICAL(bb_offset);
> > +		obj[0].handle = surf->dst.handle;
> > +		obj[1].handle = surf->src.handle;
> > +		obj[2].handle = surf->bb.handle;
> > +		obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> > +				EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +		obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +		obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +		execbuf.buffer_count = 3;
> > +		execbuf.buffers_ptr = to_user_pointer(obj);
> > +		execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> > +		execbuf.rsvd1 = ctx ? ctx->id : 0;
> > +		gem_execbuf(fd, &execbuf);
> > +		put_offset(ahnd, surf->dst.handle);
> > +		put_offset(ahnd, surf->src.handle);
> > +		put_offset(ahnd, surf->bb.handle);
> > +	}
> >   	return 0;
> >   }
> > @@ -1208,7 +1258,10 @@ uint64_t emit_blt_fast_copy(int fd,
> >   	uint32_t bbe = MI_BATCH_BUFFER_END;
> >   	uint32_t *bb;
> > -	alignment = gem_detect_safe_alignment(fd);
> > +	if (blt->driver == INTEL_DRIVER_XE)
> > +		alignment = xe_get_default_alignment(fd);
> > +	else
> > +		alignment = gem_detect_safe_alignment(fd);
> >   	data.dw00.client = 0x2;
> >   	data.dw00.opcode = 0x42;
> > @@ -1218,8 +1271,8 @@ uint64_t emit_blt_fast_copy(int fd,
> >   	data.dw01.dst_pitch = blt->dst.pitch;
> >   	data.dw01.color_depth = __fast_color_depth(blt->color_depth);
> > -	data.dw01.dst_memory = __memory_type(blt->dst.region);
> > -	data.dw01.src_memory = __memory_type(blt->src.region);
> > +	data.dw01.dst_memory = __memory_type(blt->fd, blt->driver, blt->dst.region);
> > +	data.dw01.src_memory = __memory_type(blt->fd, blt->driver, blt->src.region);
> >   	data.dw01.dst_type_y = __new_tile_y_type(blt->dst.tiling) ? 1 : 0;
> >   	data.dw01.src_type_y = __new_tile_y_type(blt->src.tiling) ? 1 : 0;
> > @@ -1246,8 +1299,11 @@ uint64_t emit_blt_fast_copy(int fd,
> >   	data.dw08.src_address_lo = src_offset;
> >   	data.dw09.src_address_hi = src_offset >> 32;
> > -	bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
> > -				       PROT_READ | PROT_WRITE);
> > +	if (blt->driver == INTEL_DRIVER_XE)
> > +		bb = xe_bo_map(fd, blt->bb.handle, blt->bb.size);
> > +	else
> > +		bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
> > +					       PROT_READ | PROT_WRITE);
> >   	igt_assert(bb_pos + sizeof(data) < blt->bb.size);
> >   	memcpy(bb + bb_pos, &data, sizeof(data));
> > @@ -1297,7 +1353,14 @@ int blt_fast_copy(int fd,
> >   	uint64_t dst_offset, src_offset, bb_offset, alignment;
> >   	int ret;
> > -	alignment = gem_detect_safe_alignment(fd);
> > +	igt_assert_f(ahnd, "fast-copy supports softpin only\n");
> > +	igt_assert_f(blt, "fast-copy requires data to do fast-copy blit\n");
> > +	igt_assert_neq(blt->driver, 0);
> > +
> > +	if (blt->driver == INTEL_DRIVER_XE)
> > +		alignment = xe_get_default_alignment(fd);
> > +	else
> > +		alignment = gem_detect_safe_alignment(fd);
> >   	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> >   	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> > @@ -1305,24 +1368,28 @@ int blt_fast_copy(int fd,
> >   	emit_blt_fast_copy(fd, ahnd, blt, 0, true);
> > -	obj[0].offset = CANONICAL(dst_offset);
> > -	obj[1].offset = CANONICAL(src_offset);
> > -	obj[2].offset = CANONICAL(bb_offset);
> > -	obj[0].handle = blt->dst.handle;
> > -	obj[1].handle = blt->src.handle;
> > -	obj[2].handle = blt->bb.handle;
> > -	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> > -		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > -	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > -	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > -	execbuf.buffer_count = 3;
> > -	execbuf.buffers_ptr = to_user_pointer(obj);
> > -	execbuf.rsvd1 = ctx ? ctx->id : 0;
> > -	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> > -	ret = __gem_execbuf(fd, &execbuf);
> > -	put_offset(ahnd, blt->dst.handle);
> > -	put_offset(ahnd, blt->src.handle);
> > -	put_offset(ahnd, blt->bb.handle);
> > +	if (blt->driver == INTEL_DRIVER_XE) {
> > +		intel_ctx_xe_exec(ctx, ahnd, CANONICAL(bb_offset));
> > +	} else {
> > +		obj[0].offset = CANONICAL(dst_offset);
> > +		obj[1].offset = CANONICAL(src_offset);
> > +		obj[2].offset = CANONICAL(bb_offset);
> > +		obj[0].handle = blt->dst.handle;
> > +		obj[1].handle = blt->src.handle;
> > +		obj[2].handle = blt->bb.handle;
> > +		obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> > +				EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +		obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +		obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +		execbuf.buffer_count = 3;
> > +		execbuf.buffers_ptr = to_user_pointer(obj);
> > +		execbuf.rsvd1 = ctx ? ctx->id : 0;
> > +		execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> > +		ret = __gem_execbuf(fd, &execbuf);
> > +		put_offset(ahnd, blt->dst.handle);
> > +		put_offset(ahnd, blt->src.handle);
> > +		put_offset(ahnd, blt->bb.handle);
> > +	}
> >   	return ret;
> >   }
> > @@ -1366,16 +1433,26 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
> >   	obj = calloc(1, sizeof(*obj));
> >   	obj->size = size;
> > -	igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
> > -						  &size, region) == 0);
> > +
> > +	if (blt->driver == INTEL_DRIVER_XE) {
> > +		size = ALIGN(size, xe_get_default_alignment(blt->fd));
> > +		handle = xe_bo_create_flags(blt->fd, 0, size, region);
> > +	} else {
> > +		igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
> > +							  &size, region) == 0);
> > +	}
> >   	blt_set_object(obj, handle, size, region, mocs, tiling,
> >   		       compression, compression_type);
> >   	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
> > -	if (create_mapping)
> > -		obj->ptr = gem_mmap__device_coherent(blt->fd, handle, 0, size,
> > -						     PROT_READ | PROT_WRITE);
> > +	if (create_mapping) {
> > +		if (blt->driver == INTEL_DRIVER_XE)
> > +			obj->ptr = xe_bo_map(blt->fd, handle, size);
> > +		else
> > +			obj->ptr = gem_mmap__device_coherent(blt->fd, handle, 0, size,
> > +							     PROT_READ | PROT_WRITE);
> > +	}
> >   	return obj;
> >   }
> > @@ -1518,14 +1595,19 @@ void blt_surface_to_png(int fd, uint32_t run_id, const char *fileid,
> >   	int format;
> >   	int stride = obj->tiling ? obj->pitch * 4 : obj->pitch;
> >   	char filename[FILENAME_MAX];
> > +	bool is_xe;
> >   	snprintf(filename, FILENAME_MAX-1, "%d-%s-%s-%ux%u-%s.png",
> >   		 run_id, fileid, blt_tiling_name(obj->tiling), width, height,
> >   		 obj->compression ? "compressed" : "uncompressed");
> > -	if (!map)
> > -		map = gem_mmap__device_coherent(fd, obj->handle, 0,
> > -						obj->size, PROT_READ);
> > +	if (!map) {
> > +		if (is_xe)
> 
> is_xe doesn't seem to be set, we'll always pick "else" if copy object is not
> mapped.

Oops, right, is_xe is just garbage now. Will fix in v3.

Thanks for the review.
--
Zbigniew

> 
> All the best,
> Karolina
> 
> > +			map = xe_bo_map(fd, obj->handle, obj->size);
> > +		else
> > +			map = gem_mmap__device_coherent(fd, obj->handle, 0,
> > +							obj->size, PROT_READ);
> > +	}
> >   	format = CAIRO_FORMAT_RGB24;
> >   	surface = cairo_image_surface_create_for_data(map,
> >   						      format, width, height,
> > diff --git a/lib/intel_blt.h b/lib/intel_blt.h
> > index 7516ce8ac7..944e2b4ae7 100644
> > --- a/lib/intel_blt.h
> > +++ b/lib/intel_blt.h
> > @@ -8,7 +8,7 @@
> >   /**
> >    * SECTION:intel_blt
> > - * @short_description: i915 blitter library
> > + * @short_description: i915/xe blitter library
> >    * @title: Blitter library
> >    * @include: intel_blt.h
> >    *

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 11/16] lib/intel_ctx: Add xe context information
  2023-07-11  9:06     ` Zbigniew Kempczyński
@ 2023-07-11 10:38       ` Karolina Stolarek
  0 siblings, 0 replies; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-11 10:38 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 11.07.2023 11:06, Zbigniew Kempczyński wrote:
> On Fri, Jul 07, 2023 at 10:31:19AM +0200, Karolina Stolarek wrote:
>> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
>>> Most complicated part in adopting i915_blt to intel_blt - which should
>>> handle both drivers - is how to achieve pipelined execution. In term
>>> pipelined execution I mean all gpu workloads are executed without
>>> stalls.
>>>
>>> Comparing to i915 relocations and softpinning xe architecture migrates
>>> binding (this means also unbind operation) responsibility from the
>>> kernel to user via vm_bind ioctl(). To avoid stalls user has to
>>> provide in/out fences (syncobjs) between consecutive bindings/execs.
>>> Of course for many igt tests we don't need pipelined execution,
>>> just synchronous bind, then exec. But exercising the driver should
>>> also cover pipelining to verify it is possible to work without stalls.
>>>
>>> I decided to extend intel_ctx_t with all necessary for xe objects
>>> (vm, engine, syncobjs) to get flexibility in deciding how to bind,
>>> execute and wait for (synchronize) those operations. Context object
>>> along with i915 engine is already passed to blitter library so adding
>>> xe required fields doesn't break i915 but will allow xe path to get
>>> all necessary data to execute.
>>>
>>> Using intel_ctx with xe requires some code patterns caused by usage
>>> of an allocator. For xe the allocator started tracking alloc()/free()
>>> operations to do bind/unbind in one call just before execution.
>>> I've added two helpers in intel_ctx which - intel_ctx_xe_exec()
>>> and intel_ctx_xe_sync(). Depending how intel_ctx was created
>>> (with 0 or real syncobj handles as in/bind/out fences) bind and exec
>>> in intel_ctx_xe_exec() are pipelined but synchronize last operation
>>> (exec). For real syncobjs they are used to join bind + exec calls
>>> but there's no wait for exec (sync-out) completion. This allows
>>> building more cascaded bind + exec operations without stalls.
>>>
>>> To wait for a sync-out fence caller may use intel_ctx_xe_sync()
>>> which is synchronous wait on syncobj. It allows user to reset
>>> fences to prepare for next operation.
>>>
>>> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
>>> ---
>>>    lib/intel_ctx.c | 110 +++++++++++++++++++++++++++++++++++++++++++++++-
>>>    lib/intel_ctx.h |  14 ++++++
>>>    2 files changed, 123 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/intel_ctx.c b/lib/intel_ctx.c
>>> index ded9c0f1e4..f210907fac 100644
>>> --- a/lib/intel_ctx.c
>>> +++ b/lib/intel_ctx.c
>>> @@ -5,9 +5,12 @@
>>>    #include <stddef.h>
>>> +#include "i915/gem_engine_topology.h"
>>> +#include "igt_syncobj.h"
>>> +#include "intel_allocator.h"
>>>    #include "intel_ctx.h"
>>>    #include "ioctl_wrappers.h"
>>> -#include "i915/gem_engine_topology.h"
>>> +#include "xe/xe_ioctl.h"
>>>    /**
>>>     * SECTION:intel_ctx
>>> @@ -390,3 +393,108 @@ unsigned int intel_ctx_engine_class(const intel_ctx_t *ctx, unsigned int engine)
>>>    {
>>>    	return intel_ctx_cfg_engine_class(&ctx->cfg, engine);
>>>    }
>>> +
>>> +/**
>>> + * intel_ctx_xe:
>>> + * @fd: open i915 drm file descriptor
>>> + * @vm: vm
>>> + * @engine: engine
>>> + *
>>> + * Returns an intel_ctx_t representing the xe context.
>>> + */
>>> +intel_ctx_t *intel_ctx_xe(int fd, uint32_t vm, uint32_t engine,
>>> +			  uint32_t sync_in, uint32_t sync_bind, uint32_t sync_out)
>>> +{
>>> +	intel_ctx_t *ctx;
>>> +
>>> +	ctx = calloc(1, sizeof(*ctx));
>>> +	igt_assert(ctx);
>>> +
>>> +	ctx->fd = fd;
>>> +	ctx->vm = vm;
>>> +	ctx->engine = engine;
>>> +	ctx->sync_in = sync_in;
>>> +	ctx->sync_bind = sync_bind;
>>> +	ctx->sync_out = sync_out;
>>> +
>>> +	return ctx;
>>> +}
>>> +
>>> +static int __xe_exec(int fd, struct drm_xe_exec *exec)
>>> +{
>>> +	int err = 0;
>>> +
>>> +	if (igt_ioctl(fd, DRM_IOCTL_XE_EXEC, exec)) {
>>> +		err = -errno;
>>> +		igt_assume(err != 0);
>>
>> Wouldn't "igt_assume(err)" be enough?
>>
>>> +	}
>>> +	errno = 0;
>>> +	return err;
>>> +}
>>
>> I'm aware that it's a helper that you use in other execs, but it feels out
>> of place, it doesn't deal with intel_ctx_t. Maybe xe_util could be its new
>> home?
>>
> 
> I'm going to just export __xe_exec() from xe_ioctl.c.

That's even a better idea, thanks

> 
>>> +
>>> +int __intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset)
>>> +{
>>> +	struct drm_xe_sync syncs[2] = {
>>> +		{ .flags = DRM_XE_SYNC_SYNCOBJ, },
>>> +		{ .flags = DRM_XE_SYNC_SYNCOBJ | DRM_XE_SYNC_SIGNAL, },
>>> +	};
>>> +	struct drm_xe_exec exec = {
>>> +		.engine_id = ctx->engine,
>>> +		.syncs = (uintptr_t)syncs,
>>> +		.num_syncs = 2,
>>> +		.address = bb_offset,
>>> +		.num_batch_buffer = 1,
>>> +	};
>>> +	uint32_t sync_in = ctx->sync_in;
>>> +	uint32_t sync_bind = ctx->sync_bind ?: syncobj_create(ctx->fd, 0);
>>> +	uint32_t sync_out = ctx->sync_out ?: syncobj_create(ctx->fd, 0);
>>> +	int ret;
>>> +
>>> +	/* Synchronize allocator state -> vm */
>>> +	intel_allocator_bind(ahnd, sync_in, sync_bind);
>>> +
>>> +	/* Pipelined exec */
>>> +	syncs[0].handle = sync_bind;
>>> +	syncs[1].handle = sync_out;
>>> +
>>> +	ret = __xe_exec(ctx->fd, &exec);
>>> +	if (ret)
>>> +		goto err;
>>> +
>>> +	if (!ctx->sync_bind || !ctx->sync_out)
>>> +		syncobj_wait_err(ctx->fd, &sync_out, 1, INT64_MAX, 0);
>>
>> This whole flow is so nice and tidy, I like it
>>
>>> +
>>> +err:
>>> +	if (!ctx->sync_bind)
>>> +		syncobj_destroy(ctx->fd, sync_bind);
>>> +
>>> +	if (!ctx->sync_out)
>>> +		syncobj_destroy(ctx->fd, sync_out);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +void intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset)
>>> +{
>>> +	igt_assert_eq(__intel_ctx_xe_exec(ctx, ahnd, bb_offset), 0);
>>> +}
>>> +
>>> +#define RESET_SYNCOBJ(__fd, __sync) do { \
>>> +	if (__sync) \
>>> +		syncobj_reset((__fd), &(__sync), 1); \
>>> +} while (0)
>>> +
>>> +int intel_ctx_xe_sync(intel_ctx_t *ctx, bool reset_syncs)
>>> +{
>>> +	int ret;
>>> +
>>> +	ret = syncobj_wait_err(ctx->fd, &ctx->sync_out, 1, INT64_MAX, 0);
>>> +
>>> +	if (reset_syncs) {
>>> +		RESET_SYNCOBJ(ctx->fd, ctx->sync_in);
>>> +		RESET_SYNCOBJ(ctx->fd, ctx->sync_bind);
>>> +		RESET_SYNCOBJ(ctx->fd, ctx->sync_out);
>>> +	}
>>
>> Is there a usecase where we want to do a synced execution without resetting
>> syncobjs?
>>
> 
> I don't know - that's I left decision to the user.

:) If there won't be a client for this functionality, we could drop it 
in the future

> 
>>> +
>>> +	return ret;
>>> +}
>>> diff --git a/lib/intel_ctx.h b/lib/intel_ctx.h
>>> index 3cfeaae81e..59d0360ada 100644
>>> --- a/lib/intel_ctx.h
>>> +++ b/lib/intel_ctx.h
>>> @@ -67,6 +67,14 @@ int intel_ctx_cfg_engine_class(const intel_ctx_cfg_t *cfg, unsigned int engine);
>>>    typedef struct intel_ctx {
>>>    	uint32_t id;
>>>    	intel_ctx_cfg_t cfg;
>>> +
>>> +	/* Xe */
>>> +	int fd;
>>> +	uint32_t vm;
>>> +	uint32_t engine;
>>> +	uint32_t sync_in;
>>> +	uint32_t sync_bind;
>>> +	uint32_t sync_out;
>>
>> Hmm, I wonder if we could wrap it in a struct. Yes, it would be painful to
>> unpack, but now it feels like we've just added a bunch of fields that are
>> irrelevant 80% of the time. Instead, we could have one additional field that
>> could be NULL, and use it if it's initialized.
>> But maybe I'm just being too nit-picky.
> 
> I wondered to introduce union of i915 and xe structs, but I would need
> to rewrite almost all igt's which are using this struct so I dropped
> this idea. At the moment I need to handle both drivers so mixing
> fields is not a big pain imo.

Oh my, that would be painful indeed. I mean, I won't nack the mixed 
fields, but I just find it strange to leave them in the open when we're 
testing i915 stuff.

All the best,
Karolina

> 
> --
> Zbigniew
> 
>>
>> All the best,
>> Karolina
>>
>>>    } intel_ctx_t;
>>>    int __intel_ctx_create(int fd, const intel_ctx_cfg_t *cfg,
>>> @@ -81,4 +89,10 @@ void intel_ctx_destroy(int fd, const intel_ctx_t *ctx);
>>>    unsigned int intel_ctx_engine_class(const intel_ctx_t *ctx, unsigned int engine);
>>> +intel_ctx_t *intel_ctx_xe(int fd, uint32_t vm, uint32_t engine,
>>> +			  uint32_t sync_in, uint32_t sync_bind, uint32_t sync_out);
>>> +int __intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset);
>>> +void intel_ctx_xe_exec(const intel_ctx_t *ctx, uint64_t ahnd, uint64_t bb_offset);
>>> +int intel_ctx_xe_sync(intel_ctx_t *ctx, bool reset_syncs);
>>> +
>>>    #endif

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 13/16] lib/intel_blt: Extend blitter library to support xe driver
  2023-07-11 10:16     ` Zbigniew Kempczyński
@ 2023-07-11 10:41       ` Karolina Stolarek
  0 siblings, 0 replies; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-11 10:41 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 11.07.2023 12:16, Zbigniew Kempczyński wrote:
> On Fri, Jul 07, 2023 at 11:26:18AM +0200, Karolina Stolarek wrote:
>> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
>>> Use already written for i915 blitter library in xe development.
>>> Add appropriate code paths which are unique for those drivers.
>>
>> I'm excited about this one :)
>>
>>>
>>> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
>>> ---
>>>    lib/intel_blt.c | 256 ++++++++++++++++++++++++++++++++----------------
>>>    lib/intel_blt.h |   2 +-
>>>    2 files changed, 170 insertions(+), 88 deletions(-)
>>>
>>> diff --git a/lib/intel_blt.c b/lib/intel_blt.c
>>> index f2f86e4947..3eb5d45460 100644
>>> --- a/lib/intel_blt.c
>>> +++ b/lib/intel_blt.c
>>> @@ -9,9 +9,13 @@
>>>    #include <malloc.h>
>>>    #include <cairo.h>
>>>    #include "drm.h"
>>> -#include "igt.h"
>>>    #include "i915/gem_create.h"
>>> +#include "igt.h"
>>> +#include "igt_syncobj.h"
>>>    #include "intel_blt.h"
>>> +#include "xe/xe_ioctl.h"
>>> +#include "xe/xe_query.h"
>>> +#include "xe/xe_util.h"
>>>    #define BITRANGE(start, end) (end - start + 1)
>>>    #define GET_CMDS_INFO(__fd) intel_get_cmds_info(intel_get_drm_devid(__fd))
>>> @@ -468,24 +472,40 @@ static int __special_mode(const struct blt_copy_data *blt)
>>>    	return SM_NONE;
>>>    }
>>> -static int __memory_type(uint32_t region)
>>> +static int __memory_type(int fd, enum intel_driver driver, uint32_t region)
>>
>> This comment applies both to __memory_type() and __aux_mode(). In
>> fill_data(), we unpack whatever was passed in blt and pass as three separate
>> params. Could we let the functions unpack them on their own, just like in
>> __special_mode()?
> 
> In __special_mode() you operate on both objects src and dst.
> __memory_type() and __aux_mode() works on single region, so if I pass
> blt to them I still will don't know what's the region and I need
> to pass it as second argument. So simple unpacking on blt just won't
> work if I wouldn't pass on which object I'm working on.

Right, let's leave it as it is then.

> 
>>
>> Also, fill_data() changes make me wonder if my r-b for 12/16 is valid -- we
>> don't set fd and driver fields in blt3 in i915 multicopy test. I think we
>> should do it.
> 
> Blt3 is local to xe_ccs and it is used to build the blt.

Yeah, now I'm seeing it, not sure why I didn't a couple days ago...

Many thanks,
Karolina

>  >>
>>>    {
>>> -	igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
>>> -		     IS_SYSTEM_MEMORY_REGION(region),
>>> -		     "Invalid region: %x\n", region);
>>> +	if (driver == INTEL_DRIVER_I915) {
>>> +		igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
>>> +			     IS_SYSTEM_MEMORY_REGION(region),
>>> +			     "Invalid region: %x\n", region);
>>> +	} else {
>>> +		igt_assert_f(XE_IS_VRAM_MEMORY_REGION(fd, region) ||
>>> +			     XE_IS_SYSMEM_MEMORY_REGION(fd, region),
>>> +			     "Invalid region: %x\n", region);
>>> +	}
>>> -	if (IS_DEVICE_MEMORY_REGION(region))
>>> +	if (driver == INTEL_DRIVER_I915 && IS_DEVICE_MEMORY_REGION(region))
>>>    		return TM_LOCAL_MEM;
>>> +	else if (driver == INTEL_DRIVER_XE && XE_IS_VRAM_MEMORY_REGION(fd, region))
>>> +		return TM_LOCAL_MEM;
>>> +
>>>    	return TM_SYSTEM_MEM;
>>>    }
>>> -static enum blt_aux_mode __aux_mode(const struct blt_copy_object *obj)
>>> +static enum blt_aux_mode __aux_mode(int fd,
>>> +				    enum intel_driver driver,
>>> +				    const struct blt_copy_object *obj)
>>>    {
>>> -	if (obj->compression == COMPRESSION_ENABLED) {
>>> +	if (driver == INTEL_DRIVER_I915 && obj->compression == COMPRESSION_ENABLED) {
>>>    		igt_assert_f(IS_DEVICE_MEMORY_REGION(obj->region),
>>>    			     "XY_BLOCK_COPY_BLT supports compression "
>>>    			     "on device memory only\n");
>>>    		return AM_AUX_CCS_E;
>>> +	} else if (driver == INTEL_DRIVER_XE && obj->compression == COMPRESSION_ENABLED) {
>>> +		igt_assert_f(XE_IS_VRAM_MEMORY_REGION(fd, obj->region),
>>> +			     "XY_BLOCK_COPY_BLT supports compression "
>>> +			     "on device memory only\n");
>>> +		return AM_AUX_CCS_E;
>>>    	}
>>>    	return AM_AUX_NONE;
>>> @@ -508,9 +528,9 @@ static void fill_data(struct gen12_block_copy_data *data,
>>>    	data->dw00.length = extended_command ? 20 : 10;
>>>    	if (__special_mode(blt) == SM_FULL_RESOLVE)
>>> -		data->dw01.dst_aux_mode = __aux_mode(&blt->src);
>>> +		data->dw01.dst_aux_mode = __aux_mode(blt->fd, blt->driver, &blt->src);
>>>    	else
>>> -		data->dw01.dst_aux_mode = __aux_mode(&blt->dst);
>>> +		data->dw01.dst_aux_mode = __aux_mode(blt->fd, blt->driver, &blt->dst);
>>>    	data->dw01.dst_pitch = blt->dst.pitch - 1;
>>>    	data->dw01.dst_mocs = blt->dst.mocs;
>>> @@ -531,13 +551,13 @@ static void fill_data(struct gen12_block_copy_data *data,
>>>    	data->dw06.dst_x_offset = blt->dst.x_offset;
>>>    	data->dw06.dst_y_offset = blt->dst.y_offset;
>>> -	data->dw06.dst_target_memory = __memory_type(blt->dst.region);
>>> +	data->dw06.dst_target_memory = __memory_type(blt->fd, blt->driver, blt->dst.region);
>>>    	data->dw07.src_x1 = blt->src.x1;
>>>    	data->dw07.src_y1 = blt->src.y1;
>>>    	data->dw08.src_pitch = blt->src.pitch - 1;
>>> -	data->dw08.src_aux_mode = __aux_mode(&blt->src);
>>> +	data->dw08.src_aux_mode = __aux_mode(blt->fd, blt->driver, &blt->src);
>>>    	data->dw08.src_mocs = blt->src.mocs;
>>>    	data->dw08.src_compression = blt->src.compression;
>>>    	data->dw08.src_tiling = __block_tiling(blt->src.tiling);
>>> @@ -550,7 +570,7 @@ static void fill_data(struct gen12_block_copy_data *data,
>>>    	data->dw11.src_x_offset = blt->src.x_offset;
>>>    	data->dw11.src_y_offset = blt->src.y_offset;
>>> -	data->dw11.src_target_memory = __memory_type(blt->src.region);
>>> +	data->dw11.src_target_memory = __memory_type(blt->fd, blt->driver, blt->src.region);
>>>    }
>>>    static void fill_data_ext(struct gen12_block_copy_data_ext *dext,
>>> @@ -739,7 +759,10 @@ uint64_t emit_blt_block_copy(int fd,
>>>    	igt_assert_f(ahnd, "block-copy supports softpin only\n");
>>>    	igt_assert_f(blt, "block-copy requires data to do blit\n");
>>> -	alignment = gem_detect_safe_alignment(fd);
>>> +	if (blt->driver == INTEL_DRIVER_XE)
>>> +		alignment = xe_get_default_alignment(fd);
>>> +	else
>>> +		alignment = gem_detect_safe_alignment(fd);
>>
>> I see this pattern of getting the alignment repeated a couple of times, so I
>> wonder if we could wrap it in a macro that switches on blt->driver? The
>> argument list is the same for both functions.
>>
> 
> Agree, there're couple of such conditionals so it's worth to create some
> helper for this.
> 
>>>    	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment)
>>>    		     + blt->src.plane_offset;
>>>    	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment)
>>> @@ -748,8 +771,11 @@ uint64_t emit_blt_block_copy(int fd,
>>>    	fill_data(&data, blt, src_offset, dst_offset, ext);
>>> -	bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
>>> -				       PROT_READ | PROT_WRITE);
>>> +	if (blt->driver == INTEL_DRIVER_XE)
>>> +		bb = xe_bo_map(fd, blt->bb.handle, blt->bb.size);
>>> +	else
>>> +		bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
>>> +					       PROT_READ | PROT_WRITE);
>>>    	igt_assert(bb_pos + sizeof(data) < blt->bb.size);
>>>    	memcpy(bb + bb_pos, &data, sizeof(data));
>>> @@ -812,29 +838,38 @@ int blt_block_copy(int fd,
>>>    	igt_assert_f(ahnd, "block-copy supports softpin only\n");
>>>    	igt_assert_f(blt, "block-copy requires data to do blit\n");
>>> +	igt_assert_neq(blt->driver, 0);
>>> -	alignment = gem_detect_safe_alignment(fd);
>>> +	if (blt->driver == INTEL_DRIVER_XE)
>>> +		alignment = xe_get_default_alignment(fd);
>>> +	else
>>> +		alignment = gem_detect_safe_alignment(fd);
>>>    	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
>>>    	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
>>>    	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>>>    	emit_blt_block_copy(fd, ahnd, blt, ext, 0, true);
>>> -	obj[0].offset = CANONICAL(dst_offset);
>>> -	obj[1].offset = CANONICAL(src_offset);
>>> -	obj[2].offset = CANONICAL(bb_offset);
>>> -	obj[0].handle = blt->dst.handle;
>>> -	obj[1].handle = blt->src.handle;
>>> -	obj[2].handle = blt->bb.handle;
>>> -	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
>>> -		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> -	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> -	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> -	execbuf.buffer_count = 3;
>>> -	execbuf.buffers_ptr = to_user_pointer(obj);
>>> -	execbuf.rsvd1 = ctx ? ctx->id : 0;
>>> -	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
>>> -	ret = __gem_execbuf(fd, &execbuf);
>>> +	if (blt->driver == INTEL_DRIVER_XE) {
>>> +		intel_ctx_xe_exec(ctx, ahnd, CANONICAL(bb_offset));
>>> +	} else {
>>> +		obj[0].offset = CANONICAL(dst_offset);
>>> +		obj[1].offset = CANONICAL(src_offset);
>>> +		obj[2].offset = CANONICAL(bb_offset);
>>> +		obj[0].handle = blt->dst.handle;
>>> +		obj[1].handle = blt->src.handle;
>>> +		obj[2].handle = blt->bb.handle;
>>> +		obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
>>> +				EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> +		obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> +		obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> +		execbuf.buffer_count = 3;
>>> +		execbuf.buffers_ptr = to_user_pointer(obj);
>>> +		execbuf.rsvd1 = ctx ? ctx->id : 0;
>>> +		execbuf.flags = e ? e->flags : I915_EXEC_BLT;
>>> +
>>> +		ret = __gem_execbuf(fd, &execbuf);
>>> +	}
>>>    	return ret;
>>>    }
>>> @@ -950,7 +985,10 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>>>    	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
>>>    	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
>>> -	alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
>>> +	if (surf->driver == INTEL_DRIVER_XE)
>>> +		alignment = max_t(uint64_t, xe_get_default_alignment(fd), 1ull << 16);
>>> +	else
>>> +		alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
>>>    	data.dw00.client = 0x2;
>>>    	data.dw00.opcode = 0x48;
>>> @@ -973,8 +1011,11 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>>>    	data.dw04.dst_address_hi = dst_offset >> 32;
>>>    	data.dw04.dst_mocs = surf->dst.mocs;
>>> -	bb = gem_mmap__device_coherent(fd, surf->bb.handle, 0, surf->bb.size,
>>> -				       PROT_READ | PROT_WRITE);
>>> +	if (surf->driver == INTEL_DRIVER_XE)
>>> +		bb = xe_bo_map(fd, surf->bb.handle, surf->bb.size);
>>> +	else
>>> +		bb = gem_mmap__device_coherent(fd, surf->bb.handle, 0, surf->bb.size,
>>> +					       PROT_READ | PROT_WRITE);
>>>    	igt_assert(bb_pos + sizeof(data) < surf->bb.size);
>>>    	memcpy(bb + bb_pos, &data, sizeof(data));
>>> @@ -1002,7 +1043,7 @@ uint64_t emit_blt_ctrl_surf_copy(int fd,
>>>    /**
>>>     * blt_ctrl_surf_copy:
>>> - * @fd: drm fd
>>> + * @blt: bldrm fd
>>>     * @ctx: intel_ctx_t context
>>>     * @e: blitter engine for @ctx
>>>     * @ahnd: allocator handle
>>> @@ -1026,32 +1067,41 @@ int blt_ctrl_surf_copy(int fd,
>>>    	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
>>>    	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
>>> +	igt_assert_neq(surf->driver, 0);
>>> +
>>> +	if (surf->driver == INTEL_DRIVER_XE)
>>> +		alignment = max_t(uint64_t, xe_get_default_alignment(fd), 1ull << 16);
>>> +	else
>>> +		alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
>>
>> Here, if we were to implement my macro suggestion, we could have one line
>> assignment, as the maximum value is the same.
>>
>>> -	alignment = max_t(uint64_t, gem_detect_safe_alignment(fd), 1ull << 16);
>>>    	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size, alignment);
>>>    	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size, alignment);
>>>    	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size, alignment);
>>>    	emit_blt_ctrl_surf_copy(fd, ahnd, surf, 0, true);
>>> -	obj[0].offset = CANONICAL(dst_offset);
>>> -	obj[1].offset = CANONICAL(src_offset);
>>> -	obj[2].offset = CANONICAL(bb_offset);
>>> -	obj[0].handle = surf->dst.handle;
>>> -	obj[1].handle = surf->src.handle;
>>> -	obj[2].handle = surf->bb.handle;
>>> -	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
>>> -		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> -	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> -	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> -	execbuf.buffer_count = 3;
>>> -	execbuf.buffers_ptr = to_user_pointer(obj);
>>> -	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
>>> -	execbuf.rsvd1 = ctx ? ctx->id : 0;
>>> -	gem_execbuf(fd, &execbuf);
>>> -	put_offset(ahnd, surf->dst.handle);
>>> -	put_offset(ahnd, surf->src.handle);
>>> -	put_offset(ahnd, surf->bb.handle);
>>> +	if (surf->driver == INTEL_DRIVER_XE) {
>>> +		intel_ctx_xe_exec(ctx, ahnd, CANONICAL(bb_offset));
>>> +	} else {
>>> +		obj[0].offset = CANONICAL(dst_offset);
>>> +		obj[1].offset = CANONICAL(src_offset);
>>> +		obj[2].offset = CANONICAL(bb_offset);
>>> +		obj[0].handle = surf->dst.handle;
>>> +		obj[1].handle = surf->src.handle;
>>> +		obj[2].handle = surf->bb.handle;
>>> +		obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
>>> +				EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> +		obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> +		obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> +		execbuf.buffer_count = 3;
>>> +		execbuf.buffers_ptr = to_user_pointer(obj);
>>> +		execbuf.flags = e ? e->flags : I915_EXEC_BLT;
>>> +		execbuf.rsvd1 = ctx ? ctx->id : 0;
>>> +		gem_execbuf(fd, &execbuf);
>>> +		put_offset(ahnd, surf->dst.handle);
>>> +		put_offset(ahnd, surf->src.handle);
>>> +		put_offset(ahnd, surf->bb.handle);
>>> +	}
>>>    	return 0;
>>>    }
>>> @@ -1208,7 +1258,10 @@ uint64_t emit_blt_fast_copy(int fd,
>>>    	uint32_t bbe = MI_BATCH_BUFFER_END;
>>>    	uint32_t *bb;
>>> -	alignment = gem_detect_safe_alignment(fd);
>>> +	if (blt->driver == INTEL_DRIVER_XE)
>>> +		alignment = xe_get_default_alignment(fd);
>>> +	else
>>> +		alignment = gem_detect_safe_alignment(fd);
>>>    	data.dw00.client = 0x2;
>>>    	data.dw00.opcode = 0x42;
>>> @@ -1218,8 +1271,8 @@ uint64_t emit_blt_fast_copy(int fd,
>>>    	data.dw01.dst_pitch = blt->dst.pitch;
>>>    	data.dw01.color_depth = __fast_color_depth(blt->color_depth);
>>> -	data.dw01.dst_memory = __memory_type(blt->dst.region);
>>> -	data.dw01.src_memory = __memory_type(blt->src.region);
>>> +	data.dw01.dst_memory = __memory_type(blt->fd, blt->driver, blt->dst.region);
>>> +	data.dw01.src_memory = __memory_type(blt->fd, blt->driver, blt->src.region);
>>>    	data.dw01.dst_type_y = __new_tile_y_type(blt->dst.tiling) ? 1 : 0;
>>>    	data.dw01.src_type_y = __new_tile_y_type(blt->src.tiling) ? 1 : 0;
>>> @@ -1246,8 +1299,11 @@ uint64_t emit_blt_fast_copy(int fd,
>>>    	data.dw08.src_address_lo = src_offset;
>>>    	data.dw09.src_address_hi = src_offset >> 32;
>>> -	bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
>>> -				       PROT_READ | PROT_WRITE);
>>> +	if (blt->driver == INTEL_DRIVER_XE)
>>> +		bb = xe_bo_map(fd, blt->bb.handle, blt->bb.size);
>>> +	else
>>> +		bb = gem_mmap__device_coherent(fd, blt->bb.handle, 0, blt->bb.size,
>>> +					       PROT_READ | PROT_WRITE);
>>>    	igt_assert(bb_pos + sizeof(data) < blt->bb.size);
>>>    	memcpy(bb + bb_pos, &data, sizeof(data));
>>> @@ -1297,7 +1353,14 @@ int blt_fast_copy(int fd,
>>>    	uint64_t dst_offset, src_offset, bb_offset, alignment;
>>>    	int ret;
>>> -	alignment = gem_detect_safe_alignment(fd);
>>> +	igt_assert_f(ahnd, "fast-copy supports softpin only\n");
>>> +	igt_assert_f(blt, "fast-copy requires data to do fast-copy blit\n");
>>> +	igt_assert_neq(blt->driver, 0);
>>> +
>>> +	if (blt->driver == INTEL_DRIVER_XE)
>>> +		alignment = xe_get_default_alignment(fd);
>>> +	else
>>> +		alignment = gem_detect_safe_alignment(fd);
>>>    	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
>>>    	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
>>> @@ -1305,24 +1368,28 @@ int blt_fast_copy(int fd,
>>>    	emit_blt_fast_copy(fd, ahnd, blt, 0, true);
>>> -	obj[0].offset = CANONICAL(dst_offset);
>>> -	obj[1].offset = CANONICAL(src_offset);
>>> -	obj[2].offset = CANONICAL(bb_offset);
>>> -	obj[0].handle = blt->dst.handle;
>>> -	obj[1].handle = blt->src.handle;
>>> -	obj[2].handle = blt->bb.handle;
>>> -	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
>>> -		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> -	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> -	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> -	execbuf.buffer_count = 3;
>>> -	execbuf.buffers_ptr = to_user_pointer(obj);
>>> -	execbuf.rsvd1 = ctx ? ctx->id : 0;
>>> -	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
>>> -	ret = __gem_execbuf(fd, &execbuf);
>>> -	put_offset(ahnd, blt->dst.handle);
>>> -	put_offset(ahnd, blt->src.handle);
>>> -	put_offset(ahnd, blt->bb.handle);
>>> +	if (blt->driver == INTEL_DRIVER_XE) {
>>> +		intel_ctx_xe_exec(ctx, ahnd, CANONICAL(bb_offset));
>>> +	} else {
>>> +		obj[0].offset = CANONICAL(dst_offset);
>>> +		obj[1].offset = CANONICAL(src_offset);
>>> +		obj[2].offset = CANONICAL(bb_offset);
>>> +		obj[0].handle = blt->dst.handle;
>>> +		obj[1].handle = blt->src.handle;
>>> +		obj[2].handle = blt->bb.handle;
>>> +		obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
>>> +				EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> +		obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> +		obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>>> +		execbuf.buffer_count = 3;
>>> +		execbuf.buffers_ptr = to_user_pointer(obj);
>>> +		execbuf.rsvd1 = ctx ? ctx->id : 0;
>>> +		execbuf.flags = e ? e->flags : I915_EXEC_BLT;
>>> +		ret = __gem_execbuf(fd, &execbuf);
>>> +		put_offset(ahnd, blt->dst.handle);
>>> +		put_offset(ahnd, blt->src.handle);
>>> +		put_offset(ahnd, blt->bb.handle);
>>> +	}
>>>    	return ret;
>>>    }
>>> @@ -1366,16 +1433,26 @@ blt_create_object(const struct blt_copy_data *blt, uint32_t region,
>>>    	obj = calloc(1, sizeof(*obj));
>>>    	obj->size = size;
>>> -	igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
>>> -						  &size, region) == 0);
>>> +
>>> +	if (blt->driver == INTEL_DRIVER_XE) {
>>> +		size = ALIGN(size, xe_get_default_alignment(blt->fd));
>>> +		handle = xe_bo_create_flags(blt->fd, 0, size, region);
>>> +	} else {
>>> +		igt_assert(__gem_create_in_memory_regions(blt->fd, &handle,
>>> +							  &size, region) == 0);
>>> +	}
>>>    	blt_set_object(obj, handle, size, region, mocs, tiling,
>>>    		       compression, compression_type);
>>>    	blt_set_geom(obj, stride, 0, 0, width, height, 0, 0);
>>> -	if (create_mapping)
>>> -		obj->ptr = gem_mmap__device_coherent(blt->fd, handle, 0, size,
>>> -						     PROT_READ | PROT_WRITE);
>>> +	if (create_mapping) {
>>> +		if (blt->driver == INTEL_DRIVER_XE)
>>> +			obj->ptr = xe_bo_map(blt->fd, handle, size);
>>> +		else
>>> +			obj->ptr = gem_mmap__device_coherent(blt->fd, handle, 0, size,
>>> +							     PROT_READ | PROT_WRITE);
>>> +	}
>>>    	return obj;
>>>    }
>>> @@ -1518,14 +1595,19 @@ void blt_surface_to_png(int fd, uint32_t run_id, const char *fileid,
>>>    	int format;
>>>    	int stride = obj->tiling ? obj->pitch * 4 : obj->pitch;
>>>    	char filename[FILENAME_MAX];
>>> +	bool is_xe;
>>>    	snprintf(filename, FILENAME_MAX-1, "%d-%s-%s-%ux%u-%s.png",
>>>    		 run_id, fileid, blt_tiling_name(obj->tiling), width, height,
>>>    		 obj->compression ? "compressed" : "uncompressed");
>>> -	if (!map)
>>> -		map = gem_mmap__device_coherent(fd, obj->handle, 0,
>>> -						obj->size, PROT_READ);
>>> +	if (!map) {
>>> +		if (is_xe)
>>
>> is_xe doesn't seem to be set, we'll always pick "else" if copy object is not
>> mapped.
> 
> Oops, right, is_xe is just garbage now. Will fix in v3.
> 
> Thanks for the review.
> --
> Zbigniew
> 
>>
>> All the best,
>> Karolina
>>
>>> +			map = xe_bo_map(fd, obj->handle, obj->size);
>>> +		else
>>> +			map = gem_mmap__device_coherent(fd, obj->handle, 0,
>>> +							obj->size, PROT_READ);
>>> +	}
>>>    	format = CAIRO_FORMAT_RGB24;
>>>    	surface = cairo_image_surface_create_for_data(map,
>>>    						      format, width, height,
>>> diff --git a/lib/intel_blt.h b/lib/intel_blt.h
>>> index 7516ce8ac7..944e2b4ae7 100644
>>> --- a/lib/intel_blt.h
>>> +++ b/lib/intel_blt.h
>>> @@ -8,7 +8,7 @@
>>>    /**
>>>     * SECTION:intel_blt
>>> - * @short_description: i915 blitter library
>>> + * @short_description: i915/xe blitter library
>>>     * @title: Blitter library
>>>     * @include: intel_blt.h
>>>     *

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 14/16] tests/xe_ccs: Check if flatccs is working with block-copy for Xe
  2023-07-07 10:05   ` Karolina Stolarek
@ 2023-07-11 10:45     ` Zbigniew Kempczyński
  2023-07-11 10:51       ` Karolina Stolarek
  0 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-11 10:45 UTC (permalink / raw)
  To: Karolina Stolarek; +Cc: igt-dev

On Fri, Jul 07, 2023 at 12:05:25PM +0200, Karolina Stolarek wrote:
> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> > This is ported to xe copy of i915 gem_ccs test. Ported means all driver
> > dependent calls - like working on regions, binding and execution were
> > replaced by xe counterparts. I wondered to add conditionals for xe
> > in gem_ccs but this would decrease test readability so I dropped
> > this idea.
> > 
> > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > ---
> >   tests/meson.build |   1 +
> >   tests/xe/xe_ccs.c | 763 ++++++++++++++++++++++++++++++++++++++++++++++
> >   2 files changed, 764 insertions(+)
> >   create mode 100644 tests/xe/xe_ccs.c
> > 
> > diff --git a/tests/meson.build b/tests/meson.build
> > index ee066b8490..9bca57a5e8 100644
> > --- a/tests/meson.build
> > +++ b/tests/meson.build
> > @@ -244,6 +244,7 @@ i915_progs = [
> >   ]
> >   xe_progs = [
> > +	'xe_ccs',
> >   	'xe_create',
> >   	'xe_compute',
> >   	'xe_dma_buf_sync',
> > diff --git a/tests/xe/xe_ccs.c b/tests/xe/xe_ccs.c
> > new file mode 100644
> > index 0000000000..e6bb29a5ed
> > --- /dev/null
> > +++ b/tests/xe/xe_ccs.c
> > @@ -0,0 +1,763 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2023 Intel Corporation
> > + */
> > +
> > +#include <errno.h>
> > +#include <glib.h>
> > +#include <sys/ioctl.h>
> > +#include <sys/time.h>
> > +#include <malloc.h>
> > +#include "drm.h"
> > +#include "igt.h"
> > +#include "igt_syncobj.h"
> > +#include "intel_blt.h"
> > +#include "intel_mocs.h"
> > +#include "xe/xe_ioctl.h"
> > +#include "xe/xe_query.h"
> > +#include "xe/xe_util.h"
> > +/**
> > + * TEST: xe ccs
> > + * Description: Exercise gen12 blitter with and without flatccs compression on Xe
> > + * Run type: FULL
> > + *
> > + * SUBTEST: block-copy-compressed
> > + * Description: Check block-copy flatccs compressed blit
> > + *
> > + * SUBTEST: block-copy-uncompressed
> > + * Description: Check block-copy uncompressed blit
> > + *
> > + * SUBTEST: block-multicopy-compressed
> > + * Description: Check block-multicopy flatccs compressed blit
> > + *
> > + * SUBTEST: block-multicopy-inplace
> > + * Description: Check block-multicopy flatccs inplace decompression blit
> > + *
> > + * SUBTEST: ctrl-surf-copy
> > + * Description: Check flatccs data can be copied from/to surface
> > + *
> > + * SUBTEST: ctrl-surf-copy-new-ctx
> > + * Description: Check flatccs data are physically tagged and visible in vm
> > + *
> > + * SUBTEST: suspend-resume
> > + * Description: Check flatccs data persists after suspend / resume (S0)
> > + */
> > +
> > +IGT_TEST_DESCRIPTION("Exercise gen12 blitter with and without flatccs compression on Xe");
> > +
> > +static struct param {
> > +	int compression_format;
> > +	int tiling;
> > +	bool write_png;
> > +	bool print_bb;
> > +	bool print_surface_info;
> > +	int width;
> > +	int height;
> > +} param = {
> > +	.compression_format = 0,
> > +	.tiling = -1,
> > +	.write_png = false,
> > +	.print_bb = false,
> > +	.print_surface_info = false,
> > +	.width = 512,
> > +	.height = 512,
> > +};
> > +
> > +struct test_config {
> > +	bool compression;
> > +	bool inplace;
> > +	bool surfcopy;
> > +	bool new_ctx;
> > +	bool suspend_resume;
> > +};
> > +
> > +static void set_surf_object(struct blt_ctrl_surf_copy_object *obj,
> > +			    uint32_t handle, uint32_t region, uint64_t size,
> > +			    uint8_t mocs, enum blt_access_type access_type)
> > +{
> > +	obj->handle = handle;
> > +	obj->region = region;
> > +	obj->size = size;
> > +	obj->mocs = mocs;
> > +	obj->access_type = access_type;
> > +}
> > +
> > +#define PRINT_SURFACE_INFO(name, obj) do { \
> > +	if (param.print_surface_info) \
> > +		blt_surface_info((name), (obj)); } while (0)
> > +
> > +#define WRITE_PNG(fd, id, name, obj, w, h) do { \
> > +	if (param.write_png) \
> > +		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)
> > +
> > +static int compare_nxn(const struct blt_copy_object *surf1,
> > +		       const struct blt_copy_object *surf2,
> > +		       int xsize, int ysize, int bx, int by)
> 
> I think that you could avoid some repetition by creating a small lib with
> blt copy test helpers. For example, we have 4 definitions of WRITE_PNG, and
> it would be good to have just one and call it in four separate tests.
> 

Regarding WRITE_PNG() macro. I'm not convinced should we export this macro
in this form. I mean there's no additional logic than checking param.write_png
field. I imagine sth like this:

#define BLT_WRITE_PNG(conditional, fd, id, ...) do { \
	if (conditional) \
		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)

and calling in the code:

BLT_WRITE_PNG(param.write_png, fd, ...);

Looks weird imo, but maybe it's just my subjective assessment. If you think
above is fine for you I'll add this in v4.

> > +{
> > +	int x, y, corrupted;
> > +	uint32_t pos, px1, px2;
> > +
> > +	corrupted = 0;
> > +	for (y = 0; y < ysize; y++) {
> > +		for (x = 0; x < xsize; x++) {
> > +			pos = bx * xsize + by * ysize * surf1->pitch / 4;
> > +			pos += x + y * surf1->pitch / 4;
> > +			px1 = surf1->ptr[pos];
> > +			px2 = surf2->ptr[pos];
> > +			if (px1 != px2)
> > +				corrupted++;
> > +		}
> > +	}
> > +
> > +	return corrupted;
> > +}
> > +
> > +static void dump_corruption_info(const struct blt_copy_object *surf1,
> > +				 const struct blt_copy_object *surf2)
> > +{
> > +	const int xsize = 8, ysize = 8;
> > +	int w, h, bx, by, corrupted;
> > +
> > +	igt_assert(surf1->x1 == surf2->x1 && surf1->x2 == surf2->x2);
> > +	igt_assert(surf1->y1 == surf2->y1 && surf1->y2 == surf2->y2);
> > +	w = surf1->x2;
> > +	h = surf1->y2;
> > +
> > +	igt_info("dump corruption - width: %d, height: %d, sizex: %x, sizey: %x\n",
> > +		 surf1->x2, surf1->y2, xsize, ysize);
> > +
> > +	for (by = 0; by < h / ysize; by++) {
> > +		for (bx = 0; bx < w / xsize; bx++) {
> > +			corrupted = compare_nxn(surf1, surf2, xsize, ysize, bx, by);
> > +			if (corrupted == 0)
> > +				igt_info(".");
> > +			else
> > +				igt_info("%c", '0' + corrupted);
> > +		}
> > +		igt_info("\n");
> > +	}
> > +}
> > +
> > +static void surf_copy(int xe,
> > +		      intel_ctx_t *ctx,
> > +		      uint64_t ahnd,
> > +		      const struct blt_copy_object *src,
> > +		      const struct blt_copy_object *mid,
> > +		      const struct blt_copy_object *dst,
> > +		      int run_id, bool suspend_resume)
> > +{
> > +	struct blt_copy_data blt = {};
> > +	struct blt_block_copy_data_ext ext = {};
> > +	struct blt_ctrl_surf_copy_data surf = {};
> > +	uint32_t bb1, bb2, ccs, ccs2, *ccsmap, *ccsmap2;
> > +	uint64_t bb_size, ccssize = mid->size / CCS_RATIO;
> > +	uint32_t *ccscopy;
> > +	uint8_t uc_mocs = intel_get_uc_mocs(xe);
> > +	uint32_t sysmem = system_memory(xe);
> > +	int result;
> > +
> > +	igt_assert(mid->compression);
> > +	ccscopy = (uint32_t *) malloc(ccssize);
> > +	ccs = xe_bo_create_flags(xe, 0, ccssize, sysmem);
> > +	ccs2 = xe_bo_create_flags(xe, 0, ccssize, sysmem);
> > +
> > +	blt_ctrl_surf_copy_init(xe, &surf);
> > +	surf.print_bb = param.print_bb;
> > +	set_surf_object(&surf.src, mid->handle, mid->region, mid->size,
> > +			uc_mocs, BLT_INDIRECT_ACCESS);
> > +	set_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
> > +	bb_size = xe_get_default_alignment(xe);
> > +	bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
> > +	blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
> > +	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> > +	intel_ctx_xe_sync(ctx, true);
> > +
> > +	ccsmap = xe_bo_map(xe, ccs, surf.dst.size);
> > +	memcpy(ccscopy, ccsmap, ccssize);
> > +
> > +	if (suspend_resume) {
> > +		char *orig, *orig2, *newsum, *newsum2;
> > +
> > +		orig = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
> > +						   (void *)ccsmap, surf.dst.size);
> > +		orig2 = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
> > +						    (void *)mid->ptr, mid->size);
> > +
> > +		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
> > +
> > +		set_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,
> 
> Shouldn't this be sysmem instead of REGION_SMEM?
> 

Yes, missed this line.

> > +				0, DIRECT_ACCESS);
> > +		blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> > +		intel_ctx_xe_sync(ctx, true);
> > +
> > +		ccsmap2 = xe_bo_map(xe, ccs2, surf.dst.size);
> > +		newsum = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
> > +						     (void *)ccsmap2, surf.dst.size);
> > +		newsum2 = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
> > +						      (void *)mid->ptr, mid->size);
> > +
> > +		munmap(ccsmap2, ccssize);
> > +		igt_assert(!strcmp(orig, newsum));
> > +		igt_assert(!strcmp(orig2, newsum2));
> > +		g_free(orig);
> > +		g_free(orig2);
> > +		g_free(newsum);
> > +		g_free(newsum2);
> > +	}
> > +
> > +	/* corrupt ccs */
> > +	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
> > +		ccsmap[i] = i;
> > +	set_surf_object(&surf.src, ccs, sysmem, ccssize,
> > +			uc_mocs, DIRECT_ACCESS);
> > +	set_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
> > +			uc_mocs, INDIRECT_ACCESS);
> > +	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> > +	intel_ctx_xe_sync(ctx, true);
> > +
> > +	blt_copy_init(xe, &blt);
> > +	blt.color_depth = CD_32bit;
> > +	blt.print_bb = param.print_bb;
> > +	blt_set_copy_object(&blt.src, mid);
> > +	blt_set_copy_object(&blt.dst, dst);
> > +	blt_set_object_ext(&ext.src, mid->compression_type, mid->x2, mid->y2, SURFACE_TYPE_2D);
> > +	blt_set_object_ext(&ext.dst, 0, dst->x2, dst->y2, SURFACE_TYPE_2D);
> > +	bb2 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
> > +	blt_set_batch(&blt.bb, bb2, bb_size, sysmem);
> > +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, &ext);
> > +	intel_ctx_xe_sync(ctx, true);
> > +	WRITE_PNG(xe, run_id, "corrupted", &blt.dst, dst->x2, dst->y2);
> > +	result = memcmp(src->ptr, dst->ptr, src->size);
> > +	igt_assert(result != 0);
> > +
> > +	/* retrieve back ccs */
> > +	memcpy(ccsmap, ccscopy, ccssize);
> > +	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> > +
> > +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, &ext);
> > +	intel_ctx_xe_sync(ctx, true);
> > +	WRITE_PNG(xe, run_id, "corrected", &blt.dst, dst->x2, dst->y2);
> > +	result = memcmp(src->ptr, dst->ptr, src->size);
> > +	if (result)
> > +		dump_corruption_info(src, dst);
> > +
> > +	munmap(ccsmap, ccssize);
> > +	gem_close(xe, ccs);
> > +	gem_close(xe, ccs2);
> > +	gem_close(xe, bb1);
> > +	gem_close(xe, bb2);
> > +
> > +	igt_assert_f(result == 0,
> > +		     "Source and destination surfaces are different after "
> > +		     "restoring source ccs data\n");
> > +}
> > +
> > +struct blt_copy3_data {
> > +	int xe;
> > +	struct blt_copy_object src;
> > +	struct blt_copy_object mid;
> > +	struct blt_copy_object dst;
> > +	struct blt_copy_object final;
> > +	struct blt_copy_batch bb;
> > +	enum blt_color_depth color_depth;
> > +
> > +	/* debug stuff */
> > +	bool print_bb;
> > +};
> > +
> > +struct blt_block_copy3_data_ext {
> > +	struct blt_block_copy_object_ext src;
> > +	struct blt_block_copy_object_ext mid;
> > +	struct blt_block_copy_object_ext dst;
> > +	struct blt_block_copy_object_ext final;
> > +};
> > +
> 
> Hmm, we really could make use of shared definitions like these (yes, I sound
> like a broken record at this point, sorry!)
>

No, there's no plan to have blt3 in the common library.

 
> > +#define FILL_OBJ(_idx, _handle, _offset) do { \
> > +	obj[(_idx)].handle = (_handle); \
> > +	obj[(_idx)].offset = (_offset); \
> > +} while (0)
> 
> We don't use this definition in Xe tests, you can delete it
> 

Good catch.

> > +
> > +static int blt_block_copy3(int xe,
> > +			   const intel_ctx_t *ctx,
> > +			   uint64_t ahnd,
> > +			   const struct blt_copy3_data *blt3,
> > +			   const struct blt_block_copy3_data_ext *ext3)
> > +{
> > +	struct blt_copy_data blt0;
> > +	struct blt_block_copy_data_ext ext0;
> > +	uint64_t bb_offset, alignment;
> > +	uint64_t bb_pos = 0;
> > +	int ret;
> > +
> > +	igt_assert_f(ahnd, "block-copy3 supports softpin only\n");
> > +	igt_assert_f(blt3, "block-copy3 requires data to do blit\n");
> > +
> > +	alignment = xe_get_default_alignment(xe);
> > +	get_offset(ahnd, blt3->src.handle, blt3->src.size, alignment);
> > +	get_offset(ahnd, blt3->mid.handle, blt3->mid.size, alignment);
> > +	get_offset(ahnd, blt3->dst.handle, blt3->dst.size, alignment);
> > +	get_offset(ahnd, blt3->final.handle, blt3->final.size, alignment);
> > +	bb_offset = get_offset(ahnd, blt3->bb.handle, blt3->bb.size, alignment);
> > +
> > +	/* First blit src -> mid */
> > +	blt_copy_init(xe, &blt0);
> > +	blt0.src = blt3->src;
> > +	blt0.dst = blt3->mid;
> > +	blt0.bb = blt3->bb;
> > +	blt0.color_depth = blt3->color_depth;
> > +	blt0.print_bb = blt3->print_bb;
> > +	ext0.src = ext3->src;
> > +	ext0.dst = ext3->mid;
> > +	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, false);
> > +
> > +	/* Second blit mid -> dst */
> > +	blt_copy_init(xe, &blt0);
> > +	blt0.src = blt3->mid;
> > +	blt0.dst = blt3->dst;
> > +	blt0.bb = blt3->bb;
> > +	blt0.color_depth = blt3->color_depth;
> > +	blt0.print_bb = blt3->print_bb;
> > +	ext0.src = ext3->mid;
> > +	ext0.dst = ext3->dst;
> > +	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, false);
> > +
> > +	/* Third blit dst -> final */
> > +	blt_copy_init(xe, &blt0);
> > +	blt0.src = blt3->dst;
> > +	blt0.dst = blt3->final;
> > +	blt0.bb = blt3->bb;
> > +	blt0.color_depth = blt3->color_depth;
> > +	blt0.print_bb = blt3->print_bb;
> > +	ext0.src = ext3->dst;
> > +	ext0.dst = ext3->final;
> > +	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, true);
> > +
> > +	intel_ctx_xe_exec(ctx, ahnd, bb_offset);
> > +
> > +	return ret;
> > +}
> > +
> > +static void block_copy(int xe,
> > +		       intel_ctx_t *ctx,
> > +		       uint32_t region1, uint32_t region2,
> > +		       enum blt_tiling_type mid_tiling,
> > +		       const struct test_config *config)
> > +{
> > +	struct blt_copy_data blt = {};
> > +	struct blt_block_copy_data_ext ext = {}, *pext = &ext;
> > +	struct blt_copy_object *src, *mid, *dst;
> > +	const uint32_t bpp = 32;
> > +	uint64_t bb_size = xe_get_default_alignment(xe);
> > +	uint64_t ahnd = intel_allocator_open(xe, ctx->vm, INTEL_ALLOCATOR_RELOC);
> > +	uint32_t run_id = mid_tiling;
> > +	uint32_t mid_region = region2, bb;
> > +	uint32_t width = param.width, height = param.height;
> > +	enum blt_compression mid_compression = config->compression;
> > +	int mid_compression_format = param.compression_format;
> > +	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
> > +	uint8_t uc_mocs = intel_get_uc_mocs(xe);
> > +	int result;
> > +
> > +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
> > +
> > +	if (!blt_uses_extended_block_copy(xe))
> > +		pext = NULL;
> > +
> > +	blt_copy_init(xe, &blt);
> > +
> > +	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> > +				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> > +	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
> > +				mid_tiling, mid_compression, comp_type, true);
> > +	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> > +				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> > +	igt_assert(src->size == dst->size);
> > +	PRINT_SURFACE_INFO("src", src);
> > +	PRINT_SURFACE_INFO("mid", mid);
> > +	PRINT_SURFACE_INFO("dst", dst);
> > +
> > +	blt_surface_fill_rect(xe, src, width, height);
> > +	WRITE_PNG(xe, run_id, "src", src, width, height);
> > +
> > +	blt.color_depth = CD_32bit;
> > +	blt.print_bb = param.print_bb;
> > +	blt_set_copy_object(&blt.src, src);
> > +	blt_set_copy_object(&blt.dst, mid);
> > +	blt_set_object_ext(&ext.src, 0, width, height, SURFACE_TYPE_2D);
> > +	blt_set_object_ext(&ext.dst, mid_compression_format, width, height, SURFACE_TYPE_2D);
> > +	blt_set_batch(&blt.bb, bb, bb_size, region1);
> > +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, pext);
> > +	intel_ctx_xe_sync(ctx, true);
> > +
> > +	/* We expect mid != src if there's compression */
> > +	if (mid->compression)
> > +		igt_assert(memcmp(src->ptr, mid->ptr, src->size) != 0);
> > +
> > +	WRITE_PNG(xe, run_id, "src", &blt.src, width, height);
> 
> It's also in gem_ccs, why are we saving "src" surface twice?
> 

Agree, this is mistake. I'll remove this in both tests.

Thank you for the review.
--
Zbigniew

> Many thanks,
> Karolina
> 
> > +	WRITE_PNG(xe, run_id, "mid", &blt.dst, width, height);
> > +
> > +	if (config->surfcopy && pext) {
> > +		struct drm_xe_engine_class_instance inst = {
> > +			.engine_class = DRM_XE_ENGINE_CLASS_COPY,
> > +		};
> > +		intel_ctx_t *surf_ctx = ctx;
> > +		uint64_t surf_ahnd = ahnd;
> > +		uint32_t vm, engine;
> > +
> > +		if (config->new_ctx) {
> > +			vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
> > +			engine = xe_engine_create(xe, vm, &inst, 0);
> > +			surf_ctx = intel_ctx_xe(xe, vm, engine, 0, 0, 0);
> > +			surf_ahnd = intel_allocator_open(xe, surf_ctx->vm,
> > +							 INTEL_ALLOCATOR_RELOC);
> > +		}
> > +		surf_copy(xe, surf_ctx, surf_ahnd, src, mid, dst, run_id,
> > +			  config->suspend_resume);
> > +
> > +		if (surf_ctx != ctx) {
> > +			xe_engine_destroy(xe, engine);
> > +			xe_vm_destroy(xe, vm);
> > +			free(surf_ctx);
> > +			put_ahnd(surf_ahnd);
> > +		}
> > +	}
> > +
> > +	blt_copy_init(xe, &blt);
> > +	blt.color_depth = CD_32bit;
> > +	blt.print_bb = param.print_bb;
> > +	blt_set_copy_object(&blt.src, mid);
> > +	blt_set_copy_object(&blt.dst, dst);
> > +	blt_set_object_ext(&ext.src, mid_compression_format, width, height, SURFACE_TYPE_2D);
> > +	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
> > +	if (config->inplace) {
> > +		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
> > +			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
> > +		blt.dst.ptr = mid->ptr;
> > +	}
> > +
> > +	blt_set_batch(&blt.bb, bb, bb_size, region1);
> > +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, pext);
> > +	intel_ctx_xe_sync(ctx, true);
> > +
> > +	WRITE_PNG(xe, run_id, "dst", &blt.dst, width, height);
> > +
> > +	result = memcmp(src->ptr, blt.dst.ptr, src->size);
> > +
> > +	/* Politely clean vm */
> > +	put_offset(ahnd, src->handle);
> > +	put_offset(ahnd, mid->handle);
> > +	put_offset(ahnd, dst->handle);
> > +	put_offset(ahnd, bb);
> > +	intel_allocator_bind(ahnd, 0, 0);
> > +	blt_destroy_object(xe, src);
> > +	blt_destroy_object(xe, mid);
> > +	blt_destroy_object(xe, dst);
> > +	gem_close(xe, bb);
> > +	put_ahnd(ahnd);
> > +
> > +	igt_assert_f(!result, "source and destination surfaces differs!\n");
> > +}
> > +
> > +static void block_multicopy(int xe,
> > +			    intel_ctx_t *ctx,
> > +			    uint32_t region1, uint32_t region2,
> > +			    enum blt_tiling_type mid_tiling,
> > +			    const struct test_config *config)
> > +{
> > +	struct blt_copy3_data blt3 = {};
> > +	struct blt_copy_data blt = {};
> > +	struct blt_block_copy3_data_ext ext3 = {}, *pext3 = &ext3;
> > +	struct blt_copy_object *src, *mid, *dst, *final;
> > +	const uint32_t bpp = 32;
> > +	uint64_t bb_size = xe_get_default_alignment(xe);
> > +	uint64_t ahnd = intel_allocator_open(xe, ctx->vm, INTEL_ALLOCATOR_RELOC);
> > +	uint32_t run_id = mid_tiling;
> > +	uint32_t mid_region = region2, bb;
> > +	uint32_t width = param.width, height = param.height;
> > +	enum blt_compression mid_compression = config->compression;
> > +	int mid_compression_format = param.compression_format;
> > +	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
> > +	uint8_t uc_mocs = intel_get_uc_mocs(xe);
> > +	int result;
> > +
> > +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
> > +
> > +	if (!blt_uses_extended_block_copy(xe))
> > +		pext3 = NULL;
> > +
> > +	blt_copy_init(xe, &blt);
> > +
> > +	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> > +				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> > +	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
> > +				mid_tiling, mid_compression, comp_type, true);
> > +	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> > +				mid_tiling, COMPRESSION_DISABLED, comp_type, true);
> > +	final = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> > +				  T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> > +	igt_assert(src->size == dst->size);
> > +	PRINT_SURFACE_INFO("src", src);
> > +	PRINT_SURFACE_INFO("mid", mid);
> > +	PRINT_SURFACE_INFO("dst", dst);
> > +	PRINT_SURFACE_INFO("final", final);
> > +
> > +	blt_surface_fill_rect(xe, src, width, height);
> > +
> > +	blt3.color_depth = CD_32bit;
> > +	blt3.print_bb = param.print_bb;
> > +	blt_set_copy_object(&blt3.src, src);
> > +	blt_set_copy_object(&blt3.mid, mid);
> > +	blt_set_copy_object(&blt3.dst, dst);
> > +	blt_set_copy_object(&blt3.final, final);
> > +
> > +	if (config->inplace) {
> > +		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
> > +			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
> > +			       comp_type);
> > +		blt3.dst.ptr = mid->ptr;
> > +	}
> > +
> > +	blt_set_object_ext(&ext3.src, 0, width, height, SURFACE_TYPE_2D);
> > +	blt_set_object_ext(&ext3.mid, mid_compression_format, width, height, SURFACE_TYPE_2D);
> > +	blt_set_object_ext(&ext3.dst, 0, width, height, SURFACE_TYPE_2D);
> > +	blt_set_object_ext(&ext3.final, 0, width, height, SURFACE_TYPE_2D);
> > +	blt_set_batch(&blt3.bb, bb, bb_size, region1);
> > +
> > +	blt_block_copy3(xe, ctx, ahnd, &blt3, pext3);
> > +	intel_ctx_xe_sync(ctx, true);
> > +
> > +	WRITE_PNG(xe, run_id, "src", &blt3.src, width, height);
> > +	if (!config->inplace)
> > +		WRITE_PNG(xe, run_id, "mid", &blt3.mid, width, height);
> > +	WRITE_PNG(xe, run_id, "dst", &blt3.dst, width, height);
> > +	WRITE_PNG(xe, run_id, "final", &blt3.final, width, height);
> > +
> > +	result = memcmp(src->ptr, blt3.final.ptr, src->size);
> > +
> > +	put_offset(ahnd, src->handle);
> > +	put_offset(ahnd, mid->handle);
> > +	put_offset(ahnd, dst->handle);
> > +	put_offset(ahnd, final->handle);
> > +	put_offset(ahnd, bb);
> > +	intel_allocator_bind(ahnd, 0, 0);
> > +	blt_destroy_object(xe, src);
> > +	blt_destroy_object(xe, mid);
> > +	blt_destroy_object(xe, dst);
> > +	blt_destroy_object(xe, final);
> > +	gem_close(xe, bb);
> > +	put_ahnd(ahnd);
> > +
> > +	igt_assert_f(!result, "source and destination surfaces differs!\n");
> > +}
> > +
> > +enum copy_func {
> > +	BLOCK_COPY,
> > +	BLOCK_MULTICOPY,
> > +};
> > +
> > +static const struct {
> > +	const char *suffix;
> > +	void (*copyfn)(int fd,
> > +		       intel_ctx_t *ctx,
> > +		       uint32_t region1, uint32_t region2,
> > +		       enum blt_tiling_type btype,
> > +		       const struct test_config *config);
> > +} copyfns[] = {
> > +	[BLOCK_COPY] = { "", block_copy },
> > +	[BLOCK_MULTICOPY] = { "-multicopy", block_multicopy },
> > +};
> > +
> > +static void block_copy_test(int xe,
> > +			    const struct test_config *config,
> > +			    struct igt_collection *set,
> > +			    enum copy_func copy_function)
> > +{
> > +	struct drm_xe_engine_class_instance inst = {
> > +		.engine_class = DRM_XE_ENGINE_CLASS_COPY,
> > +	};
> > +	intel_ctx_t *ctx;
> > +	struct igt_collection *regions;
> > +	uint32_t vm, engine;
> > +	int tiling;
> > +
> > +	if (config->compression && !blt_block_copy_supports_compression(xe))
> > +		return;
> > +
> > +	if (config->inplace && !config->compression)
> > +		return;
> > +
> > +	for_each_tiling(tiling) {
> > +		if (!blt_block_copy_supports_tiling(xe, tiling) ||
> > +		    (param.tiling >= 0 && param.tiling != tiling))
> > +			continue;
> > +
> > +		for_each_variation_r(regions, 2, set) {
> > +			uint32_t region1, region2;
> > +			char *regtxt;
> > +
> > +			region1 = igt_collection_get_value(regions, 0);
> > +			region2 = igt_collection_get_value(regions, 1);
> > +
> > +			/* Compressed surface must be in device memory */
> > +			if (config->compression && !XE_IS_VRAM_MEMORY_REGION(xe, region2))
> > +				continue;
> > +
> > +			regtxt = xe_memregion_dynamic_subtest_name(xe, regions);
> > +
> > +			igt_dynamic_f("%s-%s-compfmt%d-%s%s",
> > +				      blt_tiling_name(tiling),
> > +				      config->compression ?
> > +					      "compressed" : "uncompressed",
> > +				      param.compression_format, regtxt,
> > +				      copyfns[copy_function].suffix) {
> > +				uint32_t sync_bind, sync_out;
> > +
> > +				vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
> > +				engine = xe_engine_create(xe, vm, &inst, 0);
> > +				sync_bind = syncobj_create(xe, 0);
> > +				sync_out = syncobj_create(xe, 0);
> > +				ctx = intel_ctx_xe(xe, vm, engine,
> > +						   0, sync_bind, sync_out);
> > +
> > +				copyfns[copy_function].copyfn(xe, ctx,
> > +							      region1, region2,
> > +							      tiling, config);
> > +
> > +				xe_engine_destroy(xe, engine);
> > +				xe_vm_destroy(xe, vm);
> > +				syncobj_destroy(xe, sync_bind);
> > +				syncobj_destroy(xe, sync_out);
> > +				free(ctx);
> > +			}
> > +
> > +			free(regtxt);
> > +		}
> > +	}
> > +}
> > +
> > +static int opt_handler(int opt, int opt_index, void *data)
> > +{
> > +	switch (opt) {
> > +	case 'b':
> > +		param.print_bb = true;
> > +		igt_debug("Print bb: %d\n", param.print_bb);
> > +		break;
> > +	case 'f':
> > +		param.compression_format = atoi(optarg);
> > +		igt_debug("Compression format: %d\n", param.compression_format);
> > +		igt_assert((param.compression_format & ~0x1f) == 0);
> > +		break;
> > +	case 'p':
> > +		param.write_png = true;
> > +		igt_debug("Write png: %d\n", param.write_png);
> > +		break;
> > +	case 's':
> > +		param.print_surface_info = true;
> > +		igt_debug("Print surface info: %d\n", param.print_surface_info);
> > +		break;
> > +	case 't':
> > +		param.tiling = atoi(optarg);
> > +		igt_debug("Tiling: %d\n", param.tiling);
> > +		break;
> > +	case 'W':
> > +		param.width = atoi(optarg);
> > +		igt_debug("Width: %d\n", param.width);
> > +		break;
> > +	case 'H':
> > +		param.height = atoi(optarg);
> > +		igt_debug("Height: %d\n", param.height);
> > +		break;
> > +	default:
> > +		return IGT_OPT_HANDLER_ERROR;
> > +	}
> > +
> > +	return IGT_OPT_HANDLER_SUCCESS;
> > +}
> > +
> > +const char *help_str =
> > +	"  -b\tPrint bb\n"
> > +	"  -f\tCompression format (0-31)\n"
> > +	"  -p\tWrite PNG\n"
> > +	"  -s\tPrint surface info\n"
> > +	"  -t\tTiling format (0 - linear, 1 - XMAJOR, 2 - YMAJOR, 3 - TILE4, 4 - TILE64)\n"
> > +	"  -W\tWidth (default 512)\n"
> > +	"  -H\tHeight (default 512)"
> > +	;
> > +
> > +igt_main_args("bf:pst:W:H:", NULL, help_str, opt_handler, NULL)
> > +{
> > +	struct igt_collection *set;
> > +	int xe;
> > +
> > +	igt_fixture {
> > +		xe = drm_open_driver(DRIVER_XE);
> > +		igt_require(blt_has_block_copy(xe));
> > +
> > +		xe_device_get(xe);
> > +
> > +		set = xe_get_memory_region_set(xe,
> > +					       XE_MEM_REGION_CLASS_SYSMEM,
> > +					       XE_MEM_REGION_CLASS_VRAM);
> > +	}
> > +
> > +	igt_describe("Check block-copy uncompressed blit");
> > +	igt_subtest_with_dynamic("block-copy-uncompressed") {
> > +		struct test_config config = {};
> > +
> > +		block_copy_test(xe, &config, set, BLOCK_COPY);
> > +	}
> > +
> > +	igt_describe("Check block-copy flatccs compressed blit");
> > +	igt_subtest_with_dynamic("block-copy-compressed") {
> > +		struct test_config config = { .compression = true };
> > +
> > +		block_copy_test(xe, &config, set, BLOCK_COPY);
> > +	}
> > +
> > +	igt_describe("Check block-multicopy flatccs compressed blit");
> > +	igt_subtest_with_dynamic("block-multicopy-compressed") {
> > +		struct test_config config = { .compression = true };
> > +
> > +		block_copy_test(xe, &config, set, BLOCK_MULTICOPY);
> > +	}
> > +
> > +	igt_describe("Check block-multicopy flatccs inplace decompression blit");
> > +	igt_subtest_with_dynamic("block-multicopy-inplace") {
> > +		struct test_config config = { .compression = true,
> > +					      .inplace = true };
> > +
> > +		block_copy_test(xe, &config, set, BLOCK_MULTICOPY);
> > +	}
> > +
> > +	igt_describe("Check flatccs data can be copied from/to surface");
> > +	igt_subtest_with_dynamic("ctrl-surf-copy") {
> > +		struct test_config config = { .compression = true,
> > +					      .surfcopy = true };
> > +
> > +		block_copy_test(xe, &config, set, BLOCK_COPY);
> > +	}
> > +
> > +	igt_describe("Check flatccs data are physically tagged and visible"
> > +		     " in different contexts");
> > +	igt_subtest_with_dynamic("ctrl-surf-copy-new-ctx") {
> > +		struct test_config config = { .compression = true,
> > +					      .surfcopy = true,
> > +					      .new_ctx = true };
> > +
> > +		block_copy_test(xe, &config, set, BLOCK_COPY);
> > +	}
> > +
> > +	igt_describe("Check flatccs data persists after suspend / resume (S0)");
> > +	igt_subtest_with_dynamic("suspend-resume") {
> > +		struct test_config config = { .compression = true,
> > +					      .surfcopy = true,
> > +					      .suspend_resume = true };
> > +
> > +		block_copy_test(xe, &config, set, BLOCK_COPY);
> > +	}
> > +
> > +	igt_fixture {
> > +		xe_device_put(xe);
> > +		close(xe);
> > +	}
> > +}

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 14/16] tests/xe_ccs: Check if flatccs is working with block-copy for Xe
  2023-07-11 10:45     ` Zbigniew Kempczyński
@ 2023-07-11 10:51       ` Karolina Stolarek
  2023-07-12  7:00         ` Zbigniew Kempczyński
  0 siblings, 1 reply; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-11 10:51 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 11.07.2023 12:45, Zbigniew Kempczyński wrote:
> On Fri, Jul 07, 2023 at 12:05:25PM +0200, Karolina Stolarek wrote:
>> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
>>> This is ported to xe copy of i915 gem_ccs test. Ported means all driver
>>> dependent calls - like working on regions, binding and execution were
>>> replaced by xe counterparts. I wondered to add conditionals for xe
>>> in gem_ccs but this would decrease test readability so I dropped
>>> this idea.
>>>
>>> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
>>> ---
>>>    tests/meson.build |   1 +
>>>    tests/xe/xe_ccs.c | 763 ++++++++++++++++++++++++++++++++++++++++++++++
>>>    2 files changed, 764 insertions(+)
>>>    create mode 100644 tests/xe/xe_ccs.c
>>>
>>> diff --git a/tests/meson.build b/tests/meson.build
>>> index ee066b8490..9bca57a5e8 100644
>>> --- a/tests/meson.build
>>> +++ b/tests/meson.build
>>> @@ -244,6 +244,7 @@ i915_progs = [
>>>    ]
>>>    xe_progs = [
>>> +	'xe_ccs',
>>>    	'xe_create',
>>>    	'xe_compute',
>>>    	'xe_dma_buf_sync',
>>> diff --git a/tests/xe/xe_ccs.c b/tests/xe/xe_ccs.c
>>> new file mode 100644
>>> index 0000000000..e6bb29a5ed
>>> --- /dev/null
>>> +++ b/tests/xe/xe_ccs.c
>>> @@ -0,0 +1,763 @@
>>> +// SPDX-License-Identifier: MIT
>>> +/*
>>> + * Copyright © 2023 Intel Corporation
>>> + */
>>> +
>>> +#include <errno.h>
>>> +#include <glib.h>
>>> +#include <sys/ioctl.h>
>>> +#include <sys/time.h>
>>> +#include <malloc.h>
>>> +#include "drm.h"
>>> +#include "igt.h"
>>> +#include "igt_syncobj.h"
>>> +#include "intel_blt.h"
>>> +#include "intel_mocs.h"
>>> +#include "xe/xe_ioctl.h"
>>> +#include "xe/xe_query.h"
>>> +#include "xe/xe_util.h"
>>> +/**
>>> + * TEST: xe ccs
>>> + * Description: Exercise gen12 blitter with and without flatccs compression on Xe
>>> + * Run type: FULL
>>> + *
>>> + * SUBTEST: block-copy-compressed
>>> + * Description: Check block-copy flatccs compressed blit
>>> + *
>>> + * SUBTEST: block-copy-uncompressed
>>> + * Description: Check block-copy uncompressed blit
>>> + *
>>> + * SUBTEST: block-multicopy-compressed
>>> + * Description: Check block-multicopy flatccs compressed blit
>>> + *
>>> + * SUBTEST: block-multicopy-inplace
>>> + * Description: Check block-multicopy flatccs inplace decompression blit
>>> + *
>>> + * SUBTEST: ctrl-surf-copy
>>> + * Description: Check flatccs data can be copied from/to surface
>>> + *
>>> + * SUBTEST: ctrl-surf-copy-new-ctx
>>> + * Description: Check flatccs data are physically tagged and visible in vm
>>> + *
>>> + * SUBTEST: suspend-resume
>>> + * Description: Check flatccs data persists after suspend / resume (S0)
>>> + */
>>> +
>>> +IGT_TEST_DESCRIPTION("Exercise gen12 blitter with and without flatccs compression on Xe");
>>> +
>>> +static struct param {
>>> +	int compression_format;
>>> +	int tiling;
>>> +	bool write_png;
>>> +	bool print_bb;
>>> +	bool print_surface_info;
>>> +	int width;
>>> +	int height;
>>> +} param = {
>>> +	.compression_format = 0,
>>> +	.tiling = -1,
>>> +	.write_png = false,
>>> +	.print_bb = false,
>>> +	.print_surface_info = false,
>>> +	.width = 512,
>>> +	.height = 512,
>>> +};
>>> +
>>> +struct test_config {
>>> +	bool compression;
>>> +	bool inplace;
>>> +	bool surfcopy;
>>> +	bool new_ctx;
>>> +	bool suspend_resume;
>>> +};
>>> +
>>> +static void set_surf_object(struct blt_ctrl_surf_copy_object *obj,
>>> +			    uint32_t handle, uint32_t region, uint64_t size,
>>> +			    uint8_t mocs, enum blt_access_type access_type)
>>> +{
>>> +	obj->handle = handle;
>>> +	obj->region = region;
>>> +	obj->size = size;
>>> +	obj->mocs = mocs;
>>> +	obj->access_type = access_type;
>>> +}
>>> +
>>> +#define PRINT_SURFACE_INFO(name, obj) do { \
>>> +	if (param.print_surface_info) \
>>> +		blt_surface_info((name), (obj)); } while (0)
>>> +
>>> +#define WRITE_PNG(fd, id, name, obj, w, h) do { \
>>> +	if (param.write_png) \
>>> +		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)
>>> +
>>> +static int compare_nxn(const struct blt_copy_object *surf1,
>>> +		       const struct blt_copy_object *surf2,
>>> +		       int xsize, int ysize, int bx, int by)
>>
>> I think that you could avoid some repetition by creating a small lib with
>> blt copy test helpers. For example, we have 4 definitions of WRITE_PNG, and
>> it would be good to have just one and call it in four separate tests.
>>
> 
> Regarding WRITE_PNG() macro. I'm not convinced should we export this macro
> in this form. I mean there's no additional logic than checking param.write_png
> field. I imagine sth like this:
> 
> #define BLT_WRITE_PNG(conditional, fd, id, ...) do { \
> 	if (conditional) \
> 		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)
> 
> and calling in the code:
> 
> BLT_WRITE_PNG(param.write_png, fd, ...);
> 
> Looks weird imo, but maybe it's just my subjective assessment. If you think
> above is fine for you I'll add this in v4.

If you change "conditional" to something like "write_png", I'll be fine 
with it. I don't see much value in having 4 (?) definitions that are the 
same in the codebase.

> 
>>> +{
>>> +	int x, y, corrupted;
>>> +	uint32_t pos, px1, px2;
>>> +
>>> +	corrupted = 0;
>>> +	for (y = 0; y < ysize; y++) {
>>> +		for (x = 0; x < xsize; x++) {
>>> +			pos = bx * xsize + by * ysize * surf1->pitch / 4;
>>> +			pos += x + y * surf1->pitch / 4;
>>> +			px1 = surf1->ptr[pos];
>>> +			px2 = surf2->ptr[pos];
>>> +			if (px1 != px2)
>>> +				corrupted++;
>>> +		}
>>> +	}
>>> +
>>> +	return corrupted;
>>> +}
>>> +
>>> +static void dump_corruption_info(const struct blt_copy_object *surf1,
>>> +				 const struct blt_copy_object *surf2)
>>> +{
>>> +	const int xsize = 8, ysize = 8;
>>> +	int w, h, bx, by, corrupted;
>>> +
>>> +	igt_assert(surf1->x1 == surf2->x1 && surf1->x2 == surf2->x2);
>>> +	igt_assert(surf1->y1 == surf2->y1 && surf1->y2 == surf2->y2);
>>> +	w = surf1->x2;
>>> +	h = surf1->y2;
>>> +
>>> +	igt_info("dump corruption - width: %d, height: %d, sizex: %x, sizey: %x\n",
>>> +		 surf1->x2, surf1->y2, xsize, ysize);
>>> +
>>> +	for (by = 0; by < h / ysize; by++) {
>>> +		for (bx = 0; bx < w / xsize; bx++) {
>>> +			corrupted = compare_nxn(surf1, surf2, xsize, ysize, bx, by);
>>> +			if (corrupted == 0)
>>> +				igt_info(".");
>>> +			else
>>> +				igt_info("%c", '0' + corrupted);
>>> +		}
>>> +		igt_info("\n");
>>> +	}
>>> +}
>>> +
>>> +static void surf_copy(int xe,
>>> +		      intel_ctx_t *ctx,
>>> +		      uint64_t ahnd,
>>> +		      const struct blt_copy_object *src,
>>> +		      const struct blt_copy_object *mid,
>>> +		      const struct blt_copy_object *dst,
>>> +		      int run_id, bool suspend_resume)
>>> +{
>>> +	struct blt_copy_data blt = {};
>>> +	struct blt_block_copy_data_ext ext = {};
>>> +	struct blt_ctrl_surf_copy_data surf = {};
>>> +	uint32_t bb1, bb2, ccs, ccs2, *ccsmap, *ccsmap2;
>>> +	uint64_t bb_size, ccssize = mid->size / CCS_RATIO;
>>> +	uint32_t *ccscopy;
>>> +	uint8_t uc_mocs = intel_get_uc_mocs(xe);
>>> +	uint32_t sysmem = system_memory(xe);
>>> +	int result;
>>> +
>>> +	igt_assert(mid->compression);
>>> +	ccscopy = (uint32_t *) malloc(ccssize);
>>> +	ccs = xe_bo_create_flags(xe, 0, ccssize, sysmem);
>>> +	ccs2 = xe_bo_create_flags(xe, 0, ccssize, sysmem);
>>> +
>>> +	blt_ctrl_surf_copy_init(xe, &surf);
>>> +	surf.print_bb = param.print_bb;
>>> +	set_surf_object(&surf.src, mid->handle, mid->region, mid->size,
>>> +			uc_mocs, BLT_INDIRECT_ACCESS);
>>> +	set_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
>>> +	bb_size = xe_get_default_alignment(xe);
>>> +	bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
>>> +	blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
>>> +	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
>>> +	intel_ctx_xe_sync(ctx, true);
>>> +
>>> +	ccsmap = xe_bo_map(xe, ccs, surf.dst.size);
>>> +	memcpy(ccscopy, ccsmap, ccssize);
>>> +
>>> +	if (suspend_resume) {
>>> +		char *orig, *orig2, *newsum, *newsum2;
>>> +
>>> +		orig = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
>>> +						   (void *)ccsmap, surf.dst.size);
>>> +		orig2 = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
>>> +						    (void *)mid->ptr, mid->size);
>>> +
>>> +		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
>>> +
>>> +		set_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,
>>
>> Shouldn't this be sysmem instead of REGION_SMEM?
>>
> 
> Yes, missed this line.
> 
>>> +				0, DIRECT_ACCESS);
>>> +		blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
>>> +		intel_ctx_xe_sync(ctx, true);
>>> +
>>> +		ccsmap2 = xe_bo_map(xe, ccs2, surf.dst.size);
>>> +		newsum = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
>>> +						     (void *)ccsmap2, surf.dst.size);
>>> +		newsum2 = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
>>> +						      (void *)mid->ptr, mid->size);
>>> +
>>> +		munmap(ccsmap2, ccssize);
>>> +		igt_assert(!strcmp(orig, newsum));
>>> +		igt_assert(!strcmp(orig2, newsum2));
>>> +		g_free(orig);
>>> +		g_free(orig2);
>>> +		g_free(newsum);
>>> +		g_free(newsum2);
>>> +	}
>>> +
>>> +	/* corrupt ccs */
>>> +	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
>>> +		ccsmap[i] = i;
>>> +	set_surf_object(&surf.src, ccs, sysmem, ccssize,
>>> +			uc_mocs, DIRECT_ACCESS);
>>> +	set_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
>>> +			uc_mocs, INDIRECT_ACCESS);
>>> +	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
>>> +	intel_ctx_xe_sync(ctx, true);
>>> +
>>> +	blt_copy_init(xe, &blt);
>>> +	blt.color_depth = CD_32bit;
>>> +	blt.print_bb = param.print_bb;
>>> +	blt_set_copy_object(&blt.src, mid);
>>> +	blt_set_copy_object(&blt.dst, dst);
>>> +	blt_set_object_ext(&ext.src, mid->compression_type, mid->x2, mid->y2, SURFACE_TYPE_2D);
>>> +	blt_set_object_ext(&ext.dst, 0, dst->x2, dst->y2, SURFACE_TYPE_2D);
>>> +	bb2 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
>>> +	blt_set_batch(&blt.bb, bb2, bb_size, sysmem);
>>> +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, &ext);
>>> +	intel_ctx_xe_sync(ctx, true);
>>> +	WRITE_PNG(xe, run_id, "corrupted", &blt.dst, dst->x2, dst->y2);
>>> +	result = memcmp(src->ptr, dst->ptr, src->size);
>>> +	igt_assert(result != 0);
>>> +
>>> +	/* retrieve back ccs */
>>> +	memcpy(ccsmap, ccscopy, ccssize);
>>> +	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
>>> +
>>> +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, &ext);
>>> +	intel_ctx_xe_sync(ctx, true);
>>> +	WRITE_PNG(xe, run_id, "corrected", &blt.dst, dst->x2, dst->y2);
>>> +	result = memcmp(src->ptr, dst->ptr, src->size);
>>> +	if (result)
>>> +		dump_corruption_info(src, dst);
>>> +
>>> +	munmap(ccsmap, ccssize);
>>> +	gem_close(xe, ccs);
>>> +	gem_close(xe, ccs2);
>>> +	gem_close(xe, bb1);
>>> +	gem_close(xe, bb2);
>>> +
>>> +	igt_assert_f(result == 0,
>>> +		     "Source and destination surfaces are different after "
>>> +		     "restoring source ccs data\n");
>>> +}
>>> +
>>> +struct blt_copy3_data {
>>> +	int xe;
>>> +	struct blt_copy_object src;
>>> +	struct blt_copy_object mid;
>>> +	struct blt_copy_object dst;
>>> +	struct blt_copy_object final;
>>> +	struct blt_copy_batch bb;
>>> +	enum blt_color_depth color_depth;
>>> +
>>> +	/* debug stuff */
>>> +	bool print_bb;
>>> +};
>>> +
>>> +struct blt_block_copy3_data_ext {
>>> +	struct blt_block_copy_object_ext src;
>>> +	struct blt_block_copy_object_ext mid;
>>> +	struct blt_block_copy_object_ext dst;
>>> +	struct blt_block_copy_object_ext final;
>>> +};
>>> +
>>
>> Hmm, we really could make use of shared definitions like these (yes, I sound
>> like a broken record at this point, sorry!)
>>
> 
> No, there's no plan to have blt3 in the common library.

Sorry, not sure why did I became so obsesed with blt3 back then, heh

> 
>   
>>> +#define FILL_OBJ(_idx, _handle, _offset) do { \
>>> +	obj[(_idx)].handle = (_handle); \
>>> +	obj[(_idx)].offset = (_offset); \
>>> +} while (0)
>>
>> We don't use this definition in Xe tests, you can delete it
>>
> 
> Good catch.
> 
>>> +
>>> +static int blt_block_copy3(int xe,
>>> +			   const intel_ctx_t *ctx,
>>> +			   uint64_t ahnd,
>>> +			   const struct blt_copy3_data *blt3,
>>> +			   const struct blt_block_copy3_data_ext *ext3)
>>> +{
>>> +	struct blt_copy_data blt0;
>>> +	struct blt_block_copy_data_ext ext0;
>>> +	uint64_t bb_offset, alignment;
>>> +	uint64_t bb_pos = 0;
>>> +	int ret;
>>> +
>>> +	igt_assert_f(ahnd, "block-copy3 supports softpin only\n");
>>> +	igt_assert_f(blt3, "block-copy3 requires data to do blit\n");
>>> +
>>> +	alignment = xe_get_default_alignment(xe);
>>> +	get_offset(ahnd, blt3->src.handle, blt3->src.size, alignment);
>>> +	get_offset(ahnd, blt3->mid.handle, blt3->mid.size, alignment);
>>> +	get_offset(ahnd, blt3->dst.handle, blt3->dst.size, alignment);
>>> +	get_offset(ahnd, blt3->final.handle, blt3->final.size, alignment);
>>> +	bb_offset = get_offset(ahnd, blt3->bb.handle, blt3->bb.size, alignment);
>>> +
>>> +	/* First blit src -> mid */
>>> +	blt_copy_init(xe, &blt0);
>>> +	blt0.src = blt3->src;
>>> +	blt0.dst = blt3->mid;
>>> +	blt0.bb = blt3->bb;
>>> +	blt0.color_depth = blt3->color_depth;
>>> +	blt0.print_bb = blt3->print_bb;
>>> +	ext0.src = ext3->src;
>>> +	ext0.dst = ext3->mid;
>>> +	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, false);
>>> +
>>> +	/* Second blit mid -> dst */
>>> +	blt_copy_init(xe, &blt0);
>>> +	blt0.src = blt3->mid;
>>> +	blt0.dst = blt3->dst;
>>> +	blt0.bb = blt3->bb;
>>> +	blt0.color_depth = blt3->color_depth;
>>> +	blt0.print_bb = blt3->print_bb;
>>> +	ext0.src = ext3->mid;
>>> +	ext0.dst = ext3->dst;
>>> +	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, false);
>>> +
>>> +	/* Third blit dst -> final */
>>> +	blt_copy_init(xe, &blt0);
>>> +	blt0.src = blt3->dst;
>>> +	blt0.dst = blt3->final;
>>> +	blt0.bb = blt3->bb;
>>> +	blt0.color_depth = blt3->color_depth;
>>> +	blt0.print_bb = blt3->print_bb;
>>> +	ext0.src = ext3->dst;
>>> +	ext0.dst = ext3->final;
>>> +	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, true);
>>> +
>>> +	intel_ctx_xe_exec(ctx, ahnd, bb_offset);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static void block_copy(int xe,
>>> +		       intel_ctx_t *ctx,
>>> +		       uint32_t region1, uint32_t region2,
>>> +		       enum blt_tiling_type mid_tiling,
>>> +		       const struct test_config *config)
>>> +{
>>> +	struct blt_copy_data blt = {};
>>> +	struct blt_block_copy_data_ext ext = {}, *pext = &ext;
>>> +	struct blt_copy_object *src, *mid, *dst;
>>> +	const uint32_t bpp = 32;
>>> +	uint64_t bb_size = xe_get_default_alignment(xe);
>>> +	uint64_t ahnd = intel_allocator_open(xe, ctx->vm, INTEL_ALLOCATOR_RELOC);
>>> +	uint32_t run_id = mid_tiling;
>>> +	uint32_t mid_region = region2, bb;
>>> +	uint32_t width = param.width, height = param.height;
>>> +	enum blt_compression mid_compression = config->compression;
>>> +	int mid_compression_format = param.compression_format;
>>> +	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
>>> +	uint8_t uc_mocs = intel_get_uc_mocs(xe);
>>> +	int result;
>>> +
>>> +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
>>> +
>>> +	if (!blt_uses_extended_block_copy(xe))
>>> +		pext = NULL;
>>> +
>>> +	blt_copy_init(xe, &blt);
>>> +
>>> +	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
>>> +				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
>>> +	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
>>> +				mid_tiling, mid_compression, comp_type, true);
>>> +	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
>>> +				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
>>> +	igt_assert(src->size == dst->size);
>>> +	PRINT_SURFACE_INFO("src", src);
>>> +	PRINT_SURFACE_INFO("mid", mid);
>>> +	PRINT_SURFACE_INFO("dst", dst);
>>> +
>>> +	blt_surface_fill_rect(xe, src, width, height);
>>> +	WRITE_PNG(xe, run_id, "src", src, width, height);
>>> +
>>> +	blt.color_depth = CD_32bit;
>>> +	blt.print_bb = param.print_bb;
>>> +	blt_set_copy_object(&blt.src, src);
>>> +	blt_set_copy_object(&blt.dst, mid);
>>> +	blt_set_object_ext(&ext.src, 0, width, height, SURFACE_TYPE_2D);
>>> +	blt_set_object_ext(&ext.dst, mid_compression_format, width, height, SURFACE_TYPE_2D);
>>> +	blt_set_batch(&blt.bb, bb, bb_size, region1);
>>> +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, pext);
>>> +	intel_ctx_xe_sync(ctx, true);
>>> +
>>> +	/* We expect mid != src if there's compression */
>>> +	if (mid->compression)
>>> +		igt_assert(memcmp(src->ptr, mid->ptr, src->size) != 0);
>>> +
>>> +	WRITE_PNG(xe, run_id, "src", &blt.src, width, height);
>>
>> It's also in gem_ccs, why are we saving "src" surface twice?
>>
> 
> Agree, this is mistake. I'll remove this in both tests.

Awesome, thanks!

All the best,
Karolina

> 
> Thank you for the review.
> --
> Zbigniew
> 
>> Many thanks,
>> Karolina
>>
>>> +	WRITE_PNG(xe, run_id, "mid", &blt.dst, width, height);
>>> +
>>> +	if (config->surfcopy && pext) {
>>> +		struct drm_xe_engine_class_instance inst = {
>>> +			.engine_class = DRM_XE_ENGINE_CLASS_COPY,
>>> +		};
>>> +		intel_ctx_t *surf_ctx = ctx;
>>> +		uint64_t surf_ahnd = ahnd;
>>> +		uint32_t vm, engine;
>>> +
>>> +		if (config->new_ctx) {
>>> +			vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
>>> +			engine = xe_engine_create(xe, vm, &inst, 0);
>>> +			surf_ctx = intel_ctx_xe(xe, vm, engine, 0, 0, 0);
>>> +			surf_ahnd = intel_allocator_open(xe, surf_ctx->vm,
>>> +							 INTEL_ALLOCATOR_RELOC);
>>> +		}
>>> +		surf_copy(xe, surf_ctx, surf_ahnd, src, mid, dst, run_id,
>>> +			  config->suspend_resume);
>>> +
>>> +		if (surf_ctx != ctx) {
>>> +			xe_engine_destroy(xe, engine);
>>> +			xe_vm_destroy(xe, vm);
>>> +			free(surf_ctx);
>>> +			put_ahnd(surf_ahnd);
>>> +		}
>>> +	}
>>> +
>>> +	blt_copy_init(xe, &blt);
>>> +	blt.color_depth = CD_32bit;
>>> +	blt.print_bb = param.print_bb;
>>> +	blt_set_copy_object(&blt.src, mid);
>>> +	blt_set_copy_object(&blt.dst, dst);
>>> +	blt_set_object_ext(&ext.src, mid_compression_format, width, height, SURFACE_TYPE_2D);
>>> +	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
>>> +	if (config->inplace) {
>>> +		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
>>> +			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
>>> +		blt.dst.ptr = mid->ptr;
>>> +	}
>>> +
>>> +	blt_set_batch(&blt.bb, bb, bb_size, region1);
>>> +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, pext);
>>> +	intel_ctx_xe_sync(ctx, true);
>>> +
>>> +	WRITE_PNG(xe, run_id, "dst", &blt.dst, width, height);
>>> +
>>> +	result = memcmp(src->ptr, blt.dst.ptr, src->size);
>>> +
>>> +	/* Politely clean vm */
>>> +	put_offset(ahnd, src->handle);
>>> +	put_offset(ahnd, mid->handle);
>>> +	put_offset(ahnd, dst->handle);
>>> +	put_offset(ahnd, bb);
>>> +	intel_allocator_bind(ahnd, 0, 0);
>>> +	blt_destroy_object(xe, src);
>>> +	blt_destroy_object(xe, mid);
>>> +	blt_destroy_object(xe, dst);
>>> +	gem_close(xe, bb);
>>> +	put_ahnd(ahnd);
>>> +
>>> +	igt_assert_f(!result, "source and destination surfaces differs!\n");
>>> +}
>>> +
>>> +static void block_multicopy(int xe,
>>> +			    intel_ctx_t *ctx,
>>> +			    uint32_t region1, uint32_t region2,
>>> +			    enum blt_tiling_type mid_tiling,
>>> +			    const struct test_config *config)
>>> +{
>>> +	struct blt_copy3_data blt3 = {};
>>> +	struct blt_copy_data blt = {};
>>> +	struct blt_block_copy3_data_ext ext3 = {}, *pext3 = &ext3;
>>> +	struct blt_copy_object *src, *mid, *dst, *final;
>>> +	const uint32_t bpp = 32;
>>> +	uint64_t bb_size = xe_get_default_alignment(xe);
>>> +	uint64_t ahnd = intel_allocator_open(xe, ctx->vm, INTEL_ALLOCATOR_RELOC);
>>> +	uint32_t run_id = mid_tiling;
>>> +	uint32_t mid_region = region2, bb;
>>> +	uint32_t width = param.width, height = param.height;
>>> +	enum blt_compression mid_compression = config->compression;
>>> +	int mid_compression_format = param.compression_format;
>>> +	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
>>> +	uint8_t uc_mocs = intel_get_uc_mocs(xe);
>>> +	int result;
>>> +
>>> +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
>>> +
>>> +	if (!blt_uses_extended_block_copy(xe))
>>> +		pext3 = NULL;
>>> +
>>> +	blt_copy_init(xe, &blt);
>>> +
>>> +	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
>>> +				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
>>> +	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
>>> +				mid_tiling, mid_compression, comp_type, true);
>>> +	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
>>> +				mid_tiling, COMPRESSION_DISABLED, comp_type, true);
>>> +	final = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
>>> +				  T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
>>> +	igt_assert(src->size == dst->size);
>>> +	PRINT_SURFACE_INFO("src", src);
>>> +	PRINT_SURFACE_INFO("mid", mid);
>>> +	PRINT_SURFACE_INFO("dst", dst);
>>> +	PRINT_SURFACE_INFO("final", final);
>>> +
>>> +	blt_surface_fill_rect(xe, src, width, height);
>>> +
>>> +	blt3.color_depth = CD_32bit;
>>> +	blt3.print_bb = param.print_bb;
>>> +	blt_set_copy_object(&blt3.src, src);
>>> +	blt_set_copy_object(&blt3.mid, mid);
>>> +	blt_set_copy_object(&blt3.dst, dst);
>>> +	blt_set_copy_object(&blt3.final, final);
>>> +
>>> +	if (config->inplace) {
>>> +		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
>>> +			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
>>> +			       comp_type);
>>> +		blt3.dst.ptr = mid->ptr;
>>> +	}
>>> +
>>> +	blt_set_object_ext(&ext3.src, 0, width, height, SURFACE_TYPE_2D);
>>> +	blt_set_object_ext(&ext3.mid, mid_compression_format, width, height, SURFACE_TYPE_2D);
>>> +	blt_set_object_ext(&ext3.dst, 0, width, height, SURFACE_TYPE_2D);
>>> +	blt_set_object_ext(&ext3.final, 0, width, height, SURFACE_TYPE_2D);
>>> +	blt_set_batch(&blt3.bb, bb, bb_size, region1);
>>> +
>>> +	blt_block_copy3(xe, ctx, ahnd, &blt3, pext3);
>>> +	intel_ctx_xe_sync(ctx, true);
>>> +
>>> +	WRITE_PNG(xe, run_id, "src", &blt3.src, width, height);
>>> +	if (!config->inplace)
>>> +		WRITE_PNG(xe, run_id, "mid", &blt3.mid, width, height);
>>> +	WRITE_PNG(xe, run_id, "dst", &blt3.dst, width, height);
>>> +	WRITE_PNG(xe, run_id, "final", &blt3.final, width, height);
>>> +
>>> +	result = memcmp(src->ptr, blt3.final.ptr, src->size);
>>> +
>>> +	put_offset(ahnd, src->handle);
>>> +	put_offset(ahnd, mid->handle);
>>> +	put_offset(ahnd, dst->handle);
>>> +	put_offset(ahnd, final->handle);
>>> +	put_offset(ahnd, bb);
>>> +	intel_allocator_bind(ahnd, 0, 0);
>>> +	blt_destroy_object(xe, src);
>>> +	blt_destroy_object(xe, mid);
>>> +	blt_destroy_object(xe, dst);
>>> +	blt_destroy_object(xe, final);
>>> +	gem_close(xe, bb);
>>> +	put_ahnd(ahnd);
>>> +
>>> +	igt_assert_f(!result, "source and destination surfaces differs!\n");
>>> +}
>>> +
>>> +enum copy_func {
>>> +	BLOCK_COPY,
>>> +	BLOCK_MULTICOPY,
>>> +};
>>> +
>>> +static const struct {
>>> +	const char *suffix;
>>> +	void (*copyfn)(int fd,
>>> +		       intel_ctx_t *ctx,
>>> +		       uint32_t region1, uint32_t region2,
>>> +		       enum blt_tiling_type btype,
>>> +		       const struct test_config *config);
>>> +} copyfns[] = {
>>> +	[BLOCK_COPY] = { "", block_copy },
>>> +	[BLOCK_MULTICOPY] = { "-multicopy", block_multicopy },
>>> +};
>>> +
>>> +static void block_copy_test(int xe,
>>> +			    const struct test_config *config,
>>> +			    struct igt_collection *set,
>>> +			    enum copy_func copy_function)
>>> +{
>>> +	struct drm_xe_engine_class_instance inst = {
>>> +		.engine_class = DRM_XE_ENGINE_CLASS_COPY,
>>> +	};
>>> +	intel_ctx_t *ctx;
>>> +	struct igt_collection *regions;
>>> +	uint32_t vm, engine;
>>> +	int tiling;
>>> +
>>> +	if (config->compression && !blt_block_copy_supports_compression(xe))
>>> +		return;
>>> +
>>> +	if (config->inplace && !config->compression)
>>> +		return;
>>> +
>>> +	for_each_tiling(tiling) {
>>> +		if (!blt_block_copy_supports_tiling(xe, tiling) ||
>>> +		    (param.tiling >= 0 && param.tiling != tiling))
>>> +			continue;
>>> +
>>> +		for_each_variation_r(regions, 2, set) {
>>> +			uint32_t region1, region2;
>>> +			char *regtxt;
>>> +
>>> +			region1 = igt_collection_get_value(regions, 0);
>>> +			region2 = igt_collection_get_value(regions, 1);
>>> +
>>> +			/* Compressed surface must be in device memory */
>>> +			if (config->compression && !XE_IS_VRAM_MEMORY_REGION(xe, region2))
>>> +				continue;
>>> +
>>> +			regtxt = xe_memregion_dynamic_subtest_name(xe, regions);
>>> +
>>> +			igt_dynamic_f("%s-%s-compfmt%d-%s%s",
>>> +				      blt_tiling_name(tiling),
>>> +				      config->compression ?
>>> +					      "compressed" : "uncompressed",
>>> +				      param.compression_format, regtxt,
>>> +				      copyfns[copy_function].suffix) {
>>> +				uint32_t sync_bind, sync_out;
>>> +
>>> +				vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
>>> +				engine = xe_engine_create(xe, vm, &inst, 0);
>>> +				sync_bind = syncobj_create(xe, 0);
>>> +				sync_out = syncobj_create(xe, 0);
>>> +				ctx = intel_ctx_xe(xe, vm, engine,
>>> +						   0, sync_bind, sync_out);
>>> +
>>> +				copyfns[copy_function].copyfn(xe, ctx,
>>> +							      region1, region2,
>>> +							      tiling, config);
>>> +
>>> +				xe_engine_destroy(xe, engine);
>>> +				xe_vm_destroy(xe, vm);
>>> +				syncobj_destroy(xe, sync_bind);
>>> +				syncobj_destroy(xe, sync_out);
>>> +				free(ctx);
>>> +			}
>>> +
>>> +			free(regtxt);
>>> +		}
>>> +	}
>>> +}
>>> +
>>> +static int opt_handler(int opt, int opt_index, void *data)
>>> +{
>>> +	switch (opt) {
>>> +	case 'b':
>>> +		param.print_bb = true;
>>> +		igt_debug("Print bb: %d\n", param.print_bb);
>>> +		break;
>>> +	case 'f':
>>> +		param.compression_format = atoi(optarg);
>>> +		igt_debug("Compression format: %d\n", param.compression_format);
>>> +		igt_assert((param.compression_format & ~0x1f) == 0);
>>> +		break;
>>> +	case 'p':
>>> +		param.write_png = true;
>>> +		igt_debug("Write png: %d\n", param.write_png);
>>> +		break;
>>> +	case 's':
>>> +		param.print_surface_info = true;
>>> +		igt_debug("Print surface info: %d\n", param.print_surface_info);
>>> +		break;
>>> +	case 't':
>>> +		param.tiling = atoi(optarg);
>>> +		igt_debug("Tiling: %d\n", param.tiling);
>>> +		break;
>>> +	case 'W':
>>> +		param.width = atoi(optarg);
>>> +		igt_debug("Width: %d\n", param.width);
>>> +		break;
>>> +	case 'H':
>>> +		param.height = atoi(optarg);
>>> +		igt_debug("Height: %d\n", param.height);
>>> +		break;
>>> +	default:
>>> +		return IGT_OPT_HANDLER_ERROR;
>>> +	}
>>> +
>>> +	return IGT_OPT_HANDLER_SUCCESS;
>>> +}
>>> +
>>> +const char *help_str =
>>> +	"  -b\tPrint bb\n"
>>> +	"  -f\tCompression format (0-31)\n"
>>> +	"  -p\tWrite PNG\n"
>>> +	"  -s\tPrint surface info\n"
>>> +	"  -t\tTiling format (0 - linear, 1 - XMAJOR, 2 - YMAJOR, 3 - TILE4, 4 - TILE64)\n"
>>> +	"  -W\tWidth (default 512)\n"
>>> +	"  -H\tHeight (default 512)"
>>> +	;
>>> +
>>> +igt_main_args("bf:pst:W:H:", NULL, help_str, opt_handler, NULL)
>>> +{
>>> +	struct igt_collection *set;
>>> +	int xe;
>>> +
>>> +	igt_fixture {
>>> +		xe = drm_open_driver(DRIVER_XE);
>>> +		igt_require(blt_has_block_copy(xe));
>>> +
>>> +		xe_device_get(xe);
>>> +
>>> +		set = xe_get_memory_region_set(xe,
>>> +					       XE_MEM_REGION_CLASS_SYSMEM,
>>> +					       XE_MEM_REGION_CLASS_VRAM);
>>> +	}
>>> +
>>> +	igt_describe("Check block-copy uncompressed blit");
>>> +	igt_subtest_with_dynamic("block-copy-uncompressed") {
>>> +		struct test_config config = {};
>>> +
>>> +		block_copy_test(xe, &config, set, BLOCK_COPY);
>>> +	}
>>> +
>>> +	igt_describe("Check block-copy flatccs compressed blit");
>>> +	igt_subtest_with_dynamic("block-copy-compressed") {
>>> +		struct test_config config = { .compression = true };
>>> +
>>> +		block_copy_test(xe, &config, set, BLOCK_COPY);
>>> +	}
>>> +
>>> +	igt_describe("Check block-multicopy flatccs compressed blit");
>>> +	igt_subtest_with_dynamic("block-multicopy-compressed") {
>>> +		struct test_config config = { .compression = true };
>>> +
>>> +		block_copy_test(xe, &config, set, BLOCK_MULTICOPY);
>>> +	}
>>> +
>>> +	igt_describe("Check block-multicopy flatccs inplace decompression blit");
>>> +	igt_subtest_with_dynamic("block-multicopy-inplace") {
>>> +		struct test_config config = { .compression = true,
>>> +					      .inplace = true };
>>> +
>>> +		block_copy_test(xe, &config, set, BLOCK_MULTICOPY);
>>> +	}
>>> +
>>> +	igt_describe("Check flatccs data can be copied from/to surface");
>>> +	igt_subtest_with_dynamic("ctrl-surf-copy") {
>>> +		struct test_config config = { .compression = true,
>>> +					      .surfcopy = true };
>>> +
>>> +		block_copy_test(xe, &config, set, BLOCK_COPY);
>>> +	}
>>> +
>>> +	igt_describe("Check flatccs data are physically tagged and visible"
>>> +		     " in different contexts");
>>> +	igt_subtest_with_dynamic("ctrl-surf-copy-new-ctx") {
>>> +		struct test_config config = { .compression = true,
>>> +					      .surfcopy = true,
>>> +					      .new_ctx = true };
>>> +
>>> +		block_copy_test(xe, &config, set, BLOCK_COPY);
>>> +	}
>>> +
>>> +	igt_describe("Check flatccs data persists after suspend / resume (S0)");
>>> +	igt_subtest_with_dynamic("suspend-resume") {
>>> +		struct test_config config = { .compression = true,
>>> +					      .surfcopy = true,
>>> +					      .suspend_resume = true };
>>> +
>>> +		block_copy_test(xe, &config, set, BLOCK_COPY);
>>> +	}
>>> +
>>> +	igt_fixture {
>>> +		xe_device_put(xe);
>>> +		close(xe);
>>> +	}
>>> +}

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 15/16] tests/xe_exercise_blt: Check blitter library fast-copy for Xe
  2023-07-07 11:10   ` Karolina Stolarek
@ 2023-07-11 11:07     ` Zbigniew Kempczyński
  2023-07-11 11:15       ` Karolina Stolarek
  0 siblings, 1 reply; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-11 11:07 UTC (permalink / raw)
  To: Karolina Stolarek; +Cc: igt-dev

On Fri, Jul 07, 2023 at 01:10:52PM +0200, Karolina Stolarek wrote:
> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> > Port this test to work on xe. Instead of adding conditional code for
> > xe code which would decrease readability this is new test for xe.
> > 
> > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > ---
> >   tests/meson.build          |   1 +
> >   tests/xe/xe_exercise_blt.c | 372 +++++++++++++++++++++++++++++++++++++
> >   2 files changed, 373 insertions(+)
> >   create mode 100644 tests/xe/xe_exercise_blt.c
> > 
> > diff --git a/tests/meson.build b/tests/meson.build
> > index 9bca57a5e8..137a5cf01f 100644
> > --- a/tests/meson.build
> > +++ b/tests/meson.build
> > @@ -257,6 +257,7 @@ xe_progs = [
> >   	'xe_exec_reset',
> >   	'xe_exec_store',
> >   	'xe_exec_threads',
> > +	'xe_exercise_blt',
> >   	'xe_gpgpu_fill',
> >   	'xe_guc_pc',
> >   	'xe_huc_copy',
> > diff --git a/tests/xe/xe_exercise_blt.c b/tests/xe/xe_exercise_blt.c
> > new file mode 100644
> > index 0000000000..8340cf7148
> > --- /dev/null
> > +++ b/tests/xe/xe_exercise_blt.c
> > @@ -0,0 +1,372 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2023 Intel Corporation
> > + */
> > +
> > +#include "igt.h"
> > +#include "drm.h"
> > +#include "lib/intel_chipset.h"
> > +#include "intel_blt.h"
> > +#include "intel_mocs.h"
> > +#include "xe/xe_ioctl.h"
> > +#include "xe/xe_query.h"
> > +#include "xe/xe_util.h"
> > +
> > +/**
> > + * TEST: xe exercise blt
> > + * Description: Exercise blitter commands on Xe
> > + * Feature: blitter
> > + * Run type: FULL
> > + * Test category: GEM_Legacy
> > + *
> > + * SUBTEST: fast-copy
> > + * Description:
> > + *   Check fast-copy blit
> > + *   blitter
> > + *
> > + * SUBTEST: fast-copy-emit
> > + * Description:
> > + *   Check multiple fast-copy in one batch
> > + *   blitter
> > + */
> > +
> > +IGT_TEST_DESCRIPTION("Exercise blitter commands on Xe");
> > +
> > +static struct param {
> > +	int tiling;
> > +	bool write_png;
> > +	bool print_bb;
> > +	bool print_surface_info;
> > +	int width;
> > +	int height;
> > +} param = {
> > +	.tiling = -1,
> > +	.write_png = false,
> > +	.print_bb = false,
> > +	.print_surface_info = false,
> > +	.width = 512,
> > +	.height = 512,
> > +};
> > +
> > +#define PRINT_SURFACE_INFO(name, obj) do { \
> > +	if (param.print_surface_info) \
> > +		blt_surface_info((name), (obj)); } while (0)
> > +
> > +#define WRITE_PNG(fd, id, name, obj, w, h) do { \
> > +	if (param.write_png) \
> > +		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)
> > +
> 
> My suggestion with shared functions applies here as well
> 
> > +struct blt_fast_copy_data {
> > +	int xe;
> > +	struct blt_copy_object src;
> > +	struct blt_copy_object mid;
> > +	struct blt_copy_object dst;
> > +
> > +	struct blt_copy_batch bb;
> > +	enum blt_color_depth color_depth;
> > +
> > +	/* debug stuff */
> > +	bool print_bb;
> > +};
> > +
> > +static int fast_copy_one_bb(int xe,
> > +			    const intel_ctx_t *ctx,
> > +			    uint64_t ahnd,
> > +			    const struct blt_fast_copy_data *blt)
> > +{
> > +	struct blt_copy_data blt_tmp;
> > +	uint64_t bb_offset, alignment;
> > +	uint64_t bb_pos = 0;
> > +	int ret = 0;
> > +
> > +	alignment = xe_get_default_alignment(xe);
> > +
> > +	get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> > +	get_offset(ahnd, blt->mid.handle, blt->mid.size, alignment);
> > +	get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> > +	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> > +
> > +	/* First blit */
> > +	blt_copy_init(xe, &blt_tmp);
> > +	blt_tmp.src = blt->src;
> > +	blt_tmp.dst = blt->mid;
> > +	blt_tmp.bb = blt->bb;
> > +	blt_tmp.color_depth = blt->color_depth;
> > +	blt_tmp.print_bb = blt->print_bb;
> > +	bb_pos = emit_blt_fast_copy(xe, ahnd, &blt_tmp, bb_pos, false);
> > +
> > +	/* Second blit */
> > +	blt_copy_init(xe, &blt_tmp);
> > +	blt_tmp.src = blt->mid;
> > +	blt_tmp.dst = blt->dst;
> > +	blt_tmp.bb = blt->bb;
> > +	blt_tmp.color_depth = blt->color_depth;
> > +	blt_tmp.print_bb = blt->print_bb;
> > +	bb_pos = emit_blt_fast_copy(xe, ahnd, &blt_tmp, bb_pos, true);
> > +
> > +	intel_ctx_xe_exec(ctx, ahnd, bb_offset);
> > +
> > +	return ret;
> > +}
> > +
> > +static void fast_copy_emit(int xe, const intel_ctx_t *ctx,
> > +			   uint32_t region1, uint32_t region2,
> > +			   enum blt_tiling_type mid_tiling)
> > +{
> > +	struct blt_copy_data bltinit = {};
> > +	struct blt_fast_copy_data blt = {};
> > +	struct blt_copy_object *src, *mid, *dst;
> > +	const uint32_t bpp = 32;
> > +	uint64_t bb_size = xe_get_default_alignment(xe);
> > +	uint64_t ahnd = intel_allocator_open_full(xe, ctx->vm, 0, 0,
> > +						  INTEL_ALLOCATOR_SIMPLE,
> > +						  ALLOC_STRATEGY_LOW_TO_HIGH, 0);
> > +	uint32_t bb, width = param.width, height = param.height;
> > +	int result;
> > +
> > +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
> > +
> > +	blt_copy_init(xe, &bltinit);
> > +	src = blt_create_object(&bltinit, region1, width, height, bpp, 0,
> > +				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> > +	mid = blt_create_object(&bltinit, region2, width, height, bpp, 0,
> > +				mid_tiling, COMPRESSION_DISABLED, 0, true);
> > +	dst = blt_create_object(&bltinit, region1, width, height, bpp, 0,
> > +				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> > +	igt_assert(src->size == dst->size);
> > +
> > +	PRINT_SURFACE_INFO("src", src);
> > +	PRINT_SURFACE_INFO("mid", mid);
> > +	PRINT_SURFACE_INFO("dst", dst);
> > +
> > +	blt_surface_fill_rect(xe, src, width, height);
> > +	WRITE_PNG(xe, mid_tiling, "src", src, width, height);
> > +
> > +	memset(&blt, 0, sizeof(blt));
> > +	blt.color_depth = CD_32bit;
> > +	blt.print_bb = param.print_bb;
> > +	blt_set_copy_object(&blt.src, src);
> > +	blt_set_copy_object(&blt.mid, mid);
> > +	blt_set_copy_object(&blt.dst, dst);
> > +	blt_set_batch(&blt.bb, bb, bb_size, region1);
> > +
> > +	fast_copy_one_bb(xe, ctx, ahnd, &blt);
> > +
> > +	WRITE_PNG(xe, mid_tiling, "mid", &blt.mid, width, height);
> > +	WRITE_PNG(xe, mid_tiling, "dst", &blt.dst, width, height);
> > +
> > +	result = memcmp(src->ptr, blt.dst.ptr, src->size);
> > +
> > +	blt_destroy_object(xe, src);
> > +	blt_destroy_object(xe, mid);
> > +	blt_destroy_object(xe, dst);
> 
> I see that in block_copy tests we also call put_offset() for all copy
> objects' handles. Although we have a different allocator, I think that we
> should free the ranges on object destruction.
> 

I was a little bit surprised we put_offset() in fast-copy and surf-copy.
I mean this will rebind offsets unnecessary. Finally we should put them
before releasing ahnd, but for vm-bind (xe) destroying vm just does
the job (and intel_allocator_init() is freeing the dangling memory
which wasn't released explicitly).

> > +	gem_close(xe, bb);
> > +	put_ahnd(ahnd);
> > +
> > +	munmap(&bb, bb_size);
> > +
> > +	igt_assert_f(!result, "source and destination surfaces differs!\n");
> > +}
> > +
> > +static void fast_copy(int xe, const intel_ctx_t *ctx,
> > +		      uint32_t region1, uint32_t region2,
> > +		      enum blt_tiling_type mid_tiling)
> > +{
> > +	struct blt_copy_data blt = {};
> > +	struct blt_copy_object *src, *mid, *dst;
> > +	const uint32_t bpp = 32;
> > +	uint64_t bb_size = xe_get_default_alignment(xe);
> > +	uint64_t ahnd = intel_allocator_open_full(xe, ctx->vm, 0, 0,
> > +						  INTEL_ALLOCATOR_SIMPLE,
> > +						  ALLOC_STRATEGY_LOW_TO_HIGH, 0);
> > +	uint32_t bb;
> > +	uint32_t width = param.width, height = param.height;
> > +	int result;
> > +
> > +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
> > +
> > +	blt_copy_init(xe, &blt);
> > +	src = blt_create_object(&blt, region1, width, height, bpp, 0,
> > +				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> > +	mid = blt_create_object(&blt, region2, width, height, bpp, 0,
> > +				mid_tiling, COMPRESSION_DISABLED, 0, true);
> > +	dst = blt_create_object(&blt, region1, width, height, bpp, 0,
> > +				T_LINEAR, COMPRESSION_DISABLED, 0, true);
> > +	igt_assert(src->size == dst->size);
> > +
> > +	blt_surface_fill_rect(xe, src, width, height);
> > +
> > +	blt.color_depth = CD_32bit;
> > +	blt.print_bb = param.print_bb;
> > +	blt_set_copy_object(&blt.src, src);
> > +	blt_set_copy_object(&blt.dst, mid);
> > +	blt_set_batch(&blt.bb, bb, bb_size, region1);
> > +
> > +	blt_fast_copy(xe, ctx, NULL, ahnd, &blt);
> > +
> > +	WRITE_PNG(xe, mid_tiling, "src", &blt.src, width, height);
> > +	WRITE_PNG(xe, mid_tiling, "mid", &blt.dst, width, height);
> > +
> > +	blt_copy_init(xe, &blt);
> > +	blt.color_depth = CD_32bit;
> > +	blt.print_bb = param.print_bb;
> > +	blt_set_copy_object(&blt.src, mid);
> > +	blt_set_copy_object(&blt.dst, dst);
> > +	blt_set_batch(&blt.bb, bb, bb_size, region1);
> > +
> > +	blt_fast_copy(xe, ctx, NULL, ahnd, &blt);
> > +
> > +	WRITE_PNG(xe, mid_tiling, "dst", &blt.dst, width, height);
> > +
> > +	result = memcmp(src->ptr, blt.dst.ptr, src->size);
> > +
> > +	blt_destroy_object(xe, src);
> > +	blt_destroy_object(xe, mid);
> > +	blt_destroy_object(xe, dst);
> > +	gem_close(xe, bb);
> > +	put_ahnd(ahnd);
> > +
> > +	igt_assert_f(!result, "source and destination surfaces differs!\n");
> > +}
> > +
> > +enum fast_copy_func {
> > +	FAST_COPY,
> > +	FAST_COPY_EMIT
> > +};
> > +
> > +static char
> > +	*full_subtest_str(char *regtxt, enum blt_tiling_type tiling,
> > +			  enum fast_copy_func func)
> > +{
> > +	char *name;
> > +	uint32_t len;
> > +
> > +	len = asprintf(&name, "%s-%s%s", blt_tiling_name(tiling), regtxt,
> > +		       func == FAST_COPY_EMIT ? "-emit" : "");
> > +
> > +	igt_assert_f(len >= 0, "asprintf failed!\n");
> > +
> > +	return name;
> > +}
> > +
> > +static void fast_copy_test(int xe,
> > +			   struct igt_collection *set,
> > +			   enum fast_copy_func func)
> > +{
> > +	struct drm_xe_engine_class_instance inst = {
> > +		.engine_class = DRM_XE_ENGINE_CLASS_COPY,
> > +	};
> > +	struct igt_collection *regions;
> > +	void (*copy_func)(int xe, const intel_ctx_t *ctx,
> > +			  uint32_t r1, uint32_t r2, enum blt_tiling_type tiling);
> > +	intel_ctx_t *ctx;
> > +	int tiling;
> > +
> > +	for_each_tiling(tiling) {
> > +		if (!blt_fast_copy_supports_tiling(xe, tiling))
> > +			continue;
> > +
> > +		for_each_variation_r(regions, 2, set) {
> > +			uint32_t region1, region2;
> > +			uint32_t vm, engine;
> > +			char *regtxt, *test_name;
> > +
> > +			region1 = igt_collection_get_value(regions, 0);
> > +			region2 = igt_collection_get_value(regions, 1);
> > +
> > +			vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
> > +			engine = xe_engine_create(xe, vm, &inst, 0);
> > +			ctx = intel_ctx_xe(xe, vm, engine, 0, 0, 0);
> > +
> > +			copy_func = (func == FAST_COPY) ? fast_copy : fast_copy_emit;
> > +			regtxt = xe_memregion_dynamic_subtest_name(xe, regions);
> > +			test_name = full_subtest_str(regtxt, tiling, func);
> > +
> > +			igt_dynamic_f("%s", test_name) {
> > +				copy_func(xe, ctx,
> > +					  region1, region2,
> > +					  tiling);
> > +			}
> > +
> > +			free(regtxt);
> > +			free(test_name);
> > +			xe_engine_destroy(xe, engine);
> > +			xe_vm_destroy(xe, vm);
> > +			free(ctx);
> > +		}
> > +	}
> > +}
> > +
> > +static int opt_handler(int opt, int opt_index, void *data)
> > +{
> > +	switch (opt) {
> > +	case 'b':
> > +		param.print_bb = true;
> > +		igt_debug("Print bb: %d\n", param.print_bb);
> > +		break;
> > +	case 'p':
> > +		param.write_png = true;
> > +		igt_debug("Write png: %d\n", param.write_png);
> > +		break;
> > +	case 's':
> > +		param.print_surface_info = true;
> > +		igt_debug("Print surface info: %d\n", param.print_surface_info);
> > +		break;
> > +	case 't':
> > +		param.tiling = atoi(optarg);
> > +		igt_debug("Tiling: %d\n", param.tiling);
> > +		break;
> > +	case 'W':
> > +		param.width = atoi(optarg);
> > +		igt_debug("Width: %d\n", param.width);
> > +		break;
> > +	case 'H':
> > +		param.height = atoi(optarg);
> > +		igt_debug("Height: %d\n", param.height);
> > +		break;
> > +	default:
> > +		return IGT_OPT_HANDLER_ERROR;
> > +	}
> > +
> > +	return IGT_OPT_HANDLER_SUCCESS;
> > +}
> > +
> > +const char *help_str =
> > +	"  -b\tPrint bb\n"
> > +	"  -p\tWrite PNG\n"
> > +	"  -s\tPrint surface info\n"
> > +	"  -t\tTiling format (0 - linear, 1 - XMAJOR, 2 - YMAJOR, 3 - TILE4, 4 - TILE64, 5 - YFMAJOR)\n"
> > +	"  -W\tWidth (default 512)\n"
> > +	"  -H\tHeight (default 512)"
> > +	;
> > +
> > +igt_main_args("b:pst:W:H:", NULL, help_str, opt_handler, NULL)
> > +{
> > +	struct igt_collection *set;
> > +	int xe;
> > +
> > +	igt_fixture {
> > +		xe = drm_open_driver(DRIVER_XE);
> > +		igt_require(blt_has_block_copy(xe));
> 
> Should be blt_has_fast_copy(xe)

Yes, missed that during copying fixture from xe_ccs. Thanks for spotting
this.

--
Zbigniew

> 
> All the best,
> Karolina
> > +
> > +		xe_device_get(xe);
> > +
> > +		set = xe_get_memory_region_set(xe,
> > +					       XE_MEM_REGION_CLASS_SYSMEM,
> > +					       XE_MEM_REGION_CLASS_VRAM);
> > +	}
> > +
> > +	igt_describe("Check fast-copy blit");
> > +	igt_subtest_with_dynamic("fast-copy") {
> > +		fast_copy_test(xe, set, FAST_COPY);
> > +	}
> > +
> > +	igt_describe("Check multiple fast-copy in one batch");
> > +	igt_subtest_with_dynamic("fast-copy-emit") {
> > +		fast_copy_test(xe, set, FAST_COPY_EMIT);
> > +	}
> > +
> > +	igt_fixture {
> > +		drm_close_driver(xe);
> > +	}
> > +}

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 15/16] tests/xe_exercise_blt: Check blitter library fast-copy for Xe
  2023-07-11 11:07     ` Zbigniew Kempczyński
@ 2023-07-11 11:15       ` Karolina Stolarek
  0 siblings, 0 replies; 46+ messages in thread
From: Karolina Stolarek @ 2023-07-11 11:15 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

On 11.07.2023 13:07, Zbigniew Kempczyński wrote:
> On Fri, Jul 07, 2023 at 01:10:52PM +0200, Karolina Stolarek wrote:
>> On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
>>> Port this test to work on xe. Instead of adding conditional code for
>>> xe code which would decrease readability this is new test for xe.
>>>
>>> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
>>> ---
>>>    tests/meson.build          |   1 +
>>>    tests/xe/xe_exercise_blt.c | 372 +++++++++++++++++++++++++++++++++++++
>>>    2 files changed, 373 insertions(+)
>>>    create mode 100644 tests/xe/xe_exercise_blt.c
>>>
>>> diff --git a/tests/meson.build b/tests/meson.build
>>> index 9bca57a5e8..137a5cf01f 100644
>>> --- a/tests/meson.build
>>> +++ b/tests/meson.build
>>> @@ -257,6 +257,7 @@ xe_progs = [
>>>    	'xe_exec_reset',
>>>    	'xe_exec_store',
>>>    	'xe_exec_threads',
>>> +	'xe_exercise_blt',
>>>    	'xe_gpgpu_fill',
>>>    	'xe_guc_pc',
>>>    	'xe_huc_copy',
>>> diff --git a/tests/xe/xe_exercise_blt.c b/tests/xe/xe_exercise_blt.c
>>> new file mode 100644
>>> index 0000000000..8340cf7148
>>> --- /dev/null
>>> +++ b/tests/xe/xe_exercise_blt.c
>>> @@ -0,0 +1,372 @@
>>> +// SPDX-License-Identifier: MIT
>>> +/*
>>> + * Copyright © 2023 Intel Corporation
>>> + */
>>> +
>>> +#include "igt.h"
>>> +#include "drm.h"
>>> +#include "lib/intel_chipset.h"
>>> +#include "intel_blt.h"
>>> +#include "intel_mocs.h"
>>> +#include "xe/xe_ioctl.h"
>>> +#include "xe/xe_query.h"
>>> +#include "xe/xe_util.h"
>>> +
>>> +/**
>>> + * TEST: xe exercise blt
>>> + * Description: Exercise blitter commands on Xe
>>> + * Feature: blitter
>>> + * Run type: FULL
>>> + * Test category: GEM_Legacy
>>> + *
>>> + * SUBTEST: fast-copy
>>> + * Description:
>>> + *   Check fast-copy blit
>>> + *   blitter
>>> + *
>>> + * SUBTEST: fast-copy-emit
>>> + * Description:
>>> + *   Check multiple fast-copy in one batch
>>> + *   blitter
>>> + */
>>> +
>>> +IGT_TEST_DESCRIPTION("Exercise blitter commands on Xe");
>>> +
>>> +static struct param {
>>> +	int tiling;
>>> +	bool write_png;
>>> +	bool print_bb;
>>> +	bool print_surface_info;
>>> +	int width;
>>> +	int height;
>>> +} param = {
>>> +	.tiling = -1,
>>> +	.write_png = false,
>>> +	.print_bb = false,
>>> +	.print_surface_info = false,
>>> +	.width = 512,
>>> +	.height = 512,
>>> +};
>>> +
>>> +#define PRINT_SURFACE_INFO(name, obj) do { \
>>> +	if (param.print_surface_info) \
>>> +		blt_surface_info((name), (obj)); } while (0)
>>> +
>>> +#define WRITE_PNG(fd, id, name, obj, w, h) do { \
>>> +	if (param.write_png) \
>>> +		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)
>>> +
>>
>> My suggestion with shared functions applies here as well
>>
>>> +struct blt_fast_copy_data {
>>> +	int xe;
>>> +	struct blt_copy_object src;
>>> +	struct blt_copy_object mid;
>>> +	struct blt_copy_object dst;
>>> +
>>> +	struct blt_copy_batch bb;
>>> +	enum blt_color_depth color_depth;
>>> +
>>> +	/* debug stuff */
>>> +	bool print_bb;
>>> +};
>>> +
>>> +static int fast_copy_one_bb(int xe,
>>> +			    const intel_ctx_t *ctx,
>>> +			    uint64_t ahnd,
>>> +			    const struct blt_fast_copy_data *blt)
>>> +{
>>> +	struct blt_copy_data blt_tmp;
>>> +	uint64_t bb_offset, alignment;
>>> +	uint64_t bb_pos = 0;
>>> +	int ret = 0;
>>> +
>>> +	alignment = xe_get_default_alignment(xe);
>>> +
>>> +	get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
>>> +	get_offset(ahnd, blt->mid.handle, blt->mid.size, alignment);
>>> +	get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
>>> +	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
>>> +
>>> +	/* First blit */
>>> +	blt_copy_init(xe, &blt_tmp);
>>> +	blt_tmp.src = blt->src;
>>> +	blt_tmp.dst = blt->mid;
>>> +	blt_tmp.bb = blt->bb;
>>> +	blt_tmp.color_depth = blt->color_depth;
>>> +	blt_tmp.print_bb = blt->print_bb;
>>> +	bb_pos = emit_blt_fast_copy(xe, ahnd, &blt_tmp, bb_pos, false);
>>> +
>>> +	/* Second blit */
>>> +	blt_copy_init(xe, &blt_tmp);
>>> +	blt_tmp.src = blt->mid;
>>> +	blt_tmp.dst = blt->dst;
>>> +	blt_tmp.bb = blt->bb;
>>> +	blt_tmp.color_depth = blt->color_depth;
>>> +	blt_tmp.print_bb = blt->print_bb;
>>> +	bb_pos = emit_blt_fast_copy(xe, ahnd, &blt_tmp, bb_pos, true);
>>> +
>>> +	intel_ctx_xe_exec(ctx, ahnd, bb_offset);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static void fast_copy_emit(int xe, const intel_ctx_t *ctx,
>>> +			   uint32_t region1, uint32_t region2,
>>> +			   enum blt_tiling_type mid_tiling)
>>> +{
>>> +	struct blt_copy_data bltinit = {};
>>> +	struct blt_fast_copy_data blt = {};
>>> +	struct blt_copy_object *src, *mid, *dst;
>>> +	const uint32_t bpp = 32;
>>> +	uint64_t bb_size = xe_get_default_alignment(xe);
>>> +	uint64_t ahnd = intel_allocator_open_full(xe, ctx->vm, 0, 0,
>>> +						  INTEL_ALLOCATOR_SIMPLE,
>>> +						  ALLOC_STRATEGY_LOW_TO_HIGH, 0);
>>> +	uint32_t bb, width = param.width, height = param.height;
>>> +	int result;
>>> +
>>> +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
>>> +
>>> +	blt_copy_init(xe, &bltinit);
>>> +	src = blt_create_object(&bltinit, region1, width, height, bpp, 0,
>>> +				T_LINEAR, COMPRESSION_DISABLED, 0, true);
>>> +	mid = blt_create_object(&bltinit, region2, width, height, bpp, 0,
>>> +				mid_tiling, COMPRESSION_DISABLED, 0, true);
>>> +	dst = blt_create_object(&bltinit, region1, width, height, bpp, 0,
>>> +				T_LINEAR, COMPRESSION_DISABLED, 0, true);
>>> +	igt_assert(src->size == dst->size);
>>> +
>>> +	PRINT_SURFACE_INFO("src", src);
>>> +	PRINT_SURFACE_INFO("mid", mid);
>>> +	PRINT_SURFACE_INFO("dst", dst);
>>> +
>>> +	blt_surface_fill_rect(xe, src, width, height);
>>> +	WRITE_PNG(xe, mid_tiling, "src", src, width, height);
>>> +
>>> +	memset(&blt, 0, sizeof(blt));
>>> +	blt.color_depth = CD_32bit;
>>> +	blt.print_bb = param.print_bb;
>>> +	blt_set_copy_object(&blt.src, src);
>>> +	blt_set_copy_object(&blt.mid, mid);
>>> +	blt_set_copy_object(&blt.dst, dst);
>>> +	blt_set_batch(&blt.bb, bb, bb_size, region1);
>>> +
>>> +	fast_copy_one_bb(xe, ctx, ahnd, &blt);
>>> +
>>> +	WRITE_PNG(xe, mid_tiling, "mid", &blt.mid, width, height);
>>> +	WRITE_PNG(xe, mid_tiling, "dst", &blt.dst, width, height);
>>> +
>>> +	result = memcmp(src->ptr, blt.dst.ptr, src->size);
>>> +
>>> +	blt_destroy_object(xe, src);
>>> +	blt_destroy_object(xe, mid);
>>> +	blt_destroy_object(xe, dst);
>>
>> I see that in block_copy tests we also call put_offset() for all copy
>> objects' handles. Although we have a different allocator, I think that we
>> should free the ranges on object destruction.
>>
> 
> I was a little bit surprised we put_offset() in fast-copy and surf-copy.
> I mean this will rebind offsets unnecessary. Finally we should put them
> before releasing ahnd, but for vm-bind (xe) destroying vm just does
> the job (and intel_allocator_init() is freeing the dangling memory
> which wasn't released explicitly).

I see, good to know! Thanks for the explaination.

All the best,
Karolina

> 
>>> +	gem_close(xe, bb);
>>> +	put_ahnd(ahnd);
>>> +
>>> +	munmap(&bb, bb_size);
>>> +
>>> +	igt_assert_f(!result, "source and destination surfaces differs!\n");
>>> +}
>>> +
>>> +static void fast_copy(int xe, const intel_ctx_t *ctx,
>>> +		      uint32_t region1, uint32_t region2,
>>> +		      enum blt_tiling_type mid_tiling)
>>> +{
>>> +	struct blt_copy_data blt = {};
>>> +	struct blt_copy_object *src, *mid, *dst;
>>> +	const uint32_t bpp = 32;
>>> +	uint64_t bb_size = xe_get_default_alignment(xe);
>>> +	uint64_t ahnd = intel_allocator_open_full(xe, ctx->vm, 0, 0,
>>> +						  INTEL_ALLOCATOR_SIMPLE,
>>> +						  ALLOC_STRATEGY_LOW_TO_HIGH, 0);
>>> +	uint32_t bb;
>>> +	uint32_t width = param.width, height = param.height;
>>> +	int result;
>>> +
>>> +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
>>> +
>>> +	blt_copy_init(xe, &blt);
>>> +	src = blt_create_object(&blt, region1, width, height, bpp, 0,
>>> +				T_LINEAR, COMPRESSION_DISABLED, 0, true);
>>> +	mid = blt_create_object(&blt, region2, width, height, bpp, 0,
>>> +				mid_tiling, COMPRESSION_DISABLED, 0, true);
>>> +	dst = blt_create_object(&blt, region1, width, height, bpp, 0,
>>> +				T_LINEAR, COMPRESSION_DISABLED, 0, true);
>>> +	igt_assert(src->size == dst->size);
>>> +
>>> +	blt_surface_fill_rect(xe, src, width, height);
>>> +
>>> +	blt.color_depth = CD_32bit;
>>> +	blt.print_bb = param.print_bb;
>>> +	blt_set_copy_object(&blt.src, src);
>>> +	blt_set_copy_object(&blt.dst, mid);
>>> +	blt_set_batch(&blt.bb, bb, bb_size, region1);
>>> +
>>> +	blt_fast_copy(xe, ctx, NULL, ahnd, &blt);
>>> +
>>> +	WRITE_PNG(xe, mid_tiling, "src", &blt.src, width, height);
>>> +	WRITE_PNG(xe, mid_tiling, "mid", &blt.dst, width, height);
>>> +
>>> +	blt_copy_init(xe, &blt);
>>> +	blt.color_depth = CD_32bit;
>>> +	blt.print_bb = param.print_bb;
>>> +	blt_set_copy_object(&blt.src, mid);
>>> +	blt_set_copy_object(&blt.dst, dst);
>>> +	blt_set_batch(&blt.bb, bb, bb_size, region1);
>>> +
>>> +	blt_fast_copy(xe, ctx, NULL, ahnd, &blt);
>>> +
>>> +	WRITE_PNG(xe, mid_tiling, "dst", &blt.dst, width, height);
>>> +
>>> +	result = memcmp(src->ptr, blt.dst.ptr, src->size);
>>> +
>>> +	blt_destroy_object(xe, src);
>>> +	blt_destroy_object(xe, mid);
>>> +	blt_destroy_object(xe, dst);
>>> +	gem_close(xe, bb);
>>> +	put_ahnd(ahnd);
>>> +
>>> +	igt_assert_f(!result, "source and destination surfaces differs!\n");
>>> +}
>>> +
>>> +enum fast_copy_func {
>>> +	FAST_COPY,
>>> +	FAST_COPY_EMIT
>>> +};
>>> +
>>> +static char
>>> +	*full_subtest_str(char *regtxt, enum blt_tiling_type tiling,
>>> +			  enum fast_copy_func func)
>>> +{
>>> +	char *name;
>>> +	uint32_t len;
>>> +
>>> +	len = asprintf(&name, "%s-%s%s", blt_tiling_name(tiling), regtxt,
>>> +		       func == FAST_COPY_EMIT ? "-emit" : "");
>>> +
>>> +	igt_assert_f(len >= 0, "asprintf failed!\n");
>>> +
>>> +	return name;
>>> +}
>>> +
>>> +static void fast_copy_test(int xe,
>>> +			   struct igt_collection *set,
>>> +			   enum fast_copy_func func)
>>> +{
>>> +	struct drm_xe_engine_class_instance inst = {
>>> +		.engine_class = DRM_XE_ENGINE_CLASS_COPY,
>>> +	};
>>> +	struct igt_collection *regions;
>>> +	void (*copy_func)(int xe, const intel_ctx_t *ctx,
>>> +			  uint32_t r1, uint32_t r2, enum blt_tiling_type tiling);
>>> +	intel_ctx_t *ctx;
>>> +	int tiling;
>>> +
>>> +	for_each_tiling(tiling) {
>>> +		if (!blt_fast_copy_supports_tiling(xe, tiling))
>>> +			continue;
>>> +
>>> +		for_each_variation_r(regions, 2, set) {
>>> +			uint32_t region1, region2;
>>> +			uint32_t vm, engine;
>>> +			char *regtxt, *test_name;
>>> +
>>> +			region1 = igt_collection_get_value(regions, 0);
>>> +			region2 = igt_collection_get_value(regions, 1);
>>> +
>>> +			vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
>>> +			engine = xe_engine_create(xe, vm, &inst, 0);
>>> +			ctx = intel_ctx_xe(xe, vm, engine, 0, 0, 0);
>>> +
>>> +			copy_func = (func == FAST_COPY) ? fast_copy : fast_copy_emit;
>>> +			regtxt = xe_memregion_dynamic_subtest_name(xe, regions);
>>> +			test_name = full_subtest_str(regtxt, tiling, func);
>>> +
>>> +			igt_dynamic_f("%s", test_name) {
>>> +				copy_func(xe, ctx,
>>> +					  region1, region2,
>>> +					  tiling);
>>> +			}
>>> +
>>> +			free(regtxt);
>>> +			free(test_name);
>>> +			xe_engine_destroy(xe, engine);
>>> +			xe_vm_destroy(xe, vm);
>>> +			free(ctx);
>>> +		}
>>> +	}
>>> +}
>>> +
>>> +static int opt_handler(int opt, int opt_index, void *data)
>>> +{
>>> +	switch (opt) {
>>> +	case 'b':
>>> +		param.print_bb = true;
>>> +		igt_debug("Print bb: %d\n", param.print_bb);
>>> +		break;
>>> +	case 'p':
>>> +		param.write_png = true;
>>> +		igt_debug("Write png: %d\n", param.write_png);
>>> +		break;
>>> +	case 's':
>>> +		param.print_surface_info = true;
>>> +		igt_debug("Print surface info: %d\n", param.print_surface_info);
>>> +		break;
>>> +	case 't':
>>> +		param.tiling = atoi(optarg);
>>> +		igt_debug("Tiling: %d\n", param.tiling);
>>> +		break;
>>> +	case 'W':
>>> +		param.width = atoi(optarg);
>>> +		igt_debug("Width: %d\n", param.width);
>>> +		break;
>>> +	case 'H':
>>> +		param.height = atoi(optarg);
>>> +		igt_debug("Height: %d\n", param.height);
>>> +		break;
>>> +	default:
>>> +		return IGT_OPT_HANDLER_ERROR;
>>> +	}
>>> +
>>> +	return IGT_OPT_HANDLER_SUCCESS;
>>> +}
>>> +
>>> +const char *help_str =
>>> +	"  -b\tPrint bb\n"
>>> +	"  -p\tWrite PNG\n"
>>> +	"  -s\tPrint surface info\n"
>>> +	"  -t\tTiling format (0 - linear, 1 - XMAJOR, 2 - YMAJOR, 3 - TILE4, 4 - TILE64, 5 - YFMAJOR)\n"
>>> +	"  -W\tWidth (default 512)\n"
>>> +	"  -H\tHeight (default 512)"
>>> +	;
>>> +
>>> +igt_main_args("b:pst:W:H:", NULL, help_str, opt_handler, NULL)
>>> +{
>>> +	struct igt_collection *set;
>>> +	int xe;
>>> +
>>> +	igt_fixture {
>>> +		xe = drm_open_driver(DRIVER_XE);
>>> +		igt_require(blt_has_block_copy(xe));
>>
>> Should be blt_has_fast_copy(xe)
> 
> Yes, missed that during copying fixture from xe_ccs. Thanks for spotting
> this.
> 
> --
> Zbigniew
> 
>>
>> All the best,
>> Karolina
>>> +
>>> +		xe_device_get(xe);
>>> +
>>> +		set = xe_get_memory_region_set(xe,
>>> +					       XE_MEM_REGION_CLASS_SYSMEM,
>>> +					       XE_MEM_REGION_CLASS_VRAM);
>>> +	}
>>> +
>>> +	igt_describe("Check fast-copy blit");
>>> +	igt_subtest_with_dynamic("fast-copy") {
>>> +		fast_copy_test(xe, set, FAST_COPY);
>>> +	}
>>> +
>>> +	igt_describe("Check multiple fast-copy in one batch");
>>> +	igt_subtest_with_dynamic("fast-copy-emit") {
>>> +		fast_copy_test(xe, set, FAST_COPY_EMIT);
>>> +	}
>>> +
>>> +	igt_fixture {
>>> +		drm_close_driver(xe);
>>> +	}
>>> +}

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v2 14/16] tests/xe_ccs: Check if flatccs is working with block-copy for Xe
  2023-07-11 10:51       ` Karolina Stolarek
@ 2023-07-12  7:00         ` Zbigniew Kempczyński
  0 siblings, 0 replies; 46+ messages in thread
From: Zbigniew Kempczyński @ 2023-07-12  7:00 UTC (permalink / raw)
  To: Karolina Stolarek; +Cc: igt-dev

On Tue, Jul 11, 2023 at 12:51:01PM +0200, Karolina Stolarek wrote:
> On 11.07.2023 12:45, Zbigniew Kempczyński wrote:
> > On Fri, Jul 07, 2023 at 12:05:25PM +0200, Karolina Stolarek wrote:
> > > On 6.07.2023 08:05, Zbigniew Kempczyński wrote:
> > > > This is ported to xe copy of i915 gem_ccs test. Ported means all driver
> > > > dependent calls - like working on regions, binding and execution were
> > > > replaced by xe counterparts. I wondered to add conditionals for xe
> > > > in gem_ccs but this would decrease test readability so I dropped
> > > > this idea.
> > > > 
> > > > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> > > > ---
> > > >    tests/meson.build |   1 +
> > > >    tests/xe/xe_ccs.c | 763 ++++++++++++++++++++++++++++++++++++++++++++++
> > > >    2 files changed, 764 insertions(+)
> > > >    create mode 100644 tests/xe/xe_ccs.c
> > > > 
> > > > diff --git a/tests/meson.build b/tests/meson.build
> > > > index ee066b8490..9bca57a5e8 100644
> > > > --- a/tests/meson.build
> > > > +++ b/tests/meson.build
> > > > @@ -244,6 +244,7 @@ i915_progs = [
> > > >    ]
> > > >    xe_progs = [
> > > > +	'xe_ccs',
> > > >    	'xe_create',
> > > >    	'xe_compute',
> > > >    	'xe_dma_buf_sync',
> > > > diff --git a/tests/xe/xe_ccs.c b/tests/xe/xe_ccs.c
> > > > new file mode 100644
> > > > index 0000000000..e6bb29a5ed
> > > > --- /dev/null
> > > > +++ b/tests/xe/xe_ccs.c
> > > > @@ -0,0 +1,763 @@
> > > > +// SPDX-License-Identifier: MIT
> > > > +/*
> > > > + * Copyright © 2023 Intel Corporation
> > > > + */
> > > > +
> > > > +#include <errno.h>
> > > > +#include <glib.h>
> > > > +#include <sys/ioctl.h>
> > > > +#include <sys/time.h>
> > > > +#include <malloc.h>
> > > > +#include "drm.h"
> > > > +#include "igt.h"
> > > > +#include "igt_syncobj.h"
> > > > +#include "intel_blt.h"
> > > > +#include "intel_mocs.h"
> > > > +#include "xe/xe_ioctl.h"
> > > > +#include "xe/xe_query.h"
> > > > +#include "xe/xe_util.h"
> > > > +/**
> > > > + * TEST: xe ccs
> > > > + * Description: Exercise gen12 blitter with and without flatccs compression on Xe
> > > > + * Run type: FULL
> > > > + *
> > > > + * SUBTEST: block-copy-compressed
> > > > + * Description: Check block-copy flatccs compressed blit
> > > > + *
> > > > + * SUBTEST: block-copy-uncompressed
> > > > + * Description: Check block-copy uncompressed blit
> > > > + *
> > > > + * SUBTEST: block-multicopy-compressed
> > > > + * Description: Check block-multicopy flatccs compressed blit
> > > > + *
> > > > + * SUBTEST: block-multicopy-inplace
> > > > + * Description: Check block-multicopy flatccs inplace decompression blit
> > > > + *
> > > > + * SUBTEST: ctrl-surf-copy
> > > > + * Description: Check flatccs data can be copied from/to surface
> > > > + *
> > > > + * SUBTEST: ctrl-surf-copy-new-ctx
> > > > + * Description: Check flatccs data are physically tagged and visible in vm
> > > > + *
> > > > + * SUBTEST: suspend-resume
> > > > + * Description: Check flatccs data persists after suspend / resume (S0)
> > > > + */
> > > > +
> > > > +IGT_TEST_DESCRIPTION("Exercise gen12 blitter with and without flatccs compression on Xe");
> > > > +
> > > > +static struct param {
> > > > +	int compression_format;
> > > > +	int tiling;
> > > > +	bool write_png;
> > > > +	bool print_bb;
> > > > +	bool print_surface_info;
> > > > +	int width;
> > > > +	int height;
> > > > +} param = {
> > > > +	.compression_format = 0,
> > > > +	.tiling = -1,
> > > > +	.write_png = false,
> > > > +	.print_bb = false,
> > > > +	.print_surface_info = false,
> > > > +	.width = 512,
> > > > +	.height = 512,
> > > > +};
> > > > +
> > > > +struct test_config {
> > > > +	bool compression;
> > > > +	bool inplace;
> > > > +	bool surfcopy;
> > > > +	bool new_ctx;
> > > > +	bool suspend_resume;
> > > > +};
> > > > +
> > > > +static void set_surf_object(struct blt_ctrl_surf_copy_object *obj,
> > > > +			    uint32_t handle, uint32_t region, uint64_t size,
> > > > +			    uint8_t mocs, enum blt_access_type access_type)
> > > > +{
> > > > +	obj->handle = handle;
> > > > +	obj->region = region;
> > > > +	obj->size = size;
> > > > +	obj->mocs = mocs;
> > > > +	obj->access_type = access_type;
> > > > +}
> > > > +
> > > > +#define PRINT_SURFACE_INFO(name, obj) do { \
> > > > +	if (param.print_surface_info) \
> > > > +		blt_surface_info((name), (obj)); } while (0)
> > > > +
> > > > +#define WRITE_PNG(fd, id, name, obj, w, h) do { \
> > > > +	if (param.write_png) \
> > > > +		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)
> > > > +
> > > > +static int compare_nxn(const struct blt_copy_object *surf1,
> > > > +		       const struct blt_copy_object *surf2,
> > > > +		       int xsize, int ysize, int bx, int by)
> > > 
> > > I think that you could avoid some repetition by creating a small lib with
> > > blt copy test helpers. For example, we have 4 definitions of WRITE_PNG, and
> > > it would be good to have just one and call it in four separate tests.
> > > 
> > 
> > Regarding WRITE_PNG() macro. I'm not convinced should we export this macro
> > in this form. I mean there's no additional logic than checking param.write_png
> > field. I imagine sth like this:
> > 
> > #define BLT_WRITE_PNG(conditional, fd, id, ...) do { \
> > 	if (conditional) \
> > 		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)
> > 
> > and calling in the code:
> > 
> > BLT_WRITE_PNG(param.write_png, fd, ...);
> > 
> > Looks weird imo, but maybe it's just my subjective assessment. If you think
> > above is fine for you I'll add this in v4.
> 
> If you change "conditional" to something like "write_png", I'll be fine with
> it. I don't see much value in having 4 (?) definitions that are the same in
> the codebase.
> 

Ok, I'll provide a function instead of a macro. I'll send it in v4.

Thanks for the review.
--
Zbigniew

> > 
> > > > +{
> > > > +	int x, y, corrupted;
> > > > +	uint32_t pos, px1, px2;
> > > > +
> > > > +	corrupted = 0;
> > > > +	for (y = 0; y < ysize; y++) {
> > > > +		for (x = 0; x < xsize; x++) {
> > > > +			pos = bx * xsize + by * ysize * surf1->pitch / 4;
> > > > +			pos += x + y * surf1->pitch / 4;
> > > > +			px1 = surf1->ptr[pos];
> > > > +			px2 = surf2->ptr[pos];
> > > > +			if (px1 != px2)
> > > > +				corrupted++;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	return corrupted;
> > > > +}
> > > > +
> > > > +static void dump_corruption_info(const struct blt_copy_object *surf1,
> > > > +				 const struct blt_copy_object *surf2)
> > > > +{
> > > > +	const int xsize = 8, ysize = 8;
> > > > +	int w, h, bx, by, corrupted;
> > > > +
> > > > +	igt_assert(surf1->x1 == surf2->x1 && surf1->x2 == surf2->x2);
> > > > +	igt_assert(surf1->y1 == surf2->y1 && surf1->y2 == surf2->y2);
> > > > +	w = surf1->x2;
> > > > +	h = surf1->y2;
> > > > +
> > > > +	igt_info("dump corruption - width: %d, height: %d, sizex: %x, sizey: %x\n",
> > > > +		 surf1->x2, surf1->y2, xsize, ysize);
> > > > +
> > > > +	for (by = 0; by < h / ysize; by++) {
> > > > +		for (bx = 0; bx < w / xsize; bx++) {
> > > > +			corrupted = compare_nxn(surf1, surf2, xsize, ysize, bx, by);
> > > > +			if (corrupted == 0)
> > > > +				igt_info(".");
> > > > +			else
> > > > +				igt_info("%c", '0' + corrupted);
> > > > +		}
> > > > +		igt_info("\n");
> > > > +	}
> > > > +}
> > > > +
> > > > +static void surf_copy(int xe,
> > > > +		      intel_ctx_t *ctx,
> > > > +		      uint64_t ahnd,
> > > > +		      const struct blt_copy_object *src,
> > > > +		      const struct blt_copy_object *mid,
> > > > +		      const struct blt_copy_object *dst,
> > > > +		      int run_id, bool suspend_resume)
> > > > +{
> > > > +	struct blt_copy_data blt = {};
> > > > +	struct blt_block_copy_data_ext ext = {};
> > > > +	struct blt_ctrl_surf_copy_data surf = {};
> > > > +	uint32_t bb1, bb2, ccs, ccs2, *ccsmap, *ccsmap2;
> > > > +	uint64_t bb_size, ccssize = mid->size / CCS_RATIO;
> > > > +	uint32_t *ccscopy;
> > > > +	uint8_t uc_mocs = intel_get_uc_mocs(xe);
> > > > +	uint32_t sysmem = system_memory(xe);
> > > > +	int result;
> > > > +
> > > > +	igt_assert(mid->compression);
> > > > +	ccscopy = (uint32_t *) malloc(ccssize);
> > > > +	ccs = xe_bo_create_flags(xe, 0, ccssize, sysmem);
> > > > +	ccs2 = xe_bo_create_flags(xe, 0, ccssize, sysmem);
> > > > +
> > > > +	blt_ctrl_surf_copy_init(xe, &surf);
> > > > +	surf.print_bb = param.print_bb;
> > > > +	set_surf_object(&surf.src, mid->handle, mid->region, mid->size,
> > > > +			uc_mocs, BLT_INDIRECT_ACCESS);
> > > > +	set_surf_object(&surf.dst, ccs, sysmem, ccssize, uc_mocs, DIRECT_ACCESS);
> > > > +	bb_size = xe_get_default_alignment(xe);
> > > > +	bb1 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
> > > > +	blt_set_batch(&surf.bb, bb1, bb_size, sysmem);
> > > > +	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> > > > +	intel_ctx_xe_sync(ctx, true);
> > > > +
> > > > +	ccsmap = xe_bo_map(xe, ccs, surf.dst.size);
> > > > +	memcpy(ccscopy, ccsmap, ccssize);
> > > > +
> > > > +	if (suspend_resume) {
> > > > +		char *orig, *orig2, *newsum, *newsum2;
> > > > +
> > > > +		orig = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
> > > > +						   (void *)ccsmap, surf.dst.size);
> > > > +		orig2 = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
> > > > +						    (void *)mid->ptr, mid->size);
> > > > +
> > > > +		igt_system_suspend_autoresume(SUSPEND_STATE_FREEZE, SUSPEND_TEST_NONE);
> > > > +
> > > > +		set_surf_object(&surf.dst, ccs2, REGION_SMEM, ccssize,
> > > 
> > > Shouldn't this be sysmem instead of REGION_SMEM?
> > > 
> > 
> > Yes, missed this line.
> > 
> > > > +				0, DIRECT_ACCESS);
> > > > +		blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> > > > +		intel_ctx_xe_sync(ctx, true);
> > > > +
> > > > +		ccsmap2 = xe_bo_map(xe, ccs2, surf.dst.size);
> > > > +		newsum = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
> > > > +						     (void *)ccsmap2, surf.dst.size);
> > > > +		newsum2 = g_compute_checksum_for_data(G_CHECKSUM_SHA1,
> > > > +						      (void *)mid->ptr, mid->size);
> > > > +
> > > > +		munmap(ccsmap2, ccssize);
> > > > +		igt_assert(!strcmp(orig, newsum));
> > > > +		igt_assert(!strcmp(orig2, newsum2));
> > > > +		g_free(orig);
> > > > +		g_free(orig2);
> > > > +		g_free(newsum);
> > > > +		g_free(newsum2);
> > > > +	}
> > > > +
> > > > +	/* corrupt ccs */
> > > > +	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
> > > > +		ccsmap[i] = i;
> > > > +	set_surf_object(&surf.src, ccs, sysmem, ccssize,
> > > > +			uc_mocs, DIRECT_ACCESS);
> > > > +	set_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
> > > > +			uc_mocs, INDIRECT_ACCESS);
> > > > +	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> > > > +	intel_ctx_xe_sync(ctx, true);
> > > > +
> > > > +	blt_copy_init(xe, &blt);
> > > > +	blt.color_depth = CD_32bit;
> > > > +	blt.print_bb = param.print_bb;
> > > > +	blt_set_copy_object(&blt.src, mid);
> > > > +	blt_set_copy_object(&blt.dst, dst);
> > > > +	blt_set_object_ext(&ext.src, mid->compression_type, mid->x2, mid->y2, SURFACE_TYPE_2D);
> > > > +	blt_set_object_ext(&ext.dst, 0, dst->x2, dst->y2, SURFACE_TYPE_2D);
> > > > +	bb2 = xe_bo_create_flags(xe, 0, bb_size, sysmem);
> > > > +	blt_set_batch(&blt.bb, bb2, bb_size, sysmem);
> > > > +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, &ext);
> > > > +	intel_ctx_xe_sync(ctx, true);
> > > > +	WRITE_PNG(xe, run_id, "corrupted", &blt.dst, dst->x2, dst->y2);
> > > > +	result = memcmp(src->ptr, dst->ptr, src->size);
> > > > +	igt_assert(result != 0);
> > > > +
> > > > +	/* retrieve back ccs */
> > > > +	memcpy(ccsmap, ccscopy, ccssize);
> > > > +	blt_ctrl_surf_copy(xe, ctx, NULL, ahnd, &surf);
> > > > +
> > > > +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, &ext);
> > > > +	intel_ctx_xe_sync(ctx, true);
> > > > +	WRITE_PNG(xe, run_id, "corrected", &blt.dst, dst->x2, dst->y2);
> > > > +	result = memcmp(src->ptr, dst->ptr, src->size);
> > > > +	if (result)
> > > > +		dump_corruption_info(src, dst);
> > > > +
> > > > +	munmap(ccsmap, ccssize);
> > > > +	gem_close(xe, ccs);
> > > > +	gem_close(xe, ccs2);
> > > > +	gem_close(xe, bb1);
> > > > +	gem_close(xe, bb2);
> > > > +
> > > > +	igt_assert_f(result == 0,
> > > > +		     "Source and destination surfaces are different after "
> > > > +		     "restoring source ccs data\n");
> > > > +}
> > > > +
> > > > +struct blt_copy3_data {
> > > > +	int xe;
> > > > +	struct blt_copy_object src;
> > > > +	struct blt_copy_object mid;
> > > > +	struct blt_copy_object dst;
> > > > +	struct blt_copy_object final;
> > > > +	struct blt_copy_batch bb;
> > > > +	enum blt_color_depth color_depth;
> > > > +
> > > > +	/* debug stuff */
> > > > +	bool print_bb;
> > > > +};
> > > > +
> > > > +struct blt_block_copy3_data_ext {
> > > > +	struct blt_block_copy_object_ext src;
> > > > +	struct blt_block_copy_object_ext mid;
> > > > +	struct blt_block_copy_object_ext dst;
> > > > +	struct blt_block_copy_object_ext final;
> > > > +};
> > > > +
> > > 
> > > Hmm, we really could make use of shared definitions like these (yes, I sound
> > > like a broken record at this point, sorry!)
> > > 
> > 
> > No, there's no plan to have blt3 in the common library.
> 
> Sorry, not sure why did I became so obsesed with blt3 back then, heh
> 
> > 
> > > > +#define FILL_OBJ(_idx, _handle, _offset) do { \
> > > > +	obj[(_idx)].handle = (_handle); \
> > > > +	obj[(_idx)].offset = (_offset); \
> > > > +} while (0)
> > > 
> > > We don't use this definition in Xe tests, you can delete it
> > > 
> > 
> > Good catch.
> > 
> > > > +
> > > > +static int blt_block_copy3(int xe,
> > > > +			   const intel_ctx_t *ctx,
> > > > +			   uint64_t ahnd,
> > > > +			   const struct blt_copy3_data *blt3,
> > > > +			   const struct blt_block_copy3_data_ext *ext3)
> > > > +{
> > > > +	struct blt_copy_data blt0;
> > > > +	struct blt_block_copy_data_ext ext0;
> > > > +	uint64_t bb_offset, alignment;
> > > > +	uint64_t bb_pos = 0;
> > > > +	int ret;
> > > > +
> > > > +	igt_assert_f(ahnd, "block-copy3 supports softpin only\n");
> > > > +	igt_assert_f(blt3, "block-copy3 requires data to do blit\n");
> > > > +
> > > > +	alignment = xe_get_default_alignment(xe);
> > > > +	get_offset(ahnd, blt3->src.handle, blt3->src.size, alignment);
> > > > +	get_offset(ahnd, blt3->mid.handle, blt3->mid.size, alignment);
> > > > +	get_offset(ahnd, blt3->dst.handle, blt3->dst.size, alignment);
> > > > +	get_offset(ahnd, blt3->final.handle, blt3->final.size, alignment);
> > > > +	bb_offset = get_offset(ahnd, blt3->bb.handle, blt3->bb.size, alignment);
> > > > +
> > > > +	/* First blit src -> mid */
> > > > +	blt_copy_init(xe, &blt0);
> > > > +	blt0.src = blt3->src;
> > > > +	blt0.dst = blt3->mid;
> > > > +	blt0.bb = blt3->bb;
> > > > +	blt0.color_depth = blt3->color_depth;
> > > > +	blt0.print_bb = blt3->print_bb;
> > > > +	ext0.src = ext3->src;
> > > > +	ext0.dst = ext3->mid;
> > > > +	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, false);
> > > > +
> > > > +	/* Second blit mid -> dst */
> > > > +	blt_copy_init(xe, &blt0);
> > > > +	blt0.src = blt3->mid;
> > > > +	blt0.dst = blt3->dst;
> > > > +	blt0.bb = blt3->bb;
> > > > +	blt0.color_depth = blt3->color_depth;
> > > > +	blt0.print_bb = blt3->print_bb;
> > > > +	ext0.src = ext3->mid;
> > > > +	ext0.dst = ext3->dst;
> > > > +	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, false);
> > > > +
> > > > +	/* Third blit dst -> final */
> > > > +	blt_copy_init(xe, &blt0);
> > > > +	blt0.src = blt3->dst;
> > > > +	blt0.dst = blt3->final;
> > > > +	blt0.bb = blt3->bb;
> > > > +	blt0.color_depth = blt3->color_depth;
> > > > +	blt0.print_bb = blt3->print_bb;
> > > > +	ext0.src = ext3->dst;
> > > > +	ext0.dst = ext3->final;
> > > > +	bb_pos = emit_blt_block_copy(xe, ahnd, &blt0, &ext0, bb_pos, true);
> > > > +
> > > > +	intel_ctx_xe_exec(ctx, ahnd, bb_offset);
> > > > +
> > > > +	return ret;
> > > > +}
> > > > +
> > > > +static void block_copy(int xe,
> > > > +		       intel_ctx_t *ctx,
> > > > +		       uint32_t region1, uint32_t region2,
> > > > +		       enum blt_tiling_type mid_tiling,
> > > > +		       const struct test_config *config)
> > > > +{
> > > > +	struct blt_copy_data blt = {};
> > > > +	struct blt_block_copy_data_ext ext = {}, *pext = &ext;
> > > > +	struct blt_copy_object *src, *mid, *dst;
> > > > +	const uint32_t bpp = 32;
> > > > +	uint64_t bb_size = xe_get_default_alignment(xe);
> > > > +	uint64_t ahnd = intel_allocator_open(xe, ctx->vm, INTEL_ALLOCATOR_RELOC);
> > > > +	uint32_t run_id = mid_tiling;
> > > > +	uint32_t mid_region = region2, bb;
> > > > +	uint32_t width = param.width, height = param.height;
> > > > +	enum blt_compression mid_compression = config->compression;
> > > > +	int mid_compression_format = param.compression_format;
> > > > +	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
> > > > +	uint8_t uc_mocs = intel_get_uc_mocs(xe);
> > > > +	int result;
> > > > +
> > > > +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
> > > > +
> > > > +	if (!blt_uses_extended_block_copy(xe))
> > > > +		pext = NULL;
> > > > +
> > > > +	blt_copy_init(xe, &blt);
> > > > +
> > > > +	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> > > > +				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> > > > +	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
> > > > +				mid_tiling, mid_compression, comp_type, true);
> > > > +	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> > > > +				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> > > > +	igt_assert(src->size == dst->size);
> > > > +	PRINT_SURFACE_INFO("src", src);
> > > > +	PRINT_SURFACE_INFO("mid", mid);
> > > > +	PRINT_SURFACE_INFO("dst", dst);
> > > > +
> > > > +	blt_surface_fill_rect(xe, src, width, height);
> > > > +	WRITE_PNG(xe, run_id, "src", src, width, height);
> > > > +
> > > > +	blt.color_depth = CD_32bit;
> > > > +	blt.print_bb = param.print_bb;
> > > > +	blt_set_copy_object(&blt.src, src);
> > > > +	blt_set_copy_object(&blt.dst, mid);
> > > > +	blt_set_object_ext(&ext.src, 0, width, height, SURFACE_TYPE_2D);
> > > > +	blt_set_object_ext(&ext.dst, mid_compression_format, width, height, SURFACE_TYPE_2D);
> > > > +	blt_set_batch(&blt.bb, bb, bb_size, region1);
> > > > +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, pext);
> > > > +	intel_ctx_xe_sync(ctx, true);
> > > > +
> > > > +	/* We expect mid != src if there's compression */
> > > > +	if (mid->compression)
> > > > +		igt_assert(memcmp(src->ptr, mid->ptr, src->size) != 0);
> > > > +
> > > > +	WRITE_PNG(xe, run_id, "src", &blt.src, width, height);
> > > 
> > > It's also in gem_ccs, why are we saving "src" surface twice?
> > > 
> > 
> > Agree, this is mistake. I'll remove this in both tests.
> 
> Awesome, thanks!
> 
> All the best,
> Karolina
> 
> > 
> > Thank you for the review.
> > --
> > Zbigniew
> > 
> > > Many thanks,
> > > Karolina
> > > 
> > > > +	WRITE_PNG(xe, run_id, "mid", &blt.dst, width, height);
> > > > +
> > > > +	if (config->surfcopy && pext) {
> > > > +		struct drm_xe_engine_class_instance inst = {
> > > > +			.engine_class = DRM_XE_ENGINE_CLASS_COPY,
> > > > +		};
> > > > +		intel_ctx_t *surf_ctx = ctx;
> > > > +		uint64_t surf_ahnd = ahnd;
> > > > +		uint32_t vm, engine;
> > > > +
> > > > +		if (config->new_ctx) {
> > > > +			vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
> > > > +			engine = xe_engine_create(xe, vm, &inst, 0);
> > > > +			surf_ctx = intel_ctx_xe(xe, vm, engine, 0, 0, 0);
> > > > +			surf_ahnd = intel_allocator_open(xe, surf_ctx->vm,
> > > > +							 INTEL_ALLOCATOR_RELOC);
> > > > +		}
> > > > +		surf_copy(xe, surf_ctx, surf_ahnd, src, mid, dst, run_id,
> > > > +			  config->suspend_resume);
> > > > +
> > > > +		if (surf_ctx != ctx) {
> > > > +			xe_engine_destroy(xe, engine);
> > > > +			xe_vm_destroy(xe, vm);
> > > > +			free(surf_ctx);
> > > > +			put_ahnd(surf_ahnd);
> > > > +		}
> > > > +	}
> > > > +
> > > > +	blt_copy_init(xe, &blt);
> > > > +	blt.color_depth = CD_32bit;
> > > > +	blt.print_bb = param.print_bb;
> > > > +	blt_set_copy_object(&blt.src, mid);
> > > > +	blt_set_copy_object(&blt.dst, dst);
> > > > +	blt_set_object_ext(&ext.src, mid_compression_format, width, height, SURFACE_TYPE_2D);
> > > > +	blt_set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
> > > > +	if (config->inplace) {
> > > > +		blt_set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
> > > > +			       T_LINEAR, COMPRESSION_DISABLED, comp_type);
> > > > +		blt.dst.ptr = mid->ptr;
> > > > +	}
> > > > +
> > > > +	blt_set_batch(&blt.bb, bb, bb_size, region1);
> > > > +	blt_block_copy(xe, ctx, NULL, ahnd, &blt, pext);
> > > > +	intel_ctx_xe_sync(ctx, true);
> > > > +
> > > > +	WRITE_PNG(xe, run_id, "dst", &blt.dst, width, height);
> > > > +
> > > > +	result = memcmp(src->ptr, blt.dst.ptr, src->size);
> > > > +
> > > > +	/* Politely clean vm */
> > > > +	put_offset(ahnd, src->handle);
> > > > +	put_offset(ahnd, mid->handle);
> > > > +	put_offset(ahnd, dst->handle);
> > > > +	put_offset(ahnd, bb);
> > > > +	intel_allocator_bind(ahnd, 0, 0);
> > > > +	blt_destroy_object(xe, src);
> > > > +	blt_destroy_object(xe, mid);
> > > > +	blt_destroy_object(xe, dst);
> > > > +	gem_close(xe, bb);
> > > > +	put_ahnd(ahnd);
> > > > +
> > > > +	igt_assert_f(!result, "source and destination surfaces differs!\n");
> > > > +}
> > > > +
> > > > +static void block_multicopy(int xe,
> > > > +			    intel_ctx_t *ctx,
> > > > +			    uint32_t region1, uint32_t region2,
> > > > +			    enum blt_tiling_type mid_tiling,
> > > > +			    const struct test_config *config)
> > > > +{
> > > > +	struct blt_copy3_data blt3 = {};
> > > > +	struct blt_copy_data blt = {};
> > > > +	struct blt_block_copy3_data_ext ext3 = {}, *pext3 = &ext3;
> > > > +	struct blt_copy_object *src, *mid, *dst, *final;
> > > > +	const uint32_t bpp = 32;
> > > > +	uint64_t bb_size = xe_get_default_alignment(xe);
> > > > +	uint64_t ahnd = intel_allocator_open(xe, ctx->vm, INTEL_ALLOCATOR_RELOC);
> > > > +	uint32_t run_id = mid_tiling;
> > > > +	uint32_t mid_region = region2, bb;
> > > > +	uint32_t width = param.width, height = param.height;
> > > > +	enum blt_compression mid_compression = config->compression;
> > > > +	int mid_compression_format = param.compression_format;
> > > > +	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
> > > > +	uint8_t uc_mocs = intel_get_uc_mocs(xe);
> > > > +	int result;
> > > > +
> > > > +	bb = xe_bo_create_flags(xe, 0, bb_size, region1);
> > > > +
> > > > +	if (!blt_uses_extended_block_copy(xe))
> > > > +		pext3 = NULL;
> > > > +
> > > > +	blt_copy_init(xe, &blt);
> > > > +
> > > > +	src = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> > > > +				T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> > > > +	mid = blt_create_object(&blt, mid_region, width, height, bpp, uc_mocs,
> > > > +				mid_tiling, mid_compression, comp_type, true);
> > > > +	dst = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> > > > +				mid_tiling, COMPRESSION_DISABLED, comp_type, true);
> > > > +	final = blt_create_object(&blt, region1, width, height, bpp, uc_mocs,
> > > > +				  T_LINEAR, COMPRESSION_DISABLED, comp_type, true);
> > > > +	igt_assert(src->size == dst->size);
> > > > +	PRINT_SURFACE_INFO("src", src);
> > > > +	PRINT_SURFACE_INFO("mid", mid);
> > > > +	PRINT_SURFACE_INFO("dst", dst);
> > > > +	PRINT_SURFACE_INFO("final", final);
> > > > +
> > > > +	blt_surface_fill_rect(xe, src, width, height);
> > > > +
> > > > +	blt3.color_depth = CD_32bit;
> > > > +	blt3.print_bb = param.print_bb;
> > > > +	blt_set_copy_object(&blt3.src, src);
> > > > +	blt_set_copy_object(&blt3.mid, mid);
> > > > +	blt_set_copy_object(&blt3.dst, dst);
> > > > +	blt_set_copy_object(&blt3.final, final);
> > > > +
> > > > +	if (config->inplace) {
> > > > +		blt_set_object(&blt3.dst, mid->handle, dst->size, mid->region,
> > > > +			       mid->mocs, mid_tiling, COMPRESSION_DISABLED,
> > > > +			       comp_type);
> > > > +		blt3.dst.ptr = mid->ptr;
> > > > +	}
> > > > +
> > > > +	blt_set_object_ext(&ext3.src, 0, width, height, SURFACE_TYPE_2D);
> > > > +	blt_set_object_ext(&ext3.mid, mid_compression_format, width, height, SURFACE_TYPE_2D);
> > > > +	blt_set_object_ext(&ext3.dst, 0, width, height, SURFACE_TYPE_2D);
> > > > +	blt_set_object_ext(&ext3.final, 0, width, height, SURFACE_TYPE_2D);
> > > > +	blt_set_batch(&blt3.bb, bb, bb_size, region1);
> > > > +
> > > > +	blt_block_copy3(xe, ctx, ahnd, &blt3, pext3);
> > > > +	intel_ctx_xe_sync(ctx, true);
> > > > +
> > > > +	WRITE_PNG(xe, run_id, "src", &blt3.src, width, height);
> > > > +	if (!config->inplace)
> > > > +		WRITE_PNG(xe, run_id, "mid", &blt3.mid, width, height);
> > > > +	WRITE_PNG(xe, run_id, "dst", &blt3.dst, width, height);
> > > > +	WRITE_PNG(xe, run_id, "final", &blt3.final, width, height);
> > > > +
> > > > +	result = memcmp(src->ptr, blt3.final.ptr, src->size);
> > > > +
> > > > +	put_offset(ahnd, src->handle);
> > > > +	put_offset(ahnd, mid->handle);
> > > > +	put_offset(ahnd, dst->handle);
> > > > +	put_offset(ahnd, final->handle);
> > > > +	put_offset(ahnd, bb);
> > > > +	intel_allocator_bind(ahnd, 0, 0);
> > > > +	blt_destroy_object(xe, src);
> > > > +	blt_destroy_object(xe, mid);
> > > > +	blt_destroy_object(xe, dst);
> > > > +	blt_destroy_object(xe, final);
> > > > +	gem_close(xe, bb);
> > > > +	put_ahnd(ahnd);
> > > > +
> > > > +	igt_assert_f(!result, "source and destination surfaces differs!\n");
> > > > +}
> > > > +
> > > > +enum copy_func {
> > > > +	BLOCK_COPY,
> > > > +	BLOCK_MULTICOPY,
> > > > +};
> > > > +
> > > > +static const struct {
> > > > +	const char *suffix;
> > > > +	void (*copyfn)(int fd,
> > > > +		       intel_ctx_t *ctx,
> > > > +		       uint32_t region1, uint32_t region2,
> > > > +		       enum blt_tiling_type btype,
> > > > +		       const struct test_config *config);
> > > > +} copyfns[] = {
> > > > +	[BLOCK_COPY] = { "", block_copy },
> > > > +	[BLOCK_MULTICOPY] = { "-multicopy", block_multicopy },
> > > > +};
> > > > +
> > > > +static void block_copy_test(int xe,
> > > > +			    const struct test_config *config,
> > > > +			    struct igt_collection *set,
> > > > +			    enum copy_func copy_function)
> > > > +{
> > > > +	struct drm_xe_engine_class_instance inst = {
> > > > +		.engine_class = DRM_XE_ENGINE_CLASS_COPY,
> > > > +	};
> > > > +	intel_ctx_t *ctx;
> > > > +	struct igt_collection *regions;
> > > > +	uint32_t vm, engine;
> > > > +	int tiling;
> > > > +
> > > > +	if (config->compression && !blt_block_copy_supports_compression(xe))
> > > > +		return;
> > > > +
> > > > +	if (config->inplace && !config->compression)
> > > > +		return;
> > > > +
> > > > +	for_each_tiling(tiling) {
> > > > +		if (!blt_block_copy_supports_tiling(xe, tiling) ||
> > > > +		    (param.tiling >= 0 && param.tiling != tiling))
> > > > +			continue;
> > > > +
> > > > +		for_each_variation_r(regions, 2, set) {
> > > > +			uint32_t region1, region2;
> > > > +			char *regtxt;
> > > > +
> > > > +			region1 = igt_collection_get_value(regions, 0);
> > > > +			region2 = igt_collection_get_value(regions, 1);
> > > > +
> > > > +			/* Compressed surface must be in device memory */
> > > > +			if (config->compression && !XE_IS_VRAM_MEMORY_REGION(xe, region2))
> > > > +				continue;
> > > > +
> > > > +			regtxt = xe_memregion_dynamic_subtest_name(xe, regions);
> > > > +
> > > > +			igt_dynamic_f("%s-%s-compfmt%d-%s%s",
> > > > +				      blt_tiling_name(tiling),
> > > > +				      config->compression ?
> > > > +					      "compressed" : "uncompressed",
> > > > +				      param.compression_format, regtxt,
> > > > +				      copyfns[copy_function].suffix) {
> > > > +				uint32_t sync_bind, sync_out;
> > > > +
> > > > +				vm = xe_vm_create(xe, DRM_XE_VM_CREATE_ASYNC_BIND_OPS, 0);
> > > > +				engine = xe_engine_create(xe, vm, &inst, 0);
> > > > +				sync_bind = syncobj_create(xe, 0);
> > > > +				sync_out = syncobj_create(xe, 0);
> > > > +				ctx = intel_ctx_xe(xe, vm, engine,
> > > > +						   0, sync_bind, sync_out);
> > > > +
> > > > +				copyfns[copy_function].copyfn(xe, ctx,
> > > > +							      region1, region2,
> > > > +							      tiling, config);
> > > > +
> > > > +				xe_engine_destroy(xe, engine);
> > > > +				xe_vm_destroy(xe, vm);
> > > > +				syncobj_destroy(xe, sync_bind);
> > > > +				syncobj_destroy(xe, sync_out);
> > > > +				free(ctx);
> > > > +			}
> > > > +
> > > > +			free(regtxt);
> > > > +		}
> > > > +	}
> > > > +}
> > > > +
> > > > +static int opt_handler(int opt, int opt_index, void *data)
> > > > +{
> > > > +	switch (opt) {
> > > > +	case 'b':
> > > > +		param.print_bb = true;
> > > > +		igt_debug("Print bb: %d\n", param.print_bb);
> > > > +		break;
> > > > +	case 'f':
> > > > +		param.compression_format = atoi(optarg);
> > > > +		igt_debug("Compression format: %d\n", param.compression_format);
> > > > +		igt_assert((param.compression_format & ~0x1f) == 0);
> > > > +		break;
> > > > +	case 'p':
> > > > +		param.write_png = true;
> > > > +		igt_debug("Write png: %d\n", param.write_png);
> > > > +		break;
> > > > +	case 's':
> > > > +		param.print_surface_info = true;
> > > > +		igt_debug("Print surface info: %d\n", param.print_surface_info);
> > > > +		break;
> > > > +	case 't':
> > > > +		param.tiling = atoi(optarg);
> > > > +		igt_debug("Tiling: %d\n", param.tiling);
> > > > +		break;
> > > > +	case 'W':
> > > > +		param.width = atoi(optarg);
> > > > +		igt_debug("Width: %d\n", param.width);
> > > > +		break;
> > > > +	case 'H':
> > > > +		param.height = atoi(optarg);
> > > > +		igt_debug("Height: %d\n", param.height);
> > > > +		break;
> > > > +	default:
> > > > +		return IGT_OPT_HANDLER_ERROR;
> > > > +	}
> > > > +
> > > > +	return IGT_OPT_HANDLER_SUCCESS;
> > > > +}
> > > > +
> > > > +const char *help_str =
> > > > +	"  -b\tPrint bb\n"
> > > > +	"  -f\tCompression format (0-31)\n"
> > > > +	"  -p\tWrite PNG\n"
> > > > +	"  -s\tPrint surface info\n"
> > > > +	"  -t\tTiling format (0 - linear, 1 - XMAJOR, 2 - YMAJOR, 3 - TILE4, 4 - TILE64)\n"
> > > > +	"  -W\tWidth (default 512)\n"
> > > > +	"  -H\tHeight (default 512)"
> > > > +	;
> > > > +
> > > > +igt_main_args("bf:pst:W:H:", NULL, help_str, opt_handler, NULL)
> > > > +{
> > > > +	struct igt_collection *set;
> > > > +	int xe;
> > > > +
> > > > +	igt_fixture {
> > > > +		xe = drm_open_driver(DRIVER_XE);
> > > > +		igt_require(blt_has_block_copy(xe));
> > > > +
> > > > +		xe_device_get(xe);
> > > > +
> > > > +		set = xe_get_memory_region_set(xe,
> > > > +					       XE_MEM_REGION_CLASS_SYSMEM,
> > > > +					       XE_MEM_REGION_CLASS_VRAM);
> > > > +	}
> > > > +
> > > > +	igt_describe("Check block-copy uncompressed blit");
> > > > +	igt_subtest_with_dynamic("block-copy-uncompressed") {
> > > > +		struct test_config config = {};
> > > > +
> > > > +		block_copy_test(xe, &config, set, BLOCK_COPY);
> > > > +	}
> > > > +
> > > > +	igt_describe("Check block-copy flatccs compressed blit");
> > > > +	igt_subtest_with_dynamic("block-copy-compressed") {
> > > > +		struct test_config config = { .compression = true };
> > > > +
> > > > +		block_copy_test(xe, &config, set, BLOCK_COPY);
> > > > +	}
> > > > +
> > > > +	igt_describe("Check block-multicopy flatccs compressed blit");
> > > > +	igt_subtest_with_dynamic("block-multicopy-compressed") {
> > > > +		struct test_config config = { .compression = true };
> > > > +
> > > > +		block_copy_test(xe, &config, set, BLOCK_MULTICOPY);
> > > > +	}
> > > > +
> > > > +	igt_describe("Check block-multicopy flatccs inplace decompression blit");
> > > > +	igt_subtest_with_dynamic("block-multicopy-inplace") {
> > > > +		struct test_config config = { .compression = true,
> > > > +					      .inplace = true };
> > > > +
> > > > +		block_copy_test(xe, &config, set, BLOCK_MULTICOPY);
> > > > +	}
> > > > +
> > > > +	igt_describe("Check flatccs data can be copied from/to surface");
> > > > +	igt_subtest_with_dynamic("ctrl-surf-copy") {
> > > > +		struct test_config config = { .compression = true,
> > > > +					      .surfcopy = true };
> > > > +
> > > > +		block_copy_test(xe, &config, set, BLOCK_COPY);
> > > > +	}
> > > > +
> > > > +	igt_describe("Check flatccs data are physically tagged and visible"
> > > > +		     " in different contexts");
> > > > +	igt_subtest_with_dynamic("ctrl-surf-copy-new-ctx") {
> > > > +		struct test_config config = { .compression = true,
> > > > +					      .surfcopy = true,
> > > > +					      .new_ctx = true };
> > > > +
> > > > +		block_copy_test(xe, &config, set, BLOCK_COPY);
> > > > +	}
> > > > +
> > > > +	igt_describe("Check flatccs data persists after suspend / resume (S0)");
> > > > +	igt_subtest_with_dynamic("suspend-resume") {
> > > > +		struct test_config config = { .compression = true,
> > > > +					      .surfcopy = true,
> > > > +					      .suspend_resume = true };
> > > > +
> > > > +		block_copy_test(xe, &config, set, BLOCK_COPY);
> > > > +	}
> > > > +
> > > > +	igt_fixture {
> > > > +		xe_device_put(xe);
> > > > +		close(xe);
> > > > +	}
> > > > +}

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2023-07-12  7:00 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-06  6:05 [igt-dev] [PATCH i-g-t v2 00/16] Extend intel_blt to work on Xe Zbigniew Kempczyński
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 01/16] tests/api_intel_allocator: Don't use allocator ahnd aliasing api Zbigniew Kempczyński
2023-07-06  9:04   ` Karolina Stolarek
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 02/16] lib/intel_allocator: Drop aliasing allocator handle api Zbigniew Kempczyński
2023-07-06  8:31   ` Karolina Stolarek
2023-07-06 11:20     ` Zbigniew Kempczyński
2023-07-06 13:28       ` Karolina Stolarek
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 03/16] lib/intel_allocator: Remove extensive debugging Zbigniew Kempczyński
2023-07-06  9:30   ` Karolina Stolarek
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 04/16] lib/xe_query: Use vramN when returning string region name Zbigniew Kempczyński
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 05/16] lib/xe_query: Add xe_region_class() helper Zbigniew Kempczyński
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 06/16] lib/drmtest: Add get_intel_driver() helper Zbigniew Kempczyński
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 07/16] lib/xe_util: Return dynamic subtest name for Xe Zbigniew Kempczyński
2023-07-06  9:37   ` Karolina Stolarek
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 08/16] lib/xe_util: Add vm bind/unbind helper " Zbigniew Kempczyński
2023-07-06 10:27   ` Karolina Stolarek
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 09/16] lib/intel_allocator: Add field to distinquish underlying driver Zbigniew Kempczyński
2023-07-06 10:34   ` Karolina Stolarek
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 10/16] lib/intel_allocator: Add intel_allocator_bind() Zbigniew Kempczyński
2023-07-06 13:02   ` Karolina Stolarek
2023-07-06 16:09     ` Zbigniew Kempczyński
2023-07-07  8:01       ` Karolina Stolarek
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 11/16] lib/intel_ctx: Add xe context information Zbigniew Kempczyński
2023-07-07  8:31   ` Karolina Stolarek
2023-07-11  9:06     ` Zbigniew Kempczyński
2023-07-11 10:38       ` Karolina Stolarek
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 12/16] lib/intel_blt: Introduce blt_copy_init() helper to cache driver Zbigniew Kempczyński
2023-07-07  8:51   ` Karolina Stolarek
2023-07-11  9:23     ` Zbigniew Kempczyński
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 13/16] lib/intel_blt: Extend blitter library to support xe driver Zbigniew Kempczyński
2023-07-07  9:26   ` Karolina Stolarek
2023-07-11 10:16     ` Zbigniew Kempczyński
2023-07-11 10:41       ` Karolina Stolarek
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 14/16] tests/xe_ccs: Check if flatccs is working with block-copy for Xe Zbigniew Kempczyński
2023-07-07 10:05   ` Karolina Stolarek
2023-07-11 10:45     ` Zbigniew Kempczyński
2023-07-11 10:51       ` Karolina Stolarek
2023-07-12  7:00         ` Zbigniew Kempczyński
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 15/16] tests/xe_exercise_blt: Check blitter library fast-copy " Zbigniew Kempczyński
2023-07-07 11:10   ` Karolina Stolarek
2023-07-11 11:07     ` Zbigniew Kempczyński
2023-07-11 11:15       ` Karolina Stolarek
2023-07-06  6:05 ` [igt-dev] [PATCH i-g-t v2 16/16] tests/api-intel-allocator: Adopt to exercise allocator to Xe Zbigniew Kempczyński
2023-07-07 10:11   ` Karolina Stolarek
2023-07-06  6:58 ` [igt-dev] ✓ Fi.CI.BAT: success for Extend intel_blt to work on Xe (rev2) Patchwork
2023-07-06  9:26 ` [igt-dev] ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.