All of lore.kernel.org
 help / color / mirror / Atom feed
* [igt-dev] [PATCH i-g-t 0/3] Add i915 blt library + gem_ccs test
@ 2022-02-16 12:21 Zbigniew Kempczyński
  2022-02-16 12:21 ` [igt-dev] [PATCH i-g-t 1/3] i915/gem_engine_topology: Only use the main copy engines for XY_BLOCK_COPY Zbigniew Kempczyński
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Zbigniew Kempczyński @ 2022-02-16 12:21 UTC (permalink / raw)
  To: igt-dev

Enable DG2+ testing of flatccs.

Idea of library comes from Ayaz, just implementation differs due
to some reasons discovered during different tiling testing.

Currently I used default (0) mocs to avoid kernel differences.
We may change it to UC/WB when kernel interface will be stabilized
and merged.

v3: rebased on top of local DG2 pciids

Cc: Apoorva Singh <apoorva1.singh@intel.com>
Cc: Ayaz A Siddiqui <ayaz.siddiqui@intel.com>

Chris Wilson (1):
  i915/gem_engine_topology: Only use the main copy engines for
    XY_BLOCK_COPY

Zbigniew Kempczyński (2):
  lib/i915_blt: Add library for blitter
  tests/gem_ccs: Verify uncompressed and compressed blits

 .../igt-gpu-tools/igt-gpu-tools-docs.xml      |    1 +
 lib/i915/gem_engine_topology.c                |   38 +
 lib/i915/gem_engine_topology.h                |    5 +
 lib/i915/i915_blt.c                           | 1087 +++++++++++++++++
 lib/i915/i915_blt.h                           |  196 +++
 lib/meson.build                               |    1 +
 tests/i915/gem_ccs.c                          |  485 ++++++++
 tests/meson.build                             |    1 +
 8 files changed, 1814 insertions(+)
 create mode 100644 lib/i915/i915_blt.c
 create mode 100644 lib/i915/i915_blt.h
 create mode 100644 tests/i915/gem_ccs.c

-- 
2.32.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [igt-dev] [PATCH i-g-t 1/3] i915/gem_engine_topology: Only use the main copy engines for XY_BLOCK_COPY
  2022-02-16 12:21 [igt-dev] [PATCH i-g-t 0/3] Add i915 blt library + gem_ccs test Zbigniew Kempczyński
@ 2022-02-16 12:21 ` Zbigniew Kempczyński
  2022-02-16 12:21 ` [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter Zbigniew Kempczyński
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Zbigniew Kempczyński @ 2022-02-16 12:21 UTC (permalink / raw)
  To: igt-dev

From: Chris Wilson <chris.p.wilson@intel.com>

XY_BLOCK_COPY blt command is used to transfer the ccs data.
So, we can only run the tests on those engines which have
support for the "block_copy" capability.

Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Apoorva Singh <apoorva1.singh@intel.com>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Melkaveri, Arjun <arjun.melkaveri@intel.com>
---
 lib/i915/gem_engine_topology.c | 38 ++++++++++++++++++++++++++++++++++
 lib/i915/gem_engine_topology.h |  5 +++++
 2 files changed, 43 insertions(+)

diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
index bd12d0bc9..18fc00ab0 100644
--- a/lib/i915/gem_engine_topology.c
+++ b/lib/i915/gem_engine_topology.c
@@ -534,6 +534,44 @@ void gem_engine_properties_restore(int fd, const struct gem_engine_properties *s
 	}
 }
 
+static bool
+__gem_engine_has_capability(int i915, const char *engine,
+			    const char *attr, const char *cap)
+{
+	char buf[4096] = {};
+	FILE *file;
+
+	file = __open_attr(igt_sysfs_open(i915), "r",
+			   "engine", engine, attr, NULL);
+	if (file) {
+		fread(buf, 1, sizeof(buf) - 1, file);
+		fclose(file);
+	}
+
+	return strstr(buf, cap);
+}
+
+bool gem_engine_has_capability(int i915, const char *engine, const char *cap)
+{
+	return __gem_engine_has_capability(i915, engine, "capabilities", cap);
+}
+
+bool gem_engine_has_known_capability(int i915, const char *engine, const char *cap)
+{
+	return __gem_engine_has_capability(i915, engine, "known_capabilities", cap);
+}
+
+bool gem_engine_can_block_copy(int i915, const struct intel_execution_engine2 *engine)
+{
+	if (engine->class != I915_ENGINE_CLASS_COPY)
+		return false;
+
+	if (!gem_engine_has_known_capability(i915, engine->name, "block_copy"))
+		return intel_gen(intel_get_drm_devid(i915)) >= 12;
+
+	return gem_engine_has_capability(i915, engine->name, "block_copy");
+}
+
 uint32_t gem_engine_mmio_base(int i915, const char *engine)
 {
 	unsigned int mmio = 0;
diff --git a/lib/i915/gem_engine_topology.h b/lib/i915/gem_engine_topology.h
index b413aa8ab..987f2bf94 100644
--- a/lib/i915/gem_engine_topology.h
+++ b/lib/i915/gem_engine_topology.h
@@ -133,6 +133,11 @@ int gem_engine_property_printf(int i915, const char *engine, const char *attr,
 
 uint32_t gem_engine_mmio_base(int i915, const char *engine);
 
+bool gem_engine_has_capability(int i915, const char *engine, const char *cap);
+bool gem_engine_has_known_capability(int i915, const char *engine, const char *cap);
+
+bool gem_engine_can_block_copy(int i915, const struct intel_execution_engine2 *engine);
+
 void dyn_sysfs_engines(int i915, int engines, const char *file,
 		       void (*test)(int i915, int engine));
 
-- 
2.32.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter
  2022-02-16 12:21 [igt-dev] [PATCH i-g-t 0/3] Add i915 blt library + gem_ccs test Zbigniew Kempczyński
  2022-02-16 12:21 ` [igt-dev] [PATCH i-g-t 1/3] i915/gem_engine_topology: Only use the main copy engines for XY_BLOCK_COPY Zbigniew Kempczyński
@ 2022-02-16 12:21 ` Zbigniew Kempczyński
  2022-02-16 12:21 ` [igt-dev] [PATCH i-g-t 3/3] tests/gem_ccs: Verify uncompressed and compressed blits Zbigniew Kempczyński
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Zbigniew Kempczyński @ 2022-02-16 12:21 UTC (permalink / raw)
  To: igt-dev

Blitter commands became complicated thus manual bitshifting is error
prone and hard debugable - XY_BLOCK_COPY_BLT is the best example -
in extended version (for DG2+) it takes 20 dwords of command data.
To avoid mistakes and dozens of arguments for command library provides
input data in more structured form.

Currently supported commands:
- XY_BLOCK_COPY_BLT:
  a)  TGL/DG1 uses shorter version of command which doesn't support
      compression
  b)  DG2+ command is extended and supports compression
- XY_CTRL_SURF_COPY_BLT
- XY_FAST_COPY_BLT

Source, destination and batchbuffer are provided to blitter functions
as objects (structs). This increases readability and allows use same
object in many functions. Only drawback of such attitude is some fields
used in one function may be ignored in another. As an example is
blt_copy_object which contains a lot of information about gem object.
In block-copy all of data are used but in fast-copy only some of them
(fast-copy doesn't support compression).

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 .../igt-gpu-tools/igt-gpu-tools-docs.xml      |    1 +
 lib/i915/i915_blt.c                           | 1087 +++++++++++++++++
 lib/i915/i915_blt.h                           |  196 +++
 lib/meson.build                               |    1 +
 4 files changed, 1285 insertions(+)
 create mode 100644 lib/i915/i915_blt.c
 create mode 100644 lib/i915/i915_blt.h

diff --git a/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml b/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
index 0dc5a0b7e..3a2edbae1 100644
--- a/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
+++ b/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
@@ -16,6 +16,7 @@
   <chapter>
     <title>API Reference</title>
     <xi:include href="xml/drmtest.xml"/>
+    <xi:include href="xml/i915_blt.xml"/>
     <xi:include href="xml/igt_alsa.xml"/>
     <xi:include href="xml/igt_audio.xml"/>
     <xi:include href="xml/igt_aux.xml"/>
diff --git a/lib/i915/i915_blt.c b/lib/i915/i915_blt.c
new file mode 100644
index 000000000..c6f115009
--- /dev/null
+++ b/lib/i915/i915_blt.c
@@ -0,0 +1,1087 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <errno.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+#include <malloc.h>
+#include <cairo.h>
+#include "drm.h"
+#include "igt.h"
+#include "gem_create.h"
+#include "i915_blt.h"
+
+#define BITRANGE(start, end) (end - start + 1)
+
+enum blt_special_mode {
+	SM_NONE,
+	SM_FULL_RESOLVE,
+	SM_PARTIAL_RESOLVE,
+	SM_RESERVED,
+};
+
+enum blt_aux_mode {
+	AM_AUX_NONE,
+	AM_AUX_CCS_E = 5,
+};
+
+enum blt_target_mem {
+	TM_LOCAL_MEM,
+	TM_SYSTEM_MEM,
+};
+
+struct gen12_block_copy_data {
+	struct {
+		uint32_t length:			BITRANGE(0, 7);
+		uint32_t rsvd1:				BITRANGE(8, 8);
+		uint32_t multisamples:			BITRANGE(9, 11);
+		uint32_t special_mode:			BITRANGE(12, 13);
+		uint32_t rsvd0:				BITRANGE(14, 18);
+		uint32_t color_depth:			BITRANGE(19, 21);
+		uint32_t opcode:			BITRANGE(22, 28);
+		uint32_t client:			BITRANGE(29, 31);
+	} dw00;
+
+	struct {
+		uint32_t dst_pitch:			BITRANGE(0, 17);
+		uint32_t dst_aux_mode:			BITRANGE(18, 20);
+		uint32_t dst_mocs:			BITRANGE(21, 27);
+		uint32_t dst_ctrl_surface_type:		BITRANGE(28, 28);
+		uint32_t dst_compression:		BITRANGE(29, 29);
+		uint32_t dst_tiling:			BITRANGE(30, 31);
+	} dw01;
+
+	struct {
+		int32_t dst_x1:				BITRANGE(0, 15);
+		int32_t dst_y1:				BITRANGE(16, 31);
+	} dw02;
+
+	struct {
+		int32_t dst_x2:				BITRANGE(0, 15);
+		int32_t dst_y2:				BITRANGE(16, 31);
+	} dw03;
+
+	struct {
+		uint32_t dst_address_lo;
+	} dw04;
+
+	struct {
+		uint32_t dst_address_hi;
+	} dw05;
+
+	struct {
+		uint32_t dst_x_offset:			BITRANGE(0, 13);
+		uint32_t rsvd1:				BITRANGE(14, 15);
+		uint32_t dst_y_offset:			BITRANGE(16, 29);
+		uint32_t rsvd0:				BITRANGE(30, 30);
+		uint32_t dst_target_memory:		BITRANGE(31, 31);
+	} dw06;
+
+	struct {
+		int32_t src_x1:				BITRANGE(0, 15);
+		int32_t src_y1:				BITRANGE(16, 31);
+	} dw07;
+
+	struct {
+		uint32_t src_pitch:			BITRANGE(0, 17);
+		uint32_t src_aux_mode:			BITRANGE(18, 20);
+		uint32_t src_mocs:			BITRANGE(21, 27);
+		uint32_t src_ctrl_surface_type:		BITRANGE(28, 28);
+		uint32_t src_compression:		BITRANGE(29, 29);
+		uint32_t src_tiling:			BITRANGE(30, 31);
+	} dw08;
+
+	struct {
+		uint32_t src_address_lo;
+	} dw09;
+
+	struct {
+		uint32_t src_address_hi;
+	} dw10;
+
+	struct {
+		uint32_t src_x_offset:			BITRANGE(0, 13);
+		uint32_t rsvd1:				BITRANGE(14, 15);
+		uint32_t src_y_offset:			BITRANGE(16, 29);
+		uint32_t rsvd0:				BITRANGE(30, 30);
+		uint32_t src_target_memory:		BITRANGE(31, 31);
+	} dw11;
+};
+
+struct gen12_block_copy_data_ext {
+	struct {
+		uint32_t src_compression_format:	BITRANGE(0, 4);
+		uint32_t src_clear_value_enable:	BITRANGE(5, 5);
+		uint32_t src_clear_address_low:		BITRANGE(6, 31);
+	} dw12;
+
+	union {
+		/* DG2, XEHP */
+		uint32_t src_clear_address_hi0;
+		/* Others */
+		uint32_t src_clear_address_hi1;
+	} dw13;
+
+	struct {
+		uint32_t dst_compression_format:	BITRANGE(0, 4);
+		uint32_t dst_clear_value_enable:	BITRANGE(5, 5);
+		uint32_t dst_clear_address_low:		BITRANGE(6, 31);
+	} dw14;
+
+	union {
+		/* DG2, XEHP */
+		uint32_t dst_clear_address_hi0;
+		/* Others */
+		uint32_t dst_clear_address_hi1;
+	} dw15;
+
+	struct {
+		uint32_t dst_surface_height:		BITRANGE(0, 13);
+		uint32_t dst_surface_width:		BITRANGE(14, 27);
+		uint32_t rsvd0:				BITRANGE(28, 28);
+		uint32_t dst_surface_type:		BITRANGE(29, 31);
+	} dw16;
+
+	struct {
+		uint32_t dst_lod:			BITRANGE(0, 3);
+		uint32_t dst_surface_qpitch:		BITRANGE(4, 18);
+		uint32_t rsvd0:				BITRANGE(19, 20);
+		uint32_t dst_surface_depth:		BITRANGE(21, 31);
+	} dw17;
+
+	struct {
+		uint32_t dst_horizontal_align:		BITRANGE(0, 1);
+		uint32_t rsvd0:				BITRANGE(2, 2);
+		uint32_t dst_vertical_align:		BITRANGE(3, 4);
+		uint32_t rsvd1:				BITRANGE(5, 7);
+		uint32_t dst_mip_tail_start_lod:	BITRANGE(8, 11);
+		uint32_t rsvd2:				BITRANGE(12, 17);
+		uint32_t dst_depth_stencil_resource:	BITRANGE(18, 18);
+		uint32_t rsvd3:				BITRANGE(19, 20);
+		uint32_t dst_array_index:		BITRANGE(21, 31);
+	} dw18;
+
+	struct {
+		uint32_t src_surface_height:		BITRANGE(0, 13);
+		uint32_t src_surface_width:		BITRANGE(14, 27);
+		uint32_t rsvd0:				BITRANGE(28, 28);
+		uint32_t src_surface_type:		BITRANGE(29, 31);
+	} dw19;
+
+	struct {
+		uint32_t src_lod:			BITRANGE(0, 3);
+		uint32_t src_surface_qpitch:		BITRANGE(4, 18);
+		uint32_t rsvd0:				BITRANGE(19, 20);
+		uint32_t src_surface_depth:		BITRANGE(21, 31);
+	} dw20;
+
+	struct {
+		uint32_t src_horizontal_align:		BITRANGE(0, 1);
+		uint32_t rsvd0:				BITRANGE(2, 2);
+		uint32_t src_vertical_align:		BITRANGE(3, 4);
+		uint32_t rsvd1:				BITRANGE(5, 7);
+		uint32_t src_mip_tail_start_lod:	BITRANGE(8, 11);
+		uint32_t rsvd2:				BITRANGE(12, 17);
+		uint32_t src_depth_stencil_resource:	BITRANGE(18, 18);
+		uint32_t rsvd3:				BITRANGE(19, 20);
+		uint32_t src_array_index:		BITRANGE(21, 31);
+	} dw21;
+};
+
+/**
+ * blt_supports_compression:
+ * @i915: drm fd
+ *
+ * Function returns does HW supports flatccs compression in blitter commands
+ * on @i915 device.
+ *
+ * Returns:
+ * true if it does, false otherwise.
+ */
+bool blt_supports_compression(int i915)
+{
+	uint32_t devid = intel_get_drm_devid(i915);
+
+	return HAS_FLATCCS(devid);
+}
+
+/**
+ * blt_supports_tiling:
+ * @i915: drm fd
+ * @tiling: tiling id
+ *
+ * Function returns does blitter supports @tiling on @i915 device.
+ *
+ * Returns:
+ * true if it does, false otherwise.
+ */
+bool blt_supports_tiling(int i915, enum blt_tiling tiling)
+{
+	uint32_t devid = intel_get_drm_devid(i915);
+
+	if (tiling == T_XMAJOR) {
+		if (IS_TIGERLAKE(devid) || IS_DG1(devid))
+			return false;
+		else
+			return true;
+	}
+
+	if (tiling == T_YMAJOR) {
+		if (IS_TIGERLAKE(devid) || IS_DG1(devid))
+			return true;
+		else
+			return false;
+	}
+
+	return true;
+}
+
+/**
+ * blt_tiling_name:
+ * @tiling: tiling id
+ *
+ * Returns:
+ * name of @tiling passed. Useful to build test names or sth.
+ */
+const char *blt_tiling_name(enum blt_tiling tiling)
+{
+	switch (tiling) {
+	case T_LINEAR: return "linear";
+	case T_XMAJOR: return "xmajor";
+	case T_YMAJOR: return "ymajor";
+	case T_TILE4:  return "tile4";
+	case T_TILE64: return "tile64";
+	}
+
+	igt_warn("invalid tiling passed: %d\n", tiling);
+	return NULL;
+}
+
+static int __block_tiling(enum blt_tiling tiling)
+{
+	switch (tiling) {
+	case T_LINEAR: return 0;
+	case T_XMAJOR: return 1;
+	case T_YMAJOR: return 1;
+	case T_TILE4:  return 2;
+	case T_TILE64: return 3;
+	}
+
+	igt_warn("invalid tiling passed: %d\n", tiling);
+	return 0;
+}
+
+static int __special_mode(const struct blt_copy_data *blt)
+{
+	if (blt->src.handle == blt->dst.handle &&
+	    blt->src.compression && !blt->dst.compression)
+		return SM_FULL_RESOLVE;
+
+
+	return SM_NONE;
+}
+
+static int __memory_type(uint32_t region)
+{
+	igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
+		     IS_SYSTEM_MEMORY_REGION(region),
+		     "Invalid region: %x\n", region);
+
+	if (IS_DEVICE_MEMORY_REGION(region))
+		return TM_LOCAL_MEM;
+	return TM_SYSTEM_MEM;
+}
+
+static enum blt_aux_mode __aux_mode(const struct blt_copy_object *obj)
+{
+	if (obj->compression == COMPRESSION_ENABLED) {
+		igt_assert_f(IS_DEVICE_MEMORY_REGION(obj->region),
+			     "XY_BLOCK_COPY_BLT supports compression "
+			     "on device memory only\n");
+		return AM_AUX_CCS_E;
+	}
+
+	return AM_AUX_NONE;
+}
+
+static void fill_data(struct gen12_block_copy_data *data,
+		      const struct blt_copy_data *blt,
+		      uint64_t src_offset, uint64_t dst_offset,
+		      bool extended_command)
+{
+	data->dw00.client = 0x2;
+	data->dw00.opcode = 0x41;
+	data->dw00.color_depth = blt->color_depth;
+	data->dw00.special_mode = __special_mode(blt);
+	data->dw00.length = extended_command ? 20 : 10;
+
+	data->dw01.dst_pitch = blt->dst.pitch - 1;
+	data->dw01.dst_aux_mode = __aux_mode(&blt->dst);
+	data->dw01.dst_mocs = blt->dst.mocs;
+	data->dw01.dst_compression = blt->dst.compression;
+	data->dw01.dst_tiling = __block_tiling(blt->dst.tiling);
+
+	if (blt->dst.compression)
+		data->dw01.dst_ctrl_surface_type = blt->dst.compression_type;
+
+	data->dw02.dst_x1 = blt->dst.x1;
+	data->dw02.dst_y1 = blt->dst.y1;
+
+	data->dw03.dst_x2 = blt->dst.x2;
+	data->dw03.dst_y2 = blt->dst.y2;
+
+	data->dw04.dst_address_lo = dst_offset;
+	data->dw05.dst_address_hi = dst_offset >> 32;
+
+	data->dw06.dst_x_offset = blt->dst.x_offset;
+	data->dw06.dst_y_offset = blt->dst.y_offset;
+	data->dw06.dst_target_memory = __memory_type(blt->dst.region);
+
+	data->dw07.src_x1 = blt->src.x1;
+	data->dw07.src_y1 = blt->src.y1;
+
+	data->dw08.src_pitch = blt->src.pitch - 1;
+	data->dw08.src_aux_mode = __aux_mode(&blt->src);
+	data->dw08.src_mocs = blt->src.mocs;
+	data->dw08.src_compression = blt->src.compression;
+	data->dw08.src_tiling = __block_tiling(blt->src.tiling);
+
+	if (blt->src.compression)
+		data->dw08.src_ctrl_surface_type = blt->src.compression_type;
+
+	data->dw09.src_address_lo = src_offset;
+	data->dw10.src_address_hi = src_offset >> 32;
+
+	data->dw11.src_x_offset = blt->src.x_offset;
+	data->dw11.src_y_offset = blt->src.y_offset;
+	data->dw11.src_target_memory = __memory_type(blt->src.region);
+}
+
+static void fill_data_ext(int i915,
+			  struct gen12_block_copy_data_ext *dext,
+			  const struct blt_block_copy_data_ext *ext)
+{
+	dext->dw12.src_compression_format = ext->src.compression_format;
+	dext->dw12.src_clear_value_enable = ext->src.clear_value_enable;
+	dext->dw12.src_clear_address_low = ext->src.clear_address;
+
+	dext->dw13.src_clear_address_hi0 = ext->src.clear_address >> 32;
+
+	dext->dw14.dst_compression_format = ext->dst.compression_format;
+	dext->dw14.dst_clear_value_enable = ext->dst.clear_value_enable;
+	dext->dw14.dst_clear_address_low = ext->dst.clear_address;
+
+	dext->dw15.dst_clear_address_hi0 = ext->dst.clear_address >> 32;
+
+	dext->dw16.dst_surface_width = ext->dst.surface_width - 1;
+	dext->dw16.dst_surface_height = ext->dst.surface_height - 1;
+	dext->dw16.dst_surface_type = ext->dst.surface_type;
+
+	dext->dw17.dst_lod = ext->dst.lod;
+	dext->dw17.dst_surface_depth = ext->dst.surface_depth;
+	dext->dw17.dst_surface_qpitch = ext->dst.surface_qpitch;
+
+	dext->dw18.dst_horizontal_align = ext->dst.horizontal_align;
+	dext->dw18.dst_vertical_align = ext->dst.vertical_align;
+	dext->dw18.dst_mip_tail_start_lod = ext->dst.mip_tail_start_lod;
+	dext->dw18.dst_depth_stencil_resource = ext->dst.depth_stencil_resource;
+	dext->dw18.dst_array_index = ext->dst.array_index;
+
+	dext->dw19.src_surface_width = ext->src.surface_width - 1;
+	dext->dw19.src_surface_height = ext->src.surface_height - 1;
+
+	dext->dw19.src_surface_type = ext->src.surface_type;
+
+	dext->dw20.src_lod = ext->src.lod;
+	dext->dw20.src_surface_depth = ext->src.surface_depth;
+	dext->dw20.src_surface_qpitch = ext->src.surface_qpitch;
+
+	dext->dw21.src_horizontal_align = ext->src.horizontal_align;
+	dext->dw21.src_vertical_align = ext->src.vertical_align;
+	dext->dw21.src_mip_tail_start_lod = ext->src.mip_tail_start_lod;
+	dext->dw21.src_depth_stencil_resource = ext->src.depth_stencil_resource;
+	dext->dw21.src_array_index = ext->src.array_index;
+}
+
+static void dump_bb_cmd(struct gen12_block_copy_data *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("details:\n");
+	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, color depth: %d, "
+		 "special mode: %d, length: %d>\n",
+		 cmd[0],
+		 data->dw00.client, data->dw00.opcode, data->dw00.color_depth,
+		 data->dw00.special_mode, data->dw00.length);
+	igt_info(" dw01: [%08x] dst <pitch: %d, aux: %d, mocs: %d, compr: %d, "
+		 "tiling: %d, ctrl surf type: %d>\n",
+		 cmd[1], data->dw01.dst_pitch, data->dw01.dst_aux_mode,
+		 data->dw01.dst_mocs, data->dw01.dst_compression,
+		 data->dw01.dst_tiling, data->dw01.dst_ctrl_surface_type);
+	igt_info(" dw02: [%08x] dst geom <x1: %d, y1: %d>\n",
+		 cmd[2], data->dw02.dst_x1, data->dw02.dst_y1);
+	igt_info(" dw03: [%08x]          <x2: %d, y2: %d>\n",
+		 cmd[3], data->dw03.dst_x2, data->dw03.dst_y2);
+	igt_info(" dw04: [%08x] dst offset lo (0x%x)\n",
+		 cmd[4], data->dw04.dst_address_lo);
+	igt_info(" dw05: [%08x] dst offset hi (0x%x)\n",
+		 cmd[5], data->dw05.dst_address_hi);
+	igt_info(" dw06: [%08x] dst <x offset: 0x%x, y offset: 0x%0x, target mem: %d>\n",
+		 cmd[6], data->dw06.dst_x_offset, data->dw06.dst_y_offset,
+		 data->dw06.dst_target_memory);
+	igt_info(" dw07: [%08x] src geom <x1: %d, y1: %d>\n",
+		 cmd[7], data->dw07.src_x1, data->dw07.src_y1);
+	igt_info(" dw08: [%08x] src <pitch: %d, aux: %d, mocs: %d, compr: %d, "
+		 "tiling: %d, ctrl surf type: %d>\n",
+		 cmd[8], data->dw08.src_pitch, data->dw08.src_aux_mode,
+		 data->dw08.src_mocs, data->dw08.src_compression,
+		 data->dw08.src_tiling, data->dw08.src_ctrl_surface_type);
+	igt_info(" dw09: [%08x] src offset lo (0x%x)\n",
+		 cmd[9], data->dw09.src_address_lo);
+	igt_info(" dw10: [%08x] src offset hi (0x%x)\n",
+		 cmd[10], data->dw10.src_address_hi);
+	igt_info(" dw11: [%08x] src <x offset: 0x%x, y offset: 0x%0x, target mem: %d>\n",
+		 cmd[11], data->dw11.src_x_offset, data->dw11.src_y_offset,
+		 data->dw11.src_target_memory);
+}
+
+static void dump_bb_ext(struct gen12_block_copy_data_ext *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("ext details:\n");
+	igt_info(" dw12: [%08x] src <compression fmt: %d, clear value enable: %d, "
+		 "clear address low: 0x%x>\n",
+		 cmd[0],
+		 data->dw12.src_compression_format,
+		 data->dw12.src_clear_value_enable,
+		 data->dw12.src_clear_address_low);
+	igt_info(" dw13: [%08x] src clear address hi: 0x%x\n",
+		 cmd[1], data->dw13.src_clear_address_hi0);
+	igt_info(" dw14: [%08x] dst <compression fmt: %d, clear value enable: %d, "
+		 "clear address low: 0x%x>\n",
+		 cmd[2],
+		 data->dw14.dst_compression_format,
+		 data->dw14.dst_clear_value_enable,
+		 data->dw14.dst_clear_address_low);
+	igt_info(" dw15: [%08x] dst clear address hi: 0x%x\n",
+		 cmd[3], data->dw15.dst_clear_address_hi0);
+	igt_info(" dw16: [%08x] dst surface <width: %d, height: %d, type: %d>\n",
+		 cmd[4], data->dw16.dst_surface_width,
+		 data->dw16.dst_surface_height, data->dw16.dst_surface_type);
+	igt_info(" dw17: [%08x] dst surface <lod: %d, depth: %d, qpitch: %d>\n",
+		 cmd[5], data->dw17.dst_lod,
+		 data->dw17.dst_surface_depth, data->dw17.dst_surface_qpitch);
+	igt_info(" dw18: [%08x] dst <halign: %d, valign: %d, mip tail: %d, "
+		 "depth stencil: %d, array index: %d>\n",
+		 cmd[6],
+		 data->dw18.dst_horizontal_align,
+		 data->dw18.dst_vertical_align,
+		 data->dw18.dst_mip_tail_start_lod,
+		 data->dw18.dst_depth_stencil_resource,
+		 data->dw18.dst_array_index);
+
+	igt_info(" dw19: [%08x] src surface <width: %d, height: %d, type: %d>\n",
+		 cmd[7], data->dw19.src_surface_width,
+		 data->dw19.src_surface_height, data->dw19.src_surface_type);
+	igt_info(" dw20: [%08x] src surface <lod: %d, depth: %d, qpitch: %d>\n",
+		 cmd[8], data->dw20.src_lod,
+		 data->dw20.src_surface_depth, data->dw20.src_surface_qpitch);
+	igt_info(" dw21: [%08x] src <halign: %d, valign: %d, mip tail: %d, "
+		 "depth stencil: %d, array index: %d>\n",
+		 cmd[9],
+		 data->dw21.src_horizontal_align,
+		 data->dw21.src_vertical_align,
+		 data->dw21.src_mip_tail_start_lod,
+		 data->dw21.src_depth_stencil_resource,
+		 data->dw21.src_array_index);
+}
+
+/**
+ * blt_block_copy:
+ * @i915: drm fd
+ * @ctx: intel_ctx_t context
+ * @e: blitter engine for @ctx
+ * @ahnd: allocator handle
+ * @blt: basic blitter data (for TGL/DG1 which doesn't support ext version)
+ * @ext: extended blitter data (for DG2+, supports flatccs compression)
+ *
+ * Function does blit between @src and @dst described in @blt object.
+ *
+ * Returns:
+ * execbuffer status.
+ */
+int blt_block_copy(int i915,
+		   const intel_ctx_t *ctx,
+		   const struct intel_execution_engine2 *e,
+		   uint64_t ahnd,
+		   const struct blt_copy_data *blt,
+		   const struct blt_block_copy_data_ext *ext)
+{
+	struct drm_i915_gem_execbuffer2 execbuf = {};
+	struct drm_i915_gem_exec_object2 obj[3] = {};
+	struct gen12_block_copy_data data = {};
+	struct gen12_block_copy_data_ext dext = {};
+	uint64_t dst_offset, src_offset, bb_offset, alignment;
+	uint32_t *bb;
+	int i, ret;
+
+	igt_assert_f(ahnd, "block-copy supports softpin only\n");
+	igt_assert_f(blt, "block-copy requires data to do blit\n");
+
+	alignment = gem_detect_safe_alignment(i915);
+	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
+	if (__special_mode(blt) == SM_FULL_RESOLVE)
+		dst_offset = src_offset;
+	else
+		dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
+
+	fill_data(&data, blt, src_offset, dst_offset, ext);
+
+	i = sizeof(data) / sizeof(uint32_t);
+	bb = gem_mmap__device_coherent(i915, blt->bb.handle, 0, blt->bb.size,
+				       PROT_READ | PROT_WRITE);
+	memcpy(bb, &data, sizeof(data));
+
+	if (ext) {
+		fill_data_ext(i915, &dext, ext);
+		memcpy(bb + i, &dext, sizeof(dext));
+		i += sizeof(dext) / sizeof(uint32_t);
+	}
+	bb[i++] = MI_BATCH_BUFFER_END;
+
+	if (blt->print_bb) {
+		igt_info("[BLOCK COPY]\n");
+		igt_info("src offset: %llx, dst offset: %llx, bb offset: %llx\n",
+			 (long long) src_offset, (long long) dst_offset,
+			 (long long) bb_offset);
+
+		dump_bb_cmd(&data);
+		if (ext)
+			dump_bb_ext(&dext);
+	}
+
+	munmap(bb, blt->bb.size);
+
+	obj[0].offset = CANONICAL(dst_offset);
+	obj[1].offset = CANONICAL(src_offset);
+	obj[2].offset = CANONICAL(bb_offset);
+	obj[0].handle = blt->dst.handle;
+	obj[1].handle = blt->src.handle;
+	obj[2].handle = blt->bb.handle;
+	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	execbuf.buffer_count = 3;
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.rsvd1 = ctx ? ctx->id : 0;
+	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+	ret = __gem_execbuf(i915, &execbuf);
+	if (data.dw00.special_mode != SM_FULL_RESOLVE)
+		put_offset(ahnd, blt->dst.handle);
+	put_offset(ahnd, blt->src.handle);
+	put_offset(ahnd, blt->bb.handle);
+
+	return ret;
+}
+
+static uint16_t __ccs_size(const struct blt_ctrl_surf_copy_data *surf)
+{
+	uint32_t src_size, dst_size;
+
+	src_size = surf->src.access_type == DIRECT_ACCESS ?
+				surf->src.size : surf->src.size / CCS_RATIO;
+
+	dst_size = surf->dst.access_type == DIRECT_ACCESS ?
+				surf->dst.size : surf->dst.size / CCS_RATIO;
+
+	igt_assert_f(src_size <= dst_size, "dst size must be >= src size for CCS copy\n");
+
+	return src_size;
+}
+
+struct gen12_ctrl_surf_copy_data {
+	struct {
+		uint32_t length:			BITRANGE(0, 7);
+		uint32_t size_of_ctrl_copy:		BITRANGE(8, 17);
+		uint32_t rsvd0:				BITRANGE(18, 19);
+		uint32_t dst_access_type:		BITRANGE(20, 20);
+		uint32_t src_access_type:		BITRANGE(21, 21);
+		uint32_t opcode:			BITRANGE(22, 28);
+		uint32_t client:			BITRANGE(29, 31);
+	} dw00;
+
+	struct {
+		uint32_t src_address_lo;
+	} dw01;
+
+	struct {
+		uint32_t src_address_hi:		BITRANGE(0, 24);
+		uint32_t src_mocs:			BITRANGE(25, 31);
+	} dw02;
+
+	struct {
+		uint32_t dst_address_lo;
+	} dw03;
+
+	struct {
+		uint32_t dst_address_hi:		BITRANGE(0, 24);
+		uint32_t dst_mocs:			BITRANGE(25, 31);
+	} dw04;
+};
+
+static void dump_bb_surf_ctrl_cmd(const struct gen12_ctrl_surf_copy_data *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("details:\n");
+	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, "
+		 "src/dst access type: <%d, %d>, size of ctrl copy: %u, length: %d>\n",
+		 cmd[0],
+		 data->dw00.client, data->dw00.opcode,
+		 data->dw00.src_access_type, data->dw00.dst_access_type,
+		 data->dw00.size_of_ctrl_copy, data->dw00.length);
+	igt_info(" dw01: [%08x] src offset lo (0x%x)\n",
+		 cmd[1], data->dw01.src_address_lo);
+	igt_info(" dw02: [%08x] src offset hi (0x%x), src mocs: %u\n",
+		 cmd[2], data->dw02.src_address_hi, data->dw02.src_mocs);
+	igt_info(" dw03: [%08x] dst offset lo (0x%x)\n",
+		 cmd[3], data->dw03.dst_address_lo);
+	igt_info(" dw04: [%08x] dst offset hi (0x%x), src mocs: %u\n",
+		 cmd[4], data->dw04.dst_address_hi, data->dw04.dst_mocs);
+}
+
+/**
+ * blt_ctrl_surf_copy:
+ * @i915: drm fd
+ * @ctx: intel_ctx_t context
+ * @e: blitter engine for @ctx
+ * @ahnd: allocator handle
+ * @surf: blitter data for ctrl-surf-copy
+ *
+ * Function does ctrl-surf-copy blit between @src and @dst described in
+ * @blt object.
+ *
+ * Returns:
+ * execbuffer status.
+ */
+int blt_ctrl_surf_copy(int i915,
+		       const intel_ctx_t *ctx,
+		       const struct intel_execution_engine2 *e,
+		       uint64_t ahnd,
+		       const struct blt_ctrl_surf_copy_data *surf)
+{
+	struct drm_i915_gem_execbuffer2 execbuf = {};
+	struct drm_i915_gem_exec_object2 obj[3] = {};
+	struct gen12_ctrl_surf_copy_data data = {};
+	uint64_t dst_offset, src_offset, bb_offset, alignment;
+	uint32_t *bb;
+	int i;
+
+	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
+	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
+
+	alignment = gem_detect_safe_alignment(i915);
+
+	data.dw00.client = 0x2;
+	data.dw00.opcode = 0x48;
+	data.dw00.src_access_type = surf->src.access_type;
+	data.dw00.dst_access_type = surf->dst.access_type;
+
+	/* Ensure dst has size capable to keep src ccs aux */
+	data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
+	data.dw00.length = 0x3;
+
+	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size,
+				alignment);
+	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size,
+				alignment);
+	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size,
+			       alignment);
+
+	data.dw01.src_address_lo = src_offset;
+	data.dw02.src_address_hi = src_offset >> 32;
+	data.dw02.src_mocs = surf->src.mocs;
+
+	data.dw03.dst_address_lo = dst_offset;
+	data.dw04.dst_address_hi = dst_offset >> 32;
+	data.dw04.dst_mocs = surf->dst.mocs;
+
+	i = sizeof(data) / sizeof(uint32_t);
+	bb = gem_mmap__device_coherent(i915, surf->bb.handle, 0, surf->bb.size,
+				       PROT_READ | PROT_WRITE);
+	memcpy(bb, &data, sizeof(data));
+	bb[i++] = MI_BATCH_BUFFER_END;
+
+	if (surf->print_bb) {
+		igt_info("BB [CTRL SURF]:\n");
+		igt_info("src offset: %llx, dst offset: %llx, bb offset: %llx\n",
+			 (long long) src_offset, (long long) dst_offset,
+			 (long long) bb_offset);
+
+		dump_bb_surf_ctrl_cmd(&data);
+	}
+	munmap(bb, surf->bb.size);
+
+	obj[0].offset = CANONICAL(dst_offset);
+	obj[1].offset = CANONICAL(src_offset);
+	obj[2].offset = CANONICAL(bb_offset);
+	obj[0].handle = surf->dst.handle;
+	obj[1].handle = surf->src.handle;
+	obj[2].handle = surf->bb.handle;
+	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	execbuf.buffer_count = 3;
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+	execbuf.rsvd1 = ctx ? ctx->id : 0;
+	gem_execbuf(i915, &execbuf);
+	put_offset(ahnd, surf->dst.handle);
+	put_offset(ahnd, surf->src.handle);
+	put_offset(ahnd, surf->bb.handle);
+
+	return 0;
+}
+
+struct gen12_fast_copy_data {
+	struct {
+		uint32_t length:			BITRANGE(0, 7);
+		uint32_t rsvd1:				BITRANGE(8, 12);
+		uint32_t dst_tiling:			BITRANGE(13, 14);
+		uint32_t rsvd0:				BITRANGE(15, 19);
+		uint32_t src_tiling:			BITRANGE(20, 21);
+		uint32_t opcode:			BITRANGE(22, 28);
+		uint32_t client:			BITRANGE(29, 31);
+	} dw00;
+
+	struct {
+		uint32_t dst_pitch:			BITRANGE(0, 15);
+		uint32_t rsvd1:				BITRANGE(16, 23);
+		uint32_t color_depth:			BITRANGE(24, 26);
+		uint32_t rsvd0:				BITRANGE(27, 27);
+		uint32_t dst_memory:			BITRANGE(28, 28);
+		uint32_t src_memory:			BITRANGE(29, 29);
+		uint32_t dst_type_y:			BITRANGE(30, 30);
+		uint32_t src_type_y:			BITRANGE(31, 31);
+	} dw01;
+
+	struct {
+		int32_t dst_x1:				BITRANGE(0, 15);
+		int32_t dst_y1:				BITRANGE(16, 31);
+	} dw02;
+
+	struct {
+		int32_t dst_x2:				BITRANGE(0, 15);
+		int32_t dst_y2:				BITRANGE(16, 31);
+	} dw03;
+
+	struct {
+		uint32_t dst_address_lo;
+	} dw04;
+
+	struct {
+		uint32_t dst_address_hi;
+	} dw05;
+
+	struct {
+		int32_t src_x1:				BITRANGE(0, 15);
+		int32_t src_y1:				BITRANGE(16, 31);
+	} dw06;
+
+	struct {
+		uint32_t src_pitch:			BITRANGE(0, 15);
+		uint32_t rsvd0:				BITRANGE(16, 31);
+	} dw07;
+
+	struct {
+		uint32_t src_address_lo;
+	} dw08;
+
+	struct {
+		uint32_t src_address_hi;
+	} dw09;
+};
+
+static int __fast_tiling(enum blt_tiling tiling)
+{
+	switch (tiling) {
+	case T_LINEAR: return 0;
+	case T_XMAJOR: return 1;
+	case T_YMAJOR: return 2;
+	case T_TILE4:  return 2;
+	case T_TILE64: return 3;
+	}
+	return 0;
+}
+
+static int __fast_color_depth(enum blt_color_depth depth)
+{
+	switch (depth) {
+	case CD_8bit:   return 0;
+	case CD_16bit:  return 1;
+	case CD_32bit:  return 3;
+	case CD_64bit:  return 4;
+	case CD_96bit:
+		igt_assert_f(0, "Unsupported depth\n");
+		break;
+	case CD_128bit: return 5;
+	};
+	return 0;
+}
+
+static void dump_bb_fast_cmd(struct gen12_fast_copy_data *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("BB details:\n");
+	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, src tiling: %d, "
+		 "dst tiling: %d, length: %d>\n",
+		 cmd[0], data->dw00.client, data->dw00.opcode,
+		 data->dw00.src_tiling, data->dw00.dst_tiling, data->dw00.length);
+	igt_info(" dw01: [%08x] dst <pitch: %d, color depth: %d, dst memory: %d, "
+		 "src memory: %d, \n"
+		 "\t\t\tdst type tile: %d (0-legacy, 1-tile4),\n"
+		 "\t\t\tsrc type tile: %d (0-legacy, 1-tile4)>\n",
+		 cmd[1], data->dw01.dst_pitch, data->dw01.color_depth,
+		 data->dw01.dst_memory, data->dw01.src_memory,
+		 data->dw01.dst_type_y, data->dw01.src_type_y);
+	igt_info(" dw02: [%08x] dst geom <x1: %d, y1: %d>\n",
+		 cmd[2], data->dw02.dst_x1, data->dw02.dst_y1);
+	igt_info(" dw03: [%08x]          <x2: %d, y2: %d>\n",
+		 cmd[3], data->dw03.dst_x2, data->dw03.dst_y2);
+	igt_info(" dw04: [%08x] dst offset lo (0x%x)\n",
+		 cmd[4], data->dw04.dst_address_lo);
+	igt_info(" dw05: [%08x] dst offset hi (0x%x)\n",
+		 cmd[5], data->dw05.dst_address_hi);
+	igt_info(" dw06: [%08x] src geom <x1: %d, y1: %d>\n",
+		 cmd[6], data->dw06.src_x1, data->dw06.src_y1);
+	igt_info(" dw07: [%08x] src <pitch: %d>\n",
+		 cmd[7], data->dw07.src_pitch);
+	igt_info(" dw08: [%08x] src offset lo (0x%x)\n",
+		 cmd[8], data->dw08.src_address_lo);
+	igt_info(" dw09: [%08x] src offset hi (0x%x)\n",
+		 cmd[9], data->dw09.src_address_hi);
+}
+
+/**
+ * blt_fast_copy:
+ * @i915: drm fd
+ * @ctx: intel_ctx_t context
+ * @e: blitter engine for @ctx
+ * @ahnd: allocator handle
+ * @blt: blitter data for fast-copy (same as for block-copy but doesn't use
+ * compression fields).
+ *
+ * Function does fast blit between @src and @dst described in @blt object.
+ *
+ * Returns:
+ * execbuffer status.
+ */
+int blt_fast_copy(int i915,
+		  const intel_ctx_t *ctx,
+		  const struct intel_execution_engine2 *e,
+		  uint64_t ahnd,
+		  const struct blt_copy_data *blt)
+{
+	struct drm_i915_gem_execbuffer2 execbuf = {};
+	struct drm_i915_gem_exec_object2 obj[3] = {};
+	struct gen12_fast_copy_data data = {};
+	uint64_t dst_offset, src_offset, bb_offset, alignment;
+	uint32_t *bb;
+	int i, ret;
+
+	alignment = gem_detect_safe_alignment(i915);
+
+	data.dw00.client = 0x2;
+	data.dw00.opcode = 0x42;
+	data.dw00.dst_tiling = __fast_tiling(blt->dst.tiling);
+	data.dw00.src_tiling = __fast_tiling(blt->src.tiling);
+	data.dw00.length = 8;
+
+	data.dw01.dst_pitch = blt->dst.pitch;
+	data.dw01.color_depth = __fast_color_depth(blt->color_depth);
+	data.dw01.dst_memory = __memory_type(blt->dst.region);
+	data.dw01.src_memory = __memory_type(blt->src.region);
+	data.dw01.dst_type_y = blt->dst.tiling == T_TILE4 ? 1 : 0;
+	data.dw01.src_type_y = blt->src.tiling == T_TILE4 ? 1 : 0;
+
+	data.dw02.dst_x1 = blt->dst.x1;
+	data.dw02.dst_y1 = blt->dst.y1;
+
+	data.dw03.dst_x2 = blt->dst.x2;
+	data.dw03.dst_y2 = blt->dst.y2;
+
+	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
+	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
+
+	data.dw04.dst_address_lo = dst_offset;
+	data.dw05.dst_address_hi = dst_offset >> 32;
+
+	data.dw06.src_x1 = blt->src.x1;
+	data.dw06.src_y1 = blt->src.y1;
+
+	data.dw07.src_pitch = blt->src.pitch;
+
+	data.dw08.src_address_lo = src_offset;
+	data.dw09.src_address_hi = src_offset >> 32;
+
+	i = sizeof(data) / sizeof(uint32_t);
+	bb = gem_mmap__device_coherent(i915, blt->bb.handle, 0, blt->bb.size,
+				       PROT_READ | PROT_WRITE);
+
+	memcpy(bb, &data, sizeof(data));
+	bb[i++] = MI_BATCH_BUFFER_END;
+
+	if (blt->print_bb) {
+		igt_info("BB [FAST COPY]\n");
+		igt_info("blit [src offset: %llx, dst offset: %llx\n",
+			 (long long) src_offset, (long long) dst_offset);
+		dump_bb_fast_cmd(&data);
+	}
+
+	munmap(bb, blt->bb.size);
+
+	obj[0].offset = CANONICAL(dst_offset);
+	obj[1].offset = CANONICAL(src_offset);
+	obj[2].offset = CANONICAL(bb_offset);
+	obj[0].handle = blt->dst.handle;
+	obj[1].handle = blt->src.handle;
+	obj[2].handle = blt->bb.handle;
+	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	execbuf.buffer_count = 3;
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.rsvd1 = ctx ? ctx->id : 0;
+	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+	ret = __gem_execbuf(i915, &execbuf);
+	put_offset(ahnd, blt->dst.handle);
+	put_offset(ahnd, blt->src.handle);
+	put_offset(ahnd, blt->bb.handle);
+
+	return ret;
+}
+
+/**
+ * blt_surface_fill_rect:
+ * @i915: drm fd
+ * @obj: blitter copy object (@blt_copy_object) to fill with gradient pattern
+ * @width: width
+ * @height: height
+ *
+ * Function fills surface @width x @height * 24bpp with color gradient
+ * (internally uses ARGB where A == 0xff, see Cairo docs).
+ */
+void blt_surface_fill_rect(int i915, const struct blt_copy_object *obj,
+			   uint32_t width, uint32_t height)
+{
+	cairo_surface_t *surface;
+	cairo_pattern_t *pat;
+	cairo_t *cr;
+	void *map = obj->ptr;
+
+	if (!map)
+		map = gem_mmap__device_coherent(i915, obj->handle, 0,
+						obj->size, PROT_READ | PROT_WRITE);
+
+	surface = cairo_image_surface_create_for_data(map,
+						      CAIRO_FORMAT_RGB24,
+						      width, height,
+						      obj->pitch);
+
+	cr = cairo_create(surface);
+
+	cairo_rectangle(cr, 0, 0, width, height);
+	cairo_clip(cr);
+
+	pat = cairo_pattern_create_mesh();
+	cairo_mesh_pattern_begin_patch(pat);
+	cairo_mesh_pattern_move_to(pat, 0, 0);
+	cairo_mesh_pattern_line_to(pat, width, 0);
+	cairo_mesh_pattern_line_to(pat, width, height);
+	cairo_mesh_pattern_line_to(pat, 0, height);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 0, 1.0, 0.0, 0.0);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 1, 0.0, 1.0, 0.0);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 2, 0.0, 0.0, 1.0);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 3, 1.0, 1.0, 1.0);
+	cairo_mesh_pattern_end_patch(pat);
+
+	cairo_rectangle(cr, 0, 0, width, height);
+	cairo_set_source(cr, pat);
+	cairo_fill(cr);
+	cairo_pattern_destroy(pat);
+
+	cairo_destroy(cr);
+
+	cairo_surface_destroy(surface);
+	if (!obj->ptr)
+		munmap(map, obj->size);
+}
+
+/**
+ * blt_surface_info:
+ * @info: information header
+ * @obj: blitter copy object (@blt_copy_object) to print surface info
+ */
+void blt_surface_info(const char *info, const struct blt_copy_object *obj)
+{
+	igt_info("[%s]\n", info);
+	igt_info("surface <handle: %u, size: %llx, region: %x, mocs: %x>\n",
+		 obj->handle, (long long) obj->size, obj->region, obj->mocs);
+	igt_info("        <tiling: %s, compression: %u, compression type: %d>\n",
+		 blt_tiling_name(obj->tiling), obj->compression, obj->compression_type);
+	igt_info("        <pitch: %u, offset [x: %u, y: %u] geom [<%d,%d> <%d,%d>]>\n",
+		 obj->pitch, obj->x_offset, obj->y_offset,
+		 obj->x1, obj->y1, obj->x2, obj->y2);
+}
+
+/**
+ * blt_surface_to_png:
+ * @i915: drm fd
+ * @run_id: prefix id to allow grouping files stored from single run
+ * @fileid: file identifier
+ * @obj: blitter copy object (@blt_copy_object) to save to png
+ * @width: width
+ * @height: height
+ *
+ * Function save surface to png file. Assumes ARGB format where A == 0xff.
+ */
+void blt_surface_to_png(int i915, uint32_t run_id, const char *fileid,
+			const struct blt_copy_object *obj,
+			uint32_t width, uint32_t height)
+{
+	cairo_surface_t *surface;
+	cairo_status_t ret;
+	uint8_t *map = (uint8_t *) obj->ptr;
+	int format;
+	int stride = obj->tiling ? obj->pitch * 4 : obj->pitch;
+	char filename[FILENAME_MAX];
+
+	snprintf(filename, FILENAME_MAX-1, "%d-%s-%s-%ux%u-%s.png",
+		 run_id, fileid, blt_tiling_name(obj->tiling), width, height,
+		 obj->compression ? "compressed" : "uncompressed");
+
+	if (!map)
+		map = gem_mmap__device_coherent(i915, obj->handle, 0,
+						obj->size, PROT_READ);
+	format = CAIRO_FORMAT_RGB24;
+	surface = cairo_image_surface_create_for_data(map,
+						      format, width, height,
+						      stride);
+	ret = cairo_surface_write_to_png(surface, filename);
+	if (ret)
+		igt_info("Cairo ret: %d (%s)\n", ret, cairo_status_to_string(ret));
+	igt_assert(ret == CAIRO_STATUS_SUCCESS);
+	cairo_surface_destroy(surface);
+
+	if (!obj->ptr)
+		munmap(map, obj->size);
+}
diff --git a/lib/i915/i915_blt.h b/lib/i915/i915_blt.h
new file mode 100644
index 000000000..e0e8b52bc
--- /dev/null
+++ b/lib/i915/i915_blt.h
@@ -0,0 +1,196 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+/**
+ * SECTION:i915_blt
+ * @short_description: i915 blitter library
+ * @title: Blitter library
+ * @include: i915_blt.h
+ *
+ * # Introduction
+ *
+ * Gen12+ blitter commands like XY_BLOCK_COPY_BLT are quite long
+ * and if we would like to provide all arguments to function,
+ * list would be long, unreadable and error prone to invalid argument placement.
+ * Providing objects (structs) seems more reasonable and opens some more
+ * opportunities to share some object data across different blitter commands.
+ *
+ * Blitter library supports no-reloc (softpin) mode only (apart of TGL
+ * there's no relocations enabled) thus ahnd is mandatory. Providing NULL ctx
+ * means we use default context with I915_EXEC_BLT as an execution engine.
+ *
+ * Library introduces tiling enum which distinguishes tiling formats regardless
+ * legacy I915_TILING_... definitions. This allows to control fully what tilings
+ * are handled by command and skip/assert ones which are not supported.
+ *
+ * # Supported commands
+ *
+ * - XY_BLOCK_COPY_BLT - (block-copy) TGL/DG1 + DG2+ (ext version)
+ * - XY_FAST_COPY_BLT - (fast-copy)
+ * - XY_CTRL_SURF_COPY_BLT - (ctrl-surf-copy) DG2+
+ *
+ * # Usage details
+ *
+ * For block-copy and fast-copy @blt_copy_object struct is used to collect
+ * data about source and destination objects. It contains handle, region,
+ * size, etc...  which are using for blits. Some fields are not used for
+ * fast-copy copy (like compression) and command which use this exclusively
+ * is annotated in the comment.
+ *
+ */
+
+#include <errno.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+#include <malloc.h>
+#include "drm.h"
+#include "igt.h"
+
+#define CCS_RATIO 256
+
+enum blt_color_depth {
+	CD_8bit,
+	CD_16bit,
+	CD_32bit,
+	CD_64bit,
+	CD_96bit,
+	CD_128bit,
+};
+
+enum blt_tiling {
+	T_LINEAR,
+	T_XMAJOR,
+	T_YMAJOR,
+	T_TILE4,
+	T_TILE64,
+};
+
+enum blt_compression {
+	COMPRESSION_DISABLED,
+	COMPRESSION_ENABLED,
+};
+
+enum blt_compression_type {
+	COMPRESSION_TYPE_3D,
+	COMPRESSION_TYPE_MEDIA,
+};
+
+/* BC - block-copy */
+struct blt_copy_object {
+	uint32_t handle;
+	uint32_t region;
+	uint64_t size;
+	uint8_t mocs;
+	enum blt_tiling tiling;
+	enum blt_compression compression;  /* BC only */
+	enum blt_compression_type compression_type; /* BC only */
+	uint32_t pitch;
+	uint16_t x_offset, y_offset;
+	int16_t x1, y1, x2, y2;
+
+	/* mapping or null */
+	uint32_t *ptr;
+};
+
+struct blt_copy_batch {
+	uint32_t handle;
+	uint32_t region;
+	uint64_t size;
+};
+
+/* Common for block-copy and fast-copy */
+struct blt_copy_data {
+	int i915;
+	struct blt_copy_object src;
+	struct blt_copy_object dst;
+	struct blt_copy_batch bb;
+	enum blt_color_depth color_depth;
+
+	/* debug stuff */
+	bool print_bb;
+};
+
+enum blt_surface_type {
+	SURFACE_TYPE_1D,
+	SURFACE_TYPE_2D,
+	SURFACE_TYPE_3D,
+	SURFACE_TYPE_CUBE,
+};
+
+struct blt_block_copy_object_ext {
+	uint8_t compression_format;
+	bool clear_value_enable;
+	uint64_t clear_address;
+	uint16_t surface_width;
+	uint16_t surface_height;
+	enum blt_surface_type surface_type;
+	uint16_t surface_qpitch;
+	uint16_t surface_depth;
+	uint8_t lod;
+	uint8_t horizontal_align;
+	uint8_t vertical_align;
+	uint8_t mip_tail_start_lod;
+	bool depth_stencil_resource;
+	uint16_t array_index;
+};
+
+struct blt_block_copy_data_ext {
+	struct blt_block_copy_object_ext src;
+	struct blt_block_copy_object_ext dst;
+};
+
+enum blt_access_type {
+	INDIRECT_ACCESS,
+	DIRECT_ACCESS,
+};
+
+struct blt_ctrl_surf_copy_object {
+	uint32_t handle;
+	uint32_t region;
+	uint64_t size;
+	uint8_t mocs;
+	enum blt_access_type access_type;
+};
+
+struct blt_ctrl_surf_copy_data {
+	int i915;
+	struct blt_ctrl_surf_copy_object src;
+	struct blt_ctrl_surf_copy_object dst;
+	struct blt_copy_batch bb;
+
+	/* debug stuff */
+	bool print_bb;
+};
+
+bool blt_supports_compression(int i915);
+bool blt_supports_tiling(int i915, enum blt_tiling tiling);
+const char *blt_tiling_name(enum blt_tiling tiling);
+
+int blt_block_copy(int i915,
+		   const intel_ctx_t *ctx,
+		   const struct intel_execution_engine2 *e,
+		   uint64_t ahnd,
+		   const struct blt_copy_data *blt,
+		   const struct blt_block_copy_data_ext *ext);
+
+int blt_ctrl_surf_copy(int i915,
+		       const intel_ctx_t *ctx,
+		       const struct intel_execution_engine2 *e,
+		       uint64_t ahnd,
+		       const struct blt_ctrl_surf_copy_data *surf);
+
+int blt_fast_copy(int i915,
+		  const intel_ctx_t *ctx,
+		  const struct intel_execution_engine2 *e,
+		  uint64_t ahnd,
+		  const struct blt_copy_data *blt);
+
+void blt_surface_info(const char *info,
+		      const struct blt_copy_object *obj);
+void blt_surface_fill_rect(int i915, const struct blt_copy_object *obj,
+			   uint32_t width, uint32_t height);
+void blt_surface_to_png(int i915, uint32_t run_id, const char *fileid,
+		    const struct blt_copy_object *obj,
+		    uint32_t width, uint32_t height);
diff --git a/lib/meson.build b/lib/meson.build
index 3e43316d1..fe035672e 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -11,6 +11,7 @@ lib_sources = [
 	'i915/gem_mman.c',
 	'i915/gem_vm.c',
 	'i915/intel_memory_region.c',
+	'i915/i915_blt.c',
 	'igt_collection.c',
 	'igt_color_encoding.c',
 	'igt_debugfs.c',
-- 
2.32.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [igt-dev] [PATCH i-g-t 3/3] tests/gem_ccs: Verify uncompressed and compressed blits
  2022-02-16 12:21 [igt-dev] [PATCH i-g-t 0/3] Add i915 blt library + gem_ccs test Zbigniew Kempczyński
  2022-02-16 12:21 ` [igt-dev] [PATCH i-g-t 1/3] i915/gem_engine_topology: Only use the main copy engines for XY_BLOCK_COPY Zbigniew Kempczyński
  2022-02-16 12:21 ` [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter Zbigniew Kempczyński
@ 2022-02-16 12:21 ` Zbigniew Kempczyński
  2022-02-16 15:46 ` [igt-dev] ✓ Fi.CI.BAT: success for Add i915 blt library + gem_ccs test (rev5) Patchwork
  2022-02-16 22:30 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
  4 siblings, 0 replies; 12+ messages in thread
From: Zbigniew Kempczyński @ 2022-02-16 12:21 UTC (permalink / raw)
  To: igt-dev

DG2 and above supports flat-ccs what means object compression lays in
dedicated part or memory and is directly correlated with object location
in physical memory. As no mapping is possible from CPU side to this area
dedicated blitter command (XY_CTRL_SURF_COPY_BLT) was created to copy
from/to ccs data.

Test exercises scenarios:
1. block-copy without compression (TGL/DG1)
2. block-copy with flat-ccs compression (DG2+) + inplace decompression
3. ctrl-surf-copy which verifies copying ccs data to/from flat-ccs area

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 tests/i915/gem_ccs.c | 485 +++++++++++++++++++++++++++++++++++++++++++
 tests/meson.build    |   1 +
 2 files changed, 486 insertions(+)
 create mode 100644 tests/i915/gem_ccs.c

diff --git a/tests/i915/gem_ccs.c b/tests/i915/gem_ccs.c
new file mode 100644
index 000000000..33b79d86d
--- /dev/null
+++ b/tests/i915/gem_ccs.c
@@ -0,0 +1,485 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <errno.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+#include <malloc.h>
+#include "drm.h"
+#include "igt.h"
+#include "i915/gem.h"
+#include "i915/gem_create.h"
+#include "lib/intel_chipset.h"
+#include "i915/i915_blt.h"
+
+IGT_TEST_DESCRIPTION("Exercise gen12 blitter with and without flat ccs");
+
+#define BPP 32
+
+static struct param {
+	int compression_format;
+	int tiling;
+	bool write_png;
+	bool print_bb;
+	bool print_surface_info;
+	int width;
+	int height;
+	uint32_t increase_mid_surf_size;
+} param = {
+	.compression_format = 0,
+	.tiling = -1,
+	.write_png = false,
+	.print_bb = false,
+	.print_surface_info = false,
+	.width = 512,
+	.height = 512,
+	.increase_mid_surf_size = 0,
+};
+
+struct test_config {
+	bool compression;
+	bool inplace;
+	bool surfcopy;
+};
+
+static void set_object(struct blt_copy_object *obj,
+		       uint32_t handle, uint64_t size, uint32_t region,
+		       uint8_t mocs, enum blt_tiling tiling,
+		       enum blt_compression compression,
+		       enum blt_compression_type compression_type)
+{
+	obj->handle = handle;
+	obj->size = size;
+	obj->region = region;
+	obj->mocs = mocs;
+	obj->tiling = tiling;
+	obj->compression = compression;
+	obj->compression_type = compression_type;
+}
+
+static void set_geom(struct blt_copy_object *obj, uint32_t pitch,
+		     int16_t x1, int16_t y1, int16_t x2, int16_t y2,
+		     uint16_t x_offset, uint16_t y_offset)
+{
+	obj->pitch = pitch;
+	obj->x1 = x1;
+	obj->y1 = y1;
+	obj->x2 = x2;
+	obj->y2 = y2;
+	obj->x_offset = x_offset;
+	obj->y_offset = y_offset;
+}
+
+static void set_batch(struct blt_copy_batch *batch,
+		      uint32_t handle, uint64_t size, uint32_t region)
+{
+	batch->handle = handle;
+	batch->size = size;
+	batch->region = region;
+}
+
+static void set_object_ext(struct blt_block_copy_object_ext *obj,
+			   uint8_t compression_format,
+			   uint16_t surface_width, uint16_t surface_height,
+			   enum blt_surface_type surface_type)
+{
+	obj->compression_format = compression_format;
+	obj->surface_width = surface_width;
+	obj->surface_height = surface_height;
+	obj->surface_type = surface_type;
+}
+
+static void set_surf_object(struct blt_ctrl_surf_copy_object *obj,
+			    uint32_t handle, uint32_t region, uint64_t size,
+			    uint8_t mocs, enum blt_access_type access_type)
+{
+	obj->handle = handle;
+	obj->region = region;
+	obj->size = size;
+	obj->mocs = mocs;
+	obj->access_type = access_type;
+}
+
+static struct blt_copy_object *
+create_object(int i915, uint32_t region,
+	      uint32_t width, uint32_t height, uint32_t bpp, uint8_t mocs,
+	      enum blt_tiling tiling,
+	      enum blt_compression compression,
+	      enum blt_compression_type compression_type,
+	      bool create_mapping,
+	      uint64_t increase_size)
+{
+	struct blt_copy_object *obj;
+	uint64_t size = width * height * bpp / 8 + increase_size;
+	uint32_t stride = tiling == T_LINEAR ? width * 4 : width;
+	uint32_t handle;
+
+	obj = calloc(1, sizeof(*obj));
+
+	obj->size = size;
+	igt_assert(__gem_create_in_memory_regions(i915, &handle,
+						  &size, region) == 0);
+
+	set_object(obj, handle, size, region, mocs, tiling,
+		   compression, compression_type);
+	set_geom(obj, stride, 0, 0, width, height, 0, 0);
+
+	if (create_mapping)
+		obj->ptr = gem_mmap__device_coherent(i915, handle, 0, size,
+						     PROT_READ | PROT_WRITE);
+
+	return obj;
+}
+
+static void destroy_object(int i915, struct blt_copy_object *obj)
+{
+	if (obj->ptr)
+		munmap(obj->ptr, obj->size);
+
+	gem_close(i915, obj->handle);
+}
+
+static void set_blt_object(struct blt_copy_object *obj,
+			   const struct blt_copy_object *orig)
+{
+	memcpy(obj, orig, sizeof(*obj));
+}
+
+#define PRINT_SURFACE_INFO(name, obj) do { \
+	if (param.print_surface_info) \
+		blt_surface_info((name), (obj)); } while (0)
+
+#define WRITE_PNG(fd, id, name, obj, w, h) do { \
+	if (param.write_png) \
+		blt_surface_to_png((fd), (id), (name), (obj), (w), (h)); } while (0)
+
+static void surf_copy(int i915,
+		      const intel_ctx_t *ctx,
+		      const struct intel_execution_engine2 *e,
+		      uint64_t ahnd,
+		      const struct blt_copy_object *src,
+		      const struct blt_copy_object *mid,
+		      const struct blt_copy_object *dst,
+		      int run_id)
+{
+	struct blt_copy_data blt = {};
+	struct blt_block_copy_data_ext ext = {};
+	struct blt_ctrl_surf_copy_data surf = {};
+	uint32_t bb, bb2, ccs, *ccsmap, bb_size = 4096;
+	uint64_t ccssize = mid->size / CCS_RATIO;
+	uint32_t *ccscopy;
+	int result;
+
+	igt_assert(mid->compression);
+	ccscopy = (uint32_t *) malloc(ccssize);
+	bb = gem_create(i915, bb_size);
+	bb2 = gem_create(i915, bb_size);
+	ccs = gem_create(i915, ccssize);
+
+	surf.i915 = i915;
+	surf.print_bb = param.print_bb;
+	set_surf_object(&surf.src, mid->handle, mid->region, mid->size,
+			0, INDIRECT_ACCESS);
+	set_surf_object(&surf.dst, ccs, REGION_SMEM, ccssize,
+			0, DIRECT_ACCESS);
+	set_batch(&surf.bb, bb, bb_size, REGION_SMEM);
+	blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
+	gem_sync(i915, surf.dst.handle);
+
+	ccsmap = gem_mmap__device_coherent(i915, ccs, 0, surf.dst.size,
+					   PROT_READ | PROT_WRITE);
+	memcpy(ccscopy, ccsmap, ccssize);
+
+	/* corrupt ccs */
+	for (int i = 0; i < surf.dst.size / sizeof(uint32_t); i++)
+		ccsmap[i] = i;
+	set_surf_object(&surf.src, ccs, REGION_SMEM, ccssize,
+			0, DIRECT_ACCESS);
+	set_surf_object(&surf.dst, mid->handle, mid->region, mid->size,
+			0, INDIRECT_ACCESS);
+	blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
+
+	memset(&blt, 0, sizeof(blt));
+	blt.color_depth = CD_32bit;
+	blt.print_bb = param.print_bb;
+	set_blt_object(&blt.src, mid);
+	set_blt_object(&blt.dst, dst);
+	set_object_ext(&ext.src, mid->compression_type, mid->x2, mid->y2, SURFACE_TYPE_2D);
+	set_object_ext(&ext.dst, 0, dst->x2, dst->y2, SURFACE_TYPE_2D);
+	set_batch(&blt.bb, bb2, bb_size, REGION_SMEM);
+	blt_block_copy(i915, ctx, e, ahnd, &blt, &ext);
+	gem_sync(i915, blt.dst.handle);
+	WRITE_PNG(i915, run_id, "corrupted", &blt.dst, dst->x2, dst->y2);
+	result = memcmp(src->ptr, dst->ptr, src->size);
+	igt_assert(result != 0);
+
+	/* retrieve back ccs */
+	memcpy(ccsmap, ccscopy, ccssize);
+	blt_ctrl_surf_copy(i915, ctx, e, ahnd, &surf);
+
+	blt_block_copy(i915, ctx, e, ahnd, &blt, &ext);
+	gem_sync(i915, blt.dst.handle);
+	WRITE_PNG(i915, run_id, "corrected", &blt.dst, dst->x2, dst->y2);
+	result = memcmp(src->ptr, dst->ptr, src->size);
+	igt_assert(result == 0);
+
+	munmap(ccsmap, ccssize);
+	gem_close(i915, bb);
+	gem_close(i915, bb2);
+	gem_close(i915, ccs);
+}
+
+static void block_copy(int i915,
+		       const intel_ctx_t *ctx,
+		       const struct intel_execution_engine2 *e,
+		       uint32_t region1, uint32_t region2,
+		       enum blt_tiling mid_tiling, bool compression,
+		       bool inplace,
+		       bool surfcopy)
+{
+	struct blt_copy_data blt = {};
+	struct blt_block_copy_data_ext ext = {}, *pext = &ext;
+	struct blt_copy_object *src, *mid, *dst;
+	const uint32_t bpp = BPP;
+	uint64_t add_size = param.increase_mid_surf_size;
+	uint64_t bb_size = 4096;
+	uint64_t ahnd = intel_allocator_open_full(i915, ctx->id, 0, 0,
+						  INTEL_ALLOCATOR_SIMPLE,
+						  ALLOC_STRATEGY_LOW_TO_HIGH);
+	uint32_t run_id = mid_tiling;
+	uint32_t mid_region = region2, bb;
+	uint32_t width = param.width, height = param.height;
+	enum blt_compression mid_compression = compression;
+	int mid_compression_format = param.compression_format;
+	enum blt_compression_type comp_type = COMPRESSION_TYPE_3D;
+	int result;
+
+	igt_assert(__gem_create_in_memory_regions(i915, &bb, &bb_size, region1) == 0);
+
+	if (!blt_supports_compression(i915))
+		pext = NULL;
+
+	src = create_object(i915, region1, width, height, bpp, 0,
+			    T_LINEAR, COMPRESSION_DISABLED, comp_type, true, 0);
+	mid = create_object(i915, mid_region, width, height, bpp, 0,
+			    mid_tiling, mid_compression, comp_type, true, add_size);
+	dst = create_object(i915, region1, width, height, bpp, 0,
+			    T_LINEAR, COMPRESSION_DISABLED, comp_type, true, 0);
+	igt_assert(src->size == dst->size);
+	PRINT_SURFACE_INFO("src", src);
+	PRINT_SURFACE_INFO("mid", mid);
+	PRINT_SURFACE_INFO("dst", dst);
+
+	blt_surface_fill_rect(i915, src, width, height);
+	WRITE_PNG(i915, run_id, "src", src, width, height);
+
+	memset(&blt, 0, sizeof(blt));
+	blt.color_depth = CD_32bit;
+	blt.print_bb = param.print_bb;
+	set_blt_object(&blt.src, src);
+	set_blt_object(&blt.dst, mid);
+	set_object_ext(&ext.src, 0, width, height, SURFACE_TYPE_2D);
+	set_object_ext(&ext.dst, mid_compression_format, width, height, SURFACE_TYPE_2D);
+	set_batch(&blt.bb, bb, bb_size, region1);
+
+	blt_block_copy(i915, ctx, e, ahnd, &blt, pext);
+	gem_sync(i915, mid->handle);
+
+	WRITE_PNG(i915, run_id, "src", &blt.src, width, height);
+	WRITE_PNG(i915, run_id, "mid", &blt.dst, width, height);
+
+	if (surfcopy && pext)
+		surf_copy(i915, ctx, e, ahnd, src, mid, dst, run_id);
+
+	memset(&blt, 0, sizeof(blt));
+	blt.color_depth = CD_32bit;
+	blt.print_bb = param.print_bb;
+	set_blt_object(&blt.src, mid);
+	set_blt_object(&blt.dst, dst);
+	set_object_ext(&ext.src, mid_compression_format, width, height, SURFACE_TYPE_2D);
+	set_object_ext(&ext.dst, 0, width, height, SURFACE_TYPE_2D);
+	if (inplace) {
+		set_object(&blt.dst, mid->handle, dst->size, mid->region, 0,
+			   T_LINEAR, COMPRESSION_DISABLED, comp_type);
+		blt.dst.ptr = mid->ptr;
+	}
+
+	set_batch(&blt.bb, bb, bb_size, region1);
+	blt_block_copy(i915, ctx, e, ahnd, &blt, pext);
+	gem_sync(i915, blt.dst.handle);
+	WRITE_PNG(i915, run_id, "dst", &blt.dst, width, height);
+
+	result = memcmp(src->ptr, blt.dst.ptr, src->size);
+
+	destroy_object(i915, src);
+	destroy_object(i915, mid);
+	destroy_object(i915, dst);
+	gem_close(i915, bb);
+	put_ahnd(ahnd);
+
+	igt_assert_f(!result, "source and destination surfaces differs!\n");
+}
+
+static void block_copy_test(int i915,
+			    const struct test_config *config,
+			    const intel_ctx_t *ctx,
+			    struct igt_collection *set)
+{
+	struct igt_collection *regions;
+	const struct intel_execution_engine2 *e;
+
+	if (config->compression && !blt_supports_compression(i915))
+		return;
+
+	if (config->inplace && !config->compression)
+		return;
+
+	for (int tiling = T_LINEAR; tiling <= T_TILE64; tiling++) {
+		if (!blt_supports_tiling(i915, tiling) ||
+		    (param.tiling >= 0 && param.tiling != tiling))
+			continue;
+
+		for_each_ctx_engine(i915, ctx, e) {
+			if (!gem_engine_can_block_copy(i915, e))
+				continue;
+
+			for_each_variation_r(regions, 2, set) {
+				uint32_t region1, region2;
+				char *regtxt;
+
+				region1 = igt_collection_get_value(regions, 0);
+				region2 = igt_collection_get_value(regions, 1);
+
+				/* Compressed surface must be in device memory */
+				if (config->compression && !IS_DEVICE_MEMORY_REGION(region2))
+					continue;
+
+				regtxt = memregion_dynamic_subtest_name(regions);
+				igt_dynamic_f("%s-%s-compfmt%d-%s",
+					      blt_tiling_name(tiling),
+					      config->compression ?
+						      "compressed" : "uncompressed",
+					      param.compression_format, regtxt) {
+					block_copy(i915, ctx, e,
+						   region1, region2,
+						   tiling,
+						   config->compression,
+						   config->inplace,
+						   config->surfcopy);
+				}
+				free(regtxt);
+			}
+		}
+	}
+}
+
+static int opt_handler(int opt, int opt_index, void *data)
+{
+	switch (opt) {
+	case 'b':
+		param.print_bb = true;
+		igt_debug("Print bb: %d\n", param.print_bb);
+		break;
+	case 'f':
+		param.compression_format = atoi(optarg);
+		igt_debug("Compression format: %d\n", param.compression_format);
+		igt_assert((param.compression_format & ~0x1f) == 0);
+		break;
+	case 'p':
+		param.write_png = true;
+		igt_debug("Write png: %d\n", param.write_png);
+		break;
+	case 's':
+		param.print_surface_info = true;
+		igt_debug("Print surface info: %d\n", param.print_surface_info);
+		break;
+	case 't':
+		param.tiling = atoi(optarg);
+		igt_debug("Tiling: %d\n", param.tiling);
+		break;
+	case 'W':
+		param.width = atoi(optarg);
+		igt_debug("Width: %d\n", param.width);
+		break;
+	case 'H':
+		param.height = atoi(optarg);
+		igt_debug("Height: %d\n", param.height);
+		break;
+	case 'I':
+		param.increase_mid_surf_size = atoi(optarg);
+		igt_debug("Additional mid size: %x\n", param.increase_mid_surf_size);
+		break;
+	default:
+		return IGT_OPT_HANDLER_ERROR;
+	}
+
+	return IGT_OPT_HANDLER_SUCCESS;
+}
+
+const char *help_str =
+	"  -b\tPrint bb\n"
+	"  -f\tCompression format (0-31)"
+	"  -p\tWrite PNG\n"
+	"  -s\tPrint surface info\n"
+	"  -t\tTiling format (0 - linear, 1 - XMAJOR, 2 - YMAJOR, 3 - TILE4, 4 - TILE64)\n"
+	"  -W\tWidth (default 512)\n"
+	"  -H\tHeight (default 512)\n"
+	"  -I\tIncrease Tile64 surface size (value)"
+	;
+
+igt_main_args("bf:pst:W:H:I:", NULL, help_str, opt_handler, NULL)
+{
+	struct drm_i915_query_memory_regions *query_info;
+	struct igt_collection *set;
+	const intel_ctx_t *ctx;
+	int i915;
+	igt_hang_t hang;
+
+	igt_fixture {
+		i915 = drm_open_driver(DRIVER_INTEL);
+		igt_require_gem(i915);
+		igt_require(AT_LEAST_GEN(intel_get_drm_devid(i915), 12) > 0);
+
+		query_info = gem_get_query_memory_regions(i915);
+		igt_require(query_info);
+
+		set = get_memory_region_set(query_info,
+					    I915_SYSTEM_MEMORY,
+					    I915_DEVICE_MEMORY);
+		ctx = intel_ctx_create_all_physical(i915);
+		hang = igt_allow_hang(i915, ctx->id, 0);
+	}
+
+	igt_subtest_with_dynamic("block-copy-uncompressed") {
+		struct test_config config = {};
+
+		block_copy_test(i915, &config, ctx, set);
+	}
+
+	igt_subtest_with_dynamic("block-copy-compressed") {
+		struct test_config config = { .compression = true };
+
+		block_copy_test(i915, &config, ctx, set);
+	}
+
+	igt_subtest_with_dynamic("block-copy-inplace") {
+		struct test_config config = { .compression = true,
+					      .inplace = true };
+
+		block_copy_test(i915, &config, ctx, set);
+	}
+
+	igt_subtest_with_dynamic("ctrl-surf-copy") {
+		struct test_config config = { .compression = true,
+					      .surfcopy = true };
+
+		block_copy_test(i915, &config, ctx, set);
+	}
+
+	igt_fixture {
+		igt_disallow_hang(i915, hang);
+		close(i915);
+	}
+}
diff --git a/tests/meson.build b/tests/meson.build
index 7003d0641..0736732b7 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -108,6 +108,7 @@ i915_progs = [
 	'gem_blits',
 	'gem_busy',
 	'gem_caching',
+	'gem_ccs',
 	'gem_close',
 	'gem_close_race',
 	'gem_concurrent_blit',
-- 
2.32.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for Add i915 blt library + gem_ccs test (rev5)
  2022-02-16 12:21 [igt-dev] [PATCH i-g-t 0/3] Add i915 blt library + gem_ccs test Zbigniew Kempczyński
                   ` (2 preceding siblings ...)
  2022-02-16 12:21 ` [igt-dev] [PATCH i-g-t 3/3] tests/gem_ccs: Verify uncompressed and compressed blits Zbigniew Kempczyński
@ 2022-02-16 15:46 ` Patchwork
  2022-02-16 22:30 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
  4 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2022-02-16 15:46 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

[-- Attachment #1: Type: text/plain, Size: 6780 bytes --]

== Series Details ==

Series: Add i915 blt library + gem_ccs test (rev5)
URL   : https://patchwork.freedesktop.org/series/99925/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_11235 -> IGTPW_6637
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/index.html

Participating hosts (46 -> 46)
------------------------------

  Additional (2): bat-rpls-2 fi-pnv-d510 
  Missing    (2): fi-bsw-cyan shard-tglu 

Known issues
------------

  Here are the changes found in IGTPW_6637 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@amdgpu/amd_basic@semaphore:
    - fi-hsw-4770:        NOTRUN -> [SKIP][1] ([fdo#109271] / [fdo#109315]) +17 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-hsw-4770/igt@amdgpu/amd_basic@semaphore.html

  * igt@gem_huc_copy@huc-copy:
    - fi-skl-6600u:       NOTRUN -> [SKIP][2] ([fdo#109271] / [i915#2190])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-skl-6600u/igt@gem_huc_copy@huc-copy.html
    - fi-pnv-d510:        NOTRUN -> [SKIP][3] ([fdo#109271]) +57 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-pnv-d510/igt@gem_huc_copy@huc-copy.html

  * igt@gem_lmem_swapping@verify-random:
    - fi-skl-6600u:       NOTRUN -> [SKIP][4] ([fdo#109271] / [i915#4613]) +3 similar issues
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-skl-6600u/igt@gem_lmem_swapping@verify-random.html

  * igt@i915_selftest@live@hangcheck:
    - bat-dg1-6:          [PASS][5] -> [DMESG-FAIL][6] ([i915#4494] / [i915#4957])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/bat-dg1-6/igt@i915_selftest@live@hangcheck.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/bat-dg1-6/igt@i915_selftest@live@hangcheck.html

  * igt@kms_chamelium@vga-edid-read:
    - fi-skl-6600u:       NOTRUN -> [SKIP][7] ([fdo#109271] / [fdo#111827]) +8 similar issues
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-skl-6600u/igt@kms_chamelium@vga-edid-read.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
    - fi-skl-6600u:       NOTRUN -> [SKIP][8] ([fdo#109271]) +2 similar issues
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-skl-6600u/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
    - fi-skl-6600u:       NOTRUN -> [SKIP][9] ([fdo#109271] / [i915#533])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-skl-6600u/igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d.html

  * igt@kms_pipe_crc_basic@read-crc-pipe-b:
    - fi-cfl-8109u:       [PASS][10] -> [DMESG-WARN][11] ([i915#295]) +12 similar issues
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/fi-cfl-8109u/igt@kms_pipe_crc_basic@read-crc-pipe-b.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-cfl-8109u/igt@kms_pipe_crc_basic@read-crc-pipe-b.html

  * igt@kms_psr@primary_page_flip:
    - fi-skl-6600u:       NOTRUN -> [FAIL][12] ([i915#4547])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-skl-6600u/igt@kms_psr@primary_page_flip.html

  * igt@runner@aborted:
    - fi-skl-6600u:       NOTRUN -> [FAIL][13] ([i915#4312])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-skl-6600u/igt@runner@aborted.html

  
#### Possible fixes ####

  * igt@gem_flink_basic@bad-flink:
    - fi-skl-6600u:       [INCOMPLETE][14] ([i915#4547]) -> [PASS][15]
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/fi-skl-6600u/igt@gem_flink_basic@bad-flink.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-skl-6600u/igt@gem_flink_basic@bad-flink.html

  * igt@i915_selftest@live@hangcheck:
    - fi-hsw-4770:        [INCOMPLETE][16] ([i915#3303]) -> [PASS][17]
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-hsw-4770/igt@i915_selftest@live@hangcheck.html

  * igt@kms_frontbuffer_tracking@basic:
    - fi-cml-u2:          [DMESG-WARN][18] ([i915#4269]) -> [PASS][19]
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/fi-cml-u2/igt@kms_frontbuffer_tracking@basic.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/fi-cml-u2/igt@kms_frontbuffer_tracking@basic.html

  
#### Warnings ####

  * igt@i915_selftest@live@hangcheck:
    - bat-dg1-5:          [DMESG-FAIL][20] ([i915#4957]) -> [DMESG-FAIL][21] ([i915#4494] / [i915#4957])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/bat-dg1-5/igt@i915_selftest@live@hangcheck.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/bat-dg1-5/igt@i915_selftest@live@hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2582]: https://gitlab.freedesktop.org/drm/intel/issues/2582
  [i915#295]: https://gitlab.freedesktop.org/drm/intel/issues/295
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#4269]: https://gitlab.freedesktop.org/drm/intel/issues/4269
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#4494]: https://gitlab.freedesktop.org/drm/intel/issues/4494
  [i915#4547]: https://gitlab.freedesktop.org/drm/intel/issues/4547
  [i915#4613]: https://gitlab.freedesktop.org/drm/intel/issues/4613
  [i915#4897]: https://gitlab.freedesktop.org/drm/intel/issues/4897
  [i915#4898]: https://gitlab.freedesktop.org/drm/intel/issues/4898
  [i915#4957]: https://gitlab.freedesktop.org/drm/intel/issues/4957
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_6347 -> IGTPW_6637

  CI-20190529: 20190529
  CI_DRM_11235: 9aacb5966c7d4430c4c3d5eba12f8ef0225e95b6 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_6637: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/index.html
  IGT_6347: 37ea4c86f97c0e05fcb6b04cff72ec927930536e @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git



== Testlist changes ==

+igt@gem_ccs@block-copy-compressed
+igt@gem_ccs@block-copy-inplace
+igt@gem_ccs@block-copy-uncompressed
+igt@gem_ccs@ctrl-surf-copy

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/index.html

[-- Attachment #2: Type: text/html, Size: 8233 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [igt-dev] ✗ Fi.CI.IGT: failure for Add i915 blt library + gem_ccs test (rev5)
  2022-02-16 12:21 [igt-dev] [PATCH i-g-t 0/3] Add i915 blt library + gem_ccs test Zbigniew Kempczyński
                   ` (3 preceding siblings ...)
  2022-02-16 15:46 ` [igt-dev] ✓ Fi.CI.BAT: success for Add i915 blt library + gem_ccs test (rev5) Patchwork
@ 2022-02-16 22:30 ` Patchwork
  4 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2022-02-16 22:30 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

[-- Attachment #1: Type: text/plain, Size: 30260 bytes --]

== Series Details ==

Series: Add i915 blt library + gem_ccs test (rev5)
URL   : https://patchwork.freedesktop.org/series/99925/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_11235_full -> IGTPW_6637_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with IGTPW_6637_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in IGTPW_6637_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/index.html

Participating hosts (12 -> 8)
------------------------------

  Missing    (4): pig-skl-6260u pig-kbl-iris shard-dg1 pig-glk-j5005 

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_6637_full:

### IGT changes ###

#### Possible regressions ####

  * {igt@gem_ccs@block-copy-compressed} (NEW):
    - shard-tglb:         NOTRUN -> [SKIP][1] +2 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb1/igt@gem_ccs@block-copy-compressed.html
    - {shard-tglu}:       NOTRUN -> [SKIP][2]
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglu-2/igt@gem_ccs@block-copy-compressed.html

  * {igt@gem_ccs@block-copy-uncompressed} (NEW):
    - shard-iclb:         NOTRUN -> [SKIP][3] +3 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb1/igt@gem_ccs@block-copy-uncompressed.html

  * igt@gem_eio@reset-stress:
    - shard-snb:          [PASS][4] -> [CRASH][5]
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/shard-snb5/igt@gem_eio@reset-stress.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-snb6/igt@gem_eio@reset-stress.html

  
New tests
---------

  New tests have been introduced between CI_DRM_11235_full and IGTPW_6637_full:

### New IGT tests (8) ###

  * igt@gem_ccs@block-copy-compressed:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ccs@block-copy-inplace:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ccs@block-copy-uncompressed:
    - Statuses : 5 skip(s)
    - Exec time: [0.0] s

  * igt@gem_ccs@block-copy-uncompressed@linear-uncompressed-compfmt0-smem-smem:
    - Statuses : 2 pass(s)
    - Exec time: [0.02] s

  * igt@gem_ccs@block-copy-uncompressed@tile4-uncompressed-compfmt0-smem-smem:
    - Statuses : 2 pass(s)
    - Exec time: [0.02] s

  * igt@gem_ccs@block-copy-uncompressed@tile64-uncompressed-compfmt0-smem-smem:
    - Statuses : 2 pass(s)
    - Exec time: [0.02] s

  * igt@gem_ccs@block-copy-uncompressed@ymajor-uncompressed-compfmt0-smem-smem:
    - Statuses : 2 pass(s)
    - Exec time: [0.02] s

  * igt@gem_ccs@ctrl-surf-copy:
    - Statuses : 6 skip(s)
    - Exec time: [0.0] s

  

Known issues
------------

  Here are the changes found in IGTPW_6637_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@feature_discovery@chamelium:
    - shard-tglb:         NOTRUN -> [SKIP][6] ([fdo#111827])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@feature_discovery@chamelium.html
    - shard-iclb:         NOTRUN -> [SKIP][7] ([fdo#111827])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb8/igt@feature_discovery@chamelium.html

  * igt@feature_discovery@display-3x:
    - shard-glk:          NOTRUN -> [SKIP][8] ([fdo#109271]) +99 similar issues
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk8/igt@feature_discovery@display-3x.html
    - shard-iclb:         NOTRUN -> [SKIP][9] ([i915#1839]) +1 similar issue
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb2/igt@feature_discovery@display-3x.html

  * igt@gem_ctx_isolation@preservation-s3@bcs0:
    - shard-kbl:          NOTRUN -> [DMESG-WARN][10] ([i915#180]) +7 similar issues
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl1/igt@gem_ctx_isolation@preservation-s3@bcs0.html

  * igt@gem_ctx_persistence@smoketest:
    - shard-snb:          NOTRUN -> [SKIP][11] ([fdo#109271] / [i915#1099]) +2 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-snb7/igt@gem_ctx_persistence@smoketest.html

  * igt@gem_eio@kms:
    - shard-tglb:         NOTRUN -> [FAIL][12] ([i915#232])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb1/igt@gem_eio@kms.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
    - shard-glk:          NOTRUN -> [FAIL][13] ([i915#2842]) +1 similar issue
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk2/igt@gem_exec_fair@basic-none-solo@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs0:
    - shard-tglb:         NOTRUN -> [FAIL][14] ([i915#2842]) +4 similar issues
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb1/igt@gem_exec_fair@basic-none@vcs0.html

  * igt@gem_exec_fair@basic-none@vcs1:
    - shard-kbl:          NOTRUN -> [FAIL][15] ([i915#2842])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl1/igt@gem_exec_fair@basic-none@vcs1.html

  * igt@gem_exec_fair@basic-none@vecs0:
    - shard-iclb:         NOTRUN -> [FAIL][16] ([i915#2842]) +3 similar issues
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb6/igt@gem_exec_fair@basic-none@vecs0.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - shard-glk:          [PASS][17] -> [FAIL][18] ([i915#2842])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/shard-glk2/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk9/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gem_exec_params@no-vebox:
    - shard-iclb:         NOTRUN -> [SKIP][19] ([fdo#109283])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb2/igt@gem_exec_params@no-vebox.html
    - shard-tglb:         NOTRUN -> [SKIP][20] ([fdo#109283] / [i915#4877])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@gem_exec_params@no-vebox.html

  * igt@gem_exec_params@secure-non-root:
    - shard-tglb:         NOTRUN -> [SKIP][21] ([fdo#112283])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb1/igt@gem_exec_params@secure-non-root.html

  * igt@gem_exec_whisper@basic-queues-priority-all:
    - shard-glk:          [PASS][22] -> [DMESG-WARN][23] ([i915#118])
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/shard-glk4/igt@gem_exec_whisper@basic-queues-priority-all.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk4/igt@gem_exec_whisper@basic-queues-priority-all.html

  * igt@gem_lmem_swapping@heavy-verify-random:
    - shard-tglb:         NOTRUN -> [SKIP][24] ([i915#4613]) +1 similar issue
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@gem_lmem_swapping@heavy-verify-random.html

  * igt@gem_lmem_swapping@parallel-multi:
    - shard-apl:          NOTRUN -> [SKIP][25] ([fdo#109271] / [i915#4613]) +2 similar issues
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-apl6/igt@gem_lmem_swapping@parallel-multi.html

  * igt@gem_lmem_swapping@parallel-random-verify:
    - shard-kbl:          NOTRUN -> [SKIP][26] ([fdo#109271] / [i915#4613]) +4 similar issues
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl7/igt@gem_lmem_swapping@parallel-random-verify.html

  * igt@gem_pxp@create-regular-context-1:
    - shard-iclb:         NOTRUN -> [SKIP][27] ([i915#4270]) +1 similar issue
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb4/igt@gem_pxp@create-regular-context-1.html
    - shard-tglb:         NOTRUN -> [SKIP][28] ([i915#4270]) +2 similar issues
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb3/igt@gem_pxp@create-regular-context-1.html

  * igt@gem_render_copy@linear-to-vebox-y-tiled:
    - shard-iclb:         NOTRUN -> [SKIP][29] ([i915#768]) +1 similar issue
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb4/igt@gem_render_copy@linear-to-vebox-y-tiled.html

  * igt@gem_softpin@evict-snoop-interruptible:
    - shard-tglb:         NOTRUN -> [SKIP][30] ([fdo#109312])
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb5/igt@gem_softpin@evict-snoop-interruptible.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-kbl:          NOTRUN -> [SKIP][31] ([fdo#109271] / [i915#3323])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl6/igt@gem_userptr_blits@dmabuf-sync.html
    - shard-tglb:         NOTRUN -> [SKIP][32] ([i915#3323])
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gem_userptr_blits@unsync-unmap-cycles:
    - shard-tglb:         NOTRUN -> [SKIP][33] ([i915#3297]) +2 similar issues
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb5/igt@gem_userptr_blits@unsync-unmap-cycles.html
    - shard-iclb:         NOTRUN -> [SKIP][34] ([i915#3297])
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb8/igt@gem_userptr_blits@unsync-unmap-cycles.html

  * igt@gen9_exec_parse@allowed-all:
    - shard-iclb:         NOTRUN -> [SKIP][35] ([i915#2856]) +2 similar issues
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb7/igt@gen9_exec_parse@allowed-all.html

  * igt@gen9_exec_parse@bb-start-param:
    - shard-tglb:         NOTRUN -> [SKIP][36] ([i915#2527] / [i915#2856]) +4 similar issues
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@gen9_exec_parse@bb-start-param.html

  * igt@i915_pm_dc@dc3co-vpb-simulation:
    - shard-tglb:         NOTRUN -> [SKIP][37] ([i915#1904])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb3/igt@i915_pm_dc@dc3co-vpb-simulation.html
    - shard-iclb:         NOTRUN -> [SKIP][38] ([i915#588])
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb2/igt@i915_pm_dc@dc3co-vpb-simulation.html

  * igt@i915_pm_dc@dc6-dpms:
    - shard-kbl:          NOTRUN -> [FAIL][39] ([i915#454])
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl7/igt@i915_pm_dc@dc6-dpms.html

  * igt@i915_pm_dc@dc6-psr:
    - shard-tglb:         NOTRUN -> [FAIL][40] ([i915#454]) +1 similar issue
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb3/igt@i915_pm_dc@dc6-psr.html

  * igt@i915_pm_rc6_residency@rc6-fence:
    - shard-tglb:         NOTRUN -> [WARN][41] ([i915#2681] / [i915#2684])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb7/igt@i915_pm_rc6_residency@rc6-fence.html
    - shard-iclb:         NOTRUN -> [WARN][42] ([i915#1804] / [i915#2684])
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb3/igt@i915_pm_rc6_residency@rc6-fence.html

  * igt@i915_pm_rpm@modeset-non-lpsp-stress:
    - shard-tglb:         NOTRUN -> [SKIP][43] ([fdo#111644] / [i915#1397] / [i915#2411])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@i915_pm_rpm@modeset-non-lpsp-stress.html
    - shard-iclb:         NOTRUN -> [SKIP][44] ([fdo#110892])
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb2/igt@i915_pm_rpm@modeset-non-lpsp-stress.html

  * igt@i915_query@query-topology-unsupported:
    - shard-tglb:         NOTRUN -> [SKIP][45] ([fdo#109302])
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@i915_query@query-topology-unsupported.html

  * igt@kms_big_fb@linear-32bpp-rotate-270:
    - shard-iclb:         NOTRUN -> [SKIP][46] ([fdo#110725] / [fdo#111614]) +3 similar issues
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb2/igt@kms_big_fb@linear-32bpp-rotate-270.html

  * igt@kms_big_fb@linear-8bpp-rotate-270:
    - shard-tglb:         NOTRUN -> [SKIP][47] ([fdo#111614]) +7 similar issues
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@kms_big_fb@linear-8bpp-rotate-270.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-180-hflip-async-flip:
    - shard-kbl:          NOTRUN -> [SKIP][48] ([fdo#109271] / [i915#3777]) +9 similar issues
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl6/igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-180-hflip-async-flip.html

  * igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-hflip:
    - shard-apl:          NOTRUN -> [SKIP][49] ([fdo#109271] / [i915#3777]) +6 similar issues
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-apl6/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-hflip.html
    - shard-glk:          NOTRUN -> [SKIP][50] ([fdo#109271] / [i915#3777]) +1 similar issue
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk9/igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-hflip.html

  * igt@kms_big_fb@yf-tiled-8bpp-rotate-90:
    - shard-tglb:         NOTRUN -> [SKIP][51] ([fdo#111615]) +8 similar issues
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb3/igt@kms_big_fb@yf-tiled-8bpp-rotate-90.html
    - shard-iclb:         NOTRUN -> [SKIP][52] ([fdo#110723])
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb2/igt@kms_big_fb@yf-tiled-8bpp-rotate-90.html

  * igt@kms_big_joiner@2x-modeset:
    - shard-tglb:         NOTRUN -> [SKIP][53] ([i915#2705])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb2/igt@kms_big_joiner@2x-modeset.html

  * igt@kms_ccs@pipe-a-ccs-on-another-bo-y_tiled_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][54] ([i915#3689]) +5 similar issues
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb5/igt@kms_ccs@pipe-a-ccs-on-another-bo-y_tiled_ccs.html

  * igt@kms_ccs@pipe-a-crc-primary-rotation-180-y_tiled_gen12_rc_ccs_cc:
    - shard-kbl:          NOTRUN -> [SKIP][55] ([fdo#109271] / [i915#3886]) +17 similar issues
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl4/igt@kms_ccs@pipe-a-crc-primary-rotation-180-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-a-random-ccs-data-yf_tiled_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][56] ([fdo#111615] / [i915#3689]) +7 similar issues
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@kms_ccs@pipe-a-random-ccs-data-yf_tiled_ccs.html

  * igt@kms_ccs@pipe-b-bad-pixel-format-y_tiled_ccs:
    - shard-snb:          NOTRUN -> [SKIP][57] ([fdo#109271]) +197 similar issues
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-snb7/igt@kms_ccs@pipe-b-bad-pixel-format-y_tiled_ccs.html

  * igt@kms_ccs@pipe-b-crc-sprite-planes-basic-y_tiled_gen12_rc_ccs_cc:
    - shard-glk:          NOTRUN -> [SKIP][58] ([fdo#109271] / [i915#3886]) +7 similar issues
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk5/igt@kms_ccs@pipe-b-crc-sprite-planes-basic-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-c-ccs-on-another-bo-y_tiled_gen12_rc_ccs_cc:
    - shard-iclb:         NOTRUN -> [SKIP][59] ([fdo#109278] / [i915#3886]) +10 similar issues
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb5/igt@kms_ccs@pipe-c-ccs-on-another-bo-y_tiled_gen12_rc_ccs_cc.html
    - shard-apl:          NOTRUN -> [SKIP][60] ([fdo#109271] / [i915#3886]) +9 similar issues
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-apl6/igt@kms_ccs@pipe-c-ccs-on-another-bo-y_tiled_gen12_rc_ccs_cc.html

  * igt@kms_ccs@pipe-c-random-ccs-data-y_tiled_gen12_mc_ccs:
    - shard-tglb:         NOTRUN -> [SKIP][61] ([i915#3689] / [i915#3886]) +6 similar issues
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@kms_ccs@pipe-c-random-ccs-data-y_tiled_gen12_mc_ccs.html

  * igt@kms_cdclk@mode-transition:
    - shard-tglb:         NOTRUN -> [SKIP][62] ([i915#3742])
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb1/igt@kms_cdclk@mode-transition.html

  * igt@kms_chamelium@dp-hpd-storm-disable:
    - shard-glk:          NOTRUN -> [SKIP][63] ([fdo#109271] / [fdo#111827]) +12 similar issues
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk2/igt@kms_chamelium@dp-hpd-storm-disable.html

  * igt@kms_chamelium@hdmi-audio-edid:
    - shard-kbl:          NOTRUN -> [SKIP][64] ([fdo#109271] / [fdo#111827]) +27 similar issues
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl7/igt@kms_chamelium@hdmi-audio-edid.html

  * igt@kms_chamelium@hdmi-hpd-enable-disable-mode:
    - shard-iclb:         NOTRUN -> [SKIP][65] ([fdo#109284] / [fdo#111827]) +9 similar issues
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb5/igt@kms_chamelium@hdmi-hpd-enable-disable-mode.html
    - shard-snb:          NOTRUN -> [SKIP][66] ([fdo#109271] / [fdo#111827]) +12 similar issues
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-snb7/igt@kms_chamelium@hdmi-hpd-enable-disable-mode.html

  * igt@kms_color_chamelium@pipe-a-ctm-limited-range:
    - shard-apl:          NOTRUN -> [SKIP][67] ([fdo#109271] / [fdo#111827]) +18 similar issues
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-apl2/igt@kms_color_chamelium@pipe-a-ctm-limited-range.html

  * igt@kms_color_chamelium@pipe-b-ctm-0-5:
    - shard-tglb:         NOTRUN -> [SKIP][68] ([fdo#109284] / [fdo#111827]) +21 similar issues
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@kms_color_chamelium@pipe-b-ctm-0-5.html

  * igt@kms_color_chamelium@pipe-d-gamma:
    - shard-iclb:         NOTRUN -> [SKIP][69] ([fdo#109278] / [fdo#109284] / [fdo#111827]) +1 similar issue
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb8/igt@kms_color_chamelium@pipe-d-gamma.html

  * igt@kms_content_protection@dp-mst-lic-type-0:
    - shard-iclb:         NOTRUN -> [SKIP][70] ([i915#3116])
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb2/igt@kms_content_protection@dp-mst-lic-type-0.html
    - shard-tglb:         NOTRUN -> [SKIP][71] ([i915#3116] / [i915#3299])
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb2/igt@kms_content_protection@dp-mst-lic-type-0.html

  * igt@kms_content_protection@legacy:
    - shard-tglb:         NOTRUN -> [SKIP][72] ([i915#1063])
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@kms_content_protection@legacy.html

  * igt@kms_content_protection@srm:
    - shard-apl:          NOTRUN -> [TIMEOUT][73] ([i915#1319])
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-apl3/igt@kms_content_protection@srm.html

  * igt@kms_cursor_crc@pipe-a-cursor-32x32-sliding:
    - shard-tglb:         NOTRUN -> [SKIP][74] ([i915#3319]) +5 similar issues
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb3/igt@kms_cursor_crc@pipe-a-cursor-32x32-sliding.html

  * igt@kms_cursor_crc@pipe-a-cursor-512x512-rapid-movement:
    - shard-iclb:         NOTRUN -> [SKIP][75] ([fdo#109278] / [fdo#109279]) +4 similar issues
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb8/igt@kms_cursor_crc@pipe-a-cursor-512x512-rapid-movement.html

  * igt@kms_cursor_crc@pipe-c-cursor-max-size-onscreen:
    - shard-tglb:         NOTRUN -> [SKIP][76] ([i915#3359]) +8 similar issues
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb3/igt@kms_cursor_crc@pipe-c-cursor-max-size-onscreen.html

  * igt@kms_cursor_crc@pipe-d-cursor-512x512-sliding:
    - shard-tglb:         NOTRUN -> [SKIP][77] ([fdo#109279] / [i915#3359]) +9 similar issues
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@kms_cursor_crc@pipe-d-cursor-512x512-sliding.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
    - shard-tglb:         NOTRUN -> [SKIP][78] ([i915#4103]) +1 similar issue
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb2/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_cursor_legacy@cursorb-vs-flipa-varying-size:
    - shard-iclb:         NOTRUN -> [SKIP][79] ([fdo#109274] / [fdo#109278])
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb6/igt@kms_cursor_legacy@cursorb-vs-flipa-varying-size.html

  * igt@kms_dp_tiled_display@basic-test-pattern-with-chamelium:
    - shard-tglb:         NOTRUN -> [SKIP][80] ([i915#3528])
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb3/igt@kms_dp_tiled_display@basic-test-pattern-with-chamelium.html

  * igt@kms_flip@2x-flip-vs-rmfb:
    - shard-tglb:         NOTRUN -> [SKIP][81] ([fdo#109274] / [fdo#111825]) +13 similar issues
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb2/igt@kms_flip@2x-flip-vs-rmfb.html

  * igt@kms_flip@2x-nonexisting-fb:
    - shard-iclb:         NOTRUN -> [SKIP][82] ([fdo#109274]) +5 similar issues
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb3/igt@kms_flip@2x-nonexisting-fb.html

  * igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytile-downscaling:
    - shard-iclb:         [PASS][83] -> [SKIP][84] ([i915#3701])
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/shard-iclb8/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytile-downscaling.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb2/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytile-downscaling.html

  * igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilercccs-downscaling:
    - shard-iclb:         NOTRUN -> [SKIP][85] ([i915#2587])
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb1/igt@kms_flip_scaled_crc@flip-64bpp-ytile-to-32bpp-ytilercccs-downscaling.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-pwrite:
    - shard-glk:          [PASS][86] -> [FAIL][87] ([i915#1888] / [i915#2546])
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/shard-glk6/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-pwrite.html
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk1/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-pwrite.html
    - shard-iclb:         [PASS][88] -> [FAIL][89] ([i915#1888] / [i915#2546])
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/shard-iclb5/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-pwrite.html
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb6/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-pwrite.html

  * igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-cur-indfb-draw-blt:
    - shard-kbl:          NOTRUN -> [SKIP][90] ([fdo#109271]) +324 similar issues
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl3/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-cur-indfb-draw-blt.html

  * igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-cur-indfb-onoff:
    - shard-tglb:         NOTRUN -> [SKIP][91] ([fdo#109280] / [fdo#111825]) +46 similar issues
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb3/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-cur-indfb-onoff.html

  * igt@kms_frontbuffer_tracking@psr-2p-primscrn-pri-indfb-draw-render:
    - shard-iclb:         NOTRUN -> [SKIP][92] ([fdo#109280]) +26 similar issues
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb7/igt@kms_frontbuffer_tracking@psr-2p-primscrn-pri-indfb-draw-render.html

  * igt@kms_hdr@static-toggle-dpms:
    - shard-tglb:         NOTRUN -> [SKIP][93] ([i915#1187]) +1 similar issue
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb1/igt@kms_hdr@static-toggle-dpms.html

  * igt@kms_multipipe_modeset@basic-max-pipe-crc-check:
    - shard-tglb:         NOTRUN -> [SKIP][94] ([i915#1839]) +2 similar issues
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb3/igt@kms_multipipe_modeset@basic-max-pipe-crc-check.html

  * igt@kms_pipe_b_c_ivb@from-pipe-c-to-b-with-3-lanes:
    - shard-iclb:         NOTRUN -> [SKIP][95] ([fdo#109289])
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb2/igt@kms_pipe_b_c_ivb@from-pipe-c-to-b-with-3-lanes.html
    - shard-tglb:         NOTRUN -> [SKIP][96] ([fdo#109289]) +3 similar issues
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb2/igt@kms_pipe_b_c_ivb@from-pipe-c-to-b-with-3-lanes.html

  * igt@kms_pipe_crc_basic@disable-crc-after-crtc-pipe-d:
    - shard-glk:          NOTRUN -> [SKIP][97] ([fdo#109271] / [i915#533])
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk6/igt@kms_pipe_crc_basic@disable-crc-after-crtc-pipe-d.html
    - shard-apl:          NOTRUN -> [SKIP][98] ([fdo#109271] / [i915#533])
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-apl6/igt@kms_pipe_crc_basic@disable-crc-after-crtc-pipe-d.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-d:
    - shard-kbl:          NOTRUN -> [SKIP][99] ([fdo#109271] / [i915#533]) +2 similar issues
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl3/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-d.html

  * igt@kms_plane@plane-panning-bottom-right-suspend@pipe-a-planes:
    - shard-apl:          [PASS][100] -> [DMESG-WARN][101] ([i915#180]) +3 similar issues
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11235/shard-apl8/igt@kms_plane@plane-panning-bottom-right-suspend@pipe-a-planes.html
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-apl4/igt@kms_plane@plane-panning-bottom-right-suspend@pipe-a-planes.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-opaque-fb:
    - shard-apl:          NOTRUN -> [FAIL][102] ([fdo#108145] / [i915#265]) +2 similar issues
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-apl4/igt@kms_plane_alpha_blend@pipe-a-alpha-opaque-fb.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb:
    - shard-apl:          NOTRUN -> [FAIL][103] ([i915#265])
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-apl6/igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb.html
    - shard-glk:          NOTRUN -> [FAIL][104] ([i915#265])
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk6/igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb.html
    - shard-kbl:          NOTRUN -> [FAIL][105] ([i915#265])
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl1/igt@kms_plane_alpha_blend@pipe-a-alpha-transparent-fb.html

  * igt@kms_plane_alpha_blend@pipe-b-alpha-basic:
    - shard-kbl:          NOTRUN -> [FAIL][106] ([fdo#108145] / [i915#265]) +1 similar issue
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl4/igt@kms_plane_alpha_blend@pipe-b-alpha-basic.html

  * igt@kms_plane_alpha_blend@pipe-b-constant-alpha-max:
    - shard-glk:          NOTRUN -> [FAIL][107] ([fdo#108145] / [i915#265])
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk3/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-max.html

  * igt@kms_plane_alpha_blend@pipe-d-constant-alpha-max:
    - shard-iclb:         NOTRUN -> [SKIP][108] ([fdo#109278]) +37 similar issues
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb5/igt@kms_plane_alpha_blend@pipe-d-constant-alpha-max.html

  * igt@kms_plane_lowres@pipe-a-tiling-yf:
    - shard-iclb:         NOTRUN -> [SKIP][109] ([i915#3536]) +2 similar issues
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb6/igt@kms_plane_lowres@pipe-a-tiling-yf.html
    - shard-glk:          NOTRUN -> [DMESG-FAIL][110] ([i915#118] / [i915#1888])
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk1/igt@kms_plane_lowres@pipe-a-tiling-yf.html

  * igt@kms_plane_lowres@pipe-b-tiling-y:
    - shard-tglb:         NOTRUN -> [SKIP][111] ([i915#3536]) +3 similar issues
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb8/igt@kms_plane_lowres@pipe-b-tiling-y.html

  * igt@kms_plane_lowres@pipe-b-tiling-yf:
    - shard-tglb:         NOTRUN -> [SKIP][112] ([fdo#111615] / [fdo#112054]) +1 similar issue
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@kms_plane_lowres@pipe-b-tiling-yf.html

  * igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping:
    - shard-apl:          NOTRUN -> [SKIP][113] ([fdo#109271] / [i915#2733])
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-apl1/igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping.html
    - shard-kbl:          NOTRUN -> [SKIP][114] ([fdo#109271] / [i915#2733])
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-kbl1/igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping.html
    - shard-glk:          NOTRUN -> [SKIP][115] ([fdo#109271] / [i915#2733])
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-glk4/igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping.html

  * igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area:
    - shard-iclb:         NOTRUN -> [SKIP][116] ([fdo#111068] / [i915#658])
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb3/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area.html

  * igt@kms_psr2_sf@plane-move-sf-dmg-area:
    - shard-tglb:         NOTRUN -> [SKIP][117] ([i915#2920]) +1 similar issue
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb6/igt@kms_psr2_sf@plane-move-sf-dmg-area.html

  * igt@kms_psr2_su@frontbuffer-xrgb8888:
    - shard-iclb:         NOTRUN -> [SKIP][118] ([fdo#109642] / [fdo#111068] / [i915#658])
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-iclb3/igt@kms_psr2_su@frontbuffer-xrgb8888.html

  * igt@kms_psr2_su@page_flip-xrgb8888:
    - shard-tglb:         NOTRUN -> [SKIP][119] ([i915#1911]) +1 similar issue
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/shard-tglb3/igt@kms_psr2_su@page_flip-xrgb8888.html
    - shard-kbl:          NOTRUN -> [SKIP][120] ([fdo#109271] /

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_6637/index.html

[-- Attachment #2: Type: text/html, Size: 34142 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter
  2022-03-01 22:57   ` Kamil Konieczny
@ 2022-03-04 13:37     ` Zbigniew Kempczyński
  0 siblings, 0 replies; 12+ messages in thread
From: Zbigniew Kempczyński @ 2022-03-04 13:37 UTC (permalink / raw)
  To: Kamil Konieczny, igt-dev, Mauro Chehab

On Tue, Mar 01, 2022 at 11:57:25PM +0100, Kamil Konieczny wrote:

<cut>

> > +
> > +	union {
> > +		/* DG2, XEHP */
> > +		uint32_t src_clear_address_hi0;
> > +		/* Others */
> > +		uint32_t src_clear_address_hi1;
> > +	} dw13;
> 
> Is it really needed to have union here ? If you will add struct
> with bitfields inside it will be ok, but only for uint32_t this
> seems redundand.
>

At the moment it is the same, in the future hi1 will be redefined
as bitmask.
 
> > +
> > +	struct {
> > +		uint32_t dst_compression_format:	BITRANGE(0, 4);
> > +		uint32_t dst_clear_value_enable:	BITRANGE(5, 5);
> > +		uint32_t dst_clear_address_low:		BITRANGE(6, 31);
> > +	} dw14;
> > +
> > +	union {
> > +		/* DG2, XEHP */
> > +		uint32_t dst_clear_address_hi0;
> > +		/* Others */
> > +		uint32_t dst_clear_address_hi1;
> > +	} dw15;
> 
> Same here.
> 
> > +
> > +	struct {
> > +		uint32_t dst_surface_height:		BITRANGE(0, 13);
> > +		uint32_t dst_surface_width:		BITRANGE(14, 27);
> > +		uint32_t rsvd0:				BITRANGE(28, 28);
> > +		uint32_t dst_surface_type:		BITRANGE(29, 31);
> > +	} dw16;
> > +
> > +	struct {
> > +		uint32_t dst_lod:			BITRANGE(0, 3);
> > +		uint32_t dst_surface_qpitch:		BITRANGE(4, 18);
> > +		uint32_t rsvd0:				BITRANGE(19, 20);
> > +		uint32_t dst_surface_depth:		BITRANGE(21, 31);
> > +	} dw17;
> > +
> > +	struct {
> > +		uint32_t dst_horizontal_align:		BITRANGE(0, 1);
> > +		uint32_t rsvd0:				BITRANGE(2, 2);
> > +		uint32_t dst_vertical_align:		BITRANGE(3, 4);
> > +		uint32_t rsvd1:				BITRANGE(5, 7);
> > +		uint32_t dst_mip_tail_start_lod:	BITRANGE(8, 11);
> > +		uint32_t rsvd2:				BITRANGE(12, 17);
> > +		uint32_t dst_depth_stencil_resource:	BITRANGE(18, 18);
> > +		uint32_t rsvd3:				BITRANGE(19, 20);
> > +		uint32_t dst_array_index:		BITRANGE(21, 31);
> > +	} dw18;
> > +
> > +	struct {
> > +		uint32_t src_surface_height:		BITRANGE(0, 13);
> > +		uint32_t src_surface_width:		BITRANGE(14, 27);
> > +		uint32_t rsvd0:				BITRANGE(28, 28);
> > +		uint32_t src_surface_type:		BITRANGE(29, 31);
> > +	} dw19;
> > +
> > +	struct {
> > +		uint32_t src_lod:			BITRANGE(0, 3);
> > +		uint32_t src_surface_qpitch:		BITRANGE(4, 18);
> > +		uint32_t rsvd0:				BITRANGE(19, 20);
> > +		uint32_t src_surface_depth:		BITRANGE(21, 31);
> > +	} dw20;
> > +
> > +	struct {
> > +		uint32_t src_horizontal_align:		BITRANGE(0, 1);
> > +		uint32_t rsvd0:				BITRANGE(2, 2);
> > +		uint32_t src_vertical_align:		BITRANGE(3, 4);
> > +		uint32_t rsvd1:				BITRANGE(5, 7);
> > +		uint32_t src_mip_tail_start_lod:	BITRANGE(8, 11);
> > +		uint32_t rsvd2:				BITRANGE(12, 17);
> > +		uint32_t src_depth_stencil_resource:	BITRANGE(18, 18);
> > +		uint32_t rsvd3:				BITRANGE(19, 20);
> > +		uint32_t src_array_index:		BITRANGE(21, 31);
> > +	} dw21;
> > +};
> > +
> > +/**
> > + * blt_supports_compression:
> > + * @i915: drm fd
> > + *
> > + * Function returns does HW supports flatccs compression in blitter commands
> ---------------------- ^
> s/does/true if/

Ok.

> 
> > + * on @i915 device.
> > + *
> > + * Returns:
> > + * true if it does, false otherwise.
> > + */
> > +bool blt_supports_compression(int i915)
> > +{
> > +	uint32_t devid = intel_get_drm_devid(i915);
> > +
> > +	return HAS_FLATCCS(devid);
> > +}
> > +
> > +/**
> > + * blt_supports_tiling:
> > + * @i915: drm fd
> > + * @tiling: tiling id
> > + *
> > + * Function returns does blitter supports @tiling on @i915 device.
> ---------------------- ^
> same here, either "returns true if" or "checks if"

Ok.

> 
> > + *
> > + * Returns:
> > + * true if it does, false otherwise.
> > + */
> > +bool blt_supports_tiling(int i915, enum blt_tiling tiling)
> > +{
> > +	uint32_t devid = intel_get_drm_devid(i915);
> > +
> > +	if (tiling == T_XMAJOR) {
> > +		if (IS_TIGERLAKE(devid) || IS_DG1(devid))
> > +			return false;
> > +		else
> > +			return true;
> > +	}
> > +
> > +	if (tiling == T_YMAJOR) {
> > +		if (IS_TIGERLAKE(devid) || IS_DG1(devid))
> > +			return true;
> > +		else
> > +			return false;
> > +	}
> > +
> > +	return true;
> > +}
> > +
> > +/**
> > + * blt_tiling_name:
> > + * @tiling: tiling id
> > + *
> > + * Returns:
> > + * name of @tiling passed. Useful to build test names or sth.
> s/ or sth//
> 
Ok.

> > + */
> > +const char *blt_tiling_name(enum blt_tiling tiling)
> > +{
> > +	switch (tiling) {
> > +	case T_LINEAR: return "linear";
> > +	case T_XMAJOR: return "xmajor";
> > +	case T_YMAJOR: return "ymajor";
> > +	case T_TILE4:  return "tile4";
> > +	case T_TILE64: return "tile64";
> > +	}
> > +
> > +	igt_warn("invalid tiling passed: %d\n", tiling);
> > +	return NULL;
> > +}
> > +
> > +static int __block_tiling(enum blt_tiling tiling)
> > +{
> > +	switch (tiling) {
> > +	case T_LINEAR: return 0;
> > +	case T_XMAJOR: return 1;
> > +	case T_YMAJOR: return 1;
> > +	case T_TILE4:  return 2;
> > +	case T_TILE64: return 3;
> > +	}
> > +
> > +	igt_warn("invalid tiling passed: %d\n", tiling);
> > +	return 0;
> > +}
> > +
> > +static int __special_mode(const struct blt_copy_data *blt)
> > +{
> > +	if (blt->src.handle == blt->dst.handle &&
> > +	    blt->src.compression && !blt->dst.compression)
> > +		return SM_FULL_RESOLVE;
> > +
> > +
> > +	return SM_NONE;
> > +}
> > +
> > +static int __memory_type(uint32_t region)
> > +{
> > +	igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
> > +		     IS_SYSTEM_MEMORY_REGION(region),
> > +		     "Invalid region: %x\n", region);
> > +
> > +	if (IS_DEVICE_MEMORY_REGION(region))
> > +		return TM_LOCAL_MEM;
> > +	return TM_SYSTEM_MEM;
> > +}
> > +
> > +static enum blt_aux_mode __aux_mode(const struct blt_copy_object *obj)
> > +{
> > +	if (obj->compression == COMPRESSION_ENABLED) {
> > +		igt_assert_f(IS_DEVICE_MEMORY_REGION(obj->region),
> > +			     "XY_BLOCK_COPY_BLT supports compression "
> > +			     "on device memory only\n");
> > +		return AM_AUX_CCS_E;
> > +	}
> > +
> > +	return AM_AUX_NONE;
> > +}
> > +
> > +static void fill_data(struct gen12_block_copy_data *data,
> > +		      const struct blt_copy_data *blt,
> > +		      uint64_t src_offset, uint64_t dst_offset,
> > +		      bool extended_command)
> > +{
> > +	data->dw00.client = 0x2;
> > +	data->dw00.opcode = 0x41;
> > +	data->dw00.color_depth = blt->color_depth;
> > +	data->dw00.special_mode = __special_mode(blt);
> > +	data->dw00.length = extended_command ? 20 : 10;
> > +
> > +	data->dw01.dst_pitch = blt->dst.pitch - 1;
> > +	data->dw01.dst_aux_mode = __aux_mode(&blt->dst);
> > +	data->dw01.dst_mocs = blt->dst.mocs;
> > +	data->dw01.dst_compression = blt->dst.compression;
> > +	data->dw01.dst_tiling = __block_tiling(blt->dst.tiling);
> > +
> > +	if (blt->dst.compression)
> > +		data->dw01.dst_ctrl_surface_type = blt->dst.compression_type;
> > +
> > +	data->dw02.dst_x1 = blt->dst.x1;
> > +	data->dw02.dst_y1 = blt->dst.y1;
> > +
> > +	data->dw03.dst_x2 = blt->dst.x2;
> > +	data->dw03.dst_y2 = blt->dst.y2;
> > +
> > +	data->dw04.dst_address_lo = dst_offset;
> > +	data->dw05.dst_address_hi = dst_offset >> 32;
> > +
> > +	data->dw06.dst_x_offset = blt->dst.x_offset;
> > +	data->dw06.dst_y_offset = blt->dst.y_offset;
> > +	data->dw06.dst_target_memory = __memory_type(blt->dst.region);
> > +
> > +	data->dw07.src_x1 = blt->src.x1;
> > +	data->dw07.src_y1 = blt->src.y1;
> > +
> > +	data->dw08.src_pitch = blt->src.pitch - 1;
> > +	data->dw08.src_aux_mode = __aux_mode(&blt->src);
> > +	data->dw08.src_mocs = blt->src.mocs;
> > +	data->dw08.src_compression = blt->src.compression;
> > +	data->dw08.src_tiling = __block_tiling(blt->src.tiling);
> > +
> > +	if (blt->src.compression)
> > +		data->dw08.src_ctrl_surface_type = blt->src.compression_type;
> > +
> > +	data->dw09.src_address_lo = src_offset;
> > +	data->dw10.src_address_hi = src_offset >> 32;
> > +
> > +	data->dw11.src_x_offset = blt->src.x_offset;
> > +	data->dw11.src_y_offset = blt->src.y_offset;
> > +	data->dw11.src_target_memory = __memory_type(blt->src.region);
> > +}
> > +
> > +static void fill_data_ext(int i915,
> 
> Parameter i915 is not used here, remove it.

Ok. Was added for future gens but currently it can be removed.

> 
> > +			  struct gen12_block_copy_data_ext *dext,
> > +			  const struct blt_block_copy_data_ext *ext)
> > +{
> > +	dext->dw12.src_compression_format = ext->src.compression_format;
> > +	dext->dw12.src_clear_value_enable = ext->src.clear_value_enable;
> > +	dext->dw12.src_clear_address_low = ext->src.clear_address;
> > +
> > +	dext->dw13.src_clear_address_hi0 = ext->src.clear_address >> 32;
> > +
> > +	dext->dw14.dst_compression_format = ext->dst.compression_format;
> > +	dext->dw14.dst_clear_value_enable = ext->dst.clear_value_enable;
> > +	dext->dw14.dst_clear_address_low = ext->dst.clear_address;
> > +
> > +	dext->dw15.dst_clear_address_hi0 = ext->dst.clear_address >> 32;
> > +
> > +	dext->dw16.dst_surface_width = ext->dst.surface_width - 1;
> > +	dext->dw16.dst_surface_height = ext->dst.surface_height - 1;
> > +	dext->dw16.dst_surface_type = ext->dst.surface_type;
> > +
> > +	dext->dw17.dst_lod = ext->dst.lod;
> > +	dext->dw17.dst_surface_depth = ext->dst.surface_depth;
> > +	dext->dw17.dst_surface_qpitch = ext->dst.surface_qpitch;
> > +
> > +	dext->dw18.dst_horizontal_align = ext->dst.horizontal_align;
> > +	dext->dw18.dst_vertical_align = ext->dst.vertical_align;
> > +	dext->dw18.dst_mip_tail_start_lod = ext->dst.mip_tail_start_lod;
> > +	dext->dw18.dst_depth_stencil_resource = ext->dst.depth_stencil_resource;
> > +	dext->dw18.dst_array_index = ext->dst.array_index;
> > +
> > +	dext->dw19.src_surface_width = ext->src.surface_width - 1;
> > +	dext->dw19.src_surface_height = ext->src.surface_height - 1;
> > +
> > +	dext->dw19.src_surface_type = ext->src.surface_type;
> > +
> > +	dext->dw20.src_lod = ext->src.lod;
> > +	dext->dw20.src_surface_depth = ext->src.surface_depth;
> > +	dext->dw20.src_surface_qpitch = ext->src.surface_qpitch;
> > +
> > +	dext->dw21.src_horizontal_align = ext->src.horizontal_align;
> > +	dext->dw21.src_vertical_align = ext->src.vertical_align;
> > +	dext->dw21.src_mip_tail_start_lod = ext->src.mip_tail_start_lod;
> > +	dext->dw21.src_depth_stencil_resource = ext->src.depth_stencil_resource;
> > +	dext->dw21.src_array_index = ext->src.array_index;
> > +}
> > +
> > +static void dump_bb_cmd(struct gen12_block_copy_data *data)
> > +{
> > +	uint32_t *cmd = (uint32_t *) data;
> > +
> > +	igt_info("details:\n");
> > +	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, color depth: %d, "
> > +		 "special mode: %d, length: %d>\n",
> > +		 cmd[0],
> > +		 data->dw00.client, data->dw00.opcode, data->dw00.color_depth,
> > +		 data->dw00.special_mode, data->dw00.length);
> > +	igt_info(" dw01: [%08x] dst <pitch: %d, aux: %d, mocs: %d, compr: %d, "
> > +		 "tiling: %d, ctrl surf type: %d>\n",
> > +		 cmd[1], data->dw01.dst_pitch, data->dw01.dst_aux_mode,
> > +		 data->dw01.dst_mocs, data->dw01.dst_compression,
> > +		 data->dw01.dst_tiling, data->dw01.dst_ctrl_surface_type);
> > +	igt_info(" dw02: [%08x] dst geom <x1: %d, y1: %d>\n",
> > +		 cmd[2], data->dw02.dst_x1, data->dw02.dst_y1);
> > +	igt_info(" dw03: [%08x]          <x2: %d, y2: %d>\n",
> > +		 cmd[3], data->dw03.dst_x2, data->dw03.dst_y2);
> > +	igt_info(" dw04: [%08x] dst offset lo (0x%x)\n",
> > +		 cmd[4], data->dw04.dst_address_lo);
> > +	igt_info(" dw05: [%08x] dst offset hi (0x%x)\n",
> > +		 cmd[5], data->dw05.dst_address_hi);
> > +	igt_info(" dw06: [%08x] dst <x offset: 0x%x, y offset: 0x%0x, target mem: %d>\n",
> > +		 cmd[6], data->dw06.dst_x_offset, data->dw06.dst_y_offset,
> > +		 data->dw06.dst_target_memory);
> > +	igt_info(" dw07: [%08x] src geom <x1: %d, y1: %d>\n",
> > +		 cmd[7], data->dw07.src_x1, data->dw07.src_y1);
> > +	igt_info(" dw08: [%08x] src <pitch: %d, aux: %d, mocs: %d, compr: %d, "
> > +		 "tiling: %d, ctrl surf type: %d>\n",
> > +		 cmd[8], data->dw08.src_pitch, data->dw08.src_aux_mode,
> > +		 data->dw08.src_mocs, data->dw08.src_compression,
> > +		 data->dw08.src_tiling, data->dw08.src_ctrl_surface_type);
> > +	igt_info(" dw09: [%08x] src offset lo (0x%x)\n",
> > +		 cmd[9], data->dw09.src_address_lo);
> > +	igt_info(" dw10: [%08x] src offset hi (0x%x)\n",
> > +		 cmd[10], data->dw10.src_address_hi);
> > +	igt_info(" dw11: [%08x] src <x offset: 0x%x, y offset: 0x%0x, target mem: %d>\n",
> > +		 cmd[11], data->dw11.src_x_offset, data->dw11.src_y_offset,
> > +		 data->dw11.src_target_memory);
> > +}
> > +
> > +static void dump_bb_ext(struct gen12_block_copy_data_ext *data)
> > +{
> > +	uint32_t *cmd = (uint32_t *) data;
> > +
> > +	igt_info("ext details:\n");
> > +	igt_info(" dw12: [%08x] src <compression fmt: %d, clear value enable: %d, "
> > +		 "clear address low: 0x%x>\n",
> > +		 cmd[0],
> > +		 data->dw12.src_compression_format,
> > +		 data->dw12.src_clear_value_enable,
> > +		 data->dw12.src_clear_address_low);
> > +	igt_info(" dw13: [%08x] src clear address hi: 0x%x\n",
> > +		 cmd[1], data->dw13.src_clear_address_hi0);
> > +	igt_info(" dw14: [%08x] dst <compression fmt: %d, clear value enable: %d, "
> > +		 "clear address low: 0x%x>\n",
> > +		 cmd[2],
> > +		 data->dw14.dst_compression_format,
> > +		 data->dw14.dst_clear_value_enable,
> > +		 data->dw14.dst_clear_address_low);
> > +	igt_info(" dw15: [%08x] dst clear address hi: 0x%x\n",
> > +		 cmd[3], data->dw15.dst_clear_address_hi0);
> > +	igt_info(" dw16: [%08x] dst surface <width: %d, height: %d, type: %d>\n",
> > +		 cmd[4], data->dw16.dst_surface_width,
> > +		 data->dw16.dst_surface_height, data->dw16.dst_surface_type);
> > +	igt_info(" dw17: [%08x] dst surface <lod: %d, depth: %d, qpitch: %d>\n",
> > +		 cmd[5], data->dw17.dst_lod,
> > +		 data->dw17.dst_surface_depth, data->dw17.dst_surface_qpitch);
> > +	igt_info(" dw18: [%08x] dst <halign: %d, valign: %d, mip tail: %d, "
> > +		 "depth stencil: %d, array index: %d>\n",
> > +		 cmd[6],
> > +		 data->dw18.dst_horizontal_align,
> > +		 data->dw18.dst_vertical_align,
> > +		 data->dw18.dst_mip_tail_start_lod,
> > +		 data->dw18.dst_depth_stencil_resource,
> > +		 data->dw18.dst_array_index);
> > +
> > +	igt_info(" dw19: [%08x] src surface <width: %d, height: %d, type: %d>\n",
> > +		 cmd[7], data->dw19.src_surface_width,
> > +		 data->dw19.src_surface_height, data->dw19.src_surface_type);
> > +	igt_info(" dw20: [%08x] src surface <lod: %d, depth: %d, qpitch: %d>\n",
> > +		 cmd[8], data->dw20.src_lod,
> > +		 data->dw20.src_surface_depth, data->dw20.src_surface_qpitch);
> > +	igt_info(" dw21: [%08x] src <halign: %d, valign: %d, mip tail: %d, "
> > +		 "depth stencil: %d, array index: %d>\n",
> > +		 cmd[9],
> > +		 data->dw21.src_horizontal_align,
> > +		 data->dw21.src_vertical_align,
> > +		 data->dw21.src_mip_tail_start_lod,
> > +		 data->dw21.src_depth_stencil_resource,
> > +		 data->dw21.src_array_index);
> > +}
> > +
> > +/**
> > + * blt_block_copy:
> > + * @i915: drm fd
> > + * @ctx: intel_ctx_t context
> > + * @e: blitter engine for @ctx
> > + * @ahnd: allocator handle
> > + * @blt: basic blitter data (for TGL/DG1 which doesn't support ext version)
> > + * @ext: extended blitter data (for DG2+, supports flatccs compression)
> > + *
> > + * Function does blit between @src and @dst described in @blt object.
> > + *
> > + * Returns:
> > + * execbuffer status.
> > + */
> > +int blt_block_copy(int i915,
> > +		   const intel_ctx_t *ctx,
> > +		   const struct intel_execution_engine2 *e,
> > +		   uint64_t ahnd,
> > +		   const struct blt_copy_data *blt,
> > +		   const struct blt_block_copy_data_ext *ext)
> > +{
> > +	struct drm_i915_gem_execbuffer2 execbuf = {};
> > +	struct drm_i915_gem_exec_object2 obj[3] = {};
> > +	struct gen12_block_copy_data data = {};
> > +	struct gen12_block_copy_data_ext dext = {};
> > +	uint64_t dst_offset, src_offset, bb_offset, alignment;
> > +	uint32_t *bb;
> > +	int i, ret;
> > +
> > +	igt_assert_f(ahnd, "block-copy supports softpin only\n");
> > +	igt_assert_f(blt, "block-copy requires data to do blit\n");
> > +
> > +	alignment = gem_detect_safe_alignment(i915);
> > +	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> > +	if (__special_mode(blt) == SM_FULL_RESOLVE)
> > +		dst_offset = src_offset;
> > +	else
> > +		dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> > +	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> > +
> > +	fill_data(&data, blt, src_offset, dst_offset, ext);
> > +
> > +	i = sizeof(data) / sizeof(uint32_t);
> > +	bb = gem_mmap__device_coherent(i915, blt->bb.handle, 0, blt->bb.size,
> > +				       PROT_READ | PROT_WRITE);
> > +	memcpy(bb, &data, sizeof(data));
> > +
> > +	if (ext) {
> > +		fill_data_ext(i915, &dext, ext);
> > +		memcpy(bb + i, &dext, sizeof(dext));
> > +		i += sizeof(dext) / sizeof(uint32_t);
> > +	}
> > +	bb[i++] = MI_BATCH_BUFFER_END;
> > +
> > +	if (blt->print_bb) {
> > +		igt_info("[BLOCK COPY]\n");
> > +		igt_info("src offset: %llx, dst offset: %llx, bb offset: %llx\n",
> > +			 (long long) src_offset, (long long) dst_offset,
> > +			 (long long) bb_offset);
> > +
> > +		dump_bb_cmd(&data);
> > +		if (ext)
> > +			dump_bb_ext(&dext);
> > +	}
> > +
> > +	munmap(bb, blt->bb.size);
> > +
> > +	obj[0].offset = CANONICAL(dst_offset);
> > +	obj[1].offset = CANONICAL(src_offset);
> > +	obj[2].offset = CANONICAL(bb_offset);
> > +	obj[0].handle = blt->dst.handle;
> > +	obj[1].handle = blt->src.handle;
> > +	obj[2].handle = blt->bb.handle;
> > +	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> > +		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +	execbuf.buffer_count = 3;
> > +	execbuf.buffers_ptr = to_user_pointer(obj);
> > +	execbuf.rsvd1 = ctx ? ctx->id : 0;
> > +	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> > +	ret = __gem_execbuf(i915, &execbuf);
> > +	if (data.dw00.special_mode != SM_FULL_RESOLVE)
> > +		put_offset(ahnd, blt->dst.handle);
> > +	put_offset(ahnd, blt->src.handle);
> > +	put_offset(ahnd, blt->bb.handle);
> > +
> > +	return ret;
> > +}
> > +
> > +static uint16_t __ccs_size(const struct blt_ctrl_surf_copy_data *surf)
> > +{
> > +	uint32_t src_size, dst_size;
> > +
> > +	src_size = surf->src.access_type == DIRECT_ACCESS ?
> > +				surf->src.size : surf->src.size / CCS_RATIO;
> > +
> > +	dst_size = surf->dst.access_type == DIRECT_ACCESS ?
> > +				surf->dst.size : surf->dst.size / CCS_RATIO;
> > +
> > +	igt_assert_f(src_size <= dst_size, "dst size must be >= src size for CCS copy\n");
> > +
> > +	return src_size;
> > +}
> > +
> > +struct gen12_ctrl_surf_copy_data {
> > +	struct {
> > +		uint32_t length:			BITRANGE(0, 7);
> > +		uint32_t size_of_ctrl_copy:		BITRANGE(8, 17);
> > +		uint32_t rsvd0:				BITRANGE(18, 19);
> > +		uint32_t dst_access_type:		BITRANGE(20, 20);
> > +		uint32_t src_access_type:		BITRANGE(21, 21);
> > +		uint32_t opcode:			BITRANGE(22, 28);
> > +		uint32_t client:			BITRANGE(29, 31);
> > +	} dw00;
> > +
> > +	struct {
> > +		uint32_t src_address_lo;
> > +	} dw01;
> > +
> > +	struct {
> > +		uint32_t src_address_hi:		BITRANGE(0, 24);
> > +		uint32_t src_mocs:			BITRANGE(25, 31);
> > +	} dw02;
> > +
> > +	struct {
> > +		uint32_t dst_address_lo;
> > +	} dw03;
> > +
> > +	struct {
> > +		uint32_t dst_address_hi:		BITRANGE(0, 24);
> > +		uint32_t dst_mocs:			BITRANGE(25, 31);
> > +	} dw04;
> > +};
> > +
> > +static void dump_bb_surf_ctrl_cmd(const struct gen12_ctrl_surf_copy_data *data)
> > +{
> > +	uint32_t *cmd = (uint32_t *) data;
> > +
> > +	igt_info("details:\n");
> > +	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, "
> > +		 "src/dst access type: <%d, %d>, size of ctrl copy: %u, length: %d>\n",
> > +		 cmd[0],
> > +		 data->dw00.client, data->dw00.opcode,
> > +		 data->dw00.src_access_type, data->dw00.dst_access_type,
> > +		 data->dw00.size_of_ctrl_copy, data->dw00.length);
> > +	igt_info(" dw01: [%08x] src offset lo (0x%x)\n",
> > +		 cmd[1], data->dw01.src_address_lo);
> > +	igt_info(" dw02: [%08x] src offset hi (0x%x), src mocs: %u\n",
> > +		 cmd[2], data->dw02.src_address_hi, data->dw02.src_mocs);
> > +	igt_info(" dw03: [%08x] dst offset lo (0x%x)\n",
> > +		 cmd[3], data->dw03.dst_address_lo);
> > +	igt_info(" dw04: [%08x] dst offset hi (0x%x), src mocs: %u\n",
> > +		 cmd[4], data->dw04.dst_address_hi, data->dw04.dst_mocs);
> > +}
> > +
> > +/**
> > + * blt_ctrl_surf_copy:
> > + * @i915: drm fd
> > + * @ctx: intel_ctx_t context
> > + * @e: blitter engine for @ctx
> > + * @ahnd: allocator handle
> > + * @surf: blitter data for ctrl-surf-copy
> > + *
> > + * Function does ctrl-surf-copy blit between @src and @dst described in
> > + * @blt object.
> > + *
> > + * Returns:
> > + * execbuffer status.
> > + */
> > +int blt_ctrl_surf_copy(int i915,
> > +		       const intel_ctx_t *ctx,
> > +		       const struct intel_execution_engine2 *e,
> > +		       uint64_t ahnd,
> > +		       const struct blt_ctrl_surf_copy_data *surf)
> > +{
> > +	struct drm_i915_gem_execbuffer2 execbuf = {};
> > +	struct drm_i915_gem_exec_object2 obj[3] = {};
> > +	struct gen12_ctrl_surf_copy_data data = {};
> > +	uint64_t dst_offset, src_offset, bb_offset, alignment;
> > +	uint32_t *bb;
> > +	int i;
> > +
> > +	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
> > +	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
> > +
> > +	alignment = gem_detect_safe_alignment(i915);
> > +
> > +	data.dw00.client = 0x2;
> > +	data.dw00.opcode = 0x48;
> > +	data.dw00.src_access_type = surf->src.access_type;
> > +	data.dw00.dst_access_type = surf->dst.access_type;
> > +
> > +	/* Ensure dst has size capable to keep src ccs aux */
> > +	data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
> 
> Shouldn't this be size_of_surface / 256 - 1 ?

No. DIRECT surface is keeping only CCS data. size_of_ctrl_copy means
number of ccs (256B) blocks to copy (minus 1).
 
> 
> > +	data.dw00.length = 0x3;
> > +
> > +	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size,
> > +				alignment);
> > +	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size,
> > +				alignment);
> > +	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size,
> > +			       alignment);
> > +
> > +	data.dw01.src_address_lo = src_offset;
> 
> This should be 4K aligned (or 64KB for indirect).

Ok, you're right. We can get max of safe and 64KB alignment.

> 
> > +	data.dw02.src_address_hi = src_offset >> 32;
> > +	data.dw02.src_mocs = surf->src.mocs;
> > +
> > +	data.dw03.dst_address_lo = dst_offset;
> > +	data.dw04.dst_address_hi = dst_offset >> 32;
> > +	data.dw04.dst_mocs = surf->dst.mocs;
> > +
> > +	i = sizeof(data) / sizeof(uint32_t);
> > +	bb = gem_mmap__device_coherent(i915, surf->bb.handle, 0, surf->bb.size,
> > +				       PROT_READ | PROT_WRITE);
> > +	memcpy(bb, &data, sizeof(data));
> > +	bb[i++] = MI_BATCH_BUFFER_END;
> > +
> > +	if (surf->print_bb) {
> > +		igt_info("BB [CTRL SURF]:\n");
> > +		igt_info("src offset: %llx, dst offset: %llx, bb offset: %llx\n",
> > +			 (long long) src_offset, (long long) dst_offset,
> > +			 (long long) bb_offset);
> > +
> > +		dump_bb_surf_ctrl_cmd(&data);
> > +	}
> > +	munmap(bb, surf->bb.size);
> > +
> > +	obj[0].offset = CANONICAL(dst_offset);
> > +	obj[1].offset = CANONICAL(src_offset);
> > +	obj[2].offset = CANONICAL(bb_offset);
> > +	obj[0].handle = surf->dst.handle;
> > +	obj[1].handle = surf->src.handle;
> > +	obj[2].handle = surf->bb.handle;
> > +	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> > +		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +	execbuf.buffer_count = 3;
> > +	execbuf.buffers_ptr = to_user_pointer(obj);
> > +	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> > +	execbuf.rsvd1 = ctx ? ctx->id : 0;
> > +	gem_execbuf(i915, &execbuf);
> > +	put_offset(ahnd, surf->dst.handle);
> > +	put_offset(ahnd, surf->src.handle);
> > +	put_offset(ahnd, surf->bb.handle);
> > +
> > +	return 0;
> > +}
> > +
> > +struct gen12_fast_copy_data {
> > +	struct {
> > +		uint32_t length:			BITRANGE(0, 7);
> > +		uint32_t rsvd1:				BITRANGE(8, 12);
> > +		uint32_t dst_tiling:			BITRANGE(13, 14);
> > +		uint32_t rsvd0:				BITRANGE(15, 19);
> > +		uint32_t src_tiling:			BITRANGE(20, 21);
> > +		uint32_t opcode:			BITRANGE(22, 28);
> > +		uint32_t client:			BITRANGE(29, 31);
> > +	} dw00;
> > +
> > +	struct {
> > +		uint32_t dst_pitch:			BITRANGE(0, 15);
> > +		uint32_t rsvd1:				BITRANGE(16, 23);
> > +		uint32_t color_depth:			BITRANGE(24, 26);
> > +		uint32_t rsvd0:				BITRANGE(27, 27);
> > +		uint32_t dst_memory:			BITRANGE(28, 28);
> > +		uint32_t src_memory:			BITRANGE(29, 29);
> > +		uint32_t dst_type_y:			BITRANGE(30, 30);
> > +		uint32_t src_type_y:			BITRANGE(31, 31);
> > +	} dw01;
> > +
> > +	struct {
> > +		int32_t dst_x1:				BITRANGE(0, 15);
> > +		int32_t dst_y1:				BITRANGE(16, 31);
> > +	} dw02;
> > +
> > +	struct {
> > +		int32_t dst_x2:				BITRANGE(0, 15);
> > +		int32_t dst_y2:				BITRANGE(16, 31);
> > +	} dw03;
> > +
> > +	struct {
> > +		uint32_t dst_address_lo;
> > +	} dw04;
> > +
> > +	struct {
> > +		uint32_t dst_address_hi;
> > +	} dw05;
> > +
> > +	struct {
> > +		int32_t src_x1:				BITRANGE(0, 15);
> > +		int32_t src_y1:				BITRANGE(16, 31);
> > +	} dw06;
> > +
> > +	struct {
> > +		uint32_t src_pitch:			BITRANGE(0, 15);
> > +		uint32_t rsvd0:				BITRANGE(16, 31);
> > +	} dw07;
> > +
> > +	struct {
> > +		uint32_t src_address_lo;
> > +	} dw08;
> > +
> > +	struct {
> > +		uint32_t src_address_hi;
> > +	} dw09;
> > +};
> > +
> > +static int __fast_tiling(enum blt_tiling tiling)
> > +{
> > +	switch (tiling) {
> > +	case T_LINEAR: return 0;
> > +	case T_XMAJOR: return 1;
> > +	case T_YMAJOR: return 2;
> > +	case T_TILE4:  return 2;
> > +	case T_TILE64: return 3;
> > +	}
> > +	return 0;
> > +}
> > +
> > +static int __fast_color_depth(enum blt_color_depth depth)
> > +{
> > +	switch (depth) {
> > +	case CD_8bit:   return 0;
> > +	case CD_16bit:  return 1;
> > +	case CD_32bit:  return 3;
> > +	case CD_64bit:  return 4;
> > +	case CD_96bit:
> > +		igt_assert_f(0, "Unsupported depth\n");
> > +		break;
> > +	case CD_128bit: return 5;
> > +	};
> > +	return 0;
> > +}
> > +
> > +static void dump_bb_fast_cmd(struct gen12_fast_copy_data *data)
> > +{
> > +	uint32_t *cmd = (uint32_t *) data;
> > +
> > +	igt_info("BB details:\n");
> > +	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, src tiling: %d, "
> > +		 "dst tiling: %d, length: %d>\n",
> > +		 cmd[0], data->dw00.client, data->dw00.opcode,
> > +		 data->dw00.src_tiling, data->dw00.dst_tiling, data->dw00.length);
> > +	igt_info(" dw01: [%08x] dst <pitch: %d, color depth: %d, dst memory: %d, "
> > +		 "src memory: %d, \n"
> -------------------------------- ^
> Remove space before "\n".

Ok.

> 
> > +		 "\t\t\tdst type tile: %d (0-legacy, 1-tile4),\n"
> > +		 "\t\t\tsrc type tile: %d (0-legacy, 1-tile4)>\n",
> > +		 cmd[1], data->dw01.dst_pitch, data->dw01.color_depth,
> > +		 data->dw01.dst_memory, data->dw01.src_memory,
> > +		 data->dw01.dst_type_y, data->dw01.src_type_y);
> > +	igt_info(" dw02: [%08x] dst geom <x1: %d, y1: %d>\n",
> > +		 cmd[2], data->dw02.dst_x1, data->dw02.dst_y1);
> > +	igt_info(" dw03: [%08x]          <x2: %d, y2: %d>\n",
> > +		 cmd[3], data->dw03.dst_x2, data->dw03.dst_y2);
> > +	igt_info(" dw04: [%08x] dst offset lo (0x%x)\n",
> > +		 cmd[4], data->dw04.dst_address_lo);
> > +	igt_info(" dw05: [%08x] dst offset hi (0x%x)\n",
> > +		 cmd[5], data->dw05.dst_address_hi);
> > +	igt_info(" dw06: [%08x] src geom <x1: %d, y1: %d>\n",
> > +		 cmd[6], data->dw06.src_x1, data->dw06.src_y1);
> > +	igt_info(" dw07: [%08x] src <pitch: %d>\n",
> > +		 cmd[7], data->dw07.src_pitch);
> > +	igt_info(" dw08: [%08x] src offset lo (0x%x)\n",
> > +		 cmd[8], data->dw08.src_address_lo);
> > +	igt_info(" dw09: [%08x] src offset hi (0x%x)\n",
> > +		 cmd[9], data->dw09.src_address_hi);
> > +}
> > +
> > +/**
> > + * blt_fast_copy:
> > + * @i915: drm fd
> > + * @ctx: intel_ctx_t context
> > + * @e: blitter engine for @ctx
> > + * @ahnd: allocator handle
> > + * @blt: blitter data for fast-copy (same as for block-copy but doesn't use
> > + * compression fields).
> > + *
> > + * Function does fast blit between @src and @dst described in @blt object.
> > + *
> > + * Returns:
> > + * execbuffer status.
> > + */
> > +int blt_fast_copy(int i915,
> > +		  const intel_ctx_t *ctx,
> > +		  const struct intel_execution_engine2 *e,
> > +		  uint64_t ahnd,
> > +		  const struct blt_copy_data *blt)
> > +{
> > +	struct drm_i915_gem_execbuffer2 execbuf = {};
> > +	struct drm_i915_gem_exec_object2 obj[3] = {};
> > +	struct gen12_fast_copy_data data = {};
> > +	uint64_t dst_offset, src_offset, bb_offset, alignment;
> > +	uint32_t *bb;
> > +	int i, ret;
> > +
> > +	alignment = gem_detect_safe_alignment(i915);
> > +
> > +	data.dw00.client = 0x2;
> > +	data.dw00.opcode = 0x42;
> > +	data.dw00.dst_tiling = __fast_tiling(blt->dst.tiling);
> > +	data.dw00.src_tiling = __fast_tiling(blt->src.tiling);
> > +	data.dw00.length = 8;
> > +
> > +	data.dw01.dst_pitch = blt->dst.pitch;
> > +	data.dw01.color_depth = __fast_color_depth(blt->color_depth);
> > +	data.dw01.dst_memory = __memory_type(blt->dst.region);
> > +	data.dw01.src_memory = __memory_type(blt->src.region);
> > +	data.dw01.dst_type_y = blt->dst.tiling == T_TILE4 ? 1 : 0;
> > +	data.dw01.src_type_y = blt->src.tiling == T_TILE4 ? 1 : 0;
> > +
> > +	data.dw02.dst_x1 = blt->dst.x1;
> > +	data.dw02.dst_y1 = blt->dst.y1;
> > +
> > +	data.dw03.dst_x2 = blt->dst.x2;
> > +	data.dw03.dst_y2 = blt->dst.y2;
> > +
> > +	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> > +	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> 
> Both addresses should be aligned to 64 bytes, maybe add assert here ?

I don't think we need this. Safe alignment is power of two and not less than 4KB.

--
Zbigniew

> 
> > +	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> > +
> > +	data.dw04.dst_address_lo = dst_offset;
> > +	data.dw05.dst_address_hi = dst_offset >> 32;
> > +
> > +	data.dw06.src_x1 = blt->src.x1;
> > +	data.dw06.src_y1 = blt->src.y1;
> > +
> > +	data.dw07.src_pitch = blt->src.pitch;
> > +
> > +	data.dw08.src_address_lo = src_offset;
> > +	data.dw09.src_address_hi = src_offset >> 32;
> > +
> > +	i = sizeof(data) / sizeof(uint32_t);
> > +	bb = gem_mmap__device_coherent(i915, blt->bb.handle, 0, blt->bb.size,
> > +				       PROT_READ | PROT_WRITE);
> > +
> > +	memcpy(bb, &data, sizeof(data));
> > +	bb[i++] = MI_BATCH_BUFFER_END;
> > +
> > +	if (blt->print_bb) {
> > +		igt_info("BB [FAST COPY]\n");
> > +		igt_info("blit [src offset: %llx, dst offset: %llx\n",
> > +			 (long long) src_offset, (long long) dst_offset);
> > +		dump_bb_fast_cmd(&data);
> > +	}
> > +
> > +	munmap(bb, blt->bb.size);
> > +
> > +	obj[0].offset = CANONICAL(dst_offset);
> > +	obj[1].offset = CANONICAL(src_offset);
> > +	obj[2].offset = CANONICAL(bb_offset);
> > +	obj[0].handle = blt->dst.handle;
> > +	obj[1].handle = blt->src.handle;
> > +	obj[2].handle = blt->bb.handle;
> > +	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> > +		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> > +	execbuf.buffer_count = 3;
> > +	execbuf.buffers_ptr = to_user_pointer(obj);
> > +	execbuf.rsvd1 = ctx ? ctx->id : 0;
> > +	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> > +	ret = __gem_execbuf(i915, &execbuf);
> > +	put_offset(ahnd, blt->dst.handle);
> > +	put_offset(ahnd, blt->src.handle);
> > +	put_offset(ahnd, blt->bb.handle);
> > +
> > +	return ret;
> > +}
> > +
> > +/**
> > + * blt_surface_fill_rect:
> > + * @i915: drm fd
> > + * @obj: blitter copy object (@blt_copy_object) to fill with gradient pattern
> > + * @width: width
> > + * @height: height
> > + *
> > + * Function fills surface @width x @height * 24bpp with color gradient
> > + * (internally uses ARGB where A == 0xff, see Cairo docs).
> > + */
> > +void blt_surface_fill_rect(int i915, const struct blt_copy_object *obj,
> > +			   uint32_t width, uint32_t height)
> > +{
> > +	cairo_surface_t *surface;
> > +	cairo_pattern_t *pat;
> > +	cairo_t *cr;
> > +	void *map = obj->ptr;
> > +
> > +	if (!map)
> > +		map = gem_mmap__device_coherent(i915, obj->handle, 0,
> > +						obj->size, PROT_READ | PROT_WRITE);
> > +
> > +	surface = cairo_image_surface_create_for_data(map,
> > +						      CAIRO_FORMAT_RGB24,
> > +						      width, height,
> > +						      obj->pitch);
> > +
> > +	cr = cairo_create(surface);
> > +
> > +	cairo_rectangle(cr, 0, 0, width, height);
> > +	cairo_clip(cr);
> > +
> > +	pat = cairo_pattern_create_mesh();
> > +	cairo_mesh_pattern_begin_patch(pat);
> > +	cairo_mesh_pattern_move_to(pat, 0, 0);
> > +	cairo_mesh_pattern_line_to(pat, width, 0);
> > +	cairo_mesh_pattern_line_to(pat, width, height);
> > +	cairo_mesh_pattern_line_to(pat, 0, height);
> > +	cairo_mesh_pattern_set_corner_color_rgb(pat, 0, 1.0, 0.0, 0.0);
> > +	cairo_mesh_pattern_set_corner_color_rgb(pat, 1, 0.0, 1.0, 0.0);
> > +	cairo_mesh_pattern_set_corner_color_rgb(pat, 2, 0.0, 0.0, 1.0);
> > +	cairo_mesh_pattern_set_corner_color_rgb(pat, 3, 1.0, 1.0, 1.0);
> > +	cairo_mesh_pattern_end_patch(pat);
> > +
> > +	cairo_rectangle(cr, 0, 0, width, height);
> > +	cairo_set_source(cr, pat);
> > +	cairo_fill(cr);
> > +	cairo_pattern_destroy(pat);
> > +
> > +	cairo_destroy(cr);
> > +
> > +	cairo_surface_destroy(surface);
> > +	if (!obj->ptr)
> > +		munmap(map, obj->size);
> > +}
> > +
> > +/**
> > + * blt_surface_info:
> > + * @info: information header
> > + * @obj: blitter copy object (@blt_copy_object) to print surface info
> > + */
> > +void blt_surface_info(const char *info, const struct blt_copy_object *obj)
> > +{
> > +	igt_info("[%s]\n", info);
> > +	igt_info("surface <handle: %u, size: %llx, region: %x, mocs: %x>\n",
> > +		 obj->handle, (long long) obj->size, obj->region, obj->mocs);
> > +	igt_info("        <tiling: %s, compression: %u, compression type: %d>\n",
> > +		 blt_tiling_name(obj->tiling), obj->compression, obj->compression_type);
> > +	igt_info("        <pitch: %u, offset [x: %u, y: %u] geom [<%d,%d> <%d,%d>]>\n",
> > +		 obj->pitch, obj->x_offset, obj->y_offset,
> > +		 obj->x1, obj->y1, obj->x2, obj->y2);
> > +}
> > +
> > +/**
> > + * blt_surface_to_png:
> > + * @i915: drm fd
> > + * @run_id: prefix id to allow grouping files stored from single run
> > + * @fileid: file identifier
> > + * @obj: blitter copy object (@blt_copy_object) to save to png
> > + * @width: width
> > + * @height: height
> > + *
> > + * Function save surface to png file. Assumes ARGB format where A == 0xff.
> > + */
> > +void blt_surface_to_png(int i915, uint32_t run_id, const char *fileid,
> > +			const struct blt_copy_object *obj,
> > +			uint32_t width, uint32_t height)
> > +{
> > +	cairo_surface_t *surface;
> > +	cairo_status_t ret;
> > +	uint8_t *map = (uint8_t *) obj->ptr;
> > +	int format;
> > +	int stride = obj->tiling ? obj->pitch * 4 : obj->pitch;
> > +	char filename[FILENAME_MAX];
> > +
> > +	snprintf(filename, FILENAME_MAX-1, "%d-%s-%s-%ux%u-%s.png",
> > +		 run_id, fileid, blt_tiling_name(obj->tiling), width, height,
> > +		 obj->compression ? "compressed" : "uncompressed");
> > +
> > +	if (!map)
> > +		map = gem_mmap__device_coherent(i915, obj->handle, 0,
> > +						obj->size, PROT_READ);
> > +	format = CAIRO_FORMAT_RGB24;
> > +	surface = cairo_image_surface_create_for_data(map,
> > +						      format, width, height,
> > +						      stride);
> > +	ret = cairo_surface_write_to_png(surface, filename);
> > +	if (ret)
> > +		igt_info("Cairo ret: %d (%s)\n", ret, cairo_status_to_string(ret));
> > +	igt_assert(ret == CAIRO_STATUS_SUCCESS);
> > +	cairo_surface_destroy(surface);
> > +
> > +	if (!obj->ptr)
> > +		munmap(map, obj->size);
> > +}
> > diff --git a/lib/i915/i915_blt.h b/lib/i915/i915_blt.h
> > new file mode 100644
> > index 000000000..e0e8b52bc
> > --- /dev/null
> > +++ b/lib/i915/i915_blt.h
> > @@ -0,0 +1,196 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2022 Intel Corporation
> > + */
> > +
> > +/**
> > + * SECTION:i915_blt
> > + * @short_description: i915 blitter library
> > + * @title: Blitter library
> > + * @include: i915_blt.h
> > + *
> > + * # Introduction
> > + *
> > + * Gen12+ blitter commands like XY_BLOCK_COPY_BLT are quite long
> > + * and if we would like to provide all arguments to function,
> > + * list would be long, unreadable and error prone to invalid argument placement.
> > + * Providing objects (structs) seems more reasonable and opens some more
> > + * opportunities to share some object data across different blitter commands.
> > + *
> > + * Blitter library supports no-reloc (softpin) mode only (apart of TGL
> > + * there's no relocations enabled) thus ahnd is mandatory. Providing NULL ctx
> > + * means we use default context with I915_EXEC_BLT as an execution engine.
> > + *
> > + * Library introduces tiling enum which distinguishes tiling formats regardless
> > + * legacy I915_TILING_... definitions. This allows to control fully what tilings
> > + * are handled by command and skip/assert ones which are not supported.
> > + *
> > + * # Supported commands
> > + *
> > + * - XY_BLOCK_COPY_BLT - (block-copy) TGL/DG1 + DG2+ (ext version)
> > + * - XY_FAST_COPY_BLT - (fast-copy)
> > + * - XY_CTRL_SURF_COPY_BLT - (ctrl-surf-copy) DG2+
> > + *
> > + * # Usage details
> > + *
> > + * For block-copy and fast-copy @blt_copy_object struct is used to collect
> > + * data about source and destination objects. It contains handle, region,
> > + * size, etc...  which are using for blits. Some fields are not used for
> > + * fast-copy copy (like compression) and command which use this exclusively
> > + * is annotated in the comment.
> > + *
> > + */
> > +
> > +#include <errno.h>
> > +#include <sys/ioctl.h>
> > +#include <sys/time.h>
> > +#include <malloc.h>
> > +#include "drm.h"
> > +#include "igt.h"
> > +
> > +#define CCS_RATIO 256
> > +
> > +enum blt_color_depth {
> > +	CD_8bit,
> > +	CD_16bit,
> > +	CD_32bit,
> > +	CD_64bit,
> > +	CD_96bit,
> > +	CD_128bit,
> > +};
> > +
> > +enum blt_tiling {
> > +	T_LINEAR,
> > +	T_XMAJOR,
> > +	T_YMAJOR,
> > +	T_TILE4,
> > +	T_TILE64,
> > +};
> > +
> > +enum blt_compression {
> > +	COMPRESSION_DISABLED,
> > +	COMPRESSION_ENABLED,
> > +};
> > +
> > +enum blt_compression_type {
> > +	COMPRESSION_TYPE_3D,
> > +	COMPRESSION_TYPE_MEDIA,
> > +};
> > +
> > +/* BC - block-copy */
> > +struct blt_copy_object {
> > +	uint32_t handle;
> > +	uint32_t region;
> > +	uint64_t size;
> > +	uint8_t mocs;
> > +	enum blt_tiling tiling;
> > +	enum blt_compression compression;  /* BC only */
> > +	enum blt_compression_type compression_type; /* BC only */
> > +	uint32_t pitch;
> > +	uint16_t x_offset, y_offset;
> > +	int16_t x1, y1, x2, y2;
> > +
> > +	/* mapping or null */
> > +	uint32_t *ptr;
> > +};
> > +
> > +struct blt_copy_batch {
> > +	uint32_t handle;
> > +	uint32_t region;
> > +	uint64_t size;
> > +};
> > +
> > +/* Common for block-copy and fast-copy */
> > +struct blt_copy_data {
> > +	int i915;
> > +	struct blt_copy_object src;
> > +	struct blt_copy_object dst;
> > +	struct blt_copy_batch bb;
> > +	enum blt_color_depth color_depth;
> > +
> > +	/* debug stuff */
> > +	bool print_bb;
> > +};
> > +
> > +enum blt_surface_type {
> > +	SURFACE_TYPE_1D,
> > +	SURFACE_TYPE_2D,
> > +	SURFACE_TYPE_3D,
> > +	SURFACE_TYPE_CUBE,
> > +};
> > +
> > +struct blt_block_copy_object_ext {
> > +	uint8_t compression_format;
> > +	bool clear_value_enable;
> > +	uint64_t clear_address;
> > +	uint16_t surface_width;
> > +	uint16_t surface_height;
> > +	enum blt_surface_type surface_type;
> > +	uint16_t surface_qpitch;
> > +	uint16_t surface_depth;
> > +	uint8_t lod;
> > +	uint8_t horizontal_align;
> > +	uint8_t vertical_align;
> > +	uint8_t mip_tail_start_lod;
> > +	bool depth_stencil_resource;
> > +	uint16_t array_index;
> > +};
> > +
> > +struct blt_block_copy_data_ext {
> > +	struct blt_block_copy_object_ext src;
> > +	struct blt_block_copy_object_ext dst;
> > +};
> > +
> > +enum blt_access_type {
> > +	INDIRECT_ACCESS,
> > +	DIRECT_ACCESS,
> > +};
> > +
> > +struct blt_ctrl_surf_copy_object {
> > +	uint32_t handle;
> > +	uint32_t region;
> > +	uint64_t size;
> > +	uint8_t mocs;
> > +	enum blt_access_type access_type;
> > +};
> > +
> > +struct blt_ctrl_surf_copy_data {
> > +	int i915;
> > +	struct blt_ctrl_surf_copy_object src;
> > +	struct blt_ctrl_surf_copy_object dst;
> > +	struct blt_copy_batch bb;
> > +
> > +	/* debug stuff */
> > +	bool print_bb;
> > +};
> > +
> > +bool blt_supports_compression(int i915);
> > +bool blt_supports_tiling(int i915, enum blt_tiling tiling);
> > +const char *blt_tiling_name(enum blt_tiling tiling);
> > +
> > +int blt_block_copy(int i915,
> > +		   const intel_ctx_t *ctx,
> > +		   const struct intel_execution_engine2 *e,
> > +		   uint64_t ahnd,
> > +		   const struct blt_copy_data *blt,
> > +		   const struct blt_block_copy_data_ext *ext);
> > +
> > +int blt_ctrl_surf_copy(int i915,
> > +		       const intel_ctx_t *ctx,
> > +		       const struct intel_execution_engine2 *e,
> > +		       uint64_t ahnd,
> > +		       const struct blt_ctrl_surf_copy_data *surf);
> > +
> > +int blt_fast_copy(int i915,
> > +		  const intel_ctx_t *ctx,
> > +		  const struct intel_execution_engine2 *e,
> > +		  uint64_t ahnd,
> > +		  const struct blt_copy_data *blt);
> > +
> > +void blt_surface_info(const char *info,
> > +		      const struct blt_copy_object *obj);
> > +void blt_surface_fill_rect(int i915, const struct blt_copy_object *obj,
> > +			   uint32_t width, uint32_t height);
> > +void blt_surface_to_png(int i915, uint32_t run_id, const char *fileid,
> > +		    const struct blt_copy_object *obj,
> > +		    uint32_t width, uint32_t height);
> > diff --git a/lib/meson.build b/lib/meson.build
> > index 3e43316d1..fe035672e 100644
> > --- a/lib/meson.build
> > +++ b/lib/meson.build
> > @@ -11,6 +11,7 @@ lib_sources = [
> >  	'i915/gem_mman.c',
> >  	'i915/gem_vm.c',
> >  	'i915/intel_memory_region.c',
> > +	'i915/i915_blt.c',
> >  	'igt_collection.c',
> >  	'igt_color_encoding.c',
> >  	'igt_debugfs.c',
> > -- 
> > 2.32.0
> > 
> Regards,
> Kamil Konieczny
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter
  2022-02-25 11:06 ` [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter Zbigniew Kempczyński
  2022-02-25 16:18   ` Kamil Konieczny
@ 2022-03-01 22:57   ` Kamil Konieczny
  2022-03-04 13:37     ` Zbigniew Kempczyński
  1 sibling, 1 reply; 12+ messages in thread
From: Kamil Konieczny @ 2022-03-01 22:57 UTC (permalink / raw)
  To: igt-dev; +Cc: Mauro Chehab

Hi Zbigniew,

Dnia 2022-02-25 at 12:06:25 +0100, Zbigniew Kempczyński napisał(a):
> Blitter commands became complicated thus manual bitshifting is error
> prone and hard debugable - XY_BLOCK_COPY_BLT is the best example -
> in extended version (for DG2+) it takes 20 dwords of command data.
> To avoid mistakes and dozens of arguments for command library provides
> input data in more structured form.
> 
> Currently supported commands:
> - XY_BLOCK_COPY_BLT:
>   a)  TGL/DG1 uses shorter version of command which doesn't support
>       compression
>   b)  DG2+ command is extended and supports compression
> - XY_CTRL_SURF_COPY_BLT
> - XY_FAST_COPY_BLT
> 
> Source, destination and batchbuffer are provided to blitter functions
> as objects (structs). This increases readability and allows use same
> object in many functions. Only drawback of such attitude is some fields
> used in one function may be ignored in another. As an example is
> blt_copy_object which contains a lot of information about gem object.
> In block-copy all of data are used but in fast-copy only some of them
> (fast-copy doesn't support compression).
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>  .../igt-gpu-tools/igt-gpu-tools-docs.xml      |    1 +
>  lib/i915/i915_blt.c                           | 1087 +++++++++++++++++
>  lib/i915/i915_blt.h                           |  196 +++
>  lib/meson.build                               |    1 +
>  4 files changed, 1285 insertions(+)
>  create mode 100644 lib/i915/i915_blt.c
>  create mode 100644 lib/i915/i915_blt.h
> 
> diff --git a/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml b/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
> index 0dc5a0b7e..3a2edbae1 100644
> --- a/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
> +++ b/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
> @@ -16,6 +16,7 @@
>    <chapter>
>      <title>API Reference</title>
>      <xi:include href="xml/drmtest.xml"/>
> +    <xi:include href="xml/i915_blt.xml"/>
>      <xi:include href="xml/igt_alsa.xml"/>
>      <xi:include href="xml/igt_audio.xml"/>
>      <xi:include href="xml/igt_aux.xml"/>
> diff --git a/lib/i915/i915_blt.c b/lib/i915/i915_blt.c
> new file mode 100644
> index 000000000..c6f115009
> --- /dev/null
> +++ b/lib/i915/i915_blt.c
> @@ -0,0 +1,1087 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +#include <errno.h>
> +#include <sys/ioctl.h>
> +#include <sys/time.h>
> +#include <malloc.h>
> +#include <cairo.h>
> +#include "drm.h"
> +#include "igt.h"
> +#include "gem_create.h"
> +#include "i915_blt.h"
> +
> +#define BITRANGE(start, end) (end - start + 1)
> +
> +enum blt_special_mode {
> +	SM_NONE,
> +	SM_FULL_RESOLVE,
> +	SM_PARTIAL_RESOLVE,
> +	SM_RESERVED,
> +};
> +
> +enum blt_aux_mode {
> +	AM_AUX_NONE,
> +	AM_AUX_CCS_E = 5,
> +};
> +
> +enum blt_target_mem {
> +	TM_LOCAL_MEM,
> +	TM_SYSTEM_MEM,
> +};
> +
> +struct gen12_block_copy_data {
> +	struct {
> +		uint32_t length:			BITRANGE(0, 7);
> +		uint32_t rsvd1:				BITRANGE(8, 8);
> +		uint32_t multisamples:			BITRANGE(9, 11);
> +		uint32_t special_mode:			BITRANGE(12, 13);
> +		uint32_t rsvd0:				BITRANGE(14, 18);
> +		uint32_t color_depth:			BITRANGE(19, 21);
> +		uint32_t opcode:			BITRANGE(22, 28);
> +		uint32_t client:			BITRANGE(29, 31);
> +	} dw00;
> +
> +	struct {
> +		uint32_t dst_pitch:			BITRANGE(0, 17);
> +		uint32_t dst_aux_mode:			BITRANGE(18, 20);
> +		uint32_t dst_mocs:			BITRANGE(21, 27);
> +		uint32_t dst_ctrl_surface_type:		BITRANGE(28, 28);
> +		uint32_t dst_compression:		BITRANGE(29, 29);
> +		uint32_t dst_tiling:			BITRANGE(30, 31);
> +	} dw01;
> +
> +	struct {
> +		int32_t dst_x1:				BITRANGE(0, 15);
> +		int32_t dst_y1:				BITRANGE(16, 31);
> +	} dw02;
> +
> +	struct {
> +		int32_t dst_x2:				BITRANGE(0, 15);
> +		int32_t dst_y2:				BITRANGE(16, 31);
> +	} dw03;
> +
> +	struct {
> +		uint32_t dst_address_lo;
> +	} dw04;
> +
> +	struct {
> +		uint32_t dst_address_hi;
> +	} dw05;
> +
> +	struct {
> +		uint32_t dst_x_offset:			BITRANGE(0, 13);
> +		uint32_t rsvd1:				BITRANGE(14, 15);
> +		uint32_t dst_y_offset:			BITRANGE(16, 29);
> +		uint32_t rsvd0:				BITRANGE(30, 30);
> +		uint32_t dst_target_memory:		BITRANGE(31, 31);
> +	} dw06;
> +
> +	struct {
> +		int32_t src_x1:				BITRANGE(0, 15);
> +		int32_t src_y1:				BITRANGE(16, 31);
> +	} dw07;
> +
> +	struct {
> +		uint32_t src_pitch:			BITRANGE(0, 17);
> +		uint32_t src_aux_mode:			BITRANGE(18, 20);
> +		uint32_t src_mocs:			BITRANGE(21, 27);
> +		uint32_t src_ctrl_surface_type:		BITRANGE(28, 28);
> +		uint32_t src_compression:		BITRANGE(29, 29);
> +		uint32_t src_tiling:			BITRANGE(30, 31);
> +	} dw08;
> +
> +	struct {
> +		uint32_t src_address_lo;
> +	} dw09;
> +
> +	struct {
> +		uint32_t src_address_hi;
> +	} dw10;
> +
> +	struct {
> +		uint32_t src_x_offset:			BITRANGE(0, 13);
> +		uint32_t rsvd1:				BITRANGE(14, 15);
> +		uint32_t src_y_offset:			BITRANGE(16, 29);
> +		uint32_t rsvd0:				BITRANGE(30, 30);
> +		uint32_t src_target_memory:		BITRANGE(31, 31);
> +	} dw11;
> +};
> +
> +struct gen12_block_copy_data_ext {
> +	struct {
> +		uint32_t src_compression_format:	BITRANGE(0, 4);
> +		uint32_t src_clear_value_enable:	BITRANGE(5, 5);
> +		uint32_t src_clear_address_low:		BITRANGE(6, 31);
> +	} dw12;
> +
> +	union {
> +		/* DG2, XEHP */
> +		uint32_t src_clear_address_hi0;
> +		/* Others */
> +		uint32_t src_clear_address_hi1;
> +	} dw13;

Is it really needed to have union here ? If you will add struct
with bitfields inside it will be ok, but only for uint32_t this
seems redundand.

> +
> +	struct {
> +		uint32_t dst_compression_format:	BITRANGE(0, 4);
> +		uint32_t dst_clear_value_enable:	BITRANGE(5, 5);
> +		uint32_t dst_clear_address_low:		BITRANGE(6, 31);
> +	} dw14;
> +
> +	union {
> +		/* DG2, XEHP */
> +		uint32_t dst_clear_address_hi0;
> +		/* Others */
> +		uint32_t dst_clear_address_hi1;
> +	} dw15;

Same here.

> +
> +	struct {
> +		uint32_t dst_surface_height:		BITRANGE(0, 13);
> +		uint32_t dst_surface_width:		BITRANGE(14, 27);
> +		uint32_t rsvd0:				BITRANGE(28, 28);
> +		uint32_t dst_surface_type:		BITRANGE(29, 31);
> +	} dw16;
> +
> +	struct {
> +		uint32_t dst_lod:			BITRANGE(0, 3);
> +		uint32_t dst_surface_qpitch:		BITRANGE(4, 18);
> +		uint32_t rsvd0:				BITRANGE(19, 20);
> +		uint32_t dst_surface_depth:		BITRANGE(21, 31);
> +	} dw17;
> +
> +	struct {
> +		uint32_t dst_horizontal_align:		BITRANGE(0, 1);
> +		uint32_t rsvd0:				BITRANGE(2, 2);
> +		uint32_t dst_vertical_align:		BITRANGE(3, 4);
> +		uint32_t rsvd1:				BITRANGE(5, 7);
> +		uint32_t dst_mip_tail_start_lod:	BITRANGE(8, 11);
> +		uint32_t rsvd2:				BITRANGE(12, 17);
> +		uint32_t dst_depth_stencil_resource:	BITRANGE(18, 18);
> +		uint32_t rsvd3:				BITRANGE(19, 20);
> +		uint32_t dst_array_index:		BITRANGE(21, 31);
> +	} dw18;
> +
> +	struct {
> +		uint32_t src_surface_height:		BITRANGE(0, 13);
> +		uint32_t src_surface_width:		BITRANGE(14, 27);
> +		uint32_t rsvd0:				BITRANGE(28, 28);
> +		uint32_t src_surface_type:		BITRANGE(29, 31);
> +	} dw19;
> +
> +	struct {
> +		uint32_t src_lod:			BITRANGE(0, 3);
> +		uint32_t src_surface_qpitch:		BITRANGE(4, 18);
> +		uint32_t rsvd0:				BITRANGE(19, 20);
> +		uint32_t src_surface_depth:		BITRANGE(21, 31);
> +	} dw20;
> +
> +	struct {
> +		uint32_t src_horizontal_align:		BITRANGE(0, 1);
> +		uint32_t rsvd0:				BITRANGE(2, 2);
> +		uint32_t src_vertical_align:		BITRANGE(3, 4);
> +		uint32_t rsvd1:				BITRANGE(5, 7);
> +		uint32_t src_mip_tail_start_lod:	BITRANGE(8, 11);
> +		uint32_t rsvd2:				BITRANGE(12, 17);
> +		uint32_t src_depth_stencil_resource:	BITRANGE(18, 18);
> +		uint32_t rsvd3:				BITRANGE(19, 20);
> +		uint32_t src_array_index:		BITRANGE(21, 31);
> +	} dw21;
> +};
> +
> +/**
> + * blt_supports_compression:
> + * @i915: drm fd
> + *
> + * Function returns does HW supports flatccs compression in blitter commands
---------------------- ^
s/does/true if/

> + * on @i915 device.
> + *
> + * Returns:
> + * true if it does, false otherwise.
> + */
> +bool blt_supports_compression(int i915)
> +{
> +	uint32_t devid = intel_get_drm_devid(i915);
> +
> +	return HAS_FLATCCS(devid);
> +}
> +
> +/**
> + * blt_supports_tiling:
> + * @i915: drm fd
> + * @tiling: tiling id
> + *
> + * Function returns does blitter supports @tiling on @i915 device.
---------------------- ^
same here, either "returns true if" or "checks if"

> + *
> + * Returns:
> + * true if it does, false otherwise.
> + */
> +bool blt_supports_tiling(int i915, enum blt_tiling tiling)
> +{
> +	uint32_t devid = intel_get_drm_devid(i915);
> +
> +	if (tiling == T_XMAJOR) {
> +		if (IS_TIGERLAKE(devid) || IS_DG1(devid))
> +			return false;
> +		else
> +			return true;
> +	}
> +
> +	if (tiling == T_YMAJOR) {
> +		if (IS_TIGERLAKE(devid) || IS_DG1(devid))
> +			return true;
> +		else
> +			return false;
> +	}
> +
> +	return true;
> +}
> +
> +/**
> + * blt_tiling_name:
> + * @tiling: tiling id
> + *
> + * Returns:
> + * name of @tiling passed. Useful to build test names or sth.
s/ or sth//

> + */
> +const char *blt_tiling_name(enum blt_tiling tiling)
> +{
> +	switch (tiling) {
> +	case T_LINEAR: return "linear";
> +	case T_XMAJOR: return "xmajor";
> +	case T_YMAJOR: return "ymajor";
> +	case T_TILE4:  return "tile4";
> +	case T_TILE64: return "tile64";
> +	}
> +
> +	igt_warn("invalid tiling passed: %d\n", tiling);
> +	return NULL;
> +}
> +
> +static int __block_tiling(enum blt_tiling tiling)
> +{
> +	switch (tiling) {
> +	case T_LINEAR: return 0;
> +	case T_XMAJOR: return 1;
> +	case T_YMAJOR: return 1;
> +	case T_TILE4:  return 2;
> +	case T_TILE64: return 3;
> +	}
> +
> +	igt_warn("invalid tiling passed: %d\n", tiling);
> +	return 0;
> +}
> +
> +static int __special_mode(const struct blt_copy_data *blt)
> +{
> +	if (blt->src.handle == blt->dst.handle &&
> +	    blt->src.compression && !blt->dst.compression)
> +		return SM_FULL_RESOLVE;
> +
> +
> +	return SM_NONE;
> +}
> +
> +static int __memory_type(uint32_t region)
> +{
> +	igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
> +		     IS_SYSTEM_MEMORY_REGION(region),
> +		     "Invalid region: %x\n", region);
> +
> +	if (IS_DEVICE_MEMORY_REGION(region))
> +		return TM_LOCAL_MEM;
> +	return TM_SYSTEM_MEM;
> +}
> +
> +static enum blt_aux_mode __aux_mode(const struct blt_copy_object *obj)
> +{
> +	if (obj->compression == COMPRESSION_ENABLED) {
> +		igt_assert_f(IS_DEVICE_MEMORY_REGION(obj->region),
> +			     "XY_BLOCK_COPY_BLT supports compression "
> +			     "on device memory only\n");
> +		return AM_AUX_CCS_E;
> +	}
> +
> +	return AM_AUX_NONE;
> +}
> +
> +static void fill_data(struct gen12_block_copy_data *data,
> +		      const struct blt_copy_data *blt,
> +		      uint64_t src_offset, uint64_t dst_offset,
> +		      bool extended_command)
> +{
> +	data->dw00.client = 0x2;
> +	data->dw00.opcode = 0x41;
> +	data->dw00.color_depth = blt->color_depth;
> +	data->dw00.special_mode = __special_mode(blt);
> +	data->dw00.length = extended_command ? 20 : 10;
> +
> +	data->dw01.dst_pitch = blt->dst.pitch - 1;
> +	data->dw01.dst_aux_mode = __aux_mode(&blt->dst);
> +	data->dw01.dst_mocs = blt->dst.mocs;
> +	data->dw01.dst_compression = blt->dst.compression;
> +	data->dw01.dst_tiling = __block_tiling(blt->dst.tiling);
> +
> +	if (blt->dst.compression)
> +		data->dw01.dst_ctrl_surface_type = blt->dst.compression_type;
> +
> +	data->dw02.dst_x1 = blt->dst.x1;
> +	data->dw02.dst_y1 = blt->dst.y1;
> +
> +	data->dw03.dst_x2 = blt->dst.x2;
> +	data->dw03.dst_y2 = blt->dst.y2;
> +
> +	data->dw04.dst_address_lo = dst_offset;
> +	data->dw05.dst_address_hi = dst_offset >> 32;
> +
> +	data->dw06.dst_x_offset = blt->dst.x_offset;
> +	data->dw06.dst_y_offset = blt->dst.y_offset;
> +	data->dw06.dst_target_memory = __memory_type(blt->dst.region);
> +
> +	data->dw07.src_x1 = blt->src.x1;
> +	data->dw07.src_y1 = blt->src.y1;
> +
> +	data->dw08.src_pitch = blt->src.pitch - 1;
> +	data->dw08.src_aux_mode = __aux_mode(&blt->src);
> +	data->dw08.src_mocs = blt->src.mocs;
> +	data->dw08.src_compression = blt->src.compression;
> +	data->dw08.src_tiling = __block_tiling(blt->src.tiling);
> +
> +	if (blt->src.compression)
> +		data->dw08.src_ctrl_surface_type = blt->src.compression_type;
> +
> +	data->dw09.src_address_lo = src_offset;
> +	data->dw10.src_address_hi = src_offset >> 32;
> +
> +	data->dw11.src_x_offset = blt->src.x_offset;
> +	data->dw11.src_y_offset = blt->src.y_offset;
> +	data->dw11.src_target_memory = __memory_type(blt->src.region);
> +}
> +
> +static void fill_data_ext(int i915,

Parameter i915 is not used here, remove it.

> +			  struct gen12_block_copy_data_ext *dext,
> +			  const struct blt_block_copy_data_ext *ext)
> +{
> +	dext->dw12.src_compression_format = ext->src.compression_format;
> +	dext->dw12.src_clear_value_enable = ext->src.clear_value_enable;
> +	dext->dw12.src_clear_address_low = ext->src.clear_address;
> +
> +	dext->dw13.src_clear_address_hi0 = ext->src.clear_address >> 32;
> +
> +	dext->dw14.dst_compression_format = ext->dst.compression_format;
> +	dext->dw14.dst_clear_value_enable = ext->dst.clear_value_enable;
> +	dext->dw14.dst_clear_address_low = ext->dst.clear_address;
> +
> +	dext->dw15.dst_clear_address_hi0 = ext->dst.clear_address >> 32;
> +
> +	dext->dw16.dst_surface_width = ext->dst.surface_width - 1;
> +	dext->dw16.dst_surface_height = ext->dst.surface_height - 1;
> +	dext->dw16.dst_surface_type = ext->dst.surface_type;
> +
> +	dext->dw17.dst_lod = ext->dst.lod;
> +	dext->dw17.dst_surface_depth = ext->dst.surface_depth;
> +	dext->dw17.dst_surface_qpitch = ext->dst.surface_qpitch;
> +
> +	dext->dw18.dst_horizontal_align = ext->dst.horizontal_align;
> +	dext->dw18.dst_vertical_align = ext->dst.vertical_align;
> +	dext->dw18.dst_mip_tail_start_lod = ext->dst.mip_tail_start_lod;
> +	dext->dw18.dst_depth_stencil_resource = ext->dst.depth_stencil_resource;
> +	dext->dw18.dst_array_index = ext->dst.array_index;
> +
> +	dext->dw19.src_surface_width = ext->src.surface_width - 1;
> +	dext->dw19.src_surface_height = ext->src.surface_height - 1;
> +
> +	dext->dw19.src_surface_type = ext->src.surface_type;
> +
> +	dext->dw20.src_lod = ext->src.lod;
> +	dext->dw20.src_surface_depth = ext->src.surface_depth;
> +	dext->dw20.src_surface_qpitch = ext->src.surface_qpitch;
> +
> +	dext->dw21.src_horizontal_align = ext->src.horizontal_align;
> +	dext->dw21.src_vertical_align = ext->src.vertical_align;
> +	dext->dw21.src_mip_tail_start_lod = ext->src.mip_tail_start_lod;
> +	dext->dw21.src_depth_stencil_resource = ext->src.depth_stencil_resource;
> +	dext->dw21.src_array_index = ext->src.array_index;
> +}
> +
> +static void dump_bb_cmd(struct gen12_block_copy_data *data)
> +{
> +	uint32_t *cmd = (uint32_t *) data;
> +
> +	igt_info("details:\n");
> +	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, color depth: %d, "
> +		 "special mode: %d, length: %d>\n",
> +		 cmd[0],
> +		 data->dw00.client, data->dw00.opcode, data->dw00.color_depth,
> +		 data->dw00.special_mode, data->dw00.length);
> +	igt_info(" dw01: [%08x] dst <pitch: %d, aux: %d, mocs: %d, compr: %d, "
> +		 "tiling: %d, ctrl surf type: %d>\n",
> +		 cmd[1], data->dw01.dst_pitch, data->dw01.dst_aux_mode,
> +		 data->dw01.dst_mocs, data->dw01.dst_compression,
> +		 data->dw01.dst_tiling, data->dw01.dst_ctrl_surface_type);
> +	igt_info(" dw02: [%08x] dst geom <x1: %d, y1: %d>\n",
> +		 cmd[2], data->dw02.dst_x1, data->dw02.dst_y1);
> +	igt_info(" dw03: [%08x]          <x2: %d, y2: %d>\n",
> +		 cmd[3], data->dw03.dst_x2, data->dw03.dst_y2);
> +	igt_info(" dw04: [%08x] dst offset lo (0x%x)\n",
> +		 cmd[4], data->dw04.dst_address_lo);
> +	igt_info(" dw05: [%08x] dst offset hi (0x%x)\n",
> +		 cmd[5], data->dw05.dst_address_hi);
> +	igt_info(" dw06: [%08x] dst <x offset: 0x%x, y offset: 0x%0x, target mem: %d>\n",
> +		 cmd[6], data->dw06.dst_x_offset, data->dw06.dst_y_offset,
> +		 data->dw06.dst_target_memory);
> +	igt_info(" dw07: [%08x] src geom <x1: %d, y1: %d>\n",
> +		 cmd[7], data->dw07.src_x1, data->dw07.src_y1);
> +	igt_info(" dw08: [%08x] src <pitch: %d, aux: %d, mocs: %d, compr: %d, "
> +		 "tiling: %d, ctrl surf type: %d>\n",
> +		 cmd[8], data->dw08.src_pitch, data->dw08.src_aux_mode,
> +		 data->dw08.src_mocs, data->dw08.src_compression,
> +		 data->dw08.src_tiling, data->dw08.src_ctrl_surface_type);
> +	igt_info(" dw09: [%08x] src offset lo (0x%x)\n",
> +		 cmd[9], data->dw09.src_address_lo);
> +	igt_info(" dw10: [%08x] src offset hi (0x%x)\n",
> +		 cmd[10], data->dw10.src_address_hi);
> +	igt_info(" dw11: [%08x] src <x offset: 0x%x, y offset: 0x%0x, target mem: %d>\n",
> +		 cmd[11], data->dw11.src_x_offset, data->dw11.src_y_offset,
> +		 data->dw11.src_target_memory);
> +}
> +
> +static void dump_bb_ext(struct gen12_block_copy_data_ext *data)
> +{
> +	uint32_t *cmd = (uint32_t *) data;
> +
> +	igt_info("ext details:\n");
> +	igt_info(" dw12: [%08x] src <compression fmt: %d, clear value enable: %d, "
> +		 "clear address low: 0x%x>\n",
> +		 cmd[0],
> +		 data->dw12.src_compression_format,
> +		 data->dw12.src_clear_value_enable,
> +		 data->dw12.src_clear_address_low);
> +	igt_info(" dw13: [%08x] src clear address hi: 0x%x\n",
> +		 cmd[1], data->dw13.src_clear_address_hi0);
> +	igt_info(" dw14: [%08x] dst <compression fmt: %d, clear value enable: %d, "
> +		 "clear address low: 0x%x>\n",
> +		 cmd[2],
> +		 data->dw14.dst_compression_format,
> +		 data->dw14.dst_clear_value_enable,
> +		 data->dw14.dst_clear_address_low);
> +	igt_info(" dw15: [%08x] dst clear address hi: 0x%x\n",
> +		 cmd[3], data->dw15.dst_clear_address_hi0);
> +	igt_info(" dw16: [%08x] dst surface <width: %d, height: %d, type: %d>\n",
> +		 cmd[4], data->dw16.dst_surface_width,
> +		 data->dw16.dst_surface_height, data->dw16.dst_surface_type);
> +	igt_info(" dw17: [%08x] dst surface <lod: %d, depth: %d, qpitch: %d>\n",
> +		 cmd[5], data->dw17.dst_lod,
> +		 data->dw17.dst_surface_depth, data->dw17.dst_surface_qpitch);
> +	igt_info(" dw18: [%08x] dst <halign: %d, valign: %d, mip tail: %d, "
> +		 "depth stencil: %d, array index: %d>\n",
> +		 cmd[6],
> +		 data->dw18.dst_horizontal_align,
> +		 data->dw18.dst_vertical_align,
> +		 data->dw18.dst_mip_tail_start_lod,
> +		 data->dw18.dst_depth_stencil_resource,
> +		 data->dw18.dst_array_index);
> +
> +	igt_info(" dw19: [%08x] src surface <width: %d, height: %d, type: %d>\n",
> +		 cmd[7], data->dw19.src_surface_width,
> +		 data->dw19.src_surface_height, data->dw19.src_surface_type);
> +	igt_info(" dw20: [%08x] src surface <lod: %d, depth: %d, qpitch: %d>\n",
> +		 cmd[8], data->dw20.src_lod,
> +		 data->dw20.src_surface_depth, data->dw20.src_surface_qpitch);
> +	igt_info(" dw21: [%08x] src <halign: %d, valign: %d, mip tail: %d, "
> +		 "depth stencil: %d, array index: %d>\n",
> +		 cmd[9],
> +		 data->dw21.src_horizontal_align,
> +		 data->dw21.src_vertical_align,
> +		 data->dw21.src_mip_tail_start_lod,
> +		 data->dw21.src_depth_stencil_resource,
> +		 data->dw21.src_array_index);
> +}
> +
> +/**
> + * blt_block_copy:
> + * @i915: drm fd
> + * @ctx: intel_ctx_t context
> + * @e: blitter engine for @ctx
> + * @ahnd: allocator handle
> + * @blt: basic blitter data (for TGL/DG1 which doesn't support ext version)
> + * @ext: extended blitter data (for DG2+, supports flatccs compression)
> + *
> + * Function does blit between @src and @dst described in @blt object.
> + *
> + * Returns:
> + * execbuffer status.
> + */
> +int blt_block_copy(int i915,
> +		   const intel_ctx_t *ctx,
> +		   const struct intel_execution_engine2 *e,
> +		   uint64_t ahnd,
> +		   const struct blt_copy_data *blt,
> +		   const struct blt_block_copy_data_ext *ext)
> +{
> +	struct drm_i915_gem_execbuffer2 execbuf = {};
> +	struct drm_i915_gem_exec_object2 obj[3] = {};
> +	struct gen12_block_copy_data data = {};
> +	struct gen12_block_copy_data_ext dext = {};
> +	uint64_t dst_offset, src_offset, bb_offset, alignment;
> +	uint32_t *bb;
> +	int i, ret;
> +
> +	igt_assert_f(ahnd, "block-copy supports softpin only\n");
> +	igt_assert_f(blt, "block-copy requires data to do blit\n");
> +
> +	alignment = gem_detect_safe_alignment(i915);
> +	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> +	if (__special_mode(blt) == SM_FULL_RESOLVE)
> +		dst_offset = src_offset;
> +	else
> +		dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
> +	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> +
> +	fill_data(&data, blt, src_offset, dst_offset, ext);
> +
> +	i = sizeof(data) / sizeof(uint32_t);
> +	bb = gem_mmap__device_coherent(i915, blt->bb.handle, 0, blt->bb.size,
> +				       PROT_READ | PROT_WRITE);
> +	memcpy(bb, &data, sizeof(data));
> +
> +	if (ext) {
> +		fill_data_ext(i915, &dext, ext);
> +		memcpy(bb + i, &dext, sizeof(dext));
> +		i += sizeof(dext) / sizeof(uint32_t);
> +	}
> +	bb[i++] = MI_BATCH_BUFFER_END;
> +
> +	if (blt->print_bb) {
> +		igt_info("[BLOCK COPY]\n");
> +		igt_info("src offset: %llx, dst offset: %llx, bb offset: %llx\n",
> +			 (long long) src_offset, (long long) dst_offset,
> +			 (long long) bb_offset);
> +
> +		dump_bb_cmd(&data);
> +		if (ext)
> +			dump_bb_ext(&dext);
> +	}
> +
> +	munmap(bb, blt->bb.size);
> +
> +	obj[0].offset = CANONICAL(dst_offset);
> +	obj[1].offset = CANONICAL(src_offset);
> +	obj[2].offset = CANONICAL(bb_offset);
> +	obj[0].handle = blt->dst.handle;
> +	obj[1].handle = blt->src.handle;
> +	obj[2].handle = blt->bb.handle;
> +	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> +		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +	execbuf.buffer_count = 3;
> +	execbuf.buffers_ptr = to_user_pointer(obj);
> +	execbuf.rsvd1 = ctx ? ctx->id : 0;
> +	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> +	ret = __gem_execbuf(i915, &execbuf);
> +	if (data.dw00.special_mode != SM_FULL_RESOLVE)
> +		put_offset(ahnd, blt->dst.handle);
> +	put_offset(ahnd, blt->src.handle);
> +	put_offset(ahnd, blt->bb.handle);
> +
> +	return ret;
> +}
> +
> +static uint16_t __ccs_size(const struct blt_ctrl_surf_copy_data *surf)
> +{
> +	uint32_t src_size, dst_size;
> +
> +	src_size = surf->src.access_type == DIRECT_ACCESS ?
> +				surf->src.size : surf->src.size / CCS_RATIO;
> +
> +	dst_size = surf->dst.access_type == DIRECT_ACCESS ?
> +				surf->dst.size : surf->dst.size / CCS_RATIO;
> +
> +	igt_assert_f(src_size <= dst_size, "dst size must be >= src size for CCS copy\n");
> +
> +	return src_size;
> +}
> +
> +struct gen12_ctrl_surf_copy_data {
> +	struct {
> +		uint32_t length:			BITRANGE(0, 7);
> +		uint32_t size_of_ctrl_copy:		BITRANGE(8, 17);
> +		uint32_t rsvd0:				BITRANGE(18, 19);
> +		uint32_t dst_access_type:		BITRANGE(20, 20);
> +		uint32_t src_access_type:		BITRANGE(21, 21);
> +		uint32_t opcode:			BITRANGE(22, 28);
> +		uint32_t client:			BITRANGE(29, 31);
> +	} dw00;
> +
> +	struct {
> +		uint32_t src_address_lo;
> +	} dw01;
> +
> +	struct {
> +		uint32_t src_address_hi:		BITRANGE(0, 24);
> +		uint32_t src_mocs:			BITRANGE(25, 31);
> +	} dw02;
> +
> +	struct {
> +		uint32_t dst_address_lo;
> +	} dw03;
> +
> +	struct {
> +		uint32_t dst_address_hi:		BITRANGE(0, 24);
> +		uint32_t dst_mocs:			BITRANGE(25, 31);
> +	} dw04;
> +};
> +
> +static void dump_bb_surf_ctrl_cmd(const struct gen12_ctrl_surf_copy_data *data)
> +{
> +	uint32_t *cmd = (uint32_t *) data;
> +
> +	igt_info("details:\n");
> +	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, "
> +		 "src/dst access type: <%d, %d>, size of ctrl copy: %u, length: %d>\n",
> +		 cmd[0],
> +		 data->dw00.client, data->dw00.opcode,
> +		 data->dw00.src_access_type, data->dw00.dst_access_type,
> +		 data->dw00.size_of_ctrl_copy, data->dw00.length);
> +	igt_info(" dw01: [%08x] src offset lo (0x%x)\n",
> +		 cmd[1], data->dw01.src_address_lo);
> +	igt_info(" dw02: [%08x] src offset hi (0x%x), src mocs: %u\n",
> +		 cmd[2], data->dw02.src_address_hi, data->dw02.src_mocs);
> +	igt_info(" dw03: [%08x] dst offset lo (0x%x)\n",
> +		 cmd[3], data->dw03.dst_address_lo);
> +	igt_info(" dw04: [%08x] dst offset hi (0x%x), src mocs: %u\n",
> +		 cmd[4], data->dw04.dst_address_hi, data->dw04.dst_mocs);
> +}
> +
> +/**
> + * blt_ctrl_surf_copy:
> + * @i915: drm fd
> + * @ctx: intel_ctx_t context
> + * @e: blitter engine for @ctx
> + * @ahnd: allocator handle
> + * @surf: blitter data for ctrl-surf-copy
> + *
> + * Function does ctrl-surf-copy blit between @src and @dst described in
> + * @blt object.
> + *
> + * Returns:
> + * execbuffer status.
> + */
> +int blt_ctrl_surf_copy(int i915,
> +		       const intel_ctx_t *ctx,
> +		       const struct intel_execution_engine2 *e,
> +		       uint64_t ahnd,
> +		       const struct blt_ctrl_surf_copy_data *surf)
> +{
> +	struct drm_i915_gem_execbuffer2 execbuf = {};
> +	struct drm_i915_gem_exec_object2 obj[3] = {};
> +	struct gen12_ctrl_surf_copy_data data = {};
> +	uint64_t dst_offset, src_offset, bb_offset, alignment;
> +	uint32_t *bb;
> +	int i;
> +
> +	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
> +	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
> +
> +	alignment = gem_detect_safe_alignment(i915);
> +
> +	data.dw00.client = 0x2;
> +	data.dw00.opcode = 0x48;
> +	data.dw00.src_access_type = surf->src.access_type;
> +	data.dw00.dst_access_type = surf->dst.access_type;
> +
> +	/* Ensure dst has size capable to keep src ccs aux */
> +	data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;

Shouldn't this be size_of_surface / 256 - 1 ?

> +	data.dw00.length = 0x3;
> +
> +	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size,
> +				alignment);
> +	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size,
> +				alignment);
> +	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size,
> +			       alignment);
> +
> +	data.dw01.src_address_lo = src_offset;

This should be 4K aligned (or 64KB for indirect).

> +	data.dw02.src_address_hi = src_offset >> 32;
> +	data.dw02.src_mocs = surf->src.mocs;
> +
> +	data.dw03.dst_address_lo = dst_offset;
> +	data.dw04.dst_address_hi = dst_offset >> 32;
> +	data.dw04.dst_mocs = surf->dst.mocs;
> +
> +	i = sizeof(data) / sizeof(uint32_t);
> +	bb = gem_mmap__device_coherent(i915, surf->bb.handle, 0, surf->bb.size,
> +				       PROT_READ | PROT_WRITE);
> +	memcpy(bb, &data, sizeof(data));
> +	bb[i++] = MI_BATCH_BUFFER_END;
> +
> +	if (surf->print_bb) {
> +		igt_info("BB [CTRL SURF]:\n");
> +		igt_info("src offset: %llx, dst offset: %llx, bb offset: %llx\n",
> +			 (long long) src_offset, (long long) dst_offset,
> +			 (long long) bb_offset);
> +
> +		dump_bb_surf_ctrl_cmd(&data);
> +	}
> +	munmap(bb, surf->bb.size);
> +
> +	obj[0].offset = CANONICAL(dst_offset);
> +	obj[1].offset = CANONICAL(src_offset);
> +	obj[2].offset = CANONICAL(bb_offset);
> +	obj[0].handle = surf->dst.handle;
> +	obj[1].handle = surf->src.handle;
> +	obj[2].handle = surf->bb.handle;
> +	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> +		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +	execbuf.buffer_count = 3;
> +	execbuf.buffers_ptr = to_user_pointer(obj);
> +	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> +	execbuf.rsvd1 = ctx ? ctx->id : 0;
> +	gem_execbuf(i915, &execbuf);
> +	put_offset(ahnd, surf->dst.handle);
> +	put_offset(ahnd, surf->src.handle);
> +	put_offset(ahnd, surf->bb.handle);
> +
> +	return 0;
> +}
> +
> +struct gen12_fast_copy_data {
> +	struct {
> +		uint32_t length:			BITRANGE(0, 7);
> +		uint32_t rsvd1:				BITRANGE(8, 12);
> +		uint32_t dst_tiling:			BITRANGE(13, 14);
> +		uint32_t rsvd0:				BITRANGE(15, 19);
> +		uint32_t src_tiling:			BITRANGE(20, 21);
> +		uint32_t opcode:			BITRANGE(22, 28);
> +		uint32_t client:			BITRANGE(29, 31);
> +	} dw00;
> +
> +	struct {
> +		uint32_t dst_pitch:			BITRANGE(0, 15);
> +		uint32_t rsvd1:				BITRANGE(16, 23);
> +		uint32_t color_depth:			BITRANGE(24, 26);
> +		uint32_t rsvd0:				BITRANGE(27, 27);
> +		uint32_t dst_memory:			BITRANGE(28, 28);
> +		uint32_t src_memory:			BITRANGE(29, 29);
> +		uint32_t dst_type_y:			BITRANGE(30, 30);
> +		uint32_t src_type_y:			BITRANGE(31, 31);
> +	} dw01;
> +
> +	struct {
> +		int32_t dst_x1:				BITRANGE(0, 15);
> +		int32_t dst_y1:				BITRANGE(16, 31);
> +	} dw02;
> +
> +	struct {
> +		int32_t dst_x2:				BITRANGE(0, 15);
> +		int32_t dst_y2:				BITRANGE(16, 31);
> +	} dw03;
> +
> +	struct {
> +		uint32_t dst_address_lo;
> +	} dw04;
> +
> +	struct {
> +		uint32_t dst_address_hi;
> +	} dw05;
> +
> +	struct {
> +		int32_t src_x1:				BITRANGE(0, 15);
> +		int32_t src_y1:				BITRANGE(16, 31);
> +	} dw06;
> +
> +	struct {
> +		uint32_t src_pitch:			BITRANGE(0, 15);
> +		uint32_t rsvd0:				BITRANGE(16, 31);
> +	} dw07;
> +
> +	struct {
> +		uint32_t src_address_lo;
> +	} dw08;
> +
> +	struct {
> +		uint32_t src_address_hi;
> +	} dw09;
> +};
> +
> +static int __fast_tiling(enum blt_tiling tiling)
> +{
> +	switch (tiling) {
> +	case T_LINEAR: return 0;
> +	case T_XMAJOR: return 1;
> +	case T_YMAJOR: return 2;
> +	case T_TILE4:  return 2;
> +	case T_TILE64: return 3;
> +	}
> +	return 0;
> +}
> +
> +static int __fast_color_depth(enum blt_color_depth depth)
> +{
> +	switch (depth) {
> +	case CD_8bit:   return 0;
> +	case CD_16bit:  return 1;
> +	case CD_32bit:  return 3;
> +	case CD_64bit:  return 4;
> +	case CD_96bit:
> +		igt_assert_f(0, "Unsupported depth\n");
> +		break;
> +	case CD_128bit: return 5;
> +	};
> +	return 0;
> +}
> +
> +static void dump_bb_fast_cmd(struct gen12_fast_copy_data *data)
> +{
> +	uint32_t *cmd = (uint32_t *) data;
> +
> +	igt_info("BB details:\n");
> +	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, src tiling: %d, "
> +		 "dst tiling: %d, length: %d>\n",
> +		 cmd[0], data->dw00.client, data->dw00.opcode,
> +		 data->dw00.src_tiling, data->dw00.dst_tiling, data->dw00.length);
> +	igt_info(" dw01: [%08x] dst <pitch: %d, color depth: %d, dst memory: %d, "
> +		 "src memory: %d, \n"
-------------------------------- ^
Remove space before "\n".

> +		 "\t\t\tdst type tile: %d (0-legacy, 1-tile4),\n"
> +		 "\t\t\tsrc type tile: %d (0-legacy, 1-tile4)>\n",
> +		 cmd[1], data->dw01.dst_pitch, data->dw01.color_depth,
> +		 data->dw01.dst_memory, data->dw01.src_memory,
> +		 data->dw01.dst_type_y, data->dw01.src_type_y);
> +	igt_info(" dw02: [%08x] dst geom <x1: %d, y1: %d>\n",
> +		 cmd[2], data->dw02.dst_x1, data->dw02.dst_y1);
> +	igt_info(" dw03: [%08x]          <x2: %d, y2: %d>\n",
> +		 cmd[3], data->dw03.dst_x2, data->dw03.dst_y2);
> +	igt_info(" dw04: [%08x] dst offset lo (0x%x)\n",
> +		 cmd[4], data->dw04.dst_address_lo);
> +	igt_info(" dw05: [%08x] dst offset hi (0x%x)\n",
> +		 cmd[5], data->dw05.dst_address_hi);
> +	igt_info(" dw06: [%08x] src geom <x1: %d, y1: %d>\n",
> +		 cmd[6], data->dw06.src_x1, data->dw06.src_y1);
> +	igt_info(" dw07: [%08x] src <pitch: %d>\n",
> +		 cmd[7], data->dw07.src_pitch);
> +	igt_info(" dw08: [%08x] src offset lo (0x%x)\n",
> +		 cmd[8], data->dw08.src_address_lo);
> +	igt_info(" dw09: [%08x] src offset hi (0x%x)\n",
> +		 cmd[9], data->dw09.src_address_hi);
> +}
> +
> +/**
> + * blt_fast_copy:
> + * @i915: drm fd
> + * @ctx: intel_ctx_t context
> + * @e: blitter engine for @ctx
> + * @ahnd: allocator handle
> + * @blt: blitter data for fast-copy (same as for block-copy but doesn't use
> + * compression fields).
> + *
> + * Function does fast blit between @src and @dst described in @blt object.
> + *
> + * Returns:
> + * execbuffer status.
> + */
> +int blt_fast_copy(int i915,
> +		  const intel_ctx_t *ctx,
> +		  const struct intel_execution_engine2 *e,
> +		  uint64_t ahnd,
> +		  const struct blt_copy_data *blt)
> +{
> +	struct drm_i915_gem_execbuffer2 execbuf = {};
> +	struct drm_i915_gem_exec_object2 obj[3] = {};
> +	struct gen12_fast_copy_data data = {};
> +	uint64_t dst_offset, src_offset, bb_offset, alignment;
> +	uint32_t *bb;
> +	int i, ret;
> +
> +	alignment = gem_detect_safe_alignment(i915);
> +
> +	data.dw00.client = 0x2;
> +	data.dw00.opcode = 0x42;
> +	data.dw00.dst_tiling = __fast_tiling(blt->dst.tiling);
> +	data.dw00.src_tiling = __fast_tiling(blt->src.tiling);
> +	data.dw00.length = 8;
> +
> +	data.dw01.dst_pitch = blt->dst.pitch;
> +	data.dw01.color_depth = __fast_color_depth(blt->color_depth);
> +	data.dw01.dst_memory = __memory_type(blt->dst.region);
> +	data.dw01.src_memory = __memory_type(blt->src.region);
> +	data.dw01.dst_type_y = blt->dst.tiling == T_TILE4 ? 1 : 0;
> +	data.dw01.src_type_y = blt->src.tiling == T_TILE4 ? 1 : 0;
> +
> +	data.dw02.dst_x1 = blt->dst.x1;
> +	data.dw02.dst_y1 = blt->dst.y1;
> +
> +	data.dw03.dst_x2 = blt->dst.x2;
> +	data.dw03.dst_y2 = blt->dst.y2;
> +
> +	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
> +	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);

Both addresses should be aligned to 64 bytes, maybe add assert here ?

> +	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
> +
> +	data.dw04.dst_address_lo = dst_offset;
> +	data.dw05.dst_address_hi = dst_offset >> 32;
> +
> +	data.dw06.src_x1 = blt->src.x1;
> +	data.dw06.src_y1 = blt->src.y1;
> +
> +	data.dw07.src_pitch = blt->src.pitch;
> +
> +	data.dw08.src_address_lo = src_offset;
> +	data.dw09.src_address_hi = src_offset >> 32;
> +
> +	i = sizeof(data) / sizeof(uint32_t);
> +	bb = gem_mmap__device_coherent(i915, blt->bb.handle, 0, blt->bb.size,
> +				       PROT_READ | PROT_WRITE);
> +
> +	memcpy(bb, &data, sizeof(data));
> +	bb[i++] = MI_BATCH_BUFFER_END;
> +
> +	if (blt->print_bb) {
> +		igt_info("BB [FAST COPY]\n");
> +		igt_info("blit [src offset: %llx, dst offset: %llx\n",
> +			 (long long) src_offset, (long long) dst_offset);
> +		dump_bb_fast_cmd(&data);
> +	}
> +
> +	munmap(bb, blt->bb.size);
> +
> +	obj[0].offset = CANONICAL(dst_offset);
> +	obj[1].offset = CANONICAL(src_offset);
> +	obj[2].offset = CANONICAL(bb_offset);
> +	obj[0].handle = blt->dst.handle;
> +	obj[1].handle = blt->src.handle;
> +	obj[2].handle = blt->bb.handle;
> +	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
> +		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
> +	execbuf.buffer_count = 3;
> +	execbuf.buffers_ptr = to_user_pointer(obj);
> +	execbuf.rsvd1 = ctx ? ctx->id : 0;
> +	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
> +	ret = __gem_execbuf(i915, &execbuf);
> +	put_offset(ahnd, blt->dst.handle);
> +	put_offset(ahnd, blt->src.handle);
> +	put_offset(ahnd, blt->bb.handle);
> +
> +	return ret;
> +}
> +
> +/**
> + * blt_surface_fill_rect:
> + * @i915: drm fd
> + * @obj: blitter copy object (@blt_copy_object) to fill with gradient pattern
> + * @width: width
> + * @height: height
> + *
> + * Function fills surface @width x @height * 24bpp with color gradient
> + * (internally uses ARGB where A == 0xff, see Cairo docs).
> + */
> +void blt_surface_fill_rect(int i915, const struct blt_copy_object *obj,
> +			   uint32_t width, uint32_t height)
> +{
> +	cairo_surface_t *surface;
> +	cairo_pattern_t *pat;
> +	cairo_t *cr;
> +	void *map = obj->ptr;
> +
> +	if (!map)
> +		map = gem_mmap__device_coherent(i915, obj->handle, 0,
> +						obj->size, PROT_READ | PROT_WRITE);
> +
> +	surface = cairo_image_surface_create_for_data(map,
> +						      CAIRO_FORMAT_RGB24,
> +						      width, height,
> +						      obj->pitch);
> +
> +	cr = cairo_create(surface);
> +
> +	cairo_rectangle(cr, 0, 0, width, height);
> +	cairo_clip(cr);
> +
> +	pat = cairo_pattern_create_mesh();
> +	cairo_mesh_pattern_begin_patch(pat);
> +	cairo_mesh_pattern_move_to(pat, 0, 0);
> +	cairo_mesh_pattern_line_to(pat, width, 0);
> +	cairo_mesh_pattern_line_to(pat, width, height);
> +	cairo_mesh_pattern_line_to(pat, 0, height);
> +	cairo_mesh_pattern_set_corner_color_rgb(pat, 0, 1.0, 0.0, 0.0);
> +	cairo_mesh_pattern_set_corner_color_rgb(pat, 1, 0.0, 1.0, 0.0);
> +	cairo_mesh_pattern_set_corner_color_rgb(pat, 2, 0.0, 0.0, 1.0);
> +	cairo_mesh_pattern_set_corner_color_rgb(pat, 3, 1.0, 1.0, 1.0);
> +	cairo_mesh_pattern_end_patch(pat);
> +
> +	cairo_rectangle(cr, 0, 0, width, height);
> +	cairo_set_source(cr, pat);
> +	cairo_fill(cr);
> +	cairo_pattern_destroy(pat);
> +
> +	cairo_destroy(cr);
> +
> +	cairo_surface_destroy(surface);
> +	if (!obj->ptr)
> +		munmap(map, obj->size);
> +}
> +
> +/**
> + * blt_surface_info:
> + * @info: information header
> + * @obj: blitter copy object (@blt_copy_object) to print surface info
> + */
> +void blt_surface_info(const char *info, const struct blt_copy_object *obj)
> +{
> +	igt_info("[%s]\n", info);
> +	igt_info("surface <handle: %u, size: %llx, region: %x, mocs: %x>\n",
> +		 obj->handle, (long long) obj->size, obj->region, obj->mocs);
> +	igt_info("        <tiling: %s, compression: %u, compression type: %d>\n",
> +		 blt_tiling_name(obj->tiling), obj->compression, obj->compression_type);
> +	igt_info("        <pitch: %u, offset [x: %u, y: %u] geom [<%d,%d> <%d,%d>]>\n",
> +		 obj->pitch, obj->x_offset, obj->y_offset,
> +		 obj->x1, obj->y1, obj->x2, obj->y2);
> +}
> +
> +/**
> + * blt_surface_to_png:
> + * @i915: drm fd
> + * @run_id: prefix id to allow grouping files stored from single run
> + * @fileid: file identifier
> + * @obj: blitter copy object (@blt_copy_object) to save to png
> + * @width: width
> + * @height: height
> + *
> + * Function save surface to png file. Assumes ARGB format where A == 0xff.
> + */
> +void blt_surface_to_png(int i915, uint32_t run_id, const char *fileid,
> +			const struct blt_copy_object *obj,
> +			uint32_t width, uint32_t height)
> +{
> +	cairo_surface_t *surface;
> +	cairo_status_t ret;
> +	uint8_t *map = (uint8_t *) obj->ptr;
> +	int format;
> +	int stride = obj->tiling ? obj->pitch * 4 : obj->pitch;
> +	char filename[FILENAME_MAX];
> +
> +	snprintf(filename, FILENAME_MAX-1, "%d-%s-%s-%ux%u-%s.png",
> +		 run_id, fileid, blt_tiling_name(obj->tiling), width, height,
> +		 obj->compression ? "compressed" : "uncompressed");
> +
> +	if (!map)
> +		map = gem_mmap__device_coherent(i915, obj->handle, 0,
> +						obj->size, PROT_READ);
> +	format = CAIRO_FORMAT_RGB24;
> +	surface = cairo_image_surface_create_for_data(map,
> +						      format, width, height,
> +						      stride);
> +	ret = cairo_surface_write_to_png(surface, filename);
> +	if (ret)
> +		igt_info("Cairo ret: %d (%s)\n", ret, cairo_status_to_string(ret));
> +	igt_assert(ret == CAIRO_STATUS_SUCCESS);
> +	cairo_surface_destroy(surface);
> +
> +	if (!obj->ptr)
> +		munmap(map, obj->size);
> +}
> diff --git a/lib/i915/i915_blt.h b/lib/i915/i915_blt.h
> new file mode 100644
> index 000000000..e0e8b52bc
> --- /dev/null
> +++ b/lib/i915/i915_blt.h
> @@ -0,0 +1,196 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2022 Intel Corporation
> + */
> +
> +/**
> + * SECTION:i915_blt
> + * @short_description: i915 blitter library
> + * @title: Blitter library
> + * @include: i915_blt.h
> + *
> + * # Introduction
> + *
> + * Gen12+ blitter commands like XY_BLOCK_COPY_BLT are quite long
> + * and if we would like to provide all arguments to function,
> + * list would be long, unreadable and error prone to invalid argument placement.
> + * Providing objects (structs) seems more reasonable and opens some more
> + * opportunities to share some object data across different blitter commands.
> + *
> + * Blitter library supports no-reloc (softpin) mode only (apart of TGL
> + * there's no relocations enabled) thus ahnd is mandatory. Providing NULL ctx
> + * means we use default context with I915_EXEC_BLT as an execution engine.
> + *
> + * Library introduces tiling enum which distinguishes tiling formats regardless
> + * legacy I915_TILING_... definitions. This allows to control fully what tilings
> + * are handled by command and skip/assert ones which are not supported.
> + *
> + * # Supported commands
> + *
> + * - XY_BLOCK_COPY_BLT - (block-copy) TGL/DG1 + DG2+ (ext version)
> + * - XY_FAST_COPY_BLT - (fast-copy)
> + * - XY_CTRL_SURF_COPY_BLT - (ctrl-surf-copy) DG2+
> + *
> + * # Usage details
> + *
> + * For block-copy and fast-copy @blt_copy_object struct is used to collect
> + * data about source and destination objects. It contains handle, region,
> + * size, etc...  which are using for blits. Some fields are not used for
> + * fast-copy copy (like compression) and command which use this exclusively
> + * is annotated in the comment.
> + *
> + */
> +
> +#include <errno.h>
> +#include <sys/ioctl.h>
> +#include <sys/time.h>
> +#include <malloc.h>
> +#include "drm.h"
> +#include "igt.h"
> +
> +#define CCS_RATIO 256
> +
> +enum blt_color_depth {
> +	CD_8bit,
> +	CD_16bit,
> +	CD_32bit,
> +	CD_64bit,
> +	CD_96bit,
> +	CD_128bit,
> +};
> +
> +enum blt_tiling {
> +	T_LINEAR,
> +	T_XMAJOR,
> +	T_YMAJOR,
> +	T_TILE4,
> +	T_TILE64,
> +};
> +
> +enum blt_compression {
> +	COMPRESSION_DISABLED,
> +	COMPRESSION_ENABLED,
> +};
> +
> +enum blt_compression_type {
> +	COMPRESSION_TYPE_3D,
> +	COMPRESSION_TYPE_MEDIA,
> +};
> +
> +/* BC - block-copy */
> +struct blt_copy_object {
> +	uint32_t handle;
> +	uint32_t region;
> +	uint64_t size;
> +	uint8_t mocs;
> +	enum blt_tiling tiling;
> +	enum blt_compression compression;  /* BC only */
> +	enum blt_compression_type compression_type; /* BC only */
> +	uint32_t pitch;
> +	uint16_t x_offset, y_offset;
> +	int16_t x1, y1, x2, y2;
> +
> +	/* mapping or null */
> +	uint32_t *ptr;
> +};
> +
> +struct blt_copy_batch {
> +	uint32_t handle;
> +	uint32_t region;
> +	uint64_t size;
> +};
> +
> +/* Common for block-copy and fast-copy */
> +struct blt_copy_data {
> +	int i915;
> +	struct blt_copy_object src;
> +	struct blt_copy_object dst;
> +	struct blt_copy_batch bb;
> +	enum blt_color_depth color_depth;
> +
> +	/* debug stuff */
> +	bool print_bb;
> +};
> +
> +enum blt_surface_type {
> +	SURFACE_TYPE_1D,
> +	SURFACE_TYPE_2D,
> +	SURFACE_TYPE_3D,
> +	SURFACE_TYPE_CUBE,
> +};
> +
> +struct blt_block_copy_object_ext {
> +	uint8_t compression_format;
> +	bool clear_value_enable;
> +	uint64_t clear_address;
> +	uint16_t surface_width;
> +	uint16_t surface_height;
> +	enum blt_surface_type surface_type;
> +	uint16_t surface_qpitch;
> +	uint16_t surface_depth;
> +	uint8_t lod;
> +	uint8_t horizontal_align;
> +	uint8_t vertical_align;
> +	uint8_t mip_tail_start_lod;
> +	bool depth_stencil_resource;
> +	uint16_t array_index;
> +};
> +
> +struct blt_block_copy_data_ext {
> +	struct blt_block_copy_object_ext src;
> +	struct blt_block_copy_object_ext dst;
> +};
> +
> +enum blt_access_type {
> +	INDIRECT_ACCESS,
> +	DIRECT_ACCESS,
> +};
> +
> +struct blt_ctrl_surf_copy_object {
> +	uint32_t handle;
> +	uint32_t region;
> +	uint64_t size;
> +	uint8_t mocs;
> +	enum blt_access_type access_type;
> +};
> +
> +struct blt_ctrl_surf_copy_data {
> +	int i915;
> +	struct blt_ctrl_surf_copy_object src;
> +	struct blt_ctrl_surf_copy_object dst;
> +	struct blt_copy_batch bb;
> +
> +	/* debug stuff */
> +	bool print_bb;
> +};
> +
> +bool blt_supports_compression(int i915);
> +bool blt_supports_tiling(int i915, enum blt_tiling tiling);
> +const char *blt_tiling_name(enum blt_tiling tiling);
> +
> +int blt_block_copy(int i915,
> +		   const intel_ctx_t *ctx,
> +		   const struct intel_execution_engine2 *e,
> +		   uint64_t ahnd,
> +		   const struct blt_copy_data *blt,
> +		   const struct blt_block_copy_data_ext *ext);
> +
> +int blt_ctrl_surf_copy(int i915,
> +		       const intel_ctx_t *ctx,
> +		       const struct intel_execution_engine2 *e,
> +		       uint64_t ahnd,
> +		       const struct blt_ctrl_surf_copy_data *surf);
> +
> +int blt_fast_copy(int i915,
> +		  const intel_ctx_t *ctx,
> +		  const struct intel_execution_engine2 *e,
> +		  uint64_t ahnd,
> +		  const struct blt_copy_data *blt);
> +
> +void blt_surface_info(const char *info,
> +		      const struct blt_copy_object *obj);
> +void blt_surface_fill_rect(int i915, const struct blt_copy_object *obj,
> +			   uint32_t width, uint32_t height);
> +void blt_surface_to_png(int i915, uint32_t run_id, const char *fileid,
> +		    const struct blt_copy_object *obj,
> +		    uint32_t width, uint32_t height);
> diff --git a/lib/meson.build b/lib/meson.build
> index 3e43316d1..fe035672e 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -11,6 +11,7 @@ lib_sources = [
>  	'i915/gem_mman.c',
>  	'i915/gem_vm.c',
>  	'i915/intel_memory_region.c',
> +	'i915/i915_blt.c',
>  	'igt_collection.c',
>  	'igt_color_encoding.c',
>  	'igt_debugfs.c',
> -- 
> 2.32.0
> 
Regards,
Kamil Konieczny

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter
  2022-02-28 19:51 [igt-dev] [PATCH i-g-t 0/3] tests/i915: ccs subtests for gem_lmem_swapping Ramalingam C
@ 2022-02-28 19:51 ` Ramalingam C
  0 siblings, 0 replies; 12+ messages in thread
From: Ramalingam C @ 2022-02-28 19:51 UTC (permalink / raw)
  To: igt-dev

From: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>

Blitter commands became complicated thus manual bitshifting is error
prone and hard debugable - XY_BLOCK_COPY_BLT is the best example -
in extended version (for DG2+) it takes 20 dwords of command data.
To avoid mistakes and dozens of arguments for command library provides
input data in more structured form.

Currently supported commands:
- XY_BLOCK_COPY_BLT:
  a)  TGL/DG1 uses shorter version of command which doesn't support
      compression
  b)  DG2+ command is extended and supports compression
- XY_CTRL_SURF_COPY_BLT
- XY_FAST_COPY_BLT

Source, destination and batchbuffer are provided to blitter functions
as objects (structs). This increases readability and allows use same
object in many functions. Only drawback of such attitude is some fields
used in one function may be ignored in another. As an example is
blt_copy_object which contains a lot of information about gem object.
In block-copy all of data are used but in fast-copy only some of them
(fast-copy doesn't support compression).

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 lib/i915/i915_blt.c | 992 ++++++++++++++++++++++++++++++++++++++++++++
 lib/i915/i915_blt.h | 157 +++++++
 lib/meson.build     |   1 +
 3 files changed, 1150 insertions(+)
 create mode 100644 lib/i915/i915_blt.c
 create mode 100644 lib/i915/i915_blt.h

diff --git a/lib/i915/i915_blt.c b/lib/i915/i915_blt.c
new file mode 100644
index 000000000000..7079802b68c4
--- /dev/null
+++ b/lib/i915/i915_blt.c
@@ -0,0 +1,992 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <errno.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+#include <malloc.h>
+#include <cairo.h>
+#include "drm.h"
+#include "igt.h"
+#include "gem_create.h"
+#include "i915_blt.h"
+
+#define BITRANGE(start, end) (end - start + 1)
+
+enum blt_special_mode {
+	SM_NONE,
+	SM_FULL_RESOLVE,
+	SM_PARTIAL_RESOLVE,
+	SM_RESERVED,
+};
+
+enum blt_aux_mode {
+	AM_AUX_NONE,
+	AM_AUX_CCS_E = 5,
+};
+
+enum blt_target_mem {
+	TM_LOCAL_MEM,
+	TM_SYSTEM_MEM,
+};
+
+struct gen12_block_copy_data {
+	struct {
+		uint32_t length:			BITRANGE(0, 7);
+		uint32_t rsvd1:				BITRANGE(8, 8);
+		uint32_t multisamples:			BITRANGE(9, 11);
+		uint32_t special_mode:			BITRANGE(12, 13);
+		uint32_t rsvd0:				BITRANGE(14, 18);
+		uint32_t color_depth:			BITRANGE(19, 21);
+		uint32_t opcode:			BITRANGE(22, 28);
+		uint32_t client:			BITRANGE(29, 31);
+	} dw00;
+
+	struct {
+		uint32_t dst_pitch:			BITRANGE(0, 17);
+		uint32_t dst_aux_mode:			BITRANGE(18, 20);
+		uint32_t dst_mocs:			BITRANGE(21, 27);
+		uint32_t dst_ctrl_surface_type:		BITRANGE(28, 28);
+		uint32_t dst_compression:		BITRANGE(29, 29);
+		uint32_t dst_tiling:			BITRANGE(30, 31);
+	} dw01;
+
+	struct {
+		int32_t dst_x1:				BITRANGE(0, 15);
+		int32_t dst_y1:				BITRANGE(16, 31);
+	} dw02;
+
+	struct {
+		int32_t dst_x2:				BITRANGE(0, 15);
+		int32_t dst_y2:				BITRANGE(16, 31);
+	} dw03;
+
+	struct {
+		uint32_t dst_address_lo;
+	} dw04;
+
+	struct {
+		uint32_t dst_address_hi;
+	} dw05;
+
+	struct {
+		uint32_t dst_x_offset:			BITRANGE(0, 13);
+		uint32_t rsvd1:				BITRANGE(14, 15);
+		uint32_t dst_y_offset:			BITRANGE(16, 29);
+		uint32_t rsvd0:				BITRANGE(30, 30);
+		uint32_t dst_target_memory:		BITRANGE(31, 31);
+	} dw06;
+
+	struct {
+		int32_t src_x1:				BITRANGE(0, 15);
+		int32_t src_y1:				BITRANGE(16, 31);
+	} dw07;
+
+	struct {
+		uint32_t src_pitch:			BITRANGE(0, 17);
+		uint32_t src_aux_mode:			BITRANGE(18, 20);
+		uint32_t src_mocs:			BITRANGE(21, 27);
+		uint32_t src_ctrl_surface_type:		BITRANGE(28, 28);
+		uint32_t src_compression:		BITRANGE(29, 29);
+		uint32_t src_tiling:			BITRANGE(30, 31);
+	} dw08;
+
+	struct {
+		uint32_t src_address_lo;
+	} dw09;
+
+	struct {
+		uint32_t src_address_hi;
+	} dw10;
+
+	struct {
+		uint32_t src_x_offset:			BITRANGE(0, 13);
+		uint32_t rsvd1:				BITRANGE(14, 15);
+		uint32_t src_y_offset:			BITRANGE(16, 29);
+		uint32_t rsvd0:				BITRANGE(30, 30);
+		uint32_t src_target_memory:		BITRANGE(31, 31);
+	} dw11;
+};
+
+struct gen12_block_copy_data_ext {
+	struct {
+		uint32_t src_compression_format:	BITRANGE(0, 4);
+		uint32_t src_clear_value_enable:	BITRANGE(5, 5);
+		uint32_t src_clear_address_low:		BITRANGE(6, 31);
+	} dw12;
+
+	union {
+		/* DG2, XEHP */
+		uint32_t src_clear_address_hi0;
+		/* Others */
+		uint32_t src_clear_address_hi1;
+	} dw13;
+
+	struct {
+		uint32_t dst_compression_format:	BITRANGE(0, 4);
+		uint32_t dst_clear_value_enable:	BITRANGE(5, 5);
+		uint32_t dst_clear_address_low:		BITRANGE(6, 31);
+	} dw14;
+
+	union {
+		/* DG2, XEHP */
+		uint32_t dst_clear_address_hi0;
+		/* Others */
+		uint32_t dst_clear_address_hi1;
+	} dw15;
+
+	struct {
+		uint32_t dst_surface_height:		BITRANGE(0, 13);
+		uint32_t dst_surface_width:		BITRANGE(14, 27);
+		uint32_t rsvd0:				BITRANGE(28, 28);
+		uint32_t dst_surface_type:		BITRANGE(29, 31);
+	} dw16;
+
+	struct {
+		uint32_t dst_lod:			BITRANGE(0, 3);
+		uint32_t dst_surface_qpitch:		BITRANGE(4, 18);
+		uint32_t rsvd0:				BITRANGE(19, 20);
+		uint32_t dst_surface_depth:		BITRANGE(21, 31);
+	} dw17;
+
+	struct {
+		uint32_t dst_horizontal_align:		BITRANGE(0, 1);
+		uint32_t rsvd0:				BITRANGE(2, 2);
+		uint32_t dst_vertical_align:		BITRANGE(3, 4);
+		uint32_t rsvd1:				BITRANGE(5, 7);
+		uint32_t dst_mip_tail_start_lod:	BITRANGE(8, 11);
+		uint32_t rsvd2:				BITRANGE(12, 17);
+		uint32_t dst_depth_stencil_resource:	BITRANGE(18, 18);
+		uint32_t rsvd3:				BITRANGE(19, 20);
+		uint32_t dst_array_index:		BITRANGE(21, 31);
+	} dw18;
+
+	struct {
+		uint32_t src_surface_height:		BITRANGE(0, 13);
+		uint32_t src_surface_width:		BITRANGE(14, 27);
+		uint32_t rsvd0:				BITRANGE(28, 28);
+		uint32_t src_surface_type:		BITRANGE(29, 31);
+	} dw19;
+
+	struct {
+		uint32_t src_lod:			BITRANGE(0, 3);
+		uint32_t src_surface_qpitch:		BITRANGE(4, 18);
+		uint32_t rsvd0:				BITRANGE(19, 20);
+		uint32_t src_surface_depth:		BITRANGE(21, 31);
+	} dw20;
+
+	struct {
+		uint32_t src_horizontal_align:		BITRANGE(0, 1);
+		uint32_t rsvd0:				BITRANGE(2, 2);
+		uint32_t src_vertical_align:		BITRANGE(3, 4);
+		uint32_t rsvd1:				BITRANGE(5, 7);
+		uint32_t src_mip_tail_start_lod:	BITRANGE(8, 11);
+		uint32_t rsvd2:				BITRANGE(12, 17);
+		uint32_t src_depth_stencil_resource:	BITRANGE(18, 18);
+		uint32_t rsvd3:				BITRANGE(19, 20);
+		uint32_t src_array_index:		BITRANGE(21, 31);
+	} dw21;
+};
+
+bool blt_supports_compression(int i915)
+{
+	uint32_t devid = intel_get_drm_devid(i915);
+
+	if (IS_TIGERLAKE(devid) || IS_DG1(devid))
+		return false;
+
+	return true;
+}
+
+bool blt_supports_tiling(int i915, enum blt_tiling tiling)
+{
+	uint32_t devid = intel_get_drm_devid(i915);
+
+	if (tiling == T_XMAJOR) {
+		if (IS_TIGERLAKE(devid) || IS_DG1(devid))
+			return false;
+		else
+			return true;
+	}
+
+	if (tiling == T_YMAJOR) {
+		if (IS_TIGERLAKE(devid) || IS_DG1(devid))
+			return true;
+		else
+			return false;
+	}
+
+	return true;
+}
+
+const char *blt_tiling_name(enum blt_tiling tiling)
+{
+	switch (tiling) {
+	case T_LINEAR: return "linear";
+	case T_XMAJOR: return "xmajor";
+	case T_YMAJOR: return "ymajor";
+	case T_TILE4:  return "tile4";
+	case T_TILE64: return "tile64";
+	}
+
+	igt_warn("invalid tiling passed: %d\n", tiling);
+	return NULL;
+}
+
+static int __block_tiling(enum blt_tiling tiling)
+{
+	switch (tiling) {
+	case T_LINEAR: return 0;
+	case T_XMAJOR: return 1;
+	case T_YMAJOR: return 1;
+	case T_TILE4:  return 2;
+	case T_TILE64: return 3;
+	}
+
+	igt_warn("invalid tiling passed: %d\n", tiling);
+	return 0;
+}
+
+static int __special_mode(const struct blt_copy_data *blt)
+{
+	if (blt->src.handle == blt->dst.handle &&
+	    blt->src.compression && !blt->dst.compression)
+		return SM_FULL_RESOLVE;
+
+
+	return SM_NONE;
+}
+
+static int __memory_type(uint32_t region)
+{
+	igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
+		     IS_SYSTEM_MEMORY_REGION(region),
+		     "Invalid region: %x\n", region);
+
+	if (IS_DEVICE_MEMORY_REGION(region))
+		return TM_LOCAL_MEM;
+	return TM_SYSTEM_MEM;
+}
+
+static enum blt_aux_mode __aux_mode(const struct blt_copy_object *obj)
+{
+	if (obj->compression == COMPRESSION_ENABLED) {
+		igt_assert_f(IS_DEVICE_MEMORY_REGION(obj->region),
+			     "XY_BLOCK_COPY_BLT supports compression "
+			     "on device memory only\n");
+		return AM_AUX_CCS_E;
+	}
+
+	return AM_AUX_NONE;
+}
+
+static void fill_data(struct gen12_block_copy_data *data,
+		      const struct blt_copy_data *blt,
+		      uint64_t src_offset, uint64_t dst_offset,
+		      bool extended_command)
+{
+	data->dw00.client = 0x2;
+	data->dw00.opcode = 0x41;
+	data->dw00.color_depth = blt->color_depth;
+	data->dw00.special_mode = __special_mode(blt);
+	data->dw00.length = extended_command ? 20 : 10;
+
+	data->dw01.dst_pitch = blt->dst.pitch - 1;
+	data->dw01.dst_aux_mode = __aux_mode(&blt->dst);
+	data->dw01.dst_mocs = blt->dst.mocs;
+	data->dw01.dst_compression = blt->dst.compression;
+	data->dw01.dst_tiling = __block_tiling(blt->dst.tiling);
+
+	if (blt->dst.compression)
+		data->dw01.dst_ctrl_surface_type = blt->dst.compression_type;
+
+	data->dw02.dst_x1 = blt->dst.x1;
+	data->dw02.dst_y1 = blt->dst.y1;
+
+	data->dw03.dst_x2 = blt->dst.x2;
+	data->dw03.dst_y2 = blt->dst.y2;
+
+	data->dw04.dst_address_lo = dst_offset;
+	data->dw05.dst_address_hi = dst_offset >> 32;
+
+	data->dw06.dst_x_offset = blt->dst.x_offset;
+	data->dw06.dst_y_offset = blt->dst.y_offset;
+	data->dw06.dst_target_memory = __memory_type(blt->dst.region);
+
+	data->dw07.src_x1 = blt->src.x1;
+	data->dw07.src_y1 = blt->src.y1;
+
+	data->dw08.src_pitch = blt->src.pitch - 1;
+	data->dw08.src_aux_mode = __aux_mode(&blt->src);
+	data->dw08.src_mocs = blt->src.mocs;
+	data->dw08.src_compression = blt->src.compression;
+	data->dw08.src_tiling = __block_tiling(blt->src.tiling);
+
+	if (blt->src.compression)
+		data->dw08.src_ctrl_surface_type = blt->src.compression_type;
+
+	data->dw09.src_address_lo = src_offset;
+	data->dw10.src_address_hi = src_offset >> 32;
+
+	data->dw11.src_x_offset = blt->src.x_offset;
+	data->dw11.src_y_offset = blt->src.y_offset;
+	data->dw11.src_target_memory = __memory_type(blt->src.region);
+}
+
+static void fill_data_ext(int i915,
+			  struct gen12_block_copy_data_ext *dext,
+			  const struct blt_block_copy_data_ext *ext)
+{
+	dext->dw12.src_compression_format = ext->src.compression_format;
+	dext->dw12.src_clear_value_enable = ext->src.clear_value_enable;
+	dext->dw12.src_clear_address_low = ext->src.clear_address;
+
+	dext->dw13.src_clear_address_hi0 = ext->src.clear_address >> 32;
+
+	dext->dw14.dst_compression_format = ext->dst.compression_format;
+	dext->dw14.dst_clear_value_enable = ext->dst.clear_value_enable;
+	dext->dw14.dst_clear_address_low = ext->dst.clear_address;
+
+	dext->dw15.dst_clear_address_hi0 = ext->dst.clear_address >> 32;
+
+	dext->dw16.dst_surface_width = ext->dst.surface_width - 1;
+	dext->dw16.dst_surface_height = ext->dst.surface_height - 1;
+	dext->dw16.dst_surface_type = ext->dst.surface_type;
+
+	dext->dw17.dst_lod = ext->dst.lod;
+	dext->dw17.dst_surface_depth = ext->dst.surface_depth;
+	dext->dw17.dst_surface_qpitch = ext->dst.surface_qpitch;
+
+	dext->dw18.dst_horizontal_align = ext->dst.horizontal_align;
+	dext->dw18.dst_vertical_align = ext->dst.vertical_align;
+	dext->dw18.dst_mip_tail_start_lod = ext->dst.mip_tail_start_lod;
+	dext->dw18.dst_depth_stencil_resource = ext->dst.depth_stencil_resource;
+	dext->dw18.dst_array_index = ext->dst.array_index;
+
+	dext->dw19.src_surface_width = ext->src.surface_width - 1;
+	dext->dw19.src_surface_height = ext->src.surface_height - 1;
+
+	dext->dw19.src_surface_type = ext->src.surface_type;
+
+	dext->dw20.src_lod = ext->src.lod;
+	dext->dw20.src_surface_depth = ext->src.surface_depth;
+	dext->dw20.src_surface_qpitch = ext->src.surface_qpitch;
+
+	dext->dw21.src_horizontal_align = ext->src.horizontal_align;
+	dext->dw21.src_vertical_align = ext->src.vertical_align;
+	dext->dw21.src_mip_tail_start_lod = ext->src.mip_tail_start_lod;
+	dext->dw21.src_depth_stencil_resource = ext->src.depth_stencil_resource;
+	dext->dw21.src_array_index = ext->src.array_index;
+}
+
+static void dump_bb_cmd(struct gen12_block_copy_data *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("details:\n");
+	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, color depth: %d, "
+		 "special mode: %d, length: %d>\n",
+		 cmd[0],
+		 data->dw00.client, data->dw00.opcode, data->dw00.color_depth,
+		 data->dw00.special_mode, data->dw00.length);
+	igt_info(" dw01: [%08x] dst <pitch: %d, aux: %d, mocs: %d, compr: %d, "
+		 "tiling: %d, ctrl surf type: %d>\n",
+		 cmd[1], data->dw01.dst_pitch, data->dw01.dst_aux_mode,
+		 data->dw01.dst_mocs, data->dw01.dst_compression,
+		 data->dw01.dst_tiling, data->dw01.dst_ctrl_surface_type);
+	igt_info(" dw02: [%08x] dst geom <x1: %d, y1: %d>\n",
+		 cmd[2], data->dw02.dst_x1, data->dw02.dst_y1);
+	igt_info(" dw03: [%08x]          <x2: %d, y2: %d>\n",
+		 cmd[3], data->dw03.dst_x2, data->dw03.dst_y2);
+	igt_info(" dw04: [%08x] dst offset lo (0x%x)\n",
+		 cmd[4], data->dw04.dst_address_lo);
+	igt_info(" dw05: [%08x] dst offset hi (0x%x)\n",
+		 cmd[5], data->dw05.dst_address_hi);
+	igt_info(" dw06: [%08x] dst <x offset: 0x%x, y offset: 0x%0x, target mem: %d>\n",
+		 cmd[6], data->dw06.dst_x_offset, data->dw06.dst_y_offset,
+		 data->dw06.dst_target_memory);
+	igt_info(" dw07: [%08x] src geom <x1: %d, y1: %d>\n",
+		 cmd[7], data->dw07.src_x1, data->dw07.src_y1);
+	igt_info(" dw08: [%08x] src <pitch: %d, aux: %d, mocs: %d, compr: %d, "
+		 "tiling: %d, ctrl surf type: %d>\n",
+		 cmd[8], data->dw08.src_pitch, data->dw08.src_aux_mode,
+		 data->dw08.src_mocs, data->dw08.src_compression,
+		 data->dw08.src_tiling, data->dw08.src_ctrl_surface_type);
+	igt_info(" dw09: [%08x] src offset lo (0x%x)\n",
+		 cmd[9], data->dw09.src_address_lo);
+	igt_info(" dw10: [%08x] src offset hi (0x%x)\n",
+		 cmd[10], data->dw10.src_address_hi);
+	igt_info(" dw11: [%08x] src <x offset: 0x%x, y offset: 0x%0x, target mem: %d>\n",
+		 cmd[11], data->dw11.src_x_offset, data->dw11.src_y_offset,
+		 data->dw11.src_target_memory);
+}
+
+static void dump_bb_ext(struct gen12_block_copy_data_ext *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("ext details:\n");
+	igt_info(" dw12: [%08x] src <compression fmt: %d, clear value enable: %d, "
+		 "clear address low: 0x%x>\n",
+		 cmd[0],
+		 data->dw12.src_compression_format,
+		 data->dw12.src_clear_value_enable,
+		 data->dw12.src_clear_address_low);
+	igt_info(" dw13: [%08x] src clear address hi: 0x%x\n",
+		 cmd[1], data->dw13.src_clear_address_hi0);
+	igt_info(" dw14: [%08x] dst <compression fmt: %d, clear value enable: %d, "
+		 "clear address low: 0x%x>\n",
+		 cmd[2],
+		 data->dw14.dst_compression_format,
+		 data->dw14.dst_clear_value_enable,
+		 data->dw14.dst_clear_address_low);
+	igt_info(" dw15: [%08x] dst clear address hi: 0x%x\n",
+		 cmd[3], data->dw15.dst_clear_address_hi0);
+	igt_info(" dw16: [%08x] dst surface <width: %d, height: %d, type: %d>\n",
+		 cmd[4], data->dw16.dst_surface_width,
+		 data->dw16.dst_surface_height, data->dw16.dst_surface_type);
+	igt_info(" dw17: [%08x] dst surface <lod: %d, depth: %d, qpitch: %d>\n",
+		 cmd[5], data->dw17.dst_lod,
+		 data->dw17.dst_surface_depth, data->dw17.dst_surface_qpitch);
+	igt_info(" dw18: [%08x] dst <halign: %d, valign: %d, mip tail: %d, "
+		 "depth stencil: %d, array index: %d>\n",
+		 cmd[6],
+		 data->dw18.dst_horizontal_align,
+		 data->dw18.dst_vertical_align,
+		 data->dw18.dst_mip_tail_start_lod,
+		 data->dw18.dst_depth_stencil_resource,
+		 data->dw18.dst_array_index);
+
+	igt_info(" dw19: [%08x] src surface <width: %d, height: %d, type: %d>\n",
+		 cmd[7], data->dw19.src_surface_width,
+		 data->dw19.src_surface_height, data->dw19.src_surface_type);
+	igt_info(" dw20: [%08x] src surface <lod: %d, depth: %d, qpitch: %d>\n",
+		 cmd[8], data->dw20.src_lod,
+		 data->dw20.src_surface_depth, data->dw20.src_surface_qpitch);
+	igt_info(" dw21: [%08x] src <halign: %d, valign: %d, mip tail: %d, "
+		 "depth stencil: %d, array index: %d>\n",
+		 cmd[9],
+		 data->dw21.src_horizontal_align,
+		 data->dw21.src_vertical_align,
+		 data->dw21.src_mip_tail_start_lod,
+		 data->dw21.src_depth_stencil_resource,
+		 data->dw21.src_array_index);
+}
+
+int blt_block_copy(int i915,
+		   const intel_ctx_t *ctx,
+		   const struct intel_execution_engine2 *e,
+		   uint64_t ahnd,
+		   const struct blt_copy_data *blt,
+		   const struct blt_block_copy_data_ext *ext)
+{
+	struct drm_i915_gem_execbuffer2 execbuf = {};
+	struct drm_i915_gem_exec_object2 obj[3] = {};
+	struct gen12_block_copy_data data = {};
+	struct gen12_block_copy_data_ext dext = {};
+	uint64_t dst_offset, src_offset, bb_offset, alignment;
+	uint32_t *bb;
+	int i, ret;
+
+	alignment = gem_detect_safe_alignment(i915);
+	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
+	if (__special_mode(blt) == SM_FULL_RESOLVE)
+		dst_offset = src_offset;
+	else
+		dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
+
+	fill_data(&data, blt, src_offset, dst_offset, ext);
+
+	i = sizeof(data) / sizeof(uint32_t);
+	bb = gem_mmap__device_coherent(i915, blt->bb.handle, 0, blt->bb.size,
+				       PROT_READ | PROT_WRITE);
+	memcpy(bb, &data, sizeof(data));
+
+	if (ext) {
+		fill_data_ext(i915, &dext, ext);
+		memcpy(bb + i, &dext, sizeof(dext));
+		i += sizeof(dext) / sizeof(uint32_t);
+	}
+	bb[i++] = MI_BATCH_BUFFER_END;
+
+	if (blt->print_bb) {
+		igt_info("[BLOCK COPY]\n");
+		igt_info("src offset: %llx, dst offset: %llx, bb offset: %llx\n",
+			 (long long) src_offset, (long long) dst_offset,
+			 (long long) bb_offset);
+
+		dump_bb_cmd(&data);
+		if (ext)
+			dump_bb_ext(&dext);
+	}
+
+	munmap(bb, blt->bb.size);
+
+	obj[0].offset = CANONICAL(dst_offset);
+	obj[1].offset = CANONICAL(src_offset);
+	obj[2].offset = CANONICAL(bb_offset);
+	obj[0].handle = blt->dst.handle;
+	obj[1].handle = blt->src.handle;
+	obj[2].handle = blt->bb.handle;
+	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	execbuf.buffer_count = 3;
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.rsvd1 = ctx ? ctx->id : 0;
+	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+	ret = __gem_execbuf(i915, &execbuf);
+	if (data.dw00.special_mode != SM_FULL_RESOLVE)
+		put_offset(ahnd, blt->dst.handle);
+	put_offset(ahnd, blt->src.handle);
+	put_offset(ahnd, blt->bb.handle);
+
+	return ret;
+}
+
+static uint16_t __ccs_size(const struct blt_ctrl_surf_copy_data *surf)
+{
+	uint32_t src_size, dst_size;
+
+	src_size = surf->src.access_type == DIRECT_ACCESS ?
+				surf->src.size : surf->src.size / CCS_RATIO;
+
+	dst_size = surf->dst.access_type == DIRECT_ACCESS ?
+				surf->dst.size : surf->dst.size / CCS_RATIO;
+
+	igt_assert_f(src_size <= dst_size, "dst size must be >= src size for CCS copy\n");
+
+	return src_size;
+}
+
+struct gen12_ctrl_surf_copy_data {
+	struct {
+		uint32_t length:			BITRANGE(0, 7);
+		uint32_t size_of_ctrl_copy:		BITRANGE(8, 17);
+		uint32_t rsvd0:				BITRANGE(18, 19);
+		uint32_t dst_access_type:		BITRANGE(20, 20);
+		uint32_t src_access_type:		BITRANGE(21, 21);
+		uint32_t opcode:			BITRANGE(22, 28);
+		uint32_t client:			BITRANGE(29, 31);
+	} dw00;
+
+	struct {
+		uint32_t src_address_lo;
+	} dw01;
+
+	struct {
+		uint32_t src_address_hi:		BITRANGE(0, 24);
+		uint32_t src_mocs:			BITRANGE(25, 31);
+	} dw02;
+
+	struct {
+		uint32_t dst_address_lo;
+	} dw03;
+
+	struct {
+		uint32_t dst_address_hi:		BITRANGE(0, 24);
+		uint32_t dst_mocs:			BITRANGE(25, 31);
+	} dw04;
+};
+
+static void dump_bb_surf_ctrl_cmd(const struct gen12_ctrl_surf_copy_data *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("details:\n");
+	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, "
+		 "src/dst access type: <%d, %d>, size of ctrl copy: %u, length: %d>\n",
+		 cmd[0],
+		 data->dw00.client, data->dw00.opcode,
+		 data->dw00.src_access_type, data->dw00.dst_access_type,
+		 data->dw00.size_of_ctrl_copy, data->dw00.length);
+	igt_info(" dw01: [%08x] src offset lo (0x%x)\n",
+		 cmd[1], data->dw01.src_address_lo);
+	igt_info(" dw02: [%08x] src offset hi (0x%x), src mocs: %u\n",
+		 cmd[2], data->dw02.src_address_hi, data->dw02.src_mocs);
+	igt_info(" dw03: [%08x] dst offset lo (0x%x)\n",
+		 cmd[3], data->dw03.dst_address_lo);
+	igt_info(" dw04: [%08x] dst offset hi (0x%x), src mocs: %u\n",
+		 cmd[4], data->dw04.dst_address_hi, data->dw04.dst_mocs);
+}
+
+int blt_ctrl_surf_copy(int i915,
+		       const intel_ctx_t *ctx,
+		       const struct intel_execution_engine2 *e,
+		       uint64_t ahnd,
+		       const struct blt_ctrl_surf_copy_data *surf)
+{
+	struct drm_i915_gem_execbuffer2 execbuf = {};
+	struct drm_i915_gem_exec_object2 obj[3] = {};
+	struct gen12_ctrl_surf_copy_data data = {};
+	uint64_t dst_offset, src_offset, bb_offset, alignment;
+	uint32_t *bb;
+	int i;
+
+	alignment = gem_detect_safe_alignment(i915);
+
+	data.dw00.client = 0x2;
+	data.dw00.opcode = 0x48;
+	data.dw00.src_access_type = surf->src.access_type;
+	data.dw00.dst_access_type = surf->dst.access_type;
+
+	/* Ensure dst has size capable to keep src ccs aux */
+	data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
+	data.dw00.length = 0x3;
+
+	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size,
+				alignment);
+	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size,
+				alignment);
+	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size,
+			       alignment);
+
+	data.dw01.src_address_lo = src_offset;
+	data.dw02.src_address_hi = src_offset >> 32;
+	data.dw02.src_mocs = surf->src.mocs;
+
+	data.dw03.dst_address_lo = dst_offset;
+	data.dw04.dst_address_hi = dst_offset >> 32;
+	data.dw04.dst_mocs = surf->dst.mocs;
+
+	i = sizeof(data) / sizeof(uint32_t);
+	bb = gem_mmap__device_coherent(i915, surf->bb.handle, 0, surf->bb.size,
+				       PROT_READ | PROT_WRITE);
+	memcpy(bb, &data, sizeof(data));
+	bb[i++] = MI_BATCH_BUFFER_END;
+
+	if (surf->print_bb) {
+		igt_info("BB [CTRL SURF]:\n");
+		igt_info("src offset: %llx, dst offset: %llx, bb offset: %llx\n",
+			 (long long) src_offset, (long long) dst_offset,
+			 (long long) bb_offset);
+
+		dump_bb_surf_ctrl_cmd(&data);
+	}
+	munmap(bb, surf->bb.size);
+
+	obj[0].offset = CANONICAL(dst_offset);
+	obj[1].offset = CANONICAL(src_offset);
+	obj[2].offset = CANONICAL(bb_offset);
+	obj[0].handle = surf->dst.handle;
+	obj[1].handle = surf->src.handle;
+	obj[2].handle = surf->bb.handle;
+	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	execbuf.buffer_count = 3;
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+	execbuf.rsvd1 = ctx ? ctx->id : 0;
+	gem_execbuf(i915, &execbuf);
+	put_offset(ahnd, surf->dst.handle);
+	put_offset(ahnd, surf->src.handle);
+	put_offset(ahnd, surf->bb.handle);
+
+	return 0;
+}
+
+struct gen12_fast_copy_data {
+	struct {
+		uint32_t length:			BITRANGE(0, 7);
+		uint32_t rsvd1:				BITRANGE(8, 12);
+		uint32_t dst_tiling:			BITRANGE(13, 14);
+		uint32_t rsvd0:				BITRANGE(15, 19);
+		uint32_t src_tiling:			BITRANGE(20, 21);
+		uint32_t opcode:			BITRANGE(22, 28);
+		uint32_t client:			BITRANGE(29, 31);
+	} dw00;
+
+	struct {
+		uint32_t dst_pitch:			BITRANGE(0, 15);
+		uint32_t rsvd1:				BITRANGE(16, 23);
+		uint32_t color_depth:			BITRANGE(24, 26);
+		uint32_t rsvd0:				BITRANGE(27, 27);
+		uint32_t dst_memory:			BITRANGE(28, 28);
+		uint32_t src_memory:			BITRANGE(29, 29);
+		uint32_t dst_type_y:			BITRANGE(30, 30);
+		uint32_t src_type_y:			BITRANGE(31, 31);
+	} dw01;
+
+	struct {
+		int32_t dst_x1:				BITRANGE(0, 15);
+		int32_t dst_y1:				BITRANGE(16, 31);
+	} dw02;
+
+	struct {
+		int32_t dst_x2:				BITRANGE(0, 15);
+		int32_t dst_y2:				BITRANGE(16, 31);
+	} dw03;
+
+	struct {
+		uint32_t dst_address_lo;
+	} dw04;
+
+	struct {
+		uint32_t dst_address_hi;
+	} dw05;
+
+	struct {
+		int32_t src_x1:				BITRANGE(0, 15);
+		int32_t src_y1:				BITRANGE(16, 31);
+	} dw06;
+
+	struct {
+		uint32_t src_pitch:			BITRANGE(0, 15);
+		uint32_t rsvd0:				BITRANGE(16, 31);
+	} dw07;
+
+	struct {
+		uint32_t src_address_lo;
+	} dw08;
+
+	struct {
+		uint32_t src_address_hi;
+	} dw09;
+};
+
+static int __fast_tiling(enum blt_tiling tiling)
+{
+	switch (tiling) {
+	case T_LINEAR: return 0;
+	case T_XMAJOR: return 1;
+	case T_YMAJOR: return 2;
+	case T_TILE4:  return 2;
+	case T_TILE64: return 3;
+	}
+	return 0;
+}
+
+static int __fast_color_depth(enum blt_color_depth depth)
+{
+	switch (depth) {
+	case CD_8bit:   return 0;
+	case CD_16bit:  return 1;
+	case CD_32bit:  return 3;
+	case CD_64bit:  return 4;
+	case CD_96bit:
+		igt_assert_f(0, "Unsupported depth\n");
+		break;
+	case CD_128bit: return 5;
+	};
+	return 0;
+}
+
+static void dump_bb_fast_cmd(struct gen12_fast_copy_data *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("BB details:\n");
+	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, src tiling: %d, "
+		 "dst tiling: %d, length: %d>\n",
+		 cmd[0], data->dw00.client, data->dw00.opcode,
+		 data->dw00.src_tiling, data->dw00.dst_tiling, data->dw00.length);
+	igt_info(" dw01: [%08x] dst <pitch: %d, color depth: %d, dst memory: %d, "
+		 "src memory: %d, \n"
+		 "\t\t\tdst type tile: %d (0-legacy, 1-tile4),\n"
+		 "\t\t\tsrc type tile: %d (0-legacy, 1-tile4)>\n",
+		 cmd[1], data->dw01.dst_pitch, data->dw01.color_depth,
+		 data->dw01.dst_memory, data->dw01.src_memory,
+		 data->dw01.dst_type_y, data->dw01.src_type_y);
+	igt_info(" dw02: [%08x] dst geom <x1: %d, y1: %d>\n",
+		 cmd[2], data->dw02.dst_x1, data->dw02.dst_y1);
+	igt_info(" dw03: [%08x]          <x2: %d, y2: %d>\n",
+		 cmd[3], data->dw03.dst_x2, data->dw03.dst_y2);
+	igt_info(" dw04: [%08x] dst offset lo (0x%x)\n",
+		 cmd[4], data->dw04.dst_address_lo);
+	igt_info(" dw05: [%08x] dst offset hi (0x%x)\n",
+		 cmd[5], data->dw05.dst_address_hi);
+	igt_info(" dw06: [%08x] src geom <x1: %d, y1: %d>\n",
+		 cmd[6], data->dw06.src_x1, data->dw06.src_y1);
+	igt_info(" dw07: [%08x] src <pitch: %d>\n",
+		 cmd[7], data->dw07.src_pitch);
+	igt_info(" dw08: [%08x] src offset lo (0x%x)\n",
+		 cmd[8], data->dw08.src_address_lo);
+	igt_info(" dw09: [%08x] src offset hi (0x%x)\n",
+		 cmd[9], data->dw09.src_address_hi);
+}
+
+#define BCS_SWCTRL 0x22200
+#define BCS_SRC_Y (1 << 0)
+#define BCS_DST_Y (1 << 1)
+int blt_fast_copy(int i915,
+		  const intel_ctx_t *ctx,
+		  const struct intel_execution_engine2 *e,
+		  uint64_t ahnd,
+		  const struct blt_copy_data *blt)
+{
+	struct drm_i915_gem_execbuffer2 execbuf = {};
+	struct drm_i915_gem_exec_object2 obj[3] = {};
+	struct gen12_fast_copy_data data = {};
+	uint64_t dst_offset, src_offset, bb_offset, alignment;
+	uint32_t *bb;
+	int i, ret;
+
+	alignment = gem_detect_safe_alignment(i915);
+
+	data.dw00.client = 0x2;
+	data.dw00.opcode = 0x42;
+	data.dw00.dst_tiling = __fast_tiling(blt->dst.tiling);
+	data.dw00.src_tiling = __fast_tiling(blt->src.tiling);
+	data.dw00.length = 8;
+
+	data.dw01.dst_pitch = blt->dst.pitch;
+	data.dw01.color_depth = __fast_color_depth(blt->color_depth);
+	data.dw01.dst_memory = __memory_type(blt->dst.region);
+	data.dw01.src_memory = __memory_type(blt->src.region);
+	data.dw01.dst_type_y = blt->dst.tiling == T_TILE4 ? 1 : 0;
+	data.dw01.src_type_y = blt->src.tiling == T_TILE4 ? 1 : 0;
+
+	data.dw02.dst_x1 = blt->dst.x1;
+	data.dw02.dst_y1 = blt->dst.y1;
+
+	data.dw03.dst_x2 = blt->dst.x2;
+	data.dw03.dst_y2 = blt->dst.y2;
+
+	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
+	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
+
+	data.dw04.dst_address_lo = dst_offset;
+	data.dw05.dst_address_hi = dst_offset >> 32;
+
+	data.dw06.src_x1 = blt->src.x1;
+	data.dw06.src_y1 = blt->src.y1;
+
+	data.dw07.src_pitch = blt->src.pitch;
+
+	data.dw08.src_address_lo = src_offset;
+	data.dw09.src_address_hi = src_offset >> 32;
+
+	i = sizeof(data) / sizeof(uint32_t);
+	bb = gem_mmap__device_coherent(i915, blt->bb.handle, 0, blt->bb.size,
+				       PROT_READ | PROT_WRITE);
+
+	memcpy(bb, &data, sizeof(data));
+	bb[i++] = MI_BATCH_BUFFER_END;
+
+	if (blt->print_bb) {
+		igt_info("BB [FAST COPY]\n");
+		igt_info("blit [src offset: %llx, dst offset: %llx\n",
+			 (long long) src_offset, (long long) dst_offset);
+		dump_bb_fast_cmd(&data);
+	}
+
+	munmap(bb, blt->bb.size);
+
+	obj[0].offset = CANONICAL(dst_offset);
+	obj[1].offset = CANONICAL(src_offset);
+	obj[2].offset = CANONICAL(bb_offset);
+	obj[0].handle = blt->dst.handle;
+	obj[1].handle = blt->src.handle;
+	obj[2].handle = blt->bb.handle;
+	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	execbuf.buffer_count = 3;
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.rsvd1 = ctx ? ctx->id : 0;
+	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+	ret = __gem_execbuf(i915, &execbuf);
+	put_offset(ahnd, blt->dst.handle);
+	put_offset(ahnd, blt->src.handle);
+	put_offset(ahnd, blt->bb.handle);
+
+	return ret;
+}
+
+void blt_surface_fill_rect(int i915, const struct blt_copy_object *obj,
+			   uint32_t width, uint32_t height)
+{
+	cairo_surface_t *surface;
+	cairo_pattern_t *pat;
+	cairo_t *cr;
+	void *map = obj->ptr;
+
+	if (!map)
+		map = gem_mmap__device_coherent(i915, obj->handle, 0,
+						obj->size, PROT_READ | PROT_WRITE);
+
+	surface = cairo_image_surface_create_for_data(map,
+						      CAIRO_FORMAT_RGB24,
+						      width, height,
+						      obj->pitch);
+
+	cr = cairo_create(surface);
+
+	cairo_rectangle(cr, 0, 0, width, height);
+	cairo_clip(cr);
+
+	pat = cairo_pattern_create_mesh();
+	cairo_mesh_pattern_begin_patch(pat);
+	cairo_mesh_pattern_move_to(pat, 0, 0);
+	cairo_mesh_pattern_line_to(pat, width, 0);
+	cairo_mesh_pattern_line_to(pat, width, height);
+	cairo_mesh_pattern_line_to(pat, 0, height);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 0, 1.0, 0.0, 0.0);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 1, 0.0, 1.0, 0.0);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 2, 0.0, 0.0, 1.0);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 3, 1.0, 1.0, 1.0);
+	cairo_mesh_pattern_end_patch(pat);
+
+	cairo_rectangle(cr, 0, 0, width, height);
+	cairo_set_source(cr, pat);
+	cairo_fill(cr);
+	cairo_pattern_destroy(pat);
+
+	cairo_destroy(cr);
+
+	cairo_surface_destroy(surface);
+	if (!obj->ptr)
+		munmap(map, obj->size);
+}
+
+void blt_surface_info(const char *info, const struct blt_copy_object *obj)
+{
+	igt_info("[%s]\n", info);
+	igt_info("surface <handle: %u, size: %llx, region: %x, mocs: %x>\n",
+		 obj->handle, (long long) obj->size, obj->region, obj->mocs);
+	igt_info("        <tiling: %s, compression: %u, compression type: %d>\n",
+		 blt_tiling_name(obj->tiling), obj->compression, obj->compression_type);
+	igt_info("        <pitch: %u, offset [x: %u, y: %u] geom [<%d,%d> <%d,%d>]>\n",
+		 obj->pitch, obj->x_offset, obj->y_offset,
+		 obj->x1, obj->y1, obj->x2, obj->y2);
+}
+
+void blt_surface_to_png(int i915, uint32_t run_id, const char *fileid,
+			const struct blt_copy_object *obj,
+			uint32_t width, uint32_t height)
+{
+	cairo_surface_t *surface;
+	cairo_status_t ret;
+	uint8_t *map = (uint8_t *) obj->ptr;
+	int format;
+	int stride = obj->tiling ? obj->pitch * 4 : obj->pitch;
+	char filename[FILENAME_MAX];
+
+	snprintf(filename, FILENAME_MAX-1, "%d-%s-%s-%ux%u-%s.png",
+		 run_id, fileid, blt_tiling_name(obj->tiling), width, height,
+		 obj->compression ? "compressed" : "uncompressed");
+
+	if (!map)
+		map = gem_mmap__device_coherent(i915, obj->handle, 0,
+						obj->size, PROT_READ);
+	format = CAIRO_FORMAT_RGB24;
+	surface = cairo_image_surface_create_for_data(map,
+						      format, width, height,
+						      stride);
+	ret = cairo_surface_write_to_png(surface, filename);
+	if (ret)
+		igt_info("Cairo ret: %d (%s)\n", ret, cairo_status_to_string(ret));
+	igt_assert(ret == CAIRO_STATUS_SUCCESS);
+	cairo_surface_destroy(surface);
+
+	if (!obj->ptr)
+		munmap(map, obj->size);
+}
diff --git a/lib/i915/i915_blt.h b/lib/i915/i915_blt.h
new file mode 100644
index 000000000000..2f4198e8acde
--- /dev/null
+++ b/lib/i915/i915_blt.h
@@ -0,0 +1,157 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <errno.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+#include <malloc.h>
+#include "drm.h"
+#include "igt.h"
+
+#define CCS_RATIO 256
+
+enum blt_color_depth {
+	CD_8bit,
+	CD_16bit,
+	CD_32bit,
+	CD_64bit,
+	CD_96bit,
+	CD_128bit,
+};
+
+enum blt_tiling {
+	T_LINEAR,
+	T_XMAJOR,
+	T_YMAJOR,
+	T_TILE4,
+	T_TILE64,
+};
+
+enum blt_compression {
+	COMPRESSION_DISABLED,
+	COMPRESSION_ENABLED,
+};
+
+enum blt_compression_type {
+	COMPRESSION_TYPE_3D,
+	COMPRESSION_TYPE_MEDIA,
+};
+
+struct blt_copy_object {
+	uint32_t handle;
+	uint32_t region;
+	uint64_t size;
+	uint8_t mocs;
+	enum blt_tiling tiling;
+	enum blt_compression compression;
+	enum blt_compression_type compression_type;
+	uint32_t pitch;
+	uint16_t x_offset, y_offset;
+	int16_t x1, y1, x2, y2;
+
+	/* mapping or null */
+	uint32_t *ptr;
+};
+
+struct blt_copy_batch {
+	uint32_t handle;
+	uint32_t region;
+	uint64_t size;
+};
+
+/* Common for block-copy and fast-copy */
+struct blt_copy_data {
+	int i915;
+	struct blt_copy_object src;
+	struct blt_copy_object dst;
+	struct blt_copy_batch bb;
+	enum blt_color_depth color_depth;
+
+	/* debug stuff */
+	bool print_bb;
+};
+
+enum blt_surface_type {
+	SURFACE_TYPE_1D,
+	SURFACE_TYPE_2D,
+	SURFACE_TYPE_3D,
+	SURFACE_TYPE_CUBE,
+};
+
+struct blt_block_copy_object_ext {
+	uint8_t compression_format;
+	bool clear_value_enable;
+	uint64_t clear_address;
+	uint16_t surface_width;
+	uint16_t surface_height;
+	enum blt_surface_type surface_type;
+	uint16_t surface_qpitch;
+	uint16_t surface_depth;
+	uint8_t lod;
+	uint8_t horizontal_align;
+	uint8_t vertical_align;
+	uint8_t mip_tail_start_lod;
+	bool depth_stencil_resource;
+	uint16_t array_index;
+};
+
+struct blt_block_copy_data_ext {
+	struct blt_block_copy_object_ext src;
+	struct blt_block_copy_object_ext dst;
+};
+
+enum blt_access_type {
+	INDIRECT_ACCESS,
+	DIRECT_ACCESS,
+};
+
+struct blt_ctrl_surf_copy_object {
+	uint32_t handle;
+	uint32_t region;
+	uint64_t size;
+	uint8_t mocs;
+	enum blt_access_type access_type;
+};
+
+struct blt_ctrl_surf_copy_data {
+	int i915;
+	struct blt_ctrl_surf_copy_object src;
+	struct blt_ctrl_surf_copy_object dst;
+	struct blt_copy_batch bb;
+
+	/* debug stuff */
+	bool print_bb;
+};
+
+bool blt_supports_compression(int i915);
+bool blt_supports_tiling(int i915, enum blt_tiling tiling);
+const char *blt_tiling_name(enum blt_tiling tiling);
+
+int blt_block_copy(int i915,
+		   const intel_ctx_t *ctx,
+		   const struct intel_execution_engine2 *e,
+		   uint64_t ahnd,
+		   const struct blt_copy_data *blt,
+		   const struct blt_block_copy_data_ext *ext);
+
+int blt_ctrl_surf_copy(int i915,
+		       const intel_ctx_t *ctx,
+		       const struct intel_execution_engine2 *e,
+		       uint64_t ahnd,
+		       const struct blt_ctrl_surf_copy_data *surf);
+
+int blt_fast_copy(int i915,
+		  const intel_ctx_t *ctx,
+		  const struct intel_execution_engine2 *e,
+		  uint64_t ahnd,
+		  const struct blt_copy_data *blt);
+
+void blt_surface_info(const char *info,
+		      const struct blt_copy_object *obj);
+void blt_surface_fill_rect(int i915, const struct blt_copy_object *obj,
+			   uint32_t width, uint32_t height);
+void blt_surface_to_png(int i915, uint32_t run_id, const char *fileid,
+		    const struct blt_copy_object *obj,
+		    uint32_t width, uint32_t height);
diff --git a/lib/meson.build b/lib/meson.build
index 3e43316d1e36..fe035672e2c5 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -11,6 +11,7 @@ lib_sources = [
 	'i915/gem_mman.c',
 	'i915/gem_vm.c',
 	'i915/intel_memory_region.c',
+	'i915/i915_blt.c',
 	'igt_collection.c',
 	'igt_color_encoding.c',
 	'igt_debugfs.c',
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter
  2022-02-25 11:06 ` [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter Zbigniew Kempczyński
@ 2022-02-25 16:18   ` Kamil Konieczny
  2022-03-01 22:57   ` Kamil Konieczny
  1 sibling, 0 replies; 12+ messages in thread
From: Kamil Konieczny @ 2022-02-25 16:18 UTC (permalink / raw)
  To: igt-dev; +Cc: Mauro Chehab

Dnia 2022-02-25 at 12:06:25 +0100, Zbigniew Kempczyński napisał(a):
> Blitter commands became complicated thus manual bitshifting is error
> prone and hard debugable - XY_BLOCK_COPY_BLT is the best example -
> in extended version (for DG2+) it takes 20 dwords of command data.
> To avoid mistakes and dozens of arguments for command library provides
> input data in more structured form.
> 
> Currently supported commands:
> - XY_BLOCK_COPY_BLT:
>   a)  TGL/DG1 uses shorter version of command which doesn't support
>       compression
>   b)  DG2+ command is extended and supports compression
> - XY_CTRL_SURF_COPY_BLT
> - XY_FAST_COPY_BLT
> 
> Source, destination and batchbuffer are provided to blitter functions
> as objects (structs). This increases readability and allows use same
> object in many functions. Only drawback of such attitude is some fields
> used in one function may be ignored in another. As an example is
> blt_copy_object which contains a lot of information about gem object.
> In block-copy all of data are used but in fast-copy only some of them
> (fast-copy doesn't support compression).
> 
> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
> ---
>  .../igt-gpu-tools/igt-gpu-tools-docs.xml      |    1 +
>  lib/i915/i915_blt.c                           | 1087 +++++++++++++++++
>  lib/i915/i915_blt.h                           |  196 +++
>  lib/meson.build                               |    1 +
>  4 files changed, 1285 insertions(+)
>  create mode 100644 lib/i915/i915_blt.c
>  create mode 100644 lib/i915/i915_blt.h
> 
> diff --git a/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml b/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
> index 0dc5a0b7e..3a2edbae1 100644
> --- a/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
> +++ b/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
> @@ -16,6 +16,7 @@
>    <chapter>
>      <title>API Reference</title>
>      <xi:include href="xml/drmtest.xml"/>
> +    <xi:include href="xml/i915_blt.xml"/>

Imho this should be placed at i915 API.

>      <xi:include href="xml/igt_alsa.xml"/>
>      <xi:include href="xml/igt_audio.xml"/>
>      <xi:include href="xml/igt_aux.xml"/>
[...]
I will need more time for review of this patch.

Regards,
Kamil Konieczny

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter
  2022-02-25 11:06 [igt-dev] [PATCH i-g-t 0/3] Add i915 blt library + gem_ccs test Zbigniew Kempczyński
@ 2022-02-25 11:06 ` Zbigniew Kempczyński
  2022-02-25 16:18   ` Kamil Konieczny
  2022-03-01 22:57   ` Kamil Konieczny
  0 siblings, 2 replies; 12+ messages in thread
From: Zbigniew Kempczyński @ 2022-02-25 11:06 UTC (permalink / raw)
  To: igt-dev

Blitter commands became complicated thus manual bitshifting is error
prone and hard debugable - XY_BLOCK_COPY_BLT is the best example -
in extended version (for DG2+) it takes 20 dwords of command data.
To avoid mistakes and dozens of arguments for command library provides
input data in more structured form.

Currently supported commands:
- XY_BLOCK_COPY_BLT:
  a)  TGL/DG1 uses shorter version of command which doesn't support
      compression
  b)  DG2+ command is extended and supports compression
- XY_CTRL_SURF_COPY_BLT
- XY_FAST_COPY_BLT

Source, destination and batchbuffer are provided to blitter functions
as objects (structs). This increases readability and allows use same
object in many functions. Only drawback of such attitude is some fields
used in one function may be ignored in another. As an example is
blt_copy_object which contains a lot of information about gem object.
In block-copy all of data are used but in fast-copy only some of them
(fast-copy doesn't support compression).

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 .../igt-gpu-tools/igt-gpu-tools-docs.xml      |    1 +
 lib/i915/i915_blt.c                           | 1087 +++++++++++++++++
 lib/i915/i915_blt.h                           |  196 +++
 lib/meson.build                               |    1 +
 4 files changed, 1285 insertions(+)
 create mode 100644 lib/i915/i915_blt.c
 create mode 100644 lib/i915/i915_blt.h

diff --git a/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml b/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
index 0dc5a0b7e..3a2edbae1 100644
--- a/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
+++ b/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
@@ -16,6 +16,7 @@
   <chapter>
     <title>API Reference</title>
     <xi:include href="xml/drmtest.xml"/>
+    <xi:include href="xml/i915_blt.xml"/>
     <xi:include href="xml/igt_alsa.xml"/>
     <xi:include href="xml/igt_audio.xml"/>
     <xi:include href="xml/igt_aux.xml"/>
diff --git a/lib/i915/i915_blt.c b/lib/i915/i915_blt.c
new file mode 100644
index 000000000..c6f115009
--- /dev/null
+++ b/lib/i915/i915_blt.c
@@ -0,0 +1,1087 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <errno.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+#include <malloc.h>
+#include <cairo.h>
+#include "drm.h"
+#include "igt.h"
+#include "gem_create.h"
+#include "i915_blt.h"
+
+#define BITRANGE(start, end) (end - start + 1)
+
+enum blt_special_mode {
+	SM_NONE,
+	SM_FULL_RESOLVE,
+	SM_PARTIAL_RESOLVE,
+	SM_RESERVED,
+};
+
+enum blt_aux_mode {
+	AM_AUX_NONE,
+	AM_AUX_CCS_E = 5,
+};
+
+enum blt_target_mem {
+	TM_LOCAL_MEM,
+	TM_SYSTEM_MEM,
+};
+
+struct gen12_block_copy_data {
+	struct {
+		uint32_t length:			BITRANGE(0, 7);
+		uint32_t rsvd1:				BITRANGE(8, 8);
+		uint32_t multisamples:			BITRANGE(9, 11);
+		uint32_t special_mode:			BITRANGE(12, 13);
+		uint32_t rsvd0:				BITRANGE(14, 18);
+		uint32_t color_depth:			BITRANGE(19, 21);
+		uint32_t opcode:			BITRANGE(22, 28);
+		uint32_t client:			BITRANGE(29, 31);
+	} dw00;
+
+	struct {
+		uint32_t dst_pitch:			BITRANGE(0, 17);
+		uint32_t dst_aux_mode:			BITRANGE(18, 20);
+		uint32_t dst_mocs:			BITRANGE(21, 27);
+		uint32_t dst_ctrl_surface_type:		BITRANGE(28, 28);
+		uint32_t dst_compression:		BITRANGE(29, 29);
+		uint32_t dst_tiling:			BITRANGE(30, 31);
+	} dw01;
+
+	struct {
+		int32_t dst_x1:				BITRANGE(0, 15);
+		int32_t dst_y1:				BITRANGE(16, 31);
+	} dw02;
+
+	struct {
+		int32_t dst_x2:				BITRANGE(0, 15);
+		int32_t dst_y2:				BITRANGE(16, 31);
+	} dw03;
+
+	struct {
+		uint32_t dst_address_lo;
+	} dw04;
+
+	struct {
+		uint32_t dst_address_hi;
+	} dw05;
+
+	struct {
+		uint32_t dst_x_offset:			BITRANGE(0, 13);
+		uint32_t rsvd1:				BITRANGE(14, 15);
+		uint32_t dst_y_offset:			BITRANGE(16, 29);
+		uint32_t rsvd0:				BITRANGE(30, 30);
+		uint32_t dst_target_memory:		BITRANGE(31, 31);
+	} dw06;
+
+	struct {
+		int32_t src_x1:				BITRANGE(0, 15);
+		int32_t src_y1:				BITRANGE(16, 31);
+	} dw07;
+
+	struct {
+		uint32_t src_pitch:			BITRANGE(0, 17);
+		uint32_t src_aux_mode:			BITRANGE(18, 20);
+		uint32_t src_mocs:			BITRANGE(21, 27);
+		uint32_t src_ctrl_surface_type:		BITRANGE(28, 28);
+		uint32_t src_compression:		BITRANGE(29, 29);
+		uint32_t src_tiling:			BITRANGE(30, 31);
+	} dw08;
+
+	struct {
+		uint32_t src_address_lo;
+	} dw09;
+
+	struct {
+		uint32_t src_address_hi;
+	} dw10;
+
+	struct {
+		uint32_t src_x_offset:			BITRANGE(0, 13);
+		uint32_t rsvd1:				BITRANGE(14, 15);
+		uint32_t src_y_offset:			BITRANGE(16, 29);
+		uint32_t rsvd0:				BITRANGE(30, 30);
+		uint32_t src_target_memory:		BITRANGE(31, 31);
+	} dw11;
+};
+
+struct gen12_block_copy_data_ext {
+	struct {
+		uint32_t src_compression_format:	BITRANGE(0, 4);
+		uint32_t src_clear_value_enable:	BITRANGE(5, 5);
+		uint32_t src_clear_address_low:		BITRANGE(6, 31);
+	} dw12;
+
+	union {
+		/* DG2, XEHP */
+		uint32_t src_clear_address_hi0;
+		/* Others */
+		uint32_t src_clear_address_hi1;
+	} dw13;
+
+	struct {
+		uint32_t dst_compression_format:	BITRANGE(0, 4);
+		uint32_t dst_clear_value_enable:	BITRANGE(5, 5);
+		uint32_t dst_clear_address_low:		BITRANGE(6, 31);
+	} dw14;
+
+	union {
+		/* DG2, XEHP */
+		uint32_t dst_clear_address_hi0;
+		/* Others */
+		uint32_t dst_clear_address_hi1;
+	} dw15;
+
+	struct {
+		uint32_t dst_surface_height:		BITRANGE(0, 13);
+		uint32_t dst_surface_width:		BITRANGE(14, 27);
+		uint32_t rsvd0:				BITRANGE(28, 28);
+		uint32_t dst_surface_type:		BITRANGE(29, 31);
+	} dw16;
+
+	struct {
+		uint32_t dst_lod:			BITRANGE(0, 3);
+		uint32_t dst_surface_qpitch:		BITRANGE(4, 18);
+		uint32_t rsvd0:				BITRANGE(19, 20);
+		uint32_t dst_surface_depth:		BITRANGE(21, 31);
+	} dw17;
+
+	struct {
+		uint32_t dst_horizontal_align:		BITRANGE(0, 1);
+		uint32_t rsvd0:				BITRANGE(2, 2);
+		uint32_t dst_vertical_align:		BITRANGE(3, 4);
+		uint32_t rsvd1:				BITRANGE(5, 7);
+		uint32_t dst_mip_tail_start_lod:	BITRANGE(8, 11);
+		uint32_t rsvd2:				BITRANGE(12, 17);
+		uint32_t dst_depth_stencil_resource:	BITRANGE(18, 18);
+		uint32_t rsvd3:				BITRANGE(19, 20);
+		uint32_t dst_array_index:		BITRANGE(21, 31);
+	} dw18;
+
+	struct {
+		uint32_t src_surface_height:		BITRANGE(0, 13);
+		uint32_t src_surface_width:		BITRANGE(14, 27);
+		uint32_t rsvd0:				BITRANGE(28, 28);
+		uint32_t src_surface_type:		BITRANGE(29, 31);
+	} dw19;
+
+	struct {
+		uint32_t src_lod:			BITRANGE(0, 3);
+		uint32_t src_surface_qpitch:		BITRANGE(4, 18);
+		uint32_t rsvd0:				BITRANGE(19, 20);
+		uint32_t src_surface_depth:		BITRANGE(21, 31);
+	} dw20;
+
+	struct {
+		uint32_t src_horizontal_align:		BITRANGE(0, 1);
+		uint32_t rsvd0:				BITRANGE(2, 2);
+		uint32_t src_vertical_align:		BITRANGE(3, 4);
+		uint32_t rsvd1:				BITRANGE(5, 7);
+		uint32_t src_mip_tail_start_lod:	BITRANGE(8, 11);
+		uint32_t rsvd2:				BITRANGE(12, 17);
+		uint32_t src_depth_stencil_resource:	BITRANGE(18, 18);
+		uint32_t rsvd3:				BITRANGE(19, 20);
+		uint32_t src_array_index:		BITRANGE(21, 31);
+	} dw21;
+};
+
+/**
+ * blt_supports_compression:
+ * @i915: drm fd
+ *
+ * Function returns does HW supports flatccs compression in blitter commands
+ * on @i915 device.
+ *
+ * Returns:
+ * true if it does, false otherwise.
+ */
+bool blt_supports_compression(int i915)
+{
+	uint32_t devid = intel_get_drm_devid(i915);
+
+	return HAS_FLATCCS(devid);
+}
+
+/**
+ * blt_supports_tiling:
+ * @i915: drm fd
+ * @tiling: tiling id
+ *
+ * Function returns does blitter supports @tiling on @i915 device.
+ *
+ * Returns:
+ * true if it does, false otherwise.
+ */
+bool blt_supports_tiling(int i915, enum blt_tiling tiling)
+{
+	uint32_t devid = intel_get_drm_devid(i915);
+
+	if (tiling == T_XMAJOR) {
+		if (IS_TIGERLAKE(devid) || IS_DG1(devid))
+			return false;
+		else
+			return true;
+	}
+
+	if (tiling == T_YMAJOR) {
+		if (IS_TIGERLAKE(devid) || IS_DG1(devid))
+			return true;
+		else
+			return false;
+	}
+
+	return true;
+}
+
+/**
+ * blt_tiling_name:
+ * @tiling: tiling id
+ *
+ * Returns:
+ * name of @tiling passed. Useful to build test names or sth.
+ */
+const char *blt_tiling_name(enum blt_tiling tiling)
+{
+	switch (tiling) {
+	case T_LINEAR: return "linear";
+	case T_XMAJOR: return "xmajor";
+	case T_YMAJOR: return "ymajor";
+	case T_TILE4:  return "tile4";
+	case T_TILE64: return "tile64";
+	}
+
+	igt_warn("invalid tiling passed: %d\n", tiling);
+	return NULL;
+}
+
+static int __block_tiling(enum blt_tiling tiling)
+{
+	switch (tiling) {
+	case T_LINEAR: return 0;
+	case T_XMAJOR: return 1;
+	case T_YMAJOR: return 1;
+	case T_TILE4:  return 2;
+	case T_TILE64: return 3;
+	}
+
+	igt_warn("invalid tiling passed: %d\n", tiling);
+	return 0;
+}
+
+static int __special_mode(const struct blt_copy_data *blt)
+{
+	if (blt->src.handle == blt->dst.handle &&
+	    blt->src.compression && !blt->dst.compression)
+		return SM_FULL_RESOLVE;
+
+
+	return SM_NONE;
+}
+
+static int __memory_type(uint32_t region)
+{
+	igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
+		     IS_SYSTEM_MEMORY_REGION(region),
+		     "Invalid region: %x\n", region);
+
+	if (IS_DEVICE_MEMORY_REGION(region))
+		return TM_LOCAL_MEM;
+	return TM_SYSTEM_MEM;
+}
+
+static enum blt_aux_mode __aux_mode(const struct blt_copy_object *obj)
+{
+	if (obj->compression == COMPRESSION_ENABLED) {
+		igt_assert_f(IS_DEVICE_MEMORY_REGION(obj->region),
+			     "XY_BLOCK_COPY_BLT supports compression "
+			     "on device memory only\n");
+		return AM_AUX_CCS_E;
+	}
+
+	return AM_AUX_NONE;
+}
+
+static void fill_data(struct gen12_block_copy_data *data,
+		      const struct blt_copy_data *blt,
+		      uint64_t src_offset, uint64_t dst_offset,
+		      bool extended_command)
+{
+	data->dw00.client = 0x2;
+	data->dw00.opcode = 0x41;
+	data->dw00.color_depth = blt->color_depth;
+	data->dw00.special_mode = __special_mode(blt);
+	data->dw00.length = extended_command ? 20 : 10;
+
+	data->dw01.dst_pitch = blt->dst.pitch - 1;
+	data->dw01.dst_aux_mode = __aux_mode(&blt->dst);
+	data->dw01.dst_mocs = blt->dst.mocs;
+	data->dw01.dst_compression = blt->dst.compression;
+	data->dw01.dst_tiling = __block_tiling(blt->dst.tiling);
+
+	if (blt->dst.compression)
+		data->dw01.dst_ctrl_surface_type = blt->dst.compression_type;
+
+	data->dw02.dst_x1 = blt->dst.x1;
+	data->dw02.dst_y1 = blt->dst.y1;
+
+	data->dw03.dst_x2 = blt->dst.x2;
+	data->dw03.dst_y2 = blt->dst.y2;
+
+	data->dw04.dst_address_lo = dst_offset;
+	data->dw05.dst_address_hi = dst_offset >> 32;
+
+	data->dw06.dst_x_offset = blt->dst.x_offset;
+	data->dw06.dst_y_offset = blt->dst.y_offset;
+	data->dw06.dst_target_memory = __memory_type(blt->dst.region);
+
+	data->dw07.src_x1 = blt->src.x1;
+	data->dw07.src_y1 = blt->src.y1;
+
+	data->dw08.src_pitch = blt->src.pitch - 1;
+	data->dw08.src_aux_mode = __aux_mode(&blt->src);
+	data->dw08.src_mocs = blt->src.mocs;
+	data->dw08.src_compression = blt->src.compression;
+	data->dw08.src_tiling = __block_tiling(blt->src.tiling);
+
+	if (blt->src.compression)
+		data->dw08.src_ctrl_surface_type = blt->src.compression_type;
+
+	data->dw09.src_address_lo = src_offset;
+	data->dw10.src_address_hi = src_offset >> 32;
+
+	data->dw11.src_x_offset = blt->src.x_offset;
+	data->dw11.src_y_offset = blt->src.y_offset;
+	data->dw11.src_target_memory = __memory_type(blt->src.region);
+}
+
+static void fill_data_ext(int i915,
+			  struct gen12_block_copy_data_ext *dext,
+			  const struct blt_block_copy_data_ext *ext)
+{
+	dext->dw12.src_compression_format = ext->src.compression_format;
+	dext->dw12.src_clear_value_enable = ext->src.clear_value_enable;
+	dext->dw12.src_clear_address_low = ext->src.clear_address;
+
+	dext->dw13.src_clear_address_hi0 = ext->src.clear_address >> 32;
+
+	dext->dw14.dst_compression_format = ext->dst.compression_format;
+	dext->dw14.dst_clear_value_enable = ext->dst.clear_value_enable;
+	dext->dw14.dst_clear_address_low = ext->dst.clear_address;
+
+	dext->dw15.dst_clear_address_hi0 = ext->dst.clear_address >> 32;
+
+	dext->dw16.dst_surface_width = ext->dst.surface_width - 1;
+	dext->dw16.dst_surface_height = ext->dst.surface_height - 1;
+	dext->dw16.dst_surface_type = ext->dst.surface_type;
+
+	dext->dw17.dst_lod = ext->dst.lod;
+	dext->dw17.dst_surface_depth = ext->dst.surface_depth;
+	dext->dw17.dst_surface_qpitch = ext->dst.surface_qpitch;
+
+	dext->dw18.dst_horizontal_align = ext->dst.horizontal_align;
+	dext->dw18.dst_vertical_align = ext->dst.vertical_align;
+	dext->dw18.dst_mip_tail_start_lod = ext->dst.mip_tail_start_lod;
+	dext->dw18.dst_depth_stencil_resource = ext->dst.depth_stencil_resource;
+	dext->dw18.dst_array_index = ext->dst.array_index;
+
+	dext->dw19.src_surface_width = ext->src.surface_width - 1;
+	dext->dw19.src_surface_height = ext->src.surface_height - 1;
+
+	dext->dw19.src_surface_type = ext->src.surface_type;
+
+	dext->dw20.src_lod = ext->src.lod;
+	dext->dw20.src_surface_depth = ext->src.surface_depth;
+	dext->dw20.src_surface_qpitch = ext->src.surface_qpitch;
+
+	dext->dw21.src_horizontal_align = ext->src.horizontal_align;
+	dext->dw21.src_vertical_align = ext->src.vertical_align;
+	dext->dw21.src_mip_tail_start_lod = ext->src.mip_tail_start_lod;
+	dext->dw21.src_depth_stencil_resource = ext->src.depth_stencil_resource;
+	dext->dw21.src_array_index = ext->src.array_index;
+}
+
+static void dump_bb_cmd(struct gen12_block_copy_data *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("details:\n");
+	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, color depth: %d, "
+		 "special mode: %d, length: %d>\n",
+		 cmd[0],
+		 data->dw00.client, data->dw00.opcode, data->dw00.color_depth,
+		 data->dw00.special_mode, data->dw00.length);
+	igt_info(" dw01: [%08x] dst <pitch: %d, aux: %d, mocs: %d, compr: %d, "
+		 "tiling: %d, ctrl surf type: %d>\n",
+		 cmd[1], data->dw01.dst_pitch, data->dw01.dst_aux_mode,
+		 data->dw01.dst_mocs, data->dw01.dst_compression,
+		 data->dw01.dst_tiling, data->dw01.dst_ctrl_surface_type);
+	igt_info(" dw02: [%08x] dst geom <x1: %d, y1: %d>\n",
+		 cmd[2], data->dw02.dst_x1, data->dw02.dst_y1);
+	igt_info(" dw03: [%08x]          <x2: %d, y2: %d>\n",
+		 cmd[3], data->dw03.dst_x2, data->dw03.dst_y2);
+	igt_info(" dw04: [%08x] dst offset lo (0x%x)\n",
+		 cmd[4], data->dw04.dst_address_lo);
+	igt_info(" dw05: [%08x] dst offset hi (0x%x)\n",
+		 cmd[5], data->dw05.dst_address_hi);
+	igt_info(" dw06: [%08x] dst <x offset: 0x%x, y offset: 0x%0x, target mem: %d>\n",
+		 cmd[6], data->dw06.dst_x_offset, data->dw06.dst_y_offset,
+		 data->dw06.dst_target_memory);
+	igt_info(" dw07: [%08x] src geom <x1: %d, y1: %d>\n",
+		 cmd[7], data->dw07.src_x1, data->dw07.src_y1);
+	igt_info(" dw08: [%08x] src <pitch: %d, aux: %d, mocs: %d, compr: %d, "
+		 "tiling: %d, ctrl surf type: %d>\n",
+		 cmd[8], data->dw08.src_pitch, data->dw08.src_aux_mode,
+		 data->dw08.src_mocs, data->dw08.src_compression,
+		 data->dw08.src_tiling, data->dw08.src_ctrl_surface_type);
+	igt_info(" dw09: [%08x] src offset lo (0x%x)\n",
+		 cmd[9], data->dw09.src_address_lo);
+	igt_info(" dw10: [%08x] src offset hi (0x%x)\n",
+		 cmd[10], data->dw10.src_address_hi);
+	igt_info(" dw11: [%08x] src <x offset: 0x%x, y offset: 0x%0x, target mem: %d>\n",
+		 cmd[11], data->dw11.src_x_offset, data->dw11.src_y_offset,
+		 data->dw11.src_target_memory);
+}
+
+static void dump_bb_ext(struct gen12_block_copy_data_ext *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("ext details:\n");
+	igt_info(" dw12: [%08x] src <compression fmt: %d, clear value enable: %d, "
+		 "clear address low: 0x%x>\n",
+		 cmd[0],
+		 data->dw12.src_compression_format,
+		 data->dw12.src_clear_value_enable,
+		 data->dw12.src_clear_address_low);
+	igt_info(" dw13: [%08x] src clear address hi: 0x%x\n",
+		 cmd[1], data->dw13.src_clear_address_hi0);
+	igt_info(" dw14: [%08x] dst <compression fmt: %d, clear value enable: %d, "
+		 "clear address low: 0x%x>\n",
+		 cmd[2],
+		 data->dw14.dst_compression_format,
+		 data->dw14.dst_clear_value_enable,
+		 data->dw14.dst_clear_address_low);
+	igt_info(" dw15: [%08x] dst clear address hi: 0x%x\n",
+		 cmd[3], data->dw15.dst_clear_address_hi0);
+	igt_info(" dw16: [%08x] dst surface <width: %d, height: %d, type: %d>\n",
+		 cmd[4], data->dw16.dst_surface_width,
+		 data->dw16.dst_surface_height, data->dw16.dst_surface_type);
+	igt_info(" dw17: [%08x] dst surface <lod: %d, depth: %d, qpitch: %d>\n",
+		 cmd[5], data->dw17.dst_lod,
+		 data->dw17.dst_surface_depth, data->dw17.dst_surface_qpitch);
+	igt_info(" dw18: [%08x] dst <halign: %d, valign: %d, mip tail: %d, "
+		 "depth stencil: %d, array index: %d>\n",
+		 cmd[6],
+		 data->dw18.dst_horizontal_align,
+		 data->dw18.dst_vertical_align,
+		 data->dw18.dst_mip_tail_start_lod,
+		 data->dw18.dst_depth_stencil_resource,
+		 data->dw18.dst_array_index);
+
+	igt_info(" dw19: [%08x] src surface <width: %d, height: %d, type: %d>\n",
+		 cmd[7], data->dw19.src_surface_width,
+		 data->dw19.src_surface_height, data->dw19.src_surface_type);
+	igt_info(" dw20: [%08x] src surface <lod: %d, depth: %d, qpitch: %d>\n",
+		 cmd[8], data->dw20.src_lod,
+		 data->dw20.src_surface_depth, data->dw20.src_surface_qpitch);
+	igt_info(" dw21: [%08x] src <halign: %d, valign: %d, mip tail: %d, "
+		 "depth stencil: %d, array index: %d>\n",
+		 cmd[9],
+		 data->dw21.src_horizontal_align,
+		 data->dw21.src_vertical_align,
+		 data->dw21.src_mip_tail_start_lod,
+		 data->dw21.src_depth_stencil_resource,
+		 data->dw21.src_array_index);
+}
+
+/**
+ * blt_block_copy:
+ * @i915: drm fd
+ * @ctx: intel_ctx_t context
+ * @e: blitter engine for @ctx
+ * @ahnd: allocator handle
+ * @blt: basic blitter data (for TGL/DG1 which doesn't support ext version)
+ * @ext: extended blitter data (for DG2+, supports flatccs compression)
+ *
+ * Function does blit between @src and @dst described in @blt object.
+ *
+ * Returns:
+ * execbuffer status.
+ */
+int blt_block_copy(int i915,
+		   const intel_ctx_t *ctx,
+		   const struct intel_execution_engine2 *e,
+		   uint64_t ahnd,
+		   const struct blt_copy_data *blt,
+		   const struct blt_block_copy_data_ext *ext)
+{
+	struct drm_i915_gem_execbuffer2 execbuf = {};
+	struct drm_i915_gem_exec_object2 obj[3] = {};
+	struct gen12_block_copy_data data = {};
+	struct gen12_block_copy_data_ext dext = {};
+	uint64_t dst_offset, src_offset, bb_offset, alignment;
+	uint32_t *bb;
+	int i, ret;
+
+	igt_assert_f(ahnd, "block-copy supports softpin only\n");
+	igt_assert_f(blt, "block-copy requires data to do blit\n");
+
+	alignment = gem_detect_safe_alignment(i915);
+	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
+	if (__special_mode(blt) == SM_FULL_RESOLVE)
+		dst_offset = src_offset;
+	else
+		dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
+
+	fill_data(&data, blt, src_offset, dst_offset, ext);
+
+	i = sizeof(data) / sizeof(uint32_t);
+	bb = gem_mmap__device_coherent(i915, blt->bb.handle, 0, blt->bb.size,
+				       PROT_READ | PROT_WRITE);
+	memcpy(bb, &data, sizeof(data));
+
+	if (ext) {
+		fill_data_ext(i915, &dext, ext);
+		memcpy(bb + i, &dext, sizeof(dext));
+		i += sizeof(dext) / sizeof(uint32_t);
+	}
+	bb[i++] = MI_BATCH_BUFFER_END;
+
+	if (blt->print_bb) {
+		igt_info("[BLOCK COPY]\n");
+		igt_info("src offset: %llx, dst offset: %llx, bb offset: %llx\n",
+			 (long long) src_offset, (long long) dst_offset,
+			 (long long) bb_offset);
+
+		dump_bb_cmd(&data);
+		if (ext)
+			dump_bb_ext(&dext);
+	}
+
+	munmap(bb, blt->bb.size);
+
+	obj[0].offset = CANONICAL(dst_offset);
+	obj[1].offset = CANONICAL(src_offset);
+	obj[2].offset = CANONICAL(bb_offset);
+	obj[0].handle = blt->dst.handle;
+	obj[1].handle = blt->src.handle;
+	obj[2].handle = blt->bb.handle;
+	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	execbuf.buffer_count = 3;
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.rsvd1 = ctx ? ctx->id : 0;
+	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+	ret = __gem_execbuf(i915, &execbuf);
+	if (data.dw00.special_mode != SM_FULL_RESOLVE)
+		put_offset(ahnd, blt->dst.handle);
+	put_offset(ahnd, blt->src.handle);
+	put_offset(ahnd, blt->bb.handle);
+
+	return ret;
+}
+
+static uint16_t __ccs_size(const struct blt_ctrl_surf_copy_data *surf)
+{
+	uint32_t src_size, dst_size;
+
+	src_size = surf->src.access_type == DIRECT_ACCESS ?
+				surf->src.size : surf->src.size / CCS_RATIO;
+
+	dst_size = surf->dst.access_type == DIRECT_ACCESS ?
+				surf->dst.size : surf->dst.size / CCS_RATIO;
+
+	igt_assert_f(src_size <= dst_size, "dst size must be >= src size for CCS copy\n");
+
+	return src_size;
+}
+
+struct gen12_ctrl_surf_copy_data {
+	struct {
+		uint32_t length:			BITRANGE(0, 7);
+		uint32_t size_of_ctrl_copy:		BITRANGE(8, 17);
+		uint32_t rsvd0:				BITRANGE(18, 19);
+		uint32_t dst_access_type:		BITRANGE(20, 20);
+		uint32_t src_access_type:		BITRANGE(21, 21);
+		uint32_t opcode:			BITRANGE(22, 28);
+		uint32_t client:			BITRANGE(29, 31);
+	} dw00;
+
+	struct {
+		uint32_t src_address_lo;
+	} dw01;
+
+	struct {
+		uint32_t src_address_hi:		BITRANGE(0, 24);
+		uint32_t src_mocs:			BITRANGE(25, 31);
+	} dw02;
+
+	struct {
+		uint32_t dst_address_lo;
+	} dw03;
+
+	struct {
+		uint32_t dst_address_hi:		BITRANGE(0, 24);
+		uint32_t dst_mocs:			BITRANGE(25, 31);
+	} dw04;
+};
+
+static void dump_bb_surf_ctrl_cmd(const struct gen12_ctrl_surf_copy_data *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("details:\n");
+	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, "
+		 "src/dst access type: <%d, %d>, size of ctrl copy: %u, length: %d>\n",
+		 cmd[0],
+		 data->dw00.client, data->dw00.opcode,
+		 data->dw00.src_access_type, data->dw00.dst_access_type,
+		 data->dw00.size_of_ctrl_copy, data->dw00.length);
+	igt_info(" dw01: [%08x] src offset lo (0x%x)\n",
+		 cmd[1], data->dw01.src_address_lo);
+	igt_info(" dw02: [%08x] src offset hi (0x%x), src mocs: %u\n",
+		 cmd[2], data->dw02.src_address_hi, data->dw02.src_mocs);
+	igt_info(" dw03: [%08x] dst offset lo (0x%x)\n",
+		 cmd[3], data->dw03.dst_address_lo);
+	igt_info(" dw04: [%08x] dst offset hi (0x%x), src mocs: %u\n",
+		 cmd[4], data->dw04.dst_address_hi, data->dw04.dst_mocs);
+}
+
+/**
+ * blt_ctrl_surf_copy:
+ * @i915: drm fd
+ * @ctx: intel_ctx_t context
+ * @e: blitter engine for @ctx
+ * @ahnd: allocator handle
+ * @surf: blitter data for ctrl-surf-copy
+ *
+ * Function does ctrl-surf-copy blit between @src and @dst described in
+ * @blt object.
+ *
+ * Returns:
+ * execbuffer status.
+ */
+int blt_ctrl_surf_copy(int i915,
+		       const intel_ctx_t *ctx,
+		       const struct intel_execution_engine2 *e,
+		       uint64_t ahnd,
+		       const struct blt_ctrl_surf_copy_data *surf)
+{
+	struct drm_i915_gem_execbuffer2 execbuf = {};
+	struct drm_i915_gem_exec_object2 obj[3] = {};
+	struct gen12_ctrl_surf_copy_data data = {};
+	uint64_t dst_offset, src_offset, bb_offset, alignment;
+	uint32_t *bb;
+	int i;
+
+	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
+	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
+
+	alignment = gem_detect_safe_alignment(i915);
+
+	data.dw00.client = 0x2;
+	data.dw00.opcode = 0x48;
+	data.dw00.src_access_type = surf->src.access_type;
+	data.dw00.dst_access_type = surf->dst.access_type;
+
+	/* Ensure dst has size capable to keep src ccs aux */
+	data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
+	data.dw00.length = 0x3;
+
+	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size,
+				alignment);
+	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size,
+				alignment);
+	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size,
+			       alignment);
+
+	data.dw01.src_address_lo = src_offset;
+	data.dw02.src_address_hi = src_offset >> 32;
+	data.dw02.src_mocs = surf->src.mocs;
+
+	data.dw03.dst_address_lo = dst_offset;
+	data.dw04.dst_address_hi = dst_offset >> 32;
+	data.dw04.dst_mocs = surf->dst.mocs;
+
+	i = sizeof(data) / sizeof(uint32_t);
+	bb = gem_mmap__device_coherent(i915, surf->bb.handle, 0, surf->bb.size,
+				       PROT_READ | PROT_WRITE);
+	memcpy(bb, &data, sizeof(data));
+	bb[i++] = MI_BATCH_BUFFER_END;
+
+	if (surf->print_bb) {
+		igt_info("BB [CTRL SURF]:\n");
+		igt_info("src offset: %llx, dst offset: %llx, bb offset: %llx\n",
+			 (long long) src_offset, (long long) dst_offset,
+			 (long long) bb_offset);
+
+		dump_bb_surf_ctrl_cmd(&data);
+	}
+	munmap(bb, surf->bb.size);
+
+	obj[0].offset = CANONICAL(dst_offset);
+	obj[1].offset = CANONICAL(src_offset);
+	obj[2].offset = CANONICAL(bb_offset);
+	obj[0].handle = surf->dst.handle;
+	obj[1].handle = surf->src.handle;
+	obj[2].handle = surf->bb.handle;
+	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	execbuf.buffer_count = 3;
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+	execbuf.rsvd1 = ctx ? ctx->id : 0;
+	gem_execbuf(i915, &execbuf);
+	put_offset(ahnd, surf->dst.handle);
+	put_offset(ahnd, surf->src.handle);
+	put_offset(ahnd, surf->bb.handle);
+
+	return 0;
+}
+
+struct gen12_fast_copy_data {
+	struct {
+		uint32_t length:			BITRANGE(0, 7);
+		uint32_t rsvd1:				BITRANGE(8, 12);
+		uint32_t dst_tiling:			BITRANGE(13, 14);
+		uint32_t rsvd0:				BITRANGE(15, 19);
+		uint32_t src_tiling:			BITRANGE(20, 21);
+		uint32_t opcode:			BITRANGE(22, 28);
+		uint32_t client:			BITRANGE(29, 31);
+	} dw00;
+
+	struct {
+		uint32_t dst_pitch:			BITRANGE(0, 15);
+		uint32_t rsvd1:				BITRANGE(16, 23);
+		uint32_t color_depth:			BITRANGE(24, 26);
+		uint32_t rsvd0:				BITRANGE(27, 27);
+		uint32_t dst_memory:			BITRANGE(28, 28);
+		uint32_t src_memory:			BITRANGE(29, 29);
+		uint32_t dst_type_y:			BITRANGE(30, 30);
+		uint32_t src_type_y:			BITRANGE(31, 31);
+	} dw01;
+
+	struct {
+		int32_t dst_x1:				BITRANGE(0, 15);
+		int32_t dst_y1:				BITRANGE(16, 31);
+	} dw02;
+
+	struct {
+		int32_t dst_x2:				BITRANGE(0, 15);
+		int32_t dst_y2:				BITRANGE(16, 31);
+	} dw03;
+
+	struct {
+		uint32_t dst_address_lo;
+	} dw04;
+
+	struct {
+		uint32_t dst_address_hi;
+	} dw05;
+
+	struct {
+		int32_t src_x1:				BITRANGE(0, 15);
+		int32_t src_y1:				BITRANGE(16, 31);
+	} dw06;
+
+	struct {
+		uint32_t src_pitch:			BITRANGE(0, 15);
+		uint32_t rsvd0:				BITRANGE(16, 31);
+	} dw07;
+
+	struct {
+		uint32_t src_address_lo;
+	} dw08;
+
+	struct {
+		uint32_t src_address_hi;
+	} dw09;
+};
+
+static int __fast_tiling(enum blt_tiling tiling)
+{
+	switch (tiling) {
+	case T_LINEAR: return 0;
+	case T_XMAJOR: return 1;
+	case T_YMAJOR: return 2;
+	case T_TILE4:  return 2;
+	case T_TILE64: return 3;
+	}
+	return 0;
+}
+
+static int __fast_color_depth(enum blt_color_depth depth)
+{
+	switch (depth) {
+	case CD_8bit:   return 0;
+	case CD_16bit:  return 1;
+	case CD_32bit:  return 3;
+	case CD_64bit:  return 4;
+	case CD_96bit:
+		igt_assert_f(0, "Unsupported depth\n");
+		break;
+	case CD_128bit: return 5;
+	};
+	return 0;
+}
+
+static void dump_bb_fast_cmd(struct gen12_fast_copy_data *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("BB details:\n");
+	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, src tiling: %d, "
+		 "dst tiling: %d, length: %d>\n",
+		 cmd[0], data->dw00.client, data->dw00.opcode,
+		 data->dw00.src_tiling, data->dw00.dst_tiling, data->dw00.length);
+	igt_info(" dw01: [%08x] dst <pitch: %d, color depth: %d, dst memory: %d, "
+		 "src memory: %d, \n"
+		 "\t\t\tdst type tile: %d (0-legacy, 1-tile4),\n"
+		 "\t\t\tsrc type tile: %d (0-legacy, 1-tile4)>\n",
+		 cmd[1], data->dw01.dst_pitch, data->dw01.color_depth,
+		 data->dw01.dst_memory, data->dw01.src_memory,
+		 data->dw01.dst_type_y, data->dw01.src_type_y);
+	igt_info(" dw02: [%08x] dst geom <x1: %d, y1: %d>\n",
+		 cmd[2], data->dw02.dst_x1, data->dw02.dst_y1);
+	igt_info(" dw03: [%08x]          <x2: %d, y2: %d>\n",
+		 cmd[3], data->dw03.dst_x2, data->dw03.dst_y2);
+	igt_info(" dw04: [%08x] dst offset lo (0x%x)\n",
+		 cmd[4], data->dw04.dst_address_lo);
+	igt_info(" dw05: [%08x] dst offset hi (0x%x)\n",
+		 cmd[5], data->dw05.dst_address_hi);
+	igt_info(" dw06: [%08x] src geom <x1: %d, y1: %d>\n",
+		 cmd[6], data->dw06.src_x1, data->dw06.src_y1);
+	igt_info(" dw07: [%08x] src <pitch: %d>\n",
+		 cmd[7], data->dw07.src_pitch);
+	igt_info(" dw08: [%08x] src offset lo (0x%x)\n",
+		 cmd[8], data->dw08.src_address_lo);
+	igt_info(" dw09: [%08x] src offset hi (0x%x)\n",
+		 cmd[9], data->dw09.src_address_hi);
+}
+
+/**
+ * blt_fast_copy:
+ * @i915: drm fd
+ * @ctx: intel_ctx_t context
+ * @e: blitter engine for @ctx
+ * @ahnd: allocator handle
+ * @blt: blitter data for fast-copy (same as for block-copy but doesn't use
+ * compression fields).
+ *
+ * Function does fast blit between @src and @dst described in @blt object.
+ *
+ * Returns:
+ * execbuffer status.
+ */
+int blt_fast_copy(int i915,
+		  const intel_ctx_t *ctx,
+		  const struct intel_execution_engine2 *e,
+		  uint64_t ahnd,
+		  const struct blt_copy_data *blt)
+{
+	struct drm_i915_gem_execbuffer2 execbuf = {};
+	struct drm_i915_gem_exec_object2 obj[3] = {};
+	struct gen12_fast_copy_data data = {};
+	uint64_t dst_offset, src_offset, bb_offset, alignment;
+	uint32_t *bb;
+	int i, ret;
+
+	alignment = gem_detect_safe_alignment(i915);
+
+	data.dw00.client = 0x2;
+	data.dw00.opcode = 0x42;
+	data.dw00.dst_tiling = __fast_tiling(blt->dst.tiling);
+	data.dw00.src_tiling = __fast_tiling(blt->src.tiling);
+	data.dw00.length = 8;
+
+	data.dw01.dst_pitch = blt->dst.pitch;
+	data.dw01.color_depth = __fast_color_depth(blt->color_depth);
+	data.dw01.dst_memory = __memory_type(blt->dst.region);
+	data.dw01.src_memory = __memory_type(blt->src.region);
+	data.dw01.dst_type_y = blt->dst.tiling == T_TILE4 ? 1 : 0;
+	data.dw01.src_type_y = blt->src.tiling == T_TILE4 ? 1 : 0;
+
+	data.dw02.dst_x1 = blt->dst.x1;
+	data.dw02.dst_y1 = blt->dst.y1;
+
+	data.dw03.dst_x2 = blt->dst.x2;
+	data.dw03.dst_y2 = blt->dst.y2;
+
+	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
+	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
+
+	data.dw04.dst_address_lo = dst_offset;
+	data.dw05.dst_address_hi = dst_offset >> 32;
+
+	data.dw06.src_x1 = blt->src.x1;
+	data.dw06.src_y1 = blt->src.y1;
+
+	data.dw07.src_pitch = blt->src.pitch;
+
+	data.dw08.src_address_lo = src_offset;
+	data.dw09.src_address_hi = src_offset >> 32;
+
+	i = sizeof(data) / sizeof(uint32_t);
+	bb = gem_mmap__device_coherent(i915, blt->bb.handle, 0, blt->bb.size,
+				       PROT_READ | PROT_WRITE);
+
+	memcpy(bb, &data, sizeof(data));
+	bb[i++] = MI_BATCH_BUFFER_END;
+
+	if (blt->print_bb) {
+		igt_info("BB [FAST COPY]\n");
+		igt_info("blit [src offset: %llx, dst offset: %llx\n",
+			 (long long) src_offset, (long long) dst_offset);
+		dump_bb_fast_cmd(&data);
+	}
+
+	munmap(bb, blt->bb.size);
+
+	obj[0].offset = CANONICAL(dst_offset);
+	obj[1].offset = CANONICAL(src_offset);
+	obj[2].offset = CANONICAL(bb_offset);
+	obj[0].handle = blt->dst.handle;
+	obj[1].handle = blt->src.handle;
+	obj[2].handle = blt->bb.handle;
+	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	execbuf.buffer_count = 3;
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.rsvd1 = ctx ? ctx->id : 0;
+	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+	ret = __gem_execbuf(i915, &execbuf);
+	put_offset(ahnd, blt->dst.handle);
+	put_offset(ahnd, blt->src.handle);
+	put_offset(ahnd, blt->bb.handle);
+
+	return ret;
+}
+
+/**
+ * blt_surface_fill_rect:
+ * @i915: drm fd
+ * @obj: blitter copy object (@blt_copy_object) to fill with gradient pattern
+ * @width: width
+ * @height: height
+ *
+ * Function fills surface @width x @height * 24bpp with color gradient
+ * (internally uses ARGB where A == 0xff, see Cairo docs).
+ */
+void blt_surface_fill_rect(int i915, const struct blt_copy_object *obj,
+			   uint32_t width, uint32_t height)
+{
+	cairo_surface_t *surface;
+	cairo_pattern_t *pat;
+	cairo_t *cr;
+	void *map = obj->ptr;
+
+	if (!map)
+		map = gem_mmap__device_coherent(i915, obj->handle, 0,
+						obj->size, PROT_READ | PROT_WRITE);
+
+	surface = cairo_image_surface_create_for_data(map,
+						      CAIRO_FORMAT_RGB24,
+						      width, height,
+						      obj->pitch);
+
+	cr = cairo_create(surface);
+
+	cairo_rectangle(cr, 0, 0, width, height);
+	cairo_clip(cr);
+
+	pat = cairo_pattern_create_mesh();
+	cairo_mesh_pattern_begin_patch(pat);
+	cairo_mesh_pattern_move_to(pat, 0, 0);
+	cairo_mesh_pattern_line_to(pat, width, 0);
+	cairo_mesh_pattern_line_to(pat, width, height);
+	cairo_mesh_pattern_line_to(pat, 0, height);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 0, 1.0, 0.0, 0.0);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 1, 0.0, 1.0, 0.0);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 2, 0.0, 0.0, 1.0);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 3, 1.0, 1.0, 1.0);
+	cairo_mesh_pattern_end_patch(pat);
+
+	cairo_rectangle(cr, 0, 0, width, height);
+	cairo_set_source(cr, pat);
+	cairo_fill(cr);
+	cairo_pattern_destroy(pat);
+
+	cairo_destroy(cr);
+
+	cairo_surface_destroy(surface);
+	if (!obj->ptr)
+		munmap(map, obj->size);
+}
+
+/**
+ * blt_surface_info:
+ * @info: information header
+ * @obj: blitter copy object (@blt_copy_object) to print surface info
+ */
+void blt_surface_info(const char *info, const struct blt_copy_object *obj)
+{
+	igt_info("[%s]\n", info);
+	igt_info("surface <handle: %u, size: %llx, region: %x, mocs: %x>\n",
+		 obj->handle, (long long) obj->size, obj->region, obj->mocs);
+	igt_info("        <tiling: %s, compression: %u, compression type: %d>\n",
+		 blt_tiling_name(obj->tiling), obj->compression, obj->compression_type);
+	igt_info("        <pitch: %u, offset [x: %u, y: %u] geom [<%d,%d> <%d,%d>]>\n",
+		 obj->pitch, obj->x_offset, obj->y_offset,
+		 obj->x1, obj->y1, obj->x2, obj->y2);
+}
+
+/**
+ * blt_surface_to_png:
+ * @i915: drm fd
+ * @run_id: prefix id to allow grouping files stored from single run
+ * @fileid: file identifier
+ * @obj: blitter copy object (@blt_copy_object) to save to png
+ * @width: width
+ * @height: height
+ *
+ * Function save surface to png file. Assumes ARGB format where A == 0xff.
+ */
+void blt_surface_to_png(int i915, uint32_t run_id, const char *fileid,
+			const struct blt_copy_object *obj,
+			uint32_t width, uint32_t height)
+{
+	cairo_surface_t *surface;
+	cairo_status_t ret;
+	uint8_t *map = (uint8_t *) obj->ptr;
+	int format;
+	int stride = obj->tiling ? obj->pitch * 4 : obj->pitch;
+	char filename[FILENAME_MAX];
+
+	snprintf(filename, FILENAME_MAX-1, "%d-%s-%s-%ux%u-%s.png",
+		 run_id, fileid, blt_tiling_name(obj->tiling), width, height,
+		 obj->compression ? "compressed" : "uncompressed");
+
+	if (!map)
+		map = gem_mmap__device_coherent(i915, obj->handle, 0,
+						obj->size, PROT_READ);
+	format = CAIRO_FORMAT_RGB24;
+	surface = cairo_image_surface_create_for_data(map,
+						      format, width, height,
+						      stride);
+	ret = cairo_surface_write_to_png(surface, filename);
+	if (ret)
+		igt_info("Cairo ret: %d (%s)\n", ret, cairo_status_to_string(ret));
+	igt_assert(ret == CAIRO_STATUS_SUCCESS);
+	cairo_surface_destroy(surface);
+
+	if (!obj->ptr)
+		munmap(map, obj->size);
+}
diff --git a/lib/i915/i915_blt.h b/lib/i915/i915_blt.h
new file mode 100644
index 000000000..e0e8b52bc
--- /dev/null
+++ b/lib/i915/i915_blt.h
@@ -0,0 +1,196 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+/**
+ * SECTION:i915_blt
+ * @short_description: i915 blitter library
+ * @title: Blitter library
+ * @include: i915_blt.h
+ *
+ * # Introduction
+ *
+ * Gen12+ blitter commands like XY_BLOCK_COPY_BLT are quite long
+ * and if we would like to provide all arguments to function,
+ * list would be long, unreadable and error prone to invalid argument placement.
+ * Providing objects (structs) seems more reasonable and opens some more
+ * opportunities to share some object data across different blitter commands.
+ *
+ * Blitter library supports no-reloc (softpin) mode only (apart of TGL
+ * there's no relocations enabled) thus ahnd is mandatory. Providing NULL ctx
+ * means we use default context with I915_EXEC_BLT as an execution engine.
+ *
+ * Library introduces tiling enum which distinguishes tiling formats regardless
+ * legacy I915_TILING_... definitions. This allows to control fully what tilings
+ * are handled by command and skip/assert ones which are not supported.
+ *
+ * # Supported commands
+ *
+ * - XY_BLOCK_COPY_BLT - (block-copy) TGL/DG1 + DG2+ (ext version)
+ * - XY_FAST_COPY_BLT - (fast-copy)
+ * - XY_CTRL_SURF_COPY_BLT - (ctrl-surf-copy) DG2+
+ *
+ * # Usage details
+ *
+ * For block-copy and fast-copy @blt_copy_object struct is used to collect
+ * data about source and destination objects. It contains handle, region,
+ * size, etc...  which are using for blits. Some fields are not used for
+ * fast-copy copy (like compression) and command which use this exclusively
+ * is annotated in the comment.
+ *
+ */
+
+#include <errno.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+#include <malloc.h>
+#include "drm.h"
+#include "igt.h"
+
+#define CCS_RATIO 256
+
+enum blt_color_depth {
+	CD_8bit,
+	CD_16bit,
+	CD_32bit,
+	CD_64bit,
+	CD_96bit,
+	CD_128bit,
+};
+
+enum blt_tiling {
+	T_LINEAR,
+	T_XMAJOR,
+	T_YMAJOR,
+	T_TILE4,
+	T_TILE64,
+};
+
+enum blt_compression {
+	COMPRESSION_DISABLED,
+	COMPRESSION_ENABLED,
+};
+
+enum blt_compression_type {
+	COMPRESSION_TYPE_3D,
+	COMPRESSION_TYPE_MEDIA,
+};
+
+/* BC - block-copy */
+struct blt_copy_object {
+	uint32_t handle;
+	uint32_t region;
+	uint64_t size;
+	uint8_t mocs;
+	enum blt_tiling tiling;
+	enum blt_compression compression;  /* BC only */
+	enum blt_compression_type compression_type; /* BC only */
+	uint32_t pitch;
+	uint16_t x_offset, y_offset;
+	int16_t x1, y1, x2, y2;
+
+	/* mapping or null */
+	uint32_t *ptr;
+};
+
+struct blt_copy_batch {
+	uint32_t handle;
+	uint32_t region;
+	uint64_t size;
+};
+
+/* Common for block-copy and fast-copy */
+struct blt_copy_data {
+	int i915;
+	struct blt_copy_object src;
+	struct blt_copy_object dst;
+	struct blt_copy_batch bb;
+	enum blt_color_depth color_depth;
+
+	/* debug stuff */
+	bool print_bb;
+};
+
+enum blt_surface_type {
+	SURFACE_TYPE_1D,
+	SURFACE_TYPE_2D,
+	SURFACE_TYPE_3D,
+	SURFACE_TYPE_CUBE,
+};
+
+struct blt_block_copy_object_ext {
+	uint8_t compression_format;
+	bool clear_value_enable;
+	uint64_t clear_address;
+	uint16_t surface_width;
+	uint16_t surface_height;
+	enum blt_surface_type surface_type;
+	uint16_t surface_qpitch;
+	uint16_t surface_depth;
+	uint8_t lod;
+	uint8_t horizontal_align;
+	uint8_t vertical_align;
+	uint8_t mip_tail_start_lod;
+	bool depth_stencil_resource;
+	uint16_t array_index;
+};
+
+struct blt_block_copy_data_ext {
+	struct blt_block_copy_object_ext src;
+	struct blt_block_copy_object_ext dst;
+};
+
+enum blt_access_type {
+	INDIRECT_ACCESS,
+	DIRECT_ACCESS,
+};
+
+struct blt_ctrl_surf_copy_object {
+	uint32_t handle;
+	uint32_t region;
+	uint64_t size;
+	uint8_t mocs;
+	enum blt_access_type access_type;
+};
+
+struct blt_ctrl_surf_copy_data {
+	int i915;
+	struct blt_ctrl_surf_copy_object src;
+	struct blt_ctrl_surf_copy_object dst;
+	struct blt_copy_batch bb;
+
+	/* debug stuff */
+	bool print_bb;
+};
+
+bool blt_supports_compression(int i915);
+bool blt_supports_tiling(int i915, enum blt_tiling tiling);
+const char *blt_tiling_name(enum blt_tiling tiling);
+
+int blt_block_copy(int i915,
+		   const intel_ctx_t *ctx,
+		   const struct intel_execution_engine2 *e,
+		   uint64_t ahnd,
+		   const struct blt_copy_data *blt,
+		   const struct blt_block_copy_data_ext *ext);
+
+int blt_ctrl_surf_copy(int i915,
+		       const intel_ctx_t *ctx,
+		       const struct intel_execution_engine2 *e,
+		       uint64_t ahnd,
+		       const struct blt_ctrl_surf_copy_data *surf);
+
+int blt_fast_copy(int i915,
+		  const intel_ctx_t *ctx,
+		  const struct intel_execution_engine2 *e,
+		  uint64_t ahnd,
+		  const struct blt_copy_data *blt);
+
+void blt_surface_info(const char *info,
+		      const struct blt_copy_object *obj);
+void blt_surface_fill_rect(int i915, const struct blt_copy_object *obj,
+			   uint32_t width, uint32_t height);
+void blt_surface_to_png(int i915, uint32_t run_id, const char *fileid,
+		    const struct blt_copy_object *obj,
+		    uint32_t width, uint32_t height);
diff --git a/lib/meson.build b/lib/meson.build
index 3e43316d1..fe035672e 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -11,6 +11,7 @@ lib_sources = [
 	'i915/gem_mman.c',
 	'i915/gem_vm.c',
 	'i915/intel_memory_region.c',
+	'i915/i915_blt.c',
 	'igt_collection.c',
 	'igt_color_encoding.c',
 	'igt_debugfs.c',
-- 
2.32.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter
  2022-02-16  9:19 [igt-dev] [PATCH i-g-t 0/3] Add i915 blt library + gem_ccs test Zbigniew Kempczyński
@ 2022-02-16  9:19 ` Zbigniew Kempczyński
  0 siblings, 0 replies; 12+ messages in thread
From: Zbigniew Kempczyński @ 2022-02-16  9:19 UTC (permalink / raw)
  To: igt-dev

Blitter commands became complicated thus manual bitshifting is error
prone and hard debugable - XY_BLOCK_COPY_BLT is the best example -
in extended version (for DG2+) it takes 20 dwords of command data.
To avoid mistakes and dozens of arguments for command library provides
input data in more structured form.

Currently supported commands:
- XY_BLOCK_COPY_BLT:
  a)  TGL/DG1 uses shorter version of command which doesn't support
      compression
  b)  DG2+ command is extended and supports compression
- XY_CTRL_SURF_COPY_BLT
- XY_FAST_COPY_BLT

Source, destination and batchbuffer are provided to blitter functions
as objects (structs). This increases readability and allows use same
object in many functions. Only drawback of such attitude is some fields
used in one function may be ignored in another. As an example is
blt_copy_object which contains a lot of information about gem object.
In block-copy all of data are used but in fast-copy only some of them
(fast-copy doesn't support compression).

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
---
 .../igt-gpu-tools/igt-gpu-tools-docs.xml      |    1 +
 lib/i915/i915_blt.c                           | 1087 +++++++++++++++++
 lib/i915/i915_blt.h                           |  196 +++
 lib/meson.build                               |    1 +
 4 files changed, 1285 insertions(+)
 create mode 100644 lib/i915/i915_blt.c
 create mode 100644 lib/i915/i915_blt.h

diff --git a/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml b/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
index 0dc5a0b7e..3a2edbae1 100644
--- a/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
+++ b/docs/reference/igt-gpu-tools/igt-gpu-tools-docs.xml
@@ -16,6 +16,7 @@
   <chapter>
     <title>API Reference</title>
     <xi:include href="xml/drmtest.xml"/>
+    <xi:include href="xml/i915_blt.xml"/>
     <xi:include href="xml/igt_alsa.xml"/>
     <xi:include href="xml/igt_audio.xml"/>
     <xi:include href="xml/igt_aux.xml"/>
diff --git a/lib/i915/i915_blt.c b/lib/i915/i915_blt.c
new file mode 100644
index 000000000..c6f115009
--- /dev/null
+++ b/lib/i915/i915_blt.c
@@ -0,0 +1,1087 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+#include <errno.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+#include <malloc.h>
+#include <cairo.h>
+#include "drm.h"
+#include "igt.h"
+#include "gem_create.h"
+#include "i915_blt.h"
+
+#define BITRANGE(start, end) (end - start + 1)
+
+enum blt_special_mode {
+	SM_NONE,
+	SM_FULL_RESOLVE,
+	SM_PARTIAL_RESOLVE,
+	SM_RESERVED,
+};
+
+enum blt_aux_mode {
+	AM_AUX_NONE,
+	AM_AUX_CCS_E = 5,
+};
+
+enum blt_target_mem {
+	TM_LOCAL_MEM,
+	TM_SYSTEM_MEM,
+};
+
+struct gen12_block_copy_data {
+	struct {
+		uint32_t length:			BITRANGE(0, 7);
+		uint32_t rsvd1:				BITRANGE(8, 8);
+		uint32_t multisamples:			BITRANGE(9, 11);
+		uint32_t special_mode:			BITRANGE(12, 13);
+		uint32_t rsvd0:				BITRANGE(14, 18);
+		uint32_t color_depth:			BITRANGE(19, 21);
+		uint32_t opcode:			BITRANGE(22, 28);
+		uint32_t client:			BITRANGE(29, 31);
+	} dw00;
+
+	struct {
+		uint32_t dst_pitch:			BITRANGE(0, 17);
+		uint32_t dst_aux_mode:			BITRANGE(18, 20);
+		uint32_t dst_mocs:			BITRANGE(21, 27);
+		uint32_t dst_ctrl_surface_type:		BITRANGE(28, 28);
+		uint32_t dst_compression:		BITRANGE(29, 29);
+		uint32_t dst_tiling:			BITRANGE(30, 31);
+	} dw01;
+
+	struct {
+		int32_t dst_x1:				BITRANGE(0, 15);
+		int32_t dst_y1:				BITRANGE(16, 31);
+	} dw02;
+
+	struct {
+		int32_t dst_x2:				BITRANGE(0, 15);
+		int32_t dst_y2:				BITRANGE(16, 31);
+	} dw03;
+
+	struct {
+		uint32_t dst_address_lo;
+	} dw04;
+
+	struct {
+		uint32_t dst_address_hi;
+	} dw05;
+
+	struct {
+		uint32_t dst_x_offset:			BITRANGE(0, 13);
+		uint32_t rsvd1:				BITRANGE(14, 15);
+		uint32_t dst_y_offset:			BITRANGE(16, 29);
+		uint32_t rsvd0:				BITRANGE(30, 30);
+		uint32_t dst_target_memory:		BITRANGE(31, 31);
+	} dw06;
+
+	struct {
+		int32_t src_x1:				BITRANGE(0, 15);
+		int32_t src_y1:				BITRANGE(16, 31);
+	} dw07;
+
+	struct {
+		uint32_t src_pitch:			BITRANGE(0, 17);
+		uint32_t src_aux_mode:			BITRANGE(18, 20);
+		uint32_t src_mocs:			BITRANGE(21, 27);
+		uint32_t src_ctrl_surface_type:		BITRANGE(28, 28);
+		uint32_t src_compression:		BITRANGE(29, 29);
+		uint32_t src_tiling:			BITRANGE(30, 31);
+	} dw08;
+
+	struct {
+		uint32_t src_address_lo;
+	} dw09;
+
+	struct {
+		uint32_t src_address_hi;
+	} dw10;
+
+	struct {
+		uint32_t src_x_offset:			BITRANGE(0, 13);
+		uint32_t rsvd1:				BITRANGE(14, 15);
+		uint32_t src_y_offset:			BITRANGE(16, 29);
+		uint32_t rsvd0:				BITRANGE(30, 30);
+		uint32_t src_target_memory:		BITRANGE(31, 31);
+	} dw11;
+};
+
+struct gen12_block_copy_data_ext {
+	struct {
+		uint32_t src_compression_format:	BITRANGE(0, 4);
+		uint32_t src_clear_value_enable:	BITRANGE(5, 5);
+		uint32_t src_clear_address_low:		BITRANGE(6, 31);
+	} dw12;
+
+	union {
+		/* DG2, XEHP */
+		uint32_t src_clear_address_hi0;
+		/* Others */
+		uint32_t src_clear_address_hi1;
+	} dw13;
+
+	struct {
+		uint32_t dst_compression_format:	BITRANGE(0, 4);
+		uint32_t dst_clear_value_enable:	BITRANGE(5, 5);
+		uint32_t dst_clear_address_low:		BITRANGE(6, 31);
+	} dw14;
+
+	union {
+		/* DG2, XEHP */
+		uint32_t dst_clear_address_hi0;
+		/* Others */
+		uint32_t dst_clear_address_hi1;
+	} dw15;
+
+	struct {
+		uint32_t dst_surface_height:		BITRANGE(0, 13);
+		uint32_t dst_surface_width:		BITRANGE(14, 27);
+		uint32_t rsvd0:				BITRANGE(28, 28);
+		uint32_t dst_surface_type:		BITRANGE(29, 31);
+	} dw16;
+
+	struct {
+		uint32_t dst_lod:			BITRANGE(0, 3);
+		uint32_t dst_surface_qpitch:		BITRANGE(4, 18);
+		uint32_t rsvd0:				BITRANGE(19, 20);
+		uint32_t dst_surface_depth:		BITRANGE(21, 31);
+	} dw17;
+
+	struct {
+		uint32_t dst_horizontal_align:		BITRANGE(0, 1);
+		uint32_t rsvd0:				BITRANGE(2, 2);
+		uint32_t dst_vertical_align:		BITRANGE(3, 4);
+		uint32_t rsvd1:				BITRANGE(5, 7);
+		uint32_t dst_mip_tail_start_lod:	BITRANGE(8, 11);
+		uint32_t rsvd2:				BITRANGE(12, 17);
+		uint32_t dst_depth_stencil_resource:	BITRANGE(18, 18);
+		uint32_t rsvd3:				BITRANGE(19, 20);
+		uint32_t dst_array_index:		BITRANGE(21, 31);
+	} dw18;
+
+	struct {
+		uint32_t src_surface_height:		BITRANGE(0, 13);
+		uint32_t src_surface_width:		BITRANGE(14, 27);
+		uint32_t rsvd0:				BITRANGE(28, 28);
+		uint32_t src_surface_type:		BITRANGE(29, 31);
+	} dw19;
+
+	struct {
+		uint32_t src_lod:			BITRANGE(0, 3);
+		uint32_t src_surface_qpitch:		BITRANGE(4, 18);
+		uint32_t rsvd0:				BITRANGE(19, 20);
+		uint32_t src_surface_depth:		BITRANGE(21, 31);
+	} dw20;
+
+	struct {
+		uint32_t src_horizontal_align:		BITRANGE(0, 1);
+		uint32_t rsvd0:				BITRANGE(2, 2);
+		uint32_t src_vertical_align:		BITRANGE(3, 4);
+		uint32_t rsvd1:				BITRANGE(5, 7);
+		uint32_t src_mip_tail_start_lod:	BITRANGE(8, 11);
+		uint32_t rsvd2:				BITRANGE(12, 17);
+		uint32_t src_depth_stencil_resource:	BITRANGE(18, 18);
+		uint32_t rsvd3:				BITRANGE(19, 20);
+		uint32_t src_array_index:		BITRANGE(21, 31);
+	} dw21;
+};
+
+/**
+ * blt_supports_compression:
+ * @i915: drm fd
+ *
+ * Function returns does HW supports flatccs compression in blitter commands
+ * on @i915 device.
+ *
+ * Returns:
+ * true if it does, false otherwise.
+ */
+bool blt_supports_compression(int i915)
+{
+	uint32_t devid = intel_get_drm_devid(i915);
+
+	return HAS_FLATCCS(devid);
+}
+
+/**
+ * blt_supports_tiling:
+ * @i915: drm fd
+ * @tiling: tiling id
+ *
+ * Function returns does blitter supports @tiling on @i915 device.
+ *
+ * Returns:
+ * true if it does, false otherwise.
+ */
+bool blt_supports_tiling(int i915, enum blt_tiling tiling)
+{
+	uint32_t devid = intel_get_drm_devid(i915);
+
+	if (tiling == T_XMAJOR) {
+		if (IS_TIGERLAKE(devid) || IS_DG1(devid))
+			return false;
+		else
+			return true;
+	}
+
+	if (tiling == T_YMAJOR) {
+		if (IS_TIGERLAKE(devid) || IS_DG1(devid))
+			return true;
+		else
+			return false;
+	}
+
+	return true;
+}
+
+/**
+ * blt_tiling_name:
+ * @tiling: tiling id
+ *
+ * Returns:
+ * name of @tiling passed. Useful to build test names or sth.
+ */
+const char *blt_tiling_name(enum blt_tiling tiling)
+{
+	switch (tiling) {
+	case T_LINEAR: return "linear";
+	case T_XMAJOR: return "xmajor";
+	case T_YMAJOR: return "ymajor";
+	case T_TILE4:  return "tile4";
+	case T_TILE64: return "tile64";
+	}
+
+	igt_warn("invalid tiling passed: %d\n", tiling);
+	return NULL;
+}
+
+static int __block_tiling(enum blt_tiling tiling)
+{
+	switch (tiling) {
+	case T_LINEAR: return 0;
+	case T_XMAJOR: return 1;
+	case T_YMAJOR: return 1;
+	case T_TILE4:  return 2;
+	case T_TILE64: return 3;
+	}
+
+	igt_warn("invalid tiling passed: %d\n", tiling);
+	return 0;
+}
+
+static int __special_mode(const struct blt_copy_data *blt)
+{
+	if (blt->src.handle == blt->dst.handle &&
+	    blt->src.compression && !blt->dst.compression)
+		return SM_FULL_RESOLVE;
+
+
+	return SM_NONE;
+}
+
+static int __memory_type(uint32_t region)
+{
+	igt_assert_f(IS_DEVICE_MEMORY_REGION(region) ||
+		     IS_SYSTEM_MEMORY_REGION(region),
+		     "Invalid region: %x\n", region);
+
+	if (IS_DEVICE_MEMORY_REGION(region))
+		return TM_LOCAL_MEM;
+	return TM_SYSTEM_MEM;
+}
+
+static enum blt_aux_mode __aux_mode(const struct blt_copy_object *obj)
+{
+	if (obj->compression == COMPRESSION_ENABLED) {
+		igt_assert_f(IS_DEVICE_MEMORY_REGION(obj->region),
+			     "XY_BLOCK_COPY_BLT supports compression "
+			     "on device memory only\n");
+		return AM_AUX_CCS_E;
+	}
+
+	return AM_AUX_NONE;
+}
+
+static void fill_data(struct gen12_block_copy_data *data,
+		      const struct blt_copy_data *blt,
+		      uint64_t src_offset, uint64_t dst_offset,
+		      bool extended_command)
+{
+	data->dw00.client = 0x2;
+	data->dw00.opcode = 0x41;
+	data->dw00.color_depth = blt->color_depth;
+	data->dw00.special_mode = __special_mode(blt);
+	data->dw00.length = extended_command ? 20 : 10;
+
+	data->dw01.dst_pitch = blt->dst.pitch - 1;
+	data->dw01.dst_aux_mode = __aux_mode(&blt->dst);
+	data->dw01.dst_mocs = blt->dst.mocs;
+	data->dw01.dst_compression = blt->dst.compression;
+	data->dw01.dst_tiling = __block_tiling(blt->dst.tiling);
+
+	if (blt->dst.compression)
+		data->dw01.dst_ctrl_surface_type = blt->dst.compression_type;
+
+	data->dw02.dst_x1 = blt->dst.x1;
+	data->dw02.dst_y1 = blt->dst.y1;
+
+	data->dw03.dst_x2 = blt->dst.x2;
+	data->dw03.dst_y2 = blt->dst.y2;
+
+	data->dw04.dst_address_lo = dst_offset;
+	data->dw05.dst_address_hi = dst_offset >> 32;
+
+	data->dw06.dst_x_offset = blt->dst.x_offset;
+	data->dw06.dst_y_offset = blt->dst.y_offset;
+	data->dw06.dst_target_memory = __memory_type(blt->dst.region);
+
+	data->dw07.src_x1 = blt->src.x1;
+	data->dw07.src_y1 = blt->src.y1;
+
+	data->dw08.src_pitch = blt->src.pitch - 1;
+	data->dw08.src_aux_mode = __aux_mode(&blt->src);
+	data->dw08.src_mocs = blt->src.mocs;
+	data->dw08.src_compression = blt->src.compression;
+	data->dw08.src_tiling = __block_tiling(blt->src.tiling);
+
+	if (blt->src.compression)
+		data->dw08.src_ctrl_surface_type = blt->src.compression_type;
+
+	data->dw09.src_address_lo = src_offset;
+	data->dw10.src_address_hi = src_offset >> 32;
+
+	data->dw11.src_x_offset = blt->src.x_offset;
+	data->dw11.src_y_offset = blt->src.y_offset;
+	data->dw11.src_target_memory = __memory_type(blt->src.region);
+}
+
+static void fill_data_ext(int i915,
+			  struct gen12_block_copy_data_ext *dext,
+			  const struct blt_block_copy_data_ext *ext)
+{
+	dext->dw12.src_compression_format = ext->src.compression_format;
+	dext->dw12.src_clear_value_enable = ext->src.clear_value_enable;
+	dext->dw12.src_clear_address_low = ext->src.clear_address;
+
+	dext->dw13.src_clear_address_hi0 = ext->src.clear_address >> 32;
+
+	dext->dw14.dst_compression_format = ext->dst.compression_format;
+	dext->dw14.dst_clear_value_enable = ext->dst.clear_value_enable;
+	dext->dw14.dst_clear_address_low = ext->dst.clear_address;
+
+	dext->dw15.dst_clear_address_hi0 = ext->dst.clear_address >> 32;
+
+	dext->dw16.dst_surface_width = ext->dst.surface_width - 1;
+	dext->dw16.dst_surface_height = ext->dst.surface_height - 1;
+	dext->dw16.dst_surface_type = ext->dst.surface_type;
+
+	dext->dw17.dst_lod = ext->dst.lod;
+	dext->dw17.dst_surface_depth = ext->dst.surface_depth;
+	dext->dw17.dst_surface_qpitch = ext->dst.surface_qpitch;
+
+	dext->dw18.dst_horizontal_align = ext->dst.horizontal_align;
+	dext->dw18.dst_vertical_align = ext->dst.vertical_align;
+	dext->dw18.dst_mip_tail_start_lod = ext->dst.mip_tail_start_lod;
+	dext->dw18.dst_depth_stencil_resource = ext->dst.depth_stencil_resource;
+	dext->dw18.dst_array_index = ext->dst.array_index;
+
+	dext->dw19.src_surface_width = ext->src.surface_width - 1;
+	dext->dw19.src_surface_height = ext->src.surface_height - 1;
+
+	dext->dw19.src_surface_type = ext->src.surface_type;
+
+	dext->dw20.src_lod = ext->src.lod;
+	dext->dw20.src_surface_depth = ext->src.surface_depth;
+	dext->dw20.src_surface_qpitch = ext->src.surface_qpitch;
+
+	dext->dw21.src_horizontal_align = ext->src.horizontal_align;
+	dext->dw21.src_vertical_align = ext->src.vertical_align;
+	dext->dw21.src_mip_tail_start_lod = ext->src.mip_tail_start_lod;
+	dext->dw21.src_depth_stencil_resource = ext->src.depth_stencil_resource;
+	dext->dw21.src_array_index = ext->src.array_index;
+}
+
+static void dump_bb_cmd(struct gen12_block_copy_data *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("details:\n");
+	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, color depth: %d, "
+		 "special mode: %d, length: %d>\n",
+		 cmd[0],
+		 data->dw00.client, data->dw00.opcode, data->dw00.color_depth,
+		 data->dw00.special_mode, data->dw00.length);
+	igt_info(" dw01: [%08x] dst <pitch: %d, aux: %d, mocs: %d, compr: %d, "
+		 "tiling: %d, ctrl surf type: %d>\n",
+		 cmd[1], data->dw01.dst_pitch, data->dw01.dst_aux_mode,
+		 data->dw01.dst_mocs, data->dw01.dst_compression,
+		 data->dw01.dst_tiling, data->dw01.dst_ctrl_surface_type);
+	igt_info(" dw02: [%08x] dst geom <x1: %d, y1: %d>\n",
+		 cmd[2], data->dw02.dst_x1, data->dw02.dst_y1);
+	igt_info(" dw03: [%08x]          <x2: %d, y2: %d>\n",
+		 cmd[3], data->dw03.dst_x2, data->dw03.dst_y2);
+	igt_info(" dw04: [%08x] dst offset lo (0x%x)\n",
+		 cmd[4], data->dw04.dst_address_lo);
+	igt_info(" dw05: [%08x] dst offset hi (0x%x)\n",
+		 cmd[5], data->dw05.dst_address_hi);
+	igt_info(" dw06: [%08x] dst <x offset: 0x%x, y offset: 0x%0x, target mem: %d>\n",
+		 cmd[6], data->dw06.dst_x_offset, data->dw06.dst_y_offset,
+		 data->dw06.dst_target_memory);
+	igt_info(" dw07: [%08x] src geom <x1: %d, y1: %d>\n",
+		 cmd[7], data->dw07.src_x1, data->dw07.src_y1);
+	igt_info(" dw08: [%08x] src <pitch: %d, aux: %d, mocs: %d, compr: %d, "
+		 "tiling: %d, ctrl surf type: %d>\n",
+		 cmd[8], data->dw08.src_pitch, data->dw08.src_aux_mode,
+		 data->dw08.src_mocs, data->dw08.src_compression,
+		 data->dw08.src_tiling, data->dw08.src_ctrl_surface_type);
+	igt_info(" dw09: [%08x] src offset lo (0x%x)\n",
+		 cmd[9], data->dw09.src_address_lo);
+	igt_info(" dw10: [%08x] src offset hi (0x%x)\n",
+		 cmd[10], data->dw10.src_address_hi);
+	igt_info(" dw11: [%08x] src <x offset: 0x%x, y offset: 0x%0x, target mem: %d>\n",
+		 cmd[11], data->dw11.src_x_offset, data->dw11.src_y_offset,
+		 data->dw11.src_target_memory);
+}
+
+static void dump_bb_ext(struct gen12_block_copy_data_ext *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("ext details:\n");
+	igt_info(" dw12: [%08x] src <compression fmt: %d, clear value enable: %d, "
+		 "clear address low: 0x%x>\n",
+		 cmd[0],
+		 data->dw12.src_compression_format,
+		 data->dw12.src_clear_value_enable,
+		 data->dw12.src_clear_address_low);
+	igt_info(" dw13: [%08x] src clear address hi: 0x%x\n",
+		 cmd[1], data->dw13.src_clear_address_hi0);
+	igt_info(" dw14: [%08x] dst <compression fmt: %d, clear value enable: %d, "
+		 "clear address low: 0x%x>\n",
+		 cmd[2],
+		 data->dw14.dst_compression_format,
+		 data->dw14.dst_clear_value_enable,
+		 data->dw14.dst_clear_address_low);
+	igt_info(" dw15: [%08x] dst clear address hi: 0x%x\n",
+		 cmd[3], data->dw15.dst_clear_address_hi0);
+	igt_info(" dw16: [%08x] dst surface <width: %d, height: %d, type: %d>\n",
+		 cmd[4], data->dw16.dst_surface_width,
+		 data->dw16.dst_surface_height, data->dw16.dst_surface_type);
+	igt_info(" dw17: [%08x] dst surface <lod: %d, depth: %d, qpitch: %d>\n",
+		 cmd[5], data->dw17.dst_lod,
+		 data->dw17.dst_surface_depth, data->dw17.dst_surface_qpitch);
+	igt_info(" dw18: [%08x] dst <halign: %d, valign: %d, mip tail: %d, "
+		 "depth stencil: %d, array index: %d>\n",
+		 cmd[6],
+		 data->dw18.dst_horizontal_align,
+		 data->dw18.dst_vertical_align,
+		 data->dw18.dst_mip_tail_start_lod,
+		 data->dw18.dst_depth_stencil_resource,
+		 data->dw18.dst_array_index);
+
+	igt_info(" dw19: [%08x] src surface <width: %d, height: %d, type: %d>\n",
+		 cmd[7], data->dw19.src_surface_width,
+		 data->dw19.src_surface_height, data->dw19.src_surface_type);
+	igt_info(" dw20: [%08x] src surface <lod: %d, depth: %d, qpitch: %d>\n",
+		 cmd[8], data->dw20.src_lod,
+		 data->dw20.src_surface_depth, data->dw20.src_surface_qpitch);
+	igt_info(" dw21: [%08x] src <halign: %d, valign: %d, mip tail: %d, "
+		 "depth stencil: %d, array index: %d>\n",
+		 cmd[9],
+		 data->dw21.src_horizontal_align,
+		 data->dw21.src_vertical_align,
+		 data->dw21.src_mip_tail_start_lod,
+		 data->dw21.src_depth_stencil_resource,
+		 data->dw21.src_array_index);
+}
+
+/**
+ * blt_block_copy:
+ * @i915: drm fd
+ * @ctx: intel_ctx_t context
+ * @e: blitter engine for @ctx
+ * @ahnd: allocator handle
+ * @blt: basic blitter data (for TGL/DG1 which doesn't support ext version)
+ * @ext: extended blitter data (for DG2+, supports flatccs compression)
+ *
+ * Function does blit between @src and @dst described in @blt object.
+ *
+ * Returns:
+ * execbuffer status.
+ */
+int blt_block_copy(int i915,
+		   const intel_ctx_t *ctx,
+		   const struct intel_execution_engine2 *e,
+		   uint64_t ahnd,
+		   const struct blt_copy_data *blt,
+		   const struct blt_block_copy_data_ext *ext)
+{
+	struct drm_i915_gem_execbuffer2 execbuf = {};
+	struct drm_i915_gem_exec_object2 obj[3] = {};
+	struct gen12_block_copy_data data = {};
+	struct gen12_block_copy_data_ext dext = {};
+	uint64_t dst_offset, src_offset, bb_offset, alignment;
+	uint32_t *bb;
+	int i, ret;
+
+	igt_assert_f(ahnd, "block-copy supports softpin only\n");
+	igt_assert_f(blt, "block-copy requires data to do blit\n");
+
+	alignment = gem_detect_safe_alignment(i915);
+	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
+	if (__special_mode(blt) == SM_FULL_RESOLVE)
+		dst_offset = src_offset;
+	else
+		dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
+
+	fill_data(&data, blt, src_offset, dst_offset, ext);
+
+	i = sizeof(data) / sizeof(uint32_t);
+	bb = gem_mmap__device_coherent(i915, blt->bb.handle, 0, blt->bb.size,
+				       PROT_READ | PROT_WRITE);
+	memcpy(bb, &data, sizeof(data));
+
+	if (ext) {
+		fill_data_ext(i915, &dext, ext);
+		memcpy(bb + i, &dext, sizeof(dext));
+		i += sizeof(dext) / sizeof(uint32_t);
+	}
+	bb[i++] = MI_BATCH_BUFFER_END;
+
+	if (blt->print_bb) {
+		igt_info("[BLOCK COPY]\n");
+		igt_info("src offset: %llx, dst offset: %llx, bb offset: %llx\n",
+			 (long long) src_offset, (long long) dst_offset,
+			 (long long) bb_offset);
+
+		dump_bb_cmd(&data);
+		if (ext)
+			dump_bb_ext(&dext);
+	}
+
+	munmap(bb, blt->bb.size);
+
+	obj[0].offset = CANONICAL(dst_offset);
+	obj[1].offset = CANONICAL(src_offset);
+	obj[2].offset = CANONICAL(bb_offset);
+	obj[0].handle = blt->dst.handle;
+	obj[1].handle = blt->src.handle;
+	obj[2].handle = blt->bb.handle;
+	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	execbuf.buffer_count = 3;
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.rsvd1 = ctx ? ctx->id : 0;
+	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+	ret = __gem_execbuf(i915, &execbuf);
+	if (data.dw00.special_mode != SM_FULL_RESOLVE)
+		put_offset(ahnd, blt->dst.handle);
+	put_offset(ahnd, blt->src.handle);
+	put_offset(ahnd, blt->bb.handle);
+
+	return ret;
+}
+
+static uint16_t __ccs_size(const struct blt_ctrl_surf_copy_data *surf)
+{
+	uint32_t src_size, dst_size;
+
+	src_size = surf->src.access_type == DIRECT_ACCESS ?
+				surf->src.size : surf->src.size / CCS_RATIO;
+
+	dst_size = surf->dst.access_type == DIRECT_ACCESS ?
+				surf->dst.size : surf->dst.size / CCS_RATIO;
+
+	igt_assert_f(src_size <= dst_size, "dst size must be >= src size for CCS copy\n");
+
+	return src_size;
+}
+
+struct gen12_ctrl_surf_copy_data {
+	struct {
+		uint32_t length:			BITRANGE(0, 7);
+		uint32_t size_of_ctrl_copy:		BITRANGE(8, 17);
+		uint32_t rsvd0:				BITRANGE(18, 19);
+		uint32_t dst_access_type:		BITRANGE(20, 20);
+		uint32_t src_access_type:		BITRANGE(21, 21);
+		uint32_t opcode:			BITRANGE(22, 28);
+		uint32_t client:			BITRANGE(29, 31);
+	} dw00;
+
+	struct {
+		uint32_t src_address_lo;
+	} dw01;
+
+	struct {
+		uint32_t src_address_hi:		BITRANGE(0, 24);
+		uint32_t src_mocs:			BITRANGE(25, 31);
+	} dw02;
+
+	struct {
+		uint32_t dst_address_lo;
+	} dw03;
+
+	struct {
+		uint32_t dst_address_hi:		BITRANGE(0, 24);
+		uint32_t dst_mocs:			BITRANGE(25, 31);
+	} dw04;
+};
+
+static void dump_bb_surf_ctrl_cmd(const struct gen12_ctrl_surf_copy_data *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("details:\n");
+	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, "
+		 "src/dst access type: <%d, %d>, size of ctrl copy: %u, length: %d>\n",
+		 cmd[0],
+		 data->dw00.client, data->dw00.opcode,
+		 data->dw00.src_access_type, data->dw00.dst_access_type,
+		 data->dw00.size_of_ctrl_copy, data->dw00.length);
+	igt_info(" dw01: [%08x] src offset lo (0x%x)\n",
+		 cmd[1], data->dw01.src_address_lo);
+	igt_info(" dw02: [%08x] src offset hi (0x%x), src mocs: %u\n",
+		 cmd[2], data->dw02.src_address_hi, data->dw02.src_mocs);
+	igt_info(" dw03: [%08x] dst offset lo (0x%x)\n",
+		 cmd[3], data->dw03.dst_address_lo);
+	igt_info(" dw04: [%08x] dst offset hi (0x%x), src mocs: %u\n",
+		 cmd[4], data->dw04.dst_address_hi, data->dw04.dst_mocs);
+}
+
+/**
+ * blt_ctrl_surf_copy:
+ * @i915: drm fd
+ * @ctx: intel_ctx_t context
+ * @e: blitter engine for @ctx
+ * @ahnd: allocator handle
+ * @surf: blitter data for ctrl-surf-copy
+ *
+ * Function does ctrl-surf-copy blit between @src and @dst described in
+ * @blt object.
+ *
+ * Returns:
+ * execbuffer status.
+ */
+int blt_ctrl_surf_copy(int i915,
+		       const intel_ctx_t *ctx,
+		       const struct intel_execution_engine2 *e,
+		       uint64_t ahnd,
+		       const struct blt_ctrl_surf_copy_data *surf)
+{
+	struct drm_i915_gem_execbuffer2 execbuf = {};
+	struct drm_i915_gem_exec_object2 obj[3] = {};
+	struct gen12_ctrl_surf_copy_data data = {};
+	uint64_t dst_offset, src_offset, bb_offset, alignment;
+	uint32_t *bb;
+	int i;
+
+	igt_assert_f(ahnd, "ctrl-surf-copy supports softpin only\n");
+	igt_assert_f(surf, "ctrl-surf-copy requires data to do ctrl-surf-copy blit\n");
+
+	alignment = gem_detect_safe_alignment(i915);
+
+	data.dw00.client = 0x2;
+	data.dw00.opcode = 0x48;
+	data.dw00.src_access_type = surf->src.access_type;
+	data.dw00.dst_access_type = surf->dst.access_type;
+
+	/* Ensure dst has size capable to keep src ccs aux */
+	data.dw00.size_of_ctrl_copy = __ccs_size(surf) / CCS_RATIO - 1;
+	data.dw00.length = 0x3;
+
+	src_offset = get_offset(ahnd, surf->src.handle, surf->src.size,
+				alignment);
+	dst_offset = get_offset(ahnd, surf->dst.handle, surf->dst.size,
+				alignment);
+	bb_offset = get_offset(ahnd, surf->bb.handle, surf->bb.size,
+			       alignment);
+
+	data.dw01.src_address_lo = src_offset;
+	data.dw02.src_address_hi = src_offset >> 32;
+	data.dw02.src_mocs = surf->src.mocs;
+
+	data.dw03.dst_address_lo = dst_offset;
+	data.dw04.dst_address_hi = dst_offset >> 32;
+	data.dw04.dst_mocs = surf->dst.mocs;
+
+	i = sizeof(data) / sizeof(uint32_t);
+	bb = gem_mmap__device_coherent(i915, surf->bb.handle, 0, surf->bb.size,
+				       PROT_READ | PROT_WRITE);
+	memcpy(bb, &data, sizeof(data));
+	bb[i++] = MI_BATCH_BUFFER_END;
+
+	if (surf->print_bb) {
+		igt_info("BB [CTRL SURF]:\n");
+		igt_info("src offset: %llx, dst offset: %llx, bb offset: %llx\n",
+			 (long long) src_offset, (long long) dst_offset,
+			 (long long) bb_offset);
+
+		dump_bb_surf_ctrl_cmd(&data);
+	}
+	munmap(bb, surf->bb.size);
+
+	obj[0].offset = CANONICAL(dst_offset);
+	obj[1].offset = CANONICAL(src_offset);
+	obj[2].offset = CANONICAL(bb_offset);
+	obj[0].handle = surf->dst.handle;
+	obj[1].handle = surf->src.handle;
+	obj[2].handle = surf->bb.handle;
+	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	execbuf.buffer_count = 3;
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+	execbuf.rsvd1 = ctx ? ctx->id : 0;
+	gem_execbuf(i915, &execbuf);
+	put_offset(ahnd, surf->dst.handle);
+	put_offset(ahnd, surf->src.handle);
+	put_offset(ahnd, surf->bb.handle);
+
+	return 0;
+}
+
+struct gen12_fast_copy_data {
+	struct {
+		uint32_t length:			BITRANGE(0, 7);
+		uint32_t rsvd1:				BITRANGE(8, 12);
+		uint32_t dst_tiling:			BITRANGE(13, 14);
+		uint32_t rsvd0:				BITRANGE(15, 19);
+		uint32_t src_tiling:			BITRANGE(20, 21);
+		uint32_t opcode:			BITRANGE(22, 28);
+		uint32_t client:			BITRANGE(29, 31);
+	} dw00;
+
+	struct {
+		uint32_t dst_pitch:			BITRANGE(0, 15);
+		uint32_t rsvd1:				BITRANGE(16, 23);
+		uint32_t color_depth:			BITRANGE(24, 26);
+		uint32_t rsvd0:				BITRANGE(27, 27);
+		uint32_t dst_memory:			BITRANGE(28, 28);
+		uint32_t src_memory:			BITRANGE(29, 29);
+		uint32_t dst_type_y:			BITRANGE(30, 30);
+		uint32_t src_type_y:			BITRANGE(31, 31);
+	} dw01;
+
+	struct {
+		int32_t dst_x1:				BITRANGE(0, 15);
+		int32_t dst_y1:				BITRANGE(16, 31);
+	} dw02;
+
+	struct {
+		int32_t dst_x2:				BITRANGE(0, 15);
+		int32_t dst_y2:				BITRANGE(16, 31);
+	} dw03;
+
+	struct {
+		uint32_t dst_address_lo;
+	} dw04;
+
+	struct {
+		uint32_t dst_address_hi;
+	} dw05;
+
+	struct {
+		int32_t src_x1:				BITRANGE(0, 15);
+		int32_t src_y1:				BITRANGE(16, 31);
+	} dw06;
+
+	struct {
+		uint32_t src_pitch:			BITRANGE(0, 15);
+		uint32_t rsvd0:				BITRANGE(16, 31);
+	} dw07;
+
+	struct {
+		uint32_t src_address_lo;
+	} dw08;
+
+	struct {
+		uint32_t src_address_hi;
+	} dw09;
+};
+
+static int __fast_tiling(enum blt_tiling tiling)
+{
+	switch (tiling) {
+	case T_LINEAR: return 0;
+	case T_XMAJOR: return 1;
+	case T_YMAJOR: return 2;
+	case T_TILE4:  return 2;
+	case T_TILE64: return 3;
+	}
+	return 0;
+}
+
+static int __fast_color_depth(enum blt_color_depth depth)
+{
+	switch (depth) {
+	case CD_8bit:   return 0;
+	case CD_16bit:  return 1;
+	case CD_32bit:  return 3;
+	case CD_64bit:  return 4;
+	case CD_96bit:
+		igt_assert_f(0, "Unsupported depth\n");
+		break;
+	case CD_128bit: return 5;
+	};
+	return 0;
+}
+
+static void dump_bb_fast_cmd(struct gen12_fast_copy_data *data)
+{
+	uint32_t *cmd = (uint32_t *) data;
+
+	igt_info("BB details:\n");
+	igt_info(" dw00: [%08x] <client: 0x%x, opcode: 0x%x, src tiling: %d, "
+		 "dst tiling: %d, length: %d>\n",
+		 cmd[0], data->dw00.client, data->dw00.opcode,
+		 data->dw00.src_tiling, data->dw00.dst_tiling, data->dw00.length);
+	igt_info(" dw01: [%08x] dst <pitch: %d, color depth: %d, dst memory: %d, "
+		 "src memory: %d, \n"
+		 "\t\t\tdst type tile: %d (0-legacy, 1-tile4),\n"
+		 "\t\t\tsrc type tile: %d (0-legacy, 1-tile4)>\n",
+		 cmd[1], data->dw01.dst_pitch, data->dw01.color_depth,
+		 data->dw01.dst_memory, data->dw01.src_memory,
+		 data->dw01.dst_type_y, data->dw01.src_type_y);
+	igt_info(" dw02: [%08x] dst geom <x1: %d, y1: %d>\n",
+		 cmd[2], data->dw02.dst_x1, data->dw02.dst_y1);
+	igt_info(" dw03: [%08x]          <x2: %d, y2: %d>\n",
+		 cmd[3], data->dw03.dst_x2, data->dw03.dst_y2);
+	igt_info(" dw04: [%08x] dst offset lo (0x%x)\n",
+		 cmd[4], data->dw04.dst_address_lo);
+	igt_info(" dw05: [%08x] dst offset hi (0x%x)\n",
+		 cmd[5], data->dw05.dst_address_hi);
+	igt_info(" dw06: [%08x] src geom <x1: %d, y1: %d>\n",
+		 cmd[6], data->dw06.src_x1, data->dw06.src_y1);
+	igt_info(" dw07: [%08x] src <pitch: %d>\n",
+		 cmd[7], data->dw07.src_pitch);
+	igt_info(" dw08: [%08x] src offset lo (0x%x)\n",
+		 cmd[8], data->dw08.src_address_lo);
+	igt_info(" dw09: [%08x] src offset hi (0x%x)\n",
+		 cmd[9], data->dw09.src_address_hi);
+}
+
+/**
+ * blt_fast_copy:
+ * @i915: drm fd
+ * @ctx: intel_ctx_t context
+ * @e: blitter engine for @ctx
+ * @ahnd: allocator handle
+ * @blt: blitter data for fast-copy (same as for block-copy but doesn't use
+ * compression fields).
+ *
+ * Function does fast blit between @src and @dst described in @blt object.
+ *
+ * Returns:
+ * execbuffer status.
+ */
+int blt_fast_copy(int i915,
+		  const intel_ctx_t *ctx,
+		  const struct intel_execution_engine2 *e,
+		  uint64_t ahnd,
+		  const struct blt_copy_data *blt)
+{
+	struct drm_i915_gem_execbuffer2 execbuf = {};
+	struct drm_i915_gem_exec_object2 obj[3] = {};
+	struct gen12_fast_copy_data data = {};
+	uint64_t dst_offset, src_offset, bb_offset, alignment;
+	uint32_t *bb;
+	int i, ret;
+
+	alignment = gem_detect_safe_alignment(i915);
+
+	data.dw00.client = 0x2;
+	data.dw00.opcode = 0x42;
+	data.dw00.dst_tiling = __fast_tiling(blt->dst.tiling);
+	data.dw00.src_tiling = __fast_tiling(blt->src.tiling);
+	data.dw00.length = 8;
+
+	data.dw01.dst_pitch = blt->dst.pitch;
+	data.dw01.color_depth = __fast_color_depth(blt->color_depth);
+	data.dw01.dst_memory = __memory_type(blt->dst.region);
+	data.dw01.src_memory = __memory_type(blt->src.region);
+	data.dw01.dst_type_y = blt->dst.tiling == T_TILE4 ? 1 : 0;
+	data.dw01.src_type_y = blt->src.tiling == T_TILE4 ? 1 : 0;
+
+	data.dw02.dst_x1 = blt->dst.x1;
+	data.dw02.dst_y1 = blt->dst.y1;
+
+	data.dw03.dst_x2 = blt->dst.x2;
+	data.dw03.dst_y2 = blt->dst.y2;
+
+	src_offset = get_offset(ahnd, blt->src.handle, blt->src.size, alignment);
+	dst_offset = get_offset(ahnd, blt->dst.handle, blt->dst.size, alignment);
+	bb_offset = get_offset(ahnd, blt->bb.handle, blt->bb.size, alignment);
+
+	data.dw04.dst_address_lo = dst_offset;
+	data.dw05.dst_address_hi = dst_offset >> 32;
+
+	data.dw06.src_x1 = blt->src.x1;
+	data.dw06.src_y1 = blt->src.y1;
+
+	data.dw07.src_pitch = blt->src.pitch;
+
+	data.dw08.src_address_lo = src_offset;
+	data.dw09.src_address_hi = src_offset >> 32;
+
+	i = sizeof(data) / sizeof(uint32_t);
+	bb = gem_mmap__device_coherent(i915, blt->bb.handle, 0, blt->bb.size,
+				       PROT_READ | PROT_WRITE);
+
+	memcpy(bb, &data, sizeof(data));
+	bb[i++] = MI_BATCH_BUFFER_END;
+
+	if (blt->print_bb) {
+		igt_info("BB [FAST COPY]\n");
+		igt_info("blit [src offset: %llx, dst offset: %llx\n",
+			 (long long) src_offset, (long long) dst_offset);
+		dump_bb_fast_cmd(&data);
+	}
+
+	munmap(bb, blt->bb.size);
+
+	obj[0].offset = CANONICAL(dst_offset);
+	obj[1].offset = CANONICAL(src_offset);
+	obj[2].offset = CANONICAL(bb_offset);
+	obj[0].handle = blt->dst.handle;
+	obj[1].handle = blt->src.handle;
+	obj[2].handle = blt->bb.handle;
+	obj[0].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_WRITE |
+		       EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[1].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	obj[2].flags = EXEC_OBJECT_PINNED | EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
+	execbuf.buffer_count = 3;
+	execbuf.buffers_ptr = to_user_pointer(obj);
+	execbuf.rsvd1 = ctx ? ctx->id : 0;
+	execbuf.flags = e ? e->flags : I915_EXEC_BLT;
+	ret = __gem_execbuf(i915, &execbuf);
+	put_offset(ahnd, blt->dst.handle);
+	put_offset(ahnd, blt->src.handle);
+	put_offset(ahnd, blt->bb.handle);
+
+	return ret;
+}
+
+/**
+ * blt_surface_fill_rect:
+ * @i915: drm fd
+ * @obj: blitter copy object (@blt_copy_object) to fill with gradient pattern
+ * @width: width
+ * @height: height
+ *
+ * Function fills surface @width x @height * 24bpp with color gradient
+ * (internally uses ARGB where A == 0xff, see Cairo docs).
+ */
+void blt_surface_fill_rect(int i915, const struct blt_copy_object *obj,
+			   uint32_t width, uint32_t height)
+{
+	cairo_surface_t *surface;
+	cairo_pattern_t *pat;
+	cairo_t *cr;
+	void *map = obj->ptr;
+
+	if (!map)
+		map = gem_mmap__device_coherent(i915, obj->handle, 0,
+						obj->size, PROT_READ | PROT_WRITE);
+
+	surface = cairo_image_surface_create_for_data(map,
+						      CAIRO_FORMAT_RGB24,
+						      width, height,
+						      obj->pitch);
+
+	cr = cairo_create(surface);
+
+	cairo_rectangle(cr, 0, 0, width, height);
+	cairo_clip(cr);
+
+	pat = cairo_pattern_create_mesh();
+	cairo_mesh_pattern_begin_patch(pat);
+	cairo_mesh_pattern_move_to(pat, 0, 0);
+	cairo_mesh_pattern_line_to(pat, width, 0);
+	cairo_mesh_pattern_line_to(pat, width, height);
+	cairo_mesh_pattern_line_to(pat, 0, height);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 0, 1.0, 0.0, 0.0);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 1, 0.0, 1.0, 0.0);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 2, 0.0, 0.0, 1.0);
+	cairo_mesh_pattern_set_corner_color_rgb(pat, 3, 1.0, 1.0, 1.0);
+	cairo_mesh_pattern_end_patch(pat);
+
+	cairo_rectangle(cr, 0, 0, width, height);
+	cairo_set_source(cr, pat);
+	cairo_fill(cr);
+	cairo_pattern_destroy(pat);
+
+	cairo_destroy(cr);
+
+	cairo_surface_destroy(surface);
+	if (!obj->ptr)
+		munmap(map, obj->size);
+}
+
+/**
+ * blt_surface_info:
+ * @info: information header
+ * @obj: blitter copy object (@blt_copy_object) to print surface info
+ */
+void blt_surface_info(const char *info, const struct blt_copy_object *obj)
+{
+	igt_info("[%s]\n", info);
+	igt_info("surface <handle: %u, size: %llx, region: %x, mocs: %x>\n",
+		 obj->handle, (long long) obj->size, obj->region, obj->mocs);
+	igt_info("        <tiling: %s, compression: %u, compression type: %d>\n",
+		 blt_tiling_name(obj->tiling), obj->compression, obj->compression_type);
+	igt_info("        <pitch: %u, offset [x: %u, y: %u] geom [<%d,%d> <%d,%d>]>\n",
+		 obj->pitch, obj->x_offset, obj->y_offset,
+		 obj->x1, obj->y1, obj->x2, obj->y2);
+}
+
+/**
+ * blt_surface_to_png:
+ * @i915: drm fd
+ * @run_id: prefix id to allow grouping files stored from single run
+ * @fileid: file identifier
+ * @obj: blitter copy object (@blt_copy_object) to save to png
+ * @width: width
+ * @height: height
+ *
+ * Function save surface to png file. Assumes ARGB format where A == 0xff.
+ */
+void blt_surface_to_png(int i915, uint32_t run_id, const char *fileid,
+			const struct blt_copy_object *obj,
+			uint32_t width, uint32_t height)
+{
+	cairo_surface_t *surface;
+	cairo_status_t ret;
+	uint8_t *map = (uint8_t *) obj->ptr;
+	int format;
+	int stride = obj->tiling ? obj->pitch * 4 : obj->pitch;
+	char filename[FILENAME_MAX];
+
+	snprintf(filename, FILENAME_MAX-1, "%d-%s-%s-%ux%u-%s.png",
+		 run_id, fileid, blt_tiling_name(obj->tiling), width, height,
+		 obj->compression ? "compressed" : "uncompressed");
+
+	if (!map)
+		map = gem_mmap__device_coherent(i915, obj->handle, 0,
+						obj->size, PROT_READ);
+	format = CAIRO_FORMAT_RGB24;
+	surface = cairo_image_surface_create_for_data(map,
+						      format, width, height,
+						      stride);
+	ret = cairo_surface_write_to_png(surface, filename);
+	if (ret)
+		igt_info("Cairo ret: %d (%s)\n", ret, cairo_status_to_string(ret));
+	igt_assert(ret == CAIRO_STATUS_SUCCESS);
+	cairo_surface_destroy(surface);
+
+	if (!obj->ptr)
+		munmap(map, obj->size);
+}
diff --git a/lib/i915/i915_blt.h b/lib/i915/i915_blt.h
new file mode 100644
index 000000000..e0e8b52bc
--- /dev/null
+++ b/lib/i915/i915_blt.h
@@ -0,0 +1,196 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2022 Intel Corporation
+ */
+
+/**
+ * SECTION:i915_blt
+ * @short_description: i915 blitter library
+ * @title: Blitter library
+ * @include: i915_blt.h
+ *
+ * # Introduction
+ *
+ * Gen12+ blitter commands like XY_BLOCK_COPY_BLT are quite long
+ * and if we would like to provide all arguments to function,
+ * list would be long, unreadable and error prone to invalid argument placement.
+ * Providing objects (structs) seems more reasonable and opens some more
+ * opportunities to share some object data across different blitter commands.
+ *
+ * Blitter library supports no-reloc (softpin) mode only (apart of TGL
+ * there's no relocations enabled) thus ahnd is mandatory. Providing NULL ctx
+ * means we use default context with I915_EXEC_BLT as an execution engine.
+ *
+ * Library introduces tiling enum which distinguishes tiling formats regardless
+ * legacy I915_TILING_... definitions. This allows to control fully what tilings
+ * are handled by command and skip/assert ones which are not supported.
+ *
+ * # Supported commands
+ *
+ * - XY_BLOCK_COPY_BLT - (block-copy) TGL/DG1 + DG2+ (ext version)
+ * - XY_FAST_COPY_BLT - (fast-copy)
+ * - XY_CTRL_SURF_COPY_BLT - (ctrl-surf-copy) DG2+
+ *
+ * # Usage details
+ *
+ * For block-copy and fast-copy @blt_copy_object struct is used to collect
+ * data about source and destination objects. It contains handle, region,
+ * size, etc...  which are using for blits. Some fields are not used for
+ * fast-copy copy (like compression) and command which use this exclusively
+ * is annotated in the comment.
+ *
+ */
+
+#include <errno.h>
+#include <sys/ioctl.h>
+#include <sys/time.h>
+#include <malloc.h>
+#include "drm.h"
+#include "igt.h"
+
+#define CCS_RATIO 256
+
+enum blt_color_depth {
+	CD_8bit,
+	CD_16bit,
+	CD_32bit,
+	CD_64bit,
+	CD_96bit,
+	CD_128bit,
+};
+
+enum blt_tiling {
+	T_LINEAR,
+	T_XMAJOR,
+	T_YMAJOR,
+	T_TILE4,
+	T_TILE64,
+};
+
+enum blt_compression {
+	COMPRESSION_DISABLED,
+	COMPRESSION_ENABLED,
+};
+
+enum blt_compression_type {
+	COMPRESSION_TYPE_3D,
+	COMPRESSION_TYPE_MEDIA,
+};
+
+/* BC - block-copy */
+struct blt_copy_object {
+	uint32_t handle;
+	uint32_t region;
+	uint64_t size;
+	uint8_t mocs;
+	enum blt_tiling tiling;
+	enum blt_compression compression;  /* BC only */
+	enum blt_compression_type compression_type; /* BC only */
+	uint32_t pitch;
+	uint16_t x_offset, y_offset;
+	int16_t x1, y1, x2, y2;
+
+	/* mapping or null */
+	uint32_t *ptr;
+};
+
+struct blt_copy_batch {
+	uint32_t handle;
+	uint32_t region;
+	uint64_t size;
+};
+
+/* Common for block-copy and fast-copy */
+struct blt_copy_data {
+	int i915;
+	struct blt_copy_object src;
+	struct blt_copy_object dst;
+	struct blt_copy_batch bb;
+	enum blt_color_depth color_depth;
+
+	/* debug stuff */
+	bool print_bb;
+};
+
+enum blt_surface_type {
+	SURFACE_TYPE_1D,
+	SURFACE_TYPE_2D,
+	SURFACE_TYPE_3D,
+	SURFACE_TYPE_CUBE,
+};
+
+struct blt_block_copy_object_ext {
+	uint8_t compression_format;
+	bool clear_value_enable;
+	uint64_t clear_address;
+	uint16_t surface_width;
+	uint16_t surface_height;
+	enum blt_surface_type surface_type;
+	uint16_t surface_qpitch;
+	uint16_t surface_depth;
+	uint8_t lod;
+	uint8_t horizontal_align;
+	uint8_t vertical_align;
+	uint8_t mip_tail_start_lod;
+	bool depth_stencil_resource;
+	uint16_t array_index;
+};
+
+struct blt_block_copy_data_ext {
+	struct blt_block_copy_object_ext src;
+	struct blt_block_copy_object_ext dst;
+};
+
+enum blt_access_type {
+	INDIRECT_ACCESS,
+	DIRECT_ACCESS,
+};
+
+struct blt_ctrl_surf_copy_object {
+	uint32_t handle;
+	uint32_t region;
+	uint64_t size;
+	uint8_t mocs;
+	enum blt_access_type access_type;
+};
+
+struct blt_ctrl_surf_copy_data {
+	int i915;
+	struct blt_ctrl_surf_copy_object src;
+	struct blt_ctrl_surf_copy_object dst;
+	struct blt_copy_batch bb;
+
+	/* debug stuff */
+	bool print_bb;
+};
+
+bool blt_supports_compression(int i915);
+bool blt_supports_tiling(int i915, enum blt_tiling tiling);
+const char *blt_tiling_name(enum blt_tiling tiling);
+
+int blt_block_copy(int i915,
+		   const intel_ctx_t *ctx,
+		   const struct intel_execution_engine2 *e,
+		   uint64_t ahnd,
+		   const struct blt_copy_data *blt,
+		   const struct blt_block_copy_data_ext *ext);
+
+int blt_ctrl_surf_copy(int i915,
+		       const intel_ctx_t *ctx,
+		       const struct intel_execution_engine2 *e,
+		       uint64_t ahnd,
+		       const struct blt_ctrl_surf_copy_data *surf);
+
+int blt_fast_copy(int i915,
+		  const intel_ctx_t *ctx,
+		  const struct intel_execution_engine2 *e,
+		  uint64_t ahnd,
+		  const struct blt_copy_data *blt);
+
+void blt_surface_info(const char *info,
+		      const struct blt_copy_object *obj);
+void blt_surface_fill_rect(int i915, const struct blt_copy_object *obj,
+			   uint32_t width, uint32_t height);
+void blt_surface_to_png(int i915, uint32_t run_id, const char *fileid,
+		    const struct blt_copy_object *obj,
+		    uint32_t width, uint32_t height);
diff --git a/lib/meson.build b/lib/meson.build
index 3e43316d1..fe035672e 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -11,6 +11,7 @@ lib_sources = [
 	'i915/gem_mman.c',
 	'i915/gem_vm.c',
 	'i915/intel_memory_region.c',
+	'i915/i915_blt.c',
 	'igt_collection.c',
 	'igt_color_encoding.c',
 	'igt_debugfs.c',
-- 
2.32.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-03-04 13:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-16 12:21 [igt-dev] [PATCH i-g-t 0/3] Add i915 blt library + gem_ccs test Zbigniew Kempczyński
2022-02-16 12:21 ` [igt-dev] [PATCH i-g-t 1/3] i915/gem_engine_topology: Only use the main copy engines for XY_BLOCK_COPY Zbigniew Kempczyński
2022-02-16 12:21 ` [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter Zbigniew Kempczyński
2022-02-16 12:21 ` [igt-dev] [PATCH i-g-t 3/3] tests/gem_ccs: Verify uncompressed and compressed blits Zbigniew Kempczyński
2022-02-16 15:46 ` [igt-dev] ✓ Fi.CI.BAT: success for Add i915 blt library + gem_ccs test (rev5) Patchwork
2022-02-16 22:30 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2022-02-28 19:51 [igt-dev] [PATCH i-g-t 0/3] tests/i915: ccs subtests for gem_lmem_swapping Ramalingam C
2022-02-28 19:51 ` [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter Ramalingam C
2022-02-25 11:06 [igt-dev] [PATCH i-g-t 0/3] Add i915 blt library + gem_ccs test Zbigniew Kempczyński
2022-02-25 11:06 ` [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter Zbigniew Kempczyński
2022-02-25 16:18   ` Kamil Konieczny
2022-03-01 22:57   ` Kamil Konieczny
2022-03-04 13:37     ` Zbigniew Kempczyński
2022-02-16  9:19 [igt-dev] [PATCH i-g-t 0/3] Add i915 blt library + gem_ccs test Zbigniew Kempczyński
2022-02-16  9:19 ` [igt-dev] [PATCH i-g-t 2/3] lib/i915_blt: Add library for blitter Zbigniew Kempczyński

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.