All of lore.kernel.org
 help / color / mirror / Atom feed
* [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries
@ 2018-04-11  8:14 Katarzyna Dec
  2018-04-11  8:14 ` [igt-dev] [PATCH i-g-t v7 1/4] lib: Move common gpgpu/media fill functions to gpu_fill library Katarzyna Dec
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Katarzyna Dec @ 2018-04-11  8:14 UTC (permalink / raw)
  To: igt-dev

This series is removing duplications in gpgpu_fill and media_fill
libraries. As a first step I moved gpgpu and media helper functions
to gpu_fill library. In second patch I adjusted code to our coding
style. In the third not obvious duplications were removed (like
adding in gen7 functions conditions for future gens). Last patch
adds missing parameters that make GPU hang on gen9 and gen9+.

In first version of this series there was a comment about moving
batch_alloc/copy etc. functions to intel_batchbuffer library.
Because there is a lot of code to review already this change will
be introduced in another series (rendercopy, media_fill, gpgpu_fill
and media_spin code is affected by this).

It is possible that more changes around gen*_media.h and media_spin
is needed, but this will be done as a next step.

v2: Removed not obvious duplications. Adjusted code to review comments.
v3: Series needed reorganization because it introduced bug to ALP,
which was hard to find. That is why patch 1 is now almost only moving
functions to gpu_fill with removing duplications, such as the same
functions. Also applied comments from review.
v4: Added #defines and copyrights to new gpu_fill library. Changed functions
order in gpu_fill library.
v5: Version with no changes comparing to v4 - git sent-mail sent
series to wrong thread...
v6: Removed issue in gen8_emit_gpgpu_walker which was introduced during rebase
v7: Added copyrights to gpu_fill.c

Signed-off-by: Katarzyna Dec <katarzyna.dec@intel.com>
Cc: Lukasz Kalamarz <lukasz.kalamarz@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Katarzyna Dec (4):
  lib: Move common gpgpu/media fill functions to gpu_fill library
  lib: Remove duplications in gpu_fill library
  lib/gpgpu_fill: Add missing configuration parameters for gpgpu_fill
  lib: Adjust refactored gpu_fill library to our coding style

 lib/Makefile.sources    |   3 +-
 lib/gpgpu_fill.c        | 600 ++-----------------------------------------
 lib/gpgpu_fill.h        |  12 +-
 lib/gpu_fill.c          | 660 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/gpu_fill.h          | 135 ++++++++++
 lib/intel_batchbuffer.c |   4 +-
 lib/media_fill.h        |  23 +-
 lib/media_fill_gen7.c   | 278 +-------------------
 lib/media_fill_gen8.c   | 305 +---------------------
 lib/media_fill_gen8lp.c | 367 ---------------------------
 lib/media_fill_gen9.c   | 313 +----------------------
 lib/meson.build         |   2 +-
 12 files changed, 854 insertions(+), 1848 deletions(-)
 create mode 100644 lib/gpu_fill.c
 create mode 100644 lib/gpu_fill.h
 delete mode 100644 lib/media_fill_gen8lp.c

-- 
2.14.3

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [igt-dev] [PATCH i-g-t v7 1/4] lib: Move common gpgpu/media fill functions to gpu_fill library
  2018-04-11  8:14 [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries Katarzyna Dec
@ 2018-04-11  8:14 ` Katarzyna Dec
  2018-04-11  8:14 ` [igt-dev] [PATCH i-g-t v7 2/4] lib: Remove duplications in " Katarzyna Dec
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Katarzyna Dec @ 2018-04-11  8:14 UTC (permalink / raw)
  To: igt-dev

Gpgpu_fill and media_fill libraries are very similar and many
functions can be shared. I have created library gpu_fill with
all functions needed for implementing gpgpu_fill and media_fill
tests for all Gens. For reviewing and debugging purposes this patch
should be only moving functions from few libraries to one removing
functions identical for both media and gpgpu.
Places in the code that required more changes:
  Removing gen7_fill_gpgpu_kernel function that is identical to
gen7_fill_media_kernel and introduces conflict with moving
genX_fill_interface_descriptor, which are the same for media and gpgpu.
  Function gen8_fill_media_kernel is not removed in this patch
(although it is identical with gen7 version), because this patch
should be as much as possible functions movement.
  gen8_fill_interface_descriptor was unified for media and gpgpu
by adding kernel and its size as a parameter (this parameters
were missing in media gen8, gen8lp and gen9 functions)
  gen8_emit_state_base_address was unified, the one for gpgpu was
configured like it would be using indirect state (while we are
using CURBE). I have checked that media fill version
(OUT_BATCH(0 | BASE_ADDRESS_MODIFY)) works fine on gpgpu gen8 and newer.

v2: Changed code layout. GenX_fill_media_kernel was identical to
genX_fill_gpgpu_kernel so this function was unified to
gen7_fill_kernel. There were 2 very similar functions
gen8_emit_state_base_address for media and gpgpu, where the one
for gpgpu was configured like it would be using indirect state
(while we are using CURBE). I have checked if media fill version
works fine in gpgpu test on Gen8 and unified them.

v3: Made patch easier for reviewing moving changes unifying code for
various gens (that were included v1) to other patch, leaving only
the most critical code changes.

v5: Added copyrights and #ifndef to gpu_fill.h
v5: Added copyrights to gpu_fill.c

Signed-off-by: Katarzyna Dec <katarzyna.dec@intel.com>
Cc: Lukasz Kalamarz <lukasz.kalamarz@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 lib/Makefile.sources    |   2 +
 lib/gpgpu_fill.c        | 571 +----------------------------------
 lib/gpu_fill.c          | 782 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/gpu_fill.h          | 167 +++++++++++
 lib/media_fill_gen7.c   | 271 +----------------
 lib/media_fill_gen8.c   | 290 +-----------------
 lib/media_fill_gen8lp.c | 284 +-----------------
 lib/media_fill_gen9.c   | 298 +-----------------
 lib/meson.build         |   1 +
 9 files changed, 961 insertions(+), 1705 deletions(-)
 create mode 100644 lib/gpu_fill.c
 create mode 100644 lib/gpu_fill.h

diff --git a/lib/Makefile.sources b/lib/Makefile.sources
index 3d37ef1d..45e65dd7 100644
--- a/lib/Makefile.sources
+++ b/lib/Makefile.sources
@@ -64,6 +64,8 @@ lib_source_list =	 	\
 	media_spin.c		\
 	gpgpu_fill.h		\
 	gpgpu_fill.c		\
+	gpu_fill.h		\
+	gpu_fill.c		\
 	gen7_media.h            \
 	gen8_media.h            \
 	rendercopy_i915.c	\
diff --git a/lib/gpgpu_fill.c b/lib/gpgpu_fill.c
index 4d98643d..f2765fd6 100644
--- a/lib/gpgpu_fill.c
+++ b/lib/gpgpu_fill.c
@@ -30,10 +30,9 @@
 
 #include "intel_reg.h"
 #include "drmtest.h"
-#include "intel_batchbuffer.h"
-#include "gen7_media.h"
-#include "gen8_media.h"
+
 #include "gpgpu_fill.h"
+#include "gpu_fill.h"
 
 /* shaders/gpgpu/gpgpu_fill.gxa */
 static const uint32_t gen7_gpgpu_kernel[][4] = {
@@ -75,572 +74,6 @@ static const uint32_t gen9_gpgpu_kernel[][4] = {
 	{ 0x07800031, 0x20000a40, 0x06000e00, 0x82000010 },
 };
 
-static uint32_t
-batch_used(struct intel_batchbuffer *batch)
-{
-	return batch->ptr - batch->buffer;
-}
-
-static uint32_t
-batch_align(struct intel_batchbuffer *batch, uint32_t align)
-{
-	uint32_t offset = batch_used(batch);
-	offset = ALIGN(offset, align);
-	batch->ptr = batch->buffer + offset;
-	return offset;
-}
-
-static void *
-batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align)
-{
-	uint32_t offset = batch_align(batch, align);
-	batch->ptr += size;
-	return memset(batch->buffer + offset, 0, size);
-}
-
-static uint32_t
-batch_offset(struct intel_batchbuffer *batch, void *ptr)
-{
-	return (uint8_t *)ptr - batch->buffer;
-}
-
-static uint32_t
-batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size,
-	   uint32_t align)
-{
-	return batch_offset(batch, memcpy(batch_alloc(batch, size, align), ptr, size));
-}
-
-static void
-gen7_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end)
-{
-	int ret;
-
-	ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
-	if (ret == 0)
-		ret = drm_intel_bo_mrb_exec(batch->bo, batch_end,
-					NULL, 0, 0, 0);
-	igt_assert(ret == 0);
-}
-
-static uint32_t
-gen7_fill_curbe_buffer_data(struct intel_batchbuffer *batch, uint8_t color)
-{
-	uint8_t *curbe_buffer;
-	uint32_t offset;
-
-	curbe_buffer = batch_alloc(batch, sizeof(uint32_t) * 8, 64);
-	offset = batch_offset(batch, curbe_buffer);
-	*curbe_buffer = color;
-
-	return offset;
-}
-
-static uint32_t
-gen7_fill_surface_state(struct intel_batchbuffer *batch,
-			struct igt_buf *buf,
-			uint32_t format,
-			int is_dst)
-{
-	struct gen7_surface_state *ss;
-	uint32_t write_domain, read_domain, offset;
-	int ret;
-
-	if (is_dst) {
-		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
-	} else {
-		write_domain = 0;
-		read_domain = I915_GEM_DOMAIN_SAMPLER;
-	}
-
-	ss = batch_alloc(batch, sizeof(*ss), 64);
-	offset = batch_offset(batch, ss);
-
-	ss->ss0.surface_type = GEN7_SURFACE_2D;
-	ss->ss0.surface_format = format;
-	ss->ss0.render_cache_read_write = 1;
-
-	if (buf->tiling == I915_TILING_X)
-		ss->ss0.tiled_mode = 2;
-	else if (buf->tiling == I915_TILING_Y)
-		ss->ss0.tiled_mode = 3;
-
-	ss->ss1.base_addr = buf->bo->offset;
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				batch_offset(batch, ss) + 4,
-				buf->bo, 0,
-				read_domain, write_domain);
-	igt_assert(ret == 0);
-
-	ss->ss2.height = igt_buf_height(buf) - 1;
-	ss->ss2.width  = igt_buf_width(buf) - 1;
-
-	ss->ss3.pitch  = buf->stride - 1;
-
-	ss->ss7.shader_chanel_select_r = 4;
-	ss->ss7.shader_chanel_select_g = 5;
-	ss->ss7.shader_chanel_select_b = 6;
-	ss->ss7.shader_chanel_select_a = 7;
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_surface_state(struct intel_batchbuffer *batch,
-			struct igt_buf *buf,
-			uint32_t format,
-			int is_dst)
-{
-	struct gen8_surface_state *ss;
-	uint32_t write_domain, read_domain, offset;
-	int ret;
-
-	if (is_dst) {
-		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
-	} else {
-		write_domain = 0;
-		read_domain = I915_GEM_DOMAIN_SAMPLER;
-	}
-
-	ss = batch_alloc(batch, sizeof(*ss), 64);
-	offset = batch_offset(batch, ss);
-
-	ss->ss0.surface_type = GEN8_SURFACE_2D;
-	ss->ss0.surface_format = format;
-	ss->ss0.render_cache_read_write = 1;
-	ss->ss0.vertical_alignment = 1; /* align 4 */
-	ss->ss0.horizontal_alignment = 1; /* align 4 */
-
-	if (buf->tiling == I915_TILING_X)
-		ss->ss0.tiled_mode = 2;
-	else if (buf->tiling == I915_TILING_Y)
-		ss->ss0.tiled_mode = 3;
-
-	ss->ss8.base_addr = buf->bo->offset;
-
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				batch_offset(batch, ss) + 8 * 4,
-				buf->bo, 0,
-				read_domain, write_domain);
-	igt_assert_eq(ret, 0);
-
-	ss->ss2.height = igt_buf_height(buf) - 1;
-	ss->ss2.width  = igt_buf_width(buf) - 1;
-	ss->ss3.pitch  = buf->stride - 1;
-
-	ss->ss7.shader_chanel_select_r = 4;
-	ss->ss7.shader_chanel_select_g = 5;
-	ss->ss7.shader_chanel_select_b = 6;
-	ss->ss7.shader_chanel_select_a = 7;
-
-	return offset;
-
-}
-
-static uint32_t
-gen7_fill_binding_table(struct intel_batchbuffer *batch,
-			struct igt_buf *dst)
-{
-	uint32_t *binding_table, offset;
-
-	binding_table = batch_alloc(batch, 32, 64);
-	offset = batch_offset(batch, binding_table);
-
-	binding_table[0] = gen7_fill_surface_state(batch, dst, GEN7_SURFACEFORMAT_R8_UNORM, 1);
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_binding_table(struct intel_batchbuffer *batch,
-			struct igt_buf *dst)
-{
-	uint32_t *binding_table, offset;
-
-	binding_table = batch_alloc(batch, 32, 64);
-	offset = batch_offset(batch, binding_table);
-
-	binding_table[0] = gen8_fill_surface_state(batch, dst, GEN8_SURFACEFORMAT_R8_UNORM, 1);
-
-	return offset;
-}
-
-static uint32_t
-gen7_fill_gpgpu_kernel(struct intel_batchbuffer *batch,
-		const uint32_t kernel[][4],
-		size_t size)
-{
-	uint32_t offset;
-
-	offset = batch_copy(batch, kernel, size, 64);
-
-	return offset;
-}
-
-static uint32_t
-gen7_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *dst,
-			       const uint32_t kernel[][4], size_t size)
-{
-	struct gen7_interface_descriptor_data *idd;
-	uint32_t offset;
-	uint32_t binding_table_offset, kernel_offset;
-
-	binding_table_offset = gen7_fill_binding_table(batch, dst);
-	kernel_offset = gen7_fill_gpgpu_kernel(batch, kernel, size);
-
-	idd = batch_alloc(batch, sizeof(*idd), 64);
-	offset = batch_offset(batch, idd);
-
-	idd->desc0.kernel_start_pointer = (kernel_offset >> 6);
-
-	idd->desc1.single_program_flow = 1;
-	idd->desc1.floating_point_mode = GEN7_FLOATING_POINT_IEEE_754;
-
-	idd->desc2.sampler_count = 0;      /* 0 samplers used */
-	idd->desc2.sampler_state_pointer = 0;
-
-	idd->desc3.binding_table_entry_count = 0;
-	idd->desc3.binding_table_pointer = (binding_table_offset >> 5);
-
-	idd->desc4.constant_urb_entry_read_offset = 0;
-	idd->desc4.constant_urb_entry_read_length = 1; /* grf 1 */
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *dst,
-			       const uint32_t kernel[][4], size_t size)
-{
-	struct gen8_interface_descriptor_data *idd;
-	uint32_t offset;
-	uint32_t binding_table_offset, kernel_offset;
-
-	binding_table_offset = gen8_fill_binding_table(batch, dst);
-	kernel_offset = gen7_fill_gpgpu_kernel(batch, kernel, size);
-
-	idd = batch_alloc(batch, sizeof(*idd), 64);
-	offset = batch_offset(batch, idd);
-
-	idd->desc0.kernel_start_pointer = (kernel_offset >> 6);
-
-	idd->desc2.single_program_flow = 1;
-	idd->desc2.floating_point_mode = GEN8_FLOATING_POINT_IEEE_754;
-
-	idd->desc3.sampler_count = 0;      /* 0 samplers used */
-	idd->desc3.sampler_state_pointer = 0;
-
-	idd->desc4.binding_table_entry_count = 0;
-	idd->desc4.binding_table_pointer = (binding_table_offset >> 5);
-
-	idd->desc5.constant_urb_entry_read_offset = 0;
-	idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */
-
-	return offset;
-}
-
-static void
-gen7_emit_state_base_address(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN7_STATE_BASE_ADDRESS | (10 - 2));
-
-	/* general */
-	OUT_BATCH(0);
-
-	/* surface */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-
-	/* dynamic */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-
-	/* indirect */
-	OUT_BATCH(0);
-
-	/* instruction */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-
-	/* general/dynamic/indirect/instruction access Bound */
-	OUT_BATCH(0);
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-}
-
-static void
-gen8_emit_state_base_address(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (16 - 2));
-
-	/* general */
-	OUT_BATCH(0 | (0x78 << 4) | (0 << 1) |  BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
-
-	/* stateless data port */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-
-	/* surface */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
-
-	/* dynamic */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
-		  0, BASE_ADDRESS_MODIFY);
-
-	/* indirect */
-	OUT_BATCH(0);
-	OUT_BATCH(0 );
-
-	/* instruction */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-
-	/* general state buffer size */
-	OUT_BATCH(0xfffff000 | 1);
-	/* dynamic state buffer size */
-	OUT_BATCH(1 << 12 | 1);
-	/* indirect object buffer size */
-	OUT_BATCH(0xfffff000 | 1);
-	/* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */
-	OUT_BATCH(1 << 12 | 1);
-}
-
-static void
-gen9_emit_state_base_address(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (19 - 2));
-
-	/* general */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
-
-	/* stateless data port */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-
-	/* surface */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
-
-	/* dynamic */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
-		0, BASE_ADDRESS_MODIFY);
-
-	/* indirect */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	/* instruction */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-
-	/* general state buffer size */
-	OUT_BATCH(0xfffff000 | 1);
-	/* dynamic state buffer size */
-	OUT_BATCH(1 << 12 | 1);
-	/* indirect object buffer size */
-	OUT_BATCH(0xfffff000 | 1);
-	/* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */
-	OUT_BATCH(1 << 12 | 1);
-
-	/* Bindless surface state base address */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
-	OUT_BATCH(0xfffff000);
-}
-
-static void
-gen7_emit_vfe_state_gpgpu(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN7_MEDIA_VFE_STATE | (8 - 2));
-
-	/* scratch buffer */
-	OUT_BATCH(0);
-
-	/* number of threads & urb entries */
-	OUT_BATCH(1 << 16 | /* max num of threads */
-		  0 << 8 | /* num of URB entry */
-		  1 << 2); /* GPGPU mode */
-
-	OUT_BATCH(0);
-
-	/* urb entry size & curbe size */
-	OUT_BATCH(0 << 16 | 	/* URB entry size in 256 bits unit */
-		  1);		/* CURBE entry size in 256 bits unit */
-
-	/* scoreboard */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-}
-
-static void
-gen8_emit_vfe_state_gpgpu(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2));
-
-	/* scratch buffer */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	/* number of threads & urb entries */
-	OUT_BATCH(1 << 16 | 1 << 8);
-
-	OUT_BATCH(0);
-
-	/* urb entry size & curbe size */
-	OUT_BATCH(0 << 16 | 1);
-
-	/* scoreboard */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-}
-
-static void
-gen7_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer)
-{
-	OUT_BATCH(GEN7_MEDIA_CURBE_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* curbe total data length */
-	OUT_BATCH(64);
-	/* curbe data start address, is relative to the dynamics base address */
-	OUT_BATCH(curbe_buffer);
-}
-
-static void
-gen7_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor)
-{
-	OUT_BATCH(GEN7_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* interface descriptor data length */
-	OUT_BATCH(sizeof(struct gen7_interface_descriptor_data));
-	/* interface descriptor address, is relative to the dynamics base address */
-	OUT_BATCH(interface_descriptor);
-}
-
-static void
-gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor)
-{
-	OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* interface descriptor data length */
-	OUT_BATCH(sizeof(struct gen8_interface_descriptor_data));
-	/* interface descriptor address, is relative to the dynamics base address */
-	OUT_BATCH(interface_descriptor);
-}
-
-static void
-gen7_emit_gpgpu_walk(struct intel_batchbuffer *batch,
-		     unsigned x, unsigned y,
-		     unsigned width, unsigned height)
-{
-	uint32_t x_dim, y_dim, tmp, right_mask;
-
-	/*
-	 * Simply do SIMD16 based dispatch, so every thread uses
-	 * SIMD16 channels.
-	 *
-	 * Define our own thread group size, e.g 16x1 for every group, then
-	 * will have 1 thread each group in SIMD16 dispatch. So thread
-	 * width/height/depth are all 1.
-	 *
-	 * Then thread group X = width / 16 (aligned to 16)
-	 * thread group Y = height;
-	 */
-	x_dim = (width + 15) / 16;
-	y_dim = height;
-
-	tmp = width & 15;
-	if (tmp == 0)
-		right_mask = (1 << 16) - 1;
-	else
-		right_mask = (1 << tmp) - 1;
-
-	OUT_BATCH(GEN7_GPGPU_WALKER | 9);
-
-	/* interface descriptor offset */
-	OUT_BATCH(0);
-
-	/* SIMD size, thread w/h/d */
-	OUT_BATCH(1 << 30 | /* SIMD16 */
-		  0 << 16 | /* depth:1 */
-		  0 << 8 | /* height:1 */
-		  0); /* width:1 */
-
-	/* thread group X */
-	OUT_BATCH(0);
-	OUT_BATCH(x_dim);
-
-	/* thread group Y */
-	OUT_BATCH(0);
-	OUT_BATCH(y_dim);
-
-	/* thread group Z */
-	OUT_BATCH(0);
-	OUT_BATCH(1);
-
-	/* right mask */
-	OUT_BATCH(right_mask);
-
-	/* bottom mask, height 1, always 0xffffffff */
-	OUT_BATCH(0xffffffff);
-}
-
-static void
-gen8_emit_gpgpu_walk(struct intel_batchbuffer *batch,
-		     unsigned x, unsigned y,
-		     unsigned width, unsigned height)
-{
-	uint32_t x_dim, y_dim, tmp, right_mask;
-
-	/*
-	 * Simply do SIMD16 based dispatch, so every thread uses
-	 * SIMD16 channels.
-	 *
-	 * Define our own thread group size, e.g 16x1 for every group, then
-	 * will have 1 thread each group in SIMD16 dispatch. So thread
-	 * width/height/depth are all 1.
-	 *
-	 * Then thread group X = width / 16 (aligned to 16)
-	 * thread group Y = height;
-	 */
-	x_dim = (width + 15) / 16;
-	y_dim = height;
-
-	tmp = width & 15;
-	if (tmp == 0)
-		right_mask = (1 << 16) - 1;
-	else
-		right_mask = (1 << tmp) - 1;
-
-	OUT_BATCH(GEN7_GPGPU_WALKER | 13);
-
-	OUT_BATCH(0); /* kernel offset */
-	OUT_BATCH(0); /* indirect data length */
-	OUT_BATCH(0); /* indirect data offset */
-
-	/* SIMD size, thread w/h/d */
-	OUT_BATCH(1 << 30 | /* SIMD16 */
-		  0 << 16 | /* depth:1 */
-		  0 << 8 | /* height:1 */
-		  0); /* width:1 */
-
-	/* thread group X */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(x_dim);
-
-	/* thread group Y */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(y_dim);
-
-	/* thread group Z */
-	OUT_BATCH(0);
-	OUT_BATCH(1);
-
-	/* right mask */
-	OUT_BATCH(right_mask);
-
-	/* bottom mask, height 1, always 0xffffffff */
-	OUT_BATCH(0xffffffff);
-}
-
 /*
  * This sets up the gpgpu pipeline,
  *
diff --git a/lib/gpu_fill.c b/lib/gpu_fill.c
new file mode 100644
index 00000000..d7a2dd4c
--- /dev/null
+++ b/lib/gpu_fill.c
@@ -0,0 +1,782 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "gpu_fill.h"
+
+uint32_t
+batch_used(struct intel_batchbuffer *batch)
+{
+	return batch->ptr - batch->buffer;
+}
+
+uint32_t
+batch_align(struct intel_batchbuffer *batch, uint32_t align)
+{
+	uint32_t offset = batch_used(batch);
+	offset = ALIGN(offset, align);
+	batch->ptr = batch->buffer + offset;
+	return offset;
+}
+
+void *
+batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align)
+{
+	uint32_t offset = batch_align(batch, align);
+	batch->ptr += size;
+	return memset(batch->buffer + offset, 0, size);
+}
+
+uint32_t
+batch_offset(struct intel_batchbuffer *batch, void *ptr)
+{
+	return (uint8_t *)ptr - batch->buffer;
+}
+
+uint32_t
+batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size, uint32_t align)
+{
+	return batch_offset(batch, memcpy(batch_alloc(batch, size, align), ptr, size));
+}
+
+void
+gen7_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end)
+{
+	int ret;
+
+	ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
+	if (ret == 0)
+		ret = drm_intel_bo_mrb_exec(batch->bo, batch_end,
+					NULL, 0, 0, 0);
+	igt_assert(ret == 0);
+}
+
+uint32_t
+gen7_fill_curbe_buffer_data(struct intel_batchbuffer *batch,
+			uint8_t color)
+{
+	uint8_t *curbe_buffer;
+	uint32_t offset;
+
+	curbe_buffer = batch_alloc(batch, sizeof(uint32_t) * 8, 64);
+	offset = batch_offset(batch, curbe_buffer);
+	*curbe_buffer = color;
+
+	return offset;
+}
+
+uint32_t
+gen7_fill_surface_state(struct intel_batchbuffer *batch,
+			struct igt_buf *buf,
+			uint32_t format,
+			int is_dst)
+{
+	struct gen7_surface_state *ss;
+	uint32_t write_domain, read_domain, offset;
+	int ret;
+
+	if (is_dst) {
+		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
+	} else {
+		write_domain = 0;
+		read_domain = I915_GEM_DOMAIN_SAMPLER;
+	}
+
+	ss = batch_alloc(batch, sizeof(*ss), 64);
+	offset = batch_offset(batch, ss);
+
+	ss->ss0.surface_type = GEN7_SURFACE_2D;
+	ss->ss0.surface_format = format;
+	ss->ss0.render_cache_read_write = 1;
+
+	if (buf->tiling == I915_TILING_X)
+		ss->ss0.tiled_mode = 2;
+	else if (buf->tiling == I915_TILING_Y)
+		ss->ss0.tiled_mode = 3;
+
+	ss->ss1.base_addr = buf->bo->offset;
+	ret = drm_intel_bo_emit_reloc(batch->bo,
+				batch_offset(batch, ss) + 4,
+				buf->bo, 0,
+				read_domain, write_domain);
+	igt_assert(ret == 0);
+
+	ss->ss2.height = igt_buf_height(buf) - 1;
+	ss->ss2.width  = igt_buf_width(buf) - 1;
+
+	ss->ss3.pitch  = buf->stride - 1;
+
+	ss->ss7.shader_chanel_select_r = 4;
+	ss->ss7.shader_chanel_select_g = 5;
+	ss->ss7.shader_chanel_select_b = 6;
+	ss->ss7.shader_chanel_select_a = 7;
+
+	return offset;
+}
+
+uint32_t
+gen7_fill_binding_table(struct intel_batchbuffer *batch,
+			struct igt_buf *dst)
+{
+	uint32_t *binding_table, offset;
+
+	binding_table = batch_alloc(batch, 32, 64);
+	offset = batch_offset(batch, binding_table);
+	binding_table[0] = gen7_fill_surface_state(batch, dst,
+						GEN7_SURFACEFORMAT_R8_UNORM, 1);
+
+	return offset;
+}
+
+uint32_t
+gen7_fill_media_kernel(struct intel_batchbuffer *batch,
+		const uint32_t kernel[][4],
+		size_t size)
+{
+	uint32_t offset;
+
+	offset = batch_copy(batch, kernel, size, 64);
+
+	return offset;
+}
+
+uint32_t
+gen8_fill_media_kernel(struct intel_batchbuffer *batch,
+		const uint32_t kernel[][4],
+		size_t size)
+{
+	uint32_t offset;
+
+	offset = batch_copy(batch, kernel, size, 64);
+
+	return offset;
+}
+
+uint32_t
+gen7_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *dst,
+			       const uint32_t kernel[][4], size_t size)
+{
+	struct gen7_interface_descriptor_data *idd;
+	uint32_t offset;
+	uint32_t binding_table_offset, kernel_offset;
+
+	binding_table_offset = gen7_fill_binding_table(batch, dst);
+	kernel_offset = gen7_fill_media_kernel(batch, kernel, size);
+
+	idd = batch_alloc(batch, sizeof(*idd), 64);
+	offset = batch_offset(batch, idd);
+
+	idd->desc0.kernel_start_pointer = (kernel_offset >> 6);
+
+	idd->desc1.single_program_flow = 1;
+	idd->desc1.floating_point_mode = GEN7_FLOATING_POINT_IEEE_754;
+
+	idd->desc2.sampler_count = 0;      /* 0 samplers used */
+	idd->desc2.sampler_state_pointer = 0;
+
+	idd->desc3.binding_table_entry_count = 0;
+	idd->desc3.binding_table_pointer = (binding_table_offset >> 5);
+
+	idd->desc4.constant_urb_entry_read_offset = 0;
+	idd->desc4.constant_urb_entry_read_length = 1; /* grf 1 */
+
+	return offset;
+}
+
+void
+gen7_emit_state_base_address(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN7_STATE_BASE_ADDRESS | (10 - 2));
+
+	/* general */
+	OUT_BATCH(0);
+
+	/* surface */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
+
+	/* dynamic */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
+
+	/* indirect */
+	OUT_BATCH(0);
+
+	/* instruction */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
+
+	/* general/dynamic/indirect/instruction access Bound */
+	OUT_BATCH(0);
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+}
+
+void
+gen7_emit_vfe_state(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN7_MEDIA_VFE_STATE | (8 - 2));
+
+	/* scratch buffer */
+	OUT_BATCH(0);
+
+	/* number of threads & urb entries */
+	OUT_BATCH(1 << 16 |
+		2 << 8);
+
+	OUT_BATCH(0);
+
+	/* urb entry size & curbe size */
+	OUT_BATCH(2 << 16 | 	/* in 256 bits unit */
+		2);		/* in 256 bits unit */
+
+	/* scoreboard */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+void
+gen7_emit_vfe_state_gpgpu(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN7_MEDIA_VFE_STATE | (8 - 2));
+
+	/* scratch buffer */
+	OUT_BATCH(0);
+
+	/* number of threads & urb entries */
+	OUT_BATCH(1 << 16 | /* max num of threads */
+		  0 << 8 | /* num of URB entry */
+		  1 << 2); /* GPGPU mode */
+
+	OUT_BATCH(0);
+
+	/* urb entry size & curbe size */
+	OUT_BATCH(0 << 16 | 	/* URB entry size in 256 bits unit */
+		  1);		/* CURBE entry size in 256 bits unit */
+
+	/* scoreboard */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+void
+gen7_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer)
+{
+	OUT_BATCH(GEN7_MEDIA_CURBE_LOAD | (4 - 2));
+	OUT_BATCH(0);
+	/* curbe total data length */
+	OUT_BATCH(64);
+	/* curbe data start address, is relative to the dynamics base address */
+	OUT_BATCH(curbe_buffer);
+}
+
+void
+gen7_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor)
+{
+	OUT_BATCH(GEN7_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
+	OUT_BATCH(0);
+	/* interface descriptor data length */
+	OUT_BATCH(sizeof(struct gen7_interface_descriptor_data));
+	/* interface descriptor address, is relative to the dynamics base address */
+	OUT_BATCH(interface_descriptor);
+}
+
+void
+gen7_emit_media_objects(struct intel_batchbuffer *batch,
+			unsigned x, unsigned y,
+			unsigned width, unsigned height)
+{
+	int i, j;
+
+	for (i = 0; i < width / 16; i++) {
+		for (j = 0; j < height / 16; j++) {
+			OUT_BATCH(GEN7_MEDIA_OBJECT | (8 - 2));
+
+			/* interface descriptor offset */
+			OUT_BATCH(0);
+
+			/* without indirect data */
+			OUT_BATCH(0);
+			OUT_BATCH(0);
+
+			/* scoreboard */
+			OUT_BATCH(0);
+			OUT_BATCH(0);
+
+			/* inline data (xoffset, yoffset) */
+			OUT_BATCH(x + i * 16);
+			OUT_BATCH(y + j * 16);
+		}
+	}
+}
+
+void
+gen7_emit_gpgpu_walk(struct intel_batchbuffer *batch,
+		     unsigned x, unsigned y,
+		     unsigned width, unsigned height)
+{
+	uint32_t x_dim, y_dim, tmp, right_mask;
+
+	/*
+	 * Simply do SIMD16 based dispatch, so every thread uses
+	 * SIMD16 channels.
+	 *
+	 * Define our own thread group size, e.g 16x1 for every group, then
+	 * will have 1 thread each group in SIMD16 dispatch. So thread
+	 * width/height/depth are all 1.
+	 *
+	 * Then thread group X = width / 16 (aligned to 16)
+	 * thread group Y = height;
+	 */
+	x_dim = (width + 15) / 16;
+	y_dim = height;
+
+	tmp = width & 15;
+	if (tmp == 0)
+		right_mask = (1 << 16) - 1;
+	else
+		right_mask = (1 << tmp) - 1;
+
+	OUT_BATCH(GEN7_GPGPU_WALKER | 9);
+
+	/* interface descriptor offset */
+	OUT_BATCH(0);
+
+	/* SIMD size, thread w/h/d */
+	OUT_BATCH(1 << 30 | /* SIMD16 */
+		  0 << 16 | /* depth:1 */
+		  0 << 8 | /* height:1 */
+		  0); /* width:1 */
+
+	/* thread group X */
+	OUT_BATCH(0);
+	OUT_BATCH(x_dim);
+
+	/* thread group Y */
+	OUT_BATCH(0);
+	OUT_BATCH(y_dim);
+
+	/* thread group Z */
+	OUT_BATCH(0);
+	OUT_BATCH(1);
+
+	/* right mask */
+	OUT_BATCH(right_mask);
+
+	/* bottom mask, height 1, always 0xffffffff */
+	OUT_BATCH(0xffffffff);
+}
+
+void
+gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end)
+{
+	int ret;
+
+	ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
+	if (ret == 0)
+		ret = drm_intel_bo_mrb_exec(batch->bo, batch_end,
+					NULL, 0, 0, 0);
+	igt_assert(ret == 0);
+}
+
+
+uint32_t
+gen8_fill_curbe_buffer_data(struct intel_batchbuffer *batch,
+			uint8_t color)
+{
+	uint8_t *curbe_buffer;
+	uint32_t offset;
+
+	curbe_buffer = batch_alloc(batch, sizeof(uint32_t) * 8, 64);
+	offset = batch_offset(batch, curbe_buffer);
+	*curbe_buffer = color;
+
+	return offset;
+}
+
+uint32_t
+gen8_fill_surface_state(struct intel_batchbuffer *batch,
+			struct igt_buf *buf,
+			uint32_t format,
+			int is_dst)
+{
+	struct gen8_surface_state *ss;
+	uint32_t write_domain, read_domain, offset;
+	int ret;
+
+	if (is_dst) {
+		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
+	} else {
+		write_domain = 0;
+		read_domain = I915_GEM_DOMAIN_SAMPLER;
+	}
+
+	ss = batch_alloc(batch, sizeof(*ss), 64);
+	offset = batch_offset(batch, ss);
+
+	ss->ss0.surface_type = GEN8_SURFACE_2D;
+	ss->ss0.surface_format = format;
+	ss->ss0.render_cache_read_write = 1;
+	ss->ss0.vertical_alignment = 1; /* align 4 */
+	ss->ss0.horizontal_alignment = 1; /* align 4 */
+
+	if (buf->tiling == I915_TILING_X)
+		ss->ss0.tiled_mode = 2;
+	else if (buf->tiling == I915_TILING_Y)
+		ss->ss0.tiled_mode = 3;
+
+	ss->ss8.base_addr = buf->bo->offset;
+
+	ret = drm_intel_bo_emit_reloc(batch->bo,
+				batch_offset(batch, ss) + 8 * 4,
+				buf->bo, 0,
+				read_domain, write_domain);
+	igt_assert(ret == 0);
+
+	ss->ss2.height = igt_buf_height(buf) - 1;
+	ss->ss2.width  = igt_buf_width(buf) - 1;
+	ss->ss3.pitch  = buf->stride - 1;
+
+	ss->ss7.shader_chanel_select_r = 4;
+	ss->ss7.shader_chanel_select_g = 5;
+	ss->ss7.shader_chanel_select_b = 6;
+	ss->ss7.shader_chanel_select_a = 7;
+
+	return offset;
+}
+
+uint32_t
+gen8_fill_binding_table(struct intel_batchbuffer *batch,
+			struct igt_buf *dst)
+{
+	uint32_t *binding_table, offset;
+
+	binding_table = batch_alloc(batch, 32, 64);
+	offset = batch_offset(batch, binding_table);
+
+	binding_table[0] = gen8_fill_surface_state(batch, dst,
+						GEN8_SURFACEFORMAT_R8_UNORM, 1);
+
+	return offset;
+}
+
+uint32_t
+gen8_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *dst,  const uint32_t kernel[][4], size_t size)
+{
+	struct gen8_interface_descriptor_data *idd;
+	uint32_t offset;
+	uint32_t binding_table_offset, kernel_offset;
+
+	binding_table_offset = gen8_fill_binding_table(batch, dst);
+	kernel_offset = gen8_fill_media_kernel(batch, kernel, size);
+
+	idd = batch_alloc(batch, sizeof(*idd), 64);
+	offset = batch_offset(batch, idd);
+
+	idd->desc0.kernel_start_pointer = (kernel_offset >> 6);
+
+	idd->desc2.single_program_flow = 1;
+	idd->desc2.floating_point_mode = GEN8_FLOATING_POINT_IEEE_754;
+
+	idd->desc3.sampler_count = 0;      /* 0 samplers used */
+	idd->desc3.sampler_state_pointer = 0;
+
+	idd->desc4.binding_table_entry_count = 0;
+	idd->desc4.binding_table_pointer = (binding_table_offset >> 5);
+
+	idd->desc5.constant_urb_entry_read_offset = 0;
+	idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */
+
+	return offset;
+}
+
+void
+gen8_emit_state_base_address(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (16 - 2));
+
+	/* general */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* stateless data port */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+
+	/* surface */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
+
+	/* dynamic */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
+		0, BASE_ADDRESS_MODIFY);
+
+	/* indirect */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* instruction */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
+
+	/* general state buffer size */
+	OUT_BATCH(0xfffff000 | 1);
+	/* dynamic state buffer size */
+	OUT_BATCH(1 << 12 | 1);
+	/* indirect object buffer size */
+	OUT_BATCH(0xfffff000 | 1);
+	/* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */
+	OUT_BATCH(1 << 12 | 1);
+}
+
+void
+gen8_emit_vfe_state(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2));
+
+	/* scratch buffer */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* number of threads & urb entries */
+	OUT_BATCH(1 << 16 |
+		2 << 8);
+
+	OUT_BATCH(0);
+
+	/* urb entry size & curbe size */
+	OUT_BATCH(2 << 16 |
+		2);
+
+	/* scoreboard */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+void
+gen8_emit_vfe_state_gpgpu(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2));
+
+	/* scratch buffer */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* number of threads & urb entries */
+	OUT_BATCH(1 << 16 | 1 << 8);
+
+	OUT_BATCH(0);
+
+	/* urb entry size & curbe size */
+	OUT_BATCH(0 << 16 | 1);
+
+	/* scoreboard */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+}
+
+void
+gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer)
+{
+	OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2));
+	OUT_BATCH(0);
+	/* curbe total data length */
+	OUT_BATCH(64);
+	/* curbe data start address, is relative to the dynamics base address */
+	OUT_BATCH(curbe_buffer);
+}
+
+void
+gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor)
+{
+	OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
+	OUT_BATCH(0);
+	/* interface descriptor data length */
+	OUT_BATCH(sizeof(struct gen8_interface_descriptor_data));
+	/* interface descriptor address, is relative to the dynamics base address */
+	OUT_BATCH(interface_descriptor);
+}
+
+void
+gen8_emit_media_state_flush(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2));
+	OUT_BATCH(0);
+}
+
+void
+gen8_emit_media_objects(struct intel_batchbuffer *batch,
+			unsigned x, unsigned y,
+			unsigned width, unsigned height)
+{
+	int i, j;
+
+	for (i = 0; i < width / 16; i++) {
+		for (j = 0; j < height / 16; j++) {
+			OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
+
+			/* interface descriptor offset */
+			OUT_BATCH(0);
+
+			/* without indirect data */
+			OUT_BATCH(0);
+			OUT_BATCH(0);
+
+			/* scoreboard */
+			OUT_BATCH(0);
+			OUT_BATCH(0);
+
+			/* inline data (xoffset, yoffset) */
+			OUT_BATCH(x + i * 16);
+			OUT_BATCH(y + j * 16);
+			gen8_emit_media_state_flush(batch);
+		}
+	}
+}
+void
+gen8lp_emit_media_objects(struct intel_batchbuffer *batch,
+			unsigned x, unsigned y,
+			unsigned width, unsigned height)
+{
+	int i, j;
+
+	for (i = 0; i < width / 16; i++) {
+		for (j = 0; j < height / 16; j++) {
+			OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
+
+			/* interface descriptor offset */
+			OUT_BATCH(0);
+
+			/* without indirect data */
+			OUT_BATCH(0);
+			OUT_BATCH(0);
+
+			/* scoreboard */
+			OUT_BATCH(0);
+			OUT_BATCH(0);
+
+			/* inline data (xoffset, yoffset) */
+			OUT_BATCH(x + i * 16);
+			OUT_BATCH(y + j * 16);
+		}
+	}
+}
+void
+gen8_emit_gpgpu_walk(struct intel_batchbuffer *batch,
+		     unsigned x, unsigned y,
+		     unsigned width, unsigned height)
+{
+	uint32_t x_dim, y_dim, tmp, right_mask;
+
+	/*
+	 * Simply do SIMD16 based dispatch, so every thread uses
+	 * SIMD16 channels.
+	 *
+	 * Define our own thread group size, e.g 16x1 for every group, then
+	 * will have 1 thread each group in SIMD16 dispatch. So thread
+	 * width/height/depth are all 1.
+	 *
+	 * Then thread group X = width / 16 (aligned to 16)
+	 * thread group Y = height;
+	 */
+	x_dim = (width + 15) / 16;
+	y_dim = height;
+
+	tmp = width & 15;
+	if (tmp == 0)
+		right_mask = (1 << 16) - 1;
+	else
+		right_mask = (1 << tmp) - 1;
+
+	OUT_BATCH(GEN7_GPGPU_WALKER | 13);
+
+	OUT_BATCH(0); /* kernel offset */
+	OUT_BATCH(0); /* indirect data length */
+	OUT_BATCH(0); /* indirect data offset */
+
+	/* SIMD size, thread w/h/d */
+	OUT_BATCH(1 << 30 | /* SIMD16 */
+		  0 << 16 | /* depth:1 */
+		  0 << 8 | /* height:1 */
+		  0); /* width:1 */
+
+	/* thread group X */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(x_dim);
+
+	/* thread group Y */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+	OUT_BATCH(y_dim);
+
+	/* thread group Z */
+	OUT_BATCH(0);
+	OUT_BATCH(1);
+
+	/* right mask */
+	OUT_BATCH(right_mask);
+
+	/* bottom mask, height 1, always 0xffffffff */
+	OUT_BATCH(0xffffffff);
+}
+
+void
+gen9_emit_state_base_address(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (19 - 2));
+
+	/* general */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+
+	/* stateless data port */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+
+	/* surface */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
+
+	/* dynamic */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
+		0, BASE_ADDRESS_MODIFY);
+
+	/* indirect */
+	OUT_BATCH(0);
+	OUT_BATCH(0);
+
+	/* instruction */
+	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
+
+	/* general state buffer size */
+	OUT_BATCH(0xfffff000 | 1);
+	/* dynamic state buffer size */
+	OUT_BATCH(1 << 12 | 1);
+	/* indirect object buffer size */
+	OUT_BATCH(0xfffff000 | 1);
+	/* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */
+	OUT_BATCH(1 << 12 | 1);
+
+	/* Bindless surface state base address */
+	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	OUT_BATCH(0);
+	OUT_BATCH(0xfffff000);
+}
diff --git a/lib/gpu_fill.h b/lib/gpu_fill.h
new file mode 100644
index 00000000..87e62c86
--- /dev/null
+++ b/lib/gpu_fill.h
@@ -0,0 +1,167 @@
+/*
+ * Copyright © 2018 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#ifndef GPU_FILL_H
+#define GPU_FILL_H
+
+#include <intel_bufmgr.h>
+#include <i915_drm.h>
+
+#include "media_fill.h"
+#include "gen7_media.h"
+#include "gen8_media.h"
+#include "intel_reg.h"
+#include "drmtest.h"
+#include "intel_batchbuffer.h"
+#include "intel_chipset.h"
+#include <assert.h>
+
+uint32_t
+batch_used(struct intel_batchbuffer *batch);
+
+uint32_t
+batch_align(struct intel_batchbuffer *batch, uint32_t align);
+
+void *
+batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align);
+
+uint32_t
+batch_offset(struct intel_batchbuffer *batch, void *ptr);
+
+uint32_t
+batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size, uint32_t align);
+
+void
+gen7_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end);
+
+uint32_t
+gen7_fill_curbe_buffer_data(struct intel_batchbuffer *batch,
+			uint8_t color);
+
+uint32_t
+gen7_fill_surface_state(struct intel_batchbuffer *batch,
+			struct igt_buf *buf,
+			uint32_t format,
+			int is_dst);
+
+uint32_t
+gen7_fill_binding_table(struct intel_batchbuffer *batch,
+			struct igt_buf *dst);
+
+uint32_t
+gen7_fill_media_kernel(struct intel_batchbuffer *batch,
+		const uint32_t kernel[][4],
+		size_t size);
+
+uint32_t
+gen8_fill_media_kernel(struct intel_batchbuffer *batch,
+		const uint32_t kernel[][4],
+		size_t size);
+
+uint32_t
+gen7_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *dst,
+			       const uint32_t kernel[][4], size_t size);
+
+void
+gen7_emit_state_base_address(struct intel_batchbuffer *batch);
+
+void
+gen7_emit_vfe_state(struct intel_batchbuffer *batch);
+
+void
+gen7_emit_vfe_state_gpgpu(struct intel_batchbuffer *batch);
+
+void
+gen7_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer);
+
+void
+gen7_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor);
+
+void
+gen7_emit_media_objects(struct intel_batchbuffer *batch,
+			unsigned x, unsigned y,
+			unsigned width, unsigned height);
+
+void
+gen7_emit_gpgpu_walk(struct intel_batchbuffer *batch,
+		     unsigned x, unsigned y,
+		     unsigned width, unsigned height);
+
+void
+gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end);
+
+uint32_t
+gen8_fill_curbe_buffer_data(struct intel_batchbuffer *batch,
+			uint8_t color);
+
+uint32_t
+gen8_fill_surface_state(struct intel_batchbuffer *batch,
+			struct igt_buf *buf,
+			uint32_t format,
+			int is_dst);
+
+uint32_t
+gen8_fill_binding_table(struct intel_batchbuffer *batch,
+			struct igt_buf *dst);
+
+uint32_t
+gen8_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *dst,  const uint32_t kernel[][4], size_t size);
+
+void
+gen8_emit_state_base_address(struct intel_batchbuffer *batch);
+
+void
+gen8_emit_vfe_state(struct intel_batchbuffer *batch);
+
+void
+gen8_emit_vfe_state_gpgpu(struct intel_batchbuffer *batch);
+
+void
+gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer);
+
+void
+gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor);
+
+void
+gen8_emit_media_state_flush(struct intel_batchbuffer *batch);
+
+void
+gen8_emit_media_objects(struct intel_batchbuffer *batch,
+			unsigned x, unsigned y,
+			unsigned width, unsigned height);
+
+void
+gen8lp_emit_media_objects(struct intel_batchbuffer *batch,
+			unsigned x, unsigned y,
+			unsigned width, unsigned height);
+
+void
+gen8_emit_gpgpu_walk(struct intel_batchbuffer *batch,
+		     unsigned x, unsigned y,
+		     unsigned width, unsigned height);
+
+void
+gen9_emit_state_base_address(struct intel_batchbuffer *batch);
+
+#endif /* GPU_FILL_H */
diff --git a/lib/media_fill_gen7.c b/lib/media_fill_gen7.c
index 6fb44798..c97555a6 100644
--- a/lib/media_fill_gen7.c
+++ b/lib/media_fill_gen7.c
@@ -5,7 +5,7 @@
 #include "gen7_media.h"
 #include "intel_reg.h"
 #include "drmtest.h"
-
+#include "gpu_fill.h"
 #include <assert.h>
 
 static const uint32_t media_kernel[][4] = {
@@ -22,275 +22,6 @@ static const uint32_t media_kernel[][4] = {
 	{ 0x07800031, 0x20001ca8, 0x00000e00, 0x82000010 },
 };
 
-static uint32_t
-batch_used(struct intel_batchbuffer *batch)
-{
-	return batch->ptr - batch->buffer;
-}
-
-static uint32_t
-batch_align(struct intel_batchbuffer *batch, uint32_t align)
-{
-	uint32_t offset = batch_used(batch);
-	offset = ALIGN(offset, align);
-	batch->ptr = batch->buffer + offset;
-	return offset;
-}
-
-static void *
-batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align)
-{
-	uint32_t offset = batch_align(batch, align);
-	batch->ptr += size;
-	return memset(batch->buffer + offset, 0, size);
-}
-
-static uint32_t
-batch_offset(struct intel_batchbuffer *batch, void *ptr)
-{
-	return (uint8_t *)ptr - batch->buffer;
-}
-
-static uint32_t
-batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size, uint32_t align)
-{
-	return batch_offset(batch, memcpy(batch_alloc(batch, size, align), ptr, size));
-}
-
-static void
-gen7_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end)
-{
-	int ret;
-
-	ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
-	if (ret == 0)
-		ret = drm_intel_bo_mrb_exec(batch->bo, batch_end,
-					NULL, 0, 0, 0);
-	igt_assert(ret == 0);
-}
-
-static uint32_t
-gen7_fill_curbe_buffer_data(struct intel_batchbuffer *batch,
-			uint8_t color)
-{
-	uint8_t *curbe_buffer;
-	uint32_t offset;
-
-	curbe_buffer = batch_alloc(batch, sizeof(uint32_t) * 8, 64);
-	offset = batch_offset(batch, curbe_buffer);
-	*curbe_buffer = color;
-
-	return offset;
-}
-
-static uint32_t
-gen7_fill_surface_state(struct intel_batchbuffer *batch,
-			struct igt_buf *buf,
-			uint32_t format,
-			int is_dst)
-{
-	struct gen7_surface_state *ss;
-	uint32_t write_domain, read_domain, offset;
-	int ret;
-
-	if (is_dst) {
-		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
-	} else {
-		write_domain = 0;
-		read_domain = I915_GEM_DOMAIN_SAMPLER;
-	}
-
-	ss = batch_alloc(batch, sizeof(*ss), 64);
-	offset = batch_offset(batch, ss);
-
-	ss->ss0.surface_type = GEN7_SURFACE_2D;
-	ss->ss0.surface_format = format;
-	ss->ss0.render_cache_read_write = 1;
-
-	if (buf->tiling == I915_TILING_X)
-		ss->ss0.tiled_mode = 2;
-	else if (buf->tiling == I915_TILING_Y)
-		ss->ss0.tiled_mode = 3;
-
-	ss->ss1.base_addr = buf->bo->offset;
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				batch_offset(batch, ss) + 4,
-				buf->bo, 0,
-				read_domain, write_domain);
-	igt_assert(ret == 0);
-
-	ss->ss2.height = igt_buf_height(buf) - 1;
-	ss->ss2.width  = igt_buf_width(buf) - 1;
-
-	ss->ss3.pitch  = buf->stride - 1;
-
-	ss->ss7.shader_chanel_select_r = 4;
-	ss->ss7.shader_chanel_select_g = 5;
-	ss->ss7.shader_chanel_select_b = 6;
-	ss->ss7.shader_chanel_select_a = 7;
-
-	return offset;
-}
-
-static uint32_t
-gen7_fill_binding_table(struct intel_batchbuffer *batch,
-			struct igt_buf *dst)
-{
-	uint32_t *binding_table, offset;
-
-	binding_table = batch_alloc(batch, 32, 64);
-	offset = batch_offset(batch, binding_table);
-
-	binding_table[0] = gen7_fill_surface_state(batch, dst, GEN7_SURFACEFORMAT_R8_UNORM, 1);
-
-	return offset;
-}
-
-static uint32_t
-gen7_fill_media_kernel(struct intel_batchbuffer *batch,
-		const uint32_t kernel[][4],
-		size_t size)
-{
-	uint32_t offset;
-
-	offset = batch_copy(batch, kernel, size, 64);
-
-	return offset;
-}
-
-static uint32_t
-gen7_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *dst,
-			       const uint32_t kernel[][4], size_t size)
-{
-	struct gen7_interface_descriptor_data *idd;
-	uint32_t offset;
-	uint32_t binding_table_offset, kernel_offset;
-
-	binding_table_offset = gen7_fill_binding_table(batch, dst);
-	kernel_offset = gen7_fill_media_kernel(batch, kernel, size);
-
-	idd = batch_alloc(batch, sizeof(*idd), 64);
-	offset = batch_offset(batch, idd);
-
-	idd->desc0.kernel_start_pointer = (kernel_offset >> 6);
-
-	idd->desc1.single_program_flow = 1;
-	idd->desc1.floating_point_mode = GEN7_FLOATING_POINT_IEEE_754;
-
-	idd->desc2.sampler_count = 0;      /* 0 samplers used */
-	idd->desc2.sampler_state_pointer = 0;
-
-	idd->desc3.binding_table_entry_count = 0;
-	idd->desc3.binding_table_pointer = (binding_table_offset >> 5);
-
-	idd->desc4.constant_urb_entry_read_offset = 0;
-	idd->desc4.constant_urb_entry_read_length = 1; /* grf 1 */
-
-	return offset;
-}
-
-static void
-gen7_emit_state_base_address(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN7_STATE_BASE_ADDRESS | (10 - 2));
-
-	/* general */
-	OUT_BATCH(0);
-
-	/* surface */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-
-	/* dynamic */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-
-	/* indirect */
-	OUT_BATCH(0);
-
-	/* instruction */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-
-	/* general/dynamic/indirect/instruction access Bound */
-	OUT_BATCH(0);
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-}
-
-static void
-gen7_emit_vfe_state(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN7_MEDIA_VFE_STATE | (8 - 2));
-
-	/* scratch buffer */
-	OUT_BATCH(0);
-
-	/* number of threads & urb entries */
-	OUT_BATCH(1 << 16 |
-		2 << 8);
-
-	OUT_BATCH(0);
-
-	/* urb entry size & curbe size */
-	OUT_BATCH(2 << 16 | 	/* in 256 bits unit */
-		2);		/* in 256 bits unit */
-
-	/* scoreboard */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-}
-
-static void
-gen7_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer)
-{
-	OUT_BATCH(GEN7_MEDIA_CURBE_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* curbe total data length */
-	OUT_BATCH(64);
-	/* curbe data start address, is relative to the dynamics base address */
-	OUT_BATCH(curbe_buffer);
-}
-
-static void
-gen7_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor)
-{
-	OUT_BATCH(GEN7_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* interface descriptor data length */
-	OUT_BATCH(sizeof(struct gen7_interface_descriptor_data));
-	/* interface descriptor address, is relative to the dynamics base address */
-	OUT_BATCH(interface_descriptor);
-}
-
-static void
-gen7_emit_media_objects(struct intel_batchbuffer *batch,
-			unsigned x, unsigned y,
-			unsigned width, unsigned height)
-{
-	int i, j;
-
-	for (i = 0; i < width / 16; i++) {
-		for (j = 0; j < height / 16; j++) {
-			OUT_BATCH(GEN7_MEDIA_OBJECT | (8 - 2));
-
-			/* interface descriptor offset */
-			OUT_BATCH(0);
-
-			/* without indirect data */
-			OUT_BATCH(0);
-			OUT_BATCH(0);
-
-			/* scoreboard */
-			OUT_BATCH(0);
-			OUT_BATCH(0);
-
-			/* inline data (xoffset, yoffset) */
-			OUT_BATCH(x + i * 16);
-			OUT_BATCH(y + j * 16);
-		}
-	}
-}
-
 /*
  * This sets up the media pipeline,
  *
diff --git a/lib/media_fill_gen8.c b/lib/media_fill_gen8.c
index 4a8fe5a2..4270997e 100644
--- a/lib/media_fill_gen8.c
+++ b/lib/media_fill_gen8.c
@@ -5,7 +5,7 @@
 #include "gen8_media.h"
 #include "intel_reg.h"
 #include "drmtest.h"
-
+#include "gpu_fill.h"
 #include <assert.h>
 
 
@@ -23,293 +23,7 @@ static const uint32_t media_kernel[][4] = {
 	{ 0x07800031, 0x20000a40, 0x0e000e00, 0x82000010 },
 };
 
-static uint32_t
-batch_used(struct intel_batchbuffer *batch)
-{
-	return batch->ptr - batch->buffer;
-}
-
-static uint32_t
-batch_align(struct intel_batchbuffer *batch, uint32_t align)
-{
-	uint32_t offset = batch_used(batch);
-	offset = ALIGN(offset, align);
-	batch->ptr = batch->buffer + offset;
-	return offset;
-}
-
-static void *
-batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align)
-{
-	uint32_t offset = batch_align(batch, align);
-	batch->ptr += size;
-	return memset(batch->buffer + offset, 0, size);
-}
-
-static uint32_t
-batch_offset(struct intel_batchbuffer *batch, void *ptr)
-{
-	return (uint8_t *)ptr - batch->buffer;
-}
-
-static uint32_t
-batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size, uint32_t align)
-{
-	return batch_offset(batch, memcpy(batch_alloc(batch, size, align), ptr, size));
-}
-
-static void
-gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end)
-{
-	int ret;
-
-	ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
-	if (ret == 0)
-		ret = drm_intel_bo_mrb_exec(batch->bo, batch_end,
-					NULL, 0, 0, 0);
-	igt_assert(ret == 0);
-}
-
-static uint32_t
-gen8_fill_curbe_buffer_data(struct intel_batchbuffer *batch,
-			uint8_t color)
-{
-	uint8_t *curbe_buffer;
-	uint32_t offset;
-
-	curbe_buffer = batch_alloc(batch, sizeof(uint32_t) * 8, 64);
-	offset = batch_offset(batch, curbe_buffer);
-	*curbe_buffer = color;
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_surface_state(struct intel_batchbuffer *batch,
-			struct igt_buf *buf,
-			uint32_t format,
-			int is_dst)
-{
-	struct gen8_surface_state *ss;
-	uint32_t write_domain, read_domain, offset;
-	int ret;
-
-	if (is_dst) {
-		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
-	} else {
-		write_domain = 0;
-		read_domain = I915_GEM_DOMAIN_SAMPLER;
-	}
-
-	ss = batch_alloc(batch, sizeof(*ss), 64);
-	offset = batch_offset(batch, ss);
-
-	ss->ss0.surface_type = GEN8_SURFACE_2D;
-	ss->ss0.surface_format = format;
-	ss->ss0.render_cache_read_write = 1;
-	ss->ss0.vertical_alignment = 1; /* align 4 */
-	ss->ss0.horizontal_alignment = 1; /* align 4 */
-
-	if (buf->tiling == I915_TILING_X)
-		ss->ss0.tiled_mode = 2;
-	else if (buf->tiling == I915_TILING_Y)
-		ss->ss0.tiled_mode = 3;
-
-	ss->ss8.base_addr = buf->bo->offset;
-
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				batch_offset(batch, ss) + 8 * 4,
-				buf->bo, 0,
-				read_domain, write_domain);
-	igt_assert(ret == 0);
-
-	ss->ss2.height = igt_buf_height(buf) - 1;
-	ss->ss2.width  = igt_buf_width(buf) - 1;
-	ss->ss3.pitch  = buf->stride - 1;
-
-	ss->ss7.shader_chanel_select_r = 4;
-	ss->ss7.shader_chanel_select_g = 5;
-	ss->ss7.shader_chanel_select_b = 6;
-	ss->ss7.shader_chanel_select_a = 7;
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_binding_table(struct intel_batchbuffer *batch,
-			struct igt_buf *dst)
-{
-	uint32_t *binding_table, offset;
-
-	binding_table = batch_alloc(batch, 32, 64);
-	offset = batch_offset(batch, binding_table);
-
-	binding_table[0] = gen8_fill_surface_state(batch, dst, GEN8_SURFACEFORMAT_R8_UNORM, 1);
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_media_kernel(struct intel_batchbuffer *batch,
-		const uint32_t kernel[][4],
-		size_t size)
-{
-	uint32_t offset;
-
-	offset = batch_copy(batch, kernel, size, 64);
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *dst)
-{
-	struct gen8_interface_descriptor_data *idd;
-	uint32_t offset;
-	uint32_t binding_table_offset, kernel_offset;
-
-	binding_table_offset = gen8_fill_binding_table(batch, dst);
-	kernel_offset = gen8_fill_media_kernel(batch, media_kernel, sizeof(media_kernel));
-
-	idd = batch_alloc(batch, sizeof(*idd), 64);
-	offset = batch_offset(batch, idd);
-
-	idd->desc0.kernel_start_pointer = (kernel_offset >> 6);
-
-	idd->desc2.single_program_flow = 1;
-	idd->desc2.floating_point_mode = GEN8_FLOATING_POINT_IEEE_754;
-
-	idd->desc3.sampler_count = 0;      /* 0 samplers used */
-	idd->desc3.sampler_state_pointer = 0;
-
-	idd->desc4.binding_table_entry_count = 0;
-	idd->desc4.binding_table_pointer = (binding_table_offset >> 5);
-
-	idd->desc5.constant_urb_entry_read_offset = 0;
-	idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */
-
-	return offset;
-}
-
-static void
-gen8_emit_state_base_address(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (16 - 2));
-
-	/* general */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
-
-	/* stateless data port */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-
-	/* surface */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
-
-	/* dynamic */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
-		0, BASE_ADDRESS_MODIFY);
-
-	/* indirect */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	/* instruction */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-
-	/* general state buffer size */
-	OUT_BATCH(0xfffff000 | 1);
-	/* dynamic state buffer size */
-	OUT_BATCH(1 << 12 | 1);
-	/* indirect object buffer size */
-	OUT_BATCH(0xfffff000 | 1);
-	/* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */
-	OUT_BATCH(1 << 12 | 1);
-}
-
-static void
-gen8_emit_vfe_state(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2));
-
-	/* scratch buffer */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	/* number of threads & urb entries */
-	OUT_BATCH(1 << 16 |
-		2 << 8);
-
-	OUT_BATCH(0);
-
-	/* urb entry size & curbe size */
-	OUT_BATCH(2 << 16 |
-		2);
-
-	/* scoreboard */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-}
-
-static void
-gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer)
-{
-	OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* curbe total data length */
-	OUT_BATCH(64);
-	/* curbe data start address, is relative to the dynamics base address */
-	OUT_BATCH(curbe_buffer);
-}
 
-static void
-gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor)
-{
-	OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* interface descriptor data length */
-	OUT_BATCH(sizeof(struct gen8_interface_descriptor_data));
-	/* interface descriptor address, is relative to the dynamics base address */
-	OUT_BATCH(interface_descriptor);
-}
-
-static void
-gen8_emit_media_state_flush(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2));
-	OUT_BATCH(0);
-}
-
-static void
-gen8_emit_media_objects(struct intel_batchbuffer *batch,
-			unsigned x, unsigned y,
-			unsigned width, unsigned height)
-{
-	int i, j;
-
-	for (i = 0; i < width / 16; i++) {
-		for (j = 0; j < height / 16; j++) {
-			OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
-
-			/* interface descriptor offset */
-			OUT_BATCH(0);
-
-			/* without indirect data */
-			OUT_BATCH(0);
-			OUT_BATCH(0);
-
-			/* scoreboard */
-			OUT_BATCH(0);
-			OUT_BATCH(0);
-
-			/* inline data (xoffset, yoffset) */
-			OUT_BATCH(x + i * 16);
-			OUT_BATCH(y + j * 16);
-			gen8_emit_media_state_flush(batch);
-		}
-	}
-}
 
 /*
  * This sets up the media pipeline,
@@ -349,7 +63,7 @@ gen8_media_fillfunc(struct intel_batchbuffer *batch,
 	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
 
 	curbe_buffer = gen8_fill_curbe_buffer_data(batch, color);
-	interface_descriptor = gen8_fill_interface_descriptor(batch, dst);
+	interface_descriptor = gen8_fill_interface_descriptor(batch, dst, media_kernel, sizeof(media_kernel));
 	igt_assert(batch->ptr < &batch->buffer[4095]);
 
 	/* media pipeline */
diff --git a/lib/media_fill_gen8lp.c b/lib/media_fill_gen8lp.c
index 1f8a4adc..dcc11982 100644
--- a/lib/media_fill_gen8lp.c
+++ b/lib/media_fill_gen8lp.c
@@ -5,7 +5,7 @@
 #include "gen8_media.h"
 #include "intel_reg.h"
 #include "drmtest.h"
-
+#include "gpu_fill.h"
 #include <assert.h>
 
 
@@ -23,286 +23,6 @@ static const uint32_t media_kernel[][4] = {
 	{ 0x07800031, 0x20000a40, 0x0e000e00, 0x82000010 },
 };
 
-static uint32_t
-batch_used(struct intel_batchbuffer *batch)
-{
-	return batch->ptr - batch->buffer;
-}
-
-static uint32_t
-batch_align(struct intel_batchbuffer *batch, uint32_t align)
-{
-	uint32_t offset = batch_used(batch);
-	offset = ALIGN(offset, align);
-	batch->ptr = batch->buffer + offset;
-	return offset;
-}
-
-static void *
-batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align)
-{
-	uint32_t offset = batch_align(batch, align);
-	batch->ptr += size;
-	return memset(batch->buffer + offset, 0, size);
-}
-
-static uint32_t
-batch_offset(struct intel_batchbuffer *batch, void *ptr)
-{
-	return (uint8_t *)ptr - batch->buffer;
-}
-
-static uint32_t
-batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size, uint32_t align)
-{
-	return batch_offset(batch, memcpy(batch_alloc(batch, size, align), ptr, size));
-}
-
-static void
-gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end)
-{
-	int ret;
-
-	ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
-	if (ret == 0)
-		ret = drm_intel_bo_mrb_exec(batch->bo, batch_end,
-					NULL, 0, 0, 0);
-	igt_assert(ret == 0);
-}
-
-static uint32_t
-gen8_fill_curbe_buffer_data(struct intel_batchbuffer *batch,
-			uint8_t color)
-{
-	uint8_t *curbe_buffer;
-	uint32_t offset;
-
-	curbe_buffer = batch_alloc(batch, sizeof(uint32_t) * 8, 64);
-	offset = batch_offset(batch, curbe_buffer);
-	*curbe_buffer = color;
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_surface_state(struct intel_batchbuffer *batch,
-			struct igt_buf *buf,
-			uint32_t format,
-			int is_dst)
-{
-	struct gen8_surface_state *ss;
-	uint32_t write_domain, read_domain, offset;
-	int ret;
-
-	if (is_dst) {
-		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
-	} else {
-		write_domain = 0;
-		read_domain = I915_GEM_DOMAIN_SAMPLER;
-	}
-
-	ss = batch_alloc(batch, sizeof(*ss), 64);
-	offset = batch_offset(batch, ss);
-
-	ss->ss0.surface_type = GEN8_SURFACE_2D;
-	ss->ss0.surface_format = format;
-	ss->ss0.render_cache_read_write = 1;
-	ss->ss0.vertical_alignment = 1; /* align 4 */
-	ss->ss0.horizontal_alignment = 1; /* align 4 */
-
-	if (buf->tiling == I915_TILING_X)
-		ss->ss0.tiled_mode = 2;
-	else if (buf->tiling == I915_TILING_Y)
-		ss->ss0.tiled_mode = 3;
-
-	ss->ss8.base_addr = buf->bo->offset;
-
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				batch_offset(batch, ss) + 8 * 4,
-				buf->bo, 0,
-				read_domain, write_domain);
-	igt_assert(ret == 0);
-
-	ss->ss2.height = igt_buf_height(buf) - 1;
-	ss->ss2.width  = igt_buf_width(buf) - 1;
-	ss->ss3.pitch  = buf->stride - 1;
-
-	ss->ss7.shader_chanel_select_r = 4;
-	ss->ss7.shader_chanel_select_g = 5;
-	ss->ss7.shader_chanel_select_b = 6;
-	ss->ss7.shader_chanel_select_a = 7;
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_binding_table(struct intel_batchbuffer *batch,
-			struct igt_buf *dst)
-{
-	uint32_t *binding_table, offset;
-
-	binding_table = batch_alloc(batch, 32, 64);
-	offset = batch_offset(batch, binding_table);
-
-	binding_table[0] = gen8_fill_surface_state(batch, dst, GEN8_SURFACEFORMAT_R8_UNORM, 1);
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_media_kernel(struct intel_batchbuffer *batch,
-		const uint32_t kernel[][4],
-		size_t size)
-{
-	uint32_t offset;
-
-	offset = batch_copy(batch, kernel, size, 64);
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *dst)
-{
-	struct gen8_interface_descriptor_data *idd;
-	uint32_t offset;
-	uint32_t binding_table_offset, kernel_offset;
-
-	binding_table_offset = gen8_fill_binding_table(batch, dst);
-	kernel_offset = gen8_fill_media_kernel(batch, media_kernel, sizeof(media_kernel));
-
-	idd = batch_alloc(batch, sizeof(*idd), 64);
-	offset = batch_offset(batch, idd);
-
-	idd->desc0.kernel_start_pointer = (kernel_offset >> 6);
-
-	idd->desc2.single_program_flow = 1;
-	idd->desc2.floating_point_mode = GEN8_FLOATING_POINT_IEEE_754;
-
-	idd->desc3.sampler_count = 0;      /* 0 samplers used */
-	idd->desc3.sampler_state_pointer = 0;
-
-	idd->desc4.binding_table_entry_count = 0;
-	idd->desc4.binding_table_pointer = (binding_table_offset >> 5);
-
-	idd->desc5.constant_urb_entry_read_offset = 0;
-	idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */
-
-	return offset;
-}
-
-static void
-gen8_emit_state_base_address(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (16 - 2));
-
-	/* general */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
-
-	/* stateless data port */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-
-	/* surface */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
-
-	/* dynamic */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
-		0, BASE_ADDRESS_MODIFY);
-
-	/* indirect */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	/* instruction */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-
-	/* general state buffer size */
-	OUT_BATCH(0xfffff000 | 1);
-	/* dynamic state buffer size */
-	OUT_BATCH(1 << 12 | 1);
-	/* indirect object buffer size */
-	OUT_BATCH(0xfffff000 | 1);
-	/* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */
-	OUT_BATCH(1 << 12 | 1);
-}
-
-static void
-gen8_emit_vfe_state(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2));
-
-	/* scratch buffer */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	/* number of threads & urb entries */
-	OUT_BATCH(1 << 16 |
-		2 << 8);
-
-	OUT_BATCH(0);
-
-	/* urb entry size & curbe size */
-	OUT_BATCH(2 << 16 |
-		2);
-
-	/* scoreboard */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-}
-
-static void
-gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer)
-{
-	OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* curbe total data length */
-	OUT_BATCH(64);
-	/* curbe data start address, is relative to the dynamics base address */
-	OUT_BATCH(curbe_buffer);
-}
-
-static void
-gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor)
-{
-	OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* interface descriptor data length */
-	OUT_BATCH(sizeof(struct gen8_interface_descriptor_data));
-	/* interface descriptor address, is relative to the dynamics base address */
-	OUT_BATCH(interface_descriptor);
-}
-
-static void
-gen8lp_emit_media_objects(struct intel_batchbuffer *batch,
-			unsigned x, unsigned y,
-			unsigned width, unsigned height)
-{
-	int i, j;
-
-	for (i = 0; i < width / 16; i++) {
-		for (j = 0; j < height / 16; j++) {
-			OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
-
-			/* interface descriptor offset */
-			OUT_BATCH(0);
-
-			/* without indirect data */
-			OUT_BATCH(0);
-			OUT_BATCH(0);
-
-			/* scoreboard */
-			OUT_BATCH(0);
-			OUT_BATCH(0);
-
-			/* inline data (xoffset, yoffset) */
-			OUT_BATCH(x + i * 16);
-			OUT_BATCH(y + j * 16);
-		}
-	}
-}
-
 /*
  * This sets up the media pipeline,
  *
@@ -341,7 +61,7 @@ gen8lp_media_fillfunc(struct intel_batchbuffer *batch,
 	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
 
 	curbe_buffer = gen8_fill_curbe_buffer_data(batch, color);
-	interface_descriptor = gen8_fill_interface_descriptor(batch, dst);
+	interface_descriptor = gen8_fill_interface_descriptor(batch, dst, media_kernel, sizeof(media_kernel));
 	igt_assert(batch->ptr < &batch->buffer[4095]);
 
 	/* media pipeline */
diff --git a/lib/media_fill_gen9.c b/lib/media_fill_gen9.c
index 3fd21819..6accdbe4 100644
--- a/lib/media_fill_gen9.c
+++ b/lib/media_fill_gen9.c
@@ -4,11 +4,9 @@
 #include "media_fill.h"
 #include "gen8_media.h"
 #include "intel_reg.h"
-
+#include "gpu_fill.h"
 #include <assert.h>
 
-#define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1))
-
 static const uint32_t media_kernel[][4] = {
 	{ 0x00400001, 0x20202288, 0x00000020, 0x00000000 },
 	{ 0x00600001, 0x20800208, 0x008d0000, 0x00000000 },
@@ -23,298 +21,6 @@ static const uint32_t media_kernel[][4] = {
 	{ 0x07800031, 0x20000a40, 0x0e000e00, 0x82000010 },
 };
 
-static uint32_t
-batch_used(struct intel_batchbuffer *batch)
-{
-	return batch->ptr - batch->buffer;
-}
-
-static uint32_t
-batch_align(struct intel_batchbuffer *batch, uint32_t align)
-{
-	uint32_t offset = batch_used(batch);
-	offset = ALIGN(offset, align);
-	batch->ptr = batch->buffer + offset;
-	return offset;
-}
-
-static void *
-batch_alloc(struct intel_batchbuffer *batch, uint32_t size, uint32_t align)
-{
-	uint32_t offset = batch_align(batch, align);
-	batch->ptr += size;
-	return memset(batch->buffer + offset, 0, size);
-}
-
-static uint32_t
-batch_offset(struct intel_batchbuffer *batch, void *ptr)
-{
-	return (uint8_t *)ptr - batch->buffer;
-}
-
-static uint32_t
-batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size, uint32_t align)
-{
-	return batch_offset(batch, memcpy(batch_alloc(batch, size, align), ptr, size));
-}
-
-static void
-gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end)
-{
-	int ret;
-
-	ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
-	if (ret == 0)
-		ret = drm_intel_bo_mrb_exec(batch->bo, batch_end,
-					NULL, 0, 0, 0);
-	assert(ret == 0);
-}
-
-static uint32_t
-gen8_fill_curbe_buffer_data(struct intel_batchbuffer *batch,
-			uint8_t color)
-{
-	uint8_t *curbe_buffer;
-	uint32_t offset;
-
-	curbe_buffer = batch_alloc(batch, sizeof(uint32_t) * 8, 64);
-	offset = batch_offset(batch, curbe_buffer);
-	*curbe_buffer = color;
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_surface_state(struct intel_batchbuffer *batch,
-			struct igt_buf *buf,
-			uint32_t format,
-			int is_dst)
-{
-	struct gen8_surface_state *ss;
-	uint32_t write_domain, read_domain, offset;
-	int ret;
-
-	if (is_dst) {
-		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
-	} else {
-		write_domain = 0;
-		read_domain = I915_GEM_DOMAIN_SAMPLER;
-	}
-
-	ss = batch_alloc(batch, sizeof(*ss), 64);
-	offset = batch_offset(batch, ss);
-
-	ss->ss0.surface_type = GEN8_SURFACE_2D;
-	ss->ss0.surface_format = format;
-	ss->ss0.render_cache_read_write = 1;
-	ss->ss0.vertical_alignment = 1; /* align 4 */
-	ss->ss0.horizontal_alignment = 1; /* align 4 */
-
-	if (buf->tiling == I915_TILING_X)
-		ss->ss0.tiled_mode = 2;
-	else if (buf->tiling == I915_TILING_Y)
-		ss->ss0.tiled_mode = 3;
-
-	ss->ss8.base_addr = buf->bo->offset;
-
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				batch_offset(batch, ss) + 8 * 4,
-				buf->bo, 0,
-				read_domain, write_domain);
-	assert(ret == 0);
-
-	ss->ss2.height = igt_buf_height(buf) - 1;
-	ss->ss2.width  = igt_buf_width(buf) - 1;
-	ss->ss3.pitch  = buf->stride - 1;
-
-	ss->ss7.shader_chanel_select_r = 4;
-	ss->ss7.shader_chanel_select_g = 5;
-	ss->ss7.shader_chanel_select_b = 6;
-	ss->ss7.shader_chanel_select_a = 7;
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_binding_table(struct intel_batchbuffer *batch,
-			struct igt_buf *dst)
-{
-	uint32_t *binding_table, offset;
-
-	binding_table = batch_alloc(batch, 32, 64);
-	offset = batch_offset(batch, binding_table);
-
-	binding_table[0] = gen8_fill_surface_state(batch, dst, GEN8_SURFACEFORMAT_R8_UNORM, 1);
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_media_kernel(struct intel_batchbuffer *batch,
-		const uint32_t kernel[][4],
-		size_t size)
-{
-	uint32_t offset;
-
-	offset = batch_copy(batch, kernel, size, 64);
-
-	return offset;
-}
-
-static uint32_t
-gen8_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *dst)
-{
-	struct gen8_interface_descriptor_data *idd;
-	uint32_t offset;
-	uint32_t binding_table_offset, kernel_offset;
-
-	binding_table_offset = gen8_fill_binding_table(batch, dst);
-	kernel_offset = gen8_fill_media_kernel(batch, media_kernel, sizeof(media_kernel));
-
-	idd = batch_alloc(batch, sizeof(*idd), 64);
-	offset = batch_offset(batch, idd);
-
-	idd->desc0.kernel_start_pointer = (kernel_offset >> 6);
-
-	idd->desc2.single_program_flow = 1;
-	idd->desc2.floating_point_mode = GEN8_FLOATING_POINT_IEEE_754;
-
-	idd->desc3.sampler_count = 0;      /* 0 samplers used */
-	idd->desc3.sampler_state_pointer = 0;
-
-	idd->desc4.binding_table_entry_count = 0;
-	idd->desc4.binding_table_pointer = (binding_table_offset >> 5);
-
-	idd->desc5.constant_urb_entry_read_offset = 0;
-	idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */
-
-	return offset;
-}
-
-static void
-gen9_emit_state_base_address(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_STATE_BASE_ADDRESS | (19 - 2));
-
-	/* general */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
-
-	/* stateless data port */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-
-	/* surface */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
-
-	/* dynamic */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
-		0, BASE_ADDRESS_MODIFY);
-
-	/* indirect */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	/* instruction */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-
-	/* general state buffer size */
-	OUT_BATCH(0xfffff000 | 1);
-	/* dynamic state buffer size */
-	OUT_BATCH(1 << 12 | 1);
-	/* indirect object buffer size */
-	OUT_BATCH(0xfffff000 | 1);
-	/* intruction buffer size, must set modify enable bit, otherwise it may result in GPU hang */
-	OUT_BATCH(1 << 12 | 1);
-
-	/* Bindless surface state base address */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
-	OUT_BATCH(0xfffff000);
-}
-
-static void
-gen8_emit_vfe_state(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2));
-
-	/* scratch buffer */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	/* number of threads & urb entries */
-	OUT_BATCH(1 << 16 |
-		2 << 8);
-
-	OUT_BATCH(0);
-
-	/* urb entry size & curbe size */
-	OUT_BATCH(2 << 16 |
-		2);
-
-	/* scoreboard */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-}
-
-static void
-gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer)
-{
-	OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* curbe total data length */
-	OUT_BATCH(64);
-	/* curbe data start address, is relative to the dynamics base address */
-	OUT_BATCH(curbe_buffer);
-}
-
-static void
-gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor)
-{
-	OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* interface descriptor data length */
-	OUT_BATCH(sizeof(struct gen8_interface_descriptor_data));
-	/* interface descriptor address, is relative to the dynamics base address */
-	OUT_BATCH(interface_descriptor);
-}
-
-static void
-gen8_emit_media_state_flush(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2));
-	OUT_BATCH(0);
-}
-
-static void
-gen8_emit_media_objects(struct intel_batchbuffer *batch,
-			unsigned x, unsigned y,
-			unsigned width, unsigned height)
-{
-	int i, j;
-
-	for (i = 0; i < width / 16; i++) {
-		for (j = 0; j < height / 16; j++) {
-			OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
-
-			/* interface descriptor offset */
-			OUT_BATCH(0);
-
-			/* without indirect data */
-			OUT_BATCH(0);
-			OUT_BATCH(0);
-
-			/* scoreboard */
-			OUT_BATCH(0);
-			OUT_BATCH(0);
-
-			/* inline data (xoffset, yoffset) */
-			OUT_BATCH(x + i * 16);
-			OUT_BATCH(y + j * 16);
-			gen8_emit_media_state_flush(batch);
-		}
-	}
-}
 
 /*
  * This sets up the media pipeline,
@@ -354,7 +60,7 @@ gen9_media_fillfunc(struct intel_batchbuffer *batch,
 	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
 
 	curbe_buffer = gen8_fill_curbe_buffer_data(batch, color);
-	interface_descriptor = gen8_fill_interface_descriptor(batch, dst);
+	interface_descriptor = gen8_fill_interface_descriptor(batch, dst, media_kernel, sizeof(media_kernel));
 	assert(batch->ptr < &batch->buffer[4095]);
 
 	/* media pipeline */
diff --git a/lib/meson.build b/lib/meson.build
index b3b8b14a..385e08b9 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -30,6 +30,7 @@ lib_sources = [
 	'media_fill_gen9.c',
 	'media_spin.c',
 	'gpgpu_fill.c',
+	'gpu_fill.c',
 	'rendercopy_i915.c',
 	'rendercopy_i830.c',
 	'rendercopy_gen6.c',
-- 
2.14.3

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [igt-dev] [PATCH i-g-t v7 2/4] lib: Remove duplications in gpu_fill library
  2018-04-11  8:14 [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries Katarzyna Dec
  2018-04-11  8:14 ` [igt-dev] [PATCH i-g-t v7 1/4] lib: Move common gpgpu/media fill functions to gpu_fill library Katarzyna Dec
@ 2018-04-11  8:14 ` Katarzyna Dec
  2018-04-11 11:53   ` Szwichtenberg, Radoslaw
  2018-04-11  8:15 ` [igt-dev] [PATCH i-g-t v7 3/4] lib/gpgpu_fill: Add missing configuration parameters for gpgpu_fill Katarzyna Dec
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Katarzyna Dec @ 2018-04-11  8:14 UTC (permalink / raw)
  To: igt-dev

After moving all functions needed for gpgpu and media fill testing
there is a lot of duplications which can be removed:
  Library media_fill_gen8 and media_fill_gen8lp for CHT was removed,
media state flush for !CHT was added to gen7_emit_media_objects.
  Many gen8 functions were replaced with gen7 version with devid
parameter (gen7_fill_curbe_load, gen7_emit_interface_descriptor,
gen7_fill_binding_table, gen7_emit_media_objects). Unified fill kernel
function so it is applicable to all gens and both media and gpgpu
(merged gen7_fill_media_kernel and gen8_fill_media_kernel).
  Duplicated constants like GEN8_MEDIA_VFE_STATE, GEN8_MEDIA_CURBE_LOAD,
GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD, GEN8_MEDIA_OBJECT were
replaced by GEN7 version. However this constants were not removed
from gen8_media.h library, because they are used by other tests
for Gen8+. More refactoring in this gen*_media.h libraries is needed.

It seems that further unification of *_fillfunc functions will
introduce more confusion in understanding what the tests are doing
and what were changes between Gens.

v2: Moved some reduntant changes from Move gpgpu/media fill to gpu_fill...
to this patch. Applied comments from review.

v3: rebase

Signed-off-by: Katarzyna Dec <katarzyna.dec@intel.com>
Cc: Lukasz Kalamarz <lukasz.kalamarz@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 lib/Makefile.sources    |   1 -
 lib/gpgpu_fill.c        |   2 +-
 lib/gpu_fill.c          | 172 +++++++-----------------------------------------
 lib/gpu_fill.h          |  38 +----------
 lib/intel_batchbuffer.c |   4 +-
 lib/media_fill.h        |   7 --
 lib/media_fill_gen8.c   |  10 +--
 lib/media_fill_gen8lp.c |  87 ------------------------
 lib/media_fill_gen9.c   |  10 +--
 lib/meson.build         |   1 -
 10 files changed, 39 insertions(+), 293 deletions(-)
 delete mode 100644 lib/media_fill_gen8lp.c

diff --git a/lib/Makefile.sources b/lib/Makefile.sources
index 45e65dd7..9c0150c1 100644
--- a/lib/Makefile.sources
+++ b/lib/Makefile.sources
@@ -58,7 +58,6 @@ lib_source_list =	 	\
 	media_fill.h            \
 	media_fill_gen7.c       \
 	media_fill_gen8.c       \
-	media_fill_gen8lp.c     \
 	media_fill_gen9.c       \
 	media_spin.h		\
 	media_spin.c		\
diff --git a/lib/gpgpu_fill.c b/lib/gpgpu_fill.c
index f2765fd6..579ce78d 100644
--- a/lib/gpgpu_fill.c
+++ b/lib/gpgpu_fill.c
@@ -180,7 +180,7 @@ gen8_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 	gen8_emit_state_base_address(batch);
 	gen8_emit_vfe_state_gpgpu(batch);
 	gen7_emit_curbe_load(batch, curbe_buffer);
-	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
+	gen7_emit_interface_descriptor_load(batch, interface_descriptor);
 	gen8_emit_gpgpu_walk(batch, x, y, width, height);
 
 	OUT_BATCH(MI_BATCH_BUFFER_END);
diff --git a/lib/gpu_fill.c b/lib/gpu_fill.c
index d7a2dd4c..a53e5533 100644
--- a/lib/gpu_fill.c
+++ b/lib/gpu_fill.c
@@ -142,26 +142,18 @@ gen7_fill_binding_table(struct intel_batchbuffer *batch,
 
 	binding_table = batch_alloc(batch, 32, 64);
 	offset = batch_offset(batch, binding_table);
-	binding_table[0] = gen7_fill_surface_state(batch, dst,
+	if (IS_GEN7(batch->devid))
+		binding_table[0] = gen7_fill_surface_state(batch, dst,
 						GEN7_SURFACEFORMAT_R8_UNORM, 1);
+	else
+		binding_table[0] = gen8_fill_surface_state(batch, dst,
+						GEN8_SURFACEFORMAT_R8_UNORM, 1);
 
 	return offset;
 }
 
 uint32_t
-gen7_fill_media_kernel(struct intel_batchbuffer *batch,
-		const uint32_t kernel[][4],
-		size_t size)
-{
-	uint32_t offset;
-
-	offset = batch_copy(batch, kernel, size, 64);
-
-	return offset;
-}
-
-uint32_t
-gen8_fill_media_kernel(struct intel_batchbuffer *batch,
+gen7_fill_kernel(struct intel_batchbuffer *batch,
 		const uint32_t kernel[][4],
 		size_t size)
 {
@@ -181,7 +173,7 @@ gen7_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *
 	uint32_t binding_table_offset, kernel_offset;
 
 	binding_table_offset = gen7_fill_binding_table(batch, dst);
-	kernel_offset = gen7_fill_media_kernel(batch, kernel, size);
+	kernel_offset = gen7_fill_kernel(batch, kernel, size);
 
 	idd = batch_alloc(batch, sizeof(*idd), 64);
 	offset = batch_offset(batch, idd);
@@ -296,7 +288,10 @@ gen7_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t in
 	OUT_BATCH(GEN7_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
 	OUT_BATCH(0);
 	/* interface descriptor data length */
-	OUT_BATCH(sizeof(struct gen7_interface_descriptor_data));
+	if (IS_GEN7(batch->devid))
+		OUT_BATCH(sizeof(struct gen7_interface_descriptor_data));
+	else
+		OUT_BATCH(sizeof(struct gen8_interface_descriptor_data));
 	/* interface descriptor address, is relative to the dynamics base address */
 	OUT_BATCH(interface_descriptor);
 }
@@ -326,6 +321,8 @@ gen7_emit_media_objects(struct intel_batchbuffer *batch,
 			/* inline data (xoffset, yoffset) */
 			OUT_BATCH(x + i * 16);
 			OUT_BATCH(y + j * 16);
+			if (AT_LEAST_GEN(batch->devid, 8) && !IS_CHERRYVIEW(batch->devid))
+				gen8_emit_media_state_flush(batch);
 		}
 	}
 }
@@ -387,33 +384,6 @@ gen7_emit_gpgpu_walk(struct intel_batchbuffer *batch,
 	OUT_BATCH(0xffffffff);
 }
 
-void
-gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end)
-{
-	int ret;
-
-	ret = drm_intel_bo_subdata(batch->bo, 0, 4096, batch->buffer);
-	if (ret == 0)
-		ret = drm_intel_bo_mrb_exec(batch->bo, batch_end,
-					NULL, 0, 0, 0);
-	igt_assert(ret == 0);
-}
-
-
-uint32_t
-gen8_fill_curbe_buffer_data(struct intel_batchbuffer *batch,
-			uint8_t color)
-{
-	uint8_t *curbe_buffer;
-	uint32_t offset;
-
-	curbe_buffer = batch_alloc(batch, sizeof(uint32_t) * 8, 64);
-	offset = batch_offset(batch, curbe_buffer);
-	*curbe_buffer = color;
-
-	return offset;
-}
-
 uint32_t
 gen8_fill_surface_state(struct intel_batchbuffer *batch,
 			struct igt_buf *buf,
@@ -465,21 +435,6 @@ gen8_fill_surface_state(struct intel_batchbuffer *batch,
 	return offset;
 }
 
-uint32_t
-gen8_fill_binding_table(struct intel_batchbuffer *batch,
-			struct igt_buf *dst)
-{
-	uint32_t *binding_table, offset;
-
-	binding_table = batch_alloc(batch, 32, 64);
-	offset = batch_offset(batch, binding_table);
-
-	binding_table[0] = gen8_fill_surface_state(batch, dst,
-						GEN8_SURFACEFORMAT_R8_UNORM, 1);
-
-	return offset;
-}
-
 uint32_t
 gen8_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *dst,  const uint32_t kernel[][4], size_t size)
 {
@@ -487,8 +442,8 @@ gen8_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *
 	uint32_t offset;
 	uint32_t binding_table_offset, kernel_offset;
 
-	binding_table_offset = gen8_fill_binding_table(batch, dst);
-	kernel_offset = gen8_fill_media_kernel(batch, kernel, size);
+	binding_table_offset = gen7_fill_binding_table(batch, dst);
+	kernel_offset = gen7_fill_kernel(batch, kernel, size);
 
 	idd = batch_alloc(batch, sizeof(*idd), 64);
 	offset = batch_offset(batch, idd);
@@ -546,10 +501,17 @@ gen8_emit_state_base_address(struct intel_batchbuffer *batch)
 	OUT_BATCH(1 << 12 | 1);
 }
 
+void
+gen8_emit_media_state_flush(struct intel_batchbuffer *batch)
+{
+	OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2));
+	OUT_BATCH(0);
+}
+
 void
 gen8_emit_vfe_state(struct intel_batchbuffer *batch)
 {
-	OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2));
+	OUT_BATCH(GEN7_MEDIA_VFE_STATE | (9 - 2));
 
 	/* scratch buffer */
 	OUT_BATCH(0);
@@ -574,7 +536,7 @@ gen8_emit_vfe_state(struct intel_batchbuffer *batch)
 void
 gen8_emit_vfe_state_gpgpu(struct intel_batchbuffer *batch)
 {
-	OUT_BATCH(GEN8_MEDIA_VFE_STATE | (9 - 2));
+	OUT_BATCH(GEN7_MEDIA_VFE_STATE | (9 - 2));
 
 	/* scratch buffer */
 	OUT_BATCH(0);
@@ -594,92 +556,6 @@ gen8_emit_vfe_state_gpgpu(struct intel_batchbuffer *batch)
 	OUT_BATCH(0);
 }
 
-void
-gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer)
-{
-	OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* curbe total data length */
-	OUT_BATCH(64);
-	/* curbe data start address, is relative to the dynamics base address */
-	OUT_BATCH(curbe_buffer);
-}
-
-void
-gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor)
-{
-	OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2));
-	OUT_BATCH(0);
-	/* interface descriptor data length */
-	OUT_BATCH(sizeof(struct gen8_interface_descriptor_data));
-	/* interface descriptor address, is relative to the dynamics base address */
-	OUT_BATCH(interface_descriptor);
-}
-
-void
-gen8_emit_media_state_flush(struct intel_batchbuffer *batch)
-{
-	OUT_BATCH(GEN8_MEDIA_STATE_FLUSH | (2 - 2));
-	OUT_BATCH(0);
-}
-
-void
-gen8_emit_media_objects(struct intel_batchbuffer *batch,
-			unsigned x, unsigned y,
-			unsigned width, unsigned height)
-{
-	int i, j;
-
-	for (i = 0; i < width / 16; i++) {
-		for (j = 0; j < height / 16; j++) {
-			OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
-
-			/* interface descriptor offset */
-			OUT_BATCH(0);
-
-			/* without indirect data */
-			OUT_BATCH(0);
-			OUT_BATCH(0);
-
-			/* scoreboard */
-			OUT_BATCH(0);
-			OUT_BATCH(0);
-
-			/* inline data (xoffset, yoffset) */
-			OUT_BATCH(x + i * 16);
-			OUT_BATCH(y + j * 16);
-			gen8_emit_media_state_flush(batch);
-		}
-	}
-}
-void
-gen8lp_emit_media_objects(struct intel_batchbuffer *batch,
-			unsigned x, unsigned y,
-			unsigned width, unsigned height)
-{
-	int i, j;
-
-	for (i = 0; i < width / 16; i++) {
-		for (j = 0; j < height / 16; j++) {
-			OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2));
-
-			/* interface descriptor offset */
-			OUT_BATCH(0);
-
-			/* without indirect data */
-			OUT_BATCH(0);
-			OUT_BATCH(0);
-
-			/* scoreboard */
-			OUT_BATCH(0);
-			OUT_BATCH(0);
-
-			/* inline data (xoffset, yoffset) */
-			OUT_BATCH(x + i * 16);
-			OUT_BATCH(y + j * 16);
-		}
-	}
-}
 void
 gen8_emit_gpgpu_walk(struct intel_batchbuffer *batch,
 		     unsigned x, unsigned y,
diff --git a/lib/gpu_fill.h b/lib/gpu_fill.h
index 87e62c86..072e9f7c 100644
--- a/lib/gpu_fill.h
+++ b/lib/gpu_fill.h
@@ -70,12 +70,7 @@ gen7_fill_binding_table(struct intel_batchbuffer *batch,
 			struct igt_buf *dst);
 
 uint32_t
-gen7_fill_media_kernel(struct intel_batchbuffer *batch,
-		const uint32_t kernel[][4],
-		size_t size);
-
-uint32_t
-gen8_fill_media_kernel(struct intel_batchbuffer *batch,
+gen7_fill_kernel(struct intel_batchbuffer *batch,
 		const uint32_t kernel[][4],
 		size_t size);
 
@@ -108,53 +103,26 @@ gen7_emit_gpgpu_walk(struct intel_batchbuffer *batch,
 		     unsigned x, unsigned y,
 		     unsigned width, unsigned height);
 
-void
-gen8_render_flush(struct intel_batchbuffer *batch, uint32_t batch_end);
-
-uint32_t
-gen8_fill_curbe_buffer_data(struct intel_batchbuffer *batch,
-			uint8_t color);
-
 uint32_t
 gen8_fill_surface_state(struct intel_batchbuffer *batch,
 			struct igt_buf *buf,
 			uint32_t format,
 			int is_dst);
 
-uint32_t
-gen8_fill_binding_table(struct intel_batchbuffer *batch,
-			struct igt_buf *dst);
-
 uint32_t
 gen8_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *dst,  const uint32_t kernel[][4], size_t size);
 
 void
 gen8_emit_state_base_address(struct intel_batchbuffer *batch);
 
-void
-gen8_emit_vfe_state(struct intel_batchbuffer *batch);
-
-void
-gen8_emit_vfe_state_gpgpu(struct intel_batchbuffer *batch);
-
-void
-gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer);
-
-void
-gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor);
-
 void
 gen8_emit_media_state_flush(struct intel_batchbuffer *batch);
 
 void
-gen8_emit_media_objects(struct intel_batchbuffer *batch,
-			unsigned x, unsigned y,
-			unsigned width, unsigned height);
+gen8_emit_vfe_state(struct intel_batchbuffer *batch);
 
 void
-gen8lp_emit_media_objects(struct intel_batchbuffer *batch,
-			unsigned x, unsigned y,
-			unsigned width, unsigned height);
+gen8_emit_vfe_state_gpgpu(struct intel_batchbuffer *batch);
 
 void
 gen8_emit_gpgpu_walk(struct intel_batchbuffer *batch,
diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c
index 7c04ccf3..10d4dce8 100644
--- a/lib/intel_batchbuffer.c
+++ b/lib/intel_batchbuffer.c
@@ -796,12 +796,10 @@ igt_fillfunc_t igt_get_media_fillfunc(int devid)
 
 	if (IS_GEN9(devid))
 		fill = gen9_media_fillfunc;
-	else if (IS_BROADWELL(devid))
+	else if (IS_GEN8(devid))
 		fill = gen8_media_fillfunc;
 	else if (IS_GEN7(devid))
 		fill = gen7_media_fillfunc;
-	else if (IS_CHERRYVIEW(devid))
-		fill = gen8lp_media_fillfunc;
 
 	return fill;
 }
diff --git a/lib/media_fill.h b/lib/media_fill.h
index 226489cb..161af8cf 100644
--- a/lib/media_fill.h
+++ b/lib/media_fill.h
@@ -18,13 +18,6 @@ gen7_media_fillfunc(struct intel_batchbuffer *batch,
                 unsigned width, unsigned height,
                 uint8_t color);
 
-void
-gen8lp_media_fillfunc(struct intel_batchbuffer *batch,
-		struct igt_buf *dst,
-		unsigned x, unsigned y,
-		unsigned width, unsigned height,
-		uint8_t color);
-
 void
 gen9_media_fillfunc(struct intel_batchbuffer *batch,
                 struct igt_buf *dst,
diff --git a/lib/media_fill_gen8.c b/lib/media_fill_gen8.c
index 4270997e..362abd61 100644
--- a/lib/media_fill_gen8.c
+++ b/lib/media_fill_gen8.c
@@ -62,7 +62,7 @@ gen8_media_fillfunc(struct intel_batchbuffer *batch,
 	/* setup states */
 	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
 
-	curbe_buffer = gen8_fill_curbe_buffer_data(batch, color);
+	curbe_buffer = gen7_fill_curbe_buffer_data(batch, color);
 	interface_descriptor = gen8_fill_interface_descriptor(batch, dst, media_kernel, sizeof(media_kernel));
 	igt_assert(batch->ptr < &batch->buffer[4095]);
 
@@ -73,17 +73,17 @@ gen8_media_fillfunc(struct intel_batchbuffer *batch,
 
 	gen8_emit_vfe_state(batch);
 
-	gen8_emit_curbe_load(batch, curbe_buffer);
+	gen7_emit_curbe_load(batch, curbe_buffer);
 
-	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
+	gen7_emit_interface_descriptor_load(batch, interface_descriptor);
 
-	gen8_emit_media_objects(batch, x, y, width, height);
+	gen7_emit_media_objects(batch, x, y, width, height);
 
 	OUT_BATCH(MI_BATCH_BUFFER_END);
 
 	batch_end = batch_align(batch, 8);
 	igt_assert(batch_end < BATCH_STATE_SPLIT);
 
-	gen8_render_flush(batch, batch_end);
+	gen7_render_flush(batch, batch_end);
 	intel_batchbuffer_reset(batch);
 }
diff --git a/lib/media_fill_gen8lp.c b/lib/media_fill_gen8lp.c
deleted file mode 100644
index dcc11982..00000000
--- a/lib/media_fill_gen8lp.c
+++ /dev/null
@@ -1,87 +0,0 @@
-#include <intel_bufmgr.h>
-#include <i915_drm.h>
-
-#include "media_fill.h"
-#include "gen8_media.h"
-#include "intel_reg.h"
-#include "drmtest.h"
-#include "gpu_fill.h"
-#include <assert.h>
-
-
-static const uint32_t media_kernel[][4] = {
-	{ 0x00400001, 0x20202288, 0x00000020, 0x00000000 },
-	{ 0x00600001, 0x20800208, 0x008d0000, 0x00000000 },
-	{ 0x00200001, 0x20800208, 0x00450040, 0x00000000 },
-	{ 0x00000001, 0x20880608, 0x00000000, 0x000f000f },
-	{ 0x00800001, 0x20a00208, 0x00000020, 0x00000000 },
-	{ 0x00800001, 0x20e00208, 0x00000020, 0x00000000 },
-	{ 0x00800001, 0x21200208, 0x00000020, 0x00000000 },
-	{ 0x00800001, 0x21600208, 0x00000020, 0x00000000 },
-	{ 0x0c800031, 0x24000a40, 0x0e000080, 0x120a8000 },
-	{ 0x00600001, 0x2e000208, 0x008d0000, 0x00000000 },
-	{ 0x07800031, 0x20000a40, 0x0e000e00, 0x82000010 },
-};
-
-/*
- * This sets up the media pipeline,
- *
- * +---------------+ <---- 4096
- * |       ^       |
- * |       |       |
- * |    various    |
- * |      state    |
- * |       |       |
- * |_______|_______| <---- 2048 + ?
- * |       ^       |
- * |       |       |
- * |   batch       |
- * |    commands   |
- * |       |       |
- * |       |       |
- * +---------------+ <---- 0 + ?
- *
- */
-
-#define BATCH_STATE_SPLIT 2048
-
-void
-gen8lp_media_fillfunc(struct intel_batchbuffer *batch,
-		struct igt_buf *dst,
-		unsigned x, unsigned y,
-		unsigned width, unsigned height,
-		uint8_t color)
-{
-	uint32_t curbe_buffer, interface_descriptor;
-	uint32_t batch_end;
-
-	intel_batchbuffer_flush(batch);
-
-	/* setup states */
-	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
-
-	curbe_buffer = gen8_fill_curbe_buffer_data(batch, color);
-	interface_descriptor = gen8_fill_interface_descriptor(batch, dst, media_kernel, sizeof(media_kernel));
-	igt_assert(batch->ptr < &batch->buffer[4095]);
-
-	/* media pipeline */
-	batch->ptr = batch->buffer;
-	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA);
-	gen8_emit_state_base_address(batch);
-
-	gen8_emit_vfe_state(batch);
-
-	gen8_emit_curbe_load(batch, curbe_buffer);
-
-	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
-
-	gen8lp_emit_media_objects(batch, x, y, width, height);
-
-	OUT_BATCH(MI_BATCH_BUFFER_END);
-
-	batch_end = batch_align(batch, 8);
-	igt_assert(batch_end < BATCH_STATE_SPLIT);
-
-	gen8_render_flush(batch, batch_end);
-	intel_batchbuffer_reset(batch);
-}
diff --git a/lib/media_fill_gen9.c b/lib/media_fill_gen9.c
index 6accdbe4..d1335fe6 100644
--- a/lib/media_fill_gen9.c
+++ b/lib/media_fill_gen9.c
@@ -59,7 +59,7 @@ gen9_media_fillfunc(struct intel_batchbuffer *batch,
 	/* setup states */
 	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
 
-	curbe_buffer = gen8_fill_curbe_buffer_data(batch, color);
+	curbe_buffer = gen7_fill_curbe_buffer_data(batch, color);
 	interface_descriptor = gen8_fill_interface_descriptor(batch, dst, media_kernel, sizeof(media_kernel));
 	assert(batch->ptr < &batch->buffer[4095]);
 
@@ -75,11 +75,11 @@ gen9_media_fillfunc(struct intel_batchbuffer *batch,
 
 	gen8_emit_vfe_state(batch);
 
-	gen8_emit_curbe_load(batch, curbe_buffer);
+	gen7_emit_curbe_load(batch, curbe_buffer);
 
-	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
+	gen7_emit_interface_descriptor_load(batch, interface_descriptor);
 
-	gen8_emit_media_objects(batch, x, y, width, height);
+	gen7_emit_media_objects(batch, x, y, width, height);
 
 	OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA |
 			GEN9_FORCE_MEDIA_AWAKE_DISABLE |
@@ -93,6 +93,6 @@ gen9_media_fillfunc(struct intel_batchbuffer *batch,
 	batch_end = batch_align(batch, 8);
 	assert(batch_end < BATCH_STATE_SPLIT);
 
-	gen8_render_flush(batch, batch_end);
+	gen7_render_flush(batch, batch_end);
 	intel_batchbuffer_reset(batch);
 }
diff --git a/lib/meson.build b/lib/meson.build
index 385e08b9..5f2567fb 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -26,7 +26,6 @@ lib_sources = [
 	'ioctl_wrappers.c',
 	'media_fill_gen7.c',
 	'media_fill_gen8.c',
-	'media_fill_gen8lp.c',
 	'media_fill_gen9.c',
 	'media_spin.c',
 	'gpgpu_fill.c',
-- 
2.14.3

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [igt-dev] [PATCH i-g-t v7 3/4] lib/gpgpu_fill: Add missing configuration parameters for gpgpu_fill
  2018-04-11  8:14 [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries Katarzyna Dec
  2018-04-11  8:14 ` [igt-dev] [PATCH i-g-t v7 1/4] lib: Move common gpgpu/media fill functions to gpu_fill library Katarzyna Dec
  2018-04-11  8:14 ` [igt-dev] [PATCH i-g-t v7 2/4] lib: Remove duplications in " Katarzyna Dec
@ 2018-04-11  8:15 ` Katarzyna Dec
  2018-04-11  8:15 ` [igt-dev] [PATCH i-g-t v7 4/4] lib: Adjust refactored gpu_fill library to our coding style Katarzyna Dec
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Katarzyna Dec @ 2018-04-11  8:15 UTC (permalink / raw)
  To: igt-dev

There are missing parameters for Gen8 configuration of gpgpu_fill
that are causing GPU hangs on newer hardware. We need to set the
number of threads in TG in gen8_fill_interface_descriptor. This
field was omitted (apparently without any side effects), but
according to bspec from BDW this field cannot be set to 0. We also
need to use pipeline selection mask to gen9_gpgpu_fillfunc, which
is necessary from SKL.

v2: rebased on refactored library
v3: Removed replacing gen7_emit_interface_descriptor_load with gen8
version in gen9_gpgpgu_fillfunc, because during refactoring gen8
function was removed.
v4: rebase on series new version

Signed-off-by: Katarzyna Dec <katarzyna.dec@intel.com>
Cc: Lukasz Kalamarz <lukasz.kalamarz@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 lib/gpgpu_fill.c | 3 ++-
 lib/gpu_fill.c   | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/gpgpu_fill.c b/lib/gpgpu_fill.c
index 579ce78d..5a77ebd4 100644
--- a/lib/gpgpu_fill.c
+++ b/lib/gpgpu_fill.c
@@ -223,7 +223,8 @@ gen9_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 	batch->ptr = batch->buffer;
 
 	/* GPGPU pipeline */
-	OUT_BATCH(GEN7_PIPELINE_SELECT | PIPELINE_SELECT_GPGPU);
+	OUT_BATCH(GEN7_PIPELINE_SELECT | GEN9_PIPELINE_SELECTION_MASK |
+		  PIPELINE_SELECT_GPGPU);
 
 	gen9_emit_state_base_address(batch);
 	gen8_emit_vfe_state_gpgpu(batch);
diff --git a/lib/gpu_fill.c b/lib/gpu_fill.c
index a53e5533..f1fe5b33 100644
--- a/lib/gpu_fill.c
+++ b/lib/gpu_fill.c
@@ -462,6 +462,8 @@ gen8_fill_interface_descriptor(struct intel_batchbuffer *batch, struct igt_buf *
 	idd->desc5.constant_urb_entry_read_offset = 0;
 	idd->desc5.constant_urb_entry_read_length = 1; /* grf 1 */
 
+	idd->desc6.num_threads_in_tg = 1;
+
 	return offset;
 }
 
-- 
2.14.3

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [igt-dev] [PATCH i-g-t v7 4/4] lib: Adjust refactored gpu_fill library to our coding style
  2018-04-11  8:14 [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries Katarzyna Dec
                   ` (2 preceding siblings ...)
  2018-04-11  8:15 ` [igt-dev] [PATCH i-g-t v7 3/4] lib/gpgpu_fill: Add missing configuration parameters for gpgpu_fill Katarzyna Dec
@ 2018-04-11  8:15 ` Katarzyna Dec
  2018-04-11  8:49 ` [igt-dev] ✓ Fi.CI.BAT: success for Refactoring of *_fill libraries (rev4) Patchwork
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Katarzyna Dec @ 2018-04-11  8:15 UTC (permalink / raw)
  To: igt-dev

While I am making changes in gpgpu and media fill area let's
adjust code to our coding style.

v2: rebased on series new version (patch is now last from
series so change seems larger)
v3: rebased

Signed-off-by: Katarzyna Dec <katarzyna.dec@intel.com>
Cc: Lukasz Kalamarz <lukasz.kalamarz@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 lib/gpgpu_fill.c      | 24 ++++++++++++------------
 lib/gpgpu_fill.h      | 12 ++++++------
 lib/media_fill.h      | 20 ++++++++++----------
 lib/media_fill_gen7.c |  7 +++----
 lib/media_fill_gen8.c |  7 ++++---
 lib/media_fill_gen9.c |  7 ++++---
 6 files changed, 39 insertions(+), 38 deletions(-)

diff --git a/lib/gpgpu_fill.c b/lib/gpgpu_fill.c
index 5a77ebd4..72a1445a 100644
--- a/lib/gpgpu_fill.c
+++ b/lib/gpgpu_fill.c
@@ -99,8 +99,8 @@ static const uint32_t gen9_gpgpu_kernel[][4] = {
 void
 gen7_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 		    struct igt_buf *dst,
-		    unsigned x, unsigned y,
-		    unsigned width, unsigned height,
+		    unsigned int x, unsigned int y,
+		    unsigned int width, unsigned int height,
 		    uint8_t color)
 {
 	uint32_t curbe_buffer, interface_descriptor;
@@ -120,8 +120,8 @@ gen7_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 	curbe_buffer = gen7_fill_curbe_buffer_data(batch, color);
 
 	interface_descriptor = gen7_fill_interface_descriptor(batch, dst,
-							      gen7_gpgpu_kernel,
-							      sizeof(gen7_gpgpu_kernel));
+				gen7_gpgpu_kernel, sizeof(gen7_gpgpu_kernel));
+
 	igt_assert(batch->ptr < &batch->buffer[4095]);
 
 	batch->ptr = batch->buffer;
@@ -147,8 +147,8 @@ gen7_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 void
 gen8_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 		    struct igt_buf *dst,
-		    unsigned x, unsigned y,
-		    unsigned width, unsigned height,
+		    unsigned int x, unsigned int y,
+		    unsigned int width, unsigned int height,
 		    uint8_t color)
 {
 	uint32_t curbe_buffer, interface_descriptor;
@@ -168,8 +168,8 @@ gen8_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 	curbe_buffer = gen7_fill_curbe_buffer_data(batch, color);
 
 	interface_descriptor = gen8_fill_interface_descriptor(batch, dst,
-							      gen8_gpgpu_kernel,
-							      sizeof(gen8_gpgpu_kernel));
+				gen8_gpgpu_kernel, sizeof(gen8_gpgpu_kernel));
+
 	igt_assert(batch->ptr < &batch->buffer[4095]);
 
 	batch->ptr = batch->buffer;
@@ -195,8 +195,8 @@ gen8_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 void
 gen9_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 		    struct igt_buf *dst,
-		    unsigned x, unsigned y,
-		    unsigned width, unsigned height,
+		    unsigned int x, unsigned int y,
+		    unsigned int width, unsigned int height,
 		    uint8_t color)
 {
 	uint32_t curbe_buffer, interface_descriptor;
@@ -216,8 +216,8 @@ gen9_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 	curbe_buffer = gen7_fill_curbe_buffer_data(batch, color);
 
 	interface_descriptor = gen8_fill_interface_descriptor(batch, dst,
-							      gen9_gpgpu_kernel,
-							      sizeof(gen9_gpgpu_kernel));
+				gen9_gpgpu_kernel, sizeof(gen9_gpgpu_kernel));
+
 	igt_assert(batch->ptr < &batch->buffer[4095]);
 
 	batch->ptr = batch->buffer;
diff --git a/lib/gpgpu_fill.h b/lib/gpgpu_fill.h
index 7b5c8322..f0d188ae 100644
--- a/lib/gpgpu_fill.h
+++ b/lib/gpgpu_fill.h
@@ -30,22 +30,22 @@
 void
 gen7_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 		    struct igt_buf *dst,
-		    unsigned x, unsigned y,
-		    unsigned width, unsigned height,
+		    unsigned int x, unsigned int y,
+		    unsigned int width, unsigned int height,
 		    uint8_t color);
 
 void
 gen8_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 		    struct igt_buf *dst,
-		    unsigned x, unsigned y,
-		    unsigned width, unsigned height,
+		    unsigned int x, unsigned int y,
+		    unsigned int width, unsigned int height,
 		    uint8_t color);
 
 void
 gen9_gpgpu_fillfunc(struct intel_batchbuffer *batch,
 		    struct igt_buf *dst,
-		    unsigned x, unsigned y,
-		    unsigned width, unsigned height,
+		    unsigned int x, unsigned int y,
+		    unsigned int width, unsigned int height,
 		    uint8_t color);
 
 #endif /* GPGPU_FILL_H */
diff --git a/lib/media_fill.h b/lib/media_fill.h
index 161af8cf..f6db734e 100644
--- a/lib/media_fill.h
+++ b/lib/media_fill.h
@@ -7,22 +7,22 @@
 void
 gen8_media_fillfunc(struct intel_batchbuffer *batch,
 		struct igt_buf *dst,
-		unsigned x, unsigned y,
-		unsigned width, unsigned height,
+		unsigned int x, unsigned int y,
+		unsigned int width, unsigned int height,
 		uint8_t color);
 
 void
 gen7_media_fillfunc(struct intel_batchbuffer *batch,
-                struct igt_buf *dst,
-                unsigned x, unsigned y,
-                unsigned width, unsigned height,
-                uint8_t color);
+		struct igt_buf *dst,
+		unsigned int x, unsigned int y,
+		unsigned int width, unsigned int height,
+		uint8_t color);
 
 void
 gen9_media_fillfunc(struct intel_batchbuffer *batch,
-                struct igt_buf *dst,
-                unsigned x, unsigned y,
-                unsigned width, unsigned height,
-                uint8_t color);
+		struct igt_buf *dst,
+		unsigned int x, unsigned int y,
+		unsigned int width, unsigned int height,
+		uint8_t color);
 
 #endif /* RENDE_MEDIA_FILL_H */
diff --git a/lib/media_fill_gen7.c b/lib/media_fill_gen7.c
index c97555a6..5a8c32fb 100644
--- a/lib/media_fill_gen7.c
+++ b/lib/media_fill_gen7.c
@@ -47,8 +47,8 @@ static const uint32_t media_kernel[][4] = {
 void
 gen7_media_fillfunc(struct intel_batchbuffer *batch,
 		struct igt_buf *dst,
-		unsigned x, unsigned y,
-		unsigned width, unsigned height,
+		unsigned int x, unsigned int y,
+		unsigned int width, unsigned int height,
 		uint8_t color)
 {
 	uint32_t curbe_buffer, interface_descriptor;
@@ -61,8 +61,7 @@ gen7_media_fillfunc(struct intel_batchbuffer *batch,
 
 	curbe_buffer = gen7_fill_curbe_buffer_data(batch, color);
 	interface_descriptor = gen7_fill_interface_descriptor(batch, dst,
-							      media_kernel,
-							      sizeof(media_kernel));
+					media_kernel, sizeof(media_kernel));
 	igt_assert(batch->ptr < &batch->buffer[4095]);
 
 	/* media pipeline */
diff --git a/lib/media_fill_gen8.c b/lib/media_fill_gen8.c
index 362abd61..d6dd7410 100644
--- a/lib/media_fill_gen8.c
+++ b/lib/media_fill_gen8.c
@@ -50,8 +50,8 @@ static const uint32_t media_kernel[][4] = {
 void
 gen8_media_fillfunc(struct intel_batchbuffer *batch,
 		struct igt_buf *dst,
-		unsigned x, unsigned y,
-		unsigned width, unsigned height,
+		unsigned int x, unsigned int y,
+		unsigned int width, unsigned int height,
 		uint8_t color)
 {
 	uint32_t curbe_buffer, interface_descriptor;
@@ -63,7 +63,8 @@ gen8_media_fillfunc(struct intel_batchbuffer *batch,
 	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
 
 	curbe_buffer = gen7_fill_curbe_buffer_data(batch, color);
-	interface_descriptor = gen8_fill_interface_descriptor(batch, dst, media_kernel, sizeof(media_kernel));
+	interface_descriptor = gen8_fill_interface_descriptor(batch, dst,
+					media_kernel, sizeof(media_kernel));
 	igt_assert(batch->ptr < &batch->buffer[4095]);
 
 	/* media pipeline */
diff --git a/lib/media_fill_gen9.c b/lib/media_fill_gen9.c
index d1335fe6..a9a829f2 100644
--- a/lib/media_fill_gen9.c
+++ b/lib/media_fill_gen9.c
@@ -47,8 +47,8 @@ static const uint32_t media_kernel[][4] = {
 void
 gen9_media_fillfunc(struct intel_batchbuffer *batch,
 		struct igt_buf *dst,
-		unsigned x, unsigned y,
-		unsigned width, unsigned height,
+		unsigned int x, unsigned int y,
+		unsigned int width, unsigned int height,
 		uint8_t color)
 {
 	uint32_t curbe_buffer, interface_descriptor;
@@ -60,7 +60,8 @@ gen9_media_fillfunc(struct intel_batchbuffer *batch,
 	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
 
 	curbe_buffer = gen7_fill_curbe_buffer_data(batch, color);
-	interface_descriptor = gen8_fill_interface_descriptor(batch, dst, media_kernel, sizeof(media_kernel));
+	interface_descriptor = gen8_fill_interface_descriptor(batch, dst,
+					media_kernel, sizeof(media_kernel));
 	assert(batch->ptr < &batch->buffer[4095]);
 
 	/* media pipeline */
-- 
2.14.3

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [igt-dev] ✓ Fi.CI.BAT: success for Refactoring of *_fill libraries (rev4)
  2018-04-11  8:14 [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries Katarzyna Dec
                   ` (3 preceding siblings ...)
  2018-04-11  8:15 ` [igt-dev] [PATCH i-g-t v7 4/4] lib: Adjust refactored gpu_fill library to our coding style Katarzyna Dec
@ 2018-04-11  8:49 ` Patchwork
  2018-04-11  9:32 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
  2018-04-11 12:10 ` [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries Szwichtenberg, Radoslaw
  6 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-04-11  8:49 UTC (permalink / raw)
  To: Katarzyna Dec; +Cc: igt-dev

== Series Details ==

Series: Refactoring of *_fill libraries (rev4)
URL   : https://patchwork.freedesktop.org/series/41212/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4042 -> IGTPW_1244 =

== Summary - SUCCESS ==

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1244/

== Known issues ==

  Here are the changes found in IGTPW_1244 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@gem_exec_suspend@basic-s3:
      fi-glk-j4005:       PASS -> DMESG-WARN (fdo#105771)

    igt@kms_pipe_crc_basic@read-crc-pipe-a-frame-sequence:
      fi-elk-e7500:       PASS -> INCOMPLETE (fdo#103989)

    igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
      fi-cnl-y3:          NOTRUN -> INCOMPLETE (fdo#105058, fdo#105086)

    
    ==== Possible fixes ====

    igt@debugfs_test@read_all_entries:
      fi-snb-2520m:       INCOMPLETE (fdo#103713) -> PASS

    igt@gem_mmap_gtt@basic-small-bo-tiledx:
      fi-gdg-551:         FAIL (fdo#102575) -> PASS

    igt@kms_pipe_crc_basic@read-crc-pipe-c-frame-sequence:
      fi-cfl-s3:          FAIL (fdo#103481) -> PASS

    
  fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
  fdo#103481 https://bugs.freedesktop.org/show_bug.cgi?id=103481
  fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713
  fdo#103989 https://bugs.freedesktop.org/show_bug.cgi?id=103989
  fdo#105058 https://bugs.freedesktop.org/show_bug.cgi?id=105058
  fdo#105086 https://bugs.freedesktop.org/show_bug.cgi?id=105086
  fdo#105771 https://bugs.freedesktop.org/show_bug.cgi?id=105771


== Participating hosts (35 -> 33) ==

  Additional (1): fi-cnl-y3 
  Missing    (3): fi-ctg-p8600 fi-ilk-m540 fi-skl-6700hq 


== Build changes ==

    * IGT: IGT_4418 -> IGTPW_1244
    * Piglit: piglit_4418 -> piglit_4420

  CI_DRM_4042: 98590826e4a407967faad1e021e3b38647a9e77f @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_1244: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1244/
  IGT_4418: 7c474e011548d35df6b80ceed81d3e6ca560c71d @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  piglit_4418: 45e115f293fd6acc0c9647cf2d3b76be78819ba5 @ git://anongit.freedesktop.org/piglit
  piglit_4420: 45e115f293fd6acc0c9647cf2d3b76be78819ba5 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1244/issues.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [igt-dev] ✗ Fi.CI.IGT: warning for Refactoring of *_fill libraries (rev4)
  2018-04-11  8:14 [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries Katarzyna Dec
                   ` (4 preceding siblings ...)
  2018-04-11  8:49 ` [igt-dev] ✓ Fi.CI.BAT: success for Refactoring of *_fill libraries (rev4) Patchwork
@ 2018-04-11  9:32 ` Patchwork
  2018-04-11 12:10 ` [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries Szwichtenberg, Radoslaw
  6 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-04-11  9:32 UTC (permalink / raw)
  To: Katarzyna Dec; +Cc: igt-dev

== Series Details ==

Series: Refactoring of *_fill libraries (rev4)
URL   : https://patchwork.freedesktop.org/series/41212/
State : warning

== Summary ==

= CI Bug Log - changes from CI_DRM_4042_full -> IGTPW_1244_full =

== Summary - WARNING ==

  Minor unknown changes coming with IGTPW_1244_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in IGTPW_1244_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce the CI noise.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1244/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in IGTPW_1244_full:

  === IGT changes ===

    ==== Warnings ====

    igt@gem_mocs_settings@mocs-rc6-dirty-render:
      shard-kbl:          PASS -> SKIP +1

    igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-indfb-draw-pwrite:
      shard-snb:          PASS -> SKIP +2

    igt@kms_vblank@pipe-b-ts-continuation-idle-hang:
      shard-snb:          SKIP -> PASS +1

    igt@perf_pmu@rc6:
      shard-kbl:          SKIP -> PASS

    
== Known issues ==

  Here are the changes found in IGTPW_1244_full that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@kms_flip@dpms-vs-vblank-race:
      shard-hsw:          PASS -> FAIL (fdo#103060)

    igt@kms_flip@flip-vs-blocking-wf-vblank:
      shard-hsw:          PASS -> FAIL (fdo#100368)

    igt@kms_flip@flip-vs-expired-vblank-interruptible:
      shard-hsw:          PASS -> FAIL (fdo#102887)

    igt@kms_setmode@basic:
      shard-kbl:          PASS -> FAIL (fdo#99912)

    igt@pm_rpm@universal-planes:
      shard-kbl:          PASS -> DMESG-WARN (fdo#103558)

    
    ==== Possible fixes ====

    igt@kms_busy@extended-pageflip-hang-oldfb-render-b:
      shard-kbl:          DMESG-WARN (fdo#103313, fdo#103558) -> PASS +12

    igt@kms_flip@2x-flip-vs-expired-vblank:
      shard-hsw:          FAIL (fdo#105189) -> PASS

    igt@kms_flip@2x-plain-flip-fb-recreate:
      shard-hsw:          FAIL (fdo#100368) -> PASS

    igt@kms_flip@wf_vblank-ts-check:
      shard-hsw:          FAIL (fdo#103928) -> PASS

    igt@kms_rotation_crc@sprite-rotation-180:
      shard-snb:          FAIL (fdo#103925) -> PASS

    igt@kms_setmode@basic:
      shard-apl:          FAIL (fdo#99912) -> PASS

    igt@pm_rpm@legacy-planes-dpms:
      shard-kbl:          DMESG-WARN (fdo#103558) -> PASS +18

    
  fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
  fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
  fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
  fdo#103313 https://bugs.freedesktop.org/show_bug.cgi?id=103313
  fdo#103558 https://bugs.freedesktop.org/show_bug.cgi?id=103558
  fdo#103925 https://bugs.freedesktop.org/show_bug.cgi?id=103925
  fdo#103928 https://bugs.freedesktop.org/show_bug.cgi?id=103928
  fdo#105189 https://bugs.freedesktop.org/show_bug.cgi?id=105189
  fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912


== Participating hosts (6 -> 4) ==

  Missing    (2): shard-glk shard-glkb 


== Build changes ==

    * IGT: IGT_4418 -> IGTPW_1244
    * Piglit: piglit_4418 -> piglit_4420

  CI_DRM_4042: 98590826e4a407967faad1e021e3b38647a9e77f @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_1244: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1244/
  IGT_4418: 7c474e011548d35df6b80ceed81d3e6ca560c71d @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  piglit_4418: 45e115f293fd6acc0c9647cf2d3b76be78819ba5 @ git://anongit.freedesktop.org/piglit
  piglit_4420: 45e115f293fd6acc0c9647cf2d3b76be78819ba5 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_1244/shards.html
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v7 2/4] lib: Remove duplications in gpu_fill library
  2018-04-11  8:14 ` [igt-dev] [PATCH i-g-t v7 2/4] lib: Remove duplications in " Katarzyna Dec
@ 2018-04-11 11:53   ` Szwichtenberg, Radoslaw
  2018-04-11 13:42     ` Katarzyna Dec
  0 siblings, 1 reply; 11+ messages in thread
From: Szwichtenberg, Radoslaw @ 2018-04-11 11:53 UTC (permalink / raw)
  To: Dec, Katarzyna, igt-dev

On Wed, 2018-04-11 at 10:14 +0200, Katarzyna Dec wrote:
> After moving all functions needed for gpgpu and media fill testing
> there is a lot of duplications which can be removed:
>   Library media_fill_gen8 and media_fill_gen8lp for CHT was removed,
> media state flush for !CHT was added to gen7_emit_media_objects.
>   Many gen8 functions were replaced with gen7 version with devid
> parameter (gen7_fill_curbe_load, gen7_emit_interface_descriptor,
> gen7_fill_binding_table, gen7_emit_media_objects). Unified fill kernel
> function so it is applicable to all gens and both media and gpgpu
> (merged gen7_fill_media_kernel and gen8_fill_media_kernel).
>   Duplicated constants like GEN8_MEDIA_VFE_STATE, GEN8_MEDIA_CURBE_LOAD,
> GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD, GEN8_MEDIA_OBJECT were
> replaced by GEN7 version. However this constants were not removed
> from gen8_media.h library, because they are used by other tests
> for Gen8+. More refactoring in this gen*_media.h libraries is needed.
> 
> It seems that further unification of *_fillfunc functions will
> introduce more confusion in understanding what the tests are doing
> and what were changes between Gens.
> 
> v2: Moved some reduntant changes from Move gpgpu/media fill to gpu_fill...
> to this patch. Applied comments from review.
> 
> v3: rebase
> 
> Signed-off-by: Katarzyna Dec <katarzyna.dec@intel.com>
> Cc: Lukasz Kalamarz <lukasz.kalamarz@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> 
> diff --git a/lib/Makefile.sources b/lib/Makefile.sources
> index 45e65dd7..9c0150c1 100644
> --- a/lib/Makefile.sources
> +++ b/lib/Makefile.sources
> @@ -58,7 +58,6 @@ lib_source_list =	 	\
>  	media_fill.h            \
>  	media_fill_gen7.c       \
>  	media_fill_gen8.c       \
> -	media_fill_gen8lp.c     \
>  	media_fill_gen9.c       \
>  	media_spin.h		\
>  	media_spin.c		\
> diff --git a/lib/gpgpu_fill.c b/lib/gpgpu_fill.c
> index f2765fd6..579ce78d 100644
> --- a/lib/gpgpu_fill.c
> +++ b/lib/gpgpu_fill.c
> @@ -180,7 +180,7 @@ gen8_gpgpu_fillfunc(struct intel_batchbuffer *batch,
>  	gen8_emit_state_base_address(batch);
>  	gen8_emit_vfe_state_gpgpu(batch);
>  	gen7_emit_curbe_load(batch, curbe_buffer);
> -	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
> +	gen7_emit_interface_descriptor_load(batch, interface_descriptor);
>  	gen8_emit_gpgpu_walk(batch, x, y, width, height);
>  
>  	OUT_BATCH(MI_BATCH_BUFFER_END);
> diff --git a/lib/gpu_fill.c b/lib/gpu_fill.c
> index d7a2dd4c..a53e5533 100644
> --- a/lib/gpu_fill.c
> +++ b/lib/gpu_fill.c
> @@ -142,26 +142,18 @@ gen7_fill_binding_table(struct intel_batchbuffer *batch,
What do you think about changing name from gen7_fill to gen_fill? This function
now acts in slightly different way depending on gen.
>  
>  	binding_table = batch_alloc(batch, 32, 64);
>  	offset = batch_offset(batch, binding_table);
> -	binding_table[0] = gen7_fill_surface_state(batch, dst,
> +	if (IS_GEN7(batch->devid))
> +		binding_table[0] = gen7_fill_surface_state(batch, dst,
>  						GEN7_SURFACEFORMAT_R8_UNORM,
> 1);
> +	else
> +		binding_table[0] = gen8_fill_surface_state(batch, dst,
> +						GEN8_SURFACEFORMAT_R8_UNORM,
> 1);
>  
>  	return offset;
>  }
>  
>  uint32_t
> -gen7_fill_media_kernel(struct intel_batchbuffer *batch,
> -		const uint32_t kernel[][4],
> -		size_t size)
> -{
> -	uint32_t offset;
> -
> -	offset = batch_copy(batch, kernel, size, 64);
> -
> -	return offset;
> -}
> -
> -uint32_t
> -gen8_fill_media_kernel(struct intel_batchbuffer *batch,
> +gen7_fill_kernel(struct intel_batchbuffer *batch,
Maybe this also could be just gen_fill_kernel if it is very same for all gens
and also doesn't have anything introduced by genX?

Otherwise LGTM
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries
  2018-04-11  8:14 [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries Katarzyna Dec
                   ` (5 preceding siblings ...)
  2018-04-11  9:32 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
@ 2018-04-11 12:10 ` Szwichtenberg, Radoslaw
  2018-04-12 21:26   ` Arkadiusz Hiler
  6 siblings, 1 reply; 11+ messages in thread
From: Szwichtenberg, Radoslaw @ 2018-04-11 12:10 UTC (permalink / raw)
  To: Dec, Katarzyna, igt-dev

On Wed, 2018-04-11 at 10:14 +0200, Katarzyna Dec wrote:
> This series is removing duplications in gpgpu_fill and media_fill
> libraries. As a first step I moved gpgpu and media helper functions
> to gpu_fill library. In second patch I adjusted code to our coding
> style. In the third not obvious duplications were removed (like
> adding in gen7 functions conditions for future gens). Last patch
> adds missing parameters that make GPU hang on gen9 and gen9+.
> 
> In first version of this series there was a comment about moving
> batch_alloc/copy etc. functions to intel_batchbuffer library.
> Because there is a lot of code to review already this change will
> be introduced in another series (rendercopy, media_fill, gpgpu_fill
> and media_spin code is affected by this).
> 
> It is possible that more changes around gen*_media.h and media_spin
> is needed, but this will be done as a next step.
> 
> v2: Removed not obvious duplications. Adjusted code to review comments.
> v3: Series needed reorganization because it introduced bug to ALP,
> which was hard to find. That is why patch 1 is now almost only moving
> functions to gpu_fill with removing duplications, such as the same
> functions. Also applied comments from review.
> v4: Added #defines and copyrights to new gpu_fill library. Changed functions
> order in gpu_fill library.
> v5: Version with no changes comparing to v4 - git sent-mail sent
> series to wrong thread...
> v6: Removed issue in gen8_emit_gpgpu_walker which was introduced during rebase
> v7: Added copyrights to gpu_fill.c
> 
> Signed-off-by: Katarzyna Dec <katarzyna.dec@intel.com>
> Cc: Lukasz Kalamarz <lukasz.kalamarz@intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Whole series looks good to me as refactoring starter. Effort should continue to
clean all libraries up.

Reviewed-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v7 2/4] lib: Remove duplications in gpu_fill library
  2018-04-11 11:53   ` Szwichtenberg, Radoslaw
@ 2018-04-11 13:42     ` Katarzyna Dec
  0 siblings, 0 replies; 11+ messages in thread
From: Katarzyna Dec @ 2018-04-11 13:42 UTC (permalink / raw)
  To: Szwichtenberg, Radoslaw; +Cc: igt-dev

On Wed, Apr 11, 2018 at 12:53:24PM +0100, Szwichtenberg, Radoslaw wrote:
> On Wed, 2018-04-11 at 10:14 +0200, Katarzyna Dec wrote:
> > After moving all functions needed for gpgpu and media fill testing
> > there is a lot of duplications which can be removed:
> >   Library media_fill_gen8 and media_fill_gen8lp for CHT was removed,
> > media state flush for !CHT was added to gen7_emit_media_objects.
> >   Many gen8 functions were replaced with gen7 version with devid
> > parameter (gen7_fill_curbe_load, gen7_emit_interface_descriptor,
> > gen7_fill_binding_table, gen7_emit_media_objects). Unified fill kernel
> > function so it is applicable to all gens and both media and gpgpu
> > (merged gen7_fill_media_kernel and gen8_fill_media_kernel).
> >   Duplicated constants like GEN8_MEDIA_VFE_STATE, GEN8_MEDIA_CURBE_LOAD,
> > GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD, GEN8_MEDIA_OBJECT were
> > replaced by GEN7 version. However this constants were not removed
> > from gen8_media.h library, because they are used by other tests
> > for Gen8+. More refactoring in this gen*_media.h libraries is needed.
> > 
> > It seems that further unification of *_fillfunc functions will
> > introduce more confusion in understanding what the tests are doing
> > and what were changes between Gens.
> > 
> > v2: Moved some reduntant changes from Move gpgpu/media fill to gpu_fill...
> > to this patch. Applied comments from review.
> > 
> > v3: rebase
> > 
> > Signed-off-by: Katarzyna Dec <katarzyna.dec@intel.com>
> > Cc: Lukasz Kalamarz <lukasz.kalamarz@intel.com>
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > 
> > diff --git a/lib/Makefile.sources b/lib/Makefile.sources
> > index 45e65dd7..9c0150c1 100644
> > --- a/lib/Makefile.sources
> > +++ b/lib/Makefile.sources
> > @@ -58,7 +58,6 @@ lib_source_list =	 	\
> >  	media_fill.h            \
> >  	media_fill_gen7.c       \
> >  	media_fill_gen8.c       \
> > -	media_fill_gen8lp.c     \
> >  	media_fill_gen9.c       \
> >  	media_spin.h		\
> >  	media_spin.c		\
> > diff --git a/lib/gpgpu_fill.c b/lib/gpgpu_fill.c
> > index f2765fd6..579ce78d 100644
> > --- a/lib/gpgpu_fill.c
> > +++ b/lib/gpgpu_fill.c
> > @@ -180,7 +180,7 @@ gen8_gpgpu_fillfunc(struct intel_batchbuffer *batch,
> >  	gen8_emit_state_base_address(batch);
> >  	gen8_emit_vfe_state_gpgpu(batch);
> >  	gen7_emit_curbe_load(batch, curbe_buffer);
> > -	gen8_emit_interface_descriptor_load(batch, interface_descriptor);
> > +	gen7_emit_interface_descriptor_load(batch, interface_descriptor);
> >  	gen8_emit_gpgpu_walk(batch, x, y, width, height);
> >  
> >  	OUT_BATCH(MI_BATCH_BUFFER_END);
> > diff --git a/lib/gpu_fill.c b/lib/gpu_fill.c
> > index d7a2dd4c..a53e5533 100644
> > --- a/lib/gpu_fill.c
> > +++ b/lib/gpu_fill.c
> > @@ -142,26 +142,18 @@ gen7_fill_binding_table(struct intel_batchbuffer *batch,
> What do you think about changing name from gen7_fill to gen_fill? This function
> now acts in slightly different way depending on gen.
> >  
> >  	binding_table = batch_alloc(batch, 32, 64);
> >  	offset = batch_offset(batch, binding_table);
> > -	binding_table[0] = gen7_fill_surface_state(batch, dst,
> > +	if (IS_GEN7(batch->devid))
> > +		binding_table[0] = gen7_fill_surface_state(batch, dst,
> >  						GEN7_SURFACEFORMAT_R8_UNORM,
> > 1);
> > +	else
> > +		binding_table[0] = gen8_fill_surface_state(batch, dst,
> > +						GEN8_SURFACEFORMAT_R8_UNORM,
> > 1);
> >  
> >  	return offset;
> >  }
> >  
> >  uint32_t
> > -gen7_fill_media_kernel(struct intel_batchbuffer *batch,
> > -		const uint32_t kernel[][4],
> > -		size_t size)
> > -{
> > -	uint32_t offset;
> > -
> > -	offset = batch_copy(batch, kernel, size, 64);
> > -
> > -	return offset;
> > -}
> > -
> > -uint32_t
> > -gen8_fill_media_kernel(struct intel_batchbuffer *batch,
> > +gen7_fill_kernel(struct intel_batchbuffer *batch,
> Maybe this also could be just gen_fill_kernel if it is very same for all gens
> and also doesn't have anything introduced by genX?
> 
> Otherwise LGTM
It may be a bit confusing. It should not be that hard to add one more patch with                                                                                                                                  
name unification. Generally, the idea with staying with gen7_naming was to show
in media_fillfunc/gpgpu_fillfunc changes between gens, e.g. what did change
in gen8, gen9 etc.
If anyone who will be reviewing this code has similar concerns as you Radek, it
will not be a problem to add another patch :)

Kasia
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries
  2018-04-11 12:10 ` [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries Szwichtenberg, Radoslaw
@ 2018-04-12 21:26   ` Arkadiusz Hiler
  0 siblings, 0 replies; 11+ messages in thread
From: Arkadiusz Hiler @ 2018-04-12 21:26 UTC (permalink / raw)
  To: Szwichtenberg, Radoslaw; +Cc: igt-dev

On Wed, Apr 11, 2018 at 12:10:46PM +0000, Szwichtenberg, Radoslaw wrote:
> On Wed, 2018-04-11 at 10:14 +0200, Katarzyna Dec wrote:
> > This series is removing duplications in gpgpu_fill and media_fill
> > libraries. As a first step I moved gpgpu and media helper functions
> > to gpu_fill library. In second patch I adjusted code to our coding
> > style. In the third not obvious duplications were removed (like
> > adding in gen7 functions conditions for future gens). Last patch
> > adds missing parameters that make GPU hang on gen9 and gen9+.
> > 
> > In first version of this series there was a comment about moving
> > batch_alloc/copy etc. functions to intel_batchbuffer library.
> > Because there is a lot of code to review already this change will
> > be introduced in another series (rendercopy, media_fill, gpgpu_fill
> > and media_spin code is affected by this).
> > 
> > It is possible that more changes around gen*_media.h and media_spin
> > is needed, but this will be done as a next step.
> > 
> > v2: Removed not obvious duplications. Adjusted code to review comments.
> > v3: Series needed reorganization because it introduced bug to ALP,
> > which was hard to find. That is why patch 1 is now almost only moving
> > functions to gpu_fill with removing duplications, such as the same
> > functions. Also applied comments from review.
> > v4: Added #defines and copyrights to new gpu_fill library. Changed functions
> > order in gpu_fill library.
> > v5: Version with no changes comparing to v4 - git sent-mail sent
> > series to wrong thread...
> > v6: Removed issue in gen8_emit_gpgpu_walker which was introduced during rebase
> > v7: Added copyrights to gpu_fill.c
> > 
> > Signed-off-by: Katarzyna Dec <katarzyna.dec@intel.com>
> > Cc: Lukasz Kalamarz <lukasz.kalamarz@intel.com>
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> 
> Whole series looks good to me as refactoring starter. Effort should continue to
> clean all libraries up.
> 
> Reviewed-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>

I am really glad that someone took an effort to untangle this code.
I've just merged this. Thanks for the patches and the review!

-Arek

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-04-12 21:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-11  8:14 [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries Katarzyna Dec
2018-04-11  8:14 ` [igt-dev] [PATCH i-g-t v7 1/4] lib: Move common gpgpu/media fill functions to gpu_fill library Katarzyna Dec
2018-04-11  8:14 ` [igt-dev] [PATCH i-g-t v7 2/4] lib: Remove duplications in " Katarzyna Dec
2018-04-11 11:53   ` Szwichtenberg, Radoslaw
2018-04-11 13:42     ` Katarzyna Dec
2018-04-11  8:15 ` [igt-dev] [PATCH i-g-t v7 3/4] lib/gpgpu_fill: Add missing configuration parameters for gpgpu_fill Katarzyna Dec
2018-04-11  8:15 ` [igt-dev] [PATCH i-g-t v7 4/4] lib: Adjust refactored gpu_fill library to our coding style Katarzyna Dec
2018-04-11  8:49 ` [igt-dev] ✓ Fi.CI.BAT: success for Refactoring of *_fill libraries (rev4) Patchwork
2018-04-11  9:32 ` [igt-dev] ✗ Fi.CI.IGT: warning " Patchwork
2018-04-11 12:10 ` [igt-dev] [PATCH i-g-t v7 0/4] Refactoring of *_fill libraries Szwichtenberg, Radoslaw
2018-04-12 21:26   ` Arkadiusz Hiler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.