All of lore.kernel.org
 help / color / mirror / Atom feed
* [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy
@ 2020-07-23 19:22 Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 01/11] lib/intel_bufops: add mapping on cpu / device Zbigniew Kempczyński
                   ` (12 more replies)
  0 siblings, 13 replies; 15+ messages in thread
From: Zbigniew Kempczyński @ 2020-07-23 19:22 UTC (permalink / raw)
  To: igt-dev

First try to check how much will be broken when we migrate
to no-libdrm rendercopy.

I still have some doubts in gem_stress, gem_concurrent_blit
so depending on the result we'll decide to spread the change
to smaller series (with introducing _v2 suffix) or will we
be able to do that in single step.

Dominik Grzegorzek (2):
  lib/igt_fb: Removal of libdrm dependency
  tests/gem|kms: remove libdrm dependency (batch 1)

Zbigniew Kempczyński (9):
  lib/intel_bufops: add mapping on cpu / device
  lib/intel_batchbuffer: add new functions to support rendercopy
  tests/gem_caching: adopt to batch flush function cleanup
  lib/rendercopy: remove libdrm dependency
  tests/api_intel_bb: add render tests
  lib/igt_draw: remove libdrm dependency
  tests/gem|kms: remove libdrm dependency (batch 2)
  tools/intel_residency: adopt intel_residency to use bufops
  tests/perf: remove libdrm dependency for rendercopy

 lib/Makefile.sources                 |    2 -
 lib/igt_draw.c                       |  158 ++--
 lib/igt_draw.h                       |    8 +-
 lib/igt_fb.c                         |   96 ++-
 lib/intel_aux_pgtable.c              |  325 ++++----
 lib/intel_aux_pgtable.h              |   29 +-
 lib/intel_batchbuffer.c              |  303 ++++++-
 lib/intel_batchbuffer.h              |   83 +-
 lib/intel_bufops.c                   |  100 ++-
 lib/intel_bufops.h                   |   15 +-
 lib/meson.build                      |    1 -
 lib/rendercopy.h                     |  102 +--
 lib/rendercopy_bufmgr.c              |  171 ----
 lib/rendercopy_bufmgr.h              |   28 -
 lib/rendercopy_gen4.c                |  568 +++++++------
 lib/rendercopy_gen6.c                |  590 +++++++------
 lib/rendercopy_gen7.c                |  612 +++++++-------
 lib/rendercopy_gen8.c                | 1026 ++++++++++-------------
 lib/rendercopy_gen9.c                | 1145 ++++++++++++--------------
 lib/rendercopy_i830.c                |  278 ++++---
 lib/rendercopy_i915.c                |  281 ++++---
 lib/veboxcopy.h                      |    8 +-
 lib/veboxcopy_gen12.c                |  117 ++-
 tests/i915/api_intel_bb.c            |  168 ++++
 tests/i915/gem_caching.c             |    5 -
 tests/i915/gem_concurrent_all.c      |  431 +++++-----
 tests/i915/gem_ppgtt.c               |  183 ++--
 tests/i915/gem_read_read_speed.c     |  160 ++--
 tests/i915/gem_render_copy.c         |  313 +++----
 tests/i915/gem_render_copy_redux.c   |   66 +-
 tests/i915/gem_render_linear_blits.c |   90 +-
 tests/i915/gem_render_tiled_blits.c  |   85 +-
 tests/i915/gem_stress.c              |  244 +++---
 tests/i915/perf.c                    |  663 +++++++--------
 tests/kms_big_fb.c                   |   54 +-
 tests/kms_cursor_crc.c               |   63 +-
 tests/kms_draw_crc.c                 |   20 +-
 tests/kms_frontbuffer_tracking.c     |   20 +-
 tests/kms_psr.c                      |  134 +--
 tools/intel_residency.c              |   10 +-
 40 files changed, 4297 insertions(+), 4458 deletions(-)
 delete mode 100644 lib/rendercopy_bufmgr.c
 delete mode 100644 lib/rendercopy_bufmgr.h

-- 
2.26.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [igt-dev] [PATCH i-g-t 01/11] lib/intel_bufops: add mapping on cpu / device
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
@ 2020-07-23 19:22 ` Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 02/11] lib/intel_batchbuffer: add new functions to support rendercopy Zbigniew Kempczyński
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Zbigniew Kempczyński @ 2020-07-23 19:22 UTC (permalink / raw)
  To: igt-dev; +Cc: Chris Wilson

Simplify mapping intel_buf. To be extended with ref counting.
Add intel_buf dump function for easy dump buffer to the file.
Fixing returned type of bo size function.

Idempotency selftest is now skipped for default buf_ops creation
to avoid time consuming comparison of HW and SW tiled buffers
(this can be the problem for forked tests). Additional function
buf_ops_create_with_selftest() was added to allow perform
verification step where it is required.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/intel_bufops.c | 100 ++++++++++++++++++++++++++++++++++++++++-----
 lib/intel_bufops.h |  15 ++++++-
 2 files changed, 103 insertions(+), 12 deletions(-)

diff --git a/lib/intel_bufops.c b/lib/intel_bufops.c
index 2a48fb0c..fc143f3f 100644
--- a/lib/intel_bufops.c
+++ b/lib/intel_bufops.c
@@ -870,6 +870,48 @@ void intel_buf_destroy(struct intel_buf *buf)
 	free(buf);
 }
 
+void *intel_buf_cpu_map(struct intel_buf *buf, bool write)
+{
+	int i915 = buf_ops_get_fd(buf->bops);
+
+	igt_assert(buf->ptr == NULL); /* already mapped */
+
+	buf->cpu_write = write;
+	buf->ptr = gem_mmap__cpu_coherent(i915, buf->handle, 0,
+					  buf->surface[0].size,
+					  write ? PROT_WRITE : PROT_READ);
+
+	gem_set_domain(i915, buf->handle,
+		       I915_GEM_DOMAIN_CPU,
+		       write ? I915_GEM_DOMAIN_CPU : 0);
+
+	return buf->ptr;
+}
+
+void *intel_buf_device_map(struct intel_buf *buf, bool write)
+{
+	int i915 = buf_ops_get_fd(buf->bops);
+
+	igt_assert(buf->ptr == NULL); /* already mapped */
+
+	buf->ptr = gem_mmap__device_coherent(i915, buf->handle, 0,
+					     buf->surface[0].size,
+					     write ? PROT_WRITE : PROT_READ);
+
+	return buf->ptr;
+}
+
+void intel_buf_unmap(struct intel_buf *buf)
+{
+	igt_assert(buf->ptr);
+
+	if (buf->cpu_write)
+		gem_sw_finish(buf_ops_get_fd(buf->bops), buf->handle);
+
+	munmap(buf->ptr, buf->surface[0].size);
+	buf->ptr = NULL;
+}
+
 void intel_buf_print(const struct intel_buf *buf)
 {
 	igt_info("[name: %s]\n", buf->name);
@@ -888,6 +930,21 @@ void intel_buf_print(const struct intel_buf *buf)
 		 from_user_pointer(buf->addr.offset), buf->addr.ctx);
 }
 
+void intel_buf_dump(const struct intel_buf *buf, const char *filename)
+{
+	int i915 = buf_ops_get_fd(buf->bops);
+	uint64_t size = intel_buf_bo_size(buf);
+	FILE *out;
+	void *ptr;
+
+	ptr = gem_mmap__device_coherent(i915, buf->handle, 0, size, PROT_READ);
+	out = fopen(filename, "wb");
+	igt_assert(out);
+	fwrite(ptr, size, 1, out);
+	fclose(out);
+	munmap(ptr, size);
+}
+
 const char *intel_buf_set_name(struct intel_buf *buf, const char *name)
 {
 	return strncpy(buf->name, name, INTEL_BUF_NAME_MAXSIZE);
@@ -1066,7 +1123,7 @@ static void idempotency_selftest(struct buf_ops *bops, uint32_t tiling)
 	buf_ops_set_software_tiling(bops, tiling, false);
 }
 
-int intel_buf_bo_size(const struct intel_buf *buf)
+uint32_t intel_buf_bo_size(const struct intel_buf *buf)
 {
 	int offset = CCS_OFFSET(buf) ?: buf->surface[0].size;
 	int ccs_size =
@@ -1075,16 +1132,7 @@ int intel_buf_bo_size(const struct intel_buf *buf)
 	return offset + ccs_size;
 }
 
-/**
- * buf_ops_create
- * @fd: device filedescriptor
- *
- * Create buf_ops structure depending on fd-device capabilities.
- *
- * Returns: opaque pointer to buf_ops.
- *
- */
-struct buf_ops *buf_ops_create(int fd)
+static struct buf_ops *__buf_ops_create(int fd, bool check_idempotency)
 {
 	struct buf_ops *bops = calloc(1, sizeof(*bops));
 	uint32_t devid;
@@ -1167,6 +1215,36 @@ struct buf_ops *buf_ops_create(int fd)
 	return bops;
 }
 
+/**
+ * buf_ops_create
+ * @fd: device filedescriptor
+ *
+ * Create buf_ops structure depending on fd-device capabilities.
+ *
+ * Returns: opaque pointer to buf_ops.
+ *
+ */
+struct buf_ops *buf_ops_create(int fd)
+{
+	return __buf_ops_create(fd, false);
+}
+
+/**
+ * buf_ops_create_with_selftest
+ * @fd: device filedescriptor
+ *
+ * Create buf_ops structure depending on fd-device capabilities.
+ * Runs with idempotency selftest to verify software tiling gives same
+ * result like hardware tiling (gens with mappable gtt).
+ *
+ * Returns: opaque pointer to buf_ops.
+ *
+ */
+struct buf_ops *buf_ops_create_with_selftest(int fd)
+{
+	return __buf_ops_create(fd, true);
+}
+
 /**
  * buf_ops_destroy
  * @bops: pointer to buf_ops
diff --git a/lib/intel_bufops.h b/lib/intel_bufops.h
index f4f3751e..75b5f5f6 100644
--- a/lib/intel_bufops.h
+++ b/lib/intel_bufops.h
@@ -16,6 +16,9 @@ struct intel_buf {
 	uint32_t bpp;
 	uint32_t compression;
 	uint32_t swizzle_mode;
+	uint32_t yuv_semiplanar_bpp;
+	bool format_is_yuv;
+	bool format_is_yuv_semiplanar;
 	struct {
 		uint32_t offset;
 		uint32_t stride;
@@ -33,6 +36,10 @@ struct intel_buf {
 		uint32_t ctx;
 	} addr;
 
+	/* CPU mapping */
+	uint32_t *ptr;
+	bool cpu_write;
+
 	/* For debugging purposes */
 	char name[INTEL_BUF_NAME_MAXSIZE + 1];
 };
@@ -80,9 +87,10 @@ intel_buf_ccs_height(int gen, const struct intel_buf *buf)
 	return DIV_ROUND_UP(intel_buf_height(buf), 512) * 32;
 }
 
-int intel_buf_bo_size(const struct intel_buf *buf);
+uint32_t intel_buf_bo_size(const struct intel_buf *buf);
 
 struct buf_ops *buf_ops_create(int fd);
+struct buf_ops *buf_ops_create_with_selftest(int fd);
 void buf_ops_destroy(struct buf_ops *bops);
 int buf_ops_get_fd(struct buf_ops *bops);
 
@@ -116,7 +124,12 @@ struct intel_buf *intel_buf_create(struct buf_ops *bops,
 				   uint32_t req_tiling, uint32_t compression);
 void intel_buf_destroy(struct intel_buf *buf);
 
+void *intel_buf_cpu_map(struct intel_buf *buf, bool write);
+void *intel_buf_device_map(struct intel_buf *buf, bool write);
+void intel_buf_unmap(struct intel_buf *buf);
+
 void intel_buf_print(const struct intel_buf *buf);
+void intel_buf_dump(const struct intel_buf *buf, const char *filename);
 const char *intel_buf_set_name(struct intel_buf *buf, const char *name);
 
 void intel_buf_write_to_png(struct intel_buf *buf, const char *filename);
-- 
2.26.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [igt-dev] [PATCH i-g-t 02/11] lib/intel_batchbuffer: add new functions to support rendercopy
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 01/11] lib/intel_bufops: add mapping on cpu / device Zbigniew Kempczyński
@ 2020-07-23 19:22 ` Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 03/11] tests/gem_caching: adopt to batch flush function cleanup Zbigniew Kempczyński
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Zbigniew Kempczyński @ 2020-07-23 19:22 UTC (permalink / raw)
  To: igt-dev; +Cc: Chris Wilson

To cover rendercopy in dependent tests we need to add the following:

1. relocation in any handle
   Previously batchbuffer was the main target of relocations. As AUX
   table require relocations too add support for that in intel_bb.

2. set/get default alignment
   Add default alignment for objects added to intel_bb (AUX tables use
   different alignment for different objects).

3. add intel_buf to intel_bb
   Save proposed address to intel_buf

4. add set flag function

5. unification of intel_bb_flush.*() functions.

6. fixing indentation

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/intel_batchbuffer.c | 303 ++++++++++++++++++++++++++++++++++------
 lib/intel_batchbuffer.h |  41 +++++-
 2 files changed, 297 insertions(+), 47 deletions(-)

diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c
index 99c2e148..bf84d5f0 100644
--- a/lib/intel_batchbuffer.c
+++ b/lib/intel_batchbuffer.c
@@ -37,6 +37,7 @@
 #include "drmtest.h"
 #include "intel_batchbuffer.h"
 #include "intel_bufmgr.h"
+#include "intel_bufops.h"
 #include "intel_chipset.h"
 #include "intel_reg.h"
 #include "veboxcopy.h"
@@ -1214,6 +1215,7 @@ static inline uint64_t __intel_bb_propose_offset(struct intel_bb *ibb)
 	offset = hars_petruska_f54_1_random64(&ibb->prng);
 	offset += 256 << 10; /* Keep the low 256k clear, for negative deltas */
 	offset &= ibb->gtt_size - 1;
+	offset &= ~(ibb->alignment - 1);
 
 	return offset;
 }
@@ -1232,7 +1234,6 @@ __intel_bb_create(int i915, uint32_t size, bool do_relocs)
 {
 	struct intel_bb *ibb = calloc(1, sizeof(*ibb));
 	uint64_t gtt_size;
-	uint64_t bb_address;
 
 	igt_assert(ibb);
 
@@ -1242,6 +1243,7 @@ __intel_bb_create(int i915, uint32_t size, bool do_relocs)
 	ibb->enforce_relocs = do_relocs;
 	ibb->handle = gem_create(i915, size);
 	ibb->size = size;
+	ibb->alignment = 4096;
 	ibb->batch = calloc(1, size);
 	igt_assert(ibb->batch);
 	ibb->ptr = ibb->batch;
@@ -1253,11 +1255,12 @@ __intel_bb_create(int i915, uint32_t size, bool do_relocs)
 		gtt_size /= 2;
 	if ((gtt_size - 1) >> 32)
 		ibb->supports_48b_address = true;
+
 	ibb->gtt_size = gtt_size;
 
 	__reallocate_objects(ibb);
-	bb_address = __intel_bb_propose_offset(ibb);
-	intel_bb_add_object(ibb, ibb->handle, bb_address, false);
+	ibb->batch_offset = __intel_bb_propose_offset(ibb);
+	intel_bb_add_object(ibb, ibb->handle, ibb->batch_offset, false);
 
 	ibb->refcount = 1;
 
@@ -1364,8 +1367,6 @@ void intel_bb_destroy(struct intel_bb *ibb)
 */
 void intel_bb_reset(struct intel_bb *ibb, bool purge_objects_cache)
 {
-	uint64_t bb_address;
-
 	if (purge_objects_cache && ibb->refcount > 1)
 		igt_warn("Cannot purge objects cache on bb, refcount > 1!");
 
@@ -1383,8 +1384,8 @@ void intel_bb_reset(struct intel_bb *ibb, bool purge_objects_cache)
 	gem_close(ibb->i915, ibb->handle);
 	ibb->handle = gem_create(ibb->i915, ibb->size);
 
-	bb_address = __intel_bb_propose_offset(ibb);
-	intel_bb_add_object(ibb, ibb->handle, bb_address, false);
+	ibb->batch_offset = __intel_bb_propose_offset(ibb);
+	intel_bb_add_object(ibb, ibb->handle, ibb->batch_offset, false);
 	ibb->ptr = ibb->batch;
 	memset(ibb->batch, 0, ibb->size);
 }
@@ -1499,6 +1500,7 @@ intel_bb_add_object(struct intel_bb *ibb, uint32_t handle,
 	object = &ibb->objects[i];
 	object->handle = handle;
 	object->offset = offset;
+	object->alignment = ibb->alignment;
 
 	found = tsearch((void *) object, &ibb->root, __compare_objects);
 
@@ -1519,7 +1521,38 @@ intel_bb_add_object(struct intel_bb *ibb, uint32_t handle,
 	return object;
 }
 
-static bool intel_bb_object_set_fence(struct intel_bb *ibb, uint32_t handle)
+struct drm_i915_gem_exec_object2 *
+intel_bb_add_intel_buf(struct intel_bb *ibb, struct intel_buf *buf, bool write)
+{
+	struct drm_i915_gem_exec_object2 *obj;
+
+	obj = intel_bb_add_object(ibb, buf->handle, buf->addr.offset, write);
+
+	/* For compressed surfaces ensure address is aligned to 64KB */
+	if (ibb->gen >= 12 && buf->compression) {
+		obj->offset &= ~(0x10000 - 1);
+		obj->alignment = 0x10000;
+	}
+
+	/* Update address in intel_buf buffer */
+	buf->addr.offset = obj->offset;
+
+	return obj;
+}
+
+struct drm_i915_gem_exec_object2 *
+intel_bb_find_object(struct intel_bb *ibb, uint32_t handle)
+{
+	struct drm_i915_gem_exec_object2 object = { .handle = handle };
+	struct drm_i915_gem_exec_object2 **found;
+
+	found = tfind((void *) &object, &ibb->root, __compare_objects);
+
+	return *found;
+}
+
+bool
+intel_bb_object_set_flag(struct intel_bb *ibb, uint32_t handle, uint64_t flag)
 {
 	struct drm_i915_gem_exec_object2 object = { .handle = handle };
 	struct drm_i915_gem_exec_object2 **found;
@@ -1531,7 +1564,7 @@ static bool intel_bb_object_set_fence(struct intel_bb *ibb, uint32_t handle)
 		return false;
 	}
 
-	(*found)->flags |= EXEC_OBJECT_NEEDS_FENCE;
+	(*found)->flags |= flag;
 
 	return true;
 }
@@ -1539,7 +1572,8 @@ static bool intel_bb_object_set_fence(struct intel_bb *ibb, uint32_t handle)
 /*
  * intel_bb_add_reloc:
  * @ibb: pointer to intel_bb
- * @handle: object handle which address will be taken to patch the bb
+ * @to_handle: object handle in which do relocation
+ * @handle: object handle which address will be taken to patch the @to_handle
  * @read_domains: gem domain bits for the relocation
  * @write_domain: gem domain bit for the relocation
  * @delta: delta value to add to @buffer's gpu address
@@ -1554,6 +1588,7 @@ static bool intel_bb_object_set_fence(struct intel_bb *ibb, uint32_t handle)
  * exists but doesn't mark it as a render target.
  */
 static uint64_t intel_bb_add_reloc(struct intel_bb *ibb,
+				   uint32_t to_handle,
 				   uint32_t handle,
 				   uint32_t read_domains,
 				   uint32_t write_domain,
@@ -1562,20 +1597,32 @@ static uint64_t intel_bb_add_reloc(struct intel_bb *ibb,
 				   uint64_t presumed_offset)
 {
 	struct drm_i915_gem_relocation_entry *relocs;
-	struct drm_i915_gem_exec_object2 *object;
+	struct drm_i915_gem_exec_object2 *object, *to_object;
 	uint32_t i;
 
 	object = intel_bb_add_object(ibb, handle, presumed_offset, false);
 
-	relocs = ibb->relocs;
-	if (ibb->num_relocs == ibb->allocated_relocs) {
-		ibb->allocated_relocs += 4096 / sizeof(*relocs);
-		relocs = realloc(relocs, sizeof(*relocs) * ibb->allocated_relocs);
+	/* For ibb we have relocs allocated in chunks */
+	if (to_handle == ibb->handle) {
+		relocs = ibb->relocs;
+		if (ibb->num_relocs == ibb->allocated_relocs) {
+			ibb->allocated_relocs += 4096 / sizeof(*relocs);
+			relocs = realloc(relocs, sizeof(*relocs) * ibb->allocated_relocs);
+			igt_assert(relocs);
+			ibb->relocs = relocs;
+		}
+		i = ibb->num_relocs++;
+	} else {
+		to_object = intel_bb_find_object(ibb, to_handle);
+		igt_assert_f(to_object, "object has to be added to ibb first!\n");
+
+		i = to_object->relocation_count++;
+		relocs = from_user_pointer(to_object->relocs_ptr);
+		relocs = realloc(relocs, sizeof(*relocs) * to_object->relocation_count);
+		to_object->relocs_ptr = to_user_pointer(relocs);
 		igt_assert(relocs);
-		ibb->relocs = relocs;
 	}
 
-	i = ibb->num_relocs++;
 	memset(&relocs[i], 0, sizeof(*relocs));
 	relocs[i].target_handle = handle;
 	relocs[i].read_domains = read_domains;
@@ -1587,17 +1634,42 @@ static uint64_t intel_bb_add_reloc(struct intel_bb *ibb,
 	else
 		relocs[i].presumed_offset = object->offset;
 
-	igt_debug("add reloc: handle: %u, r/w: 0x%x/0x%x, "
+	igt_debug("add reloc: to_handle: %u, handle: %u, r/w: 0x%x/0x%x, "
 		  "delta: 0x%" PRIx64 ", "
 		  "offset: 0x%" PRIx64 ", "
 		  "poffset: %p\n",
-		  handle, read_domains, write_domain,
+		  to_handle, handle, read_domains, write_domain,
 		  delta, offset,
 		  from_user_pointer(relocs[i].presumed_offset));
 
 	return object->offset;
 }
 
+static uint64_t __intel_bb_emit_reloc(struct intel_bb *ibb,
+				      uint32_t to_handle,
+				      uint32_t to_offset,
+				      uint32_t handle,
+				      uint32_t read_domains,
+				      uint32_t write_domain,
+				      uint64_t delta,
+				      uint64_t presumed_offset)
+{
+	uint64_t address;
+
+	igt_assert(ibb);
+
+	address = intel_bb_add_reloc(ibb, to_handle, handle,
+				     read_domains, write_domain,
+				     delta, to_offset,
+				     presumed_offset);
+
+	intel_bb_out(ibb, delta + address);
+	if (ibb->gen >= 8)
+		intel_bb_out(ibb, address >> 32);
+
+	return address;
+}
+
 /**
  * intel_bb_emit_reloc:
  * @ibb: pointer to intel_bb
@@ -1626,19 +1698,11 @@ uint64_t intel_bb_emit_reloc(struct intel_bb *ibb,
 			     uint64_t delta,
 			     uint64_t presumed_offset)
 {
-	uint64_t address;
-
 	igt_assert(ibb);
 
-	address = intel_bb_add_reloc(ibb, handle, read_domains, write_domain,
-				     delta, intel_bb_offset(ibb),
-				     presumed_offset);
-
-	intel_bb_out(ibb, delta + address);
-	if (ibb->gen >= 8)
-		intel_bb_out(ibb, address >> 32);
-
-	return address;
+	return __intel_bb_emit_reloc(ibb, ibb->handle, intel_bb_offset(ibb),
+				     handle, read_domains, write_domain,
+				     delta, presumed_offset);
 }
 
 uint64_t intel_bb_emit_reloc_fenced(struct intel_bb *ibb,
@@ -1653,7 +1717,7 @@ uint64_t intel_bb_emit_reloc_fenced(struct intel_bb *ibb,
 	address = intel_bb_emit_reloc(ibb, handle, read_domains, write_domain,
 				      delta, presumed_offset);
 
-	intel_bb_object_set_fence(ibb, handle);
+	intel_bb_object_set_flag(ibb, handle, EXEC_OBJECT_NEEDS_FENCE);
 
 	return address;
 }
@@ -1686,7 +1750,8 @@ uint64_t intel_bb_offset_reloc(struct intel_bb *ibb,
 {
 	igt_assert(ibb);
 
-	return intel_bb_add_reloc(ibb, handle, read_domains, write_domain,
+	return intel_bb_add_reloc(ibb, ibb->handle, handle,
+				  read_domains, write_domain,
 				  0, offset, presumed_offset);
 }
 
@@ -1700,7 +1765,24 @@ uint64_t intel_bb_offset_reloc_with_delta(struct intel_bb *ibb,
 {
 	igt_assert(ibb);
 
-	return intel_bb_add_reloc(ibb, handle, read_domains, write_domain,
+	return intel_bb_add_reloc(ibb, ibb->handle, handle,
+				  read_domains, write_domain,
+				  delta, offset, presumed_offset);
+}
+
+uint64_t intel_bb_offset_reloc_to_object(struct intel_bb *ibb,
+					 uint32_t to_handle,
+					 uint32_t handle,
+					 uint32_t read_domains,
+					 uint32_t write_domain,
+					 uint32_t delta,
+					 uint32_t offset,
+					 uint64_t presumed_offset)
+{
+	igt_assert(ibb);
+
+	return intel_bb_add_reloc(ibb, to_handle, handle,
+				  read_domains, write_domain,
 				  delta, offset, presumed_offset);
 }
 
@@ -1955,16 +2037,38 @@ uint32_t intel_bb_emit_bbe(struct intel_bb *ibb)
 }
 
 /*
- * intel_bb_flush_with_context_ring:
+ * intel_bb_emit_flush_common:
  * @ibb: batchbuffer
- * @ctx: context id
- * @ring: ring
  *
- * Submits the batch for execution on the @ring engine with the supplied
- * hardware context @ctx.
+ * Emits instructions which completes batch buffer.
+ *
+ * Returns: offset in batch buffer where there's end of instructions.
  */
-static void intel_bb_flush_with_context_ring(struct intel_bb *ibb,
-					     uint32_t ctx, uint32_t ring)
+uint32_t intel_bb_emit_flush_common(struct intel_bb *ibb)
+{
+	if (intel_bb_offset(ibb) == 0)
+		return 0;
+
+	if (ibb->gen == 5) {
+		/*
+		 * emit gen5 w/a without batch space checks - we reserve that
+		 * already.
+		 */
+		intel_bb_out(ibb, CMD_POLY_STIPPLE_OFFSET << 16);
+		intel_bb_out(ibb, 0);
+	}
+
+	/* Round batchbuffer usage to 2 DWORDs. */
+	if ((intel_bb_offset(ibb) & 4) == 0)
+		intel_bb_out(ibb, 0);
+
+	intel_bb_emit_bbe(ibb);
+
+	return intel_bb_offset(ibb);
+}
+
+static void intel_bb_exec_with_context_ring(struct intel_bb *ibb,
+					    uint32_t ctx, uint32_t ring)
 {
 	intel_bb_exec_with_context(ibb, intel_bb_offset(ibb), ctx,
 				   ring | I915_EXEC_NO_RELOC,
@@ -1972,23 +2076,109 @@ static void intel_bb_flush_with_context_ring(struct intel_bb *ibb,
 	intel_bb_reset(ibb, false);
 }
 
-void intel_bb_flush_render(struct intel_bb *ibb)
+/*
+ * intel_bb_flush:
+ * @ibb: batchbuffer
+ * @ctx: context
+ * @ring: ring
+ *
+ * If batch is not empty emit batch buffer end, execute on specified
+ * context, ring then reset the batch.
+ */
+void intel_bb_flush(struct intel_bb *ibb, uint32_t ctx, uint32_t ring)
+{
+	if (intel_bb_emit_flush_common(ibb) == 0)
+		return;
+
+	intel_bb_exec_with_context_ring(ibb, ctx, ring);
+}
+
+static void __intel_bb_flush_render_with_context(struct intel_bb *ibb,
+						 uint32_t ctx)
 {
 	uint32_t ring = I915_EXEC_RENDER;
 
-	intel_bb_flush_with_context_ring(ibb, ibb->ctx, ring);
+	if (intel_bb_emit_flush_common(ibb) == 0)
+		return;
+
+	intel_bb_exec_with_context_ring(ibb, ctx, ring);
 }
 
-void intel_bb_flush_blit(struct intel_bb *ibb)
+/*
+ * intel_bb_flush_render:
+ * @ibb: batchbuffer
+ *
+ * If batch is not empty emit batch buffer end, execute on render ring
+ * and reset the batch.
+ * Context used to execute is previous batch context.
+ */
+void intel_bb_flush_render(struct intel_bb *ibb)
+{
+	__intel_bb_flush_render_with_context(ibb, ibb->ctx);
+}
+
+/*
+ * intel_bb_flush_render_with_context:
+ * @ibb: batchbuffer
+ * @ctx: context
+ *
+ * If batch is not empty emit batch buffer end, execute on render ring with @ctx
+ * and reset the batch.
+ */
+void intel_bb_flush_render_with_context(struct intel_bb *ibb, uint32_t ctx)
+{
+	__intel_bb_flush_render_with_context(ibb, ctx);
+}
+
+static void __intel_bb_flush_blit_with_context(struct intel_bb *ibb,
+					       uint32_t ctx)
 {
 	uint32_t ring = I915_EXEC_DEFAULT;
 
+	if (intel_bb_emit_flush_common(ibb) == 0)
+		return;
+
 	if (HAS_BLT_RING(ibb->devid))
 		ring = I915_EXEC_BLT;
 
-	intel_bb_flush_with_context_ring(ibb, ibb->ctx, ring);
+	intel_bb_exec_with_context_ring(ibb, ctx, ring);
+}
+
+/*
+ * intel_bb_flush_blit:
+ * @ibb: batchbuffer
+ *
+ * If batch is not empty emit batch buffer end, execute on default/blit ring
+ * (depends on gen) and reset the batch.
+ * Context used to execute is previous batch context.
+ */
+void intel_bb_flush_blit(struct intel_bb *ibb)
+{
+	__intel_bb_flush_blit_with_context(ibb, ibb->ctx);
+}
+
+/*
+ * intel_bb_flush_blit_with_context:
+ * @ibb: batchbuffer
+ * @ctx: context
+ *
+ * If batch is not empty emit batch buffer end, execute on default/blit ring
+ * (depends on gen) with @ctx and reset the batch.
+ */
+void intel_bb_flush_blit_with_context(struct intel_bb *ibb, uint32_t ctx)
+{
+	__intel_bb_flush_blit_with_context(ibb, ctx);
 }
 
+/*
+ * intel_bb_copy_data:
+ * @ibb: batchbuffer
+ * @data: pointer of data which should be copied into batch
+ * @bytes: number of bytes to copy, must be dword multiplied
+ * @align: alignment in the batch
+ *
+ * Function copies @bytes of data pointed by @data into batch buffer.
+ */
 uint32_t intel_bb_copy_data(struct intel_bb *ibb,
 			    const void *data, unsigned int bytes,
 			    uint32_t align)
@@ -2008,6 +2198,14 @@ uint32_t intel_bb_copy_data(struct intel_bb *ibb,
 	return offset;
 }
 
+/*
+ * intel_bb_blit_start:
+ * @ibb: batchbuffer
+ * @flags: flags to blit command
+ *
+ * Function emits XY_SRC_COPY_BLT instruction with size appropriate size
+ * which depend on gen.
+ */
 void intel_bb_blit_start(struct intel_bb *ibb, uint32_t flags)
 {
 	intel_bb_out(ibb, XY_SRC_COPY_BLT_CMD |
@@ -2017,6 +2215,22 @@ void intel_bb_blit_start(struct intel_bb *ibb, uint32_t flags)
 		     (6 + 2 * (ibb->gen >= 8)));
 }
 
+/*
+ * intel_bb_emit_blt_copy:
+ * @ibb: batchbuffer
+ * @src: source buffer (intel_buf)
+ * @src_x1: source x1 position
+ * @src_y1: source y1 position
+ * @src_pitch: source pitch
+ * @dst: destination buffer (intel_buf)
+ * @dst_x1: destination x1 position
+ * @dst_y1: destination y1 position
+ * @dst_pitch: destination pitch
+ * @width: width of data to copy
+ * @height: height of data to copy
+ *
+ * Function emits complete blit command.
+ */
 void intel_bb_emit_blt_copy(struct intel_bb *ibb,
 			    struct intel_buf *src,
 			    int src_x1, int src_y1, int src_pitch,
@@ -2123,7 +2337,6 @@ void intel_bb_blt_copy(struct intel_bb *ibb,
 	intel_bb_emit_blt_copy(ibb, src, src_x1, src_y1, src_pitch,
 			       dst, dst_x1, dst_y1, dst_pitch,
 			       width, height, bpp);
-	intel_bb_emit_bbe(ibb);
 	intel_bb_flush_blit(ibb);
 }
 
diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h
index a69fffd8..cc8c84b6 100644
--- a/lib/intel_batchbuffer.h
+++ b/lib/intel_batchbuffer.h
@@ -439,6 +439,7 @@ struct intel_bb {
 	uint32_t size;
 	uint32_t *batch;
 	uint32_t *ptr;
+	uint64_t alignment;
 	int fence;
 
 	uint32_t prng;
@@ -486,6 +487,22 @@ void intel_bb_print(struct intel_bb *ibb);
 void intel_bb_dump(struct intel_bb *ibb, const char *filename);
 void intel_bb_set_debug(struct intel_bb *ibb, bool debug);
 
+static inline uint64_t
+intel_bb_set_default_object_alignment(struct intel_bb *ibb, uint64_t alignment)
+{
+	uint64_t old = ibb->alignment;
+
+	ibb->alignment = alignment;
+
+	return old;
+}
+
+static inline uint64_t
+intel_bb_get_default_object_alignment(struct intel_bb *ibb)
+{
+	return ibb->alignment;
+}
+
 static inline uint32_t intel_bb_offset(struct intel_bb *ibb)
 {
 	return (uint32_t) ((uint8_t *) ibb->ptr - (uint8_t *) ibb->batch);
@@ -513,8 +530,7 @@ static inline uint32_t intel_bb_ptr_add_return_prev_offset(struct intel_bb *ibb,
 	return previous_offset;
 }
 
-static inline void *intel_bb_ptr_align(struct intel_bb *ibb,
-				      uint32_t alignment)
+static inline void *intel_bb_ptr_align(struct intel_bb *ibb, uint32_t alignment)
 {
 	intel_bb_ptr_set(ibb, ALIGN(intel_bb_offset(ibb), alignment));
 
@@ -538,6 +554,14 @@ static inline void intel_bb_out(struct intel_bb *ibb, uint32_t dword)
 struct drm_i915_gem_exec_object2 *
 intel_bb_add_object(struct intel_bb *ibb, uint32_t handle,
 		    uint64_t offset, bool write);
+struct drm_i915_gem_exec_object2 *
+intel_bb_add_intel_buf(struct intel_bb *ibb, struct intel_buf *buf, bool write);
+
+struct drm_i915_gem_exec_object2 *
+intel_bb_find_object(struct intel_bb *ibb, uint32_t handle);
+
+bool
+intel_bb_object_set_flag(struct intel_bb *ibb, uint32_t handle, uint64_t flag);
 
 uint64_t intel_bb_emit_reloc(struct intel_bb *ibb,
 			 uint32_t handle,
@@ -568,6 +592,15 @@ uint64_t intel_bb_offset_reloc_with_delta(struct intel_bb *ibb,
 					  uint32_t offset,
 					  uint64_t presumed_offset);
 
+uint64_t intel_bb_offset_reloc_to_object(struct intel_bb *ibb,
+					 uint32_t handle,
+					 uint32_t to_handle,
+					 uint32_t read_domains,
+					 uint32_t write_domain,
+					 uint32_t delta,
+					 uint32_t offset,
+					 uint64_t presumed_offset);
+
 int __intel_bb_exec(struct intel_bb *ibb, uint32_t end_offset,
 		    uint32_t ctx, uint64_t flags, bool sync);
 
@@ -581,8 +614,12 @@ uint64_t intel_bb_get_object_offset(struct intel_bb *ibb, uint32_t handle);
 bool intel_bb_object_offset_to_buf(struct intel_bb *ibb, struct intel_buf *buf);
 
 uint32_t intel_bb_emit_bbe(struct intel_bb *ibb);
+uint32_t intel_bb_emit_flush_common(struct intel_bb *ibb);
+void intel_bb_flush(struct intel_bb *ibb, uint32_t ctx, uint32_t ring);
 void intel_bb_flush_render(struct intel_bb *ibb);
+void intel_bb_flush_render_with_context(struct intel_bb *ibb, uint32_t ctx);
 void intel_bb_flush_blit(struct intel_bb *ibb);
+void intel_bb_flush_blit_with_context(struct intel_bb *ibb, uint32_t ctx);
 
 uint32_t intel_bb_copy_data(struct intel_bb *ibb,
 			    const void *data, unsigned int bytes,
-- 
2.26.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [igt-dev] [PATCH i-g-t 03/11] tests/gem_caching: adopt to batch flush function cleanup
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 01/11] lib/intel_bufops: add mapping on cpu / device Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 02/11] lib/intel_batchbuffer: add new functions to support rendercopy Zbigniew Kempczyński
@ 2020-07-23 19:22 ` Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 04/11] lib/rendercopy: remove libdrm dependency Zbigniew Kempczyński
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Zbigniew Kempczyński @ 2020-07-23 19:22 UTC (permalink / raw)
  To: igt-dev; +Cc: Chris Wilson

intel_bb API has been cleaned so flush generates also BBE and executes
the batch. Remove redundant code to use flush properly.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/i915/gem_caching.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/tests/i915/gem_caching.c b/tests/i915/gem_caching.c
index 1d8989db..23e42132 100644
--- a/tests/i915/gem_caching.c
+++ b/tests/i915/gem_caching.c
@@ -99,11 +99,6 @@ copy_bo(struct intel_bb *ibb, struct intel_buf *src, struct intel_buf *dst)
 	intel_bb_emit_reloc_fenced(ibb, src->handle,
 				   I915_GEM_DOMAIN_RENDER,
 				   0, 0, 0x0);
-
-	 /* Mark the end of the buffer. */
-	intel_bb_out(ibb, MI_BATCH_BUFFER_END);
-	intel_bb_ptr_align(ibb, 8);
-
 	intel_bb_flush_blit(ibb);
 	intel_bb_sync(ibb);
 }
-- 
2.26.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [igt-dev] [PATCH i-g-t 04/11] lib/rendercopy: remove libdrm dependency
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
                   ` (2 preceding siblings ...)
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 03/11] tests/gem_caching: adopt to batch flush function cleanup Zbigniew Kempczyński
@ 2020-07-23 19:22 ` Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 05/11] tests/api_intel_bb: add render tests Zbigniew Kempczyński
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Zbigniew Kempczyński @ 2020-07-23 19:22 UTC (permalink / raw)
  To: igt-dev; +Cc: Chris Wilson

Use intel_bb as main batch implementation to remove libdrm dependency.
Rewrite all pipelines to use intel_bb and update render|vebox_copy
function prototypes.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/intel_aux_pgtable.c |  325 +++++------
 lib/intel_aux_pgtable.h |   29 +-
 lib/intel_batchbuffer.h |   42 +-
 lib/rendercopy.h        |  102 ++--
 lib/rendercopy_gen4.c   |  568 ++++++++++---------
 lib/rendercopy_gen6.c   |  590 ++++++++++----------
 lib/rendercopy_gen7.c   |  612 ++++++++++-----------
 lib/rendercopy_gen8.c   | 1026 +++++++++++++++--------------------
 lib/rendercopy_gen9.c   | 1145 +++++++++++++++++----------------------
 lib/rendercopy_i830.c   |  278 +++++-----
 lib/rendercopy_i915.c   |  281 +++++-----
 lib/veboxcopy.h         |    8 +-
 lib/veboxcopy_gen12.c   |  117 ++--
 13 files changed, 2421 insertions(+), 2702 deletions(-)

diff --git a/lib/intel_aux_pgtable.c b/lib/intel_aux_pgtable.c
index db5055c8..b43a366b 100644
--- a/lib/intel_aux_pgtable.c
+++ b/lib/intel_aux_pgtable.c
@@ -4,7 +4,7 @@
 #include "drmtest.h"
 #include "intel_aux_pgtable.h"
 #include "intel_batchbuffer.h"
-#include "intel_bufmgr.h"
+#include "intel_bufops.h"
 #include "ioctl_wrappers.h"
 
 #include "i915/gem_mman.h"
@@ -31,8 +31,6 @@
 
 #define GFX_ADDRESS_BITS	48
 
-#define max(a, b)		((a) > (b) ? (a) : (b))
-
 #define AUX_FORMAT_YCRCB	0x03
 #define AUX_FORMAT_P010		0x07
 #define AUX_FORMAT_P016		0x08
@@ -58,10 +56,12 @@ struct pgtable {
 	struct pgtable_level_info *level_info;
 	int size;
 	int max_align;
-	drm_intel_bo *bo;
+	struct intel_bb *ibb;
+	struct intel_buf *buf;
+	void *ptr;
 };
 
-static uint64_t last_buf_surface_end(const struct igt_buf *buf)
+static uint64_t last_buf_surface_end(struct intel_buf *buf)
 {
 	uint64_t end_offset = 0;
 	int num_surfaces = buf->format_is_yuv_semiplanar ? 2 : 1;
@@ -79,7 +79,7 @@ static uint64_t last_buf_surface_end(const struct igt_buf *buf)
 }
 
 static int
-pgt_table_count(int address_bits, const struct igt_buf **bufs, int buf_count)
+pgt_table_count(int address_bits, struct intel_buf **bufs, int buf_count)
 {
 	uint64_t end;
 	int count;
@@ -88,19 +88,19 @@ pgt_table_count(int address_bits, const struct igt_buf **bufs, int buf_count)
 	count = 0;
 	end = 0;
 	for (i = 0; i < buf_count; i++) {
-		const struct igt_buf *buf = bufs[i];
+		struct intel_buf *buf = bufs[i];
 		uint64_t start;
 
 		/* We require bufs to be sorted. */
 		igt_assert(i == 0 ||
-			   buf->bo->offset64 >= bufs[i - 1]->bo->offset64 +
-						bufs[i - 1]->bo->size);
+			   buf->addr.offset >= bufs[i - 1]->addr.offset +
+					       intel_buf_bo_size(bufs[i - 1]));
 
-		start = ALIGN_DOWN(buf->bo->offset64, 1UL << address_bits);
+		start = ALIGN_DOWN(buf->addr.offset, 1UL << address_bits);
 		/* Avoid double counting for overlapping aligned bufs. */
 		start = max(start, end);
 
-		end = ALIGN(buf->bo->offset64 + last_buf_surface_end(buf),
+		end = ALIGN(buf->addr.offset + last_buf_surface_end(buf),
 			    1UL << address_bits);
 		igt_assert(end >= start);
 
@@ -111,7 +111,7 @@ pgt_table_count(int address_bits, const struct igt_buf **bufs, int buf_count)
 }
 
 static void
-pgt_calc_size(struct pgtable *pgt, const struct igt_buf **bufs, int buf_count)
+pgt_calc_size(struct pgtable *pgt, struct intel_buf **bufs, int buf_count)
 {
 	int level;
 
@@ -171,28 +171,33 @@ pgt_get_child_table(struct pgtable *pgt, uint64_t parent_table,
 	uint64_t *child_entry_ptr;
 	uint64_t child_table;
 
-	parent_table_ptr = pgt->bo->virtual + parent_table;
+	parent_table_ptr = pgt->ptr + parent_table;
 	child_entry_idx = pgt_entry_index(pgt, level, address);
 	child_entry_ptr = &parent_table_ptr[child_entry_idx];
 
 	if (!*child_entry_ptr) {
 		uint64_t pte;
+		uint32_t offset;
 
 		child_table = pgt_alloc_table(pgt, level - 1);
-		igt_assert(!((child_table + pgt->bo->offset64) &
+		igt_assert(!((child_table + pgt->buf->addr.offset) &
 			     ~ptr_mask(pgt, level)));
 
 		pte = child_table | flags;
-		*child_entry_ptr = pgt->bo->offset64 + pte;
+		*child_entry_ptr = pgt->buf->addr.offset + pte;
 
 		igt_assert(pte <= INT32_MAX);
-		drm_intel_bo_emit_reloc(pgt->bo,
-					parent_table +
-						child_entry_idx * sizeof(uint64_t),
-					pgt->bo, pte, 0, 0);
+
+		offset = parent_table + child_entry_idx * sizeof(uint64_t);
+		intel_bb_offset_reloc_to_object(pgt->ibb,
+						pgt->buf->handle,
+						pgt->buf->handle,
+						0, 0,
+						pte, offset,
+						pgt->buf->addr.offset);
 	} else {
 		child_table = (*child_entry_ptr & ptr_mask(pgt, level)) -
-			      pgt->bo->offset64;
+			      pgt->buf->addr.offset;
 	}
 
 	return child_table;
@@ -205,7 +210,7 @@ pgt_set_l1_entry(struct pgtable *pgt, uint64_t l1_table,
 	uint64_t *l1_table_ptr;
 	uint64_t *l1_entry_ptr;
 
-	l1_table_ptr = pgt->bo->virtual + l1_table;
+	l1_table_ptr = pgt->ptr + l1_table;
 	l1_entry_ptr = &l1_table_ptr[pgt_entry_index(pgt, 0, address)];
 
 	igt_assert(!(ptr & ~ptr_mask(pgt, 0)));
@@ -234,7 +239,7 @@ static int bpp_to_depth_val(int bpp)
 	}
 }
 
-static uint64_t pgt_get_l1_flags(const struct igt_buf *buf, int surface_idx)
+static uint64_t pgt_get_l1_flags(const struct intel_buf *buf, int surface_idx)
 {
 	/*
 	 * The offset of .tile_mode isn't specifed by bspec, it's what Mesa
@@ -337,15 +342,15 @@ static uint64_t pgt_get_lx_flags(void)
 
 static void
 pgt_populate_entries_for_buf(struct pgtable *pgt,
-			       const struct igt_buf *buf,
-			       uint64_t top_table,
-			       int surface_idx)
+			     struct intel_buf *buf,
+			     uint64_t top_table,
+			     int surface_idx)
 {
-	uint64_t surface_addr = buf->bo->offset64 +
+	uint64_t surface_addr = buf->addr.offset +
 				buf->surface[surface_idx].offset;
 	uint64_t surface_end = surface_addr +
 			       buf->surface[surface_idx].size;
-	uint64_t aux_addr = buf->bo->offset64 + buf->ccs[surface_idx].offset;
+	uint64_t aux_addr = buf->addr.offset + buf->ccs[surface_idx].offset;
 	uint64_t l1_flags = pgt_get_l1_flags(buf, surface_idx);
 	uint64_t lx_flags = pgt_get_lx_flags();
 
@@ -367,19 +372,24 @@ pgt_populate_entries_for_buf(struct pgtable *pgt,
 	}
 }
 
+static void pgt_map(int i915, struct pgtable *pgt)
+{
+	pgt->ptr = gem_mmap__device_coherent(i915, pgt->buf->handle, 0,
+					     pgt->size, PROT_READ | PROT_WRITE);
+}
+
+static void pgt_unmap(struct pgtable *pgt)
+{
+	munmap(pgt->ptr, pgt->size);
+}
+
 static void pgt_populate_entries(struct pgtable *pgt,
-				 const struct igt_buf **bufs,
-				 int buf_count,
-				 drm_intel_bo *pgt_bo)
+				 struct intel_buf **bufs,
+				 int buf_count)
 {
 	uint64_t top_table;
 	int i;
 
-	pgt->bo = pgt_bo;
-
-	igt_assert(pgt_bo->size >= pgt->size);
-	memset(pgt_bo->virtual, 0, pgt->size);
-
 	top_table = pgt_alloc_table(pgt, pgt->levels - 1);
 	/* Top level table must be at offset 0. */
 	igt_assert(top_table == 0);
@@ -395,7 +405,7 @@ static void pgt_populate_entries(struct pgtable *pgt,
 
 static struct pgtable *
 pgt_create(const struct pgtable_level_desc *level_descs, int levels,
-	   const struct igt_buf **bufs, int buf_count)
+	   struct intel_buf **bufs, int buf_count)
 {
 	struct pgtable *pgt;
 	int level;
@@ -427,10 +437,11 @@ static void pgt_destroy(struct pgtable *pgt)
 	free(pgt);
 }
 
-drm_intel_bo *
-intel_aux_pgtable_create(drm_intel_bufmgr *bufmgr,
-		       const struct igt_buf **bufs, int buf_count)
+struct intel_buf *
+intel_aux_pgtable_create(struct intel_bb *ibb,
+			 struct intel_buf **bufs, int buf_count)
 {
+	struct drm_i915_gem_exec_object2 *obj;
 	static const struct pgtable_level_desc level_desc[] = {
 		{
 			.idx_shift = 16,
@@ -452,99 +463,43 @@ intel_aux_pgtable_create(drm_intel_bufmgr *bufmgr,
 		},
 	};
 	struct pgtable *pgt;
-	drm_intel_bo *pgt_bo;
-
-	pgt = pgt_create(level_desc, ARRAY_SIZE(level_desc), bufs, buf_count);
+	struct buf_ops *bops;
+	struct intel_buf *buf;
+	uint64_t prev_alignment;
 
-	pgt_bo = drm_intel_bo_alloc_for_render(bufmgr, "aux pgt",
-					       pgt->size, pgt->max_align);
-	igt_assert(pgt_bo);
-
-	igt_assert(drm_intel_bo_map(pgt_bo, true) == 0);
-	pgt_populate_entries(pgt, bufs, buf_count, pgt_bo);
-	igt_assert(drm_intel_bo_unmap(pgt_bo) == 0);
+	igt_assert(buf_count);
+	bops = bufs[0]->bops;
 
+	pgt = pgt_create(level_desc, ARRAY_SIZE(level_desc), bufs, buf_count);
+	pgt->ibb = ibb;
+	pgt->buf = intel_buf_create(bops, pgt->size, 1, 8, 0, I915_TILING_NONE,
+				    I915_COMPRESSION_NONE);
+
+	/* We need to use pgt->max_align for aux table */
+	prev_alignment = intel_bb_set_default_object_alignment(ibb,
+							       pgt->max_align);
+	obj = intel_bb_add_intel_buf(ibb, pgt->buf, false);
+	intel_bb_set_default_object_alignment(ibb, prev_alignment);
+	obj->alignment = pgt->max_align;
+
+	pgt_map(ibb->i915, pgt);
+	pgt_populate_entries(pgt, bufs, buf_count);
+	pgt_unmap(pgt);
+
+	buf = pgt->buf;
 	pgt_destroy(pgt);
 
-	return pgt_bo;
-}
-
-static void
-aux_pgtable_find_max_free_range(const struct igt_buf **bufs, int buf_count,
-				uint64_t *range_start, uint64_t *range_size)
-{
-	/*
-	 * Keep the first page reserved, so we can differentiate pinned
-	 * objects based on a non-NULL offset.
-	 */
-	uint64_t start = 0x1000;
-	/* For now alloc only from the first 4GB address space. */
-	const uint64_t end = 1ULL << 32;
-	uint64_t max_range_start = 0;
-	uint64_t max_range_size = 0;
-	int i;
-
-	for (i = 0; i < buf_count; i++) {
-		if (bufs[i]->bo->offset64 >= end)
-			break;
-
-		if (bufs[i]->bo->offset64 - start > max_range_size) {
-			max_range_start = start;
-			max_range_size = bufs[i]->bo->offset64 - start;
-		}
-		start = bufs[i]->bo->offset64 + bufs[i]->bo->size;
-	}
-
-	if (start < end && end - start > max_range_size) {
-		max_range_start = start;
-		max_range_size = end - start;
-	}
-
-	*range_start = max_range_start;
-	*range_size = max_range_size;
-}
-
-static uint64_t
-aux_pgtable_find_free_range(const struct igt_buf **bufs, int buf_count,
-			    uint32_t size)
-{
-	uint64_t range_start;
-	uint64_t range_size;
-	/* A compressed surface must be 64kB aligned. */
-	const uint32_t align = 0x10000;
-	int pad;
-
-	aux_pgtable_find_max_free_range(bufs, buf_count,
-					&range_start, &range_size);
-
-	pad = ALIGN(range_start, align) - range_start;
-	range_start += pad;
-	range_size -= pad;
-	igt_assert(range_size >= size);
-
-	return range_start +
-	       ALIGN_DOWN(rand() % ((range_size - size) + 1), align);
+	return buf;
 }
 
 static void
-aux_pgtable_reserve_range(const struct igt_buf **bufs, int buf_count,
-			  const struct igt_buf *new_buf)
+aux_pgtable_reserve_buf_slot(struct intel_buf **bufs, int buf_count,
+			     struct intel_buf *new_buf)
 {
 	int i;
 
-	if (igt_buf_compressed(new_buf)) {
-		uint64_t pin_offset = new_buf->bo->offset64;
-
-		if (!pin_offset)
-			pin_offset = aux_pgtable_find_free_range(bufs,
-								 buf_count,
-								 new_buf->bo->size);
-		drm_intel_bo_set_softpin_offset(new_buf->bo, pin_offset);
-		igt_assert(new_buf->bo->offset64 == pin_offset);
-	}
-
 	for (i = 0; i < buf_count; i++)
-		if (bufs[i]->bo->offset64 > new_buf->bo->offset64)
+		if (bufs[i]->addr.offset > new_buf->addr.offset)
 			break;
 
 	memmove(&bufs[i + 1], &bufs[i], sizeof(bufs[0]) * (buf_count - i));
@@ -554,107 +509,115 @@ aux_pgtable_reserve_range(const struct igt_buf **bufs, int buf_count,
 
 void
 gen12_aux_pgtable_init(struct aux_pgtable_info *info,
-		       drm_intel_bufmgr *bufmgr,
-		       const struct igt_buf *src_buf,
-		       const struct igt_buf *dst_buf)
+		       struct intel_bb *ibb,
+		       struct intel_buf *src_buf,
+		       struct intel_buf *dst_buf)
 {
-	const struct igt_buf *bufs[2];
-	const struct igt_buf *reserved_bufs[2];
+	struct intel_buf *bufs[2];
+	struct intel_buf *reserved_bufs[2];
 	int reserved_buf_count;
 	int i;
 
-	if (!igt_buf_compressed(src_buf) && !igt_buf_compressed(dst_buf))
+	igt_assert_f(ibb->enforce_relocs == false,
+		     "We support aux pgtables for non-forced relocs yet!");
+
+	if (!intel_buf_compressed(src_buf) && !intel_buf_compressed(dst_buf))
 		return;
 
 	bufs[0] = src_buf;
 	bufs[1] = dst_buf;
 
 	/*
-	 * Ideally we'd need an IGT-wide GFX address space allocator, which
-	 * would consider all allocations and thus avoid evictions. For now use
-	 * a simpler scheme here, which only considers the buffers involved in
-	 * the blit, which should at least minimize the chance for evictions
-	 * in the case of subsequent blits:
-	 *   1. If they were already bound (bo->offset64 != 0), use this
-	 *      address.
-	 *   2. Pick a range randomly from the 4GB address space, that is not
-	 *      already occupied by a bound object, or an object we pinned.
+	 * Surface index in pgt table depend on its address so:
+	 *   1. if handle was previously executed in batch use that address
+	 *   2. add object to batch, this will generate random address
+	 *
+	 * Randomizing addresses can lead to overlapping, but we don't have
+	 * global address space generator in IGT. Currently assumption is
+	 * randomizing address is spread over 48-bit address space equally
+	 * so risk with overlapping is minimal. Of course it is growing
+	 * with number of objects (+its sizes) involved in blit.
+	 * To avoid relocation EXEC_OBJECT_PINNED flag is set for compressed
+	 * surfaces.
 	 */
+
+	intel_bb_add_intel_buf(ibb, src_buf, false);
+	if (intel_buf_compressed(src_buf))
+		intel_bb_object_set_flag(ibb, src_buf->handle, EXEC_OBJECT_PINNED);
+
+	intel_bb_add_intel_buf(ibb, dst_buf, true);
+	if (intel_buf_compressed(dst_buf))
+		intel_bb_object_set_flag(ibb, dst_buf->handle, EXEC_OBJECT_PINNED);
+
 	reserved_buf_count = 0;
 	/* First reserve space for any bufs that are bound already. */
-	for (i = 0; i < ARRAY_SIZE(bufs); i++)
-		if (bufs[i]->bo->offset64)
-			aux_pgtable_reserve_range(reserved_bufs,
-						  reserved_buf_count++,
-						  bufs[i]);
-
-	/* Next, reserve space for unbound bufs with an AUX surface. */
-	for (i = 0; i < ARRAY_SIZE(bufs); i++)
-		if (!bufs[i]->bo->offset64 && igt_buf_compressed(bufs[i]))
-			aux_pgtable_reserve_range(reserved_bufs,
-						  reserved_buf_count++,
-						  bufs[i]);
+	for (i = 0; i < ARRAY_SIZE(bufs); i++) {
+		igt_assert(bufs[i]->addr.offset != INTEL_BUF_INVALID_ADDRESS);
+		aux_pgtable_reserve_buf_slot(reserved_bufs,
+					     reserved_buf_count++,
+					     bufs[i]);
+	}
 
 	/* Create AUX pgtable entries only for bufs with an AUX surface */
 	info->buf_count = 0;
 	for (i = 0; i < reserved_buf_count; i++) {
-		if (!igt_buf_compressed(reserved_bufs[i]))
+		if (!intel_buf_compressed(reserved_bufs[i]))
 			continue;
 
 		info->bufs[info->buf_count] = reserved_bufs[i];
 		info->buf_pin_offsets[info->buf_count] =
-			reserved_bufs[i]->bo->offset64;
+			reserved_bufs[i]->addr.offset;
+
 		info->buf_count++;
 	}
 
-	info->pgtable_bo = intel_aux_pgtable_create(bufmgr,
-						    info->bufs,
-						    info->buf_count);
-	igt_assert(info->pgtable_bo);
+	info->pgtable_buf = intel_aux_pgtable_create(ibb,
+						     info->bufs,
+						     info->buf_count);
+
+	igt_assert(info->pgtable_buf);
 }
 
 void
-gen12_aux_pgtable_cleanup(struct aux_pgtable_info *info)
+gen12_aux_pgtable_cleanup(struct intel_bb *ibb, struct aux_pgtable_info *info)
 {
 	int i;
 
 	/* Check that the pinned bufs kept their offset after the exec. */
-	for (i = 0; i < info->buf_count; i++)
-		igt_assert_eq_u64(info->bufs[i]->bo->offset64,
-				  info->buf_pin_offsets[i]);
+	for (i = 0; i < info->buf_count; i++) {
+		uint64_t addr;
+
+		addr = intel_bb_get_object_offset(ibb, info->bufs[i]->handle);
+		igt_assert_eq_u64(addr, info->buf_pin_offsets[i]);
+	}
 
-	drm_intel_bo_unreference(info->pgtable_bo);
+	if (info->pgtable_buf)
+		intel_buf_destroy(info->pgtable_buf);
 }
 
 uint32_t
-gen12_create_aux_pgtable_state(struct intel_batchbuffer *batch,
-			       drm_intel_bo *aux_pgtable_bo)
+gen12_create_aux_pgtable_state(struct intel_bb *ibb,
+			       struct intel_buf *aux_pgtable_buf)
 {
 	uint64_t *pgtable_ptr;
 	uint32_t pgtable_ptr_offset;
-	int ret;
 
-	if (!aux_pgtable_bo)
+	if (!aux_pgtable_buf)
 		return 0;
 
-	pgtable_ptr = intel_batchbuffer_subdata_alloc(batch,
-						      sizeof(*pgtable_ptr),
-						      sizeof(*pgtable_ptr));
-	pgtable_ptr_offset = intel_batchbuffer_subdata_offset(batch,
-							      pgtable_ptr);
+	pgtable_ptr = intel_bb_ptr(ibb);
+	pgtable_ptr_offset = intel_bb_offset(ibb);
 
-	*pgtable_ptr = aux_pgtable_bo->offset64;
-	ret = drm_intel_bo_emit_reloc(batch->bo, pgtable_ptr_offset,
-				      aux_pgtable_bo, 0,
-				      0, 0);
-	assert(ret == 0);
+	*pgtable_ptr = intel_bb_offset_reloc(ibb, aux_pgtable_buf->handle,
+					     0, 0,
+					     pgtable_ptr_offset, -1);
+	intel_bb_ptr_add(ibb, sizeof(*pgtable_ptr));
 
 	return pgtable_ptr_offset;
 }
 
 void
-gen12_emit_aux_pgtable_state(struct intel_batchbuffer *batch, uint32_t state,
-			     bool render)
+gen12_emit_aux_pgtable_state(struct intel_bb *ibb, uint32_t state, bool render)
 {
 	uint32_t table_base_reg = render ? GEN12_GFX_AUX_TABLE_BASE_ADDR :
 					   GEN12_VEBOX_AUX_TABLE_BASE_ADDR;
@@ -662,11 +625,11 @@ gen12_emit_aux_pgtable_state(struct intel_batchbuffer *batch, uint32_t state,
 	if (!state)
 		return;
 
-	OUT_BATCH(MI_LOAD_REGISTER_MEM_GEN8 | MI_MMIO_REMAP_ENABLE_GEN12);
-	OUT_BATCH(table_base_reg);
-	OUT_RELOC(batch->bo, 0, 0, state);
+	intel_bb_out(ibb, MI_LOAD_REGISTER_MEM_GEN8 | MI_MMIO_REMAP_ENABLE_GEN12);
+	intel_bb_out(ibb, table_base_reg);
+	intel_bb_emit_reloc(ibb, ibb->handle, 0, 0, state, ibb->batch_offset);
 
-	OUT_BATCH(MI_LOAD_REGISTER_MEM_GEN8 | MI_MMIO_REMAP_ENABLE_GEN12);
-	OUT_BATCH(table_base_reg + 4);
-	OUT_RELOC(batch->bo, 0, 0, state + 4);
+	intel_bb_out(ibb, MI_LOAD_REGISTER_MEM_GEN8 | MI_MMIO_REMAP_ENABLE_GEN12);
+	intel_bb_out(ibb, table_base_reg + 4);
+	intel_bb_emit_reloc(ibb, ibb->handle, 0, 0, state + 4, ibb->batch_offset);
 }
diff --git a/lib/intel_aux_pgtable.h b/lib/intel_aux_pgtable.h
index ac82b7d2..e9976e52 100644
--- a/lib/intel_aux_pgtable.h
+++ b/lib/intel_aux_pgtable.h
@@ -1,36 +1,33 @@
 #ifndef __INTEL_AUX_PGTABLE_H__
 #define __INTEL_AUX_PGTABLE_H__
 
-#include "intel_bufmgr.h"
-
-struct igt_buf;
-struct intel_batchbuffer;
+#include "intel_bufops.h"
 
 struct aux_pgtable_info {
 	int buf_count;
-	const struct igt_buf *bufs[2];
+	struct intel_buf *bufs[2];
 	uint64_t buf_pin_offsets[2];
-	drm_intel_bo *pgtable_bo;
+	struct intel_buf *pgtable_buf;
 };
 
-drm_intel_bo *
-intel_aux_pgtable_create(drm_intel_bufmgr *bufmgr,
-			 const struct igt_buf **bufs, int buf_count);
+struct intel_buf *
+intel_aux_pgtable_create(struct intel_bb *ibb,
+			 struct intel_buf **bufs, int buf_count);
 
 void
 gen12_aux_pgtable_init(struct aux_pgtable_info *info,
-		       drm_intel_bufmgr *bufmgr,
-		       const struct igt_buf *src_buf,
-		       const struct igt_buf *dst_buf);
+		       struct intel_bb *ibb,
+		       struct intel_buf *src_buf,
+		       struct intel_buf *dst_buf);
 
 void
-gen12_aux_pgtable_cleanup(struct aux_pgtable_info *info);
+gen12_aux_pgtable_cleanup(struct intel_bb *ibb, struct aux_pgtable_info *info);
 
 uint32_t
-gen12_create_aux_pgtable_state(struct intel_batchbuffer *batch,
-			       drm_intel_bo *aux_pgtable_bo);
+gen12_create_aux_pgtable_state(struct intel_bb *batch,
+			       struct intel_buf *aux_pgtable_buf);
 void
-gen12_emit_aux_pgtable_state(struct intel_batchbuffer *batch, uint32_t state,
+gen12_emit_aux_pgtable_state(struct intel_bb *batch, uint32_t state,
 			     bool render);
 
 #endif
diff --git a/lib/intel_batchbuffer.h b/lib/intel_batchbuffer.h
index cc8c84b6..477ea423 100644
--- a/lib/intel_batchbuffer.h
+++ b/lib/intel_batchbuffer.h
@@ -318,14 +318,14 @@ void igt_blitter_fast_copy__raw(int fd,
 
 /**
  * igt_render_copyfunc_t:
- * @batch: batchbuffer object
- * @context: libdrm hardware context to use
- * @src: source i-g-t buffer object
+ * @ibb: batchbuffer
+ * @ctx: context to use
+ * @src: intel_buf source object
  * @src_x: source pixel x-coordination
  * @src_y: source pixel y-coordination
  * @width: width of the copied rectangle
  * @height: height of the copied rectangle
- * @dst: destination i-g-t buffer object
+ * @dst: intel_buf destination object
  * @dst_x: destination pixel x-coordination
  * @dst_y: destination pixel y-coordination
  *
@@ -334,25 +334,30 @@ void igt_blitter_fast_copy__raw(int fd,
  * igt_get_render_copyfunc().
  *
  * A render copy function will emit a batchbuffer to the kernel which executes
- * the specified blit copy operation using the render engine. @context is
- * optional and can be NULL.
+ * the specified blit copy operation using the render engine. @ctx is
+ * optional and can be 0.
  */
-typedef void (*igt_render_copyfunc_t)(struct intel_batchbuffer *batch,
-				      drm_intel_context *context,
-				      const struct igt_buf *src, unsigned src_x, unsigned src_y,
-				      unsigned width, unsigned height,
-				      const struct igt_buf *dst, unsigned dst_x, unsigned dst_y);
+struct intel_bb;
+struct intel_buf;
+
+typedef void (*igt_render_copyfunc_t)(struct intel_bb *ibb,
+				      uint32_t ctx,
+				      struct intel_buf *src,
+				      uint32_t src_x, uint32_t src_y,
+				      uint32_t width, uint32_t height,
+				      struct intel_buf *dst,
+				      uint32_t dst_x, uint32_t dst_y);
 
 igt_render_copyfunc_t igt_get_render_copyfunc(int devid);
 
 
 /**
  * igt_vebox_copyfunc_t:
- * @batch: batchbuffer object
- * @src: source i-g-t buffer object
+ * @ibb: batchbuffer
+ * @src: intel_buf source object
  * @width: width of the copied rectangle
  * @height: height of the copied rectangle
- * @dst: destination i-g-t buffer object
+ * @dst: intel_buf destination object
  *
  * This is the type of the per-platform vebox copy functions. The
  * platform-specific implementation can be obtained by calling
@@ -361,10 +366,10 @@ igt_render_copyfunc_t igt_get_render_copyfunc(int devid);
  * A vebox copy function will emit a batchbuffer to the kernel which executes
  * the specified blit copy operation using the vebox engine.
  */
-typedef void (*igt_vebox_copyfunc_t)(struct intel_batchbuffer *batch,
-				     const struct igt_buf *src,
-				     unsigned width, unsigned height,
-				     const struct igt_buf *dst);
+typedef void (*igt_vebox_copyfunc_t)(struct intel_bb *ibb,
+				     struct intel_buf *src,
+				     unsigned int width, unsigned int height,
+				     struct intel_buf *dst);
 
 igt_vebox_copyfunc_t igt_get_vebox_copyfunc(int devid);
 
@@ -385,7 +390,6 @@ igt_vebox_copyfunc_t igt_get_vebox_copyfunc(int devid);
  * A fill function will emit a batchbuffer to the kernel which executes
  * the specified blit fill operation using the media/gpgpu engine.
  */
-struct intel_buf;
 typedef void (*igt_fillfunc_t)(int i915,
 			       struct intel_buf *buf,
 			       unsigned x, unsigned y,
diff --git a/lib/rendercopy.h b/lib/rendercopy.h
index e0577cac..8dd867bc 100644
--- a/lib/rendercopy.h
+++ b/lib/rendercopy.h
@@ -1,70 +1,70 @@
 #include "intel_batchbuffer.h"
 
 
-static inline void emit_vertex_2s(struct intel_batchbuffer *batch,
+static inline void emit_vertex_2s(struct intel_bb *ibb,
 				  int16_t x, int16_t y)
 {
-	OUT_BATCH((uint16_t)y << 16 | (uint16_t)x);
+	intel_bb_out(ibb, ((uint16_t)y << 16 | (uint16_t)x));
 }
 
-static inline void emit_vertex(struct intel_batchbuffer *batch,
+static inline void emit_vertex(struct intel_bb *ibb,
 			       float f)
 {
 	union { float f; uint32_t ui; } u;
 	u.f = f;
-	OUT_BATCH(u.ui);
+	intel_bb_out(ibb, u.ui);
 }
 
-static inline void emit_vertex_normalized(struct intel_batchbuffer *batch,
+static inline void emit_vertex_normalized(struct intel_bb *ibb,
 					  float f, float total)
 {
 	union { float f; uint32_t ui; } u;
 	u.f = f / total;
-	OUT_BATCH(u.ui);
+	intel_bb_out(ibb, u.ui);
 }
 
-void gen12_render_copyfunc(struct intel_batchbuffer *batch,
-			   drm_intel_context *context,
-			   const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			   unsigned width, unsigned height,
-			   const struct igt_buf *dst, unsigned dst_x, unsigned dst_y);
-void gen11_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y);
-void gen9_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y);
-void gen8_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y);
-void gen7_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y);
-void gen6_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y);
-void gen4_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y);
-void gen3_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y);
-void gen2_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y);
+void gen12_render_copyfunc(struct intel_bb *ibb,
+			   uint32_t ctx,
+			   struct intel_buf *src, uint32_t src_x, uint32_t src_y,
+			   uint32_t width, uint32_t height,
+			   struct intel_buf *dst, uint32_t dst_x, uint32_t dst_y);
+void gen11_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src, uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst, uint32_t dst_x, uint32_t dst_y);
+void gen9_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src, uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst, uint32_t dst_x, uint32_t dst_y);
+void gen8_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src, uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst, uint32_t dst_x, uint32_t dst_y);
+void gen7_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src, uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst, uint32_t dst_x, uint32_t dst_y);
+void gen6_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src, uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst, uint32_t dst_x, uint32_t dst_y);
+void gen4_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src, uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst, uint32_t dst_x, uint32_t dst_y);
+void gen3_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src, uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst, uint32_t dst_x, uint32_t dst_y);
+void gen2_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src, uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst, uint32_t dst_x, uint32_t dst_y);
diff --git a/lib/rendercopy_gen4.c b/lib/rendercopy_gen4.c
index 413e3357..027ab571 100644
--- a/lib/rendercopy_gen4.c
+++ b/lib/rendercopy_gen4.c
@@ -2,6 +2,7 @@
 #include "intel_chipset.h"
 #include "gen4_render.h"
 #include "surfaceformat.h"
+#include "intel_bufops.h"
 
 #include <assert.h>
 
@@ -80,18 +81,13 @@ static const uint32_t gen5_ps_kernel_nomask_affine[][4] = {
 };
 
 static uint32_t
-batch_used(struct intel_batchbuffer *batch)
+batch_round_upto(struct intel_bb *ibb, uint32_t divisor)
 {
-	return batch->ptr - batch->buffer;
-}
-
-static uint32_t
-batch_round_upto(struct intel_batchbuffer *batch, uint32_t divisor)
-{
-	uint32_t offset = batch_used(batch);
+	uint32_t offset = intel_bb_offset(ibb);
 
 	offset = (offset + divisor - 1) / divisor * divisor;
-	batch->ptr = batch->buffer + offset;
+	intel_bb_ptr_set(ibb, offset);
+
 	return offset;
 }
 
@@ -120,30 +116,16 @@ static int gen4_max_wm_threads(uint32_t devid)
 	return IS_GEN5(devid) ? 72 : IS_G4X(devid) ? 50 : 32;
 }
 
-static void
-gen4_render_flush(struct intel_batchbuffer *batch,
-		  drm_intel_context *context, uint32_t batch_end)
-{
-	igt_assert_eq(drm_intel_bo_subdata(batch->bo,
-					   0, 4096, batch->buffer),
-		      0);
-	igt_assert_eq(drm_intel_gem_bo_context_exec(batch->bo, context,
-						    batch_end, 0),
-		      0);
-}
-
 static uint32_t
-gen4_bind_buf(struct intel_batchbuffer *batch,
-	      const struct igt_buf *buf,
-	      int is_dst)
+gen4_bind_buf(struct intel_bb *ibb, const struct intel_buf *buf, int is_dst)
 {
 	struct gen4_surface_state *ss;
 	uint32_t write_domain, read_domain;
-	int ret;
+	uint64_t address;
 
 	igt_assert_lte(buf->surface[0].stride, 128*1024);
-	igt_assert_lte(igt_buf_width(buf), 8192);
-	igt_assert_lte(igt_buf_height(buf), 8192);
+	igt_assert_lte(intel_buf_width(buf), 8192);
+	igt_assert_lte(intel_buf_height(buf), 8192);
 
 	if (is_dst) {
 		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
@@ -152,7 +134,7 @@ gen4_bind_buf(struct intel_batchbuffer *batch,
 		read_domain = I915_GEM_DOMAIN_SAMPLER;
 	}
 
-	ss = intel_batchbuffer_subdata_alloc(batch, sizeof(*ss), 32);
+	ss = intel_bb_ptr_align(ibb, 32);
 
 	ss->ss0.surface_type = SURFACE_2D;
 	switch (buf->bpp) {
@@ -165,102 +147,102 @@ gen4_bind_buf(struct intel_batchbuffer *batch,
 
 	ss->ss0.data_return_format = SURFACERETURNFORMAT_FLOAT32;
 	ss->ss0.color_blend = 1;
-	ss->ss1.base_addr = buf->bo->offset;
 
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				      intel_batchbuffer_subdata_offset(batch, ss) + 4,
-				      buf->bo, 0,
-				      read_domain, write_domain);
-	assert(ret == 0);
+	address = intel_bb_offset_reloc(ibb, buf->handle,
+					read_domain, write_domain,
+					intel_bb_offset(ibb) + 4,
+					buf->addr.offset);
+	ss->ss1.base_addr = (uint32_t) address;
 
-	ss->ss2.height = igt_buf_height(buf) - 1;
-	ss->ss2.width  = igt_buf_width(buf) - 1;
+	ss->ss2.height = intel_buf_height(buf) - 1;
+	ss->ss2.width  = intel_buf_width(buf) - 1;
 	ss->ss3.pitch  = buf->surface[0].stride - 1;
 	ss->ss3.tiled_surface = buf->tiling != I915_TILING_NONE;
 	ss->ss3.tile_walk     = buf->tiling == I915_TILING_Y;
 
-	return intel_batchbuffer_subdata_offset(batch, ss);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*ss));
 }
 
 static uint32_t
-gen4_bind_surfaces(struct intel_batchbuffer *batch,
-		   const struct igt_buf *src,
-		   const struct igt_buf *dst)
+gen4_bind_surfaces(struct intel_bb *ibb,
+		   const struct intel_buf *src,
+		   const struct intel_buf *dst)
 {
-	uint32_t *binding_table;
+	uint32_t *binding_table, binding_table_offset;
 
-	binding_table = intel_batchbuffer_subdata_alloc(batch, 32, 32);
+	binding_table = intel_bb_ptr_align(ibb, 32);
+	binding_table_offset = intel_bb_ptr_add_return_prev_offset(ibb, 32);
 
-	binding_table[0] = gen4_bind_buf(batch, dst, 1);
-	binding_table[1] = gen4_bind_buf(batch, src, 0);
+	binding_table[0] = gen4_bind_buf(ibb, dst, 1);
+	binding_table[1] = gen4_bind_buf(ibb, src, 0);
 
-	return intel_batchbuffer_subdata_offset(batch, binding_table);
+	return binding_table_offset;
 }
 
 static void
-gen4_emit_sip(struct intel_batchbuffer *batch)
+gen4_emit_sip(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN4_STATE_SIP | (2 - 2));
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN4_STATE_SIP | (2 - 2));
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen4_emit_state_base_address(struct intel_batchbuffer *batch)
+gen4_emit_state_base_address(struct intel_bb *ibb)
 {
-	if (IS_GEN5(batch->devid)) {
-		OUT_BATCH(GEN4_STATE_BASE_ADDRESS | (8 - 2));
-		OUT_RELOC(batch->bo, /* general */
-			  I915_GEM_DOMAIN_INSTRUCTION, 0,
-			  BASE_ADDRESS_MODIFY);
-		OUT_RELOC(batch->bo, /* surface */
-			  I915_GEM_DOMAIN_INSTRUCTION, 0,
-			  BASE_ADDRESS_MODIFY);
-		OUT_BATCH(0); /* media */
-		OUT_RELOC(batch->bo, /* instruction */
-			  I915_GEM_DOMAIN_INSTRUCTION, 0,
-			  BASE_ADDRESS_MODIFY);
+	if (IS_GEN5(ibb->devid)) {
+		intel_bb_out(ibb, GEN4_STATE_BASE_ADDRESS | (8 - 2));
+		intel_bb_emit_reloc(ibb, ibb->handle, /* general */
+				    I915_GEM_DOMAIN_INSTRUCTION, 0,
+				    BASE_ADDRESS_MODIFY, ibb->batch_offset);
+		intel_bb_emit_reloc(ibb, ibb->handle, /* surface */
+				    I915_GEM_DOMAIN_INSTRUCTION, 0,
+				    BASE_ADDRESS_MODIFY, ibb->batch_offset);
+		intel_bb_out(ibb, 0); /* media */
+		intel_bb_emit_reloc(ibb, ibb->handle, /* instruction */
+				    I915_GEM_DOMAIN_INSTRUCTION, 0,
+				    BASE_ADDRESS_MODIFY, ibb->batch_offset);
 
 		/* upper bounds, disable */
-		OUT_BATCH(BASE_ADDRESS_MODIFY); /* general */
-		OUT_BATCH(0); /* media */
-		OUT_BATCH(BASE_ADDRESS_MODIFY); /* instruction */
+		intel_bb_out(ibb, BASE_ADDRESS_MODIFY); /* general */
+		intel_bb_out(ibb, 0); /* media */
+		intel_bb_out(ibb, BASE_ADDRESS_MODIFY); /* instruction */
 	} else {
-		OUT_BATCH(GEN4_STATE_BASE_ADDRESS | (6 - 2));
-		OUT_RELOC(batch->bo, /* general */
-			  I915_GEM_DOMAIN_INSTRUCTION, 0,
-			  BASE_ADDRESS_MODIFY);
-		OUT_RELOC(batch->bo, /* surface */
-			  I915_GEM_DOMAIN_INSTRUCTION, 0,
-			  BASE_ADDRESS_MODIFY);
-		OUT_BATCH(0); /* media */
+		intel_bb_out(ibb, GEN4_STATE_BASE_ADDRESS | (6 - 2));
+		intel_bb_emit_reloc(ibb, ibb->handle, /* general */
+				    I915_GEM_DOMAIN_INSTRUCTION, 0,
+				    BASE_ADDRESS_MODIFY, ibb->batch_offset);
+		intel_bb_emit_reloc(ibb, ibb->handle, /* surface */
+				    I915_GEM_DOMAIN_INSTRUCTION, 0,
+				    BASE_ADDRESS_MODIFY, ibb->batch_offset);
+		intel_bb_out(ibb, 0); /* media */
 
 		/* upper bounds, disable */
-		OUT_BATCH(BASE_ADDRESS_MODIFY); /* general */
-		OUT_BATCH(0); /* media */
+		intel_bb_out(ibb, BASE_ADDRESS_MODIFY); /* general */
+		intel_bb_out(ibb, 0); /* media */
 	}
 }
 
 static void
-gen4_emit_pipelined_pointers(struct intel_batchbuffer *batch,
+gen4_emit_pipelined_pointers(struct intel_bb *ibb,
 			     uint32_t vs, uint32_t sf,
 			     uint32_t wm, uint32_t cc)
 {
-	OUT_BATCH(GEN4_3DSTATE_PIPELINED_POINTERS | (7 - 2));
-	OUT_BATCH(vs);
-	OUT_BATCH(GEN4_GS_DISABLE);
-	OUT_BATCH(GEN4_CLIP_DISABLE);
-	OUT_BATCH(sf);
-	OUT_BATCH(wm);
-	OUT_BATCH(cc);
+	intel_bb_out(ibb, GEN4_3DSTATE_PIPELINED_POINTERS | (7 - 2));
+	intel_bb_out(ibb, vs);
+	intel_bb_out(ibb, GEN4_GS_DISABLE);
+	intel_bb_out(ibb, GEN4_CLIP_DISABLE);
+	intel_bb_out(ibb, sf);
+	intel_bb_out(ibb, wm);
+	intel_bb_out(ibb, cc);
 }
 
 static void
-gen4_emit_urb(struct intel_batchbuffer *batch)
+gen4_emit_urb(struct intel_bb *ibb)
 {
-	int vs_entries = gen4_max_vs_nr_urb_entries(batch->devid);
+	int vs_entries = gen4_max_vs_nr_urb_entries(ibb->devid);
 	int gs_entries = 0;
 	int cl_entries = 0;
-	int sf_entries = gen4_max_sf_nr_urb_entries(batch->devid);
+	int sf_entries = gen4_max_sf_nr_urb_entries(ibb->devid);
 	int cs_entries = 0;
 
 	int urb_vs_end =              vs_entries * URB_VS_ENTRY_SIZE;
@@ -269,91 +251,91 @@ gen4_emit_urb(struct intel_batchbuffer *batch)
 	int urb_sf_end = urb_cl_end + sf_entries * URB_SF_ENTRY_SIZE;
 	int urb_cs_end = urb_sf_end + cs_entries * URB_CS_ENTRY_SIZE;
 
-	assert(urb_cs_end <= gen4_urb_size(batch->devid));
-
-	intel_batchbuffer_align(batch, 16);
-
-	OUT_BATCH(GEN4_URB_FENCE |
-		  UF0_CS_REALLOC |
-		  UF0_SF_REALLOC |
-		  UF0_CLIP_REALLOC |
-		  UF0_GS_REALLOC |
-		  UF0_VS_REALLOC |
-		  (3 - 2));
-	OUT_BATCH(urb_cl_end << UF1_CLIP_FENCE_SHIFT |
-		  urb_gs_end << UF1_GS_FENCE_SHIFT |
-		  urb_vs_end << UF1_VS_FENCE_SHIFT);
-	OUT_BATCH(urb_cs_end << UF2_CS_FENCE_SHIFT |
-		  urb_sf_end << UF2_SF_FENCE_SHIFT);
-
-	OUT_BATCH(GEN4_CS_URB_STATE | (2 - 2));
-	OUT_BATCH((URB_CS_ENTRY_SIZE - 1) << 4 | cs_entries << 0);
+	assert(urb_cs_end <= gen4_urb_size(ibb->devid));
+
+	intel_bb_ptr_align(ibb, 16);
+
+	intel_bb_out(ibb, GEN4_URB_FENCE |
+		     UF0_CS_REALLOC |
+		     UF0_SF_REALLOC |
+		     UF0_CLIP_REALLOC |
+		     UF0_GS_REALLOC |
+		     UF0_VS_REALLOC |
+		     (3 - 2));
+	intel_bb_out(ibb, urb_cl_end << UF1_CLIP_FENCE_SHIFT |
+		     urb_gs_end << UF1_GS_FENCE_SHIFT |
+		     urb_vs_end << UF1_VS_FENCE_SHIFT);
+	intel_bb_out(ibb, urb_cs_end << UF2_CS_FENCE_SHIFT |
+		     urb_sf_end << UF2_SF_FENCE_SHIFT);
+
+	intel_bb_out(ibb, GEN4_CS_URB_STATE | (2 - 2));
+	intel_bb_out(ibb, (URB_CS_ENTRY_SIZE - 1) << 4 | cs_entries << 0);
 }
 
 static void
-gen4_emit_null_depth_buffer(struct intel_batchbuffer *batch)
+gen4_emit_null_depth_buffer(struct intel_bb *ibb)
 {
-	if (IS_G4X(batch->devid) || IS_GEN5(batch->devid)) {
-		OUT_BATCH(GEN4_3DSTATE_DEPTH_BUFFER | (6 - 2));
-		OUT_BATCH(SURFACE_NULL << GEN4_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT |
-			  GEN4_DEPTHFORMAT_D32_FLOAT << GEN4_3DSTATE_DEPTH_BUFFER_FORMAT_SHIFT);
-		OUT_BATCH(0);
-		OUT_BATCH(0);
-		OUT_BATCH(0);
-		OUT_BATCH(0);
+	if (IS_G4X(ibb->devid) || IS_GEN5(ibb->devid)) {
+		intel_bb_out(ibb, GEN4_3DSTATE_DEPTH_BUFFER | (6 - 2));
+		intel_bb_out(ibb, SURFACE_NULL << GEN4_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT |
+			     GEN4_DEPTHFORMAT_D32_FLOAT << GEN4_3DSTATE_DEPTH_BUFFER_FORMAT_SHIFT);
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, 0);
 	} else {
-		OUT_BATCH(GEN4_3DSTATE_DEPTH_BUFFER | (5 - 2));
-		OUT_BATCH(SURFACE_NULL << GEN4_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT |
-			  GEN4_DEPTHFORMAT_D32_FLOAT << GEN4_3DSTATE_DEPTH_BUFFER_FORMAT_SHIFT);
-		OUT_BATCH(0);
-		OUT_BATCH(0);
-		OUT_BATCH(0);
+		intel_bb_out(ibb, GEN4_3DSTATE_DEPTH_BUFFER | (5 - 2));
+		intel_bb_out(ibb, SURFACE_NULL << GEN4_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT |
+			     GEN4_DEPTHFORMAT_D32_FLOAT << GEN4_3DSTATE_DEPTH_BUFFER_FORMAT_SHIFT);
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, 0);
 	}
 
-	if (IS_GEN5(batch->devid)) {
-		OUT_BATCH(GEN4_3DSTATE_CLEAR_PARAMS | (2 - 2));
-		OUT_BATCH(0);
+	if (IS_GEN5(ibb->devid)) {
+		intel_bb_out(ibb, GEN4_3DSTATE_CLEAR_PARAMS | (2 - 2));
+		intel_bb_out(ibb, 0);
 	}
 }
 
 static void
-gen4_emit_invariant(struct intel_batchbuffer *batch)
+gen4_emit_invariant(struct intel_bb *ibb)
 {
-	OUT_BATCH(MI_FLUSH | MI_INHIBIT_RENDER_CACHE_FLUSH);
+	intel_bb_out(ibb, MI_FLUSH | MI_INHIBIT_RENDER_CACHE_FLUSH);
 
-	if (IS_GEN5(batch->devid) || IS_G4X(batch->devid))
-		OUT_BATCH(G4X_PIPELINE_SELECT | PIPELINE_SELECT_3D);
+	if (IS_GEN5(ibb->devid) || IS_G4X(ibb->devid))
+		intel_bb_out(ibb, G4X_PIPELINE_SELECT | PIPELINE_SELECT_3D);
 	else
-		OUT_BATCH(GEN4_PIPELINE_SELECT | PIPELINE_SELECT_3D);
+		intel_bb_out(ibb, GEN4_PIPELINE_SELECT | PIPELINE_SELECT_3D);
 }
 
 static uint32_t
-gen4_create_vs_state(struct intel_batchbuffer *batch)
+gen4_create_vs_state(struct intel_bb *ibb)
 {
 	struct gen4_vs_state *vs;
 	int nr_urb_entries;
 
-	vs = intel_batchbuffer_subdata_alloc(batch, sizeof(*vs), 32);
+	vs = intel_bb_ptr_align(ibb, 32);
 
 	/* Set up the vertex shader to be disabled (passthrough) */
-	nr_urb_entries = gen4_max_vs_nr_urb_entries(batch->devid);
-	if (IS_GEN5(batch->devid))
+	nr_urb_entries = gen4_max_vs_nr_urb_entries(ibb->devid);
+	if (IS_GEN5(ibb->devid))
 		nr_urb_entries >>= 2;
 	vs->vs4.nr_urb_entries = nr_urb_entries;
 	vs->vs4.urb_entry_allocation_size = URB_VS_ENTRY_SIZE - 1;
 	vs->vs6.vs_enable = 0;
 	vs->vs6.vert_cache_disable = 1;
 
-	return intel_batchbuffer_subdata_offset(batch, vs);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*vs));
 }
 
 static uint32_t
-gen4_create_sf_state(struct intel_batchbuffer *batch,
+gen4_create_sf_state(struct intel_bb *ibb,
 		     uint32_t kernel)
 {
 	struct gen4_sf_state *sf;
 
-	sf = intel_batchbuffer_subdata_alloc(batch, sizeof(*sf), 32);
+	sf = intel_bb_ptr_align(ibb, 32);
 
 	sf->sf0.grf_reg_count = GEN4_GRF_BLOCKS(SF_KERNEL_NUM_GRF);
 	sf->sf0.kernel_start_pointer = kernel >> 6;
@@ -363,25 +345,25 @@ gen4_create_sf_state(struct intel_batchbuffer *batch,
 	sf->sf3.urb_entry_read_offset = 1;
 	sf->sf3.dispatch_grf_start_reg = 3;
 
-	sf->sf4.max_threads = gen4_max_sf_threads(batch->devid) - 1;
+	sf->sf4.max_threads = gen4_max_sf_threads(ibb->devid) - 1;
 	sf->sf4.urb_entry_allocation_size = URB_SF_ENTRY_SIZE - 1;
-	sf->sf4.nr_urb_entries = gen4_max_sf_nr_urb_entries(batch->devid);
+	sf->sf4.nr_urb_entries = gen4_max_sf_nr_urb_entries(ibb->devid);
 
 	sf->sf6.cull_mode = GEN4_CULLMODE_NONE;
 	sf->sf6.dest_org_vbias = 0x8;
 	sf->sf6.dest_org_hbias = 0x8;
 
-	return intel_batchbuffer_subdata_offset(batch, sf);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*sf));
 }
 
 static uint32_t
-gen4_create_wm_state(struct intel_batchbuffer *batch,
+gen4_create_wm_state(struct intel_bb *ibb,
 		     uint32_t kernel,
 		     uint32_t sampler)
 {
 	struct gen4_wm_state *wm;
 
-	wm = intel_batchbuffer_subdata_alloc(batch, sizeof(*wm), 32);
+	wm = intel_bb_ptr_align(ibb, 32);
 
 	assert((kernel & 63) == 0);
 	wm->wm0.kernel_start_pointer = kernel >> 6;
@@ -394,48 +376,48 @@ gen4_create_wm_state(struct intel_batchbuffer *batch,
 	wm->wm4.sampler_state_pointer = sampler >> 5;
 	wm->wm4.sampler_count = 1;
 
-	wm->wm5.max_threads = gen4_max_wm_threads(batch->devid);
+	wm->wm5.max_threads = gen4_max_wm_threads(ibb->devid);
 	wm->wm5.thread_dispatch_enable = 1;
 	wm->wm5.enable_16_pix = 1;
 	wm->wm5.early_depth_test = 1;
 
-	if (IS_GEN5(batch->devid))
+	if (IS_GEN5(ibb->devid))
 		wm->wm1.binding_table_entry_count = 0;
 	else
 		wm->wm1.binding_table_entry_count = 2;
 	wm->wm3.urb_entry_read_length = 2;
 
-	return intel_batchbuffer_subdata_offset(batch, wm);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*wm));
 }
 
 static void
-gen4_emit_binding_table(struct intel_batchbuffer *batch,
+gen4_emit_binding_table(struct intel_bb *ibb,
 			uint32_t wm_table)
 {
-	OUT_BATCH(GEN4_3DSTATE_BINDING_TABLE_POINTERS | (6 - 2));
-	OUT_BATCH(0);		/* vs */
-	OUT_BATCH(0);		/* gs */
-	OUT_BATCH(0);		/* clip */
-	OUT_BATCH(0);		/* sf */
-	OUT_BATCH(wm_table);    /* ps */
+	intel_bb_out(ibb, GEN4_3DSTATE_BINDING_TABLE_POINTERS | (6 - 2));
+	intel_bb_out(ibb, 0);		/* vs */
+	intel_bb_out(ibb, 0);		/* gs */
+	intel_bb_out(ibb, 0);		/* clip */
+	intel_bb_out(ibb, 0);		/* sf */
+	intel_bb_out(ibb, wm_table);    /* ps */
 }
 
 static void
-gen4_emit_drawing_rectangle(struct intel_batchbuffer *batch,
-			    const struct igt_buf *dst)
+gen4_emit_drawing_rectangle(struct intel_bb *ibb,
+			    const struct intel_buf *dst)
 {
-	OUT_BATCH(GEN4_3DSTATE_DRAWING_RECTANGLE | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH((igt_buf_height(dst) - 1) << 16 |
-		  (igt_buf_width(dst) - 1));
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN4_3DSTATE_DRAWING_RECTANGLE | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, (intel_buf_height(dst) - 1) << 16 |
+		     (intel_buf_width(dst) - 1));
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen4_emit_vertex_elements(struct intel_batchbuffer *batch)
+gen4_emit_vertex_elements(struct intel_bb *ibb)
 {
 
-	if (IS_GEN5(batch->devid)) {
+	if (IS_GEN5(ibb->devid)) {
 		/* The VUE layout
 		 *    dword 0-3: pad (0.0, 0.0, 0.0, 0.0),
 		 *    dword 4-7: position (x, y, 1.0, 1.0),
@@ -443,34 +425,34 @@ gen4_emit_vertex_elements(struct intel_batchbuffer *batch)
 		 *
 		 * dword 4-11 are fetched from vertex buffer
 		 */
-		OUT_BATCH(GEN4_3DSTATE_VERTEX_ELEMENTS | (3 * 2 + 1 - 2));
+		intel_bb_out(ibb, GEN4_3DSTATE_VERTEX_ELEMENTS | (3 * 2 + 1 - 2));
 
 		/* pad */
-		OUT_BATCH(0 << GEN4_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN4_VE0_VALID |
-			  SURFACEFORMAT_R32G32B32A32_FLOAT << VE0_FORMAT_SHIFT |
-			  0 << VE0_OFFSET_SHIFT);
-		OUT_BATCH(GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
+		intel_bb_out(ibb, 0 << GEN4_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN4_VE0_VALID |
+			     SURFACEFORMAT_R32G32B32A32_FLOAT << VE0_FORMAT_SHIFT |
+			     0 << VE0_OFFSET_SHIFT);
+		intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
 
 		/* x,y */
-		OUT_BATCH(0 << GEN4_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN4_VE0_VALID |
-			  SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
-			  0 << VE0_OFFSET_SHIFT);
-		OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_2_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
+		intel_bb_out(ibb, 0 << GEN4_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN4_VE0_VALID |
+			     SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
+			     0 << VE0_OFFSET_SHIFT);
+		intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_2_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
 
 		/* u0, v0 */
-		OUT_BATCH(0 << GEN4_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN4_VE0_VALID |
-			  SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
-			  4 << VE0_OFFSET_SHIFT);
-		OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
+		intel_bb_out(ibb, 0 << GEN4_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN4_VE0_VALID |
+			     SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
+			     4 << VE0_OFFSET_SHIFT);
+		intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
 	} else {
 		/* The VUE layout
 		 *    dword 0-3: position (x, y, 1.0, 1.0),
@@ -478,90 +460,88 @@ gen4_emit_vertex_elements(struct intel_batchbuffer *batch)
 		 *
 		 * dword 0-7 are fetched from vertex buffer
 		 */
-		OUT_BATCH(GEN4_3DSTATE_VERTEX_ELEMENTS | (2 * 2 + 1 - 2));
+		intel_bb_out(ibb, GEN4_3DSTATE_VERTEX_ELEMENTS | (2 * 2 + 1 - 2));
 
 		/* x,y */
-		OUT_BATCH(0 << GEN4_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN4_VE0_VALID |
-			  SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
-			  0 << VE0_OFFSET_SHIFT);
-		OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_2_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT |
-			  4 << VE1_DESTINATION_ELEMENT_OFFSET_SHIFT);
+		intel_bb_out(ibb, 0 << GEN4_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN4_VE0_VALID |
+			     SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
+			     0 << VE0_OFFSET_SHIFT);
+		intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_2_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT |
+			     4 << VE1_DESTINATION_ELEMENT_OFFSET_SHIFT);
 
 		/* u0, v0 */
-		OUT_BATCH(0 << GEN4_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN4_VE0_VALID |
-			  SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
-			  4 << VE0_OFFSET_SHIFT);
-		OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-			  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT |
-			  8 << VE1_DESTINATION_ELEMENT_OFFSET_SHIFT);
+		intel_bb_out(ibb, 0 << GEN4_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN4_VE0_VALID |
+			     SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
+			     4 << VE0_OFFSET_SHIFT);
+		intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+			     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT |
+			     8 << VE1_DESTINATION_ELEMENT_OFFSET_SHIFT);
 	}
 }
 
 static uint32_t
-gen4_create_cc_viewport(struct intel_batchbuffer *batch)
+gen4_create_cc_viewport(struct intel_bb *ibb)
 {
 	struct gen4_cc_viewport *vp;
 
-	vp = intel_batchbuffer_subdata_alloc(batch, sizeof(*vp), 32);
+	vp = intel_bb_ptr_align(ibb, 32);
 
 	vp->min_depth = -1.e35;
 	vp->max_depth = 1.e35;
 
-	return intel_batchbuffer_subdata_offset(batch, vp);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*vp));
 }
 
 static uint32_t
-gen4_create_cc_state(struct intel_batchbuffer *batch,
+gen4_create_cc_state(struct intel_bb *ibb,
 		     uint32_t cc_vp)
 {
 	struct gen4_color_calc_state *cc;
 
-	cc = intel_batchbuffer_subdata_alloc(batch, sizeof(*cc), 64);
+	cc = intel_bb_ptr_align(ibb, 64);
 
 	cc->cc4.cc_viewport_state_offset = cc_vp;
 
-	return intel_batchbuffer_subdata_offset(batch, cc);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*cc));
 }
 
 static uint32_t
-gen4_create_sf_kernel(struct intel_batchbuffer *batch)
+gen4_create_sf_kernel(struct intel_bb *ibb)
 {
-	if (IS_GEN5(batch->devid))
-		return intel_batchbuffer_copy_data(batch, gen5_sf_kernel_nomask,
-						   sizeof(gen5_sf_kernel_nomask),
-						   64);
+	if (IS_GEN5(ibb->devid))
+		return intel_bb_copy_data(ibb, gen5_sf_kernel_nomask,
+					  sizeof(gen5_sf_kernel_nomask), 64);
 	else
-		return intel_batchbuffer_copy_data(batch, gen4_sf_kernel_nomask,
-						   sizeof(gen4_sf_kernel_nomask),
-						   64);
+		return intel_bb_copy_data(ibb, gen4_sf_kernel_nomask,
+					  sizeof(gen4_sf_kernel_nomask), 64);
 }
 
 static uint32_t
-gen4_create_ps_kernel(struct intel_batchbuffer *batch)
+gen4_create_ps_kernel(struct intel_bb *ibb)
 {
-	if (IS_GEN5(batch->devid))
-		return intel_batchbuffer_copy_data(batch, gen5_ps_kernel_nomask_affine,
-						   sizeof(gen5_ps_kernel_nomask_affine),
-						   64);
+	if (IS_GEN5(ibb->devid))
+		return intel_bb_copy_data(ibb, gen5_ps_kernel_nomask_affine,
+					  sizeof(gen5_ps_kernel_nomask_affine),
+					  64);
 	else
-		return intel_batchbuffer_copy_data(batch, gen4_ps_kernel_nomask_affine,
-						   sizeof(gen4_ps_kernel_nomask_affine),
-						   64);
+		return intel_bb_copy_data(ibb, gen4_ps_kernel_nomask_affine,
+					  sizeof(gen4_ps_kernel_nomask_affine),
+					  64);
 }
 
 static uint32_t
-gen4_create_sampler(struct intel_batchbuffer *batch,
+gen4_create_sampler(struct intel_bb *ibb,
 		    sampler_filter_t filter,
 		    sampler_extend_t extend)
 {
 	struct gen4_sampler_state *ss;
 
-	ss = intel_batchbuffer_subdata_alloc(batch, sizeof(*ss), 32);
+	ss = intel_bb_ptr_align(ibb, 32);
 
 	ss->ss0.lod_preclamp = GEN4_LOD_PRECLAMP_OGL;
 
@@ -606,50 +586,52 @@ gen4_create_sampler(struct intel_batchbuffer *batch,
 		break;
 	}
 
-	return intel_batchbuffer_subdata_offset(batch, ss);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*ss));
 }
 
-static void gen4_emit_vertex_buffer(struct intel_batchbuffer *batch)
+static void gen4_emit_vertex_buffer(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN4_3DSTATE_VERTEX_BUFFERS | (5 - 2));
-	OUT_BATCH(GEN4_VB0_VERTEXDATA |
-		  0 << GEN4_VB0_BUFFER_INDEX_SHIFT |
-		  VERTEX_SIZE << VB0_BUFFER_PITCH_SHIFT);
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_VERTEX, 0, 0);
-	if (IS_GEN5(batch->devid))
-		OUT_RELOC(batch->bo, I915_GEM_DOMAIN_VERTEX, 0,
-			  batch->bo->size - 1);
+	intel_bb_out(ibb, GEN4_3DSTATE_VERTEX_BUFFERS | (5 - 2));
+	intel_bb_out(ibb, GEN4_VB0_VERTEXDATA |
+		     0 << GEN4_VB0_BUFFER_INDEX_SHIFT |
+		     VERTEX_SIZE << VB0_BUFFER_PITCH_SHIFT);
+	intel_bb_emit_reloc(ibb, ibb->handle,
+			    I915_GEM_DOMAIN_VERTEX, 0, 0, ibb->batch_offset);
+	if (IS_GEN5(ibb->devid))
+		intel_bb_emit_reloc(ibb, ibb->handle,
+				    I915_GEM_DOMAIN_VERTEX, 0,
+				    ibb->size - 1, ibb->batch_offset);
 	else
-		OUT_BATCH(batch->bo->size / VERTEX_SIZE - 1);
-	OUT_BATCH(0);
+		intel_bb_out(ibb, ibb->size / VERTEX_SIZE - 1);
+	intel_bb_out(ibb, 0);
 }
 
-static uint32_t gen4_emit_primitive(struct intel_batchbuffer *batch)
+static uint32_t gen4_emit_primitive(struct intel_bb *ibb)
 {
 	uint32_t offset;
 
-	OUT_BATCH(GEN4_3DPRIMITIVE |
-		  GEN4_3DPRIMITIVE_VERTEX_SEQUENTIAL |
-		  _3DPRIM_RECTLIST << GEN4_3DPRIMITIVE_TOPOLOGY_SHIFT |
-		  0 << 9 |
-		  (6 - 2));
-	OUT_BATCH(3);	/* vertex count */
-	offset = batch_used(batch);
-	OUT_BATCH(0);	/* vertex_index */
-	OUT_BATCH(1);	/* single instance */
-	OUT_BATCH(0);	/* start instance location */
-	OUT_BATCH(0);	/* index buffer offset, ignored */
+	intel_bb_out(ibb, GEN4_3DPRIMITIVE |
+		     GEN4_3DPRIMITIVE_VERTEX_SEQUENTIAL |
+		     _3DPRIM_RECTLIST << GEN4_3DPRIMITIVE_TOPOLOGY_SHIFT |
+		     0 << 9 |
+		     (6 - 2));
+	intel_bb_out(ibb, 3);	/* vertex count */
+	offset = intel_bb_offset(ibb);
+	intel_bb_out(ibb, 0);	/* vertex_index */
+	intel_bb_out(ibb, 1);	/* single instance */
+	intel_bb_out(ibb, 0);	/* start instance location */
+	intel_bb_out(ibb, 0);	/* index buffer offset, ignored */
 
 	return offset;
 }
 
-void gen4_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src,
-			  unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst,
-			  unsigned dst_x, unsigned dst_y)
+void gen4_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src,
+			  uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst,
+			  uint32_t dst_x, uint32_t dst_y)
 {
 	uint32_t cc, cc_vp;
 	uint32_t wm, wm_sampler, wm_kernel, wm_table;
@@ -658,60 +640,64 @@ void gen4_render_copyfunc(struct intel_batchbuffer *batch,
 	uint32_t offset, batch_end;
 
 	igt_assert(src->bpp == dst->bpp);
-	intel_batchbuffer_flush_with_context(batch, context);
 
-	batch->ptr = batch->buffer + 1024;
-	intel_batchbuffer_subdata_alloc(batch, 64, 64);
+	intel_bb_flush_render_with_context(ibb, ctx);
+
+	intel_bb_add_intel_buf(ibb, dst, true);
+	intel_bb_add_intel_buf(ibb, src, false);
+
+	intel_bb_ptr_set(ibb, 1024 + 64);
 
-	vs = gen4_create_vs_state(batch);
+	vs = gen4_create_vs_state(ibb);
 
-	sf_kernel = gen4_create_sf_kernel(batch);
-	sf = gen4_create_sf_state(batch, sf_kernel);
+	sf_kernel = gen4_create_sf_kernel(ibb);
+	sf = gen4_create_sf_state(ibb, sf_kernel);
 
-	wm_table = gen4_bind_surfaces(batch, src, dst);
-	wm_kernel = gen4_create_ps_kernel(batch);
-	wm_sampler = gen4_create_sampler(batch,
+	wm_table = gen4_bind_surfaces(ibb, src, dst);
+	wm_kernel = gen4_create_ps_kernel(ibb);
+	wm_sampler = gen4_create_sampler(ibb,
 					 SAMPLER_FILTER_NEAREST,
 					 SAMPLER_EXTEND_NONE);
-	wm = gen4_create_wm_state(batch, wm_kernel, wm_sampler);
+	wm = gen4_create_wm_state(ibb, wm_kernel, wm_sampler);
 
-	cc_vp = gen4_create_cc_viewport(batch);
-	cc = gen4_create_cc_state(batch, cc_vp);
+	cc_vp = gen4_create_cc_viewport(ibb);
+	cc = gen4_create_cc_state(ibb, cc_vp);
 
-	batch->ptr = batch->buffer;
+	intel_bb_ptr_set(ibb, 0);
 
-	gen4_emit_invariant(batch);
-	gen4_emit_state_base_address(batch);
-	gen4_emit_sip(batch);
-	gen4_emit_null_depth_buffer(batch);
+	gen4_emit_invariant(ibb);
+	gen4_emit_state_base_address(ibb);
+	gen4_emit_sip(ibb);
+	gen4_emit_null_depth_buffer(ibb);
 
-	gen4_emit_drawing_rectangle(batch, dst);
-	gen4_emit_binding_table(batch, wm_table);
-	gen4_emit_vertex_elements(batch);
-	gen4_emit_pipelined_pointers(batch, vs, sf, wm, cc);
-	gen4_emit_urb(batch);
+	gen4_emit_drawing_rectangle(ibb, dst);
+	gen4_emit_binding_table(ibb, wm_table);
+	gen4_emit_vertex_elements(ibb);
+	gen4_emit_pipelined_pointers(ibb, vs, sf, wm, cc);
+	gen4_emit_urb(ibb);
 
-	gen4_emit_vertex_buffer(batch);
-	offset = gen4_emit_primitive(batch);
+	gen4_emit_vertex_buffer(ibb);
+	offset = gen4_emit_primitive(ibb);
 
-	OUT_BATCH(MI_BATCH_BUFFER_END);
-	batch_end = intel_batchbuffer_align(batch, 8);
+	batch_end = intel_bb_emit_bbe(ibb);
 
-	*(uint32_t *)(batch->buffer + offset) =
-		batch_round_upto(batch, VERTEX_SIZE)/VERTEX_SIZE;
+	ibb->batch[offset / sizeof(uint32_t)] =
+			batch_round_upto(ibb, VERTEX_SIZE)/VERTEX_SIZE;
 
-	emit_vertex_2s(batch, dst_x + width, dst_y + height);
-	emit_vertex_normalized(batch, src_x + width, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y + height, igt_buf_height(src));
+	emit_vertex_2s(ibb, dst_x + width, dst_y + height);
+	emit_vertex_normalized(ibb, src_x + width, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y + height, intel_buf_height(src));
 
-	emit_vertex_2s(batch, dst_x, dst_y + height);
-	emit_vertex_normalized(batch, src_x, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y + height, igt_buf_height(src));
+	emit_vertex_2s(ibb, dst_x, dst_y + height);
+	emit_vertex_normalized(ibb, src_x, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y + height, intel_buf_height(src));
 
-	emit_vertex_2s(batch, dst_x, dst_y);
-	emit_vertex_normalized(batch, src_x, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y, igt_buf_height(src));
+	emit_vertex_2s(ibb, dst_x, dst_y);
+	emit_vertex_normalized(ibb, src_x, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y, intel_buf_height(src));
 
-	gen4_render_flush(batch, context, batch_end);
-	intel_batchbuffer_reset(batch);
+	intel_bb_exec_with_context(ibb, batch_end, ctx,
+				   I915_EXEC_DEFAULT | I915_EXEC_NO_RELOC,
+				   false);
+	intel_bb_reset(ibb, false);
 }
diff --git a/lib/rendercopy_gen6.c b/lib/rendercopy_gen6.c
index 16cbb679..227bde05 100644
--- a/lib/rendercopy_gen6.c
+++ b/lib/rendercopy_gen6.c
@@ -12,7 +12,7 @@
 #include "drm.h"
 #include "i915_drm.h"
 #include "drmtest.h"
-#include "intel_bufmgr.h"
+#include "intel_bufops.h"
 #include "intel_batchbuffer.h"
 #include "intel_io.h"
 #include "rendercopy.h"
@@ -49,38 +49,26 @@ static const uint32_t ps_kernel_nomask_affine[][4] = {
 };
 
 static uint32_t
-batch_round_upto(struct intel_batchbuffer *batch, uint32_t divisor)
+batch_round_upto(struct intel_bb *ibb, uint32_t divisor)
 {
-	uint32_t offset = batch->ptr - batch->buffer;
+	uint32_t offset = intel_bb_offset(ibb);
 
-	offset = (offset + divisor-1) / divisor * divisor;
-	batch->ptr = batch->buffer + offset;
-	return offset;
-}
+	offset = (offset + divisor - 1) / divisor * divisor;
+	intel_bb_ptr_set(ibb, offset);
 
-static void
-gen6_render_flush(struct intel_batchbuffer *batch,
-		  drm_intel_context *context, uint32_t batch_end)
-{
-	igt_assert_eq(drm_intel_bo_subdata(batch->bo,
-					   0, 4096, batch->buffer),
-		      0);
-	igt_assert_eq(drm_intel_gem_bo_context_exec(batch->bo,
-						    context, batch_end, 0),
-		      0);
+	return offset;
 }
 
 static uint32_t
-gen6_bind_buf(struct intel_batchbuffer *batch, const struct igt_buf *buf,
-	      int is_dst)
+gen6_bind_buf(struct intel_bb *ibb, const struct intel_buf *buf, int is_dst)
 {
 	struct gen6_surface_state *ss;
 	uint32_t write_domain, read_domain;
-	int ret;
+	uint64_t address;
 
 	igt_assert_lte(buf->surface[0].stride, 128*1024);
-	igt_assert_lte(igt_buf_width(buf), 8192);
-	igt_assert_lte(igt_buf_height(buf), 8192);
+	igt_assert_lte(intel_buf_width(buf), 8192);
+	igt_assert_lte(intel_buf_height(buf), 8192);
 
 	if (is_dst) {
 		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
@@ -89,7 +77,7 @@ gen6_bind_buf(struct intel_batchbuffer *batch, const struct igt_buf *buf,
 		read_domain = I915_GEM_DOMAIN_SAMPLER;
 	}
 
-	ss = intel_batchbuffer_subdata_alloc(batch, sizeof(*ss), 32);
+	ss = intel_bb_ptr_align(ibb, 32);
 	ss->ss0.surface_type = SURFACE_2D;
 
 	switch (buf->bpp) {
@@ -102,265 +90,265 @@ gen6_bind_buf(struct intel_batchbuffer *batch, const struct igt_buf *buf,
 
 	ss->ss0.data_return_format = SURFACERETURNFORMAT_FLOAT32;
 	ss->ss0.color_blend = 1;
-	ss->ss1.base_addr = buf->bo->offset;
 
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				      intel_batchbuffer_subdata_offset(batch, &ss->ss1),
-				      buf->bo, 0,
-				      read_domain, write_domain);
-	igt_assert(ret == 0);
+	address = intel_bb_offset_reloc(ibb, buf->handle,
+					read_domain, write_domain,
+					intel_bb_offset(ibb) + 4,
+					buf->addr.offset);
+	ss->ss1.base_addr = (uint32_t) address;
 
-	ss->ss2.height = igt_buf_height(buf) - 1;
-	ss->ss2.width  = igt_buf_width(buf) - 1;
+	ss->ss2.height = intel_buf_height(buf) - 1;
+	ss->ss2.width  = intel_buf_width(buf) - 1;
 	ss->ss3.pitch  = buf->surface[0].stride - 1;
 	ss->ss3.tiled_surface = buf->tiling != I915_TILING_NONE;
 	ss->ss3.tile_walk     = buf->tiling == I915_TILING_Y;
 
 	ss->ss5.memory_object_control = GEN6_MOCS_PTE;
 
-	return intel_batchbuffer_subdata_offset(batch, ss);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*ss));
 }
 
 static uint32_t
-gen6_bind_surfaces(struct intel_batchbuffer *batch,
-		   const struct igt_buf *src,
-		   const struct igt_buf *dst)
+gen6_bind_surfaces(struct intel_bb *ibb,
+		   const struct intel_buf *src,
+		   const struct intel_buf *dst)
 {
-	uint32_t *binding_table;
+	uint32_t *binding_table, binding_table_offset;
 
-	binding_table = intel_batchbuffer_subdata_alloc(batch, 32, 32);
+	binding_table = intel_bb_ptr_align(ibb, 32);
+	binding_table_offset = intel_bb_ptr_add_return_prev_offset(ibb, 32);
 
-	binding_table[0] = gen6_bind_buf(batch, dst, 1);
-	binding_table[1] = gen6_bind_buf(batch, src, 0);
+	binding_table[0] = gen6_bind_buf(ibb, dst, 1);
+	binding_table[1] = gen6_bind_buf(ibb, src, 0);
 
-	return intel_batchbuffer_subdata_offset(batch, binding_table);
+	return binding_table_offset;
 }
 
 static void
-gen6_emit_sip(struct intel_batchbuffer *batch)
+gen6_emit_sip(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN4_STATE_SIP | 0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN4_STATE_SIP | 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen6_emit_urb(struct intel_batchbuffer *batch)
+gen6_emit_urb(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN6_3DSTATE_URB | (3 - 2));
-	OUT_BATCH((1 - 1) << GEN6_3DSTATE_URB_VS_SIZE_SHIFT |
-		  24 << GEN6_3DSTATE_URB_VS_ENTRIES_SHIFT); /* at least 24 on GEN6 */
-	OUT_BATCH(0 << GEN6_3DSTATE_URB_GS_SIZE_SHIFT |
-		  0 << GEN6_3DSTATE_URB_GS_ENTRIES_SHIFT); /* no GS thread */
+	intel_bb_out(ibb, GEN6_3DSTATE_URB | (3 - 2));
+	intel_bb_out(ibb, (1 - 1) << GEN6_3DSTATE_URB_VS_SIZE_SHIFT |
+		     24 << GEN6_3DSTATE_URB_VS_ENTRIES_SHIFT); /* at least 24 on GEN6 */
+	intel_bb_out(ibb, 0 << GEN6_3DSTATE_URB_GS_SIZE_SHIFT |
+		     0 << GEN6_3DSTATE_URB_GS_ENTRIES_SHIFT); /* no GS thread */
 }
 
 static void
-gen6_emit_state_base_address(struct intel_batchbuffer *batch)
+gen6_emit_state_base_address(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN4_STATE_BASE_ADDRESS | (10 - 2));
-	OUT_BATCH(0); /* general */
-	OUT_RELOC(batch->bo, /* surface */
-		  I915_GEM_DOMAIN_INSTRUCTION, 0,
-		  BASE_ADDRESS_MODIFY);
-	OUT_RELOC(batch->bo, /* instruction */
-		  I915_GEM_DOMAIN_INSTRUCTION, 0,
-		  BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0); /* indirect */
-	OUT_RELOC(batch->bo, /* dynamic */
-		  I915_GEM_DOMAIN_INSTRUCTION, 0,
-		  BASE_ADDRESS_MODIFY);
+	intel_bb_out(ibb, GEN4_STATE_BASE_ADDRESS | (10 - 2));
+	intel_bb_out(ibb, 0); /* general */
+	intel_bb_emit_reloc(ibb, ibb->handle, /* surface */
+			    I915_GEM_DOMAIN_INSTRUCTION, 0,
+			    BASE_ADDRESS_MODIFY, ibb->batch_offset);
+	intel_bb_emit_reloc(ibb, ibb->handle, /* instruction */
+			    I915_GEM_DOMAIN_INSTRUCTION, 0,
+			    BASE_ADDRESS_MODIFY, ibb->batch_offset);
+	intel_bb_out(ibb, 0); /* indirect */
+	intel_bb_emit_reloc(ibb, ibb->handle, /* dynamic */
+			    I915_GEM_DOMAIN_INSTRUCTION, 0,
+			    BASE_ADDRESS_MODIFY, ibb->batch_offset);
 
 	/* upper bounds, disable */
-	OUT_BATCH(0);
-	OUT_BATCH(BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
-	OUT_BATCH(BASE_ADDRESS_MODIFY);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, BASE_ADDRESS_MODIFY);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, BASE_ADDRESS_MODIFY);
 }
 
 static void
-gen6_emit_viewports(struct intel_batchbuffer *batch, uint32_t cc_vp)
+gen6_emit_viewports(struct intel_bb *ibb, uint32_t cc_vp)
 {
-	OUT_BATCH(GEN6_3DSTATE_VIEWPORT_STATE_POINTERS |
-		  GEN6_3DSTATE_VIEWPORT_STATE_MODIFY_CC |
-		  (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(cc_vp);
+	intel_bb_out(ibb, GEN6_3DSTATE_VIEWPORT_STATE_POINTERS |
+		     GEN6_3DSTATE_VIEWPORT_STATE_MODIFY_CC |
+		     (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, cc_vp);
 }
 
 static void
-gen6_emit_vs(struct intel_batchbuffer *batch)
+gen6_emit_vs(struct intel_bb *ibb)
 {
 	/* disable VS constant buffer */
-	OUT_BATCH(GEN6_3DSTATE_CONSTANT_VS | (5 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN6_3DSTATE_VS | (6 - 2));
-	OUT_BATCH(0); /* no VS kernel */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* pass-through */
+	intel_bb_out(ibb, GEN6_3DSTATE_CONSTANT_VS | (5 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN6_3DSTATE_VS | (6 - 2));
+	intel_bb_out(ibb, 0); /* no VS kernel */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* pass-through */
 }
 
 static void
-gen6_emit_gs(struct intel_batchbuffer *batch)
+gen6_emit_gs(struct intel_bb *ibb)
 {
 	/* disable GS constant buffer */
-	OUT_BATCH(GEN6_3DSTATE_CONSTANT_GS | (5 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN6_3DSTATE_GS | (7 - 2));
-	OUT_BATCH(0); /* no GS kernel */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* pass-through */
+	intel_bb_out(ibb, GEN6_3DSTATE_CONSTANT_GS | (5 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN6_3DSTATE_GS | (7 - 2));
+	intel_bb_out(ibb, 0); /* no GS kernel */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* pass-through */
 }
 
 static void
-gen6_emit_clip(struct intel_batchbuffer *batch)
+gen6_emit_clip(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN6_3DSTATE_CLIP | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* pass-through */
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN6_3DSTATE_CLIP | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* pass-through */
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen6_emit_wm_constants(struct intel_batchbuffer *batch)
+gen6_emit_wm_constants(struct intel_bb *ibb)
 {
 	/* disable WM constant buffer */
-	OUT_BATCH(GEN6_3DSTATE_CONSTANT_PS | (5 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN6_3DSTATE_CONSTANT_PS | (5 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen6_emit_null_depth_buffer(struct intel_batchbuffer *batch)
+gen6_emit_null_depth_buffer(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN4_3DSTATE_DEPTH_BUFFER | (7 - 2));
-	OUT_BATCH(SURFACE_NULL << GEN4_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT |
-		  GEN4_DEPTHFORMAT_D32_FLOAT << GEN4_3DSTATE_DEPTH_BUFFER_FORMAT_SHIFT);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN4_3DSTATE_CLEAR_PARAMS | (2 - 2));
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN4_3DSTATE_DEPTH_BUFFER | (7 - 2));
+	intel_bb_out(ibb, SURFACE_NULL << GEN4_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT |
+		     GEN4_DEPTHFORMAT_D32_FLOAT << GEN4_3DSTATE_DEPTH_BUFFER_FORMAT_SHIFT);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN4_3DSTATE_CLEAR_PARAMS | (2 - 2));
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen6_emit_invariant(struct intel_batchbuffer *batch)
+gen6_emit_invariant(struct intel_bb *ibb)
 {
-	OUT_BATCH(G4X_PIPELINE_SELECT | PIPELINE_SELECT_3D);
+	intel_bb_out(ibb, G4X_PIPELINE_SELECT | PIPELINE_SELECT_3D);
 
-	OUT_BATCH(GEN6_3DSTATE_MULTISAMPLE | (3 - 2));
-	OUT_BATCH(GEN6_3DSTATE_MULTISAMPLE_PIXEL_LOCATION_CENTER |
-		  GEN6_3DSTATE_MULTISAMPLE_NUMSAMPLES_1); /* 1 sample/pixel */
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN6_3DSTATE_MULTISAMPLE | (3 - 2));
+	intel_bb_out(ibb, GEN6_3DSTATE_MULTISAMPLE_PIXEL_LOCATION_CENTER |
+		     GEN6_3DSTATE_MULTISAMPLE_NUMSAMPLES_1); /* 1 sample/pixel */
+	intel_bb_out(ibb, 0);
 
-	OUT_BATCH(GEN6_3DSTATE_SAMPLE_MASK | (2 - 2));
-	OUT_BATCH(1);
+	intel_bb_out(ibb, GEN6_3DSTATE_SAMPLE_MASK | (2 - 2));
+	intel_bb_out(ibb, 1);
 }
 
 static void
-gen6_emit_cc(struct intel_batchbuffer *batch, uint32_t blend)
+gen6_emit_cc(struct intel_bb *ibb, uint32_t blend)
 {
-	OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS | (4 - 2));
-	OUT_BATCH(blend | 1);
-	OUT_BATCH(1024 | 1);
-	OUT_BATCH(1024 | 1);
+	intel_bb_out(ibb, GEN6_3DSTATE_CC_STATE_POINTERS | (4 - 2));
+	intel_bb_out(ibb, blend | 1);
+	intel_bb_out(ibb, 1024 | 1);
+	intel_bb_out(ibb, 1024 | 1);
 }
 
 static void
-gen6_emit_sampler(struct intel_batchbuffer *batch, uint32_t state)
+gen6_emit_sampler(struct intel_bb *ibb, uint32_t state)
 {
-	OUT_BATCH(GEN6_3DSTATE_SAMPLER_STATE_POINTERS |
-		  GEN6_3DSTATE_SAMPLER_STATE_MODIFY_PS |
-		  (4 - 2));
-	OUT_BATCH(0); /* VS */
-	OUT_BATCH(0); /* GS */
-	OUT_BATCH(state);
+	intel_bb_out(ibb, GEN6_3DSTATE_SAMPLER_STATE_POINTERS |
+		     GEN6_3DSTATE_SAMPLER_STATE_MODIFY_PS |
+		     (4 - 2));
+	intel_bb_out(ibb, 0); /* VS */
+	intel_bb_out(ibb, 0); /* GS */
+	intel_bb_out(ibb, state);
 }
 
 static void
-gen6_emit_sf(struct intel_batchbuffer *batch)
+gen6_emit_sf(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN6_3DSTATE_SF | (20 - 2));
-	OUT_BATCH(1 << GEN6_3DSTATE_SF_NUM_OUTPUTS_SHIFT |
-		  1 << GEN6_3DSTATE_SF_URB_ENTRY_READ_LENGTH_SHIFT |
-		  1 << GEN6_3DSTATE_SF_URB_ENTRY_READ_OFFSET_SHIFT);
-	OUT_BATCH(0);
-	OUT_BATCH(GEN6_3DSTATE_SF_CULL_NONE);
-	OUT_BATCH(2 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT); /* DW4 */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* DW9 */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* DW14 */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* DW19 */
+	intel_bb_out(ibb, GEN6_3DSTATE_SF | (20 - 2));
+	intel_bb_out(ibb, 1 << GEN6_3DSTATE_SF_NUM_OUTPUTS_SHIFT |
+		     1 << GEN6_3DSTATE_SF_URB_ENTRY_READ_LENGTH_SHIFT |
+		     1 << GEN6_3DSTATE_SF_URB_ENTRY_READ_OFFSET_SHIFT);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, GEN6_3DSTATE_SF_CULL_NONE);
+	intel_bb_out(ibb, 2 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT); /* DW4 */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* DW9 */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* DW14 */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* DW19 */
 }
 
 static void
-gen6_emit_wm(struct intel_batchbuffer *batch, int kernel)
+gen6_emit_wm(struct intel_bb *ibb, int kernel)
 {
-	OUT_BATCH(GEN6_3DSTATE_WM | (9 - 2));
-	OUT_BATCH(kernel);
-	OUT_BATCH(1 << GEN6_3DSTATE_WM_SAMPLER_COUNT_SHIFT |
-		  2 << GEN6_3DSTATE_WM_BINDING_TABLE_ENTRY_COUNT_SHIFT);
-	OUT_BATCH(0);
-	OUT_BATCH(6 << GEN6_3DSTATE_WM_DISPATCH_START_GRF_0_SHIFT); /* DW4 */
-	OUT_BATCH((40 - 1) << GEN6_3DSTATE_WM_MAX_THREADS_SHIFT |
-		  GEN6_3DSTATE_WM_DISPATCH_ENABLE |
-		  GEN6_3DSTATE_WM_16_DISPATCH_ENABLE);
-	OUT_BATCH(1 << GEN6_3DSTATE_WM_NUM_SF_OUTPUTS_SHIFT |
-		  GEN6_3DSTATE_WM_PERSPECTIVE_PIXEL_BARYCENTRIC);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN6_3DSTATE_WM | (9 - 2));
+	intel_bb_out(ibb, kernel);
+	intel_bb_out(ibb, 1 << GEN6_3DSTATE_WM_SAMPLER_COUNT_SHIFT |
+		     2 << GEN6_3DSTATE_WM_BINDING_TABLE_ENTRY_COUNT_SHIFT);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 6 << GEN6_3DSTATE_WM_DISPATCH_START_GRF_0_SHIFT); /* DW4 */
+	intel_bb_out(ibb, (40 - 1) << GEN6_3DSTATE_WM_MAX_THREADS_SHIFT |
+		     GEN6_3DSTATE_WM_DISPATCH_ENABLE |
+		     GEN6_3DSTATE_WM_16_DISPATCH_ENABLE);
+	intel_bb_out(ibb, 1 << GEN6_3DSTATE_WM_NUM_SF_OUTPUTS_SHIFT |
+		     GEN6_3DSTATE_WM_PERSPECTIVE_PIXEL_BARYCENTRIC);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen6_emit_binding_table(struct intel_batchbuffer *batch, uint32_t wm_table)
+gen6_emit_binding_table(struct intel_bb *ibb, uint32_t wm_table)
 {
-	OUT_BATCH(GEN4_3DSTATE_BINDING_TABLE_POINTERS |
-		  GEN6_3DSTATE_BINDING_TABLE_MODIFY_PS |
-		  (4 - 2));
-	OUT_BATCH(0);		/* vs */
-	OUT_BATCH(0);		/* gs */
-	OUT_BATCH(wm_table);
+	intel_bb_out(ibb, GEN4_3DSTATE_BINDING_TABLE_POINTERS |
+		     GEN6_3DSTATE_BINDING_TABLE_MODIFY_PS |
+		     (4 - 2));
+	intel_bb_out(ibb, 0);		/* vs */
+	intel_bb_out(ibb, 0);		/* gs */
+	intel_bb_out(ibb, wm_table);
 }
 
 static void
-gen6_emit_drawing_rectangle(struct intel_batchbuffer *batch, const struct igt_buf *dst)
+gen6_emit_drawing_rectangle(struct intel_bb *ibb, const struct intel_buf *dst)
 {
-	OUT_BATCH(GEN4_3DSTATE_DRAWING_RECTANGLE | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH((igt_buf_height(dst) - 1) << 16 | (igt_buf_width(dst) - 1));
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN4_3DSTATE_DRAWING_RECTANGLE | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, (intel_buf_height(dst) - 1) << 16 | (intel_buf_width(dst) - 1));
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen6_emit_vertex_elements(struct intel_batchbuffer *batch)
+gen6_emit_vertex_elements(struct intel_bb *ibb)
 {
 	/* The VUE layout
 	 *    dword 0-3: pad (0.0, 0.0, 0.0. 0.0)
@@ -369,54 +357,54 @@ gen6_emit_vertex_elements(struct intel_batchbuffer *batch)
 	 *
 	 * dword 4-11 are fetched from vertex buffer
 	 */
-	OUT_BATCH(GEN4_3DSTATE_VERTEX_ELEMENTS | (2 * 3 + 1 - 2));
+	intel_bb_out(ibb, GEN4_3DSTATE_VERTEX_ELEMENTS | (2 * 3 + 1 - 2));
 
-	OUT_BATCH(0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
-		  SURFACEFORMAT_R32G32B32A32_FLOAT << VE0_FORMAT_SHIFT |
-		  0 << VE0_OFFSET_SHIFT);
-	OUT_BATCH(GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
+	intel_bb_out(ibb, 0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
+		     SURFACEFORMAT_R32G32B32A32_FLOAT << VE0_FORMAT_SHIFT |
+		     0 << VE0_OFFSET_SHIFT);
+	intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
 
 	/* x,y */
-	OUT_BATCH(0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
-		  SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
-		  0 << VE0_OFFSET_SHIFT); /* offsets vb in bytes */
-	OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
+	intel_bb_out(ibb, 0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
+		     SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
+		     0 << VE0_OFFSET_SHIFT); /* offsets vb in bytes */
+	intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_2_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
 
 	/* u0, v0 */
-	OUT_BATCH(0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
-		  SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
-		  4 << VE0_OFFSET_SHIFT);	/* offset vb in bytes */
-	OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
+	intel_bb_out(ibb, 0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
+		     SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
+		     4 << VE0_OFFSET_SHIFT);	/* offset vb in bytes */
+	intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
 }
 
 static uint32_t
-gen6_create_cc_viewport(struct intel_batchbuffer *batch)
+gen6_create_cc_viewport(struct intel_bb *ibb)
 {
 	struct gen4_cc_viewport *vp;
 
-	vp = intel_batchbuffer_subdata_alloc(batch, sizeof(*vp), 32);
+	vp = intel_bb_ptr_align(ibb, 32);
 
 	vp->min_depth = -1.e35;
 	vp->max_depth = 1.e35;
 
-	return intel_batchbuffer_subdata_offset(batch, vp);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*vp));
 }
 
 static uint32_t
-gen6_create_cc_blend(struct intel_batchbuffer *batch)
+gen6_create_cc_blend(struct intel_bb *ibb)
 {
 	struct gen6_blend_state *blend;
 
-	blend = intel_batchbuffer_subdata_alloc(batch, sizeof(*blend), 64);
+	blend = intel_bb_ptr_align(ibb, 64);
 
 	blend->blend0.dest_blend_factor = GEN6_BLENDFACTOR_ZERO;
 	blend->blend0.source_blend_factor = GEN6_BLENDFACTOR_ONE;
@@ -426,25 +414,24 @@ gen6_create_cc_blend(struct intel_batchbuffer *batch)
 	blend->blend1.post_blend_clamp_enable = 1;
 	blend->blend1.pre_blend_clamp_enable = 1;
 
-	return intel_batchbuffer_subdata_offset(batch, blend);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*blend));
 }
 
 static uint32_t
-gen6_create_kernel(struct intel_batchbuffer *batch)
+gen6_create_kernel(struct intel_bb *ibb)
 {
-	return intel_batchbuffer_copy_data(batch, ps_kernel_nomask_affine,
-			  sizeof(ps_kernel_nomask_affine),
-			  64);
+	return intel_bb_copy_data(ibb, ps_kernel_nomask_affine,
+				  sizeof(ps_kernel_nomask_affine), 64);
 }
 
 static uint32_t
-gen6_create_sampler(struct intel_batchbuffer *batch,
+gen6_create_sampler(struct intel_bb *ibb,
 		    sampler_filter_t filter,
-		   sampler_extend_t extend)
+		    sampler_extend_t extend)
 {
 	struct gen6_sampler_state *ss;
 
-	ss = intel_batchbuffer_subdata_alloc(batch, sizeof(*ss), 32);
+	ss = intel_bb_ptr_align(ibb, 32);
 	ss->ss0.lod_preclamp = 1;	/* GL mode */
 
 	/* We use the legacy mode to get the semantics specified by
@@ -487,107 +474,116 @@ gen6_create_sampler(struct intel_batchbuffer *batch,
 		break;
 	}
 
-	return intel_batchbuffer_subdata_offset(batch, ss);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*ss));
 }
 
-static void gen6_emit_vertex_buffer(struct intel_batchbuffer *batch)
+static void gen6_emit_vertex_buffer(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN4_3DSTATE_VERTEX_BUFFERS | 3);
-	OUT_BATCH(GEN6_VB0_VERTEXDATA |
-		  0 << GEN6_VB0_BUFFER_INDEX_SHIFT |
-		  VERTEX_SIZE << VB0_BUFFER_PITCH_SHIFT);
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_VERTEX, 0, 0);
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_VERTEX, 0, batch->bo->size-1);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN4_3DSTATE_VERTEX_BUFFERS | 3);
+	intel_bb_out(ibb, GEN6_VB0_VERTEXDATA |
+		     0 << GEN6_VB0_BUFFER_INDEX_SHIFT |
+		     VERTEX_SIZE << VB0_BUFFER_PITCH_SHIFT);
+	intel_bb_emit_reloc(ibb, ibb->handle, I915_GEM_DOMAIN_VERTEX, 0,
+			    0, ibb->batch_offset);
+	intel_bb_emit_reloc(ibb, ibb->handle, I915_GEM_DOMAIN_VERTEX, 0,
+			    ibb->size - 1, ibb->batch_offset);
+	intel_bb_out(ibb, 0);
 }
 
-static uint32_t gen6_emit_primitive(struct intel_batchbuffer *batch)
+static uint32_t gen6_emit_primitive(struct intel_bb *ibb)
 {
 	uint32_t offset;
 
-	OUT_BATCH(GEN4_3DPRIMITIVE |
-		  GEN4_3DPRIMITIVE_VERTEX_SEQUENTIAL |
-		  _3DPRIM_RECTLIST << GEN4_3DPRIMITIVE_TOPOLOGY_SHIFT |
-		  0 << 9 |
-		  4);
-	OUT_BATCH(3);	/* vertex count */
-	offset = batch->ptr - batch->buffer;
-	OUT_BATCH(0);	/* vertex_index */
-	OUT_BATCH(1);	/* single instance */
-	OUT_BATCH(0);	/* start instance location */
-	OUT_BATCH(0);	/* index buffer offset, ignored */
+	intel_bb_out(ibb, GEN4_3DPRIMITIVE |
+		     GEN4_3DPRIMITIVE_VERTEX_SEQUENTIAL |
+		     _3DPRIM_RECTLIST << GEN4_3DPRIMITIVE_TOPOLOGY_SHIFT |
+		     0 << 9 |
+		     4);
+	intel_bb_out(ibb, 3);	/* vertex count */
+	offset = intel_bb_offset(ibb);
+	intel_bb_out(ibb, 0);	/* vertex_index */
+	intel_bb_out(ibb, 1);	/* single instance */
+	intel_bb_out(ibb, 0);	/* start instance location */
+	intel_bb_out(ibb, 0);	/* index buffer offset, ignored */
 
 	return offset;
 }
 
-void gen6_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y)
+void gen6_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src,
+			  uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst,
+			  uint32_t dst_x, uint32_t dst_y)
 {
 	uint32_t wm_state, wm_kernel, wm_table;
 	uint32_t cc_vp, cc_blend, offset;
 	uint32_t batch_end;
 
 	igt_assert(src->bpp == dst->bpp);
-	intel_batchbuffer_flush_with_context(batch, context);
 
-	batch->ptr = batch->buffer + 1024;
-	intel_batchbuffer_subdata_alloc(batch, 64, 64);
-	wm_table  = gen6_bind_surfaces(batch, src, dst);
-	wm_kernel = gen6_create_kernel(batch);
-	wm_state  = gen6_create_sampler(batch,
+	intel_bb_flush_render_with_context(ibb, ctx);
+
+	intel_bb_add_intel_buf(ibb, dst, true);
+	intel_bb_add_intel_buf(ibb, src, false);
+
+	intel_bb_ptr_set(ibb, 1024 + 64);
+
+	wm_table  = gen6_bind_surfaces(ibb, src, dst);
+	wm_kernel = gen6_create_kernel(ibb);
+	wm_state  = gen6_create_sampler(ibb,
 					SAMPLER_FILTER_NEAREST,
 					SAMPLER_EXTEND_NONE);
 
-	cc_vp = gen6_create_cc_viewport(batch);
-	cc_blend = gen6_create_cc_blend(batch);
+	cc_vp = gen6_create_cc_viewport(ibb);
+	cc_blend = gen6_create_cc_blend(ibb);
 
-	batch->ptr = batch->buffer;
+	intel_bb_ptr_set(ibb, 0);
 
-	gen6_emit_invariant(batch);
-	gen6_emit_state_base_address(batch);
+	gen6_emit_invariant(ibb);
+	gen6_emit_state_base_address(ibb);
 
-	gen6_emit_sip(batch);
-	gen6_emit_urb(batch);
+	gen6_emit_sip(ibb);
+	gen6_emit_urb(ibb);
 
-	gen6_emit_viewports(batch, cc_vp);
-	gen6_emit_vs(batch);
-	gen6_emit_gs(batch);
-	gen6_emit_clip(batch);
-	gen6_emit_wm_constants(batch);
-	gen6_emit_null_depth_buffer(batch);
+	gen6_emit_viewports(ibb, cc_vp);
+	gen6_emit_vs(ibb);
+	gen6_emit_gs(ibb);
+	gen6_emit_clip(ibb);
+	gen6_emit_wm_constants(ibb);
+	gen6_emit_null_depth_buffer(ibb);
 
-	gen6_emit_drawing_rectangle(batch, dst);
-	gen6_emit_cc(batch, cc_blend);
-	gen6_emit_sampler(batch, wm_state);
-	gen6_emit_sf(batch);
-	gen6_emit_wm(batch, wm_kernel);
-	gen6_emit_vertex_elements(batch);
-	gen6_emit_binding_table(batch, wm_table);
+	gen6_emit_drawing_rectangle(ibb, dst);
+	gen6_emit_cc(ibb, cc_blend);
+	gen6_emit_sampler(ibb, wm_state);
+	gen6_emit_sf(ibb);
+	gen6_emit_wm(ibb, wm_kernel);
+	gen6_emit_vertex_elements(ibb);
+	gen6_emit_binding_table(ibb, wm_table);
 
-	gen6_emit_vertex_buffer(batch);
-	offset = gen6_emit_primitive(batch);
+	gen6_emit_vertex_buffer(ibb);
+	offset = gen6_emit_primitive(ibb);
 
-	OUT_BATCH(MI_BATCH_BUFFER_END);
-	batch_end = intel_batchbuffer_align(batch, 8);
+	batch_end = intel_bb_emit_bbe(ibb);
 
-	*(uint32_t*)(batch->buffer + offset) =
-		batch_round_upto(batch, VERTEX_SIZE)/VERTEX_SIZE;
+	ibb->batch[offset / sizeof(uint32_t)] =
+			batch_round_upto(ibb, VERTEX_SIZE)/VERTEX_SIZE;
 
-	emit_vertex_2s(batch, dst_x + width, dst_y + height);
-	emit_vertex_normalized(batch, src_x + width, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y + height, igt_buf_height(src));
+	emit_vertex_2s(ibb, dst_x + width, dst_y + height);
+	emit_vertex_normalized(ibb, src_x + width, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y + height, intel_buf_height(src));
 
-	emit_vertex_2s(batch, dst_x, dst_y + height);
-	emit_vertex_normalized(batch, src_x, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y + height, igt_buf_height(src));
+	emit_vertex_2s(ibb, dst_x, dst_y + height);
+	emit_vertex_normalized(ibb, src_x, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y + height, intel_buf_height(src));
 
-	emit_vertex_2s(batch, dst_x, dst_y);
-	emit_vertex_normalized(batch, src_x, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y, igt_buf_height(src));
+	emit_vertex_2s(ibb, dst_x, dst_y);
+	emit_vertex_normalized(ibb, src_x, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y, intel_buf_height(src));
 
-	gen6_render_flush(batch, context, batch_end);
-	intel_batchbuffer_reset(batch);
+	intel_bb_exec_with_context(ibb, batch_end, ctx,
+				   I915_EXEC_DEFAULT | I915_EXEC_NO_RELOC,
+				   false);
+	intel_bb_reset(ibb, false);
 }
diff --git a/lib/rendercopy_gen7.c b/lib/rendercopy_gen7.c
index 93b4da72..62ef4325 100644
--- a/lib/rendercopy_gen7.c
+++ b/lib/rendercopy_gen7.c
@@ -12,7 +12,7 @@
 #include "drm.h"
 #include "i915_drm.h"
 #include "drmtest.h"
-#include "intel_bufmgr.h"
+#include "intel_bufops.h"
 #include "intel_batchbuffer.h"
 #include "intel_io.h"
 #include "intel_chipset.h"
@@ -20,6 +20,14 @@
 #include "gen7_render.h"
 #include "intel_reg.h"
 
+#if DEBUG_RENDERCPY
+static void dump_batch(struct intel_bb *ibb)
+{
+	intel_bb_dump(ibb, "/tmp/gen7-batchbuffers.dump");
+}
+#else
+#define dump_batch(x) do { } while (0)
+#endif
 
 static const uint32_t ps_kernel[][4] = {
 	{ 0x0080005a, 0x2e2077bd, 0x000000c0, 0x008d0040 },
@@ -32,17 +40,6 @@ static const uint32_t ps_kernel[][4] = {
 	{ 0x05800031, 0x20001fa8, 0x008d0e20, 0x90031000 },
 };
 
-static void
-gen7_render_flush(struct intel_batchbuffer *batch,
-		  drm_intel_context *context, uint32_t batch_end)
-{
-	igt_assert_eq(drm_intel_bo_subdata(batch->bo,
-					   0, 4096, batch->buffer),
-		      0);
-	igt_assert_eq(drm_intel_gem_bo_context_exec(batch->bo, context,
-						    batch_end, 0),
-		      0);
-}
 
 static uint32_t
 gen7_tiling_bits(uint32_t tiling)
@@ -56,17 +53,17 @@ gen7_tiling_bits(uint32_t tiling)
 }
 
 static uint32_t
-gen7_bind_buf(struct intel_batchbuffer *batch,
-	      const struct igt_buf *buf,
+gen7_bind_buf(struct intel_bb *ibb,
+	      const struct intel_buf *buf,
 	      int is_dst)
 {
 	uint32_t format, *ss;
 	uint32_t write_domain, read_domain;
-	int ret;
+	uint64_t address;
 
 	igt_assert_lte(buf->surface[0].stride, 256*1024);
-	igt_assert_lte(igt_buf_width(buf), 16384);
-	igt_assert_lte(igt_buf_height(buf), 16384);
+	igt_assert_lte(intel_buf_width(buf), 16384);
+	igt_assert_lte(intel_buf_height(buf), 16384);
 
 	switch (buf->bpp) {
 		case 8: format = SURFACEFORMAT_R8_UNORM; break;
@@ -83,77 +80,76 @@ gen7_bind_buf(struct intel_batchbuffer *batch,
 		read_domain = I915_GEM_DOMAIN_SAMPLER;
 	}
 
-	ss = intel_batchbuffer_subdata_alloc(batch, 8 * sizeof(*ss), 32);
+	ss = intel_bb_ptr_align(ibb, 32);
 
 	ss[0] = (SURFACE_2D << GEN7_SURFACE_TYPE_SHIFT |
 		 gen7_tiling_bits(buf->tiling) |
 		format << GEN7_SURFACE_FORMAT_SHIFT);
-	ss[1] = buf->bo->offset;
-	ss[2] = ((igt_buf_width(buf) - 1)  << GEN7_SURFACE_WIDTH_SHIFT |
-		 (igt_buf_height(buf) - 1) << GEN7_SURFACE_HEIGHT_SHIFT);
+
+	address = intel_bb_offset_reloc(ibb, buf->handle,
+					read_domain, write_domain,
+					intel_bb_offset(ibb) + 4,
+					buf->addr.offset);
+	ss[1] = address;
+	ss[2] = ((intel_buf_width(buf) - 1)  << GEN7_SURFACE_WIDTH_SHIFT |
+		 (intel_buf_height(buf) - 1) << GEN7_SURFACE_HEIGHT_SHIFT);
 	ss[3] = (buf->surface[0].stride - 1) << GEN7_SURFACE_PITCH_SHIFT;
 	ss[4] = 0;
-	if (IS_VALLEYVIEW(batch->devid))
+	if (IS_VALLEYVIEW(ibb->devid))
 		ss[5] = VLV_MOCS_L3 << 16;
 	else
 		ss[5] = (IVB_MOCS_L3 | IVB_MOCS_PTE) << 16;
 	ss[6] = 0;
 	ss[7] = 0;
-	if (IS_HASWELL(batch->devid))
+	if (IS_HASWELL(ibb->devid))
 		ss[7] |= HSW_SURFACE_SWIZZLE(RED, GREEN, BLUE, ALPHA);
 
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				      intel_batchbuffer_subdata_offset(batch, &ss[1]),
-				      buf->bo, 0,
-				      read_domain, write_domain);
-	igt_assert(ret == 0);
-
-	return intel_batchbuffer_subdata_offset(batch, ss);
+	return intel_bb_ptr_add_return_prev_offset(ibb, 8 * sizeof(*ss));
 }
 
 static void
-gen7_emit_vertex_elements(struct intel_batchbuffer *batch)
+gen7_emit_vertex_elements(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN4_3DSTATE_VERTEX_ELEMENTS |
-		  ((2 * (1 + 2)) + 1 - 2));
+	intel_bb_out(ibb, GEN4_3DSTATE_VERTEX_ELEMENTS |
+		     ((2 * (1 + 2)) + 1 - 2));
 
-	OUT_BATCH(0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
-		  SURFACEFORMAT_R32G32B32A32_FLOAT << VE0_FORMAT_SHIFT |
-		  0 << VE0_OFFSET_SHIFT);
+	intel_bb_out(ibb, 0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
+		     SURFACEFORMAT_R32G32B32A32_FLOAT << VE0_FORMAT_SHIFT |
+		     0 << VE0_OFFSET_SHIFT);
 
-	OUT_BATCH(GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
+	intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
 
 	/* x,y */
-	OUT_BATCH(0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
-		  SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
-		  0 << VE0_OFFSET_SHIFT); /* offsets vb in bytes */
-	OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
+	intel_bb_out(ibb, 0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
+		     SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
+		     0 << VE0_OFFSET_SHIFT); /* offsets vb in bytes */
+	intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
 
 	/* s,t */
-	OUT_BATCH(0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
-		  SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
-		  4 << VE0_OFFSET_SHIFT);  /* offset vb in bytes */
-	OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
+	intel_bb_out(ibb, 0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
+		     SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
+		     4 << VE0_OFFSET_SHIFT);  /* offset vb in bytes */
+	intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
 }
 
 static uint32_t
-gen7_create_vertex_buffer(struct intel_batchbuffer *batch,
+gen7_create_vertex_buffer(struct intel_bb *ibb,
 			  uint32_t src_x, uint32_t src_y,
 			  uint32_t dst_x, uint32_t dst_y,
 			  uint32_t width, uint32_t height)
 {
 	uint16_t *v;
 
-	v = intel_batchbuffer_subdata_alloc(batch, 12 * sizeof(*v), 8);
+	v = intel_bb_ptr_align(ibb, 8);
 
 	v[0] = dst_x + width;
 	v[1] = dst_y + height;
@@ -170,66 +166,68 @@ gen7_create_vertex_buffer(struct intel_batchbuffer *batch,
 	v[10] = src_x;
 	v[11] = src_y;
 
-	return intel_batchbuffer_subdata_offset(batch, v);
+	return intel_bb_ptr_add_return_prev_offset(ibb, 12 * sizeof(*v));
 }
 
-static void gen7_emit_vertex_buffer(struct intel_batchbuffer *batch,
+static void gen7_emit_vertex_buffer(struct intel_bb *ibb,
 				    int src_x, int src_y,
 				    int dst_x, int dst_y,
 				    int width, int height,
 				    uint32_t offset)
 {
-	OUT_BATCH(GEN4_3DSTATE_VERTEX_BUFFERS | (5 - 2));
-	OUT_BATCH(0 << GEN6_VB0_BUFFER_INDEX_SHIFT |
-		  GEN6_VB0_VERTEXDATA |
-		  GEN7_VB0_ADDRESS_MODIFY_ENABLE |
-		  4 * 2 << VB0_BUFFER_PITCH_SHIFT);
-
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_VERTEX, 0, offset);
-	OUT_BATCH(~0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN4_3DSTATE_VERTEX_BUFFERS | (5 - 2));
+	intel_bb_out(ibb, 0 << GEN6_VB0_BUFFER_INDEX_SHIFT |
+		     GEN6_VB0_VERTEXDATA |
+		     GEN7_VB0_ADDRESS_MODIFY_ENABLE |
+		     4 * 2 << VB0_BUFFER_PITCH_SHIFT);
+
+	intel_bb_emit_reloc(ibb, ibb->handle, I915_GEM_DOMAIN_VERTEX, 0,
+			    offset, ibb->batch_offset);
+	intel_bb_out(ibb, ~0);
+	intel_bb_out(ibb, 0);
 }
 
 static uint32_t
-gen7_bind_surfaces(struct intel_batchbuffer *batch,
-		   const struct igt_buf *src,
-		   const struct igt_buf *dst)
+gen7_bind_surfaces(struct intel_bb *ibb,
+		   const struct intel_buf *src,
+		   const struct intel_buf *dst)
 {
-	uint32_t *binding_table;
+	uint32_t *binding_table, binding_table_offset;
 
-	binding_table = intel_batchbuffer_subdata_alloc(batch, 8, 32);
+	binding_table = intel_bb_ptr_align(ibb, 32);
+	binding_table_offset = intel_bb_ptr_add_return_prev_offset(ibb, 8);
 
-	binding_table[0] = gen7_bind_buf(batch, dst, 1);
-	binding_table[1] = gen7_bind_buf(batch, src, 0);
+	binding_table[0] = gen7_bind_buf(ibb, dst, 1);
+	binding_table[1] = gen7_bind_buf(ibb, src, 0);
 
-	return intel_batchbuffer_subdata_offset(batch, binding_table);
+	return binding_table_offset;
 }
 
 static void
-gen7_emit_binding_table(struct intel_batchbuffer *batch,
-			const struct igt_buf *src,
-			const struct igt_buf *dst,
+gen7_emit_binding_table(struct intel_bb *ibb,
+			const struct intel_buf *src,
+			const struct intel_buf *dst,
 			uint32_t bind_surf_off)
 {
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS | (2 - 2));
-	OUT_BATCH(bind_surf_off);
+	intel_bb_out(ibb, GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS | (2 - 2));
+	intel_bb_out(ibb, bind_surf_off);
 }
 
 static void
-gen7_emit_drawing_rectangle(struct intel_batchbuffer *batch, const struct igt_buf *dst)
+gen7_emit_drawing_rectangle(struct intel_bb *ibb, const struct intel_buf *dst)
 {
-	OUT_BATCH(GEN4_3DSTATE_DRAWING_RECTANGLE | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH((igt_buf_height(dst) - 1) << 16 | (igt_buf_width(dst) - 1));
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN4_3DSTATE_DRAWING_RECTANGLE | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, (intel_buf_height(dst) - 1) << 16 | (intel_buf_width(dst) - 1));
+	intel_bb_out(ibb, 0);
 }
 
 static uint32_t
-gen7_create_blend_state(struct intel_batchbuffer *batch)
+gen7_create_blend_state(struct intel_bb *ibb)
 {
 	struct gen6_blend_state *blend;
 
-	blend = intel_batchbuffer_subdata_alloc(batch, sizeof(*blend), 64);
+	blend = intel_bb_ptr_align(ibb, 64);
 
 	blend->blend0.dest_blend_factor = GEN6_BLENDFACTOR_ZERO;
 	blend->blend0.source_blend_factor = GEN6_BLENDFACTOR_ONE;
@@ -237,54 +235,61 @@ gen7_create_blend_state(struct intel_batchbuffer *batch)
 	blend->blend1.post_blend_clamp_enable = 1;
 	blend->blend1.pre_blend_clamp_enable = 1;
 
-	return intel_batchbuffer_subdata_offset(batch, blend);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*blend));
 }
 
 static void
-gen7_emit_state_base_address(struct intel_batchbuffer *batch)
+gen7_emit_state_base_address(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN4_STATE_BASE_ADDRESS | (10 - 2));
-	OUT_BATCH(0);
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
-
-	OUT_BATCH(0);
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	intel_bb_out(ibb, GEN4_STATE_BASE_ADDRESS | (10 - 2));
+	intel_bb_out(ibb, 0);
+
+	intel_bb_emit_reloc(ibb, ibb->handle, I915_GEM_DOMAIN_INSTRUCTION, 0,
+			    BASE_ADDRESS_MODIFY,
+			    ibb->batch_offset);
+	intel_bb_emit_reloc(ibb, ibb->handle, I915_GEM_DOMAIN_INSTRUCTION, 0,
+			    BASE_ADDRESS_MODIFY,
+			    ibb->batch_offset);
+	intel_bb_out(ibb, 0);
+	intel_bb_emit_reloc(ibb, ibb->handle, I915_GEM_DOMAIN_INSTRUCTION, 0,
+			    BASE_ADDRESS_MODIFY,
+			    ibb->batch_offset);
+
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0 | BASE_ADDRESS_MODIFY);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0 | BASE_ADDRESS_MODIFY);
 }
 
 static uint32_t
-gen7_create_cc_viewport(struct intel_batchbuffer *batch)
+gen7_create_cc_viewport(struct intel_bb *ibb)
 {
 	struct gen4_cc_viewport *vp;
 
-	vp = intel_batchbuffer_subdata_alloc(batch, sizeof(*vp), 32);
+	vp = intel_bb_ptr_align(ibb, 32);
 	vp->min_depth = -1.e35;
 	vp->max_depth = 1.e35;
 
-	return intel_batchbuffer_subdata_offset(batch, vp);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*vp));
 }
 
 static void
-gen7_emit_cc(struct intel_batchbuffer *batch, uint32_t blend_state,
+gen7_emit_cc(struct intel_bb *ibb, uint32_t blend_state,
 	     uint32_t cc_viewport)
 {
-	OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2));
-	OUT_BATCH(blend_state);
+	intel_bb_out(ibb, GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2));
+	intel_bb_out(ibb, blend_state);
 
-	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2));
-	OUT_BATCH(cc_viewport);
+	intel_bb_out(ibb, GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2));
+	intel_bb_out(ibb, cc_viewport);
 }
 
 static uint32_t
-gen7_create_sampler(struct intel_batchbuffer *batch)
+gen7_create_sampler(struct intel_bb *ibb)
 {
 	struct gen7_sampler_state *ss;
 
-	ss = intel_batchbuffer_subdata_alloc(batch, sizeof(*ss), 32);
+	ss = intel_bb_ptr_align(ibb, 32);
 
 	ss->ss0.min_filter = GEN4_MAPFILTER_NEAREST;
 	ss->ss0.mag_filter = GEN4_MAPFILTER_NEAREST;
@@ -295,283 +300,284 @@ gen7_create_sampler(struct intel_batchbuffer *batch)
 
 	ss->ss3.non_normalized_coord = 1;
 
-	return intel_batchbuffer_subdata_offset(batch, ss);
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*ss));
 }
 
 static void
-gen7_emit_sampler(struct intel_batchbuffer *batch, uint32_t sampler_off)
+gen7_emit_sampler(struct intel_bb *ibb, uint32_t sampler_off)
 {
-	OUT_BATCH(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS | (2 - 2));
-	OUT_BATCH(sampler_off);
+	intel_bb_out(ibb, GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS | (2 - 2));
+	intel_bb_out(ibb, sampler_off);
 }
 
 static void
-gen7_emit_multisample(struct intel_batchbuffer *batch)
+gen7_emit_multisample(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN6_3DSTATE_MULTISAMPLE | (4 - 2));
-	OUT_BATCH(GEN6_3DSTATE_MULTISAMPLE_PIXEL_LOCATION_CENTER |
-		  GEN6_3DSTATE_MULTISAMPLE_NUMSAMPLES_1); /* 1 sample/pixel */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN6_3DSTATE_SAMPLE_MASK | (2 - 2));
-	OUT_BATCH(1);
+	intel_bb_out(ibb, GEN6_3DSTATE_MULTISAMPLE | (4 - 2));
+	intel_bb_out(ibb, GEN6_3DSTATE_MULTISAMPLE_PIXEL_LOCATION_CENTER |
+		     GEN6_3DSTATE_MULTISAMPLE_NUMSAMPLES_1); /* 1 sample/pixel */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN6_3DSTATE_SAMPLE_MASK | (2 - 2));
+	intel_bb_out(ibb, 1);
 }
 
 static void
-gen7_emit_urb(struct intel_batchbuffer *batch)
+gen7_emit_urb(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS | (2 - 2));
-	OUT_BATCH(8); /* in 1KBs */
+	intel_bb_out(ibb, GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS | (2 - 2));
+	intel_bb_out(ibb, 8); /* in 1KBs */
 
 	/* num of VS entries must be divisible by 8 if size < 9 */
-	OUT_BATCH(GEN7_3DSTATE_URB_VS | (2 - 2));
-	OUT_BATCH((64 << GEN7_URB_ENTRY_NUMBER_SHIFT) |
-		  (2 - 1) << GEN7_URB_ENTRY_SIZE_SHIFT |
-		  (1 << GEN7_URB_STARTING_ADDRESS_SHIFT));
-
-	OUT_BATCH(GEN7_3DSTATE_URB_HS | (2 - 2));
-	OUT_BATCH((0 << GEN7_URB_ENTRY_SIZE_SHIFT) |
-		  (2 << GEN7_URB_STARTING_ADDRESS_SHIFT));
-
-	OUT_BATCH(GEN7_3DSTATE_URB_DS | (2 - 2));
-	OUT_BATCH((0 << GEN7_URB_ENTRY_SIZE_SHIFT) |
-		  (2 << GEN7_URB_STARTING_ADDRESS_SHIFT));
-
-	OUT_BATCH(GEN7_3DSTATE_URB_GS | (2 - 2));
-	OUT_BATCH((0 << GEN7_URB_ENTRY_SIZE_SHIFT) |
-		  (1 << GEN7_URB_STARTING_ADDRESS_SHIFT));
+	intel_bb_out(ibb, GEN7_3DSTATE_URB_VS | (2 - 2));
+	intel_bb_out(ibb, (64 << GEN7_URB_ENTRY_NUMBER_SHIFT) |
+		     (2 - 1) << GEN7_URB_ENTRY_SIZE_SHIFT |
+		     (1 << GEN7_URB_STARTING_ADDRESS_SHIFT));
+
+	intel_bb_out(ibb, GEN7_3DSTATE_URB_HS | (2 - 2));
+	intel_bb_out(ibb, (0 << GEN7_URB_ENTRY_SIZE_SHIFT) |
+		     (2 << GEN7_URB_STARTING_ADDRESS_SHIFT));
+
+	intel_bb_out(ibb, GEN7_3DSTATE_URB_DS | (2 - 2));
+	intel_bb_out(ibb, (0 << GEN7_URB_ENTRY_SIZE_SHIFT) |
+		     (2 << GEN7_URB_STARTING_ADDRESS_SHIFT));
+
+	intel_bb_out(ibb, GEN7_3DSTATE_URB_GS | (2 - 2));
+	intel_bb_out(ibb, (0 << GEN7_URB_ENTRY_SIZE_SHIFT) |
+		     (1 << GEN7_URB_STARTING_ADDRESS_SHIFT));
 }
 
 static void
-gen7_emit_vs(struct intel_batchbuffer *batch)
+gen7_emit_vs(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN6_3DSTATE_VS | (6 - 2));
-	OUT_BATCH(0); /* no VS kernel */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* pass-through */
+	intel_bb_out(ibb, GEN6_3DSTATE_VS | (6 - 2));
+	intel_bb_out(ibb, 0); /* no VS kernel */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* pass-through */
 }
 
 static void
-gen7_emit_hs(struct intel_batchbuffer *batch)
+gen7_emit_hs(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN7_3DSTATE_HS | (7 - 2));
-	OUT_BATCH(0); /* no HS kernel */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* pass-through */
+	intel_bb_out(ibb, GEN7_3DSTATE_HS | (7 - 2));
+	intel_bb_out(ibb, 0); /* no HS kernel */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* pass-through */
 }
 
 static void
-gen7_emit_te(struct intel_batchbuffer *batch)
+gen7_emit_te(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN7_3DSTATE_TE | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN7_3DSTATE_TE | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_ds(struct intel_batchbuffer *batch)
+gen7_emit_ds(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN7_3DSTATE_DS | (6 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN7_3DSTATE_DS | (6 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_gs(struct intel_batchbuffer *batch)
+gen7_emit_gs(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN6_3DSTATE_GS | (7 - 2));
-	OUT_BATCH(0); /* no GS kernel */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* pass-through  */
+	intel_bb_out(ibb, GEN6_3DSTATE_GS | (7 - 2));
+	intel_bb_out(ibb, 0); /* no GS kernel */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* pass-through  */
 }
 
 static void
-gen7_emit_streamout(struct intel_batchbuffer *batch)
+gen7_emit_streamout(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN7_3DSTATE_STREAMOUT | (3 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN7_3DSTATE_STREAMOUT | (3 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_sf(struct intel_batchbuffer *batch)
+gen7_emit_sf(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN6_3DSTATE_SF | (7 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(GEN6_3DSTATE_SF_CULL_NONE);
-	OUT_BATCH(2 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN6_3DSTATE_SF | (7 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, GEN6_3DSTATE_SF_CULL_NONE);
+	intel_bb_out(ibb, 2 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_sbe(struct intel_batchbuffer *batch)
+gen7_emit_sbe(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN7_3DSTATE_SBE | (14 - 2));
-	OUT_BATCH(1 << GEN7_SBE_NUM_OUTPUTS_SHIFT |
-		  1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT |
-		  1 << GEN7_SBE_URB_ENTRY_READ_OFFSET_SHIFT);
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* dw4 */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* dw8 */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* dw12 */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN7_3DSTATE_SBE | (14 - 2));
+	intel_bb_out(ibb, 1 << GEN7_SBE_NUM_OUTPUTS_SHIFT |
+		     1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT |
+		     1 << GEN7_SBE_URB_ENTRY_READ_OFFSET_SHIFT);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* dw4 */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* dw8 */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* dw12 */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_ps(struct intel_batchbuffer *batch, uint32_t kernel_off)
+gen7_emit_ps(struct intel_bb *ibb, uint32_t kernel_off)
 {
 	int threads;
 
-	if (IS_HASWELL(batch->devid))
+	if (IS_HASWELL(ibb->devid))
 		threads = 40 << HSW_PS_MAX_THREADS_SHIFT | 1 << HSW_PS_SAMPLE_MASK_SHIFT;
 	else
 		threads = 40 << IVB_PS_MAX_THREADS_SHIFT;
 
-	OUT_BATCH(GEN7_3DSTATE_PS | (8 - 2));
-	OUT_BATCH(kernel_off);
-	OUT_BATCH(1 << GEN7_PS_SAMPLER_COUNT_SHIFT |
-		  2 << GEN7_PS_BINDING_TABLE_ENTRY_COUNT_SHIFT);
-	OUT_BATCH(0); /* scratch address */
-	OUT_BATCH(threads |
-		  GEN7_PS_16_DISPATCH_ENABLE |
-		  GEN7_PS_ATTRIBUTE_ENABLE);
-	OUT_BATCH(6 << GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN7_3DSTATE_PS | (8 - 2));
+	intel_bb_out(ibb, kernel_off);
+	intel_bb_out(ibb, 1 << GEN7_PS_SAMPLER_COUNT_SHIFT |
+		     2 << GEN7_PS_BINDING_TABLE_ENTRY_COUNT_SHIFT);
+	intel_bb_out(ibb, 0); /* scratch address */
+	intel_bb_out(ibb, threads |
+		     GEN7_PS_16_DISPATCH_ENABLE |
+		     GEN7_PS_ATTRIBUTE_ENABLE);
+	intel_bb_out(ibb, 6 << GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_clip(struct intel_batchbuffer *batch)
+gen7_emit_clip(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN6_3DSTATE_CLIP | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0); /* pass-through */
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN6_3DSTATE_CLIP | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /* pass-through */
+	intel_bb_out(ibb, 0);
 
-	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CL | (2 - 2));
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CL | (2 - 2));
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_wm(struct intel_batchbuffer *batch)
+gen7_emit_wm(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN6_3DSTATE_WM | (3 - 2));
-	OUT_BATCH(GEN7_WM_DISPATCH_ENABLE |
-		GEN7_WM_PERSPECTIVE_PIXEL_BARYCENTRIC);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN6_3DSTATE_WM | (3 - 2));
+	intel_bb_out(ibb, GEN7_WM_DISPATCH_ENABLE |
+		     GEN7_WM_PERSPECTIVE_PIXEL_BARYCENTRIC);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_null_depth_buffer(struct intel_batchbuffer *batch)
+gen7_emit_null_depth_buffer(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN7_3DSTATE_DEPTH_BUFFER | (7 - 2));
-	OUT_BATCH(SURFACE_NULL << GEN4_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT |
-		  GEN4_DEPTHFORMAT_D32_FLOAT << GEN4_3DSTATE_DEPTH_BUFFER_FORMAT_SHIFT);
-	OUT_BATCH(0); /* disable depth, stencil and hiz */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_CLEAR_PARAMS | (3 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN7_3DSTATE_DEPTH_BUFFER | (7 - 2));
+	intel_bb_out(ibb, SURFACE_NULL << GEN4_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT |
+		     GEN4_DEPTHFORMAT_D32_FLOAT << GEN4_3DSTATE_DEPTH_BUFFER_FORMAT_SHIFT);
+	intel_bb_out(ibb, 0); /* disable depth, stencil and hiz */
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_CLEAR_PARAMS | (3 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 #define BATCH_STATE_SPLIT 2048
-void gen7_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y)
+void gen7_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src,
+			  uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst,
+			  uint32_t dst_x, uint32_t dst_y)
 {
 	uint32_t ps_binding_table, ps_sampler_off, ps_kernel_off;
 	uint32_t blend_state, cc_viewport;
 	uint32_t vertex_buffer;
-	uint32_t batch_end;
 
 	igt_assert(src->bpp == dst->bpp);
-	intel_batchbuffer_flush_with_context(batch, context);
 
-	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
+	intel_bb_flush_render_with_context(ibb, ctx);
+
+	intel_bb_add_intel_buf(ibb, dst, true);
+	intel_bb_add_intel_buf(ibb, src, false);
 
+	intel_bb_ptr_set(ibb, BATCH_STATE_SPLIT);
 
-	blend_state = gen7_create_blend_state(batch);
-	cc_viewport = gen7_create_cc_viewport(batch);
-	ps_sampler_off = gen7_create_sampler(batch);
-	ps_kernel_off = intel_batchbuffer_copy_data(batch, ps_kernel,
-						    sizeof(ps_kernel), 64);
-	vertex_buffer = gen7_create_vertex_buffer(batch,
+	blend_state = gen7_create_blend_state(ibb);
+	cc_viewport = gen7_create_cc_viewport(ibb);
+	ps_sampler_off = gen7_create_sampler(ibb);
+	ps_kernel_off = intel_bb_copy_data(ibb, ps_kernel,
+					   sizeof(ps_kernel), 64);
+	vertex_buffer = gen7_create_vertex_buffer(ibb,
 						  src_x, src_y,
 						  dst_x, dst_y,
 						  width, height);
-	ps_binding_table = gen7_bind_surfaces(batch, src, dst);
-
-	igt_assert(batch->ptr < &batch->buffer[4095]);
-
-	batch->ptr = batch->buffer;
-	OUT_BATCH(G4X_PIPELINE_SELECT | PIPELINE_SELECT_3D);
-
-	gen7_emit_state_base_address(batch);
-	gen7_emit_multisample(batch);
-	gen7_emit_urb(batch);
-	gen7_emit_vs(batch);
-	gen7_emit_hs(batch);
-	gen7_emit_te(batch);
-	gen7_emit_ds(batch);
-	gen7_emit_gs(batch);
-	gen7_emit_clip(batch);
-	gen7_emit_sf(batch);
-	gen7_emit_wm(batch);
-	gen7_emit_streamout(batch);
-	gen7_emit_null_depth_buffer(batch);
-	gen7_emit_cc(batch, blend_state, cc_viewport);
-	gen7_emit_sampler(batch, ps_sampler_off);
-	gen7_emit_sbe(batch);
-	gen7_emit_ps(batch, ps_kernel_off);
-	gen7_emit_vertex_elements(batch);
-	gen7_emit_vertex_buffer(batch, src_x, src_y,
+	ps_binding_table = gen7_bind_surfaces(ibb, src, dst);
+
+	intel_bb_ptr_set(ibb, 0);
+
+	intel_bb_out(ibb, G4X_PIPELINE_SELECT | PIPELINE_SELECT_3D);
+
+	gen7_emit_state_base_address(ibb);
+	gen7_emit_multisample(ibb);
+	gen7_emit_urb(ibb);
+	gen7_emit_vs(ibb);
+	gen7_emit_hs(ibb);
+	gen7_emit_te(ibb);
+	gen7_emit_ds(ibb);
+	gen7_emit_gs(ibb);
+	gen7_emit_clip(ibb);
+	gen7_emit_sf(ibb);
+	gen7_emit_wm(ibb);
+	gen7_emit_streamout(ibb);
+	gen7_emit_null_depth_buffer(ibb);
+	gen7_emit_cc(ibb, blend_state, cc_viewport);
+	gen7_emit_sampler(ibb, ps_sampler_off);
+	gen7_emit_sbe(ibb);
+	gen7_emit_ps(ibb, ps_kernel_off);
+	gen7_emit_vertex_elements(ibb);
+	gen7_emit_vertex_buffer(ibb, src_x, src_y,
 				dst_x, dst_y, width,
 				height, vertex_buffer);
-	gen7_emit_binding_table(batch, src, dst, ps_binding_table);
-	gen7_emit_drawing_rectangle(batch, dst);
-
-	OUT_BATCH(GEN4_3DPRIMITIVE | (7 - 2));
-	OUT_BATCH(GEN4_3DPRIMITIVE_VERTEX_SEQUENTIAL | _3DPRIM_RECTLIST);
-	OUT_BATCH(3);
-	OUT_BATCH(0);
-	OUT_BATCH(1);   /* single instance */
-	OUT_BATCH(0);   /* start instance location */
-	OUT_BATCH(0);   /* index buffer offset, ignored */
-
-	OUT_BATCH(MI_BATCH_BUFFER_END);
-
-	batch_end = batch->ptr - batch->buffer;
-	batch_end = ALIGN(batch_end, 8);
-	igt_assert(batch_end < BATCH_STATE_SPLIT);
-
-	gen7_render_flush(batch, context, batch_end);
-	intel_batchbuffer_reset(batch);
+	gen7_emit_binding_table(ibb, src, dst, ps_binding_table);
+	gen7_emit_drawing_rectangle(ibb, dst);
+
+	intel_bb_out(ibb, GEN4_3DPRIMITIVE | (7 - 2));
+	intel_bb_out(ibb, GEN4_3DPRIMITIVE_VERTEX_SEQUENTIAL | _3DPRIM_RECTLIST);
+	intel_bb_out(ibb, 3);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 1);   /* single instance */
+	intel_bb_out(ibb, 0);   /* start instance location */
+	intel_bb_out(ibb, 0);   /* index buffer offset, ignored */
+
+	intel_bb_emit_bbe(ibb);
+	intel_bb_exec_with_context(ibb, intel_bb_offset(ibb), ctx,
+				   I915_EXEC_DEFAULT | I915_EXEC_NO_RELOC,
+				   false);
+	dump_batch(ibb);
+	intel_bb_reset(ibb, false);
 }
diff --git a/lib/rendercopy_gen8.c b/lib/rendercopy_gen8.c
index 75005d0b..053d5642 100644
--- a/lib/rendercopy_gen8.c
+++ b/lib/rendercopy_gen8.c
@@ -14,7 +14,7 @@
 #include <i915_drm.h>
 
 #include "drmtest.h"
-#include "intel_bufmgr.h"
+#include "intel_bufops.h"
 #include "intel_batchbuffer.h"
 #include "intel_chipset.h"
 #include "intel_io.h"
@@ -23,17 +23,12 @@
 #include "intel_reg.h"
 #include "igt_aux.h"
 
-#include "intel_aub.h"
-
 #define VERTEX_SIZE (3*4)
 
 #if DEBUG_RENDERCPY
-static void dump_batch(struct intel_batchbuffer *batch) {
-	int fd = open("/tmp/i965-batchbuffers.dump", O_WRONLY | O_CREAT,  0666);
-	if (fd != -1) {
-		igt_assert_eq(write(fd, batch->buffer, 4096), 4096);
-		fd = close(fd);
-	}
+static void dump_batch(struct intel_bb *ibb)
+{
+	intel_bb_dump(ibb, "/tmp/gen8-batchbuffers.dump");
 }
 #else
 #define dump_batch(x) do { } while(0)
@@ -70,89 +65,18 @@ static const uint32_t ps_kernel[][4] = {
 #endif
 };
 
-/* AUB annotation support */
-#define MAX_ANNOTATIONS	33
-struct annotations_context {
-	drm_intel_aub_annotation annotations[MAX_ANNOTATIONS];
-	int index;
-	uint32_t offset;
-};
-
-static void annotation_init(struct annotations_context *aub)
-{
-	/* aub->annotations is an array keeping a list of annotations of the
-	 * batch buffer ordered by offset. aub->annotations[0] is thus left
-	 * for the command stream and will be filled just before executing
-	 * the batch buffer with annotations_add_batch() */
-	aub->index = 1;
-}
-
-static void add_annotation(drm_intel_aub_annotation *a,
-			   uint32_t type, uint32_t subtype,
-			   uint32_t ending_offset)
-{
-	a->type = type;
-	a->subtype = subtype;
-	a->ending_offset = ending_offset;
-}
-
-static void annotation_add_batch(struct annotations_context *aub, size_t size)
-{
-	add_annotation(&aub->annotations[0], AUB_TRACE_TYPE_BATCH, 0, size);
-}
-
-static void annotation_add_state(struct annotations_context *aub,
-				 uint32_t state_type,
-				 uint32_t start_offset,
-				 size_t   size)
-{
-	igt_assert(aub->index < MAX_ANNOTATIONS);
-
-	add_annotation(&aub->annotations[aub->index++],
-		       AUB_TRACE_TYPE_NOTYPE, 0,
-		       start_offset);
-	add_annotation(&aub->annotations[aub->index++],
-		       AUB_TRACE_TYPE(state_type),
-		       AUB_TRACE_SUBTYPE(state_type),
-		       start_offset + size);
-}
-
-static void annotation_flush(struct annotations_context *aub,
-			     struct intel_batchbuffer *batch)
-{
-	if (!igt_aub_dump_enabled())
-		return;
-
-	drm_intel_bufmgr_gem_set_aub_annotations(batch->bo,
-						 aub->annotations,
-						 aub->index);
-}
-
-static void
-gen6_render_flush(struct intel_batchbuffer *batch,
-		  drm_intel_context *context, uint32_t batch_end)
-{
-	igt_assert_eq(drm_intel_bo_subdata(batch->bo,
-					   0, 4096, batch->buffer),
-		      0);
-	igt_assert_eq(drm_intel_gem_bo_context_exec(batch->bo, context,
-						    batch_end, 0),
-		      0);
-}
-
 /* Mostly copy+paste from gen6, except height, width, pitch moved */
 static uint32_t
-gen8_bind_buf(struct intel_batchbuffer *batch,
-	      struct annotations_context *aub,
-	      const struct igt_buf *buf, int is_dst)
+gen8_bind_buf(struct intel_bb *ibb,
+	      const struct intel_buf *buf, int is_dst)
 {
 	struct gen8_surface_state *ss;
 	uint32_t write_domain, read_domain, offset;
-	int ret;
+	uint64_t address;
 
 	igt_assert_lte(buf->surface[0].stride, 256*1024);
-	igt_assert_lte(igt_buf_width(buf), 16384);
-	igt_assert_lte(igt_buf_height(buf), 16384);
+	igt_assert_lte(intel_buf_width(buf), 16384);
+	igt_assert_lte(intel_buf_height(buf), 16384);
 
 	if (is_dst) {
 		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
@@ -161,9 +85,7 @@ gen8_bind_buf(struct intel_batchbuffer *batch,
 		read_domain = I915_GEM_DOMAIN_SAMPLER;
 	}
 
-	ss = intel_batchbuffer_subdata_alloc(batch, sizeof(*ss), 64);
-	offset = intel_batchbuffer_subdata_offset(batch, ss);
-	annotation_add_state(aub, AUB_TRACE_SURFACE_STATE, offset, sizeof(*ss));
+	ss = intel_bb_ptr_align(ibb, 64);
 
 	ss->ss0.surface_type = SURFACE_2D;
 	switch (buf->bpp) {
@@ -181,23 +103,21 @@ gen8_bind_buf(struct intel_batchbuffer *batch,
 	else if (buf->tiling == I915_TILING_Y)
 		ss->ss0.tiled_mode = 3;
 
-	if (IS_CHERRYVIEW(batch->devid))
+	if (IS_CHERRYVIEW(ibb->devid))
 		ss->ss1.memory_object_control = CHV_MOCS_WB | CHV_MOCS_L3;
 	else
 		ss->ss1.memory_object_control = BDW_MOCS_PTE |
 			BDW_MOCS_TC_L3_PTE | BDW_MOCS_AGE(0);
 
-	ss->ss8.base_addr = buf->bo->offset64;
-	ss->ss9.base_addr_hi = buf->bo->offset64 >> 32;
-
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				      intel_batchbuffer_subdata_offset(batch, &ss->ss8),
-				      buf->bo, 0,
-				      read_domain, write_domain);
-	igt_assert(ret == 0);
+	address = intel_bb_offset_reloc(ibb, buf->handle,
+					read_domain, write_domain,
+					intel_bb_offset(ibb) + 4 * 8,
+					buf->addr.offset);
+	ss->ss8.base_addr = address;
+	ss->ss9.base_addr_hi = address >> 32;
 
-	ss->ss2.height = igt_buf_height(buf) - 1;
-	ss->ss2.width  = igt_buf_width(buf) - 1;
+	ss->ss2.height = intel_buf_height(buf) - 1;
+	ss->ss2.width  = intel_buf_width(buf) - 1;
 	ss->ss3.pitch  = buf->surface[0].stride - 1;
 
 	ss->ss7.shader_chanel_select_r = 4;
@@ -209,35 +129,28 @@ gen8_bind_buf(struct intel_batchbuffer *batch,
 }
 
 static uint32_t
-gen8_bind_surfaces(struct intel_batchbuffer *batch,
-		   struct annotations_context *aub,
-		   const struct igt_buf *src,
-		   const struct igt_buf *dst)
+gen8_bind_surfaces(struct intel_bb *ibb,
+		   const struct intel_buf *src,
+		   const struct intel_buf *dst)
 {
-	uint32_t *binding_table, offset;
+	uint32_t *binding_table, binding_table_offset;
 
-	binding_table = intel_batchbuffer_subdata_alloc(batch, 8, 32);
-	offset = intel_batchbuffer_subdata_offset(batch, binding_table);
-	annotation_add_state(aub, AUB_TRACE_BINDING_TABLE, offset, 8);
+	binding_table = intel_bb_ptr_align(ibb, 32);
+	binding_table_offset = intel_bb_ptr_add_return_prev_offset(ibb, 8);
 
-	binding_table[0] = gen8_bind_buf(batch, aub, dst, 1);
-	binding_table[1] = gen8_bind_buf(batch, aub, src, 0);
+	binding_table[0] = gen8_bind_buf(ibb, dst, 1);
+	binding_table[1] = gen8_bind_buf(ibb, src, 0);
 
-	return offset;
+	return binding_table_offset;
 }
 
 /* Mostly copy+paste from gen6, except wrap modes moved */
 static uint32_t
-gen8_create_sampler(struct intel_batchbuffer *batch,
-		    struct annotations_context *aub)
+gen8_create_sampler(struct intel_bb *ibb)
 {
 	struct gen8_sampler_state *ss;
-	uint32_t offset;
 
-	ss = intel_batchbuffer_subdata_alloc(batch, sizeof(*ss), 64);
-	offset = intel_batchbuffer_subdata_offset(batch, ss);
-	annotation_add_state(aub, AUB_TRACE_SAMPLER_STATE,
-			     offset, sizeof(*ss));
+	ss = intel_bb_ptr_align(ibb, 64);
 
 	ss->ss0.min_filter = GEN4_MAPFILTER_NEAREST;
 	ss->ss0.mag_filter = GEN4_MAPFILTER_NEAREST;
@@ -249,21 +162,15 @@ gen8_create_sampler(struct intel_batchbuffer *batch,
 	 * sampler fetch, but couldn't make it work. */
 	ss->ss3.non_normalized_coord = 0;
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*ss));
 }
 
 static uint32_t
-gen8_fill_ps(struct intel_batchbuffer *batch,
-	     struct annotations_context *aub,
+gen8_fill_ps(struct intel_bb *ibb,
 	     const uint32_t kernel[][4],
 	     size_t size)
 {
-	uint32_t offset;
-
-	offset = intel_batchbuffer_copy_data(batch, kernel, size, 64);
-	annotation_add_state(aub, AUB_TRACE_KERNEL_INSTRUCTIONS, offset, size);
-
-	return offset;
+	return intel_bb_copy_data(ibb, kernel, size, 64);
 }
 
 /*
@@ -277,34 +184,29 @@ gen8_fill_ps(struct intel_batchbuffer *batch,
  * see gen6_emit_vertex_elements
  */
 static uint32_t
-gen7_fill_vertex_buffer_data(struct intel_batchbuffer *batch,
-			     struct annotations_context *aub,
-			     const struct igt_buf *src,
+gen7_fill_vertex_buffer_data(struct intel_bb *ibb,
+			     const struct intel_buf *src,
 			     uint32_t src_x, uint32_t src_y,
 			     uint32_t dst_x, uint32_t dst_y,
 			     uint32_t width, uint32_t height)
 {
-	void *start;
 	uint32_t offset;
 
-	intel_batchbuffer_align(batch, 8);
-	start = batch->ptr;
+	intel_bb_ptr_align(ibb, 8);
+	offset = intel_bb_offset(ibb);
 
-	emit_vertex_2s(batch, dst_x + width, dst_y + height);
-	emit_vertex_normalized(batch, src_x + width, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y + height, igt_buf_height(src));
+	emit_vertex_2s(ibb, dst_x + width, dst_y + height);
+	emit_vertex_normalized(ibb, src_x + width, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y + height, intel_buf_height(src));
 
-	emit_vertex_2s(batch, dst_x, dst_y + height);
-	emit_vertex_normalized(batch, src_x, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y + height, igt_buf_height(src));
+	emit_vertex_2s(ibb, dst_x, dst_y + height);
+	emit_vertex_normalized(ibb, src_x, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y + height, intel_buf_height(src));
 
-	emit_vertex_2s(batch, dst_x, dst_y);
-	emit_vertex_normalized(batch, src_x, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y, igt_buf_height(src));
+	emit_vertex_2s(ibb, dst_x, dst_y);
+	emit_vertex_normalized(ibb, src_x, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y, intel_buf_height(src));
 
-	offset = intel_batchbuffer_subdata_offset(batch, start);
-	annotation_add_state(aub, AUB_TRACE_VERTEX_BUFFER,
-			     offset, 3 * VERTEX_SIZE);
 	return offset;
 }
 
@@ -318,25 +220,25 @@ gen7_fill_vertex_buffer_data(struct intel_batchbuffer *batch,
  * packed.
  */
 static void
-gen6_emit_vertex_elements(struct intel_batchbuffer *batch) {
+gen6_emit_vertex_elements(struct intel_bb *ibb) {
 	/*
 	 * The VUE layout
 	 *    dword 0-3: pad (0, 0, 0. 0)
 	 *    dword 4-7: position (x, y, 0, 1.0),
 	 *    dword 8-11: texture coordinate 0 (u0, v0, 0, 1.0)
 	 */
-	OUT_BATCH(GEN4_3DSTATE_VERTEX_ELEMENTS | (3 * 2 + 1 - 2));
+	intel_bb_out(ibb, GEN4_3DSTATE_VERTEX_ELEMENTS | (3 * 2 + 1 - 2));
 
 	/* Element state 0. These are 4 dwords of 0 required for the VUE format.
 	 * We don't really know or care what they do.
 	 */
-	OUT_BATCH(0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
-		  SURFACEFORMAT_R32G32B32A32_FLOAT << VE0_FORMAT_SHIFT |
-		  0 << VE0_OFFSET_SHIFT); /* we specify 0, but it's really does not exist */
-	OUT_BATCH(GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
+	intel_bb_out(ibb, 0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
+		     SURFACEFORMAT_R32G32B32A32_FLOAT << VE0_FORMAT_SHIFT |
+		     0 << VE0_OFFSET_SHIFT); /* we specify 0, but it's really does not exist */
+	intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
 
 	/* Element state 1 - Our "destination" vertices. These are passed down
 	 * through the pipeline, and eventually make it to the pixel shader as
@@ -344,25 +246,25 @@ gen6_emit_vertex_elements(struct intel_batchbuffer *batch) {
 	 * signed/scaled because of gen6 rendercopy. I see no particular reason
 	 * for doing this though.
 	 */
-	OUT_BATCH(0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
-		  SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
-		  0 << VE0_OFFSET_SHIFT); /* offsets vb in bytes */
-	OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
+	intel_bb_out(ibb, 0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
+		     SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
+		     0 << VE0_OFFSET_SHIFT); /* offsets vb in bytes */
+	intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
 
 	/* Element state 2. Last but not least we store the U,V components as
 	 * normalized floats. These will be used in the pixel shader to sample
 	 * from the source buffer.
 	 */
-	OUT_BATCH(0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
-		  SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
-		  4 << VE0_OFFSET_SHIFT);	/* offset vb in bytes */
-	OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
+	intel_bb_out(ibb, 0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
+		     SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
+		     4 << VE0_OFFSET_SHIFT);	/* offset vb in bytes */
+	intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
 }
 
 /*
@@ -371,44 +273,35 @@ gen6_emit_vertex_elements(struct intel_batchbuffer *batch) {
  * @batch
  * @offset - bytw offset within the @batch where the vertex buffer starts.
  */
-static void gen8_emit_vertex_buffer(struct intel_batchbuffer *batch,
-				    uint32_t offset) {
-	OUT_BATCH(GEN4_3DSTATE_VERTEX_BUFFERS | (1 + (4 * 1) - 2));
-	OUT_BATCH(0 << GEN6_VB0_BUFFER_INDEX_SHIFT | /* VB 0th index */
-		  GEN8_VB0_BUFFER_ADDR_MOD_EN | /* Address Modify Enable */
-		  VERTEX_SIZE << VB0_BUFFER_PITCH_SHIFT);
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_VERTEX, 0, offset);
-	OUT_BATCH(3 * VERTEX_SIZE);
+static void gen8_emit_vertex_buffer(struct intel_bb *ibb, uint32_t offset)
+{
+	intel_bb_out(ibb, GEN4_3DSTATE_VERTEX_BUFFERS | (1 + (4 * 1) - 2));
+	intel_bb_out(ibb, 0 << GEN6_VB0_BUFFER_INDEX_SHIFT | /* VB 0th index */
+		     GEN8_VB0_BUFFER_ADDR_MOD_EN | /* Address Modify Enable */
+		     VERTEX_SIZE << VB0_BUFFER_PITCH_SHIFT);
+	intel_bb_emit_reloc(ibb, ibb->handle,
+			    I915_GEM_DOMAIN_VERTEX, 0,
+			    offset, ibb->batch_offset);
+	intel_bb_out(ibb, 3 * VERTEX_SIZE);
 }
 
 static uint32_t
-gen6_create_cc_state(struct intel_batchbuffer *batch,
-		     struct annotations_context *aub)
+gen6_create_cc_state(struct intel_bb *ibb)
 {
 	struct gen6_color_calc_state *cc_state;
-	uint32_t offset;
 
-	cc_state = intel_batchbuffer_subdata_alloc(batch,
-						   sizeof(*cc_state), 64);
-	offset = intel_batchbuffer_subdata_offset(batch, cc_state);
-	annotation_add_state(aub, AUB_TRACE_CC_STATE,
-			     offset, sizeof(*cc_state));
+	cc_state = intel_bb_ptr_align(ibb, 64);
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*cc_state));
 }
 
 static uint32_t
-gen8_create_blend_state(struct intel_batchbuffer *batch,
-			struct annotations_context *aub)
+gen8_create_blend_state(struct intel_bb *ibb)
 {
 	struct gen8_blend_state *blend;
 	int i;
-	uint32_t offset;
 
-	blend = intel_batchbuffer_subdata_alloc(batch, sizeof(*blend), 64);
-	offset = intel_batchbuffer_subdata_offset(batch, blend);
-	annotation_add_state(aub, AUB_TRACE_BLEND_STATE,
-			     offset, sizeof(*blend));
+	blend = intel_bb_ptr_align(ibb, 64);
 
 	for (i = 0; i < 16; i++) {
 		blend->bs[i].dest_blend_factor = GEN6_BLENDFACTOR_ZERO;
@@ -418,451 +311,440 @@ gen8_create_blend_state(struct intel_batchbuffer *batch,
 		blend->bs[i].color_buffer_blend = 0;
 	}
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*blend));
 }
 
 static uint32_t
-gen6_create_cc_viewport(struct intel_batchbuffer *batch,
-			struct annotations_context *aub)
+gen6_create_cc_viewport(struct intel_bb *ibb)
 {
 	struct gen4_cc_viewport *vp;
-	uint32_t offset;
 
-	vp = intel_batchbuffer_subdata_alloc(batch, sizeof(*vp), 32);
-	offset = intel_batchbuffer_subdata_offset(batch, vp);
-	annotation_add_state(aub, AUB_TRACE_CC_VP_STATE,
-			     offset, sizeof(*vp));
+	vp = intel_bb_ptr_align(ibb, 32);
 
 	/* XXX I don't understand this */
 	vp->min_depth = -1.e35;
 	vp->max_depth = 1.e35;
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*vp));
 }
 
 static uint32_t
-gen7_create_sf_clip_viewport(struct intel_batchbuffer *batch,
-			struct annotations_context *aub)
+gen7_create_sf_clip_viewport(struct intel_bb *ibb)
 {
 	/* XXX these are likely not needed */
 	struct gen7_sf_clip_viewport *scv_state;
-	uint32_t offset;
 
-	scv_state = intel_batchbuffer_subdata_alloc(batch,
-						    sizeof(*scv_state), 64);
-	offset = intel_batchbuffer_subdata_offset(batch, scv_state);
-	annotation_add_state(aub, AUB_TRACE_CLIP_VP_STATE,
-			     offset, sizeof(*scv_state));
+	scv_state = intel_bb_ptr_align(ibb, 64);
 
 	scv_state->guardband.xmin = 0;
 	scv_state->guardband.xmax = 1.0f;
 	scv_state->guardband.ymin = 0;
 	scv_state->guardband.ymax = 1.0f;
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*scv_state));
 }
 
 static uint32_t
-gen6_create_scissor_rect(struct intel_batchbuffer *batch,
-			struct annotations_context *aub)
+gen6_create_scissor_rect(struct intel_bb *ibb)
 {
 	struct gen6_scissor_rect *scissor;
-	uint32_t offset;
 
-	scissor = intel_batchbuffer_subdata_alloc(batch, sizeof(*scissor), 64);
-	offset = intel_batchbuffer_subdata_offset(batch, scissor);
-	annotation_add_state(aub, AUB_TRACE_SCISSOR_STATE,
-			     offset, sizeof(*scissor));
+	scissor = intel_bb_ptr_align(ibb, 64);
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*scissor));
 }
 
 static void
-gen8_emit_sip(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN4_STATE_SIP | (3 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+gen8_emit_sip(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN4_STATE_SIP | (3 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_push_constants(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS);
-	OUT_BATCH(0);
-	OUT_BATCH(GEN8_3DSTATE_PUSH_CONSTANT_ALLOC_HS);
-	OUT_BATCH(0);
-	OUT_BATCH(GEN8_3DSTATE_PUSH_CONSTANT_ALLOC_DS);
-	OUT_BATCH(0);
-	OUT_BATCH(GEN8_3DSTATE_PUSH_CONSTANT_ALLOC_GS);
-	OUT_BATCH(0);
-	OUT_BATCH(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS);
-	OUT_BATCH(0);
+gen7_emit_push_constants(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, GEN8_3DSTATE_PUSH_CONSTANT_ALLOC_HS);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, GEN8_3DSTATE_PUSH_CONSTANT_ALLOC_DS);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, GEN8_3DSTATE_PUSH_CONSTANT_ALLOC_GS);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_state_base_address(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN4_STATE_BASE_ADDRESS | (16 - 2));
+gen8_emit_state_base_address(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN4_STATE_BASE_ADDRESS | (16 - 2));
 
 	/* general */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, 0 | BASE_ADDRESS_MODIFY);
+	intel_bb_out(ibb, 0);
 
 	/* stateless data port */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	intel_bb_out(ibb, 0 | BASE_ADDRESS_MODIFY);
 
 	/* surface */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
+	intel_bb_emit_reloc(ibb, ibb->handle,
+			    I915_GEM_DOMAIN_SAMPLER, 0,
+			    BASE_ADDRESS_MODIFY, ibb->batch_offset);
 
 	/* dynamic */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
-		  0, BASE_ADDRESS_MODIFY);
+	intel_bb_emit_reloc(ibb, ibb->handle,
+			    I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, 0,
+			    BASE_ADDRESS_MODIFY, ibb->batch_offset);
 
 	/* indirect */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 
 	/* instruction */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
+	intel_bb_emit_reloc(ibb, ibb->handle,
+			    I915_GEM_DOMAIN_INSTRUCTION, 0,
+			    BASE_ADDRESS_MODIFY, ibb->batch_offset);
 
 	/* general state buffer size */
-	OUT_BATCH(0xfffff000 | 1);
+	intel_bb_out(ibb, 0xfffff000 | 1);
 	/* dynamic state buffer size */
-	OUT_BATCH(1 << 12 | 1);
+	intel_bb_out(ibb, 1 << 12 | 1);
 	/* indirect object buffer size */
-	OUT_BATCH(0xfffff000 | 1);
+	intel_bb_out(ibb, 0xfffff000 | 1);
 	/* intruction buffer size */
-	OUT_BATCH(1 << 12 | 1);
+	intel_bb_out(ibb, 1 << 12 | 1);
 }
 
 static void
-gen7_emit_urb(struct intel_batchbuffer *batch) {
+gen7_emit_urb(struct intel_bb *ibb) {
 	/* XXX: Min valid values from mesa */
 	const int vs_entries = 64;
 	const int vs_size = 2;
 	const int vs_start = 2;
 
-	OUT_BATCH(GEN7_3DSTATE_URB_VS);
-	OUT_BATCH(vs_entries | ((vs_size - 1) << 16) | (vs_start << 25));
-	OUT_BATCH(GEN7_3DSTATE_URB_GS);
-	OUT_BATCH(vs_start << 25);
-	OUT_BATCH(GEN7_3DSTATE_URB_HS);
-	OUT_BATCH(vs_start << 25);
-	OUT_BATCH(GEN7_3DSTATE_URB_DS);
-	OUT_BATCH(vs_start << 25);
+	intel_bb_out(ibb, GEN7_3DSTATE_URB_VS);
+	intel_bb_out(ibb, vs_entries | ((vs_size - 1) << 16) | (vs_start << 25));
+	intel_bb_out(ibb, GEN7_3DSTATE_URB_GS);
+	intel_bb_out(ibb, vs_start << 25);
+	intel_bb_out(ibb, GEN7_3DSTATE_URB_HS);
+	intel_bb_out(ibb, vs_start << 25);
+	intel_bb_out(ibb, GEN7_3DSTATE_URB_DS);
+	intel_bb_out(ibb, vs_start << 25);
 }
 
 static void
-gen8_emit_cc(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS);
-	OUT_BATCH(cc.blend_state | 1);
+gen8_emit_cc(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN7_3DSTATE_BLEND_STATE_POINTERS);
+	intel_bb_out(ibb, cc.blend_state | 1);
 
-	OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS);
-	OUT_BATCH(cc.cc_state | 1);
+	intel_bb_out(ibb, GEN6_3DSTATE_CC_STATE_POINTERS);
+	intel_bb_out(ibb, cc.cc_state | 1);
 }
 
 static void
-gen8_emit_multisample(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN8_3DSTATE_MULTISAMPLE);
-	OUT_BATCH(0);
+gen8_emit_multisample(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN8_3DSTATE_MULTISAMPLE);
+	intel_bb_out(ibb, 0);
 
-	OUT_BATCH(GEN6_3DSTATE_SAMPLE_MASK);
-	OUT_BATCH(1);
+	intel_bb_out(ibb, GEN6_3DSTATE_SAMPLE_MASK);
+	intel_bb_out(ibb, 1);
 }
 
 static void
-gen8_emit_vs(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN6_3DSTATE_CONSTANT_VS | (11 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN6_3DSTATE_VS | (9-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+gen8_emit_vs(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN6_3DSTATE_CONSTANT_VS | (11 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN6_3DSTATE_VS | (9-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_hs(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_CONSTANT_HS | (11 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_HS | (9-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN8_3DSTATE_SAMPLER_STATE_POINTERS_HS);
-	OUT_BATCH(0);
+gen8_emit_hs(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN7_3DSTATE_CONSTANT_HS | (11 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_HS | (9-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_SAMPLER_STATE_POINTERS_HS);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_gs(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN6_3DSTATE_CONSTANT_GS | (11 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN6_3DSTATE_GS | (10-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS);
-	OUT_BATCH(0);
+gen8_emit_gs(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN6_3DSTATE_CONSTANT_GS | (11 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN6_3DSTATE_GS | (10-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_ds(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_CONSTANT_DS | (11 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_DS | (9-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN8_3DSTATE_SAMPLER_STATE_POINTERS_DS);
-	OUT_BATCH(0);
+gen8_emit_ds(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN7_3DSTATE_CONSTANT_DS | (11 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_DS | (9-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_SAMPLER_STATE_POINTERS_DS);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_wm_hz_op(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN8_3DSTATE_WM_HZ_OP | (5-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+gen8_emit_wm_hz_op(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN8_3DSTATE_WM_HZ_OP | (5-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_null_state(struct intel_batchbuffer *batch) {
-	gen8_emit_wm_hz_op(batch);
-	gen8_emit_hs(batch);
-	OUT_BATCH(GEN7_3DSTATE_TE | (4-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	gen8_emit_gs(batch);
-	gen8_emit_ds(batch);
-	gen8_emit_vs(batch);
+gen8_emit_null_state(struct intel_bb *ibb) {
+	gen8_emit_wm_hz_op(ibb);
+	gen8_emit_hs(ibb);
+	intel_bb_out(ibb, GEN7_3DSTATE_TE | (4-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	gen8_emit_gs(ibb);
+	gen8_emit_ds(ibb);
+	gen8_emit_vs(ibb);
 }
 
 static void
-gen7_emit_clip(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN6_3DSTATE_CLIP | (4 - 2));
-	OUT_BATCH(0); 
-	OUT_BATCH(0); /*  pass-through */
-	OUT_BATCH(0);
+gen7_emit_clip(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN6_3DSTATE_CLIP | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /*  pass-through */
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_sf(struct intel_batchbuffer *batch)
+gen8_emit_sf(struct intel_bb *ibb)
 {
 	int i;
 
-	OUT_BATCH(GEN7_3DSTATE_SBE | (4 - 2));
-	OUT_BATCH(1 << GEN7_SBE_NUM_OUTPUTS_SHIFT |
-		  GEN8_SBE_FORCE_URB_ENTRY_READ_LENGTH |
-		  GEN8_SBE_FORCE_URB_ENTRY_READ_OFFSET |
-		  1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT |
-		  1 << GEN8_SBE_URB_ENTRY_READ_OFFSET_SHIFT);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN7_3DSTATE_SBE | (4 - 2));
+	intel_bb_out(ibb, 1 << GEN7_SBE_NUM_OUTPUTS_SHIFT |
+		     GEN8_SBE_FORCE_URB_ENTRY_READ_LENGTH |
+		     GEN8_SBE_FORCE_URB_ENTRY_READ_OFFSET |
+		     1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT |
+		     1 << GEN8_SBE_URB_ENTRY_READ_OFFSET_SHIFT);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 
-	OUT_BATCH(GEN8_3DSTATE_SBE_SWIZ | (11 - 2));
+	intel_bb_out(ibb, GEN8_3DSTATE_SBE_SWIZ | (11 - 2));
 	for (i = 0; i < 8; i++)
-		OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+		intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 
-	OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2));
-	OUT_BATCH(GEN8_RASTER_FRONT_WINDING_CCW | GEN8_RASTER_CULL_NONE);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN8_3DSTATE_RASTER | (5 - 2));
+	intel_bb_out(ibb, GEN8_RASTER_FRONT_WINDING_CCW | GEN8_RASTER_CULL_NONE);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 
-	OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN6_3DSTATE_SF | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_ps(struct intel_batchbuffer *batch, uint32_t kernel) {
+gen8_emit_ps(struct intel_bb *ibb, uint32_t kernel) {
 	const int max_threads = 63;
 
-	OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2));
-	OUT_BATCH(/* XXX: I don't understand the BARYCENTRIC stuff, but it
+	intel_bb_out(ibb, GEN6_3DSTATE_WM | (2 - 2));
+	intel_bb_out(ibb, /* XXX: I don't understand the BARYCENTRIC stuff, but it
 		   * appears we need it to put our setup data in the place we
 		   * expect (g6, see below) */
-		  GEN8_3DSTATE_PS_PERSPECTIVE_PIXEL_BARYCENTRIC);
-
-	OUT_BATCH(GEN6_3DSTATE_CONSTANT_PS | (11-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_PS | (12-2));
-	OUT_BATCH(kernel);
-	OUT_BATCH(0); /* kernel hi */
-	OUT_BATCH(1 << GEN6_3DSTATE_WM_SAMPLER_COUNT_SHIFT |
-		  2 << GEN6_3DSTATE_WM_BINDING_TABLE_ENTRY_COUNT_SHIFT);
-	OUT_BATCH(0); /* scratch space stuff */
-	OUT_BATCH(0); /* scratch hi */
-	OUT_BATCH((max_threads - 1) << GEN8_3DSTATE_PS_MAX_THREADS_SHIFT |
-		  GEN6_3DSTATE_WM_16_DISPATCH_ENABLE);
-	OUT_BATCH(6 << GEN6_3DSTATE_WM_DISPATCH_START_GRF_0_SHIFT);
-	OUT_BATCH(0); // kernel 1
-	OUT_BATCH(0); /* kernel 1 hi */
-	OUT_BATCH(0); // kernel 2
-	OUT_BATCH(0); /* kernel 2 hi */
-
-	OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2));
-	OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT);
-
-	OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2));
-	OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID | GEN8_PSX_ATTRIBUTE_ENABLE);
+		     GEN8_3DSTATE_PS_PERSPECTIVE_PIXEL_BARYCENTRIC);
+
+	intel_bb_out(ibb, GEN6_3DSTATE_CONSTANT_PS | (11-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_PS | (12-2));
+	intel_bb_out(ibb, kernel);
+	intel_bb_out(ibb, 0); /* kernel hi */
+	intel_bb_out(ibb, 1 << GEN6_3DSTATE_WM_SAMPLER_COUNT_SHIFT |
+		     2 << GEN6_3DSTATE_WM_BINDING_TABLE_ENTRY_COUNT_SHIFT);
+	intel_bb_out(ibb, 0); /* scratch space stuff */
+	intel_bb_out(ibb, 0); /* scratch hi */
+	intel_bb_out(ibb, (max_threads - 1) << GEN8_3DSTATE_PS_MAX_THREADS_SHIFT |
+		     GEN6_3DSTATE_WM_16_DISPATCH_ENABLE);
+	intel_bb_out(ibb, 6 << GEN6_3DSTATE_WM_DISPATCH_START_GRF_0_SHIFT);
+	intel_bb_out(ibb, 0); // kernel 1
+	intel_bb_out(ibb, 0); /* kernel 1 hi */
+	intel_bb_out(ibb, 0); // kernel 2
+	intel_bb_out(ibb, 0); /* kernel 2 hi */
+
+	intel_bb_out(ibb, GEN8_3DSTATE_PS_BLEND | (2 - 2));
+	intel_bb_out(ibb, GEN8_PS_BLEND_HAS_WRITEABLE_RT);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_PS_EXTRA | (2 - 2));
+	intel_bb_out(ibb, GEN8_PSX_PIXEL_SHADER_VALID | GEN8_PSX_ATTRIBUTE_ENABLE);
 }
 
 static void
-gen8_emit_depth(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN8_3DSTATE_WM_DEPTH_STENCIL | (3 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_DEPTH_BUFFER | (8-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN8_3DSTATE_HIER_DEPTH_BUFFER | (5 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN8_3DSTATE_STENCIL_BUFFER | (5 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+gen8_emit_depth(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN8_3DSTATE_WM_DEPTH_STENCIL | (3 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_DEPTH_BUFFER | (8-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_HIER_DEPTH_BUFFER | (5 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_STENCIL_BUFFER | (5 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_clear(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_CLEAR_PARAMS | (3-2));
-	OUT_BATCH(0);
-	OUT_BATCH(1); // clear valid
+gen7_emit_clear(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN7_3DSTATE_CLEAR_PARAMS | (3-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 1); // clear valid
 }
 
 static void
-gen6_emit_drawing_rectangle(struct intel_batchbuffer *batch, const struct igt_buf *dst)
+gen6_emit_drawing_rectangle(struct intel_bb *ibb, const struct intel_buf *dst)
 {
-	OUT_BATCH(GEN4_3DSTATE_DRAWING_RECTANGLE | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH((igt_buf_height(dst) - 1) << 16 | (igt_buf_width(dst) - 1));
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN4_3DSTATE_DRAWING_RECTANGLE | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, (intel_buf_height(dst) - 1) << 16 | (intel_buf_width(dst) - 1));
+	intel_bb_out(ibb, 0);
 }
 
-static void gen8_emit_vf_topology(struct intel_batchbuffer *batch)
+static void gen8_emit_vf_topology(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY);
-	OUT_BATCH(_3DPRIM_RECTLIST);
+	intel_bb_out(ibb, GEN8_3DSTATE_VF_TOPOLOGY);
+	intel_bb_out(ibb, _3DPRIM_RECTLIST);
 }
 
 /* Vertex elements MUST be defined before this according to spec */
-static void gen8_emit_primitive(struct intel_batchbuffer *batch, uint32_t offset)
+static void gen8_emit_primitive(struct intel_bb *ibb, uint32_t offset)
 {
-	OUT_BATCH(GEN8_3DSTATE_VF_INSTANCING | (3 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN8_3DSTATE_VF_INSTANCING | (3 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 
-	OUT_BATCH(GEN4_3DPRIMITIVE | (7-2));
-	OUT_BATCH(0);	/* gen8+ ignore the topology type field */
-	OUT_BATCH(3);	/* vertex count */
-	OUT_BATCH(0);	/*  We're specifying this instead with offset in GEN6_3DSTATE_VERTEX_BUFFERS */
-	OUT_BATCH(1);	/* single instance */
-	OUT_BATCH(0);	/* start instance location */
-	OUT_BATCH(0);	/* index buffer offset, ignored */
+	intel_bb_out(ibb, GEN4_3DPRIMITIVE | (7-2));
+	intel_bb_out(ibb, 0);	/* gen8+ ignore the topology type field */
+	intel_bb_out(ibb, 3);	/* vertex count */
+	intel_bb_out(ibb, 0);	/*  We're specifying this instead with offset in GEN6_3DSTATE_VERTEX_BUFFERS */
+	intel_bb_out(ibb, 1);	/* single instance */
+	intel_bb_out(ibb, 0);	/* start instance location */
+	intel_bb_out(ibb, 0);	/* index buffer offset, ignored */
 }
 
 /* The general rule is if it's named gen6 it is directly copied from
@@ -897,114 +779,104 @@ static void gen8_emit_primitive(struct intel_batchbuffer *batch, uint32_t offset
 
 #define BATCH_STATE_SPLIT 2048
 
-void gen8_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y)
+void gen8_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src,
+			  unsigned int src_x, unsigned int src_y,
+			  unsigned int width, unsigned int height,
+			  struct intel_buf *dst,
+			  unsigned int dst_x, unsigned int dst_y)
 {
-	struct annotations_context aub_annotations;
 	uint32_t ps_sampler_state, ps_kernel_off, ps_binding_table;
 	uint32_t scissor_state;
 	uint32_t vertex_buffer;
-	uint32_t batch_end;
 
 	igt_assert(src->bpp == dst->bpp);
-	intel_batchbuffer_flush_with_context(batch, context);
 
-	intel_batchbuffer_align(batch, 8);
+	intel_bb_flush_render_with_context(ibb, ctx);
 
-	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
+	intel_bb_add_intel_buf(ibb, dst, true);
+	intel_bb_add_intel_buf(ibb, src, false);
 
-	annotation_init(&aub_annotations);
+	intel_bb_ptr_set(ibb, BATCH_STATE_SPLIT);
 
-	ps_binding_table  = gen8_bind_surfaces(batch, &aub_annotations,
-					       src, dst);
-	ps_sampler_state  = gen8_create_sampler(batch, &aub_annotations);
-	ps_kernel_off = gen8_fill_ps(batch, &aub_annotations,
-				     ps_kernel, sizeof(ps_kernel));
-	vertex_buffer = gen7_fill_vertex_buffer_data(batch, &aub_annotations,
+	ps_binding_table  = gen8_bind_surfaces(ibb, src, dst);
+	ps_sampler_state  = gen8_create_sampler(ibb);
+	ps_kernel_off = gen8_fill_ps(ibb, ps_kernel, sizeof(ps_kernel));
+	vertex_buffer = gen7_fill_vertex_buffer_data(ibb,
 						     src,
 						     src_x, src_y,
 						     dst_x, dst_y,
 						     width, height);
-	cc.cc_state = gen6_create_cc_state(batch, &aub_annotations);
-	cc.blend_state = gen8_create_blend_state(batch, &aub_annotations);
-	viewport.cc_state = gen6_create_cc_viewport(batch, &aub_annotations);
-	viewport.sf_clip_state = gen7_create_sf_clip_viewport(batch, &aub_annotations);
-	scissor_state = gen6_create_scissor_rect(batch, &aub_annotations);
+	cc.cc_state = gen6_create_cc_state(ibb);
+	cc.blend_state = gen8_create_blend_state(ibb);
+	viewport.cc_state = gen6_create_cc_viewport(ibb);
+	viewport.sf_clip_state = gen7_create_sf_clip_viewport(ibb);
+	scissor_state = gen6_create_scissor_rect(ibb);
 	/* TODO: theree is other state which isn't setup */
 
-	igt_assert(batch->ptr < &batch->buffer[4095]);
-
-	batch->ptr = batch->buffer;
+	intel_bb_ptr_set(ibb, 0);
 
 	/* Start emitting the commands. The order roughly follows the mesa blorp
 	 * order */
-	OUT_BATCH(G4X_PIPELINE_SELECT | PIPELINE_SELECT_3D);
-
-	gen8_emit_sip(batch);
-
-	gen7_emit_push_constants(batch);
-
-	gen8_emit_state_base_address(batch);
-
-	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC);
-	OUT_BATCH(viewport.cc_state);
-	OUT_BATCH(GEN8_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP);
-	OUT_BATCH(viewport.sf_clip_state);
+	intel_bb_out(ibb, G4X_PIPELINE_SELECT | PIPELINE_SELECT_3D);
 
-	gen7_emit_urb(batch);
+	gen8_emit_sip(ibb);
 
-	gen8_emit_cc(batch);
+	gen7_emit_push_constants(ibb);
 
-	gen8_emit_multisample(batch);
+	gen8_emit_state_base_address(ibb);
 
-	gen8_emit_null_state(batch);
+	intel_bb_out(ibb, GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC);
+	intel_bb_out(ibb, viewport.cc_state);
+	intel_bb_out(ibb, GEN8_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP);
+	intel_bb_out(ibb, viewport.sf_clip_state);
 
-	OUT_BATCH(GEN7_3DSTATE_STREAMOUT | (5-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	gen7_emit_urb(ibb);
 
-	gen7_emit_clip(batch);
+	gen8_emit_cc(ibb);
 
-	gen8_emit_sf(batch);
+	gen8_emit_multisample(ibb);
 
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS);
-	OUT_BATCH(ps_binding_table);
+	gen8_emit_null_state(ibb);
 
-	OUT_BATCH(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS);
-	OUT_BATCH(ps_sampler_state);
+	intel_bb_out(ibb, GEN7_3DSTATE_STREAMOUT | (5-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 
-	gen8_emit_ps(batch, ps_kernel_off);
+	gen7_emit_clip(ibb);
 
-	OUT_BATCH(GEN8_3DSTATE_SCISSOR_STATE_POINTERS);
-	OUT_BATCH(scissor_state);
+	gen8_emit_sf(ibb);
 
-	gen8_emit_depth(batch);
+	intel_bb_out(ibb, GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS);
+	intel_bb_out(ibb, ps_binding_table);
 
-	gen7_emit_clear(batch);
+	intel_bb_out(ibb, GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS);
+	intel_bb_out(ibb, ps_sampler_state);
 
-	gen6_emit_drawing_rectangle(batch, dst);
+	gen8_emit_ps(ibb, ps_kernel_off);
 
-	gen8_emit_vertex_buffer(batch, vertex_buffer);
-	gen6_emit_vertex_elements(batch);
+	intel_bb_out(ibb, GEN8_3DSTATE_SCISSOR_STATE_POINTERS);
+	intel_bb_out(ibb, scissor_state);
 
-	gen8_emit_vf_topology(batch);
-	gen8_emit_primitive(batch, vertex_buffer);
+	gen8_emit_depth(ibb);
 
-	OUT_BATCH(MI_BATCH_BUFFER_END);
+	gen7_emit_clear(ibb);
 
-	batch_end = intel_batchbuffer_align(batch, 8);
-	igt_assert(batch_end < BATCH_STATE_SPLIT);
-	annotation_add_batch(&aub_annotations, batch_end);
+	gen6_emit_drawing_rectangle(ibb, dst);
 
-	dump_batch(batch);
+	gen8_emit_vertex_buffer(ibb, vertex_buffer);
+	gen6_emit_vertex_elements(ibb);
 
-	annotation_flush(&aub_annotations, batch);
+	gen8_emit_vf_topology(ibb);
+	gen8_emit_primitive(ibb, vertex_buffer);
 
-	gen6_render_flush(batch, context, batch_end);
-	intel_batchbuffer_reset(batch);
+	intel_bb_emit_bbe(ibb);
+	intel_bb_exec_with_context(ibb, intel_bb_offset(ibb), ctx,
+				   I915_EXEC_DEFAULT | I915_EXEC_NO_RELOC,
+				   false);
+	dump_batch(ibb);
+	intel_bb_reset(ibb, false);
 }
diff --git a/lib/rendercopy_gen9.c b/lib/rendercopy_gen9.c
index 85ae4cab..6bad7bb6 100644
--- a/lib/rendercopy_gen9.c
+++ b/lib/rendercopy_gen9.c
@@ -16,7 +16,7 @@
 
 #include "drmtest.h"
 #include "intel_aux_pgtable.h"
-#include "intel_bufmgr.h"
+#include "intel_bufops.h"
 #include "intel_batchbuffer.h"
 #include "intel_io.h"
 #include "rendercopy.h"
@@ -24,17 +24,12 @@
 #include "intel_reg.h"
 #include "igt_aux.h"
 
-#include "intel_aub.h"
-
 #define VERTEX_SIZE (3*4)
 
 #if DEBUG_RENDERCPY
-static void dump_batch(struct intel_batchbuffer *batch) {
-	int fd = open("/tmp/i965-batchbuffers.dump", O_WRONLY | O_CREAT,  0666);
-	if (fd != -1) {
-		igt_assert_eq(write(fd, batch->buffer, 4096), 4096);
-		fd = close(fd);
-	}
+static void dump_batch(struct intel_bb *ibb)
+{
+	intel_bb_dump(ibb, "/tmp/gen9-batchbuffers.dump");
 }
 #else
 #define dump_batch(x) do { } while(0)
@@ -120,87 +115,16 @@ static const uint32_t gen12_render_copy[][4] = {
 	{ 0x80040131, 0x00000004, 0x50007144, 0x00c40000 },
 };
 
-/* AUB annotation support */
-#define MAX_ANNOTATIONS	33
-struct annotations_context {
-	drm_intel_aub_annotation annotations[MAX_ANNOTATIONS];
-	int index;
-	uint32_t offset;
-} aub_annotations;
-
-static void annotation_init(struct annotations_context *ctx)
-{
-	/* ctx->annotations is an array keeping a list of annotations of the
-	 * batch buffer ordered by offset. ctx->annotations[0] is thus left
-	 * for the command stream and will be filled just before executing
-	 * the batch buffer with annotations_add_batch() */
-	ctx->index = 1;
-}
-
-static void add_annotation(drm_intel_aub_annotation *a,
-			   uint32_t type, uint32_t subtype,
-			   uint32_t ending_offset)
-{
-	a->type = type;
-	a->subtype = subtype;
-	a->ending_offset = ending_offset;
-}
-
-static void annotation_add_batch(struct annotations_context *ctx, size_t size)
-{
-	add_annotation(&ctx->annotations[0], AUB_TRACE_TYPE_BATCH, 0, size);
-}
-
-static void annotation_add_state(struct annotations_context *ctx,
-				 uint32_t state_type,
-				 uint32_t start_offset,
-				 size_t   size)
-{
-	assert(ctx->index < MAX_ANNOTATIONS);
-
-	add_annotation(&ctx->annotations[ctx->index++],
-		       AUB_TRACE_TYPE_NOTYPE, 0,
-		       start_offset);
-	add_annotation(&ctx->annotations[ctx->index++],
-		       AUB_TRACE_TYPE(state_type),
-		       AUB_TRACE_SUBTYPE(state_type),
-		       start_offset + size);
-}
-
-static void annotation_flush(struct annotations_context *ctx,
-			     struct intel_batchbuffer *batch)
-{
-	if (!igt_aub_dump_enabled())
-		return;
-
-	drm_intel_bufmgr_gem_set_aub_annotations(batch->bo,
-						 ctx->annotations,
-						 ctx->index);
-}
-
-static void
-gen6_render_flush(struct intel_batchbuffer *batch,
-		  drm_intel_context *context, uint32_t batch_end)
-{
-	igt_assert_eq(drm_intel_bo_subdata(batch->bo,
-					   0, 4096, batch->buffer),
-		      0);
-	igt_assert_eq(drm_intel_gem_bo_context_exec(batch->bo,
-						    context, batch_end, 0),
-		      0);
-}
-
 /* Mostly copy+paste from gen6, except height, width, pitch moved */
 static uint32_t
-gen8_bind_buf(struct intel_batchbuffer *batch, const struct igt_buf *buf,
-	      int is_dst) {
+gen8_bind_buf(struct intel_bb *ibb, const struct intel_buf *buf, int is_dst) {
 	struct gen9_surface_state *ss;
-	uint32_t write_domain, read_domain, offset;
-	int ret;
+	uint32_t write_domain, read_domain;
+	uint64_t address;
 
 	igt_assert_lte(buf->surface[0].stride, 256*1024);
-	igt_assert_lte(igt_buf_width(buf), 16384);
-	igt_assert_lte(igt_buf_height(buf), 16384);
+	igt_assert_lte(intel_buf_width(buf), 16384);
+	igt_assert_lte(intel_buf_height(buf), 16384);
 
 	if (is_dst) {
 		write_domain = read_domain = I915_GEM_DOMAIN_RENDER;
@@ -209,10 +133,7 @@ gen8_bind_buf(struct intel_batchbuffer *batch, const struct igt_buf *buf,
 		read_domain = I915_GEM_DOMAIN_SAMPLER;
 	}
 
-	ss = intel_batchbuffer_subdata_alloc(batch, sizeof(*ss), 64);
-	offset = intel_batchbuffer_subdata_offset(batch, ss);
-	annotation_add_state(&aub_annotations, AUB_TRACE_SURFACE_STATE,
-			     offset, sizeof(*ss));
+	ss = intel_bb_ptr_align(ibb, 64);
 
 	ss->ss0.surface_type = SURFACE_2D;
 	switch (buf->bpp) {
@@ -238,17 +159,15 @@ gen8_bind_buf(struct intel_batchbuffer *batch, const struct igt_buf *buf,
 		ss->ss5.trmode = 2;
 	ss->ss5.mip_tail_start_lod = 1; /* needed with trmode */
 
-	ss->ss8.base_addr = buf->bo->offset64;
-	ss->ss9.base_addr_hi = buf->bo->offset64 >> 32;
+	address = intel_bb_offset_reloc(ibb, buf->handle,
+					read_domain, write_domain,
+					intel_bb_offset(ibb) + 4 * 8,
+					buf->addr.offset);
+	ss->ss8.base_addr = address;
+	ss->ss9.base_addr_hi = address >> 32;
 
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				      intel_batchbuffer_subdata_offset(batch, &ss->ss8),
-				      buf->bo, 0,
-				      read_domain, write_domain);
-	assert(ret == 0);
-
-	ss->ss2.height = igt_buf_height(buf) - 1;
-	ss->ss2.width  = igt_buf_width(buf) - 1;
+	ss->ss2.height = intel_buf_height(buf) - 1;
+	ss->ss2.width  = intel_buf_width(buf) - 1;
 	ss->ss3.pitch  = buf->surface[0].stride - 1;
 
 	ss->ss7.skl.shader_chanel_select_r = 4;
@@ -264,60 +183,52 @@ gen8_bind_buf(struct intel_batchbuffer *batch, const struct igt_buf *buf,
 		ss->ss6.aux_mode = 0x5; /* AUX_CCS_E */
 		ss->ss6.aux_pitch = (buf->ccs[0].stride / 128) - 1;
 
-		ss->ss10.aux_base_addr = buf->bo->offset64 + buf->ccs[0].offset;
-		ss->ss11.aux_base_addr_hi = (buf->bo->offset64 + buf->ccs[0].offset) >> 32;
-
-		ret = drm_intel_bo_emit_reloc(batch->bo,
-					      intel_batchbuffer_subdata_offset(batch, &ss->ss10),
-					      buf->bo, buf->ccs[0].offset,
-					      read_domain, write_domain);
-		assert(ret == 0);
+		address = intel_bb_offset_reloc_with_delta(ibb, buf->handle,
+							   read_domain, write_domain,
+							   buf->ccs[0].offset,
+							   intel_bb_offset(ibb) + 4 * 10,
+							   buf->addr.offset);
+		ss->ss10.aux_base_addr = (address + buf->ccs[0].offset);
+		ss->ss11.aux_base_addr_hi = (address + buf->ccs[0].offset) >> 32;
 	}
 
 	if (buf->cc.offset) {
 		igt_assert(buf->compression == I915_COMPRESSION_RENDER);
 
-		ss->ss12.clear_address = buf->bo->offset64 + buf->cc.offset;
-		ss->ss13.clear_address_hi = (buf->bo->offset64 + buf->cc.offset) >> 32;
-
-		ret = drm_intel_bo_emit_reloc(batch->bo,
-					      intel_batchbuffer_subdata_offset(batch, &ss->ss12),
-					      buf->bo, buf->cc.offset,
-					      read_domain, write_domain);
-		assert(ret == 0);
+		address = intel_bb_offset_reloc_with_delta(ibb, buf->handle,
+							   read_domain, write_domain,
+							   buf->cc.offset,
+							   intel_bb_offset(ibb) + 4 * 12,
+							   buf->addr.offset);
+		ss->ss12.clear_address = address + buf->cc.offset;
+		ss->ss13.clear_address_hi = (address + buf->cc.offset) >> 32;
 	}
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*ss));
 }
 
 static uint32_t
-gen8_bind_surfaces(struct intel_batchbuffer *batch,
-		   const struct igt_buf *src,
-		   const struct igt_buf *dst)
+gen8_bind_surfaces(struct intel_bb *ibb,
+		   const struct intel_buf *src,
+		   const struct intel_buf *dst)
 {
-	uint32_t *binding_table, offset;
+	uint32_t *binding_table, binding_table_offset;
 
-	binding_table = intel_batchbuffer_subdata_alloc(batch, 8, 32);
-	offset = intel_batchbuffer_subdata_offset(batch, binding_table);
-	annotation_add_state(&aub_annotations, AUB_TRACE_BINDING_TABLE,
-			     offset, 8);
+	binding_table = intel_bb_ptr_align(ibb, 32);
+	binding_table_offset = intel_bb_ptr_add_return_prev_offset(ibb, 32);
 
-	binding_table[0] = gen8_bind_buf(batch, dst, 1);
-	binding_table[1] = gen8_bind_buf(batch, src, 0);
+	binding_table[0] = gen8_bind_buf(ibb, dst, 1);
+	binding_table[1] = gen8_bind_buf(ibb, src, 0);
 
-	return offset;
+	return binding_table_offset;
 }
 
 /* Mostly copy+paste from gen6, except wrap modes moved */
 static uint32_t
-gen8_create_sampler(struct intel_batchbuffer *batch) {
+gen8_create_sampler(struct intel_bb *ibb) {
 	struct gen8_sampler_state *ss;
-	uint32_t offset;
 
-	ss = intel_batchbuffer_subdata_alloc(batch, sizeof(*ss), 64);
-	offset = intel_batchbuffer_subdata_offset(batch, ss);
-	annotation_add_state(&aub_annotations, AUB_TRACE_SAMPLER_STATE,
-			     offset, sizeof(*ss));
+	ss = intel_bb_ptr_align(ibb, 64);
 
 	ss->ss0.min_filter = GEN4_MAPFILTER_NEAREST;
 	ss->ss0.mag_filter = GEN4_MAPFILTER_NEAREST;
@@ -329,21 +240,15 @@ gen8_create_sampler(struct intel_batchbuffer *batch) {
 	 * sampler fetch, but couldn't make it work. */
 	ss->ss3.non_normalized_coord = 0;
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*ss));
 }
 
 static uint32_t
-gen8_fill_ps(struct intel_batchbuffer *batch,
+gen8_fill_ps(struct intel_bb *ibb,
 	     const uint32_t kernel[][4],
 	     size_t size)
 {
-	uint32_t offset;
-
-	offset = intel_batchbuffer_copy_data(batch, kernel, size, 64);
-	annotation_add_state(&aub_annotations, AUB_TRACE_KERNEL_INSTRUCTIONS,
-			     offset, size);
-
-	return offset;
+	return intel_bb_copy_data(ibb, kernel, size, 64);
 }
 
 /*
@@ -357,33 +262,29 @@ gen8_fill_ps(struct intel_batchbuffer *batch,
  * see gen6_emit_vertex_elements
  */
 static uint32_t
-gen7_fill_vertex_buffer_data(struct intel_batchbuffer *batch,
-			     const struct igt_buf *src,
+gen7_fill_vertex_buffer_data(struct intel_bb *ibb,
+			     const struct intel_buf *src,
 			     uint32_t src_x, uint32_t src_y,
 			     uint32_t dst_x, uint32_t dst_y,
 			     uint32_t width, uint32_t height)
 {
-	void *start;
 	uint32_t offset;
 
-	intel_batchbuffer_align(batch, 8);
-	start = batch->ptr;
+	intel_bb_ptr_align(ibb, 8);
+	offset = intel_bb_offset(ibb);
 
-	emit_vertex_2s(batch, dst_x + width, dst_y + height);
-	emit_vertex_normalized(batch, src_x + width, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y + height, igt_buf_height(src));
+	emit_vertex_2s(ibb, dst_x + width, dst_y + height);
+	emit_vertex_normalized(ibb, src_x + width, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y + height, intel_buf_height(src));
 
-	emit_vertex_2s(batch, dst_x, dst_y + height);
-	emit_vertex_normalized(batch, src_x, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y + height, igt_buf_height(src));
+	emit_vertex_2s(ibb, dst_x, dst_y + height);
+	emit_vertex_normalized(ibb, src_x, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y + height, intel_buf_height(src));
 
-	emit_vertex_2s(batch, dst_x, dst_y);
-	emit_vertex_normalized(batch, src_x, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y, igt_buf_height(src));
+	emit_vertex_2s(ibb, dst_x, dst_y);
+	emit_vertex_normalized(ibb, src_x, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y, intel_buf_height(src));
 
-	offset = intel_batchbuffer_subdata_offset(batch, start);
-	annotation_add_state(&aub_annotations, AUB_TRACE_VERTEX_BUFFER,
-			     offset, 3 * VERTEX_SIZE);
 	return offset;
 }
 
@@ -397,25 +298,25 @@ gen7_fill_vertex_buffer_data(struct intel_batchbuffer *batch,
  * packed.
  */
 static void
-gen6_emit_vertex_elements(struct intel_batchbuffer *batch) {
+gen6_emit_vertex_elements(struct intel_bb *ibb) {
 	/*
 	 * The VUE layout
 	 *    dword 0-3: pad (0, 0, 0. 0)
 	 *    dword 4-7: position (x, y, 0, 1.0),
 	 *    dword 8-11: texture coordinate 0 (u0, v0, 0, 1.0)
 	 */
-	OUT_BATCH(GEN4_3DSTATE_VERTEX_ELEMENTS | (3 * 2 + 1 - 2));
+	intel_bb_out(ibb, GEN4_3DSTATE_VERTEX_ELEMENTS | (3 * 2 + 1 - 2));
 
 	/* Element state 0. These are 4 dwords of 0 required for the VUE format.
 	 * We don't really know or care what they do.
 	 */
-	OUT_BATCH(0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
-		  SURFACEFORMAT_R32G32B32A32_FLOAT << VE0_FORMAT_SHIFT |
-		  0 << VE0_OFFSET_SHIFT); /* we specify 0, but it's really does not exist */
-	OUT_BATCH(GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
+	intel_bb_out(ibb, 0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
+		     SURFACEFORMAT_R32G32B32A32_FLOAT << VE0_FORMAT_SHIFT |
+		     0 << VE0_OFFSET_SHIFT); /* we specify 0, but it's really does not exist */
+	intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT);
 
 	/* Element state 1 - Our "destination" vertices. These are passed down
 	 * through the pipeline, and eventually make it to the pixel shader as
@@ -423,25 +324,25 @@ gen6_emit_vertex_elements(struct intel_batchbuffer *batch) {
 	 * signed/scaled because of gen6 rendercopy. I see no particular reason
 	 * for doing this though.
 	 */
-	OUT_BATCH(0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
-		  SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
-		  0 << VE0_OFFSET_SHIFT); /* offsets vb in bytes */
-	OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
+	intel_bb_out(ibb, 0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
+		     SURFACEFORMAT_R16G16_SSCALED << VE0_FORMAT_SHIFT |
+		     0 << VE0_OFFSET_SHIFT); /* offsets vb in bytes */
+	intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
 
 	/* Element state 2. Last but not least we store the U,V components as
 	 * normalized floats. These will be used in the pixel shader to sample
 	 * from the source buffer.
 	 */
-	OUT_BATCH(0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
-		  SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
-		  4 << VE0_OFFSET_SHIFT);	/* offset vb in bytes */
-	OUT_BATCH(GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
-		  GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
+	intel_bb_out(ibb, 0 << GEN6_VE0_VERTEX_BUFFER_INDEX_SHIFT | GEN6_VE0_VALID |
+		     SURFACEFORMAT_R32G32_FLOAT << VE0_FORMAT_SHIFT |
+		     4 << VE0_OFFSET_SHIFT);	/* offset vb in bytes */
+	intel_bb_out(ibb, GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_0_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_SRC << VE1_VFCOMPONENT_1_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT |
+		     GEN4_VFCOMPONENT_STORE_1_FLT << VE1_VFCOMPONENT_3_SHIFT);
 }
 
 /*
@@ -450,42 +351,35 @@ gen6_emit_vertex_elements(struct intel_batchbuffer *batch) {
  * @batch
  * @offset - bytw offset within the @batch where the vertex buffer starts.
  */
-static void gen7_emit_vertex_buffer(struct intel_batchbuffer *batch,
-				    uint32_t offset) {
-	OUT_BATCH(GEN4_3DSTATE_VERTEX_BUFFERS | (1 + (4 * 1) - 2));
-	OUT_BATCH(0 << GEN6_VB0_BUFFER_INDEX_SHIFT | /* VB 0th index */
-		  GEN8_VB0_BUFFER_ADDR_MOD_EN | /* Address Modify Enable */
-		  VERTEX_SIZE << VB0_BUFFER_PITCH_SHIFT);
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_VERTEX, 0, offset);
-	OUT_BATCH(3 * VERTEX_SIZE);
+static void gen7_emit_vertex_buffer(struct intel_bb *ibb, uint32_t offset)
+{
+	intel_bb_out(ibb, GEN4_3DSTATE_VERTEX_BUFFERS | (1 + (4 * 1) - 2));
+	intel_bb_out(ibb, 0 << GEN6_VB0_BUFFER_INDEX_SHIFT | /* VB 0th index */
+		     GEN8_VB0_BUFFER_ADDR_MOD_EN | /* Address Modify Enable */
+		     VERTEX_SIZE << VB0_BUFFER_PITCH_SHIFT);
+	intel_bb_emit_reloc(ibb, ibb->handle,
+			    I915_GEM_DOMAIN_VERTEX, 0,
+			    offset, ibb->batch_offset);
+	intel_bb_out(ibb, 3 * VERTEX_SIZE);
 }
 
 static uint32_t
-gen6_create_cc_state(struct intel_batchbuffer *batch)
+gen6_create_cc_state(struct intel_bb *ibb)
 {
 	struct gen6_color_calc_state *cc_state;
-	uint32_t offset;
 
-	cc_state = intel_batchbuffer_subdata_alloc(batch,
-						   sizeof(*cc_state), 64);
-	offset = intel_batchbuffer_subdata_offset(batch, cc_state);
-	annotation_add_state(&aub_annotations, AUB_TRACE_CC_STATE,
-			     offset, sizeof(*cc_state));
+	cc_state = intel_bb_ptr_align(ibb, 64);
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*cc_state));
 }
 
 static uint32_t
-gen8_create_blend_state(struct intel_batchbuffer *batch)
+gen8_create_blend_state(struct intel_bb *ibb)
 {
 	struct gen8_blend_state *blend;
 	int i;
-	uint32_t offset;
 
-	blend = intel_batchbuffer_subdata_alloc(batch, sizeof(*blend), 64);
-	offset = intel_batchbuffer_subdata_offset(batch, blend);
-	annotation_add_state(&aub_annotations, AUB_TRACE_BLEND_STATE,
-			     offset, sizeof(*blend));
+	blend = intel_bb_ptr_align(ibb, 64);
 
 	for (i = 0; i < 16; i++) {
 		blend->bs[i].dest_blend_factor = GEN6_BLENDFACTOR_ZERO;
@@ -495,466 +389,458 @@ gen8_create_blend_state(struct intel_batchbuffer *batch)
 		blend->bs[i].color_buffer_blend = 0;
 	}
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*blend));
 }
 
 static uint32_t
-gen6_create_cc_viewport(struct intel_batchbuffer *batch)
+gen6_create_cc_viewport(struct intel_bb *ibb)
 {
 	struct gen4_cc_viewport *vp;
-	uint32_t offset;
 
-	vp = intel_batchbuffer_subdata_alloc(batch, sizeof(*vp), 32);
-	offset = intel_batchbuffer_subdata_offset(batch, vp);
-	annotation_add_state(&aub_annotations, AUB_TRACE_CC_VP_STATE,
-			     offset, sizeof(*vp));
+	vp = intel_bb_ptr_align(ibb, 32);
 
 	/* XXX I don't understand this */
 	vp->min_depth = -1.e35;
 	vp->max_depth = 1.e35;
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*vp));
 }
 
 static uint32_t
-gen7_create_sf_clip_viewport(struct intel_batchbuffer *batch) {
+gen7_create_sf_clip_viewport(struct intel_bb *ibb) {
 	/* XXX these are likely not needed */
 	struct gen7_sf_clip_viewport *scv_state;
-	uint32_t offset;
 
-	scv_state = intel_batchbuffer_subdata_alloc(batch,
-						    sizeof(*scv_state), 64);
-	offset = intel_batchbuffer_subdata_offset(batch, scv_state);
-	annotation_add_state(&aub_annotations, AUB_TRACE_CLIP_VP_STATE,
-			     offset, sizeof(*scv_state));
+	scv_state = intel_bb_ptr_align(ibb, 64);
 
 	scv_state->guardband.xmin = 0;
 	scv_state->guardband.xmax = 1.0f;
 	scv_state->guardband.ymin = 0;
 	scv_state->guardband.ymax = 1.0f;
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*scv_state));
 }
 
 static uint32_t
-gen6_create_scissor_rect(struct intel_batchbuffer *batch)
+gen6_create_scissor_rect(struct intel_bb *ibb)
 {
 	struct gen6_scissor_rect *scissor;
-	uint32_t offset;
 
-	scissor = intel_batchbuffer_subdata_alloc(batch, sizeof(*scissor), 64);
-	offset = intel_batchbuffer_subdata_offset(batch, scissor);
-	annotation_add_state(&aub_annotations, AUB_TRACE_SCISSOR_STATE,
-			     offset, sizeof(*scissor));
+	scissor = intel_bb_ptr_align(ibb, 64);
 
-	return offset;
+	return intel_bb_ptr_add_return_prev_offset(ibb, sizeof(*scissor));
 }
 
 static void
-gen8_emit_sip(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN4_STATE_SIP | (3 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+gen8_emit_sip(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN4_STATE_SIP | (3 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_push_constants(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS);
-	OUT_BATCH(0);
-	OUT_BATCH(GEN8_3DSTATE_PUSH_CONSTANT_ALLOC_HS);
-	OUT_BATCH(0);
-	OUT_BATCH(GEN8_3DSTATE_PUSH_CONSTANT_ALLOC_DS);
-	OUT_BATCH(0);
-	OUT_BATCH(GEN8_3DSTATE_PUSH_CONSTANT_ALLOC_GS);
-	OUT_BATCH(0);
-	OUT_BATCH(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS);
-	OUT_BATCH(0);
+gen7_emit_push_constants(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, GEN8_3DSTATE_PUSH_CONSTANT_ALLOC_HS);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, GEN8_3DSTATE_PUSH_CONSTANT_ALLOC_DS);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, GEN8_3DSTATE_PUSH_CONSTANT_ALLOC_GS);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen9_emit_state_base_address(struct intel_batchbuffer *batch) {
+gen9_emit_state_base_address(struct intel_bb *ibb) {
 
 	/* WaBindlessSurfaceStateModifyEnable:skl,bxt */
 	/* The length has to be one less if we dont modify
 	   bindless state */
-	OUT_BATCH(GEN4_STATE_BASE_ADDRESS | (19 - 1 - 2));
+	intel_bb_out(ibb, GEN4_STATE_BASE_ADDRESS | (19 - 1 - 2));
 
 	/* general */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, 0 | BASE_ADDRESS_MODIFY);
+	intel_bb_out(ibb, 0);
 
 	/* stateless data port */
-	OUT_BATCH(0 | BASE_ADDRESS_MODIFY);
+	intel_bb_out(ibb, 0 | BASE_ADDRESS_MODIFY);
 
 	/* surface */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_SAMPLER, 0, BASE_ADDRESS_MODIFY);
+	intel_bb_emit_reloc(ibb, ibb->handle,
+			    I915_GEM_DOMAIN_SAMPLER, 0,
+			    BASE_ADDRESS_MODIFY, ibb->batch_offset);
 
 	/* dynamic */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION,
-		  0, BASE_ADDRESS_MODIFY);
+	intel_bb_emit_reloc(ibb, ibb->handle,
+			    I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, 0,
+			    BASE_ADDRESS_MODIFY, ibb->batch_offset);
 
 	/* indirect */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 
 	/* instruction */
-	OUT_RELOC(batch->bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY);
+	intel_bb_emit_reloc(ibb, ibb->handle,
+			    I915_GEM_DOMAIN_INSTRUCTION, 0,
+			    BASE_ADDRESS_MODIFY, ibb->batch_offset);
 
 	/* general state buffer size */
-	OUT_BATCH(0xfffff000 | 1);
+	intel_bb_out(ibb, 0xfffff000 | 1);
 	/* dynamic state buffer size */
-	OUT_BATCH(1 << 12 | 1);
+	intel_bb_out(ibb, 1 << 12 | 1);
 	/* indirect object buffer size */
-	OUT_BATCH(0xfffff000 | 1);
+	intel_bb_out(ibb, 0xfffff000 | 1);
 	/* intruction buffer size */
-	OUT_BATCH(1 << 12 | 1);
+	intel_bb_out(ibb, 1 << 12 | 1);
 
 	/* Bindless surface state base address */
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_urb(struct intel_batchbuffer *batch) {
+gen7_emit_urb(struct intel_bb *ibb) {
 	/* XXX: Min valid values from mesa */
 	const int vs_entries = 64;
 	const int vs_size = 2;
 	const int vs_start = 4;
 
-	OUT_BATCH(GEN7_3DSTATE_URB_VS);
-	OUT_BATCH(vs_entries | ((vs_size - 1) << 16) | (vs_start << 25));
-	OUT_BATCH(GEN7_3DSTATE_URB_GS);
-	OUT_BATCH(vs_start << 25);
-	OUT_BATCH(GEN7_3DSTATE_URB_HS);
-	OUT_BATCH(vs_start << 25);
-	OUT_BATCH(GEN7_3DSTATE_URB_DS);
-	OUT_BATCH(vs_start << 25);
+	intel_bb_out(ibb, GEN7_3DSTATE_URB_VS);
+	intel_bb_out(ibb, vs_entries | ((vs_size - 1) << 16) | (vs_start << 25));
+	intel_bb_out(ibb, GEN7_3DSTATE_URB_GS);
+	intel_bb_out(ibb, vs_start << 25);
+	intel_bb_out(ibb, GEN7_3DSTATE_URB_HS);
+	intel_bb_out(ibb, vs_start << 25);
+	intel_bb_out(ibb, GEN7_3DSTATE_URB_DS);
+	intel_bb_out(ibb, vs_start << 25);
 }
 
 static void
-gen8_emit_cc(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS);
-	OUT_BATCH(cc.blend_state | 1);
+gen8_emit_cc(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN7_3DSTATE_BLEND_STATE_POINTERS);
+	intel_bb_out(ibb, cc.blend_state | 1);
 
-	OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS);
-	OUT_BATCH(cc.cc_state | 1);
+	intel_bb_out(ibb, GEN6_3DSTATE_CC_STATE_POINTERS);
+	intel_bb_out(ibb, cc.cc_state | 1);
 }
 
 static void
-gen8_emit_multisample(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN8_3DSTATE_MULTISAMPLE | 0);
-	OUT_BATCH(0);
+gen8_emit_multisample(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN8_3DSTATE_MULTISAMPLE | 0);
+	intel_bb_out(ibb, 0);
 
-	OUT_BATCH(GEN6_3DSTATE_SAMPLE_MASK);
-	OUT_BATCH(1);
+	intel_bb_out(ibb, GEN6_3DSTATE_SAMPLE_MASK);
+	intel_bb_out(ibb, 1);
 }
 
 static void
-gen8_emit_vs(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN6_3DSTATE_CONSTANT_VS | (11-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN6_3DSTATE_VS | (9-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+gen8_emit_vs(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN6_3DSTATE_CONSTANT_VS | (11-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN6_3DSTATE_VS | (9-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_hs(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_CONSTANT_HS | (11-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_HS | (9-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN8_3DSTATE_SAMPLER_STATE_POINTERS_HS);
-	OUT_BATCH(0);
+gen8_emit_hs(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN7_3DSTATE_CONSTANT_HS | (11-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_HS | (9-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_SAMPLER_STATE_POINTERS_HS);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_gs(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN6_3DSTATE_CONSTANT_GS | (11-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN6_3DSTATE_GS | (10-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS);
-	OUT_BATCH(0);
+gen8_emit_gs(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN6_3DSTATE_CONSTANT_GS | (11-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN6_3DSTATE_GS | (10-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen9_emit_ds(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_CONSTANT_DS | (11-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_DS | (11-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN8_3DSTATE_SAMPLER_STATE_POINTERS_DS);
-	OUT_BATCH(0);
+gen9_emit_ds(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN7_3DSTATE_CONSTANT_DS | (11-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_DS | (11-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_SAMPLER_STATE_POINTERS_DS);
+	intel_bb_out(ibb, 0);
 }
 
 
 static void
-gen8_emit_wm_hz_op(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN8_3DSTATE_WM_HZ_OP | (5-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+gen8_emit_wm_hz_op(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN8_3DSTATE_WM_HZ_OP | (5-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_null_state(struct intel_batchbuffer *batch) {
-	gen8_emit_wm_hz_op(batch);
-	gen8_emit_hs(batch);
-	OUT_BATCH(GEN7_3DSTATE_TE | (4-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	gen8_emit_gs(batch);
-	gen9_emit_ds(batch);
-	gen8_emit_vs(batch);
+gen8_emit_null_state(struct intel_bb *ibb) {
+	gen8_emit_wm_hz_op(ibb);
+	gen8_emit_hs(ibb);
+	intel_bb_out(ibb, GEN7_3DSTATE_TE | (4-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	gen8_emit_gs(ibb);
+	gen9_emit_ds(ibb);
+	gen8_emit_vs(ibb);
 }
 
 static void
-gen7_emit_clip(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN6_3DSTATE_CLIP | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0); /*  pass-through */
-	OUT_BATCH(0);
+gen7_emit_clip(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN6_3DSTATE_CLIP | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0); /*  pass-through */
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_sf(struct intel_batchbuffer *batch)
+gen8_emit_sf(struct intel_bb *ibb)
 {
 	int i;
 
-	OUT_BATCH(GEN7_3DSTATE_SBE | (6 - 2));
-	OUT_BATCH(1 << GEN7_SBE_NUM_OUTPUTS_SHIFT |
-		  GEN8_SBE_FORCE_URB_ENTRY_READ_LENGTH |
-		  GEN8_SBE_FORCE_URB_ENTRY_READ_OFFSET |
-		  1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT |
-		  1 << GEN8_SBE_URB_ENTRY_READ_OFFSET_SHIFT);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(GEN9_SBE_ACTIVE_COMPONENT_XYZW << 0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN8_3DSTATE_SBE_SWIZ | (11 - 2));
+	intel_bb_out(ibb, GEN7_3DSTATE_SBE | (6 - 2));
+	intel_bb_out(ibb, 1 << GEN7_SBE_NUM_OUTPUTS_SHIFT |
+		     GEN8_SBE_FORCE_URB_ENTRY_READ_LENGTH |
+		     GEN8_SBE_FORCE_URB_ENTRY_READ_OFFSET |
+		     1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT |
+		     1 << GEN8_SBE_URB_ENTRY_READ_OFFSET_SHIFT);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, GEN9_SBE_ACTIVE_COMPONENT_XYZW << 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_SBE_SWIZ | (11 - 2));
 	for (i = 0; i < 8; i++)
-		OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2));
-	OUT_BATCH(GEN8_RASTER_FRONT_WINDING_CCW | GEN8_RASTER_CULL_NONE);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+		intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_RASTER | (5 - 2));
+	intel_bb_out(ibb, GEN8_RASTER_FRONT_WINDING_CCW | GEN8_RASTER_CULL_NONE);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN6_3DSTATE_SF | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen8_emit_ps(struct intel_batchbuffer *batch, uint32_t kernel) {
+gen8_emit_ps(struct intel_bb *ibb, uint32_t kernel) {
 	const int max_threads = 63;
 
-	OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2));
-	OUT_BATCH(/* XXX: I don't understand the BARYCENTRIC stuff, but it
+	intel_bb_out(ibb, GEN6_3DSTATE_WM | (2 - 2));
+	intel_bb_out(ibb, /* XXX: I don't understand the BARYCENTRIC stuff, but it
 		   * appears we need it to put our setup data in the place we
 		   * expect (g6, see below) */
-		  GEN8_3DSTATE_PS_PERSPECTIVE_PIXEL_BARYCENTRIC);
-
-	OUT_BATCH(GEN6_3DSTATE_CONSTANT_PS | (11-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_PS | (12-2));
-	OUT_BATCH(kernel);
-	OUT_BATCH(0); /* kernel hi */
-	OUT_BATCH(1 << GEN6_3DSTATE_WM_SAMPLER_COUNT_SHIFT |
-		  2 << GEN6_3DSTATE_WM_BINDING_TABLE_ENTRY_COUNT_SHIFT);
-	OUT_BATCH(0); /* scratch space stuff */
-	OUT_BATCH(0); /* scratch hi */
-	OUT_BATCH((max_threads - 1) << GEN8_3DSTATE_PS_MAX_THREADS_SHIFT |
-		  GEN6_3DSTATE_WM_16_DISPATCH_ENABLE);
-	OUT_BATCH(6 << GEN6_3DSTATE_WM_DISPATCH_START_GRF_0_SHIFT);
-	OUT_BATCH(0); // kernel 1
-	OUT_BATCH(0); /* kernel 1 hi */
-	OUT_BATCH(0); // kernel 2
-	OUT_BATCH(0); /* kernel 2 hi */
-
-	OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2));
-	OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT);
-
-	OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2));
-	OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID | GEN8_PSX_ATTRIBUTE_ENABLE);
+		     GEN8_3DSTATE_PS_PERSPECTIVE_PIXEL_BARYCENTRIC);
+
+	intel_bb_out(ibb, GEN6_3DSTATE_CONSTANT_PS | (11-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_PS | (12-2));
+	intel_bb_out(ibb, kernel);
+	intel_bb_out(ibb, 0); /* kernel hi */
+	intel_bb_out(ibb, 1 << GEN6_3DSTATE_WM_SAMPLER_COUNT_SHIFT |
+		     2 << GEN6_3DSTATE_WM_BINDING_TABLE_ENTRY_COUNT_SHIFT);
+	intel_bb_out(ibb, 0); /* scratch space stuff */
+	intel_bb_out(ibb, 0); /* scratch hi */
+	intel_bb_out(ibb, (max_threads - 1) << GEN8_3DSTATE_PS_MAX_THREADS_SHIFT |
+		     GEN6_3DSTATE_WM_16_DISPATCH_ENABLE);
+	intel_bb_out(ibb, 6 << GEN6_3DSTATE_WM_DISPATCH_START_GRF_0_SHIFT);
+	intel_bb_out(ibb, 0); // kernel 1
+	intel_bb_out(ibb, 0); /* kernel 1 hi */
+	intel_bb_out(ibb, 0); // kernel 2
+	intel_bb_out(ibb, 0); /* kernel 2 hi */
+
+	intel_bb_out(ibb, GEN8_3DSTATE_PS_BLEND | (2 - 2));
+	intel_bb_out(ibb, GEN8_PS_BLEND_HAS_WRITEABLE_RT);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_PS_EXTRA | (2 - 2));
+	intel_bb_out(ibb, GEN8_PSX_PIXEL_SHADER_VALID | GEN8_PSX_ATTRIBUTE_ENABLE);
 }
 
 static void
-gen9_emit_depth(struct intel_batchbuffer *batch)
+gen9_emit_depth(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN8_3DSTATE_WM_DEPTH_STENCIL | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN7_3DSTATE_DEPTH_BUFFER | (8-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN8_3DSTATE_HIER_DEPTH_BUFFER | (5-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN8_3DSTATE_STENCIL_BUFFER | (5-2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN8_3DSTATE_WM_DEPTH_STENCIL | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN7_3DSTATE_DEPTH_BUFFER | (8-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_HIER_DEPTH_BUFFER | (5-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_STENCIL_BUFFER | (5-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 }
 
 static void
-gen7_emit_clear(struct intel_batchbuffer *batch) {
-	OUT_BATCH(GEN7_3DSTATE_CLEAR_PARAMS | (3-2));
-	OUT_BATCH(0);
-	OUT_BATCH(1); // clear valid
+gen7_emit_clear(struct intel_bb *ibb) {
+	intel_bb_out(ibb, GEN7_3DSTATE_CLEAR_PARAMS | (3-2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 1); // clear valid
 }
 
 static void
-gen6_emit_drawing_rectangle(struct intel_batchbuffer *batch, const struct igt_buf *dst)
+gen6_emit_drawing_rectangle(struct intel_bb *ibb, const struct intel_buf *dst)
 {
-	OUT_BATCH(GEN4_3DSTATE_DRAWING_RECTANGLE | (4 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH((igt_buf_height(dst) - 1) << 16 | (igt_buf_width(dst) - 1));
-	OUT_BATCH(0);
+	intel_bb_out(ibb, GEN4_3DSTATE_DRAWING_RECTANGLE | (4 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, (intel_buf_height(dst) - 1) << 16 | (intel_buf_width(dst) - 1));
+	intel_bb_out(ibb, 0);
 }
 
-static void gen8_emit_vf_topology(struct intel_batchbuffer *batch)
+static void gen8_emit_vf_topology(struct intel_bb *ibb)
 {
-	OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY);
-	OUT_BATCH(_3DPRIM_RECTLIST);
+	intel_bb_out(ibb, GEN8_3DSTATE_VF_TOPOLOGY);
+	intel_bb_out(ibb, _3DPRIM_RECTLIST);
 }
 
 /* Vertex elements MUST be defined before this according to spec */
-static void gen8_emit_primitive(struct intel_batchbuffer *batch, uint32_t offset)
+static void gen8_emit_primitive(struct intel_bb *ibb, uint32_t offset)
 {
-	OUT_BATCH(GEN8_3DSTATE_VF | (2 - 2));
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN8_3DSTATE_VF_INSTANCING | (3 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-
-	OUT_BATCH(GEN4_3DPRIMITIVE | (7-2));
-	OUT_BATCH(0);	/* gen8+ ignore the topology type field */
-	OUT_BATCH(3);	/* vertex count */
-	OUT_BATCH(0);	/*  We're specifying this instead with offset in GEN6_3DSTATE_VERTEX_BUFFERS */
-	OUT_BATCH(1);	/* single instance */
-	OUT_BATCH(0);	/* start instance location */
-	OUT_BATCH(0);	/* index buffer offset, ignored */
+	intel_bb_out(ibb, GEN8_3DSTATE_VF | (2 - 2));
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN8_3DSTATE_VF_INSTANCING | (3 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, GEN4_3DPRIMITIVE | (7-2));
+	intel_bb_out(ibb, 0);	/* gen8+ ignore the topology type field */
+	intel_bb_out(ibb, 3);	/* vertex count */
+	intel_bb_out(ibb, 0);	/*  We're specifying this instead with offset in GEN6_3DSTATE_VERTEX_BUFFERS */
+	intel_bb_out(ibb, 1);	/* single instance */
+	intel_bb_out(ibb, 0);	/* start instance location */
+	intel_bb_out(ibb, 0);	/* index buffer offset, ignored */
 }
 
 /* The general rule is if it's named gen6 it is directly copied from
@@ -990,166 +876,159 @@ static void gen8_emit_primitive(struct intel_batchbuffer *batch, uint32_t offset
 #define BATCH_STATE_SPLIT 2048
 
 static
-void _gen9_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x,
-			  unsigned src_y, unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x,
-			  unsigned dst_y,
-			  drm_intel_bo *aux_pgtable_bo,
-			  const uint32_t ps_kernel[][4],
-			  uint32_t ps_kernel_size)
+void _gen9_render_copyfunc(struct intel_bb *ibb,
+			   uint32_t ctx,
+			   struct intel_buf *src,
+			   unsigned int src_x, unsigned int src_y,
+			   unsigned int width, unsigned int height,
+			   struct intel_buf *dst,
+			   unsigned int dst_x, unsigned int dst_y,
+			   struct intel_buf *aux_pgtable_buf,
+			   const uint32_t ps_kernel[][4],
+			   uint32_t ps_kernel_size)
 {
 	uint32_t ps_sampler_state, ps_kernel_off, ps_binding_table;
 	uint32_t scissor_state;
 	uint32_t vertex_buffer;
-	uint32_t batch_end;
 	uint32_t aux_pgtable_state;
 
 	igt_assert(src->bpp == dst->bpp);
-	intel_batchbuffer_flush_with_context(batch, context);
 
-	intel_batchbuffer_align(batch, 8);
+	intel_bb_flush_render_with_context(ibb, ctx);
 
-	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
+	intel_bb_add_intel_buf(ibb, dst, true);
+	intel_bb_add_intel_buf(ibb, src, false);
 
-	annotation_init(&aub_annotations);
+	intel_bb_ptr_set(ibb, BATCH_STATE_SPLIT);
 
-	ps_binding_table  = gen8_bind_surfaces(batch, src, dst);
-	ps_sampler_state  = gen8_create_sampler(batch);
-	ps_kernel_off = gen8_fill_ps(batch, ps_kernel, ps_kernel_size);
-	vertex_buffer = gen7_fill_vertex_buffer_data(batch, src,
+	ps_binding_table  = gen8_bind_surfaces(ibb, src, dst);
+	ps_sampler_state  = gen8_create_sampler(ibb);
+	ps_kernel_off = gen8_fill_ps(ibb, ps_kernel, ps_kernel_size);
+	vertex_buffer = gen7_fill_vertex_buffer_data(ibb, src,
 						     src_x, src_y,
 						     dst_x, dst_y,
 						     width, height);
-	cc.cc_state = gen6_create_cc_state(batch);
-	cc.blend_state = gen8_create_blend_state(batch);
-	viewport.cc_state = gen6_create_cc_viewport(batch);
-	viewport.sf_clip_state = gen7_create_sf_clip_viewport(batch);
-	scissor_state = gen6_create_scissor_rect(batch);
-
-	aux_pgtable_state = gen12_create_aux_pgtable_state(batch,
-							   aux_pgtable_bo);
-
-	/* TODO: theree is other state which isn't setup */
+	cc.cc_state = gen6_create_cc_state(ibb);
+	cc.blend_state = gen8_create_blend_state(ibb);
+	viewport.cc_state = gen6_create_cc_viewport(ibb);
+	viewport.sf_clip_state = gen7_create_sf_clip_viewport(ibb);
+	scissor_state = gen6_create_scissor_rect(ibb);
+	aux_pgtable_state = gen12_create_aux_pgtable_state(ibb, aux_pgtable_buf);
 
-	assert(batch->ptr < &batch->buffer[4095]);
-
-	batch->ptr = batch->buffer;
+	/* TODO: there is other state which isn't setup */
+	intel_bb_ptr_set(ibb, 0);
 
 	/* Start emitting the commands. The order roughly follows the mesa blorp
 	 * order */
-	OUT_BATCH(G4X_PIPELINE_SELECT | PIPELINE_SELECT_3D |
-				GEN9_PIPELINE_SELECTION_MASK);
-
-	gen12_emit_aux_pgtable_state(batch, aux_pgtable_state, true);
-
-	gen8_emit_sip(batch);
-
-	gen7_emit_push_constants(batch);
+	intel_bb_out(ibb, G4X_PIPELINE_SELECT | PIPELINE_SELECT_3D |
+		     GEN9_PIPELINE_SELECTION_MASK);
 
-	gen9_emit_state_base_address(batch);
+	gen12_emit_aux_pgtable_state(ibb, aux_pgtable_state, true);
 
-	OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC);
-	OUT_BATCH(viewport.cc_state);
-	OUT_BATCH(GEN8_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP);
-	OUT_BATCH(viewport.sf_clip_state);
+	gen8_emit_sip(ibb);
 
-	gen7_emit_urb(batch);
+	gen7_emit_push_constants(ibb);
 
-	gen8_emit_cc(batch);
+	gen9_emit_state_base_address(ibb);
 
-	gen8_emit_multisample(batch);
+	intel_bb_out(ibb, GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC);
+	intel_bb_out(ibb, viewport.cc_state);
+	intel_bb_out(ibb, GEN8_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP);
+	intel_bb_out(ibb, viewport.sf_clip_state);
 
-	gen8_emit_null_state(batch);
+	gen7_emit_urb(ibb);
 
-	OUT_BATCH(GEN7_3DSTATE_STREAMOUT | (5 - 2));
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
-	OUT_BATCH(0);
+	gen8_emit_cc(ibb);
 
-	gen7_emit_clip(batch);
+	gen8_emit_multisample(ibb);
 
-	gen8_emit_sf(batch);
+	gen8_emit_null_state(ibb);
 
-	gen8_emit_ps(batch, ps_kernel_off);
+	intel_bb_out(ibb, GEN7_3DSTATE_STREAMOUT | (5 - 2));
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);
 
-	OUT_BATCH(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS);
-	OUT_BATCH(ps_binding_table);
+	gen7_emit_clip(ibb);
 
-	OUT_BATCH(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS);
-	OUT_BATCH(ps_sampler_state);
+	gen8_emit_sf(ibb);
 
-	OUT_BATCH(GEN8_3DSTATE_SCISSOR_STATE_POINTERS);
-	OUT_BATCH(scissor_state);
+	gen8_emit_ps(ibb, ps_kernel_off);
 
-	gen9_emit_depth(batch);
+	intel_bb_out(ibb, GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS);
+	intel_bb_out(ibb, ps_binding_table);
 
-	gen7_emit_clear(batch);
+	intel_bb_out(ibb, GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS);
+	intel_bb_out(ibb, ps_sampler_state);
 
-	gen6_emit_drawing_rectangle(batch, dst);
+	intel_bb_out(ibb, GEN8_3DSTATE_SCISSOR_STATE_POINTERS);
+	intel_bb_out(ibb, scissor_state);
 
-	gen7_emit_vertex_buffer(batch, vertex_buffer);
-	gen6_emit_vertex_elements(batch);
+	gen9_emit_depth(ibb);
 
-	gen8_emit_vf_topology(batch);
-	gen8_emit_primitive(batch, vertex_buffer);
+	gen7_emit_clear(ibb);
 
-	OUT_BATCH(MI_BATCH_BUFFER_END);
+	gen6_emit_drawing_rectangle(ibb, dst);
 
-	batch_end = intel_batchbuffer_align(batch, 8);
-	assert(batch_end < BATCH_STATE_SPLIT);
-	annotation_add_batch(&aub_annotations, batch_end);
+	gen7_emit_vertex_buffer(ibb, vertex_buffer);
+	gen6_emit_vertex_elements(ibb);
 
-	dump_batch(batch);
+	gen8_emit_vf_topology(ibb);
+	gen8_emit_primitive(ibb, vertex_buffer);
 
-	annotation_flush(&aub_annotations, batch);
-
-	gen6_render_flush(batch, context, batch_end);
-	intel_batchbuffer_reset(batch);
+	intel_bb_emit_bbe(ibb);
+	intel_bb_exec_with_context(ibb, intel_bb_offset(ibb), ctx,
+				   I915_EXEC_RENDER | I915_EXEC_NO_RELOC,
+				   false);
+	dump_batch(ibb);
+	intel_bb_reset(ibb, false);
 }
 
-void gen9_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y)
+void gen9_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src,
+			  unsigned int src_x, unsigned int src_y,
+			  unsigned int width, unsigned int height,
+			  struct intel_buf *dst,
+			  unsigned int dst_x, unsigned int dst_y)
 
 {
-	_gen9_render_copyfunc(batch, context, src, src_x, src_y,
+	_gen9_render_copyfunc(ibb, ctx, src, src_x, src_y,
 			  width, height, dst, dst_x, dst_y, NULL,
 			  ps_kernel_gen9, sizeof(ps_kernel_gen9));
 }
 
-void gen11_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y)
-
+void gen11_render_copyfunc(struct intel_bb *ibb,
+			   uint32_t ctx,
+			   struct intel_buf *src,
+			   unsigned int src_x, unsigned int src_y,
+			   unsigned int width, unsigned int height,
+			   struct intel_buf *dst,
+			   unsigned int dst_x, unsigned int dst_y)
 {
-	_gen9_render_copyfunc(batch, context, src, src_x, src_y,
+	_gen9_render_copyfunc(ibb, ctx, src, src_x, src_y,
 			  width, height, dst, dst_x, dst_y, NULL,
 			  ps_kernel_gen11, sizeof(ps_kernel_gen11));
 }
 
-void gen12_render_copyfunc(struct intel_batchbuffer *batch,
-			   drm_intel_context *context,
-			   const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			   unsigned width, unsigned height,
-			   const struct igt_buf *dst, unsigned dst_x, unsigned dst_y)
-
+void gen12_render_copyfunc(struct intel_bb *ibb,
+			   uint32_t ctx,
+			   struct intel_buf *src,
+			   unsigned int src_x, unsigned int src_y,
+			   unsigned int width, unsigned int height,
+			   struct intel_buf *dst,
+			   unsigned int dst_x, unsigned int dst_y)
 {
 	struct aux_pgtable_info pgtable_info = { };
 
-	gen12_aux_pgtable_init(&pgtable_info, batch->bufmgr, src, dst);
+	gen12_aux_pgtable_init(&pgtable_info, ibb, src, dst);
 
-	_gen9_render_copyfunc(batch, context, src, src_x, src_y,
+	_gen9_render_copyfunc(ibb, ctx, src, src_x, src_y,
 			  width, height, dst, dst_x, dst_y,
-			  pgtable_info.pgtable_bo,
+			  pgtable_info.pgtable_buf,
 			  gen12_render_copy,
 			  sizeof(gen12_render_copy));
 
-	gen12_aux_pgtable_cleanup(&pgtable_info);
+	gen12_aux_pgtable_cleanup(ibb, &pgtable_info);
 }
diff --git a/lib/rendercopy_i830.c b/lib/rendercopy_i830.c
index ca815122..e755706e 100644
--- a/lib/rendercopy_i830.c
+++ b/lib/rendercopy_i830.c
@@ -11,7 +11,7 @@
 #include "drm.h"
 #include "i915_drm.h"
 #include "drmtest.h"
-#include "intel_bufmgr.h"
+#include "intel_bufops.h"
 #include "intel_batchbuffer.h"
 #include "intel_io.h"
 
@@ -72,75 +72,75 @@
 #define TB0A_ARG1_SEL_TEXEL3		(9 << 6)
 
 
-static void gen2_emit_invariant(struct intel_batchbuffer *batch)
+static void gen2_emit_invariant(struct intel_bb *ibb)
 {
 	int i;
 
 	for (i = 0; i < 4; i++) {
-		OUT_BATCH(_3DSTATE_MAP_CUBE | MAP_UNIT(i));
-		OUT_BATCH(_3DSTATE_MAP_TEX_STREAM_CMD | MAP_UNIT(i) |
-			  DISABLE_TEX_STREAM_BUMP |
-			  ENABLE_TEX_STREAM_COORD_SET | TEX_STREAM_COORD_SET(i) |
-			  ENABLE_TEX_STREAM_MAP_IDX | TEX_STREAM_MAP_IDX(i));
-		OUT_BATCH(_3DSTATE_MAP_COORD_TRANSFORM);
-		OUT_BATCH(DISABLE_TEX_TRANSFORM | TEXTURE_SET(i));
+		intel_bb_out(ibb, _3DSTATE_MAP_CUBE | MAP_UNIT(i));
+		intel_bb_out(ibb, _3DSTATE_MAP_TEX_STREAM_CMD | MAP_UNIT(i) |
+			     DISABLE_TEX_STREAM_BUMP |
+			     ENABLE_TEX_STREAM_COORD_SET | TEX_STREAM_COORD_SET(i) |
+			     ENABLE_TEX_STREAM_MAP_IDX | TEX_STREAM_MAP_IDX(i));
+		intel_bb_out(ibb, _3DSTATE_MAP_COORD_TRANSFORM);
+		intel_bb_out(ibb, DISABLE_TEX_TRANSFORM | TEXTURE_SET(i));
 	}
 
-	OUT_BATCH(_3DSTATE_MAP_COORD_SETBIND_CMD);
-	OUT_BATCH(TEXBIND_SET3(TEXCOORDSRC_VTXSET_3) |
-		  TEXBIND_SET2(TEXCOORDSRC_VTXSET_2) |
-		  TEXBIND_SET1(TEXCOORDSRC_VTXSET_1) |
-		  TEXBIND_SET0(TEXCOORDSRC_VTXSET_0));
-
-	OUT_BATCH(_3DSTATE_SCISSOR_ENABLE_CMD | DISABLE_SCISSOR_RECT);
-
-	OUT_BATCH(_3DSTATE_VERTEX_TRANSFORM);
-	OUT_BATCH(DISABLE_VIEWPORT_TRANSFORM | DISABLE_PERSPECTIVE_DIVIDE);
-
-	OUT_BATCH(_3DSTATE_W_STATE_CMD);
-	OUT_BATCH(MAGIC_W_STATE_DWORD1);
-	OUT_BATCH(0x3f800000 /* 1.0 in IEEE float */ );
-
-	OUT_BATCH(_3DSTATE_INDPT_ALPHA_BLEND_CMD |
-		  DISABLE_INDPT_ALPHA_BLEND |
-		  ENABLE_ALPHA_BLENDFUNC | ABLENDFUNC_ADD);
-
-	OUT_BATCH(_3DSTATE_CONST_BLEND_COLOR_CMD);
-	OUT_BATCH(0);
-
-	OUT_BATCH(_3DSTATE_MODES_1_CMD |
-		  ENABLE_COLR_BLND_FUNC | BLENDFUNC_ADD |
-		  ENABLE_SRC_BLND_FACTOR | SRC_BLND_FACT(BLENDFACTOR_ONE) |
-		  ENABLE_DST_BLND_FACTOR | DST_BLND_FACT(BLENDFACTOR_ZERO));
-
-	OUT_BATCH(_3DSTATE_ENABLES_1_CMD |
-		  DISABLE_LOGIC_OP |
-		  DISABLE_STENCIL_TEST |
-		  DISABLE_DEPTH_BIAS |
-		  DISABLE_SPEC_ADD |
-		  DISABLE_FOG |
-		  DISABLE_ALPHA_TEST |
-		  DISABLE_DEPTH_TEST |
-		  ENABLE_COLOR_BLEND);
-
-	OUT_BATCH(_3DSTATE_ENABLES_2_CMD |
-		  DISABLE_STENCIL_WRITE |
-		  DISABLE_DITHER |
-		  DISABLE_DEPTH_WRITE |
-		  ENABLE_COLOR_MASK |
-		  ENABLE_COLOR_WRITE |
-		  ENABLE_TEX_CACHE);
+	intel_bb_out(ibb, _3DSTATE_MAP_COORD_SETBIND_CMD);
+	intel_bb_out(ibb, TEXBIND_SET3(TEXCOORDSRC_VTXSET_3) |
+		     TEXBIND_SET2(TEXCOORDSRC_VTXSET_2) |
+		     TEXBIND_SET1(TEXCOORDSRC_VTXSET_1) |
+		     TEXBIND_SET0(TEXCOORDSRC_VTXSET_0));
+
+	intel_bb_out(ibb, _3DSTATE_SCISSOR_ENABLE_CMD | DISABLE_SCISSOR_RECT);
+
+	intel_bb_out(ibb, _3DSTATE_VERTEX_TRANSFORM);
+	intel_bb_out(ibb, DISABLE_VIEWPORT_TRANSFORM | DISABLE_PERSPECTIVE_DIVIDE);
+
+	intel_bb_out(ibb, _3DSTATE_W_STATE_CMD);
+	intel_bb_out(ibb, MAGIC_W_STATE_DWORD1);
+	intel_bb_out(ibb, 0x3f800000 /* 1.0 in IEEE float */);
+
+	intel_bb_out(ibb, _3DSTATE_INDPT_ALPHA_BLEND_CMD |
+		     DISABLE_INDPT_ALPHA_BLEND |
+		     ENABLE_ALPHA_BLENDFUNC | ABLENDFUNC_ADD);
+
+	intel_bb_out(ibb, _3DSTATE_CONST_BLEND_COLOR_CMD);
+	intel_bb_out(ibb, 0);
+
+	intel_bb_out(ibb, _3DSTATE_MODES_1_CMD |
+		     ENABLE_COLR_BLND_FUNC | BLENDFUNC_ADD |
+		     ENABLE_SRC_BLND_FACTOR | SRC_BLND_FACT(BLENDFACTOR_ONE) |
+		     ENABLE_DST_BLND_FACTOR | DST_BLND_FACT(BLENDFACTOR_ZERO));
+
+	intel_bb_out(ibb, _3DSTATE_ENABLES_1_CMD |
+		     DISABLE_LOGIC_OP |
+		     DISABLE_STENCIL_TEST |
+		     DISABLE_DEPTH_BIAS |
+		     DISABLE_SPEC_ADD |
+		     DISABLE_FOG |
+		     DISABLE_ALPHA_TEST |
+		     DISABLE_DEPTH_TEST |
+		     ENABLE_COLOR_BLEND);
+
+	intel_bb_out(ibb, _3DSTATE_ENABLES_2_CMD |
+		     DISABLE_STENCIL_WRITE |
+		     DISABLE_DITHER |
+		     DISABLE_DEPTH_WRITE |
+		     ENABLE_COLOR_MASK |
+		     ENABLE_COLOR_WRITE |
+		     ENABLE_TEX_CACHE);
 }
 
-static void gen2_emit_target(struct intel_batchbuffer *batch,
-			     const struct igt_buf *dst)
+static void gen2_emit_target(struct intel_bb *ibb,
+			     const struct intel_buf *dst)
 {
 	uint32_t tiling;
 	uint32_t format;
 
 	igt_assert_lte(dst->surface[0].stride, 8192);
-	igt_assert_lte(igt_buf_width(dst), 2048);
-	igt_assert_lte(igt_buf_height(dst), 2048);
+	igt_assert_lte(intel_buf_width(dst), 2048);
+	igt_assert_lte(intel_buf_height(dst), 2048);
 
 	switch (dst->bpp) {
 		case 8: format = COLR_BUF_8BIT; break;
@@ -155,33 +155,35 @@ static void gen2_emit_target(struct intel_batchbuffer *batch,
 	if (dst->tiling == I915_TILING_Y)
 		tiling |= BUF_3D_TILE_WALK_Y;
 
-	OUT_BATCH(_3DSTATE_BUF_INFO_CMD);
-	OUT_BATCH(BUF_3D_ID_COLOR_BACK | tiling | BUF_3D_PITCH(dst->surface[0].stride));
-	OUT_RELOC(dst->bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, 0);
-
-	OUT_BATCH(_3DSTATE_DST_BUF_VARS_CMD);
-	OUT_BATCH(format |
-		  DSTORG_HORT_BIAS(0x8) |
-		  DSTORG_VERT_BIAS(0x8));
-
-	OUT_BATCH(_3DSTATE_DRAW_RECT_CMD);
-	OUT_BATCH(0);
-	OUT_BATCH(0);		/* ymin, xmin */
-	OUT_BATCH(DRAW_YMAX(igt_buf_height(dst) - 1) |
-		  DRAW_XMAX(igt_buf_width(dst) - 1));
-	OUT_BATCH(0);		/* yorig, xorig */
+	intel_bb_out(ibb, _3DSTATE_BUF_INFO_CMD);
+	intel_bb_out(ibb, BUF_3D_ID_COLOR_BACK | tiling |
+		     BUF_3D_PITCH(dst->surface[0].stride));
+	intel_bb_emit_reloc(ibb, dst->handle, I915_GEM_DOMAIN_RENDER,
+			    I915_GEM_DOMAIN_RENDER, 0, dst->addr.offset);
+
+	intel_bb_out(ibb, _3DSTATE_DST_BUF_VARS_CMD);
+	intel_bb_out(ibb, format |
+		     DSTORG_HORT_BIAS(0x8) |
+		     DSTORG_VERT_BIAS(0x8));
+
+	intel_bb_out(ibb, _3DSTATE_DRAW_RECT_CMD);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0);		/* ymin, xmin */
+	intel_bb_out(ibb, DRAW_YMAX(intel_buf_height(dst) - 1) |
+		     DRAW_XMAX(intel_buf_width(dst) - 1));
+	intel_bb_out(ibb, 0);		/* yorig, xorig */
 }
 
-static void gen2_emit_texture(struct intel_batchbuffer *batch,
-			      const struct igt_buf *src,
+static void gen2_emit_texture(struct intel_bb *ibb,
+			      const struct intel_buf *src,
 			      int unit)
 {
 	uint32_t tiling;
 	uint32_t format;
 
 	igt_assert_lte(src->surface[0].stride, 8192);
-	igt_assert_lte(igt_buf_width(src), 2048);
-	igt_assert_lte(igt_buf_height(src), 2048);
+	igt_assert_lte(intel_buf_width(src), 2048);
+	igt_assert_lte(intel_buf_height(src), 2048);
 
 	switch (src->bpp) {
 		case 8: format = MAPSURF_8BIT | MT_8BIT_L8; break;
@@ -196,78 +198,84 @@ static void gen2_emit_texture(struct intel_batchbuffer *batch,
 	if (src->tiling == I915_TILING_Y)
 		tiling |= TM0S1_TILE_WALK;
 
-	OUT_BATCH(_3DSTATE_LOAD_STATE_IMMEDIATE_2 | LOAD_TEXTURE_MAP(unit) | 4);
-	OUT_RELOC(src->bo, I915_GEM_DOMAIN_SAMPLER, 0, 0);
-	OUT_BATCH((igt_buf_height(src) - 1) << TM0S1_HEIGHT_SHIFT |
-		  (igt_buf_width(src) - 1) << TM0S1_WIDTH_SHIFT |
-		  format | tiling);
-	OUT_BATCH((src->surface[0].stride / 4 - 1) << TM0S2_PITCH_SHIFT | TM0S2_MAP_2D);
-	OUT_BATCH(FILTER_NEAREST << TM0S3_MAG_FILTER_SHIFT |
-		  FILTER_NEAREST << TM0S3_MIN_FILTER_SHIFT |
-		  MIPFILTER_NONE << TM0S3_MIP_FILTER_SHIFT);
-	OUT_BATCH(0);	/* default color */
-
-	OUT_BATCH(_3DSTATE_MAP_COORD_SET_CMD | TEXCOORD_SET(unit) |
-		  ENABLE_TEXCOORD_PARAMS | TEXCOORDS_ARE_NORMAL |
-		  TEXCOORDTYPE_CARTESIAN |
-		  ENABLE_ADDR_V_CNTL | TEXCOORD_ADDR_V_MODE(TEXCOORDMODE_CLAMP_BORDER) |
-		  ENABLE_ADDR_U_CNTL | TEXCOORD_ADDR_U_MODE(TEXCOORDMODE_CLAMP_BORDER));
+	intel_bb_out(ibb, _3DSTATE_LOAD_STATE_IMMEDIATE_2 | LOAD_TEXTURE_MAP(unit) | 4);
+	intel_bb_emit_reloc(ibb, src->handle, I915_GEM_DOMAIN_SAMPLER, 0, 0,
+			    src->addr.offset);
+	intel_bb_out(ibb, (intel_buf_height(src) - 1) << TM0S1_HEIGHT_SHIFT |
+		     (intel_buf_width(src) - 1) << TM0S1_WIDTH_SHIFT |
+		     format | tiling);
+	intel_bb_out(ibb, (src->surface[0].stride / 4 - 1) << TM0S2_PITCH_SHIFT | TM0S2_MAP_2D);
+	intel_bb_out(ibb, FILTER_NEAREST << TM0S3_MAG_FILTER_SHIFT |
+		     FILTER_NEAREST << TM0S3_MIN_FILTER_SHIFT |
+		     MIPFILTER_NONE << TM0S3_MIP_FILTER_SHIFT);
+	intel_bb_out(ibb, 0);	/* default color */
+
+	intel_bb_out(ibb, _3DSTATE_MAP_COORD_SET_CMD | TEXCOORD_SET(unit) |
+		     ENABLE_TEXCOORD_PARAMS | TEXCOORDS_ARE_NORMAL |
+		     TEXCOORDTYPE_CARTESIAN |
+		     ENABLE_ADDR_V_CNTL | TEXCOORD_ADDR_V_MODE(TEXCOORDMODE_CLAMP_BORDER) |
+		     ENABLE_ADDR_U_CNTL | TEXCOORD_ADDR_U_MODE(TEXCOORDMODE_CLAMP_BORDER));
 }
 
-static void gen2_emit_copy_pipeline(struct intel_batchbuffer *batch)
+static void gen2_emit_copy_pipeline(struct intel_bb *ibb)
 {
-	OUT_BATCH(_3DSTATE_INDPT_ALPHA_BLEND_CMD | DISABLE_INDPT_ALPHA_BLEND);
-	OUT_BATCH(_3DSTATE_ENABLES_1_CMD | DISABLE_LOGIC_OP |
-		  DISABLE_STENCIL_TEST | DISABLE_DEPTH_BIAS |
-		  DISABLE_SPEC_ADD | DISABLE_FOG | DISABLE_ALPHA_TEST |
-		  DISABLE_COLOR_BLEND | DISABLE_DEPTH_TEST);
-
-	OUT_BATCH(_3DSTATE_LOAD_STATE_IMMEDIATE_2 |
-		  LOAD_TEXTURE_BLEND_STAGE(0) | 1);
-	OUT_BATCH(TB0C_LAST_STAGE | TB0C_RESULT_SCALE_1X |
-		  TB0C_OUTPUT_WRITE_CURRENT |
-		  TB0C_OP_ARG1 | TB0C_ARG1_SEL_TEXEL0);
-	OUT_BATCH(TB0A_RESULT_SCALE_1X | TB0A_OUTPUT_WRITE_CURRENT |
-		  TB0A_OP_ARG1 | TB0A_ARG1_SEL_TEXEL0);
+	intel_bb_out(ibb, _3DSTATE_INDPT_ALPHA_BLEND_CMD | DISABLE_INDPT_ALPHA_BLEND);
+	intel_bb_out(ibb, _3DSTATE_ENABLES_1_CMD | DISABLE_LOGIC_OP |
+		     DISABLE_STENCIL_TEST | DISABLE_DEPTH_BIAS |
+		     DISABLE_SPEC_ADD | DISABLE_FOG | DISABLE_ALPHA_TEST |
+		     DISABLE_COLOR_BLEND | DISABLE_DEPTH_TEST);
+
+	intel_bb_out(ibb, _3DSTATE_LOAD_STATE_IMMEDIATE_2 |
+		     LOAD_TEXTURE_BLEND_STAGE(0) | 1);
+	intel_bb_out(ibb, TB0C_LAST_STAGE | TB0C_RESULT_SCALE_1X |
+		     TB0C_OUTPUT_WRITE_CURRENT |
+		     TB0C_OP_ARG1 | TB0C_ARG1_SEL_TEXEL0);
+	intel_bb_out(ibb, TB0A_RESULT_SCALE_1X | TB0A_OUTPUT_WRITE_CURRENT |
+		     TB0A_OP_ARG1 | TB0A_ARG1_SEL_TEXEL0);
 }
 
-void gen2_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y)
+void gen2_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src,
+			  uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst,
+			  uint32_t dst_x, uint32_t dst_y)
 {
 	igt_assert(src->bpp == dst->bpp);
 
-	gen2_emit_invariant(batch);
-	gen2_emit_copy_pipeline(batch);
+	intel_bb_add_intel_buf(ibb, dst, true);
+	intel_bb_add_intel_buf(ibb, src, false);
 
-	gen2_emit_target(batch, dst);
-	gen2_emit_texture(batch, src, 0);
+	gen2_emit_invariant(ibb);
+	gen2_emit_copy_pipeline(ibb);
 
-	OUT_BATCH(_3DSTATE_LOAD_STATE_IMMEDIATE_1 |
-		  I1_LOAD_S(2) | I1_LOAD_S(3) | I1_LOAD_S(8) | 2);
-	OUT_BATCH(1<<12);
-	OUT_BATCH(S3_CULLMODE_NONE | S3_VERTEXHAS_XY);
-	OUT_BATCH(S8_ENABLE_COLOR_BUFFER_WRITE);
+	gen2_emit_target(ibb, dst);
+	gen2_emit_texture(ibb, src, 0);
 
-	OUT_BATCH(_3DSTATE_VERTEX_FORMAT_2_CMD | TEXCOORDFMT_2D << 0);
+	intel_bb_out(ibb, _3DSTATE_LOAD_STATE_IMMEDIATE_1 |
+		     I1_LOAD_S(2) | I1_LOAD_S(3) | I1_LOAD_S(8) | 2);
+	intel_bb_out(ibb, 1<<12);
+	intel_bb_out(ibb, S3_CULLMODE_NONE | S3_VERTEXHAS_XY);
+	intel_bb_out(ibb, S8_ENABLE_COLOR_BUFFER_WRITE);
 
-	OUT_BATCH(PRIM3D_INLINE | PRIM3D_RECTLIST | (3*4 -1));
-	emit_vertex(batch, dst_x + width);
-	emit_vertex(batch, dst_y + height);
-	emit_vertex_normalized(batch, src_x + width, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y + height, igt_buf_height(src));
+	intel_bb_out(ibb, _3DSTATE_VERTEX_FORMAT_2_CMD | TEXCOORDFMT_2D << 0);
 
-	emit_vertex(batch, dst_x);
-	emit_vertex(batch, dst_y + height);
-	emit_vertex_normalized(batch, src_x, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y + height, igt_buf_height(src));
+	intel_bb_out(ibb, PRIM3D_INLINE | PRIM3D_RECTLIST | (3*4 - 1));
+	emit_vertex(ibb, dst_x + width);
+	emit_vertex(ibb, dst_y + height);
+	emit_vertex_normalized(ibb, src_x + width, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y + height, intel_buf_height(src));
 
-	emit_vertex(batch, dst_x);
-	emit_vertex(batch, dst_y);
-	emit_vertex_normalized(batch, src_x, igt_buf_width(src));
-	emit_vertex_normalized(batch, src_y, igt_buf_height(src));
+	emit_vertex(ibb, dst_x);
+	emit_vertex(ibb, dst_y + height);
+	emit_vertex_normalized(ibb, src_x, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y + height, intel_buf_height(src));
 
-	intel_batchbuffer_flush(batch);
+	emit_vertex(ibb, dst_x);
+	emit_vertex(ibb, dst_y);
+	emit_vertex_normalized(ibb, src_x, intel_buf_width(src));
+	emit_vertex_normalized(ibb, src_y, intel_buf_height(src));
+
+	intel_bb_flush_blit_with_context(ibb, ctx);
 }
diff --git a/lib/rendercopy_i915.c b/lib/rendercopy_i915.c
index 56e1863e..b16d4f12 100644
--- a/lib/rendercopy_i915.c
+++ b/lib/rendercopy_i915.c
@@ -11,7 +11,7 @@
 #include "drm.h"
 #include "i915_drm.h"
 #include "drmtest.h"
-#include "intel_bufmgr.h"
+#include "intel_bufops.h"
 #include "intel_batchbuffer.h"
 #include "intel_io.h"
 
@@ -19,68 +19,73 @@
 #include "i915_3d.h"
 #include "rendercopy.h"
 
-void gen3_render_copyfunc(struct intel_batchbuffer *batch,
-			  drm_intel_context *context,
-			  const struct igt_buf *src, unsigned src_x, unsigned src_y,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst, unsigned dst_x, unsigned dst_y)
+void gen3_render_copyfunc(struct intel_bb *ibb,
+			  uint32_t ctx,
+			  struct intel_buf *src,
+			  uint32_t src_x, uint32_t src_y,
+			  uint32_t width, uint32_t height,
+			  struct intel_buf *dst,
+			  uint32_t dst_x, uint32_t dst_y)
 {
 	igt_assert(src->bpp == dst->bpp);
 
+	intel_bb_add_intel_buf(ibb, dst, true);
+	intel_bb_add_intel_buf(ibb, src, false);
+
 	/* invariant state */
 	{
-		OUT_BATCH(_3DSTATE_AA_CMD |
-			  AA_LINE_ECAAR_WIDTH_ENABLE |
-			  AA_LINE_ECAAR_WIDTH_1_0 |
-			  AA_LINE_REGION_WIDTH_ENABLE | AA_LINE_REGION_WIDTH_1_0);
-		OUT_BATCH(_3DSTATE_INDEPENDENT_ALPHA_BLEND_CMD |
-			  IAB_MODIFY_ENABLE |
-			  IAB_MODIFY_FUNC | (BLENDFUNC_ADD << IAB_FUNC_SHIFT) |
-			  IAB_MODIFY_SRC_FACTOR | (BLENDFACT_ONE <<
-						   IAB_SRC_FACTOR_SHIFT) |
-			  IAB_MODIFY_DST_FACTOR | (BLENDFACT_ZERO <<
-						   IAB_DST_FACTOR_SHIFT));
-		OUT_BATCH(_3DSTATE_DFLT_DIFFUSE_CMD);
-		OUT_BATCH(0);
-		OUT_BATCH(_3DSTATE_DFLT_SPEC_CMD);
-		OUT_BATCH(0);
-		OUT_BATCH(_3DSTATE_DFLT_Z_CMD);
-		OUT_BATCH(0);
-		OUT_BATCH(_3DSTATE_COORD_SET_BINDINGS |
-			  CSB_TCB(0, 0) |
-			  CSB_TCB(1, 1) |
-			  CSB_TCB(2, 2) |
-			  CSB_TCB(3, 3) |
-			  CSB_TCB(4, 4) |
-			  CSB_TCB(5, 5) | CSB_TCB(6, 6) | CSB_TCB(7, 7));
-		OUT_BATCH(_3DSTATE_RASTER_RULES_CMD |
-			  ENABLE_POINT_RASTER_RULE |
-			  OGL_POINT_RASTER_RULE |
-			  ENABLE_LINE_STRIP_PROVOKE_VRTX |
-			  ENABLE_TRI_FAN_PROVOKE_VRTX |
-			  LINE_STRIP_PROVOKE_VRTX(1) |
-			  TRI_FAN_PROVOKE_VRTX(2) | ENABLE_TEXKILL_3D_4D | TEXKILL_4D);
-		OUT_BATCH(_3DSTATE_MODES_4_CMD |
-			  ENABLE_LOGIC_OP_FUNC | LOGIC_OP_FUNC(LOGICOP_COPY) |
-			  ENABLE_STENCIL_WRITE_MASK | STENCIL_WRITE_MASK(0xff) |
-			  ENABLE_STENCIL_TEST_MASK | STENCIL_TEST_MASK(0xff));
-		OUT_BATCH(_3DSTATE_LOAD_STATE_IMMEDIATE_1 | I1_LOAD_S(3) | I1_LOAD_S(4) | I1_LOAD_S(5) | 2);
-		OUT_BATCH(0x00000000);	/* Disable texture coordinate wrap-shortest */
-		OUT_BATCH((1 << S4_POINT_WIDTH_SHIFT) |
-			  S4_LINE_WIDTH_ONE |
-			  S4_CULLMODE_NONE |
-			  S4_VFMT_XY);
-		OUT_BATCH(0x00000000);	/* Stencil. */
-		OUT_BATCH(_3DSTATE_SCISSOR_ENABLE_CMD | DISABLE_SCISSOR_RECT);
-		OUT_BATCH(_3DSTATE_SCISSOR_RECT_0_CMD);
-		OUT_BATCH(0);
-		OUT_BATCH(0);
-		OUT_BATCH(_3DSTATE_DEPTH_SUBRECT_DISABLE);
-		OUT_BATCH(_3DSTATE_LOAD_INDIRECT | 0);	/* disable indirect state */
-		OUT_BATCH(0);
-		OUT_BATCH(_3DSTATE_STIPPLE);
-		OUT_BATCH(0x00000000);
-		OUT_BATCH(_3DSTATE_BACKFACE_STENCIL_OPS | BFO_ENABLE_STENCIL_TWO_SIDE | 0);
+		intel_bb_out(ibb, _3DSTATE_AA_CMD |
+			     AA_LINE_ECAAR_WIDTH_ENABLE |
+			     AA_LINE_ECAAR_WIDTH_1_0 |
+			     AA_LINE_REGION_WIDTH_ENABLE | AA_LINE_REGION_WIDTH_1_0);
+		intel_bb_out(ibb, _3DSTATE_INDEPENDENT_ALPHA_BLEND_CMD |
+			     IAB_MODIFY_ENABLE |
+			     IAB_MODIFY_FUNC | (BLENDFUNC_ADD << IAB_FUNC_SHIFT) |
+			     IAB_MODIFY_SRC_FACTOR | (BLENDFACT_ONE <<
+						      IAB_SRC_FACTOR_SHIFT) |
+			     IAB_MODIFY_DST_FACTOR | (BLENDFACT_ZERO <<
+						      IAB_DST_FACTOR_SHIFT));
+		intel_bb_out(ibb, _3DSTATE_DFLT_DIFFUSE_CMD);
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, _3DSTATE_DFLT_SPEC_CMD);
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, _3DSTATE_DFLT_Z_CMD);
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, _3DSTATE_COORD_SET_BINDINGS |
+			     CSB_TCB(0, 0) |
+			     CSB_TCB(1, 1) |
+			     CSB_TCB(2, 2) |
+			     CSB_TCB(3, 3) |
+			     CSB_TCB(4, 4) |
+			     CSB_TCB(5, 5) | CSB_TCB(6, 6) | CSB_TCB(7, 7));
+		intel_bb_out(ibb, _3DSTATE_RASTER_RULES_CMD |
+			     ENABLE_POINT_RASTER_RULE |
+			     OGL_POINT_RASTER_RULE |
+			     ENABLE_LINE_STRIP_PROVOKE_VRTX |
+			     ENABLE_TRI_FAN_PROVOKE_VRTX |
+			     LINE_STRIP_PROVOKE_VRTX(1) |
+			     TRI_FAN_PROVOKE_VRTX(2) | ENABLE_TEXKILL_3D_4D | TEXKILL_4D);
+		intel_bb_out(ibb, _3DSTATE_MODES_4_CMD |
+			     ENABLE_LOGIC_OP_FUNC | LOGIC_OP_FUNC(LOGICOP_COPY) |
+			     ENABLE_STENCIL_WRITE_MASK | STENCIL_WRITE_MASK(0xff) |
+			     ENABLE_STENCIL_TEST_MASK | STENCIL_TEST_MASK(0xff));
+		intel_bb_out(ibb, _3DSTATE_LOAD_STATE_IMMEDIATE_1 | I1_LOAD_S(3) | I1_LOAD_S(4) | I1_LOAD_S(5) | 2);
+		intel_bb_out(ibb, 0x00000000);	/* Disable texture coordinate wrap-shortest */
+		intel_bb_out(ibb, (1 << S4_POINT_WIDTH_SHIFT) |
+			     S4_LINE_WIDTH_ONE |
+			     S4_CULLMODE_NONE |
+			     S4_VFMT_XY);
+		intel_bb_out(ibb, 0x00000000);	/* Stencil. */
+		intel_bb_out(ibb, _3DSTATE_SCISSOR_ENABLE_CMD | DISABLE_SCISSOR_RECT);
+		intel_bb_out(ibb, _3DSTATE_SCISSOR_RECT_0_CMD);
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, _3DSTATE_DEPTH_SUBRECT_DISABLE);
+		intel_bb_out(ibb, _3DSTATE_LOAD_INDIRECT | 0);	/* disable indirect state */
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, _3DSTATE_STIPPLE);
+		intel_bb_out(ibb, 0x00000000);
+		intel_bb_out(ibb, _3DSTATE_BACKFACE_STENCIL_OPS | BFO_ENABLE_STENCIL_TWO_SIDE | 0);
 	}
 
 	/* samler state */
@@ -89,8 +94,8 @@ void gen3_render_copyfunc(struct intel_batchbuffer *batch,
 		uint32_t format_bits, tiling_bits = 0;
 
 		igt_assert_lte(src->surface[0].stride, 8192);
-		igt_assert_lte(igt_buf_width(src), 2048);
-		igt_assert_lte(igt_buf_height(src), 2048);
+		igt_assert_lte(intel_buf_width(src), 2048);
+		igt_assert_lte(intel_buf_height(src), 2048);
 
 		if (src->tiling != I915_TILING_NONE)
 			tiling_bits = MS3_TILED_SURFACE;
@@ -104,23 +109,25 @@ void gen3_render_copyfunc(struct intel_batchbuffer *batch,
 			default: igt_assert(0);
 		}
 
-		OUT_BATCH(_3DSTATE_MAP_STATE | (3 * TEX_COUNT));
-		OUT_BATCH((1 << TEX_COUNT) - 1);
-		OUT_RELOC(src->bo, I915_GEM_DOMAIN_SAMPLER, 0, 0);
-		OUT_BATCH(format_bits | tiling_bits |
-			  (igt_buf_height(src) - 1) << MS3_HEIGHT_SHIFT |
-			  (igt_buf_width(src) - 1) << MS3_WIDTH_SHIFT);
-		OUT_BATCH((src->surface[0].stride/4-1) << MS4_PITCH_SHIFT);
-
-		OUT_BATCH(_3DSTATE_SAMPLER_STATE | (3 * TEX_COUNT));
-		OUT_BATCH((1 << TEX_COUNT) - 1);
-		OUT_BATCH(MIPFILTER_NONE << SS2_MIP_FILTER_SHIFT |
-			  FILTER_NEAREST << SS2_MAG_FILTER_SHIFT |
-			  FILTER_NEAREST << SS2_MIN_FILTER_SHIFT);
-		OUT_BATCH(TEXCOORDMODE_WRAP << SS3_TCX_ADDR_MODE_SHIFT |
-			  TEXCOORDMODE_WRAP << SS3_TCY_ADDR_MODE_SHIFT |
-			  0 << SS3_TEXTUREMAP_INDEX_SHIFT);
-		OUT_BATCH(0x00000000);
+		intel_bb_out(ibb, _3DSTATE_MAP_STATE | (3 * TEX_COUNT));
+		intel_bb_out(ibb, (1 << TEX_COUNT) - 1);
+		intel_bb_emit_reloc(ibb, src->handle,
+				    I915_GEM_DOMAIN_SAMPLER, 0,
+				    0, src->addr.offset);
+		intel_bb_out(ibb, format_bits | tiling_bits |
+			     (intel_buf_height(src) - 1) << MS3_HEIGHT_SHIFT |
+			     (intel_buf_width(src) - 1) << MS3_WIDTH_SHIFT);
+		intel_bb_out(ibb, (src->surface[0].stride/4-1) << MS4_PITCH_SHIFT);
+
+		intel_bb_out(ibb, _3DSTATE_SAMPLER_STATE | (3 * TEX_COUNT));
+		intel_bb_out(ibb, (1 << TEX_COUNT) - 1);
+		intel_bb_out(ibb, MIPFILTER_NONE << SS2_MIP_FILTER_SHIFT |
+			     FILTER_NEAREST << SS2_MAG_FILTER_SHIFT |
+			     FILTER_NEAREST << SS2_MIN_FILTER_SHIFT);
+		intel_bb_out(ibb, TEXCOORDMODE_WRAP << SS3_TCX_ADDR_MODE_SHIFT |
+			     TEXCOORDMODE_WRAP << SS3_TCY_ADDR_MODE_SHIFT |
+			     0 << SS3_TEXTUREMAP_INDEX_SHIFT);
+		intel_bb_out(ibb, 0x00000000);
 	}
 
 	/* render target state */
@@ -129,8 +136,8 @@ void gen3_render_copyfunc(struct intel_batchbuffer *batch,
 		uint32_t format_bits;
 
 		igt_assert_lte(dst->surface[0].stride, 8192);
-		igt_assert_lte(igt_buf_width(dst), 2048);
-		igt_assert_lte(igt_buf_height(dst), 2048);
+		igt_assert_lte(intel_buf_width(dst), 2048);
+		igt_assert_lte(intel_buf_height(dst), 2048);
 
 		switch (dst->bpp) {
 			case 8: format_bits = COLR_BUF_8BIT; break;
@@ -144,81 +151,83 @@ void gen3_render_copyfunc(struct intel_batchbuffer *batch,
 		if (dst->tiling == I915_TILING_Y)
 			tiling_bits |= BUF_3D_TILE_WALK_Y;
 
-		OUT_BATCH(_3DSTATE_BUF_INFO_CMD);
-		OUT_BATCH(BUF_3D_ID_COLOR_BACK | tiling_bits |
-			  BUF_3D_PITCH(dst->surface[0].stride));
-		OUT_RELOC(dst->bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, 0);
+		intel_bb_out(ibb, _3DSTATE_BUF_INFO_CMD);
+		intel_bb_out(ibb, BUF_3D_ID_COLOR_BACK | tiling_bits |
+			     BUF_3D_PITCH(dst->surface[0].stride));
+		intel_bb_emit_reloc(ibb, dst->handle,
+				    I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
+				    0, dst->addr.offset);
 
-		OUT_BATCH(_3DSTATE_DST_BUF_VARS_CMD);
-		OUT_BATCH(format_bits |
-			  DSTORG_HORT_BIAS(0x8) |
-			  DSTORG_VERT_BIAS(0x8));
+		intel_bb_out(ibb, _3DSTATE_DST_BUF_VARS_CMD);
+		intel_bb_out(ibb, format_bits |
+			     DSTORG_HORT_BIAS(0x8) |
+			     DSTORG_VERT_BIAS(0x8));
 
 		/* draw rect is unconditional */
-		OUT_BATCH(_3DSTATE_DRAW_RECT_CMD);
-		OUT_BATCH(0x00000000);
-		OUT_BATCH(0x00000000);	/* ymin, xmin */
-		OUT_BATCH(DRAW_YMAX(igt_buf_height(dst) - 1) |
-			  DRAW_XMAX(igt_buf_width(dst) - 1));
+		intel_bb_out(ibb, _3DSTATE_DRAW_RECT_CMD);
+		intel_bb_out(ibb, 0x00000000);
+		intel_bb_out(ibb, 0x00000000);	/* ymin, xmin */
+		intel_bb_out(ibb, DRAW_YMAX(intel_buf_height(dst) - 1) |
+			     DRAW_XMAX(intel_buf_width(dst) - 1));
 		/* yorig, xorig (relate to color buffer?) */
-		OUT_BATCH(0x00000000);
+		intel_bb_out(ibb, 0x00000000);
 	}
 
 	/* texfmt */
 	{
-		OUT_BATCH(_3DSTATE_LOAD_STATE_IMMEDIATE_1 |
-			  I1_LOAD_S(1) | I1_LOAD_S(2) | I1_LOAD_S(6) | 2);
-		OUT_BATCH((4 << S1_VERTEX_WIDTH_SHIFT) |
-			  (4 << S1_VERTEX_PITCH_SHIFT));
-		OUT_BATCH(~S2_TEXCOORD_FMT(0, TEXCOORDFMT_NOT_PRESENT) | S2_TEXCOORD_FMT(0, TEXCOORDFMT_2D));
-		OUT_BATCH(S6_CBUF_BLEND_ENABLE | S6_COLOR_WRITE_ENABLE |
-			  BLENDFUNC_ADD << S6_CBUF_BLEND_FUNC_SHIFT |
-			  BLENDFACT_ONE << S6_CBUF_SRC_BLEND_FACT_SHIFT |
-			  BLENDFACT_ZERO << S6_CBUF_DST_BLEND_FACT_SHIFT);
+		intel_bb_out(ibb, _3DSTATE_LOAD_STATE_IMMEDIATE_1 |
+			     I1_LOAD_S(1) | I1_LOAD_S(2) | I1_LOAD_S(6) | 2);
+		intel_bb_out(ibb, (4 << S1_VERTEX_WIDTH_SHIFT) |
+			     (4 << S1_VERTEX_PITCH_SHIFT));
+		intel_bb_out(ibb, ~S2_TEXCOORD_FMT(0, TEXCOORDFMT_NOT_PRESENT) | S2_TEXCOORD_FMT(0, TEXCOORDFMT_2D));
+		intel_bb_out(ibb, S6_CBUF_BLEND_ENABLE | S6_COLOR_WRITE_ENABLE |
+			     BLENDFUNC_ADD << S6_CBUF_BLEND_FUNC_SHIFT |
+			     BLENDFACT_ONE << S6_CBUF_SRC_BLEND_FACT_SHIFT |
+			     BLENDFACT_ZERO << S6_CBUF_DST_BLEND_FACT_SHIFT);
 	}
 
 	/* frage shader */
 	{
-		OUT_BATCH(_3DSTATE_PIXEL_SHADER_PROGRAM | (1 + 3*3 - 2));
+		intel_bb_out(ibb, _3DSTATE_PIXEL_SHADER_PROGRAM | (1 + 3*3 - 2));
 		/* decl FS_T0 */
-		OUT_BATCH(D0_DCL |
-			  REG_TYPE(FS_T0) << D0_TYPE_SHIFT |
-			  REG_NR(FS_T0) << D0_NR_SHIFT |
-			  ((REG_TYPE(FS_T0) != REG_TYPE_S) ? D0_CHANNEL_ALL : 0));
-		OUT_BATCH(0);
-		OUT_BATCH(0);
+		intel_bb_out(ibb, D0_DCL |
+			     REG_TYPE(FS_T0) << D0_TYPE_SHIFT |
+			     REG_NR(FS_T0) << D0_NR_SHIFT |
+			     ((REG_TYPE(FS_T0) != REG_TYPE_S) ? D0_CHANNEL_ALL : 0));
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, 0);
 		/* decl FS_S0 */
-		OUT_BATCH(D0_DCL |
-			  (REG_TYPE(FS_S0) << D0_TYPE_SHIFT) |
-			  (REG_NR(FS_S0) << D0_NR_SHIFT) |
-			  ((REG_TYPE(FS_S0) != REG_TYPE_S) ? D0_CHANNEL_ALL : 0));
-		OUT_BATCH(0);
-		OUT_BATCH(0);
+		intel_bb_out(ibb, D0_DCL |
+			     (REG_TYPE(FS_S0) << D0_TYPE_SHIFT) |
+			     (REG_NR(FS_S0) << D0_NR_SHIFT) |
+			     ((REG_TYPE(FS_S0) != REG_TYPE_S) ? D0_CHANNEL_ALL : 0));
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, 0);
 		/* texld(FS_OC, FS_S0, FS_T0 */
-		OUT_BATCH(T0_TEXLD |
-			  (REG_TYPE(FS_OC) << T0_DEST_TYPE_SHIFT) |
-			  (REG_NR(FS_OC) << T0_DEST_NR_SHIFT) |
-			  (REG_NR(FS_S0) << T0_SAMPLER_NR_SHIFT));
-		OUT_BATCH((REG_TYPE(FS_T0) << T1_ADDRESS_REG_TYPE_SHIFT) |
-			  (REG_NR(FS_T0) << T1_ADDRESS_REG_NR_SHIFT));
-		OUT_BATCH(0);
+		intel_bb_out(ibb, T0_TEXLD |
+			     (REG_TYPE(FS_OC) << T0_DEST_TYPE_SHIFT) |
+			     (REG_NR(FS_OC) << T0_DEST_NR_SHIFT) |
+			     (REG_NR(FS_S0) << T0_SAMPLER_NR_SHIFT));
+		intel_bb_out(ibb, (REG_TYPE(FS_T0) << T1_ADDRESS_REG_TYPE_SHIFT) |
+			     (REG_NR(FS_T0) << T1_ADDRESS_REG_NR_SHIFT));
+		intel_bb_out(ibb, 0);
 	}
 
-	OUT_BATCH(PRIM3D_RECTLIST | (3*4 - 1));
-	emit_vertex(batch, dst_x + width);
-	emit_vertex(batch, dst_y + height);
-	emit_vertex(batch, src_x + width);
-	emit_vertex(batch, src_y + height);
+	intel_bb_out(ibb, PRIM3D_RECTLIST | (3*4 - 1));
+	emit_vertex(ibb, dst_x + width);
+	emit_vertex(ibb, dst_y + height);
+	emit_vertex(ibb, src_x + width);
+	emit_vertex(ibb, src_y + height);
 
-	emit_vertex(batch, dst_x);
-	emit_vertex(batch, dst_y + height);
-	emit_vertex(batch, src_x);
-	emit_vertex(batch, src_y + height);
+	emit_vertex(ibb, dst_x);
+	emit_vertex(ibb, dst_y + height);
+	emit_vertex(ibb, src_x);
+	emit_vertex(ibb, src_y + height);
 
-	emit_vertex(batch, dst_x);
-	emit_vertex(batch, dst_y);
-	emit_vertex(batch, src_x);
-	emit_vertex(batch, src_y);
+	emit_vertex(ibb, dst_x);
+	emit_vertex(ibb, dst_y);
+	emit_vertex(ibb, src_x);
+	emit_vertex(ibb, src_y);
 
-	intel_batchbuffer_flush(batch);
+	intel_bb_flush_blit_with_context(ibb, ctx);
 }
diff --git a/lib/veboxcopy.h b/lib/veboxcopy.h
index 949d83bf..925b8f52 100644
--- a/lib/veboxcopy.h
+++ b/lib/veboxcopy.h
@@ -1,9 +1,9 @@
 #ifndef __VEBOXCOPY_H__
 #define __VEBOXCOPY_H__
 
-void gen12_vebox_copyfunc(struct intel_batchbuffer *batch,
-			  const struct igt_buf *src,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst);
+void gen12_vebox_copyfunc(struct intel_bb *ibb,
+			  struct intel_buf *src,
+			  unsigned int width, unsigned int height,
+			  struct intel_buf *dst);
 
 #endif
diff --git a/lib/veboxcopy_gen12.c b/lib/veboxcopy_gen12.c
index 237c43f2..a44e2bff 100644
--- a/lib/veboxcopy_gen12.c
+++ b/lib/veboxcopy_gen12.c
@@ -144,7 +144,7 @@ static bool format_is_interleaved_yuv(int format)
 	return false;
 }
 
-static void emit_surface_state_cmd(struct intel_batchbuffer *batch,
+static void emit_surface_state_cmd(struct intel_bb *ibb,
 				   int surface_id,
 				   int width, int height, int bpp,
 				   int pitch, uint32_t tiling, int format,
@@ -152,7 +152,7 @@ static void emit_surface_state_cmd(struct intel_batchbuffer *batch,
 {
 	struct vebox_surface_state *ss;
 
-	ss = intel_batchbuffer_subdata_alloc(batch, sizeof(*ss), 4);
+	ss = intel_bb_ptr_align(ibb, 4);
 
 	ss->ss0.cmd_type = 3;
 	ss->ss0.media_cmd_pipeline = 2;
@@ -175,21 +175,19 @@ static void emit_surface_state_cmd(struct intel_batchbuffer *batch,
 	ss->ss4.u_y_offset = uv_offset / pitch;
 
 	ss->ss7.derived_surface_pitch = pitch - 1;
+
+	intel_bb_ptr_add(ibb, sizeof(*ss));
 }
 
-static void emit_tiling_convert_cmd(struct intel_batchbuffer *batch,
-				    drm_intel_bo *input_bo,
-				    uint32_t input_tiling,
-				    uint32_t input_compression,
-				    drm_intel_bo *output_bo,
-				    uint32_t output_tiling,
-				    uint32_t output_compression)
+static void emit_tiling_convert_cmd(struct intel_bb *ibb,
+				    struct intel_buf *src,
+				    struct intel_buf *dst)
 {
-	uint32_t reloc_delta;
+	uint32_t reloc_delta, tc_offset, offset;
 	struct vebox_tiling_convert *tc;
-	int ret;
 
-	tc = intel_batchbuffer_subdata_alloc(batch, sizeof(*tc), 8);
+	tc = intel_bb_ptr_align(ibb, 8);
+	tc_offset = intel_bb_offset(ibb);
 
 	tc->tc0.cmd_type = 3;
 	tc->tc0.pipeline = 2;
@@ -198,71 +196,70 @@ static void emit_tiling_convert_cmd(struct intel_batchbuffer *batch,
 
 	tc->tc0.dw_count = 3;
 
-	if (input_compression != I915_COMPRESSION_NONE) {
+	if (src->compression != I915_COMPRESSION_NONE) {
 		tc->tc1_2.input_memory_compression_enable = 1;
 		tc->tc1_2.input_compression_type =
-			input_compression == I915_COMPRESSION_RENDER;
+			src->compression == I915_COMPRESSION_RENDER;
 	}
-	tc->tc1_2.input_tiled_resource_mode = input_tiling == I915_TILING_Yf;
+	tc->tc1_2.input_tiled_resource_mode = src->tiling == I915_TILING_Yf;
 	reloc_delta = tc->tc1_2_l;
 
-	igt_assert(input_bo->offset64 == ALIGN(input_bo->offset64, 0x1000));
-	tc->tc1_2.input_address = input_bo->offset64 >> 12;
+	igt_assert(src->addr.offset == ALIGN(src->addr.offset, 0x1000));
+	tc->tc1_2.input_address = src->addr.offset >> 12;
 	igt_assert(reloc_delta <= INT32_MAX);
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				      intel_batchbuffer_subdata_offset(batch, tc) +
-					offsetof(typeof(*tc), tc1_2),
-				      input_bo, reloc_delta,
-				      0, 0);
-	igt_assert(ret == 0);
-
-	if (output_compression != I915_COMPRESSION_NONE) {
+
+	offset = tc_offset + offsetof(typeof(*tc), tc1_2);
+	intel_bb_offset_reloc_with_delta(ibb, src->handle, 0, 0,
+					 reloc_delta, offset,
+					 src->addr.offset);
+
+	if (dst->compression != I915_COMPRESSION_NONE) {
 		tc->tc3_4.output_memory_compression_enable = 1;
 		tc->tc3_4.output_compression_type =
-			output_compression == I915_COMPRESSION_RENDER;
+			dst->compression == I915_COMPRESSION_RENDER;
 	}
-	tc->tc3_4.output_tiled_resource_mode = output_tiling == I915_TILING_Yf;
+	tc->tc3_4.output_tiled_resource_mode = dst->tiling == I915_TILING_Yf;
 	reloc_delta = tc->tc3_4_l;
 
-	igt_assert(output_bo->offset64 == ALIGN(output_bo->offset64, 0x1000));
-	tc->tc3_4.output_address = output_bo->offset64 >> 12;
+	igt_assert(dst->addr.offset == ALIGN(dst->addr.offset, 0x1000));
+	tc->tc3_4.output_address = dst->addr.offset >> 12;
 	igt_assert(reloc_delta <= INT32_MAX);
-	ret = drm_intel_bo_emit_reloc(batch->bo,
-				      intel_batchbuffer_subdata_offset(batch, tc) +
-					offsetof(typeof(*tc), tc3_4),
-				      output_bo, reloc_delta,
-				      0, I915_GEM_DOMAIN_RENDER);
-	igt_assert(ret == 0);
 
+	offset = tc_offset + offsetof(typeof(*tc), tc3_4);
+	intel_bb_offset_reloc_with_delta(ibb, dst->handle,
+					 0, I915_GEM_DOMAIN_RENDER,
+					 reloc_delta, offset,
+					 dst->addr.offset);
+
+	intel_bb_ptr_add(ibb, sizeof(*tc));
 }
 
 /* Borrowing the idea from the rendercopy state setup. */
 #define BATCH_STATE_SPLIT 2048
 
-void gen12_vebox_copyfunc(struct intel_batchbuffer *batch,
-			  const struct igt_buf *src,
-			  unsigned width, unsigned height,
-			  const struct igt_buf *dst)
+void gen12_vebox_copyfunc(struct intel_bb *ibb,
+			  struct intel_buf *src,
+			  unsigned int width, unsigned int height,
+			  struct intel_buf *dst)
 {
 	struct aux_pgtable_info aux_pgtable_info = { };
 	uint32_t aux_pgtable_state;
 	int format;
 
-	intel_batchbuffer_flush_on_ring(batch, I915_EXEC_VEBOX);
+	igt_assert(src->bpp == dst->bpp);
 
-	intel_batchbuffer_align(batch, 8);
+	intel_bb_flush(ibb, ibb->ctx, I915_EXEC_VEBOX);
 
-	batch->ptr = &batch->buffer[BATCH_STATE_SPLIT];
+	intel_bb_add_intel_buf(ibb, dst, true);
+	intel_bb_add_intel_buf(ibb, src, false);
 
-	gen12_aux_pgtable_init(&aux_pgtable_info, batch->bufmgr, src, dst);
+	intel_bb_ptr_set(ibb, BATCH_STATE_SPLIT);
+	gen12_aux_pgtable_init(&aux_pgtable_info, ibb, src, dst);
+	aux_pgtable_state = gen12_create_aux_pgtable_state(ibb,
+							   aux_pgtable_info.pgtable_buf);
 
-	aux_pgtable_state = gen12_create_aux_pgtable_state(batch,
-							   aux_pgtable_info.pgtable_bo);
-
-	assert(batch->ptr < &batch->buffer[4095]);
-	batch->ptr = batch->buffer;
-
-	gen12_emit_aux_pgtable_state(batch, aux_pgtable_state, false);
+	intel_bb_ptr_set(ibb, 0);
+	gen12_emit_aux_pgtable_state(ibb, aux_pgtable_state, false);
 
 	/* The tiling convert command can't convert formats. */
 	igt_assert_eq(src->format_is_yuv, dst->format_is_yuv);
@@ -292,24 +289,26 @@ void gen12_vebox_copyfunc(struct intel_batchbuffer *batch,
 
 	igt_assert(!src->format_is_yuv_semiplanar ||
 		   (src->surface[1].offset && dst->surface[1].offset));
-	emit_surface_state_cmd(batch, VEBOX_SURFACE_INPUT,
+	emit_surface_state_cmd(ibb, VEBOX_SURFACE_INPUT,
 			       width, height, src->bpp,
 			       src->surface[0].stride,
 			       src->tiling, format, src->surface[1].offset);
 
-	emit_surface_state_cmd(batch, VEBOX_SURFACE_OUTPUT,
+	emit_surface_state_cmd(ibb, VEBOX_SURFACE_OUTPUT,
 			       width, height, dst->bpp,
 			       dst->surface[0].stride,
 			       dst->tiling, format, dst->surface[1].offset);
 
-	emit_tiling_convert_cmd(batch,
-				src->bo, src->tiling, src->compression,
-				dst->bo, dst->tiling, dst->compression);
+	emit_tiling_convert_cmd(ibb, src, dst);
+
+	intel_bb_out(ibb, MI_BATCH_BUFFER_END);
+	intel_bb_ptr_align(ibb, 8);
 
-	OUT_BATCH(MI_BATCH_BUFFER_END);
+	intel_bb_exec_with_context(ibb, intel_bb_offset(ibb), 0,
+				   I915_EXEC_VEBOX | I915_EXEC_NO_RELOC,
+				   false);
 
-	intel_batchbuffer_flush_on_ring(batch, I915_EXEC_VEBOX);
+	intel_bb_reset(ibb, false);
 
-	gen12_aux_pgtable_cleanup(&aux_pgtable_info);
-	intel_batchbuffer_reset(batch);
+	gen12_aux_pgtable_cleanup(ibb, &aux_pgtable_info);
 }
-- 
2.26.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [igt-dev] [PATCH i-g-t 05/11] tests/api_intel_bb: add render tests
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
                   ` (3 preceding siblings ...)
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 04/11] lib/rendercopy: remove libdrm dependency Zbigniew Kempczyński
@ 2020-07-23 19:22 ` Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 06/11] lib/igt_draw: remove libdrm dependency Zbigniew Kempczyński
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Zbigniew Kempczyński @ 2020-07-23 19:22 UTC (permalink / raw)
  To: igt-dev; +Cc: Chris Wilson

Check render / render-ccs tests works fine on all supported gens.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/i915/api_intel_bb.c | 168 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 168 insertions(+)

diff --git a/tests/i915/api_intel_bb.c b/tests/i915/api_intel_bb.c
index 6967fc5d..df0eb933 100644
--- a/tests/i915/api_intel_bb.c
+++ b/tests/i915/api_intel_bb.c
@@ -632,6 +632,162 @@ static void full_batch(struct buf_ops *bops)
 	intel_bb_destroy(ibb);
 }
 
+static int render(struct buf_ops *bops, uint32_t tiling)
+{
+	struct intel_bb *ibb;
+	const int width = 1024;
+	const int height = 1024;
+	struct intel_buf src, dst, final;
+	int i915 = buf_ops_get_fd(bops);
+	uint32_t fails = 0;
+	char name[128];
+	uint32_t devid = intel_get_drm_devid(i915);
+	igt_render_copyfunc_t render_copy = NULL;
+
+	ibb = intel_bb_create(i915, PAGE_SIZE);
+	if (debug_bb)
+		intel_bb_set_debug(ibb, true);
+
+	scratch_buf_init(bops, &src, width, height, I915_TILING_NONE,
+			 I915_COMPRESSION_NONE);
+	scratch_buf_init(bops, &dst, width, height, tiling,
+			 I915_COMPRESSION_NONE);
+	scratch_buf_init(bops, &final, width, height, I915_TILING_NONE,
+			 I915_COMPRESSION_NONE);
+
+	scratch_buf_draw_pattern(bops, &src,
+				 0, 0, width, height,
+				 0, 0, width, height, 0);
+
+	render_copy = igt_get_render_copyfunc(devid);
+	igt_assert(render_copy);
+
+	render_copy(ibb, 0,
+		    &src,
+		    0, 0, width, height,
+		    &dst,
+		    0, 0);
+
+	render_copy(ibb, 0,
+		    &dst,
+		    0, 0, width, height,
+		    &final,
+		    0, 0);
+
+	intel_bb_sync(ibb);
+	intel_bb_destroy(ibb);
+
+	if (write_png) {
+		snprintf(name, sizeof(name) - 1,
+			 "render_dst_tiling_%d.png", tiling);
+		intel_buf_write_to_png(&src, "render_src_tiling_none.png");
+		intel_buf_write_to_png(&dst, name);
+		intel_buf_write_to_png(&final, "render_final_tiling_none.png");
+	}
+
+	/* We'll fail on src <-> final compare so just warn */
+	if (tiling == I915_TILING_NONE) {
+		if (compare_bufs(&src, &dst, false) > 0)
+			igt_warn("%s: none->none failed!\n", __func__);
+	} else {
+		if (compare_bufs(&src, &dst, false) == 0)
+			igt_warn("%s: none->tiled failed!\n", __func__);
+	}
+
+	fails = compare_bufs(&src, &final, true);
+
+	intel_buf_close(bops, &src);
+	intel_buf_close(bops, &dst);
+	intel_buf_close(bops, &final);
+
+	igt_assert_f(fails == 0, "%s: (tiling: %d) fails: %d\n",
+		     __func__, tiling, fails);
+
+	return fails;
+}
+
+static uint32_t count_compressed(int gen, struct intel_buf *buf)
+{
+	int ccs_size = intel_buf_ccs_width(gen, buf) * intel_buf_ccs_height(gen, buf);
+	uint8_t *ptr = intel_buf_device_map(buf, false);
+	uint32_t compressed = 0;
+	int i;
+
+	for (i = 0; i < ccs_size; i++)
+		if (ptr[buf->ccs[0].offset + i])
+			compressed++;
+
+	intel_buf_unmap(buf);
+
+	return compressed;
+}
+
+#define IMGSIZE (512 * 512 * 4)
+#define CCSIMGSIZE (IMGSIZE + 4096)
+static void render_ccs(struct buf_ops *bops)
+{
+	struct intel_bb *ibb;
+	const int width = 1024;
+	const int height = 1024;
+	struct intel_buf src, dst, final;
+	int i915 = buf_ops_get_fd(bops);
+	uint32_t fails = 0;
+	uint32_t compressed = 0;
+	uint32_t devid = intel_get_drm_devid(i915);
+	igt_render_copyfunc_t render_copy = NULL;
+
+	ibb = intel_bb_create(i915, PAGE_SIZE);
+	if (debug_bb)
+		intel_bb_set_debug(ibb, true);
+
+	scratch_buf_init(bops, &src, width, height, I915_TILING_NONE,
+			 I915_COMPRESSION_NONE);
+	scratch_buf_init(bops, &dst, width, height, I915_TILING_Y,
+			 I915_COMPRESSION_RENDER);
+	scratch_buf_init(bops, &final, width, height, I915_TILING_NONE,
+			 I915_COMPRESSION_NONE);
+
+	render_copy = igt_get_render_copyfunc(devid);
+	igt_assert(render_copy);
+
+	scratch_buf_draw_pattern(bops, &src,
+				 0, 0, width, height,
+				 0, 0, width, height, 0);
+
+	render_copy(ibb, 0,
+		    &src,
+		    0, 0, width, height,
+		    &dst,
+		    0, 0);
+
+	render_copy(ibb, 0,
+		    &dst,
+		    0, 0, width, height,
+		    &final,
+		    0, 0);
+
+	intel_bb_sync(ibb);
+	intel_bb_destroy(ibb);
+
+	fails = compare_bufs(&src, &final, true);
+	compressed = count_compressed(ibb->gen, &dst);
+
+	igt_debug("fails: %u, compressed: %u\n", fails, compressed);
+
+	if (write_png) {
+		intel_buf_write_to_png(&src, "render-ccs-src.png");
+		intel_buf_write_to_png(&dst, "render-ccs-dst.png");
+		intel_buf_write_aux_to_png(&dst, "render-ccs-dst-aux.png");
+		intel_buf_write_to_png(&final, "render-ccs-final.png");
+	}
+
+	intel_buf_close(bops, &src);
+	intel_buf_close(bops, &dst);
+	intel_buf_close(bops, &final);
+
+	igt_assert_f(fails == 0, "render-ccs fails: %d\n", fails);
+}
+
 static int opt_handler(int opt, int opt_index, void *data)
 {
 	switch (opt) {
@@ -702,6 +858,18 @@ igt_main_args("dpi", NULL, help_str, opt_handler, NULL)
 	igt_subtest("full-batch")
 		full_batch(bops);
 
+	igt_subtest("render-none")
+		render(bops, I915_TILING_NONE);
+
+	igt_subtest("render-x")
+		render(bops, I915_TILING_X);
+
+	igt_subtest("render-y")
+		render(bops, I915_TILING_Y);
+
+	igt_subtest("render-ccs")
+		render_ccs(bops);
+
 	igt_fixture {
 		buf_ops_destroy(bops);
 		close(i915);
-- 
2.26.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [igt-dev] [PATCH i-g-t 06/11] lib/igt_draw: remove libdrm dependency
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
                   ` (4 preceding siblings ...)
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 05/11] tests/api_intel_bb: add render tests Zbigniew Kempczyński
@ 2020-07-23 19:22 ` Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 07/11] lib/igt_fb: Removal of " Zbigniew Kempczyński
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Zbigniew Kempczyński @ 2020-07-23 19:22 UTC (permalink / raw)
  To: igt-dev; +Cc: Chris Wilson

Change rendercopy to use intel_bb to remove libdrm dependency.

Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/igt_draw.c | 158 ++++++++++++++++++++++++-------------------------
 lib/igt_draw.h |   8 +--
 2 files changed, 80 insertions(+), 86 deletions(-)

diff --git a/lib/igt_draw.c b/lib/igt_draw.c
index f2340127..8de870b7 100644
--- a/lib/igt_draw.c
+++ b/lib/igt_draw.c
@@ -27,6 +27,7 @@
 #include "igt_draw.h"
 
 #include "drmtest.h"
+#include "intel_bufops.h"
 #include "intel_batchbuffer.h"
 #include "intel_chipset.h"
 #include "igt_core.h"
@@ -61,8 +62,8 @@
 /* Some internal data structures to avoid having to pass tons of parameters
  * around everything. */
 struct cmd_data {
-	drm_intel_bufmgr *bufmgr;
-	drm_intel_context *context;
+	struct buf_ops *bops;
+	uint32_t ctx;
 };
 
 struct buf_data {
@@ -264,8 +265,7 @@ static void set_pixel(void *_ptr, int index, uint32_t color, int bpp)
 	}
 }
 
-static void switch_blt_tiling(struct intel_batchbuffer *batch, uint32_t tiling,
-			      bool on)
+static void switch_blt_tiling(struct intel_bb *ibb, uint32_t tiling, bool on)
 {
 	uint32_t bcs_swctrl;
 
@@ -273,26 +273,22 @@ static void switch_blt_tiling(struct intel_batchbuffer *batch, uint32_t tiling,
 	if (tiling != I915_TILING_Y)
 		return;
 
-	igt_require(batch->gen >= 6);
+	igt_require(ibb->gen >= 6);
 
 	bcs_swctrl = (0x3 << 16) | (on ? 0x3 : 0x0);
 
 	/* To change the tile register, insert an MI_FLUSH_DW followed by an
 	 * MI_LOAD_REGISTER_IMM
 	 */
-	BEGIN_BATCH(4, 0);
-	OUT_BATCH(MI_FLUSH_DW | 2);
-	OUT_BATCH(0x0);
-	OUT_BATCH(0x0);
-	OUT_BATCH(0x0);
-	ADVANCE_BATCH();
-
-	BEGIN_BATCH(4, 0);
-	OUT_BATCH(MI_LOAD_REGISTER_IMM);
-	OUT_BATCH(0x22200); /* BCS_SWCTRL */
-	OUT_BATCH(bcs_swctrl);
-	OUT_BATCH(MI_NOOP);
-	ADVANCE_BATCH();
+	intel_bb_out(ibb, MI_FLUSH_DW | 2);
+	intel_bb_out(ibb, 0x0);
+	intel_bb_out(ibb, 0x0);
+	intel_bb_out(ibb, 0x0);
+
+	intel_bb_out(ibb, MI_LOAD_REGISTER_IMM);
+	intel_bb_out(ibb, 0x22200); /* BCS_SWCTRL */
+	intel_bb_out(ibb, bcs_swctrl);
+	intel_bb_out(ibb, MI_NOOP);
 }
 
 static void draw_rect_ptr_linear(void *ptr, uint32_t stride,
@@ -509,22 +505,39 @@ static void draw_rect_pwrite(int fd, struct buf_data *buf,
 	}
 }
 
+static struct intel_buf *create_buf(int fd, struct buf_ops *bops,
+				    struct buf_data *from, uint32_t tiling)
+{
+	struct intel_buf *buf = calloc(sizeof(*buf), 1);
+	uint32_t handle, name, width, height;
+
+	igt_assert(buf);
+
+	width = from->stride / (from->bpp / 8);
+	height = from->size / from->stride;
+
+	name = gem_flink(fd, from->handle);
+	handle = gem_open(fd, name);
+	intel_buf_init_using_handle(bops, handle, buf,
+				    width, height, from->bpp, 0,
+				    tiling, 0);
+
+	return buf;
+}
+
 static void draw_rect_blt(int fd, struct cmd_data *cmd_data,
 			  struct buf_data *buf, struct rect *rect,
 			  uint32_t tiling, uint32_t color)
 {
-	drm_intel_bo *dst;
-	struct intel_batchbuffer *batch;
+	struct intel_bb *ibb;
+	struct intel_buf *dst;
 	int blt_cmd_len, blt_cmd_tiling, blt_cmd_depth;
 	uint32_t devid = intel_get_drm_devid(fd);
 	int gen = intel_gen(devid);
 	int pitch;
 
-	dst = gem_handle_to_libdrm_bo(cmd_data->bufmgr, fd, "", buf->handle);
-	igt_assert(dst);
-
-	batch = intel_batchbuffer_alloc(cmd_data->bufmgr, devid);
-	igt_assert(batch);
+	dst = create_buf(fd, cmd_data->bops, buf, tiling);
+	ibb = intel_bb_create(fd, PAGE_SIZE);
 
 	switch (buf->bpp) {
 	case 8:
@@ -544,81 +557,63 @@ static void draw_rect_blt(int fd, struct cmd_data *cmd_data,
 	blt_cmd_tiling = (tiling) ? XY_COLOR_BLT_TILED : 0;
 	pitch = (gen >= 4 && tiling) ? buf->stride / 4 : buf->stride;
 
-	switch_blt_tiling(batch, tiling, true);
+	switch_blt_tiling(ibb, tiling, true);
 
-	BEGIN_BATCH(6, 1);
-	OUT_BATCH(XY_COLOR_BLT_CMD_NOLEN | XY_COLOR_BLT_WRITE_ALPHA |
-		  XY_COLOR_BLT_WRITE_RGB | blt_cmd_tiling | blt_cmd_len);
-	OUT_BATCH(blt_cmd_depth | (0xF0 << 16) | pitch);
-	OUT_BATCH((rect->y << 16) | rect->x);
-	OUT_BATCH(((rect->y + rect->h) << 16) | (rect->x + rect->w));
-	OUT_RELOC_FENCED(dst, 0, I915_GEM_DOMAIN_RENDER, 0);
-	OUT_BATCH(color);
-	ADVANCE_BATCH();
+	intel_bb_out(ibb, XY_COLOR_BLT_CMD_NOLEN | XY_COLOR_BLT_WRITE_ALPHA |
+		     XY_COLOR_BLT_WRITE_RGB | blt_cmd_tiling | blt_cmd_len);
+	intel_bb_out(ibb, blt_cmd_depth | (0xF0 << 16) | pitch);
+	intel_bb_out(ibb, (rect->y << 16) | rect->x);
+	intel_bb_out(ibb, ((rect->y + rect->h) << 16) | (rect->x + rect->w));
+	intel_bb_emit_reloc_fenced(ibb, dst->handle, 0, I915_GEM_DOMAIN_RENDER,
+				   0, dst->addr.offset);
+	intel_bb_out(ibb, color);
 
-	switch_blt_tiling(batch, tiling, false);
+	switch_blt_tiling(ibb, tiling, false);
 
-	intel_batchbuffer_flush(batch);
-	intel_batchbuffer_free(batch);
-	drm_intel_bo_unreference(dst);
+	intel_bb_flush_blit(ibb);
+	intel_bb_destroy(ibb);
+	intel_buf_destroy(dst);
 }
 
 static void draw_rect_render(int fd, struct cmd_data *cmd_data,
 			     struct buf_data *buf, struct rect *rect,
 			     uint32_t tiling, uint32_t color)
 {
-	drm_intel_bo *src, *dst;
+	struct intel_buf *src, *dst;
 	uint32_t devid = intel_get_drm_devid(fd);
 	igt_render_copyfunc_t rendercopy = igt_get_render_copyfunc(devid);
-	struct igt_buf src_buf = {}, dst_buf = {};
-	struct intel_batchbuffer *batch;
+	struct intel_bb *ibb;
 	struct buf_data tmp;
 	int pixel_size = buf->bpp / 8;
 
 	igt_skip_on(!rendercopy);
 
 	/* We create a temporary buffer and copy from it using rendercopy. */
-	tmp.size = rect->w * rect->h * pixel_size;
+	tmp.size = ALIGN(rect->w * rect->h * pixel_size, 128);
 	tmp.handle = gem_create(fd, tmp.size);
-	tmp.stride = rect->w * pixel_size;
+	tmp.stride = ALIGN(rect->w * pixel_size, 32);
 	tmp.bpp = buf->bpp;
 	draw_rect_mmap_cpu(fd, &tmp, &(struct rect){0, 0, rect->w, rect->h},
 			   I915_TILING_NONE, I915_BIT_6_SWIZZLE_NONE, color);
 
-	src = gem_handle_to_libdrm_bo(cmd_data->bufmgr, fd, "", tmp.handle);
-	igt_assert(src);
-	dst = gem_handle_to_libdrm_bo(cmd_data->bufmgr, fd, "", buf->handle);
-	igt_assert(dst);
-
-	src_buf.bo = src;
-	src_buf.surface[0].stride = tmp.stride;
-	src_buf.tiling = I915_TILING_NONE;
-	src_buf.surface[0].size = tmp.size;
-	src_buf.bpp = tmp.bpp;
-	dst_buf.bo = dst;
-	dst_buf.surface[0].stride = buf->stride;
-	dst_buf.tiling = tiling;
-	dst_buf.surface[0].size = buf->size;
-	dst_buf.bpp = buf->bpp;
-
-	batch = intel_batchbuffer_alloc(cmd_data->bufmgr, devid);
-	igt_assert(batch);
-
-	rendercopy(batch, cmd_data->context, &src_buf, 0, 0, rect->w,
-		   rect->h, &dst_buf, rect->x, rect->y);
-
-	intel_batchbuffer_free(batch);
-	drm_intel_bo_unreference(src);
-	drm_intel_bo_unreference(dst);
+	src = create_buf(fd, cmd_data->bops, &tmp, I915_TILING_NONE);
+	dst = create_buf(fd, cmd_data->bops, buf, tiling);
+	ibb = intel_bb_create(fd, PAGE_SIZE);
+
+	rendercopy(ibb, cmd_data->ctx, src, 0, 0, rect->w,
+		   rect->h, dst, rect->x, rect->y);
+
+	intel_bb_destroy(ibb);
+	intel_buf_destroy(src);
+	intel_buf_destroy(dst);
 	gem_close(fd, tmp.handle);
 }
 
 /**
  * igt_draw_rect:
  * @fd: the DRM file descriptor
- * @bufmgr: the libdrm bufmgr, only required for IGT_DRAW_BLT and
- *          IGT_DRAW_RENDER
- * @context: the context, can be NULL if you don't want to think about it
+ * @bops: buf ops, only required for IGT_DRAW_BLT and IGT_DRAW_RENDER
+ * @ctx: the context, can be 0 if you don't want to think about it
  * @buf_handle: the handle of the buffer where you're going to draw to
  * @buf_size: the size of the buffer
  * @buf_stride: the stride of the buffer
@@ -634,7 +629,7 @@ static void draw_rect_render(int fd, struct cmd_data *cmd_data,
  * This function draws a colored rectangle on the destination buffer, allowing
  * you to specify the method used to draw the rectangle.
  */
-void igt_draw_rect(int fd, drm_intel_bufmgr *bufmgr, drm_intel_context *context,
+void igt_draw_rect(int fd, struct buf_ops *bops, uint32_t ctx,
 		   uint32_t buf_handle, uint32_t buf_size, uint32_t buf_stride,
 		   uint32_t tiling, enum igt_draw_method method,
 		   int rect_x, int rect_y, int rect_w, int rect_h,
@@ -643,8 +638,8 @@ void igt_draw_rect(int fd, drm_intel_bufmgr *bufmgr, drm_intel_context *context,
 	uint32_t buf_tiling, swizzle;
 
 	struct cmd_data cmd_data = {
-		.bufmgr = bufmgr,
-		.context = context,
+		.bops = bops,
+		.ctx = ctx,
 	};
 	struct buf_data buf = {
 		.handle = buf_handle,
@@ -693,9 +688,8 @@ void igt_draw_rect(int fd, drm_intel_bufmgr *bufmgr, drm_intel_context *context,
 /**
  * igt_draw_rect_fb:
  * @fd: the DRM file descriptor
- * @bufmgr: the libdrm bufmgr, only required for IGT_DRAW_BLT and
- *          IGT_DRAW_RENDER
- * @context: the context, can be NULL if you don't want to think about it
+ * @bops: buf ops, only required for IGT_DRAW_BLT and IGT_DRAW_RENDER
+ * @ctx: context, can be 0 if you don't want to think about it
  * @fb: framebuffer
  * @method: method you're going to use to write to the buffer
  * @rect_x: horizontal position on the buffer where your rectangle starts
@@ -707,12 +701,12 @@ void igt_draw_rect(int fd, drm_intel_bufmgr *bufmgr, drm_intel_context *context,
  * This is exactly the same as igt_draw_rect, but you can pass an igt_fb instead
  * of manually providing its details. See igt_draw_rect.
  */
-void igt_draw_rect_fb(int fd, drm_intel_bufmgr *bufmgr,
-		      drm_intel_context *context, struct igt_fb *fb,
+void igt_draw_rect_fb(int fd, struct buf_ops *bops,
+		      uint32_t ctx, struct igt_fb *fb,
 		      enum igt_draw_method method, int rect_x, int rect_y,
 		      int rect_w, int rect_h, uint32_t color)
 {
-	igt_draw_rect(fd, bufmgr, context, fb->gem_handle, fb->size, fb->strides[0],
+	igt_draw_rect(fd, bops, ctx, fb->gem_handle, fb->size, fb->strides[0],
 		      igt_fb_mod_to_tiling(fb->modifier), method,
 		      rect_x, rect_y, rect_w, rect_h, color,
 		      igt_drm_format_to_bpp(fb->drm_format));
@@ -728,7 +722,7 @@ void igt_draw_rect_fb(int fd, drm_intel_bufmgr *bufmgr,
  */
 void igt_draw_fill_fb(int fd, struct igt_fb *fb, uint32_t color)
 {
-	igt_draw_rect_fb(fd, NULL, NULL, fb,
+	igt_draw_rect_fb(fd, NULL, 0, fb,
 			 gem_has_mappable_ggtt(fd) ? IGT_DRAW_MMAP_GTT :
 						     IGT_DRAW_MMAP_WC,
 			 0, 0, fb->width, fb->height, color);
diff --git a/lib/igt_draw.h b/lib/igt_draw.h
index ec146754..2d18ef6c 100644
--- a/lib/igt_draw.h
+++ b/lib/igt_draw.h
@@ -25,7 +25,7 @@
 #ifndef __IGT_DRAW_H__
 #define __IGT_DRAW_H__
 
-#include <intel_bufmgr.h>
+#include <intel_bufops.h>
 #include "igt_fb.h"
 
 /**
@@ -50,14 +50,14 @@ enum igt_draw_method {
 
 const char *igt_draw_get_method_name(enum igt_draw_method method);
 
-void igt_draw_rect(int fd, drm_intel_bufmgr *bufmgr, drm_intel_context *context,
+void igt_draw_rect(int fd, struct buf_ops *bops, uint32_t ctx,
 		   uint32_t buf_handle, uint32_t buf_size, uint32_t buf_stride,
 		   uint32_t tiling, enum igt_draw_method method,
 		   int rect_x, int rect_y, int rect_w, int rect_h,
 		   uint32_t color, int bpp);
 
-void igt_draw_rect_fb(int fd, drm_intel_bufmgr *bufmgr,
-		      drm_intel_context *context, struct igt_fb *fb,
+void igt_draw_rect_fb(int fd, struct buf_ops *bops,
+		      uint32_t ctx, struct igt_fb *fb,
 		      enum igt_draw_method method, int rect_x, int rect_y,
 		      int rect_w, int rect_h, uint32_t color);
 
-- 
2.26.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [igt-dev] [PATCH i-g-t 07/11] lib/igt_fb: Removal of libdrm dependency
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
                   ` (5 preceding siblings ...)
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 06/11] lib/igt_draw: remove libdrm dependency Zbigniew Kempczyński
@ 2020-07-23 19:22 ` Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 08/11] tests/gem|kms: remove libdrm dependency (batch 1) Zbigniew Kempczyński
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Zbigniew Kempczyński @ 2020-07-23 19:22 UTC (permalink / raw)
  To: igt-dev; +Cc: Chris Wilson

From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>

Adopt to intel_bb/intel_buf for libdrm removal.

Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/igt_fb.c | 96 +++++++++++++++++++++++++++++-----------------------
 1 file changed, 54 insertions(+), 42 deletions(-)

diff --git a/lib/igt_fb.c b/lib/igt_fb.c
index 3864b7a1..748c9c5a 100644
--- a/lib/igt_fb.c
+++ b/lib/igt_fb.c
@@ -45,6 +45,7 @@
 #include "intel_batchbuffer.h"
 #include "intel_chipset.h"
 #include "i915/gem_mman.h"
+#include "intel_bufops.h"
 
 /**
  * SECTION:igt_fb
@@ -1973,8 +1974,8 @@ struct fb_blit_upload {
 	int fd;
 	struct igt_fb *fb;
 	struct fb_blit_linear linear;
-	drm_intel_bufmgr *bufmgr;
-	struct intel_batchbuffer *batch;
+	struct buf_ops *bops;
+	struct intel_bb *ibb;
 };
 
 static bool fast_blit_ok(const struct igt_fb *fb)
@@ -2044,14 +2045,14 @@ static bool use_blitter(const struct igt_fb *fb)
 	       !gem_has_mappable_ggtt(fb->fd);
 }
 
-static void init_buf_ccs(struct igt_buf *buf, int ccs_idx,
+static void init_buf_ccs(struct intel_buf *buf, int ccs_idx,
 			 uint32_t offset, uint32_t stride)
 {
 	buf->ccs[ccs_idx].offset = offset;
 	buf->ccs[ccs_idx].stride = stride;
 }
 
-static void init_buf_surface(struct igt_buf *buf, int surface_idx,
+static void init_buf_surface(struct intel_buf *buf, int surface_idx,
 			     uint32_t offset, uint32_t stride, uint32_t size)
 {
 	buf->surface[surface_idx].offset = offset;
@@ -2075,26 +2076,17 @@ static int yuv_semiplanar_bpp(uint32_t drm_format)
 	}
 }
 
-static void init_buf(struct fb_blit_upload *blit,
-		     struct igt_buf *buf,
-		     const struct igt_fb *fb,
-		     const char *name)
+static struct intel_buf *create_buf(struct fb_blit_upload *blit,
+				   const struct igt_fb *fb,
+				   const char *name)
 {
+	struct intel_buf *buf;
+	uint32_t bo_name, handle, compression;
 	int num_surfaces;
 	int i;
 
 	igt_assert_eq(fb->offsets[0], 0);
 
-	buf->bo = gem_handle_to_libdrm_bo(blit->bufmgr, blit->fd,
-					  name, fb->gem_handle);
-	buf->tiling = igt_fb_mod_to_tiling(fb->modifier);
-	buf->bpp = fb->plane_bpp[0];
-	buf->format_is_yuv = igt_format_is_yuv(fb->drm_format);
-	buf->format_is_yuv_semiplanar =
-		igt_format_is_yuv_semiplanar(fb->drm_format);
-	if (buf->format_is_yuv_semiplanar)
-		buf->yuv_semiplanar_bpp = yuv_semiplanar_bpp(fb->drm_format);
-
 	if (is_ccs_modifier(fb->modifier)) {
 		igt_assert_eq(fb->strides[0] & 127, 0);
 
@@ -2104,17 +2096,36 @@ static void init_buf(struct fb_blit_upload *blit,
 			igt_assert_eq(fb->strides[1] & 127, 0);
 
 		if (is_gen12_mc_ccs_modifier(fb->modifier))
-			buf->compression = I915_COMPRESSION_MEDIA;
+			compression = I915_COMPRESSION_MEDIA;
 		else
-			buf->compression = I915_COMPRESSION_RENDER;
+			compression = I915_COMPRESSION_RENDER;
+	} else {
+		num_surfaces = fb->num_planes;
+		compression = I915_COMPRESSION_NONE;
+	}
 
+	bo_name = gem_flink(blit->fd, fb->gem_handle);
+	handle = gem_open(blit->fd, bo_name);
+	buf = calloc(1, sizeof(*buf));
+	igt_assert(buf);
+	intel_buf_init_using_handle(blit->bops, handle, buf, fb->width,
+				    fb->height, fb->plane_bpp[0], 0,
+				    igt_fb_mod_to_tiling(fb->modifier),
+				    compression);
+	intel_buf_set_name(buf, name);
+
+	buf->format_is_yuv = igt_format_is_yuv(fb->drm_format);
+	buf->format_is_yuv_semiplanar =
+		igt_format_is_yuv_semiplanar(fb->drm_format);
+	if (buf->format_is_yuv_semiplanar)
+		buf->yuv_semiplanar_bpp = yuv_semiplanar_bpp(fb->drm_format);
+
+	if (is_ccs_modifier(fb->modifier)) {
 		num_surfaces = fb->num_planes / 2;
 		for (i = 0; i < num_surfaces; i++)
 			init_buf_ccs(buf, i,
 				     fb->offsets[num_surfaces + i],
 				     fb->strides[num_surfaces + i]);
-	} else {
-		num_surfaces = fb->num_planes;
 	}
 
 	igt_assert(fb->offsets[0] == 0);
@@ -2130,11 +2141,13 @@ static void init_buf(struct fb_blit_upload *blit,
 
 	if (fb->modifier == I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS_CC)
 		buf->cc.offset = fb->offsets[2];
+
+	return buf;
 }
 
-static void fini_buf(struct igt_buf *buf)
+static void fini_buf(struct intel_buf *buf)
 {
-	drm_intel_bo_unreference(buf->bo);
+	intel_buf_destroy(buf);
 }
 
 static bool use_vebox_copy(const struct igt_fb *src_fb,
@@ -2164,7 +2177,7 @@ static void copy_with_engine(struct fb_blit_upload *blit,
 			     const struct igt_fb *dst_fb,
 			     const struct igt_fb *src_fb)
 {
-	struct igt_buf src = {}, dst = {};
+	struct intel_buf *src, *dst;
 	igt_render_copyfunc_t render_copy = NULL;
 	igt_vebox_copyfunc_t vebox_copy = NULL;
 
@@ -2178,23 +2191,23 @@ static void copy_with_engine(struct fb_blit_upload *blit,
 	igt_assert_eq(dst_fb->offsets[0], 0);
 	igt_assert_eq(src_fb->offsets[0], 0);
 
-	init_buf(blit, &src, src_fb, "cairo enginecopy src");
-	init_buf(blit, &dst, dst_fb, "cairo enginecopy dst");
+	src = create_buf(blit, src_fb, "cairo enginecopy src");
+	dst = create_buf(blit, dst_fb, "cairo enginecopy dst");
 
 	if (vebox_copy)
-		vebox_copy(blit->batch, &src,
+		vebox_copy(blit->ibb, src,
 			   dst_fb->plane_width[0], dst_fb->plane_height[0],
-			   &dst);
+			   dst);
 	else
-		render_copy(blit->batch, NULL,
-			    &src,
+		render_copy(blit->ibb, 0,
+			    src,
 			    0, 0,
 			    dst_fb->plane_width[0], dst_fb->plane_height[0],
-			    &dst,
+			    dst,
 			    0, 0);
 
-	fini_buf(&dst);
-	fini_buf(&src);
+	fini_buf(dst);
+	fini_buf(src);
 }
 
 static void blitcopy(const struct igt_fb *dst_fb,
@@ -2268,7 +2281,7 @@ static void free_linear_mapping(struct fb_blit_upload *blit)
 		gem_set_domain(fd, linear->fb.gem_handle,
 			I915_GEM_DOMAIN_GTT, 0);
 
-		if (blit->batch)
+		if (blit->ibb)
 			copy_with_engine(blit, fb, &linear->fb);
 		else
 			blitcopy(fb, &linear->fb);
@@ -2277,9 +2290,9 @@ static void free_linear_mapping(struct fb_blit_upload *blit)
 		gem_close(fd, linear->fb.gem_handle);
 	}
 
-	if (blit->batch) {
-		intel_batchbuffer_free(blit->batch);
-		drm_intel_bufmgr_destroy(blit->bufmgr);
+	if (blit->ibb) {
+		intel_bb_destroy(blit->ibb);
+		buf_ops_destroy(blit->bops);
 	}
 }
 
@@ -2301,9 +2314,8 @@ static void setup_linear_mapping(struct fb_blit_upload *blit)
 	struct fb_blit_linear *linear = &blit->linear;
 
 	if (!igt_vc4_is_tiled(fb->modifier) && use_enginecopy(fb)) {
-		blit->bufmgr = drm_intel_bufmgr_gem_init(fd, 4096);
-		blit->batch = intel_batchbuffer_alloc(blit->bufmgr,
-						      intel_get_drm_devid(fd));
+		blit->bops = buf_ops_create(fd);
+		blit->ibb = intel_bb_create(fd, 4096);
 	}
 
 	/*
@@ -2335,7 +2347,7 @@ static void setup_linear_mapping(struct fb_blit_upload *blit)
 		gem_set_domain(fd, linear->fb.gem_handle,
 				I915_GEM_DOMAIN_GTT, 0);
 
-		if (blit->batch)
+		if (blit->ibb)
 			copy_with_engine(blit, &linear->fb, fb);
 		else
 			blitcopy(&linear->fb, fb);
-- 
2.26.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [igt-dev] [PATCH i-g-t 08/11] tests/gem|kms: remove libdrm dependency (batch 1)
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
                   ` (6 preceding siblings ...)
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 07/11] lib/igt_fb: Removal of " Zbigniew Kempczyński
@ 2020-07-23 19:22 ` Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 09/11] tests/gem|kms: remove libdrm dependency (batch 2) Zbigniew Kempczyński
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Zbigniew Kempczyński @ 2020-07-23 19:22 UTC (permalink / raw)
  To: igt-dev; +Cc: Chris Wilson

From: Dominik Grzegorzek <dominik.grzegorzek@intel.com>

Use intel_bb / intel_buf to remove libdrm dependency.

Tests changed:
- gem_concurrent_all
- gem_ppgtt
- kms_draw_crc
- kms_frontbuffer_tracking
- kms_psr

Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/i915/gem_concurrent_all.c  | 431 ++++++++++++++++---------------
 tests/i915/gem_ppgtt.c           | 183 ++++++-------
 tests/kms_draw_crc.c             |  20 +-
 tests/kms_frontbuffer_tracking.c |  20 +-
 tests/kms_psr.c                  | 134 +++++-----
 5 files changed, 391 insertions(+), 397 deletions(-)

diff --git a/tests/i915/gem_concurrent_all.c b/tests/i915/gem_concurrent_all.c
index 80278c75..6ba12853 100644
--- a/tests/i915/gem_concurrent_all.c
+++ b/tests/i915/gem_concurrent_all.c
@@ -64,7 +64,9 @@ int pass;
 struct create {
 	const char *name;
 	void (*require)(const struct create *, unsigned);
-	drm_intel_bo *(*create)(drm_intel_bufmgr *, uint64_t size);
+	struct intel_buf *(*create)(struct buf_ops *bops, uint32_t width,
+				    uint32_t height, uint32_t tiling,
+				    uint64_t size);
 };
 
 struct size {
@@ -77,10 +79,10 @@ struct buffers {
 	const struct create *create;
 	const struct access_mode *mode;
 	const struct size *size;
-	drm_intel_bufmgr *bufmgr;
-	struct intel_batchbuffer *batch;
-	drm_intel_bo **src, **dst;
-	drm_intel_bo *snoop, *spare;
+	struct buf_ops *bops;
+	struct intel_bb *ibb;
+	struct intel_buf **src, **dst;
+	struct intel_buf *snoop, *spare;
 	uint32_t *tmp;
 	int width, height, npixels, page_size;
 	int count, num_buffers;
@@ -88,29 +90,32 @@ struct buffers {
 
 #define MIN_BUFFERS 3
 
-static void blt_copy_bo(struct buffers *b, drm_intel_bo *dst, drm_intel_bo *src);
+static void blt_copy_bo(struct buffers *b, struct intel_buf *dst,
+			struct intel_buf *src);
 
 static void
-nop_release_bo(drm_intel_bo *bo)
+nop_release_bo(struct intel_buf *buf)
 {
-	drm_intel_bo_unreference(bo);
+	if (buf->ptr)
+		intel_buf_unmap(buf);
+	intel_buf_destroy(buf);
 }
 
 static void
-prw_set_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+prw_set_bo(struct buffers *b, struct intel_buf *buf, uint32_t val)
 {
 	for (int i = 0; i < b->npixels; i++)
 		b->tmp[i] = val;
-	drm_intel_bo_subdata(bo, 0, 4*b->npixels, b->tmp);
+	gem_write(fd, buf->handle, 0, b->tmp, 4*b->npixels);
 }
 
 static void
-prw_cmp_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+prw_cmp_bo(struct buffers *b, struct intel_buf *buf, uint32_t val)
 {
 	uint32_t *vaddr;
 
 	vaddr = b->tmp;
-	do_or_die(drm_intel_bo_get_subdata(bo, 0, 4*b->npixels, vaddr));
+	gem_read(fd, buf->handle, 0, vaddr, 4*b->npixels);
 	for (int i = 0; i < b->npixels; i++)
 		igt_assert_eq_u32(vaddr[i], val);
 }
@@ -118,31 +123,33 @@ prw_cmp_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
 #define pixel(y, width) ((y)*(width) + (((y) + pass)%(width)))
 
 static void
-partial_set_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+partial_set_bo(struct buffers *b, struct intel_buf *buf, uint32_t val)
 {
 	for (int y = 0; y < b->height; y++)
-		do_or_die(drm_intel_bo_subdata(bo, 4*pixel(y, b->width), 4, &val));
+		gem_write(fd, buf->handle, 4*pixel(y, b->width), &val, 4);
 }
 
 static void
-partial_cmp_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+partial_cmp_bo(struct buffers *b, struct intel_buf *buf, uint32_t val)
 {
 	for (int y = 0; y < b->height; y++) {
-		uint32_t buf;
-		do_or_die(drm_intel_bo_get_subdata(bo, 4*pixel(y, b->width), 4, &buf));
-		igt_assert_eq_u32(buf, val);
+		uint32_t tmp;
+
+		gem_read(fd, buf->handle, 4*pixel(y, b->width), &tmp, 4);
+		igt_assert_eq_u32(tmp, val);
 	}
 }
 
-static drm_intel_bo *
-create_normal_bo(drm_intel_bufmgr *bufmgr, uint64_t size)
+static struct intel_buf *
+create_normal_bo(struct buf_ops *bops, uint32_t width,
+		 uint32_t height, uint32_t tiling, uint64_t size)
 {
-	drm_intel_bo *bo;
+	struct intel_buf *buf;
+	int bpp = size/height/width * 8;
 
-	bo = drm_intel_bo_alloc(bufmgr, "bo", size, 0);
-	igt_assert(bo);
+	buf = intel_buf_create(bops, width, height, bpp, 0, tiling, 0);
 
-	return bo;
+	return buf;
 }
 
 static void can_create_normal(const struct create *create, unsigned count)
@@ -150,19 +157,26 @@ static void can_create_normal(const struct create *create, unsigned count)
 }
 
 #if HAVE_CREATE_PRIVATE
-static drm_intel_bo *
-create_private_bo(drm_intel_bufmgr *bufmgr, uint64_t size)
+static struct intel_buf *
+create_private_bo(buf_ops *bops, uint32_t width, uint32_t height,
+		  uint32_t tiling, uint64_t size)
 {
-	drm_intel_bo *bo;
-	uint32_t handle;
+	struct intel_buf *buf;
+	uint32_t handle, buf_handle, name;
+	int bpp = size/height/width * 8;
 
 	/* XXX gem_create_with_flags(fd, size, I915_CREATE_PRIVATE); */
 
 	handle = gem_create(fd, size);
-	bo = gem_handle_to_libdrm_bo(bufmgr, fd, "stolen", handle);
+	name = gem_flink(fd, handle);
+	buf_handle = gem_open(fd, name);
+
+	buf = calloc(1, sizeof(*buf));
+	intel_buf_init_using_handle(bops, buf_handle, buf, width, height,
+				    bpp, 0, tiling, 0);
 	gem_close(fd, handle);
 
-	return bo;
+	return buf;
 }
 
 static void can_create_private(const struct create *create, unsigned count)
@@ -172,19 +186,26 @@ static void can_create_private(const struct create *create, unsigned count)
 #endif
 
 #if HAVE_CREATE_STOLEN
-static drm_intel_bo *
-create_stolen_bo(drm_intel_bufmgr *bufmgr, uint64_t size)
+static struct intel_buf *
+create_stolen_bo(buf_ops *bops, uint32_t width, uint32_t height,
+		 uint32_t tiling, uint64_t size)
 {
-	drm_intel_bo *bo;
-	uint32_t handle;
+	struct intel_buf *buf;
+	uint32_t handle, buf_handle, name;
+	int bpp = size/height/width * 8;
 
-	/* XXX gem_create_with_flags(fd, size, I915_CREATE_STOLEN); */
+	/* XXX gem_create_with_flags(fd, size, I915_CREATE_PRIVATE); */
 
 	handle = gem_create(fd, size);
-	bo = gem_handle_to_libdrm_bo(bufmgr, fd, "stolen", handle);
+	name = gem_flink(fd, handle);
+	buf_handle = gem_open(fd, name);
+
+	buf = calloc(1, sizeof(*buf));
+	intel_buf_init_using_handle(bops, buf_handle, buf, width, height,
+				    bpp, 0, tiling, 0);
 	gem_close(fd, handle);
 
-	return bo;
+	return buf;
 }
 
 static void can_create_stolen(const struct create *create, unsigned count)
@@ -201,10 +222,17 @@ static void create_cpu_require(const struct create *create, unsigned count)
 #endif
 }
 
-static drm_intel_bo *
+static struct intel_buf *
+create_bo(const struct buffers *b, uint32_t tiling)
+{
+	return b->create->create(b->bops, b->width, b->height,
+				 tiling, 4*b->npixels);
+}
+
+static struct intel_buf *
 unmapped_create_bo(const struct buffers *b)
 {
-	return b->create->create(b->bufmgr, 4*b->npixels);
+	return create_bo(b, I915_TILING_NONE);
 }
 
 static void create_snoop_require(const struct create *create, unsigned count)
@@ -213,16 +241,15 @@ static void create_snoop_require(const struct create *create, unsigned count)
 	igt_require(!gem_has_llc(fd));
 }
 
-static drm_intel_bo *
+static struct intel_buf *
 snoop_create_bo(const struct buffers *b)
 {
-	drm_intel_bo *bo;
+	struct intel_buf *buf;
 
-	bo = unmapped_create_bo(b);
-	gem_set_caching(fd, bo->handle, I915_CACHING_CACHED);
-	drm_intel_bo_disable_reuse(bo);
+	buf = unmapped_create_bo(b);
+	gem_set_caching(fd, buf->handle, I915_CACHING_CACHED);
 
-	return bo;
+	return buf;
 }
 
 static void create_userptr_require(const struct create *create, unsigned count)
@@ -251,11 +278,11 @@ static void create_userptr_require(const struct create *create, unsigned count)
 	igt_require(has_userptr);
 }
 
-static drm_intel_bo *
+static struct intel_buf *
 userptr_create_bo(const struct buffers *b)
 {
 	struct drm_i915_gem_userptr userptr;
-	drm_intel_bo *bo;
+	struct intel_buf *buf;
 	void *ptr;
 
 	memset(&userptr, 0, sizeof(userptr));
@@ -266,54 +293,48 @@ userptr_create_bo(const struct buffers *b)
 	igt_assert(ptr != (void *)-1);
 	userptr.user_ptr = to_user_pointer(ptr);
 
-#if 0
 	do_or_die(drmIoctl(fd, DRM_IOCTL_I915_GEM_USERPTR, &userptr));
-	bo = gem_handle_to_libdrm_bo(b->bufmgr, fd, "userptr", userptr.handle);
-	gem_close(fd, userptr.handle);
-#else
-	bo = drm_intel_bo_alloc_userptr(b->bufmgr, "name",
-					ptr, I915_TILING_NONE, 0,
-					userptr.user_size, 0);
-	igt_assert(bo);
-#endif
-	bo->virtual = from_user_pointer(userptr.user_ptr);
+	buf = calloc(1, sizeof(*buf));
+	intel_buf_init_using_handle(b->bops, userptr.handle, buf, b->width,
+				    b->height, 32, 0, I915_TILING_NONE, 0);
+
+	buf->ptr = (void *) from_user_pointer(userptr.user_ptr);
 
-	return bo;
+	return buf;
 }
 
 static void
-userptr_set_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+userptr_set_bo(struct buffers *b, struct intel_buf *buf, uint32_t val)
 {
 	int size = b->npixels;
-	uint32_t *vaddr = bo->virtual;
+	uint32_t *vaddr = buf->ptr;
 
-	gem_set_domain(fd, bo->handle,
+	gem_set_domain(fd, buf->handle,
 		       I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
 	while (size--)
 		*vaddr++ = val;
 }
 
 static void
-userptr_cmp_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+userptr_cmp_bo(struct buffers *b, struct intel_buf *buf, uint32_t val)
 {
 	int size =  b->npixels;
-	uint32_t *vaddr = bo->virtual;
+	uint32_t *vaddr = buf->ptr;
 
-	gem_set_domain(fd, bo->handle,
+	gem_set_domain(fd, buf->handle,
 		       I915_GEM_DOMAIN_CPU, 0);
 	while (size--)
 		igt_assert_eq_u32(*vaddr++, val);
 }
 
 static void
-userptr_release_bo(drm_intel_bo *bo)
+userptr_release_bo(struct intel_buf *buf)
 {
-	igt_assert(bo->virtual);
+	igt_assert(buf->ptr);
 
-	munmap(bo->virtual, bo->size);
-	bo->virtual = NULL;
+	munmap(buf->ptr, buf->surface[0].size);
 
-	drm_intel_bo_unreference(bo);
+	intel_buf_destroy(buf);
 }
 
 static void create_dmabuf_require(const struct create *create, unsigned count)
@@ -349,13 +370,14 @@ struct dmabuf {
 	void *map;
 };
 
-static drm_intel_bo *
+static struct intel_buf *
 dmabuf_create_bo(const struct buffers *b)
 {
 	struct drm_prime_handle args;
-	drm_intel_bo *bo;
+	static struct intel_buf *buf;
 	struct dmabuf *dmabuf;
 	int size;
+	uint32_t handle;
 
 	size = b->page_size;
 
@@ -366,9 +388,12 @@ dmabuf_create_bo(const struct buffers *b)
 
 	do_ioctl(fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &args);
 	gem_close(fd, args.handle);
+	igt_assert(args.fd != -1);
 
-	bo = drm_intel_bo_gem_create_from_prime(b->bufmgr, args.fd, size);
-	igt_assert(bo);
+	handle = prime_fd_to_handle(buf_ops_get_fd(b->bops), args.fd);
+	buf = calloc(1, sizeof(*buf));
+	intel_buf_init_using_handle(b->bops, handle, buf, b->width,
+				    b->height, 32, 0, I915_TILING_NONE, 0);
 
 	dmabuf = malloc(sizeof(*dmabuf));
 	igt_assert(dmabuf);
@@ -379,15 +404,15 @@ dmabuf_create_bo(const struct buffers *b)
 			   dmabuf->fd, 0);
 	igt_assert(dmabuf->map != (void *)-1);
 
-	bo->virtual = dmabuf;
+	buf->ptr = (void *) dmabuf;
 
-	return bo;
+	return buf;
 }
 
 static void
-dmabuf_set_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+dmabuf_set_bo(struct buffers *b, struct intel_buf *buf, uint32_t val)
 {
-	struct dmabuf *dmabuf = bo->virtual;
+	struct dmabuf *dmabuf = (void *) buf->ptr;
 	uint32_t *v = dmabuf->map;
 	int y;
 
@@ -398,9 +423,9 @@ dmabuf_set_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
 }
 
 static void
-dmabuf_cmp_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+dmabuf_cmp_bo(struct buffers *b, struct intel_buf *buf, uint32_t val)
 {
-	struct dmabuf *dmabuf = bo->virtual;
+	struct dmabuf *dmabuf = (void *) buf->ptr;
 	uint32_t *v = dmabuf->map;
 	int y;
 
@@ -411,17 +436,16 @@ dmabuf_cmp_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
 }
 
 static void
-dmabuf_release_bo(drm_intel_bo *bo)
+dmabuf_release_bo(struct intel_buf *buf)
 {
-	struct dmabuf *dmabuf = bo->virtual;
+	struct dmabuf *dmabuf = (void *) buf->ptr;
 	igt_assert(dmabuf);
 
-	munmap(dmabuf->map, bo->size);
+	munmap(dmabuf->map, buf->surface[0].size);
 	close(dmabuf->fd);
 	free(dmabuf);
 
-	bo->virtual = NULL;
-	drm_intel_bo_unreference(bo);
+	intel_buf_destroy(buf);
 }
 
 static bool has_prime_export(int _fd)
@@ -446,13 +470,14 @@ static void create_gtt_require(const struct create *create, unsigned count)
 	gem_require_mappable_ggtt(fd);
 }
 
-static drm_intel_bo *
+static struct intel_buf *
 vgem_create_bo(const struct buffers *b)
 {
 	struct drm_prime_handle args;
-	drm_intel_bo *bo;
+	struct intel_buf *buf;
 	struct vgem_bo vgem;
 	struct dmabuf *dmabuf;
+	uint32_t handle;
 
 	igt_assert(vgem_drv != -1);
 
@@ -470,8 +495,11 @@ vgem_create_bo(const struct buffers *b)
 	gem_close(vgem_drv, args.handle);
 	igt_assert(args.fd != -1);
 
-	bo = drm_intel_bo_gem_create_from_prime(b->bufmgr, args.fd, vgem.size);
-	igt_assert(bo);
+	handle = prime_fd_to_handle(buf_ops_get_fd(b->bops), args.fd);
+	buf = calloc(1, sizeof(*buf));
+	intel_buf_init_using_handle(b->bops, handle, buf,
+				    vgem.width, vgem.height, vgem.bpp,
+				    0, I915_TILING_NONE, 0);
 
 	dmabuf = malloc(sizeof(*dmabuf));
 	igt_assert(dmabuf);
@@ -482,64 +510,58 @@ vgem_create_bo(const struct buffers *b)
 			   dmabuf->fd, 0);
 	igt_assert(dmabuf->map != (void *)-1);
 
-	bo->virtual = dmabuf;
+	buf->ptr = (void *) dmabuf;
 
-	return bo;
+	return buf;
 }
 
 static void
-gtt_set_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+gtt_set_bo(struct buffers *b, struct intel_buf *buf, uint32_t val)
 {
-	uint32_t *vaddr = bo->virtual;
+	uint32_t *vaddr = buf->ptr;
+
+	gem_set_domain(fd, buf->handle,
+		       I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
 
-	drm_intel_gem_bo_start_gtt_access(bo, true);
 	for (int y = 0; y < b->height; y++)
 		vaddr[pixel(y, b->width)] = val;
 }
 
 static void
-gtt_cmp_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+gtt_cmp_bo(struct buffers *b, struct  intel_buf *buf, uint32_t val)
 {
-	uint32_t *vaddr = bo->virtual;
+	uint32_t *vaddr = buf->ptr;
 
 	/* GTT access is slow. So we just compare a few points */
-	drm_intel_gem_bo_start_gtt_access(bo, false);
+	gem_set_domain(fd, buf->handle,
+		       I915_GEM_DOMAIN_GTT, 0);
+
 	for (int y = 0; y < b->height; y++)
 		igt_assert_eq_u32(vaddr[pixel(y, b->width)], val);
 }
 
-static drm_intel_bo *
-map_bo(drm_intel_bo *bo)
+static struct intel_buf *
+map_bo(struct intel_buf *buf)
 {
 	/* gtt map doesn't have a write parameter, so just keep the mapping
 	 * around (to avoid the set_domain with the gtt write domain set) and
 	 * manually tell the kernel when we start access the gtt. */
-	do_or_die(drm_intel_gem_bo_map_gtt(bo));
-
-	return bo;
-}
-
-static drm_intel_bo *
-tile_bo(drm_intel_bo *bo, int width)
-{
-	uint32_t tiling = I915_TILING_X;
-	uint32_t stride = width * 4;
-
-	do_or_die(drm_intel_bo_set_tiling(bo, &tiling, stride));
+	buf->ptr = gem_mmap__gtt(buf_ops_get_fd(buf->bops), buf->handle,
+				 buf->surface[0].size, PROT_READ | PROT_WRITE);
 
-	return bo;
+	return buf;
 }
 
-static drm_intel_bo *
+static struct intel_buf *
 gtt_create_bo(const struct buffers *b)
 {
 	return map_bo(unmapped_create_bo(b));
 }
 
-static drm_intel_bo *
+static struct intel_buf *
 gttX_create_bo(const struct buffers *b)
 {
-	return tile_bo(gtt_create_bo(b), b->width);
+	return map_bo(create_bo(b, I915_TILING_X));
 }
 
 static void bit17_require(void)
@@ -574,85 +596,83 @@ wc_create_require(const struct create *create, unsigned count)
 	wc_require();
 }
 
-static drm_intel_bo *
+static struct intel_buf *
 wc_create_bo(const struct buffers *b)
 {
-	drm_intel_bo *bo;
+	static struct intel_buf *buf;
 
-	bo = unmapped_create_bo(b);
-	bo->virtual = gem_mmap__wc(fd, bo->handle, 0, bo->size, PROT_READ | PROT_WRITE);
-	return bo;
+	buf = unmapped_create_bo(b);
+	buf->ptr = gem_mmap__wc(fd, buf->handle, 0, buf->surface[0].size,
+				PROT_READ | PROT_WRITE);
+	return buf;
 }
 
 static void
-wc_release_bo(drm_intel_bo *bo)
+wc_release_bo(struct intel_buf *buf)
 {
-	igt_assert(bo->virtual);
+	igt_assert(buf->ptr);
 
-	munmap(bo->virtual, bo->size);
-	bo->virtual = NULL;
+	munmap(buf->ptr, buf->surface[0].size);
+	buf->ptr = 0;
 
-	nop_release_bo(bo);
+	nop_release_bo(buf);
 }
 
-static drm_intel_bo *
+static struct intel_buf *
 gpu_create_bo(const struct buffers *b)
 {
 	return unmapped_create_bo(b);
 }
 
-static drm_intel_bo *
+static struct intel_buf *
 gpuX_create_bo(const struct buffers *b)
 {
-	return tile_bo(gpu_create_bo(b), b->width);
+	return create_bo(b, I915_TILING_X);
 }
 
 static void
-cpu_set_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+cpu_set_bo(struct buffers *b, struct intel_buf *buf, uint32_t val)
 {
 	int size = b->npixels;
 	uint32_t *vaddr;
 
-	do_or_die(drm_intel_bo_map(bo, true));
-	vaddr = bo->virtual;
+	intel_buf_cpu_map(buf, true);
+	vaddr = buf->ptr;
 	while (size--)
 		*vaddr++ = val;
-	drm_intel_bo_unmap(bo);
+	intel_buf_unmap(buf);
 }
 
 static void
-cpu_cmp_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+cpu_cmp_bo(struct buffers *b, struct intel_buf *buf, uint32_t val)
 {
 	int size = b->npixels;
 	uint32_t *vaddr;
 
-	do_or_die(drm_intel_bo_map(bo, false));
-	vaddr = bo->virtual;
+	intel_buf_cpu_map(buf, false);
+	vaddr = buf->ptr;
 	while (size--)
 		igt_assert_eq_u32(*vaddr++, val);
-	drm_intel_bo_unmap(bo);
+	intel_buf_unmap(buf);
 }
 
 static void
-gpu_set_bo(struct buffers *buffers, drm_intel_bo *bo, uint32_t val)
+gpu_set_bo(struct buffers *buffers, struct intel_buf *buf, uint32_t val)
 {
 	struct drm_i915_gem_relocation_entry reloc[1];
 	struct drm_i915_gem_exec_object2 gem_exec[2];
 	struct drm_i915_gem_execbuffer2 execbuf;
-	uint32_t buf[10], *b;
-	uint32_t tiling, swizzle;
-
-	drm_intel_bo_get_tiling(bo, &tiling, &swizzle);
+	uint32_t tmp[10], *b;
 
 	memset(reloc, 0, sizeof(reloc));
 	memset(gem_exec, 0, sizeof(gem_exec));
 	memset(&execbuf, 0, sizeof(execbuf));
 
-	b = buf;
+	b = tmp;
 	*b++ = XY_COLOR_BLT_CMD_NOLEN |
 		((gen >= 8) ? 5 : 4) |
 		COLOR_BLT_WRITE_ALPHA | XY_COLOR_BLT_WRITE_RGB;
-	if (gen >= 4 && tiling) {
+	if (gen >= 4 && buf->tiling) {
 		b[-1] |= XY_COLOR_BLT_TILED;
 		*b = buffers->width;
 	} else
@@ -660,8 +680,8 @@ gpu_set_bo(struct buffers *buffers, drm_intel_bo *bo, uint32_t val)
 	*b++ |= 0xf0 << 16 | 1 << 25 | 1 << 24;
 	*b++ = 0;
 	*b++ = buffers->height << 16 | buffers->width;
-	reloc[0].offset = (b - buf) * sizeof(uint32_t);
-	reloc[0].target_handle = bo->handle;
+	reloc[0].offset = (b - tmp) * sizeof(uint32_t);
+	reloc[0].target_handle = buf->handle;
 	reloc[0].read_domains = I915_GEM_DOMAIN_RENDER;
 	reloc[0].write_domain = I915_GEM_DOMAIN_RENDER;
 	*b++ = 0;
@@ -669,10 +689,10 @@ gpu_set_bo(struct buffers *buffers, drm_intel_bo *bo, uint32_t val)
 		*b++ = 0;
 	*b++ = val;
 	*b++ = MI_BATCH_BUFFER_END;
-	if ((b - buf) & 1)
+	if ((b - tmp) & 1)
 		*b++ = 0;
 
-	gem_exec[0].handle = bo->handle;
+	gem_exec[0].handle = buf->handle;
 	gem_exec[0].flags = EXEC_OBJECT_NEEDS_FENCE;
 
 	gem_exec[1].handle = gem_create(fd, 4096);
@@ -681,30 +701,30 @@ gpu_set_bo(struct buffers *buffers, drm_intel_bo *bo, uint32_t val)
 
 	execbuf.buffers_ptr = to_user_pointer(gem_exec);
 	execbuf.buffer_count = 2;
-	execbuf.batch_len = (b - buf) * sizeof(buf[0]);
+	execbuf.batch_len = (b - tmp) * sizeof(tmp[0]);
 	if (gen >= 6)
 		execbuf.flags = I915_EXEC_BLT;
 
-	gem_write(fd, gem_exec[1].handle, 0, buf, execbuf.batch_len);
+	gem_write(fd, gem_exec[1].handle, 0, tmp, execbuf.batch_len);
 	gem_execbuf(fd, &execbuf);
 
 	gem_close(fd, gem_exec[1].handle);
 }
 
 static void
-gpu_cmp_bo(struct buffers *b, drm_intel_bo *bo, uint32_t val)
+gpu_cmp_bo(struct buffers *b, struct intel_buf *buf, uint32_t val)
 {
-	blt_copy_bo(b, b->snoop, bo);
+	blt_copy_bo(b, b->snoop, buf);
 	cpu_cmp_bo(b, b->snoop, val);
 }
 
 struct access_mode {
 	const char *name;
 	void (*require)(const struct create *, unsigned);
-	drm_intel_bo *(*create_bo)(const struct buffers *b);
-	void (*set_bo)(struct buffers *b, drm_intel_bo *bo, uint32_t val);
-	void (*cmp_bo)(struct buffers *b, drm_intel_bo *bo, uint32_t val);
-	void (*release_bo)(drm_intel_bo *bo);
+	struct intel_buf *(*create_bo)(const struct buffers *b);
+	void (*set_bo)(struct buffers *b, struct intel_buf *buf, uint32_t val);
+	void (*cmp_bo)(struct buffers *b, struct intel_buf *buf, uint32_t val);
+	void (*release_bo)(struct intel_buf *buf);
 };
 igt_render_copyfunc_t rendercopy;
 
@@ -745,7 +765,7 @@ static void buffers_init(struct buffers *b,
 			 const struct access_mode *mode,
 			 const struct size *size,
 			 int num_buffers,
-			 int _fd, int enable_reuse)
+			 int _fd)
 {
 	memset(b, 0, sizeof(*b));
 	b->name = name;
@@ -763,17 +783,13 @@ static void buffers_init(struct buffers *b,
 	b->tmp = malloc(b->page_size);
 	igt_assert(b->tmp);
 
-	b->bufmgr = drm_intel_bufmgr_gem_init(_fd, 4096);
-	igt_assert(b->bufmgr);
+	b->bops = buf_ops_create(_fd);
 
-	b->src = malloc(2*sizeof(drm_intel_bo *)*num_buffers);
+	b->src = malloc(2*sizeof(struct intel_buf *)*num_buffers);
 	igt_assert(b->src);
 	b->dst = b->src + num_buffers;
 
-	if (enable_reuse)
-		drm_intel_bufmgr_gem_enable_reuse(b->bufmgr);
-	b->batch = intel_batchbuffer_alloc(b->bufmgr, devid);
-	igt_assert(b->batch);
+	b->ibb = intel_bb_create(_fd, 4096);
 }
 
 static void buffers_destroy(struct buffers *b)
@@ -809,7 +825,7 @@ static void buffers_destroy(struct buffers *b)
 static void buffers_create(struct buffers *b)
 {
 	int count = b->num_buffers;
-	igt_assert(b->bufmgr);
+	igt_assert(b->bops);
 
 	buffers_destroy(b);
 	igt_assert(b->count == 0);
@@ -823,20 +839,15 @@ static void buffers_create(struct buffers *b)
 	b->snoop = snoop_create_bo(b);
 }
 
-static void buffers_reset(struct buffers *b, bool enable_reuse)
+static void buffers_reset(struct buffers *b)
 {
-	b->bufmgr = drm_intel_bufmgr_gem_init(fd, 4096);
-	igt_assert(b->bufmgr);
-
-	if (enable_reuse)
-		drm_intel_bufmgr_gem_enable_reuse(b->bufmgr);
-	b->batch = intel_batchbuffer_alloc(b->bufmgr, devid);
-	igt_assert(b->batch);
+	b->bops = buf_ops_create(fd);
+	b->ibb = intel_bb_create(fd, 4096);
 }
 
 static void buffers_fini(struct buffers *b)
 {
-	if (b->bufmgr == NULL)
+	if (b->bops == NULL)
 		return;
 
 	buffers_destroy(b);
@@ -844,58 +855,47 @@ static void buffers_fini(struct buffers *b)
 	free(b->tmp);
 	free(b->src);
 
-	intel_batchbuffer_free(b->batch);
-	drm_intel_bufmgr_destroy(b->bufmgr);
+	intel_bb_destroy(b->ibb);
+	buf_ops_destroy(b->bops);
 
 	memset(b, 0, sizeof(*b));
 }
 
-typedef void (*do_copy)(struct buffers *b, drm_intel_bo *dst, drm_intel_bo *src);
+typedef void (*do_copy)(struct buffers *b, struct intel_buf *dst,
+			struct intel_buf *src);
 typedef igt_hang_t (*do_hang)(void);
 
-static void render_copy_bo(struct buffers *b, drm_intel_bo *dst, drm_intel_bo *src)
+static void render_copy_bo(struct buffers *b, struct intel_buf *dst,
+			   struct intel_buf *src)
 {
-	struct igt_buf d = {
-		.bo = dst,
-		.num_tiles = b->npixels * 4,
-		.surface[0] = {
-			.size = b->npixels * 4, .stride = b->width * 4,
-		},
-		.bpp = 32,
-	}, s = {
-		.bo = src,
-		.num_tiles = b->npixels * 4,
-		.surface[0] = {
-			.size = b->npixels * 4, .stride = b->width * 4,
-		},
-		.bpp = 32,
-	};
-	uint32_t swizzle;
-
-	drm_intel_bo_get_tiling(dst, &d.tiling, &swizzle);
-	drm_intel_bo_get_tiling(src, &s.tiling, &swizzle);
-
-	rendercopy(b->batch, NULL,
-		   &s, 0, 0,
+	rendercopy(b->ibb, 0,
+		   src, 0, 0,
 		   b->width, b->height,
-		   &d, 0, 0);
+		   dst, 0, 0);
+	intel_bb_sync(b->ibb);
+	intel_bb_reset(b->ibb, true);
 }
 
-static void blt_copy_bo(struct buffers *b, drm_intel_bo *dst, drm_intel_bo *src)
+static void blt_copy_bo(struct buffers *b, struct intel_buf *dst,
+			struct intel_buf *src)
 {
-	intel_blt_copy(b->batch,
+	intel_bb_blt_copy(b->ibb,
 		       src, 0, 0, 4*b->width,
 		       dst, 0, 0, 4*b->width,
 		       b->width, b->height, 32);
+	intel_bb_sync(b->ibb);
+	intel_bb_reset(b->ibb, true);
 }
 
-static void cpu_copy_bo(struct buffers *b, drm_intel_bo *dst, drm_intel_bo *src)
+static void cpu_copy_bo(struct buffers *b, struct intel_buf *dst,
+			struct intel_buf *src)
 {
 	const int size = b->page_size;
 	void *d, *s;
 
 	gem_set_domain(fd, src->handle, I915_GEM_DOMAIN_CPU, 0);
-	gem_set_domain(fd, dst->handle, I915_GEM_DOMAIN_CPU, I915_GEM_DOMAIN_CPU);
+	gem_set_domain(fd, dst->handle, I915_GEM_DOMAIN_CPU,
+		       I915_GEM_DOMAIN_CPU);
 	s = gem_mmap__cpu(fd, src->handle, 0, size, PROT_READ);
 	d = gem_mmap__cpu(fd, dst->handle, 0, size, PROT_WRITE);
 
@@ -905,13 +905,15 @@ static void cpu_copy_bo(struct buffers *b, drm_intel_bo *dst, drm_intel_bo *src)
 	munmap(s, size);
 }
 
-static void gtt_copy_bo(struct buffers *b, drm_intel_bo *dst, drm_intel_bo *src)
+static void gtt_copy_bo(struct buffers *b, struct intel_buf *dst,
+			struct intel_buf *src)
 {
 	const int size = b->page_size;
 	void *d, *s;
 
 	gem_set_domain(fd, src->handle, I915_GEM_DOMAIN_GTT, 0);
-	gem_set_domain(fd, dst->handle, I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+	gem_set_domain(fd, dst->handle, I915_GEM_DOMAIN_GTT,
+		       I915_GEM_DOMAIN_GTT);
 
 	s = gem_mmap__gtt(fd, src->handle, size, PROT_READ);
 	d = gem_mmap__gtt(fd, dst->handle, size, PROT_WRITE);
@@ -922,13 +924,15 @@ static void gtt_copy_bo(struct buffers *b, drm_intel_bo *dst, drm_intel_bo *src)
 	munmap(s, size);
 }
 
-static void wc_copy_bo(struct buffers *b, drm_intel_bo *dst, drm_intel_bo *src)
+static void wc_copy_bo(struct buffers *b, struct intel_buf *dst,
+		       struct intel_buf *src)
 {
 	const int size = b->page_size;
 	void *d, *s;
 
 	gem_set_domain(fd, src->handle, I915_GEM_DOMAIN_WC, 0);
-	gem_set_domain(fd, dst->handle, I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC);
+	gem_set_domain(fd, dst->handle, I915_GEM_DOMAIN_WC,
+		       I915_GEM_DOMAIN_WC);
 
 	s = gem_mmap__wc(fd, src->handle, 0, size, PROT_READ);
 	d = gem_mmap__wc(fd, dst->handle, 0, size, PROT_WRITE);
@@ -1353,10 +1357,10 @@ static void __run_forked(struct buffers *buffers,
 			 do_hang do_hang_func)
 
 {
-	/* purge the libdrm caches before cloing the process */
+	/* purge the caches before cloing the process */
 	buffers_destroy(buffers);
-	intel_batchbuffer_free(buffers->batch);
-	drm_intel_bufmgr_destroy(buffers->bufmgr);
+	intel_bb_destroy(buffers->ibb);
+	buf_ops_destroy(buffers->bops);
 
 	igt_fork(child, num_children) {
 		int num_buffers;
@@ -1369,7 +1373,7 @@ static void __run_forked(struct buffers *buffers,
 		if (num_buffers < buffers->num_buffers)
 			buffers->num_buffers = num_buffers;
 
-		buffers_reset(buffers, true);
+		buffers_reset(buffers);
 		buffers_create(buffers);
 
 		igt_while_interruptible(interrupt) {
@@ -1382,7 +1386,7 @@ static void __run_forked(struct buffers *buffers,
 	igt_waitchildren();
 	igt_assert_eq(intel_detect_and_clear_missed_interrupts(fd), 0);
 
-	buffers_reset(buffers, true);
+	buffers_reset(buffers);
 }
 
 static void run_forked(struct buffers *buffers,
@@ -1459,8 +1463,7 @@ run_mode(const char *prefix,
 
 	igt_fixture
 		buffers_init(&buffers, prefix, create, mode,
-			     size, num_buffers,
-			     fd, run_wrap_func != run_child);
+			     size, num_buffers, fd);
 
 	for (h = hangs; h->suffix; h++) {
 		if (!all && *h->suffix)
diff --git a/tests/i915/gem_ppgtt.c b/tests/i915/gem_ppgtt.c
index 8c02e4af..d56e808e 100644
--- a/tests/i915/gem_ppgtt.c
+++ b/tests/i915/gem_ppgtt.c
@@ -38,56 +38,46 @@
 #include "i915/gem.h"
 #include "igt.h"
 #include "igt_debugfs.h"
-#include "intel_bufmgr.h"
 
 #define WIDTH 512
 #define STRIDE (WIDTH*4)
 #define HEIGHT 512
 #define SIZE (HEIGHT*STRIDE)
 
-static drm_intel_bo *create_bo(drm_intel_bufmgr *bufmgr,
-			       uint32_t pixel)
+static struct intel_buf *create_bo(struct buf_ops *bops, uint32_t pixel)
 {
 	uint64_t value = (uint64_t)pixel << 32 | pixel, *v;
-	drm_intel_bo *bo;
+	struct intel_buf *buf;
 
-	bo = drm_intel_bo_alloc(bufmgr, "surface", SIZE, 4096);
-	igt_assert(bo);
+	buf = intel_buf_create(bops, WIDTH, HEIGHT, 32, 0, I915_TILING_NONE, 0);
+	v = intel_buf_cpu_map(buf, 1);
 
-	do_or_die(drm_intel_bo_map(bo, 1));
-	v = bo->virtual;
 	for (int i = 0; i < SIZE / sizeof(value); i += 8) {
 		v[i + 0] = value; v[i + 1] = value;
 		v[i + 2] = value; v[i + 3] = value;
 		v[i + 4] = value; v[i + 5] = value;
 		v[i + 6] = value; v[i + 7] = value;
 	}
-	drm_intel_bo_unmap(bo);
+	intel_buf_unmap(buf);
 
-	return bo;
+	return buf;
 }
 
-static void scratch_buf_init(struct igt_buf *buf,
-			     drm_intel_bufmgr *bufmgr,
-			     uint32_t pixel)
+static void cleanup_bufs(struct intel_buf **buf, int count)
 {
-	memset(buf, 0, sizeof(*buf));
+	for (int child = 0; child < count; child++) {
+		struct buf_ops *bops = buf[child]->bops;
+		int fd = buf_ops_get_fd(buf[child]->bops);
 
-	buf->bo = create_bo(bufmgr, pixel);
-	buf->surface[0].stride = STRIDE;
-	buf->tiling = I915_TILING_NONE;
-	buf->surface[0].size = SIZE;
-	buf->bpp = 32;
-}
+		intel_buf_destroy(buf[child]);
+		buf_ops_destroy(bops);
+		close(fd);
+	}
 
-static void scratch_buf_fini(struct igt_buf *buf)
-{
-	drm_intel_bo_unreference(buf->bo);
-	memset(buf, 0, sizeof(*buf));
 }
 
 static void fork_rcs_copy(int timeout, uint32_t final,
-			  drm_intel_bo **dst, int count,
+			  struct intel_buf **dst, int count,
 			  unsigned flags)
 #define CREATE_CONTEXT 0x1
 {
@@ -102,21 +92,13 @@ static void fork_rcs_copy(int timeout, uint32_t final,
 
 	for (int child = 0; child < count; child++) {
 		int fd = drm_open_driver(DRIVER_INTEL);
-		drm_intel_bufmgr *bufmgr;
+		struct buf_ops *bops;
 
 		devid = intel_get_drm_devid(fd);
 
-		bufmgr = drm_intel_bufmgr_gem_init(fd, 4096);
-		igt_assert(bufmgr);
-
-		dst[child] = create_bo(bufmgr, ~0);
+		bops = buf_ops_create(fd);
 
-		if (flags & CREATE_CONTEXT) {
-			drm_intel_context *ctx;
-
-			ctx = drm_intel_gem_context_create(dst[child]->bufmgr);
-			igt_require(ctx);
-		}
+		dst[child] = create_bo(bops, ~0);
 
 		render_copy = igt_get_render_copyfunc(devid);
 		igt_require_f(render_copy,
@@ -124,113 +106,106 @@ static void fork_rcs_copy(int timeout, uint32_t final,
 	}
 
 	igt_fork(child, count) {
-		struct intel_batchbuffer *batch;
-		struct igt_buf buf = {};
-		struct igt_buf src;
+		struct intel_bb *ibb;
+		uint32_t ctx = 0;
+		struct intel_buf *src;
 		unsigned long i;
 
-		batch = intel_batchbuffer_alloc(dst[child]->bufmgr,
-						devid);
-		igt_assert(batch);
-
-		if (flags & CREATE_CONTEXT) {
-			drm_intel_context *ctx;
+		ibb = intel_bb_create(buf_ops_get_fd(dst[child]->bops), 4096);
 
-			ctx = drm_intel_gem_context_create(dst[child]->bufmgr);
-			intel_batchbuffer_set_context(batch, ctx);
-		}
-
-		buf.bo = dst[child];
-		buf.surface[0].stride = STRIDE;
-		buf.tiling = I915_TILING_NONE;
-		buf.surface[0].size = SIZE;
-		buf.bpp = 32;
+		if (flags & CREATE_CONTEXT)
+			ctx = gem_context_create(buf_ops_get_fd(dst[child]->bops));
 
 		i = 0;
 		igt_until_timeout(timeout) {
-			scratch_buf_init(&src, dst[child]->bufmgr,
-					 i++ | child << 16);
-			render_copy(batch, NULL,
-				    &src, 0, 0,
+			src = create_bo(dst[child]->bops,
+					i++ | child << 16);
+			render_copy(ibb, ctx,
+				    src, 0, 0,
 				    WIDTH, HEIGHT,
-				    &buf, 0, 0);
-			scratch_buf_fini(&src);
+				    dst[child], 0, 0);
+
+			intel_buf_destroy(src);
 		}
 
-		scratch_buf_init(&src, dst[child]->bufmgr,
-				 final | child << 16);
-		render_copy(batch, NULL,
-			    &src, 0, 0,
+		src = create_bo(dst[child]->bops,
+				final | child << 16);
+		render_copy(ibb, ctx,
+			    src, 0, 0,
 			    WIDTH, HEIGHT,
-			    &buf, 0, 0);
-		scratch_buf_fini(&src);
+			    dst[child], 0, 0);
+		intel_bb_sync(ibb);
+
+		intel_buf_destroy(src);
+
+		intel_bb_destroy(ibb);
 	}
 }
 
 static void fork_bcs_copy(int timeout, uint32_t final,
-			  drm_intel_bo **dst, int count)
+			  struct intel_buf **dst, int count)
 {
-	int devid;
-
 	for (int child = 0; child < count; child++) {
-		drm_intel_bufmgr *bufmgr;
+		struct buf_ops *bops;
 		int fd = drm_open_driver(DRIVER_INTEL);
 
-		devid = intel_get_drm_devid(fd);
-
-		bufmgr = drm_intel_bufmgr_gem_init(fd, 4096);
-		igt_assert(bufmgr);
-
-		dst[child] = create_bo(bufmgr, ~0);
+		bops = buf_ops_create(fd);
+		dst[child] = create_bo(bops, ~0);
 	}
 
 	igt_fork(child, count) {
-		struct intel_batchbuffer *batch;
-		drm_intel_bo *src[2];
+		struct intel_buf *src[2];
+		struct intel_bb *ibb;
 		unsigned long i;
 
-
-		batch = intel_batchbuffer_alloc(dst[child]->bufmgr,
-						devid);
-		igt_assert(batch);
+		ibb = intel_bb_create(buf_ops_get_fd(dst[child]->bops), 4096);
 
 		i = 0;
 		igt_until_timeout(timeout) {
-			src[0] = create_bo(dst[child]->bufmgr,
+			src[0] = create_bo(dst[child]->bops,
 					   ~0);
-			src[1] = create_bo(dst[child]->bufmgr,
+			src[1] = create_bo(dst[child]->bops,
 					   i++ | child << 16);
 
-			intel_copy_bo(batch, src[0], src[1], SIZE);
-			intel_copy_bo(batch, dst[child], src[0], SIZE);
+			intel_bb_blt_copy(ibb, src[1], 0, 0, 4096,
+					  src[0], 0, 0, 4096,
+					  4096/4, SIZE/4096, 32);
+			intel_bb_blt_copy(ibb, src[0], 0, 0, 4096,
+					  dst[child], 0, 0, 4096,
+					  4096/4, SIZE/4096, 32);
 
-			drm_intel_bo_unreference(src[1]);
-			drm_intel_bo_unreference(src[0]);
+			intel_buf_destroy(src[1]);
+			intel_buf_destroy(src[0]);
 		}
 
-		src[0] = create_bo(dst[child]->bufmgr,
-				   ~0);
-		src[1] = create_bo(dst[child]->bufmgr,
+		src[0] = create_bo(dst[child]->bops, ~0);
+		src[1] = create_bo(dst[child]->bops,
 				   final | child << 16);
 
-		intel_copy_bo(batch, src[0], src[1], SIZE);
-		intel_copy_bo(batch, dst[child], src[0], SIZE);
+		intel_bb_blt_copy(ibb, src[1], 0, 0, 4096,
+				  src[0], 0, 0, 4096,
+				  4096/4, SIZE/4096, 32);
+		intel_bb_blt_copy(ibb, src[0], 0, 0, 4096,
+				  dst[child], 0, 0, 4096,
+				  4096/4, SIZE/4096, 32);
+		intel_bb_sync(ibb);
+
+		intel_buf_destroy(src[1]);
+		intel_buf_destroy(src[0]);
 
-		drm_intel_bo_unreference(src[1]);
-		drm_intel_bo_unreference(src[0]);
+		intel_bb_destroy(ibb);
 	}
 }
 
-static void surfaces_check(drm_intel_bo **bo, int count, uint32_t expected)
+static void surfaces_check(struct intel_buf **buf, int count, uint32_t expected)
 {
 	for (int child = 0; child < count; child++) {
 		uint32_t *ptr;
 
-		do_or_die(drm_intel_bo_map(bo[child], 0));
-		ptr = bo[child]->virtual;
+		ptr = intel_buf_cpu_map(buf[child], 0);
 		for (int j = 0; j < SIZE/4; j++)
 			igt_assert_eq(ptr[j], expected | child << 16);
-		drm_intel_bo_unmap(bo[child]);
+		intel_buf_unmap(buf[child]);
 	}
 }
 
@@ -301,7 +276,7 @@ igt_main
 	}
 
 	igt_subtest("blt-vs-render-ctx0") {
-		drm_intel_bo *bcs[1], *rcs[N_CHILD];
+		struct intel_buf *bcs[1], *rcs[N_CHILD];
 
 		fork_bcs_copy(30, 0x4000, bcs, 1);
 		fork_rcs_copy(30, 0x8000 / N_CHILD, rcs, N_CHILD, 0);
@@ -310,10 +285,13 @@ igt_main
 
 		surfaces_check(bcs, 1, 0x4000);
 		surfaces_check(rcs, N_CHILD, 0x8000 / N_CHILD);
+
+		cleanup_bufs(bcs, 1);
+		cleanup_bufs(rcs, N_CHILD);
 	}
 
 	igt_subtest("blt-vs-render-ctxN") {
-		drm_intel_bo *bcs[1], *rcs[N_CHILD];
+		struct intel_buf *bcs[1], *rcs[N_CHILD];
 
 		fork_rcs_copy(30, 0x8000 / N_CHILD, rcs, N_CHILD, CREATE_CONTEXT);
 		fork_bcs_copy(30, 0x4000, bcs, 1);
@@ -322,6 +300,9 @@ igt_main
 
 		surfaces_check(bcs, 1, 0x4000);
 		surfaces_check(rcs, N_CHILD, 0x8000 / N_CHILD);
+
+		cleanup_bufs(bcs, 1);
+		cleanup_bufs(rcs, N_CHILD);
 	}
 
 	igt_subtest("flink-and-close-vma-leak")
diff --git a/tests/kms_draw_crc.c b/tests/kms_draw_crc.c
index 70b9b05f..ffd655b0 100644
--- a/tests/kms_draw_crc.c
+++ b/tests/kms_draw_crc.c
@@ -38,7 +38,7 @@ struct modeset_params {
 int drm_fd;
 drmModeResPtr drm_res;
 drmModeConnectorPtr drm_connectors[MAX_CONNECTORS];
-drm_intel_bufmgr *bufmgr;
+struct buf_ops *bops;
 igt_pipe_crc_t *pipe_crc;
 
 #define N_FORMATS 3
@@ -126,23 +126,23 @@ static void get_method_crc(enum igt_draw_method method, uint32_t drm_format,
 
 	igt_create_fb(drm_fd, ms.mode->hdisplay, ms.mode->vdisplay,
 		      drm_format, tiling, &fb);
-	igt_draw_rect_fb(drm_fd, bufmgr, NULL, &fb, method,
+	igt_draw_rect_fb(drm_fd, bops, 0, &fb, method,
 			 0, 0, fb.width, fb.height,
 			 get_color(drm_format, 0, 0, 1));
 
-	igt_draw_rect_fb(drm_fd, bufmgr, NULL, &fb, method,
+	igt_draw_rect_fb(drm_fd, bops, 0, &fb, method,
 			 fb.width / 4, fb.height / 4,
 			 fb.width / 2, fb.height / 2,
 			 get_color(drm_format, 0, 1, 0));
-	igt_draw_rect_fb(drm_fd, bufmgr, NULL, &fb, method,
+	igt_draw_rect_fb(drm_fd, bops, 0, &fb, method,
 			 fb.width / 8, fb.height / 8,
 			 fb.width / 4, fb.height / 4,
 			 get_color(drm_format, 1, 0, 0));
-	igt_draw_rect_fb(drm_fd, bufmgr, NULL, &fb, method,
+	igt_draw_rect_fb(drm_fd, bops, 0, &fb, method,
 			 fb.width / 2, fb.height / 2,
 			 fb.width / 3, fb.height / 3,
 			 get_color(drm_format, 1, 0, 1));
-	igt_draw_rect_fb(drm_fd, bufmgr, NULL, &fb, method, 1, 1, 15, 15,
+	igt_draw_rect_fb(drm_fd, bops, 0, &fb, method, 1, 1, 15, 15,
 			 get_color(drm_format, 0, 1, 1));
 
 	rc = drmModeSetCrtc(drm_fd, ms.crtc_id, fb.fb_id, 0, 0,
@@ -228,7 +228,7 @@ static void fill_fb_subtest(void)
 	igt_create_fb(drm_fd, ms.mode->hdisplay, ms.mode->vdisplay,
 		      DRM_FORMAT_XRGB8888, LOCAL_DRM_FORMAT_MOD_NONE, &fb);
 
-	igt_draw_rect_fb(drm_fd, bufmgr, NULL, &fb,
+	igt_draw_rect_fb(drm_fd, bops, 0, &fb,
 			 gem_has_mappable_ggtt(drm_fd) ? IGT_DRAW_MMAP_GTT :
 							 IGT_DRAW_MMAP_WC,
 			 0, 0, fb.width, fb.height, 0xFF);
@@ -270,9 +270,7 @@ static void setup_environment(void)
 
 	kmstest_set_vt_graphics_mode();
 
-	bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
-	igt_assert(bufmgr);
-	drm_intel_bufmgr_gem_enable_reuse(bufmgr);
+	bops = buf_ops_create(drm_fd);
 
 	find_modeset_params();
 	pipe_crc = igt_pipe_crc_new(drm_fd, kmstest_get_crtc_idx(drm_res, ms.crtc_id),
@@ -285,7 +283,7 @@ static void teardown_environment(void)
 
 	igt_pipe_crc_free(pipe_crc);
 
-	drm_intel_bufmgr_destroy(bufmgr);
+	buf_ops_destroy(bops);
 
 	for (i = 0; i < drm_res->count_connectors; i++)
 		drmModeFreeConnector(drm_connectors[i]);
diff --git a/tests/kms_frontbuffer_tracking.c b/tests/kms_frontbuffer_tracking.c
index 780fecfe..14f522d8 100644
--- a/tests/kms_frontbuffer_tracking.c
+++ b/tests/kms_frontbuffer_tracking.c
@@ -166,7 +166,7 @@ struct {
 	int debugfs;
 	igt_display_t display;
 
-	drm_intel_bufmgr *bufmgr;
+	struct buf_ops *bops;
 } drm;
 
 struct {
@@ -1091,7 +1091,7 @@ static void draw_rect(struct draw_pattern_info *pattern, struct fb_region *fb,
 {
 	struct rect rect = pattern->get_rect(fb, r);
 
-	igt_draw_rect_fb(drm.fd, drm.bufmgr, NULL, fb->fb, method,
+	igt_draw_rect_fb(drm.fd, drm.bops, 0, fb->fb, method,
 			 fb->x + rect.x, fb->y + rect.y,
 			 rect.w, rect.h, rect.color);
 
@@ -1117,7 +1117,7 @@ static void fill_fb_region(struct fb_region *region, enum color ecolor)
 {
 	uint32_t color = pick_color(region->fb, ecolor);
 
-	igt_draw_rect_fb(drm.fd, drm.bufmgr, NULL, region->fb, IGT_DRAW_BLT,
+	igt_draw_rect_fb(drm.fd, drm.bops, 0, region->fb, IGT_DRAW_BLT,
 			 region->x, region->y, region->w, region->h,
 			 color);
 }
@@ -1141,7 +1141,7 @@ static bool disable_features(const struct test_mode *t)
 static void *busy_thread_func(void *data)
 {
 	while (!busy_thread.stop)
-		igt_draw_rect(drm.fd, drm.bufmgr, NULL, busy_thread.handle,
+		igt_draw_rect(drm.fd, drm.bops, 0, busy_thread.handle,
 			      busy_thread.size, busy_thread.stride,
 			      busy_thread.tiling, IGT_DRAW_BLT, 0, 0,
 			      busy_thread.width, busy_thread.height,
@@ -1288,14 +1288,12 @@ static void setup_drm(void)
 	kmstest_set_vt_graphics_mode();
 	igt_display_require(&drm.display, drm.fd);
 
-	drm.bufmgr = drm_intel_bufmgr_gem_init(drm.fd, 4096);
-	igt_assert(drm.bufmgr);
-	drm_intel_bufmgr_gem_enable_reuse(drm.bufmgr);
+	drm.bops = buf_ops_create(drm.fd);
 }
 
 static void teardown_drm(void)
 {
-	drm_intel_bufmgr_destroy(drm.bufmgr);
+	buf_ops_destroy(drm.bops);
 	igt_display_fini(&drm.display);
 	close(drm.fd);
 }
@@ -2622,14 +2620,14 @@ static void scaledprimary_subtest(const struct test_mode *t)
 		  t->tiling, t->plane, &new_fb);
 	fill_fb(&new_fb, COLOR_BLUE);
 
-	igt_draw_rect_fb(drm.fd, drm.bufmgr, NULL, &new_fb, t->method,
+	igt_draw_rect_fb(drm.fd, drm.bops, 0, &new_fb, t->method,
 			 reg->x, reg->y, reg->w / 2, reg->h / 2,
 			 pick_color(&new_fb, COLOR_GREEN));
-	igt_draw_rect_fb(drm.fd, drm.bufmgr, NULL, &new_fb, t->method,
+	igt_draw_rect_fb(drm.fd, drm.bops, 0, &new_fb, t->method,
 			 reg->x + reg->w / 2, reg->y + reg->h / 2,
 			 reg->w / 2, reg->h / 2,
 			 pick_color(&new_fb, COLOR_RED));
-	igt_draw_rect_fb(drm.fd, drm.bufmgr, NULL, &new_fb, t->method,
+	igt_draw_rect_fb(drm.fd, drm.bops, 0, &new_fb, t->method,
 			 reg->x + reg->w / 2, reg->y + reg->h / 2,
 			 reg->w / 4, reg->h / 4,
 			 pick_color(&new_fb, COLOR_MAGENTA));
diff --git a/tests/kms_psr.c b/tests/kms_psr.c
index 49ea446a..d2c5c540 100644
--- a/tests/kms_psr.c
+++ b/tests/kms_psr.c
@@ -29,7 +29,6 @@
 #include <stdbool.h>
 #include <stdio.h>
 #include <string.h>
-#include "intel_bufmgr.h"
 
 enum operations {
 	PAGE_FLIP,
@@ -65,7 +64,7 @@ typedef struct {
 	uint32_t devid;
 	uint32_t crtc_id;
 	igt_display_t display;
-	drm_intel_bufmgr *bufmgr;
+	struct buf_ops *bops;
 	struct igt_fb fb_green, fb_white;
 	igt_plane_t *test_plane;
 	int mod_size;
@@ -123,73 +122,91 @@ static void display_fini(data_t *data)
 	igt_display_fini(&data->display);
 }
 
-static void fill_blt(data_t *data, uint32_t handle, unsigned char color)
+static void color_blit_start(struct intel_bb *ibb)
 {
-	drm_intel_bo *dst = gem_handle_to_libdrm_bo(data->bufmgr,
-						    data->drm_fd,
-						    "", handle);
-	struct intel_batchbuffer *batch;
-
-	batch = intel_batchbuffer_alloc(data->bufmgr, data->devid);
-	igt_assert(batch);
-
-	COLOR_BLIT_COPY_BATCH_START(0);
-	OUT_BATCH((1 << 24) | (0xf0 << 16) | 0);
-	OUT_BATCH(0);
-	OUT_BATCH(0xfff << 16 | 0xfff);
-	OUT_RELOC(dst, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, 0);
-	OUT_BATCH(color);
-	ADVANCE_BATCH();
-
-	intel_batchbuffer_flush(batch);
-	intel_batchbuffer_free(batch);
-
-	gem_bo_busy(data->drm_fd, handle);
+	intel_bb_out(ibb, XY_COLOR_BLT_CMD_NOLEN |
+		     COLOR_BLT_WRITE_ALPHA |
+		     XY_COLOR_BLT_WRITE_RGB |
+		     (4 + (ibb->gen >= 8)));
 }
 
-static void scratch_buf_init(struct igt_buf *buf, drm_intel_bo *bo,
-			     int size, int stride)
+static struct intel_buf *create_buf_from_fb(data_t *data,
+					    const struct igt_fb *fb)
 {
-	memset(buf, 0, sizeof(*buf));
+	uint32_t name, handle, tiling, stride, width, height, bpp, size;
+	struct intel_buf *buf;
+
+	igt_assert_eq(fb->offsets[0], 0);
+
+	tiling = igt_fb_mod_to_tiling(fb->modifier);
+	stride = fb->strides[0];
+	bpp = fb->plane_bpp[0];
+	size = fb->size;
+	width = stride / (bpp / 8);
+	height = size / stride;
+
+	buf = calloc(1, sizeof(*buf));
+	igt_assert(buf);
+
+	name = gem_flink(data->drm_fd, fb->gem_handle);
+	handle = gem_open(data->drm_fd, name);
+	intel_buf_init_using_handle(data->bops, handle, buf,
+				    width, height, bpp, 0, tiling, 0);
+	return buf;
+}
 
-	buf->bo = bo;
-	buf->surface[0].stride = stride;
-	buf->tiling = I915_TILING_X;
-	buf->surface[0].size = size;
-	buf->bpp = 32;
+static void fill_blt(data_t *data, const struct igt_fb *fb, unsigned char color)
+{
+	struct intel_bb *ibb;
+	struct intel_buf *dst;
+
+	ibb = intel_bb_create(data->drm_fd, 4096);
+	dst = create_buf_from_fb(data, fb);
+
+	color_blit_start(ibb);
+	intel_bb_out(ibb, (1 << 24) | (0xf0 << 16) | 0);
+	intel_bb_out(ibb, 0);
+	intel_bb_out(ibb, 0xfff << 16 | 0xfff);
+	intel_bb_emit_reloc(ibb, dst->handle, I915_GEM_DOMAIN_RENDER,
+			    I915_GEM_DOMAIN_RENDER, 0, 0x0);
+	intel_bb_out(ibb, color);
+
+	intel_bb_flush_blit(ibb);
+	intel_bb_sync(ibb);
+	intel_bb_destroy(ibb);
+	intel_buf_destroy(dst);
 }
 
-static void fill_render(data_t *data, uint32_t handle, unsigned char color)
+static void fill_render(data_t *data, const struct igt_fb *fb,
+			unsigned char color)
 {
-	drm_intel_bo *src, *dst;
-	struct intel_batchbuffer *batch;
-	struct igt_buf src_buf, dst_buf;
+	struct intel_buf *src, *dst;
+	struct intel_bb *ibb;
 	const uint8_t buf[4] = { color, color, color, color };
 	igt_render_copyfunc_t rendercopy = igt_get_render_copyfunc(data->devid);
+	int height, width, tiling;
 
 	igt_skip_on(!rendercopy);
 
-	dst = gem_handle_to_libdrm_bo(data->bufmgr, data->drm_fd, "", handle);
-	igt_assert(dst);
+	ibb = intel_bb_create(data->drm_fd, 4096);
+	dst = create_buf_from_fb(data, fb);
 
-	src = drm_intel_bo_alloc(data->bufmgr, "", data->mod_size, 4096);
-	igt_assert(src);
+	width = fb->strides[0] / (fb->plane_bpp[0] / 8);
+	height = fb->size / fb->strides[0];
+	tiling = igt_fb_mod_to_tiling(fb->modifier);
 
+	src = intel_buf_create(data->bops, width, height, fb->plane_bpp[0],
+			       0, tiling, 0);
 	gem_write(data->drm_fd, src->handle, 0, buf, 4);
 
-	scratch_buf_init(&src_buf, src, data->mod_size, data->mod_stride);
-	scratch_buf_init(&dst_buf, dst, data->mod_size, data->mod_stride);
-
-	batch = intel_batchbuffer_alloc(data->bufmgr, data->devid);
-	igt_assert(batch);
-
-	rendercopy(batch, NULL,
-		   &src_buf, 0, 0, 0xff, 0xff,
-		   &dst_buf, 0, 0);
+	rendercopy(ibb, 0,
+		   src, 0, 0, 0xff, 0xff,
+		   dst, 0, 0);
 
-	intel_batchbuffer_free(batch);
-
-	gem_bo_busy(data->drm_fd, handle);
+	intel_bb_sync(ibb);
+	intel_bb_destroy(ibb);
+	intel_buf_destroy(src);
+	intel_buf_destroy(dst);
 }
 
 static bool sink_support(data_t *data, enum psr_mode mode)
@@ -290,11 +307,11 @@ static void run_test(data_t *data)
 		expected = "BLACK or TRANSPARENT mark on top of plane in test";
 		break;
 	case BLT:
-		fill_blt(data, handle, 0);
+		fill_blt(data, &data->fb_white, 0);
 		expected = "BLACK or TRANSPARENT mark on top of plane in test";
 		break;
 	case RENDER:
-		fill_render(data, handle, 0);
+		fill_render(data, &data->fb_white, 0);
 		expected = "BLACK or TRANSPARENT mark on top of plane in test";
 		break;
 	case PLANE_MOVE:
@@ -314,7 +331,8 @@ static void run_test(data_t *data)
 	manual(expected);
 }
 
-static void test_cleanup(data_t *data) {
+static void test_cleanup(data_t *data)
+{
 	igt_plane_t *primary;
 
 	primary = igt_output_get_plane_type(data->output,
@@ -444,11 +462,7 @@ igt_main_args("", long_options, help_str, opt_handler, &data)
 			      "Sink does not support PSR\n");
 
 		data.supports_psr2 = sink_support(&data, PSR_MODE_2);
-
-		data.bufmgr = drm_intel_bufmgr_gem_init(data.drm_fd, 4096);
-		igt_assert(data.bufmgr);
-		drm_intel_bufmgr_gem_enable_reuse(data.bufmgr);
-
+		data.bops = buf_ops_create(data.drm_fd);
 		display_init(&data);
 	}
 
@@ -528,7 +542,7 @@ igt_main_args("", long_options, help_str, opt_handler, &data)
 			psr_disable(data.drm_fd, data.debugfs_fd);
 
 		close(data.debugfs_fd);
-		drm_intel_bufmgr_destroy(data.bufmgr);
+		buf_ops_destroy(data.bops);
 		display_fini(&data);
 	}
 }
-- 
2.26.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [igt-dev] [PATCH i-g-t 09/11] tests/gem|kms: remove libdrm dependency (batch 2)
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
                   ` (7 preceding siblings ...)
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 08/11] tests/gem|kms: remove libdrm dependency (batch 1) Zbigniew Kempczyński
@ 2020-07-23 19:22 ` Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 10/11] tools/intel_residency: adopt intel_residency to use bufops Zbigniew Kempczyński
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Zbigniew Kempczyński @ 2020-07-23 19:22 UTC (permalink / raw)
  To: igt-dev; +Cc: Chris Wilson

Use intel_bb / intel_buf to remove libdrm dependency.

Tests changed:
- gem_read_read_speed
- gem_render_copy
- gem_render_copy_redux
- gem_render_linear_blits
- gem_render_tiled_blits
- gem_stress
- kms_big_fb
- kms_cursor_crc

Remove transitional rendercopy_bufmgr, we don't need it anymore.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/Makefile.sources                 |   2 -
 lib/meson.build                      |   1 -
 lib/rendercopy_bufmgr.c              | 171 ---------------
 lib/rendercopy_bufmgr.h              |  28 ---
 tests/i915/gem_read_read_speed.c     | 160 +++++++-------
 tests/i915/gem_render_copy.c         | 313 +++++++++++----------------
 tests/i915/gem_render_copy_redux.c   |  66 +++---
 tests/i915/gem_render_linear_blits.c |  90 +++-----
 tests/i915/gem_render_tiled_blits.c  |  85 ++++----
 tests/i915/gem_stress.c              | 244 +++++++++++----------
 tests/kms_big_fb.c                   |  54 +++--
 tests/kms_cursor_crc.c               |  63 +++---
 12 files changed, 498 insertions(+), 779 deletions(-)
 delete mode 100644 lib/rendercopy_bufmgr.c
 delete mode 100644 lib/rendercopy_bufmgr.h

diff --git a/lib/Makefile.sources b/lib/Makefile.sources
index 464f5be2..67b38645 100644
--- a/lib/Makefile.sources
+++ b/lib/Makefile.sources
@@ -107,8 +107,6 @@ lib_source_list =	 	\
 	gen7_render.h		\
 	gen8_render.h		\
 	gen9_render.h		\
-	rendercopy_bufmgr.c	\
-	rendercopy_bufmgr.h	\
 	rendercopy_gen4.c	\
 	rendercopy_gen6.c	\
 	rendercopy_gen7.c	\
diff --git a/lib/meson.build b/lib/meson.build
index 341696e5..f1f5f7ca 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -45,7 +45,6 @@ lib_sources = [
 	'gpu_cmds.c',
 	'rendercopy_i915.c',
 	'rendercopy_i830.c',
-	'rendercopy_bufmgr.c',
 	'rendercopy_gen4.c',
 	'rendercopy_gen6.c',
 	'rendercopy_gen7.c',
diff --git a/lib/rendercopy_bufmgr.c b/lib/rendercopy_bufmgr.c
deleted file mode 100644
index 5cb588fa..00000000
--- a/lib/rendercopy_bufmgr.c
+++ /dev/null
@@ -1,171 +0,0 @@
-/*
- * Copyright © 2020 Intel Corporation
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
- *
- */
-
-#include <sys/ioctl.h>
-#include "igt.h"
-#include "igt_x86.h"
-#include "rendercopy_bufmgr.h"
-
-/**
- * SECTION:rendercopy_bufmgr
- * @short_description: Render copy buffer manager
- * @title: Render copy bufmgr
- * @include: igt.h
- *
- * # Rendercopy buffer manager
- *
- * Rendercopy depends on libdrm and igt_buf, so some middle layer to intel_buf
- * and buf_ops is required.
- *
- * |[<!-- language="c" -->
- * struct rendercopy_bufmgr *bmgr;
- * ...
- * bmgr = rendercopy_bufmgr_create(fd, bufmgr);
- * ...
- * igt_buf_init(bmgr, &buf, 512, 512, 32, I915_TILING_X, false);
- * ...
- * linear_to_igt_buf(bmgr, &buf, linear);
- * ...
- * igt_buf_to_linear(bmgr, &buf, linear);
- * ...
- * rendercopy_bufmgr_destroy(bmgr);
- * ]|
- */
-
-struct rendercopy_bufmgr {
-	int fd;
-	drm_intel_bufmgr *bufmgr;
-	struct buf_ops *bops;
-};
-
-static void __igt_buf_to_intel_buf(struct igt_buf *buf, struct intel_buf *ibuf)
-{
-	ibuf->handle = buf->bo->handle;
-	ibuf->surface[0].stride = buf->surface[0].stride;
-	ibuf->tiling = buf->tiling;
-	ibuf->bpp = buf->bpp;
-	ibuf->surface[0].size = buf->surface[0].size;
-	ibuf->compression = buf->compression;
-	ibuf->ccs[0].offset = buf->ccs[0].offset;
-	ibuf->ccs[0].stride = buf->ccs[0].stride;
-}
-
-void igt_buf_to_linear(struct rendercopy_bufmgr *bmgr, struct igt_buf *buf,
-		       uint32_t *linear)
-{
-	struct intel_buf ibuf;
-
-	__igt_buf_to_intel_buf(buf, &ibuf);
-
-	intel_buf_to_linear(bmgr->bops, &ibuf, linear);
-}
-
-void linear_to_igt_buf(struct rendercopy_bufmgr *bmgr, struct igt_buf *buf,
-		       uint32_t *linear)
-{
-	struct intel_buf ibuf;
-
-	__igt_buf_to_intel_buf(buf, &ibuf);
-
-	linear_to_intel_buf(bmgr->bops, &ibuf, linear);
-}
-
-struct rendercopy_bufmgr *
-rendercopy_bufmgr_create(int fd, drm_intel_bufmgr *bufmgr)
-{
-	struct buf_ops *bops;
-	struct rendercopy_bufmgr *bmgr;
-
-	igt_assert(bufmgr);
-
-	bops = buf_ops_create(fd);
-	igt_assert(bops);
-
-	bmgr = calloc(1, sizeof(*bmgr));
-	igt_assert(bmgr);
-
-	bmgr->fd = fd;
-	bmgr->bufmgr = bufmgr;
-	bmgr->bops = bops;
-
-	return bmgr;
-}
-
-void rendercopy_bufmgr_destroy(struct rendercopy_bufmgr *bmgr)
-{
-	igt_assert(bmgr);
-	igt_assert(bmgr->bops);
-
-	buf_ops_destroy(bmgr->bops);
-	free(bmgr);
-}
-
-bool rendercopy_bufmgr_set_software_tiling(struct rendercopy_bufmgr *bmgr,
-					   uint32_t tiling,
-					   bool use_software_tiling)
-{
-	return buf_ops_set_software_tiling(bmgr->bops, tiling,
-					   use_software_tiling);
-}
-
-void igt_buf_init(struct rendercopy_bufmgr *bmgr, struct igt_buf *buf,
-		  int width, int height, int bpp,
-		  uint32_t tiling, uint32_t compression)
-{
-	uint32_t devid = intel_get_drm_devid(bmgr->fd);
-	int generation= intel_gen(devid);
-	struct intel_buf ibuf;
-	int size;
-
-	memset(buf, 0, sizeof(*buf));
-
-	buf->surface[0].stride = ALIGN(width * (bpp / 8), 128);
-	buf->surface[0].size = buf->surface[0].stride * height;
-	buf->tiling = tiling;
-	buf->bpp = bpp;
-	buf->compression = compression;
-
-	size = buf->surface[0].stride * ALIGN(height, 32);
-
-	if (compression) {
-		int ccs_width = igt_buf_intel_ccs_width(generation, buf);
-		int ccs_height = igt_buf_intel_ccs_height(generation, buf);
-
-		buf->ccs[0].offset = buf->surface[0].stride * ALIGN(height, 32);
-		buf->ccs[0].stride = ccs_width;
-
-		size = buf->ccs[0].offset + ccs_width * ccs_height;
-	}
-
-	buf->bo = drm_intel_bo_alloc(bmgr->bufmgr, "", size, 4096);
-
-	intel_buf_init_using_handle(bmgr->bops,
-				    buf->bo->handle,
-				    &ibuf,
-				    width, height, bpp, 0,
-				    tiling, compression);
-
-	buf->ccs[0].offset = ibuf.ccs[0].offset;
-	buf->ccs[0].stride = ibuf.ccs[0].stride;
-}
diff --git a/lib/rendercopy_bufmgr.h b/lib/rendercopy_bufmgr.h
deleted file mode 100644
index f4e5118c..00000000
--- a/lib/rendercopy_bufmgr.h
+++ /dev/null
@@ -1,28 +0,0 @@
-#ifndef __RENDERCOPY_BUFMGR_H__
-#define __RENDERCOPY_BUFMGR_H__
-
-#include <stdint.h>
-#include "intel_bufops.h"
-#include "intel_batchbuffer.h"
-
-struct rendercopy_bufmgr;
-
-struct rendercopy_bufmgr *rendercopy_bufmgr_create(int fd,
-						   drm_intel_bufmgr *bufmgr);
-void rendercopy_bufmgr_destroy(struct rendercopy_bufmgr *bmgr);
-
-bool rendercopy_bufmgr_set_software_tiling(struct rendercopy_bufmgr *bmgr,
-					   uint32_t tiling,
-					   bool use_software_tiling);
-
-void igt_buf_to_linear(struct rendercopy_bufmgr *bmgr, struct igt_buf *buf,
-		       uint32_t *linear);
-
-void linear_to_igt_buf(struct rendercopy_bufmgr *bmgr, struct igt_buf *buf,
-		       uint32_t *linear);
-
-void igt_buf_init(struct rendercopy_bufmgr *bmgr, struct igt_buf *buf,
-		  int width, int height, int bpp,
-		  uint32_t tiling, uint32_t compression);
-
-#endif
diff --git a/tests/i915/gem_read_read_speed.c b/tests/i915/gem_read_read_speed.c
index 06b66935..023f9990 100644
--- a/tests/i915/gem_read_read_speed.c
+++ b/tests/i915/gem_read_read_speed.c
@@ -46,66 +46,60 @@
 
 IGT_TEST_DESCRIPTION("Test speed of concurrent reads between engines.");
 
+#define BBSIZE 4096
 igt_render_copyfunc_t rendercopy;
-struct intel_batchbuffer *batch;
 int width, height;
 
-static drm_intel_bo *rcs_copy_bo(drm_intel_bo *dst, drm_intel_bo *src)
+static void set_to_gtt_domain(struct intel_buf *buf, int writing)
 {
-	struct igt_buf d = {
-		.bo = dst,
-		.num_tiles = width * height * 4,
-		.surface[0] = {
-			.size = width * height * 4, .stride = width * 4,
-		},
-		.bpp = 32,
-	}, s = {
-		.bo = src,
-		.num_tiles = width * height * 4,
-		.surface[0] = {
-			.size = width * height * 4, .stride = width * 4,
-		},
-		.bpp = 32,
-	};
-	uint32_t swizzle;
-	drm_intel_bo *bo = batch->bo;
-	drm_intel_bo_reference(bo);
-
-	drm_intel_bo_get_tiling(dst, &d.tiling, &swizzle);
-	drm_intel_bo_get_tiling(src, &s.tiling, &swizzle);
-
-	rendercopy(batch, NULL,
-		   &s, 0, 0,
+	int i915 = buf_ops_get_fd(buf->bops);
+
+	gem_set_domain(i915, buf->handle, I915_GEM_DOMAIN_GTT,
+		       writing ? I915_GEM_DOMAIN_GTT : 0);
+}
+
+static struct intel_bb *rcs_copy_bo(struct intel_buf *dst,
+				    struct intel_buf *src)
+{
+	int i915 = buf_ops_get_fd(dst->bops);
+	struct intel_bb *ibb = intel_bb_create(i915, BBSIZE);
+
+	/* enforce batch won't be recreated after execution */
+	intel_bb_ref(ibb);
+
+	rendercopy(ibb, 0,
+		   src, 0, 0,
 		   width, height,
-		   &d, 0, 0);
+		   dst, 0, 0);
 
-	return bo;
+	return ibb;
 }
 
-static drm_intel_bo *bcs_copy_bo(drm_intel_bo *dst, drm_intel_bo *src)
+static struct intel_bb *bcs_copy_bo(struct intel_buf *dst,
+				    struct intel_buf *src)
 {
-	drm_intel_bo *bo = batch->bo;
-	drm_intel_bo_reference(bo);
+	int i915 = buf_ops_get_fd(dst->bops);
+	struct intel_bb *ibb = intel_bb_create(i915, BBSIZE);
 
-	intel_blt_copy(batch,
-		       src, 0, 0, 4*width,
-		       dst, 0, 0, 4*width,
-		       width, height, 32);
+	intel_bb_ref(ibb);
 
-	return bo;
+	intel_bb_blt_copy(ibb,
+			  src, 0, 0, 4*width,
+			  dst, 0, 0, 4*width,
+			  width, height, 32);
+
+	return ibb;
 }
 
-static void
-set_bo(drm_intel_bo *bo, uint32_t val)
+static void set_bo(struct intel_buf *buf, uint32_t val)
 {
 	int size = width * height;
 	uint32_t *vaddr;
 
-	do_or_die(drm_intel_bo_map(bo, 1));
-	vaddr = bo->virtual;
+	vaddr = intel_buf_device_map(buf, true);
 	while (size--)
 		*vaddr++ = val;
-	drm_intel_bo_unmap(bo);
+	intel_buf_unmap(buf);
 }
 
 static double elapsed(const struct timespec *start,
@@ -115,54 +109,59 @@ static double elapsed(const struct timespec *start,
 	return (1e6*(end->tv_sec - start->tv_sec) + (end->tv_nsec - start->tv_nsec)/1000)/loop;
 }
 
-static drm_intel_bo *create_bo(drm_intel_bufmgr *bufmgr,
-			       const char *name)
+static struct intel_buf *create_bo(struct buf_ops *bops, const char *name)
 {
 	uint32_t tiling_mode = I915_TILING_X;
-	unsigned long pitch;
-	return drm_intel_bo_alloc_tiled(bufmgr, name,
-					width, height, 4,
-					&tiling_mode, &pitch, 0);
+	struct intel_buf *buf;
+
+	buf = intel_buf_create(bops, width, height, 32, 0, tiling_mode,
+			       I915_COMPRESSION_NONE);
+	intel_buf_set_name(buf, name);
+
+	return buf;
 }
 
-static void run(drm_intel_bufmgr *bufmgr, int _width, int _height,
+static void run(struct buf_ops *bops, int _width, int _height,
 		bool write_bcs, bool write_rcs)
 {
-	drm_intel_bo *src = NULL, *bcs = NULL, *rcs = NULL;
-	drm_intel_bo *bcs_batch, *rcs_batch;
+	struct intel_buf *src = NULL, *bcs = NULL, *rcs = NULL;
+	struct intel_bb *bcs_ibb = NULL, *rcs_ibb = NULL;
 	struct timespec start, end;
-	int loops = 1000;
+	int loops = 1;
 
 	width = _width;
 	height = _height;
 
-	src = create_bo(bufmgr, "src");
-	bcs = create_bo(bufmgr, "bcs");
-	rcs = create_bo(bufmgr, "rcs");
+	igt_info("width: %d, height: %d\n", width, height);
+
+	src = create_bo(bops, "src");
+	bcs = create_bo(bops, "bcs");
+	rcs = create_bo(bops, "rcs");
 
 	set_bo(src, 0xdeadbeef);
 
 	if (write_bcs) {
-		bcs_batch = bcs_copy_bo(src, bcs);
+		bcs_ibb = bcs_copy_bo(src, bcs);
 	} else {
-		bcs_batch = bcs_copy_bo(bcs, src);
+		bcs_ibb = bcs_copy_bo(bcs, src);
 	}
 	if (write_rcs) {
-		rcs_batch = rcs_copy_bo(src, rcs);
+		rcs_ibb = rcs_copy_bo(src, rcs);
 	} else {
-		rcs_batch = rcs_copy_bo(rcs, src);
+		rcs_ibb = rcs_copy_bo(rcs, src);
 	}
 
-	drm_intel_bo_unreference(rcs);
-	drm_intel_bo_unreference(bcs);
+	set_to_gtt_domain(src, true);
 
-	drm_intel_gem_bo_start_gtt_access(src, true);
 	clock_gettime(CLOCK_MONOTONIC, &start);
 	for (int i = 0; i < loops; i++) {
-		drm_intel_gem_bo_context_exec(rcs_batch, NULL, 4096, I915_EXEC_RENDER);
-		drm_intel_gem_bo_context_exec(bcs_batch, NULL, 4096, I915_EXEC_BLT);
+		intel_bb_exec(rcs_ibb, intel_bb_offset(rcs_ibb),
+			      I915_EXEC_RENDER, false);
+		intel_bb_exec(bcs_ibb, intel_bb_offset(bcs_ibb),
+			      I915_EXEC_BLT, false);
 	}
-	drm_intel_gem_bo_start_gtt_access(src, true);
+
+	set_to_gtt_domain(src, true);
 	clock_gettime(CLOCK_MONOTONIC, &end);
 
 	igt_info("Time to %s-%s %dx%d [%dk]:		%7.3fµs\n",
@@ -171,16 +170,19 @@ static void run(drm_intel_bufmgr *bufmgr, int _width, int _height,
 		 width, height, 4*width*height/1024,
 		 elapsed(&start, &end, loops));
 
-	drm_intel_bo_unreference(rcs_batch);
-	drm_intel_bo_unreference(bcs_batch);
-
-	drm_intel_bo_unreference(src);
+	intel_bb_unref(rcs_ibb);
+	intel_bb_destroy(rcs_ibb);
+	intel_bb_unref(bcs_ibb);
+	intel_bb_destroy(bcs_ibb);
+	intel_buf_destroy(src);
+	intel_buf_destroy(rcs);
+	intel_buf_destroy(bcs);
 }
 
 igt_main
 {
-	const int sizes[] = {1, 128, 256, 512, 1024, 2048, 4096, 8192, 0};
-	drm_intel_bufmgr *bufmgr = NULL;
+	const int sizes[] = {128, 256, 512, 1024, 2048, 4096, 8192, 0};
+	struct buf_ops *bops = NULL;
 	int fd, i;
 
 	igt_fixture {
@@ -195,22 +197,24 @@ igt_main
 		rendercopy = igt_get_render_copyfunc(devid);
 		igt_require(rendercopy);
 
-		bufmgr = drm_intel_bufmgr_gem_init(fd, 4096);
-		igt_assert(bufmgr);
-
-		batch =  intel_batchbuffer_alloc(bufmgr, devid);
+		bops = buf_ops_create(fd);
 
 		gem_submission_print_method(fd);
 	}
 
 	for (i = 0; sizes[i] != 0; i++) {
 		igt_subtest_f("read-read-%dx%d", sizes[i], sizes[i])
-			run(bufmgr, sizes[i], sizes[i], false, false);
+			run(bops, sizes[i], sizes[i], false, false);
 		igt_subtest_f("read-write-%dx%d", sizes[i], sizes[i])
-			run(bufmgr, sizes[i], sizes[i], false, true);
+			run(bops, sizes[i], sizes[i], false, true);
 		igt_subtest_f("write-read-%dx%d", sizes[i], sizes[i])
-			run(bufmgr, sizes[i], sizes[i], true, false);
+			run(bops, sizes[i], sizes[i], true, false);
 		igt_subtest_f("write-write-%dx%d", sizes[i], sizes[i])
-			run(bufmgr, sizes[i], sizes[i], true, true);
+			run(bops, sizes[i], sizes[i], true, true);
+	}
+
+	igt_fixture {
+		buf_ops_destroy(bops);
+		close(fd);
 	}
 }
diff --git a/tests/i915/gem_render_copy.c b/tests/i915/gem_render_copy.c
index 1e1e79b9..7f425dcd 100644
--- a/tests/i915/gem_render_copy.c
+++ b/tests/i915/gem_render_copy.c
@@ -47,8 +47,7 @@
 #include "i915/gem.h"
 #include "igt.h"
 #include "igt_x86.h"
-#include "intel_bufmgr.h"
-#include "rendercopy_bufmgr.h"
+#include "intel_bufops.h"
 
 IGT_TEST_DESCRIPTION("Basic test for the render_copy() function.");
 
@@ -58,11 +57,10 @@ IGT_TEST_DESCRIPTION("Basic test for the render_copy() function.");
 typedef struct {
 	int drm_fd;
 	uint32_t devid;
-	drm_intel_bufmgr *bufmgr;
-	struct intel_batchbuffer *batch;
+	struct buf_ops *bops;
+	struct intel_bb *ibb;
 	igt_render_copyfunc_t render_copy;
 	igt_vebox_copyfunc_t vebox_copy;
-	struct rendercopy_bufmgr *bmgr;
 } data_t;
 static int opt_dump_png = false;
 static int check_all_pixels = false;
@@ -87,59 +85,32 @@ static void *alloc_aligned(uint64_t size)
 }
 
 static void
-copy_from_linear_buf(data_t *data, struct igt_buf *src, struct igt_buf *dst)
+copy_from_linear_buf(data_t *data, struct intel_buf *src, struct intel_buf *dst)
 {
-	void *linear;
-
 	igt_assert(src->tiling == I915_TILING_NONE);
 
-	gem_set_domain(data->drm_fd, src->bo->handle,
+	gem_set_domain(data->drm_fd, src->handle,
 		       I915_GEM_DOMAIN_CPU, 0);
-	linear = __gem_mmap_offset__cpu(data->drm_fd, src->bo->handle, 0,
-					src->bo->size, PROT_READ);
-	if (!linear)
-		linear = gem_mmap__cpu(data->drm_fd, src->bo->handle, 0,
-				       src->bo->size, PROT_READ);
-
-	linear_to_igt_buf(data->bmgr, dst, linear);
-
-	munmap(linear, src->bo->size);
-}
-
-static void scratch_buf_write_to_png(data_t *data, struct igt_buf *buf,
-				     const char *filename)
-{
-	cairo_surface_t *surface;
-	cairo_status_t ret;
-	void *linear;
-
-	linear = alloc_aligned(buf->bo->size);
-	igt_buf_to_linear(data->bmgr, buf, linear);
+	intel_buf_cpu_map(src, false);
 
-	surface = cairo_image_surface_create_for_data(linear,
-						      CAIRO_FORMAT_RGB24,
-						      igt_buf_width(buf),
-						      igt_buf_height(buf),
-						      buf->surface[0].stride);
-	ret = cairo_surface_write_to_png(surface, make_filename(filename));
-	igt_assert(ret == CAIRO_STATUS_SUCCESS);
-	cairo_surface_destroy(surface);
+	linear_to_intel_buf(data->bops, dst, src->ptr);
 
-	free(linear);
+	intel_buf_unmap(src);
 }
 
-static void *linear_copy_ccs(data_t *data, struct igt_buf *buf)
+static void *linear_copy_ccs(data_t *data, struct intel_buf *buf)
 {
 	void *ccs_data, *linear;
 	int gen = intel_gen(data->devid);
-	int ccs_size = igt_buf_intel_ccs_width(gen, buf) *
-		igt_buf_intel_ccs_height(gen, buf);
+	int ccs_size = intel_buf_ccs_width(gen, buf) *
+		intel_buf_ccs_height(gen, buf);
+	int bo_size = intel_buf_bo_size(buf);
 
 	ccs_data = alloc_aligned(ccs_size);
-	linear = alloc_aligned(buf->bo->size);
-	memset(linear, 0, buf->bo->size);
+	linear = alloc_aligned(bo_size);
+	memset(linear, 0, bo_size);
 
-	igt_buf_to_linear(data->bmgr, buf, linear);
+	intel_buf_to_linear(data->bops, buf, linear);
 	igt_memcpy_from_wc(ccs_data, linear + buf->ccs[0].offset, ccs_size);
 
 	free(linear);
@@ -147,31 +118,7 @@ static void *linear_copy_ccs(data_t *data, struct igt_buf *buf)
 	return ccs_data;
 }
 
-static void scratch_buf_ccs_write_to_png(data_t *data,
-					 struct igt_buf *buf,
-					 const char *filename)
-{
-	cairo_surface_t *surface;
-	cairo_status_t ret;
-	void *linear;
-	int gen = intel_gen(data->devid);
-	unsigned int ccs_width = igt_buf_intel_ccs_width(gen, buf);
-	unsigned int ccs_height = igt_buf_intel_ccs_height(gen, buf);
-
-	linear = linear_copy_ccs(data, buf);
-
-	surface = cairo_image_surface_create_for_data(linear,
-						      CAIRO_FORMAT_A8,
-						      ccs_width, ccs_height,
-						      buf->ccs[0].stride);
-	ret = cairo_surface_write_to_png(surface, make_filename(filename));
-	igt_assert(ret == CAIRO_STATUS_SUCCESS);
-	cairo_surface_destroy(surface);
-
-	free(linear);
-}
-
-static void scratch_buf_draw_pattern(data_t *data, struct igt_buf *buf,
+static void scratch_buf_draw_pattern(data_t *data, struct intel_buf *buf,
 				     int x, int y, int w, int h,
 				     int cx, int cy, int cw, int ch,
 				     bool use_alternate_colors)
@@ -181,12 +128,12 @@ static void scratch_buf_draw_pattern(data_t *data, struct igt_buf *buf,
 	cairo_t *cr;
 	void *linear;
 
-	linear = alloc_aligned(buf->bo->size);
+	linear = alloc_aligned(buf->surface[0].size);
 
 	surface = cairo_image_surface_create_for_data(linear,
 						      CAIRO_FORMAT_RGB24,
-						      igt_buf_width(buf),
-						      igt_buf_height(buf),
+						      intel_buf_width(buf),
+						      intel_buf_height(buf),
 						      buf->surface[0].stride);
 
 	cr = cairo_create(surface);
@@ -222,24 +169,24 @@ static void scratch_buf_draw_pattern(data_t *data, struct igt_buf *buf,
 
 	cairo_surface_destroy(surface);
 
-	linear_to_igt_buf(data->bmgr, buf, linear);
+	linear_to_intel_buf(data->bops, buf, linear);
 
 	free(linear);
 }
 
 static void
 scratch_buf_copy(data_t *data,
-		 struct igt_buf *src, int sx, int sy, int w, int h,
-		 struct igt_buf *dst, int dx, int dy)
+		 struct intel_buf *src, int sx, int sy, int w, int h,
+		 struct intel_buf *dst, int dx, int dy)
 {
-	int width = igt_buf_width(dst);
-	int height  = igt_buf_height(dst);
+	int width = intel_buf_width(dst);
+	int height = intel_buf_height(dst);
 	uint32_t *linear_dst;
 	uint32_t *linear_src;
 
-	igt_assert_eq(igt_buf_width(dst), igt_buf_width(src));
-	igt_assert_eq(igt_buf_height(dst), igt_buf_height(src));
-	igt_assert_eq(dst->bo->size, src->bo->size);
+	igt_assert_eq(intel_buf_width(dst), intel_buf_width(src));
+	igt_assert_eq(intel_buf_height(dst), intel_buf_height(src));
+	igt_assert_eq(intel_buf_bo_size(dst), intel_buf_bo_size(src));
 	igt_assert_eq(dst->bpp, src->bpp);
 
 	w = min(w, width - sx);
@@ -248,10 +195,10 @@ scratch_buf_copy(data_t *data,
 	h = min(h, height - sy);
 	h = min(h, height - dy);
 
-	linear_dst = alloc_aligned(dst->bo->size);
-	linear_src = alloc_aligned(src->bo->size);
-	igt_buf_to_linear(data->bmgr, src, linear_src);
-	igt_buf_to_linear(data->bmgr, dst, linear_dst);
+	linear_dst = alloc_aligned(intel_buf_bo_size(dst));
+	linear_src = alloc_aligned(intel_buf_bo_size(src));
+	intel_buf_to_linear(data->bops, src, linear_src);
+	intel_buf_to_linear(data->bops, dst, linear_dst);
 
 	for (int y = 0; y < h; y++) {
 		memcpy(&linear_dst[(dy+y) * width + dx],
@@ -260,50 +207,50 @@ scratch_buf_copy(data_t *data,
 	}
 	free(linear_src);
 
-	linear_to_igt_buf(data->bmgr, dst, linear_dst);
+	linear_to_intel_buf(data->bops, dst, linear_dst);
 	free(linear_dst);
 }
 
-static void scratch_buf_init(data_t *data, struct igt_buf *buf,
+static void scratch_buf_init(data_t *data, struct intel_buf *buf,
 			     int width, int height,
 			     uint32_t req_tiling,
 			     enum i915_compression compression)
 {
 	int bpp = 32;
 
-	igt_buf_init(data->bmgr, buf, width, height, bpp, req_tiling,
-		     compression);
+	intel_buf_init(data->bops, buf, width, height, bpp, 0,
+		       req_tiling, compression);
 
-	igt_assert(igt_buf_width(buf) == width);
-	igt_assert(igt_buf_height(buf) == height);
+	igt_assert(intel_buf_width(buf) == width);
+	igt_assert(intel_buf_height(buf) == height);
 }
 
-static void scratch_buf_fini(struct igt_buf *buf)
+static void scratch_buf_fini(struct intel_buf *buf)
 {
-	drm_intel_bo_unreference(buf->bo);
+	intel_buf_close(buf->bops, buf);
 }
 
 static void
 scratch_buf_check(data_t *data,
-		  struct igt_buf *buf,
-		  struct igt_buf *ref,
+		  struct intel_buf *buf,
+		  struct intel_buf *ref,
 		  int x, int y)
 {
-	int width = igt_buf_width(buf);
+	int width = intel_buf_width(buf);
 	uint32_t buf_val, ref_val;
 	uint32_t *linear;
 
-	igt_assert_eq(igt_buf_width(buf), igt_buf_width(ref));
-	igt_assert_eq(igt_buf_height(buf), igt_buf_height(ref));
-	igt_assert_eq(buf->bo->size, ref->bo->size);
+	igt_assert_eq(intel_buf_width(buf), intel_buf_width(ref));
+	igt_assert_eq(intel_buf_height(buf), intel_buf_height(ref));
+	igt_assert_eq(buf->surface[0].size, ref->surface[0].size);
 
-	linear = alloc_aligned(buf->bo->size);
-	igt_buf_to_linear(data->bmgr, buf, linear);
+	linear = alloc_aligned(buf->surface[0].size);
+	intel_buf_to_linear(data->bops, buf, linear);
 	buf_val = linear[y * width + x];
 	free(linear);
 
-	linear = alloc_aligned(ref->bo->size);
-	igt_buf_to_linear(data->bmgr, buf, linear);
+	linear = alloc_aligned(ref->surface[0].size);
+	intel_buf_to_linear(data->bops, buf, linear);
 	ref_val = linear[y * width + x];
 	free(linear);
 
@@ -314,21 +261,21 @@ scratch_buf_check(data_t *data,
 
 static void
 scratch_buf_check_all(data_t *data,
-		      struct igt_buf *buf,
-		      struct igt_buf *ref)
+		      struct intel_buf *buf,
+		      struct intel_buf *ref)
 {
-	int width = igt_buf_width(buf);
-	int height  = igt_buf_height(buf);
+	int width = intel_buf_width(buf);
+	int height = intel_buf_height(buf);
 	uint32_t *linear_buf, *linear_ref;
 
-	igt_assert_eq(igt_buf_width(buf), igt_buf_width(ref));
-	igt_assert_eq(igt_buf_height(buf), igt_buf_height(ref));
-	igt_assert_eq(buf->bo->size, ref->bo->size);
+	igt_assert_eq(intel_buf_width(buf), intel_buf_width(ref));
+	igt_assert_eq(intel_buf_height(buf), intel_buf_height(ref));
+	igt_assert_eq(buf->surface[0].size, ref->surface[0].size);
 
-	linear_buf = alloc_aligned(buf->bo->size);
-	linear_ref = alloc_aligned(ref->bo->size);
-	igt_buf_to_linear(data->bmgr, buf, linear_buf);
-	igt_buf_to_linear(data->bmgr, ref, linear_ref);
+	linear_buf = alloc_aligned(buf->surface[0].size);
+	linear_ref = alloc_aligned(ref->surface[0].size);
+	intel_buf_to_linear(data->bops, buf, linear_buf);
+	intel_buf_to_linear(data->bops, ref, linear_ref);
 
 	for (int y = 0; y < height; y++) {
 		for (int x = 0; x < width; x++) {
@@ -346,11 +293,11 @@ scratch_buf_check_all(data_t *data,
 }
 
 static void scratch_buf_ccs_check(data_t *data,
-				  struct igt_buf *buf)
+				  struct intel_buf *buf)
 {
 	int gen = intel_gen(data->devid);
-	int ccs_size = igt_buf_intel_ccs_width(gen, buf) *
-		igt_buf_intel_ccs_height(gen, buf);
+	int ccs_size = intel_buf_ccs_width(gen, buf) *
+		intel_buf_ccs_height(gen, buf);
 	uint8_t *linear;
 	int i;
 
@@ -368,26 +315,22 @@ static void scratch_buf_ccs_check(data_t *data,
 }
 
 static void
-dump_igt_buf_to_file(data_t *data, struct igt_buf *buf, const char *filename)
+dump_intel_buf_to_file(data_t *data, struct intel_buf *buf, const char *filename)
 {
 	FILE *out;
-	void *linear;
+	void *ptr;
+	uint32_t size = intel_buf_bo_size(buf);
 
-	gem_set_domain(data->drm_fd, buf->bo->handle,
+	gem_set_domain(data->drm_fd, buf->handle,
 		       I915_GEM_DOMAIN_CPU, 0);
+	ptr = gem_mmap__cpu_coherent(data->drm_fd, buf->handle, 0, size, PROT_READ);
 
-	linear = __gem_mmap_offset__cpu(data->drm_fd, buf->bo->handle, 0,
-					buf->bo->size, PROT_READ);
-	if (!linear)
-		linear = gem_mmap__cpu(data->drm_fd, buf->bo->handle, 0,
-				       buf->bo->size, PROT_READ);
 	out = fopen(filename, "wb");
 	igt_assert(out);
-	fwrite(linear, buf->bo->size, 1, out);
+	fwrite(ptr, size, 1, out);
 	fclose(out);
 
-	munmap(linear, buf->bo->size);
-
+	munmap(ptr, size);
 }
 
 #define SOURCE_MIXED_TILED	1
@@ -398,9 +341,9 @@ static void test(data_t *data, uint32_t src_tiling, uint32_t dst_tiling,
 		 enum i915_compression dst_compression,
 		 int flags)
 {
-	struct igt_buf ref, src_tiled, src_ccs, dst_ccs, dst;
+	struct intel_buf ref, src_tiled, src_ccs, dst_ccs, dst;
 	struct {
-		struct igt_buf buf;
+		struct intel_buf buf;
 		const char *filename;
 		uint32_t tiling;
 		int x, y;
@@ -426,7 +369,6 @@ static void test(data_t *data, uint32_t src_tiling, uint32_t dst_tiling,
 			.x = 1, .y = 1,
 		},
 	};
-	int opt_dump_aub = igt_aub_dump_enabled();
 	int num_src = ARRAY_SIZE(src);
 	const bool src_mixed_tiled = flags & SOURCE_MIXED_TILED;
 	const bool src_compressed = src_compression != I915_COMPRESSION_NONE;
@@ -495,18 +437,13 @@ static void test(data_t *data, uint32_t src_tiling, uint32_t dst_tiling,
 
 	if (opt_dump_png) {
 		for (int i = 0; i < num_src; i++)
-			scratch_buf_write_to_png(data, &src[i].buf, src[i].filename);
+			intel_buf_write_to_png(&src[i].buf,
+					       make_filename(src[i].filename));
 		if (!src_mixed_tiled)
-			scratch_buf_write_to_png(data, &src_tiled,
-						 "source-tiled.png");
-		scratch_buf_write_to_png(data, &dst, "destination.png");
-		scratch_buf_write_to_png(data, &ref, "reference.png");
-	}
-
-	if (opt_dump_aub) {
-		drm_intel_bufmgr_gem_set_aub_filename(data->bufmgr,
-						      "rendercopy.aub");
-		drm_intel_bufmgr_gem_set_aub_dump(data->bufmgr, true);
+			intel_buf_write_to_png(&src_tiled,
+					       make_filename("source-tiled.png"));
+		intel_buf_write_to_png(&dst, make_filename("destination.png"));
+		intel_buf_write_to_png(&ref, make_filename("reference.png"));
 	}
 
 	/* This will copy the src to the mid point of the dst buffer. Presumably
@@ -519,75 +456,76 @@ static void test(data_t *data, uint32_t src_tiling, uint32_t dst_tiling,
 	 */
 	if (src_mixed_tiled) {
 		if (dst_compressed)
-			data->render_copy(data->batch, NULL,
+			data->render_copy(data->ibb, 0,
 					  &dst, 0, 0, WIDTH, HEIGHT,
 					  &dst_ccs, 0, 0);
 
-		for (int i = 0; i < num_src; i++)
-			data->render_copy(data->batch, NULL,
+		for (int i = 0; i < num_src; i++) {
+			data->render_copy(data->ibb, 0,
 					  &src[i].buf,
 					  WIDTH/4, HEIGHT/4, WIDTH/2-2, HEIGHT/2-2,
 					  dst_compressed ? &dst_ccs : &dst,
 					  src[i].x, src[i].y);
+		}
 
 		if (dst_compressed)
-			data->render_copy(data->batch, NULL,
+			data->render_copy(data->ibb, 0,
 					  &dst_ccs, 0, 0, WIDTH, HEIGHT,
 					  &dst, 0, 0);
 
 	} else {
 		if (src_compression == I915_COMPRESSION_RENDER) {
-			data->render_copy(data->batch, NULL,
+			data->render_copy(data->ibb, 0,
 					  &src_tiled, 0, 0, WIDTH, HEIGHT,
 					  &src_ccs,
 					  0, 0);
 			if (dump_compressed_src_buf) {
-				dump_igt_buf_to_file(data, &src_tiled,
-						     "render-src_tiled.bin");
-				dump_igt_buf_to_file(data, &src_ccs,
-						     "render-src_ccs.bin");
+				dump_intel_buf_to_file(data, &src_tiled,
+						       "render-src_tiled.bin");
+				dump_intel_buf_to_file(data, &src_ccs,
+						       "render-src_ccs.bin");
 			}
 		} else if (src_compression == I915_COMPRESSION_MEDIA) {
-			data->vebox_copy(data->batch,
+			data->vebox_copy(data->ibb,
 					 &src_tiled, WIDTH, HEIGHT,
 					 &src_ccs);
 			if (dump_compressed_src_buf) {
-				dump_igt_buf_to_file(data, &src_tiled,
-						     "vebox-src_tiled.bin");
-				dump_igt_buf_to_file(data, &src_ccs,
-						     "vebox-src_ccs.bin");
+				dump_intel_buf_to_file(data, &src_tiled,
+						       "vebox-src_tiled.bin");
+				dump_intel_buf_to_file(data, &src_ccs,
+						       "vebox-src_ccs.bin");
 			}
 		}
 
 		if (dst_compression == I915_COMPRESSION_RENDER) {
-			data->render_copy(data->batch, NULL,
+			data->render_copy(data->ibb, 0,
 					  src_compressed ? &src_ccs : &src_tiled,
 					  0, 0, WIDTH, HEIGHT,
 					  &dst_ccs,
 					  0, 0);
 
-			data->render_copy(data->batch, NULL,
+			data->render_copy(data->ibb, 0,
 					  &dst_ccs,
 					  0, 0, WIDTH, HEIGHT,
 					  &dst,
 					  0, 0);
 		} else if (dst_compression == I915_COMPRESSION_MEDIA) {
-			data->vebox_copy(data->batch,
+			data->vebox_copy(data->ibb,
 					 src_compressed ? &src_ccs : &src_tiled,
 					 WIDTH, HEIGHT,
 					 &dst_ccs);
 
-			data->vebox_copy(data->batch,
+			data->vebox_copy(data->ibb,
 					 &dst_ccs,
 					 WIDTH, HEIGHT,
 					 &dst);
 		} else if (force_vebox_dst_copy) {
-			data->vebox_copy(data->batch,
+			data->vebox_copy(data->ibb,
 					 src_compressed ? &src_ccs : &src_tiled,
 					 WIDTH, HEIGHT,
 					 &dst);
 		} else {
-			data->render_copy(data->batch, NULL,
+			data->render_copy(data->ibb, 0,
 					  src_compressed ? &src_ccs : &src_tiled,
 					  0, 0, WIDTH, HEIGHT,
 					  &dst,
@@ -596,29 +534,22 @@ static void test(data_t *data, uint32_t src_tiling, uint32_t dst_tiling,
 	}
 
 	if (opt_dump_png){
-		scratch_buf_write_to_png(data, &dst, "result.png");
+		intel_buf_write_to_png(&dst, make_filename("result.png"));
 		if (src_compressed) {
-			scratch_buf_write_to_png(data, &src_ccs,
-						 "compressed-src.png");
-			scratch_buf_ccs_write_to_png(data, &src_ccs,
-						     "compressed-src-ccs.png");
+			intel_buf_write_to_png(&src_ccs,
+					       make_filename("compressed-src.png"));
+			intel_buf_write_aux_to_png(&src_ccs,
+						   "compressed-src-ccs.png");
 		}
 		if (dst_compressed) {
-			scratch_buf_write_to_png(data, &dst_ccs,
-						 "compressed-dst.png");
-			scratch_buf_ccs_write_to_png(data, &dst_ccs,
-						     "compressed-dst-ccs.png");
+			intel_buf_write_to_png(&dst_ccs,
+					       make_filename("compressed-dst.png"));
+			intel_buf_write_aux_to_png(&dst_ccs,
+						   "compressed-dst-ccs.png");
 		}
 	}
 
-	if (opt_dump_aub) {
-		drm_intel_gem_bo_aub_dump_bmp(dst.bo,
-					      0, 0, igt_buf_width(&dst),
-					      igt_buf_height(&dst),
-					      AUB_DUMP_BMP_FORMAT_ARGB_8888,
-					      dst.surface[0].stride, 0);
-		drm_intel_bufmgr_gem_set_aub_dump(data->bufmgr, false);
-	} else if (check_all_pixels) {
+	if (check_all_pixels) {
 		scratch_buf_check_all(data, &dst, &ref);
 	} else {
 		scratch_buf_check(data, &dst, &ref, 10, 10);
@@ -631,6 +562,8 @@ static void test(data_t *data, uint32_t src_tiling, uint32_t dst_tiling,
 		scratch_buf_ccs_check(data, &dst_ccs);
 
 	scratch_buf_fini(&ref);
+	if (!src_mixed_tiled)
+		scratch_buf_fini(&src_tiled);
 	if (dst_compressed)
 		scratch_buf_fini(&dst_ccs);
 	if (src_compressed)
@@ -638,6 +571,9 @@ static void test(data_t *data, uint32_t src_tiling, uint32_t dst_tiling,
 	scratch_buf_fini(&dst);
 	for (int i = 0; i < num_src; i++)
 		scratch_buf_fini(&src[i].buf);
+
+	/* handles gone, need to clean the objects cache within intel_bb */
+	intel_bb_reset(data->ibb, true);
 }
 
 static int opt_handler(int opt, int opt_index, void *data)
@@ -649,6 +585,9 @@ static int opt_handler(int opt, int opt_index, void *data)
 	case 'a':
 		check_all_pixels = true;
 		break;
+	case 'c':
+		dump_compressed_src_buf = true;
+		break;
 	default:
 		return IGT_OPT_HANDLER_ERROR;
 	}
@@ -659,6 +598,7 @@ static int opt_handler(int opt, int opt_index, void *data)
 const char *help_str =
 	"  -d\tDump PNG\n"
 	"  -a\tCheck all pixels\n"
+	"  -c\tDump compressed src surface\n"
 	;
 
 static void buf_mode_to_str(uint32_t tiling, bool mixed_tiled,
@@ -705,7 +645,7 @@ static void buf_mode_to_str(uint32_t tiling, bool mixed_tiled,
 		 tiling_str, compression_str[0] ? "-" : "", compression_str);
 }
 
-igt_main_args("da", NULL, help_str, opt_handler, NULL)
+igt_main_args("dac", NULL, help_str, opt_handler, NULL)
 {
 	static const struct test_desc {
 		int src_tiling;
@@ -849,20 +789,14 @@ igt_main_args("da", NULL, help_str, opt_handler, NULL)
 		data.devid = intel_get_drm_devid(data.drm_fd);
 		igt_require_gem(data.drm_fd);
 
-		data.bufmgr = drm_intel_bufmgr_gem_init(data.drm_fd, 4096);
-		igt_assert(data.bufmgr);
-
 		data.render_copy = igt_get_render_copyfunc(data.devid);
 		igt_require_f(data.render_copy,
 			      "no render-copy function\n");
 
 		data.vebox_copy = igt_get_vebox_copyfunc(data.devid);
 
-		data.batch = intel_batchbuffer_alloc(data.bufmgr, data.devid);
-		igt_assert(data.batch);
-
-		data.bmgr = rendercopy_bufmgr_create(data.drm_fd, data.bufmgr);
-		igt_assert(data.bmgr);
+		data.bops = buf_ops_create(data.drm_fd);
+		data.ibb = intel_bb_create(data.drm_fd, 4096);
 
 		igt_fork_hang_detector(data.drm_fd);
 	}
@@ -915,8 +849,7 @@ igt_main_args("da", NULL, help_str, opt_handler, NULL)
 
 	igt_fixture {
 		igt_stop_hang_detector();
-		intel_batchbuffer_free(data.batch);
-		drm_intel_bufmgr_destroy(data.bufmgr);
-		rendercopy_bufmgr_destroy(data.bmgr);
+		intel_bb_destroy(data.ibb);
+		buf_ops_destroy(data.bops);
 	}
 }
diff --git a/tests/i915/gem_render_copy_redux.c b/tests/i915/gem_render_copy_redux.c
index a3f77e84..86f2dba2 100644
--- a/tests/i915/gem_render_copy_redux.c
+++ b/tests/i915/gem_render_copy_redux.c
@@ -63,8 +63,8 @@ IGT_TEST_DESCRIPTION("Advanced test for the render_copy() function.");
 typedef struct {
 	int fd;
 	uint32_t devid;
-	drm_intel_bufmgr *bufmgr;
-	struct intel_batchbuffer *batch;
+	struct buf_ops *bops;
+	struct intel_bb *ibb;
 	igt_render_copyfunc_t render_copy;
 	uint32_t linear[WIDTH * HEIGHT];
 } data_t;
@@ -74,58 +74,50 @@ static void data_init(data_t *data)
 	data->fd = drm_open_driver(DRIVER_INTEL);
 	data->devid = intel_get_drm_devid(data->fd);
 
-	data->bufmgr = drm_intel_bufmgr_gem_init(data->fd, 4096);
-	igt_assert(data->bufmgr);
-
+	data->bops = buf_ops_create(data->fd);
 	data->render_copy = igt_get_render_copyfunc(data->devid);
 	igt_require_f(data->render_copy,
 		      "no render-copy function\n");
 
-	data->batch = intel_batchbuffer_alloc(data->bufmgr, data->devid);
-	igt_assert(data->batch);
+	data->ibb = intel_bb_create(data->fd, 4096);
 }
 
 static void data_fini(data_t *data)
 {
-	 intel_batchbuffer_free(data->batch);
-	 drm_intel_bufmgr_destroy(data->bufmgr);
-	 close(data->fd);
+	intel_bb_destroy(data->ibb);
+	buf_ops_destroy(data->bops);
+	close(data->fd);
 }
 
-static void scratch_buf_init(data_t *data, struct igt_buf *buf,
+static void scratch_buf_init(data_t *data, struct intel_buf *buf,
 			     int width, int height, int stride, uint32_t color)
 {
-	drm_intel_bo *bo;
 	int i;
 
-	bo = drm_intel_bo_alloc(data->bufmgr, "", SIZE, 4096);
+	intel_buf_init(data->bops, buf, width, height, 32, 0,
+		       I915_TILING_NONE, I915_COMPRESSION_NONE);
+	igt_assert(buf->surface[0].size == SIZE);
+	igt_assert(buf->surface[0].stride == stride);
+
 	for (i = 0; i < width * height; i++)
 		data->linear[i] = color;
-	gem_write(data->fd, bo->handle, 0, data->linear,
+	gem_write(data->fd, buf->handle, 0, data->linear,
 		  sizeof(data->linear));
-
-	memset(buf, 0, sizeof(*buf));
-
-	buf->bo = bo;
-	buf->surface[0].stride = stride;
-	buf->tiling = I915_TILING_NONE;
-	buf->surface[0].size = SIZE;
-	buf->bpp = 32;
 }
 
-static void scratch_buf_fini(data_t *data, struct igt_buf *buf)
+static void scratch_buf_fini(data_t *data, struct intel_buf *buf)
 {
-	dri_bo_unreference(buf->bo);
+	intel_buf_close(data->bops, buf);
 	memset(buf, 0, sizeof(*buf));
 }
 
 static void
-scratch_buf_check(data_t *data, struct igt_buf *buf, int x, int y,
+scratch_buf_check(data_t *data, struct intel_buf *buf, int x, int y,
 		  uint32_t color)
 {
 	uint32_t val;
 
-	gem_read(data->fd, buf->bo->handle, 0,
+	gem_read(data->fd, buf->handle, 0,
 		 data->linear, sizeof(data->linear));
 	val = data->linear[y * WIDTH + x];
 	igt_assert_f(val == color,
@@ -135,7 +127,7 @@ scratch_buf_check(data_t *data, struct igt_buf *buf, int x, int y,
 
 static void copy(data_t *data)
 {
-	struct igt_buf src, dst;
+	struct intel_buf src, dst;
 
 	scratch_buf_init(data, &src, WIDTH, HEIGHT, STRIDE, SRC_COLOR);
 	scratch_buf_init(data, &dst, WIDTH, HEIGHT, STRIDE, DST_COLOR);
@@ -143,7 +135,7 @@ static void copy(data_t *data)
 	scratch_buf_check(data, &src, WIDTH / 2, HEIGHT / 2, SRC_COLOR);
 	scratch_buf_check(data, &dst, WIDTH / 2, HEIGHT / 2, DST_COLOR);
 
-	data->render_copy(data->batch, NULL,
+	data->render_copy(data->ibb, 0,
 			  &src, 0, 0, WIDTH, HEIGHT,
 			  &dst, WIDTH / 2, HEIGHT / 2);
 
@@ -157,9 +149,9 @@ static void copy(data_t *data)
 static void copy_flink(data_t *data)
 {
 	data_t local;
-	struct igt_buf src, dst;
-	struct igt_buf local_src, local_dst;
-	struct igt_buf flink;
+	struct intel_buf src, dst;
+	struct intel_buf local_src, local_dst;
+	struct intel_buf flink;
 	uint32_t name;
 
 	data_init(&local);
@@ -167,23 +159,22 @@ static void copy_flink(data_t *data)
 	scratch_buf_init(data, &src, WIDTH, HEIGHT, STRIDE, 0);
 	scratch_buf_init(data, &dst, WIDTH, HEIGHT, STRIDE, DST_COLOR);
 
-	data->render_copy(data->batch, NULL,
+	data->render_copy(data->ibb, 0,
 			  &src, 0, 0, WIDTH, HEIGHT,
 			  &dst, WIDTH, HEIGHT);
 
 	scratch_buf_init(&local, &local_src, WIDTH, HEIGHT, STRIDE, 0);
 	scratch_buf_init(&local, &local_dst, WIDTH, HEIGHT, STRIDE, SRC_COLOR);
 
-	local.render_copy(local.batch, NULL,
+	local.render_copy(local.ibb, 0,
 			  &local_src, 0, 0, WIDTH, HEIGHT,
 			  &local_dst, WIDTH, HEIGHT);
 
-
-	drm_intel_bo_flink(local_dst.bo, &name);
+	name = gem_flink(local.fd, local_dst.handle);
 	flink = local_dst;
-	flink.bo = drm_intel_bo_gem_create_from_name(data->bufmgr, "flink", name);
+	flink.handle = gem_open(data->fd, name);
 
-	data->render_copy(data->batch, NULL,
+	data->render_copy(data->ibb, 0,
 			  &flink, 0, 0, WIDTH, HEIGHT,
 			  &dst, WIDTH / 2, HEIGHT / 2);
 
@@ -193,6 +184,7 @@ static void copy_flink(data_t *data)
 	scratch_buf_check(data, &dst, 10, 10, DST_COLOR);
 	scratch_buf_check(data, &dst, WIDTH - 10, HEIGHT - 10, SRC_COLOR);
 
+	intel_bb_reset(data->ibb, true);
 	scratch_buf_fini(data, &src);
 	scratch_buf_fini(data, &flink);
 	scratch_buf_fini(data, &dst);
diff --git a/tests/i915/gem_render_linear_blits.c b/tests/i915/gem_render_linear_blits.c
index 55a9e6be..8d044e9e 100644
--- a/tests/i915/gem_render_linear_blits.c
+++ b/tests/i915/gem_render_linear_blits.c
@@ -49,7 +49,6 @@
 
 #include "i915/gem.h"
 #include "igt.h"
-#include "intel_bufmgr.h"
 
 #define WIDTH 512
 #define STRIDE (WIDTH*4)
@@ -60,7 +59,7 @@ static uint32_t linear[WIDTH*HEIGHT];
 static igt_render_copyfunc_t render_copy;
 
 static void
-check_bo(int fd, uint32_t handle, uint32_t val)
+check_buf(int fd, uint32_t handle, uint32_t val)
 {
 	int i;
 
@@ -76,114 +75,89 @@ check_bo(int fd, uint32_t handle, uint32_t val)
 
 static void run_test (int fd, int count)
 {
-	drm_intel_bufmgr *bufmgr;
-	struct intel_batchbuffer *batch;
+	struct buf_ops *bops;
+	struct intel_bb *ibb;
 	uint32_t *start_val;
-	drm_intel_bo **bo;
+	struct intel_buf *bufs;
 	uint32_t start = 0;
 	int i, j;
 
 	render_copy = igt_get_render_copyfunc(intel_get_drm_devid(fd));
 	igt_require(render_copy);
 
-	bufmgr = drm_intel_bufmgr_gem_init(fd, 4096);
-	batch = intel_batchbuffer_alloc(bufmgr, intel_get_drm_devid(fd));
+	bops = buf_ops_create(fd);
+	ibb = intel_bb_create(fd, 4096);
 
-	bo = malloc(sizeof(*bo)*count);
+	bufs = malloc(sizeof(*bufs)*count);
 	start_val = malloc(sizeof(*start_val)*count);
 
 	for (i = 0; i < count; i++) {
-		bo[i] = drm_intel_bo_alloc(bufmgr, "", SIZE, 4096);
+		intel_buf_init(bops, &bufs[i], WIDTH, HEIGHT, 32, 0,
+			       I915_TILING_NONE, I915_COMPRESSION_NONE);
 		start_val[i] = start;
 		for (j = 0; j < WIDTH*HEIGHT; j++)
 			linear[j] = start++;
-		gem_write(fd, bo[i]->handle, 0, linear, sizeof(linear));
+		gem_write(fd, bufs[i].handle, 0, linear, sizeof(linear));
 	}
 
 	igt_info("Verifying initialisation - %d buffers of %d bytes\n", count, SIZE);
 	for (i = 0; i < count; i++)
-		check_bo(fd, bo[i]->handle, start_val[i]);
+		check_buf(fd, bufs[i].handle, start_val[i]);
 
 	igt_info("Cyclic blits, forward...\n");
 	for (i = 0; i < count * 4; i++) {
-		struct igt_buf src = {}, dst = {};
+		struct intel_buf *src, *dst;
 
-		src.bo = bo[i % count];
-		src.surface[0].stride = STRIDE;
-		src.tiling = I915_TILING_NONE;
-		src.surface[0].size = SIZE;
-		src.bpp = 32;
+		src = &bufs[i % count];
+		dst = &bufs[(i + 1) % count];
 
-		dst.bo = bo[(i + 1) % count];
-		dst.surface[0].stride = STRIDE;
-		dst.tiling = I915_TILING_NONE;
-		dst.surface[0].size = SIZE;
-		dst.bpp = 32;
-
-		render_copy(batch, NULL, &src, 0, 0, WIDTH, HEIGHT, &dst, 0, 0);
+		render_copy(ibb, 0, src, 0, 0, WIDTH, HEIGHT, dst, 0, 0);
 		start_val[(i + 1) % count] = start_val[i % count];
 	}
+
 	for (i = 0; i < count; i++)
-		check_bo(fd, bo[i]->handle, start_val[i]);
+		check_buf(fd, bufs[i].handle, start_val[i]);
 
 	if (igt_run_in_simulation())
 		return;
 
 	igt_info("Cyclic blits, backward...\n");
 	for (i = 0; i < count * 4; i++) {
-		struct igt_buf src = {}, dst = {};
+		struct intel_buf *src, *dst;
 
-		src.bo = bo[(i + 1) % count];
-		src.surface[0].stride = STRIDE;
-		src.tiling = I915_TILING_NONE;
-		src.surface[0].size = SIZE;
-		src.bpp = 32;
+		src = &bufs[(i + 1) % count];
+		dst = &bufs[i % count];
 
-		dst.bo = bo[i % count];
-		dst.surface[0].stride = STRIDE;
-		dst.tiling = I915_TILING_NONE;
-		dst.surface[0].size = SIZE;
-		dst.bpp = 32;
-
-		render_copy(batch, NULL, &src, 0, 0, WIDTH, HEIGHT, &dst, 0, 0);
+		render_copy(ibb, 0, src, 0, 0, WIDTH, HEIGHT, dst, 0, 0);
 		start_val[i % count] = start_val[(i + 1) % count];
 	}
 	for (i = 0; i < count; i++)
-		check_bo(fd, bo[i]->handle, start_val[i]);
+		check_buf(fd, bufs[i].handle, start_val[i]);
 
 	igt_info("Random blits...\n");
 	for (i = 0; i < count * 4; i++) {
-		struct igt_buf src = {}, dst = {};
+		struct intel_buf *src, *dst;
 		int s = random() % count;
 		int d = random() % count;
 
 		if (s == d)
 			continue;
 
-		src.bo = bo[s];
-		src.surface[0].stride = STRIDE;
-		src.tiling = I915_TILING_NONE;
-		src.surface[0].size = SIZE;
-		src.bpp = 32;
-
-		dst.bo = bo[d];
-		dst.surface[0].stride = STRIDE;
-		dst.tiling = I915_TILING_NONE;
-		dst.surface[0].size = SIZE;
-		dst.bpp = 32;
+		src = &bufs[s];
+		dst = &bufs[d];
 
-		render_copy(batch, NULL, &src, 0, 0, WIDTH, HEIGHT, &dst, 0, 0);
+		render_copy(ibb, 0, src, 0, 0, WIDTH, HEIGHT, dst, 0, 0);
 		start_val[d] = start_val[s];
 	}
 	for (i = 0; i < count; i++)
-		check_bo(fd, bo[i]->handle, start_val[i]);
+		check_buf(fd, bufs[i].handle, start_val[i]);
 
 	/* release resources */
-	for (i = 0; i < count; i++) {
-		drm_intel_bo_unreference(bo[i]);
-	}
-	intel_batchbuffer_free(batch);
-	drm_intel_bufmgr_destroy(bufmgr);
+	for (i = 0; i < count; i++)
+		intel_buf_close(bops, &bufs[i]);
+	free(bufs);
+	intel_bb_destroy(ibb);
+	buf_ops_destroy(bops);
 }
 
 igt_main
diff --git a/tests/i915/gem_render_tiled_blits.c b/tests/i915/gem_render_tiled_blits.c
index bd76066a..f321f90d 100644
--- a/tests/i915/gem_render_tiled_blits.c
+++ b/tests/i915/gem_render_tiled_blits.c
@@ -42,12 +42,12 @@
 #include <errno.h>
 #include <sys/stat.h>
 #include <sys/time.h>
+#include <sys/mman.h>
 
 #include <drm.h>
 
 #include "i915/gem.h"
 #include "igt.h"
-#include "intel_bufmgr.h"
 
 #define WIDTH 512
 #define STRIDE (WIDTH*4)
@@ -55,29 +55,25 @@
 #define SIZE (HEIGHT*STRIDE)
 
 static igt_render_copyfunc_t render_copy;
-static drm_intel_bo *linear;
+static struct intel_buf linear;
 static uint32_t data[WIDTH*HEIGHT];
 static int snoop;
 
 static void
-check_bo(struct intel_batchbuffer *batch, struct igt_buf *buf, uint32_t val)
+check_buf(struct intel_bb *ibb, struct intel_buf *buf, uint32_t val)
 {
-	struct igt_buf tmp = {};
+	int i915 = buf_ops_get_fd(linear.bops);
 	uint32_t *ptr;
 	int i;
 
-	tmp.bo = linear;
-	tmp.surface[0].stride = STRIDE;
-	tmp.tiling = I915_TILING_NONE;
-	tmp.surface[0].size = SIZE;
-	tmp.bpp = 32;
+	render_copy(ibb, 0, buf, 0, 0, WIDTH, HEIGHT, &linear, 0, 0);
 
-	render_copy(batch, NULL, buf, 0, 0, WIDTH, HEIGHT, &tmp, 0, 0);
 	if (snoop) {
-		do_or_die(drm_intel_bo_map(linear, 0));
-		ptr = linear->virtual;
+		ptr = gem_mmap__cpu_coherent(i915, linear.handle, 0,
+					     linear.surface[0].size, PROT_READ);
+		gem_set_domain(i915, linear.handle, I915_GEM_DOMAIN_CPU, 0);
 	} else {
-		do_or_die(drm_intel_bo_get_subdata(linear, 0, sizeof(data), data));
+		gem_read(i915, linear.handle, 0, data, sizeof(data));
 		ptr = data;
 	}
 	for (i = 0; i < WIDTH*HEIGHT; i++) {
@@ -88,15 +84,15 @@ check_bo(struct intel_batchbuffer *batch, struct igt_buf *buf, uint32_t val)
 		val++;
 	}
 	if (ptr != data)
-		drm_intel_bo_unmap(linear);
+		munmap(ptr, linear.surface[0].size);
 }
 
 static void run_test (int fd, int count)
 {
-	drm_intel_bufmgr *bufmgr;
-	struct intel_batchbuffer *batch;
+	struct buf_ops *bops;
+	struct intel_bb *ibb;
 	uint32_t *start_val;
-	struct igt_buf *buf;
+	struct intel_buf *bufs;
 	uint32_t start = 0;
 	int i, j;
 	uint32_t devid;
@@ -112,66 +108,62 @@ static void run_test (int fd, int count)
 	if (IS_BROADWATER(devid) || IS_CRESTLINE(devid)) /* snafu */
 		snoop = 0;
 
-	bufmgr = drm_intel_bufmgr_gem_init(fd, 4096);
-	drm_intel_bufmgr_gem_set_vma_cache_size(bufmgr, 32);
-	batch = intel_batchbuffer_alloc(bufmgr, devid);
+	bops = buf_ops_create(fd);
+	ibb = intel_bb_create(fd, 4096);
 
-	linear = drm_intel_bo_alloc(bufmgr, "linear", WIDTH*HEIGHT*4, 0);
+	intel_buf_init(bops, &linear, WIDTH, HEIGHT, 32, 0,
+		       I915_TILING_NONE, I915_COMPRESSION_NONE);
 	if (snoop) {
-		gem_set_caching(fd, linear->handle, 1);
+		gem_set_caching(fd, linear.handle, 1);
 		igt_info("Using a snoop linear buffer for comparisons\n");
 	}
 
-	buf = calloc(sizeof(*buf), count);
+	bufs = calloc(sizeof(*bufs), count);
 	start_val = malloc(sizeof(*start_val)*count);
 
 	for (i = 0; i < count; i++) {
 		uint32_t tiling = I915_TILING_X + (random() & 1);
-		unsigned long pitch = STRIDE;
 		uint32_t *ptr;
 
-		buf[i].bo = drm_intel_bo_alloc_tiled(bufmgr, "",
-						     WIDTH, HEIGHT, 4,
-						     &tiling, &pitch, 0);
-		buf[i].surface[0].stride = pitch;
-		buf[i].tiling = tiling;
-		buf[i].surface[0].size = SIZE;
-		buf[i].bpp = 32;
-
+		intel_buf_init(bops, &bufs[i], WIDTH, HEIGHT, 32, 0,
+			       tiling, I915_COMPRESSION_NONE);
 		start_val[i] = start;
 
-		do_or_die(drm_intel_gem_bo_map_gtt(buf[i].bo));
-		ptr = buf[i].bo->virtual;
+		ptr = gem_mmap__gtt(fd, bufs[i].handle,
+				    bufs[i].surface[0].size, PROT_WRITE);
 		for (j = 0; j < WIDTH*HEIGHT; j++)
 			ptr[j] = start++;
-		drm_intel_gem_bo_unmap_gtt(buf[i].bo);
+
+		munmap(ptr, bufs[i].surface[0].size);
 	}
 
 	igt_info("Verifying initialisation...\n");
 	for (i = 0; i < count; i++)
-		check_bo(batch, &buf[i], start_val[i]);
+		check_buf(ibb, &bufs[i], start_val[i]);
 
 	igt_info("Cyclic blits, forward...\n");
 	for (i = 0; i < count * 4; i++) {
 		int src = i % count;
 		int dst = (i + 1) % count;
 
-		render_copy(batch, NULL, buf+src, 0, 0, WIDTH, HEIGHT, buf+dst, 0, 0);
+		render_copy(ibb, 0, &bufs[src], 0, 0, WIDTH, HEIGHT,
+			    &bufs[dst], 0, 0);
 		start_val[dst] = start_val[src];
 	}
 	for (i = 0; i < count; i++)
-		check_bo(batch, &buf[i], start_val[i]);
+		check_buf(ibb, &bufs[i], start_val[i]);
 
 	igt_info("Cyclic blits, backward...\n");
 	for (i = 0; i < count * 4; i++) {
 		int src = (i + 1) % count;
 		int dst = i % count;
 
-		render_copy(batch, NULL, buf+src, 0, 0, WIDTH, HEIGHT, buf+dst, 0, 0);
+		render_copy(ibb, 0, &bufs[src], 0, 0, WIDTH, HEIGHT,
+			    &bufs[dst], 0, 0);
 		start_val[dst] = start_val[src];
 	}
 	for (i = 0; i < count; i++)
-		check_bo(batch, &buf[i], start_val[i]);
+		check_buf(ibb, &bufs[i], start_val[i]);
 
 	igt_info("Random blits...\n");
 	for (i = 0; i < count * 4; i++) {
@@ -181,19 +173,20 @@ static void run_test (int fd, int count)
 		if (src == dst)
 			continue;
 
-		render_copy(batch, NULL, buf+src, 0, 0, WIDTH, HEIGHT, buf+dst, 0, 0);
+		render_copy(ibb, 0, &bufs[src], 0, 0, WIDTH, HEIGHT,
+			    &bufs[dst], 0, 0);
 		start_val[dst] = start_val[src];
 	}
 	for (i = 0; i < count; i++)
-		check_bo(batch, &buf[i], start_val[i]);
+		check_buf(ibb, &bufs[i], start_val[i]);
 
 	/* release resources */
-	drm_intel_bo_unreference(linear);
+	intel_buf_close(bops, &linear);
 	for (i = 0; i < count; i++) {
-		drm_intel_bo_unreference(buf[i].bo);
+		intel_buf_close(bops, &bufs[i]);
 	}
-	intel_batchbuffer_free(batch);
-	drm_intel_bufmgr_destroy(bufmgr);
+	intel_bb_destroy(ibb);
+	buf_ops_destroy(bops);
 }
 
 
diff --git a/tests/i915/gem_stress.c b/tests/i915/gem_stress.c
index 50245b93..22d64749 100644
--- a/tests/i915/gem_stress.c
+++ b/tests/i915/gem_stress.c
@@ -62,8 +62,6 @@
 
 #include <drm.h>
 
-#include "intel_bufmgr.h"
-
 IGT_TEST_DESCRIPTION("General gem coherency test.");
 
 #define CMD_POLY_STIPPLE_OFFSET       0x7906
@@ -84,13 +82,13 @@ IGT_TEST_DESCRIPTION("General gem coherency test.");
  *   first one (to check consistency of the kernel recovery paths)
  */
 
-drm_intel_bufmgr *bufmgr;
-struct intel_batchbuffer *batch;
+struct buf_ops *bops;
+struct intel_bb *ibb;
 int drm_fd;
 int devid;
 int num_fences;
 
-drm_intel_bo *busy_bo;
+struct intel_buf busy_bo;
 
 struct option_struct {
     unsigned scratch_buf_size;
@@ -136,7 +134,7 @@ struct option_struct options = {
 	.check_render_cpyfn = 0,
 };
 
-static struct igt_buf buffers[2][MAX_BUFS];
+static struct intel_buf buffers[2][MAX_BUFS];
 /* tile i is at logical position tile_permutation[i] */
 static unsigned *tile_permutation;
 static unsigned num_buffers = 0;
@@ -152,16 +150,16 @@ struct {
 	unsigned max_failed_reads;
 } stats;
 
-static void tile2xy(struct igt_buf *buf, unsigned tile, unsigned *x, unsigned *y)
+static void tile2xy(struct intel_buf *buf, unsigned tile, unsigned *x, unsigned *y)
 {
-	igt_assert(tile < buf->num_tiles);
+	igt_assert(tile < options.tiles_per_buf);
 	*x = (tile*options.tile_size) % (buf->surface[0].stride/sizeof(uint32_t));
 	*y = ((tile*options.tile_size) / (buf->surface[0].stride/sizeof(uint32_t))) * options.tile_size;
 }
 
-static void emit_blt(drm_intel_bo *src_bo, uint32_t src_tiling, unsigned src_pitch,
+static void emit_blt(struct intel_buf *src, uint32_t src_tiling, unsigned src_pitch,
 		     unsigned src_x, unsigned src_y, unsigned w, unsigned h,
-		     drm_intel_bo *dst_bo, uint32_t dst_tiling, unsigned dst_pitch,
+		     struct intel_buf *dst, uint32_t dst_tiling, unsigned dst_pitch,
 		     unsigned dst_x, unsigned dst_y)
 {
 	uint32_t cmd_bits = 0;
@@ -177,24 +175,26 @@ static void emit_blt(drm_intel_bo *src_bo, uint32_t src_tiling, unsigned src_pit
 	}
 
 	/* copy lower half to upper half */
-	BLIT_COPY_BATCH_START(cmd_bits);
-	OUT_BATCH((3 << 24) | /* 32 bits */
-		  (0xcc << 16) | /* copy ROP */
-		  dst_pitch);
-	OUT_BATCH(dst_y << 16 | dst_x);
-	OUT_BATCH((dst_y+h) << 16 | (dst_x+w));
-	OUT_RELOC_FENCED(dst_bo, I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, 0);
-	OUT_BATCH(src_y << 16 | src_x);
-	OUT_BATCH(src_pitch);
-	OUT_RELOC_FENCED(src_bo, I915_GEM_DOMAIN_RENDER, 0, 0);
-	ADVANCE_BATCH();
-
-	if (batch->gen >= 6) {
-		BEGIN_BATCH(3, 0);
-		OUT_BATCH(XY_SETUP_CLIP_BLT_CMD);
-		OUT_BATCH(0);
-		OUT_BATCH(0);
-		ADVANCE_BATCH();
+	intel_bb_blit_start(ibb, cmd_bits);
+	intel_bb_out(ibb, (3 << 24) | /* 32 bits */
+		     (0xcc << 16) | /* copy ROP */
+		     dst_pitch);
+	intel_bb_out(ibb, dst_y << 16 | dst_x);
+	intel_bb_out(ibb, (dst_y+h) << 16 | (dst_x+w));
+	intel_bb_emit_reloc_fenced(ibb, dst->handle,
+				   I915_GEM_DOMAIN_RENDER,
+				   I915_GEM_DOMAIN_RENDER,
+				   0, dst->addr.offset);
+	intel_bb_out(ibb, src_y << 16 | src_x);
+	intel_bb_out(ibb, src_pitch);
+	intel_bb_emit_reloc_fenced(ibb, src->handle,
+				   I915_GEM_DOMAIN_RENDER, 0,
+				   0, src->addr.offset);
+
+	if (ibb->gen >= 6) {
+		intel_bb_out(ibb, XY_SETUP_CLIP_BLT_CMD);
+		intel_bb_out(ibb, 0);
+		intel_bb_out(ibb, 0);
 	}
 }
 
@@ -207,19 +207,25 @@ static void keep_gpu_busy(void)
 	tmp = 1 << gpu_busy_load;
 	igt_assert_lte(tmp, 1024);
 
-	emit_blt(busy_bo, 0, 4096, 0, 0, tmp, 128,
-		 busy_bo, 0, 4096, 0, 128);
+	emit_blt(&busy_bo, 0, 4096, 0, 0, tmp, 128,
+		 &busy_bo, 0, 4096, 0, 128);
 }
 
-static void set_to_cpu_domain(struct igt_buf *buf, int writing)
+static void set_to_cpu_domain(struct intel_buf *buf, int writing)
 {
-	gem_set_domain(drm_fd, buf->bo->handle, I915_GEM_DOMAIN_CPU,
+	gem_set_domain(drm_fd, buf->handle, I915_GEM_DOMAIN_CPU,
 		       writing ? I915_GEM_DOMAIN_CPU : 0);
 }
 
+static void set_to_gtt_domain(struct intel_buf *buf, int writing)
+{
+	gem_set_domain(drm_fd, buf->handle, I915_GEM_DOMAIN_GTT,
+		       writing ? I915_GEM_DOMAIN_GTT : 0);
+}
+
 static unsigned int copyfunc_seq = 0;
-static void (*copyfunc)(struct igt_buf *src, unsigned src_x, unsigned src_y,
-			struct igt_buf *dst, unsigned dst_x, unsigned dst_y,
+static void (*copyfunc)(struct intel_buf *src, unsigned src_x, unsigned src_y,
+			struct intel_buf *dst, unsigned dst_x, unsigned dst_y,
 			unsigned logical_tile_no);
 
 /* stride, x, y in units of uint32_t! */
@@ -254,51 +260,53 @@ static void cpucpy2d(uint32_t *src, unsigned src_stride, unsigned src_x, unsigne
 		stats.num_failed++;
 }
 
-static void cpu_copyfunc(struct igt_buf *src, unsigned src_x, unsigned src_y,
-			 struct igt_buf *dst, unsigned dst_x, unsigned dst_y,
+static void cpu_copyfunc(struct intel_buf *src, unsigned src_x, unsigned src_y,
+			 struct intel_buf *dst, unsigned dst_x, unsigned dst_y,
 			 unsigned logical_tile_no)
 {
-	igt_assert(batch->ptr == batch->buffer);
+	igt_assert(src->ptr);
+	igt_assert(dst->ptr);
 
 	if (options.ducttape)
-		drm_intel_bo_wait_rendering(dst->bo);
+		set_to_gtt_domain(dst, 1);
 
 	if (options.use_cpu_maps) {
 		set_to_cpu_domain(src, 0);
 		set_to_cpu_domain(dst, 1);
 	}
 
-	cpucpy2d(src->data, src->surface[0].stride/sizeof(uint32_t), src_x,
-		 src_y,
-		 dst->data, dst->surface[0].stride/sizeof(uint32_t), dst_x,
-		 dst_y,
+	cpucpy2d(src->ptr, src->surface[0].stride/sizeof(uint32_t), src_x, src_y,
+		 dst->ptr, dst->surface[0].stride/sizeof(uint32_t), dst_x, dst_y,
 		 logical_tile_no);
 }
 
-static void prw_copyfunc(struct igt_buf *src, unsigned src_x, unsigned src_y,
-			 struct igt_buf *dst, unsigned dst_x, unsigned dst_y,
+static void prw_copyfunc(struct intel_buf *src, unsigned src_x, unsigned src_y,
+			 struct intel_buf *dst, unsigned dst_x, unsigned dst_y,
 			 unsigned logical_tile_no)
 {
 	uint32_t tmp_tile[options.tile_size*options.tile_size];
 	int i;
 
-	igt_assert(batch->ptr == batch->buffer);
+	igt_assert(src->ptr);
+	igt_assert(dst->ptr);
+
+	igt_info("prw\n");
 
 	if (options.ducttape)
-		drm_intel_bo_wait_rendering(dst->bo);
+		set_to_gtt_domain(dst, 1);
 
 	if (src->tiling == I915_TILING_NONE) {
 		for (i = 0; i < options.tile_size; i++) {
 			unsigned ofs = src_x*sizeof(uint32_t) + src->surface[0].stride*(src_y + i);
-			drm_intel_bo_get_subdata(src->bo, ofs,
-						 options.tile_size*sizeof(uint32_t),
-						 tmp_tile + options.tile_size*i);
+			gem_read(drm_fd, src->handle, ofs,
+				 tmp_tile + options.tile_size*i,
+				 options.tile_size*sizeof(uint32_t));
 		}
 	} else {
 		if (options.use_cpu_maps)
 			set_to_cpu_domain(src, 0);
 
-		cpucpy2d(src->data, src->surface[0].stride/sizeof(uint32_t),
+		cpucpy2d(src->ptr, src->surface[0].stride/sizeof(uint32_t),
 			 src_x, src_y,
 			 tmp_tile, options.tile_size, 0, 0, logical_tile_no);
 	}
@@ -306,23 +314,23 @@ static void prw_copyfunc(struct igt_buf *src, unsigned src_x, unsigned src_y,
 	if (dst->tiling == I915_TILING_NONE) {
 		for (i = 0; i < options.tile_size; i++) {
 			unsigned ofs = dst_x*sizeof(uint32_t) + dst->surface[0].stride*(dst_y + i);
-			drm_intel_bo_subdata(dst->bo, ofs,
-					     options.tile_size*sizeof(uint32_t),
-					     tmp_tile + options.tile_size*i);
+			gem_write(drm_fd, dst->handle, ofs,
+				  tmp_tile + options.tile_size*i,
+				  options.tile_size*sizeof(uint32_t));
 		}
 	} else {
 		if (options.use_cpu_maps)
 			set_to_cpu_domain(dst, 1);
 
 		cpucpy2d(tmp_tile, options.tile_size, 0, 0,
-			 dst->data, dst->surface[0].stride/sizeof(uint32_t),
+			 dst->ptr, dst->surface[0].stride/sizeof(uint32_t),
 			 dst_x, dst_y,
 			 logical_tile_no);
 	}
 }
 
-static void blitter_copyfunc(struct igt_buf *src, unsigned src_x, unsigned src_y,
-			     struct igt_buf *dst, unsigned dst_x, unsigned dst_y,
+static void blitter_copyfunc(struct intel_buf *src, unsigned src_x, unsigned src_y,
+			     struct intel_buf *dst, unsigned dst_x, unsigned dst_y,
 			     unsigned logical_tile_no)
 {
 	static unsigned keep_gpu_busy_counter = 0;
@@ -331,9 +339,9 @@ static void blitter_copyfunc(struct igt_buf *src, unsigned src_x, unsigned src_y
 	if (keep_gpu_busy_counter & 1 && !fence_storm)
 		keep_gpu_busy();
 
-	emit_blt(src->bo, src->tiling, src->surface[0].stride, src_x, src_y,
+	emit_blt(src, src->tiling, src->surface[0].stride, src_x, src_y,
 		 options.tile_size, options.tile_size,
-		 dst->bo, dst->tiling, dst->surface[0].stride, dst_x, dst_y);
+		 dst, dst->tiling, dst->surface[0].stride, dst_x, dst_y);
 
 	if (!(keep_gpu_busy_counter & 1) && !fence_storm)
 		keep_gpu_busy();
@@ -347,12 +355,12 @@ static void blitter_copyfunc(struct igt_buf *src, unsigned src_x, unsigned src_y
 
 	if (fence_storm <= 1) {
 		fence_storm = 0;
-		intel_batchbuffer_flush(batch);
+		intel_bb_flush_blit(ibb);
 	}
 }
 
-static void render_copyfunc(struct igt_buf *src, unsigned src_x, unsigned src_y,
-			    struct igt_buf *dst, unsigned dst_x, unsigned dst_y,
+static void render_copyfunc(struct intel_buf *src, unsigned src_x, unsigned src_y,
+			    struct intel_buf *dst, unsigned dst_x, unsigned dst_y,
 			    unsigned logical_tile_no)
 {
 	static unsigned keep_gpu_busy_counter = 0;
@@ -367,8 +375,9 @@ static void render_copyfunc(struct igt_buf *src, unsigned src_x, unsigned src_y,
 		 * Flush outstanding blts so that they don't end up on
 		 * the render ring when that's not allowed (gen6+).
 		 */
-		intel_batchbuffer_flush(batch);
-		rendercopy(batch, NULL, src, src_x, src_y,
+		intel_bb_flush_blit(ibb);
+
+		rendercopy(ibb, 0, src, src_x, src_y,
 		     options.tile_size, options.tile_size,
 		     dst, dst_x, dst_y);
 	} else
@@ -379,7 +388,7 @@ static void render_copyfunc(struct igt_buf *src, unsigned src_x, unsigned src_y,
 		keep_gpu_busy();
 
 	keep_gpu_busy_counter++;
-	intel_batchbuffer_flush(batch);
+	intel_bb_flush_blit(ibb);
 }
 
 static void next_copyfunc(int tile)
@@ -444,7 +453,7 @@ static void fan_out(void)
 			set_to_cpu_domain(&buffers[current_set][buf_idx], 1);
 
 		cpucpy2d(tmp_tile, options.tile_size, 0, 0,
-			 buffers[current_set][buf_idx].data,
+			 buffers[current_set][buf_idx].ptr,
 			 buffers[current_set][buf_idx].surface[0].stride / sizeof(uint32_t),
 			 x, y, i);
 	}
@@ -468,7 +477,7 @@ static void fan_in_and_check(void)
 		if (options.use_cpu_maps)
 			set_to_cpu_domain(&buffers[current_set][buf_idx], 0);
 
-		cpucpy2d(buffers[current_set][buf_idx].data,
+		cpucpy2d(buffers[current_set][buf_idx].ptr,
 			 buffers[current_set][buf_idx].surface[0].stride / sizeof(uint32_t),
 			 x, y,
 			 tmp_tile, options.tile_size, 0, 0,
@@ -476,61 +485,59 @@ static void fan_in_and_check(void)
 	}
 }
 
-static void sanitize_stride(struct igt_buf *buf)
+static void sanitize_stride(struct intel_buf *buf)
 {
 
-	if (igt_buf_height(buf) > options.max_dimension)
+	if (intel_buf_height(buf) > options.max_dimension)
 		buf->surface[0].stride = buf->surface[0].size / options.max_dimension;
 
-	if (igt_buf_height(buf) < options.tile_size)
+	if (intel_buf_height(buf) < options.tile_size)
 		buf->surface[0].stride = buf->surface[0].size / options.tile_size;
 
-	if (igt_buf_width(buf) < options.tile_size)
+	if (intel_buf_width(buf) < options.tile_size)
 		buf->surface[0].stride = options.tile_size * sizeof(uint32_t);
 
 	igt_assert(buf->surface[0].stride <= 8192);
-	igt_assert(igt_buf_width(buf) <= options.max_dimension);
-	igt_assert(igt_buf_height(buf) <= options.max_dimension);
+	igt_assert(intel_buf_width(buf) <= options.max_dimension);
+	igt_assert(intel_buf_height(buf) <= options.max_dimension);
 
-	igt_assert(igt_buf_width(buf) >= options.tile_size);
-	igt_assert(igt_buf_height(buf) >= options.tile_size);
+	igt_assert(intel_buf_width(buf) >= options.tile_size);
+	igt_assert(intel_buf_height(buf) >= options.tile_size);
 
 }
 
-static void init_buffer(struct igt_buf *buf, unsigned size)
+static void init_buffer(struct intel_buf *buf, unsigned size)
 {
-	memset(buf, 0, sizeof(*buf));
+	uint32_t stride, width, height, bpp;
+
+	stride = 4096;
+	bpp = 32;
+	width = stride / (bpp / 8);
+	height = size / stride;
 
-	buf->bo = drm_intel_bo_alloc(bufmgr, "tiled bo", size, 4096);
-	buf->surface[0].size = size;
-	igt_assert(buf->bo);
-	buf->tiling = I915_TILING_NONE;
-	buf->surface[0].stride = 4096;
-	buf->bpp = 32;
+	intel_buf_init(bops, buf, width, height, bpp, 0,
+		       I915_TILING_NONE, I915_COMPRESSION_NONE);
 
 	sanitize_stride(buf);
 
 	if (options.no_hw)
-		buf->data = malloc(size);
+		buf->ptr = malloc(size);
 	else {
 		if (options.use_cpu_maps)
-			drm_intel_bo_map(buf->bo, 1);
+			intel_buf_cpu_map(buf, 1);
 		else
-			drm_intel_gem_bo_map_gtt(buf->bo);
-		buf->data = buf->bo->virtual;
+			intel_buf_device_map(buf, 1);
 	}
-
-	buf->num_tiles = options.tiles_per_buf;
 }
 
 static void exchange_buf(void *array, unsigned i, unsigned j)
 {
-	struct igt_buf *buf_arr, tmp;
+	struct intel_buf *buf_arr, tmp;
 	buf_arr = array;
 
-	memcpy(&tmp, &buf_arr[i], sizeof(struct igt_buf));
-	memcpy(&buf_arr[i], &buf_arr[j], sizeof(struct igt_buf));
-	memcpy(&buf_arr[j], &tmp, sizeof(struct igt_buf));
+	memcpy(&tmp, &buf_arr[i], sizeof(struct intel_buf));
+	memcpy(&buf_arr[i], &buf_arr[j], sizeof(struct intel_buf));
+	memcpy(&buf_arr[j], &tmp, sizeof(struct intel_buf));
 }
 
 
@@ -577,7 +584,7 @@ static void init_set(unsigned set)
 
 		sanitize_stride(&buffers[set][i]);
 
-		gem_set_tiling(drm_fd, buffers[set][i].bo->handle,
+		gem_set_tiling(drm_fd, buffers[set][i].handle,
 			       buffers[set][i].tiling,
 			       buffers[set][i].surface[0].stride);
 
@@ -598,8 +605,9 @@ static void copy_tiles(unsigned *permutation)
 {
 	unsigned src_tile, src_buf_idx, src_x, src_y;
 	unsigned dst_tile, dst_buf_idx, dst_x, dst_y;
-	struct igt_buf *src_buf, *dst_buf;
+	struct intel_buf *src_buf, *dst_buf;
 	int i, idx;
+
 	for (i = 0; i < num_total_tiles; i++) {
 		/* tile_permutation is independent of current_permutation, so
 		 * abuse it to randomize the order of the src bos */
@@ -620,10 +628,10 @@ static void copy_tiles(unsigned *permutation)
 			igt_info("copying tile %i from %i (%i, %i) to %i (%i, %i)", i, tile_permutation[i], src_buf_idx, src_tile, permutation[idx], dst_buf_idx, dst_tile);
 
 		if (options.no_hw) {
-			cpucpy2d(src_buf->data,
+			cpucpy2d(src_buf->ptr,
 				 src_buf->surface[0].stride / sizeof(uint32_t),
 				 src_x, src_y,
-				 dst_buf->data,
+				 dst_buf->ptr,
 				 dst_buf->surface[0].stride / sizeof(uint32_t),
 				 dst_x, dst_y,
 				 i);
@@ -635,7 +643,7 @@ static void copy_tiles(unsigned *permutation)
 		}
 	}
 
-	intel_batchbuffer_flush(batch);
+	intel_bb_flush_blit(ibb);
 }
 
 static void sanitize_tiles_per_buf(void)
@@ -757,6 +765,7 @@ static void init(void)
 {
 	int i;
 	unsigned tmp;
+	uint32_t stride, width, height, bpp;
 
 	if (options.num_buffers == 0) {
 		tmp = gem_aperture_size(drm_fd);
@@ -767,22 +776,25 @@ static void init(void)
 	} else
 		num_buffers = options.num_buffers;
 
-	bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
-	drm_intel_bufmgr_gem_enable_reuse(bufmgr);
-	drm_intel_bufmgr_gem_enable_fenced_relocs(bufmgr);
 	num_fences = gem_available_fences(drm_fd);
 	igt_assert_lt(4, num_fences);
-	batch = intel_batchbuffer_alloc(bufmgr, devid);
 
-	busy_bo = drm_intel_bo_alloc(bufmgr, "tiled bo", BUSY_BUF_SIZE, 4096);
-	if (options.forced_tiling >= 0)
-		gem_set_tiling(drm_fd, busy_bo->handle, options.forced_tiling, 4096);
+	bops = buf_ops_create(drm_fd);
+	ibb = intel_bb_create(drm_fd, 4096);
+
+	stride = 4096;
+	bpp = 32;
+	width = stride / (bpp / 8);
+	height = BUSY_BUF_SIZE / stride;
+	intel_buf_init(bops, &busy_bo,
+		       width, height, bpp, 0, options.forced_tiling,
+		       I915_COMPRESSION_NONE);
 
 	for (i = 0; i < num_buffers; i++) {
 		init_buffer(&buffers[0][i], options.scratch_buf_size);
 		init_buffer(&buffers[1][i], options.scratch_buf_size);
 
-		num_total_tiles += buffers[0][i].num_tiles;
+		num_total_tiles += options.tiles_per_buf;
 	}
 	current_set = 0;
 
@@ -792,7 +804,7 @@ static void init(void)
 
 static void check_render_copyfunc(void)
 {
-	struct igt_buf src, dst;
+	struct intel_buf src, dst;
 	uint32_t *ptr;
 	int i, j, pass;
 
@@ -803,17 +815,18 @@ static void check_render_copyfunc(void)
 	init_buffer(&dst, options.scratch_buf_size);
 
 	for (pass = 0; pass < 16; pass++) {
-		int sx = random() % (igt_buf_width(&src)-options.tile_size);
-		int sy = random() % (igt_buf_height(&src)-options.tile_size);
-		int dx = random() % (igt_buf_width(&dst)-options.tile_size);
-		int dy = random() % (igt_buf_height(&dst)-options.tile_size);
+		int sx = random() % (intel_buf_width(&src)-options.tile_size);
+		int sy = random() % (intel_buf_height(&src)-options.tile_size);
+		int dx = random() % (intel_buf_width(&dst)-options.tile_size);
+		int dy = random() % (intel_buf_height(&dst)-options.tile_size);
 
 		if (options.use_cpu_maps)
 			set_to_cpu_domain(&src, 1);
 
-		memset(src.data, 0xff, options.scratch_buf_size);
+		memset(src.ptr, 0xff, options.scratch_buf_size);
 		for (j = 0; j < options.tile_size; j++) {
-			ptr = (uint32_t*)((char *)src.data + sx*4 + (sy+j) * src.surface[0].stride);
+			ptr = (uint32_t*)((char *)src.ptr + sx*4 +
+					  (sy+j) * src.surface[0].stride);
 			for (i = 0; i < options.tile_size; i++)
 				ptr[i] = j * options.tile_size + i;
 		}
@@ -824,7 +837,8 @@ static void check_render_copyfunc(void)
 			set_to_cpu_domain(&dst, 0);
 
 		for (j = 0; j < options.tile_size; j++) {
-			ptr = (uint32_t*)((char *)dst.data + dx*4 + (dy+j) * dst.surface[0].stride);
+			ptr = (uint32_t*)((char *)dst.ptr + dx*4 +
+					  (dy+j) * dst.surface[0].stride);
 			for (i = 0; i < options.tile_size; i++)
 				if (ptr[i] != j * options.tile_size + i) {
 					igt_info("render copyfunc mismatch at (%d, %d): found %d, expected %d\n", i, j, ptr[i], j * options.tile_size + i);
@@ -909,8 +923,8 @@ igt_simple_main_args("ds:g:c:t:rbuxmo:fp:",
 
 	igt_info("num failed tiles %u, max incoherent bytes %zd\n", stats.num_failed, stats.max_failed_reads * sizeof(uint32_t));
 
-	intel_batchbuffer_free(batch);
-	drm_intel_bufmgr_destroy(bufmgr);
+	intel_bb_destroy(ibb);
+	buf_ops_destroy(bops);
 
 	close(drm_fd);
 
diff --git a/tests/kms_big_fb.c b/tests/kms_big_fb.c
index a754b299..a9907ea3 100644
--- a/tests/kms_big_fb.c
+++ b/tests/kms_big_fb.c
@@ -46,28 +46,36 @@ typedef struct {
 	int big_fb_width, big_fb_height;
 	uint64_t ram_size, aper_size, mappable_size;
 	igt_render_copyfunc_t render_copy;
-	drm_intel_bufmgr *bufmgr;
-	struct intel_batchbuffer *batch;
+	struct buf_ops *bops;
+	struct intel_bb *ibb;
 } data_t;
 
 static void init_buf(data_t *data,
-		     struct igt_buf *buf,
+		     struct intel_buf *buf,
 		     const struct igt_fb *fb,
-		     const char *name)
+		     const char *buf_name)
 {
+	uint32_t name, handle, tiling, stride, width, height, bpp, size;
+
 	igt_assert_eq(fb->offsets[0], 0);
 
-	buf->bo = gem_handle_to_libdrm_bo(data->bufmgr, data->drm_fd,
-					  name, fb->gem_handle);
-	buf->tiling = igt_fb_mod_to_tiling(fb->modifier);
-	buf->surface[0].stride = fb->strides[0];
-	buf->bpp = fb->plane_bpp[0];
-	buf->surface[0].size = fb->size;
+	tiling = igt_fb_mod_to_tiling(fb->modifier);
+	stride = fb->strides[0];
+	bpp = fb->plane_bpp[0];
+	size = fb->size;
+	width = stride / (bpp / 8);
+	height = size / stride;
+
+	name = gem_flink(data->drm_fd, fb->gem_handle);
+	handle = gem_open(data->drm_fd, name);
+	intel_buf_init_using_handle(data->bops, handle, buf, width, height, bpp,
+				    0, tiling, 0);
+	intel_buf_set_name(buf, buf_name);
 }
 
-static void fini_buf(struct igt_buf *buf)
+static void fini_buf(struct intel_buf *buf)
 {
-	drm_intel_bo_unreference(buf->bo);
+	intel_buf_close(buf->bops, buf);
 }
 
 static void copy_pattern(data_t *data,
@@ -75,7 +83,7 @@ static void copy_pattern(data_t *data,
 			 struct igt_fb *src_fb, int sx, int sy,
 			 int w, int h)
 {
-	struct igt_buf src = {}, dst = {};
+	struct intel_buf src = {}, dst = {};
 
 	init_buf(data, &src, src_fb, "big fb src");
 	init_buf(data, &dst, dst_fb, "big fb dst");
@@ -91,7 +99,7 @@ static void copy_pattern(data_t *data,
 	 * rendered with the blitter/render engine.
 	 */
 	if (data->render_copy) {
-		data->render_copy(data->batch, NULL, &src, sx, sy, w, h, &dst, dx, dy);
+		data->render_copy(data->ibb, 0, &src, sx, sy, w, h, &dst, dx, dy);
 	} else {
 		w = min(w, src_fb->width - sx);
 		w = min(w, dst_fb->width - dx);
@@ -99,14 +107,16 @@ static void copy_pattern(data_t *data,
 		h = min(h, src_fb->height - sy);
 		h = min(h, dst_fb->height - dy);
 
-		intel_blt_copy(data->batch, src.bo, sx, sy,
-			       src.surface[0].stride,
-			       dst.bo, dx, dy, dst.surface[0].stride, w, h,
-			       dst.bpp);
+		intel_bb_blt_copy(data->ibb, &src, sx, sy, src.surface[0].stride,
+				  &dst, dx, dy, dst.surface[0].stride, w, h, dst.bpp);
 	}
 
 	fini_buf(&dst);
 	fini_buf(&src);
+
+	/* intel_bb cache doesn't know when objects dissappear, so
+	 * let's purge the cache */
+	intel_bb_reset(data->ibb, true);
 }
 
 static void generate_pattern(data_t *data,
@@ -648,8 +658,8 @@ igt_main
 		if (intel_gen(data.devid) >= 4)
 			data.render_copy = igt_get_render_copyfunc(data.devid);
 
-		data.bufmgr = drm_intel_bufmgr_gem_init(data.drm_fd, 4096);
-		data.batch = intel_batchbuffer_alloc(data.bufmgr, data.devid);
+		data.bops = buf_ops_create(data.drm_fd);
+		data.ibb = intel_bb_create(data.drm_fd, 4096);
 	}
 
 	/*
@@ -708,7 +718,7 @@ igt_main
 	igt_fixture {
 		igt_display_fini(&data.display);
 
-		intel_batchbuffer_free(data.batch);
-		drm_intel_bufmgr_destroy(data.bufmgr);
+		intel_bb_destroy(data.ibb);
+		buf_ops_destroy(data.bops);
 	}
 }
diff --git a/tests/kms_cursor_crc.c b/tests/kms_cursor_crc.c
index e9491847..be37a75d 100644
--- a/tests/kms_cursor_crc.c
+++ b/tests/kms_cursor_crc.c
@@ -64,11 +64,11 @@ typedef struct {
 	igt_plane_t *cursor;
 	cairo_surface_t *surface;
 	uint32_t devid;
-	drm_intel_bufmgr *bufmgr;
+	struct buf_ops *bops;
 	igt_render_copyfunc_t rendercopy;
-	drm_intel_bo * drmibo[2];
-	struct intel_batchbuffer *batch;
-	struct igt_buf igtbo[2];
+	uint32_t drm_handle[2];
+	struct intel_bb *ibb;
+	struct intel_buf igtbo[2];
 
 } data_t;
 
@@ -155,7 +155,7 @@ static void restore_image(data_t *data)
 
 	if (data->rendercopy != NULL) {
 		/* use rendercopy if available */
-		data->rendercopy(data->batch, NULL,
+		data->rendercopy(data->ibb, 0,
 				 &data->igtbo[RESTOREBUFFER], 0, 0,
 				 data->primary_fb[RESTOREBUFFER].width,
 				 data->primary_fb[RESTOREBUFFER].height,
@@ -381,16 +381,25 @@ static void cleanup_crtc(data_t *data)
 	igt_remove_fb(data->drm_fd, &data->primary_fb[FRONTBUFFER]);
 	igt_remove_fb(data->drm_fd, &data->primary_fb[RESTOREBUFFER]);
 
+	intel_bb_destroy(data->ibb);
+
 	igt_display_reset(display);
 }
 
 static void scratch_buf_init(data_t *data, int buffer)
 {
-	data->igtbo[buffer].bo = data->drmibo[buffer];
-	data->igtbo[buffer].surface[0].stride = data->primary_fb[buffer].strides[0];
-	data->igtbo[buffer].tiling = data->primary_fb[buffer].modifier;
-	data->igtbo[buffer].surface[0].size = data->primary_fb[buffer].size;
-	data->igtbo[buffer].bpp = data->primary_fb[buffer].plane_bpp[0];
+	uint32_t handle, tiling, stride, width, height, bpp, size;
+
+	handle = data->drm_handle[buffer];
+	size = data->primary_fb[buffer].size;
+	tiling = data->primary_fb[buffer].modifier;
+	stride = data->primary_fb[buffer].strides[0];
+	bpp = data->primary_fb[buffer].plane_bpp[0];
+	width = stride / (bpp / 8);
+	height = size / stride;
+
+	intel_buf_init_using_handle(data->bops, handle, &data->igtbo[buffer],
+				    width, height, bpp, 0, tiling, 0);
 }
 
 static void prepare_crtc(data_t *data, igt_output_t *output,
@@ -450,6 +459,8 @@ static void prepare_crtc(data_t *data, igt_output_t *output,
 		igt_paint_test_pattern(cr, data->screenw, data->screenh);
 		cairo_destroy(cr);
 	} else {
+		uint32_t name;
+
 		/* store test image as fb if rendercopy is available */
 		cr = igt_get_cairo_ctx(data->drm_fd,
 		                       &data->primary_fb[RESTOREBUFFER]);
@@ -457,22 +468,20 @@ static void prepare_crtc(data_t *data, igt_output_t *output,
 		igt_paint_test_pattern(cr, data->screenw, data->screenh);
 		igt_put_cairo_ctx(cr);
 
-		data->drmibo[FRONTBUFFER] = gem_handle_to_libdrm_bo(data->bufmgr,
-								    data->drm_fd,
-								    "", data->primary_fb[FRONTBUFFER].gem_handle);
-		igt_assert(data->drmibo[FRONTBUFFER]);
+		name = gem_flink(data->drm_fd,
+				 data->primary_fb[FRONTBUFFER].gem_handle);
+		data->drm_handle[FRONTBUFFER] = gem_open(data->drm_fd, name);
+		igt_assert(data->drm_handle[FRONTBUFFER]);
 
-		data->drmibo[RESTOREBUFFER] = gem_handle_to_libdrm_bo(data->bufmgr,
-								      data->drm_fd,
-								      "", data->primary_fb[RESTOREBUFFER].gem_handle);
-		igt_assert(data->drmibo[RESTOREBUFFER]);
+		name = gem_flink(data->drm_fd,
+				 data->primary_fb[RESTOREBUFFER].gem_handle);
+		data->drm_handle[RESTOREBUFFER] = gem_open(data->drm_fd, name);
+		igt_assert(data->drm_handle[RESTOREBUFFER]);
 
 		scratch_buf_init(data, RESTOREBUFFER);
 		scratch_buf_init(data, FRONTBUFFER);
 
-		data->batch = intel_batchbuffer_alloc(data->bufmgr,
-						      data->devid);
-		igt_assert(data->batch);
+		data->ibb = intel_bb_create(data->drm_fd, 4096);
 	}
 
 	igt_pipe_crc_start(data->pipe_crc);
@@ -832,10 +841,7 @@ igt_main
 		igt_display_require(&data.display, data.drm_fd);
 
 		if (is_i915_device(data.drm_fd)) {
-			data.bufmgr = drm_intel_bufmgr_gem_init(data.drm_fd, 4096);
-			igt_assert(data.bufmgr);
-			drm_intel_bufmgr_gem_enable_reuse(data.bufmgr);
-
+			data.bops = buf_ops_create(data.drm_fd);
 			data.devid = intel_get_drm_devid(data.drm_fd);
 			data.rendercopy = igt_get_render_copyfunc(data.devid);
 		}
@@ -854,12 +860,7 @@ igt_main
 			igt_pipe_crc_stop(data.pipe_crc);
 			igt_pipe_crc_free(data.pipe_crc);
 		}
-
-		if (data.bufmgr != NULL) {
-			intel_batchbuffer_free(data.batch);
-			drm_intel_bufmgr_destroy(data.bufmgr);
-		}
-
+		buf_ops_destroy(data.bops);
 		igt_display_fini(&data.display);
 	}
 }
-- 
2.26.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [igt-dev] [PATCH i-g-t 10/11] tools/intel_residency: adopt intel_residency to use bufops
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
                   ` (8 preceding siblings ...)
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 09/11] tests/gem|kms: remove libdrm dependency (batch 2) Zbigniew Kempczyński
@ 2020-07-23 19:22 ` Zbigniew Kempczyński
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 11/11] tests/perf: remove libdrm dependency for rendercopy Zbigniew Kempczyński
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 15+ messages in thread
From: Zbigniew Kempczyński @ 2020-07-23 19:22 UTC (permalink / raw)
  To: igt-dev; +Cc: Chris Wilson

IGT draw functions remove libdrm dependency so migrate to new API.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 tools/intel_residency.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/tools/intel_residency.c b/tools/intel_residency.c
index bfab40da..736fca0d 100644
--- a/tools/intel_residency.c
+++ b/tools/intel_residency.c
@@ -82,7 +82,7 @@ struct {
 	int fd;
 	drmModeResPtr res;
 	drmModeConnectorPtr connectors[MAX_CONNECTORS];
-	drm_intel_bufmgr *bufmgr;
+	struct buf_ops *bops;
 } drm;
 
 struct {
@@ -191,16 +191,14 @@ static void setup_drm(void)
 		drm.connectors[i] = drmModeGetConnector(drm.fd,
 						drm.res->connectors[i]);
 
-	drm.bufmgr = drm_intel_bufmgr_gem_init(drm.fd, 4096);
-	igt_assert(drm.bufmgr);
-	drm_intel_bufmgr_gem_enable_reuse(drm.bufmgr);
+	drm.bops = buf_ops_create(drm.fd);
 }
 
 static void teardown_drm(void)
 {
 	int i;
 
-	drm_intel_bufmgr_destroy(drm.bufmgr);
+	buf_ops_destroy(drm.bops);
 
 	for (i = 0; i < drm.res->count_connectors; i++)
 		drmModeFreeConnector(drm.connectors[i]);
@@ -238,7 +236,7 @@ static void draw_rect(struct igt_fb *fb, enum igt_draw_method method,
 		igt_assert(false);
 	}
 
-	igt_draw_rect_fb(drm.fd, drm.bufmgr, NULL, fb, method, clip.x1, clip.y1,
+	igt_draw_rect_fb(drm.fd, drm.bops, 0, fb, method, clip.x1, clip.y1,
 			 clip.x2 - clip.x1, clip.y2 - clip.y1, color);
 
 	if (method == IGT_DRAW_MMAP_WC) {
-- 
2.26.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [igt-dev] [PATCH i-g-t 11/11] tests/perf: remove libdrm dependency for rendercopy
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
                   ` (9 preceding siblings ...)
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 10/11] tools/intel_residency: adopt intel_residency to use bufops Zbigniew Kempczyński
@ 2020-07-23 19:22 ` Zbigniew Kempczyński
  2020-07-23 19:30 ` [igt-dev] ✗ GitLab.Pipeline: warning for Remove libdrm in rendercopy Patchwork
  2020-07-23 19:50 ` [igt-dev] ✗ Fi.CI.BAT: failure " Patchwork
  12 siblings, 0 replies; 15+ messages in thread
From: Zbigniew Kempczyński @ 2020-07-23 19:22 UTC (permalink / raw)
  To: igt-dev; +Cc: Chris Wilson

Rendercopy now uses no-drm version so all users has to
migrate to new interface.

Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/i915/perf.c | 663 ++++++++++++++++++++--------------------------
 1 file changed, 281 insertions(+), 382 deletions(-)

diff --git a/tests/i915/perf.c b/tests/i915/perf.c
index 92edc9f1..fd2b4073 100644
--- a/tests/i915/perf.c
+++ b/tests/i915/perf.c
@@ -497,64 +497,42 @@ oa_report_get_ctx_id(uint32_t *report)
 }
 
 static void
-scratch_buf_memset(drm_intel_bo *bo, int width, int height, uint32_t color)
+scratch_buf_memset(struct intel_buf *buf, int width, int height, uint32_t color)
 {
-	int ret;
-
-	ret = drm_intel_bo_map(bo, true /* writable */);
-	igt_assert_eq(ret, 0);
+	intel_buf_cpu_map(buf, true);
 
 	for (int i = 0; i < width * height; i++)
-		((uint32_t *)bo->virtual)[i] = color;
+		buf->ptr[i] = color;
 
-	drm_intel_bo_unmap(bo);
+	intel_buf_unmap(buf);
 }
 
 static void
-scratch_buf_init(drm_intel_bufmgr *bufmgr,
-		 struct igt_buf *buf,
+scratch_buf_init(struct buf_ops *bops,
+		 struct intel_buf *buf,
 		 int width, int height,
 		 uint32_t color)
 {
-	size_t stride = width * 4;
-	size_t size = stride * height;
-	drm_intel_bo *bo = drm_intel_bo_alloc(bufmgr, "", size, 4096);
-
-	scratch_buf_memset(bo, width, height, color);
-
-	memset(buf, 0, sizeof(*buf));
-
-	buf->bo = bo;
-	buf->surface[0].stride = stride;
-	buf->tiling = I915_TILING_NONE;
-	buf->surface[0].size = size;
-	buf->bpp = 32;
+	intel_buf_init(bops, buf, width, height, 32, 0,
+		       I915_TILING_NONE, I915_COMPRESSION_NONE);
+	scratch_buf_memset(buf, width, height, color);
 }
 
 static void
-emit_report_perf_count(struct intel_batchbuffer *batch,
-		       drm_intel_bo *dst_bo,
+emit_report_perf_count(struct intel_bb *ibb,
+		       struct intel_buf *dst,
 		       int dst_offset,
 		       uint32_t report_id)
 {
-	if (IS_HASWELL(devid)) {
-		BEGIN_BATCH(3, 1);
-		OUT_BATCH(GEN6_MI_REPORT_PERF_COUNT);
-		OUT_RELOC(dst_bo, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
-			  dst_offset);
-		OUT_BATCH(report_id);
-		ADVANCE_BATCH();
-	} else {
-		/* XXX: NB: n dwords arg is actually magic since it internally
-		 * automatically accounts for larger addresses on gen >= 8...
-		 */
-		BEGIN_BATCH(3, 1);
-		OUT_BATCH(GEN8_MI_REPORT_PERF_COUNT);
-		OUT_RELOC(dst_bo, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
-			  dst_offset);
-		OUT_BATCH(report_id);
-		ADVANCE_BATCH();
-	}
+	if (IS_HASWELL(devid))
+		intel_bb_out(ibb, GEN6_MI_REPORT_PERF_COUNT);
+	else
+		intel_bb_out(ibb, GEN8_MI_REPORT_PERF_COUNT);
+
+	intel_bb_emit_reloc(ibb, dst->handle,
+			    I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+			    dst_offset, dst->addr.offset);
+	intel_bb_out(ibb, report_id);
 }
 
 static void
@@ -1495,14 +1473,13 @@ enum load {
 
 static struct load_helper {
 	int devid;
-	drm_intel_bufmgr *bufmgr;
-	drm_intel_context *context;
+	struct buf_ops *bops;
 	uint32_t context_id;
-	struct intel_batchbuffer *batch;
+	struct intel_bb *ibb;
 	enum load load;
 	bool exit;
 	struct igt_helper_process igt_proc;
-	struct igt_buf src, dst;
+	struct intel_buf src, dst;
 } lh = { 0, };
 
 static void load_helper_signal_handler(int sig)
@@ -1524,6 +1501,14 @@ static void load_helper_set_load(enum load load)
 	kill(lh.igt_proc.pid, SIGUSR2);
 }
 
+static void set_to_gtt_domain(struct intel_buf *buf, int writing)
+{
+	int i915 = buf_ops_get_fd(buf->bops);
+
+	gem_set_domain(i915, buf->handle, I915_GEM_DOMAIN_GTT,
+		       writing ? I915_GEM_DOMAIN_GTT : 0);
+}
+
 static void load_helper_run(enum load load)
 {
 	/*
@@ -1542,21 +1527,12 @@ static void load_helper_run(enum load load)
 		signal(SIGUSR2, load_helper_signal_handler);
 
 		while (!lh.exit) {
-			int ret;
-
-			render_copy(lh.batch,
-				    lh.context,
+			render_copy(lh.ibb,
+				    lh.context_id,
 				    &lh.src, 0, 0, 1920, 1080,
 				    &lh.dst, 0, 0);
 
-			intel_batchbuffer_flush_with_context(lh.batch,
-							     lh.context);
-
-			ret = drm_intel_gem_context_get_id(lh.context,
-							   &lh.context_id);
-			igt_assert_eq(ret, 0);
-
-			drm_intel_bo_wait_rendering(lh.dst.bo);
+			set_to_gtt_domain(&lh.dst, true);
 
 			/* Lower the load by pausing after every submitted
 			 * write. */
@@ -1574,52 +1550,36 @@ static void load_helper_stop(void)
 
 static void load_helper_init(void)
 {
-	int ret;
-
 	lh.devid = intel_get_drm_devid(drm_fd);
 
 	/* MI_STORE_DATA can only use GTT address on gen4+/g33 and needs
 	 * snoopable mem on pre-gen6. Hence load-helper only works on gen6+, but
 	 * that's also all we care about for the rps testcase*/
 	igt_assert(intel_gen(lh.devid) >= 6);
-	lh.bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
-	igt_assert(lh.bufmgr);
-
-	drm_intel_bufmgr_gem_enable_reuse(lh.bufmgr);
 
-	lh.context = drm_intel_gem_context_create(lh.bufmgr);
-	igt_assert(lh.context);
+	lh.bops = buf_ops_create(drm_fd);
 
-	lh.context_id = 0xffffffff;
-	ret = drm_intel_gem_context_get_id(lh.context, &lh.context_id);
-	igt_assert_eq(ret, 0);
+	lh.context_id = gem_context_create(drm_fd);
 	igt_assert_neq(lh.context_id, 0xffffffff);
 
-	lh.batch = intel_batchbuffer_alloc(lh.bufmgr, lh.devid);
-	igt_assert(lh.batch);
+	lh.ibb = intel_bb_create(drm_fd, BATCH_SZ);
 
-	scratch_buf_init(lh.bufmgr, &lh.dst, 1920, 1080, 0);
-	scratch_buf_init(lh.bufmgr, &lh.src, 1920, 1080, 0);
+	scratch_buf_init(lh.bops, &lh.dst, 1920, 1080, 0);
+	scratch_buf_init(lh.bops, &lh.src, 1920, 1080, 0);
 }
 
 static void load_helper_fini(void)
 {
+	int i915 = buf_ops_get_fd(lh.bops);
+
 	if (lh.igt_proc.running)
 		load_helper_stop();
 
-	if (lh.src.bo)
-		drm_intel_bo_unreference(lh.src.bo);
-	if (lh.dst.bo)
-		drm_intel_bo_unreference(lh.dst.bo);
-
-	if (lh.batch)
-		intel_batchbuffer_free(lh.batch);
-
-	if (lh.context)
-		drm_intel_gem_context_destroy(lh.context);
-
-	if (lh.bufmgr)
-		drm_intel_bufmgr_destroy(lh.bufmgr);
+	intel_buf_close(lh.bops, &lh.src);
+	intel_buf_close(lh.bops, &lh.dst);
+	intel_bb_destroy(lh.ibb);
+	gem_context_destroy(i915, lh.context_id);
+	buf_ops_destroy(lh.bops);
 }
 
 static bool expected_report_timing_delta(uint32_t delta, uint32_t expected_delta)
@@ -1888,20 +1848,11 @@ test_per_context_mode_unprivileged(void)
 	write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
 
 	igt_fork(child, 1) {
-		drm_intel_context *context;
-		drm_intel_bufmgr *bufmgr;
 		uint32_t ctx_id = 0xffffffff; /* invalid id */
-		int ret;
 
 		igt_drop_root();
 
-		bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
-		context = drm_intel_gem_context_create(bufmgr);
-
-		igt_assert(context);
-
-		ret = drm_intel_gem_context_get_id(context, &ctx_id);
-		igt_assert_eq(ret, 0);
+		ctx_id = gem_context_create(drm_fd);
 		igt_assert_neq(ctx_id, 0xffffffff);
 
 		properties[1] = ctx_id;
@@ -1909,8 +1860,7 @@ test_per_context_mode_unprivileged(void)
 		stream_fd = __perf_open(drm_fd, &param, false);
 		__perf_close(stream_fd);
 
-		drm_intel_gem_context_destroy(context);
-		drm_intel_bufmgr_destroy(bufmgr);
+		gem_context_destroy(drm_fd, ctx_id);
 	}
 
 	igt_waitchildren();
@@ -2936,55 +2886,44 @@ gen12_test_mi_rpc(void)
 		.num_properties = ARRAY_SIZE(properties) / 2,
 		.properties_ptr = to_user_pointer(properties),
 	};
-	drm_intel_bo *bo;
-	drm_intel_bufmgr *bufmgr;
-	drm_intel_context *context;
-	struct intel_batchbuffer *batch;
+	struct buf_ops *bops;
+	struct intel_bb *ibb;
+	struct intel_buf *buf;
 #define INVALID_CTX_ID 0xffffffff
 	uint32_t ctx_id = INVALID_CTX_ID;
 	uint32_t *report32;
-	int ret;
 	size_t format_size_32;
 	struct oa_format format = get_oa_format(test_set->perf_oa_format);
 
 	/* Ensure perf_stream_paranoid is set to 1 by default */
 	write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
 
-	bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
-	igt_assert(bufmgr);
-
-	drm_intel_bufmgr_gem_enable_reuse(bufmgr);
-
-	context = drm_intel_gem_context_create(bufmgr);
-	igt_assert(context);
-
-	ret = drm_intel_gem_context_get_id(context, &ctx_id);
-	igt_assert_eq(ret, 0);
+	bops = buf_ops_create(drm_fd);
+	ctx_id = gem_context_create(drm_fd);
 	igt_assert_neq(ctx_id, INVALID_CTX_ID);
 	properties[1] = ctx_id;
 
-	batch = intel_batchbuffer_alloc(bufmgr, devid);
-	bo = drm_intel_bo_alloc(bufmgr, "mi_rpc dest bo", 4096, 64);
+	ibb = intel_bb_create(drm_fd, BATCH_SZ);
+	buf = intel_buf_create(bops, 4096, 1, 8, 64,
+			       I915_TILING_NONE, I915_COMPRESSION_NONE);
 
-	ret = drm_intel_bo_map(bo, true);
-	igt_assert_eq(ret, 0);
-	memset(bo->virtual, 0x80, 4096);
-	drm_intel_bo_unmap(bo);
+	intel_buf_cpu_map(buf, true);
+	memset(buf->ptr, 0x80, 4096);
+	intel_buf_unmap(buf);
 
 	stream_fd = __perf_open(drm_fd, &param, false);
 
 #define REPORT_ID 0xdeadbeef
 #define REPORT_OFFSET 0
-	emit_report_perf_count(batch,
-			       bo,
+	emit_report_perf_count(ibb,
+			       buf,
 			       REPORT_OFFSET,
 			       REPORT_ID);
-	intel_batchbuffer_flush_with_context(batch, context);
+	intel_bb_flush_render_with_context(ibb, ctx_id);
+	intel_bb_sync(ibb);
 
-	ret = drm_intel_bo_map(bo, false);
-	igt_assert_eq(ret, 0);
-
-	report32 = bo->virtual;
+	intel_buf_cpu_map(buf, false);
+	report32 = buf->ptr;
 	format_size_32 = format.size >> 2;
 	dump_report(report32, format_size_32, "mi-rpc");
 
@@ -3006,11 +2945,11 @@ gen12_test_mi_rpc(void)
 	igt_assert_neq(report32[format.b_off >> 2], 0x80808080);
 	igt_assert_eq(report32[format_size_32], 0x80808080);
 
-	drm_intel_bo_unmap(bo);
-	drm_intel_bo_unreference(bo);
-	intel_batchbuffer_free(batch);
-	drm_intel_gem_context_destroy(context);
-	drm_intel_bufmgr_destroy(bufmgr);
+	intel_buf_unmap(buf);
+	intel_buf_destroy(buf);
+	intel_bb_destroy(ibb);
+	gem_context_destroy(drm_fd, ctx_id);
+	buf_ops_destroy(bops);
 	__perf_close(stream_fd);
 }
 
@@ -3034,41 +2973,33 @@ test_mi_rpc(void)
 		.num_properties = sizeof(properties) / 16,
 		.properties_ptr = to_user_pointer(properties),
 	};
-	drm_intel_bufmgr *bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
-	drm_intel_context *context;
-	struct intel_batchbuffer *batch;
-	drm_intel_bo *bo;
-	uint32_t *report32;
-	int ret;
+	struct buf_ops *bops = buf_ops_create(drm_fd);
+	struct intel_bb *ibb;
+	struct intel_buf *buf;
+	uint32_t *report32, ctx_id;
 
 	stream_fd = __perf_open(drm_fd, &param, false);
 
-	drm_intel_bufmgr_gem_enable_reuse(bufmgr);
-
-	context = drm_intel_gem_context_create(bufmgr);
-	igt_assert(context);
+	ctx_id = gem_context_create(drm_fd);
 
-	batch = intel_batchbuffer_alloc(bufmgr, devid);
+	ibb = intel_bb_create(drm_fd, BATCH_SZ);
+	buf = intel_buf_create(bops, 4096, 1, 8, 64,
+			       I915_TILING_NONE, I915_COMPRESSION_NONE);
 
-	bo = drm_intel_bo_alloc(bufmgr, "mi_rpc dest bo", 4096, 64);
+	intel_buf_cpu_map(buf, true);
+	memset(buf->ptr, 0x80, 4096);
+	intel_buf_unmap(buf);
 
-	ret = drm_intel_bo_map(bo, true);
-	igt_assert_eq(ret, 0);
-
-	memset(bo->virtual, 0x80, 4096);
-	drm_intel_bo_unmap(bo);
-
-	emit_report_perf_count(batch,
-			       bo, /* dst */
+	emit_report_perf_count(ibb,
+			       buf, /* dst */
 			       0, /* dst offset in bytes */
 			       0xdeadbeef); /* report ID */
 
-	intel_batchbuffer_flush_with_context(batch, context);
-
-	ret = drm_intel_bo_map(bo, false /* write enable */);
-	igt_assert_eq(ret, 0);
+	intel_bb_flush_render_with_context(ibb, ctx_id);
+	intel_bb_sync(ibb);
 
-	report32 = bo->virtual;
+	intel_buf_cpu_map(buf, false);
+	report32 = buf->ptr;
 	dump_report(report32, 64, "mi-rpc");
 	igt_assert_eq(report32[0], 0xdeadbeef); /* report ID */
 	igt_assert_neq(report32[1], 0); /* timestamp */
@@ -3076,17 +3007,17 @@ test_mi_rpc(void)
 	igt_assert_neq(report32[63], 0x80808080); /* end of report */
 	igt_assert_eq(report32[64], 0x80808080); /* after 256 byte report */
 
-	drm_intel_bo_unmap(bo);
-	drm_intel_bo_unreference(bo);
-	intel_batchbuffer_free(batch);
-	drm_intel_gem_context_destroy(context);
-	drm_intel_bufmgr_destroy(bufmgr);
+	intel_buf_unmap(buf);
+	intel_buf_destroy(buf);
+	intel_bb_destroy(ibb);
+	gem_context_destroy(drm_fd, ctx_id);
+	buf_ops_destroy(bops);
 	__perf_close(stream_fd);
 }
 
 static void
-emit_stall_timestamp_and_rpc(struct intel_batchbuffer *batch,
-			     drm_intel_bo *dst,
+emit_stall_timestamp_and_rpc(struct intel_bb *ibb,
+			     struct intel_buf *dst,
 			     int timestamp_offset,
 			     int report_dst_offset,
 			     uint32_t report_id)
@@ -3095,27 +3026,19 @@ emit_stall_timestamp_and_rpc(struct intel_batchbuffer *batch,
 				   PIPE_CONTROL_RENDER_TARGET_FLUSH |
 				   PIPE_CONTROL_WRITE_TIMESTAMP);
 
-	if (intel_gen(devid) >= 8) {
-		BEGIN_BATCH(5, 1);
-		OUT_BATCH(GFX_OP_PIPE_CONTROL | (6 - 2));
-		OUT_BATCH(pipe_ctl_flags);
-		OUT_RELOC(dst, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
-			  timestamp_offset);
-		OUT_BATCH(0); /* imm lower */
-		OUT_BATCH(0); /* imm upper */
-		ADVANCE_BATCH();
-	} else {
-		BEGIN_BATCH(5, 1);
-		OUT_BATCH(GFX_OP_PIPE_CONTROL | (5 - 2));
-		OUT_BATCH(pipe_ctl_flags);
-		OUT_RELOC(dst, I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
-			  timestamp_offset);
-		OUT_BATCH(0); /* imm lower */
-		OUT_BATCH(0); /* imm upper */
-		ADVANCE_BATCH();
-	}
+	if (intel_gen(devid) >= 8)
+		intel_bb_out(ibb, GFX_OP_PIPE_CONTROL | (6 - 2));
+	else
+		intel_bb_out(ibb, GFX_OP_PIPE_CONTROL | (5 - 2));
+
+	intel_bb_out(ibb, pipe_ctl_flags);
+	intel_bb_emit_reloc(ibb, dst->handle,
+			    I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+			    timestamp_offset, dst->addr.offset);
+	intel_bb_out(ibb, 0); /* imm lower */
+	intel_bb_out(ibb, 0); /* imm upper */
 
-	emit_report_perf_count(batch, dst, report_dst_offset, report_id);
+	emit_report_perf_count(ibb, dst, report_dst_offset, report_id);
 }
 
 /* Tests the INTEL_performance_query use case where an unprivileged process
@@ -3156,11 +3079,10 @@ hsw_test_single_ctx_counters(void)
 	write_u64_file("/proc/sys/dev/i915/perf_stream_paranoid", 1);
 
 	igt_fork(child, 1) {
-		drm_intel_bufmgr *bufmgr;
-		drm_intel_context *context0, *context1;
-		struct intel_batchbuffer *batch;
-		struct igt_buf src[3], dst[3];
-		drm_intel_bo *bo;
+		struct buf_ops *bops;
+		struct intel_buf src[3], dst[3], *dst_buf;
+		struct intel_bb *ibb0, *ibb1;
+		uint32_t context0_id, context1_id;
 		uint32_t *report0_32, *report1_32;
 		uint64_t timestamp0_64, timestamp1_64;
 		uint32_t delta_ts64, delta_oa32;
@@ -3169,26 +3091,24 @@ hsw_test_single_ctx_counters(void)
 		int n_samples_written;
 		int width = 800;
 		int height = 600;
-		uint32_t ctx_id = 0xffffffff; /* invalid id */
-		int ret;
 
 		igt_drop_root();
 
-		bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
-		drm_intel_bufmgr_gem_enable_reuse(bufmgr);
+		bops = buf_ops_create(drm_fd);
 
 		for (int i = 0; i < ARRAY_SIZE(src); i++) {
-			scratch_buf_init(bufmgr, &src[i], width, height, 0xff0000ff);
-			scratch_buf_init(bufmgr, &dst[i], width, height, 0x00ff00ff);
+			scratch_buf_init(bops, &src[i], width, height, 0xff0000ff);
+			scratch_buf_init(bops, &dst[i], width, height, 0x00ff00ff);
 		}
 
-		batch = intel_batchbuffer_alloc(bufmgr, devid);
-
-		context0 = drm_intel_gem_context_create(bufmgr);
-		igt_assert(context0);
-
-		context1 = drm_intel_gem_context_create(bufmgr);
-		igt_assert(context1);
+		/*
+		 * We currently cache addresses for buffers within
+		 * intel_bb, so use separate batches for different contexts
+		 */
+		ibb0 = intel_bb_create(drm_fd, BATCH_SZ);
+		ibb1 = intel_bb_create(drm_fd, BATCH_SZ);
+		context0_id = gem_context_create(drm_fd);
+		context1_id = gem_context_create(drm_fd);
 
 		igt_debug("submitting warm up render_copy\n");
 
@@ -3212,34 +3132,32 @@ hsw_test_single_ctx_counters(void)
 		 * up pinning the context since there won't ever be a pinning
 		 * hook callback.
 		 */
-		render_copy(batch,
-			    context0,
+		render_copy(ibb0,
+			    context0_id,
 			    &src[0], 0, 0, width, height,
 			    &dst[0], 0, 0);
 
-		ret = drm_intel_gem_context_get_id(context0, &ctx_id);
-		igt_assert_eq(ret, 0);
-		igt_assert_neq(ctx_id, 0xffffffff);
-		properties[1] = ctx_id;
+		properties[1] = context0_id;
 
-		intel_batchbuffer_flush_with_context(batch, context0);
+		intel_bb_flush_render_with_context(ibb0, context0_id);
+		intel_bb_sync(ibb0);
 
-		scratch_buf_memset(src[0].bo, width, height, 0xff0000ff);
-		scratch_buf_memset(dst[0].bo, width, height, 0x00ff00ff);
+		scratch_buf_memset(&src[0], width, height, 0xff0000ff);
+		scratch_buf_memset(&dst[0], width, height, 0x00ff00ff);
 
 		igt_debug("opening i915-perf stream\n");
 		stream_fd = __perf_open(drm_fd, &param, false);
 
-		bo = drm_intel_bo_alloc(bufmgr, "mi_rpc dest bo", 4096, 64);
-
-		ret = drm_intel_bo_map(bo, true /* write enable */);
-		igt_assert_eq(ret, 0);
+		dst_buf = intel_buf_create(bops, 4096, 1, 8, 64,
+					   I915_TILING_NONE,
+					   I915_COMPRESSION_NONE);
 
-		memset(bo->virtual, 0x80, 4096);
-		drm_intel_bo_unmap(bo);
+		intel_buf_cpu_map(dst_buf, true /* write enable */);
+		memset(dst_buf->ptr, 0x80, 4096);
+		intel_buf_unmap(dst_buf);
 
-		emit_stall_timestamp_and_rpc(batch,
-					     bo,
+		emit_stall_timestamp_and_rpc(ibb0,
+					     dst_buf,
 					     512 /* timestamp offset */,
 					     0, /* report dst offset */
 					     0xdeadbeef); /* report id */
@@ -3249,45 +3167,45 @@ hsw_test_single_ctx_counters(void)
 		 * that the PIPE_CONTROL + MI_RPC commands will be in a
 		 * separate batch from the copy.
 		 */
-		intel_batchbuffer_flush_with_context(batch, context0);
+		intel_bb_flush_render_with_context(ibb0, context0_id);
 
-		render_copy(batch,
-			    context0,
+		render_copy(ibb0,
+			    context0_id,
 			    &src[0], 0, 0, width, height,
 			    &dst[0], 0, 0);
 
 		/* Another redundant flush to clarify batch bo is free to reuse */
-		intel_batchbuffer_flush_with_context(batch, context0);
+		intel_bb_flush_render_with_context(ibb0, context0_id);
 
 		/* submit two copies on the other context to avoid a false
 		 * positive in case the driver somehow ended up filtering for
 		 * context1
 		 */
-		render_copy(batch,
-			    context1,
+		render_copy(ibb1,
+			    context1_id,
 			    &src[1], 0, 0, width, height,
 			    &dst[1], 0, 0);
 
-		render_copy(batch,
-			    context1,
+		render_copy(ibb1,
+			    context1_id,
 			    &src[2], 0, 0, width, height,
 			    &dst[2], 0, 0);
 
 		/* And another */
-		intel_batchbuffer_flush_with_context(batch, context1);
+		intel_bb_flush_render_with_context(ibb1, context1_id);
 
-		emit_stall_timestamp_and_rpc(batch,
-					     bo,
+		emit_stall_timestamp_and_rpc(ibb0,
+					     dst_buf,
 					     520 /* timestamp offset */,
 					     256, /* report dst offset */
 					     0xbeefbeef); /* report id */
 
-		intel_batchbuffer_flush_with_context(batch, context0);
+		intel_bb_flush_render_with_context(ibb0, context0_id);
+		intel_bb_sync(ibb0);
 
-		ret = drm_intel_bo_map(bo, false /* write enable */);
-		igt_assert_eq(ret, 0);
+		intel_buf_cpu_map(dst_buf, false /* write enable */);
 
-		report0_32 = bo->virtual;
+		report0_32 = dst_buf->ptr;
 		igt_assert_eq(report0_32[0], 0xdeadbeef); /* report ID */
 		igt_assert_neq(report0_32[1], 0); /* timestamp */
 
@@ -3307,8 +3225,8 @@ hsw_test_single_ctx_counters(void)
 		igt_debug("timestamp32 0 = %u\n", report0_32[1]);
 		igt_debug("timestamp32 1 = %u\n", report1_32[1]);
 
-		timestamp0_64 = *(uint64_t *)(((uint8_t *)bo->virtual) + 512);
-		timestamp1_64 = *(uint64_t *)(((uint8_t *)bo->virtual) + 520);
+		timestamp0_64 = *(uint64_t *)(((uint8_t *)dst_buf->ptr) + 512);
+		timestamp1_64 = *(uint64_t *)(((uint8_t *)dst_buf->ptr) + 520);
 
 		igt_debug("timestamp64 0 = %"PRIu64"\n", timestamp0_64);
 		igt_debug("timestamp64 1 = %"PRIu64"\n", timestamp1_64);
@@ -3336,16 +3254,17 @@ hsw_test_single_ctx_counters(void)
 		igt_assert(delta_delta <= 320);
 
 		for (int i = 0; i < ARRAY_SIZE(src); i++) {
-			drm_intel_bo_unreference(src[i].bo);
-			drm_intel_bo_unreference(dst[i].bo);
+			intel_buf_close(bops, &src[i]);
+			intel_buf_close(bops, &dst[i]);
 		}
 
-		drm_intel_bo_unmap(bo);
-		drm_intel_bo_unreference(bo);
-		intel_batchbuffer_free(batch);
-		drm_intel_gem_context_destroy(context0);
-		drm_intel_gem_context_destroy(context1);
-		drm_intel_bufmgr_destroy(bufmgr);
+		intel_buf_unmap(dst_buf);
+		intel_buf_destroy(dst_buf);
+		intel_bb_destroy(ibb0);
+		intel_bb_destroy(ibb1);
+		gem_context_destroy(drm_fd, context0_id);
+		gem_context_destroy(drm_fd, context1_id);
+		buf_ops_destroy(bops);
 		__perf_close(stream_fd);
 	}
 
@@ -3406,11 +3325,10 @@ gen8_test_single_ctx_render_target_writes_a_counter(void)
 
 		igt_fork_helper(&child) {
 			struct drm_i915_perf_record_header *header;
-			drm_intel_bufmgr *bufmgr;
-			drm_intel_context *context0, *context1;
-			struct intel_batchbuffer *batch;
-			struct igt_buf src[3], dst[3];
-			drm_intel_bo *bo;
+			struct buf_ops *bops;
+			struct intel_bb *ibb0, *ibb1;
+			struct intel_buf src[3], dst[3], *dst_buf;
+			uint32_t context0_id, context1_id;
 			uint32_t *report0_32, *report1_32;
 			uint32_t *prev, *lprev = NULL;
 			uint64_t timestamp0_64, timestamp1_64;
@@ -3428,21 +3346,17 @@ gen8_test_single_ctx_render_target_writes_a_counter(void)
 				.format = test_set->perf_oa_format
 			};
 
-			bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
-			drm_intel_bufmgr_gem_enable_reuse(bufmgr);
+			bops = buf_ops_create(drm_fd);
 
 			for (int i = 0; i < ARRAY_SIZE(src); i++) {
-				scratch_buf_init(bufmgr, &src[i], width, height, 0xff0000ff);
-				scratch_buf_init(bufmgr, &dst[i], width, height, 0x00ff00ff);
+				scratch_buf_init(bops, &src[i], width, height, 0xff0000ff);
+				scratch_buf_init(bops, &dst[i], width, height, 0x00ff00ff);
 			}
 
-			batch = intel_batchbuffer_alloc(bufmgr, devid);
-
-			context0 = drm_intel_gem_context_create(bufmgr);
-			igt_assert(context0);
-
-			context1 = drm_intel_gem_context_create(bufmgr);
-			igt_assert(context1);
+			ibb0 = intel_bb_create(drm_fd, BATCH_SZ);
+			ibb1 = intel_bb_create(drm_fd, BATCH_SZ);
+			context0_id = gem_context_create(drm_fd);
+			context1_id = gem_context_create(drm_fd);
 
 			igt_debug("submitting warm up render_copy\n");
 
@@ -3466,32 +3380,30 @@ gen8_test_single_ctx_render_target_writes_a_counter(void)
 			 * up pinning the context since there won't ever be a pinning
 			 * hook callback.
 			 */
-			render_copy(batch,
-				    context0,
+			render_copy(ibb0,
+				    context0_id,
 				    &src[0], 0, 0, width, height,
 				    &dst[0], 0, 0);
+			intel_bb_sync(ibb0);
 
-			ret = drm_intel_gem_context_get_id(context0, &ctx_id);
-			igt_assert_eq(ret, 0);
-			igt_assert_neq(ctx_id, 0xffffffff);
-			properties[1] = ctx_id;
+			properties[1] = context0_id;
 
-			scratch_buf_memset(src[0].bo, width, height, 0xff0000ff);
-			scratch_buf_memset(dst[0].bo, width, height, 0x00ff00ff);
+			scratch_buf_memset(&src[0], width, height, 0xff0000ff);
+			scratch_buf_memset(&dst[0], width, height, 0x00ff00ff);
 
 			igt_debug("opening i915-perf stream\n");
 			stream_fd = __perf_open(drm_fd, &param, false);
 
-			bo = drm_intel_bo_alloc(bufmgr, "mi_rpc dest bo", 4096, 64);
-
-			ret = drm_intel_bo_map(bo, true /* write enable */);
-			igt_assert_eq(ret, 0);
+			dst_buf = intel_buf_create(bops, 4096, 1, 8, 64,
+						   I915_TILING_NONE,
+						   I915_COMPRESSION_NONE);
 
-			memset(bo->virtual, 0x80, 4096);
-			drm_intel_bo_unmap(bo);
+			intel_buf_cpu_map(dst_buf, true /* write enable */);
+			memset(dst_buf->ptr, 0x80, 4096);
+			intel_buf_unmap(dst_buf);
 
-			emit_stall_timestamp_and_rpc(batch,
-						     bo,
+			emit_stall_timestamp_and_rpc(ibb0,
+						     dst_buf,
 						     512 /* timestamp offset */,
 						     0, /* report dst offset */
 						     0xdeadbeef); /* report id */
@@ -3501,49 +3413,46 @@ gen8_test_single_ctx_render_target_writes_a_counter(void)
 			 * that the PIPE_CONTROL + MI_RPC commands will be in a
 			 * separate batch from the copy.
 			 */
-			intel_batchbuffer_flush_with_context(batch, context0);
+			intel_bb_flush_render_with_context(ibb0, context0_id);
 
-			render_copy(batch,
-				    context0,
+			render_copy(ibb0,
+				    context0_id,
 				    &src[0], 0, 0, width, height,
 				    &dst[0], 0, 0);
 
 			/* Another redundant flush to clarify batch bo is free to reuse */
-			intel_batchbuffer_flush_with_context(batch, context0);
+			intel_bb_flush_render_with_context(ibb0, context0_id);
 
 			/* submit two copies on the other context to avoid a false
 			 * positive in case the driver somehow ended up filtering for
 			 * context1
 			 */
-			render_copy(batch,
-				    context1,
+			render_copy(ibb1,
+				    context1_id,
 				    &src[1], 0, 0, width, height,
 				    &dst[1], 0, 0);
 
-			ret = drm_intel_gem_context_get_id(context1, &ctx1_id);
-			igt_assert_eq(ret, 0);
-			igt_assert_neq(ctx1_id, 0xffffffff);
-
-			render_copy(batch,
-				    context1,
+			render_copy(ibb1,
+				    context1_id,
 				    &src[2], 0, 0, width, height,
 				    &dst[2], 0, 0);
 
 			/* And another */
-			intel_batchbuffer_flush_with_context(batch, context1);
+			intel_bb_flush_render_with_context(ibb1, context1_id);
 
-			emit_stall_timestamp_and_rpc(batch,
-						     bo,
+			emit_stall_timestamp_and_rpc(ibb1,
+						     dst_buf,
 						     520 /* timestamp offset */,
 						     256, /* report dst offset */
 						     0xbeefbeef); /* report id */
 
-			intel_batchbuffer_flush_with_context(batch, context1);
+			intel_bb_flush_render_with_context(ibb1, context1_id);
+			intel_bb_sync(ibb1);
+			intel_bb_sync(ibb0);
 
-			ret = drm_intel_bo_map(bo, false /* write enable */);
-			igt_assert_eq(ret, 0);
+			intel_buf_cpu_map(dst_buf, false /* write enable */);
 
-			report0_32 = bo->virtual;
+			report0_32 = dst_buf->ptr;
 			igt_assert_eq(report0_32[0], 0xdeadbeef); /* report ID */
 			igt_assert_neq(report0_32[1], 0); /* timestamp */
 			prev = report0_32;
@@ -3555,6 +3464,7 @@ gen8_test_single_ctx_render_target_writes_a_counter(void)
 			igt_assert_eq(report1_32[0], 0xbeefbeef); /* report ID */
 			igt_assert_neq(report1_32[1], 0); /* timestamp */
 			ctx1_id = report1_32[2];
+			igt_debug("CTX1 ID: %u\n", ctx1_id);
 			dump_report(report1_32, 64, "report1_32");
 
 			memset(accumulator.deltas, 0, sizeof(accumulator.deltas));
@@ -3569,8 +3479,8 @@ gen8_test_single_ctx_render_target_writes_a_counter(void)
 			igt_debug("ctx_id 0 = %u\n", report0_32[2]);
 			igt_debug("ctx_id 1 = %u\n", report1_32[2]);
 
-			timestamp0_64 = *(uint64_t *)(((uint8_t *)bo->virtual) + 512);
-			timestamp1_64 = *(uint64_t *)(((uint8_t *)bo->virtual) + 520);
+			timestamp0_64 = *(uint64_t *)(((uint8_t *)dst_buf->ptr) + 512);
+			timestamp1_64 = *(uint64_t *)(((uint8_t *)dst_buf->ptr) + 520);
 
 			igt_debug("ts_timestamp64 0 = %"PRIu64"\n", timestamp0_64);
 			igt_debug("ts_timestamp64 1 = %"PRIu64"\n", timestamp1_64);
@@ -3758,27 +3668,26 @@ gen8_test_single_ctx_render_target_writes_a_counter(void)
 				  width, height);
 			accumulator_print(&accumulator, "filtered");
 
-			ret = drm_intel_bo_map(src[0].bo, false /* write enable */);
-			igt_assert_eq(ret, 0);
-			ret = drm_intel_bo_map(dst[0].bo, false /* write enable */);
-			igt_assert_eq(ret, 0);
+			intel_buf_cpu_map(&src[0], false /* write enable */);
+			intel_buf_cpu_map(&dst[0], false /* write enable */);
 
-			ret = memcmp(src[0].bo->virtual, dst[0].bo->virtual, 4 * width * height);
-			drm_intel_bo_unmap(src[0].bo);
-			drm_intel_bo_unmap(dst[0].bo);
+			ret = memcmp(src[0].ptr, dst[0].ptr, 4 * width * height);
+			intel_buf_unmap(&src[0]);
+			intel_buf_unmap(&dst[0]);
 
 again:
 			for (int i = 0; i < ARRAY_SIZE(src); i++) {
-				drm_intel_bo_unreference(src[i].bo);
-				drm_intel_bo_unreference(dst[i].bo);
+				intel_buf_close(bops, &src[i]);
+				intel_buf_close(bops, &dst[i]);
 			}
 
-			drm_intel_bo_unmap(bo);
-			drm_intel_bo_unreference(bo);
-			intel_batchbuffer_free(batch);
-			drm_intel_gem_context_destroy(context0);
-			drm_intel_gem_context_destroy(context1);
-			drm_intel_bufmgr_destroy(bufmgr);
+			intel_buf_unmap(dst_buf);
+			intel_buf_destroy(dst_buf);
+			intel_bb_destroy(ibb0);
+			intel_bb_destroy(ibb1);
+			gem_context_destroy(drm_fd, context0_id);
+			gem_context_destroy(drm_fd, context1_id);
+			buf_ops_destroy(bops);
 			__perf_close(stream_fd);
 			gem_quiescent_gpu(drm_fd);
 
@@ -3825,11 +3734,10 @@ static void gen12_single_ctx_helper(void)
 		.num_properties = ARRAY_SIZE(properties) / 2,
 		.properties_ptr = to_user_pointer(properties),
 	};
-	drm_intel_bufmgr *bufmgr;
-	drm_intel_context *context0, *context1;
-	struct intel_batchbuffer *batch;
-	struct igt_buf src[3], dst[3];
-	drm_intel_bo *bo;
+	struct buf_ops *bops;
+	struct intel_bb *ibb0, *ibb1;
+	struct intel_buf src[3], dst[3], *dst_buf;
+	uint32_t context0_id, context1_id;
 	uint32_t *report0_32, *report1_32, *report2_32, *report3_32;
 	uint64_t timestamp0_64, timestamp1_64;
 	uint32_t delta_ts64, delta_oa32;
@@ -3845,21 +3753,17 @@ static void gen12_single_ctx_helper(void)
 		.format = test_set->perf_oa_format
 	};
 
-	bufmgr = drm_intel_bufmgr_gem_init(drm_fd, 4096);
-	drm_intel_bufmgr_gem_enable_reuse(bufmgr);
+	bops = buf_ops_create(drm_fd);
 
 	for (int i = 0; i < ARRAY_SIZE(src); i++) {
-		scratch_buf_init(bufmgr, &src[i], width, height, 0xff0000ff);
-		scratch_buf_init(bufmgr, &dst[i], width, height, 0x00ff00ff);
+		scratch_buf_init(bops, &src[i], width, height, 0xff0000ff);
+		scratch_buf_init(bops, &dst[i], width, height, 0x00ff00ff);
 	}
 
-	batch = intel_batchbuffer_alloc(bufmgr, devid);
-
-	context0 = drm_intel_gem_context_create(bufmgr);
-	igt_assert(context0);
-
-	context1 = drm_intel_gem_context_create(bufmgr);
-	igt_assert(context1);
+	ibb0 = intel_bb_create(drm_fd, BATCH_SZ);
+	ibb1 = intel_bb_create(drm_fd, BATCH_SZ);
+	context0_id = gem_context_create(drm_fd);
+	context1_id = gem_context_create(drm_fd);
 
 	igt_debug("submitting warm up render_copy\n");
 
@@ -3883,44 +3787,42 @@ static void gen12_single_ctx_helper(void)
 	 * up pinning the context since there won't ever be a pinning
 	 * hook callback.
 	 */
-	render_copy(batch, context0,
+	render_copy(ibb0, context0_id,
 		    &src[0], 0, 0, width, height,
 		    &dst[0], 0, 0);
 
 	/* Initialize the context parameter to the perf open ioctl here */
-	ret = drm_intel_gem_context_get_id(context0, &ctx0_id);
-	igt_assert_eq(ret, 0);
-	igt_assert_neq(ctx0_id, 0xffffffff);
-	properties[1] = ctx0_id;
+	properties[1] = context0_id;
 
 	igt_debug("opening i915-perf stream\n");
 	stream_fd = __perf_open(drm_fd, &param, false);
 
-	bo = drm_intel_bo_alloc(bufmgr, "mi_rpc dest bo", 4096, 64);
+	dst_buf = intel_buf_create(bops, 4096, 1, 8, 64,
+				   I915_TILING_NONE,
+				   I915_COMPRESSION_NONE);
 
 	/* Set write domain to cpu briefly to fill the buffer with 80s */
-	ret = drm_intel_bo_map(bo, true);
-	igt_assert_eq(ret, 0);
-	memset(bo->virtual, 0x80, 2048);
-	memset(bo->virtual + 2048, 0, 2048);
-	drm_intel_bo_unmap(bo);
+	intel_buf_cpu_map(dst_buf, true /* write enable */);
+	memset(dst_buf->ptr, 0x80, 2048);
+	memset((uint8_t *) dst_buf->ptr + 2048, 0, 2048);
+	intel_buf_unmap(dst_buf);
 
 	/* Submit an mi-rpc to context0 before measurable work */
 #define BO_TIMESTAMP_OFFSET0 1024
 #define BO_REPORT_OFFSET0 0
 #define BO_REPORT_ID0 0xdeadbeef
-	emit_stall_timestamp_and_rpc(batch,
-				     bo,
+	emit_stall_timestamp_and_rpc(ibb0,
+				     dst_buf,
 				     BO_TIMESTAMP_OFFSET0,
 				     BO_REPORT_OFFSET0,
 				     BO_REPORT_ID0);
-	intel_batchbuffer_flush_with_context(batch, context0);
+	intel_bb_flush_render_with_context(ibb0, context0_id);
 
 	/* This is the work/context that is measured for counter increments */
-	render_copy(batch, context0,
+	render_copy(ibb0, context0_id,
 		    &src[0], 0, 0, width, height,
 		    &dst[0], 0, 0);
-	intel_batchbuffer_flush_with_context(batch, context0);
+	intel_bb_flush_render_with_context(ibb0, context0_id);
 
 	/* Submit an mi-rpc to context1 before work
 	 *
@@ -3931,54 +3833,51 @@ static void gen12_single_ctx_helper(void)
 #define BO_TIMESTAMP_OFFSET2 1040
 #define BO_REPORT_OFFSET2 512
 #define BO_REPORT_ID2 0x00c0ffee
-	emit_stall_timestamp_and_rpc(batch,
-				     bo,
+	emit_stall_timestamp_and_rpc(ibb1,
+				     dst_buf,
 				     BO_TIMESTAMP_OFFSET2,
 				     BO_REPORT_OFFSET2,
 				     BO_REPORT_ID2);
-	intel_batchbuffer_flush_with_context(batch, context1);
+	intel_bb_flush_render_with_context(ibb1, context1_id);
 
 	/* Submit two copies on the other context to avoid a false
 	 * positive in case the driver somehow ended up filtering for
 	 * context1
 	 */
-	render_copy(batch, context1,
+	render_copy(ibb1, context1_id,
 		    &src[1], 0, 0, width, height,
 		    &dst[1], 0, 0);
-	ret = drm_intel_gem_context_get_id(context1, &ctx1_id);
-	igt_assert_eq(ret, 0);
-	igt_assert_neq(ctx1_id, 0xffffffff);
 
-	render_copy(batch, context1,
+	render_copy(ibb1, context1_id,
 		    &src[2], 0, 0, width, height,
 		    &dst[2], 0, 0);
-	intel_batchbuffer_flush_with_context(batch, context1);
+	intel_bb_flush_render_with_context(ibb1, context1_id);
 
 	/* Submit an mi-rpc to context1 after all work */
 #define BO_TIMESTAMP_OFFSET3 1048
 #define BO_REPORT_OFFSET3 768
 #define BO_REPORT_ID3 0x01c0ffee
-	emit_stall_timestamp_and_rpc(batch,
-				     bo,
+	emit_stall_timestamp_and_rpc(ibb1,
+				     dst_buf,
 				     BO_TIMESTAMP_OFFSET3,
 				     BO_REPORT_OFFSET3,
 				     BO_REPORT_ID3);
-	intel_batchbuffer_flush_with_context(batch, context1);
+	intel_bb_flush_render_with_context(ibb1, context1_id);
 
 	/* Submit an mi-rpc to context0 after all measurable work */
 #define BO_TIMESTAMP_OFFSET1 1032
 #define BO_REPORT_OFFSET1 256
 #define BO_REPORT_ID1 0xbeefbeef
-	emit_stall_timestamp_and_rpc(batch,
-				     bo,
+	emit_stall_timestamp_and_rpc(ibb0,
+				     dst_buf,
 				     BO_TIMESTAMP_OFFSET1,
 				     BO_REPORT_OFFSET1,
 				     BO_REPORT_ID1);
-	intel_batchbuffer_flush_with_context(batch, context0);
+	intel_bb_flush_render_with_context(ibb0, context0_id);
+	intel_bb_sync(ibb0);
+	intel_bb_sync(ibb1);
 
-	/* Set write domain to none */
-	ret = drm_intel_bo_map(bo, false);
-	igt_assert_eq(ret, 0);
+	intel_buf_cpu_map(dst_buf, false);
 
 	/* Sanity check reports
 	 * reportX_32[0]: report id passed with mi-rpc
@@ -3990,7 +3889,7 @@ static void gen12_single_ctx_helper(void)
 	 * report2_32: start of other work
 	 * report3_32: end of other work
 	 */
-	report0_32 = bo->virtual;
+	report0_32 = dst_buf->ptr;
 	igt_assert_eq(report0_32[0], 0xdeadbeef);
 	igt_assert_neq(report0_32[1], 0);
 	ctx0_id = report0_32[2];
@@ -4001,6 +3900,7 @@ static void gen12_single_ctx_helper(void)
 	igt_assert_eq(report1_32[0], 0xbeefbeef);
 	igt_assert_neq(report1_32[1], 0);
 	ctx1_id = report1_32[2];
+	igt_debug("CTX ID1: %u\n", ctx1_id);
 	dump_report(report1_32, 64, "report1_32");
 
 	/* Verify that counters in context1 are all zeroes */
@@ -4009,7 +3909,7 @@ static void gen12_single_ctx_helper(void)
 	igt_assert_neq(report2_32[1], 0);
 	dump_report(report2_32, 64, "report2_32");
 	igt_assert_eq(0, memcmp(&report2_32[4],
-				bo->virtual + 2048,
+				(uint8_t *) dst_buf->ptr + 2048,
 				240));
 
 	report3_32 = report0_32 + 192;
@@ -4017,7 +3917,7 @@ static void gen12_single_ctx_helper(void)
 	igt_assert_neq(report3_32[1], 0);
 	dump_report(report3_32, 64, "report3_32");
 	igt_assert_eq(0, memcmp(&report3_32[4],
-				bo->virtual + 2048,
+				(uint8_t *) dst_buf->ptr + 2048,
 				240));
 
 	/* Accumulate deltas for counters - A0, A21 and A26 */
@@ -4037,8 +3937,8 @@ static void gen12_single_ctx_helper(void)
 	 * the OA report timestamps should be almost identical but
 	 * allow a 500 nanoseconds margin.
 	 */
-	timestamp0_64 = *(uint64_t *)(((uint8_t *)bo->virtual) + BO_TIMESTAMP_OFFSET0);
-	timestamp1_64 = *(uint64_t *)(((uint8_t *)bo->virtual) + BO_TIMESTAMP_OFFSET1);
+	timestamp0_64 = *(uint64_t *)(((uint8_t *)dst_buf->ptr) + BO_TIMESTAMP_OFFSET0);
+	timestamp1_64 = *(uint64_t *)(((uint8_t *)dst_buf->ptr) + BO_TIMESTAMP_OFFSET1);
 
 	igt_debug("ts_timestamp64 0 = %"PRIu64"\n", timestamp0_64);
 	igt_debug("ts_timestamp64 1 = %"PRIu64"\n", timestamp1_64);
@@ -4073,20 +3973,18 @@ static void gen12_single_ctx_helper(void)
 	/* Verify that the work actually happened by comparing the src
 	 * and dst buffers
 	 */
-	ret = drm_intel_bo_map(src[0].bo, false);
-	igt_assert_eq(ret, 0);
-	ret = drm_intel_bo_map(dst[0].bo, false);
-	igt_assert_eq(ret, 0);
+	intel_buf_cpu_map(&src[0], false);
+	intel_buf_cpu_map(&dst[0], false);
+
+	ret = memcmp(src[0].ptr, dst[0].ptr, 4 * width * height);
+	intel_buf_unmap(&src[0]);
+	intel_buf_unmap(&dst[0]);
 
-	ret = memcmp(src[0].bo->virtual, dst[0].bo->virtual, 4 * width * height);
 	if (ret != 0) {
 		accumulator_print(&accumulator, "total");
 		exit(EAGAIN);
 	}
 
-	drm_intel_bo_unmap(src[0].bo);
-	drm_intel_bo_unmap(dst[0].bo);
-
 	/* Check that this test passed. The test measures the number of 2x2
 	 * samples written to the render target using the counter A26. For
 	 * OAR, this counter will only have increments relevant to this specific
@@ -4096,16 +3994,17 @@ static void gen12_single_ctx_helper(void)
 
 	/* Clean up */
 	for (int i = 0; i < ARRAY_SIZE(src); i++) {
-		drm_intel_bo_unreference(src[i].bo);
-		drm_intel_bo_unreference(dst[i].bo);
+		intel_buf_close(bops, &src[i]);
+		intel_buf_close(bops, &dst[i]);
 	}
 
-	drm_intel_bo_unmap(bo);
-	drm_intel_bo_unreference(bo);
-	intel_batchbuffer_free(batch);
-	drm_intel_gem_context_destroy(context0);
-	drm_intel_gem_context_destroy(context1);
-	drm_intel_bufmgr_destroy(bufmgr);
+	intel_buf_unmap(dst_buf);
+	intel_buf_destroy(dst_buf);
+	intel_bb_destroy(ibb0);
+	intel_bb_destroy(ibb1);
+	gem_context_destroy(drm_fd, context0_id);
+	gem_context_destroy(drm_fd, context1_id);
+	buf_ops_destroy(bops);
 	__perf_close(stream_fd);
 }
 
-- 
2.26.0

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [igt-dev] ✗ GitLab.Pipeline: warning for Remove libdrm in rendercopy
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
                   ` (10 preceding siblings ...)
  2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 11/11] tests/perf: remove libdrm dependency for rendercopy Zbigniew Kempczyński
@ 2020-07-23 19:30 ` Patchwork
  2020-07-23 19:50 ` [igt-dev] ✗ Fi.CI.BAT: failure " Patchwork
  12 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2020-07-23 19:30 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev

== Series Details ==

Series: Remove libdrm in rendercopy
URL   : https://patchwork.freedesktop.org/series/79823/
State : warning

== Summary ==

Did not get list of undocumented tests for this run, something is wrong!

Other than that, pipeline status: FAILED.

see https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/pipelines/181379 for the overview.

build-containers:build-debian-mips has failed (https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/jobs/3775209):
  Using Docker executor with image registry.freedesktop.org/wayland/ci-templates/buildah:2019-08-13.0 ...
  Authenticating with credentials from job payload (GitLab Registry)
  Pulling docker image registry.freedesktop.org/wayland/ci-templates/buildah:2019-08-13.0 ...
  Using docker image sha256:594aa868d31ee3304dee8cae8a3433c89a6fcfcf6c7d420c04cce22f60147176 for registry.freedesktop.org/wayland/ci-templates/buildah:2019-08-13.0 ...
  section_end:1595532388:prepare_executor
  section_start:1595532388:prepare_script
  Preparing environment
  Running on runner-thys1ka2-project-3185-concurrent-1 via gst-htz-2...
  section_end:1595532389:prepare_script
  section_start:1595532389:get_sources
  Getting source from Git repository
  $ eval "$CI_PRE_CLONE_SCRIPT"
  Fetching changes...
  Initialized empty Git repository in /builds/gfx-ci/igt-ci-tags/.git/
  Created fresh repository.
  error: RPC failed; curl 92 HTTP/2 stream 0 was not closed cleanly: Unknown error code (err 1)
  fatal: the remote end hung up unexpectedly
  section_end:1595532451:get_sources
  ERROR: Job failed: exit code 1

== Logs ==

For more details see: https://gitlab.freedesktop.org/gfx-ci/igt-ci-tags/-/pipelines/181379
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [igt-dev] ✗ Fi.CI.BAT: failure for Remove libdrm in rendercopy
  2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
                   ` (11 preceding siblings ...)
  2020-07-23 19:30 ` [igt-dev] ✗ GitLab.Pipeline: warning for Remove libdrm in rendercopy Patchwork
@ 2020-07-23 19:50 ` Patchwork
  12 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2020-07-23 19:50 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev


[-- Attachment #1.1: Type: text/plain, Size: 10075 bytes --]

== Series Details ==

Series: Remove libdrm in rendercopy
URL   : https://patchwork.freedesktop.org/series/79823/
State : failure

== Summary ==

CI Bug Log - changes from IGT_5746 -> IGTPW_4794
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with IGTPW_4794 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in IGTPW_4794, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_4794:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_render_linear_blits@basic:
    - fi-bsw-nick:        [PASS][1] -> [FAIL][2] +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-bsw-nick/igt@gem_render_linear_blits@basic.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-bsw-nick/igt@gem_render_linear_blits@basic.html
    - fi-bdw-5557u:       [PASS][3] -> [FAIL][4] +2 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-bdw-5557u/igt@gem_render_linear_blits@basic.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-bdw-5557u/igt@gem_render_linear_blits@basic.html

  * igt@gem_render_tiled_blits@basic:
    - fi-bsw-n3050:       [PASS][5] -> [FAIL][6] +1 similar issue
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-bsw-n3050/igt@gem_render_tiled_blits@basic.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-bsw-n3050/igt@gem_render_tiled_blits@basic.html
    - fi-gdg-551:         [PASS][7] -> [FAIL][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-gdg-551/igt@gem_render_tiled_blits@basic.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-gdg-551/igt@gem_render_tiled_blits@basic.html
    - fi-bsw-kefka:       [PASS][9] -> [FAIL][10] +1 similar issue
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-bsw-kefka/igt@gem_render_tiled_blits@basic.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-bsw-kefka/igt@gem_render_tiled_blits@basic.html

  * igt@kms_frontbuffer_tracking@basic:
    - fi-snb-2520m:       [PASS][11] -> [FAIL][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-snb-2520m/igt@kms_frontbuffer_tracking@basic.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-snb-2520m/igt@kms_frontbuffer_tracking@basic.html

  
Known issues
------------

  Here are the changes found in IGTPW_4794 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@debugfs_test@read_all_entries:
    - fi-kbl-soraka:      [PASS][13] -> [DMESG-WARN][14] ([i915#1982])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-kbl-soraka/igt@debugfs_test@read_all_entries.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-kbl-soraka/igt@debugfs_test@read_all_entries.html

  * igt@gem_exec_suspend@basic-s3:
    - fi-tgl-u2:          [PASS][15] -> [FAIL][16] ([i915#1888]) +1 similar issue
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-tgl-u2/igt@gem_exec_suspend@basic-s3.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-tgl-u2/igt@gem_exec_suspend@basic-s3.html

  * igt@gem_render_linear_blits@basic:
    - fi-tgl-y:           [PASS][17] -> [DMESG-WARN][18] ([i915#402]) +1 similar issue
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-tgl-y/igt@gem_render_linear_blits@basic.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-tgl-y/igt@gem_render_linear_blits@basic.html

  * igt@i915_module_load@reload:
    - fi-tgl-u2:          [PASS][19] -> [DMESG-WARN][20] ([i915#402])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-tgl-u2/igt@i915_module_load@reload.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-tgl-u2/igt@i915_module_load@reload.html
    - fi-tgl-y:           [PASS][21] -> [DMESG-WARN][22] ([i915#1982])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-tgl-y/igt@i915_module_load@reload.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-tgl-y/igt@i915_module_load@reload.html

  * igt@kms_busy@basic@flip:
    - fi-kbl-x1275:       [PASS][23] -> [DMESG-WARN][24] ([i915#62] / [i915#92] / [i915#95])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-kbl-x1275/igt@kms_busy@basic@flip.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-kbl-x1275/igt@kms_busy@basic@flip.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
    - fi-bsw-kefka:       [PASS][25] -> [DMESG-WARN][26] ([i915#1982])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-bsw-kefka/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-bsw-kefka/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@b-edp1:
    - fi-icl-u2:          [PASS][27] -> [DMESG-WARN][28] ([i915#1982]) +1 similar issue
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vblank@b-edp1.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vblank@b-edp1.html

  * igt@kms_frontbuffer_tracking@basic:
    - fi-bsw-n3050:       [PASS][29] -> [FAIL][30] ([i915#49])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-bsw-n3050/igt@kms_frontbuffer_tracking@basic.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-bsw-n3050/igt@kms_frontbuffer_tracking@basic.html
    - fi-bsw-kefka:       [PASS][31] -> [DMESG-FAIL][32] ([i915#1982] / [i915#49])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-bsw-kefka/igt@kms_frontbuffer_tracking@basic.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-bsw-kefka/igt@kms_frontbuffer_tracking@basic.html

  
#### Possible fixes ####

  * igt@gem_mmap@basic:
    - fi-tgl-y:           [DMESG-WARN][33] ([i915#402]) -> [PASS][34]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-tgl-y/igt@gem_mmap@basic.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-tgl-y/igt@gem_mmap@basic.html

  * igt@i915_pm_rpm@module-reload:
    - fi-bsw-kefka:       [DMESG-WARN][35] ([i915#1982]) -> [PASS][36] +1 similar issue
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-bsw-kefka/igt@i915_pm_rpm@module-reload.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-bsw-kefka/igt@i915_pm_rpm@module-reload.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
    - fi-bsw-n3050:       [DMESG-WARN][37] ([i915#1982]) -> [PASS][38] +1 similar issue
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-bsw-n3050/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-bsw-n3050/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_flip@basic-flip-vs-dpms@b-hdmi-a1:
    - fi-cml-s:           [DMESG-WARN][39] ([i915#1982]) -> [PASS][40]
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-cml-s/igt@kms_flip@basic-flip-vs-dpms@b-hdmi-a1.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-cml-s/igt@kms_flip@basic-flip-vs-dpms@b-hdmi-a1.html

  
#### Warnings ####

  * igt@debugfs_test@read_all_entries:
    - fi-kbl-x1275:       [DMESG-WARN][41] ([i915#62] / [i915#92]) -> [DMESG-WARN][42] ([i915#62] / [i915#92] / [i915#95]) +2 similar issues
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-kbl-x1275/igt@debugfs_test@read_all_entries.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-kbl-x1275/igt@debugfs_test@read_all_entries.html

  * igt@kms_cursor_legacy@basic-flip-after-cursor-varying-size:
    - fi-kbl-x1275:       [DMESG-WARN][43] ([i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][44] ([i915#62] / [i915#92]) +1 similar issue
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5746/fi-kbl-x1275/igt@kms_cursor_legacy@basic-flip-after-cursor-varying-size.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/fi-kbl-x1275/igt@kms_cursor_legacy@basic-flip-after-cursor-varying-size.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402
  [i915#49]: https://gitlab.freedesktop.org/drm/intel/issues/49
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (47 -> 40)
------------------------------

  Missing    (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_5746 -> IGTPW_4794

  CI-20190529: 20190529
  CI_DRM_8778: 5ead5989a42079951e6f0b7b6a072a690df0b985 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_4794: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/index.html
  IGT_5746: d818f0c54e5e781ba3fb372aab8f270cf153776c @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools



== Testlist changes ==

+igt@api_intel_bb@render-ccs
+igt@api_intel_bb@render-none
+igt@api_intel_bb@render-x
+igt@api_intel_bb@render-y
-igt@gem_read_read_speed@read-read-1x1
-igt@gem_read_read_speed@read-write-1x1
-igt@gem_read_read_speed@write-read-1x1
-igt@gem_read_read_speed@write-write-1x1

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4794/index.html

[-- Attachment #1.2: Type: text/html, Size: 12590 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [igt-dev] ✗ Fi.CI.BAT: failure for *Remove libdrm in rendercopy
  2020-07-29 12:29 [igt-dev] [PATCH i-g-t v13 00/13] *Remove " Zbigniew Kempczyński
@ 2020-07-29 13:05 ` Patchwork
  0 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2020-07-29 13:05 UTC (permalink / raw)
  To: Zbigniew Kempczyński; +Cc: igt-dev


[-- Attachment #1.1: Type: text/plain, Size: 7068 bytes --]

== Series Details ==

Series: *Remove libdrm in rendercopy
URL   : https://patchwork.freedesktop.org/series/80035/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8814 -> IGTPW_4817
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with IGTPW_4817 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in IGTPW_4817, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in IGTPW_4817:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_render_tiled_blits@basic:
    - fi-gdg-551:         [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8814/fi-gdg-551/igt@gem_render_tiled_blits@basic.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/fi-gdg-551/igt@gem_render_tiled_blits@basic.html

  
Known issues
------------

  Here are the changes found in IGTPW_4817 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_pm_rpm@module-reload:
    - fi-byt-j1900:       [PASS][3] -> [DMESG-WARN][4] ([i915#1982])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8814/fi-byt-j1900/igt@i915_pm_rpm@module-reload.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/fi-byt-j1900/igt@i915_pm_rpm@module-reload.html

  * igt@kms_busy@basic@flip:
    - fi-kbl-x1275:       [PASS][5] -> [DMESG-WARN][6] ([i915#62] / [i915#92] / [i915#95])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8814/fi-kbl-x1275/igt@kms_busy@basic@flip.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/fi-kbl-x1275/igt@kms_busy@basic@flip.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
    - fi-icl-u2:          [PASS][7] -> [DMESG-WARN][8] ([i915#1982])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8814/fi-icl-u2/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/fi-icl-u2/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@c-hdmi-a2:
    - fi-skl-guc:         [PASS][9] -> [DMESG-WARN][10] ([i915#2203])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8814/fi-skl-guc/igt@kms_flip@basic-flip-vs-wf_vblank@c-hdmi-a2.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/fi-skl-guc/igt@kms_flip@basic-flip-vs-wf_vblank@c-hdmi-a2.html

  
#### Possible fixes ####

  * igt@debugfs_test@read_all_entries:
    - {fi-kbl-7560u}:     [INCOMPLETE][11] ([i915#2175]) -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8814/fi-kbl-7560u/igt@debugfs_test@read_all_entries.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/fi-kbl-7560u/igt@debugfs_test@read_all_entries.html

  * igt@i915_pm_rpm@module-reload:
    - fi-bsw-kefka:       [DMESG-WARN][13] ([i915#1982]) -> [PASS][14] +1 similar issue
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8814/fi-bsw-kefka/igt@i915_pm_rpm@module-reload.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/fi-bsw-kefka/igt@i915_pm_rpm@module-reload.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
    - fi-byt-j1900:       [DMESG-WARN][15] ([i915#1982]) -> [PASS][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8814/fi-byt-j1900/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/fi-byt-j1900/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_cursor_legacy@basic-flip-after-cursor-atomic:
    - fi-icl-u2:          [DMESG-WARN][17] ([i915#1982]) -> [PASS][18] +1 similar issue
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8814/fi-icl-u2/igt@kms_cursor_legacy@basic-flip-after-cursor-atomic.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/fi-icl-u2/igt@kms_cursor_legacy@basic-flip-after-cursor-atomic.html

  
#### Warnings ####

  * igt@kms_cursor_legacy@basic-flip-after-cursor-legacy:
    - fi-kbl-x1275:       [DMESG-WARN][19] ([i915#62] / [i915#92]) -> [DMESG-WARN][20] ([i915#62] / [i915#92] / [i915#95]) +3 similar issues
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8814/fi-kbl-x1275/igt@kms_cursor_legacy@basic-flip-after-cursor-legacy.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/fi-kbl-x1275/igt@kms_cursor_legacy@basic-flip-after-cursor-legacy.html

  * igt@kms_force_connector_basic@force-edid:
    - fi-kbl-x1275:       [DMESG-WARN][21] ([i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][22] ([i915#62] / [i915#92]) +2 similar issues
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8814/fi-kbl-x1275/igt@kms_force_connector_basic@force-edid.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/fi-kbl-x1275/igt@kms_force_connector_basic@force-edid.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2175]: https://gitlab.freedesktop.org/drm/intel/issues/2175
  [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (39 -> 36)
------------------------------

  Additional (1): fi-tgl-u2 
  Missing    (4): fi-byt-clapper fi-byt-squawks fi-bsw-cyan fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * IGT: IGT_5750 -> IGTPW_4817

  CI-20190529: 20190529
  CI_DRM_8814: c3d46022808ab0325db29918a829ac7fa02a4314 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGTPW_4817: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/index.html
  IGT_5750: 95d906bf458634850626f7e5d6a707191022279f @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools



== Testlist changes ==

+igt@api_intel_bb@check-canonical
+igt@api_intel_bb@render-ccs
+igt@api_intel_bb@render-none-512
+igt@api_intel_bb@render-none-1024
+igt@api_intel_bb@render-none-reloc-512
+igt@api_intel_bb@render-none-reloc-1024
+igt@api_intel_bb@render-x-512
+igt@api_intel_bb@render-x-1024
+igt@api_intel_bb@render-x-reloc-512
+igt@api_intel_bb@render-x-reloc-1024
+igt@api_intel_bb@render-y-512
+igt@api_intel_bb@render-y-1024
+igt@api_intel_bb@render-y-reloc-512
+igt@api_intel_bb@render-y-reloc-1024
-igt@gem_read_read_speed@read-read-1x1
-igt@gem_read_read_speed@read-write-1x1
-igt@gem_read_read_speed@write-read-1x1
-igt@gem_read_read_speed@write-write-1x1

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/IGTPW_4817/index.html

[-- Attachment #1.2: Type: text/html, Size: 8985 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-07-29 13:05 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-23 19:22 [igt-dev] [PATCH i-g-t 00/11] Remove libdrm in rendercopy Zbigniew Kempczyński
2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 01/11] lib/intel_bufops: add mapping on cpu / device Zbigniew Kempczyński
2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 02/11] lib/intel_batchbuffer: add new functions to support rendercopy Zbigniew Kempczyński
2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 03/11] tests/gem_caching: adopt to batch flush function cleanup Zbigniew Kempczyński
2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 04/11] lib/rendercopy: remove libdrm dependency Zbigniew Kempczyński
2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 05/11] tests/api_intel_bb: add render tests Zbigniew Kempczyński
2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 06/11] lib/igt_draw: remove libdrm dependency Zbigniew Kempczyński
2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 07/11] lib/igt_fb: Removal of " Zbigniew Kempczyński
2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 08/11] tests/gem|kms: remove libdrm dependency (batch 1) Zbigniew Kempczyński
2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 09/11] tests/gem|kms: remove libdrm dependency (batch 2) Zbigniew Kempczyński
2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 10/11] tools/intel_residency: adopt intel_residency to use bufops Zbigniew Kempczyński
2020-07-23 19:22 ` [igt-dev] [PATCH i-g-t 11/11] tests/perf: remove libdrm dependency for rendercopy Zbigniew Kempczyński
2020-07-23 19:30 ` [igt-dev] ✗ GitLab.Pipeline: warning for Remove libdrm in rendercopy Patchwork
2020-07-23 19:50 ` [igt-dev] ✗ Fi.CI.BAT: failure " Patchwork
2020-07-29 12:29 [igt-dev] [PATCH i-g-t v13 00/13] *Remove " Zbigniew Kempczyński
2020-07-29 13:05 ` [igt-dev] ✗ Fi.CI.BAT: failure for " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.