All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/12] i915 TTM sync accelerated migration and clear
@ 2021-06-14 16:26 ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

This patchset implements synchronous accelerated migration and clearing
for i915 on TTM. We plan to follow up with these operations made
asynchronous to the extent of TTM support for that:

A couple of patches from Chris which implement pipelined migration and
clears by atomically writing the PTEs in place before performing the
actual blit.

Some ww utilities mainly for the accompanying selftests added by Thomas,
as well as modified the above patches for ww locking- and lmem support.

Hooked up to our TTM backend by Ramalingam

Finally, on request from Daniel, we ditch old blit code which is now obsolete.

v2:
- A couple of minor style fixes pointed out by Matthew Auld
- Export and use intel_engine_destroy_pinned_context() to address a
  CI warning / failure.
v3:
- Acceleration hooked up to TTM
- Minor fixes to review comments (Pointed out by Matthew Auld)
- Fix pipelined blit handling of engine instances (Pointed out by Matthew Auld)
- Ditch old blit code, (Pointed out by Daniel)

Chris Wilson (6):
  drm/i915/gt: Add an insert_entry for gen8_ppgtt
  drm/i915/gt: Add a routine to iterate over the pagetables of a GTT
  drm/i915/gt: Export the pinned context constructor and destructor
  drm/i915/gt: Pipelined page migration
  drm/i915/gt: Pipelined clear
  drm/i915/gt: Setup a default migration context on the GT

Ramalingam C (1):
  drm/i915/ttm: accelerated move implementation

Thomas Hellström (5):
  drm/i915: Reference objects on the ww object list
  drm/i915: Break out dma_resv ww locking utilities to separate files
  drm/i915: Introduce a ww transaction helper
  drm/i915/gem: Zap the client blt code
  drm/i915/gem: Zap the i915_gem_object_blt code

 drivers/gpu/drm/i915/Makefile                 |   4 +-
 .../gpu/drm/i915/gem/i915_gem_client_blt.c    | 355 ---------
 .../gpu/drm/i915/gem/i915_gem_client_blt.h    |  21 -
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   9 +-
 .../gpu/drm/i915/gem/i915_gem_object_blt.c    | 461 ------------
 .../gpu/drm/i915/gem/i915_gem_object_blt.h    |  39 -
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c       |  87 ++-
 .../i915/gem/selftests/i915_gem_client_blt.c  | 704 ------------------
 .../i915/gem/selftests/i915_gem_object_blt.c  | 597 ---------------
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  68 ++
 drivers/gpu/drm/i915/gt/intel_engine.h        |  12 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  27 +-
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   2 +
 drivers/gpu/drm/i915/gt/intel_gt.c            |   4 +
 drivers/gpu/drm/i915/gt/intel_gt_types.h      |   3 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |   7 +
 drivers/gpu/drm/i915/gt/intel_migrate.c       | 687 +++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_migrate.h       |  65 ++
 drivers/gpu/drm/i915/gt/intel_migrate_types.h |  15 +
 drivers/gpu/drm/i915/gt/intel_renderstate.h   |   1 +
 drivers/gpu/drm/i915/gt/intel_ring.h          |   1 +
 drivers/gpu/drm/i915/gt/selftest_migrate.c    | 669 +++++++++++++++++
 drivers/gpu/drm/i915/i915_gem.c               |  52 --
 drivers/gpu/drm/i915/i915_gem.h               |  12 -
 drivers/gpu/drm/i915/i915_gem_ww.c            |  63 ++
 drivers/gpu/drm/i915/i915_gem_ww.h            |  50 ++
 .../drm/i915/selftests/i915_live_selftests.h  |   3 +-
 .../drm/i915/selftests/i915_perf_selftests.h  |   2 +-
 .../drm/i915/selftests/intel_memory_region.c  |  21 +-
 29 files changed, 1763 insertions(+), 2278 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
 delete mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
 delete mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.h
 create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate_types.h
 create mode 100644 drivers/gpu/drm/i915/gt/selftest_migrate.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 00/12] i915 TTM sync accelerated migration and clear
@ 2021-06-14 16:26 ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

This patchset implements synchronous accelerated migration and clearing
for i915 on TTM. We plan to follow up with these operations made
asynchronous to the extent of TTM support for that:

A couple of patches from Chris which implement pipelined migration and
clears by atomically writing the PTEs in place before performing the
actual blit.

Some ww utilities mainly for the accompanying selftests added by Thomas,
as well as modified the above patches for ww locking- and lmem support.

Hooked up to our TTM backend by Ramalingam

Finally, on request from Daniel, we ditch old blit code which is now obsolete.

v2:
- A couple of minor style fixes pointed out by Matthew Auld
- Export and use intel_engine_destroy_pinned_context() to address a
  CI warning / failure.
v3:
- Acceleration hooked up to TTM
- Minor fixes to review comments (Pointed out by Matthew Auld)
- Fix pipelined blit handling of engine instances (Pointed out by Matthew Auld)
- Ditch old blit code, (Pointed out by Daniel)

Chris Wilson (6):
  drm/i915/gt: Add an insert_entry for gen8_ppgtt
  drm/i915/gt: Add a routine to iterate over the pagetables of a GTT
  drm/i915/gt: Export the pinned context constructor and destructor
  drm/i915/gt: Pipelined page migration
  drm/i915/gt: Pipelined clear
  drm/i915/gt: Setup a default migration context on the GT

Ramalingam C (1):
  drm/i915/ttm: accelerated move implementation

Thomas Hellström (5):
  drm/i915: Reference objects on the ww object list
  drm/i915: Break out dma_resv ww locking utilities to separate files
  drm/i915: Introduce a ww transaction helper
  drm/i915/gem: Zap the client blt code
  drm/i915/gem: Zap the i915_gem_object_blt code

 drivers/gpu/drm/i915/Makefile                 |   4 +-
 .../gpu/drm/i915/gem/i915_gem_client_blt.c    | 355 ---------
 .../gpu/drm/i915/gem/i915_gem_client_blt.h    |  21 -
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   9 +-
 .../gpu/drm/i915/gem/i915_gem_object_blt.c    | 461 ------------
 .../gpu/drm/i915/gem/i915_gem_object_blt.h    |  39 -
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c       |  87 ++-
 .../i915/gem/selftests/i915_gem_client_blt.c  | 704 ------------------
 .../i915/gem/selftests/i915_gem_object_blt.c  | 597 ---------------
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  68 ++
 drivers/gpu/drm/i915/gt/intel_engine.h        |  12 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  27 +-
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   2 +
 drivers/gpu/drm/i915/gt/intel_gt.c            |   4 +
 drivers/gpu/drm/i915/gt/intel_gt_types.h      |   3 +
 drivers/gpu/drm/i915/gt/intel_gtt.h           |   7 +
 drivers/gpu/drm/i915/gt/intel_migrate.c       | 687 +++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_migrate.h       |  65 ++
 drivers/gpu/drm/i915/gt/intel_migrate_types.h |  15 +
 drivers/gpu/drm/i915/gt/intel_renderstate.h   |   1 +
 drivers/gpu/drm/i915/gt/intel_ring.h          |   1 +
 drivers/gpu/drm/i915/gt/selftest_migrate.c    | 669 +++++++++++++++++
 drivers/gpu/drm/i915/i915_gem.c               |  52 --
 drivers/gpu/drm/i915/i915_gem.h               |  12 -
 drivers/gpu/drm/i915/i915_gem_ww.c            |  63 ++
 drivers/gpu/drm/i915/i915_gem_ww.h            |  50 ++
 .../drm/i915/selftests/i915_live_selftests.h  |   3 +-
 .../drm/i915/selftests/i915_perf_selftests.h  |   2 +-
 .../drm/i915/selftests/intel_memory_region.c  |  21 +-
 29 files changed, 1763 insertions(+), 2278 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
 delete mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
 delete mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.h
 create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate_types.h
 create mode 100644 drivers/gpu/drm/i915/gt/selftest_migrate.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.h

-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v3 01/12] drm/i915: Reference objects on the ww object list
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:26   ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

Since the ww transaction endpoint easily end up far out-of-scope of
the objects on the ww object list, particularly for contending lock
objects, make sure we reference objects on the list so they don't
disappear under us.

This comes with a performance penalty so it's been debated whether this
is really needed. But I think this is motivated by the fact that locking
is typically difficult to get right, and whatever we can do to make it
simpler for developers moving forward should be done, unless the
performance impact is far too high.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 ++++++--
 drivers/gpu/drm/i915/i915_gem.c            | 4 ++++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index d66aa00d023a..241666931945 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -169,13 +169,17 @@ static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj,
 	else
 		ret = dma_resv_lock(obj->base.resv, ww ? &ww->ctx : NULL);
 
-	if (!ret && ww)
+	if (!ret && ww) {
+		i915_gem_object_get(obj);
 		list_add_tail(&obj->obj_link, &ww->obj_list);
+	}
 	if (ret == -EALREADY)
 		ret = 0;
 
-	if (ret == -EDEADLK)
+	if (ret == -EDEADLK) {
+		i915_gem_object_get(obj);
 		ww->contended = obj;
+	}
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6a0a3f0e36e1..c62dcd0e341a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1222,6 +1222,7 @@ static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww)
 	while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) {
 		list_del(&obj->obj_link);
 		i915_gem_object_unlock(obj);
+		i915_gem_object_put(obj);
 	}
 }
 
@@ -1229,6 +1230,7 @@ void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj)
 {
 	list_del(&obj->obj_link);
 	i915_gem_object_unlock(obj);
+	i915_gem_object_put(obj);
 }
 
 void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww)
@@ -1253,6 +1255,8 @@ int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww)
 
 	if (!ret)
 		list_add_tail(&ww->contended->obj_link, &ww->obj_list);
+	else
+		i915_gem_object_put(ww->contended);
 
 	ww->contended = NULL;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 01/12] drm/i915: Reference objects on the ww object list
@ 2021-06-14 16:26   ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

Since the ww transaction endpoint easily end up far out-of-scope of
the objects on the ww object list, particularly for contending lock
objects, make sure we reference objects on the list so they don't
disappear under us.

This comes with a performance penalty so it's been debated whether this
is really needed. But I think this is motivated by the fact that locking
is typically difficult to get right, and whatever we can do to make it
simpler for developers moving forward should be done, unless the
performance impact is far too high.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 ++++++--
 drivers/gpu/drm/i915/i915_gem.c            | 4 ++++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index d66aa00d023a..241666931945 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -169,13 +169,17 @@ static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj,
 	else
 		ret = dma_resv_lock(obj->base.resv, ww ? &ww->ctx : NULL);
 
-	if (!ret && ww)
+	if (!ret && ww) {
+		i915_gem_object_get(obj);
 		list_add_tail(&obj->obj_link, &ww->obj_list);
+	}
 	if (ret == -EALREADY)
 		ret = 0;
 
-	if (ret == -EDEADLK)
+	if (ret == -EDEADLK) {
+		i915_gem_object_get(obj);
 		ww->contended = obj;
+	}
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6a0a3f0e36e1..c62dcd0e341a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1222,6 +1222,7 @@ static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww)
 	while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) {
 		list_del(&obj->obj_link);
 		i915_gem_object_unlock(obj);
+		i915_gem_object_put(obj);
 	}
 }
 
@@ -1229,6 +1230,7 @@ void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj)
 {
 	list_del(&obj->obj_link);
 	i915_gem_object_unlock(obj);
+	i915_gem_object_put(obj);
 }
 
 void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww)
@@ -1253,6 +1255,8 @@ int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww)
 
 	if (!ret)
 		list_add_tail(&ww->contended->obj_link, &ww->obj_list);
+	else
+		i915_gem_object_put(ww->contended);
 
 	ww->contended = NULL;
 
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 02/12] drm/i915: Break out dma_resv ww locking utilities to separate files
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:26   ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

As we're about to add more ww-related functionality,
break out the dma_resv ww locking utilities to their own files

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
v2:
- Make sure filenames are sorted in include file lists and Makefile
  (Reported by Matthew Auld)
---
 drivers/gpu/drm/i915/Makefile               |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h  |  1 +
 drivers/gpu/drm/i915/gt/intel_renderstate.h |  1 +
 drivers/gpu/drm/i915/i915_gem.c             | 56 ------------------
 drivers/gpu/drm/i915/i915_gem.h             | 12 ----
 drivers/gpu/drm/i915/i915_gem_ww.c          | 63 +++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_ww.h          | 21 +++++++
 7 files changed, 87 insertions(+), 68 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index f57dfc74d6ce..7e01ea2c0f00 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -165,6 +165,7 @@ i915-y += \
 	  i915_cmd_parser.o \
 	  i915_gem_evict.o \
 	  i915_gem_gtt.o \
+	  i915_gem_ww.o \
 	  i915_gem.o \
 	  i915_globals.o \
 	  i915_query.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 241666931945..7bf4dd46d8d2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -14,6 +14,7 @@
 #include "display/intel_frontbuffer.h"
 #include "i915_gem_object_types.h"
 #include "i915_gem_gtt.h"
+#include "i915_gem_ww.h"
 #include "i915_vma_types.h"
 
 /*
diff --git a/drivers/gpu/drm/i915/gt/intel_renderstate.h b/drivers/gpu/drm/i915/gt/intel_renderstate.h
index 48f009203917..4da4c5234ef0 100644
--- a/drivers/gpu/drm/i915/gt/intel_renderstate.h
+++ b/drivers/gpu/drm/i915/gt/intel_renderstate.h
@@ -8,6 +8,7 @@
 
 #include <linux/types.h>
 #include "i915_gem.h"
+#include "i915_gem_ww.h"
 
 struct i915_request;
 struct intel_context;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c62dcd0e341a..b7ba3c951a58 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1207,62 +1207,6 @@ int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file)
 	return ret;
 }
 
-void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ww, bool intr)
-{
-	ww_acquire_init(&ww->ctx, &reservation_ww_class);
-	INIT_LIST_HEAD(&ww->obj_list);
-	ww->intr = intr;
-	ww->contended = NULL;
-}
-
-static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww)
-{
-	struct drm_i915_gem_object *obj;
-
-	while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) {
-		list_del(&obj->obj_link);
-		i915_gem_object_unlock(obj);
-		i915_gem_object_put(obj);
-	}
-}
-
-void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj)
-{
-	list_del(&obj->obj_link);
-	i915_gem_object_unlock(obj);
-	i915_gem_object_put(obj);
-}
-
-void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww)
-{
-	i915_gem_ww_ctx_unlock_all(ww);
-	WARN_ON(ww->contended);
-	ww_acquire_fini(&ww->ctx);
-}
-
-int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww)
-{
-	int ret = 0;
-
-	if (WARN_ON(!ww->contended))
-		return -EINVAL;
-
-	i915_gem_ww_ctx_unlock_all(ww);
-	if (ww->intr)
-		ret = dma_resv_lock_slow_interruptible(ww->contended->base.resv, &ww->ctx);
-	else
-		dma_resv_lock_slow(ww->contended->base.resv, &ww->ctx);
-
-	if (!ret)
-		list_add_tail(&ww->contended->obj_link, &ww->obj_list);
-	else
-		i915_gem_object_put(ww->contended);
-
-	ww->contended = NULL;
-
-	return ret;
-}
-
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_gem_device.c"
 #include "selftests/i915_gem.c"
diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index 440c35f1abc9..d0752e5553db 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -123,16 +123,4 @@ static inline bool __tasklet_is_scheduled(struct tasklet_struct *t)
 	return test_bit(TASKLET_STATE_SCHED, &t->state);
 }
 
-struct i915_gem_ww_ctx {
-	struct ww_acquire_ctx ctx;
-	struct list_head obj_list;
-	bool intr;
-	struct drm_i915_gem_object *contended;
-};
-
-void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
-void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx);
-int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx);
-void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj);
-
 #endif /* __I915_GEM_H__ */
diff --git a/drivers/gpu/drm/i915/i915_gem_ww.c b/drivers/gpu/drm/i915/i915_gem_ww.c
new file mode 100644
index 000000000000..3f6ff139478e
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_ww.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+#include <linux/dma-resv.h>
+#include "i915_gem_ww.h"
+#include "gem/i915_gem_object.h"
+
+void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ww, bool intr)
+{
+	ww_acquire_init(&ww->ctx, &reservation_ww_class);
+	INIT_LIST_HEAD(&ww->obj_list);
+	ww->intr = intr;
+	ww->contended = NULL;
+}
+
+static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww)
+{
+	struct drm_i915_gem_object *obj;
+
+	while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) {
+		list_del(&obj->obj_link);
+		i915_gem_object_unlock(obj);
+		i915_gem_object_put(obj);
+	}
+}
+
+void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj)
+{
+	list_del(&obj->obj_link);
+	i915_gem_object_unlock(obj);
+	i915_gem_object_put(obj);
+}
+
+void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww)
+{
+	i915_gem_ww_ctx_unlock_all(ww);
+	WARN_ON(ww->contended);
+	ww_acquire_fini(&ww->ctx);
+}
+
+int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww)
+{
+	int ret = 0;
+
+	if (WARN_ON(!ww->contended))
+		return -EINVAL;
+
+	i915_gem_ww_ctx_unlock_all(ww);
+	if (ww->intr)
+		ret = dma_resv_lock_slow_interruptible(ww->contended->base.resv, &ww->ctx);
+	else
+		dma_resv_lock_slow(ww->contended->base.resv, &ww->ctx);
+
+	if (!ret)
+		list_add_tail(&ww->contended->obj_link, &ww->obj_list);
+	else
+		i915_gem_object_put(ww->contended);
+
+	ww->contended = NULL;
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h b/drivers/gpu/drm/i915/i915_gem_ww.h
new file mode 100644
index 000000000000..f2d8769e4118
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_ww.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+#ifndef __I915_GEM_WW_H__
+#define __I915_GEM_WW_H__
+
+#include <drm/drm_drv.h>
+
+struct i915_gem_ww_ctx {
+	struct ww_acquire_ctx ctx;
+	struct list_head obj_list;
+	struct drm_i915_gem_object *contended;
+	bool intr;
+};
+
+void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
+void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx);
+int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx);
+void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj);
+#endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 02/12] drm/i915: Break out dma_resv ww locking utilities to separate files
@ 2021-06-14 16:26   ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

As we're about to add more ww-related functionality,
break out the dma_resv ww locking utilities to their own files

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
v2:
- Make sure filenames are sorted in include file lists and Makefile
  (Reported by Matthew Auld)
---
 drivers/gpu/drm/i915/Makefile               |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h  |  1 +
 drivers/gpu/drm/i915/gt/intel_renderstate.h |  1 +
 drivers/gpu/drm/i915/i915_gem.c             | 56 ------------------
 drivers/gpu/drm/i915/i915_gem.h             | 12 ----
 drivers/gpu/drm/i915/i915_gem_ww.c          | 63 +++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_ww.h          | 21 +++++++
 7 files changed, 87 insertions(+), 68 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index f57dfc74d6ce..7e01ea2c0f00 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -165,6 +165,7 @@ i915-y += \
 	  i915_cmd_parser.o \
 	  i915_gem_evict.o \
 	  i915_gem_gtt.o \
+	  i915_gem_ww.o \
 	  i915_gem.o \
 	  i915_globals.o \
 	  i915_query.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 241666931945..7bf4dd46d8d2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -14,6 +14,7 @@
 #include "display/intel_frontbuffer.h"
 #include "i915_gem_object_types.h"
 #include "i915_gem_gtt.h"
+#include "i915_gem_ww.h"
 #include "i915_vma_types.h"
 
 /*
diff --git a/drivers/gpu/drm/i915/gt/intel_renderstate.h b/drivers/gpu/drm/i915/gt/intel_renderstate.h
index 48f009203917..4da4c5234ef0 100644
--- a/drivers/gpu/drm/i915/gt/intel_renderstate.h
+++ b/drivers/gpu/drm/i915/gt/intel_renderstate.h
@@ -8,6 +8,7 @@
 
 #include <linux/types.h>
 #include "i915_gem.h"
+#include "i915_gem_ww.h"
 
 struct i915_request;
 struct intel_context;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c62dcd0e341a..b7ba3c951a58 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1207,62 +1207,6 @@ int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file)
 	return ret;
 }
 
-void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ww, bool intr)
-{
-	ww_acquire_init(&ww->ctx, &reservation_ww_class);
-	INIT_LIST_HEAD(&ww->obj_list);
-	ww->intr = intr;
-	ww->contended = NULL;
-}
-
-static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww)
-{
-	struct drm_i915_gem_object *obj;
-
-	while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) {
-		list_del(&obj->obj_link);
-		i915_gem_object_unlock(obj);
-		i915_gem_object_put(obj);
-	}
-}
-
-void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj)
-{
-	list_del(&obj->obj_link);
-	i915_gem_object_unlock(obj);
-	i915_gem_object_put(obj);
-}
-
-void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww)
-{
-	i915_gem_ww_ctx_unlock_all(ww);
-	WARN_ON(ww->contended);
-	ww_acquire_fini(&ww->ctx);
-}
-
-int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww)
-{
-	int ret = 0;
-
-	if (WARN_ON(!ww->contended))
-		return -EINVAL;
-
-	i915_gem_ww_ctx_unlock_all(ww);
-	if (ww->intr)
-		ret = dma_resv_lock_slow_interruptible(ww->contended->base.resv, &ww->ctx);
-	else
-		dma_resv_lock_slow(ww->contended->base.resv, &ww->ctx);
-
-	if (!ret)
-		list_add_tail(&ww->contended->obj_link, &ww->obj_list);
-	else
-		i915_gem_object_put(ww->contended);
-
-	ww->contended = NULL;
-
-	return ret;
-}
-
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_gem_device.c"
 #include "selftests/i915_gem.c"
diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index 440c35f1abc9..d0752e5553db 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -123,16 +123,4 @@ static inline bool __tasklet_is_scheduled(struct tasklet_struct *t)
 	return test_bit(TASKLET_STATE_SCHED, &t->state);
 }
 
-struct i915_gem_ww_ctx {
-	struct ww_acquire_ctx ctx;
-	struct list_head obj_list;
-	bool intr;
-	struct drm_i915_gem_object *contended;
-};
-
-void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
-void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx);
-int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx);
-void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj);
-
 #endif /* __I915_GEM_H__ */
diff --git a/drivers/gpu/drm/i915/i915_gem_ww.c b/drivers/gpu/drm/i915/i915_gem_ww.c
new file mode 100644
index 000000000000..3f6ff139478e
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_ww.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+#include <linux/dma-resv.h>
+#include "i915_gem_ww.h"
+#include "gem/i915_gem_object.h"
+
+void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ww, bool intr)
+{
+	ww_acquire_init(&ww->ctx, &reservation_ww_class);
+	INIT_LIST_HEAD(&ww->obj_list);
+	ww->intr = intr;
+	ww->contended = NULL;
+}
+
+static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww)
+{
+	struct drm_i915_gem_object *obj;
+
+	while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) {
+		list_del(&obj->obj_link);
+		i915_gem_object_unlock(obj);
+		i915_gem_object_put(obj);
+	}
+}
+
+void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj)
+{
+	list_del(&obj->obj_link);
+	i915_gem_object_unlock(obj);
+	i915_gem_object_put(obj);
+}
+
+void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww)
+{
+	i915_gem_ww_ctx_unlock_all(ww);
+	WARN_ON(ww->contended);
+	ww_acquire_fini(&ww->ctx);
+}
+
+int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww)
+{
+	int ret = 0;
+
+	if (WARN_ON(!ww->contended))
+		return -EINVAL;
+
+	i915_gem_ww_ctx_unlock_all(ww);
+	if (ww->intr)
+		ret = dma_resv_lock_slow_interruptible(ww->contended->base.resv, &ww->ctx);
+	else
+		dma_resv_lock_slow(ww->contended->base.resv, &ww->ctx);
+
+	if (!ret)
+		list_add_tail(&ww->contended->obj_link, &ww->obj_list);
+	else
+		i915_gem_object_put(ww->contended);
+
+	ww->contended = NULL;
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h b/drivers/gpu/drm/i915/i915_gem_ww.h
new file mode 100644
index 000000000000..f2d8769e4118
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_ww.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+#ifndef __I915_GEM_WW_H__
+#define __I915_GEM_WW_H__
+
+#include <drm/drm_drv.h>
+
+struct i915_gem_ww_ctx {
+	struct ww_acquire_ctx ctx;
+	struct list_head obj_list;
+	struct drm_i915_gem_object *contended;
+	bool intr;
+};
+
+void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
+void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx);
+int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx);
+void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj);
+#endif
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 03/12] drm/i915: Introduce a ww transaction helper
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:26   ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

Introduce a for_i915_gem_ww(){} utility to help make the code
around a ww transaction more readable.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_ww.h | 31 +++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h b/drivers/gpu/drm/i915/i915_gem_ww.h
index f2d8769e4118..f6b1a796667b 100644
--- a/drivers/gpu/drm/i915/i915_gem_ww.h
+++ b/drivers/gpu/drm/i915/i915_gem_ww.h
@@ -11,11 +11,40 @@ struct i915_gem_ww_ctx {
 	struct ww_acquire_ctx ctx;
 	struct list_head obj_list;
 	struct drm_i915_gem_object *contended;
-	bool intr;
+	unsigned short intr;
+	unsigned short loop;
 };
 
 void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
 void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx);
 int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx);
 void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj);
+
+/* Internal functions used by the inlines! Don't use. */
+static inline int __i915_gem_ww_fini(struct i915_gem_ww_ctx *ww, int err)
+{
+	ww->loop = 0;
+	if (err == -EDEADLK) {
+		err = i915_gem_ww_ctx_backoff(ww);
+		if (!err)
+			ww->loop = 1;
+	}
+
+	if (!ww->loop)
+		i915_gem_ww_ctx_fini(ww);
+
+	return err;
+}
+
+static inline void
+__i915_gem_ww_init(struct i915_gem_ww_ctx *ww, bool intr)
+{
+	i915_gem_ww_ctx_init(ww, intr);
+	ww->loop = 1;
+}
+
+#define for_i915_gem_ww(_ww, _err, _intr)			\
+	for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop;	\
+	     _err = __i915_gem_ww_fini(_ww, _err))
+
 #endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 03/12] drm/i915: Introduce a ww transaction helper
@ 2021-06-14 16:26   ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

Introduce a for_i915_gem_ww(){} utility to help make the code
around a ww transaction more readable.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_ww.h | 31 +++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h b/drivers/gpu/drm/i915/i915_gem_ww.h
index f2d8769e4118..f6b1a796667b 100644
--- a/drivers/gpu/drm/i915/i915_gem_ww.h
+++ b/drivers/gpu/drm/i915/i915_gem_ww.h
@@ -11,11 +11,40 @@ struct i915_gem_ww_ctx {
 	struct ww_acquire_ctx ctx;
 	struct list_head obj_list;
 	struct drm_i915_gem_object *contended;
-	bool intr;
+	unsigned short intr;
+	unsigned short loop;
 };
 
 void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr);
 void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx);
 int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx);
 void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj);
+
+/* Internal functions used by the inlines! Don't use. */
+static inline int __i915_gem_ww_fini(struct i915_gem_ww_ctx *ww, int err)
+{
+	ww->loop = 0;
+	if (err == -EDEADLK) {
+		err = i915_gem_ww_ctx_backoff(ww);
+		if (!err)
+			ww->loop = 1;
+	}
+
+	if (!ww->loop)
+		i915_gem_ww_ctx_fini(ww);
+
+	return err;
+}
+
+static inline void
+__i915_gem_ww_init(struct i915_gem_ww_ctx *ww, bool intr)
+{
+	i915_gem_ww_ctx_init(ww, intr);
+	ww->loop = 1;
+}
+
+#define for_i915_gem_ww(_ww, _err, _intr)			\
+	for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop;	\
+	     _err = __i915_gem_ww_fini(_ww, _err))
+
 #endif
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 04/12] drm/i915/gt: Add an insert_entry for gen8_ppgtt
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:26   ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: matthew.auld, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

In the next patch, we will want to write a PTE for an explicit
dma address, outside of the usual vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 21c8b7350b7a..1b676d7700bf 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -555,6 +555,24 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
 	}
 }
 
+static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
+				    dma_addr_t addr,
+				    u64 offset,
+				    enum i915_cache_level level,
+				    u32 flags)
+{
+	u64 idx = offset >> GEN8_PTE_SHIFT;
+	struct i915_page_directory * const pdp =
+		gen8_pdp_for_page_index(vm, idx);
+	struct i915_page_directory *pd =
+		i915_pd_entry(pdp, gen8_pd_index(idx, 2));
+	gen8_pte_t *vaddr;
+
+	vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+	vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
+	clflush_cache_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
+}
+
 static int gen8_init_scratch(struct i915_address_space *vm)
 {
 	u32 pte_flags;
@@ -734,6 +752,7 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 
 	ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
 	ppgtt->vm.insert_entries = gen8_ppgtt_insert;
+	ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
 	ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
 	ppgtt->vm.clear_range = gen8_ppgtt_clear;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 04/12] drm/i915/gt: Add an insert_entry for gen8_ppgtt
@ 2021-06-14 16:26   ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: matthew.auld, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

In the next patch, we will want to write a PTE for an explicit
dma address, outside of the usual vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 21c8b7350b7a..1b676d7700bf 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -555,6 +555,24 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm,
 	}
 }
 
+static void gen8_ppgtt_insert_entry(struct i915_address_space *vm,
+				    dma_addr_t addr,
+				    u64 offset,
+				    enum i915_cache_level level,
+				    u32 flags)
+{
+	u64 idx = offset >> GEN8_PTE_SHIFT;
+	struct i915_page_directory * const pdp =
+		gen8_pdp_for_page_index(vm, idx);
+	struct i915_page_directory *pd =
+		i915_pd_entry(pdp, gen8_pd_index(idx, 2));
+	gen8_pte_t *vaddr;
+
+	vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+	vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
+	clflush_cache_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr));
+}
+
 static int gen8_init_scratch(struct i915_address_space *vm)
 {
 	u32 pte_flags;
@@ -734,6 +752,7 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 
 	ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
 	ppgtt->vm.insert_entries = gen8_ppgtt_insert;
+	ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
 	ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
 	ppgtt->vm.clear_range = gen8_ppgtt_clear;
 
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 05/12] drm/i915/gt: Add a routine to iterate over the pagetables of a GTT
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:26   ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: matthew.auld, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

In the next patch, we will want to look at the dma addresses of
individual page tables, so add a routine to iterate over them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 49 ++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gtt.h  |  7 ++++
 2 files changed, 56 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 1b676d7700bf..3d02c726c746 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -361,6 +361,54 @@ static void gen8_ppgtt_alloc(struct i915_address_space *vm,
 			   &start, start + length, vm->top);
 }
 
+static void __gen8_ppgtt_foreach(struct i915_address_space *vm,
+				 struct i915_page_directory *pd,
+				 u64 *start, u64 end, int lvl,
+				 void (*fn)(struct i915_address_space *vm,
+					    struct i915_page_table *pt,
+					    void *data),
+				 void *data)
+{
+	unsigned int idx, len;
+
+	len = gen8_pd_range(*start, end, lvl--, &idx);
+
+	spin_lock(&pd->lock);
+	do {
+		struct i915_page_table *pt = pd->entry[idx];
+
+		atomic_inc(&pt->used);
+		spin_unlock(&pd->lock);
+
+		if (lvl) {
+			__gen8_ppgtt_foreach(vm, as_pd(pt), start, end, lvl,
+					     fn, data);
+		} else {
+			fn(vm, pt, data);
+			*start += gen8_pt_count(*start, end);
+		}
+
+		spin_lock(&pd->lock);
+		atomic_dec(&pt->used);
+	} while (idx++, --len);
+	spin_unlock(&pd->lock);
+}
+
+static void gen8_ppgtt_foreach(struct i915_address_space *vm,
+			       u64 start, u64 length,
+			       void (*fn)(struct i915_address_space *vm,
+					  struct i915_page_table *pt,
+					  void *data),
+			       void *data)
+{
+	start >>= GEN8_PTE_SHIFT;
+	length >>= GEN8_PTE_SHIFT;
+
+	__gen8_ppgtt_foreach(vm, i915_vm_to_ppgtt(vm)->pd,
+			     &start, start + length, vm->top,
+			     fn, data);
+}
+
 static __always_inline u64
 gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 		      struct i915_page_directory *pdp,
@@ -755,6 +803,7 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 	ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
 	ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
 	ppgtt->vm.clear_range = gen8_ppgtt_clear;
+	ppgtt->vm.foreach = gen8_ppgtt_foreach;
 
 	ppgtt->vm.pte_encode = gen8_pte_encode;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index edea95b97c36..9bd89f2a01ff 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -296,6 +296,13 @@ struct i915_address_space {
 			       u32 flags);
 	void (*cleanup)(struct i915_address_space *vm);
 
+	void (*foreach)(struct i915_address_space *vm,
+			u64 start, u64 length,
+			void (*fn)(struct i915_address_space *vm,
+				   struct i915_page_table *pt,
+				   void *data),
+			void *data);
+
 	struct i915_vma_ops vma_ops;
 
 	I915_SELFTEST_DECLARE(struct fault_attr fault_attr);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 05/12] drm/i915/gt: Add a routine to iterate over the pagetables of a GTT
@ 2021-06-14 16:26   ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: matthew.auld, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

In the next patch, we will want to look at the dma addresses of
individual page tables, so add a routine to iterate over them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 49 ++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gtt.h  |  7 ++++
 2 files changed, 56 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 1b676d7700bf..3d02c726c746 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -361,6 +361,54 @@ static void gen8_ppgtt_alloc(struct i915_address_space *vm,
 			   &start, start + length, vm->top);
 }
 
+static void __gen8_ppgtt_foreach(struct i915_address_space *vm,
+				 struct i915_page_directory *pd,
+				 u64 *start, u64 end, int lvl,
+				 void (*fn)(struct i915_address_space *vm,
+					    struct i915_page_table *pt,
+					    void *data),
+				 void *data)
+{
+	unsigned int idx, len;
+
+	len = gen8_pd_range(*start, end, lvl--, &idx);
+
+	spin_lock(&pd->lock);
+	do {
+		struct i915_page_table *pt = pd->entry[idx];
+
+		atomic_inc(&pt->used);
+		spin_unlock(&pd->lock);
+
+		if (lvl) {
+			__gen8_ppgtt_foreach(vm, as_pd(pt), start, end, lvl,
+					     fn, data);
+		} else {
+			fn(vm, pt, data);
+			*start += gen8_pt_count(*start, end);
+		}
+
+		spin_lock(&pd->lock);
+		atomic_dec(&pt->used);
+	} while (idx++, --len);
+	spin_unlock(&pd->lock);
+}
+
+static void gen8_ppgtt_foreach(struct i915_address_space *vm,
+			       u64 start, u64 length,
+			       void (*fn)(struct i915_address_space *vm,
+					  struct i915_page_table *pt,
+					  void *data),
+			       void *data)
+{
+	start >>= GEN8_PTE_SHIFT;
+	length >>= GEN8_PTE_SHIFT;
+
+	__gen8_ppgtt_foreach(vm, i915_vm_to_ppgtt(vm)->pd,
+			     &start, start + length, vm->top,
+			     fn, data);
+}
+
 static __always_inline u64
 gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 		      struct i915_page_directory *pdp,
@@ -755,6 +803,7 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 	ppgtt->vm.insert_page = gen8_ppgtt_insert_entry;
 	ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc;
 	ppgtt->vm.clear_range = gen8_ppgtt_clear;
+	ppgtt->vm.foreach = gen8_ppgtt_foreach;
 
 	ppgtt->vm.pte_encode = gen8_pte_encode;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index edea95b97c36..9bd89f2a01ff 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -296,6 +296,13 @@ struct i915_address_space {
 			       u32 flags);
 	void (*cleanup)(struct i915_address_space *vm);
 
+	void (*foreach)(struct i915_address_space *vm,
+			u64 start, u64 length,
+			void (*fn)(struct i915_address_space *vm,
+				   struct i915_page_table *pt,
+				   void *data),
+			void *data);
+
 	struct i915_vma_ops vma_ops;
 
 	I915_SELFTEST_DECLARE(struct fault_attr fault_attr);
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 06/12] drm/i915/gt: Export the pinned context constructor and destructor
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:26   ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: matthew.auld, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

Allow internal clients to create and destroy a pinned context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
v2:
- (Thomas) Export also the pinned context destructor
---
 drivers/gpu/drm/i915/gt/intel_engine.h    | 11 +++++++++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 27 ++++++++++++++---------
 2 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 8d9184920c51..36ea9eb52bb5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -19,7 +19,9 @@
 #include "intel_workarounds.h"
 
 struct drm_printer;
+struct intel_context;
 struct intel_gt;
+struct lock_class_key;
 
 /* Early gen2 devices have a cacheline of just 32 bytes, using 64 is overkill,
  * but keeps the logic simple. Indeed, the whole purpose of this macro is just
@@ -256,6 +258,15 @@ struct i915_request *
 intel_engine_find_active_request(struct intel_engine_cs *engine);
 
 u32 intel_engine_context_size(struct intel_gt *gt, u8 class);
+struct intel_context *
+intel_engine_create_pinned_context(struct intel_engine_cs *engine,
+				   struct i915_address_space *vm,
+				   unsigned int ring_size,
+				   unsigned int hwsp,
+				   struct lock_class_key *key,
+				   const char *name);
+
+void intel_engine_destroy_pinned_context(struct intel_context *ce);
 
 void intel_engine_init_active(struct intel_engine_cs *engine,
 			      unsigned int subclass);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 9ceddfbb1687..fcbaad18ac91 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -810,11 +810,13 @@ intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass)
 #endif
 }
 
-static struct intel_context *
-create_pinned_context(struct intel_engine_cs *engine,
-		      unsigned int hwsp,
-		      struct lock_class_key *key,
-		      const char *name)
+struct intel_context *
+intel_engine_create_pinned_context(struct intel_engine_cs *engine,
+				   struct i915_address_space *vm,
+				   unsigned int ring_size,
+				   unsigned int hwsp,
+				   struct lock_class_key *key,
+				   const char *name)
 {
 	struct intel_context *ce;
 	int err;
@@ -825,6 +827,10 @@ create_pinned_context(struct intel_engine_cs *engine,
 
 	__set_bit(CONTEXT_BARRIER_BIT, &ce->flags);
 	ce->timeline = page_pack_bits(NULL, hwsp);
+	ce->ring = __intel_context_ring_size(ring_size);
+
+	i915_vm_put(ce->vm);
+	ce->vm = i915_vm_get(vm);
 
 	err = intel_context_pin(ce); /* perma-pin so it is always available */
 	if (err) {
@@ -843,7 +849,7 @@ create_pinned_context(struct intel_engine_cs *engine,
 	return ce;
 }
 
-static void destroy_pinned_context(struct intel_context *ce)
+void intel_engine_destroy_pinned_context(struct intel_context *ce)
 {
 	struct intel_engine_cs *engine = ce->engine;
 	struct i915_vma *hwsp = engine->status_page.vma;
@@ -863,8 +869,9 @@ create_kernel_context(struct intel_engine_cs *engine)
 {
 	static struct lock_class_key kernel;
 
-	return create_pinned_context(engine, I915_GEM_HWS_SEQNO_ADDR,
-				     &kernel, "kernel_context");
+	return intel_engine_create_pinned_context(engine, engine->gt->vm, SZ_4K,
+						  I915_GEM_HWS_SEQNO_ADDR,
+						  &kernel, "kernel_context");
 }
 
 /**
@@ -907,7 +914,7 @@ static int engine_init_common(struct intel_engine_cs *engine)
 	return 0;
 
 err_context:
-	destroy_pinned_context(ce);
+	intel_engine_destroy_pinned_context(ce);
 	return ret;
 }
 
@@ -969,7 +976,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 		fput(engine->default_state);
 
 	if (engine->kernel_context)
-		destroy_pinned_context(engine->kernel_context);
+		intel_engine_destroy_pinned_context(engine->kernel_context);
 
 	GEM_BUG_ON(!llist_empty(&engine->barrier_tasks));
 	cleanup_status_page(engine);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 06/12] drm/i915/gt: Export the pinned context constructor and destructor
@ 2021-06-14 16:26   ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: matthew.auld, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

Allow internal clients to create and destroy a pinned context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
v2:
- (Thomas) Export also the pinned context destructor
---
 drivers/gpu/drm/i915/gt/intel_engine.h    | 11 +++++++++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 27 ++++++++++++++---------
 2 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 8d9184920c51..36ea9eb52bb5 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -19,7 +19,9 @@
 #include "intel_workarounds.h"
 
 struct drm_printer;
+struct intel_context;
 struct intel_gt;
+struct lock_class_key;
 
 /* Early gen2 devices have a cacheline of just 32 bytes, using 64 is overkill,
  * but keeps the logic simple. Indeed, the whole purpose of this macro is just
@@ -256,6 +258,15 @@ struct i915_request *
 intel_engine_find_active_request(struct intel_engine_cs *engine);
 
 u32 intel_engine_context_size(struct intel_gt *gt, u8 class);
+struct intel_context *
+intel_engine_create_pinned_context(struct intel_engine_cs *engine,
+				   struct i915_address_space *vm,
+				   unsigned int ring_size,
+				   unsigned int hwsp,
+				   struct lock_class_key *key,
+				   const char *name);
+
+void intel_engine_destroy_pinned_context(struct intel_context *ce);
 
 void intel_engine_init_active(struct intel_engine_cs *engine,
 			      unsigned int subclass);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 9ceddfbb1687..fcbaad18ac91 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -810,11 +810,13 @@ intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass)
 #endif
 }
 
-static struct intel_context *
-create_pinned_context(struct intel_engine_cs *engine,
-		      unsigned int hwsp,
-		      struct lock_class_key *key,
-		      const char *name)
+struct intel_context *
+intel_engine_create_pinned_context(struct intel_engine_cs *engine,
+				   struct i915_address_space *vm,
+				   unsigned int ring_size,
+				   unsigned int hwsp,
+				   struct lock_class_key *key,
+				   const char *name)
 {
 	struct intel_context *ce;
 	int err;
@@ -825,6 +827,10 @@ create_pinned_context(struct intel_engine_cs *engine,
 
 	__set_bit(CONTEXT_BARRIER_BIT, &ce->flags);
 	ce->timeline = page_pack_bits(NULL, hwsp);
+	ce->ring = __intel_context_ring_size(ring_size);
+
+	i915_vm_put(ce->vm);
+	ce->vm = i915_vm_get(vm);
 
 	err = intel_context_pin(ce); /* perma-pin so it is always available */
 	if (err) {
@@ -843,7 +849,7 @@ create_pinned_context(struct intel_engine_cs *engine,
 	return ce;
 }
 
-static void destroy_pinned_context(struct intel_context *ce)
+void intel_engine_destroy_pinned_context(struct intel_context *ce)
 {
 	struct intel_engine_cs *engine = ce->engine;
 	struct i915_vma *hwsp = engine->status_page.vma;
@@ -863,8 +869,9 @@ create_kernel_context(struct intel_engine_cs *engine)
 {
 	static struct lock_class_key kernel;
 
-	return create_pinned_context(engine, I915_GEM_HWS_SEQNO_ADDR,
-				     &kernel, "kernel_context");
+	return intel_engine_create_pinned_context(engine, engine->gt->vm, SZ_4K,
+						  I915_GEM_HWS_SEQNO_ADDR,
+						  &kernel, "kernel_context");
 }
 
 /**
@@ -907,7 +914,7 @@ static int engine_init_common(struct intel_engine_cs *engine)
 	return 0;
 
 err_context:
-	destroy_pinned_context(ce);
+	intel_engine_destroy_pinned_context(ce);
 	return ret;
 }
 
@@ -969,7 +976,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 		fput(engine->default_state);
 
 	if (engine->kernel_context)
-		destroy_pinned_context(engine->kernel_context);
+		intel_engine_destroy_pinned_context(engine->kernel_context);
 
 	GEM_BUG_ON(!llist_empty(&engine->barrier_tasks));
 	cleanup_status_page(engine);
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 07/12] drm/i915/gt: Pipelined page migration
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:26   ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

If we pipeline the PTE updates and then do the copy of those pages
within a single unpreemptible command packet, we can submit the copies
and leave them to be scheduled without having to synchronously wait
under a global lock. In order to manage migration, we need to
preallocate the page tables (and keep them pinned and available for use
at any time), causing a bottleneck for migrations as all clients must
contend on the limited resources. By inlining the ppGTT updates and
performing the blit atomically, each client only owns the PTE while in
use, and so we can reschedule individual operations however we see fit.
And most importantly, we do not need to take a global lock on the shared
vm, and wait until the operation is complete before releasing the lock
for others to claim the PTE for themselves.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
v2:
- Add a TODO for huge LMEM ptes (Pointed out by Matthew Auld)
- Use intel_engine_destroy_pinned_context() to properly take the pinned
  context timeline off the engine list. (CI warning).
v3:
- Remove an obsolete GEM_BUG_ON (Pointed out by Matthew Auld)
- Fix the size argument in allocate_va_range() to not include the base
  (Pointed out by Matthew Auld)
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/gt/intel_engine.h        |   1 +
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   2 +
 drivers/gpu/drm/i915/gt/intel_migrate.c       | 542 ++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_migrate.h       |  45 ++
 drivers/gpu/drm/i915/gt/intel_migrate_types.h |  15 +
 drivers/gpu/drm/i915/gt/intel_ring.h          |   1 +
 drivers/gpu/drm/i915/gt/selftest_migrate.c    | 291 ++++++++++
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 9 files changed, 899 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.h
 create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate_types.h
 create mode 100644 drivers/gpu/drm/i915/gt/selftest_migrate.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 7e01ea2c0f00..de4cb9c52585 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -108,6 +108,7 @@ gt-y += \
 	gt/intel_gtt.o \
 	gt/intel_llc.o \
 	gt/intel_lrc.o \
+	gt/intel_migrate.o \
 	gt/intel_mocs.o \
 	gt/intel_ppgtt.o \
 	gt/intel_rc6.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 36ea9eb52bb5..62f7440bc111 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -188,6 +188,7 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
 #define I915_GEM_HWS_PREEMPT_ADDR	(I915_GEM_HWS_PREEMPT * sizeof(u32))
 #define I915_GEM_HWS_SEQNO		0x40
 #define I915_GEM_HWS_SEQNO_ADDR		(I915_GEM_HWS_SEQNO * sizeof(u32))
+#define I915_GEM_HWS_MIGRATE		(0x42 * sizeof(u32))
 #define I915_GEM_HWS_SCRATCH		0x80
 
 #define I915_HWS_CSB_BUF0_INDEX		0x10
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 2694dbb9967e..1c3af0fc0456 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -123,8 +123,10 @@
 #define   MI_SEMAPHORE_SAD_NEQ_SDD	(5 << 12)
 #define   MI_SEMAPHORE_TOKEN_MASK	REG_GENMASK(9, 5)
 #define   MI_SEMAPHORE_TOKEN_SHIFT	5
+#define MI_STORE_DATA_IMM	MI_INSTR(0x20, 0)
 #define MI_STORE_DWORD_IMM	MI_INSTR(0x20, 1)
 #define MI_STORE_DWORD_IMM_GEN4	MI_INSTR(0x20, 2)
+#define MI_STORE_QWORD_IMM_GEN8 (MI_INSTR(0x20, 3) | REG_BIT(21))
 #define   MI_MEM_VIRTUAL	(1 << 22) /* 945,g33,965 */
 #define   MI_USE_GGTT		(1 << 22) /* g4x+ */
 #define MI_STORE_DWORD_INDEX	MI_INSTR(0x21, 1)
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
new file mode 100644
index 000000000000..e2e860063e7b
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -0,0 +1,542 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "intel_context.h"
+#include "intel_gpu_commands.h"
+#include "intel_gt.h"
+#include "intel_gtt.h"
+#include "intel_migrate.h"
+#include "intel_ring.h"
+
+struct insert_pte_data {
+	u64 offset;
+	bool is_lmem;
+};
+
+#define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */
+
+static bool engine_supports_migration(struct intel_engine_cs *engine)
+{
+	if (!engine)
+		return false;
+
+	/*
+	 * We need the ability to prevent aribtration (MI_ARB_ON_OFF),
+	 * the ability to write PTE using inline data (MI_STORE_DATA)
+	 * and of course the ability to do the block transfer (blits).
+	 */
+	GEM_BUG_ON(engine->class != COPY_ENGINE_CLASS);
+
+	return true;
+}
+
+static void insert_pte(struct i915_address_space *vm,
+		       struct i915_page_table *pt,
+		       void *data)
+{
+	struct insert_pte_data *d = data;
+
+	vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE,
+			d->is_lmem ? PTE_LM : 0);
+	d->offset += PAGE_SIZE;
+}
+
+static struct i915_address_space *migrate_vm(struct intel_gt *gt)
+{
+	struct i915_vm_pt_stash stash = {};
+	struct i915_ppgtt *vm;
+	int err;
+	int i;
+
+	/*
+	 * We construct a very special VM for use by all migration contexts,
+	 * it is kept pinned so that it can be used at any time. As we need
+	 * to pre-allocate the page directories for the migration VM, this
+	 * limits us to only using a small number of prepared vma.
+	 *
+	 * To be able to pipeline and reschedule migration operations while
+	 * avoiding unnecessary contention on the vm itself, the PTE updates
+	 * are inline with the blits. All the blits use the same fixed
+	 * addresses, with the backing store redirection being updated on the
+	 * fly. Only 2 implicit vma are used for all migration operations.
+	 *
+	 * We lay the ppGTT out as:
+	 *
+	 *	[0, CHUNK_SZ) -> first object
+	 *	[CHUNK_SZ, 2 * CHUNK_SZ) -> second object
+	 *	[2 * CHUNK_SZ, 2 * CHUNK_SZ + 2 * CHUNK_SZ >> 9] -> PTE
+	 *
+	 * By exposing the dma addresses of the page directories themselves
+	 * within the ppGTT, we are then able to rewrite the PTE prior to use.
+	 * But the PTE update and subsequent migration operation must be atomic,
+	 * i.e. within the same non-preemptible window so that we do not switch
+	 * to another migration context that overwrites the PTE.
+	 *
+	 * TODO: Add support for huge LMEM PTEs
+	 */
+
+	vm = i915_ppgtt_create(gt);
+	if (IS_ERR(vm))
+		return ERR_CAST(vm);
+
+	if (!vm->vm.allocate_va_range || !vm->vm.foreach) {
+		err = -ENODEV;
+		goto err_vm;
+	}
+
+	/*
+	 * Each engine instance is assigned its own chunk in the VM, so
+	 * that we can run multiple instances concurrently
+	 */
+	for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) {
+		struct intel_engine_cs *engine;
+		u64 base = (u64)i << 32;
+		struct insert_pte_data d = {};
+		struct i915_gem_ww_ctx ww;
+		u64 sz;
+
+		engine = gt->engine_class[COPY_ENGINE_CLASS][i];
+		if (!engine_supports_migration(engine))
+			continue;
+
+		/*
+		 * We copy in 8MiB chunks. Each PDE covers 2MiB, so we need
+		 * 4x2 page directories for source/destination.
+		 */
+		sz = 2 * CHUNK_SZ;
+		d.offset = base + sz;
+
+		/*
+		 * We need another page directory setup so that we can write
+		 * the 8x512 PTE in each chunk.
+		 */
+		sz += (sz >> 12) * sizeof(u64);
+
+		err = i915_vm_alloc_pt_stash(&vm->vm, &stash, sz);
+		if (err)
+			goto err_vm;
+
+		for_i915_gem_ww(&ww, err, true) {
+			err = i915_vm_lock_objects(&vm->vm, &ww);
+			if (err)
+				continue;
+			err = i915_vm_map_pt_stash(&vm->vm, &stash);
+			if (err)
+				continue;
+
+			vm->vm.allocate_va_range(&vm->vm, &stash, base, sz);
+		}
+		i915_vm_free_pt_stash(&vm->vm, &stash);
+		if (err)
+			goto err_vm;
+
+		/* Now allow the GPU to rewrite the PTE via its own ppGTT */
+		d.is_lmem = i915_gem_object_is_lmem(vm->vm.scratch[0]);
+		vm->vm.foreach(&vm->vm, base, base + sz, insert_pte, &d);
+	}
+
+	return &vm->vm;
+
+err_vm:
+	i915_vm_put(&vm->vm);
+	return ERR_PTR(err);
+}
+
+static struct intel_engine_cs *first_copy_engine(struct intel_gt *gt)
+{
+	struct intel_engine_cs *engine;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) {
+		engine = gt->engine_class[COPY_ENGINE_CLASS][i];
+		if (engine_supports_migration(engine))
+			return engine;
+	}
+
+	return NULL;
+}
+
+static struct intel_context *pinned_context(struct intel_gt *gt)
+{
+	static struct lock_class_key key;
+	struct intel_engine_cs *engine;
+	struct i915_address_space *vm;
+	struct intel_context *ce;
+
+	engine = first_copy_engine(gt);
+	if (!engine)
+		return ERR_PTR(-ENODEV);
+
+	vm = migrate_vm(gt);
+	if (IS_ERR(vm))
+		return ERR_CAST(vm);
+
+	ce = intel_engine_create_pinned_context(engine, vm, SZ_512K,
+						I915_GEM_HWS_MIGRATE,
+						&key, "migrate");
+	i915_vm_put(ce->vm);
+	return ce;
+}
+
+int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt)
+{
+	struct intel_context *ce;
+
+	memset(m, 0, sizeof(*m));
+
+	ce = pinned_context(gt);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
+	m->context = ce;
+	return 0;
+}
+
+static int random_index(unsigned int max)
+{
+	return upper_32_bits(mul_u32_u32(get_random_u32(), max));
+}
+
+static struct intel_context *__migrate_engines(struct intel_gt *gt)
+{
+	struct intel_engine_cs *engines[MAX_ENGINE_INSTANCE];
+	struct intel_engine_cs *engine;
+	unsigned int count, i;
+
+	count = 0;
+	for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) {
+		engine = gt->engine_class[COPY_ENGINE_CLASS][i];
+		if (engine_supports_migration(engine))
+			engines[count++] = engine;
+	}
+
+	return intel_context_create(engines[random_index(count)]);
+}
+
+struct intel_context *intel_migrate_create_context(struct intel_migrate *m)
+{
+	struct intel_context *ce;
+
+	/*
+	 * We randomly distribute contexts across the engines upon constrction,
+	 * as they all share the same pinned vm, and so in order to allow
+	 * multiple blits to run in parallel, we must construct each blit
+	 * to use a different range of the vm for its GTT. This has to be
+	 * known at construction, so we can not use the late greedy load
+	 * balancing of the virtual-engine.
+	 */
+	ce = __migrate_engines(m->context->engine->gt);
+	if (IS_ERR(ce))
+		return ce;
+
+	ce->ring = __intel_context_ring_size(SZ_256K);
+
+	i915_vm_put(ce->vm);
+	ce->vm = i915_vm_get(m->context->vm);
+
+	return ce;
+}
+
+static inline struct sgt_dma sg_sgt(struct scatterlist *sg)
+{
+	dma_addr_t addr = sg_dma_address(sg);
+
+	return (struct sgt_dma){ sg, addr, addr + sg_dma_len(sg) };
+}
+
+static int emit_no_arbitration(struct i915_request *rq)
+{
+	u32 *cs;
+
+	cs = intel_ring_begin(rq, 2);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	/* Explicitly disable preemption for this request. */
+	*cs++ = MI_ARB_ON_OFF;
+	*cs++ = MI_NOOP;
+	intel_ring_advance(rq, cs);
+
+	return 0;
+}
+
+static int emit_pte(struct i915_request *rq,
+		    struct sgt_dma *it,
+		    enum i915_cache_level cache_level,
+		    bool is_lmem,
+		    u64 offset,
+		    int length)
+{
+	const u64 encode = rq->context->vm->pte_encode(0, cache_level,
+						       is_lmem ? PTE_LM : 0);
+	struct intel_ring *ring = rq->ring;
+	int total = 0;
+	u32 *hdr, *cs;
+	int pkt;
+
+	GEM_BUG_ON(INTEL_GEN(rq->engine->i915) < 8);
+
+	/* Compute the page directory offset for the target address range */
+	offset += (u64)rq->engine->instance << 32;
+	offset >>= 12;
+	offset *= sizeof(u64);
+	offset += 2 * CHUNK_SZ;
+
+	cs = intel_ring_begin(rq, 6);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	/* Pack as many PTE updates as possible into a single MI command */
+	pkt = min_t(int, 0x400, ring->space / sizeof(u32) + 5);
+	pkt = min_t(int, pkt, (ring->size - ring->emit) / sizeof(u32) + 5);
+
+	hdr = cs;
+	*cs++ = MI_STORE_DATA_IMM | REG_BIT(21); /* as qword elements */
+	*cs++ = lower_32_bits(offset);
+	*cs++ = upper_32_bits(offset);
+
+	do {
+		if (cs - hdr >= pkt) {
+			*hdr += cs - hdr - 2;
+			*cs++ = MI_NOOP;
+
+			ring->emit = (void *)cs - ring->vaddr;
+			intel_ring_advance(rq, cs);
+			intel_ring_update_space(ring);
+
+			cs = intel_ring_begin(rq, 6);
+			if (IS_ERR(cs))
+				return PTR_ERR(cs);
+
+			pkt = min_t(int, 0x400, ring->space / sizeof(u32) + 5);
+			pkt = min_t(int, pkt, (ring->size - ring->emit) / sizeof(u32) + 5);
+
+			hdr = cs;
+			*cs++ = MI_STORE_DATA_IMM | REG_BIT(21);
+			*cs++ = lower_32_bits(offset);
+			*cs++ = upper_32_bits(offset);
+		}
+
+		*cs++ = lower_32_bits(encode | it->dma);
+		*cs++ = upper_32_bits(encode | it->dma);
+
+		offset += 8;
+		total += I915_GTT_PAGE_SIZE;
+
+		it->dma += I915_GTT_PAGE_SIZE;
+		if (it->dma >= it->max) {
+			it->sg = __sg_next(it->sg);
+			if (!it->sg || sg_dma_len(it->sg) == 0)
+				break;
+
+			it->dma = sg_dma_address(it->sg);
+			it->max = it->dma + sg_dma_len(it->sg);
+		}
+	} while (total < length);
+
+	*hdr += cs - hdr - 2;
+	*cs++ = MI_NOOP;
+
+	ring->emit = (void *)cs - ring->vaddr;
+	intel_ring_advance(rq, cs);
+	intel_ring_update_space(ring);
+
+	return total;
+}
+
+static bool wa_1209644611_applies(int gen, u32 size)
+{
+	u32 height = size >> PAGE_SHIFT;
+
+	if (gen != 11)
+		return false;
+
+	return height % 4 == 3 && height <= 8;
+}
+
+static int emit_copy(struct i915_request *rq, int size)
+{
+	const int gen = INTEL_GEN(rq->engine->i915);
+	u32 instance = rq->engine->instance;
+	u32 *cs;
+
+	cs = intel_ring_begin(rq, gen >= 8 ? 10 : 6);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	if (gen >= 9 && !wa_1209644611_applies(gen, size)) {
+		*cs++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
+		*cs++ = BLT_DEPTH_32 | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = CHUNK_SZ; /* dst offset */
+		*cs++ = instance;
+		*cs++ = 0;
+		*cs++ = PAGE_SIZE;
+		*cs++ = 0; /* src offset */
+		*cs++ = instance;
+	} else if (gen >= 8) {
+		*cs++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (10 - 2);
+		*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = CHUNK_SZ; /* dst offset */
+		*cs++ = instance;
+		*cs++ = 0;
+		*cs++ = PAGE_SIZE;
+		*cs++ = 0; /* src offset */
+		*cs++ = instance;
+	} else {
+		GEM_BUG_ON(instance);
+		*cs++ = SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
+		*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
+		*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE;
+		*cs++ = CHUNK_SZ; /* dst offset */
+		*cs++ = PAGE_SIZE;
+		*cs++ = 0; /* src offset */
+	}
+
+	intel_ring_advance(rq, cs);
+	return 0;
+}
+
+int
+intel_context_migrate_copy(struct intel_context *ce,
+			   struct dma_fence *await,
+			   struct scatterlist *src,
+			   enum i915_cache_level src_cache_level,
+			   bool src_is_lmem,
+			   struct scatterlist *dst,
+			   enum i915_cache_level dst_cache_level,
+			   bool dst_is_lmem,
+			   struct i915_request **out)
+{
+	struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst);
+	struct i915_request *rq;
+	int err;
+
+	*out = NULL;
+
+	GEM_BUG_ON(ce->ring->size < SZ_64K);
+
+	do {
+		int len;
+
+		rq = i915_request_create(ce);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto out_ce;
+		}
+
+		if (await) {
+			err = i915_request_await_dma_fence(rq, await);
+			if (err)
+				goto out_rq;
+
+			if (rq->engine->emit_init_breadcrumb) {
+				err = rq->engine->emit_init_breadcrumb(rq);
+				if (err)
+					goto out_rq;
+			}
+
+			await = NULL;
+		}
+
+		/* The PTE updates + copy must not be interrupted. */
+		err = emit_no_arbitration(rq);
+		if (err)
+			goto out_rq;
+
+		len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem, 0,
+			       CHUNK_SZ);
+		if (len <= 0) {
+			err = len;
+			goto out_rq;
+		}
+
+		err = emit_pte(rq, &it_dst, dst_cache_level, dst_is_lmem,
+			       CHUNK_SZ, len);
+		if (err < 0)
+			goto out_rq;
+		if (err < len) {
+			err = -EINVAL;
+			goto out_rq;
+		}
+
+		err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
+		if (err)
+			goto out_rq;
+
+		err = emit_copy(rq, len);
+
+		/* Arbitration is re-enabled between requests. */
+out_rq:
+		if (*out)
+			i915_request_put(*out);
+		*out = i915_request_get(rq);
+		i915_request_add(rq);
+		if (err || !it_src.sg || !sg_dma_len(it_src.sg))
+			break;
+
+		cond_resched();
+	} while (1);
+
+out_ce:
+	return err;
+}
+
+int intel_migrate_copy(struct intel_migrate *m,
+		       struct i915_gem_ww_ctx *ww,
+		       struct dma_fence *await,
+		       struct scatterlist *src,
+		       enum i915_cache_level src_cache_level,
+		       bool src_is_lmem,
+		       struct scatterlist *dst,
+		       enum i915_cache_level dst_cache_level,
+		       bool dst_is_lmem,
+		       struct i915_request **out)
+{
+	struct intel_context *ce;
+	int err;
+
+	*out = NULL;
+	if (!m->context)
+		return -ENODEV;
+
+	ce = intel_migrate_create_context(m);
+	if (IS_ERR(ce))
+		ce = intel_context_get(m->context);
+	GEM_BUG_ON(IS_ERR(ce));
+
+	err = intel_context_pin_ww(ce, ww);
+	if (err)
+		goto out;
+
+	err = intel_context_migrate_copy(ce, await,
+					 src, src_cache_level, src_is_lmem,
+					 dst, dst_cache_level, dst_is_lmem,
+					 out);
+
+	intel_context_unpin(ce);
+out:
+	intel_context_put(ce);
+	return err;
+}
+
+void intel_migrate_fini(struct intel_migrate *m)
+{
+	struct intel_context *ce;
+
+	ce = fetch_and_zero(&m->context);
+	if (!ce)
+		return;
+
+	intel_engine_destroy_pinned_context(ce);
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftest_migrate.c"
+#endif
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.h b/drivers/gpu/drm/i915/gt/intel_migrate.h
new file mode 100644
index 000000000000..32c61190ed73
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.h
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef __INTEL_MIGRATE__
+#define __INTEL_MIGRATE__
+
+#include "intel_migrate_types.h"
+
+struct dma_fence;
+struct i915_request;
+struct i915_gem_ww_ctx;
+struct intel_gt;
+struct scatterlist;
+enum i915_cache_level;
+
+int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt);
+
+struct intel_context *intel_migrate_create_context(struct intel_migrate *m);
+
+int intel_migrate_copy(struct intel_migrate *m,
+		       struct i915_gem_ww_ctx *ww,
+		       struct dma_fence *await,
+		       struct scatterlist *src,
+		       enum i915_cache_level src_cache_level,
+		       bool src_is_lmem,
+		       struct scatterlist *dst,
+		       enum i915_cache_level dst_cache_level,
+		       bool dst_is_lmem,
+		       struct i915_request **out);
+
+int intel_context_migrate_copy(struct intel_context *ce,
+			       struct dma_fence *await,
+			       struct scatterlist *src,
+			       enum i915_cache_level src_cache_level,
+			       bool src_is_lmem,
+			       struct scatterlist *dst,
+			       enum i915_cache_level dst_cache_level,
+			       bool dst_is_lmem,
+			       struct i915_request **out);
+
+void intel_migrate_fini(struct intel_migrate *m);
+
+#endif /* __INTEL_MIGRATE__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate_types.h b/drivers/gpu/drm/i915/gt/intel_migrate_types.h
new file mode 100644
index 000000000000..d98230597f42
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_migrate_types.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef __INTEL_MIGRATE_TYPES__
+#define __INTEL_MIGRATE_TYPES__
+
+struct intel_context;
+
+struct intel_migrate {
+	struct intel_context *context;
+};
+
+#endif /* __INTEL_MIGRATE_TYPES__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.h b/drivers/gpu/drm/i915/gt/intel_ring.h
index dbf5f14a136f..1b32dadfb8c3 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.h
+++ b/drivers/gpu/drm/i915/gt/intel_ring.h
@@ -49,6 +49,7 @@ static inline void intel_ring_advance(struct i915_request *rq, u32 *cs)
 	 * intel_ring_begin()).
 	 */
 	GEM_BUG_ON((rq->ring->vaddr + rq->ring->emit) != cs);
+	GEM_BUG_ON(!IS_ALIGNED(rq->ring->emit, 8)); /* RING_TAIL qword align */
 }
 
 static inline u32 intel_ring_wrap(const struct intel_ring *ring, u32 pos)
diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c
new file mode 100644
index 000000000000..9784d149ebf1
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
@@ -0,0 +1,291 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include "selftests/i915_random.h"
+
+static const unsigned int sizes[] = {
+	SZ_4K,
+	SZ_64K,
+	SZ_2M,
+	CHUNK_SZ - SZ_4K,
+	CHUNK_SZ,
+	CHUNK_SZ + SZ_4K,
+	SZ_64M,
+};
+
+static struct drm_i915_gem_object *
+create_lmem_or_internal(struct drm_i915_private *i915, size_t size)
+{
+	if (HAS_LMEM(i915)) {
+		struct drm_i915_gem_object *obj;
+
+		obj = i915_gem_object_create_lmem(i915, size, 0);
+		if (!IS_ERR(obj))
+			return obj;
+	}
+
+	return i915_gem_object_create_internal(i915, size);
+}
+
+static int copy(struct intel_migrate *migrate,
+		int (*fn)(struct intel_migrate *migrate,
+			  struct i915_gem_ww_ctx *ww,
+			  struct drm_i915_gem_object *src,
+			  struct drm_i915_gem_object *dst,
+			  struct i915_request **out),
+		u32 sz, struct rnd_state *prng)
+{
+	struct drm_i915_private *i915 = migrate->context->engine->i915;
+	struct drm_i915_gem_object *src, *dst;
+	struct i915_request *rq;
+	struct i915_gem_ww_ctx ww;
+	u32 *vaddr;
+	int err = 0;
+	int i;
+
+	src = create_lmem_or_internal(i915, sz);
+	if (IS_ERR(src))
+		return 0;
+
+	dst = i915_gem_object_create_internal(i915, sz);
+	if (IS_ERR(dst))
+		goto err_free_src;
+
+	for_i915_gem_ww(&ww, err, true) {
+		err = i915_gem_object_lock(src, &ww);
+		if (err)
+			continue;
+
+		err = i915_gem_object_lock(dst, &ww);
+		if (err)
+			continue;
+
+		vaddr = i915_gem_object_pin_map(src, I915_MAP_WC);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			continue;
+		}
+
+		for (i = 0; i < sz / sizeof(u32); i++)
+			vaddr[i] = i;
+		i915_gem_object_flush_map(src);
+
+		vaddr = i915_gem_object_pin_map(dst, I915_MAP_WC);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			goto unpin_src;
+		}
+
+		for (i = 0; i < sz / sizeof(u32); i++)
+			vaddr[i] = ~i;
+		i915_gem_object_flush_map(dst);
+
+		err = fn(migrate, &ww, src, dst, &rq);
+		if (!err)
+			continue;
+
+		if (err != -EDEADLK && err != -EINTR && err != -ERESTARTSYS)
+			pr_err("%ps failed, size: %u\n", fn, sz);
+		if (rq) {
+			i915_request_wait(rq, 0, HZ);
+			i915_request_put(rq);
+		}
+		i915_gem_object_unpin_map(dst);
+unpin_src:
+		i915_gem_object_unpin_map(src);
+	}
+	if (err)
+		goto err_out;
+
+	if (rq) {
+		if (i915_request_wait(rq, 0, HZ) < 0) {
+			pr_err("%ps timed out, size: %u\n", fn, sz);
+			err = -ETIME;
+		}
+		i915_request_put(rq);
+	}
+
+	for (i = 0; !err && i < sz / PAGE_SIZE; i++) {
+		int x = i * 1024 + i915_prandom_u32_max_state(1024, prng);
+
+		if (vaddr[x] != x) {
+			pr_err("%ps failed, size: %u, offset: %zu\n",
+			       fn, sz, x * sizeof(u32));
+			igt_hexdump(vaddr + i * 1024, 4096);
+			err = -EINVAL;
+		}
+	}
+
+	i915_gem_object_unpin_map(dst);
+	i915_gem_object_unpin_map(src);
+
+err_out:
+	i915_gem_object_put(dst);
+err_free_src:
+	i915_gem_object_put(src);
+
+	return err;
+}
+
+static int __migrate_copy(struct intel_migrate *migrate,
+			  struct i915_gem_ww_ctx *ww,
+			  struct drm_i915_gem_object *src,
+			  struct drm_i915_gem_object *dst,
+			  struct i915_request **out)
+{
+	return intel_migrate_copy(migrate, ww, NULL,
+				  src->mm.pages->sgl, src->cache_level,
+				  i915_gem_object_is_lmem(src),
+				  dst->mm.pages->sgl, dst->cache_level,
+				  i915_gem_object_is_lmem(dst),
+				  out);
+}
+
+static int __global_copy(struct intel_migrate *migrate,
+			 struct i915_gem_ww_ctx *ww,
+			 struct drm_i915_gem_object *src,
+			 struct drm_i915_gem_object *dst,
+			 struct i915_request **out)
+{
+	return intel_context_migrate_copy(migrate->context, NULL,
+					  src->mm.pages->sgl, src->cache_level,
+					  i915_gem_object_is_lmem(src),
+					  dst->mm.pages->sgl, dst->cache_level,
+					  i915_gem_object_is_lmem(dst),
+					  out);
+}
+
+static int
+migrate_copy(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
+{
+	return copy(migrate, __migrate_copy, sz, prng);
+}
+
+static int
+global_copy(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
+{
+	return copy(migrate, __global_copy, sz, prng);
+}
+
+static int live_migrate_copy(void *arg)
+{
+	struct intel_migrate *migrate = arg;
+	struct drm_i915_private *i915 = migrate->context->engine->i915;
+	I915_RND_STATE(prng);
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
+		int err;
+
+		err = migrate_copy(migrate, sizes[i], &prng);
+		if (err == 0)
+			err = global_copy(migrate, sizes[i], &prng);
+		i915_gem_drain_freed_objects(i915);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+struct threaded_migrate {
+	struct intel_migrate *migrate;
+	struct task_struct *tsk;
+	struct rnd_state prng;
+};
+
+static int threaded_migrate(struct intel_migrate *migrate,
+			    int (*fn)(void *arg),
+			    unsigned int flags)
+{
+	const unsigned int n_cpus = num_online_cpus() + 1;
+	struct threaded_migrate *thread;
+	I915_RND_STATE(prng);
+	unsigned int i;
+	int err = 0;
+
+	thread = kcalloc(n_cpus, sizeof(*thread), GFP_KERNEL);
+	if (!thread)
+		return 0;
+
+	for (i = 0; i < n_cpus; ++i) {
+		struct task_struct *tsk;
+
+		thread[i].migrate = migrate;
+		thread[i].prng =
+			I915_RND_STATE_INITIALIZER(prandom_u32_state(&prng));
+
+		tsk = kthread_run(fn, &thread[i], "igt-%d", i);
+		if (IS_ERR(tsk)) {
+			err = PTR_ERR(tsk);
+			break;
+		}
+
+		get_task_struct(tsk);
+		thread[i].tsk = tsk;
+	}
+
+	msleep(10); /* start all threads before we kthread_stop() */
+
+	for (i = 0; i < n_cpus; ++i) {
+		struct task_struct *tsk = thread[i].tsk;
+		int status;
+
+		if (IS_ERR_OR_NULL(tsk))
+			continue;
+
+		status = kthread_stop(tsk);
+		if (status && !err)
+			err = status;
+
+		put_task_struct(tsk);
+	}
+
+	kfree(thread);
+	return err;
+}
+
+static int __thread_migrate_copy(void *arg)
+{
+	struct threaded_migrate *tm = arg;
+
+	return migrate_copy(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
+}
+
+static int thread_migrate_copy(void *arg)
+{
+	return threaded_migrate(arg, __thread_migrate_copy, 0);
+}
+
+static int __thread_global_copy(void *arg)
+{
+	struct threaded_migrate *tm = arg;
+
+	return global_copy(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
+}
+
+static int thread_global_copy(void *arg)
+{
+	return threaded_migrate(arg, __thread_global_copy, 0);
+}
+
+int intel_migrate_live_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(live_migrate_copy),
+		SUBTEST(thread_migrate_copy),
+		SUBTEST(thread_global_copy),
+	};
+	struct intel_migrate m;
+	int err;
+
+	if (intel_migrate_init(&m, &i915->gt))
+		return 0;
+
+	err = i915_subtests(tests, &m);
+	intel_migrate_fini(&m);
+
+	return err;
+}
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index a92c0e9b7e6b..be5e0191eaea 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -26,6 +26,7 @@ selftest(gt_mocs, intel_mocs_live_selftests)
 selftest(gt_pm, intel_gt_pm_live_selftests)
 selftest(gt_heartbeat, intel_heartbeat_live_selftests)
 selftest(requests, i915_request_live_selftests)
+selftest(migrate, intel_migrate_live_selftests)
 selftest(active, i915_active_live_selftests)
 selftest(objects, i915_gem_object_live_selftests)
 selftest(mman, i915_gem_mman_live_selftests)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 07/12] drm/i915/gt: Pipelined page migration
@ 2021-06-14 16:26   ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

If we pipeline the PTE updates and then do the copy of those pages
within a single unpreemptible command packet, we can submit the copies
and leave them to be scheduled without having to synchronously wait
under a global lock. In order to manage migration, we need to
preallocate the page tables (and keep them pinned and available for use
at any time), causing a bottleneck for migrations as all clients must
contend on the limited resources. By inlining the ppGTT updates and
performing the blit atomically, each client only owns the PTE while in
use, and so we can reschedule individual operations however we see fit.
And most importantly, we do not need to take a global lock on the shared
vm, and wait until the operation is complete before releasing the lock
for others to claim the PTE for themselves.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
v2:
- Add a TODO for huge LMEM ptes (Pointed out by Matthew Auld)
- Use intel_engine_destroy_pinned_context() to properly take the pinned
  context timeline off the engine list. (CI warning).
v3:
- Remove an obsolete GEM_BUG_ON (Pointed out by Matthew Auld)
- Fix the size argument in allocate_va_range() to not include the base
  (Pointed out by Matthew Auld)
---
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/gt/intel_engine.h        |   1 +
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   2 +
 drivers/gpu/drm/i915/gt/intel_migrate.c       | 542 ++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_migrate.h       |  45 ++
 drivers/gpu/drm/i915/gt/intel_migrate_types.h |  15 +
 drivers/gpu/drm/i915/gt/intel_ring.h          |   1 +
 drivers/gpu/drm/i915/gt/selftest_migrate.c    | 291 ++++++++++
 .../drm/i915/selftests/i915_live_selftests.h  |   1 +
 9 files changed, 899 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.h
 create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate_types.h
 create mode 100644 drivers/gpu/drm/i915/gt/selftest_migrate.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 7e01ea2c0f00..de4cb9c52585 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -108,6 +108,7 @@ gt-y += \
 	gt/intel_gtt.o \
 	gt/intel_llc.o \
 	gt/intel_lrc.o \
+	gt/intel_migrate.o \
 	gt/intel_mocs.o \
 	gt/intel_ppgtt.o \
 	gt/intel_rc6.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index 36ea9eb52bb5..62f7440bc111 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -188,6 +188,7 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
 #define I915_GEM_HWS_PREEMPT_ADDR	(I915_GEM_HWS_PREEMPT * sizeof(u32))
 #define I915_GEM_HWS_SEQNO		0x40
 #define I915_GEM_HWS_SEQNO_ADDR		(I915_GEM_HWS_SEQNO * sizeof(u32))
+#define I915_GEM_HWS_MIGRATE		(0x42 * sizeof(u32))
 #define I915_GEM_HWS_SCRATCH		0x80
 
 #define I915_HWS_CSB_BUF0_INDEX		0x10
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 2694dbb9967e..1c3af0fc0456 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -123,8 +123,10 @@
 #define   MI_SEMAPHORE_SAD_NEQ_SDD	(5 << 12)
 #define   MI_SEMAPHORE_TOKEN_MASK	REG_GENMASK(9, 5)
 #define   MI_SEMAPHORE_TOKEN_SHIFT	5
+#define MI_STORE_DATA_IMM	MI_INSTR(0x20, 0)
 #define MI_STORE_DWORD_IMM	MI_INSTR(0x20, 1)
 #define MI_STORE_DWORD_IMM_GEN4	MI_INSTR(0x20, 2)
+#define MI_STORE_QWORD_IMM_GEN8 (MI_INSTR(0x20, 3) | REG_BIT(21))
 #define   MI_MEM_VIRTUAL	(1 << 22) /* 945,g33,965 */
 #define   MI_USE_GGTT		(1 << 22) /* g4x+ */
 #define MI_STORE_DWORD_INDEX	MI_INSTR(0x21, 1)
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
new file mode 100644
index 000000000000..e2e860063e7b
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -0,0 +1,542 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include "i915_drv.h"
+#include "intel_context.h"
+#include "intel_gpu_commands.h"
+#include "intel_gt.h"
+#include "intel_gtt.h"
+#include "intel_migrate.h"
+#include "intel_ring.h"
+
+struct insert_pte_data {
+	u64 offset;
+	bool is_lmem;
+};
+
+#define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */
+
+static bool engine_supports_migration(struct intel_engine_cs *engine)
+{
+	if (!engine)
+		return false;
+
+	/*
+	 * We need the ability to prevent aribtration (MI_ARB_ON_OFF),
+	 * the ability to write PTE using inline data (MI_STORE_DATA)
+	 * and of course the ability to do the block transfer (blits).
+	 */
+	GEM_BUG_ON(engine->class != COPY_ENGINE_CLASS);
+
+	return true;
+}
+
+static void insert_pte(struct i915_address_space *vm,
+		       struct i915_page_table *pt,
+		       void *data)
+{
+	struct insert_pte_data *d = data;
+
+	vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE,
+			d->is_lmem ? PTE_LM : 0);
+	d->offset += PAGE_SIZE;
+}
+
+static struct i915_address_space *migrate_vm(struct intel_gt *gt)
+{
+	struct i915_vm_pt_stash stash = {};
+	struct i915_ppgtt *vm;
+	int err;
+	int i;
+
+	/*
+	 * We construct a very special VM for use by all migration contexts,
+	 * it is kept pinned so that it can be used at any time. As we need
+	 * to pre-allocate the page directories for the migration VM, this
+	 * limits us to only using a small number of prepared vma.
+	 *
+	 * To be able to pipeline and reschedule migration operations while
+	 * avoiding unnecessary contention on the vm itself, the PTE updates
+	 * are inline with the blits. All the blits use the same fixed
+	 * addresses, with the backing store redirection being updated on the
+	 * fly. Only 2 implicit vma are used for all migration operations.
+	 *
+	 * We lay the ppGTT out as:
+	 *
+	 *	[0, CHUNK_SZ) -> first object
+	 *	[CHUNK_SZ, 2 * CHUNK_SZ) -> second object
+	 *	[2 * CHUNK_SZ, 2 * CHUNK_SZ + 2 * CHUNK_SZ >> 9] -> PTE
+	 *
+	 * By exposing the dma addresses of the page directories themselves
+	 * within the ppGTT, we are then able to rewrite the PTE prior to use.
+	 * But the PTE update and subsequent migration operation must be atomic,
+	 * i.e. within the same non-preemptible window so that we do not switch
+	 * to another migration context that overwrites the PTE.
+	 *
+	 * TODO: Add support for huge LMEM PTEs
+	 */
+
+	vm = i915_ppgtt_create(gt);
+	if (IS_ERR(vm))
+		return ERR_CAST(vm);
+
+	if (!vm->vm.allocate_va_range || !vm->vm.foreach) {
+		err = -ENODEV;
+		goto err_vm;
+	}
+
+	/*
+	 * Each engine instance is assigned its own chunk in the VM, so
+	 * that we can run multiple instances concurrently
+	 */
+	for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) {
+		struct intel_engine_cs *engine;
+		u64 base = (u64)i << 32;
+		struct insert_pte_data d = {};
+		struct i915_gem_ww_ctx ww;
+		u64 sz;
+
+		engine = gt->engine_class[COPY_ENGINE_CLASS][i];
+		if (!engine_supports_migration(engine))
+			continue;
+
+		/*
+		 * We copy in 8MiB chunks. Each PDE covers 2MiB, so we need
+		 * 4x2 page directories for source/destination.
+		 */
+		sz = 2 * CHUNK_SZ;
+		d.offset = base + sz;
+
+		/*
+		 * We need another page directory setup so that we can write
+		 * the 8x512 PTE in each chunk.
+		 */
+		sz += (sz >> 12) * sizeof(u64);
+
+		err = i915_vm_alloc_pt_stash(&vm->vm, &stash, sz);
+		if (err)
+			goto err_vm;
+
+		for_i915_gem_ww(&ww, err, true) {
+			err = i915_vm_lock_objects(&vm->vm, &ww);
+			if (err)
+				continue;
+			err = i915_vm_map_pt_stash(&vm->vm, &stash);
+			if (err)
+				continue;
+
+			vm->vm.allocate_va_range(&vm->vm, &stash, base, sz);
+		}
+		i915_vm_free_pt_stash(&vm->vm, &stash);
+		if (err)
+			goto err_vm;
+
+		/* Now allow the GPU to rewrite the PTE via its own ppGTT */
+		d.is_lmem = i915_gem_object_is_lmem(vm->vm.scratch[0]);
+		vm->vm.foreach(&vm->vm, base, base + sz, insert_pte, &d);
+	}
+
+	return &vm->vm;
+
+err_vm:
+	i915_vm_put(&vm->vm);
+	return ERR_PTR(err);
+}
+
+static struct intel_engine_cs *first_copy_engine(struct intel_gt *gt)
+{
+	struct intel_engine_cs *engine;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) {
+		engine = gt->engine_class[COPY_ENGINE_CLASS][i];
+		if (engine_supports_migration(engine))
+			return engine;
+	}
+
+	return NULL;
+}
+
+static struct intel_context *pinned_context(struct intel_gt *gt)
+{
+	static struct lock_class_key key;
+	struct intel_engine_cs *engine;
+	struct i915_address_space *vm;
+	struct intel_context *ce;
+
+	engine = first_copy_engine(gt);
+	if (!engine)
+		return ERR_PTR(-ENODEV);
+
+	vm = migrate_vm(gt);
+	if (IS_ERR(vm))
+		return ERR_CAST(vm);
+
+	ce = intel_engine_create_pinned_context(engine, vm, SZ_512K,
+						I915_GEM_HWS_MIGRATE,
+						&key, "migrate");
+	i915_vm_put(ce->vm);
+	return ce;
+}
+
+int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt)
+{
+	struct intel_context *ce;
+
+	memset(m, 0, sizeof(*m));
+
+	ce = pinned_context(gt);
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
+	m->context = ce;
+	return 0;
+}
+
+static int random_index(unsigned int max)
+{
+	return upper_32_bits(mul_u32_u32(get_random_u32(), max));
+}
+
+static struct intel_context *__migrate_engines(struct intel_gt *gt)
+{
+	struct intel_engine_cs *engines[MAX_ENGINE_INSTANCE];
+	struct intel_engine_cs *engine;
+	unsigned int count, i;
+
+	count = 0;
+	for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) {
+		engine = gt->engine_class[COPY_ENGINE_CLASS][i];
+		if (engine_supports_migration(engine))
+			engines[count++] = engine;
+	}
+
+	return intel_context_create(engines[random_index(count)]);
+}
+
+struct intel_context *intel_migrate_create_context(struct intel_migrate *m)
+{
+	struct intel_context *ce;
+
+	/*
+	 * We randomly distribute contexts across the engines upon constrction,
+	 * as they all share the same pinned vm, and so in order to allow
+	 * multiple blits to run in parallel, we must construct each blit
+	 * to use a different range of the vm for its GTT. This has to be
+	 * known at construction, so we can not use the late greedy load
+	 * balancing of the virtual-engine.
+	 */
+	ce = __migrate_engines(m->context->engine->gt);
+	if (IS_ERR(ce))
+		return ce;
+
+	ce->ring = __intel_context_ring_size(SZ_256K);
+
+	i915_vm_put(ce->vm);
+	ce->vm = i915_vm_get(m->context->vm);
+
+	return ce;
+}
+
+static inline struct sgt_dma sg_sgt(struct scatterlist *sg)
+{
+	dma_addr_t addr = sg_dma_address(sg);
+
+	return (struct sgt_dma){ sg, addr, addr + sg_dma_len(sg) };
+}
+
+static int emit_no_arbitration(struct i915_request *rq)
+{
+	u32 *cs;
+
+	cs = intel_ring_begin(rq, 2);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	/* Explicitly disable preemption for this request. */
+	*cs++ = MI_ARB_ON_OFF;
+	*cs++ = MI_NOOP;
+	intel_ring_advance(rq, cs);
+
+	return 0;
+}
+
+static int emit_pte(struct i915_request *rq,
+		    struct sgt_dma *it,
+		    enum i915_cache_level cache_level,
+		    bool is_lmem,
+		    u64 offset,
+		    int length)
+{
+	const u64 encode = rq->context->vm->pte_encode(0, cache_level,
+						       is_lmem ? PTE_LM : 0);
+	struct intel_ring *ring = rq->ring;
+	int total = 0;
+	u32 *hdr, *cs;
+	int pkt;
+
+	GEM_BUG_ON(INTEL_GEN(rq->engine->i915) < 8);
+
+	/* Compute the page directory offset for the target address range */
+	offset += (u64)rq->engine->instance << 32;
+	offset >>= 12;
+	offset *= sizeof(u64);
+	offset += 2 * CHUNK_SZ;
+
+	cs = intel_ring_begin(rq, 6);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	/* Pack as many PTE updates as possible into a single MI command */
+	pkt = min_t(int, 0x400, ring->space / sizeof(u32) + 5);
+	pkt = min_t(int, pkt, (ring->size - ring->emit) / sizeof(u32) + 5);
+
+	hdr = cs;
+	*cs++ = MI_STORE_DATA_IMM | REG_BIT(21); /* as qword elements */
+	*cs++ = lower_32_bits(offset);
+	*cs++ = upper_32_bits(offset);
+
+	do {
+		if (cs - hdr >= pkt) {
+			*hdr += cs - hdr - 2;
+			*cs++ = MI_NOOP;
+
+			ring->emit = (void *)cs - ring->vaddr;
+			intel_ring_advance(rq, cs);
+			intel_ring_update_space(ring);
+
+			cs = intel_ring_begin(rq, 6);
+			if (IS_ERR(cs))
+				return PTR_ERR(cs);
+
+			pkt = min_t(int, 0x400, ring->space / sizeof(u32) + 5);
+			pkt = min_t(int, pkt, (ring->size - ring->emit) / sizeof(u32) + 5);
+
+			hdr = cs;
+			*cs++ = MI_STORE_DATA_IMM | REG_BIT(21);
+			*cs++ = lower_32_bits(offset);
+			*cs++ = upper_32_bits(offset);
+		}
+
+		*cs++ = lower_32_bits(encode | it->dma);
+		*cs++ = upper_32_bits(encode | it->dma);
+
+		offset += 8;
+		total += I915_GTT_PAGE_SIZE;
+
+		it->dma += I915_GTT_PAGE_SIZE;
+		if (it->dma >= it->max) {
+			it->sg = __sg_next(it->sg);
+			if (!it->sg || sg_dma_len(it->sg) == 0)
+				break;
+
+			it->dma = sg_dma_address(it->sg);
+			it->max = it->dma + sg_dma_len(it->sg);
+		}
+	} while (total < length);
+
+	*hdr += cs - hdr - 2;
+	*cs++ = MI_NOOP;
+
+	ring->emit = (void *)cs - ring->vaddr;
+	intel_ring_advance(rq, cs);
+	intel_ring_update_space(ring);
+
+	return total;
+}
+
+static bool wa_1209644611_applies(int gen, u32 size)
+{
+	u32 height = size >> PAGE_SHIFT;
+
+	if (gen != 11)
+		return false;
+
+	return height % 4 == 3 && height <= 8;
+}
+
+static int emit_copy(struct i915_request *rq, int size)
+{
+	const int gen = INTEL_GEN(rq->engine->i915);
+	u32 instance = rq->engine->instance;
+	u32 *cs;
+
+	cs = intel_ring_begin(rq, gen >= 8 ? 10 : 6);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	if (gen >= 9 && !wa_1209644611_applies(gen, size)) {
+		*cs++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
+		*cs++ = BLT_DEPTH_32 | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = CHUNK_SZ; /* dst offset */
+		*cs++ = instance;
+		*cs++ = 0;
+		*cs++ = PAGE_SIZE;
+		*cs++ = 0; /* src offset */
+		*cs++ = instance;
+	} else if (gen >= 8) {
+		*cs++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (10 - 2);
+		*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = CHUNK_SZ; /* dst offset */
+		*cs++ = instance;
+		*cs++ = 0;
+		*cs++ = PAGE_SIZE;
+		*cs++ = 0; /* src offset */
+		*cs++ = instance;
+	} else {
+		GEM_BUG_ON(instance);
+		*cs++ = SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
+		*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
+		*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE;
+		*cs++ = CHUNK_SZ; /* dst offset */
+		*cs++ = PAGE_SIZE;
+		*cs++ = 0; /* src offset */
+	}
+
+	intel_ring_advance(rq, cs);
+	return 0;
+}
+
+int
+intel_context_migrate_copy(struct intel_context *ce,
+			   struct dma_fence *await,
+			   struct scatterlist *src,
+			   enum i915_cache_level src_cache_level,
+			   bool src_is_lmem,
+			   struct scatterlist *dst,
+			   enum i915_cache_level dst_cache_level,
+			   bool dst_is_lmem,
+			   struct i915_request **out)
+{
+	struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst);
+	struct i915_request *rq;
+	int err;
+
+	*out = NULL;
+
+	GEM_BUG_ON(ce->ring->size < SZ_64K);
+
+	do {
+		int len;
+
+		rq = i915_request_create(ce);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto out_ce;
+		}
+
+		if (await) {
+			err = i915_request_await_dma_fence(rq, await);
+			if (err)
+				goto out_rq;
+
+			if (rq->engine->emit_init_breadcrumb) {
+				err = rq->engine->emit_init_breadcrumb(rq);
+				if (err)
+					goto out_rq;
+			}
+
+			await = NULL;
+		}
+
+		/* The PTE updates + copy must not be interrupted. */
+		err = emit_no_arbitration(rq);
+		if (err)
+			goto out_rq;
+
+		len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem, 0,
+			       CHUNK_SZ);
+		if (len <= 0) {
+			err = len;
+			goto out_rq;
+		}
+
+		err = emit_pte(rq, &it_dst, dst_cache_level, dst_is_lmem,
+			       CHUNK_SZ, len);
+		if (err < 0)
+			goto out_rq;
+		if (err < len) {
+			err = -EINVAL;
+			goto out_rq;
+		}
+
+		err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
+		if (err)
+			goto out_rq;
+
+		err = emit_copy(rq, len);
+
+		/* Arbitration is re-enabled between requests. */
+out_rq:
+		if (*out)
+			i915_request_put(*out);
+		*out = i915_request_get(rq);
+		i915_request_add(rq);
+		if (err || !it_src.sg || !sg_dma_len(it_src.sg))
+			break;
+
+		cond_resched();
+	} while (1);
+
+out_ce:
+	return err;
+}
+
+int intel_migrate_copy(struct intel_migrate *m,
+		       struct i915_gem_ww_ctx *ww,
+		       struct dma_fence *await,
+		       struct scatterlist *src,
+		       enum i915_cache_level src_cache_level,
+		       bool src_is_lmem,
+		       struct scatterlist *dst,
+		       enum i915_cache_level dst_cache_level,
+		       bool dst_is_lmem,
+		       struct i915_request **out)
+{
+	struct intel_context *ce;
+	int err;
+
+	*out = NULL;
+	if (!m->context)
+		return -ENODEV;
+
+	ce = intel_migrate_create_context(m);
+	if (IS_ERR(ce))
+		ce = intel_context_get(m->context);
+	GEM_BUG_ON(IS_ERR(ce));
+
+	err = intel_context_pin_ww(ce, ww);
+	if (err)
+		goto out;
+
+	err = intel_context_migrate_copy(ce, await,
+					 src, src_cache_level, src_is_lmem,
+					 dst, dst_cache_level, dst_is_lmem,
+					 out);
+
+	intel_context_unpin(ce);
+out:
+	intel_context_put(ce);
+	return err;
+}
+
+void intel_migrate_fini(struct intel_migrate *m)
+{
+	struct intel_context *ce;
+
+	ce = fetch_and_zero(&m->context);
+	if (!ce)
+		return;
+
+	intel_engine_destroy_pinned_context(ce);
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftest_migrate.c"
+#endif
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.h b/drivers/gpu/drm/i915/gt/intel_migrate.h
new file mode 100644
index 000000000000..32c61190ed73
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.h
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef __INTEL_MIGRATE__
+#define __INTEL_MIGRATE__
+
+#include "intel_migrate_types.h"
+
+struct dma_fence;
+struct i915_request;
+struct i915_gem_ww_ctx;
+struct intel_gt;
+struct scatterlist;
+enum i915_cache_level;
+
+int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt);
+
+struct intel_context *intel_migrate_create_context(struct intel_migrate *m);
+
+int intel_migrate_copy(struct intel_migrate *m,
+		       struct i915_gem_ww_ctx *ww,
+		       struct dma_fence *await,
+		       struct scatterlist *src,
+		       enum i915_cache_level src_cache_level,
+		       bool src_is_lmem,
+		       struct scatterlist *dst,
+		       enum i915_cache_level dst_cache_level,
+		       bool dst_is_lmem,
+		       struct i915_request **out);
+
+int intel_context_migrate_copy(struct intel_context *ce,
+			       struct dma_fence *await,
+			       struct scatterlist *src,
+			       enum i915_cache_level src_cache_level,
+			       bool src_is_lmem,
+			       struct scatterlist *dst,
+			       enum i915_cache_level dst_cache_level,
+			       bool dst_is_lmem,
+			       struct i915_request **out);
+
+void intel_migrate_fini(struct intel_migrate *m);
+
+#endif /* __INTEL_MIGRATE__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate_types.h b/drivers/gpu/drm/i915/gt/intel_migrate_types.h
new file mode 100644
index 000000000000..d98230597f42
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_migrate_types.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#ifndef __INTEL_MIGRATE_TYPES__
+#define __INTEL_MIGRATE_TYPES__
+
+struct intel_context;
+
+struct intel_migrate {
+	struct intel_context *context;
+};
+
+#endif /* __INTEL_MIGRATE_TYPES__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.h b/drivers/gpu/drm/i915/gt/intel_ring.h
index dbf5f14a136f..1b32dadfb8c3 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.h
+++ b/drivers/gpu/drm/i915/gt/intel_ring.h
@@ -49,6 +49,7 @@ static inline void intel_ring_advance(struct i915_request *rq, u32 *cs)
 	 * intel_ring_begin()).
 	 */
 	GEM_BUG_ON((rq->ring->vaddr + rq->ring->emit) != cs);
+	GEM_BUG_ON(!IS_ALIGNED(rq->ring->emit, 8)); /* RING_TAIL qword align */
 }
 
 static inline u32 intel_ring_wrap(const struct intel_ring *ring, u32 pos)
diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c
new file mode 100644
index 000000000000..9784d149ebf1
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
@@ -0,0 +1,291 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include "selftests/i915_random.h"
+
+static const unsigned int sizes[] = {
+	SZ_4K,
+	SZ_64K,
+	SZ_2M,
+	CHUNK_SZ - SZ_4K,
+	CHUNK_SZ,
+	CHUNK_SZ + SZ_4K,
+	SZ_64M,
+};
+
+static struct drm_i915_gem_object *
+create_lmem_or_internal(struct drm_i915_private *i915, size_t size)
+{
+	if (HAS_LMEM(i915)) {
+		struct drm_i915_gem_object *obj;
+
+		obj = i915_gem_object_create_lmem(i915, size, 0);
+		if (!IS_ERR(obj))
+			return obj;
+	}
+
+	return i915_gem_object_create_internal(i915, size);
+}
+
+static int copy(struct intel_migrate *migrate,
+		int (*fn)(struct intel_migrate *migrate,
+			  struct i915_gem_ww_ctx *ww,
+			  struct drm_i915_gem_object *src,
+			  struct drm_i915_gem_object *dst,
+			  struct i915_request **out),
+		u32 sz, struct rnd_state *prng)
+{
+	struct drm_i915_private *i915 = migrate->context->engine->i915;
+	struct drm_i915_gem_object *src, *dst;
+	struct i915_request *rq;
+	struct i915_gem_ww_ctx ww;
+	u32 *vaddr;
+	int err = 0;
+	int i;
+
+	src = create_lmem_or_internal(i915, sz);
+	if (IS_ERR(src))
+		return 0;
+
+	dst = i915_gem_object_create_internal(i915, sz);
+	if (IS_ERR(dst))
+		goto err_free_src;
+
+	for_i915_gem_ww(&ww, err, true) {
+		err = i915_gem_object_lock(src, &ww);
+		if (err)
+			continue;
+
+		err = i915_gem_object_lock(dst, &ww);
+		if (err)
+			continue;
+
+		vaddr = i915_gem_object_pin_map(src, I915_MAP_WC);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			continue;
+		}
+
+		for (i = 0; i < sz / sizeof(u32); i++)
+			vaddr[i] = i;
+		i915_gem_object_flush_map(src);
+
+		vaddr = i915_gem_object_pin_map(dst, I915_MAP_WC);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			goto unpin_src;
+		}
+
+		for (i = 0; i < sz / sizeof(u32); i++)
+			vaddr[i] = ~i;
+		i915_gem_object_flush_map(dst);
+
+		err = fn(migrate, &ww, src, dst, &rq);
+		if (!err)
+			continue;
+
+		if (err != -EDEADLK && err != -EINTR && err != -ERESTARTSYS)
+			pr_err("%ps failed, size: %u\n", fn, sz);
+		if (rq) {
+			i915_request_wait(rq, 0, HZ);
+			i915_request_put(rq);
+		}
+		i915_gem_object_unpin_map(dst);
+unpin_src:
+		i915_gem_object_unpin_map(src);
+	}
+	if (err)
+		goto err_out;
+
+	if (rq) {
+		if (i915_request_wait(rq, 0, HZ) < 0) {
+			pr_err("%ps timed out, size: %u\n", fn, sz);
+			err = -ETIME;
+		}
+		i915_request_put(rq);
+	}
+
+	for (i = 0; !err && i < sz / PAGE_SIZE; i++) {
+		int x = i * 1024 + i915_prandom_u32_max_state(1024, prng);
+
+		if (vaddr[x] != x) {
+			pr_err("%ps failed, size: %u, offset: %zu\n",
+			       fn, sz, x * sizeof(u32));
+			igt_hexdump(vaddr + i * 1024, 4096);
+			err = -EINVAL;
+		}
+	}
+
+	i915_gem_object_unpin_map(dst);
+	i915_gem_object_unpin_map(src);
+
+err_out:
+	i915_gem_object_put(dst);
+err_free_src:
+	i915_gem_object_put(src);
+
+	return err;
+}
+
+static int __migrate_copy(struct intel_migrate *migrate,
+			  struct i915_gem_ww_ctx *ww,
+			  struct drm_i915_gem_object *src,
+			  struct drm_i915_gem_object *dst,
+			  struct i915_request **out)
+{
+	return intel_migrate_copy(migrate, ww, NULL,
+				  src->mm.pages->sgl, src->cache_level,
+				  i915_gem_object_is_lmem(src),
+				  dst->mm.pages->sgl, dst->cache_level,
+				  i915_gem_object_is_lmem(dst),
+				  out);
+}
+
+static int __global_copy(struct intel_migrate *migrate,
+			 struct i915_gem_ww_ctx *ww,
+			 struct drm_i915_gem_object *src,
+			 struct drm_i915_gem_object *dst,
+			 struct i915_request **out)
+{
+	return intel_context_migrate_copy(migrate->context, NULL,
+					  src->mm.pages->sgl, src->cache_level,
+					  i915_gem_object_is_lmem(src),
+					  dst->mm.pages->sgl, dst->cache_level,
+					  i915_gem_object_is_lmem(dst),
+					  out);
+}
+
+static int
+migrate_copy(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
+{
+	return copy(migrate, __migrate_copy, sz, prng);
+}
+
+static int
+global_copy(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
+{
+	return copy(migrate, __global_copy, sz, prng);
+}
+
+static int live_migrate_copy(void *arg)
+{
+	struct intel_migrate *migrate = arg;
+	struct drm_i915_private *i915 = migrate->context->engine->i915;
+	I915_RND_STATE(prng);
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
+		int err;
+
+		err = migrate_copy(migrate, sizes[i], &prng);
+		if (err == 0)
+			err = global_copy(migrate, sizes[i], &prng);
+		i915_gem_drain_freed_objects(i915);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+struct threaded_migrate {
+	struct intel_migrate *migrate;
+	struct task_struct *tsk;
+	struct rnd_state prng;
+};
+
+static int threaded_migrate(struct intel_migrate *migrate,
+			    int (*fn)(void *arg),
+			    unsigned int flags)
+{
+	const unsigned int n_cpus = num_online_cpus() + 1;
+	struct threaded_migrate *thread;
+	I915_RND_STATE(prng);
+	unsigned int i;
+	int err = 0;
+
+	thread = kcalloc(n_cpus, sizeof(*thread), GFP_KERNEL);
+	if (!thread)
+		return 0;
+
+	for (i = 0; i < n_cpus; ++i) {
+		struct task_struct *tsk;
+
+		thread[i].migrate = migrate;
+		thread[i].prng =
+			I915_RND_STATE_INITIALIZER(prandom_u32_state(&prng));
+
+		tsk = kthread_run(fn, &thread[i], "igt-%d", i);
+		if (IS_ERR(tsk)) {
+			err = PTR_ERR(tsk);
+			break;
+		}
+
+		get_task_struct(tsk);
+		thread[i].tsk = tsk;
+	}
+
+	msleep(10); /* start all threads before we kthread_stop() */
+
+	for (i = 0; i < n_cpus; ++i) {
+		struct task_struct *tsk = thread[i].tsk;
+		int status;
+
+		if (IS_ERR_OR_NULL(tsk))
+			continue;
+
+		status = kthread_stop(tsk);
+		if (status && !err)
+			err = status;
+
+		put_task_struct(tsk);
+	}
+
+	kfree(thread);
+	return err;
+}
+
+static int __thread_migrate_copy(void *arg)
+{
+	struct threaded_migrate *tm = arg;
+
+	return migrate_copy(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
+}
+
+static int thread_migrate_copy(void *arg)
+{
+	return threaded_migrate(arg, __thread_migrate_copy, 0);
+}
+
+static int __thread_global_copy(void *arg)
+{
+	struct threaded_migrate *tm = arg;
+
+	return global_copy(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
+}
+
+static int thread_global_copy(void *arg)
+{
+	return threaded_migrate(arg, __thread_global_copy, 0);
+}
+
+int intel_migrate_live_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(live_migrate_copy),
+		SUBTEST(thread_migrate_copy),
+		SUBTEST(thread_global_copy),
+	};
+	struct intel_migrate m;
+	int err;
+
+	if (intel_migrate_init(&m, &i915->gt))
+		return 0;
+
+	err = i915_subtests(tests, &m);
+	intel_migrate_fini(&m);
+
+	return err;
+}
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index a92c0e9b7e6b..be5e0191eaea 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -26,6 +26,7 @@ selftest(gt_mocs, intel_mocs_live_selftests)
 selftest(gt_pm, intel_gt_pm_live_selftests)
 selftest(gt_heartbeat, intel_heartbeat_live_selftests)
 selftest(requests, i915_request_live_selftests)
+selftest(migrate, intel_migrate_live_selftests)
 selftest(active, i915_active_live_selftests)
 selftest(objects, i915_gem_object_live_selftests)
 selftest(mman, i915_gem_mman_live_selftests)
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 08/12] drm/i915/gt: Pipelined clear
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:26   ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

Update the PTE and emit a clear within a single unpreemptible packet
such that we can schedule and pipeline clears.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
v3:
- Handle engine instances correctly (Reported by Matthew Auld)
---
 drivers/gpu/drm/i915/gt/intel_migrate.c    | 143 ++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_migrate.h    |  20 +++
 drivers/gpu/drm/i915/gt/selftest_migrate.c | 163 +++++++++++++++++++++
 3 files changed, 326 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
index e2e860063e7b..ba4009120b33 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -488,6 +488,114 @@ intel_context_migrate_copy(struct intel_context *ce,
 	return err;
 }
 
+static int emit_clear(struct i915_request *rq, int size, u32 value)
+{
+	const int gen = INTEL_GEN(rq->engine->i915);
+	u32 instance = rq->engine->instance;
+	u32 *cs;
+
+	GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
+
+	cs = intel_ring_begin(rq, gen >= 8 ? 8 : 6);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	if (gen >= 8) {
+		*cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2);
+		*cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = 0; /* offset */
+		*cs++ = instance;
+		*cs++ = value;
+		*cs++ = MI_NOOP;
+	} else {
+		GEM_BUG_ON(instance);
+		*cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
+		*cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = 0;
+		*cs++ = value;
+	}
+
+	intel_ring_advance(rq, cs);
+	return 0;
+}
+
+int
+intel_context_migrate_clear(struct intel_context *ce,
+			    struct dma_fence *await,
+			    struct scatterlist *sg,
+			    enum i915_cache_level cache_level,
+			    bool is_lmem,
+			    u32 value,
+			    struct i915_request **out)
+{
+	struct sgt_dma it = sg_sgt(sg);
+	struct i915_request *rq;
+	int err;
+
+	*out = NULL;
+
+	GEM_BUG_ON(ce->ring->size < SZ_64K);
+
+	do {
+		int len;
+
+		rq = i915_request_create(ce);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto out_ce;
+		}
+
+		if (await) {
+			err = i915_request_await_dma_fence(rq, await);
+			if (err)
+				goto out_rq;
+
+			if (rq->engine->emit_init_breadcrumb) {
+				err = rq->engine->emit_init_breadcrumb(rq);
+				if (err)
+					goto out_rq;
+			}
+
+			await = NULL;
+		}
+
+		/* The PTE updates + clear must not be interrupted. */
+		err = emit_no_arbitration(rq);
+		if (err)
+			goto out_rq;
+
+		len = emit_pte(rq, &it, cache_level, is_lmem, 0, CHUNK_SZ);
+		if (len <= 0) {
+			err = len;
+			goto out_rq;
+		}
+
+		err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
+		if (err)
+			goto out_rq;
+
+		err = emit_clear(rq, len, value);
+
+		/* Arbitration is re-enabled between requests. */
+out_rq:
+		if (*out)
+			i915_request_put(*out);
+		*out = i915_request_get(rq);
+		i915_request_add(rq);
+		if (err || !it.sg || !sg_dma_len(it.sg))
+			break;
+
+		cond_resched();
+	} while (1);
+
+out_ce:
+	return err;
+}
+
 int intel_migrate_copy(struct intel_migrate *m,
 		       struct i915_gem_ww_ctx *ww,
 		       struct dma_fence *await,
@@ -526,6 +634,41 @@ int intel_migrate_copy(struct intel_migrate *m,
 	return err;
 }
 
+int
+intel_migrate_clear(struct intel_migrate *m,
+		    struct i915_gem_ww_ctx *ww,
+		    struct dma_fence *await,
+		    struct scatterlist *sg,
+		    enum i915_cache_level cache_level,
+		    bool is_lmem,
+		    u32 value,
+		    struct i915_request **out)
+{
+	struct intel_context *ce;
+	int err;
+
+	*out = NULL;
+	if (!m->context)
+		return -ENODEV;
+
+	ce = intel_migrate_create_context(m);
+	if (IS_ERR(ce))
+		ce = intel_context_get(m->context);
+	GEM_BUG_ON(IS_ERR(ce));
+
+	err = intel_context_pin_ww(ce, ww);
+	if (err)
+		goto out;
+
+	err = intel_context_migrate_clear(ce, await, sg, cache_level,
+					  is_lmem, value, out);
+
+	intel_context_unpin(ce);
+out:
+	intel_context_put(ce);
+	return err;
+}
+
 void intel_migrate_fini(struct intel_migrate *m)
 {
 	struct intel_context *ce;
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.h b/drivers/gpu/drm/i915/gt/intel_migrate.h
index 32c61190ed73..4e18e755a00b 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.h
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.h
@@ -6,6 +6,8 @@
 #ifndef __INTEL_MIGRATE__
 #define __INTEL_MIGRATE__
 
+#include <linux/types.h>
+
 #include "intel_migrate_types.h"
 
 struct dma_fence;
@@ -40,6 +42,24 @@ int intel_context_migrate_copy(struct intel_context *ce,
 			       bool dst_is_lmem,
 			       struct i915_request **out);
 
+int
+intel_migrate_clear(struct intel_migrate *m,
+		    struct i915_gem_ww_ctx *ww,
+		    struct dma_fence *await,
+		    struct scatterlist *sg,
+		    enum i915_cache_level cache_level,
+		    bool is_lmem,
+		    u32 value,
+		    struct i915_request **out);
+int
+intel_context_migrate_clear(struct intel_context *ce,
+			    struct dma_fence *await,
+			    struct scatterlist *sg,
+			    enum i915_cache_level cache_level,
+			    bool is_lmem,
+			    u32 value,
+			    struct i915_request **out);
+
 void intel_migrate_fini(struct intel_migrate *m);
 
 #endif /* __INTEL_MIGRATE__ */
diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c
index 9784d149ebf1..159c8656e1b0 100644
--- a/drivers/gpu/drm/i915/gt/selftest_migrate.c
+++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
@@ -129,6 +129,82 @@ static int copy(struct intel_migrate *migrate,
 	return err;
 }
 
+static int clear(struct intel_migrate *migrate,
+		 int (*fn)(struct intel_migrate *migrate,
+			   struct i915_gem_ww_ctx *ww,
+			   struct drm_i915_gem_object *obj,
+			   u32 value,
+			   struct i915_request **out),
+		 u32 sz, struct rnd_state *prng)
+{
+	struct drm_i915_private *i915 = migrate->context->engine->i915;
+	struct drm_i915_gem_object *obj;
+	struct i915_request *rq;
+	struct i915_gem_ww_ctx ww;
+	u32 *vaddr;
+	int err = 0;
+	int i;
+
+	obj = create_lmem_or_internal(i915, sz);
+	if (IS_ERR(obj))
+		return 0;
+
+	for_i915_gem_ww(&ww, err, true) {
+		err = i915_gem_object_lock(obj, &ww);
+		if (err)
+			continue;
+
+		vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			continue;
+		}
+
+		for (i = 0; i < sz / sizeof(u32); i++)
+			vaddr[i] = ~i;
+		i915_gem_object_flush_map(obj);
+
+		err = fn(migrate, &ww, obj, sz, &rq);
+		if (!err)
+			continue;
+
+		if (err != -EDEADLK && err != -EINTR && err != -ERESTARTSYS)
+			pr_err("%ps failed, size: %u\n", fn, sz);
+		if (rq) {
+			i915_request_wait(rq, 0, HZ);
+			i915_request_put(rq);
+		}
+		i915_gem_object_unpin_map(obj);
+	}
+	if (err)
+		goto err_out;
+
+	if (rq) {
+		if (i915_request_wait(rq, 0, HZ) < 0) {
+			pr_err("%ps timed out, size: %u\n", fn, sz);
+			err = -ETIME;
+		}
+		i915_request_put(rq);
+	}
+
+	for (i = 0; !err && i < sz / PAGE_SIZE; i++) {
+		int x = i * 1024 + i915_prandom_u32_max_state(1024, prng);
+
+		if (vaddr[x] != sz) {
+			pr_err("%ps failed, size: %u, offset: %zu\n",
+			       fn, sz, x * sizeof(u32));
+			igt_hexdump(vaddr + i * 1024, 4096);
+			err = -EINVAL;
+		}
+	}
+
+	i915_gem_object_unpin_map(obj);
+err_out:
+	i915_gem_object_put(obj);
+
+	return err;
+}
+
 static int __migrate_copy(struct intel_migrate *migrate,
 			  struct i915_gem_ww_ctx *ww,
 			  struct drm_i915_gem_object *src,
@@ -169,6 +245,44 @@ global_copy(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
 	return copy(migrate, __global_copy, sz, prng);
 }
 
+static int __migrate_clear(struct intel_migrate *migrate,
+			   struct i915_gem_ww_ctx *ww,
+			   struct drm_i915_gem_object *obj,
+			   u32 value,
+			   struct i915_request **out)
+{
+	return intel_migrate_clear(migrate, ww, NULL,
+				   obj->mm.pages->sgl,
+				   obj->cache_level,
+				   i915_gem_object_is_lmem(obj),
+				   value, out);
+}
+
+static int __global_clear(struct intel_migrate *migrate,
+			  struct i915_gem_ww_ctx *ww,
+			  struct drm_i915_gem_object *obj,
+			  u32 value,
+			  struct i915_request **out)
+{
+	return intel_context_migrate_clear(migrate->context, NULL,
+					   obj->mm.pages->sgl,
+					   obj->cache_level,
+					   i915_gem_object_is_lmem(obj),
+					   value, out);
+}
+
+static int
+migrate_clear(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
+{
+	return clear(migrate, __migrate_clear, sz, prng);
+}
+
+static int
+global_clear(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
+{
+	return clear(migrate, __global_clear, sz, prng);
+}
+
 static int live_migrate_copy(void *arg)
 {
 	struct intel_migrate *migrate = arg;
@@ -190,6 +304,28 @@ static int live_migrate_copy(void *arg)
 	return 0;
 }
 
+static int live_migrate_clear(void *arg)
+{
+	struct intel_migrate *migrate = arg;
+	struct drm_i915_private *i915 = migrate->context->engine->i915;
+	I915_RND_STATE(prng);
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
+		int err;
+
+		err = migrate_clear(migrate, sizes[i], &prng);
+		if (err == 0)
+			err = global_clear(migrate, sizes[i], &prng);
+
+		i915_gem_drain_freed_objects(i915);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 struct threaded_migrate {
 	struct intel_migrate *migrate;
 	struct task_struct *tsk;
@@ -271,12 +407,39 @@ static int thread_global_copy(void *arg)
 	return threaded_migrate(arg, __thread_global_copy, 0);
 }
 
+static int __thread_migrate_clear(void *arg)
+{
+	struct threaded_migrate *tm = arg;
+
+	return migrate_clear(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
+}
+
+static int __thread_global_clear(void *arg)
+{
+	struct threaded_migrate *tm = arg;
+
+	return global_clear(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
+}
+
+static int thread_migrate_clear(void *arg)
+{
+	return threaded_migrate(arg, __thread_migrate_clear, 0);
+}
+
+static int thread_global_clear(void *arg)
+{
+	return threaded_migrate(arg, __thread_global_clear, 0);
+}
+
 int intel_migrate_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(live_migrate_copy),
+		SUBTEST(live_migrate_clear),
 		SUBTEST(thread_migrate_copy),
+		SUBTEST(thread_migrate_clear),
 		SUBTEST(thread_global_copy),
+		SUBTEST(thread_global_clear),
 	};
 	struct intel_migrate m;
 	int err;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 08/12] drm/i915/gt: Pipelined clear
@ 2021-06-14 16:26   ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

Update the PTE and emit a clear within a single unpreemptible packet
such that we can schedule and pipeline clears.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
v3:
- Handle engine instances correctly (Reported by Matthew Auld)
---
 drivers/gpu/drm/i915/gt/intel_migrate.c    | 143 ++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_migrate.h    |  20 +++
 drivers/gpu/drm/i915/gt/selftest_migrate.c | 163 +++++++++++++++++++++
 3 files changed, 326 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
index e2e860063e7b..ba4009120b33 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -488,6 +488,114 @@ intel_context_migrate_copy(struct intel_context *ce,
 	return err;
 }
 
+static int emit_clear(struct i915_request *rq, int size, u32 value)
+{
+	const int gen = INTEL_GEN(rq->engine->i915);
+	u32 instance = rq->engine->instance;
+	u32 *cs;
+
+	GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
+
+	cs = intel_ring_begin(rq, gen >= 8 ? 8 : 6);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	if (gen >= 8) {
+		*cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2);
+		*cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = 0; /* offset */
+		*cs++ = instance;
+		*cs++ = value;
+		*cs++ = MI_NOOP;
+	} else {
+		GEM_BUG_ON(instance);
+		*cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
+		*cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = 0;
+		*cs++ = value;
+	}
+
+	intel_ring_advance(rq, cs);
+	return 0;
+}
+
+int
+intel_context_migrate_clear(struct intel_context *ce,
+			    struct dma_fence *await,
+			    struct scatterlist *sg,
+			    enum i915_cache_level cache_level,
+			    bool is_lmem,
+			    u32 value,
+			    struct i915_request **out)
+{
+	struct sgt_dma it = sg_sgt(sg);
+	struct i915_request *rq;
+	int err;
+
+	*out = NULL;
+
+	GEM_BUG_ON(ce->ring->size < SZ_64K);
+
+	do {
+		int len;
+
+		rq = i915_request_create(ce);
+		if (IS_ERR(rq)) {
+			err = PTR_ERR(rq);
+			goto out_ce;
+		}
+
+		if (await) {
+			err = i915_request_await_dma_fence(rq, await);
+			if (err)
+				goto out_rq;
+
+			if (rq->engine->emit_init_breadcrumb) {
+				err = rq->engine->emit_init_breadcrumb(rq);
+				if (err)
+					goto out_rq;
+			}
+
+			await = NULL;
+		}
+
+		/* The PTE updates + clear must not be interrupted. */
+		err = emit_no_arbitration(rq);
+		if (err)
+			goto out_rq;
+
+		len = emit_pte(rq, &it, cache_level, is_lmem, 0, CHUNK_SZ);
+		if (len <= 0) {
+			err = len;
+			goto out_rq;
+		}
+
+		err = rq->engine->emit_flush(rq, EMIT_INVALIDATE);
+		if (err)
+			goto out_rq;
+
+		err = emit_clear(rq, len, value);
+
+		/* Arbitration is re-enabled between requests. */
+out_rq:
+		if (*out)
+			i915_request_put(*out);
+		*out = i915_request_get(rq);
+		i915_request_add(rq);
+		if (err || !it.sg || !sg_dma_len(it.sg))
+			break;
+
+		cond_resched();
+	} while (1);
+
+out_ce:
+	return err;
+}
+
 int intel_migrate_copy(struct intel_migrate *m,
 		       struct i915_gem_ww_ctx *ww,
 		       struct dma_fence *await,
@@ -526,6 +634,41 @@ int intel_migrate_copy(struct intel_migrate *m,
 	return err;
 }
 
+int
+intel_migrate_clear(struct intel_migrate *m,
+		    struct i915_gem_ww_ctx *ww,
+		    struct dma_fence *await,
+		    struct scatterlist *sg,
+		    enum i915_cache_level cache_level,
+		    bool is_lmem,
+		    u32 value,
+		    struct i915_request **out)
+{
+	struct intel_context *ce;
+	int err;
+
+	*out = NULL;
+	if (!m->context)
+		return -ENODEV;
+
+	ce = intel_migrate_create_context(m);
+	if (IS_ERR(ce))
+		ce = intel_context_get(m->context);
+	GEM_BUG_ON(IS_ERR(ce));
+
+	err = intel_context_pin_ww(ce, ww);
+	if (err)
+		goto out;
+
+	err = intel_context_migrate_clear(ce, await, sg, cache_level,
+					  is_lmem, value, out);
+
+	intel_context_unpin(ce);
+out:
+	intel_context_put(ce);
+	return err;
+}
+
 void intel_migrate_fini(struct intel_migrate *m)
 {
 	struct intel_context *ce;
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.h b/drivers/gpu/drm/i915/gt/intel_migrate.h
index 32c61190ed73..4e18e755a00b 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.h
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.h
@@ -6,6 +6,8 @@
 #ifndef __INTEL_MIGRATE__
 #define __INTEL_MIGRATE__
 
+#include <linux/types.h>
+
 #include "intel_migrate_types.h"
 
 struct dma_fence;
@@ -40,6 +42,24 @@ int intel_context_migrate_copy(struct intel_context *ce,
 			       bool dst_is_lmem,
 			       struct i915_request **out);
 
+int
+intel_migrate_clear(struct intel_migrate *m,
+		    struct i915_gem_ww_ctx *ww,
+		    struct dma_fence *await,
+		    struct scatterlist *sg,
+		    enum i915_cache_level cache_level,
+		    bool is_lmem,
+		    u32 value,
+		    struct i915_request **out);
+int
+intel_context_migrate_clear(struct intel_context *ce,
+			    struct dma_fence *await,
+			    struct scatterlist *sg,
+			    enum i915_cache_level cache_level,
+			    bool is_lmem,
+			    u32 value,
+			    struct i915_request **out);
+
 void intel_migrate_fini(struct intel_migrate *m);
 
 #endif /* __INTEL_MIGRATE__ */
diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c
index 9784d149ebf1..159c8656e1b0 100644
--- a/drivers/gpu/drm/i915/gt/selftest_migrate.c
+++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
@@ -129,6 +129,82 @@ static int copy(struct intel_migrate *migrate,
 	return err;
 }
 
+static int clear(struct intel_migrate *migrate,
+		 int (*fn)(struct intel_migrate *migrate,
+			   struct i915_gem_ww_ctx *ww,
+			   struct drm_i915_gem_object *obj,
+			   u32 value,
+			   struct i915_request **out),
+		 u32 sz, struct rnd_state *prng)
+{
+	struct drm_i915_private *i915 = migrate->context->engine->i915;
+	struct drm_i915_gem_object *obj;
+	struct i915_request *rq;
+	struct i915_gem_ww_ctx ww;
+	u32 *vaddr;
+	int err = 0;
+	int i;
+
+	obj = create_lmem_or_internal(i915, sz);
+	if (IS_ERR(obj))
+		return 0;
+
+	for_i915_gem_ww(&ww, err, true) {
+		err = i915_gem_object_lock(obj, &ww);
+		if (err)
+			continue;
+
+		vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			continue;
+		}
+
+		for (i = 0; i < sz / sizeof(u32); i++)
+			vaddr[i] = ~i;
+		i915_gem_object_flush_map(obj);
+
+		err = fn(migrate, &ww, obj, sz, &rq);
+		if (!err)
+			continue;
+
+		if (err != -EDEADLK && err != -EINTR && err != -ERESTARTSYS)
+			pr_err("%ps failed, size: %u\n", fn, sz);
+		if (rq) {
+			i915_request_wait(rq, 0, HZ);
+			i915_request_put(rq);
+		}
+		i915_gem_object_unpin_map(obj);
+	}
+	if (err)
+		goto err_out;
+
+	if (rq) {
+		if (i915_request_wait(rq, 0, HZ) < 0) {
+			pr_err("%ps timed out, size: %u\n", fn, sz);
+			err = -ETIME;
+		}
+		i915_request_put(rq);
+	}
+
+	for (i = 0; !err && i < sz / PAGE_SIZE; i++) {
+		int x = i * 1024 + i915_prandom_u32_max_state(1024, prng);
+
+		if (vaddr[x] != sz) {
+			pr_err("%ps failed, size: %u, offset: %zu\n",
+			       fn, sz, x * sizeof(u32));
+			igt_hexdump(vaddr + i * 1024, 4096);
+			err = -EINVAL;
+		}
+	}
+
+	i915_gem_object_unpin_map(obj);
+err_out:
+	i915_gem_object_put(obj);
+
+	return err;
+}
+
 static int __migrate_copy(struct intel_migrate *migrate,
 			  struct i915_gem_ww_ctx *ww,
 			  struct drm_i915_gem_object *src,
@@ -169,6 +245,44 @@ global_copy(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
 	return copy(migrate, __global_copy, sz, prng);
 }
 
+static int __migrate_clear(struct intel_migrate *migrate,
+			   struct i915_gem_ww_ctx *ww,
+			   struct drm_i915_gem_object *obj,
+			   u32 value,
+			   struct i915_request **out)
+{
+	return intel_migrate_clear(migrate, ww, NULL,
+				   obj->mm.pages->sgl,
+				   obj->cache_level,
+				   i915_gem_object_is_lmem(obj),
+				   value, out);
+}
+
+static int __global_clear(struct intel_migrate *migrate,
+			  struct i915_gem_ww_ctx *ww,
+			  struct drm_i915_gem_object *obj,
+			  u32 value,
+			  struct i915_request **out)
+{
+	return intel_context_migrate_clear(migrate->context, NULL,
+					   obj->mm.pages->sgl,
+					   obj->cache_level,
+					   i915_gem_object_is_lmem(obj),
+					   value, out);
+}
+
+static int
+migrate_clear(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
+{
+	return clear(migrate, __migrate_clear, sz, prng);
+}
+
+static int
+global_clear(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng)
+{
+	return clear(migrate, __global_clear, sz, prng);
+}
+
 static int live_migrate_copy(void *arg)
 {
 	struct intel_migrate *migrate = arg;
@@ -190,6 +304,28 @@ static int live_migrate_copy(void *arg)
 	return 0;
 }
 
+static int live_migrate_clear(void *arg)
+{
+	struct intel_migrate *migrate = arg;
+	struct drm_i915_private *i915 = migrate->context->engine->i915;
+	I915_RND_STATE(prng);
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
+		int err;
+
+		err = migrate_clear(migrate, sizes[i], &prng);
+		if (err == 0)
+			err = global_clear(migrate, sizes[i], &prng);
+
+		i915_gem_drain_freed_objects(i915);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 struct threaded_migrate {
 	struct intel_migrate *migrate;
 	struct task_struct *tsk;
@@ -271,12 +407,39 @@ static int thread_global_copy(void *arg)
 	return threaded_migrate(arg, __thread_global_copy, 0);
 }
 
+static int __thread_migrate_clear(void *arg)
+{
+	struct threaded_migrate *tm = arg;
+
+	return migrate_clear(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
+}
+
+static int __thread_global_clear(void *arg)
+{
+	struct threaded_migrate *tm = arg;
+
+	return global_clear(tm->migrate, 2 * CHUNK_SZ, &tm->prng);
+}
+
+static int thread_migrate_clear(void *arg)
+{
+	return threaded_migrate(arg, __thread_migrate_clear, 0);
+}
+
+static int thread_global_clear(void *arg)
+{
+	return threaded_migrate(arg, __thread_global_clear, 0);
+}
+
 int intel_migrate_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(live_migrate_copy),
+		SUBTEST(live_migrate_clear),
 		SUBTEST(thread_migrate_copy),
+		SUBTEST(thread_migrate_clear),
 		SUBTEST(thread_global_copy),
+		SUBTEST(thread_global_clear),
 	};
 	struct intel_migrate m;
 	int err;
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 09/12] drm/i915/gt: Setup a default migration context on the GT
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:26   ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

Set up a default migration context on the GT and use it from the
selftests.
Add a perf selftest and make sure we exercise LMEM if available.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
v3:
- Skip checks for lmem presence before creating objects
  (Reported by Matthew Auld)
---
 drivers/gpu/drm/i915/gt/intel_gt.c            |   4 +
 drivers/gpu/drm/i915/gt/intel_gt_types.h      |   3 +
 drivers/gpu/drm/i915/gt/intel_migrate.c       |   2 +
 drivers/gpu/drm/i915/gt/selftest_migrate.c    | 237 +++++++++++++++++-
 .../drm/i915/selftests/i915_perf_selftests.h  |   1 +
 5 files changed, 236 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 2161bf01ef8b..67ef057ae918 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -13,6 +13,7 @@
 #include "intel_gt_clock_utils.h"
 #include "intel_gt_pm.h"
 #include "intel_gt_requests.h"
+#include "intel_migrate.h"
 #include "intel_mocs.h"
 #include "intel_rc6.h"
 #include "intel_renderstate.h"
@@ -626,6 +627,8 @@ int intel_gt_init(struct intel_gt *gt)
 	if (err)
 		goto err_gt;
 
+	intel_migrate_init(&gt->migrate, gt);
+
 	goto out_fw;
 err_gt:
 	__intel_gt_disable(gt);
@@ -649,6 +652,7 @@ void intel_gt_driver_remove(struct intel_gt *gt)
 {
 	__intel_gt_disable(gt);
 
+	intel_migrate_fini(&gt->migrate);
 	intel_uc_driver_remove(&gt->uc);
 
 	intel_engines_release(gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index fecfacf551d5..7450935f2ca8 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -24,6 +24,7 @@
 #include "intel_reset_types.h"
 #include "intel_rc6_types.h"
 #include "intel_rps_types.h"
+#include "intel_migrate_types.h"
 #include "intel_wakeref.h"
 
 struct drm_i915_private;
@@ -145,6 +146,8 @@ struct intel_gt {
 
 	struct i915_vma *scratch;
 
+	struct intel_migrate migrate;
+
 	struct intel_gt_info {
 		intel_engine_mask_t engine_mask;
 		u8 num_engines;
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
index ba4009120b33..23c59ce66cee 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -418,6 +418,7 @@ intel_context_migrate_copy(struct intel_context *ce,
 	struct i915_request *rq;
 	int err;
 
+	GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm);
 	*out = NULL;
 
 	GEM_BUG_ON(ce->ring->size < SZ_64K);
@@ -536,6 +537,7 @@ intel_context_migrate_clear(struct intel_context *ce,
 	struct i915_request *rq;
 	int err;
 
+	GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm);
 	*out = NULL;
 
 	GEM_BUG_ON(ce->ring->size < SZ_64K);
diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c
index 159c8656e1b0..12ef2837c89b 100644
--- a/drivers/gpu/drm/i915/gt/selftest_migrate.c
+++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
@@ -3,6 +3,8 @@
  * Copyright © 2020 Intel Corporation
  */
 
+#include <linux/sort.h>
+
 #include "selftests/i915_random.h"
 
 static const unsigned int sizes[] = {
@@ -18,13 +20,11 @@ static const unsigned int sizes[] = {
 static struct drm_i915_gem_object *
 create_lmem_or_internal(struct drm_i915_private *i915, size_t size)
 {
-	if (HAS_LMEM(i915)) {
-		struct drm_i915_gem_object *obj;
+	struct drm_i915_gem_object *obj;
 
-		obj = i915_gem_object_create_lmem(i915, size, 0);
-		if (!IS_ERR(obj))
-			return obj;
-	}
+	obj = i915_gem_object_create_lmem(i915, size, 0);
+	if (!IS_ERR(obj))
+		return obj;
 
 	return i915_gem_object_create_internal(i915, size);
 }
@@ -441,14 +441,229 @@ int intel_migrate_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(thread_global_copy),
 		SUBTEST(thread_global_clear),
 	};
-	struct intel_migrate m;
+	struct intel_gt *gt = &i915->gt;
+
+	if (!gt->migrate.context)
+		return 0;
+
+	return i915_subtests(tests, &gt->migrate);
+}
+
+static struct drm_i915_gem_object *
+create_init_lmem_internal(struct intel_gt *gt, size_t sz, bool try_lmem)
+{
+	struct drm_i915_gem_object *obj = NULL;
 	int err;
 
-	if (intel_migrate_init(&m, &i915->gt))
+	if (try_lmem)
+		obj = i915_gem_object_create_lmem(gt->i915, sz, 0);
+
+	if (IS_ERR_OR_NULL(obj)) {
+		obj = i915_gem_object_create_internal(gt->i915, sz);
+		if (IS_ERR(obj))
+			return obj;
+	}
+
+	i915_gem_object_trylock(obj);
+	err = i915_gem_object_pin_pages(obj);
+	if (err) {
+		i915_gem_object_unlock(obj);
+		i915_gem_object_put(obj);
+		return ERR_PTR(err);
+	}
+
+	return obj;
+}
+
+static int wrap_ktime_compare(const void *A, const void *B)
+{
+	const ktime_t *a = A, *b = B;
+
+	return ktime_compare(*a, *b);
+}
+
+static int __perf_clear_blt(struct intel_context *ce,
+			    struct scatterlist *sg,
+			    enum i915_cache_level cache_level,
+			    bool is_lmem,
+			    size_t sz)
+{
+	ktime_t t[5];
+	int pass;
+	int err = 0;
+
+	for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
+		struct i915_request *rq;
+		ktime_t t0, t1;
+
+		t0 = ktime_get();
+
+		err = intel_context_migrate_clear(ce, NULL, sg, cache_level,
+						  is_lmem, 0, &rq);
+		if (rq) {
+			if (i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT) < 0)
+				err = -EIO;
+			i915_request_put(rq);
+		}
+		if (err)
+			break;
+
+		t1 = ktime_get();
+		t[pass] = ktime_sub(t1, t0);
+	}
+	if (err)
+		return err;
+
+	sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
+	pr_info("%s: %zd KiB fill: %lld MiB/s\n",
+		ce->engine->name, sz >> 10,
+		div64_u64(mul_u32_u32(4 * sz,
+				      1000 * 1000 * 1000),
+			  t[1] + 2 * t[2] + t[3]) >> 20);
+	return 0;
+}
+
+static int perf_clear_blt(void *arg)
+{
+	struct intel_gt *gt = arg;
+	static const unsigned long sizes[] = {
+		SZ_4K,
+		SZ_64K,
+		SZ_2M,
+		SZ_64M
+	};
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
+		struct drm_i915_gem_object *dst;
+		int err;
+
+		dst = create_init_lmem_internal(gt, sizes[i], true);
+		if (IS_ERR(dst))
+			return PTR_ERR(dst);
+
+		err = __perf_clear_blt(gt->migrate.context,
+				       dst->mm.pages->sgl,
+				       I915_CACHE_NONE,
+				       i915_gem_object_is_lmem(dst),
+				       sizes[i]);
+
+		i915_gem_object_unlock(dst);
+		i915_gem_object_put(dst);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int __perf_copy_blt(struct intel_context *ce,
+			   struct scatterlist *src,
+			   enum i915_cache_level src_cache_level,
+			   bool src_is_lmem,
+			   struct scatterlist *dst,
+			   enum i915_cache_level dst_cache_level,
+			   bool dst_is_lmem,
+			   size_t sz)
+{
+	ktime_t t[5];
+	int pass;
+	int err = 0;
+
+	for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
+		struct i915_request *rq;
+		ktime_t t0, t1;
+
+		t0 = ktime_get();
+
+		err = intel_context_migrate_copy(ce, NULL,
+						 src, src_cache_level,
+						 src_is_lmem,
+						 dst, dst_cache_level,
+						 dst_is_lmem,
+						 &rq);
+		if (rq) {
+			if (i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT) < 0)
+				err = -EIO;
+			i915_request_put(rq);
+		}
+		if (err)
+			break;
+
+		t1 = ktime_get();
+		t[pass] = ktime_sub(t1, t0);
+	}
+	if (err)
+		return err;
+
+	sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
+	pr_info("%s: %zd KiB copy: %lld MiB/s\n",
+		ce->engine->name, sz >> 10,
+		div64_u64(mul_u32_u32(4 * sz,
+				      1000 * 1000 * 1000),
+			  t[1] + 2 * t[2] + t[3]) >> 20);
+	return 0;
+}
+
+static int perf_copy_blt(void *arg)
+{
+	struct intel_gt *gt = arg;
+	static const unsigned long sizes[] = {
+		SZ_4K,
+		SZ_64K,
+		SZ_2M,
+		SZ_64M
+	};
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
+		struct drm_i915_gem_object *src, *dst;
+		int err;
+
+		src = create_init_lmem_internal(gt, sizes[i], true);
+		if (IS_ERR(src))
+			return PTR_ERR(src);
+
+		dst = create_init_lmem_internal(gt, sizes[i], false);
+		if (IS_ERR(dst)) {
+			err = PTR_ERR(dst);
+			goto err_src;
+		}
+
+		err = __perf_copy_blt(gt->migrate.context,
+				      src->mm.pages->sgl,
+				      I915_CACHE_NONE,
+				      i915_gem_object_is_lmem(src),
+				      dst->mm.pages->sgl,
+				      I915_CACHE_NONE,
+				      i915_gem_object_is_lmem(dst),
+				      sizes[i]);
+
+		i915_gem_object_unlock(dst);
+		i915_gem_object_put(dst);
+err_src:
+		i915_gem_object_unlock(src);
+		i915_gem_object_put(src);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+int intel_migrate_perf_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(perf_clear_blt),
+		SUBTEST(perf_copy_blt),
+	};
+	struct intel_gt *gt = &i915->gt;
+
+	if (intel_gt_is_wedged(gt))
 		return 0;
 
-	err = i915_subtests(tests, &m);
-	intel_migrate_fini(&m);
+	if (!gt->migrate.context)
+		return 0;
 
-	return err;
+	return intel_gt_live_subtests(tests, gt);
 }
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
index c2389f8a257d..5077dc3c3b8c 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
@@ -17,5 +17,6 @@
  */
 selftest(engine_cs, intel_engine_cs_perf_selftests)
 selftest(request, i915_request_perf_selftests)
+selftest(migrate, intel_migrate_perf_selftests)
 selftest(blt, i915_gem_object_blt_perf_selftests)
 selftest(region, intel_memory_region_perf_selftests)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 09/12] drm/i915/gt: Setup a default migration context on the GT
@ 2021-06-14 16:26   ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld, Chris Wilson

From: Chris Wilson <chris@chris-wilson.co.uk>

Set up a default migration context on the GT and use it from the
selftests.
Add a perf selftest and make sure we exercise LMEM if available.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
---
v3:
- Skip checks for lmem presence before creating objects
  (Reported by Matthew Auld)
---
 drivers/gpu/drm/i915/gt/intel_gt.c            |   4 +
 drivers/gpu/drm/i915/gt/intel_gt_types.h      |   3 +
 drivers/gpu/drm/i915/gt/intel_migrate.c       |   2 +
 drivers/gpu/drm/i915/gt/selftest_migrate.c    | 237 +++++++++++++++++-
 .../drm/i915/selftests/i915_perf_selftests.h  |   1 +
 5 files changed, 236 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 2161bf01ef8b..67ef057ae918 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -13,6 +13,7 @@
 #include "intel_gt_clock_utils.h"
 #include "intel_gt_pm.h"
 #include "intel_gt_requests.h"
+#include "intel_migrate.h"
 #include "intel_mocs.h"
 #include "intel_rc6.h"
 #include "intel_renderstate.h"
@@ -626,6 +627,8 @@ int intel_gt_init(struct intel_gt *gt)
 	if (err)
 		goto err_gt;
 
+	intel_migrate_init(&gt->migrate, gt);
+
 	goto out_fw;
 err_gt:
 	__intel_gt_disable(gt);
@@ -649,6 +652,7 @@ void intel_gt_driver_remove(struct intel_gt *gt)
 {
 	__intel_gt_disable(gt);
 
+	intel_migrate_fini(&gt->migrate);
 	intel_uc_driver_remove(&gt->uc);
 
 	intel_engines_release(gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index fecfacf551d5..7450935f2ca8 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -24,6 +24,7 @@
 #include "intel_reset_types.h"
 #include "intel_rc6_types.h"
 #include "intel_rps_types.h"
+#include "intel_migrate_types.h"
 #include "intel_wakeref.h"
 
 struct drm_i915_private;
@@ -145,6 +146,8 @@ struct intel_gt {
 
 	struct i915_vma *scratch;
 
+	struct intel_migrate migrate;
+
 	struct intel_gt_info {
 		intel_engine_mask_t engine_mask;
 		u8 num_engines;
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c
index ba4009120b33..23c59ce66cee 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -418,6 +418,7 @@ intel_context_migrate_copy(struct intel_context *ce,
 	struct i915_request *rq;
 	int err;
 
+	GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm);
 	*out = NULL;
 
 	GEM_BUG_ON(ce->ring->size < SZ_64K);
@@ -536,6 +537,7 @@ intel_context_migrate_clear(struct intel_context *ce,
 	struct i915_request *rq;
 	int err;
 
+	GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm);
 	*out = NULL;
 
 	GEM_BUG_ON(ce->ring->size < SZ_64K);
diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c
index 159c8656e1b0..12ef2837c89b 100644
--- a/drivers/gpu/drm/i915/gt/selftest_migrate.c
+++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c
@@ -3,6 +3,8 @@
  * Copyright © 2020 Intel Corporation
  */
 
+#include <linux/sort.h>
+
 #include "selftests/i915_random.h"
 
 static const unsigned int sizes[] = {
@@ -18,13 +20,11 @@ static const unsigned int sizes[] = {
 static struct drm_i915_gem_object *
 create_lmem_or_internal(struct drm_i915_private *i915, size_t size)
 {
-	if (HAS_LMEM(i915)) {
-		struct drm_i915_gem_object *obj;
+	struct drm_i915_gem_object *obj;
 
-		obj = i915_gem_object_create_lmem(i915, size, 0);
-		if (!IS_ERR(obj))
-			return obj;
-	}
+	obj = i915_gem_object_create_lmem(i915, size, 0);
+	if (!IS_ERR(obj))
+		return obj;
 
 	return i915_gem_object_create_internal(i915, size);
 }
@@ -441,14 +441,229 @@ int intel_migrate_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(thread_global_copy),
 		SUBTEST(thread_global_clear),
 	};
-	struct intel_migrate m;
+	struct intel_gt *gt = &i915->gt;
+
+	if (!gt->migrate.context)
+		return 0;
+
+	return i915_subtests(tests, &gt->migrate);
+}
+
+static struct drm_i915_gem_object *
+create_init_lmem_internal(struct intel_gt *gt, size_t sz, bool try_lmem)
+{
+	struct drm_i915_gem_object *obj = NULL;
 	int err;
 
-	if (intel_migrate_init(&m, &i915->gt))
+	if (try_lmem)
+		obj = i915_gem_object_create_lmem(gt->i915, sz, 0);
+
+	if (IS_ERR_OR_NULL(obj)) {
+		obj = i915_gem_object_create_internal(gt->i915, sz);
+		if (IS_ERR(obj))
+			return obj;
+	}
+
+	i915_gem_object_trylock(obj);
+	err = i915_gem_object_pin_pages(obj);
+	if (err) {
+		i915_gem_object_unlock(obj);
+		i915_gem_object_put(obj);
+		return ERR_PTR(err);
+	}
+
+	return obj;
+}
+
+static int wrap_ktime_compare(const void *A, const void *B)
+{
+	const ktime_t *a = A, *b = B;
+
+	return ktime_compare(*a, *b);
+}
+
+static int __perf_clear_blt(struct intel_context *ce,
+			    struct scatterlist *sg,
+			    enum i915_cache_level cache_level,
+			    bool is_lmem,
+			    size_t sz)
+{
+	ktime_t t[5];
+	int pass;
+	int err = 0;
+
+	for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
+		struct i915_request *rq;
+		ktime_t t0, t1;
+
+		t0 = ktime_get();
+
+		err = intel_context_migrate_clear(ce, NULL, sg, cache_level,
+						  is_lmem, 0, &rq);
+		if (rq) {
+			if (i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT) < 0)
+				err = -EIO;
+			i915_request_put(rq);
+		}
+		if (err)
+			break;
+
+		t1 = ktime_get();
+		t[pass] = ktime_sub(t1, t0);
+	}
+	if (err)
+		return err;
+
+	sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
+	pr_info("%s: %zd KiB fill: %lld MiB/s\n",
+		ce->engine->name, sz >> 10,
+		div64_u64(mul_u32_u32(4 * sz,
+				      1000 * 1000 * 1000),
+			  t[1] + 2 * t[2] + t[3]) >> 20);
+	return 0;
+}
+
+static int perf_clear_blt(void *arg)
+{
+	struct intel_gt *gt = arg;
+	static const unsigned long sizes[] = {
+		SZ_4K,
+		SZ_64K,
+		SZ_2M,
+		SZ_64M
+	};
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
+		struct drm_i915_gem_object *dst;
+		int err;
+
+		dst = create_init_lmem_internal(gt, sizes[i], true);
+		if (IS_ERR(dst))
+			return PTR_ERR(dst);
+
+		err = __perf_clear_blt(gt->migrate.context,
+				       dst->mm.pages->sgl,
+				       I915_CACHE_NONE,
+				       i915_gem_object_is_lmem(dst),
+				       sizes[i]);
+
+		i915_gem_object_unlock(dst);
+		i915_gem_object_put(dst);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int __perf_copy_blt(struct intel_context *ce,
+			   struct scatterlist *src,
+			   enum i915_cache_level src_cache_level,
+			   bool src_is_lmem,
+			   struct scatterlist *dst,
+			   enum i915_cache_level dst_cache_level,
+			   bool dst_is_lmem,
+			   size_t sz)
+{
+	ktime_t t[5];
+	int pass;
+	int err = 0;
+
+	for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
+		struct i915_request *rq;
+		ktime_t t0, t1;
+
+		t0 = ktime_get();
+
+		err = intel_context_migrate_copy(ce, NULL,
+						 src, src_cache_level,
+						 src_is_lmem,
+						 dst, dst_cache_level,
+						 dst_is_lmem,
+						 &rq);
+		if (rq) {
+			if (i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT) < 0)
+				err = -EIO;
+			i915_request_put(rq);
+		}
+		if (err)
+			break;
+
+		t1 = ktime_get();
+		t[pass] = ktime_sub(t1, t0);
+	}
+	if (err)
+		return err;
+
+	sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
+	pr_info("%s: %zd KiB copy: %lld MiB/s\n",
+		ce->engine->name, sz >> 10,
+		div64_u64(mul_u32_u32(4 * sz,
+				      1000 * 1000 * 1000),
+			  t[1] + 2 * t[2] + t[3]) >> 20);
+	return 0;
+}
+
+static int perf_copy_blt(void *arg)
+{
+	struct intel_gt *gt = arg;
+	static const unsigned long sizes[] = {
+		SZ_4K,
+		SZ_64K,
+		SZ_2M,
+		SZ_64M
+	};
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
+		struct drm_i915_gem_object *src, *dst;
+		int err;
+
+		src = create_init_lmem_internal(gt, sizes[i], true);
+		if (IS_ERR(src))
+			return PTR_ERR(src);
+
+		dst = create_init_lmem_internal(gt, sizes[i], false);
+		if (IS_ERR(dst)) {
+			err = PTR_ERR(dst);
+			goto err_src;
+		}
+
+		err = __perf_copy_blt(gt->migrate.context,
+				      src->mm.pages->sgl,
+				      I915_CACHE_NONE,
+				      i915_gem_object_is_lmem(src),
+				      dst->mm.pages->sgl,
+				      I915_CACHE_NONE,
+				      i915_gem_object_is_lmem(dst),
+				      sizes[i]);
+
+		i915_gem_object_unlock(dst);
+		i915_gem_object_put(dst);
+err_src:
+		i915_gem_object_unlock(src);
+		i915_gem_object_put(src);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+int intel_migrate_perf_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(perf_clear_blt),
+		SUBTEST(perf_copy_blt),
+	};
+	struct intel_gt *gt = &i915->gt;
+
+	if (intel_gt_is_wedged(gt))
 		return 0;
 
-	err = i915_subtests(tests, &m);
-	intel_migrate_fini(&m);
+	if (!gt->migrate.context)
+		return 0;
 
-	return err;
+	return intel_gt_live_subtests(tests, gt);
 }
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
index c2389f8a257d..5077dc3c3b8c 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
@@ -17,5 +17,6 @@
  */
 selftest(engine_cs, intel_engine_cs_perf_selftests)
 selftest(request, i915_request_perf_selftests)
+selftest(migrate, intel_migrate_perf_selftests)
 selftest(blt, i915_gem_object_blt_perf_selftests)
 selftest(region, intel_memory_region_perf_selftests)
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 10/12] drm/i915/ttm: accelerated move implementation
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:26   ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: matthew.auld

From: Ramalingam C <ramalingam.c@intel.com>

Invokes the pipelined page migration through blt, for
i915_ttm_move requests of eviction and also obj clear.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
---
v2:
 - subfunction for accel_move (Thomas)
 - engine_pm_get/put around context_move/clear (Thomas)
 - Invalidation at accel_clear (Thomas)
v3:
 - conflict resolution s/&bo->mem/bo->resource/g
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 87 +++++++++++++++++++++----
 1 file changed, 74 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index bf33724bed5c..08b72c280cb5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -15,6 +15,9 @@
 #include "gem/i915_gem_ttm.h"
 #include "gem/i915_gem_mman.h"
 
+#include "gt/intel_migrate.h"
+#include "gt/intel_engine_pm.h"
+
 #define I915_PL_LMEM0 TTM_PL_PRIV
 #define I915_PL_SYSTEM TTM_PL_SYSTEM
 #define I915_PL_STOLEN TTM_PL_VRAM
@@ -282,6 +285,61 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
 	return intel_region_ttm_node_to_st(obj->mm.region, res);
 }
 
+static int i915_ttm_accel_move(struct ttm_buffer_object *bo,
+			       struct ttm_resource *dst_mem,
+			       struct sg_table *dst_st)
+{
+	struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915),
+						     bdev);
+	struct ttm_resource_manager *src_man =
+		ttm_manager_type(bo->bdev, bo->resource->mem_type);
+	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
+	struct sg_table *src_st;
+	struct i915_request *rq;
+	int ret;
+
+	if (!i915->gt.migrate.context)
+		return -EINVAL;
+
+	if (!bo->ttm || !ttm_tt_is_populated(bo->ttm)) {
+		if (bo->type == ttm_bo_type_kernel)
+			return -EINVAL;
+
+		if (bo->ttm &&
+		    !(bo->ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))
+			return 0;
+
+		intel_engine_pm_get(i915->gt.migrate.context->engine);
+		ret = intel_context_migrate_clear(i915->gt.migrate.context, NULL,
+						  dst_st->sgl, I915_CACHE_NONE,
+						  dst_mem->mem_type >= TTM_PL_PRIV,
+						  0, &rq);
+
+		if (!ret && rq) {
+			i915_request_wait(rq, 0, HZ);
+			i915_request_put(rq);
+		}
+		intel_engine_pm_put(i915->gt.migrate.context->engine);
+	} else {
+		src_st = src_man->use_tt ? i915_ttm_tt_get_st(bo->ttm) :
+						obj->ttm.cached_io_st;
+
+		intel_engine_pm_get(i915->gt.migrate.context->engine);
+		ret = intel_context_migrate_copy(i915->gt.migrate.context,
+						 NULL, src_st->sgl, I915_CACHE_NONE,
+						 bo->resource->mem_type >= TTM_PL_PRIV,
+						 dst_st->sgl, I915_CACHE_NONE,
+						 dst_mem->mem_type >= TTM_PL_PRIV, &rq);
+		if (!ret && rq) {
+			i915_request_wait(rq, 0, HZ);
+			i915_request_put(rq);
+		}
+		intel_engine_pm_put(i915->gt.migrate.context->engine);
+	}
+
+	return ret;
+}
+
 static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 			 struct ttm_operation_ctx *ctx,
 			 struct ttm_resource *dst_mem,
@@ -332,19 +390,22 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 	if (IS_ERR(dst_st))
 		return PTR_ERR(dst_st);
 
-	/* If we start mapping GGTT, we can no longer use man::use_tt here. */
-	dst_iter = dst_man->use_tt ?
-		ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) :
-		ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap,
-					 dst_st, dst_reg->region.start);
-
-	src_iter = src_man->use_tt ?
-		ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) :
-		ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap,
-					 obj->ttm.cached_io_st,
-					 src_reg->region.start);
-
-	ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter);
+	ret = i915_ttm_accel_move(bo, dst_mem, dst_st);
+	if (ret) {
+		/* If we start mapping GGTT, we can no longer use man::use_tt here. */
+		dst_iter = dst_man->use_tt ?
+			ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) :
+			ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap,
+						 dst_st, dst_reg->region.start);
+
+		src_iter = src_man->use_tt ?
+			ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) :
+			ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap,
+						 obj->ttm.cached_io_st,
+						 src_reg->region.start);
+
+		ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter);
+	}
 	ttm_bo_move_sync_cleanup(bo, dst_mem);
 	i915_ttm_free_cached_io_st(obj);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 10/12] drm/i915/ttm: accelerated move implementation
@ 2021-06-14 16:26   ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: matthew.auld

From: Ramalingam C <ramalingam.c@intel.com>

Invokes the pipelined page migration through blt, for
i915_ttm_move requests of eviction and also obj clear.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
---
v2:
 - subfunction for accel_move (Thomas)
 - engine_pm_get/put around context_move/clear (Thomas)
 - Invalidation at accel_clear (Thomas)
v3:
 - conflict resolution s/&bo->mem/bo->resource/g
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 87 +++++++++++++++++++++----
 1 file changed, 74 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index bf33724bed5c..08b72c280cb5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -15,6 +15,9 @@
 #include "gem/i915_gem_ttm.h"
 #include "gem/i915_gem_mman.h"
 
+#include "gt/intel_migrate.h"
+#include "gt/intel_engine_pm.h"
+
 #define I915_PL_LMEM0 TTM_PL_PRIV
 #define I915_PL_SYSTEM TTM_PL_SYSTEM
 #define I915_PL_STOLEN TTM_PL_VRAM
@@ -282,6 +285,61 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
 	return intel_region_ttm_node_to_st(obj->mm.region, res);
 }
 
+static int i915_ttm_accel_move(struct ttm_buffer_object *bo,
+			       struct ttm_resource *dst_mem,
+			       struct sg_table *dst_st)
+{
+	struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915),
+						     bdev);
+	struct ttm_resource_manager *src_man =
+		ttm_manager_type(bo->bdev, bo->resource->mem_type);
+	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
+	struct sg_table *src_st;
+	struct i915_request *rq;
+	int ret;
+
+	if (!i915->gt.migrate.context)
+		return -EINVAL;
+
+	if (!bo->ttm || !ttm_tt_is_populated(bo->ttm)) {
+		if (bo->type == ttm_bo_type_kernel)
+			return -EINVAL;
+
+		if (bo->ttm &&
+		    !(bo->ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))
+			return 0;
+
+		intel_engine_pm_get(i915->gt.migrate.context->engine);
+		ret = intel_context_migrate_clear(i915->gt.migrate.context, NULL,
+						  dst_st->sgl, I915_CACHE_NONE,
+						  dst_mem->mem_type >= TTM_PL_PRIV,
+						  0, &rq);
+
+		if (!ret && rq) {
+			i915_request_wait(rq, 0, HZ);
+			i915_request_put(rq);
+		}
+		intel_engine_pm_put(i915->gt.migrate.context->engine);
+	} else {
+		src_st = src_man->use_tt ? i915_ttm_tt_get_st(bo->ttm) :
+						obj->ttm.cached_io_st;
+
+		intel_engine_pm_get(i915->gt.migrate.context->engine);
+		ret = intel_context_migrate_copy(i915->gt.migrate.context,
+						 NULL, src_st->sgl, I915_CACHE_NONE,
+						 bo->resource->mem_type >= TTM_PL_PRIV,
+						 dst_st->sgl, I915_CACHE_NONE,
+						 dst_mem->mem_type >= TTM_PL_PRIV, &rq);
+		if (!ret && rq) {
+			i915_request_wait(rq, 0, HZ);
+			i915_request_put(rq);
+		}
+		intel_engine_pm_put(i915->gt.migrate.context->engine);
+	}
+
+	return ret;
+}
+
 static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 			 struct ttm_operation_ctx *ctx,
 			 struct ttm_resource *dst_mem,
@@ -332,19 +390,22 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 	if (IS_ERR(dst_st))
 		return PTR_ERR(dst_st);
 
-	/* If we start mapping GGTT, we can no longer use man::use_tt here. */
-	dst_iter = dst_man->use_tt ?
-		ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) :
-		ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap,
-					 dst_st, dst_reg->region.start);
-
-	src_iter = src_man->use_tt ?
-		ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) :
-		ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap,
-					 obj->ttm.cached_io_st,
-					 src_reg->region.start);
-
-	ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter);
+	ret = i915_ttm_accel_move(bo, dst_mem, dst_st);
+	if (ret) {
+		/* If we start mapping GGTT, we can no longer use man::use_tt here. */
+		dst_iter = dst_man->use_tt ?
+			ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) :
+			ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap,
+						 dst_st, dst_reg->region.start);
+
+		src_iter = src_man->use_tt ?
+			ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) :
+			ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap,
+						 obj->ttm.cached_io_st,
+						 src_reg->region.start);
+
+		ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter);
+	}
 	ttm_bo_move_sync_cleanup(bo, dst_mem);
 	i915_ttm_free_cached_io_st(obj);
 
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 11/12] drm/i915/gem: Zap the client blt code
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:26   ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

It's not used anywhere.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 -
 .../gpu/drm/i915/gem/i915_gem_client_blt.c    | 355 ---------
 .../gpu/drm/i915/gem/i915_gem_client_blt.h    |  21 -
 .../i915/gem/selftests/i915_gem_client_blt.c  | 704 ------------------
 .../drm/i915/selftests/i915_live_selftests.h  |   1 -
 5 files changed, 1082 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
 delete mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index de4cb9c52585..ca07474ec2df 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -136,7 +136,6 @@ i915-y += $(gt-y)
 gem-y += \
 	gem/i915_gem_busy.o \
 	gem/i915_gem_clflush.o \
-	gem/i915_gem_client_blt.o \
 	gem/i915_gem_context.o \
 	gem/i915_gem_create.o \
 	gem/i915_gem_dmabuf.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
deleted file mode 100644
index 44821d94544f..000000000000
--- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
+++ /dev/null
@@ -1,355 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#include "i915_drv.h"
-#include "gt/intel_context.h"
-#include "gt/intel_engine_pm.h"
-#include "i915_gem_client_blt.h"
-#include "i915_gem_object_blt.h"
-
-struct i915_sleeve {
-	struct i915_vma *vma;
-	struct drm_i915_gem_object *obj;
-	struct sg_table *pages;
-	struct i915_page_sizes page_sizes;
-};
-
-static int vma_set_pages(struct i915_vma *vma)
-{
-	struct i915_sleeve *sleeve = vma->private;
-
-	vma->pages = sleeve->pages;
-	vma->page_sizes = sleeve->page_sizes;
-
-	return 0;
-}
-
-static void vma_clear_pages(struct i915_vma *vma)
-{
-	GEM_BUG_ON(!vma->pages);
-	vma->pages = NULL;
-}
-
-static void vma_bind(struct i915_address_space *vm,
-		     struct i915_vm_pt_stash *stash,
-		     struct i915_vma *vma,
-		     enum i915_cache_level cache_level,
-		     u32 flags)
-{
-	vm->vma_ops.bind_vma(vm, stash, vma, cache_level, flags);
-}
-
-static void vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
-{
-	vm->vma_ops.unbind_vma(vm, vma);
-}
-
-static const struct i915_vma_ops proxy_vma_ops = {
-	.set_pages = vma_set_pages,
-	.clear_pages = vma_clear_pages,
-	.bind_vma = vma_bind,
-	.unbind_vma = vma_unbind,
-};
-
-static struct i915_sleeve *create_sleeve(struct i915_address_space *vm,
-					 struct drm_i915_gem_object *obj,
-					 struct sg_table *pages,
-					 struct i915_page_sizes *page_sizes)
-{
-	struct i915_sleeve *sleeve;
-	struct i915_vma *vma;
-	int err;
-
-	sleeve = kzalloc(sizeof(*sleeve), GFP_KERNEL);
-	if (!sleeve)
-		return ERR_PTR(-ENOMEM);
-
-	vma = i915_vma_instance(obj, vm, NULL);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto err_free;
-	}
-
-	vma->private = sleeve;
-	vma->ops = &proxy_vma_ops;
-
-	sleeve->vma = vma;
-	sleeve->pages = pages;
-	sleeve->page_sizes = *page_sizes;
-
-	return sleeve;
-
-err_free:
-	kfree(sleeve);
-	return ERR_PTR(err);
-}
-
-static void destroy_sleeve(struct i915_sleeve *sleeve)
-{
-	kfree(sleeve);
-}
-
-struct clear_pages_work {
-	struct dma_fence dma;
-	struct dma_fence_cb cb;
-	struct i915_sw_fence wait;
-	struct work_struct work;
-	struct irq_work irq_work;
-	struct i915_sleeve *sleeve;
-	struct intel_context *ce;
-	u32 value;
-};
-
-static const char *clear_pages_work_driver_name(struct dma_fence *fence)
-{
-	return DRIVER_NAME;
-}
-
-static const char *clear_pages_work_timeline_name(struct dma_fence *fence)
-{
-	return "clear";
-}
-
-static void clear_pages_work_release(struct dma_fence *fence)
-{
-	struct clear_pages_work *w = container_of(fence, typeof(*w), dma);
-
-	destroy_sleeve(w->sleeve);
-
-	i915_sw_fence_fini(&w->wait);
-
-	BUILD_BUG_ON(offsetof(typeof(*w), dma));
-	dma_fence_free(&w->dma);
-}
-
-static const struct dma_fence_ops clear_pages_work_ops = {
-	.get_driver_name = clear_pages_work_driver_name,
-	.get_timeline_name = clear_pages_work_timeline_name,
-	.release = clear_pages_work_release,
-};
-
-static void clear_pages_signal_irq_worker(struct irq_work *work)
-{
-	struct clear_pages_work *w = container_of(work, typeof(*w), irq_work);
-
-	dma_fence_signal(&w->dma);
-	dma_fence_put(&w->dma);
-}
-
-static void clear_pages_dma_fence_cb(struct dma_fence *fence,
-				     struct dma_fence_cb *cb)
-{
-	struct clear_pages_work *w = container_of(cb, typeof(*w), cb);
-
-	if (fence->error)
-		dma_fence_set_error(&w->dma, fence->error);
-
-	/*
-	 * Push the signalling of the fence into yet another worker to avoid
-	 * the nightmare locking around the fence spinlock.
-	 */
-	irq_work_queue(&w->irq_work);
-}
-
-static void clear_pages_worker(struct work_struct *work)
-{
-	struct clear_pages_work *w = container_of(work, typeof(*w), work);
-	struct drm_i915_gem_object *obj = w->sleeve->vma->obj;
-	struct i915_vma *vma = w->sleeve->vma;
-	struct i915_gem_ww_ctx ww;
-	struct i915_request *rq;
-	struct i915_vma *batch;
-	int err = w->dma.error;
-
-	if (unlikely(err))
-		goto out_signal;
-
-	if (obj->cache_dirty) {
-		if (i915_gem_object_has_struct_page(obj))
-			drm_clflush_sg(w->sleeve->pages);
-		obj->cache_dirty = false;
-	}
-	obj->read_domains = I915_GEM_GPU_DOMAINS;
-	obj->write_domain = 0;
-
-	i915_gem_ww_ctx_init(&ww, false);
-	intel_engine_pm_get(w->ce->engine);
-retry:
-	err = intel_context_pin_ww(w->ce, &ww);
-	if (err)
-		goto out_signal;
-
-	batch = intel_emit_vma_fill_blt(w->ce, vma, &ww, w->value);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto out_ctx;
-	}
-
-	rq = i915_request_create(w->ce);
-	if (IS_ERR(rq)) {
-		err = PTR_ERR(rq);
-		goto out_batch;
-	}
-
-	/* There's no way the fence has signalled */
-	if (dma_fence_add_callback(&rq->fence, &w->cb,
-				   clear_pages_dma_fence_cb))
-		GEM_BUG_ON(1);
-
-	err = intel_emit_vma_mark_active(batch, rq);
-	if (unlikely(err))
-		goto out_request;
-
-	/*
-	 * w->dma is already exported via (vma|obj)->resv we need only
-	 * keep track of the GPU activity within this vma/request, and
-	 * propagate the signal from the request to w->dma.
-	 */
-	err = __i915_vma_move_to_active(vma, rq);
-	if (err)
-		goto out_request;
-
-	if (rq->engine->emit_init_breadcrumb) {
-		err = rq->engine->emit_init_breadcrumb(rq);
-		if (unlikely(err))
-			goto out_request;
-	}
-
-	err = rq->engine->emit_bb_start(rq,
-					batch->node.start, batch->node.size,
-					0);
-out_request:
-	if (unlikely(err)) {
-		i915_request_set_error_once(rq, err);
-		err = 0;
-	}
-
-	i915_request_add(rq);
-out_batch:
-	intel_emit_vma_release(w->ce, batch);
-out_ctx:
-	intel_context_unpin(w->ce);
-out_signal:
-	if (err == -EDEADLK) {
-		err = i915_gem_ww_ctx_backoff(&ww);
-		if (!err)
-			goto retry;
-	}
-	i915_gem_ww_ctx_fini(&ww);
-
-	i915_vma_unpin(w->sleeve->vma);
-	intel_engine_pm_put(w->ce->engine);
-
-	if (unlikely(err)) {
-		dma_fence_set_error(&w->dma, err);
-		dma_fence_signal(&w->dma);
-		dma_fence_put(&w->dma);
-	}
-}
-
-static int pin_wait_clear_pages_work(struct clear_pages_work *w,
-				     struct intel_context *ce)
-{
-	struct i915_vma *vma = w->sleeve->vma;
-	struct i915_gem_ww_ctx ww;
-	int err;
-
-	i915_gem_ww_ctx_init(&ww, false);
-retry:
-	err = i915_gem_object_lock(vma->obj, &ww);
-	if (err)
-		goto out;
-
-	err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER);
-	if (unlikely(err))
-		goto out;
-
-	err = i915_sw_fence_await_reservation(&w->wait,
-					      vma->obj->base.resv, NULL,
-					      true, 0, I915_FENCE_GFP);
-	if (err)
-		goto err_unpin_vma;
-
-	dma_resv_add_excl_fence(vma->obj->base.resv, &w->dma);
-
-err_unpin_vma:
-	if (err)
-		i915_vma_unpin(vma);
-out:
-	if (err == -EDEADLK) {
-		err = i915_gem_ww_ctx_backoff(&ww);
-		if (!err)
-			goto retry;
-	}
-	i915_gem_ww_ctx_fini(&ww);
-	return err;
-}
-
-static int __i915_sw_fence_call
-clear_pages_work_notify(struct i915_sw_fence *fence,
-			enum i915_sw_fence_notify state)
-{
-	struct clear_pages_work *w = container_of(fence, typeof(*w), wait);
-
-	switch (state) {
-	case FENCE_COMPLETE:
-		schedule_work(&w->work);
-		break;
-
-	case FENCE_FREE:
-		dma_fence_put(&w->dma);
-		break;
-	}
-
-	return NOTIFY_DONE;
-}
-
-static DEFINE_SPINLOCK(fence_lock);
-
-/* XXX: better name please */
-int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
-				     struct intel_context *ce,
-				     struct sg_table *pages,
-				     struct i915_page_sizes *page_sizes,
-				     u32 value)
-{
-	struct clear_pages_work *work;
-	struct i915_sleeve *sleeve;
-	int err;
-
-	sleeve = create_sleeve(ce->vm, obj, pages, page_sizes);
-	if (IS_ERR(sleeve))
-		return PTR_ERR(sleeve);
-
-	work = kmalloc(sizeof(*work), GFP_KERNEL);
-	if (!work) {
-		destroy_sleeve(sleeve);
-		return -ENOMEM;
-	}
-
-	work->value = value;
-	work->sleeve = sleeve;
-	work->ce = ce;
-
-	INIT_WORK(&work->work, clear_pages_worker);
-
-	init_irq_work(&work->irq_work, clear_pages_signal_irq_worker);
-
-	dma_fence_init(&work->dma, &clear_pages_work_ops, &fence_lock, 0, 0);
-	i915_sw_fence_init(&work->wait, clear_pages_work_notify);
-
-	err = pin_wait_clear_pages_work(work, ce);
-	if (err < 0)
-		dma_fence_set_error(&work->dma, err);
-
-	dma_fence_get(&work->dma);
-	i915_sw_fence_commit(&work->wait);
-
-	return err;
-}
-
-#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-#include "selftests/i915_gem_client_blt.c"
-#endif
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.h b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
deleted file mode 100644
index 3dbd28c22ff5..000000000000
--- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
+++ /dev/null
@@ -1,21 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2019 Intel Corporation
- */
-#ifndef __I915_GEM_CLIENT_BLT_H__
-#define __I915_GEM_CLIENT_BLT_H__
-
-#include <linux/types.h>
-
-struct drm_i915_gem_object;
-struct i915_page_sizes;
-struct intel_context;
-struct sg_table;
-
-int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
-				     struct intel_context *ce,
-				     struct sg_table *pages,
-				     struct i915_page_sizes *page_sizes,
-				     u32 value);
-
-#endif
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
deleted file mode 100644
index 176e6b22f87f..000000000000
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
+++ /dev/null
@@ -1,704 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#include "i915_selftest.h"
-
-#include "gt/intel_engine_user.h"
-#include "gt/intel_gt.h"
-#include "gt/intel_gpu_commands.h"
-#include "gem/i915_gem_lmem.h"
-
-#include "selftests/igt_flush_test.h"
-#include "selftests/mock_drm.h"
-#include "selftests/i915_random.h"
-#include "huge_gem_object.h"
-#include "mock_context.h"
-
-static int __igt_client_fill(struct intel_engine_cs *engine)
-{
-	struct intel_context *ce = engine->kernel_context;
-	struct drm_i915_gem_object *obj;
-	I915_RND_STATE(prng);
-	IGT_TIMEOUT(end);
-	u32 *vaddr;
-	int err = 0;
-
-	intel_engine_pm_get(engine);
-	do {
-		const u32 max_block_size = S16_MAX * PAGE_SIZE;
-		u32 sz = min_t(u64, ce->vm->total >> 4, prandom_u32_state(&prng));
-		u32 phys_sz = sz % (max_block_size + 1);
-		u32 val = prandom_u32_state(&prng);
-		u32 i;
-
-		sz = round_up(sz, PAGE_SIZE);
-		phys_sz = round_up(phys_sz, PAGE_SIZE);
-
-		pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
-			 phys_sz, sz, val);
-
-		obj = huge_gem_object(engine->i915, phys_sz, sz);
-		if (IS_ERR(obj)) {
-			err = PTR_ERR(obj);
-			goto err_flush;
-		}
-
-		vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
-		if (IS_ERR(vaddr)) {
-			err = PTR_ERR(vaddr);
-			goto err_put;
-		}
-
-		/*
-		 * XXX: The goal is move this to get_pages, so try to dirty the
-		 * CPU cache first to check that we do the required clflush
-		 * before scheduling the blt for !llc platforms. This matches
-		 * some version of reality where at get_pages the pages
-		 * themselves may not yet be coherent with the GPU(swap-in). If
-		 * we are missing the flush then we should see the stale cache
-		 * values after we do the set_to_cpu_domain and pick it up as a
-		 * test failure.
-		 */
-		memset32(vaddr, val ^ 0xdeadbeaf,
-			 huge_gem_object_phys_size(obj) / sizeof(u32));
-
-		if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
-			obj->cache_dirty = true;
-
-		err = i915_gem_schedule_fill_pages_blt(obj, ce, obj->mm.pages,
-						       &obj->mm.page_sizes,
-						       val);
-		if (err)
-			goto err_unpin;
-
-		i915_gem_object_lock(obj, NULL);
-		err = i915_gem_object_set_to_cpu_domain(obj, false);
-		i915_gem_object_unlock(obj);
-		if (err)
-			goto err_unpin;
-
-		for (i = 0; i < huge_gem_object_phys_size(obj) / sizeof(u32); ++i) {
-			if (vaddr[i] != val) {
-				pr_err("vaddr[%u]=%x, expected=%x\n", i,
-				       vaddr[i], val);
-				err = -EINVAL;
-				goto err_unpin;
-			}
-		}
-
-		i915_gem_object_unpin_map(obj);
-		i915_gem_object_put(obj);
-	} while (!time_after(jiffies, end));
-
-	goto err_flush;
-
-err_unpin:
-	i915_gem_object_unpin_map(obj);
-err_put:
-	i915_gem_object_put(obj);
-err_flush:
-	if (err == -ENOMEM)
-		err = 0;
-	intel_engine_pm_put(engine);
-
-	return err;
-}
-
-static int igt_client_fill(void *arg)
-{
-	int inst = 0;
-
-	do {
-		struct intel_engine_cs *engine;
-		int err;
-
-		engine = intel_engine_lookup_user(arg,
-						  I915_ENGINE_CLASS_COPY,
-						  inst++);
-		if (!engine)
-			return 0;
-
-		err = __igt_client_fill(engine);
-		if (err == -ENOMEM)
-			err = 0;
-		if (err)
-			return err;
-	} while (1);
-}
-
-#define WIDTH 512
-#define HEIGHT 32
-
-struct blit_buffer {
-	struct i915_vma *vma;
-	u32 start_val;
-	u32 tiling;
-};
-
-struct tiled_blits {
-	struct intel_context *ce;
-	struct blit_buffer buffers[3];
-	struct blit_buffer scratch;
-	struct i915_vma *batch;
-	u64 hole;
-	u32 width;
-	u32 height;
-};
-
-static int prepare_blit(const struct tiled_blits *t,
-			struct blit_buffer *dst,
-			struct blit_buffer *src,
-			struct drm_i915_gem_object *batch)
-{
-	const int ver = GRAPHICS_VER(to_i915(batch->base.dev));
-	bool use_64b_reloc = ver >= 8;
-	u32 src_pitch, dst_pitch;
-	u32 cmd, *cs;
-
-	cs = i915_gem_object_pin_map_unlocked(batch, I915_MAP_WC);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	*cs++ = MI_LOAD_REGISTER_IMM(1);
-	*cs++ = i915_mmio_reg_offset(BCS_SWCTRL);
-	cmd = (BCS_SRC_Y | BCS_DST_Y) << 16;
-	if (src->tiling == I915_TILING_Y)
-		cmd |= BCS_SRC_Y;
-	if (dst->tiling == I915_TILING_Y)
-		cmd |= BCS_DST_Y;
-	*cs++ = cmd;
-
-	cmd = MI_FLUSH_DW;
-	if (ver >= 8)
-		cmd++;
-	*cs++ = cmd;
-	*cs++ = 0;
-	*cs++ = 0;
-	*cs++ = 0;
-
-	cmd = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (8 - 2);
-	if (ver >= 8)
-		cmd += 2;
-
-	src_pitch = t->width * 4;
-	if (src->tiling) {
-		cmd |= XY_SRC_COPY_BLT_SRC_TILED;
-		src_pitch /= 4;
-	}
-
-	dst_pitch = t->width * 4;
-	if (dst->tiling) {
-		cmd |= XY_SRC_COPY_BLT_DST_TILED;
-		dst_pitch /= 4;
-	}
-
-	*cs++ = cmd;
-	*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | dst_pitch;
-	*cs++ = 0;
-	*cs++ = t->height << 16 | t->width;
-	*cs++ = lower_32_bits(dst->vma->node.start);
-	if (use_64b_reloc)
-		*cs++ = upper_32_bits(dst->vma->node.start);
-	*cs++ = 0;
-	*cs++ = src_pitch;
-	*cs++ = lower_32_bits(src->vma->node.start);
-	if (use_64b_reloc)
-		*cs++ = upper_32_bits(src->vma->node.start);
-
-	*cs++ = MI_BATCH_BUFFER_END;
-
-	i915_gem_object_flush_map(batch);
-	i915_gem_object_unpin_map(batch);
-
-	return 0;
-}
-
-static void tiled_blits_destroy_buffers(struct tiled_blits *t)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(t->buffers); i++)
-		i915_vma_put(t->buffers[i].vma);
-
-	i915_vma_put(t->scratch.vma);
-	i915_vma_put(t->batch);
-}
-
-static struct i915_vma *
-__create_vma(struct tiled_blits *t, size_t size, bool lmem)
-{
-	struct drm_i915_private *i915 = t->ce->vm->i915;
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
-
-	if (lmem)
-		obj = i915_gem_object_create_lmem(i915, size, 0);
-	else
-		obj = i915_gem_object_create_shmem(i915, size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	vma = i915_vma_instance(obj, t->ce->vm, NULL);
-	if (IS_ERR(vma))
-		i915_gem_object_put(obj);
-
-	return vma;
-}
-
-static struct i915_vma *create_vma(struct tiled_blits *t, bool lmem)
-{
-	return __create_vma(t, PAGE_ALIGN(t->width * t->height * 4), lmem);
-}
-
-static int tiled_blits_create_buffers(struct tiled_blits *t,
-				      int width, int height,
-				      struct rnd_state *prng)
-{
-	struct drm_i915_private *i915 = t->ce->engine->i915;
-	int i;
-
-	t->width = width;
-	t->height = height;
-
-	t->batch = __create_vma(t, PAGE_SIZE, false);
-	if (IS_ERR(t->batch))
-		return PTR_ERR(t->batch);
-
-	t->scratch.vma = create_vma(t, false);
-	if (IS_ERR(t->scratch.vma)) {
-		i915_vma_put(t->batch);
-		return PTR_ERR(t->scratch.vma);
-	}
-
-	for (i = 0; i < ARRAY_SIZE(t->buffers); i++) {
-		struct i915_vma *vma;
-
-		vma = create_vma(t, HAS_LMEM(i915) && i % 2);
-		if (IS_ERR(vma)) {
-			tiled_blits_destroy_buffers(t);
-			return PTR_ERR(vma);
-		}
-
-		t->buffers[i].vma = vma;
-		t->buffers[i].tiling =
-			i915_prandom_u32_max_state(I915_TILING_Y + 1, prng);
-	}
-
-	return 0;
-}
-
-static void fill_scratch(struct tiled_blits *t, u32 *vaddr, u32 val)
-{
-	int i;
-
-	t->scratch.start_val = val;
-	for (i = 0; i < t->width * t->height; i++)
-		vaddr[i] = val++;
-
-	i915_gem_object_flush_map(t->scratch.vma->obj);
-}
-
-static u64 swizzle_bit(unsigned int bit, u64 offset)
-{
-	return (offset & BIT_ULL(bit)) >> (bit - 6);
-}
-
-static u64 tiled_offset(const struct intel_gt *gt,
-			u64 v,
-			unsigned int stride,
-			unsigned int tiling)
-{
-	unsigned int swizzle;
-	u64 x, y;
-
-	if (tiling == I915_TILING_NONE)
-		return v;
-
-	y = div64_u64_rem(v, stride, &x);
-
-	if (tiling == I915_TILING_X) {
-		v = div64_u64_rem(y, 8, &y) * stride * 8;
-		v += y * 512;
-		v += div64_u64_rem(x, 512, &x) << 12;
-		v += x;
-
-		swizzle = gt->ggtt->bit_6_swizzle_x;
-	} else {
-		const unsigned int ytile_span = 16;
-		const unsigned int ytile_height = 512;
-
-		v = div64_u64_rem(y, 32, &y) * stride * 32;
-		v += y * ytile_span;
-		v += div64_u64_rem(x, ytile_span, &x) * ytile_height;
-		v += x;
-
-		swizzle = gt->ggtt->bit_6_swizzle_y;
-	}
-
-	switch (swizzle) {
-	case I915_BIT_6_SWIZZLE_9:
-		v ^= swizzle_bit(9, v);
-		break;
-	case I915_BIT_6_SWIZZLE_9_10:
-		v ^= swizzle_bit(9, v) ^ swizzle_bit(10, v);
-		break;
-	case I915_BIT_6_SWIZZLE_9_11:
-		v ^= swizzle_bit(9, v) ^ swizzle_bit(11, v);
-		break;
-	case I915_BIT_6_SWIZZLE_9_10_11:
-		v ^= swizzle_bit(9, v) ^ swizzle_bit(10, v) ^ swizzle_bit(11, v);
-		break;
-	}
-
-	return v;
-}
-
-static const char *repr_tiling(int tiling)
-{
-	switch (tiling) {
-	case I915_TILING_NONE: return "linear";
-	case I915_TILING_X: return "X";
-	case I915_TILING_Y: return "Y";
-	default: return "unknown";
-	}
-}
-
-static int verify_buffer(const struct tiled_blits *t,
-			 struct blit_buffer *buf,
-			 struct rnd_state *prng)
-{
-	const u32 *vaddr;
-	int ret = 0;
-	int x, y, p;
-
-	x = i915_prandom_u32_max_state(t->width, prng);
-	y = i915_prandom_u32_max_state(t->height, prng);
-	p = y * t->width + x;
-
-	vaddr = i915_gem_object_pin_map_unlocked(buf->vma->obj, I915_MAP_WC);
-	if (IS_ERR(vaddr))
-		return PTR_ERR(vaddr);
-
-	if (vaddr[0] != buf->start_val) {
-		ret = -EINVAL;
-	} else {
-		u64 v = tiled_offset(buf->vma->vm->gt,
-				     p * 4, t->width * 4,
-				     buf->tiling);
-
-		if (vaddr[v / sizeof(*vaddr)] != buf->start_val + p)
-			ret = -EINVAL;
-	}
-	if (ret) {
-		pr_err("Invalid %s tiling detected at (%d, %d), start_val %x\n",
-		       repr_tiling(buf->tiling),
-		       x, y, buf->start_val);
-		igt_hexdump(vaddr, 4096);
-	}
-
-	i915_gem_object_unpin_map(buf->vma->obj);
-	return ret;
-}
-
-static int move_to_active(struct i915_vma *vma,
-			  struct i915_request *rq,
-			  unsigned int flags)
-{
-	int err;
-
-	i915_vma_lock(vma);
-	err = i915_request_await_object(rq, vma->obj, false);
-	if (err == 0)
-		err = i915_vma_move_to_active(vma, rq, flags);
-	i915_vma_unlock(vma);
-
-	return err;
-}
-
-static int pin_buffer(struct i915_vma *vma, u64 addr)
-{
-	int err;
-
-	if (drm_mm_node_allocated(&vma->node) && vma->node.start != addr) {
-		err = i915_vma_unbind(vma);
-		if (err)
-			return err;
-	}
-
-	err = i915_vma_pin(vma, 0, 0, PIN_USER | PIN_OFFSET_FIXED | addr);
-	if (err)
-		return err;
-
-	return 0;
-}
-
-static int
-tiled_blit(struct tiled_blits *t,
-	   struct blit_buffer *dst, u64 dst_addr,
-	   struct blit_buffer *src, u64 src_addr)
-{
-	struct i915_request *rq;
-	int err;
-
-	err = pin_buffer(src->vma, src_addr);
-	if (err) {
-		pr_err("Cannot pin src @ %llx\n", src_addr);
-		return err;
-	}
-
-	err = pin_buffer(dst->vma, dst_addr);
-	if (err) {
-		pr_err("Cannot pin dst @ %llx\n", dst_addr);
-		goto err_src;
-	}
-
-	err = i915_vma_pin(t->batch, 0, 0, PIN_USER | PIN_HIGH);
-	if (err) {
-		pr_err("cannot pin batch\n");
-		goto err_dst;
-	}
-
-	err = prepare_blit(t, dst, src, t->batch->obj);
-	if (err)
-		goto err_bb;
-
-	rq = intel_context_create_request(t->ce);
-	if (IS_ERR(rq)) {
-		err = PTR_ERR(rq);
-		goto err_bb;
-	}
-
-	err = move_to_active(t->batch, rq, 0);
-	if (!err)
-		err = move_to_active(src->vma, rq, 0);
-	if (!err)
-		err = move_to_active(dst->vma, rq, 0);
-	if (!err)
-		err = rq->engine->emit_bb_start(rq,
-						t->batch->node.start,
-						t->batch->node.size,
-						0);
-	i915_request_get(rq);
-	i915_request_add(rq);
-	if (i915_request_wait(rq, 0, HZ / 2) < 0)
-		err = -ETIME;
-	i915_request_put(rq);
-
-	dst->start_val = src->start_val;
-err_bb:
-	i915_vma_unpin(t->batch);
-err_dst:
-	i915_vma_unpin(dst->vma);
-err_src:
-	i915_vma_unpin(src->vma);
-	return err;
-}
-
-static struct tiled_blits *
-tiled_blits_create(struct intel_engine_cs *engine, struct rnd_state *prng)
-{
-	struct drm_mm_node hole;
-	struct tiled_blits *t;
-	u64 hole_size;
-	int err;
-
-	t = kzalloc(sizeof(*t), GFP_KERNEL);
-	if (!t)
-		return ERR_PTR(-ENOMEM);
-
-	t->ce = intel_context_create(engine);
-	if (IS_ERR(t->ce)) {
-		err = PTR_ERR(t->ce);
-		goto err_free;
-	}
-
-	hole_size = 2 * PAGE_ALIGN(WIDTH * HEIGHT * 4);
-	hole_size *= 2; /* room to maneuver */
-	hole_size += 2 * I915_GTT_MIN_ALIGNMENT;
-
-	mutex_lock(&t->ce->vm->mutex);
-	memset(&hole, 0, sizeof(hole));
-	err = drm_mm_insert_node_in_range(&t->ce->vm->mm, &hole,
-					  hole_size, 0, I915_COLOR_UNEVICTABLE,
-					  0, U64_MAX,
-					  DRM_MM_INSERT_BEST);
-	if (!err)
-		drm_mm_remove_node(&hole);
-	mutex_unlock(&t->ce->vm->mutex);
-	if (err) {
-		err = -ENODEV;
-		goto err_put;
-	}
-
-	t->hole = hole.start + I915_GTT_MIN_ALIGNMENT;
-	pr_info("Using hole at %llx\n", t->hole);
-
-	err = tiled_blits_create_buffers(t, WIDTH, HEIGHT, prng);
-	if (err)
-		goto err_put;
-
-	return t;
-
-err_put:
-	intel_context_put(t->ce);
-err_free:
-	kfree(t);
-	return ERR_PTR(err);
-}
-
-static void tiled_blits_destroy(struct tiled_blits *t)
-{
-	tiled_blits_destroy_buffers(t);
-
-	intel_context_put(t->ce);
-	kfree(t);
-}
-
-static int tiled_blits_prepare(struct tiled_blits *t,
-			       struct rnd_state *prng)
-{
-	u64 offset = PAGE_ALIGN(t->width * t->height * 4);
-	u32 *map;
-	int err;
-	int i;
-
-	map = i915_gem_object_pin_map_unlocked(t->scratch.vma->obj, I915_MAP_WC);
-	if (IS_ERR(map))
-		return PTR_ERR(map);
-
-	/* Use scratch to fill objects */
-	for (i = 0; i < ARRAY_SIZE(t->buffers); i++) {
-		fill_scratch(t, map, prandom_u32_state(prng));
-		GEM_BUG_ON(verify_buffer(t, &t->scratch, prng));
-
-		err = tiled_blit(t,
-				 &t->buffers[i], t->hole + offset,
-				 &t->scratch, t->hole);
-		if (err == 0)
-			err = verify_buffer(t, &t->buffers[i], prng);
-		if (err) {
-			pr_err("Failed to create buffer %d\n", i);
-			break;
-		}
-	}
-
-	i915_gem_object_unpin_map(t->scratch.vma->obj);
-	return err;
-}
-
-static int tiled_blits_bounce(struct tiled_blits *t, struct rnd_state *prng)
-{
-	u64 offset =
-		round_up(t->width * t->height * 4, 2 * I915_GTT_MIN_ALIGNMENT);
-	int err;
-
-	/* We want to check position invariant tiling across GTT eviction */
-
-	err = tiled_blit(t,
-			 &t->buffers[1], t->hole + offset / 2,
-			 &t->buffers[0], t->hole + 2 * offset);
-	if (err)
-		return err;
-
-	/* Reposition so that we overlap the old addresses, and slightly off */
-	err = tiled_blit(t,
-			 &t->buffers[2], t->hole + I915_GTT_MIN_ALIGNMENT,
-			 &t->buffers[1], t->hole + 3 * offset / 2);
-	if (err)
-		return err;
-
-	err = verify_buffer(t, &t->buffers[2], prng);
-	if (err)
-		return err;
-
-	return 0;
-}
-
-static int __igt_client_tiled_blits(struct intel_engine_cs *engine,
-				    struct rnd_state *prng)
-{
-	struct tiled_blits *t;
-	int err;
-
-	t = tiled_blits_create(engine, prng);
-	if (IS_ERR(t))
-		return PTR_ERR(t);
-
-	err = tiled_blits_prepare(t, prng);
-	if (err)
-		goto out;
-
-	err = tiled_blits_bounce(t, prng);
-	if (err)
-		goto out;
-
-out:
-	tiled_blits_destroy(t);
-	return err;
-}
-
-static bool has_bit17_swizzle(int sw)
-{
-	return (sw == I915_BIT_6_SWIZZLE_9_10_17 ||
-		sw == I915_BIT_6_SWIZZLE_9_17);
-}
-
-static bool bad_swizzling(struct drm_i915_private *i915)
-{
-	struct i915_ggtt *ggtt = &i915->ggtt;
-
-	if (i915->quirks & QUIRK_PIN_SWIZZLED_PAGES)
-		return true;
-
-	if (has_bit17_swizzle(ggtt->bit_6_swizzle_x) ||
-	    has_bit17_swizzle(ggtt->bit_6_swizzle_y))
-		return true;
-
-	return false;
-}
-
-static int igt_client_tiled_blits(void *arg)
-{
-	struct drm_i915_private *i915 = arg;
-	I915_RND_STATE(prng);
-	int inst = 0;
-
-	/* Test requires explicit BLT tiling controls */
-	if (GRAPHICS_VER(i915) < 4)
-		return 0;
-
-	if (bad_swizzling(i915)) /* Requires sane (sub-page) swizzling */
-		return 0;
-
-	do {
-		struct intel_engine_cs *engine;
-		int err;
-
-		engine = intel_engine_lookup_user(i915,
-						  I915_ENGINE_CLASS_COPY,
-						  inst++);
-		if (!engine)
-			return 0;
-
-		err = __igt_client_tiled_blits(engine, &prng);
-		if (err == -ENODEV)
-			err = 0;
-		if (err)
-			return err;
-	} while (1);
-}
-
-int i915_gem_client_blt_live_selftests(struct drm_i915_private *i915)
-{
-	static const struct i915_subtest tests[] = {
-		SUBTEST(igt_client_fill),
-		SUBTEST(igt_client_tiled_blits),
-	};
-
-	if (intel_gt_is_wedged(&i915->gt))
-		return 0;
-
-	return i915_live_subtests(tests, i915);
-}
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index be5e0191eaea..6f5893ecd549 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -40,7 +40,6 @@ selftest(hugepages, i915_gem_huge_page_live_selftests)
 selftest(gem_contexts, i915_gem_context_live_selftests)
 selftest(gem_execbuf, i915_gem_execbuffer_live_selftests)
 selftest(blt, i915_gem_object_blt_live_selftests)
-selftest(client, i915_gem_client_blt_live_selftests)
 selftest(reset, intel_reset_live_selftests)
 selftest(memory_region, intel_memory_region_live_selftests)
 selftest(hangcheck, intel_hangcheck_live_selftests)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 11/12] drm/i915/gem: Zap the client blt code
@ 2021-06-14 16:26   ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

It's not used anywhere.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 -
 .../gpu/drm/i915/gem/i915_gem_client_blt.c    | 355 ---------
 .../gpu/drm/i915/gem/i915_gem_client_blt.h    |  21 -
 .../i915/gem/selftests/i915_gem_client_blt.c  | 704 ------------------
 .../drm/i915/selftests/i915_live_selftests.h  |   1 -
 5 files changed, 1082 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
 delete mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index de4cb9c52585..ca07474ec2df 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -136,7 +136,6 @@ i915-y += $(gt-y)
 gem-y += \
 	gem/i915_gem_busy.o \
 	gem/i915_gem_clflush.o \
-	gem/i915_gem_client_blt.o \
 	gem/i915_gem_context.o \
 	gem/i915_gem_create.o \
 	gem/i915_gem_dmabuf.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
deleted file mode 100644
index 44821d94544f..000000000000
--- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
+++ /dev/null
@@ -1,355 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#include "i915_drv.h"
-#include "gt/intel_context.h"
-#include "gt/intel_engine_pm.h"
-#include "i915_gem_client_blt.h"
-#include "i915_gem_object_blt.h"
-
-struct i915_sleeve {
-	struct i915_vma *vma;
-	struct drm_i915_gem_object *obj;
-	struct sg_table *pages;
-	struct i915_page_sizes page_sizes;
-};
-
-static int vma_set_pages(struct i915_vma *vma)
-{
-	struct i915_sleeve *sleeve = vma->private;
-
-	vma->pages = sleeve->pages;
-	vma->page_sizes = sleeve->page_sizes;
-
-	return 0;
-}
-
-static void vma_clear_pages(struct i915_vma *vma)
-{
-	GEM_BUG_ON(!vma->pages);
-	vma->pages = NULL;
-}
-
-static void vma_bind(struct i915_address_space *vm,
-		     struct i915_vm_pt_stash *stash,
-		     struct i915_vma *vma,
-		     enum i915_cache_level cache_level,
-		     u32 flags)
-{
-	vm->vma_ops.bind_vma(vm, stash, vma, cache_level, flags);
-}
-
-static void vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
-{
-	vm->vma_ops.unbind_vma(vm, vma);
-}
-
-static const struct i915_vma_ops proxy_vma_ops = {
-	.set_pages = vma_set_pages,
-	.clear_pages = vma_clear_pages,
-	.bind_vma = vma_bind,
-	.unbind_vma = vma_unbind,
-};
-
-static struct i915_sleeve *create_sleeve(struct i915_address_space *vm,
-					 struct drm_i915_gem_object *obj,
-					 struct sg_table *pages,
-					 struct i915_page_sizes *page_sizes)
-{
-	struct i915_sleeve *sleeve;
-	struct i915_vma *vma;
-	int err;
-
-	sleeve = kzalloc(sizeof(*sleeve), GFP_KERNEL);
-	if (!sleeve)
-		return ERR_PTR(-ENOMEM);
-
-	vma = i915_vma_instance(obj, vm, NULL);
-	if (IS_ERR(vma)) {
-		err = PTR_ERR(vma);
-		goto err_free;
-	}
-
-	vma->private = sleeve;
-	vma->ops = &proxy_vma_ops;
-
-	sleeve->vma = vma;
-	sleeve->pages = pages;
-	sleeve->page_sizes = *page_sizes;
-
-	return sleeve;
-
-err_free:
-	kfree(sleeve);
-	return ERR_PTR(err);
-}
-
-static void destroy_sleeve(struct i915_sleeve *sleeve)
-{
-	kfree(sleeve);
-}
-
-struct clear_pages_work {
-	struct dma_fence dma;
-	struct dma_fence_cb cb;
-	struct i915_sw_fence wait;
-	struct work_struct work;
-	struct irq_work irq_work;
-	struct i915_sleeve *sleeve;
-	struct intel_context *ce;
-	u32 value;
-};
-
-static const char *clear_pages_work_driver_name(struct dma_fence *fence)
-{
-	return DRIVER_NAME;
-}
-
-static const char *clear_pages_work_timeline_name(struct dma_fence *fence)
-{
-	return "clear";
-}
-
-static void clear_pages_work_release(struct dma_fence *fence)
-{
-	struct clear_pages_work *w = container_of(fence, typeof(*w), dma);
-
-	destroy_sleeve(w->sleeve);
-
-	i915_sw_fence_fini(&w->wait);
-
-	BUILD_BUG_ON(offsetof(typeof(*w), dma));
-	dma_fence_free(&w->dma);
-}
-
-static const struct dma_fence_ops clear_pages_work_ops = {
-	.get_driver_name = clear_pages_work_driver_name,
-	.get_timeline_name = clear_pages_work_timeline_name,
-	.release = clear_pages_work_release,
-};
-
-static void clear_pages_signal_irq_worker(struct irq_work *work)
-{
-	struct clear_pages_work *w = container_of(work, typeof(*w), irq_work);
-
-	dma_fence_signal(&w->dma);
-	dma_fence_put(&w->dma);
-}
-
-static void clear_pages_dma_fence_cb(struct dma_fence *fence,
-				     struct dma_fence_cb *cb)
-{
-	struct clear_pages_work *w = container_of(cb, typeof(*w), cb);
-
-	if (fence->error)
-		dma_fence_set_error(&w->dma, fence->error);
-
-	/*
-	 * Push the signalling of the fence into yet another worker to avoid
-	 * the nightmare locking around the fence spinlock.
-	 */
-	irq_work_queue(&w->irq_work);
-}
-
-static void clear_pages_worker(struct work_struct *work)
-{
-	struct clear_pages_work *w = container_of(work, typeof(*w), work);
-	struct drm_i915_gem_object *obj = w->sleeve->vma->obj;
-	struct i915_vma *vma = w->sleeve->vma;
-	struct i915_gem_ww_ctx ww;
-	struct i915_request *rq;
-	struct i915_vma *batch;
-	int err = w->dma.error;
-
-	if (unlikely(err))
-		goto out_signal;
-
-	if (obj->cache_dirty) {
-		if (i915_gem_object_has_struct_page(obj))
-			drm_clflush_sg(w->sleeve->pages);
-		obj->cache_dirty = false;
-	}
-	obj->read_domains = I915_GEM_GPU_DOMAINS;
-	obj->write_domain = 0;
-
-	i915_gem_ww_ctx_init(&ww, false);
-	intel_engine_pm_get(w->ce->engine);
-retry:
-	err = intel_context_pin_ww(w->ce, &ww);
-	if (err)
-		goto out_signal;
-
-	batch = intel_emit_vma_fill_blt(w->ce, vma, &ww, w->value);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto out_ctx;
-	}
-
-	rq = i915_request_create(w->ce);
-	if (IS_ERR(rq)) {
-		err = PTR_ERR(rq);
-		goto out_batch;
-	}
-
-	/* There's no way the fence has signalled */
-	if (dma_fence_add_callback(&rq->fence, &w->cb,
-				   clear_pages_dma_fence_cb))
-		GEM_BUG_ON(1);
-
-	err = intel_emit_vma_mark_active(batch, rq);
-	if (unlikely(err))
-		goto out_request;
-
-	/*
-	 * w->dma is already exported via (vma|obj)->resv we need only
-	 * keep track of the GPU activity within this vma/request, and
-	 * propagate the signal from the request to w->dma.
-	 */
-	err = __i915_vma_move_to_active(vma, rq);
-	if (err)
-		goto out_request;
-
-	if (rq->engine->emit_init_breadcrumb) {
-		err = rq->engine->emit_init_breadcrumb(rq);
-		if (unlikely(err))
-			goto out_request;
-	}
-
-	err = rq->engine->emit_bb_start(rq,
-					batch->node.start, batch->node.size,
-					0);
-out_request:
-	if (unlikely(err)) {
-		i915_request_set_error_once(rq, err);
-		err = 0;
-	}
-
-	i915_request_add(rq);
-out_batch:
-	intel_emit_vma_release(w->ce, batch);
-out_ctx:
-	intel_context_unpin(w->ce);
-out_signal:
-	if (err == -EDEADLK) {
-		err = i915_gem_ww_ctx_backoff(&ww);
-		if (!err)
-			goto retry;
-	}
-	i915_gem_ww_ctx_fini(&ww);
-
-	i915_vma_unpin(w->sleeve->vma);
-	intel_engine_pm_put(w->ce->engine);
-
-	if (unlikely(err)) {
-		dma_fence_set_error(&w->dma, err);
-		dma_fence_signal(&w->dma);
-		dma_fence_put(&w->dma);
-	}
-}
-
-static int pin_wait_clear_pages_work(struct clear_pages_work *w,
-				     struct intel_context *ce)
-{
-	struct i915_vma *vma = w->sleeve->vma;
-	struct i915_gem_ww_ctx ww;
-	int err;
-
-	i915_gem_ww_ctx_init(&ww, false);
-retry:
-	err = i915_gem_object_lock(vma->obj, &ww);
-	if (err)
-		goto out;
-
-	err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER);
-	if (unlikely(err))
-		goto out;
-
-	err = i915_sw_fence_await_reservation(&w->wait,
-					      vma->obj->base.resv, NULL,
-					      true, 0, I915_FENCE_GFP);
-	if (err)
-		goto err_unpin_vma;
-
-	dma_resv_add_excl_fence(vma->obj->base.resv, &w->dma);
-
-err_unpin_vma:
-	if (err)
-		i915_vma_unpin(vma);
-out:
-	if (err == -EDEADLK) {
-		err = i915_gem_ww_ctx_backoff(&ww);
-		if (!err)
-			goto retry;
-	}
-	i915_gem_ww_ctx_fini(&ww);
-	return err;
-}
-
-static int __i915_sw_fence_call
-clear_pages_work_notify(struct i915_sw_fence *fence,
-			enum i915_sw_fence_notify state)
-{
-	struct clear_pages_work *w = container_of(fence, typeof(*w), wait);
-
-	switch (state) {
-	case FENCE_COMPLETE:
-		schedule_work(&w->work);
-		break;
-
-	case FENCE_FREE:
-		dma_fence_put(&w->dma);
-		break;
-	}
-
-	return NOTIFY_DONE;
-}
-
-static DEFINE_SPINLOCK(fence_lock);
-
-/* XXX: better name please */
-int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
-				     struct intel_context *ce,
-				     struct sg_table *pages,
-				     struct i915_page_sizes *page_sizes,
-				     u32 value)
-{
-	struct clear_pages_work *work;
-	struct i915_sleeve *sleeve;
-	int err;
-
-	sleeve = create_sleeve(ce->vm, obj, pages, page_sizes);
-	if (IS_ERR(sleeve))
-		return PTR_ERR(sleeve);
-
-	work = kmalloc(sizeof(*work), GFP_KERNEL);
-	if (!work) {
-		destroy_sleeve(sleeve);
-		return -ENOMEM;
-	}
-
-	work->value = value;
-	work->sleeve = sleeve;
-	work->ce = ce;
-
-	INIT_WORK(&work->work, clear_pages_worker);
-
-	init_irq_work(&work->irq_work, clear_pages_signal_irq_worker);
-
-	dma_fence_init(&work->dma, &clear_pages_work_ops, &fence_lock, 0, 0);
-	i915_sw_fence_init(&work->wait, clear_pages_work_notify);
-
-	err = pin_wait_clear_pages_work(work, ce);
-	if (err < 0)
-		dma_fence_set_error(&work->dma, err);
-
-	dma_fence_get(&work->dma);
-	i915_sw_fence_commit(&work->wait);
-
-	return err;
-}
-
-#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-#include "selftests/i915_gem_client_blt.c"
-#endif
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.h b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
deleted file mode 100644
index 3dbd28c22ff5..000000000000
--- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
+++ /dev/null
@@ -1,21 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2019 Intel Corporation
- */
-#ifndef __I915_GEM_CLIENT_BLT_H__
-#define __I915_GEM_CLIENT_BLT_H__
-
-#include <linux/types.h>
-
-struct drm_i915_gem_object;
-struct i915_page_sizes;
-struct intel_context;
-struct sg_table;
-
-int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
-				     struct intel_context *ce,
-				     struct sg_table *pages,
-				     struct i915_page_sizes *page_sizes,
-				     u32 value);
-
-#endif
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
deleted file mode 100644
index 176e6b22f87f..000000000000
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
+++ /dev/null
@@ -1,704 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#include "i915_selftest.h"
-
-#include "gt/intel_engine_user.h"
-#include "gt/intel_gt.h"
-#include "gt/intel_gpu_commands.h"
-#include "gem/i915_gem_lmem.h"
-
-#include "selftests/igt_flush_test.h"
-#include "selftests/mock_drm.h"
-#include "selftests/i915_random.h"
-#include "huge_gem_object.h"
-#include "mock_context.h"
-
-static int __igt_client_fill(struct intel_engine_cs *engine)
-{
-	struct intel_context *ce = engine->kernel_context;
-	struct drm_i915_gem_object *obj;
-	I915_RND_STATE(prng);
-	IGT_TIMEOUT(end);
-	u32 *vaddr;
-	int err = 0;
-
-	intel_engine_pm_get(engine);
-	do {
-		const u32 max_block_size = S16_MAX * PAGE_SIZE;
-		u32 sz = min_t(u64, ce->vm->total >> 4, prandom_u32_state(&prng));
-		u32 phys_sz = sz % (max_block_size + 1);
-		u32 val = prandom_u32_state(&prng);
-		u32 i;
-
-		sz = round_up(sz, PAGE_SIZE);
-		phys_sz = round_up(phys_sz, PAGE_SIZE);
-
-		pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
-			 phys_sz, sz, val);
-
-		obj = huge_gem_object(engine->i915, phys_sz, sz);
-		if (IS_ERR(obj)) {
-			err = PTR_ERR(obj);
-			goto err_flush;
-		}
-
-		vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
-		if (IS_ERR(vaddr)) {
-			err = PTR_ERR(vaddr);
-			goto err_put;
-		}
-
-		/*
-		 * XXX: The goal is move this to get_pages, so try to dirty the
-		 * CPU cache first to check that we do the required clflush
-		 * before scheduling the blt for !llc platforms. This matches
-		 * some version of reality where at get_pages the pages
-		 * themselves may not yet be coherent with the GPU(swap-in). If
-		 * we are missing the flush then we should see the stale cache
-		 * values after we do the set_to_cpu_domain and pick it up as a
-		 * test failure.
-		 */
-		memset32(vaddr, val ^ 0xdeadbeaf,
-			 huge_gem_object_phys_size(obj) / sizeof(u32));
-
-		if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
-			obj->cache_dirty = true;
-
-		err = i915_gem_schedule_fill_pages_blt(obj, ce, obj->mm.pages,
-						       &obj->mm.page_sizes,
-						       val);
-		if (err)
-			goto err_unpin;
-
-		i915_gem_object_lock(obj, NULL);
-		err = i915_gem_object_set_to_cpu_domain(obj, false);
-		i915_gem_object_unlock(obj);
-		if (err)
-			goto err_unpin;
-
-		for (i = 0; i < huge_gem_object_phys_size(obj) / sizeof(u32); ++i) {
-			if (vaddr[i] != val) {
-				pr_err("vaddr[%u]=%x, expected=%x\n", i,
-				       vaddr[i], val);
-				err = -EINVAL;
-				goto err_unpin;
-			}
-		}
-
-		i915_gem_object_unpin_map(obj);
-		i915_gem_object_put(obj);
-	} while (!time_after(jiffies, end));
-
-	goto err_flush;
-
-err_unpin:
-	i915_gem_object_unpin_map(obj);
-err_put:
-	i915_gem_object_put(obj);
-err_flush:
-	if (err == -ENOMEM)
-		err = 0;
-	intel_engine_pm_put(engine);
-
-	return err;
-}
-
-static int igt_client_fill(void *arg)
-{
-	int inst = 0;
-
-	do {
-		struct intel_engine_cs *engine;
-		int err;
-
-		engine = intel_engine_lookup_user(arg,
-						  I915_ENGINE_CLASS_COPY,
-						  inst++);
-		if (!engine)
-			return 0;
-
-		err = __igt_client_fill(engine);
-		if (err == -ENOMEM)
-			err = 0;
-		if (err)
-			return err;
-	} while (1);
-}
-
-#define WIDTH 512
-#define HEIGHT 32
-
-struct blit_buffer {
-	struct i915_vma *vma;
-	u32 start_val;
-	u32 tiling;
-};
-
-struct tiled_blits {
-	struct intel_context *ce;
-	struct blit_buffer buffers[3];
-	struct blit_buffer scratch;
-	struct i915_vma *batch;
-	u64 hole;
-	u32 width;
-	u32 height;
-};
-
-static int prepare_blit(const struct tiled_blits *t,
-			struct blit_buffer *dst,
-			struct blit_buffer *src,
-			struct drm_i915_gem_object *batch)
-{
-	const int ver = GRAPHICS_VER(to_i915(batch->base.dev));
-	bool use_64b_reloc = ver >= 8;
-	u32 src_pitch, dst_pitch;
-	u32 cmd, *cs;
-
-	cs = i915_gem_object_pin_map_unlocked(batch, I915_MAP_WC);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	*cs++ = MI_LOAD_REGISTER_IMM(1);
-	*cs++ = i915_mmio_reg_offset(BCS_SWCTRL);
-	cmd = (BCS_SRC_Y | BCS_DST_Y) << 16;
-	if (src->tiling == I915_TILING_Y)
-		cmd |= BCS_SRC_Y;
-	if (dst->tiling == I915_TILING_Y)
-		cmd |= BCS_DST_Y;
-	*cs++ = cmd;
-
-	cmd = MI_FLUSH_DW;
-	if (ver >= 8)
-		cmd++;
-	*cs++ = cmd;
-	*cs++ = 0;
-	*cs++ = 0;
-	*cs++ = 0;
-
-	cmd = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (8 - 2);
-	if (ver >= 8)
-		cmd += 2;
-
-	src_pitch = t->width * 4;
-	if (src->tiling) {
-		cmd |= XY_SRC_COPY_BLT_SRC_TILED;
-		src_pitch /= 4;
-	}
-
-	dst_pitch = t->width * 4;
-	if (dst->tiling) {
-		cmd |= XY_SRC_COPY_BLT_DST_TILED;
-		dst_pitch /= 4;
-	}
-
-	*cs++ = cmd;
-	*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | dst_pitch;
-	*cs++ = 0;
-	*cs++ = t->height << 16 | t->width;
-	*cs++ = lower_32_bits(dst->vma->node.start);
-	if (use_64b_reloc)
-		*cs++ = upper_32_bits(dst->vma->node.start);
-	*cs++ = 0;
-	*cs++ = src_pitch;
-	*cs++ = lower_32_bits(src->vma->node.start);
-	if (use_64b_reloc)
-		*cs++ = upper_32_bits(src->vma->node.start);
-
-	*cs++ = MI_BATCH_BUFFER_END;
-
-	i915_gem_object_flush_map(batch);
-	i915_gem_object_unpin_map(batch);
-
-	return 0;
-}
-
-static void tiled_blits_destroy_buffers(struct tiled_blits *t)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(t->buffers); i++)
-		i915_vma_put(t->buffers[i].vma);
-
-	i915_vma_put(t->scratch.vma);
-	i915_vma_put(t->batch);
-}
-
-static struct i915_vma *
-__create_vma(struct tiled_blits *t, size_t size, bool lmem)
-{
-	struct drm_i915_private *i915 = t->ce->vm->i915;
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
-
-	if (lmem)
-		obj = i915_gem_object_create_lmem(i915, size, 0);
-	else
-		obj = i915_gem_object_create_shmem(i915, size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	vma = i915_vma_instance(obj, t->ce->vm, NULL);
-	if (IS_ERR(vma))
-		i915_gem_object_put(obj);
-
-	return vma;
-}
-
-static struct i915_vma *create_vma(struct tiled_blits *t, bool lmem)
-{
-	return __create_vma(t, PAGE_ALIGN(t->width * t->height * 4), lmem);
-}
-
-static int tiled_blits_create_buffers(struct tiled_blits *t,
-				      int width, int height,
-				      struct rnd_state *prng)
-{
-	struct drm_i915_private *i915 = t->ce->engine->i915;
-	int i;
-
-	t->width = width;
-	t->height = height;
-
-	t->batch = __create_vma(t, PAGE_SIZE, false);
-	if (IS_ERR(t->batch))
-		return PTR_ERR(t->batch);
-
-	t->scratch.vma = create_vma(t, false);
-	if (IS_ERR(t->scratch.vma)) {
-		i915_vma_put(t->batch);
-		return PTR_ERR(t->scratch.vma);
-	}
-
-	for (i = 0; i < ARRAY_SIZE(t->buffers); i++) {
-		struct i915_vma *vma;
-
-		vma = create_vma(t, HAS_LMEM(i915) && i % 2);
-		if (IS_ERR(vma)) {
-			tiled_blits_destroy_buffers(t);
-			return PTR_ERR(vma);
-		}
-
-		t->buffers[i].vma = vma;
-		t->buffers[i].tiling =
-			i915_prandom_u32_max_state(I915_TILING_Y + 1, prng);
-	}
-
-	return 0;
-}
-
-static void fill_scratch(struct tiled_blits *t, u32 *vaddr, u32 val)
-{
-	int i;
-
-	t->scratch.start_val = val;
-	for (i = 0; i < t->width * t->height; i++)
-		vaddr[i] = val++;
-
-	i915_gem_object_flush_map(t->scratch.vma->obj);
-}
-
-static u64 swizzle_bit(unsigned int bit, u64 offset)
-{
-	return (offset & BIT_ULL(bit)) >> (bit - 6);
-}
-
-static u64 tiled_offset(const struct intel_gt *gt,
-			u64 v,
-			unsigned int stride,
-			unsigned int tiling)
-{
-	unsigned int swizzle;
-	u64 x, y;
-
-	if (tiling == I915_TILING_NONE)
-		return v;
-
-	y = div64_u64_rem(v, stride, &x);
-
-	if (tiling == I915_TILING_X) {
-		v = div64_u64_rem(y, 8, &y) * stride * 8;
-		v += y * 512;
-		v += div64_u64_rem(x, 512, &x) << 12;
-		v += x;
-
-		swizzle = gt->ggtt->bit_6_swizzle_x;
-	} else {
-		const unsigned int ytile_span = 16;
-		const unsigned int ytile_height = 512;
-
-		v = div64_u64_rem(y, 32, &y) * stride * 32;
-		v += y * ytile_span;
-		v += div64_u64_rem(x, ytile_span, &x) * ytile_height;
-		v += x;
-
-		swizzle = gt->ggtt->bit_6_swizzle_y;
-	}
-
-	switch (swizzle) {
-	case I915_BIT_6_SWIZZLE_9:
-		v ^= swizzle_bit(9, v);
-		break;
-	case I915_BIT_6_SWIZZLE_9_10:
-		v ^= swizzle_bit(9, v) ^ swizzle_bit(10, v);
-		break;
-	case I915_BIT_6_SWIZZLE_9_11:
-		v ^= swizzle_bit(9, v) ^ swizzle_bit(11, v);
-		break;
-	case I915_BIT_6_SWIZZLE_9_10_11:
-		v ^= swizzle_bit(9, v) ^ swizzle_bit(10, v) ^ swizzle_bit(11, v);
-		break;
-	}
-
-	return v;
-}
-
-static const char *repr_tiling(int tiling)
-{
-	switch (tiling) {
-	case I915_TILING_NONE: return "linear";
-	case I915_TILING_X: return "X";
-	case I915_TILING_Y: return "Y";
-	default: return "unknown";
-	}
-}
-
-static int verify_buffer(const struct tiled_blits *t,
-			 struct blit_buffer *buf,
-			 struct rnd_state *prng)
-{
-	const u32 *vaddr;
-	int ret = 0;
-	int x, y, p;
-
-	x = i915_prandom_u32_max_state(t->width, prng);
-	y = i915_prandom_u32_max_state(t->height, prng);
-	p = y * t->width + x;
-
-	vaddr = i915_gem_object_pin_map_unlocked(buf->vma->obj, I915_MAP_WC);
-	if (IS_ERR(vaddr))
-		return PTR_ERR(vaddr);
-
-	if (vaddr[0] != buf->start_val) {
-		ret = -EINVAL;
-	} else {
-		u64 v = tiled_offset(buf->vma->vm->gt,
-				     p * 4, t->width * 4,
-				     buf->tiling);
-
-		if (vaddr[v / sizeof(*vaddr)] != buf->start_val + p)
-			ret = -EINVAL;
-	}
-	if (ret) {
-		pr_err("Invalid %s tiling detected at (%d, %d), start_val %x\n",
-		       repr_tiling(buf->tiling),
-		       x, y, buf->start_val);
-		igt_hexdump(vaddr, 4096);
-	}
-
-	i915_gem_object_unpin_map(buf->vma->obj);
-	return ret;
-}
-
-static int move_to_active(struct i915_vma *vma,
-			  struct i915_request *rq,
-			  unsigned int flags)
-{
-	int err;
-
-	i915_vma_lock(vma);
-	err = i915_request_await_object(rq, vma->obj, false);
-	if (err == 0)
-		err = i915_vma_move_to_active(vma, rq, flags);
-	i915_vma_unlock(vma);
-
-	return err;
-}
-
-static int pin_buffer(struct i915_vma *vma, u64 addr)
-{
-	int err;
-
-	if (drm_mm_node_allocated(&vma->node) && vma->node.start != addr) {
-		err = i915_vma_unbind(vma);
-		if (err)
-			return err;
-	}
-
-	err = i915_vma_pin(vma, 0, 0, PIN_USER | PIN_OFFSET_FIXED | addr);
-	if (err)
-		return err;
-
-	return 0;
-}
-
-static int
-tiled_blit(struct tiled_blits *t,
-	   struct blit_buffer *dst, u64 dst_addr,
-	   struct blit_buffer *src, u64 src_addr)
-{
-	struct i915_request *rq;
-	int err;
-
-	err = pin_buffer(src->vma, src_addr);
-	if (err) {
-		pr_err("Cannot pin src @ %llx\n", src_addr);
-		return err;
-	}
-
-	err = pin_buffer(dst->vma, dst_addr);
-	if (err) {
-		pr_err("Cannot pin dst @ %llx\n", dst_addr);
-		goto err_src;
-	}
-
-	err = i915_vma_pin(t->batch, 0, 0, PIN_USER | PIN_HIGH);
-	if (err) {
-		pr_err("cannot pin batch\n");
-		goto err_dst;
-	}
-
-	err = prepare_blit(t, dst, src, t->batch->obj);
-	if (err)
-		goto err_bb;
-
-	rq = intel_context_create_request(t->ce);
-	if (IS_ERR(rq)) {
-		err = PTR_ERR(rq);
-		goto err_bb;
-	}
-
-	err = move_to_active(t->batch, rq, 0);
-	if (!err)
-		err = move_to_active(src->vma, rq, 0);
-	if (!err)
-		err = move_to_active(dst->vma, rq, 0);
-	if (!err)
-		err = rq->engine->emit_bb_start(rq,
-						t->batch->node.start,
-						t->batch->node.size,
-						0);
-	i915_request_get(rq);
-	i915_request_add(rq);
-	if (i915_request_wait(rq, 0, HZ / 2) < 0)
-		err = -ETIME;
-	i915_request_put(rq);
-
-	dst->start_val = src->start_val;
-err_bb:
-	i915_vma_unpin(t->batch);
-err_dst:
-	i915_vma_unpin(dst->vma);
-err_src:
-	i915_vma_unpin(src->vma);
-	return err;
-}
-
-static struct tiled_blits *
-tiled_blits_create(struct intel_engine_cs *engine, struct rnd_state *prng)
-{
-	struct drm_mm_node hole;
-	struct tiled_blits *t;
-	u64 hole_size;
-	int err;
-
-	t = kzalloc(sizeof(*t), GFP_KERNEL);
-	if (!t)
-		return ERR_PTR(-ENOMEM);
-
-	t->ce = intel_context_create(engine);
-	if (IS_ERR(t->ce)) {
-		err = PTR_ERR(t->ce);
-		goto err_free;
-	}
-
-	hole_size = 2 * PAGE_ALIGN(WIDTH * HEIGHT * 4);
-	hole_size *= 2; /* room to maneuver */
-	hole_size += 2 * I915_GTT_MIN_ALIGNMENT;
-
-	mutex_lock(&t->ce->vm->mutex);
-	memset(&hole, 0, sizeof(hole));
-	err = drm_mm_insert_node_in_range(&t->ce->vm->mm, &hole,
-					  hole_size, 0, I915_COLOR_UNEVICTABLE,
-					  0, U64_MAX,
-					  DRM_MM_INSERT_BEST);
-	if (!err)
-		drm_mm_remove_node(&hole);
-	mutex_unlock(&t->ce->vm->mutex);
-	if (err) {
-		err = -ENODEV;
-		goto err_put;
-	}
-
-	t->hole = hole.start + I915_GTT_MIN_ALIGNMENT;
-	pr_info("Using hole at %llx\n", t->hole);
-
-	err = tiled_blits_create_buffers(t, WIDTH, HEIGHT, prng);
-	if (err)
-		goto err_put;
-
-	return t;
-
-err_put:
-	intel_context_put(t->ce);
-err_free:
-	kfree(t);
-	return ERR_PTR(err);
-}
-
-static void tiled_blits_destroy(struct tiled_blits *t)
-{
-	tiled_blits_destroy_buffers(t);
-
-	intel_context_put(t->ce);
-	kfree(t);
-}
-
-static int tiled_blits_prepare(struct tiled_blits *t,
-			       struct rnd_state *prng)
-{
-	u64 offset = PAGE_ALIGN(t->width * t->height * 4);
-	u32 *map;
-	int err;
-	int i;
-
-	map = i915_gem_object_pin_map_unlocked(t->scratch.vma->obj, I915_MAP_WC);
-	if (IS_ERR(map))
-		return PTR_ERR(map);
-
-	/* Use scratch to fill objects */
-	for (i = 0; i < ARRAY_SIZE(t->buffers); i++) {
-		fill_scratch(t, map, prandom_u32_state(prng));
-		GEM_BUG_ON(verify_buffer(t, &t->scratch, prng));
-
-		err = tiled_blit(t,
-				 &t->buffers[i], t->hole + offset,
-				 &t->scratch, t->hole);
-		if (err == 0)
-			err = verify_buffer(t, &t->buffers[i], prng);
-		if (err) {
-			pr_err("Failed to create buffer %d\n", i);
-			break;
-		}
-	}
-
-	i915_gem_object_unpin_map(t->scratch.vma->obj);
-	return err;
-}
-
-static int tiled_blits_bounce(struct tiled_blits *t, struct rnd_state *prng)
-{
-	u64 offset =
-		round_up(t->width * t->height * 4, 2 * I915_GTT_MIN_ALIGNMENT);
-	int err;
-
-	/* We want to check position invariant tiling across GTT eviction */
-
-	err = tiled_blit(t,
-			 &t->buffers[1], t->hole + offset / 2,
-			 &t->buffers[0], t->hole + 2 * offset);
-	if (err)
-		return err;
-
-	/* Reposition so that we overlap the old addresses, and slightly off */
-	err = tiled_blit(t,
-			 &t->buffers[2], t->hole + I915_GTT_MIN_ALIGNMENT,
-			 &t->buffers[1], t->hole + 3 * offset / 2);
-	if (err)
-		return err;
-
-	err = verify_buffer(t, &t->buffers[2], prng);
-	if (err)
-		return err;
-
-	return 0;
-}
-
-static int __igt_client_tiled_blits(struct intel_engine_cs *engine,
-				    struct rnd_state *prng)
-{
-	struct tiled_blits *t;
-	int err;
-
-	t = tiled_blits_create(engine, prng);
-	if (IS_ERR(t))
-		return PTR_ERR(t);
-
-	err = tiled_blits_prepare(t, prng);
-	if (err)
-		goto out;
-
-	err = tiled_blits_bounce(t, prng);
-	if (err)
-		goto out;
-
-out:
-	tiled_blits_destroy(t);
-	return err;
-}
-
-static bool has_bit17_swizzle(int sw)
-{
-	return (sw == I915_BIT_6_SWIZZLE_9_10_17 ||
-		sw == I915_BIT_6_SWIZZLE_9_17);
-}
-
-static bool bad_swizzling(struct drm_i915_private *i915)
-{
-	struct i915_ggtt *ggtt = &i915->ggtt;
-
-	if (i915->quirks & QUIRK_PIN_SWIZZLED_PAGES)
-		return true;
-
-	if (has_bit17_swizzle(ggtt->bit_6_swizzle_x) ||
-	    has_bit17_swizzle(ggtt->bit_6_swizzle_y))
-		return true;
-
-	return false;
-}
-
-static int igt_client_tiled_blits(void *arg)
-{
-	struct drm_i915_private *i915 = arg;
-	I915_RND_STATE(prng);
-	int inst = 0;
-
-	/* Test requires explicit BLT tiling controls */
-	if (GRAPHICS_VER(i915) < 4)
-		return 0;
-
-	if (bad_swizzling(i915)) /* Requires sane (sub-page) swizzling */
-		return 0;
-
-	do {
-		struct intel_engine_cs *engine;
-		int err;
-
-		engine = intel_engine_lookup_user(i915,
-						  I915_ENGINE_CLASS_COPY,
-						  inst++);
-		if (!engine)
-			return 0;
-
-		err = __igt_client_tiled_blits(engine, &prng);
-		if (err == -ENODEV)
-			err = 0;
-		if (err)
-			return err;
-	} while (1);
-}
-
-int i915_gem_client_blt_live_selftests(struct drm_i915_private *i915)
-{
-	static const struct i915_subtest tests[] = {
-		SUBTEST(igt_client_fill),
-		SUBTEST(igt_client_tiled_blits),
-	};
-
-	if (intel_gt_is_wedged(&i915->gt))
-		return 0;
-
-	return i915_live_subtests(tests, i915);
-}
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index be5e0191eaea..6f5893ecd549 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -40,7 +40,6 @@ selftest(hugepages, i915_gem_huge_page_live_selftests)
 selftest(gem_contexts, i915_gem_context_live_selftests)
 selftest(gem_execbuf, i915_gem_execbuffer_live_selftests)
 selftest(blt, i915_gem_object_blt_live_selftests)
-selftest(client, i915_gem_client_blt_live_selftests)
 selftest(reset, intel_reset_live_selftests)
 selftest(memory_region, intel_memory_region_live_selftests)
 selftest(hangcheck, intel_hangcheck_live_selftests)
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 12/12] drm/i915/gem: Zap the i915_gem_object_blt code
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:26   ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

It's unused with the exception of selftest. Replace a call in the
memory_region live selftest with a call into a corresponding
function in the new migrate code.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 -
 .../gpu/drm/i915/gem/i915_gem_object_blt.c    | 461 --------------
 .../gpu/drm/i915/gem/i915_gem_object_blt.h    |  39 --
 .../i915/gem/selftests/i915_gem_object_blt.c  | 597 ------------------
 .../drm/i915/selftests/i915_live_selftests.h  |   1 -
 .../drm/i915/selftests/i915_perf_selftests.h  |   1 -
 .../drm/i915/selftests/intel_memory_region.c  |  21 +-
 7 files changed, 14 insertions(+), 1107 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
 delete mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index ca07474ec2df..13085ac78c63 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -143,7 +143,6 @@ gem-y += \
 	gem/i915_gem_execbuffer.o \
 	gem/i915_gem_internal.o \
 	gem/i915_gem_object.o \
-	gem/i915_gem_object_blt.o \
 	gem/i915_gem_lmem.o \
 	gem/i915_gem_mman.o \
 	gem/i915_gem_pages.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
deleted file mode 100644
index 3e28c68fda3e..000000000000
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
+++ /dev/null
@@ -1,461 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#include "i915_drv.h"
-#include "gt/intel_context.h"
-#include "gt/intel_engine_pm.h"
-#include "gt/intel_gpu_commands.h"
-#include "gt/intel_gt.h"
-#include "gt/intel_gt_buffer_pool.h"
-#include "gt/intel_ring.h"
-#include "i915_gem_clflush.h"
-#include "i915_gem_object_blt.h"
-
-struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
-					 struct i915_vma *vma,
-					 struct i915_gem_ww_ctx *ww,
-					 u32 value)
-{
-	struct drm_i915_private *i915 = ce->vm->i915;
-	const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
-	struct intel_gt_buffer_pool_node *pool;
-	struct i915_vma *batch;
-	u64 offset;
-	u64 count;
-	u64 rem;
-	u32 size;
-	u32 *cmd;
-	int err;
-
-	GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
-	intel_engine_pm_get(ce->engine);
-
-	count = div_u64(round_up(vma->size, block_size), block_size);
-	size = (1 + 8 * count) * sizeof(u32);
-	size = round_up(size, PAGE_SIZE);
-	pool = intel_gt_get_buffer_pool(ce->engine->gt, size, I915_MAP_WC);
-	if (IS_ERR(pool)) {
-		err = PTR_ERR(pool);
-		goto out_pm;
-	}
-
-	err = i915_gem_object_lock(pool->obj, ww);
-	if (err)
-		goto out_put;
-
-	batch = i915_vma_instance(pool->obj, ce->vm, NULL);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto out_put;
-	}
-
-	err = i915_vma_pin_ww(batch, ww, 0, 0, PIN_USER);
-	if (unlikely(err))
-		goto out_put;
-
-	/* we pinned the pool, mark it as such */
-	intel_gt_buffer_pool_mark_used(pool);
-
-	cmd = i915_gem_object_pin_map(pool->obj, pool->type);
-	if (IS_ERR(cmd)) {
-		err = PTR_ERR(cmd);
-		goto out_unpin;
-	}
-
-	rem = vma->size;
-	offset = vma->node.start;
-
-	do {
-		u32 size = min_t(u64, rem, block_size);
-
-		GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
-
-		if (GRAPHICS_VER(i915) >= 8) {
-			*cmd++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2);
-			*cmd++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
-			*cmd++ = 0;
-			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
-			*cmd++ = lower_32_bits(offset);
-			*cmd++ = upper_32_bits(offset);
-			*cmd++ = value;
-		} else {
-			*cmd++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
-			*cmd++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
-			*cmd++ = 0;
-			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
-			*cmd++ = offset;
-			*cmd++ = value;
-		}
-
-		/* Allow ourselves to be preempted in between blocks. */
-		*cmd++ = MI_ARB_CHECK;
-
-		offset += size;
-		rem -= size;
-	} while (rem);
-
-	*cmd = MI_BATCH_BUFFER_END;
-
-	i915_gem_object_flush_map(pool->obj);
-	i915_gem_object_unpin_map(pool->obj);
-
-	intel_gt_chipset_flush(ce->vm->gt);
-
-	batch->private = pool;
-	return batch;
-
-out_unpin:
-	i915_vma_unpin(batch);
-out_put:
-	intel_gt_buffer_pool_put(pool);
-out_pm:
-	intel_engine_pm_put(ce->engine);
-	return ERR_PTR(err);
-}
-
-int intel_emit_vma_mark_active(struct i915_vma *vma, struct i915_request *rq)
-{
-	int err;
-
-	err = i915_request_await_object(rq, vma->obj, false);
-	if (err == 0)
-		err = i915_vma_move_to_active(vma, rq, 0);
-	if (unlikely(err))
-		return err;
-
-	return intel_gt_buffer_pool_mark_active(vma->private, rq);
-}
-
-void intel_emit_vma_release(struct intel_context *ce, struct i915_vma *vma)
-{
-	i915_vma_unpin(vma);
-	intel_gt_buffer_pool_put(vma->private);
-	intel_engine_pm_put(ce->engine);
-}
-
-static int
-move_obj_to_gpu(struct drm_i915_gem_object *obj,
-		struct i915_request *rq,
-		bool write)
-{
-	if (obj->cache_dirty & ~obj->cache_coherent)
-		i915_gem_clflush_object(obj, 0);
-
-	return i915_request_await_object(rq, obj, write);
-}
-
-int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
-			     struct intel_context *ce,
-			     u32 value)
-{
-	struct i915_gem_ww_ctx ww;
-	struct i915_request *rq;
-	struct i915_vma *batch;
-	struct i915_vma *vma;
-	int err;
-
-	vma = i915_vma_instance(obj, ce->vm, NULL);
-	if (IS_ERR(vma))
-		return PTR_ERR(vma);
-
-	i915_gem_ww_ctx_init(&ww, true);
-	intel_engine_pm_get(ce->engine);
-retry:
-	err = i915_gem_object_lock(obj, &ww);
-	if (err)
-		goto out;
-
-	err = intel_context_pin_ww(ce, &ww);
-	if (err)
-		goto out;
-
-	err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER);
-	if (err)
-		goto out_ctx;
-
-	batch = intel_emit_vma_fill_blt(ce, vma, &ww, value);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto out_vma;
-	}
-
-	rq = i915_request_create(ce);
-	if (IS_ERR(rq)) {
-		err = PTR_ERR(rq);
-		goto out_batch;
-	}
-
-	err = intel_emit_vma_mark_active(batch, rq);
-	if (unlikely(err))
-		goto out_request;
-
-	err = move_obj_to_gpu(vma->obj, rq, true);
-	if (err == 0)
-		err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
-	if (unlikely(err))
-		goto out_request;
-
-	if (ce->engine->emit_init_breadcrumb)
-		err = ce->engine->emit_init_breadcrumb(rq);
-
-	if (likely(!err))
-		err = ce->engine->emit_bb_start(rq,
-						batch->node.start,
-						batch->node.size,
-						0);
-out_request:
-	if (unlikely(err))
-		i915_request_set_error_once(rq, err);
-
-	i915_request_add(rq);
-out_batch:
-	intel_emit_vma_release(ce, batch);
-out_vma:
-	i915_vma_unpin(vma);
-out_ctx:
-	intel_context_unpin(ce);
-out:
-	if (err == -EDEADLK) {
-		err = i915_gem_ww_ctx_backoff(&ww);
-		if (!err)
-			goto retry;
-	}
-	i915_gem_ww_ctx_fini(&ww);
-	intel_engine_pm_put(ce->engine);
-	return err;
-}
-
-/* Wa_1209644611:icl,ehl */
-static bool wa_1209644611_applies(struct drm_i915_private *i915, u32 size)
-{
-	u32 height = size >> PAGE_SHIFT;
-
-	if (GRAPHICS_VER(i915) != 11)
-		return false;
-
-	return height % 4 == 3 && height <= 8;
-}
-
-struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
-					 struct i915_gem_ww_ctx *ww,
-					 struct i915_vma *src,
-					 struct i915_vma *dst)
-{
-	struct drm_i915_private *i915 = ce->vm->i915;
-	const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
-	struct intel_gt_buffer_pool_node *pool;
-	struct i915_vma *batch;
-	u64 src_offset, dst_offset;
-	u64 count, rem;
-	u32 size, *cmd;
-	int err;
-
-	GEM_BUG_ON(src->size != dst->size);
-
-	GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
-	intel_engine_pm_get(ce->engine);
-
-	count = div_u64(round_up(dst->size, block_size), block_size);
-	size = (1 + 11 * count) * sizeof(u32);
-	size = round_up(size, PAGE_SIZE);
-	pool = intel_gt_get_buffer_pool(ce->engine->gt, size, I915_MAP_WC);
-	if (IS_ERR(pool)) {
-		err = PTR_ERR(pool);
-		goto out_pm;
-	}
-
-	err = i915_gem_object_lock(pool->obj, ww);
-	if (err)
-		goto out_put;
-
-	batch = i915_vma_instance(pool->obj, ce->vm, NULL);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto out_put;
-	}
-
-	err = i915_vma_pin_ww(batch, ww, 0, 0, PIN_USER);
-	if (unlikely(err))
-		goto out_put;
-
-	/* we pinned the pool, mark it as such */
-	intel_gt_buffer_pool_mark_used(pool);
-
-	cmd = i915_gem_object_pin_map(pool->obj, pool->type);
-	if (IS_ERR(cmd)) {
-		err = PTR_ERR(cmd);
-		goto out_unpin;
-	}
-
-	rem = src->size;
-	src_offset = src->node.start;
-	dst_offset = dst->node.start;
-
-	do {
-		size = min_t(u64, rem, block_size);
-		GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
-
-		if (GRAPHICS_VER(i915) >= 9 &&
-		    !wa_1209644611_applies(i915, size)) {
-			*cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
-			*cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
-			*cmd++ = 0;
-			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
-			*cmd++ = lower_32_bits(dst_offset);
-			*cmd++ = upper_32_bits(dst_offset);
-			*cmd++ = 0;
-			*cmd++ = PAGE_SIZE;
-			*cmd++ = lower_32_bits(src_offset);
-			*cmd++ = upper_32_bits(src_offset);
-		} else if (GRAPHICS_VER(i915) >= 8) {
-			*cmd++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (10 - 2);
-			*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
-			*cmd++ = 0;
-			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
-			*cmd++ = lower_32_bits(dst_offset);
-			*cmd++ = upper_32_bits(dst_offset);
-			*cmd++ = 0;
-			*cmd++ = PAGE_SIZE;
-			*cmd++ = lower_32_bits(src_offset);
-			*cmd++ = upper_32_bits(src_offset);
-		} else {
-			*cmd++ = SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
-			*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
-			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE;
-			*cmd++ = dst_offset;
-			*cmd++ = PAGE_SIZE;
-			*cmd++ = src_offset;
-		}
-
-		/* Allow ourselves to be preempted in between blocks. */
-		*cmd++ = MI_ARB_CHECK;
-
-		src_offset += size;
-		dst_offset += size;
-		rem -= size;
-	} while (rem);
-
-	*cmd = MI_BATCH_BUFFER_END;
-
-	i915_gem_object_flush_map(pool->obj);
-	i915_gem_object_unpin_map(pool->obj);
-
-	intel_gt_chipset_flush(ce->vm->gt);
-	batch->private = pool;
-	return batch;
-
-out_unpin:
-	i915_vma_unpin(batch);
-out_put:
-	intel_gt_buffer_pool_put(pool);
-out_pm:
-	intel_engine_pm_put(ce->engine);
-	return ERR_PTR(err);
-}
-
-int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
-			     struct drm_i915_gem_object *dst,
-			     struct intel_context *ce)
-{
-	struct i915_address_space *vm = ce->vm;
-	struct i915_vma *vma[2], *batch;
-	struct i915_gem_ww_ctx ww;
-	struct i915_request *rq;
-	int err, i;
-
-	vma[0] = i915_vma_instance(src, vm, NULL);
-	if (IS_ERR(vma[0]))
-		return PTR_ERR(vma[0]);
-
-	vma[1] = i915_vma_instance(dst, vm, NULL);
-	if (IS_ERR(vma[1]))
-		return PTR_ERR(vma[1]);
-
-	i915_gem_ww_ctx_init(&ww, true);
-	intel_engine_pm_get(ce->engine);
-retry:
-	err = i915_gem_object_lock(src, &ww);
-	if (!err)
-		err = i915_gem_object_lock(dst, &ww);
-	if (!err)
-		err = intel_context_pin_ww(ce, &ww);
-	if (err)
-		goto out;
-
-	err = i915_vma_pin_ww(vma[0], &ww, 0, 0, PIN_USER);
-	if (err)
-		goto out_ctx;
-
-	err = i915_vma_pin_ww(vma[1], &ww, 0, 0, PIN_USER);
-	if (unlikely(err))
-		goto out_unpin_src;
-
-	batch = intel_emit_vma_copy_blt(ce, &ww, vma[0], vma[1]);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto out_unpin_dst;
-	}
-
-	rq = i915_request_create(ce);
-	if (IS_ERR(rq)) {
-		err = PTR_ERR(rq);
-		goto out_batch;
-	}
-
-	err = intel_emit_vma_mark_active(batch, rq);
-	if (unlikely(err))
-		goto out_request;
-
-	for (i = 0; i < ARRAY_SIZE(vma); i++) {
-		err = move_obj_to_gpu(vma[i]->obj, rq, i);
-		if (unlikely(err))
-			goto out_request;
-	}
-
-	for (i = 0; i < ARRAY_SIZE(vma); i++) {
-		unsigned int flags = i ? EXEC_OBJECT_WRITE : 0;
-
-		err = i915_vma_move_to_active(vma[i], rq, flags);
-		if (unlikely(err))
-			goto out_request;
-	}
-
-	if (rq->engine->emit_init_breadcrumb) {
-		err = rq->engine->emit_init_breadcrumb(rq);
-		if (unlikely(err))
-			goto out_request;
-	}
-
-	err = rq->engine->emit_bb_start(rq,
-					batch->node.start, batch->node.size,
-					0);
-
-out_request:
-	if (unlikely(err))
-		i915_request_set_error_once(rq, err);
-
-	i915_request_add(rq);
-out_batch:
-	intel_emit_vma_release(ce, batch);
-out_unpin_dst:
-	i915_vma_unpin(vma[1]);
-out_unpin_src:
-	i915_vma_unpin(vma[0]);
-out_ctx:
-	intel_context_unpin(ce);
-out:
-	if (err == -EDEADLK) {
-		err = i915_gem_ww_ctx_backoff(&ww);
-		if (!err)
-			goto retry;
-	}
-	i915_gem_ww_ctx_fini(&ww);
-	intel_engine_pm_put(ce->engine);
-	return err;
-}
-
-#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-#include "selftests/i915_gem_object_blt.c"
-#endif
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
deleted file mode 100644
index 2409fdcccf0e..000000000000
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
+++ /dev/null
@@ -1,39 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#ifndef __I915_GEM_OBJECT_BLT_H__
-#define __I915_GEM_OBJECT_BLT_H__
-
-#include <linux/types.h>
-
-#include "gt/intel_context.h"
-#include "gt/intel_engine_pm.h"
-#include "i915_vma.h"
-
-struct drm_i915_gem_object;
-struct i915_gem_ww_ctx;
-
-struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
-					 struct i915_vma *vma,
-					 struct i915_gem_ww_ctx *ww,
-					 u32 value);
-
-struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
-					 struct i915_gem_ww_ctx *ww,
-					 struct i915_vma *src,
-					 struct i915_vma *dst);
-
-int intel_emit_vma_mark_active(struct i915_vma *vma, struct i915_request *rq);
-void intel_emit_vma_release(struct intel_context *ce, struct i915_vma *vma);
-
-int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
-			     struct intel_context *ce,
-			     u32 value);
-
-int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
-			     struct drm_i915_gem_object *dst,
-			     struct intel_context *ce);
-
-#endif
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
deleted file mode 100644
index 8c335d1a8406..000000000000
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
+++ /dev/null
@@ -1,597 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#include <linux/sort.h>
-
-#include "gt/intel_gt.h"
-#include "gt/intel_engine_user.h"
-
-#include "i915_selftest.h"
-
-#include "gem/i915_gem_context.h"
-#include "selftests/igt_flush_test.h"
-#include "selftests/i915_random.h"
-#include "selftests/mock_drm.h"
-#include "huge_gem_object.h"
-#include "mock_context.h"
-
-static int wrap_ktime_compare(const void *A, const void *B)
-{
-	const ktime_t *a = A, *b = B;
-
-	return ktime_compare(*a, *b);
-}
-
-static int __perf_fill_blt(struct drm_i915_gem_object *obj)
-{
-	struct drm_i915_private *i915 = to_i915(obj->base.dev);
-	int inst = 0;
-
-	do {
-		struct intel_engine_cs *engine;
-		ktime_t t[5];
-		int pass;
-		int err;
-
-		engine = intel_engine_lookup_user(i915,
-						  I915_ENGINE_CLASS_COPY,
-						  inst++);
-		if (!engine)
-			return 0;
-
-		intel_engine_pm_get(engine);
-		for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
-			struct intel_context *ce = engine->kernel_context;
-			ktime_t t0, t1;
-
-			t0 = ktime_get();
-
-			err = i915_gem_object_fill_blt(obj, ce, 0);
-			if (err)
-				break;
-
-			err = i915_gem_object_wait(obj,
-						   I915_WAIT_ALL,
-						   MAX_SCHEDULE_TIMEOUT);
-			if (err)
-				break;
-
-			t1 = ktime_get();
-			t[pass] = ktime_sub(t1, t0);
-		}
-		intel_engine_pm_put(engine);
-		if (err)
-			return err;
-
-		sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
-		pr_info("%s: blt %zd KiB fill: %lld MiB/s\n",
-			engine->name,
-			obj->base.size >> 10,
-			div64_u64(mul_u32_u32(4 * obj->base.size,
-					      1000 * 1000 * 1000),
-				  t[1] + 2 * t[2] + t[3]) >> 20);
-	} while (1);
-}
-
-static int perf_fill_blt(void *arg)
-{
-	struct drm_i915_private *i915 = arg;
-	static const unsigned long sizes[] = {
-		SZ_4K,
-		SZ_64K,
-		SZ_2M,
-		SZ_64M
-	};
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
-		struct drm_i915_gem_object *obj;
-		int err;
-
-		obj = i915_gem_object_create_internal(i915, sizes[i]);
-		if (IS_ERR(obj))
-			return PTR_ERR(obj);
-
-		err = __perf_fill_blt(obj);
-		i915_gem_object_put(obj);
-		if (err)
-			return err;
-	}
-
-	return 0;
-}
-
-static int __perf_copy_blt(struct drm_i915_gem_object *src,
-			   struct drm_i915_gem_object *dst)
-{
-	struct drm_i915_private *i915 = to_i915(src->base.dev);
-	int inst = 0;
-
-	do {
-		struct intel_engine_cs *engine;
-		ktime_t t[5];
-		int pass;
-		int err = 0;
-
-		engine = intel_engine_lookup_user(i915,
-						  I915_ENGINE_CLASS_COPY,
-						  inst++);
-		if (!engine)
-			return 0;
-
-		intel_engine_pm_get(engine);
-		for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
-			struct intel_context *ce = engine->kernel_context;
-			ktime_t t0, t1;
-
-			t0 = ktime_get();
-
-			err = i915_gem_object_copy_blt(src, dst, ce);
-			if (err)
-				break;
-
-			err = i915_gem_object_wait(dst,
-						   I915_WAIT_ALL,
-						   MAX_SCHEDULE_TIMEOUT);
-			if (err)
-				break;
-
-			t1 = ktime_get();
-			t[pass] = ktime_sub(t1, t0);
-		}
-		intel_engine_pm_put(engine);
-		if (err)
-			return err;
-
-		sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
-		pr_info("%s: blt %zd KiB copy: %lld MiB/s\n",
-			engine->name,
-			src->base.size >> 10,
-			div64_u64(mul_u32_u32(4 * src->base.size,
-					      1000 * 1000 * 1000),
-				  t[1] + 2 * t[2] + t[3]) >> 20);
-	} while (1);
-}
-
-static int perf_copy_blt(void *arg)
-{
-	struct drm_i915_private *i915 = arg;
-	static const unsigned long sizes[] = {
-		SZ_4K,
-		SZ_64K,
-		SZ_2M,
-		SZ_64M
-	};
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
-		struct drm_i915_gem_object *src, *dst;
-		int err;
-
-		src = i915_gem_object_create_internal(i915, sizes[i]);
-		if (IS_ERR(src))
-			return PTR_ERR(src);
-
-		dst = i915_gem_object_create_internal(i915, sizes[i]);
-		if (IS_ERR(dst)) {
-			err = PTR_ERR(dst);
-			goto err_src;
-		}
-
-		err = __perf_copy_blt(src, dst);
-
-		i915_gem_object_put(dst);
-err_src:
-		i915_gem_object_put(src);
-		if (err)
-			return err;
-	}
-
-	return 0;
-}
-
-struct igt_thread_arg {
-	struct intel_engine_cs *engine;
-	struct i915_gem_context *ctx;
-	struct file *file;
-	struct rnd_state prng;
-	unsigned int n_cpus;
-};
-
-static int igt_fill_blt_thread(void *arg)
-{
-	struct igt_thread_arg *thread = arg;
-	struct intel_engine_cs *engine = thread->engine;
-	struct rnd_state *prng = &thread->prng;
-	struct drm_i915_gem_object *obj;
-	struct i915_gem_context *ctx;
-	struct intel_context *ce;
-	unsigned int prio;
-	IGT_TIMEOUT(end);
-	u64 total, max;
-	int err;
-
-	ctx = thread->ctx;
-	if (!ctx) {
-		ctx = live_context_for_engine(engine, thread->file);
-		if (IS_ERR(ctx))
-			return PTR_ERR(ctx);
-
-		prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
-		ctx->sched.priority = prio;
-	}
-
-	ce = i915_gem_context_get_engine(ctx, 0);
-	GEM_BUG_ON(IS_ERR(ce));
-
-	/*
-	 * If we have a tiny shared address space, like for the GGTT
-	 * then we can't be too greedy.
-	 */
-	max = ce->vm->total;
-	if (i915_is_ggtt(ce->vm) || thread->ctx)
-		max = div_u64(max, thread->n_cpus);
-	max >>= 4;
-
-	total = PAGE_SIZE;
-	do {
-		/* Aim to keep the runtime under reasonable bounds! */
-		const u32 max_phys_size = SZ_64K;
-		u32 val = prandom_u32_state(prng);
-		u32 phys_sz;
-		u32 sz;
-		u32 *vaddr;
-		u32 i;
-
-		total = min(total, max);
-		sz = i915_prandom_u32_max_state(total, prng) + 1;
-		phys_sz = sz % max_phys_size + 1;
-
-		sz = round_up(sz, PAGE_SIZE);
-		phys_sz = round_up(phys_sz, PAGE_SIZE);
-		phys_sz = min(phys_sz, sz);
-
-		pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
-			 phys_sz, sz, val);
-
-		obj = huge_gem_object(engine->i915, phys_sz, sz);
-		if (IS_ERR(obj)) {
-			err = PTR_ERR(obj);
-			goto err_flush;
-		}
-
-		vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
-		if (IS_ERR(vaddr)) {
-			err = PTR_ERR(vaddr);
-			goto err_put;
-		}
-
-		/*
-		 * Make sure the potentially async clflush does its job, if
-		 * required.
-		 */
-		memset32(vaddr, val ^ 0xdeadbeaf,
-			 huge_gem_object_phys_size(obj) / sizeof(u32));
-
-		if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
-			obj->cache_dirty = true;
-
-		err = i915_gem_object_fill_blt(obj, ce, val);
-		if (err)
-			goto err_unpin;
-
-		err = i915_gem_object_wait(obj, 0, MAX_SCHEDULE_TIMEOUT);
-		if (err)
-			goto err_unpin;
-
-		for (i = 0; i < huge_gem_object_phys_size(obj) / sizeof(u32); i += 17) {
-			if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
-				drm_clflush_virt_range(&vaddr[i], sizeof(vaddr[i]));
-
-			if (vaddr[i] != val) {
-				pr_err("vaddr[%u]=%x, expected=%x\n", i,
-				       vaddr[i], val);
-				err = -EINVAL;
-				goto err_unpin;
-			}
-		}
-
-		i915_gem_object_unpin_map(obj);
-		i915_gem_object_put(obj);
-
-		total <<= 1;
-	} while (!time_after(jiffies, end));
-
-	goto err_flush;
-
-err_unpin:
-	i915_gem_object_unpin_map(obj);
-err_put:
-	i915_gem_object_put(obj);
-err_flush:
-	if (err == -ENOMEM)
-		err = 0;
-
-	intel_context_put(ce);
-	return err;
-}
-
-static int igt_copy_blt_thread(void *arg)
-{
-	struct igt_thread_arg *thread = arg;
-	struct intel_engine_cs *engine = thread->engine;
-	struct rnd_state *prng = &thread->prng;
-	struct drm_i915_gem_object *src, *dst;
-	struct i915_gem_context *ctx;
-	struct intel_context *ce;
-	unsigned int prio;
-	IGT_TIMEOUT(end);
-	u64 total, max;
-	int err;
-
-	ctx = thread->ctx;
-	if (!ctx) {
-		ctx = live_context_for_engine(engine, thread->file);
-		if (IS_ERR(ctx))
-			return PTR_ERR(ctx);
-
-		prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
-		ctx->sched.priority = prio;
-	}
-
-	ce = i915_gem_context_get_engine(ctx, 0);
-	GEM_BUG_ON(IS_ERR(ce));
-
-	/*
-	 * If we have a tiny shared address space, like for the GGTT
-	 * then we can't be too greedy.
-	 */
-	max = ce->vm->total;
-	if (i915_is_ggtt(ce->vm) || thread->ctx)
-		max = div_u64(max, thread->n_cpus);
-	max >>= 4;
-
-	total = PAGE_SIZE;
-	do {
-		/* Aim to keep the runtime under reasonable bounds! */
-		const u32 max_phys_size = SZ_64K;
-		u32 val = prandom_u32_state(prng);
-		u32 phys_sz;
-		u32 sz;
-		u32 *vaddr;
-		u32 i;
-
-		total = min(total, max);
-		sz = i915_prandom_u32_max_state(total, prng) + 1;
-		phys_sz = sz % max_phys_size + 1;
-
-		sz = round_up(sz, PAGE_SIZE);
-		phys_sz = round_up(phys_sz, PAGE_SIZE);
-		phys_sz = min(phys_sz, sz);
-
-		pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
-			 phys_sz, sz, val);
-
-		src = huge_gem_object(engine->i915, phys_sz, sz);
-		if (IS_ERR(src)) {
-			err = PTR_ERR(src);
-			goto err_flush;
-		}
-
-		vaddr = i915_gem_object_pin_map_unlocked(src, I915_MAP_WB);
-		if (IS_ERR(vaddr)) {
-			err = PTR_ERR(vaddr);
-			goto err_put_src;
-		}
-
-		memset32(vaddr, val,
-			 huge_gem_object_phys_size(src) / sizeof(u32));
-
-		i915_gem_object_unpin_map(src);
-
-		if (!(src->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
-			src->cache_dirty = true;
-
-		dst = huge_gem_object(engine->i915, phys_sz, sz);
-		if (IS_ERR(dst)) {
-			err = PTR_ERR(dst);
-			goto err_put_src;
-		}
-
-		vaddr = i915_gem_object_pin_map_unlocked(dst, I915_MAP_WB);
-		if (IS_ERR(vaddr)) {
-			err = PTR_ERR(vaddr);
-			goto err_put_dst;
-		}
-
-		memset32(vaddr, val ^ 0xdeadbeaf,
-			 huge_gem_object_phys_size(dst) / sizeof(u32));
-
-		if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
-			dst->cache_dirty = true;
-
-		err = i915_gem_object_copy_blt(src, dst, ce);
-		if (err)
-			goto err_unpin;
-
-		err = i915_gem_object_wait(dst, 0, MAX_SCHEDULE_TIMEOUT);
-		if (err)
-			goto err_unpin;
-
-		for (i = 0; i < huge_gem_object_phys_size(dst) / sizeof(u32); i += 17) {
-			if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
-				drm_clflush_virt_range(&vaddr[i], sizeof(vaddr[i]));
-
-			if (vaddr[i] != val) {
-				pr_err("vaddr[%u]=%x, expected=%x\n", i,
-				       vaddr[i], val);
-				err = -EINVAL;
-				goto err_unpin;
-			}
-		}
-
-		i915_gem_object_unpin_map(dst);
-
-		i915_gem_object_put(src);
-		i915_gem_object_put(dst);
-
-		total <<= 1;
-	} while (!time_after(jiffies, end));
-
-	goto err_flush;
-
-err_unpin:
-	i915_gem_object_unpin_map(dst);
-err_put_dst:
-	i915_gem_object_put(dst);
-err_put_src:
-	i915_gem_object_put(src);
-err_flush:
-	if (err == -ENOMEM)
-		err = 0;
-
-	intel_context_put(ce);
-	return err;
-}
-
-static int igt_threaded_blt(struct intel_engine_cs *engine,
-			    int (*blt_fn)(void *arg),
-			    unsigned int flags)
-#define SINGLE_CTX BIT(0)
-{
-	struct igt_thread_arg *thread;
-	struct task_struct **tsk;
-	unsigned int n_cpus, i;
-	I915_RND_STATE(prng);
-	int err = 0;
-
-	n_cpus = num_online_cpus() + 1;
-
-	tsk = kcalloc(n_cpus, sizeof(struct task_struct *), GFP_KERNEL);
-	if (!tsk)
-		return 0;
-
-	thread = kcalloc(n_cpus, sizeof(struct igt_thread_arg), GFP_KERNEL);
-	if (!thread)
-		goto out_tsk;
-
-	thread[0].file = mock_file(engine->i915);
-	if (IS_ERR(thread[0].file)) {
-		err = PTR_ERR(thread[0].file);
-		goto out_thread;
-	}
-
-	if (flags & SINGLE_CTX) {
-		thread[0].ctx = live_context_for_engine(engine, thread[0].file);
-		if (IS_ERR(thread[0].ctx)) {
-			err = PTR_ERR(thread[0].ctx);
-			goto out_file;
-		}
-	}
-
-	for (i = 0; i < n_cpus; ++i) {
-		thread[i].engine = engine;
-		thread[i].file = thread[0].file;
-		thread[i].ctx = thread[0].ctx;
-		thread[i].n_cpus = n_cpus;
-		thread[i].prng =
-			I915_RND_STATE_INITIALIZER(prandom_u32_state(&prng));
-
-		tsk[i] = kthread_run(blt_fn, &thread[i], "igt/blt-%d", i);
-		if (IS_ERR(tsk[i])) {
-			err = PTR_ERR(tsk[i]);
-			break;
-		}
-
-		get_task_struct(tsk[i]);
-	}
-
-	yield(); /* start all threads before we kthread_stop() */
-
-	for (i = 0; i < n_cpus; ++i) {
-		int status;
-
-		if (IS_ERR_OR_NULL(tsk[i]))
-			continue;
-
-		status = kthread_stop(tsk[i]);
-		if (status && !err)
-			err = status;
-
-		put_task_struct(tsk[i]);
-	}
-
-out_file:
-	fput(thread[0].file);
-out_thread:
-	kfree(thread);
-out_tsk:
-	kfree(tsk);
-	return err;
-}
-
-static int test_copy_engines(struct drm_i915_private *i915,
-			     int (*fn)(void *arg),
-			     unsigned int flags)
-{
-	struct intel_engine_cs *engine;
-	int ret;
-
-	for_each_uabi_class_engine(engine, I915_ENGINE_CLASS_COPY, i915) {
-		ret = igt_threaded_blt(engine, fn, flags);
-		if (ret)
-			return ret;
-	}
-
-	return 0;
-}
-
-static int igt_fill_blt(void *arg)
-{
-	return test_copy_engines(arg, igt_fill_blt_thread, 0);
-}
-
-static int igt_fill_blt_ctx0(void *arg)
-{
-	return test_copy_engines(arg, igt_fill_blt_thread, SINGLE_CTX);
-}
-
-static int igt_copy_blt(void *arg)
-{
-	return test_copy_engines(arg, igt_copy_blt_thread, 0);
-}
-
-static int igt_copy_blt_ctx0(void *arg)
-{
-	return test_copy_engines(arg, igt_copy_blt_thread, SINGLE_CTX);
-}
-
-int i915_gem_object_blt_live_selftests(struct drm_i915_private *i915)
-{
-	static const struct i915_subtest tests[] = {
-		SUBTEST(igt_fill_blt),
-		SUBTEST(igt_fill_blt_ctx0),
-		SUBTEST(igt_copy_blt),
-		SUBTEST(igt_copy_blt_ctx0),
-	};
-
-	if (intel_gt_is_wedged(&i915->gt))
-		return 0;
-
-	return i915_live_subtests(tests, i915);
-}
-
-int i915_gem_object_blt_perf_selftests(struct drm_i915_private *i915)
-{
-	static const struct i915_subtest tests[] = {
-		SUBTEST(perf_fill_blt),
-		SUBTEST(perf_copy_blt),
-	};
-
-	if (intel_gt_is_wedged(&i915->gt))
-		return 0;
-
-	return i915_live_subtests(tests, i915);
-}
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index 6f5893ecd549..1ae3f8039d68 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -39,7 +39,6 @@ selftest(evict, i915_gem_evict_live_selftests)
 selftest(hugepages, i915_gem_huge_page_live_selftests)
 selftest(gem_contexts, i915_gem_context_live_selftests)
 selftest(gem_execbuf, i915_gem_execbuffer_live_selftests)
-selftest(blt, i915_gem_object_blt_live_selftests)
 selftest(reset, intel_reset_live_selftests)
 selftest(memory_region, intel_memory_region_live_selftests)
 selftest(hangcheck, intel_hangcheck_live_selftests)
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
index 5077dc3c3b8c..058450d351f7 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
@@ -18,5 +18,4 @@
 selftest(engine_cs, intel_engine_cs_perf_selftests)
 selftest(request, i915_request_perf_selftests)
 selftest(migrate, intel_migrate_perf_selftests)
-selftest(blt, i915_gem_object_blt_perf_selftests)
 selftest(region, intel_memory_region_perf_selftests)
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index c85d516b85cd..2e18f3a3d538 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -15,11 +15,12 @@
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
-#include "gem/i915_gem_object_blt.h"
 #include "gem/selftests/igt_gem_utils.h"
 #include "gem/selftests/mock_context.h"
+#include "gt/intel_engine_pm.h"
 #include "gt/intel_engine_user.h"
 #include "gt/intel_gt.h"
+#include "gt/intel_migrate.h"
 #include "i915_memcpy.h"
 #include "selftests/igt_flush_test.h"
 #include "selftests/i915_random.h"
@@ -741,6 +742,7 @@ static int igt_lmem_write_cpu(void *arg)
 		PAGE_SIZE - 64,
 	};
 	struct intel_engine_cs *engine;
+	struct i915_request *rq;
 	u32 *vaddr;
 	u32 sz;
 	u32 i;
@@ -767,15 +769,20 @@ static int igt_lmem_write_cpu(void *arg)
 		goto out_put;
 	}
 
+	i915_gem_object_lock(obj, NULL);
 	/* Put the pages into a known state -- from the gpu for added fun */
 	intel_engine_pm_get(engine);
-	err = i915_gem_object_fill_blt(obj, engine->kernel_context, 0xdeadbeaf);
-	intel_engine_pm_put(engine);
-	if (err)
-		goto out_unpin;
+	err = intel_context_migrate_clear(engine->gt->migrate.context, NULL,
+					  obj->mm.pages->sgl, I915_CACHE_NONE,
+					  true, 0xdeadbeaf, &rq);
+	if (rq) {
+		dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
+		i915_request_put(rq);
+	}
 
-	i915_gem_object_lock(obj, NULL);
-	err = i915_gem_object_set_to_wc_domain(obj, true);
+	intel_engine_pm_put(engine);
+	if (!err)
+		err = i915_gem_object_set_to_wc_domain(obj, true);
 	i915_gem_object_unlock(obj);
 	if (err)
 		goto out_unpin;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3 12/12] drm/i915/gem: Zap the i915_gem_object_blt code
@ 2021-06-14 16:26   ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:26 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Thomas Hellström, matthew.auld

It's unused with the exception of selftest. Replace a call in the
memory_region live selftest with a call into a corresponding
function in the new migrate code.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   1 -
 .../gpu/drm/i915/gem/i915_gem_object_blt.c    | 461 --------------
 .../gpu/drm/i915/gem/i915_gem_object_blt.h    |  39 --
 .../i915/gem/selftests/i915_gem_object_blt.c  | 597 ------------------
 .../drm/i915/selftests/i915_live_selftests.h  |   1 -
 .../drm/i915/selftests/i915_perf_selftests.h  |   1 -
 .../drm/i915/selftests/intel_memory_region.c  |  21 +-
 7 files changed, 14 insertions(+), 1107 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
 delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
 delete mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index ca07474ec2df..13085ac78c63 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -143,7 +143,6 @@ gem-y += \
 	gem/i915_gem_execbuffer.o \
 	gem/i915_gem_internal.o \
 	gem/i915_gem_object.o \
-	gem/i915_gem_object_blt.o \
 	gem/i915_gem_lmem.o \
 	gem/i915_gem_mman.o \
 	gem/i915_gem_pages.o \
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
deleted file mode 100644
index 3e28c68fda3e..000000000000
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
+++ /dev/null
@@ -1,461 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#include "i915_drv.h"
-#include "gt/intel_context.h"
-#include "gt/intel_engine_pm.h"
-#include "gt/intel_gpu_commands.h"
-#include "gt/intel_gt.h"
-#include "gt/intel_gt_buffer_pool.h"
-#include "gt/intel_ring.h"
-#include "i915_gem_clflush.h"
-#include "i915_gem_object_blt.h"
-
-struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
-					 struct i915_vma *vma,
-					 struct i915_gem_ww_ctx *ww,
-					 u32 value)
-{
-	struct drm_i915_private *i915 = ce->vm->i915;
-	const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
-	struct intel_gt_buffer_pool_node *pool;
-	struct i915_vma *batch;
-	u64 offset;
-	u64 count;
-	u64 rem;
-	u32 size;
-	u32 *cmd;
-	int err;
-
-	GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
-	intel_engine_pm_get(ce->engine);
-
-	count = div_u64(round_up(vma->size, block_size), block_size);
-	size = (1 + 8 * count) * sizeof(u32);
-	size = round_up(size, PAGE_SIZE);
-	pool = intel_gt_get_buffer_pool(ce->engine->gt, size, I915_MAP_WC);
-	if (IS_ERR(pool)) {
-		err = PTR_ERR(pool);
-		goto out_pm;
-	}
-
-	err = i915_gem_object_lock(pool->obj, ww);
-	if (err)
-		goto out_put;
-
-	batch = i915_vma_instance(pool->obj, ce->vm, NULL);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto out_put;
-	}
-
-	err = i915_vma_pin_ww(batch, ww, 0, 0, PIN_USER);
-	if (unlikely(err))
-		goto out_put;
-
-	/* we pinned the pool, mark it as such */
-	intel_gt_buffer_pool_mark_used(pool);
-
-	cmd = i915_gem_object_pin_map(pool->obj, pool->type);
-	if (IS_ERR(cmd)) {
-		err = PTR_ERR(cmd);
-		goto out_unpin;
-	}
-
-	rem = vma->size;
-	offset = vma->node.start;
-
-	do {
-		u32 size = min_t(u64, rem, block_size);
-
-		GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
-
-		if (GRAPHICS_VER(i915) >= 8) {
-			*cmd++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2);
-			*cmd++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
-			*cmd++ = 0;
-			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
-			*cmd++ = lower_32_bits(offset);
-			*cmd++ = upper_32_bits(offset);
-			*cmd++ = value;
-		} else {
-			*cmd++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
-			*cmd++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
-			*cmd++ = 0;
-			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
-			*cmd++ = offset;
-			*cmd++ = value;
-		}
-
-		/* Allow ourselves to be preempted in between blocks. */
-		*cmd++ = MI_ARB_CHECK;
-
-		offset += size;
-		rem -= size;
-	} while (rem);
-
-	*cmd = MI_BATCH_BUFFER_END;
-
-	i915_gem_object_flush_map(pool->obj);
-	i915_gem_object_unpin_map(pool->obj);
-
-	intel_gt_chipset_flush(ce->vm->gt);
-
-	batch->private = pool;
-	return batch;
-
-out_unpin:
-	i915_vma_unpin(batch);
-out_put:
-	intel_gt_buffer_pool_put(pool);
-out_pm:
-	intel_engine_pm_put(ce->engine);
-	return ERR_PTR(err);
-}
-
-int intel_emit_vma_mark_active(struct i915_vma *vma, struct i915_request *rq)
-{
-	int err;
-
-	err = i915_request_await_object(rq, vma->obj, false);
-	if (err == 0)
-		err = i915_vma_move_to_active(vma, rq, 0);
-	if (unlikely(err))
-		return err;
-
-	return intel_gt_buffer_pool_mark_active(vma->private, rq);
-}
-
-void intel_emit_vma_release(struct intel_context *ce, struct i915_vma *vma)
-{
-	i915_vma_unpin(vma);
-	intel_gt_buffer_pool_put(vma->private);
-	intel_engine_pm_put(ce->engine);
-}
-
-static int
-move_obj_to_gpu(struct drm_i915_gem_object *obj,
-		struct i915_request *rq,
-		bool write)
-{
-	if (obj->cache_dirty & ~obj->cache_coherent)
-		i915_gem_clflush_object(obj, 0);
-
-	return i915_request_await_object(rq, obj, write);
-}
-
-int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
-			     struct intel_context *ce,
-			     u32 value)
-{
-	struct i915_gem_ww_ctx ww;
-	struct i915_request *rq;
-	struct i915_vma *batch;
-	struct i915_vma *vma;
-	int err;
-
-	vma = i915_vma_instance(obj, ce->vm, NULL);
-	if (IS_ERR(vma))
-		return PTR_ERR(vma);
-
-	i915_gem_ww_ctx_init(&ww, true);
-	intel_engine_pm_get(ce->engine);
-retry:
-	err = i915_gem_object_lock(obj, &ww);
-	if (err)
-		goto out;
-
-	err = intel_context_pin_ww(ce, &ww);
-	if (err)
-		goto out;
-
-	err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER);
-	if (err)
-		goto out_ctx;
-
-	batch = intel_emit_vma_fill_blt(ce, vma, &ww, value);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto out_vma;
-	}
-
-	rq = i915_request_create(ce);
-	if (IS_ERR(rq)) {
-		err = PTR_ERR(rq);
-		goto out_batch;
-	}
-
-	err = intel_emit_vma_mark_active(batch, rq);
-	if (unlikely(err))
-		goto out_request;
-
-	err = move_obj_to_gpu(vma->obj, rq, true);
-	if (err == 0)
-		err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
-	if (unlikely(err))
-		goto out_request;
-
-	if (ce->engine->emit_init_breadcrumb)
-		err = ce->engine->emit_init_breadcrumb(rq);
-
-	if (likely(!err))
-		err = ce->engine->emit_bb_start(rq,
-						batch->node.start,
-						batch->node.size,
-						0);
-out_request:
-	if (unlikely(err))
-		i915_request_set_error_once(rq, err);
-
-	i915_request_add(rq);
-out_batch:
-	intel_emit_vma_release(ce, batch);
-out_vma:
-	i915_vma_unpin(vma);
-out_ctx:
-	intel_context_unpin(ce);
-out:
-	if (err == -EDEADLK) {
-		err = i915_gem_ww_ctx_backoff(&ww);
-		if (!err)
-			goto retry;
-	}
-	i915_gem_ww_ctx_fini(&ww);
-	intel_engine_pm_put(ce->engine);
-	return err;
-}
-
-/* Wa_1209644611:icl,ehl */
-static bool wa_1209644611_applies(struct drm_i915_private *i915, u32 size)
-{
-	u32 height = size >> PAGE_SHIFT;
-
-	if (GRAPHICS_VER(i915) != 11)
-		return false;
-
-	return height % 4 == 3 && height <= 8;
-}
-
-struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
-					 struct i915_gem_ww_ctx *ww,
-					 struct i915_vma *src,
-					 struct i915_vma *dst)
-{
-	struct drm_i915_private *i915 = ce->vm->i915;
-	const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
-	struct intel_gt_buffer_pool_node *pool;
-	struct i915_vma *batch;
-	u64 src_offset, dst_offset;
-	u64 count, rem;
-	u32 size, *cmd;
-	int err;
-
-	GEM_BUG_ON(src->size != dst->size);
-
-	GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
-	intel_engine_pm_get(ce->engine);
-
-	count = div_u64(round_up(dst->size, block_size), block_size);
-	size = (1 + 11 * count) * sizeof(u32);
-	size = round_up(size, PAGE_SIZE);
-	pool = intel_gt_get_buffer_pool(ce->engine->gt, size, I915_MAP_WC);
-	if (IS_ERR(pool)) {
-		err = PTR_ERR(pool);
-		goto out_pm;
-	}
-
-	err = i915_gem_object_lock(pool->obj, ww);
-	if (err)
-		goto out_put;
-
-	batch = i915_vma_instance(pool->obj, ce->vm, NULL);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto out_put;
-	}
-
-	err = i915_vma_pin_ww(batch, ww, 0, 0, PIN_USER);
-	if (unlikely(err))
-		goto out_put;
-
-	/* we pinned the pool, mark it as such */
-	intel_gt_buffer_pool_mark_used(pool);
-
-	cmd = i915_gem_object_pin_map(pool->obj, pool->type);
-	if (IS_ERR(cmd)) {
-		err = PTR_ERR(cmd);
-		goto out_unpin;
-	}
-
-	rem = src->size;
-	src_offset = src->node.start;
-	dst_offset = dst->node.start;
-
-	do {
-		size = min_t(u64, rem, block_size);
-		GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
-
-		if (GRAPHICS_VER(i915) >= 9 &&
-		    !wa_1209644611_applies(i915, size)) {
-			*cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
-			*cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
-			*cmd++ = 0;
-			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
-			*cmd++ = lower_32_bits(dst_offset);
-			*cmd++ = upper_32_bits(dst_offset);
-			*cmd++ = 0;
-			*cmd++ = PAGE_SIZE;
-			*cmd++ = lower_32_bits(src_offset);
-			*cmd++ = upper_32_bits(src_offset);
-		} else if (GRAPHICS_VER(i915) >= 8) {
-			*cmd++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (10 - 2);
-			*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
-			*cmd++ = 0;
-			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
-			*cmd++ = lower_32_bits(dst_offset);
-			*cmd++ = upper_32_bits(dst_offset);
-			*cmd++ = 0;
-			*cmd++ = PAGE_SIZE;
-			*cmd++ = lower_32_bits(src_offset);
-			*cmd++ = upper_32_bits(src_offset);
-		} else {
-			*cmd++ = SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
-			*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
-			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE;
-			*cmd++ = dst_offset;
-			*cmd++ = PAGE_SIZE;
-			*cmd++ = src_offset;
-		}
-
-		/* Allow ourselves to be preempted in between blocks. */
-		*cmd++ = MI_ARB_CHECK;
-
-		src_offset += size;
-		dst_offset += size;
-		rem -= size;
-	} while (rem);
-
-	*cmd = MI_BATCH_BUFFER_END;
-
-	i915_gem_object_flush_map(pool->obj);
-	i915_gem_object_unpin_map(pool->obj);
-
-	intel_gt_chipset_flush(ce->vm->gt);
-	batch->private = pool;
-	return batch;
-
-out_unpin:
-	i915_vma_unpin(batch);
-out_put:
-	intel_gt_buffer_pool_put(pool);
-out_pm:
-	intel_engine_pm_put(ce->engine);
-	return ERR_PTR(err);
-}
-
-int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
-			     struct drm_i915_gem_object *dst,
-			     struct intel_context *ce)
-{
-	struct i915_address_space *vm = ce->vm;
-	struct i915_vma *vma[2], *batch;
-	struct i915_gem_ww_ctx ww;
-	struct i915_request *rq;
-	int err, i;
-
-	vma[0] = i915_vma_instance(src, vm, NULL);
-	if (IS_ERR(vma[0]))
-		return PTR_ERR(vma[0]);
-
-	vma[1] = i915_vma_instance(dst, vm, NULL);
-	if (IS_ERR(vma[1]))
-		return PTR_ERR(vma[1]);
-
-	i915_gem_ww_ctx_init(&ww, true);
-	intel_engine_pm_get(ce->engine);
-retry:
-	err = i915_gem_object_lock(src, &ww);
-	if (!err)
-		err = i915_gem_object_lock(dst, &ww);
-	if (!err)
-		err = intel_context_pin_ww(ce, &ww);
-	if (err)
-		goto out;
-
-	err = i915_vma_pin_ww(vma[0], &ww, 0, 0, PIN_USER);
-	if (err)
-		goto out_ctx;
-
-	err = i915_vma_pin_ww(vma[1], &ww, 0, 0, PIN_USER);
-	if (unlikely(err))
-		goto out_unpin_src;
-
-	batch = intel_emit_vma_copy_blt(ce, &ww, vma[0], vma[1]);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto out_unpin_dst;
-	}
-
-	rq = i915_request_create(ce);
-	if (IS_ERR(rq)) {
-		err = PTR_ERR(rq);
-		goto out_batch;
-	}
-
-	err = intel_emit_vma_mark_active(batch, rq);
-	if (unlikely(err))
-		goto out_request;
-
-	for (i = 0; i < ARRAY_SIZE(vma); i++) {
-		err = move_obj_to_gpu(vma[i]->obj, rq, i);
-		if (unlikely(err))
-			goto out_request;
-	}
-
-	for (i = 0; i < ARRAY_SIZE(vma); i++) {
-		unsigned int flags = i ? EXEC_OBJECT_WRITE : 0;
-
-		err = i915_vma_move_to_active(vma[i], rq, flags);
-		if (unlikely(err))
-			goto out_request;
-	}
-
-	if (rq->engine->emit_init_breadcrumb) {
-		err = rq->engine->emit_init_breadcrumb(rq);
-		if (unlikely(err))
-			goto out_request;
-	}
-
-	err = rq->engine->emit_bb_start(rq,
-					batch->node.start, batch->node.size,
-					0);
-
-out_request:
-	if (unlikely(err))
-		i915_request_set_error_once(rq, err);
-
-	i915_request_add(rq);
-out_batch:
-	intel_emit_vma_release(ce, batch);
-out_unpin_dst:
-	i915_vma_unpin(vma[1]);
-out_unpin_src:
-	i915_vma_unpin(vma[0]);
-out_ctx:
-	intel_context_unpin(ce);
-out:
-	if (err == -EDEADLK) {
-		err = i915_gem_ww_ctx_backoff(&ww);
-		if (!err)
-			goto retry;
-	}
-	i915_gem_ww_ctx_fini(&ww);
-	intel_engine_pm_put(ce->engine);
-	return err;
-}
-
-#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
-#include "selftests/i915_gem_object_blt.c"
-#endif
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
deleted file mode 100644
index 2409fdcccf0e..000000000000
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
+++ /dev/null
@@ -1,39 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#ifndef __I915_GEM_OBJECT_BLT_H__
-#define __I915_GEM_OBJECT_BLT_H__
-
-#include <linux/types.h>
-
-#include "gt/intel_context.h"
-#include "gt/intel_engine_pm.h"
-#include "i915_vma.h"
-
-struct drm_i915_gem_object;
-struct i915_gem_ww_ctx;
-
-struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
-					 struct i915_vma *vma,
-					 struct i915_gem_ww_ctx *ww,
-					 u32 value);
-
-struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
-					 struct i915_gem_ww_ctx *ww,
-					 struct i915_vma *src,
-					 struct i915_vma *dst);
-
-int intel_emit_vma_mark_active(struct i915_vma *vma, struct i915_request *rq);
-void intel_emit_vma_release(struct intel_context *ce, struct i915_vma *vma);
-
-int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
-			     struct intel_context *ce,
-			     u32 value);
-
-int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
-			     struct drm_i915_gem_object *dst,
-			     struct intel_context *ce);
-
-#endif
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
deleted file mode 100644
index 8c335d1a8406..000000000000
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
+++ /dev/null
@@ -1,597 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#include <linux/sort.h>
-
-#include "gt/intel_gt.h"
-#include "gt/intel_engine_user.h"
-
-#include "i915_selftest.h"
-
-#include "gem/i915_gem_context.h"
-#include "selftests/igt_flush_test.h"
-#include "selftests/i915_random.h"
-#include "selftests/mock_drm.h"
-#include "huge_gem_object.h"
-#include "mock_context.h"
-
-static int wrap_ktime_compare(const void *A, const void *B)
-{
-	const ktime_t *a = A, *b = B;
-
-	return ktime_compare(*a, *b);
-}
-
-static int __perf_fill_blt(struct drm_i915_gem_object *obj)
-{
-	struct drm_i915_private *i915 = to_i915(obj->base.dev);
-	int inst = 0;
-
-	do {
-		struct intel_engine_cs *engine;
-		ktime_t t[5];
-		int pass;
-		int err;
-
-		engine = intel_engine_lookup_user(i915,
-						  I915_ENGINE_CLASS_COPY,
-						  inst++);
-		if (!engine)
-			return 0;
-
-		intel_engine_pm_get(engine);
-		for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
-			struct intel_context *ce = engine->kernel_context;
-			ktime_t t0, t1;
-
-			t0 = ktime_get();
-
-			err = i915_gem_object_fill_blt(obj, ce, 0);
-			if (err)
-				break;
-
-			err = i915_gem_object_wait(obj,
-						   I915_WAIT_ALL,
-						   MAX_SCHEDULE_TIMEOUT);
-			if (err)
-				break;
-
-			t1 = ktime_get();
-			t[pass] = ktime_sub(t1, t0);
-		}
-		intel_engine_pm_put(engine);
-		if (err)
-			return err;
-
-		sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
-		pr_info("%s: blt %zd KiB fill: %lld MiB/s\n",
-			engine->name,
-			obj->base.size >> 10,
-			div64_u64(mul_u32_u32(4 * obj->base.size,
-					      1000 * 1000 * 1000),
-				  t[1] + 2 * t[2] + t[3]) >> 20);
-	} while (1);
-}
-
-static int perf_fill_blt(void *arg)
-{
-	struct drm_i915_private *i915 = arg;
-	static const unsigned long sizes[] = {
-		SZ_4K,
-		SZ_64K,
-		SZ_2M,
-		SZ_64M
-	};
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
-		struct drm_i915_gem_object *obj;
-		int err;
-
-		obj = i915_gem_object_create_internal(i915, sizes[i]);
-		if (IS_ERR(obj))
-			return PTR_ERR(obj);
-
-		err = __perf_fill_blt(obj);
-		i915_gem_object_put(obj);
-		if (err)
-			return err;
-	}
-
-	return 0;
-}
-
-static int __perf_copy_blt(struct drm_i915_gem_object *src,
-			   struct drm_i915_gem_object *dst)
-{
-	struct drm_i915_private *i915 = to_i915(src->base.dev);
-	int inst = 0;
-
-	do {
-		struct intel_engine_cs *engine;
-		ktime_t t[5];
-		int pass;
-		int err = 0;
-
-		engine = intel_engine_lookup_user(i915,
-						  I915_ENGINE_CLASS_COPY,
-						  inst++);
-		if (!engine)
-			return 0;
-
-		intel_engine_pm_get(engine);
-		for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
-			struct intel_context *ce = engine->kernel_context;
-			ktime_t t0, t1;
-
-			t0 = ktime_get();
-
-			err = i915_gem_object_copy_blt(src, dst, ce);
-			if (err)
-				break;
-
-			err = i915_gem_object_wait(dst,
-						   I915_WAIT_ALL,
-						   MAX_SCHEDULE_TIMEOUT);
-			if (err)
-				break;
-
-			t1 = ktime_get();
-			t[pass] = ktime_sub(t1, t0);
-		}
-		intel_engine_pm_put(engine);
-		if (err)
-			return err;
-
-		sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
-		pr_info("%s: blt %zd KiB copy: %lld MiB/s\n",
-			engine->name,
-			src->base.size >> 10,
-			div64_u64(mul_u32_u32(4 * src->base.size,
-					      1000 * 1000 * 1000),
-				  t[1] + 2 * t[2] + t[3]) >> 20);
-	} while (1);
-}
-
-static int perf_copy_blt(void *arg)
-{
-	struct drm_i915_private *i915 = arg;
-	static const unsigned long sizes[] = {
-		SZ_4K,
-		SZ_64K,
-		SZ_2M,
-		SZ_64M
-	};
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
-		struct drm_i915_gem_object *src, *dst;
-		int err;
-
-		src = i915_gem_object_create_internal(i915, sizes[i]);
-		if (IS_ERR(src))
-			return PTR_ERR(src);
-
-		dst = i915_gem_object_create_internal(i915, sizes[i]);
-		if (IS_ERR(dst)) {
-			err = PTR_ERR(dst);
-			goto err_src;
-		}
-
-		err = __perf_copy_blt(src, dst);
-
-		i915_gem_object_put(dst);
-err_src:
-		i915_gem_object_put(src);
-		if (err)
-			return err;
-	}
-
-	return 0;
-}
-
-struct igt_thread_arg {
-	struct intel_engine_cs *engine;
-	struct i915_gem_context *ctx;
-	struct file *file;
-	struct rnd_state prng;
-	unsigned int n_cpus;
-};
-
-static int igt_fill_blt_thread(void *arg)
-{
-	struct igt_thread_arg *thread = arg;
-	struct intel_engine_cs *engine = thread->engine;
-	struct rnd_state *prng = &thread->prng;
-	struct drm_i915_gem_object *obj;
-	struct i915_gem_context *ctx;
-	struct intel_context *ce;
-	unsigned int prio;
-	IGT_TIMEOUT(end);
-	u64 total, max;
-	int err;
-
-	ctx = thread->ctx;
-	if (!ctx) {
-		ctx = live_context_for_engine(engine, thread->file);
-		if (IS_ERR(ctx))
-			return PTR_ERR(ctx);
-
-		prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
-		ctx->sched.priority = prio;
-	}
-
-	ce = i915_gem_context_get_engine(ctx, 0);
-	GEM_BUG_ON(IS_ERR(ce));
-
-	/*
-	 * If we have a tiny shared address space, like for the GGTT
-	 * then we can't be too greedy.
-	 */
-	max = ce->vm->total;
-	if (i915_is_ggtt(ce->vm) || thread->ctx)
-		max = div_u64(max, thread->n_cpus);
-	max >>= 4;
-
-	total = PAGE_SIZE;
-	do {
-		/* Aim to keep the runtime under reasonable bounds! */
-		const u32 max_phys_size = SZ_64K;
-		u32 val = prandom_u32_state(prng);
-		u32 phys_sz;
-		u32 sz;
-		u32 *vaddr;
-		u32 i;
-
-		total = min(total, max);
-		sz = i915_prandom_u32_max_state(total, prng) + 1;
-		phys_sz = sz % max_phys_size + 1;
-
-		sz = round_up(sz, PAGE_SIZE);
-		phys_sz = round_up(phys_sz, PAGE_SIZE);
-		phys_sz = min(phys_sz, sz);
-
-		pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
-			 phys_sz, sz, val);
-
-		obj = huge_gem_object(engine->i915, phys_sz, sz);
-		if (IS_ERR(obj)) {
-			err = PTR_ERR(obj);
-			goto err_flush;
-		}
-
-		vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
-		if (IS_ERR(vaddr)) {
-			err = PTR_ERR(vaddr);
-			goto err_put;
-		}
-
-		/*
-		 * Make sure the potentially async clflush does its job, if
-		 * required.
-		 */
-		memset32(vaddr, val ^ 0xdeadbeaf,
-			 huge_gem_object_phys_size(obj) / sizeof(u32));
-
-		if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
-			obj->cache_dirty = true;
-
-		err = i915_gem_object_fill_blt(obj, ce, val);
-		if (err)
-			goto err_unpin;
-
-		err = i915_gem_object_wait(obj, 0, MAX_SCHEDULE_TIMEOUT);
-		if (err)
-			goto err_unpin;
-
-		for (i = 0; i < huge_gem_object_phys_size(obj) / sizeof(u32); i += 17) {
-			if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
-				drm_clflush_virt_range(&vaddr[i], sizeof(vaddr[i]));
-
-			if (vaddr[i] != val) {
-				pr_err("vaddr[%u]=%x, expected=%x\n", i,
-				       vaddr[i], val);
-				err = -EINVAL;
-				goto err_unpin;
-			}
-		}
-
-		i915_gem_object_unpin_map(obj);
-		i915_gem_object_put(obj);
-
-		total <<= 1;
-	} while (!time_after(jiffies, end));
-
-	goto err_flush;
-
-err_unpin:
-	i915_gem_object_unpin_map(obj);
-err_put:
-	i915_gem_object_put(obj);
-err_flush:
-	if (err == -ENOMEM)
-		err = 0;
-
-	intel_context_put(ce);
-	return err;
-}
-
-static int igt_copy_blt_thread(void *arg)
-{
-	struct igt_thread_arg *thread = arg;
-	struct intel_engine_cs *engine = thread->engine;
-	struct rnd_state *prng = &thread->prng;
-	struct drm_i915_gem_object *src, *dst;
-	struct i915_gem_context *ctx;
-	struct intel_context *ce;
-	unsigned int prio;
-	IGT_TIMEOUT(end);
-	u64 total, max;
-	int err;
-
-	ctx = thread->ctx;
-	if (!ctx) {
-		ctx = live_context_for_engine(engine, thread->file);
-		if (IS_ERR(ctx))
-			return PTR_ERR(ctx);
-
-		prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
-		ctx->sched.priority = prio;
-	}
-
-	ce = i915_gem_context_get_engine(ctx, 0);
-	GEM_BUG_ON(IS_ERR(ce));
-
-	/*
-	 * If we have a tiny shared address space, like for the GGTT
-	 * then we can't be too greedy.
-	 */
-	max = ce->vm->total;
-	if (i915_is_ggtt(ce->vm) || thread->ctx)
-		max = div_u64(max, thread->n_cpus);
-	max >>= 4;
-
-	total = PAGE_SIZE;
-	do {
-		/* Aim to keep the runtime under reasonable bounds! */
-		const u32 max_phys_size = SZ_64K;
-		u32 val = prandom_u32_state(prng);
-		u32 phys_sz;
-		u32 sz;
-		u32 *vaddr;
-		u32 i;
-
-		total = min(total, max);
-		sz = i915_prandom_u32_max_state(total, prng) + 1;
-		phys_sz = sz % max_phys_size + 1;
-
-		sz = round_up(sz, PAGE_SIZE);
-		phys_sz = round_up(phys_sz, PAGE_SIZE);
-		phys_sz = min(phys_sz, sz);
-
-		pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
-			 phys_sz, sz, val);
-
-		src = huge_gem_object(engine->i915, phys_sz, sz);
-		if (IS_ERR(src)) {
-			err = PTR_ERR(src);
-			goto err_flush;
-		}
-
-		vaddr = i915_gem_object_pin_map_unlocked(src, I915_MAP_WB);
-		if (IS_ERR(vaddr)) {
-			err = PTR_ERR(vaddr);
-			goto err_put_src;
-		}
-
-		memset32(vaddr, val,
-			 huge_gem_object_phys_size(src) / sizeof(u32));
-
-		i915_gem_object_unpin_map(src);
-
-		if (!(src->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
-			src->cache_dirty = true;
-
-		dst = huge_gem_object(engine->i915, phys_sz, sz);
-		if (IS_ERR(dst)) {
-			err = PTR_ERR(dst);
-			goto err_put_src;
-		}
-
-		vaddr = i915_gem_object_pin_map_unlocked(dst, I915_MAP_WB);
-		if (IS_ERR(vaddr)) {
-			err = PTR_ERR(vaddr);
-			goto err_put_dst;
-		}
-
-		memset32(vaddr, val ^ 0xdeadbeaf,
-			 huge_gem_object_phys_size(dst) / sizeof(u32));
-
-		if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
-			dst->cache_dirty = true;
-
-		err = i915_gem_object_copy_blt(src, dst, ce);
-		if (err)
-			goto err_unpin;
-
-		err = i915_gem_object_wait(dst, 0, MAX_SCHEDULE_TIMEOUT);
-		if (err)
-			goto err_unpin;
-
-		for (i = 0; i < huge_gem_object_phys_size(dst) / sizeof(u32); i += 17) {
-			if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
-				drm_clflush_virt_range(&vaddr[i], sizeof(vaddr[i]));
-
-			if (vaddr[i] != val) {
-				pr_err("vaddr[%u]=%x, expected=%x\n", i,
-				       vaddr[i], val);
-				err = -EINVAL;
-				goto err_unpin;
-			}
-		}
-
-		i915_gem_object_unpin_map(dst);
-
-		i915_gem_object_put(src);
-		i915_gem_object_put(dst);
-
-		total <<= 1;
-	} while (!time_after(jiffies, end));
-
-	goto err_flush;
-
-err_unpin:
-	i915_gem_object_unpin_map(dst);
-err_put_dst:
-	i915_gem_object_put(dst);
-err_put_src:
-	i915_gem_object_put(src);
-err_flush:
-	if (err == -ENOMEM)
-		err = 0;
-
-	intel_context_put(ce);
-	return err;
-}
-
-static int igt_threaded_blt(struct intel_engine_cs *engine,
-			    int (*blt_fn)(void *arg),
-			    unsigned int flags)
-#define SINGLE_CTX BIT(0)
-{
-	struct igt_thread_arg *thread;
-	struct task_struct **tsk;
-	unsigned int n_cpus, i;
-	I915_RND_STATE(prng);
-	int err = 0;
-
-	n_cpus = num_online_cpus() + 1;
-
-	tsk = kcalloc(n_cpus, sizeof(struct task_struct *), GFP_KERNEL);
-	if (!tsk)
-		return 0;
-
-	thread = kcalloc(n_cpus, sizeof(struct igt_thread_arg), GFP_KERNEL);
-	if (!thread)
-		goto out_tsk;
-
-	thread[0].file = mock_file(engine->i915);
-	if (IS_ERR(thread[0].file)) {
-		err = PTR_ERR(thread[0].file);
-		goto out_thread;
-	}
-
-	if (flags & SINGLE_CTX) {
-		thread[0].ctx = live_context_for_engine(engine, thread[0].file);
-		if (IS_ERR(thread[0].ctx)) {
-			err = PTR_ERR(thread[0].ctx);
-			goto out_file;
-		}
-	}
-
-	for (i = 0; i < n_cpus; ++i) {
-		thread[i].engine = engine;
-		thread[i].file = thread[0].file;
-		thread[i].ctx = thread[0].ctx;
-		thread[i].n_cpus = n_cpus;
-		thread[i].prng =
-			I915_RND_STATE_INITIALIZER(prandom_u32_state(&prng));
-
-		tsk[i] = kthread_run(blt_fn, &thread[i], "igt/blt-%d", i);
-		if (IS_ERR(tsk[i])) {
-			err = PTR_ERR(tsk[i]);
-			break;
-		}
-
-		get_task_struct(tsk[i]);
-	}
-
-	yield(); /* start all threads before we kthread_stop() */
-
-	for (i = 0; i < n_cpus; ++i) {
-		int status;
-
-		if (IS_ERR_OR_NULL(tsk[i]))
-			continue;
-
-		status = kthread_stop(tsk[i]);
-		if (status && !err)
-			err = status;
-
-		put_task_struct(tsk[i]);
-	}
-
-out_file:
-	fput(thread[0].file);
-out_thread:
-	kfree(thread);
-out_tsk:
-	kfree(tsk);
-	return err;
-}
-
-static int test_copy_engines(struct drm_i915_private *i915,
-			     int (*fn)(void *arg),
-			     unsigned int flags)
-{
-	struct intel_engine_cs *engine;
-	int ret;
-
-	for_each_uabi_class_engine(engine, I915_ENGINE_CLASS_COPY, i915) {
-		ret = igt_threaded_blt(engine, fn, flags);
-		if (ret)
-			return ret;
-	}
-
-	return 0;
-}
-
-static int igt_fill_blt(void *arg)
-{
-	return test_copy_engines(arg, igt_fill_blt_thread, 0);
-}
-
-static int igt_fill_blt_ctx0(void *arg)
-{
-	return test_copy_engines(arg, igt_fill_blt_thread, SINGLE_CTX);
-}
-
-static int igt_copy_blt(void *arg)
-{
-	return test_copy_engines(arg, igt_copy_blt_thread, 0);
-}
-
-static int igt_copy_blt_ctx0(void *arg)
-{
-	return test_copy_engines(arg, igt_copy_blt_thread, SINGLE_CTX);
-}
-
-int i915_gem_object_blt_live_selftests(struct drm_i915_private *i915)
-{
-	static const struct i915_subtest tests[] = {
-		SUBTEST(igt_fill_blt),
-		SUBTEST(igt_fill_blt_ctx0),
-		SUBTEST(igt_copy_blt),
-		SUBTEST(igt_copy_blt_ctx0),
-	};
-
-	if (intel_gt_is_wedged(&i915->gt))
-		return 0;
-
-	return i915_live_subtests(tests, i915);
-}
-
-int i915_gem_object_blt_perf_selftests(struct drm_i915_private *i915)
-{
-	static const struct i915_subtest tests[] = {
-		SUBTEST(perf_fill_blt),
-		SUBTEST(perf_copy_blt),
-	};
-
-	if (intel_gt_is_wedged(&i915->gt))
-		return 0;
-
-	return i915_live_subtests(tests, i915);
-}
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index 6f5893ecd549..1ae3f8039d68 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -39,7 +39,6 @@ selftest(evict, i915_gem_evict_live_selftests)
 selftest(hugepages, i915_gem_huge_page_live_selftests)
 selftest(gem_contexts, i915_gem_context_live_selftests)
 selftest(gem_execbuf, i915_gem_execbuffer_live_selftests)
-selftest(blt, i915_gem_object_blt_live_selftests)
 selftest(reset, intel_reset_live_selftests)
 selftest(memory_region, intel_memory_region_live_selftests)
 selftest(hangcheck, intel_hangcheck_live_selftests)
diff --git a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
index 5077dc3c3b8c..058450d351f7 100644
--- a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
@@ -18,5 +18,4 @@
 selftest(engine_cs, intel_engine_cs_perf_selftests)
 selftest(request, i915_request_perf_selftests)
 selftest(migrate, intel_migrate_perf_selftests)
-selftest(blt, i915_gem_object_blt_perf_selftests)
 selftest(region, intel_memory_region_perf_selftests)
diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
index c85d516b85cd..2e18f3a3d538 100644
--- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
+++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
@@ -15,11 +15,12 @@
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_lmem.h"
 #include "gem/i915_gem_region.h"
-#include "gem/i915_gem_object_blt.h"
 #include "gem/selftests/igt_gem_utils.h"
 #include "gem/selftests/mock_context.h"
+#include "gt/intel_engine_pm.h"
 #include "gt/intel_engine_user.h"
 #include "gt/intel_gt.h"
+#include "gt/intel_migrate.h"
 #include "i915_memcpy.h"
 #include "selftests/igt_flush_test.h"
 #include "selftests/i915_random.h"
@@ -741,6 +742,7 @@ static int igt_lmem_write_cpu(void *arg)
 		PAGE_SIZE - 64,
 	};
 	struct intel_engine_cs *engine;
+	struct i915_request *rq;
 	u32 *vaddr;
 	u32 sz;
 	u32 i;
@@ -767,15 +769,20 @@ static int igt_lmem_write_cpu(void *arg)
 		goto out_put;
 	}
 
+	i915_gem_object_lock(obj, NULL);
 	/* Put the pages into a known state -- from the gpu for added fun */
 	intel_engine_pm_get(engine);
-	err = i915_gem_object_fill_blt(obj, engine->kernel_context, 0xdeadbeaf);
-	intel_engine_pm_put(engine);
-	if (err)
-		goto out_unpin;
+	err = intel_context_migrate_clear(engine->gt->migrate.context, NULL,
+					  obj->mm.pages->sgl, I915_CACHE_NONE,
+					  true, 0xdeadbeaf, &rq);
+	if (rq) {
+		dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
+		i915_request_put(rq);
+	}
 
-	i915_gem_object_lock(obj, NULL);
-	err = i915_gem_object_set_to_wc_domain(obj, true);
+	intel_engine_pm_put(engine);
+	if (!err)
+		err = i915_gem_object_set_to_wc_domain(obj, true);
 	i915_gem_object_unlock(obj);
 	if (err)
 		goto out_unpin;
-- 
2.31.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 11/12] drm/i915/gem: Zap the client blt code
  2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:33     ` Matthew Auld
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-06-14 16:33 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 14/06/2021 17:26, Thomas Hellström wrote:
> It's not used anywhere.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

We do have to keep igt_client_tiled_blits subtest, it's not related to 
the client blitting code and was added afterwards. Not completely sure 
why it's in this file.

With that added back,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> ---
>   drivers/gpu/drm/i915/Makefile                 |   1 -
>   .../gpu/drm/i915/gem/i915_gem_client_blt.c    | 355 ---------
>   .../gpu/drm/i915/gem/i915_gem_client_blt.h    |  21 -
>   .../i915/gem/selftests/i915_gem_client_blt.c  | 704 ------------------
>   .../drm/i915/selftests/i915_live_selftests.h  |   1 -
>   5 files changed, 1082 deletions(-)
>   delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
>   delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
>   delete mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index de4cb9c52585..ca07474ec2df 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -136,7 +136,6 @@ i915-y += $(gt-y)
>   gem-y += \
>   	gem/i915_gem_busy.o \
>   	gem/i915_gem_clflush.o \
> -	gem/i915_gem_client_blt.o \
>   	gem/i915_gem_context.o \
>   	gem/i915_gem_create.o \
>   	gem/i915_gem_dmabuf.o \
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
> deleted file mode 100644
> index 44821d94544f..000000000000
> --- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
> +++ /dev/null
> @@ -1,355 +0,0 @@
> -// SPDX-License-Identifier: MIT
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#include "i915_drv.h"
> -#include "gt/intel_context.h"
> -#include "gt/intel_engine_pm.h"
> -#include "i915_gem_client_blt.h"
> -#include "i915_gem_object_blt.h"
> -
> -struct i915_sleeve {
> -	struct i915_vma *vma;
> -	struct drm_i915_gem_object *obj;
> -	struct sg_table *pages;
> -	struct i915_page_sizes page_sizes;
> -};
> -
> -static int vma_set_pages(struct i915_vma *vma)
> -{
> -	struct i915_sleeve *sleeve = vma->private;
> -
> -	vma->pages = sleeve->pages;
> -	vma->page_sizes = sleeve->page_sizes;
> -
> -	return 0;
> -}
> -
> -static void vma_clear_pages(struct i915_vma *vma)
> -{
> -	GEM_BUG_ON(!vma->pages);
> -	vma->pages = NULL;
> -}
> -
> -static void vma_bind(struct i915_address_space *vm,
> -		     struct i915_vm_pt_stash *stash,
> -		     struct i915_vma *vma,
> -		     enum i915_cache_level cache_level,
> -		     u32 flags)
> -{
> -	vm->vma_ops.bind_vma(vm, stash, vma, cache_level, flags);
> -}
> -
> -static void vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
> -{
> -	vm->vma_ops.unbind_vma(vm, vma);
> -}
> -
> -static const struct i915_vma_ops proxy_vma_ops = {
> -	.set_pages = vma_set_pages,
> -	.clear_pages = vma_clear_pages,
> -	.bind_vma = vma_bind,
> -	.unbind_vma = vma_unbind,
> -};
> -
> -static struct i915_sleeve *create_sleeve(struct i915_address_space *vm,
> -					 struct drm_i915_gem_object *obj,
> -					 struct sg_table *pages,
> -					 struct i915_page_sizes *page_sizes)
> -{
> -	struct i915_sleeve *sleeve;
> -	struct i915_vma *vma;
> -	int err;
> -
> -	sleeve = kzalloc(sizeof(*sleeve), GFP_KERNEL);
> -	if (!sleeve)
> -		return ERR_PTR(-ENOMEM);
> -
> -	vma = i915_vma_instance(obj, vm, NULL);
> -	if (IS_ERR(vma)) {
> -		err = PTR_ERR(vma);
> -		goto err_free;
> -	}
> -
> -	vma->private = sleeve;
> -	vma->ops = &proxy_vma_ops;
> -
> -	sleeve->vma = vma;
> -	sleeve->pages = pages;
> -	sleeve->page_sizes = *page_sizes;
> -
> -	return sleeve;
> -
> -err_free:
> -	kfree(sleeve);
> -	return ERR_PTR(err);
> -}
> -
> -static void destroy_sleeve(struct i915_sleeve *sleeve)
> -{
> -	kfree(sleeve);
> -}
> -
> -struct clear_pages_work {
> -	struct dma_fence dma;
> -	struct dma_fence_cb cb;
> -	struct i915_sw_fence wait;
> -	struct work_struct work;
> -	struct irq_work irq_work;
> -	struct i915_sleeve *sleeve;
> -	struct intel_context *ce;
> -	u32 value;
> -};
> -
> -static const char *clear_pages_work_driver_name(struct dma_fence *fence)
> -{
> -	return DRIVER_NAME;
> -}
> -
> -static const char *clear_pages_work_timeline_name(struct dma_fence *fence)
> -{
> -	return "clear";
> -}
> -
> -static void clear_pages_work_release(struct dma_fence *fence)
> -{
> -	struct clear_pages_work *w = container_of(fence, typeof(*w), dma);
> -
> -	destroy_sleeve(w->sleeve);
> -
> -	i915_sw_fence_fini(&w->wait);
> -
> -	BUILD_BUG_ON(offsetof(typeof(*w), dma));
> -	dma_fence_free(&w->dma);
> -}
> -
> -static const struct dma_fence_ops clear_pages_work_ops = {
> -	.get_driver_name = clear_pages_work_driver_name,
> -	.get_timeline_name = clear_pages_work_timeline_name,
> -	.release = clear_pages_work_release,
> -};
> -
> -static void clear_pages_signal_irq_worker(struct irq_work *work)
> -{
> -	struct clear_pages_work *w = container_of(work, typeof(*w), irq_work);
> -
> -	dma_fence_signal(&w->dma);
> -	dma_fence_put(&w->dma);
> -}
> -
> -static void clear_pages_dma_fence_cb(struct dma_fence *fence,
> -				     struct dma_fence_cb *cb)
> -{
> -	struct clear_pages_work *w = container_of(cb, typeof(*w), cb);
> -
> -	if (fence->error)
> -		dma_fence_set_error(&w->dma, fence->error);
> -
> -	/*
> -	 * Push the signalling of the fence into yet another worker to avoid
> -	 * the nightmare locking around the fence spinlock.
> -	 */
> -	irq_work_queue(&w->irq_work);
> -}
> -
> -static void clear_pages_worker(struct work_struct *work)
> -{
> -	struct clear_pages_work *w = container_of(work, typeof(*w), work);
> -	struct drm_i915_gem_object *obj = w->sleeve->vma->obj;
> -	struct i915_vma *vma = w->sleeve->vma;
> -	struct i915_gem_ww_ctx ww;
> -	struct i915_request *rq;
> -	struct i915_vma *batch;
> -	int err = w->dma.error;
> -
> -	if (unlikely(err))
> -		goto out_signal;
> -
> -	if (obj->cache_dirty) {
> -		if (i915_gem_object_has_struct_page(obj))
> -			drm_clflush_sg(w->sleeve->pages);
> -		obj->cache_dirty = false;
> -	}
> -	obj->read_domains = I915_GEM_GPU_DOMAINS;
> -	obj->write_domain = 0;
> -
> -	i915_gem_ww_ctx_init(&ww, false);
> -	intel_engine_pm_get(w->ce->engine);
> -retry:
> -	err = intel_context_pin_ww(w->ce, &ww);
> -	if (err)
> -		goto out_signal;
> -
> -	batch = intel_emit_vma_fill_blt(w->ce, vma, &ww, w->value);
> -	if (IS_ERR(batch)) {
> -		err = PTR_ERR(batch);
> -		goto out_ctx;
> -	}
> -
> -	rq = i915_request_create(w->ce);
> -	if (IS_ERR(rq)) {
> -		err = PTR_ERR(rq);
> -		goto out_batch;
> -	}
> -
> -	/* There's no way the fence has signalled */
> -	if (dma_fence_add_callback(&rq->fence, &w->cb,
> -				   clear_pages_dma_fence_cb))
> -		GEM_BUG_ON(1);
> -
> -	err = intel_emit_vma_mark_active(batch, rq);
> -	if (unlikely(err))
> -		goto out_request;
> -
> -	/*
> -	 * w->dma is already exported via (vma|obj)->resv we need only
> -	 * keep track of the GPU activity within this vma/request, and
> -	 * propagate the signal from the request to w->dma.
> -	 */
> -	err = __i915_vma_move_to_active(vma, rq);
> -	if (err)
> -		goto out_request;
> -
> -	if (rq->engine->emit_init_breadcrumb) {
> -		err = rq->engine->emit_init_breadcrumb(rq);
> -		if (unlikely(err))
> -			goto out_request;
> -	}
> -
> -	err = rq->engine->emit_bb_start(rq,
> -					batch->node.start, batch->node.size,
> -					0);
> -out_request:
> -	if (unlikely(err)) {
> -		i915_request_set_error_once(rq, err);
> -		err = 0;
> -	}
> -
> -	i915_request_add(rq);
> -out_batch:
> -	intel_emit_vma_release(w->ce, batch);
> -out_ctx:
> -	intel_context_unpin(w->ce);
> -out_signal:
> -	if (err == -EDEADLK) {
> -		err = i915_gem_ww_ctx_backoff(&ww);
> -		if (!err)
> -			goto retry;
> -	}
> -	i915_gem_ww_ctx_fini(&ww);
> -
> -	i915_vma_unpin(w->sleeve->vma);
> -	intel_engine_pm_put(w->ce->engine);
> -
> -	if (unlikely(err)) {
> -		dma_fence_set_error(&w->dma, err);
> -		dma_fence_signal(&w->dma);
> -		dma_fence_put(&w->dma);
> -	}
> -}
> -
> -static int pin_wait_clear_pages_work(struct clear_pages_work *w,
> -				     struct intel_context *ce)
> -{
> -	struct i915_vma *vma = w->sleeve->vma;
> -	struct i915_gem_ww_ctx ww;
> -	int err;
> -
> -	i915_gem_ww_ctx_init(&ww, false);
> -retry:
> -	err = i915_gem_object_lock(vma->obj, &ww);
> -	if (err)
> -		goto out;
> -
> -	err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER);
> -	if (unlikely(err))
> -		goto out;
> -
> -	err = i915_sw_fence_await_reservation(&w->wait,
> -					      vma->obj->base.resv, NULL,
> -					      true, 0, I915_FENCE_GFP);
> -	if (err)
> -		goto err_unpin_vma;
> -
> -	dma_resv_add_excl_fence(vma->obj->base.resv, &w->dma);
> -
> -err_unpin_vma:
> -	if (err)
> -		i915_vma_unpin(vma);
> -out:
> -	if (err == -EDEADLK) {
> -		err = i915_gem_ww_ctx_backoff(&ww);
> -		if (!err)
> -			goto retry;
> -	}
> -	i915_gem_ww_ctx_fini(&ww);
> -	return err;
> -}
> -
> -static int __i915_sw_fence_call
> -clear_pages_work_notify(struct i915_sw_fence *fence,
> -			enum i915_sw_fence_notify state)
> -{
> -	struct clear_pages_work *w = container_of(fence, typeof(*w), wait);
> -
> -	switch (state) {
> -	case FENCE_COMPLETE:
> -		schedule_work(&w->work);
> -		break;
> -
> -	case FENCE_FREE:
> -		dma_fence_put(&w->dma);
> -		break;
> -	}
> -
> -	return NOTIFY_DONE;
> -}
> -
> -static DEFINE_SPINLOCK(fence_lock);
> -
> -/* XXX: better name please */
> -int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
> -				     struct intel_context *ce,
> -				     struct sg_table *pages,
> -				     struct i915_page_sizes *page_sizes,
> -				     u32 value)
> -{
> -	struct clear_pages_work *work;
> -	struct i915_sleeve *sleeve;
> -	int err;
> -
> -	sleeve = create_sleeve(ce->vm, obj, pages, page_sizes);
> -	if (IS_ERR(sleeve))
> -		return PTR_ERR(sleeve);
> -
> -	work = kmalloc(sizeof(*work), GFP_KERNEL);
> -	if (!work) {
> -		destroy_sleeve(sleeve);
> -		return -ENOMEM;
> -	}
> -
> -	work->value = value;
> -	work->sleeve = sleeve;
> -	work->ce = ce;
> -
> -	INIT_WORK(&work->work, clear_pages_worker);
> -
> -	init_irq_work(&work->irq_work, clear_pages_signal_irq_worker);
> -
> -	dma_fence_init(&work->dma, &clear_pages_work_ops, &fence_lock, 0, 0);
> -	i915_sw_fence_init(&work->wait, clear_pages_work_notify);
> -
> -	err = pin_wait_clear_pages_work(work, ce);
> -	if (err < 0)
> -		dma_fence_set_error(&work->dma, err);
> -
> -	dma_fence_get(&work->dma);
> -	i915_sw_fence_commit(&work->wait);
> -
> -	return err;
> -}
> -
> -#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> -#include "selftests/i915_gem_client_blt.c"
> -#endif
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.h b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
> deleted file mode 100644
> index 3dbd28c22ff5..000000000000
> --- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
> +++ /dev/null
> @@ -1,21 +0,0 @@
> -/* SPDX-License-Identifier: MIT */
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -#ifndef __I915_GEM_CLIENT_BLT_H__
> -#define __I915_GEM_CLIENT_BLT_H__
> -
> -#include <linux/types.h>
> -
> -struct drm_i915_gem_object;
> -struct i915_page_sizes;
> -struct intel_context;
> -struct sg_table;
> -
> -int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
> -				     struct intel_context *ce,
> -				     struct sg_table *pages,
> -				     struct i915_page_sizes *page_sizes,
> -				     u32 value);
> -
> -#endif
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
> deleted file mode 100644
> index 176e6b22f87f..000000000000
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
> +++ /dev/null
> @@ -1,704 +0,0 @@
> -// SPDX-License-Identifier: MIT
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#include "i915_selftest.h"
> -
> -#include "gt/intel_engine_user.h"
> -#include "gt/intel_gt.h"
> -#include "gt/intel_gpu_commands.h"
> -#include "gem/i915_gem_lmem.h"
> -
> -#include "selftests/igt_flush_test.h"
> -#include "selftests/mock_drm.h"
> -#include "selftests/i915_random.h"
> -#include "huge_gem_object.h"
> -#include "mock_context.h"
> -
> -static int __igt_client_fill(struct intel_engine_cs *engine)
> -{
> -	struct intel_context *ce = engine->kernel_context;
> -	struct drm_i915_gem_object *obj;
> -	I915_RND_STATE(prng);
> -	IGT_TIMEOUT(end);
> -	u32 *vaddr;
> -	int err = 0;
> -
> -	intel_engine_pm_get(engine);
> -	do {
> -		const u32 max_block_size = S16_MAX * PAGE_SIZE;
> -		u32 sz = min_t(u64, ce->vm->total >> 4, prandom_u32_state(&prng));
> -		u32 phys_sz = sz % (max_block_size + 1);
> -		u32 val = prandom_u32_state(&prng);
> -		u32 i;
> -
> -		sz = round_up(sz, PAGE_SIZE);
> -		phys_sz = round_up(phys_sz, PAGE_SIZE);
> -
> -		pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
> -			 phys_sz, sz, val);
> -
> -		obj = huge_gem_object(engine->i915, phys_sz, sz);
> -		if (IS_ERR(obj)) {
> -			err = PTR_ERR(obj);
> -			goto err_flush;
> -		}
> -
> -		vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
> -		if (IS_ERR(vaddr)) {
> -			err = PTR_ERR(vaddr);
> -			goto err_put;
> -		}
> -
> -		/*
> -		 * XXX: The goal is move this to get_pages, so try to dirty the
> -		 * CPU cache first to check that we do the required clflush
> -		 * before scheduling the blt for !llc platforms. This matches
> -		 * some version of reality where at get_pages the pages
> -		 * themselves may not yet be coherent with the GPU(swap-in). If
> -		 * we are missing the flush then we should see the stale cache
> -		 * values after we do the set_to_cpu_domain and pick it up as a
> -		 * test failure.
> -		 */
> -		memset32(vaddr, val ^ 0xdeadbeaf,
> -			 huge_gem_object_phys_size(obj) / sizeof(u32));
> -
> -		if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
> -			obj->cache_dirty = true;
> -
> -		err = i915_gem_schedule_fill_pages_blt(obj, ce, obj->mm.pages,
> -						       &obj->mm.page_sizes,
> -						       val);
> -		if (err)
> -			goto err_unpin;
> -
> -		i915_gem_object_lock(obj, NULL);
> -		err = i915_gem_object_set_to_cpu_domain(obj, false);
> -		i915_gem_object_unlock(obj);
> -		if (err)
> -			goto err_unpin;
> -
> -		for (i = 0; i < huge_gem_object_phys_size(obj) / sizeof(u32); ++i) {
> -			if (vaddr[i] != val) {
> -				pr_err("vaddr[%u]=%x, expected=%x\n", i,
> -				       vaddr[i], val);
> -				err = -EINVAL;
> -				goto err_unpin;
> -			}
> -		}
> -
> -		i915_gem_object_unpin_map(obj);
> -		i915_gem_object_put(obj);
> -	} while (!time_after(jiffies, end));
> -
> -	goto err_flush;
> -
> -err_unpin:
> -	i915_gem_object_unpin_map(obj);
> -err_put:
> -	i915_gem_object_put(obj);
> -err_flush:
> -	if (err == -ENOMEM)
> -		err = 0;
> -	intel_engine_pm_put(engine);
> -
> -	return err;
> -}
> -
> -static int igt_client_fill(void *arg)
> -{
> -	int inst = 0;
> -
> -	do {
> -		struct intel_engine_cs *engine;
> -		int err;
> -
> -		engine = intel_engine_lookup_user(arg,
> -						  I915_ENGINE_CLASS_COPY,
> -						  inst++);
> -		if (!engine)
> -			return 0;
> -
> -		err = __igt_client_fill(engine);
> -		if (err == -ENOMEM)
> -			err = 0;
> -		if (err)
> -			return err;
> -	} while (1);
> -}
> -
> -#define WIDTH 512
> -#define HEIGHT 32
> -
> -struct blit_buffer {
> -	struct i915_vma *vma;
> -	u32 start_val;
> -	u32 tiling;
> -};
> -
> -struct tiled_blits {
> -	struct intel_context *ce;
> -	struct blit_buffer buffers[3];
> -	struct blit_buffer scratch;
> -	struct i915_vma *batch;
> -	u64 hole;
> -	u32 width;
> -	u32 height;
> -};
> -
> -static int prepare_blit(const struct tiled_blits *t,
> -			struct blit_buffer *dst,
> -			struct blit_buffer *src,
> -			struct drm_i915_gem_object *batch)
> -{
> -	const int ver = GRAPHICS_VER(to_i915(batch->base.dev));
> -	bool use_64b_reloc = ver >= 8;
> -	u32 src_pitch, dst_pitch;
> -	u32 cmd, *cs;
> -
> -	cs = i915_gem_object_pin_map_unlocked(batch, I915_MAP_WC);
> -	if (IS_ERR(cs))
> -		return PTR_ERR(cs);
> -
> -	*cs++ = MI_LOAD_REGISTER_IMM(1);
> -	*cs++ = i915_mmio_reg_offset(BCS_SWCTRL);
> -	cmd = (BCS_SRC_Y | BCS_DST_Y) << 16;
> -	if (src->tiling == I915_TILING_Y)
> -		cmd |= BCS_SRC_Y;
> -	if (dst->tiling == I915_TILING_Y)
> -		cmd |= BCS_DST_Y;
> -	*cs++ = cmd;
> -
> -	cmd = MI_FLUSH_DW;
> -	if (ver >= 8)
> -		cmd++;
> -	*cs++ = cmd;
> -	*cs++ = 0;
> -	*cs++ = 0;
> -	*cs++ = 0;
> -
> -	cmd = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (8 - 2);
> -	if (ver >= 8)
> -		cmd += 2;
> -
> -	src_pitch = t->width * 4;
> -	if (src->tiling) {
> -		cmd |= XY_SRC_COPY_BLT_SRC_TILED;
> -		src_pitch /= 4;
> -	}
> -
> -	dst_pitch = t->width * 4;
> -	if (dst->tiling) {
> -		cmd |= XY_SRC_COPY_BLT_DST_TILED;
> -		dst_pitch /= 4;
> -	}
> -
> -	*cs++ = cmd;
> -	*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | dst_pitch;
> -	*cs++ = 0;
> -	*cs++ = t->height << 16 | t->width;
> -	*cs++ = lower_32_bits(dst->vma->node.start);
> -	if (use_64b_reloc)
> -		*cs++ = upper_32_bits(dst->vma->node.start);
> -	*cs++ = 0;
> -	*cs++ = src_pitch;
> -	*cs++ = lower_32_bits(src->vma->node.start);
> -	if (use_64b_reloc)
> -		*cs++ = upper_32_bits(src->vma->node.start);
> -
> -	*cs++ = MI_BATCH_BUFFER_END;
> -
> -	i915_gem_object_flush_map(batch);
> -	i915_gem_object_unpin_map(batch);
> -
> -	return 0;
> -}
> -
> -static void tiled_blits_destroy_buffers(struct tiled_blits *t)
> -{
> -	int i;
> -
> -	for (i = 0; i < ARRAY_SIZE(t->buffers); i++)
> -		i915_vma_put(t->buffers[i].vma);
> -
> -	i915_vma_put(t->scratch.vma);
> -	i915_vma_put(t->batch);
> -}
> -
> -static struct i915_vma *
> -__create_vma(struct tiled_blits *t, size_t size, bool lmem)
> -{
> -	struct drm_i915_private *i915 = t->ce->vm->i915;
> -	struct drm_i915_gem_object *obj;
> -	struct i915_vma *vma;
> -
> -	if (lmem)
> -		obj = i915_gem_object_create_lmem(i915, size, 0);
> -	else
> -		obj = i915_gem_object_create_shmem(i915, size);
> -	if (IS_ERR(obj))
> -		return ERR_CAST(obj);
> -
> -	vma = i915_vma_instance(obj, t->ce->vm, NULL);
> -	if (IS_ERR(vma))
> -		i915_gem_object_put(obj);
> -
> -	return vma;
> -}
> -
> -static struct i915_vma *create_vma(struct tiled_blits *t, bool lmem)
> -{
> -	return __create_vma(t, PAGE_ALIGN(t->width * t->height * 4), lmem);
> -}
> -
> -static int tiled_blits_create_buffers(struct tiled_blits *t,
> -				      int width, int height,
> -				      struct rnd_state *prng)
> -{
> -	struct drm_i915_private *i915 = t->ce->engine->i915;
> -	int i;
> -
> -	t->width = width;
> -	t->height = height;
> -
> -	t->batch = __create_vma(t, PAGE_SIZE, false);
> -	if (IS_ERR(t->batch))
> -		return PTR_ERR(t->batch);
> -
> -	t->scratch.vma = create_vma(t, false);
> -	if (IS_ERR(t->scratch.vma)) {
> -		i915_vma_put(t->batch);
> -		return PTR_ERR(t->scratch.vma);
> -	}
> -
> -	for (i = 0; i < ARRAY_SIZE(t->buffers); i++) {
> -		struct i915_vma *vma;
> -
> -		vma = create_vma(t, HAS_LMEM(i915) && i % 2);
> -		if (IS_ERR(vma)) {
> -			tiled_blits_destroy_buffers(t);
> -			return PTR_ERR(vma);
> -		}
> -
> -		t->buffers[i].vma = vma;
> -		t->buffers[i].tiling =
> -			i915_prandom_u32_max_state(I915_TILING_Y + 1, prng);
> -	}
> -
> -	return 0;
> -}
> -
> -static void fill_scratch(struct tiled_blits *t, u32 *vaddr, u32 val)
> -{
> -	int i;
> -
> -	t->scratch.start_val = val;
> -	for (i = 0; i < t->width * t->height; i++)
> -		vaddr[i] = val++;
> -
> -	i915_gem_object_flush_map(t->scratch.vma->obj);
> -}
> -
> -static u64 swizzle_bit(unsigned int bit, u64 offset)
> -{
> -	return (offset & BIT_ULL(bit)) >> (bit - 6);
> -}
> -
> -static u64 tiled_offset(const struct intel_gt *gt,
> -			u64 v,
> -			unsigned int stride,
> -			unsigned int tiling)
> -{
> -	unsigned int swizzle;
> -	u64 x, y;
> -
> -	if (tiling == I915_TILING_NONE)
> -		return v;
> -
> -	y = div64_u64_rem(v, stride, &x);
> -
> -	if (tiling == I915_TILING_X) {
> -		v = div64_u64_rem(y, 8, &y) * stride * 8;
> -		v += y * 512;
> -		v += div64_u64_rem(x, 512, &x) << 12;
> -		v += x;
> -
> -		swizzle = gt->ggtt->bit_6_swizzle_x;
> -	} else {
> -		const unsigned int ytile_span = 16;
> -		const unsigned int ytile_height = 512;
> -
> -		v = div64_u64_rem(y, 32, &y) * stride * 32;
> -		v += y * ytile_span;
> -		v += div64_u64_rem(x, ytile_span, &x) * ytile_height;
> -		v += x;
> -
> -		swizzle = gt->ggtt->bit_6_swizzle_y;
> -	}
> -
> -	switch (swizzle) {
> -	case I915_BIT_6_SWIZZLE_9:
> -		v ^= swizzle_bit(9, v);
> -		break;
> -	case I915_BIT_6_SWIZZLE_9_10:
> -		v ^= swizzle_bit(9, v) ^ swizzle_bit(10, v);
> -		break;
> -	case I915_BIT_6_SWIZZLE_9_11:
> -		v ^= swizzle_bit(9, v) ^ swizzle_bit(11, v);
> -		break;
> -	case I915_BIT_6_SWIZZLE_9_10_11:
> -		v ^= swizzle_bit(9, v) ^ swizzle_bit(10, v) ^ swizzle_bit(11, v);
> -		break;
> -	}
> -
> -	return v;
> -}
> -
> -static const char *repr_tiling(int tiling)
> -{
> -	switch (tiling) {
> -	case I915_TILING_NONE: return "linear";
> -	case I915_TILING_X: return "X";
> -	case I915_TILING_Y: return "Y";
> -	default: return "unknown";
> -	}
> -}
> -
> -static int verify_buffer(const struct tiled_blits *t,
> -			 struct blit_buffer *buf,
> -			 struct rnd_state *prng)
> -{
> -	const u32 *vaddr;
> -	int ret = 0;
> -	int x, y, p;
> -
> -	x = i915_prandom_u32_max_state(t->width, prng);
> -	y = i915_prandom_u32_max_state(t->height, prng);
> -	p = y * t->width + x;
> -
> -	vaddr = i915_gem_object_pin_map_unlocked(buf->vma->obj, I915_MAP_WC);
> -	if (IS_ERR(vaddr))
> -		return PTR_ERR(vaddr);
> -
> -	if (vaddr[0] != buf->start_val) {
> -		ret = -EINVAL;
> -	} else {
> -		u64 v = tiled_offset(buf->vma->vm->gt,
> -				     p * 4, t->width * 4,
> -				     buf->tiling);
> -
> -		if (vaddr[v / sizeof(*vaddr)] != buf->start_val + p)
> -			ret = -EINVAL;
> -	}
> -	if (ret) {
> -		pr_err("Invalid %s tiling detected at (%d, %d), start_val %x\n",
> -		       repr_tiling(buf->tiling),
> -		       x, y, buf->start_val);
> -		igt_hexdump(vaddr, 4096);
> -	}
> -
> -	i915_gem_object_unpin_map(buf->vma->obj);
> -	return ret;
> -}
> -
> -static int move_to_active(struct i915_vma *vma,
> -			  struct i915_request *rq,
> -			  unsigned int flags)
> -{
> -	int err;
> -
> -	i915_vma_lock(vma);
> -	err = i915_request_await_object(rq, vma->obj, false);
> -	if (err == 0)
> -		err = i915_vma_move_to_active(vma, rq, flags);
> -	i915_vma_unlock(vma);
> -
> -	return err;
> -}
> -
> -static int pin_buffer(struct i915_vma *vma, u64 addr)
> -{
> -	int err;
> -
> -	if (drm_mm_node_allocated(&vma->node) && vma->node.start != addr) {
> -		err = i915_vma_unbind(vma);
> -		if (err)
> -			return err;
> -	}
> -
> -	err = i915_vma_pin(vma, 0, 0, PIN_USER | PIN_OFFSET_FIXED | addr);
> -	if (err)
> -		return err;
> -
> -	return 0;
> -}
> -
> -static int
> -tiled_blit(struct tiled_blits *t,
> -	   struct blit_buffer *dst, u64 dst_addr,
> -	   struct blit_buffer *src, u64 src_addr)
> -{
> -	struct i915_request *rq;
> -	int err;
> -
> -	err = pin_buffer(src->vma, src_addr);
> -	if (err) {
> -		pr_err("Cannot pin src @ %llx\n", src_addr);
> -		return err;
> -	}
> -
> -	err = pin_buffer(dst->vma, dst_addr);
> -	if (err) {
> -		pr_err("Cannot pin dst @ %llx\n", dst_addr);
> -		goto err_src;
> -	}
> -
> -	err = i915_vma_pin(t->batch, 0, 0, PIN_USER | PIN_HIGH);
> -	if (err) {
> -		pr_err("cannot pin batch\n");
> -		goto err_dst;
> -	}
> -
> -	err = prepare_blit(t, dst, src, t->batch->obj);
> -	if (err)
> -		goto err_bb;
> -
> -	rq = intel_context_create_request(t->ce);
> -	if (IS_ERR(rq)) {
> -		err = PTR_ERR(rq);
> -		goto err_bb;
> -	}
> -
> -	err = move_to_active(t->batch, rq, 0);
> -	if (!err)
> -		err = move_to_active(src->vma, rq, 0);
> -	if (!err)
> -		err = move_to_active(dst->vma, rq, 0);
> -	if (!err)
> -		err = rq->engine->emit_bb_start(rq,
> -						t->batch->node.start,
> -						t->batch->node.size,
> -						0);
> -	i915_request_get(rq);
> -	i915_request_add(rq);
> -	if (i915_request_wait(rq, 0, HZ / 2) < 0)
> -		err = -ETIME;
> -	i915_request_put(rq);
> -
> -	dst->start_val = src->start_val;
> -err_bb:
> -	i915_vma_unpin(t->batch);
> -err_dst:
> -	i915_vma_unpin(dst->vma);
> -err_src:
> -	i915_vma_unpin(src->vma);
> -	return err;
> -}
> -
> -static struct tiled_blits *
> -tiled_blits_create(struct intel_engine_cs *engine, struct rnd_state *prng)
> -{
> -	struct drm_mm_node hole;
> -	struct tiled_blits *t;
> -	u64 hole_size;
> -	int err;
> -
> -	t = kzalloc(sizeof(*t), GFP_KERNEL);
> -	if (!t)
> -		return ERR_PTR(-ENOMEM);
> -
> -	t->ce = intel_context_create(engine);
> -	if (IS_ERR(t->ce)) {
> -		err = PTR_ERR(t->ce);
> -		goto err_free;
> -	}
> -
> -	hole_size = 2 * PAGE_ALIGN(WIDTH * HEIGHT * 4);
> -	hole_size *= 2; /* room to maneuver */
> -	hole_size += 2 * I915_GTT_MIN_ALIGNMENT;
> -
> -	mutex_lock(&t->ce->vm->mutex);
> -	memset(&hole, 0, sizeof(hole));
> -	err = drm_mm_insert_node_in_range(&t->ce->vm->mm, &hole,
> -					  hole_size, 0, I915_COLOR_UNEVICTABLE,
> -					  0, U64_MAX,
> -					  DRM_MM_INSERT_BEST);
> -	if (!err)
> -		drm_mm_remove_node(&hole);
> -	mutex_unlock(&t->ce->vm->mutex);
> -	if (err) {
> -		err = -ENODEV;
> -		goto err_put;
> -	}
> -
> -	t->hole = hole.start + I915_GTT_MIN_ALIGNMENT;
> -	pr_info("Using hole at %llx\n", t->hole);
> -
> -	err = tiled_blits_create_buffers(t, WIDTH, HEIGHT, prng);
> -	if (err)
> -		goto err_put;
> -
> -	return t;
> -
> -err_put:
> -	intel_context_put(t->ce);
> -err_free:
> -	kfree(t);
> -	return ERR_PTR(err);
> -}
> -
> -static void tiled_blits_destroy(struct tiled_blits *t)
> -{
> -	tiled_blits_destroy_buffers(t);
> -
> -	intel_context_put(t->ce);
> -	kfree(t);
> -}
> -
> -static int tiled_blits_prepare(struct tiled_blits *t,
> -			       struct rnd_state *prng)
> -{
> -	u64 offset = PAGE_ALIGN(t->width * t->height * 4);
> -	u32 *map;
> -	int err;
> -	int i;
> -
> -	map = i915_gem_object_pin_map_unlocked(t->scratch.vma->obj, I915_MAP_WC);
> -	if (IS_ERR(map))
> -		return PTR_ERR(map);
> -
> -	/* Use scratch to fill objects */
> -	for (i = 0; i < ARRAY_SIZE(t->buffers); i++) {
> -		fill_scratch(t, map, prandom_u32_state(prng));
> -		GEM_BUG_ON(verify_buffer(t, &t->scratch, prng));
> -
> -		err = tiled_blit(t,
> -				 &t->buffers[i], t->hole + offset,
> -				 &t->scratch, t->hole);
> -		if (err == 0)
> -			err = verify_buffer(t, &t->buffers[i], prng);
> -		if (err) {
> -			pr_err("Failed to create buffer %d\n", i);
> -			break;
> -		}
> -	}
> -
> -	i915_gem_object_unpin_map(t->scratch.vma->obj);
> -	return err;
> -}
> -
> -static int tiled_blits_bounce(struct tiled_blits *t, struct rnd_state *prng)
> -{
> -	u64 offset =
> -		round_up(t->width * t->height * 4, 2 * I915_GTT_MIN_ALIGNMENT);
> -	int err;
> -
> -	/* We want to check position invariant tiling across GTT eviction */
> -
> -	err = tiled_blit(t,
> -			 &t->buffers[1], t->hole + offset / 2,
> -			 &t->buffers[0], t->hole + 2 * offset);
> -	if (err)
> -		return err;
> -
> -	/* Reposition so that we overlap the old addresses, and slightly off */
> -	err = tiled_blit(t,
> -			 &t->buffers[2], t->hole + I915_GTT_MIN_ALIGNMENT,
> -			 &t->buffers[1], t->hole + 3 * offset / 2);
> -	if (err)
> -		return err;
> -
> -	err = verify_buffer(t, &t->buffers[2], prng);
> -	if (err)
> -		return err;
> -
> -	return 0;
> -}
> -
> -static int __igt_client_tiled_blits(struct intel_engine_cs *engine,
> -				    struct rnd_state *prng)
> -{
> -	struct tiled_blits *t;
> -	int err;
> -
> -	t = tiled_blits_create(engine, prng);
> -	if (IS_ERR(t))
> -		return PTR_ERR(t);
> -
> -	err = tiled_blits_prepare(t, prng);
> -	if (err)
> -		goto out;
> -
> -	err = tiled_blits_bounce(t, prng);
> -	if (err)
> -		goto out;
> -
> -out:
> -	tiled_blits_destroy(t);
> -	return err;
> -}
> -
> -static bool has_bit17_swizzle(int sw)
> -{
> -	return (sw == I915_BIT_6_SWIZZLE_9_10_17 ||
> -		sw == I915_BIT_6_SWIZZLE_9_17);
> -}
> -
> -static bool bad_swizzling(struct drm_i915_private *i915)
> -{
> -	struct i915_ggtt *ggtt = &i915->ggtt;
> -
> -	if (i915->quirks & QUIRK_PIN_SWIZZLED_PAGES)
> -		return true;
> -
> -	if (has_bit17_swizzle(ggtt->bit_6_swizzle_x) ||
> -	    has_bit17_swizzle(ggtt->bit_6_swizzle_y))
> -		return true;
> -
> -	return false;
> -}
> -
> -static int igt_client_tiled_blits(void *arg)
> -{
> -	struct drm_i915_private *i915 = arg;
> -	I915_RND_STATE(prng);
> -	int inst = 0;
> -
> -	/* Test requires explicit BLT tiling controls */
> -	if (GRAPHICS_VER(i915) < 4)
> -		return 0;
> -
> -	if (bad_swizzling(i915)) /* Requires sane (sub-page) swizzling */
> -		return 0;
> -
> -	do {
> -		struct intel_engine_cs *engine;
> -		int err;
> -
> -		engine = intel_engine_lookup_user(i915,
> -						  I915_ENGINE_CLASS_COPY,
> -						  inst++);
> -		if (!engine)
> -			return 0;
> -
> -		err = __igt_client_tiled_blits(engine, &prng);
> -		if (err == -ENODEV)
> -			err = 0;
> -		if (err)
> -			return err;
> -	} while (1);
> -}
> -
> -int i915_gem_client_blt_live_selftests(struct drm_i915_private *i915)
> -{
> -	static const struct i915_subtest tests[] = {
> -		SUBTEST(igt_client_fill),
> -		SUBTEST(igt_client_tiled_blits),
> -	};
> -
> -	if (intel_gt_is_wedged(&i915->gt))
> -		return 0;
> -
> -	return i915_live_subtests(tests, i915);
> -}
> diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> index be5e0191eaea..6f5893ecd549 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> @@ -40,7 +40,6 @@ selftest(hugepages, i915_gem_huge_page_live_selftests)
>   selftest(gem_contexts, i915_gem_context_live_selftests)
>   selftest(gem_execbuf, i915_gem_execbuffer_live_selftests)
>   selftest(blt, i915_gem_object_blt_live_selftests)
> -selftest(client, i915_gem_client_blt_live_selftests)
>   selftest(reset, intel_reset_live_selftests)
>   selftest(memory_region, intel_memory_region_live_selftests)
>   selftest(hangcheck, intel_hangcheck_live_selftests)
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH v3 11/12] drm/i915/gem: Zap the client blt code
@ 2021-06-14 16:33     ` Matthew Auld
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-06-14 16:33 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 14/06/2021 17:26, Thomas Hellström wrote:
> It's not used anywhere.
> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

We do have to keep igt_client_tiled_blits subtest, it's not related to 
the client blitting code and was added afterwards. Not completely sure 
why it's in this file.

With that added back,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> ---
>   drivers/gpu/drm/i915/Makefile                 |   1 -
>   .../gpu/drm/i915/gem/i915_gem_client_blt.c    | 355 ---------
>   .../gpu/drm/i915/gem/i915_gem_client_blt.h    |  21 -
>   .../i915/gem/selftests/i915_gem_client_blt.c  | 704 ------------------
>   .../drm/i915/selftests/i915_live_selftests.h  |   1 -
>   5 files changed, 1082 deletions(-)
>   delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
>   delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
>   delete mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index de4cb9c52585..ca07474ec2df 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -136,7 +136,6 @@ i915-y += $(gt-y)
>   gem-y += \
>   	gem/i915_gem_busy.o \
>   	gem/i915_gem_clflush.o \
> -	gem/i915_gem_client_blt.o \
>   	gem/i915_gem_context.o \
>   	gem/i915_gem_create.o \
>   	gem/i915_gem_dmabuf.o \
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
> deleted file mode 100644
> index 44821d94544f..000000000000
> --- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
> +++ /dev/null
> @@ -1,355 +0,0 @@
> -// SPDX-License-Identifier: MIT
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#include "i915_drv.h"
> -#include "gt/intel_context.h"
> -#include "gt/intel_engine_pm.h"
> -#include "i915_gem_client_blt.h"
> -#include "i915_gem_object_blt.h"
> -
> -struct i915_sleeve {
> -	struct i915_vma *vma;
> -	struct drm_i915_gem_object *obj;
> -	struct sg_table *pages;
> -	struct i915_page_sizes page_sizes;
> -};
> -
> -static int vma_set_pages(struct i915_vma *vma)
> -{
> -	struct i915_sleeve *sleeve = vma->private;
> -
> -	vma->pages = sleeve->pages;
> -	vma->page_sizes = sleeve->page_sizes;
> -
> -	return 0;
> -}
> -
> -static void vma_clear_pages(struct i915_vma *vma)
> -{
> -	GEM_BUG_ON(!vma->pages);
> -	vma->pages = NULL;
> -}
> -
> -static void vma_bind(struct i915_address_space *vm,
> -		     struct i915_vm_pt_stash *stash,
> -		     struct i915_vma *vma,
> -		     enum i915_cache_level cache_level,
> -		     u32 flags)
> -{
> -	vm->vma_ops.bind_vma(vm, stash, vma, cache_level, flags);
> -}
> -
> -static void vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
> -{
> -	vm->vma_ops.unbind_vma(vm, vma);
> -}
> -
> -static const struct i915_vma_ops proxy_vma_ops = {
> -	.set_pages = vma_set_pages,
> -	.clear_pages = vma_clear_pages,
> -	.bind_vma = vma_bind,
> -	.unbind_vma = vma_unbind,
> -};
> -
> -static struct i915_sleeve *create_sleeve(struct i915_address_space *vm,
> -					 struct drm_i915_gem_object *obj,
> -					 struct sg_table *pages,
> -					 struct i915_page_sizes *page_sizes)
> -{
> -	struct i915_sleeve *sleeve;
> -	struct i915_vma *vma;
> -	int err;
> -
> -	sleeve = kzalloc(sizeof(*sleeve), GFP_KERNEL);
> -	if (!sleeve)
> -		return ERR_PTR(-ENOMEM);
> -
> -	vma = i915_vma_instance(obj, vm, NULL);
> -	if (IS_ERR(vma)) {
> -		err = PTR_ERR(vma);
> -		goto err_free;
> -	}
> -
> -	vma->private = sleeve;
> -	vma->ops = &proxy_vma_ops;
> -
> -	sleeve->vma = vma;
> -	sleeve->pages = pages;
> -	sleeve->page_sizes = *page_sizes;
> -
> -	return sleeve;
> -
> -err_free:
> -	kfree(sleeve);
> -	return ERR_PTR(err);
> -}
> -
> -static void destroy_sleeve(struct i915_sleeve *sleeve)
> -{
> -	kfree(sleeve);
> -}
> -
> -struct clear_pages_work {
> -	struct dma_fence dma;
> -	struct dma_fence_cb cb;
> -	struct i915_sw_fence wait;
> -	struct work_struct work;
> -	struct irq_work irq_work;
> -	struct i915_sleeve *sleeve;
> -	struct intel_context *ce;
> -	u32 value;
> -};
> -
> -static const char *clear_pages_work_driver_name(struct dma_fence *fence)
> -{
> -	return DRIVER_NAME;
> -}
> -
> -static const char *clear_pages_work_timeline_name(struct dma_fence *fence)
> -{
> -	return "clear";
> -}
> -
> -static void clear_pages_work_release(struct dma_fence *fence)
> -{
> -	struct clear_pages_work *w = container_of(fence, typeof(*w), dma);
> -
> -	destroy_sleeve(w->sleeve);
> -
> -	i915_sw_fence_fini(&w->wait);
> -
> -	BUILD_BUG_ON(offsetof(typeof(*w), dma));
> -	dma_fence_free(&w->dma);
> -}
> -
> -static const struct dma_fence_ops clear_pages_work_ops = {
> -	.get_driver_name = clear_pages_work_driver_name,
> -	.get_timeline_name = clear_pages_work_timeline_name,
> -	.release = clear_pages_work_release,
> -};
> -
> -static void clear_pages_signal_irq_worker(struct irq_work *work)
> -{
> -	struct clear_pages_work *w = container_of(work, typeof(*w), irq_work);
> -
> -	dma_fence_signal(&w->dma);
> -	dma_fence_put(&w->dma);
> -}
> -
> -static void clear_pages_dma_fence_cb(struct dma_fence *fence,
> -				     struct dma_fence_cb *cb)
> -{
> -	struct clear_pages_work *w = container_of(cb, typeof(*w), cb);
> -
> -	if (fence->error)
> -		dma_fence_set_error(&w->dma, fence->error);
> -
> -	/*
> -	 * Push the signalling of the fence into yet another worker to avoid
> -	 * the nightmare locking around the fence spinlock.
> -	 */
> -	irq_work_queue(&w->irq_work);
> -}
> -
> -static void clear_pages_worker(struct work_struct *work)
> -{
> -	struct clear_pages_work *w = container_of(work, typeof(*w), work);
> -	struct drm_i915_gem_object *obj = w->sleeve->vma->obj;
> -	struct i915_vma *vma = w->sleeve->vma;
> -	struct i915_gem_ww_ctx ww;
> -	struct i915_request *rq;
> -	struct i915_vma *batch;
> -	int err = w->dma.error;
> -
> -	if (unlikely(err))
> -		goto out_signal;
> -
> -	if (obj->cache_dirty) {
> -		if (i915_gem_object_has_struct_page(obj))
> -			drm_clflush_sg(w->sleeve->pages);
> -		obj->cache_dirty = false;
> -	}
> -	obj->read_domains = I915_GEM_GPU_DOMAINS;
> -	obj->write_domain = 0;
> -
> -	i915_gem_ww_ctx_init(&ww, false);
> -	intel_engine_pm_get(w->ce->engine);
> -retry:
> -	err = intel_context_pin_ww(w->ce, &ww);
> -	if (err)
> -		goto out_signal;
> -
> -	batch = intel_emit_vma_fill_blt(w->ce, vma, &ww, w->value);
> -	if (IS_ERR(batch)) {
> -		err = PTR_ERR(batch);
> -		goto out_ctx;
> -	}
> -
> -	rq = i915_request_create(w->ce);
> -	if (IS_ERR(rq)) {
> -		err = PTR_ERR(rq);
> -		goto out_batch;
> -	}
> -
> -	/* There's no way the fence has signalled */
> -	if (dma_fence_add_callback(&rq->fence, &w->cb,
> -				   clear_pages_dma_fence_cb))
> -		GEM_BUG_ON(1);
> -
> -	err = intel_emit_vma_mark_active(batch, rq);
> -	if (unlikely(err))
> -		goto out_request;
> -
> -	/*
> -	 * w->dma is already exported via (vma|obj)->resv we need only
> -	 * keep track of the GPU activity within this vma/request, and
> -	 * propagate the signal from the request to w->dma.
> -	 */
> -	err = __i915_vma_move_to_active(vma, rq);
> -	if (err)
> -		goto out_request;
> -
> -	if (rq->engine->emit_init_breadcrumb) {
> -		err = rq->engine->emit_init_breadcrumb(rq);
> -		if (unlikely(err))
> -			goto out_request;
> -	}
> -
> -	err = rq->engine->emit_bb_start(rq,
> -					batch->node.start, batch->node.size,
> -					0);
> -out_request:
> -	if (unlikely(err)) {
> -		i915_request_set_error_once(rq, err);
> -		err = 0;
> -	}
> -
> -	i915_request_add(rq);
> -out_batch:
> -	intel_emit_vma_release(w->ce, batch);
> -out_ctx:
> -	intel_context_unpin(w->ce);
> -out_signal:
> -	if (err == -EDEADLK) {
> -		err = i915_gem_ww_ctx_backoff(&ww);
> -		if (!err)
> -			goto retry;
> -	}
> -	i915_gem_ww_ctx_fini(&ww);
> -
> -	i915_vma_unpin(w->sleeve->vma);
> -	intel_engine_pm_put(w->ce->engine);
> -
> -	if (unlikely(err)) {
> -		dma_fence_set_error(&w->dma, err);
> -		dma_fence_signal(&w->dma);
> -		dma_fence_put(&w->dma);
> -	}
> -}
> -
> -static int pin_wait_clear_pages_work(struct clear_pages_work *w,
> -				     struct intel_context *ce)
> -{
> -	struct i915_vma *vma = w->sleeve->vma;
> -	struct i915_gem_ww_ctx ww;
> -	int err;
> -
> -	i915_gem_ww_ctx_init(&ww, false);
> -retry:
> -	err = i915_gem_object_lock(vma->obj, &ww);
> -	if (err)
> -		goto out;
> -
> -	err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER);
> -	if (unlikely(err))
> -		goto out;
> -
> -	err = i915_sw_fence_await_reservation(&w->wait,
> -					      vma->obj->base.resv, NULL,
> -					      true, 0, I915_FENCE_GFP);
> -	if (err)
> -		goto err_unpin_vma;
> -
> -	dma_resv_add_excl_fence(vma->obj->base.resv, &w->dma);
> -
> -err_unpin_vma:
> -	if (err)
> -		i915_vma_unpin(vma);
> -out:
> -	if (err == -EDEADLK) {
> -		err = i915_gem_ww_ctx_backoff(&ww);
> -		if (!err)
> -			goto retry;
> -	}
> -	i915_gem_ww_ctx_fini(&ww);
> -	return err;
> -}
> -
> -static int __i915_sw_fence_call
> -clear_pages_work_notify(struct i915_sw_fence *fence,
> -			enum i915_sw_fence_notify state)
> -{
> -	struct clear_pages_work *w = container_of(fence, typeof(*w), wait);
> -
> -	switch (state) {
> -	case FENCE_COMPLETE:
> -		schedule_work(&w->work);
> -		break;
> -
> -	case FENCE_FREE:
> -		dma_fence_put(&w->dma);
> -		break;
> -	}
> -
> -	return NOTIFY_DONE;
> -}
> -
> -static DEFINE_SPINLOCK(fence_lock);
> -
> -/* XXX: better name please */
> -int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
> -				     struct intel_context *ce,
> -				     struct sg_table *pages,
> -				     struct i915_page_sizes *page_sizes,
> -				     u32 value)
> -{
> -	struct clear_pages_work *work;
> -	struct i915_sleeve *sleeve;
> -	int err;
> -
> -	sleeve = create_sleeve(ce->vm, obj, pages, page_sizes);
> -	if (IS_ERR(sleeve))
> -		return PTR_ERR(sleeve);
> -
> -	work = kmalloc(sizeof(*work), GFP_KERNEL);
> -	if (!work) {
> -		destroy_sleeve(sleeve);
> -		return -ENOMEM;
> -	}
> -
> -	work->value = value;
> -	work->sleeve = sleeve;
> -	work->ce = ce;
> -
> -	INIT_WORK(&work->work, clear_pages_worker);
> -
> -	init_irq_work(&work->irq_work, clear_pages_signal_irq_worker);
> -
> -	dma_fence_init(&work->dma, &clear_pages_work_ops, &fence_lock, 0, 0);
> -	i915_sw_fence_init(&work->wait, clear_pages_work_notify);
> -
> -	err = pin_wait_clear_pages_work(work, ce);
> -	if (err < 0)
> -		dma_fence_set_error(&work->dma, err);
> -
> -	dma_fence_get(&work->dma);
> -	i915_sw_fence_commit(&work->wait);
> -
> -	return err;
> -}
> -
> -#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> -#include "selftests/i915_gem_client_blt.c"
> -#endif
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.h b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
> deleted file mode 100644
> index 3dbd28c22ff5..000000000000
> --- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.h
> +++ /dev/null
> @@ -1,21 +0,0 @@
> -/* SPDX-License-Identifier: MIT */
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -#ifndef __I915_GEM_CLIENT_BLT_H__
> -#define __I915_GEM_CLIENT_BLT_H__
> -
> -#include <linux/types.h>
> -
> -struct drm_i915_gem_object;
> -struct i915_page_sizes;
> -struct intel_context;
> -struct sg_table;
> -
> -int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
> -				     struct intel_context *ce,
> -				     struct sg_table *pages,
> -				     struct i915_page_sizes *page_sizes,
> -				     u32 value);
> -
> -#endif
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
> deleted file mode 100644
> index 176e6b22f87f..000000000000
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
> +++ /dev/null
> @@ -1,704 +0,0 @@
> -// SPDX-License-Identifier: MIT
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#include "i915_selftest.h"
> -
> -#include "gt/intel_engine_user.h"
> -#include "gt/intel_gt.h"
> -#include "gt/intel_gpu_commands.h"
> -#include "gem/i915_gem_lmem.h"
> -
> -#include "selftests/igt_flush_test.h"
> -#include "selftests/mock_drm.h"
> -#include "selftests/i915_random.h"
> -#include "huge_gem_object.h"
> -#include "mock_context.h"
> -
> -static int __igt_client_fill(struct intel_engine_cs *engine)
> -{
> -	struct intel_context *ce = engine->kernel_context;
> -	struct drm_i915_gem_object *obj;
> -	I915_RND_STATE(prng);
> -	IGT_TIMEOUT(end);
> -	u32 *vaddr;
> -	int err = 0;
> -
> -	intel_engine_pm_get(engine);
> -	do {
> -		const u32 max_block_size = S16_MAX * PAGE_SIZE;
> -		u32 sz = min_t(u64, ce->vm->total >> 4, prandom_u32_state(&prng));
> -		u32 phys_sz = sz % (max_block_size + 1);
> -		u32 val = prandom_u32_state(&prng);
> -		u32 i;
> -
> -		sz = round_up(sz, PAGE_SIZE);
> -		phys_sz = round_up(phys_sz, PAGE_SIZE);
> -
> -		pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
> -			 phys_sz, sz, val);
> -
> -		obj = huge_gem_object(engine->i915, phys_sz, sz);
> -		if (IS_ERR(obj)) {
> -			err = PTR_ERR(obj);
> -			goto err_flush;
> -		}
> -
> -		vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
> -		if (IS_ERR(vaddr)) {
> -			err = PTR_ERR(vaddr);
> -			goto err_put;
> -		}
> -
> -		/*
> -		 * XXX: The goal is move this to get_pages, so try to dirty the
> -		 * CPU cache first to check that we do the required clflush
> -		 * before scheduling the blt for !llc platforms. This matches
> -		 * some version of reality where at get_pages the pages
> -		 * themselves may not yet be coherent with the GPU(swap-in). If
> -		 * we are missing the flush then we should see the stale cache
> -		 * values after we do the set_to_cpu_domain and pick it up as a
> -		 * test failure.
> -		 */
> -		memset32(vaddr, val ^ 0xdeadbeaf,
> -			 huge_gem_object_phys_size(obj) / sizeof(u32));
> -
> -		if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
> -			obj->cache_dirty = true;
> -
> -		err = i915_gem_schedule_fill_pages_blt(obj, ce, obj->mm.pages,
> -						       &obj->mm.page_sizes,
> -						       val);
> -		if (err)
> -			goto err_unpin;
> -
> -		i915_gem_object_lock(obj, NULL);
> -		err = i915_gem_object_set_to_cpu_domain(obj, false);
> -		i915_gem_object_unlock(obj);
> -		if (err)
> -			goto err_unpin;
> -
> -		for (i = 0; i < huge_gem_object_phys_size(obj) / sizeof(u32); ++i) {
> -			if (vaddr[i] != val) {
> -				pr_err("vaddr[%u]=%x, expected=%x\n", i,
> -				       vaddr[i], val);
> -				err = -EINVAL;
> -				goto err_unpin;
> -			}
> -		}
> -
> -		i915_gem_object_unpin_map(obj);
> -		i915_gem_object_put(obj);
> -	} while (!time_after(jiffies, end));
> -
> -	goto err_flush;
> -
> -err_unpin:
> -	i915_gem_object_unpin_map(obj);
> -err_put:
> -	i915_gem_object_put(obj);
> -err_flush:
> -	if (err == -ENOMEM)
> -		err = 0;
> -	intel_engine_pm_put(engine);
> -
> -	return err;
> -}
> -
> -static int igt_client_fill(void *arg)
> -{
> -	int inst = 0;
> -
> -	do {
> -		struct intel_engine_cs *engine;
> -		int err;
> -
> -		engine = intel_engine_lookup_user(arg,
> -						  I915_ENGINE_CLASS_COPY,
> -						  inst++);
> -		if (!engine)
> -			return 0;
> -
> -		err = __igt_client_fill(engine);
> -		if (err == -ENOMEM)
> -			err = 0;
> -		if (err)
> -			return err;
> -	} while (1);
> -}
> -
> -#define WIDTH 512
> -#define HEIGHT 32
> -
> -struct blit_buffer {
> -	struct i915_vma *vma;
> -	u32 start_val;
> -	u32 tiling;
> -};
> -
> -struct tiled_blits {
> -	struct intel_context *ce;
> -	struct blit_buffer buffers[3];
> -	struct blit_buffer scratch;
> -	struct i915_vma *batch;
> -	u64 hole;
> -	u32 width;
> -	u32 height;
> -};
> -
> -static int prepare_blit(const struct tiled_blits *t,
> -			struct blit_buffer *dst,
> -			struct blit_buffer *src,
> -			struct drm_i915_gem_object *batch)
> -{
> -	const int ver = GRAPHICS_VER(to_i915(batch->base.dev));
> -	bool use_64b_reloc = ver >= 8;
> -	u32 src_pitch, dst_pitch;
> -	u32 cmd, *cs;
> -
> -	cs = i915_gem_object_pin_map_unlocked(batch, I915_MAP_WC);
> -	if (IS_ERR(cs))
> -		return PTR_ERR(cs);
> -
> -	*cs++ = MI_LOAD_REGISTER_IMM(1);
> -	*cs++ = i915_mmio_reg_offset(BCS_SWCTRL);
> -	cmd = (BCS_SRC_Y | BCS_DST_Y) << 16;
> -	if (src->tiling == I915_TILING_Y)
> -		cmd |= BCS_SRC_Y;
> -	if (dst->tiling == I915_TILING_Y)
> -		cmd |= BCS_DST_Y;
> -	*cs++ = cmd;
> -
> -	cmd = MI_FLUSH_DW;
> -	if (ver >= 8)
> -		cmd++;
> -	*cs++ = cmd;
> -	*cs++ = 0;
> -	*cs++ = 0;
> -	*cs++ = 0;
> -
> -	cmd = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (8 - 2);
> -	if (ver >= 8)
> -		cmd += 2;
> -
> -	src_pitch = t->width * 4;
> -	if (src->tiling) {
> -		cmd |= XY_SRC_COPY_BLT_SRC_TILED;
> -		src_pitch /= 4;
> -	}
> -
> -	dst_pitch = t->width * 4;
> -	if (dst->tiling) {
> -		cmd |= XY_SRC_COPY_BLT_DST_TILED;
> -		dst_pitch /= 4;
> -	}
> -
> -	*cs++ = cmd;
> -	*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | dst_pitch;
> -	*cs++ = 0;
> -	*cs++ = t->height << 16 | t->width;
> -	*cs++ = lower_32_bits(dst->vma->node.start);
> -	if (use_64b_reloc)
> -		*cs++ = upper_32_bits(dst->vma->node.start);
> -	*cs++ = 0;
> -	*cs++ = src_pitch;
> -	*cs++ = lower_32_bits(src->vma->node.start);
> -	if (use_64b_reloc)
> -		*cs++ = upper_32_bits(src->vma->node.start);
> -
> -	*cs++ = MI_BATCH_BUFFER_END;
> -
> -	i915_gem_object_flush_map(batch);
> -	i915_gem_object_unpin_map(batch);
> -
> -	return 0;
> -}
> -
> -static void tiled_blits_destroy_buffers(struct tiled_blits *t)
> -{
> -	int i;
> -
> -	for (i = 0; i < ARRAY_SIZE(t->buffers); i++)
> -		i915_vma_put(t->buffers[i].vma);
> -
> -	i915_vma_put(t->scratch.vma);
> -	i915_vma_put(t->batch);
> -}
> -
> -static struct i915_vma *
> -__create_vma(struct tiled_blits *t, size_t size, bool lmem)
> -{
> -	struct drm_i915_private *i915 = t->ce->vm->i915;
> -	struct drm_i915_gem_object *obj;
> -	struct i915_vma *vma;
> -
> -	if (lmem)
> -		obj = i915_gem_object_create_lmem(i915, size, 0);
> -	else
> -		obj = i915_gem_object_create_shmem(i915, size);
> -	if (IS_ERR(obj))
> -		return ERR_CAST(obj);
> -
> -	vma = i915_vma_instance(obj, t->ce->vm, NULL);
> -	if (IS_ERR(vma))
> -		i915_gem_object_put(obj);
> -
> -	return vma;
> -}
> -
> -static struct i915_vma *create_vma(struct tiled_blits *t, bool lmem)
> -{
> -	return __create_vma(t, PAGE_ALIGN(t->width * t->height * 4), lmem);
> -}
> -
> -static int tiled_blits_create_buffers(struct tiled_blits *t,
> -				      int width, int height,
> -				      struct rnd_state *prng)
> -{
> -	struct drm_i915_private *i915 = t->ce->engine->i915;
> -	int i;
> -
> -	t->width = width;
> -	t->height = height;
> -
> -	t->batch = __create_vma(t, PAGE_SIZE, false);
> -	if (IS_ERR(t->batch))
> -		return PTR_ERR(t->batch);
> -
> -	t->scratch.vma = create_vma(t, false);
> -	if (IS_ERR(t->scratch.vma)) {
> -		i915_vma_put(t->batch);
> -		return PTR_ERR(t->scratch.vma);
> -	}
> -
> -	for (i = 0; i < ARRAY_SIZE(t->buffers); i++) {
> -		struct i915_vma *vma;
> -
> -		vma = create_vma(t, HAS_LMEM(i915) && i % 2);
> -		if (IS_ERR(vma)) {
> -			tiled_blits_destroy_buffers(t);
> -			return PTR_ERR(vma);
> -		}
> -
> -		t->buffers[i].vma = vma;
> -		t->buffers[i].tiling =
> -			i915_prandom_u32_max_state(I915_TILING_Y + 1, prng);
> -	}
> -
> -	return 0;
> -}
> -
> -static void fill_scratch(struct tiled_blits *t, u32 *vaddr, u32 val)
> -{
> -	int i;
> -
> -	t->scratch.start_val = val;
> -	for (i = 0; i < t->width * t->height; i++)
> -		vaddr[i] = val++;
> -
> -	i915_gem_object_flush_map(t->scratch.vma->obj);
> -}
> -
> -static u64 swizzle_bit(unsigned int bit, u64 offset)
> -{
> -	return (offset & BIT_ULL(bit)) >> (bit - 6);
> -}
> -
> -static u64 tiled_offset(const struct intel_gt *gt,
> -			u64 v,
> -			unsigned int stride,
> -			unsigned int tiling)
> -{
> -	unsigned int swizzle;
> -	u64 x, y;
> -
> -	if (tiling == I915_TILING_NONE)
> -		return v;
> -
> -	y = div64_u64_rem(v, stride, &x);
> -
> -	if (tiling == I915_TILING_X) {
> -		v = div64_u64_rem(y, 8, &y) * stride * 8;
> -		v += y * 512;
> -		v += div64_u64_rem(x, 512, &x) << 12;
> -		v += x;
> -
> -		swizzle = gt->ggtt->bit_6_swizzle_x;
> -	} else {
> -		const unsigned int ytile_span = 16;
> -		const unsigned int ytile_height = 512;
> -
> -		v = div64_u64_rem(y, 32, &y) * stride * 32;
> -		v += y * ytile_span;
> -		v += div64_u64_rem(x, ytile_span, &x) * ytile_height;
> -		v += x;
> -
> -		swizzle = gt->ggtt->bit_6_swizzle_y;
> -	}
> -
> -	switch (swizzle) {
> -	case I915_BIT_6_SWIZZLE_9:
> -		v ^= swizzle_bit(9, v);
> -		break;
> -	case I915_BIT_6_SWIZZLE_9_10:
> -		v ^= swizzle_bit(9, v) ^ swizzle_bit(10, v);
> -		break;
> -	case I915_BIT_6_SWIZZLE_9_11:
> -		v ^= swizzle_bit(9, v) ^ swizzle_bit(11, v);
> -		break;
> -	case I915_BIT_6_SWIZZLE_9_10_11:
> -		v ^= swizzle_bit(9, v) ^ swizzle_bit(10, v) ^ swizzle_bit(11, v);
> -		break;
> -	}
> -
> -	return v;
> -}
> -
> -static const char *repr_tiling(int tiling)
> -{
> -	switch (tiling) {
> -	case I915_TILING_NONE: return "linear";
> -	case I915_TILING_X: return "X";
> -	case I915_TILING_Y: return "Y";
> -	default: return "unknown";
> -	}
> -}
> -
> -static int verify_buffer(const struct tiled_blits *t,
> -			 struct blit_buffer *buf,
> -			 struct rnd_state *prng)
> -{
> -	const u32 *vaddr;
> -	int ret = 0;
> -	int x, y, p;
> -
> -	x = i915_prandom_u32_max_state(t->width, prng);
> -	y = i915_prandom_u32_max_state(t->height, prng);
> -	p = y * t->width + x;
> -
> -	vaddr = i915_gem_object_pin_map_unlocked(buf->vma->obj, I915_MAP_WC);
> -	if (IS_ERR(vaddr))
> -		return PTR_ERR(vaddr);
> -
> -	if (vaddr[0] != buf->start_val) {
> -		ret = -EINVAL;
> -	} else {
> -		u64 v = tiled_offset(buf->vma->vm->gt,
> -				     p * 4, t->width * 4,
> -				     buf->tiling);
> -
> -		if (vaddr[v / sizeof(*vaddr)] != buf->start_val + p)
> -			ret = -EINVAL;
> -	}
> -	if (ret) {
> -		pr_err("Invalid %s tiling detected at (%d, %d), start_val %x\n",
> -		       repr_tiling(buf->tiling),
> -		       x, y, buf->start_val);
> -		igt_hexdump(vaddr, 4096);
> -	}
> -
> -	i915_gem_object_unpin_map(buf->vma->obj);
> -	return ret;
> -}
> -
> -static int move_to_active(struct i915_vma *vma,
> -			  struct i915_request *rq,
> -			  unsigned int flags)
> -{
> -	int err;
> -
> -	i915_vma_lock(vma);
> -	err = i915_request_await_object(rq, vma->obj, false);
> -	if (err == 0)
> -		err = i915_vma_move_to_active(vma, rq, flags);
> -	i915_vma_unlock(vma);
> -
> -	return err;
> -}
> -
> -static int pin_buffer(struct i915_vma *vma, u64 addr)
> -{
> -	int err;
> -
> -	if (drm_mm_node_allocated(&vma->node) && vma->node.start != addr) {
> -		err = i915_vma_unbind(vma);
> -		if (err)
> -			return err;
> -	}
> -
> -	err = i915_vma_pin(vma, 0, 0, PIN_USER | PIN_OFFSET_FIXED | addr);
> -	if (err)
> -		return err;
> -
> -	return 0;
> -}
> -
> -static int
> -tiled_blit(struct tiled_blits *t,
> -	   struct blit_buffer *dst, u64 dst_addr,
> -	   struct blit_buffer *src, u64 src_addr)
> -{
> -	struct i915_request *rq;
> -	int err;
> -
> -	err = pin_buffer(src->vma, src_addr);
> -	if (err) {
> -		pr_err("Cannot pin src @ %llx\n", src_addr);
> -		return err;
> -	}
> -
> -	err = pin_buffer(dst->vma, dst_addr);
> -	if (err) {
> -		pr_err("Cannot pin dst @ %llx\n", dst_addr);
> -		goto err_src;
> -	}
> -
> -	err = i915_vma_pin(t->batch, 0, 0, PIN_USER | PIN_HIGH);
> -	if (err) {
> -		pr_err("cannot pin batch\n");
> -		goto err_dst;
> -	}
> -
> -	err = prepare_blit(t, dst, src, t->batch->obj);
> -	if (err)
> -		goto err_bb;
> -
> -	rq = intel_context_create_request(t->ce);
> -	if (IS_ERR(rq)) {
> -		err = PTR_ERR(rq);
> -		goto err_bb;
> -	}
> -
> -	err = move_to_active(t->batch, rq, 0);
> -	if (!err)
> -		err = move_to_active(src->vma, rq, 0);
> -	if (!err)
> -		err = move_to_active(dst->vma, rq, 0);
> -	if (!err)
> -		err = rq->engine->emit_bb_start(rq,
> -						t->batch->node.start,
> -						t->batch->node.size,
> -						0);
> -	i915_request_get(rq);
> -	i915_request_add(rq);
> -	if (i915_request_wait(rq, 0, HZ / 2) < 0)
> -		err = -ETIME;
> -	i915_request_put(rq);
> -
> -	dst->start_val = src->start_val;
> -err_bb:
> -	i915_vma_unpin(t->batch);
> -err_dst:
> -	i915_vma_unpin(dst->vma);
> -err_src:
> -	i915_vma_unpin(src->vma);
> -	return err;
> -}
> -
> -static struct tiled_blits *
> -tiled_blits_create(struct intel_engine_cs *engine, struct rnd_state *prng)
> -{
> -	struct drm_mm_node hole;
> -	struct tiled_blits *t;
> -	u64 hole_size;
> -	int err;
> -
> -	t = kzalloc(sizeof(*t), GFP_KERNEL);
> -	if (!t)
> -		return ERR_PTR(-ENOMEM);
> -
> -	t->ce = intel_context_create(engine);
> -	if (IS_ERR(t->ce)) {
> -		err = PTR_ERR(t->ce);
> -		goto err_free;
> -	}
> -
> -	hole_size = 2 * PAGE_ALIGN(WIDTH * HEIGHT * 4);
> -	hole_size *= 2; /* room to maneuver */
> -	hole_size += 2 * I915_GTT_MIN_ALIGNMENT;
> -
> -	mutex_lock(&t->ce->vm->mutex);
> -	memset(&hole, 0, sizeof(hole));
> -	err = drm_mm_insert_node_in_range(&t->ce->vm->mm, &hole,
> -					  hole_size, 0, I915_COLOR_UNEVICTABLE,
> -					  0, U64_MAX,
> -					  DRM_MM_INSERT_BEST);
> -	if (!err)
> -		drm_mm_remove_node(&hole);
> -	mutex_unlock(&t->ce->vm->mutex);
> -	if (err) {
> -		err = -ENODEV;
> -		goto err_put;
> -	}
> -
> -	t->hole = hole.start + I915_GTT_MIN_ALIGNMENT;
> -	pr_info("Using hole at %llx\n", t->hole);
> -
> -	err = tiled_blits_create_buffers(t, WIDTH, HEIGHT, prng);
> -	if (err)
> -		goto err_put;
> -
> -	return t;
> -
> -err_put:
> -	intel_context_put(t->ce);
> -err_free:
> -	kfree(t);
> -	return ERR_PTR(err);
> -}
> -
> -static void tiled_blits_destroy(struct tiled_blits *t)
> -{
> -	tiled_blits_destroy_buffers(t);
> -
> -	intel_context_put(t->ce);
> -	kfree(t);
> -}
> -
> -static int tiled_blits_prepare(struct tiled_blits *t,
> -			       struct rnd_state *prng)
> -{
> -	u64 offset = PAGE_ALIGN(t->width * t->height * 4);
> -	u32 *map;
> -	int err;
> -	int i;
> -
> -	map = i915_gem_object_pin_map_unlocked(t->scratch.vma->obj, I915_MAP_WC);
> -	if (IS_ERR(map))
> -		return PTR_ERR(map);
> -
> -	/* Use scratch to fill objects */
> -	for (i = 0; i < ARRAY_SIZE(t->buffers); i++) {
> -		fill_scratch(t, map, prandom_u32_state(prng));
> -		GEM_BUG_ON(verify_buffer(t, &t->scratch, prng));
> -
> -		err = tiled_blit(t,
> -				 &t->buffers[i], t->hole + offset,
> -				 &t->scratch, t->hole);
> -		if (err == 0)
> -			err = verify_buffer(t, &t->buffers[i], prng);
> -		if (err) {
> -			pr_err("Failed to create buffer %d\n", i);
> -			break;
> -		}
> -	}
> -
> -	i915_gem_object_unpin_map(t->scratch.vma->obj);
> -	return err;
> -}
> -
> -static int tiled_blits_bounce(struct tiled_blits *t, struct rnd_state *prng)
> -{
> -	u64 offset =
> -		round_up(t->width * t->height * 4, 2 * I915_GTT_MIN_ALIGNMENT);
> -	int err;
> -
> -	/* We want to check position invariant tiling across GTT eviction */
> -
> -	err = tiled_blit(t,
> -			 &t->buffers[1], t->hole + offset / 2,
> -			 &t->buffers[0], t->hole + 2 * offset);
> -	if (err)
> -		return err;
> -
> -	/* Reposition so that we overlap the old addresses, and slightly off */
> -	err = tiled_blit(t,
> -			 &t->buffers[2], t->hole + I915_GTT_MIN_ALIGNMENT,
> -			 &t->buffers[1], t->hole + 3 * offset / 2);
> -	if (err)
> -		return err;
> -
> -	err = verify_buffer(t, &t->buffers[2], prng);
> -	if (err)
> -		return err;
> -
> -	return 0;
> -}
> -
> -static int __igt_client_tiled_blits(struct intel_engine_cs *engine,
> -				    struct rnd_state *prng)
> -{
> -	struct tiled_blits *t;
> -	int err;
> -
> -	t = tiled_blits_create(engine, prng);
> -	if (IS_ERR(t))
> -		return PTR_ERR(t);
> -
> -	err = tiled_blits_prepare(t, prng);
> -	if (err)
> -		goto out;
> -
> -	err = tiled_blits_bounce(t, prng);
> -	if (err)
> -		goto out;
> -
> -out:
> -	tiled_blits_destroy(t);
> -	return err;
> -}
> -
> -static bool has_bit17_swizzle(int sw)
> -{
> -	return (sw == I915_BIT_6_SWIZZLE_9_10_17 ||
> -		sw == I915_BIT_6_SWIZZLE_9_17);
> -}
> -
> -static bool bad_swizzling(struct drm_i915_private *i915)
> -{
> -	struct i915_ggtt *ggtt = &i915->ggtt;
> -
> -	if (i915->quirks & QUIRK_PIN_SWIZZLED_PAGES)
> -		return true;
> -
> -	if (has_bit17_swizzle(ggtt->bit_6_swizzle_x) ||
> -	    has_bit17_swizzle(ggtt->bit_6_swizzle_y))
> -		return true;
> -
> -	return false;
> -}
> -
> -static int igt_client_tiled_blits(void *arg)
> -{
> -	struct drm_i915_private *i915 = arg;
> -	I915_RND_STATE(prng);
> -	int inst = 0;
> -
> -	/* Test requires explicit BLT tiling controls */
> -	if (GRAPHICS_VER(i915) < 4)
> -		return 0;
> -
> -	if (bad_swizzling(i915)) /* Requires sane (sub-page) swizzling */
> -		return 0;
> -
> -	do {
> -		struct intel_engine_cs *engine;
> -		int err;
> -
> -		engine = intel_engine_lookup_user(i915,
> -						  I915_ENGINE_CLASS_COPY,
> -						  inst++);
> -		if (!engine)
> -			return 0;
> -
> -		err = __igt_client_tiled_blits(engine, &prng);
> -		if (err == -ENODEV)
> -			err = 0;
> -		if (err)
> -			return err;
> -	} while (1);
> -}
> -
> -int i915_gem_client_blt_live_selftests(struct drm_i915_private *i915)
> -{
> -	static const struct i915_subtest tests[] = {
> -		SUBTEST(igt_client_fill),
> -		SUBTEST(igt_client_tiled_blits),
> -	};
> -
> -	if (intel_gt_is_wedged(&i915->gt))
> -		return 0;
> -
> -	return i915_live_subtests(tests, i915);
> -}
> diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> index be5e0191eaea..6f5893ecd549 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> @@ -40,7 +40,6 @@ selftest(hugepages, i915_gem_huge_page_live_selftests)
>   selftest(gem_contexts, i915_gem_context_live_selftests)
>   selftest(gem_execbuf, i915_gem_execbuffer_live_selftests)
>   selftest(blt, i915_gem_object_blt_live_selftests)
> -selftest(client, i915_gem_client_blt_live_selftests)
>   selftest(reset, intel_reset_live_selftests)
>   selftest(memory_region, intel_memory_region_live_selftests)
>   selftest(hangcheck, intel_hangcheck_live_selftests)
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 11/12] drm/i915/gem: Zap the client blt code
  2021-06-14 16:33     ` [Intel-gfx] " Matthew Auld
@ 2021-06-14 16:40       ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:40 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx, dri-devel


On 6/14/21 6:33 PM, Matthew Auld wrote:
> On 14/06/2021 17:26, Thomas Hellström wrote:
>> It's not used anywhere.
>>
>> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>
> We do have to keep igt_client_tiled_blits subtest, it's not related to 
> the client blitting code and was added afterwards. Not completely sure 
> why it's in this file.
>
> With that added back,
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>

OK, I'll add it back.

/Thomas


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH v3 11/12] drm/i915/gem: Zap the client blt code
@ 2021-06-14 16:40       ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 16:40 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx, dri-devel


On 6/14/21 6:33 PM, Matthew Auld wrote:
> On 14/06/2021 17:26, Thomas Hellström wrote:
>> It's not used anywhere.
>>
>> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>
> We do have to keep igt_client_tiled_blits subtest, it's not related to 
> the client blitting code and was added afterwards. Not completely sure 
> why it's in this file.
>
> With that added back,
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>

OK, I'll add it back.

/Thomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 12/12] drm/i915/gem: Zap the i915_gem_object_blt code
  2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 16:43     ` Matthew Auld
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-06-14 16:43 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 14/06/2021 17:26, Thomas Hellström wrote:
> It's unused with the exception of selftest. Replace a call in the
> memory_region live selftest with a call into a corresponding
> function in the new migrate code.

I guess we do lose some coverage around blitting massively sized GEM 
objects using the huge_gem_object tricks.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/Makefile                 |   1 -
>   .../gpu/drm/i915/gem/i915_gem_object_blt.c    | 461 --------------
>   .../gpu/drm/i915/gem/i915_gem_object_blt.h    |  39 --
>   .../i915/gem/selftests/i915_gem_object_blt.c  | 597 ------------------
>   .../drm/i915/selftests/i915_live_selftests.h  |   1 -
>   .../drm/i915/selftests/i915_perf_selftests.h  |   1 -
>   .../drm/i915/selftests/intel_memory_region.c  |  21 +-
>   7 files changed, 14 insertions(+), 1107 deletions(-)
>   delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
>   delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
>   delete mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index ca07474ec2df..13085ac78c63 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -143,7 +143,6 @@ gem-y += \
>   	gem/i915_gem_execbuffer.o \
>   	gem/i915_gem_internal.o \
>   	gem/i915_gem_object.o \
> -	gem/i915_gem_object_blt.o \
>   	gem/i915_gem_lmem.o \
>   	gem/i915_gem_mman.o \
>   	gem/i915_gem_pages.o \
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
> deleted file mode 100644
> index 3e28c68fda3e..000000000000
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
> +++ /dev/null
> @@ -1,461 +0,0 @@
> -// SPDX-License-Identifier: MIT
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#include "i915_drv.h"
> -#include "gt/intel_context.h"
> -#include "gt/intel_engine_pm.h"
> -#include "gt/intel_gpu_commands.h"
> -#include "gt/intel_gt.h"
> -#include "gt/intel_gt_buffer_pool.h"
> -#include "gt/intel_ring.h"
> -#include "i915_gem_clflush.h"
> -#include "i915_gem_object_blt.h"
> -
> -struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
> -					 struct i915_vma *vma,
> -					 struct i915_gem_ww_ctx *ww,
> -					 u32 value)
> -{
> -	struct drm_i915_private *i915 = ce->vm->i915;
> -	const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
> -	struct intel_gt_buffer_pool_node *pool;
> -	struct i915_vma *batch;
> -	u64 offset;
> -	u64 count;
> -	u64 rem;
> -	u32 size;
> -	u32 *cmd;
> -	int err;
> -
> -	GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
> -	intel_engine_pm_get(ce->engine);
> -
> -	count = div_u64(round_up(vma->size, block_size), block_size);
> -	size = (1 + 8 * count) * sizeof(u32);
> -	size = round_up(size, PAGE_SIZE);
> -	pool = intel_gt_get_buffer_pool(ce->engine->gt, size, I915_MAP_WC);
> -	if (IS_ERR(pool)) {
> -		err = PTR_ERR(pool);
> -		goto out_pm;
> -	}
> -
> -	err = i915_gem_object_lock(pool->obj, ww);
> -	if (err)
> -		goto out_put;
> -
> -	batch = i915_vma_instance(pool->obj, ce->vm, NULL);
> -	if (IS_ERR(batch)) {
> -		err = PTR_ERR(batch);
> -		goto out_put;
> -	}
> -
> -	err = i915_vma_pin_ww(batch, ww, 0, 0, PIN_USER);
> -	if (unlikely(err))
> -		goto out_put;
> -
> -	/* we pinned the pool, mark it as such */
> -	intel_gt_buffer_pool_mark_used(pool);
> -
> -	cmd = i915_gem_object_pin_map(pool->obj, pool->type);
> -	if (IS_ERR(cmd)) {
> -		err = PTR_ERR(cmd);
> -		goto out_unpin;
> -	}
> -
> -	rem = vma->size;
> -	offset = vma->node.start;
> -
> -	do {
> -		u32 size = min_t(u64, rem, block_size);
> -
> -		GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
> -
> -		if (GRAPHICS_VER(i915) >= 8) {
> -			*cmd++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2);
> -			*cmd++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
> -			*cmd++ = 0;
> -			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> -			*cmd++ = lower_32_bits(offset);
> -			*cmd++ = upper_32_bits(offset);
> -			*cmd++ = value;
> -		} else {
> -			*cmd++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
> -			*cmd++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
> -			*cmd++ = 0;
> -			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> -			*cmd++ = offset;
> -			*cmd++ = value;
> -		}
> -
> -		/* Allow ourselves to be preempted in between blocks. */
> -		*cmd++ = MI_ARB_CHECK;
> -
> -		offset += size;
> -		rem -= size;
> -	} while (rem);
> -
> -	*cmd = MI_BATCH_BUFFER_END;
> -
> -	i915_gem_object_flush_map(pool->obj);
> -	i915_gem_object_unpin_map(pool->obj);
> -
> -	intel_gt_chipset_flush(ce->vm->gt);
> -
> -	batch->private = pool;
> -	return batch;
> -
> -out_unpin:
> -	i915_vma_unpin(batch);
> -out_put:
> -	intel_gt_buffer_pool_put(pool);
> -out_pm:
> -	intel_engine_pm_put(ce->engine);
> -	return ERR_PTR(err);
> -}
> -
> -int intel_emit_vma_mark_active(struct i915_vma *vma, struct i915_request *rq)
> -{
> -	int err;
> -
> -	err = i915_request_await_object(rq, vma->obj, false);
> -	if (err == 0)
> -		err = i915_vma_move_to_active(vma, rq, 0);
> -	if (unlikely(err))
> -		return err;
> -
> -	return intel_gt_buffer_pool_mark_active(vma->private, rq);
> -}
> -
> -void intel_emit_vma_release(struct intel_context *ce, struct i915_vma *vma)
> -{
> -	i915_vma_unpin(vma);
> -	intel_gt_buffer_pool_put(vma->private);
> -	intel_engine_pm_put(ce->engine);
> -}
> -
> -static int
> -move_obj_to_gpu(struct drm_i915_gem_object *obj,
> -		struct i915_request *rq,
> -		bool write)
> -{
> -	if (obj->cache_dirty & ~obj->cache_coherent)
> -		i915_gem_clflush_object(obj, 0);
> -
> -	return i915_request_await_object(rq, obj, write);
> -}
> -
> -int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
> -			     struct intel_context *ce,
> -			     u32 value)
> -{
> -	struct i915_gem_ww_ctx ww;
> -	struct i915_request *rq;
> -	struct i915_vma *batch;
> -	struct i915_vma *vma;
> -	int err;
> -
> -	vma = i915_vma_instance(obj, ce->vm, NULL);
> -	if (IS_ERR(vma))
> -		return PTR_ERR(vma);
> -
> -	i915_gem_ww_ctx_init(&ww, true);
> -	intel_engine_pm_get(ce->engine);
> -retry:
> -	err = i915_gem_object_lock(obj, &ww);
> -	if (err)
> -		goto out;
> -
> -	err = intel_context_pin_ww(ce, &ww);
> -	if (err)
> -		goto out;
> -
> -	err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER);
> -	if (err)
> -		goto out_ctx;
> -
> -	batch = intel_emit_vma_fill_blt(ce, vma, &ww, value);
> -	if (IS_ERR(batch)) {
> -		err = PTR_ERR(batch);
> -		goto out_vma;
> -	}
> -
> -	rq = i915_request_create(ce);
> -	if (IS_ERR(rq)) {
> -		err = PTR_ERR(rq);
> -		goto out_batch;
> -	}
> -
> -	err = intel_emit_vma_mark_active(batch, rq);
> -	if (unlikely(err))
> -		goto out_request;
> -
> -	err = move_obj_to_gpu(vma->obj, rq, true);
> -	if (err == 0)
> -		err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
> -	if (unlikely(err))
> -		goto out_request;
> -
> -	if (ce->engine->emit_init_breadcrumb)
> -		err = ce->engine->emit_init_breadcrumb(rq);
> -
> -	if (likely(!err))
> -		err = ce->engine->emit_bb_start(rq,
> -						batch->node.start,
> -						batch->node.size,
> -						0);
> -out_request:
> -	if (unlikely(err))
> -		i915_request_set_error_once(rq, err);
> -
> -	i915_request_add(rq);
> -out_batch:
> -	intel_emit_vma_release(ce, batch);
> -out_vma:
> -	i915_vma_unpin(vma);
> -out_ctx:
> -	intel_context_unpin(ce);
> -out:
> -	if (err == -EDEADLK) {
> -		err = i915_gem_ww_ctx_backoff(&ww);
> -		if (!err)
> -			goto retry;
> -	}
> -	i915_gem_ww_ctx_fini(&ww);
> -	intel_engine_pm_put(ce->engine);
> -	return err;
> -}
> -
> -/* Wa_1209644611:icl,ehl */
> -static bool wa_1209644611_applies(struct drm_i915_private *i915, u32 size)
> -{
> -	u32 height = size >> PAGE_SHIFT;
> -
> -	if (GRAPHICS_VER(i915) != 11)
> -		return false;
> -
> -	return height % 4 == 3 && height <= 8;
> -}
> -
> -struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
> -					 struct i915_gem_ww_ctx *ww,
> -					 struct i915_vma *src,
> -					 struct i915_vma *dst)
> -{
> -	struct drm_i915_private *i915 = ce->vm->i915;
> -	const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
> -	struct intel_gt_buffer_pool_node *pool;
> -	struct i915_vma *batch;
> -	u64 src_offset, dst_offset;
> -	u64 count, rem;
> -	u32 size, *cmd;
> -	int err;
> -
> -	GEM_BUG_ON(src->size != dst->size);
> -
> -	GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
> -	intel_engine_pm_get(ce->engine);
> -
> -	count = div_u64(round_up(dst->size, block_size), block_size);
> -	size = (1 + 11 * count) * sizeof(u32);
> -	size = round_up(size, PAGE_SIZE);
> -	pool = intel_gt_get_buffer_pool(ce->engine->gt, size, I915_MAP_WC);
> -	if (IS_ERR(pool)) {
> -		err = PTR_ERR(pool);
> -		goto out_pm;
> -	}
> -
> -	err = i915_gem_object_lock(pool->obj, ww);
> -	if (err)
> -		goto out_put;
> -
> -	batch = i915_vma_instance(pool->obj, ce->vm, NULL);
> -	if (IS_ERR(batch)) {
> -		err = PTR_ERR(batch);
> -		goto out_put;
> -	}
> -
> -	err = i915_vma_pin_ww(batch, ww, 0, 0, PIN_USER);
> -	if (unlikely(err))
> -		goto out_put;
> -
> -	/* we pinned the pool, mark it as such */
> -	intel_gt_buffer_pool_mark_used(pool);
> -
> -	cmd = i915_gem_object_pin_map(pool->obj, pool->type);
> -	if (IS_ERR(cmd)) {
> -		err = PTR_ERR(cmd);
> -		goto out_unpin;
> -	}
> -
> -	rem = src->size;
> -	src_offset = src->node.start;
> -	dst_offset = dst->node.start;
> -
> -	do {
> -		size = min_t(u64, rem, block_size);
> -		GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
> -
> -		if (GRAPHICS_VER(i915) >= 9 &&
> -		    !wa_1209644611_applies(i915, size)) {
> -			*cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
> -			*cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
> -			*cmd++ = 0;
> -			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> -			*cmd++ = lower_32_bits(dst_offset);
> -			*cmd++ = upper_32_bits(dst_offset);
> -			*cmd++ = 0;
> -			*cmd++ = PAGE_SIZE;
> -			*cmd++ = lower_32_bits(src_offset);
> -			*cmd++ = upper_32_bits(src_offset);
> -		} else if (GRAPHICS_VER(i915) >= 8) {
> -			*cmd++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (10 - 2);
> -			*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
> -			*cmd++ = 0;
> -			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> -			*cmd++ = lower_32_bits(dst_offset);
> -			*cmd++ = upper_32_bits(dst_offset);
> -			*cmd++ = 0;
> -			*cmd++ = PAGE_SIZE;
> -			*cmd++ = lower_32_bits(src_offset);
> -			*cmd++ = upper_32_bits(src_offset);
> -		} else {
> -			*cmd++ = SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
> -			*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
> -			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE;
> -			*cmd++ = dst_offset;
> -			*cmd++ = PAGE_SIZE;
> -			*cmd++ = src_offset;
> -		}
> -
> -		/* Allow ourselves to be preempted in between blocks. */
> -		*cmd++ = MI_ARB_CHECK;
> -
> -		src_offset += size;
> -		dst_offset += size;
> -		rem -= size;
> -	} while (rem);
> -
> -	*cmd = MI_BATCH_BUFFER_END;
> -
> -	i915_gem_object_flush_map(pool->obj);
> -	i915_gem_object_unpin_map(pool->obj);
> -
> -	intel_gt_chipset_flush(ce->vm->gt);
> -	batch->private = pool;
> -	return batch;
> -
> -out_unpin:
> -	i915_vma_unpin(batch);
> -out_put:
> -	intel_gt_buffer_pool_put(pool);
> -out_pm:
> -	intel_engine_pm_put(ce->engine);
> -	return ERR_PTR(err);
> -}
> -
> -int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
> -			     struct drm_i915_gem_object *dst,
> -			     struct intel_context *ce)
> -{
> -	struct i915_address_space *vm = ce->vm;
> -	struct i915_vma *vma[2], *batch;
> -	struct i915_gem_ww_ctx ww;
> -	struct i915_request *rq;
> -	int err, i;
> -
> -	vma[0] = i915_vma_instance(src, vm, NULL);
> -	if (IS_ERR(vma[0]))
> -		return PTR_ERR(vma[0]);
> -
> -	vma[1] = i915_vma_instance(dst, vm, NULL);
> -	if (IS_ERR(vma[1]))
> -		return PTR_ERR(vma[1]);
> -
> -	i915_gem_ww_ctx_init(&ww, true);
> -	intel_engine_pm_get(ce->engine);
> -retry:
> -	err = i915_gem_object_lock(src, &ww);
> -	if (!err)
> -		err = i915_gem_object_lock(dst, &ww);
> -	if (!err)
> -		err = intel_context_pin_ww(ce, &ww);
> -	if (err)
> -		goto out;
> -
> -	err = i915_vma_pin_ww(vma[0], &ww, 0, 0, PIN_USER);
> -	if (err)
> -		goto out_ctx;
> -
> -	err = i915_vma_pin_ww(vma[1], &ww, 0, 0, PIN_USER);
> -	if (unlikely(err))
> -		goto out_unpin_src;
> -
> -	batch = intel_emit_vma_copy_blt(ce, &ww, vma[0], vma[1]);
> -	if (IS_ERR(batch)) {
> -		err = PTR_ERR(batch);
> -		goto out_unpin_dst;
> -	}
> -
> -	rq = i915_request_create(ce);
> -	if (IS_ERR(rq)) {
> -		err = PTR_ERR(rq);
> -		goto out_batch;
> -	}
> -
> -	err = intel_emit_vma_mark_active(batch, rq);
> -	if (unlikely(err))
> -		goto out_request;
> -
> -	for (i = 0; i < ARRAY_SIZE(vma); i++) {
> -		err = move_obj_to_gpu(vma[i]->obj, rq, i);
> -		if (unlikely(err))
> -			goto out_request;
> -	}
> -
> -	for (i = 0; i < ARRAY_SIZE(vma); i++) {
> -		unsigned int flags = i ? EXEC_OBJECT_WRITE : 0;
> -
> -		err = i915_vma_move_to_active(vma[i], rq, flags);
> -		if (unlikely(err))
> -			goto out_request;
> -	}
> -
> -	if (rq->engine->emit_init_breadcrumb) {
> -		err = rq->engine->emit_init_breadcrumb(rq);
> -		if (unlikely(err))
> -			goto out_request;
> -	}
> -
> -	err = rq->engine->emit_bb_start(rq,
> -					batch->node.start, batch->node.size,
> -					0);
> -
> -out_request:
> -	if (unlikely(err))
> -		i915_request_set_error_once(rq, err);
> -
> -	i915_request_add(rq);
> -out_batch:
> -	intel_emit_vma_release(ce, batch);
> -out_unpin_dst:
> -	i915_vma_unpin(vma[1]);
> -out_unpin_src:
> -	i915_vma_unpin(vma[0]);
> -out_ctx:
> -	intel_context_unpin(ce);
> -out:
> -	if (err == -EDEADLK) {
> -		err = i915_gem_ww_ctx_backoff(&ww);
> -		if (!err)
> -			goto retry;
> -	}
> -	i915_gem_ww_ctx_fini(&ww);
> -	intel_engine_pm_put(ce->engine);
> -	return err;
> -}
> -
> -#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> -#include "selftests/i915_gem_object_blt.c"
> -#endif
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
> deleted file mode 100644
> index 2409fdcccf0e..000000000000
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
> +++ /dev/null
> @@ -1,39 +0,0 @@
> -/* SPDX-License-Identifier: MIT */
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#ifndef __I915_GEM_OBJECT_BLT_H__
> -#define __I915_GEM_OBJECT_BLT_H__
> -
> -#include <linux/types.h>
> -
> -#include "gt/intel_context.h"
> -#include "gt/intel_engine_pm.h"
> -#include "i915_vma.h"
> -
> -struct drm_i915_gem_object;
> -struct i915_gem_ww_ctx;
> -
> -struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
> -					 struct i915_vma *vma,
> -					 struct i915_gem_ww_ctx *ww,
> -					 u32 value);
> -
> -struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
> -					 struct i915_gem_ww_ctx *ww,
> -					 struct i915_vma *src,
> -					 struct i915_vma *dst);
> -
> -int intel_emit_vma_mark_active(struct i915_vma *vma, struct i915_request *rq);
> -void intel_emit_vma_release(struct intel_context *ce, struct i915_vma *vma);
> -
> -int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
> -			     struct intel_context *ce,
> -			     u32 value);
> -
> -int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
> -			     struct drm_i915_gem_object *dst,
> -			     struct intel_context *ce);
> -
> -#endif
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
> deleted file mode 100644
> index 8c335d1a8406..000000000000
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
> +++ /dev/null
> @@ -1,597 +0,0 @@
> -// SPDX-License-Identifier: MIT
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#include <linux/sort.h>
> -
> -#include "gt/intel_gt.h"
> -#include "gt/intel_engine_user.h"
> -
> -#include "i915_selftest.h"
> -
> -#include "gem/i915_gem_context.h"
> -#include "selftests/igt_flush_test.h"
> -#include "selftests/i915_random.h"
> -#include "selftests/mock_drm.h"
> -#include "huge_gem_object.h"
> -#include "mock_context.h"
> -
> -static int wrap_ktime_compare(const void *A, const void *B)
> -{
> -	const ktime_t *a = A, *b = B;
> -
> -	return ktime_compare(*a, *b);
> -}
> -
> -static int __perf_fill_blt(struct drm_i915_gem_object *obj)
> -{
> -	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> -	int inst = 0;
> -
> -	do {
> -		struct intel_engine_cs *engine;
> -		ktime_t t[5];
> -		int pass;
> -		int err;
> -
> -		engine = intel_engine_lookup_user(i915,
> -						  I915_ENGINE_CLASS_COPY,
> -						  inst++);
> -		if (!engine)
> -			return 0;
> -
> -		intel_engine_pm_get(engine);
> -		for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
> -			struct intel_context *ce = engine->kernel_context;
> -			ktime_t t0, t1;
> -
> -			t0 = ktime_get();
> -
> -			err = i915_gem_object_fill_blt(obj, ce, 0);
> -			if (err)
> -				break;
> -
> -			err = i915_gem_object_wait(obj,
> -						   I915_WAIT_ALL,
> -						   MAX_SCHEDULE_TIMEOUT);
> -			if (err)
> -				break;
> -
> -			t1 = ktime_get();
> -			t[pass] = ktime_sub(t1, t0);
> -		}
> -		intel_engine_pm_put(engine);
> -		if (err)
> -			return err;
> -
> -		sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
> -		pr_info("%s: blt %zd KiB fill: %lld MiB/s\n",
> -			engine->name,
> -			obj->base.size >> 10,
> -			div64_u64(mul_u32_u32(4 * obj->base.size,
> -					      1000 * 1000 * 1000),
> -				  t[1] + 2 * t[2] + t[3]) >> 20);
> -	} while (1);
> -}
> -
> -static int perf_fill_blt(void *arg)
> -{
> -	struct drm_i915_private *i915 = arg;
> -	static const unsigned long sizes[] = {
> -		SZ_4K,
> -		SZ_64K,
> -		SZ_2M,
> -		SZ_64M
> -	};
> -	int i;
> -
> -	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
> -		struct drm_i915_gem_object *obj;
> -		int err;
> -
> -		obj = i915_gem_object_create_internal(i915, sizes[i]);
> -		if (IS_ERR(obj))
> -			return PTR_ERR(obj);
> -
> -		err = __perf_fill_blt(obj);
> -		i915_gem_object_put(obj);
> -		if (err)
> -			return err;
> -	}
> -
> -	return 0;
> -}
> -
> -static int __perf_copy_blt(struct drm_i915_gem_object *src,
> -			   struct drm_i915_gem_object *dst)
> -{
> -	struct drm_i915_private *i915 = to_i915(src->base.dev);
> -	int inst = 0;
> -
> -	do {
> -		struct intel_engine_cs *engine;
> -		ktime_t t[5];
> -		int pass;
> -		int err = 0;
> -
> -		engine = intel_engine_lookup_user(i915,
> -						  I915_ENGINE_CLASS_COPY,
> -						  inst++);
> -		if (!engine)
> -			return 0;
> -
> -		intel_engine_pm_get(engine);
> -		for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
> -			struct intel_context *ce = engine->kernel_context;
> -			ktime_t t0, t1;
> -
> -			t0 = ktime_get();
> -
> -			err = i915_gem_object_copy_blt(src, dst, ce);
> -			if (err)
> -				break;
> -
> -			err = i915_gem_object_wait(dst,
> -						   I915_WAIT_ALL,
> -						   MAX_SCHEDULE_TIMEOUT);
> -			if (err)
> -				break;
> -
> -			t1 = ktime_get();
> -			t[pass] = ktime_sub(t1, t0);
> -		}
> -		intel_engine_pm_put(engine);
> -		if (err)
> -			return err;
> -
> -		sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
> -		pr_info("%s: blt %zd KiB copy: %lld MiB/s\n",
> -			engine->name,
> -			src->base.size >> 10,
> -			div64_u64(mul_u32_u32(4 * src->base.size,
> -					      1000 * 1000 * 1000),
> -				  t[1] + 2 * t[2] + t[3]) >> 20);
> -	} while (1);
> -}
> -
> -static int perf_copy_blt(void *arg)
> -{
> -	struct drm_i915_private *i915 = arg;
> -	static const unsigned long sizes[] = {
> -		SZ_4K,
> -		SZ_64K,
> -		SZ_2M,
> -		SZ_64M
> -	};
> -	int i;
> -
> -	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
> -		struct drm_i915_gem_object *src, *dst;
> -		int err;
> -
> -		src = i915_gem_object_create_internal(i915, sizes[i]);
> -		if (IS_ERR(src))
> -			return PTR_ERR(src);
> -
> -		dst = i915_gem_object_create_internal(i915, sizes[i]);
> -		if (IS_ERR(dst)) {
> -			err = PTR_ERR(dst);
> -			goto err_src;
> -		}
> -
> -		err = __perf_copy_blt(src, dst);
> -
> -		i915_gem_object_put(dst);
> -err_src:
> -		i915_gem_object_put(src);
> -		if (err)
> -			return err;
> -	}
> -
> -	return 0;
> -}
> -
> -struct igt_thread_arg {
> -	struct intel_engine_cs *engine;
> -	struct i915_gem_context *ctx;
> -	struct file *file;
> -	struct rnd_state prng;
> -	unsigned int n_cpus;
> -};
> -
> -static int igt_fill_blt_thread(void *arg)
> -{
> -	struct igt_thread_arg *thread = arg;
> -	struct intel_engine_cs *engine = thread->engine;
> -	struct rnd_state *prng = &thread->prng;
> -	struct drm_i915_gem_object *obj;
> -	struct i915_gem_context *ctx;
> -	struct intel_context *ce;
> -	unsigned int prio;
> -	IGT_TIMEOUT(end);
> -	u64 total, max;
> -	int err;
> -
> -	ctx = thread->ctx;
> -	if (!ctx) {
> -		ctx = live_context_for_engine(engine, thread->file);
> -		if (IS_ERR(ctx))
> -			return PTR_ERR(ctx);
> -
> -		prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
> -		ctx->sched.priority = prio;
> -	}
> -
> -	ce = i915_gem_context_get_engine(ctx, 0);
> -	GEM_BUG_ON(IS_ERR(ce));
> -
> -	/*
> -	 * If we have a tiny shared address space, like for the GGTT
> -	 * then we can't be too greedy.
> -	 */
> -	max = ce->vm->total;
> -	if (i915_is_ggtt(ce->vm) || thread->ctx)
> -		max = div_u64(max, thread->n_cpus);
> -	max >>= 4;
> -
> -	total = PAGE_SIZE;
> -	do {
> -		/* Aim to keep the runtime under reasonable bounds! */
> -		const u32 max_phys_size = SZ_64K;
> -		u32 val = prandom_u32_state(prng);
> -		u32 phys_sz;
> -		u32 sz;
> -		u32 *vaddr;
> -		u32 i;
> -
> -		total = min(total, max);
> -		sz = i915_prandom_u32_max_state(total, prng) + 1;
> -		phys_sz = sz % max_phys_size + 1;
> -
> -		sz = round_up(sz, PAGE_SIZE);
> -		phys_sz = round_up(phys_sz, PAGE_SIZE);
> -		phys_sz = min(phys_sz, sz);
> -
> -		pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
> -			 phys_sz, sz, val);
> -
> -		obj = huge_gem_object(engine->i915, phys_sz, sz);
> -		if (IS_ERR(obj)) {
> -			err = PTR_ERR(obj);
> -			goto err_flush;
> -		}
> -
> -		vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
> -		if (IS_ERR(vaddr)) {
> -			err = PTR_ERR(vaddr);
> -			goto err_put;
> -		}
> -
> -		/*
> -		 * Make sure the potentially async clflush does its job, if
> -		 * required.
> -		 */
> -		memset32(vaddr, val ^ 0xdeadbeaf,
> -			 huge_gem_object_phys_size(obj) / sizeof(u32));
> -
> -		if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
> -			obj->cache_dirty = true;
> -
> -		err = i915_gem_object_fill_blt(obj, ce, val);
> -		if (err)
> -			goto err_unpin;
> -
> -		err = i915_gem_object_wait(obj, 0, MAX_SCHEDULE_TIMEOUT);
> -		if (err)
> -			goto err_unpin;
> -
> -		for (i = 0; i < huge_gem_object_phys_size(obj) / sizeof(u32); i += 17) {
> -			if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
> -				drm_clflush_virt_range(&vaddr[i], sizeof(vaddr[i]));
> -
> -			if (vaddr[i] != val) {
> -				pr_err("vaddr[%u]=%x, expected=%x\n", i,
> -				       vaddr[i], val);
> -				err = -EINVAL;
> -				goto err_unpin;
> -			}
> -		}
> -
> -		i915_gem_object_unpin_map(obj);
> -		i915_gem_object_put(obj);
> -
> -		total <<= 1;
> -	} while (!time_after(jiffies, end));
> -
> -	goto err_flush;
> -
> -err_unpin:
> -	i915_gem_object_unpin_map(obj);
> -err_put:
> -	i915_gem_object_put(obj);
> -err_flush:
> -	if (err == -ENOMEM)
> -		err = 0;
> -
> -	intel_context_put(ce);
> -	return err;
> -}
> -
> -static int igt_copy_blt_thread(void *arg)
> -{
> -	struct igt_thread_arg *thread = arg;
> -	struct intel_engine_cs *engine = thread->engine;
> -	struct rnd_state *prng = &thread->prng;
> -	struct drm_i915_gem_object *src, *dst;
> -	struct i915_gem_context *ctx;
> -	struct intel_context *ce;
> -	unsigned int prio;
> -	IGT_TIMEOUT(end);
> -	u64 total, max;
> -	int err;
> -
> -	ctx = thread->ctx;
> -	if (!ctx) {
> -		ctx = live_context_for_engine(engine, thread->file);
> -		if (IS_ERR(ctx))
> -			return PTR_ERR(ctx);
> -
> -		prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
> -		ctx->sched.priority = prio;
> -	}
> -
> -	ce = i915_gem_context_get_engine(ctx, 0);
> -	GEM_BUG_ON(IS_ERR(ce));
> -
> -	/*
> -	 * If we have a tiny shared address space, like for the GGTT
> -	 * then we can't be too greedy.
> -	 */
> -	max = ce->vm->total;
> -	if (i915_is_ggtt(ce->vm) || thread->ctx)
> -		max = div_u64(max, thread->n_cpus);
> -	max >>= 4;
> -
> -	total = PAGE_SIZE;
> -	do {
> -		/* Aim to keep the runtime under reasonable bounds! */
> -		const u32 max_phys_size = SZ_64K;
> -		u32 val = prandom_u32_state(prng);
> -		u32 phys_sz;
> -		u32 sz;
> -		u32 *vaddr;
> -		u32 i;
> -
> -		total = min(total, max);
> -		sz = i915_prandom_u32_max_state(total, prng) + 1;
> -		phys_sz = sz % max_phys_size + 1;
> -
> -		sz = round_up(sz, PAGE_SIZE);
> -		phys_sz = round_up(phys_sz, PAGE_SIZE);
> -		phys_sz = min(phys_sz, sz);
> -
> -		pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
> -			 phys_sz, sz, val);
> -
> -		src = huge_gem_object(engine->i915, phys_sz, sz);
> -		if (IS_ERR(src)) {
> -			err = PTR_ERR(src);
> -			goto err_flush;
> -		}
> -
> -		vaddr = i915_gem_object_pin_map_unlocked(src, I915_MAP_WB);
> -		if (IS_ERR(vaddr)) {
> -			err = PTR_ERR(vaddr);
> -			goto err_put_src;
> -		}
> -
> -		memset32(vaddr, val,
> -			 huge_gem_object_phys_size(src) / sizeof(u32));
> -
> -		i915_gem_object_unpin_map(src);
> -
> -		if (!(src->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
> -			src->cache_dirty = true;
> -
> -		dst = huge_gem_object(engine->i915, phys_sz, sz);
> -		if (IS_ERR(dst)) {
> -			err = PTR_ERR(dst);
> -			goto err_put_src;
> -		}
> -
> -		vaddr = i915_gem_object_pin_map_unlocked(dst, I915_MAP_WB);
> -		if (IS_ERR(vaddr)) {
> -			err = PTR_ERR(vaddr);
> -			goto err_put_dst;
> -		}
> -
> -		memset32(vaddr, val ^ 0xdeadbeaf,
> -			 huge_gem_object_phys_size(dst) / sizeof(u32));
> -
> -		if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
> -			dst->cache_dirty = true;
> -
> -		err = i915_gem_object_copy_blt(src, dst, ce);
> -		if (err)
> -			goto err_unpin;
> -
> -		err = i915_gem_object_wait(dst, 0, MAX_SCHEDULE_TIMEOUT);
> -		if (err)
> -			goto err_unpin;
> -
> -		for (i = 0; i < huge_gem_object_phys_size(dst) / sizeof(u32); i += 17) {
> -			if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
> -				drm_clflush_virt_range(&vaddr[i], sizeof(vaddr[i]));
> -
> -			if (vaddr[i] != val) {
> -				pr_err("vaddr[%u]=%x, expected=%x\n", i,
> -				       vaddr[i], val);
> -				err = -EINVAL;
> -				goto err_unpin;
> -			}
> -		}
> -
> -		i915_gem_object_unpin_map(dst);
> -
> -		i915_gem_object_put(src);
> -		i915_gem_object_put(dst);
> -
> -		total <<= 1;
> -	} while (!time_after(jiffies, end));
> -
> -	goto err_flush;
> -
> -err_unpin:
> -	i915_gem_object_unpin_map(dst);
> -err_put_dst:
> -	i915_gem_object_put(dst);
> -err_put_src:
> -	i915_gem_object_put(src);
> -err_flush:
> -	if (err == -ENOMEM)
> -		err = 0;
> -
> -	intel_context_put(ce);
> -	return err;
> -}
> -
> -static int igt_threaded_blt(struct intel_engine_cs *engine,
> -			    int (*blt_fn)(void *arg),
> -			    unsigned int flags)
> -#define SINGLE_CTX BIT(0)
> -{
> -	struct igt_thread_arg *thread;
> -	struct task_struct **tsk;
> -	unsigned int n_cpus, i;
> -	I915_RND_STATE(prng);
> -	int err = 0;
> -
> -	n_cpus = num_online_cpus() + 1;
> -
> -	tsk = kcalloc(n_cpus, sizeof(struct task_struct *), GFP_KERNEL);
> -	if (!tsk)
> -		return 0;
> -
> -	thread = kcalloc(n_cpus, sizeof(struct igt_thread_arg), GFP_KERNEL);
> -	if (!thread)
> -		goto out_tsk;
> -
> -	thread[0].file = mock_file(engine->i915);
> -	if (IS_ERR(thread[0].file)) {
> -		err = PTR_ERR(thread[0].file);
> -		goto out_thread;
> -	}
> -
> -	if (flags & SINGLE_CTX) {
> -		thread[0].ctx = live_context_for_engine(engine, thread[0].file);
> -		if (IS_ERR(thread[0].ctx)) {
> -			err = PTR_ERR(thread[0].ctx);
> -			goto out_file;
> -		}
> -	}
> -
> -	for (i = 0; i < n_cpus; ++i) {
> -		thread[i].engine = engine;
> -		thread[i].file = thread[0].file;
> -		thread[i].ctx = thread[0].ctx;
> -		thread[i].n_cpus = n_cpus;
> -		thread[i].prng =
> -			I915_RND_STATE_INITIALIZER(prandom_u32_state(&prng));
> -
> -		tsk[i] = kthread_run(blt_fn, &thread[i], "igt/blt-%d", i);
> -		if (IS_ERR(tsk[i])) {
> -			err = PTR_ERR(tsk[i]);
> -			break;
> -		}
> -
> -		get_task_struct(tsk[i]);
> -	}
> -
> -	yield(); /* start all threads before we kthread_stop() */
> -
> -	for (i = 0; i < n_cpus; ++i) {
> -		int status;
> -
> -		if (IS_ERR_OR_NULL(tsk[i]))
> -			continue;
> -
> -		status = kthread_stop(tsk[i]);
> -		if (status && !err)
> -			err = status;
> -
> -		put_task_struct(tsk[i]);
> -	}
> -
> -out_file:
> -	fput(thread[0].file);
> -out_thread:
> -	kfree(thread);
> -out_tsk:
> -	kfree(tsk);
> -	return err;
> -}
> -
> -static int test_copy_engines(struct drm_i915_private *i915,
> -			     int (*fn)(void *arg),
> -			     unsigned int flags)
> -{
> -	struct intel_engine_cs *engine;
> -	int ret;
> -
> -	for_each_uabi_class_engine(engine, I915_ENGINE_CLASS_COPY, i915) {
> -		ret = igt_threaded_blt(engine, fn, flags);
> -		if (ret)
> -			return ret;
> -	}
> -
> -	return 0;
> -}
> -
> -static int igt_fill_blt(void *arg)
> -{
> -	return test_copy_engines(arg, igt_fill_blt_thread, 0);
> -}
> -
> -static int igt_fill_blt_ctx0(void *arg)
> -{
> -	return test_copy_engines(arg, igt_fill_blt_thread, SINGLE_CTX);
> -}
> -
> -static int igt_copy_blt(void *arg)
> -{
> -	return test_copy_engines(arg, igt_copy_blt_thread, 0);
> -}
> -
> -static int igt_copy_blt_ctx0(void *arg)
> -{
> -	return test_copy_engines(arg, igt_copy_blt_thread, SINGLE_CTX);
> -}
> -
> -int i915_gem_object_blt_live_selftests(struct drm_i915_private *i915)
> -{
> -	static const struct i915_subtest tests[] = {
> -		SUBTEST(igt_fill_blt),
> -		SUBTEST(igt_fill_blt_ctx0),
> -		SUBTEST(igt_copy_blt),
> -		SUBTEST(igt_copy_blt_ctx0),
> -	};
> -
> -	if (intel_gt_is_wedged(&i915->gt))
> -		return 0;
> -
> -	return i915_live_subtests(tests, i915);
> -}
> -
> -int i915_gem_object_blt_perf_selftests(struct drm_i915_private *i915)
> -{
> -	static const struct i915_subtest tests[] = {
> -		SUBTEST(perf_fill_blt),
> -		SUBTEST(perf_copy_blt),
> -	};
> -
> -	if (intel_gt_is_wedged(&i915->gt))
> -		return 0;
> -
> -	return i915_live_subtests(tests, i915);
> -}
> diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> index 6f5893ecd549..1ae3f8039d68 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> @@ -39,7 +39,6 @@ selftest(evict, i915_gem_evict_live_selftests)
>   selftest(hugepages, i915_gem_huge_page_live_selftests)
>   selftest(gem_contexts, i915_gem_context_live_selftests)
>   selftest(gem_execbuf, i915_gem_execbuffer_live_selftests)
> -selftest(blt, i915_gem_object_blt_live_selftests)
>   selftest(reset, intel_reset_live_selftests)
>   selftest(memory_region, intel_memory_region_live_selftests)
>   selftest(hangcheck, intel_hangcheck_live_selftests)
> diff --git a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
> index 5077dc3c3b8c..058450d351f7 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
> @@ -18,5 +18,4 @@
>   selftest(engine_cs, intel_engine_cs_perf_selftests)
>   selftest(request, i915_request_perf_selftests)
>   selftest(migrate, intel_migrate_perf_selftests)
> -selftest(blt, i915_gem_object_blt_perf_selftests)
>   selftest(region, intel_memory_region_perf_selftests)
> diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> index c85d516b85cd..2e18f3a3d538 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> @@ -15,11 +15,12 @@
>   #include "gem/i915_gem_context.h"
>   #include "gem/i915_gem_lmem.h"
>   #include "gem/i915_gem_region.h"
> -#include "gem/i915_gem_object_blt.h"
>   #include "gem/selftests/igt_gem_utils.h"
>   #include "gem/selftests/mock_context.h"
> +#include "gt/intel_engine_pm.h"
>   #include "gt/intel_engine_user.h"
>   #include "gt/intel_gt.h"
> +#include "gt/intel_migrate.h"
>   #include "i915_memcpy.h"
>   #include "selftests/igt_flush_test.h"
>   #include "selftests/i915_random.h"
> @@ -741,6 +742,7 @@ static int igt_lmem_write_cpu(void *arg)
>   		PAGE_SIZE - 64,
>   	};
>   	struct intel_engine_cs *engine;
> +	struct i915_request *rq;
>   	u32 *vaddr;
>   	u32 sz;
>   	u32 i;
> @@ -767,15 +769,20 @@ static int igt_lmem_write_cpu(void *arg)
>   		goto out_put;
>   	}
>   
> +	i915_gem_object_lock(obj, NULL);
>   	/* Put the pages into a known state -- from the gpu for added fun */
>   	intel_engine_pm_get(engine);
> -	err = i915_gem_object_fill_blt(obj, engine->kernel_context, 0xdeadbeaf);
> -	intel_engine_pm_put(engine);
> -	if (err)
> -		goto out_unpin;
> +	err = intel_context_migrate_clear(engine->gt->migrate.context, NULL,
> +					  obj->mm.pages->sgl, I915_CACHE_NONE,
> +					  true, 0xdeadbeaf, &rq);
> +	if (rq) {
> +		dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
> +		i915_request_put(rq);
> +	}
>   
> -	i915_gem_object_lock(obj, NULL);
> -	err = i915_gem_object_set_to_wc_domain(obj, true);
> +	intel_engine_pm_put(engine);
> +	if (!err)
> +		err = i915_gem_object_set_to_wc_domain(obj, true);
>   	i915_gem_object_unlock(obj);
>   	if (err)
>   		goto out_unpin;
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH v3 12/12] drm/i915/gem: Zap the i915_gem_object_blt code
@ 2021-06-14 16:43     ` Matthew Auld
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-06-14 16:43 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel

On 14/06/2021 17:26, Thomas Hellström wrote:
> It's unused with the exception of selftest. Replace a call in the
> memory_region live selftest with a call into a corresponding
> function in the new migrate code.

I guess we do lose some coverage around blitting massively sized GEM 
objects using the huge_gem_object tricks.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> 
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/Makefile                 |   1 -
>   .../gpu/drm/i915/gem/i915_gem_object_blt.c    | 461 --------------
>   .../gpu/drm/i915/gem/i915_gem_object_blt.h    |  39 --
>   .../i915/gem/selftests/i915_gem_object_blt.c  | 597 ------------------
>   .../drm/i915/selftests/i915_live_selftests.h  |   1 -
>   .../drm/i915/selftests/i915_perf_selftests.h  |   1 -
>   .../drm/i915/selftests/intel_memory_region.c  |  21 +-
>   7 files changed, 14 insertions(+), 1107 deletions(-)
>   delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
>   delete mode 100644 drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
>   delete mode 100644 drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index ca07474ec2df..13085ac78c63 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -143,7 +143,6 @@ gem-y += \
>   	gem/i915_gem_execbuffer.o \
>   	gem/i915_gem_internal.o \
>   	gem/i915_gem_object.o \
> -	gem/i915_gem_object_blt.o \
>   	gem/i915_gem_lmem.o \
>   	gem/i915_gem_mman.o \
>   	gem/i915_gem_pages.o \
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
> deleted file mode 100644
> index 3e28c68fda3e..000000000000
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.c
> +++ /dev/null
> @@ -1,461 +0,0 @@
> -// SPDX-License-Identifier: MIT
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#include "i915_drv.h"
> -#include "gt/intel_context.h"
> -#include "gt/intel_engine_pm.h"
> -#include "gt/intel_gpu_commands.h"
> -#include "gt/intel_gt.h"
> -#include "gt/intel_gt_buffer_pool.h"
> -#include "gt/intel_ring.h"
> -#include "i915_gem_clflush.h"
> -#include "i915_gem_object_blt.h"
> -
> -struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
> -					 struct i915_vma *vma,
> -					 struct i915_gem_ww_ctx *ww,
> -					 u32 value)
> -{
> -	struct drm_i915_private *i915 = ce->vm->i915;
> -	const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
> -	struct intel_gt_buffer_pool_node *pool;
> -	struct i915_vma *batch;
> -	u64 offset;
> -	u64 count;
> -	u64 rem;
> -	u32 size;
> -	u32 *cmd;
> -	int err;
> -
> -	GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
> -	intel_engine_pm_get(ce->engine);
> -
> -	count = div_u64(round_up(vma->size, block_size), block_size);
> -	size = (1 + 8 * count) * sizeof(u32);
> -	size = round_up(size, PAGE_SIZE);
> -	pool = intel_gt_get_buffer_pool(ce->engine->gt, size, I915_MAP_WC);
> -	if (IS_ERR(pool)) {
> -		err = PTR_ERR(pool);
> -		goto out_pm;
> -	}
> -
> -	err = i915_gem_object_lock(pool->obj, ww);
> -	if (err)
> -		goto out_put;
> -
> -	batch = i915_vma_instance(pool->obj, ce->vm, NULL);
> -	if (IS_ERR(batch)) {
> -		err = PTR_ERR(batch);
> -		goto out_put;
> -	}
> -
> -	err = i915_vma_pin_ww(batch, ww, 0, 0, PIN_USER);
> -	if (unlikely(err))
> -		goto out_put;
> -
> -	/* we pinned the pool, mark it as such */
> -	intel_gt_buffer_pool_mark_used(pool);
> -
> -	cmd = i915_gem_object_pin_map(pool->obj, pool->type);
> -	if (IS_ERR(cmd)) {
> -		err = PTR_ERR(cmd);
> -		goto out_unpin;
> -	}
> -
> -	rem = vma->size;
> -	offset = vma->node.start;
> -
> -	do {
> -		u32 size = min_t(u64, rem, block_size);
> -
> -		GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
> -
> -		if (GRAPHICS_VER(i915) >= 8) {
> -			*cmd++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2);
> -			*cmd++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
> -			*cmd++ = 0;
> -			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> -			*cmd++ = lower_32_bits(offset);
> -			*cmd++ = upper_32_bits(offset);
> -			*cmd++ = value;
> -		} else {
> -			*cmd++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
> -			*cmd++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
> -			*cmd++ = 0;
> -			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> -			*cmd++ = offset;
> -			*cmd++ = value;
> -		}
> -
> -		/* Allow ourselves to be preempted in between blocks. */
> -		*cmd++ = MI_ARB_CHECK;
> -
> -		offset += size;
> -		rem -= size;
> -	} while (rem);
> -
> -	*cmd = MI_BATCH_BUFFER_END;
> -
> -	i915_gem_object_flush_map(pool->obj);
> -	i915_gem_object_unpin_map(pool->obj);
> -
> -	intel_gt_chipset_flush(ce->vm->gt);
> -
> -	batch->private = pool;
> -	return batch;
> -
> -out_unpin:
> -	i915_vma_unpin(batch);
> -out_put:
> -	intel_gt_buffer_pool_put(pool);
> -out_pm:
> -	intel_engine_pm_put(ce->engine);
> -	return ERR_PTR(err);
> -}
> -
> -int intel_emit_vma_mark_active(struct i915_vma *vma, struct i915_request *rq)
> -{
> -	int err;
> -
> -	err = i915_request_await_object(rq, vma->obj, false);
> -	if (err == 0)
> -		err = i915_vma_move_to_active(vma, rq, 0);
> -	if (unlikely(err))
> -		return err;
> -
> -	return intel_gt_buffer_pool_mark_active(vma->private, rq);
> -}
> -
> -void intel_emit_vma_release(struct intel_context *ce, struct i915_vma *vma)
> -{
> -	i915_vma_unpin(vma);
> -	intel_gt_buffer_pool_put(vma->private);
> -	intel_engine_pm_put(ce->engine);
> -}
> -
> -static int
> -move_obj_to_gpu(struct drm_i915_gem_object *obj,
> -		struct i915_request *rq,
> -		bool write)
> -{
> -	if (obj->cache_dirty & ~obj->cache_coherent)
> -		i915_gem_clflush_object(obj, 0);
> -
> -	return i915_request_await_object(rq, obj, write);
> -}
> -
> -int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
> -			     struct intel_context *ce,
> -			     u32 value)
> -{
> -	struct i915_gem_ww_ctx ww;
> -	struct i915_request *rq;
> -	struct i915_vma *batch;
> -	struct i915_vma *vma;
> -	int err;
> -
> -	vma = i915_vma_instance(obj, ce->vm, NULL);
> -	if (IS_ERR(vma))
> -		return PTR_ERR(vma);
> -
> -	i915_gem_ww_ctx_init(&ww, true);
> -	intel_engine_pm_get(ce->engine);
> -retry:
> -	err = i915_gem_object_lock(obj, &ww);
> -	if (err)
> -		goto out;
> -
> -	err = intel_context_pin_ww(ce, &ww);
> -	if (err)
> -		goto out;
> -
> -	err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER);
> -	if (err)
> -		goto out_ctx;
> -
> -	batch = intel_emit_vma_fill_blt(ce, vma, &ww, value);
> -	if (IS_ERR(batch)) {
> -		err = PTR_ERR(batch);
> -		goto out_vma;
> -	}
> -
> -	rq = i915_request_create(ce);
> -	if (IS_ERR(rq)) {
> -		err = PTR_ERR(rq);
> -		goto out_batch;
> -	}
> -
> -	err = intel_emit_vma_mark_active(batch, rq);
> -	if (unlikely(err))
> -		goto out_request;
> -
> -	err = move_obj_to_gpu(vma->obj, rq, true);
> -	if (err == 0)
> -		err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
> -	if (unlikely(err))
> -		goto out_request;
> -
> -	if (ce->engine->emit_init_breadcrumb)
> -		err = ce->engine->emit_init_breadcrumb(rq);
> -
> -	if (likely(!err))
> -		err = ce->engine->emit_bb_start(rq,
> -						batch->node.start,
> -						batch->node.size,
> -						0);
> -out_request:
> -	if (unlikely(err))
> -		i915_request_set_error_once(rq, err);
> -
> -	i915_request_add(rq);
> -out_batch:
> -	intel_emit_vma_release(ce, batch);
> -out_vma:
> -	i915_vma_unpin(vma);
> -out_ctx:
> -	intel_context_unpin(ce);
> -out:
> -	if (err == -EDEADLK) {
> -		err = i915_gem_ww_ctx_backoff(&ww);
> -		if (!err)
> -			goto retry;
> -	}
> -	i915_gem_ww_ctx_fini(&ww);
> -	intel_engine_pm_put(ce->engine);
> -	return err;
> -}
> -
> -/* Wa_1209644611:icl,ehl */
> -static bool wa_1209644611_applies(struct drm_i915_private *i915, u32 size)
> -{
> -	u32 height = size >> PAGE_SHIFT;
> -
> -	if (GRAPHICS_VER(i915) != 11)
> -		return false;
> -
> -	return height % 4 == 3 && height <= 8;
> -}
> -
> -struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
> -					 struct i915_gem_ww_ctx *ww,
> -					 struct i915_vma *src,
> -					 struct i915_vma *dst)
> -{
> -	struct drm_i915_private *i915 = ce->vm->i915;
> -	const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
> -	struct intel_gt_buffer_pool_node *pool;
> -	struct i915_vma *batch;
> -	u64 src_offset, dst_offset;
> -	u64 count, rem;
> -	u32 size, *cmd;
> -	int err;
> -
> -	GEM_BUG_ON(src->size != dst->size);
> -
> -	GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
> -	intel_engine_pm_get(ce->engine);
> -
> -	count = div_u64(round_up(dst->size, block_size), block_size);
> -	size = (1 + 11 * count) * sizeof(u32);
> -	size = round_up(size, PAGE_SIZE);
> -	pool = intel_gt_get_buffer_pool(ce->engine->gt, size, I915_MAP_WC);
> -	if (IS_ERR(pool)) {
> -		err = PTR_ERR(pool);
> -		goto out_pm;
> -	}
> -
> -	err = i915_gem_object_lock(pool->obj, ww);
> -	if (err)
> -		goto out_put;
> -
> -	batch = i915_vma_instance(pool->obj, ce->vm, NULL);
> -	if (IS_ERR(batch)) {
> -		err = PTR_ERR(batch);
> -		goto out_put;
> -	}
> -
> -	err = i915_vma_pin_ww(batch, ww, 0, 0, PIN_USER);
> -	if (unlikely(err))
> -		goto out_put;
> -
> -	/* we pinned the pool, mark it as such */
> -	intel_gt_buffer_pool_mark_used(pool);
> -
> -	cmd = i915_gem_object_pin_map(pool->obj, pool->type);
> -	if (IS_ERR(cmd)) {
> -		err = PTR_ERR(cmd);
> -		goto out_unpin;
> -	}
> -
> -	rem = src->size;
> -	src_offset = src->node.start;
> -	dst_offset = dst->node.start;
> -
> -	do {
> -		size = min_t(u64, rem, block_size);
> -		GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX);
> -
> -		if (GRAPHICS_VER(i915) >= 9 &&
> -		    !wa_1209644611_applies(i915, size)) {
> -			*cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2);
> -			*cmd++ = BLT_DEPTH_32 | PAGE_SIZE;
> -			*cmd++ = 0;
> -			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> -			*cmd++ = lower_32_bits(dst_offset);
> -			*cmd++ = upper_32_bits(dst_offset);
> -			*cmd++ = 0;
> -			*cmd++ = PAGE_SIZE;
> -			*cmd++ = lower_32_bits(src_offset);
> -			*cmd++ = upper_32_bits(src_offset);
> -		} else if (GRAPHICS_VER(i915) >= 8) {
> -			*cmd++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (10 - 2);
> -			*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
> -			*cmd++ = 0;
> -			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> -			*cmd++ = lower_32_bits(dst_offset);
> -			*cmd++ = upper_32_bits(dst_offset);
> -			*cmd++ = 0;
> -			*cmd++ = PAGE_SIZE;
> -			*cmd++ = lower_32_bits(src_offset);
> -			*cmd++ = upper_32_bits(src_offset);
> -		} else {
> -			*cmd++ = SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (6 - 2);
> -			*cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE;
> -			*cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE;
> -			*cmd++ = dst_offset;
> -			*cmd++ = PAGE_SIZE;
> -			*cmd++ = src_offset;
> -		}
> -
> -		/* Allow ourselves to be preempted in between blocks. */
> -		*cmd++ = MI_ARB_CHECK;
> -
> -		src_offset += size;
> -		dst_offset += size;
> -		rem -= size;
> -	} while (rem);
> -
> -	*cmd = MI_BATCH_BUFFER_END;
> -
> -	i915_gem_object_flush_map(pool->obj);
> -	i915_gem_object_unpin_map(pool->obj);
> -
> -	intel_gt_chipset_flush(ce->vm->gt);
> -	batch->private = pool;
> -	return batch;
> -
> -out_unpin:
> -	i915_vma_unpin(batch);
> -out_put:
> -	intel_gt_buffer_pool_put(pool);
> -out_pm:
> -	intel_engine_pm_put(ce->engine);
> -	return ERR_PTR(err);
> -}
> -
> -int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
> -			     struct drm_i915_gem_object *dst,
> -			     struct intel_context *ce)
> -{
> -	struct i915_address_space *vm = ce->vm;
> -	struct i915_vma *vma[2], *batch;
> -	struct i915_gem_ww_ctx ww;
> -	struct i915_request *rq;
> -	int err, i;
> -
> -	vma[0] = i915_vma_instance(src, vm, NULL);
> -	if (IS_ERR(vma[0]))
> -		return PTR_ERR(vma[0]);
> -
> -	vma[1] = i915_vma_instance(dst, vm, NULL);
> -	if (IS_ERR(vma[1]))
> -		return PTR_ERR(vma[1]);
> -
> -	i915_gem_ww_ctx_init(&ww, true);
> -	intel_engine_pm_get(ce->engine);
> -retry:
> -	err = i915_gem_object_lock(src, &ww);
> -	if (!err)
> -		err = i915_gem_object_lock(dst, &ww);
> -	if (!err)
> -		err = intel_context_pin_ww(ce, &ww);
> -	if (err)
> -		goto out;
> -
> -	err = i915_vma_pin_ww(vma[0], &ww, 0, 0, PIN_USER);
> -	if (err)
> -		goto out_ctx;
> -
> -	err = i915_vma_pin_ww(vma[1], &ww, 0, 0, PIN_USER);
> -	if (unlikely(err))
> -		goto out_unpin_src;
> -
> -	batch = intel_emit_vma_copy_blt(ce, &ww, vma[0], vma[1]);
> -	if (IS_ERR(batch)) {
> -		err = PTR_ERR(batch);
> -		goto out_unpin_dst;
> -	}
> -
> -	rq = i915_request_create(ce);
> -	if (IS_ERR(rq)) {
> -		err = PTR_ERR(rq);
> -		goto out_batch;
> -	}
> -
> -	err = intel_emit_vma_mark_active(batch, rq);
> -	if (unlikely(err))
> -		goto out_request;
> -
> -	for (i = 0; i < ARRAY_SIZE(vma); i++) {
> -		err = move_obj_to_gpu(vma[i]->obj, rq, i);
> -		if (unlikely(err))
> -			goto out_request;
> -	}
> -
> -	for (i = 0; i < ARRAY_SIZE(vma); i++) {
> -		unsigned int flags = i ? EXEC_OBJECT_WRITE : 0;
> -
> -		err = i915_vma_move_to_active(vma[i], rq, flags);
> -		if (unlikely(err))
> -			goto out_request;
> -	}
> -
> -	if (rq->engine->emit_init_breadcrumb) {
> -		err = rq->engine->emit_init_breadcrumb(rq);
> -		if (unlikely(err))
> -			goto out_request;
> -	}
> -
> -	err = rq->engine->emit_bb_start(rq,
> -					batch->node.start, batch->node.size,
> -					0);
> -
> -out_request:
> -	if (unlikely(err))
> -		i915_request_set_error_once(rq, err);
> -
> -	i915_request_add(rq);
> -out_batch:
> -	intel_emit_vma_release(ce, batch);
> -out_unpin_dst:
> -	i915_vma_unpin(vma[1]);
> -out_unpin_src:
> -	i915_vma_unpin(vma[0]);
> -out_ctx:
> -	intel_context_unpin(ce);
> -out:
> -	if (err == -EDEADLK) {
> -		err = i915_gem_ww_ctx_backoff(&ww);
> -		if (!err)
> -			goto retry;
> -	}
> -	i915_gem_ww_ctx_fini(&ww);
> -	intel_engine_pm_put(ce->engine);
> -	return err;
> -}
> -
> -#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> -#include "selftests/i915_gem_object_blt.c"
> -#endif
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h b/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
> deleted file mode 100644
> index 2409fdcccf0e..000000000000
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_blt.h
> +++ /dev/null
> @@ -1,39 +0,0 @@
> -/* SPDX-License-Identifier: MIT */
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#ifndef __I915_GEM_OBJECT_BLT_H__
> -#define __I915_GEM_OBJECT_BLT_H__
> -
> -#include <linux/types.h>
> -
> -#include "gt/intel_context.h"
> -#include "gt/intel_engine_pm.h"
> -#include "i915_vma.h"
> -
> -struct drm_i915_gem_object;
> -struct i915_gem_ww_ctx;
> -
> -struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
> -					 struct i915_vma *vma,
> -					 struct i915_gem_ww_ctx *ww,
> -					 u32 value);
> -
> -struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
> -					 struct i915_gem_ww_ctx *ww,
> -					 struct i915_vma *src,
> -					 struct i915_vma *dst);
> -
> -int intel_emit_vma_mark_active(struct i915_vma *vma, struct i915_request *rq);
> -void intel_emit_vma_release(struct intel_context *ce, struct i915_vma *vma);
> -
> -int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
> -			     struct intel_context *ce,
> -			     u32 value);
> -
> -int i915_gem_object_copy_blt(struct drm_i915_gem_object *src,
> -			     struct drm_i915_gem_object *dst,
> -			     struct intel_context *ce);
> -
> -#endif
> diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
> deleted file mode 100644
> index 8c335d1a8406..000000000000
> --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_object_blt.c
> +++ /dev/null
> @@ -1,597 +0,0 @@
> -// SPDX-License-Identifier: MIT
> -/*
> - * Copyright © 2019 Intel Corporation
> - */
> -
> -#include <linux/sort.h>
> -
> -#include "gt/intel_gt.h"
> -#include "gt/intel_engine_user.h"
> -
> -#include "i915_selftest.h"
> -
> -#include "gem/i915_gem_context.h"
> -#include "selftests/igt_flush_test.h"
> -#include "selftests/i915_random.h"
> -#include "selftests/mock_drm.h"
> -#include "huge_gem_object.h"
> -#include "mock_context.h"
> -
> -static int wrap_ktime_compare(const void *A, const void *B)
> -{
> -	const ktime_t *a = A, *b = B;
> -
> -	return ktime_compare(*a, *b);
> -}
> -
> -static int __perf_fill_blt(struct drm_i915_gem_object *obj)
> -{
> -	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> -	int inst = 0;
> -
> -	do {
> -		struct intel_engine_cs *engine;
> -		ktime_t t[5];
> -		int pass;
> -		int err;
> -
> -		engine = intel_engine_lookup_user(i915,
> -						  I915_ENGINE_CLASS_COPY,
> -						  inst++);
> -		if (!engine)
> -			return 0;
> -
> -		intel_engine_pm_get(engine);
> -		for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
> -			struct intel_context *ce = engine->kernel_context;
> -			ktime_t t0, t1;
> -
> -			t0 = ktime_get();
> -
> -			err = i915_gem_object_fill_blt(obj, ce, 0);
> -			if (err)
> -				break;
> -
> -			err = i915_gem_object_wait(obj,
> -						   I915_WAIT_ALL,
> -						   MAX_SCHEDULE_TIMEOUT);
> -			if (err)
> -				break;
> -
> -			t1 = ktime_get();
> -			t[pass] = ktime_sub(t1, t0);
> -		}
> -		intel_engine_pm_put(engine);
> -		if (err)
> -			return err;
> -
> -		sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
> -		pr_info("%s: blt %zd KiB fill: %lld MiB/s\n",
> -			engine->name,
> -			obj->base.size >> 10,
> -			div64_u64(mul_u32_u32(4 * obj->base.size,
> -					      1000 * 1000 * 1000),
> -				  t[1] + 2 * t[2] + t[3]) >> 20);
> -	} while (1);
> -}
> -
> -static int perf_fill_blt(void *arg)
> -{
> -	struct drm_i915_private *i915 = arg;
> -	static const unsigned long sizes[] = {
> -		SZ_4K,
> -		SZ_64K,
> -		SZ_2M,
> -		SZ_64M
> -	};
> -	int i;
> -
> -	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
> -		struct drm_i915_gem_object *obj;
> -		int err;
> -
> -		obj = i915_gem_object_create_internal(i915, sizes[i]);
> -		if (IS_ERR(obj))
> -			return PTR_ERR(obj);
> -
> -		err = __perf_fill_blt(obj);
> -		i915_gem_object_put(obj);
> -		if (err)
> -			return err;
> -	}
> -
> -	return 0;
> -}
> -
> -static int __perf_copy_blt(struct drm_i915_gem_object *src,
> -			   struct drm_i915_gem_object *dst)
> -{
> -	struct drm_i915_private *i915 = to_i915(src->base.dev);
> -	int inst = 0;
> -
> -	do {
> -		struct intel_engine_cs *engine;
> -		ktime_t t[5];
> -		int pass;
> -		int err = 0;
> -
> -		engine = intel_engine_lookup_user(i915,
> -						  I915_ENGINE_CLASS_COPY,
> -						  inst++);
> -		if (!engine)
> -			return 0;
> -
> -		intel_engine_pm_get(engine);
> -		for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
> -			struct intel_context *ce = engine->kernel_context;
> -			ktime_t t0, t1;
> -
> -			t0 = ktime_get();
> -
> -			err = i915_gem_object_copy_blt(src, dst, ce);
> -			if (err)
> -				break;
> -
> -			err = i915_gem_object_wait(dst,
> -						   I915_WAIT_ALL,
> -						   MAX_SCHEDULE_TIMEOUT);
> -			if (err)
> -				break;
> -
> -			t1 = ktime_get();
> -			t[pass] = ktime_sub(t1, t0);
> -		}
> -		intel_engine_pm_put(engine);
> -		if (err)
> -			return err;
> -
> -		sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
> -		pr_info("%s: blt %zd KiB copy: %lld MiB/s\n",
> -			engine->name,
> -			src->base.size >> 10,
> -			div64_u64(mul_u32_u32(4 * src->base.size,
> -					      1000 * 1000 * 1000),
> -				  t[1] + 2 * t[2] + t[3]) >> 20);
> -	} while (1);
> -}
> -
> -static int perf_copy_blt(void *arg)
> -{
> -	struct drm_i915_private *i915 = arg;
> -	static const unsigned long sizes[] = {
> -		SZ_4K,
> -		SZ_64K,
> -		SZ_2M,
> -		SZ_64M
> -	};
> -	int i;
> -
> -	for (i = 0; i < ARRAY_SIZE(sizes); i++) {
> -		struct drm_i915_gem_object *src, *dst;
> -		int err;
> -
> -		src = i915_gem_object_create_internal(i915, sizes[i]);
> -		if (IS_ERR(src))
> -			return PTR_ERR(src);
> -
> -		dst = i915_gem_object_create_internal(i915, sizes[i]);
> -		if (IS_ERR(dst)) {
> -			err = PTR_ERR(dst);
> -			goto err_src;
> -		}
> -
> -		err = __perf_copy_blt(src, dst);
> -
> -		i915_gem_object_put(dst);
> -err_src:
> -		i915_gem_object_put(src);
> -		if (err)
> -			return err;
> -	}
> -
> -	return 0;
> -}
> -
> -struct igt_thread_arg {
> -	struct intel_engine_cs *engine;
> -	struct i915_gem_context *ctx;
> -	struct file *file;
> -	struct rnd_state prng;
> -	unsigned int n_cpus;
> -};
> -
> -static int igt_fill_blt_thread(void *arg)
> -{
> -	struct igt_thread_arg *thread = arg;
> -	struct intel_engine_cs *engine = thread->engine;
> -	struct rnd_state *prng = &thread->prng;
> -	struct drm_i915_gem_object *obj;
> -	struct i915_gem_context *ctx;
> -	struct intel_context *ce;
> -	unsigned int prio;
> -	IGT_TIMEOUT(end);
> -	u64 total, max;
> -	int err;
> -
> -	ctx = thread->ctx;
> -	if (!ctx) {
> -		ctx = live_context_for_engine(engine, thread->file);
> -		if (IS_ERR(ctx))
> -			return PTR_ERR(ctx);
> -
> -		prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
> -		ctx->sched.priority = prio;
> -	}
> -
> -	ce = i915_gem_context_get_engine(ctx, 0);
> -	GEM_BUG_ON(IS_ERR(ce));
> -
> -	/*
> -	 * If we have a tiny shared address space, like for the GGTT
> -	 * then we can't be too greedy.
> -	 */
> -	max = ce->vm->total;
> -	if (i915_is_ggtt(ce->vm) || thread->ctx)
> -		max = div_u64(max, thread->n_cpus);
> -	max >>= 4;
> -
> -	total = PAGE_SIZE;
> -	do {
> -		/* Aim to keep the runtime under reasonable bounds! */
> -		const u32 max_phys_size = SZ_64K;
> -		u32 val = prandom_u32_state(prng);
> -		u32 phys_sz;
> -		u32 sz;
> -		u32 *vaddr;
> -		u32 i;
> -
> -		total = min(total, max);
> -		sz = i915_prandom_u32_max_state(total, prng) + 1;
> -		phys_sz = sz % max_phys_size + 1;
> -
> -		sz = round_up(sz, PAGE_SIZE);
> -		phys_sz = round_up(phys_sz, PAGE_SIZE);
> -		phys_sz = min(phys_sz, sz);
> -
> -		pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
> -			 phys_sz, sz, val);
> -
> -		obj = huge_gem_object(engine->i915, phys_sz, sz);
> -		if (IS_ERR(obj)) {
> -			err = PTR_ERR(obj);
> -			goto err_flush;
> -		}
> -
> -		vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
> -		if (IS_ERR(vaddr)) {
> -			err = PTR_ERR(vaddr);
> -			goto err_put;
> -		}
> -
> -		/*
> -		 * Make sure the potentially async clflush does its job, if
> -		 * required.
> -		 */
> -		memset32(vaddr, val ^ 0xdeadbeaf,
> -			 huge_gem_object_phys_size(obj) / sizeof(u32));
> -
> -		if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
> -			obj->cache_dirty = true;
> -
> -		err = i915_gem_object_fill_blt(obj, ce, val);
> -		if (err)
> -			goto err_unpin;
> -
> -		err = i915_gem_object_wait(obj, 0, MAX_SCHEDULE_TIMEOUT);
> -		if (err)
> -			goto err_unpin;
> -
> -		for (i = 0; i < huge_gem_object_phys_size(obj) / sizeof(u32); i += 17) {
> -			if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
> -				drm_clflush_virt_range(&vaddr[i], sizeof(vaddr[i]));
> -
> -			if (vaddr[i] != val) {
> -				pr_err("vaddr[%u]=%x, expected=%x\n", i,
> -				       vaddr[i], val);
> -				err = -EINVAL;
> -				goto err_unpin;
> -			}
> -		}
> -
> -		i915_gem_object_unpin_map(obj);
> -		i915_gem_object_put(obj);
> -
> -		total <<= 1;
> -	} while (!time_after(jiffies, end));
> -
> -	goto err_flush;
> -
> -err_unpin:
> -	i915_gem_object_unpin_map(obj);
> -err_put:
> -	i915_gem_object_put(obj);
> -err_flush:
> -	if (err == -ENOMEM)
> -		err = 0;
> -
> -	intel_context_put(ce);
> -	return err;
> -}
> -
> -static int igt_copy_blt_thread(void *arg)
> -{
> -	struct igt_thread_arg *thread = arg;
> -	struct intel_engine_cs *engine = thread->engine;
> -	struct rnd_state *prng = &thread->prng;
> -	struct drm_i915_gem_object *src, *dst;
> -	struct i915_gem_context *ctx;
> -	struct intel_context *ce;
> -	unsigned int prio;
> -	IGT_TIMEOUT(end);
> -	u64 total, max;
> -	int err;
> -
> -	ctx = thread->ctx;
> -	if (!ctx) {
> -		ctx = live_context_for_engine(engine, thread->file);
> -		if (IS_ERR(ctx))
> -			return PTR_ERR(ctx);
> -
> -		prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
> -		ctx->sched.priority = prio;
> -	}
> -
> -	ce = i915_gem_context_get_engine(ctx, 0);
> -	GEM_BUG_ON(IS_ERR(ce));
> -
> -	/*
> -	 * If we have a tiny shared address space, like for the GGTT
> -	 * then we can't be too greedy.
> -	 */
> -	max = ce->vm->total;
> -	if (i915_is_ggtt(ce->vm) || thread->ctx)
> -		max = div_u64(max, thread->n_cpus);
> -	max >>= 4;
> -
> -	total = PAGE_SIZE;
> -	do {
> -		/* Aim to keep the runtime under reasonable bounds! */
> -		const u32 max_phys_size = SZ_64K;
> -		u32 val = prandom_u32_state(prng);
> -		u32 phys_sz;
> -		u32 sz;
> -		u32 *vaddr;
> -		u32 i;
> -
> -		total = min(total, max);
> -		sz = i915_prandom_u32_max_state(total, prng) + 1;
> -		phys_sz = sz % max_phys_size + 1;
> -
> -		sz = round_up(sz, PAGE_SIZE);
> -		phys_sz = round_up(phys_sz, PAGE_SIZE);
> -		phys_sz = min(phys_sz, sz);
> -
> -		pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
> -			 phys_sz, sz, val);
> -
> -		src = huge_gem_object(engine->i915, phys_sz, sz);
> -		if (IS_ERR(src)) {
> -			err = PTR_ERR(src);
> -			goto err_flush;
> -		}
> -
> -		vaddr = i915_gem_object_pin_map_unlocked(src, I915_MAP_WB);
> -		if (IS_ERR(vaddr)) {
> -			err = PTR_ERR(vaddr);
> -			goto err_put_src;
> -		}
> -
> -		memset32(vaddr, val,
> -			 huge_gem_object_phys_size(src) / sizeof(u32));
> -
> -		i915_gem_object_unpin_map(src);
> -
> -		if (!(src->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
> -			src->cache_dirty = true;
> -
> -		dst = huge_gem_object(engine->i915, phys_sz, sz);
> -		if (IS_ERR(dst)) {
> -			err = PTR_ERR(dst);
> -			goto err_put_src;
> -		}
> -
> -		vaddr = i915_gem_object_pin_map_unlocked(dst, I915_MAP_WB);
> -		if (IS_ERR(vaddr)) {
> -			err = PTR_ERR(vaddr);
> -			goto err_put_dst;
> -		}
> -
> -		memset32(vaddr, val ^ 0xdeadbeaf,
> -			 huge_gem_object_phys_size(dst) / sizeof(u32));
> -
> -		if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
> -			dst->cache_dirty = true;
> -
> -		err = i915_gem_object_copy_blt(src, dst, ce);
> -		if (err)
> -			goto err_unpin;
> -
> -		err = i915_gem_object_wait(dst, 0, MAX_SCHEDULE_TIMEOUT);
> -		if (err)
> -			goto err_unpin;
> -
> -		for (i = 0; i < huge_gem_object_phys_size(dst) / sizeof(u32); i += 17) {
> -			if (!(dst->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
> -				drm_clflush_virt_range(&vaddr[i], sizeof(vaddr[i]));
> -
> -			if (vaddr[i] != val) {
> -				pr_err("vaddr[%u]=%x, expected=%x\n", i,
> -				       vaddr[i], val);
> -				err = -EINVAL;
> -				goto err_unpin;
> -			}
> -		}
> -
> -		i915_gem_object_unpin_map(dst);
> -
> -		i915_gem_object_put(src);
> -		i915_gem_object_put(dst);
> -
> -		total <<= 1;
> -	} while (!time_after(jiffies, end));
> -
> -	goto err_flush;
> -
> -err_unpin:
> -	i915_gem_object_unpin_map(dst);
> -err_put_dst:
> -	i915_gem_object_put(dst);
> -err_put_src:
> -	i915_gem_object_put(src);
> -err_flush:
> -	if (err == -ENOMEM)
> -		err = 0;
> -
> -	intel_context_put(ce);
> -	return err;
> -}
> -
> -static int igt_threaded_blt(struct intel_engine_cs *engine,
> -			    int (*blt_fn)(void *arg),
> -			    unsigned int flags)
> -#define SINGLE_CTX BIT(0)
> -{
> -	struct igt_thread_arg *thread;
> -	struct task_struct **tsk;
> -	unsigned int n_cpus, i;
> -	I915_RND_STATE(prng);
> -	int err = 0;
> -
> -	n_cpus = num_online_cpus() + 1;
> -
> -	tsk = kcalloc(n_cpus, sizeof(struct task_struct *), GFP_KERNEL);
> -	if (!tsk)
> -		return 0;
> -
> -	thread = kcalloc(n_cpus, sizeof(struct igt_thread_arg), GFP_KERNEL);
> -	if (!thread)
> -		goto out_tsk;
> -
> -	thread[0].file = mock_file(engine->i915);
> -	if (IS_ERR(thread[0].file)) {
> -		err = PTR_ERR(thread[0].file);
> -		goto out_thread;
> -	}
> -
> -	if (flags & SINGLE_CTX) {
> -		thread[0].ctx = live_context_for_engine(engine, thread[0].file);
> -		if (IS_ERR(thread[0].ctx)) {
> -			err = PTR_ERR(thread[0].ctx);
> -			goto out_file;
> -		}
> -	}
> -
> -	for (i = 0; i < n_cpus; ++i) {
> -		thread[i].engine = engine;
> -		thread[i].file = thread[0].file;
> -		thread[i].ctx = thread[0].ctx;
> -		thread[i].n_cpus = n_cpus;
> -		thread[i].prng =
> -			I915_RND_STATE_INITIALIZER(prandom_u32_state(&prng));
> -
> -		tsk[i] = kthread_run(blt_fn, &thread[i], "igt/blt-%d", i);
> -		if (IS_ERR(tsk[i])) {
> -			err = PTR_ERR(tsk[i]);
> -			break;
> -		}
> -
> -		get_task_struct(tsk[i]);
> -	}
> -
> -	yield(); /* start all threads before we kthread_stop() */
> -
> -	for (i = 0; i < n_cpus; ++i) {
> -		int status;
> -
> -		if (IS_ERR_OR_NULL(tsk[i]))
> -			continue;
> -
> -		status = kthread_stop(tsk[i]);
> -		if (status && !err)
> -			err = status;
> -
> -		put_task_struct(tsk[i]);
> -	}
> -
> -out_file:
> -	fput(thread[0].file);
> -out_thread:
> -	kfree(thread);
> -out_tsk:
> -	kfree(tsk);
> -	return err;
> -}
> -
> -static int test_copy_engines(struct drm_i915_private *i915,
> -			     int (*fn)(void *arg),
> -			     unsigned int flags)
> -{
> -	struct intel_engine_cs *engine;
> -	int ret;
> -
> -	for_each_uabi_class_engine(engine, I915_ENGINE_CLASS_COPY, i915) {
> -		ret = igt_threaded_blt(engine, fn, flags);
> -		if (ret)
> -			return ret;
> -	}
> -
> -	return 0;
> -}
> -
> -static int igt_fill_blt(void *arg)
> -{
> -	return test_copy_engines(arg, igt_fill_blt_thread, 0);
> -}
> -
> -static int igt_fill_blt_ctx0(void *arg)
> -{
> -	return test_copy_engines(arg, igt_fill_blt_thread, SINGLE_CTX);
> -}
> -
> -static int igt_copy_blt(void *arg)
> -{
> -	return test_copy_engines(arg, igt_copy_blt_thread, 0);
> -}
> -
> -static int igt_copy_blt_ctx0(void *arg)
> -{
> -	return test_copy_engines(arg, igt_copy_blt_thread, SINGLE_CTX);
> -}
> -
> -int i915_gem_object_blt_live_selftests(struct drm_i915_private *i915)
> -{
> -	static const struct i915_subtest tests[] = {
> -		SUBTEST(igt_fill_blt),
> -		SUBTEST(igt_fill_blt_ctx0),
> -		SUBTEST(igt_copy_blt),
> -		SUBTEST(igt_copy_blt_ctx0),
> -	};
> -
> -	if (intel_gt_is_wedged(&i915->gt))
> -		return 0;
> -
> -	return i915_live_subtests(tests, i915);
> -}
> -
> -int i915_gem_object_blt_perf_selftests(struct drm_i915_private *i915)
> -{
> -	static const struct i915_subtest tests[] = {
> -		SUBTEST(perf_fill_blt),
> -		SUBTEST(perf_copy_blt),
> -	};
> -
> -	if (intel_gt_is_wedged(&i915->gt))
> -		return 0;
> -
> -	return i915_live_subtests(tests, i915);
> -}
> diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> index 6f5893ecd549..1ae3f8039d68 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
> @@ -39,7 +39,6 @@ selftest(evict, i915_gem_evict_live_selftests)
>   selftest(hugepages, i915_gem_huge_page_live_selftests)
>   selftest(gem_contexts, i915_gem_context_live_selftests)
>   selftest(gem_execbuf, i915_gem_execbuffer_live_selftests)
> -selftest(blt, i915_gem_object_blt_live_selftests)
>   selftest(reset, intel_reset_live_selftests)
>   selftest(memory_region, intel_memory_region_live_selftests)
>   selftest(hangcheck, intel_hangcheck_live_selftests)
> diff --git a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
> index 5077dc3c3b8c..058450d351f7 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
> +++ b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h
> @@ -18,5 +18,4 @@
>   selftest(engine_cs, intel_engine_cs_perf_selftests)
>   selftest(request, i915_request_perf_selftests)
>   selftest(migrate, intel_migrate_perf_selftests)
> -selftest(blt, i915_gem_object_blt_perf_selftests)
>   selftest(region, intel_memory_region_perf_selftests)
> diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> index c85d516b85cd..2e18f3a3d538 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c
> @@ -15,11 +15,12 @@
>   #include "gem/i915_gem_context.h"
>   #include "gem/i915_gem_lmem.h"
>   #include "gem/i915_gem_region.h"
> -#include "gem/i915_gem_object_blt.h"
>   #include "gem/selftests/igt_gem_utils.h"
>   #include "gem/selftests/mock_context.h"
> +#include "gt/intel_engine_pm.h"
>   #include "gt/intel_engine_user.h"
>   #include "gt/intel_gt.h"
> +#include "gt/intel_migrate.h"
>   #include "i915_memcpy.h"
>   #include "selftests/igt_flush_test.h"
>   #include "selftests/i915_random.h"
> @@ -741,6 +742,7 @@ static int igt_lmem_write_cpu(void *arg)
>   		PAGE_SIZE - 64,
>   	};
>   	struct intel_engine_cs *engine;
> +	struct i915_request *rq;
>   	u32 *vaddr;
>   	u32 sz;
>   	u32 i;
> @@ -767,15 +769,20 @@ static int igt_lmem_write_cpu(void *arg)
>   		goto out_put;
>   	}
>   
> +	i915_gem_object_lock(obj, NULL);
>   	/* Put the pages into a known state -- from the gpu for added fun */
>   	intel_engine_pm_get(engine);
> -	err = i915_gem_object_fill_blt(obj, engine->kernel_context, 0xdeadbeaf);
> -	intel_engine_pm_put(engine);
> -	if (err)
> -		goto out_unpin;
> +	err = intel_context_migrate_clear(engine->gt->migrate.context, NULL,
> +					  obj->mm.pages->sgl, I915_CACHE_NONE,
> +					  true, 0xdeadbeaf, &rq);
> +	if (rq) {
> +		dma_resv_add_excl_fence(obj->base.resv, &rq->fence);
> +		i915_request_put(rq);
> +	}
>   
> -	i915_gem_object_lock(obj, NULL);
> -	err = i915_gem_object_set_to_wc_domain(obj, true);
> +	intel_engine_pm_put(engine);
> +	if (!err)
> +		err = i915_gem_object_set_to_wc_domain(obj, true);
>   	i915_gem_object_unlock(obj);
>   	if (err)
>   		goto out_unpin;
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 10/12] drm/i915/ttm: accelerated move implementation
  2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
@ 2021-06-14 17:55     ` Thomas Hellström
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 17:55 UTC (permalink / raw)
  To: intel-gfx, dri-devel, ramalingam.c; +Cc: matthew.auld


On 6/14/21 6:26 PM, Thomas Hellström wrote:
> From: Ramalingam C <ramalingam.c@intel.com>
>
> Invokes the pipelined page migration through blt, for
> i915_ttm_move requests of eviction and also obj clear.
>
> Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
> ---
> v2:
>   - subfunction for accel_move (Thomas)
>   - engine_pm_get/put around context_move/clear (Thomas)
>   - Invalidation at accel_clear (Thomas)
> v3:
>   - conflict resolution s/&bo->mem/bo->resource/g
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 87 +++++++++++++++++++++----
>   1 file changed, 74 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index bf33724bed5c..08b72c280cb5 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -15,6 +15,9 @@
>   #include "gem/i915_gem_ttm.h"
>   #include "gem/i915_gem_mman.h"
>   
> +#include "gt/intel_migrate.h"
> +#include "gt/intel_engine_pm.h"
> +
>   #define I915_PL_LMEM0 TTM_PL_PRIV
>   #define I915_PL_SYSTEM TTM_PL_SYSTEM
>   #define I915_PL_STOLEN TTM_PL_VRAM
> @@ -282,6 +285,61 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
>   	return intel_region_ttm_node_to_st(obj->mm.region, res);
>   }
>   
> +static int i915_ttm_accel_move(struct ttm_buffer_object *bo,
> +			       struct ttm_resource *dst_mem,
> +			       struct sg_table *dst_st)
> +{
> +	struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915),
> +						     bdev);
> +	struct ttm_resource_manager *src_man =
> +		ttm_manager_type(bo->bdev, bo->resource->mem_type);
> +	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> +	struct sg_table *src_st;
> +	struct i915_request *rq;
> +	int ret;
> +
> +	if (!i915->gt.migrate.context)
> +		return -EINVAL;
> +
> +	if (!bo->ttm || !ttm_tt_is_populated(bo->ttm)) {
> +		if (bo->type == ttm_bo_type_kernel)
> +			return -EINVAL;
> +
> +		if (bo->ttm &&
> +		    !(bo->ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))
> +			return 0;
> +
> +		intel_engine_pm_get(i915->gt.migrate.context->engine);
> +		ret = intel_context_migrate_clear(i915->gt.migrate.context, NULL,
> +						  dst_st->sgl, I915_CACHE_NONE,
> +						  dst_mem->mem_type >= TTM_PL_PRIV,
Here we should probably use I915_PL_LMEM0 instead of TTM_PL_PRIV, but 
since this test  will replaced by gpu_binds_iomem() in an upcoming 
patch, doesn't matter really.
> +						  0, &rq);
> +
> +		if (!ret && rq) {
> +			i915_request_wait(rq, 0, HZ);
Could be a MAX_SCHEDULE_TIMEOUT here to avoid surprises in case the 
queue to the blitter is getting long?
> +			i915_request_put(rq);
> +		}
> +		intel_engine_pm_put(i915->gt.migrate.context->engine);
> +	} else {
> +		src_st = src_man->use_tt ? i915_ttm_tt_get_st(bo->ttm) :
> +						obj->ttm.cached_io_st;
> +
> +		intel_engine_pm_get(i915->gt.migrate.context->engine);
> +		ret = intel_context_migrate_copy(i915->gt.migrate.context,
> +						 NULL, src_st->sgl, I915_CACHE_NONE,
> +						 bo->resource->mem_type >= TTM_PL_PRIV,
> +						 dst_st->sgl, I915_CACHE_NONE,
> +						 dst_mem->mem_type >= TTM_PL_PRIV, &rq);
> +		if (!ret && rq) {
> +			i915_request_wait(rq, 0, HZ);
Same thing here.


With that fixed,

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH v3 10/12] drm/i915/ttm: accelerated move implementation
@ 2021-06-14 17:55     ` Thomas Hellström
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Hellström @ 2021-06-14 17:55 UTC (permalink / raw)
  To: intel-gfx, dri-devel, ramalingam.c; +Cc: matthew.auld


On 6/14/21 6:26 PM, Thomas Hellström wrote:
> From: Ramalingam C <ramalingam.c@intel.com>
>
> Invokes the pipelined page migration through blt, for
> i915_ttm_move requests of eviction and also obj clear.
>
> Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
> ---
> v2:
>   - subfunction for accel_move (Thomas)
>   - engine_pm_get/put around context_move/clear (Thomas)
>   - Invalidation at accel_clear (Thomas)
> v3:
>   - conflict resolution s/&bo->mem/bo->resource/g
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 87 +++++++++++++++++++++----
>   1 file changed, 74 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index bf33724bed5c..08b72c280cb5 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -15,6 +15,9 @@
>   #include "gem/i915_gem_ttm.h"
>   #include "gem/i915_gem_mman.h"
>   
> +#include "gt/intel_migrate.h"
> +#include "gt/intel_engine_pm.h"
> +
>   #define I915_PL_LMEM0 TTM_PL_PRIV
>   #define I915_PL_SYSTEM TTM_PL_SYSTEM
>   #define I915_PL_STOLEN TTM_PL_VRAM
> @@ -282,6 +285,61 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
>   	return intel_region_ttm_node_to_st(obj->mm.region, res);
>   }
>   
> +static int i915_ttm_accel_move(struct ttm_buffer_object *bo,
> +			       struct ttm_resource *dst_mem,
> +			       struct sg_table *dst_st)
> +{
> +	struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915),
> +						     bdev);
> +	struct ttm_resource_manager *src_man =
> +		ttm_manager_type(bo->bdev, bo->resource->mem_type);
> +	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
> +	struct sg_table *src_st;
> +	struct i915_request *rq;
> +	int ret;
> +
> +	if (!i915->gt.migrate.context)
> +		return -EINVAL;
> +
> +	if (!bo->ttm || !ttm_tt_is_populated(bo->ttm)) {
> +		if (bo->type == ttm_bo_type_kernel)
> +			return -EINVAL;
> +
> +		if (bo->ttm &&
> +		    !(bo->ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))
> +			return 0;
> +
> +		intel_engine_pm_get(i915->gt.migrate.context->engine);
> +		ret = intel_context_migrate_clear(i915->gt.migrate.context, NULL,
> +						  dst_st->sgl, I915_CACHE_NONE,
> +						  dst_mem->mem_type >= TTM_PL_PRIV,
Here we should probably use I915_PL_LMEM0 instead of TTM_PL_PRIV, but 
since this test  will replaced by gpu_binds_iomem() in an upcoming 
patch, doesn't matter really.
> +						  0, &rq);
> +
> +		if (!ret && rq) {
> +			i915_request_wait(rq, 0, HZ);
Could be a MAX_SCHEDULE_TIMEOUT here to avoid surprises in case the 
queue to the blitter is getting long?
> +			i915_request_put(rq);
> +		}
> +		intel_engine_pm_put(i915->gt.migrate.context->engine);
> +	} else {
> +		src_st = src_man->use_tt ? i915_ttm_tt_get_st(bo->ttm) :
> +						obj->ttm.cached_io_st;
> +
> +		intel_engine_pm_get(i915->gt.migrate.context->engine);
> +		ret = intel_context_migrate_copy(i915->gt.migrate.context,
> +						 NULL, src_st->sgl, I915_CACHE_NONE,
> +						 bo->resource->mem_type >= TTM_PL_PRIV,
> +						 dst_st->sgl, I915_CACHE_NONE,
> +						 dst_mem->mem_type >= TTM_PL_PRIV, &rq);
> +		if (!ret && rq) {
> +			i915_request_wait(rq, 0, HZ);
Same thing here.


With that fixed,

Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for i915 TTM sync accelerated migration and clear
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
                   ` (12 preceding siblings ...)
  (?)
@ 2021-06-15  0:55 ` Patchwork
  -1 siblings, 0 replies; 44+ messages in thread
From: Patchwork @ 2021-06-15  0:55 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx

== Series Details ==

Series: i915 TTM sync accelerated migration and clear
URL   : https://patchwork.freedesktop.org/series/91463/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
a9a395f920ef drm/i915: Reference objects on the ww object list
161d138ac072 drm/i915: Break out dma_resv ww locking utilities to separate files
-:141: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#141: 
new file mode 100644

-:166: WARNING:LONG_LINE: line length of 103 exceeds 100 columns
#166: FILE: drivers/gpu/drm/i915/i915_gem_ww.c:21:
+	while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) {

total: 0 errors, 2 warnings, 0 checks, 183 lines checked
52210b9e019a drm/i915: Introduce a ww transaction helper
-:56: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_ww' - possible side-effects?
#56: FILE: drivers/gpu/drm/i915/i915_gem_ww.h:46:
+#define for_i915_gem_ww(_ww, _err, _intr)			\
+	for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop;	\
+	     _err = __i915_gem_ww_fini(_ww, _err))

-:56: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_err' - possible side-effects?
#56: FILE: drivers/gpu/drm/i915/i915_gem_ww.h:46:
+#define for_i915_gem_ww(_ww, _err, _intr)			\
+	for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop;	\
+	     _err = __i915_gem_ww_fini(_ww, _err))

total: 0 errors, 0 warnings, 2 checks, 41 lines checked
f748e9a88a62 drm/i915/gt: Add an insert_entry for gen8_ppgtt
03305cf7c83e drm/i915/gt: Add a routine to iterate over the pagetables of a GTT
eec7f1143f99 drm/i915/gt: Export the pinned context constructor and destructor
a498403a00c0 drm/i915/gt: Pipelined page migration
-:66: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#66: 
new file mode 100644

-:878: WARNING:LINE_SPACING: Missing a blank line after declarations
#878: FILE: drivers/gpu/drm/i915/gt/selftest_migrate.c:176:
+	struct drm_i915_private *i915 = migrate->context->engine->i915;
+	I915_RND_STATE(prng);

-:907: WARNING:LINE_SPACING: Missing a blank line after declarations
#907: FILE: drivers/gpu/drm/i915/gt/selftest_migrate.c:205:
+	struct threaded_migrate *thread;
+	I915_RND_STATE(prng);

-:932: WARNING:MSLEEP: msleep < 20ms can sleep for up to 20ms; see Documentation/timers/timers-howto.rst
#932: FILE: drivers/gpu/drm/i915/gt/selftest_migrate.c:230:
+	msleep(10); /* start all threads before we kthread_stop() */

total: 0 errors, 4 warnings, 0 checks, 931 lines checked
40f8090431c6 drm/i915/gt: Pipelined clear
-:355: WARNING:LINE_SPACING: Missing a blank line after declarations
#355: FILE: drivers/gpu/drm/i915/gt/selftest_migrate.c:311:
+	struct drm_i915_private *i915 = migrate->context->engine->i915;
+	I915_RND_STATE(prng);

total: 0 errors, 1 warnings, 0 checks, 380 lines checked
1c74336e9b0c drm/i915/gt: Setup a default migration context on the GT
7e0037bd1a17 drm/i915/ttm: accelerated move implementation
eda8112d6927 drm/i915/gem: Zap the client blt code
-:27: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#27: 
deleted file mode 100644

total: 0 errors, 1 warnings, 0 checks, 14 lines checked
d95873d0888a drm/i915/gem: Zap the i915_gem_object_blt code
-:29: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#29: 
deleted file mode 100644

total: 0 errors, 1 warnings, 0 checks, 65 lines checked


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for i915 TTM sync accelerated migration and clear
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
                   ` (13 preceding siblings ...)
  (?)
@ 2021-06-15  1:24 ` Patchwork
  -1 siblings, 0 replies; 44+ messages in thread
From: Patchwork @ 2021-06-15  1:24 UTC (permalink / raw)
  To: Thomas Hellström; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 6195 bytes --]

== Series Details ==

Series: i915 TTM sync accelerated migration and clear
URL   : https://patchwork.freedesktop.org/series/91463/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10222 -> Patchwork_20362
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/index.html

New tests
---------

  New tests have been introduced between CI_DRM_10222 and Patchwork_20362:

### New IGT tests (1) ###

  * igt@i915_selftest@live@migrate:
    - Statuses : 2 dmesg-warn(s) 34 pass(s)
    - Exec time: [0.51, 8.68] s

  

Known issues
------------

  Here are the changes found in Patchwork_20362 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@amdgpu/amd_cs_nop@sync-fork-compute0:
    - fi-snb-2600:        NOTRUN -> [SKIP][1] ([fdo#109271]) +17 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/fi-snb-2600/igt@amdgpu/amd_cs_nop@sync-fork-compute0.html

  * igt@amdgpu/amd_prime@i915-to-amd:
    - fi-kbl-soraka:      NOTRUN -> [SKIP][2] ([fdo#109271]) +1 similar issue
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/fi-kbl-soraka/igt@amdgpu/amd_prime@i915-to-amd.html

  * igt@i915_selftest@live@execlists:
    - fi-bsw-nick:        [PASS][3] -> [INCOMPLETE][4] ([i915#2782] / [i915#2940])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/fi-bsw-nick/igt@i915_selftest@live@execlists.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/fi-bsw-nick/igt@i915_selftest@live@execlists.html

  * {igt@i915_selftest@live@migrate} (NEW):
    - {fi-ehl-2}:         NOTRUN -> [DMESG-WARN][5] ([i915#1222])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/fi-ehl-2/igt@i915_selftest@live@migrate.html
    - {fi-jsl-1}:         NOTRUN -> [DMESG-WARN][6] ([i915#1222])
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/fi-jsl-1/igt@i915_selftest@live@migrate.html

  * igt@runner@aborted:
    - fi-bsw-nick:        NOTRUN -> [FAIL][7] ([fdo#109271] / [i915#1436])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/fi-bsw-nick/igt@runner@aborted.html

  
#### Possible fixes ####

  * igt@i915_pm_rpm@module-reload:
    - fi-kbl-guc:         [SKIP][8] ([fdo#109271]) -> [PASS][9]
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/fi-kbl-guc/igt@i915_pm_rpm@module-reload.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/fi-kbl-guc/igt@i915_pm_rpm@module-reload.html

  * igt@i915_selftest@live@hangcheck:
    - {fi-hsw-gt1}:       [DMESG-WARN][10] ([i915#3303]) -> [PASS][11]
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/fi-hsw-gt1/igt@i915_selftest@live@hangcheck.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/fi-hsw-gt1/igt@i915_selftest@live@hangcheck.html
    - fi-snb-2600:        [INCOMPLETE][12] ([i915#2782]) -> [PASS][13]
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/fi-snb-2600/igt@i915_selftest@live@hangcheck.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/fi-snb-2600/igt@i915_selftest@live@hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1222]: https://gitlab.freedesktop.org/drm/intel/issues/1222
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2782]: https://gitlab.freedesktop.org/drm/intel/issues/2782
  [i915#2940]: https://gitlab.freedesktop.org/drm/intel/issues/2940
  [i915#3012]: https://gitlab.freedesktop.org/drm/intel/issues/3012
  [i915#3276]: https://gitlab.freedesktop.org/drm/intel/issues/3276
  [i915#3277]: https://gitlab.freedesktop.org/drm/intel/issues/3277
  [i915#3282]: https://gitlab.freedesktop.org/drm/intel/issues/3282
  [i915#3283]: https://gitlab.freedesktop.org/drm/intel/issues/3283
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#3539]: https://gitlab.freedesktop.org/drm/intel/issues/3539
  [i915#3542]: https://gitlab.freedesktop.org/drm/intel/issues/3542
  [i915#3544]: https://gitlab.freedesktop.org/drm/intel/issues/3544
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533


Participating hosts (42 -> 39)
------------------------------

  Additional (1): fi-rkl-11500t 
  Missing    (4): fi-ilk-m540 fi-bsw-cyan fi-bdw-samus fi-hsw-4200u 


Build changes
-------------

  * Linux: CI_DRM_10222 -> Patchwork_20362

  CI-20190529: 20190529
  CI_DRM_10222: 9b5675dc51137543709a5ec444b0d7076e43198e @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6105: 598a154680374e7875ae9ffc98425abc57398b2f @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20362: d95873d0888ac8c680a5527c57d526378ec6ccf4 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

d95873d0888a drm/i915/gem: Zap the i915_gem_object_blt code
eda8112d6927 drm/i915/gem: Zap the client blt code
7e0037bd1a17 drm/i915/ttm: accelerated move implementation
1c74336e9b0c drm/i915/gt: Setup a default migration context on the GT
40f8090431c6 drm/i915/gt: Pipelined clear
a498403a00c0 drm/i915/gt: Pipelined page migration
eec7f1143f99 drm/i915/gt: Export the pinned context constructor and destructor
03305cf7c83e drm/i915/gt: Add a routine to iterate over the pagetables of a GTT
f748e9a88a62 drm/i915/gt: Add an insert_entry for gen8_ppgtt
52210b9e019a drm/i915: Introduce a ww transaction helper
161d138ac072 drm/i915: Break out dma_resv ww locking utilities to separate files
a9a395f920ef drm/i915: Reference objects on the ww object list

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/index.html

[-- Attachment #1.2: Type: text/html, Size: 6501 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v3] drm/i915/ttm: accelerated move implementation
  2021-06-14 17:55     ` [Intel-gfx] " Thomas Hellström
@ 2021-06-15 10:06       ` Ramalingam C
  -1 siblings, 0 replies; 44+ messages in thread
From: Ramalingam C @ 2021-06-15 10:06 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Matthew Auld, Thomas Hellström

Invokes the pipelined page migration through blt, for
i915_ttm_move requests of eviction and also obj clear.

v2:
 - subfunction for accel_move (Thomas)
 - engine_pm_get/put around context_move/clear (Thomas)
 - Invalidation at accel_clear (Thomas)

v3:
 - Timeout is set for MAX_SCHEDULE_TIMEOUT (Thomas)
 - s/TTM_PL_PRIV/I915_PL_LMEM0 (Thomas)

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 88 +++++++++++++++++++++----
 1 file changed, 75 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 3748098b42d5..94571757fb42 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -15,6 +15,9 @@
 #include "gem/i915_gem_ttm.h"
 #include "gem/i915_gem_mman.h"
 
+#include "gt/intel_migrate.h"
+#include "gt/intel_engine_pm.h"
+
 #define I915_PL_LMEM0 TTM_PL_PRIV
 #define I915_PL_SYSTEM TTM_PL_SYSTEM
 #define I915_PL_STOLEN TTM_PL_VRAM
@@ -282,6 +285,62 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
 	return intel_region_ttm_node_to_st(obj->mm.region, res->mm_node);
 }
 
+static int i915_ttm_accel_move(struct ttm_buffer_object *bo,
+			       struct ttm_resource *dst_mem,
+			       struct sg_table *dst_st)
+{
+	struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915),
+						     bdev);
+	struct ttm_resource_manager *src_man =
+		ttm_manager_type(bo->bdev, bo->mem.mem_type);
+	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
+	struct sg_table *src_st;
+	struct i915_request *rq;
+	int ret;
+
+	if (!i915->gt.migrate.context)
+		return -EINVAL;
+
+	if (!bo->ttm || !ttm_tt_is_populated(bo->ttm)) {
+		if (bo->type == ttm_bo_type_kernel)
+			return -EINVAL;
+
+		if (bo->ttm &&
+		    !(bo->ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))
+			return 0;
+
+		intel_engine_pm_get(i915->gt.migrate.context->engine);
+		ret = intel_context_migrate_clear(i915->gt.migrate.context, NULL,
+						  dst_st->sgl, I915_CACHE_NONE,
+						  dst_mem->mem_type >= I915_PL_LMEM0,
+						  0, &rq);
+
+		if (!ret && rq) {
+			i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT);
+			i915_request_put(rq);
+		}
+		intel_engine_pm_put(i915->gt.migrate.context->engine);
+	} else {
+		src_st = src_man->use_tt ? i915_ttm_tt_get_st(bo->ttm) :
+						obj->ttm.cached_io_st;
+
+		intel_engine_pm_get(i915->gt.migrate.context->engine);
+		ret = intel_context_migrate_copy(i915->gt.migrate.context,
+						 NULL, src_st->sgl, I915_CACHE_NONE,
+						 bo->mem.mem_type >= I915_PL_LMEM0,
+						 dst_st->sgl, I915_CACHE_NONE,
+						 dst_mem->mem_type >= I915_PL_LMEM0,
+						 &rq);
+		if (!ret && rq) {
+			i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT);
+			i915_request_put(rq);
+		}
+		intel_engine_pm_put(i915->gt.migrate.context->engine);
+	}
+
+	return ret;
+}
+
 static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 			 struct ttm_operation_ctx *ctx,
 			 struct ttm_resource *dst_mem,
@@ -332,19 +391,22 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 	if (IS_ERR(dst_st))
 		return PTR_ERR(dst_st);
 
-	/* If we start mapping GGTT, we can no longer use man::use_tt here. */
-	dst_iter = dst_man->use_tt ?
-		ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) :
-		ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap,
-					 dst_st, dst_reg->region.start);
-
-	src_iter = src_man->use_tt ?
-		ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) :
-		ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap,
-					 obj->ttm.cached_io_st,
-					 src_reg->region.start);
-
-	ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter);
+	ret = i915_ttm_accel_move(bo, dst_mem, dst_st);
+	if (ret) {
+		/* If we start mapping GGTT, we can no longer use man::use_tt here. */
+		dst_iter = dst_man->use_tt ?
+			ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) :
+			ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap,
+						 dst_st, dst_reg->region.start);
+
+		src_iter = src_man->use_tt ?
+			ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) :
+			ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap,
+						 obj->ttm.cached_io_st,
+						 src_reg->region.start);
+
+		ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter);
+	}
 	ttm_bo_move_sync_cleanup(bo, dst_mem);
 	i915_ttm_free_cached_io_st(obj);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] [PATCH v3] drm/i915/ttm: accelerated move implementation
@ 2021-06-15 10:06       ` Ramalingam C
  0 siblings, 0 replies; 44+ messages in thread
From: Ramalingam C @ 2021-06-15 10:06 UTC (permalink / raw)
  To: intel-gfx, dri-devel; +Cc: Matthew Auld, Thomas Hellström

Invokes the pipelined page migration through blt, for
i915_ttm_move requests of eviction and also obj clear.

v2:
 - subfunction for accel_move (Thomas)
 - engine_pm_get/put around context_move/clear (Thomas)
 - Invalidation at accel_clear (Thomas)

v3:
 - Timeout is set for MAX_SCHEDULE_TIMEOUT (Thomas)
 - s/TTM_PL_PRIV/I915_PL_LMEM0 (Thomas)

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 88 +++++++++++++++++++++----
 1 file changed, 75 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
index 3748098b42d5..94571757fb42 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
@@ -15,6 +15,9 @@
 #include "gem/i915_gem_ttm.h"
 #include "gem/i915_gem_mman.h"
 
+#include "gt/intel_migrate.h"
+#include "gt/intel_engine_pm.h"
+
 #define I915_PL_LMEM0 TTM_PL_PRIV
 #define I915_PL_SYSTEM TTM_PL_SYSTEM
 #define I915_PL_STOLEN TTM_PL_VRAM
@@ -282,6 +285,62 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
 	return intel_region_ttm_node_to_st(obj->mm.region, res->mm_node);
 }
 
+static int i915_ttm_accel_move(struct ttm_buffer_object *bo,
+			       struct ttm_resource *dst_mem,
+			       struct sg_table *dst_st)
+{
+	struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915),
+						     bdev);
+	struct ttm_resource_manager *src_man =
+		ttm_manager_type(bo->bdev, bo->mem.mem_type);
+	struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
+	struct sg_table *src_st;
+	struct i915_request *rq;
+	int ret;
+
+	if (!i915->gt.migrate.context)
+		return -EINVAL;
+
+	if (!bo->ttm || !ttm_tt_is_populated(bo->ttm)) {
+		if (bo->type == ttm_bo_type_kernel)
+			return -EINVAL;
+
+		if (bo->ttm &&
+		    !(bo->ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))
+			return 0;
+
+		intel_engine_pm_get(i915->gt.migrate.context->engine);
+		ret = intel_context_migrate_clear(i915->gt.migrate.context, NULL,
+						  dst_st->sgl, I915_CACHE_NONE,
+						  dst_mem->mem_type >= I915_PL_LMEM0,
+						  0, &rq);
+
+		if (!ret && rq) {
+			i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT);
+			i915_request_put(rq);
+		}
+		intel_engine_pm_put(i915->gt.migrate.context->engine);
+	} else {
+		src_st = src_man->use_tt ? i915_ttm_tt_get_st(bo->ttm) :
+						obj->ttm.cached_io_st;
+
+		intel_engine_pm_get(i915->gt.migrate.context->engine);
+		ret = intel_context_migrate_copy(i915->gt.migrate.context,
+						 NULL, src_st->sgl, I915_CACHE_NONE,
+						 bo->mem.mem_type >= I915_PL_LMEM0,
+						 dst_st->sgl, I915_CACHE_NONE,
+						 dst_mem->mem_type >= I915_PL_LMEM0,
+						 &rq);
+		if (!ret && rq) {
+			i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT);
+			i915_request_put(rq);
+		}
+		intel_engine_pm_put(i915->gt.migrate.context->engine);
+	}
+
+	return ret;
+}
+
 static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 			 struct ttm_operation_ctx *ctx,
 			 struct ttm_resource *dst_mem,
@@ -332,19 +391,22 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
 	if (IS_ERR(dst_st))
 		return PTR_ERR(dst_st);
 
-	/* If we start mapping GGTT, we can no longer use man::use_tt here. */
-	dst_iter = dst_man->use_tt ?
-		ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) :
-		ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap,
-					 dst_st, dst_reg->region.start);
-
-	src_iter = src_man->use_tt ?
-		ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) :
-		ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap,
-					 obj->ttm.cached_io_st,
-					 src_reg->region.start);
-
-	ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter);
+	ret = i915_ttm_accel_move(bo, dst_mem, dst_st);
+	if (ret) {
+		/* If we start mapping GGTT, we can no longer use man::use_tt here. */
+		dst_iter = dst_man->use_tt ?
+			ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) :
+			ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap,
+						 dst_st, dst_reg->region.start);
+
+		src_iter = src_man->use_tt ?
+			ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) :
+			ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap,
+						 obj->ttm.cached_io_st,
+						 src_reg->region.start);
+
+		ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter);
+	}
 	ttm_bo_move_sync_cleanup(bo, dst_mem);
 	i915_ttm_free_cached_io_st(obj);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for i915 TTM sync accelerated migration and clear
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
                   ` (14 preceding siblings ...)
  (?)
@ 2021-06-15 10:19 ` Patchwork
  -1 siblings, 0 replies; 44+ messages in thread
From: Patchwork @ 2021-06-15 10:19 UTC (permalink / raw)
  To: Ramalingam C; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 29530 bytes --]

== Series Details ==

Series: i915 TTM sync accelerated migration and clear
URL   : https://patchwork.freedesktop.org/series/91463/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10222_full -> Patchwork_20362_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

New tests
---------

  New tests have been introduced between CI_DRM_10222_full and Patchwork_20362_full:

### New IGT tests (2) ###

  * igt@i915_selftest@live@migrate:
    - Statuses : 6 pass(s)
    - Exec time: [0.50, 8.43] s

  * igt@i915_selftest@perf@migrate:
    - Statuses : 5 pass(s)
    - Exec time: [1.27, 5.56] s

  

Known issues
------------

  Here are the changes found in Patchwork_20362_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_create@create-clear:
    - shard-skl:          [PASS][1] -> [FAIL][2] ([i915#3160])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl10/igt@gem_create@create-clear.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl9/igt@gem_create@create-clear.html

  * igt@gem_ctx_isolation@preservation-s3@vcs0:
    - shard-apl:          NOTRUN -> [DMESG-WARN][3] ([i915#180])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl6/igt@gem_ctx_isolation@preservation-s3@vcs0.html

  * igt@gem_ctx_isolation@preservation-s3@vecs0:
    - shard-kbl:          [PASS][4] -> [DMESG-WARN][5] ([i915#180]) +4 similar issues
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl2/igt@gem_ctx_isolation@preservation-s3@vecs0.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl2/igt@gem_ctx_isolation@preservation-s3@vecs0.html

  * igt@gem_ctx_persistence@legacy-engines-queued:
    - shard-snb:          NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#1099]) +5 similar issues
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-snb6/igt@gem_ctx_persistence@legacy-engines-queued.html

  * igt@gem_exec_fair@basic-deadline:
    - shard-apl:          NOTRUN -> [FAIL][7] ([i915#2846])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl3/igt@gem_exec_fair@basic-deadline.html

  * igt@gem_exec_fair@basic-none@vcs0:
    - shard-kbl:          NOTRUN -> [FAIL][8] ([i915#2842]) +2 similar issues
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl3/igt@gem_exec_fair@basic-none@vcs0.html
    - shard-apl:          NOTRUN -> [FAIL][9] ([i915#2842])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl6/igt@gem_exec_fair@basic-none@vcs0.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
    - shard-tglb:         [PASS][10] -> [FAIL][11] ([i915#2842]) +2 similar issues
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-tglb2/igt@gem_exec_fair@basic-pace-share@rcs0.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-tglb5/igt@gem_exec_fair@basic-pace-share@rcs0.html

  * igt@gem_exec_fair@basic-pace@rcs0:
    - shard-glk:          [PASS][12] -> [FAIL][13] ([i915#2842])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-glk5/igt@gem_exec_fair@basic-pace@rcs0.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-glk4/igt@gem_exec_fair@basic-pace@rcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-iclb:         [PASS][14] -> [FAIL][15] ([i915#2849])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-iclb2/igt@gem_exec_fair@basic-throttle@rcs0.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-iclb8/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_exec_reloc@basic-wide-active@rcs0:
    - shard-snb:          NOTRUN -> [FAIL][16] ([i915#2389]) +2 similar issues
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-snb6/igt@gem_exec_reloc@basic-wide-active@rcs0.html

  * igt@gem_pread@exhaustion:
    - shard-apl:          NOTRUN -> [WARN][17] ([i915#2658])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl6/igt@gem_pread@exhaustion.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-apl:          NOTRUN -> [SKIP][18] ([fdo#109271] / [i915#3323])
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl8/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gen9_exec_parse@allowed-single:
    - shard-skl:          [PASS][19] -> [DMESG-WARN][20] ([i915#1436] / [i915#716])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl5/igt@gen9_exec_parse@allowed-single.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl10/igt@gen9_exec_parse@allowed-single.html

  * igt@gen9_exec_parse@batch-invalid-length:
    - shard-snb:          NOTRUN -> [SKIP][21] ([fdo#109271]) +531 similar issues
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-snb6/igt@gen9_exec_parse@batch-invalid-length.html

  * igt@gen9_exec_parse@bb-large:
    - shard-apl:          NOTRUN -> [FAIL][22] ([i915#3296])
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl3/igt@gen9_exec_parse@bb-large.html

  * igt@i915_pm_dc@dc9-dpms:
    - shard-apl:          NOTRUN -> [FAIL][23] ([i915#3343])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl8/igt@i915_pm_dc@dc9-dpms.html

  * igt@i915_suspend@debugfs-reader:
    - shard-apl:          [PASS][24] -> [DMESG-WARN][25] ([i915#180])
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-apl7/igt@i915_suspend@debugfs-reader.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl1/igt@i915_suspend@debugfs-reader.html

  * igt@kms_chamelium@hdmi-audio-edid:
    - shard-kbl:          NOTRUN -> [SKIP][26] ([fdo#109271] / [fdo#111827]) +5 similar issues
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl1/igt@kms_chamelium@hdmi-audio-edid.html

  * igt@kms_chamelium@vga-hpd-without-ddc:
    - shard-snb:          NOTRUN -> [SKIP][27] ([fdo#109271] / [fdo#111827]) +26 similar issues
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-snb5/igt@kms_chamelium@vga-hpd-without-ddc.html

  * igt@kms_color_chamelium@pipe-a-ctm-limited-range:
    - shard-apl:          NOTRUN -> [SKIP][28] ([fdo#109271] / [fdo#111827]) +29 similar issues
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl7/igt@kms_color_chamelium@pipe-a-ctm-limited-range.html

  * igt@kms_content_protection@atomic:
    - shard-apl:          NOTRUN -> [TIMEOUT][29] ([i915#1319])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl3/igt@kms_content_protection@atomic.html

  * igt@kms_content_protection@uevent:
    - shard-apl:          NOTRUN -> [FAIL][30] ([i915#2105])
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl3/igt@kms_content_protection@uevent.html

  * igt@kms_cursor_crc@pipe-a-cursor-suspend:
    - shard-skl:          [PASS][31] -> [INCOMPLETE][32] ([i915#2828] / [i915#300])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl3/igt@kms_cursor_crc@pipe-a-cursor-suspend.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl5/igt@kms_cursor_crc@pipe-a-cursor-suspend.html

  * igt@kms_cursor_crc@pipe-b-cursor-size-change:
    - shard-skl:          [PASS][33] -> [FAIL][34] ([i915#3444]) +1 similar issue
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl9/igt@kms_cursor_crc@pipe-b-cursor-size-change.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl9/igt@kms_cursor_crc@pipe-b-cursor-size-change.html

  * igt@kms_cursor_edge_walk@pipe-a-64x64-bottom-edge:
    - shard-skl:          [PASS][35] -> [DMESG-WARN][36] ([i915#1982]) +3 similar issues
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl3/igt@kms_cursor_edge_walk@pipe-a-64x64-bottom-edge.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl1/igt@kms_cursor_edge_walk@pipe-a-64x64-bottom-edge.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-skl:          [PASS][37] -> [FAIL][38] ([i915#2346])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl10/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl6/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_cursor_legacy@pipe-d-torture-bo:
    - shard-apl:          NOTRUN -> [SKIP][39] ([fdo#109271] / [i915#533]) +2 similar issues
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl2/igt@kms_cursor_legacy@pipe-d-torture-bo.html

  * igt@kms_frontbuffer_tracking@psr-2p-primscrn-pri-indfb-draw-mmap-wc:
    - shard-skl:          NOTRUN -> [SKIP][40] ([fdo#109271]) +2 similar issues
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl8/igt@kms_frontbuffer_tracking@psr-2p-primscrn-pri-indfb-draw-mmap-wc.html

  * igt@kms_hdr@bpc-switch:
    - shard-skl:          [PASS][41] -> [FAIL][42] ([i915#1188])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl5/igt@kms_hdr@bpc-switch.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl10/igt@kms_hdr@bpc-switch.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-basic:
    - shard-apl:          NOTRUN -> [FAIL][43] ([fdo#108145] / [i915#265]) +2 similar issues
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl8/igt@kms_plane_alpha_blend@pipe-a-alpha-basic.html

  * igt@kms_plane_alpha_blend@pipe-c-alpha-basic:
    - shard-kbl:          NOTRUN -> [FAIL][44] ([fdo#108145] / [i915#265])
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl3/igt@kms_plane_alpha_blend@pipe-c-alpha-basic.html

  * igt@kms_plane_multiple@atomic-pipe-d-tiling-x:
    - shard-kbl:          NOTRUN -> [SKIP][45] ([fdo#109271]) +122 similar issues
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl3/igt@kms_plane_multiple@atomic-pipe-d-tiling-x.html

  * igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping:
    - shard-apl:          NOTRUN -> [SKIP][46] ([fdo#109271] / [i915#2733])
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl2/igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping.html
    - shard-kbl:          NOTRUN -> [SKIP][47] ([fdo#109271] / [i915#2733])
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl1/igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping.html

  * igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-4:
    - shard-apl:          NOTRUN -> [SKIP][48] ([fdo#109271] / [i915#658]) +3 similar issues
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl6/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-4.html
    - shard-kbl:          NOTRUN -> [SKIP][49] ([fdo#109271] / [i915#658]) +3 similar issues
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl3/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-4.html

  * igt@kms_psr@psr2_sprite_mmap_cpu:
    - shard-iclb:         [PASS][50] -> [SKIP][51] ([fdo#109441]) +1 similar issue
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-iclb2/igt@kms_psr@psr2_sprite_mmap_cpu.html
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-iclb5/igt@kms_psr@psr2_sprite_mmap_cpu.html

  * igt@kms_vblank@pipe-d-ts-continuation-idle:
    - shard-apl:          NOTRUN -> [SKIP][52] ([fdo#109271]) +319 similar issues
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl8/igt@kms_vblank@pipe-d-ts-continuation-idle.html

  * igt@kms_writeback@writeback-fb-id:
    - shard-apl:          NOTRUN -> [SKIP][53] ([fdo#109271] / [i915#2437])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl6/igt@kms_writeback@writeback-fb-id.html
    - shard-kbl:          NOTRUN -> [SKIP][54] ([fdo#109271] / [i915#2437])
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl3/igt@kms_writeback@writeback-fb-id.html

  * igt@sysfs_clients@fair-7:
    - shard-apl:          NOTRUN -> [SKIP][55] ([fdo#109271] / [i915#2994]) +5 similar issues
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl2/igt@sysfs_clients@fair-7.html

  * igt@sysfs_clients@recycle-many:
    - shard-kbl:          NOTRUN -> [SKIP][56] ([fdo#109271] / [i915#2994])
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl3/igt@sysfs_clients@recycle-many.html

  
#### Possible fixes ####

  * igt@gem_create@create-clear:
    - shard-glk:          [FAIL][57] ([i915#1888] / [i915#3160]) -> [PASS][58]
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-glk5/igt@gem_create@create-clear.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-glk4/igt@gem_create@create-clear.html

  * igt@gem_eio@reset-stress:
    - shard-snb:          [TIMEOUT][59] ([i915#3427]) -> [PASS][60]
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-snb6/igt@gem_eio@reset-stress.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-snb5/igt@gem_eio@reset-stress.html

  * igt@gem_exec_fair@basic-deadline:
    - shard-kbl:          [FAIL][61] ([i915#2846]) -> [PASS][62]
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl7/igt@gem_exec_fair@basic-deadline.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl1/igt@gem_exec_fair@basic-deadline.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
    - shard-iclb:         [FAIL][63] ([i915#2842]) -> [PASS][64]
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-iclb8/igt@gem_exec_fair@basic-none-share@rcs0.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-iclb7/igt@gem_exec_fair@basic-none-share@rcs0.html

  * igt@gem_exec_fair@basic-pace-solo@rcs0:
    - shard-kbl:          [FAIL][65] ([i915#2842]) -> [PASS][66]
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl2/igt@gem_exec_fair@basic-pace-solo@rcs0.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl1/igt@gem_exec_fair@basic-pace-solo@rcs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
    - shard-kbl:          [SKIP][67] ([fdo#109271]) -> [PASS][68]
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl6/igt@gem_exec_fair@basic-pace@vecs0.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl6/igt@gem_exec_fair@basic-pace@vecs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-glk:          [FAIL][69] ([i915#2842]) -> [PASS][70] +2 similar issues
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-glk4/igt@gem_exec_fair@basic-throttle@rcs0.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-glk3/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_exec_whisper@basic-normal-all:
    - shard-glk:          [DMESG-WARN][71] ([i915#118] / [i915#95]) -> [PASS][72]
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-glk6/igt@gem_exec_whisper@basic-normal-all.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-glk3/igt@gem_exec_whisper@basic-normal-all.html

  * igt@gem_mmap_gtt@cpuset-big-copy-odd:
    - shard-iclb:         [FAIL][73] ([i915#307]) -> [PASS][74]
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-iclb8/igt@gem_mmap_gtt@cpuset-big-copy-odd.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-iclb6/igt@gem_mmap_gtt@cpuset-big-copy-odd.html

  * igt@gem_mmap_gtt@cpuset-medium-copy-xy:
    - shard-iclb:         [FAIL][75] ([i915#2428]) -> [PASS][76]
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-iclb3/igt@gem_mmap_gtt@cpuset-medium-copy-xy.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-iclb1/igt@gem_mmap_gtt@cpuset-medium-copy-xy.html

  * igt@i915_pm_sseu@full-enable:
    - shard-skl:          [FAIL][77] -> [PASS][78]
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl2/igt@i915_pm_sseu@full-enable.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl8/igt@i915_pm_sseu@full-enable.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size:
    - shard-skl:          [FAIL][79] ([i915#2346] / [i915#533]) -> [PASS][80]
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl3/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl1/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html

  * igt@kms_draw_crc@draw-method-rgb565-mmap-cpu-untiled:
    - shard-glk:          [FAIL][81] ([i915#3451]) -> [PASS][82]
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-glk4/igt@kms_draw_crc@draw-method-rgb565-mmap-cpu-untiled.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-glk1/igt@kms_draw_crc@draw-method-rgb565-mmap-cpu-untiled.html

  * igt@kms_flip@flip-vs-expired-vblank@c-edp1:
    - shard-skl:          [FAIL][83] ([i915#79]) -> [PASS][84]
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl7/igt@kms_flip@flip-vs-expired-vblank@c-edp1.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl7/igt@kms_flip@flip-vs-expired-vblank@c-edp1.html

  * igt@kms_flip@flip-vs-suspend-interruptible@a-dp1:
    - shard-kbl:          [DMESG-WARN][85] ([i915#180]) -> [PASS][86] +8 similar issues
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl2/igt@kms_flip@flip-vs-suspend-interruptible@a-dp1.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl1/igt@kms_flip@flip-vs-suspend-interruptible@a-dp1.html

  * igt@kms_flip@flip-vs-suspend@c-dp1:
    - shard-apl:          [DMESG-WARN][87] ([i915#180]) -> [PASS][88] +1 similar issue
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-apl2/igt@kms_flip@flip-vs-suspend@c-dp1.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl6/igt@kms_flip@flip-vs-suspend@c-dp1.html

  * igt@kms_flip@plain-flip-fb-recreate@b-edp1:
    - shard-skl:          [FAIL][89] ([i915#2122]) -> [PASS][90] +2 similar issues
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl9/igt@kms_flip@plain-flip-fb-recreate@b-edp1.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl2/igt@kms_flip@plain-flip-fb-recreate@b-edp1.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-indfb-msflip-blt:
    - shard-skl:          [DMESG-WARN][91] ([i915#1982]) -> [PASS][92]
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl2/igt@kms_frontbuffer_tracking@psr-1p-primscrn-indfb-msflip-blt.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl8/igt@kms_frontbuffer_tracking@psr-1p-primscrn-indfb-msflip-blt.html

  * igt@kms_psr@psr2_sprite_blt:
    - shard-iclb:         [SKIP][93] ([fdo#109441]) -> [PASS][94]
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-iclb5/igt@kms_psr@psr2_sprite_blt.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-iclb2/igt@kms_psr@psr2_sprite_blt.html

  * igt@kms_sequence@get-idle:
    - shard-snb:          [SKIP][95] ([fdo#109271]) -> [PASS][96]
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-snb7/igt@kms_sequence@get-idle.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-snb2/igt@kms_sequence@get-idle.html

  * igt@kms_vblank@pipe-a-ts-continuation-suspend:
    - shard-kbl:          [DMESG-WARN][97] ([i915#180] / [i915#295]) -> [PASS][98]
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl2/igt@kms_vblank@pipe-a-ts-continuation-suspend.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl3/igt@kms_vblank@pipe-a-ts-continuation-suspend.html
    - shard-apl:          [DMESG-WARN][99] ([i915#180] / [i915#295]) -> [PASS][100]
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-apl2/igt@kms_vblank@pipe-a-ts-continuation-suspend.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl8/igt@kms_vblank@pipe-a-ts-continuation-suspend.html

  * igt@kms_vblank@pipe-b-ts-continuation-dpms-suspend:
    - shard-kbl:          [INCOMPLETE][101] ([i915#155] / [i915#2828]) -> [PASS][102]
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl3/igt@kms_vblank@pipe-b-ts-continuation-dpms-suspend.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl1/igt@kms_vblank@pipe-b-ts-continuation-dpms-suspend.html

  
#### Warnings ####

  * igt@kms_psr2_sf@plane-move-sf-dmg-area-0:
    - shard-iclb:         [SKIP][103] ([i915#658]) -> [SKIP][104] ([i915#2920]) +3 similar issues
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-iclb5/igt@kms_psr2_sf@plane-move-sf-dmg-area-0.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-iclb2/igt@kms_psr2_sf@plane-move-sf-dmg-area-0.html

  * igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-5:
    - shard-iclb:         [SKIP][105] ([i915#2920]) -> [SKIP][106] ([i915#658]) +1 similar issue
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-iclb2/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-5.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-iclb5/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-5.html

  * igt@runner@aborted:
    - shard-kbl:          ([FAIL][107], [FAIL][108], [FAIL][109], [FAIL][110], [FAIL][111], [FAIL][112], [FAIL][113], [FAIL][114]) ([i915#1436] / [i915#180] / [i915#1814] / [i915#2505] / [i915#3002] / [i915#3363] / [i915#602]) -> ([FAIL][115], [FAIL][116], [FAIL][117], [FAIL][118], [FAIL][119], [FAIL][120], [FAIL][121]) ([i915#1436] / [i915#180] / [i915#1814] / [i915#2505] / [i915#3002] / [i915#3363])
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl2/igt@runner@aborted.html
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl7/igt@runner@aborted.html
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl6/igt@runner@aborted.html
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl2/igt@runner@aborted.html
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl2/igt@runner@aborted.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl6/igt@runner@aborted.html
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl2/igt@runner@aborted.html
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-kbl2/igt@runner@aborted.html
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl2/igt@runner@aborted.html
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl2/igt@runner@aborted.html
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl6/igt@runner@aborted.html
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl6/igt@runner@aborted.html
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl6/igt@runner@aborted.html
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl7/igt@runner@aborted.html
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-kbl6/igt@runner@aborted.html
    - shard-apl:          ([FAIL][122], [FAIL][123], [FAIL][124], [FAIL][125]) ([i915#180] / [i915#1814] / [i915#3002] / [i915#3363]) -> ([FAIL][126], [FAIL][127], [FAIL][128]) ([i915#180] / [i915#3002] / [i915#3363])
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-apl2/igt@runner@aborted.html
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-apl2/igt@runner@aborted.html
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-apl3/igt@runner@aborted.html
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-apl2/igt@runner@aborted.html
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl6/igt@runner@aborted.html
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl6/igt@runner@aborted.html
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-apl1/igt@runner@aborted.html
    - shard-skl:          ([FAIL][129], [FAIL][130]) ([i915#1814] / [i915#2029] / [i915#3002] / [i915#3363]) -> ([FAIL][131], [FAIL][132]) ([i915#1436] / [i915#1814] / [i915#2029] / [i915#3363])
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl2/igt@runner@aborted.html
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10222/shard-skl3/igt@runner@aborted.html
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl10/igt@runner@aborted.html
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/shard-skl3/igt@runner@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1099]: https://gitlab.freedesktop.org/drm/intel/issues/1099
  [i915#118]: https://gitlab.freedesktop.org/drm/intel/issues/118
  [i915#1188]: https://gitlab.freedesktop.org/drm/intel/issues/1188
  [i915#1319]: https://gitlab.freedesktop.org/drm/intel/issues/1319
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#1814]: https://gitlab.freedesktop.org/drm/intel/issues/1814
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2029]: https://gitlab.freedesktop.org/drm/intel/issues/2029
  [i915#2105]: https://gitlab.freedesktop.org/drm/intel/issues/2105
  [i915#2122]: https://gitlab.freedesktop.org/drm/intel/issues/2122
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2389]: https://gitlab.freedesktop.org/drm/intel/issues/2389
  [i915#2428]: https://gitlab.freedesktop.org/drm/intel/issues/2428
  [i915#2437]: https://gitlab.freedesktop.org/drm/intel/issues/2437
  [i915#2505]: https://gitlab.freedesktop.org/drm/intel/issues/2505
  [i915#265]: https://gitlab.freedesktop.org/drm/intel/issues/265
  [i915#2658]: https://gitlab.freedesktop.org/drm/intel/issues/2658
  [i915#2733]: https://gitlab.freedesktop.org/drm/intel/issues/2733
  [i915#2828]: https://gitlab.freedesktop.org/drm/intel/issues/2828
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2846]: https://gitlab.freedesktop.org/drm/intel/issues/2846
  [i915#2849]: https://gitlab.freedesktop.org/drm/intel/issues/2849
  [i915#2920]: https://gitlab.freedesktop.org/drm/intel/issues/2920
  [i915#295]: https://gitlab.freedesktop.org/drm/intel/issues/295
  [i915#2994]: https://gitlab.freedesktop.org/drm/intel/issues/2994
  [i915#300]: https://gitlab.freedesktop.org/drm/intel/issues/300
  [i915#3002]: https://gitlab.freedesktop.org/drm/intel/issues/3002
  [i915#307]: https://gitlab.freedesktop.org/drm/intel/issues/307
  [i915#3160]: https://gitlab.freedesktop.org/drm/intel/issues/3160
  [i915#3296]: https://gitlab.freedesktop.org/drm/intel/issues/3296
  [i915#3323]: https://gitlab.freedesktop.org/drm/intel/issues/3323
  [i915#3343]: https://gitlab.freedesktop.org/drm/intel/issues/3343
  [i915#3363]: https://gitlab.freedesktop.org/drm/intel/issues/3363
  [i915#3427]: https://gitlab.freedesktop.org/drm/intel/issues/3427
  [i915#3444]: https://gitlab.freedesktop.org/drm/intel/issues/3444
  [i915#3451]: https://gitlab.freedesktop.org/drm/intel/issues/3451
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#602]: https://gitlab.freedesktop.org/drm/intel/issues/602
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#716]: https://gitlab.freedesktop.org/drm/intel/issues/716
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (10 -> 10)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * Linux: CI_DRM_10222 -> Patchwork_20362

  CI-20190529: 20190529
  CI_DRM_10222: 9b5675dc51137543709a5ec444b0d7076e43198e @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6105: 598a154680374e7875ae9ffc98425abc57398b2f @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20362: d95873d0888ac8c680a5527c57d526378ec6ccf4 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20362/index.html

[-- Attachment #1.2: Type: text/html, Size: 37076 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for i915 TTM sync accelerated migration and clear (rev2)
  2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
                   ` (15 preceding siblings ...)
  (?)
@ 2021-06-15 11:45 ` Patchwork
  -1 siblings, 0 replies; 44+ messages in thread
From: Patchwork @ 2021-06-15 11:45 UTC (permalink / raw)
  To: Ramalingam C; +Cc: intel-gfx

== Series Details ==

Series: i915 TTM sync accelerated migration and clear (rev2)
URL   : https://patchwork.freedesktop.org/series/91463/
State : failure

== Summary ==

CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  DESCEND objtool
  CHK     include/generated/compile.h
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_ttm.o
drivers/gpu/drm/i915/gem/i915_gem_ttm.c: In function ‘i915_ttm_accel_move’:
drivers/gpu/drm/i915/gem/i915_gem_ttm.c:295:32: error: ‘struct ttm_buffer_object’ has no member named ‘mem’
   ttm_manager_type(bo->bdev, bo->mem.mem_type);
                                ^~
drivers/gpu/drm/i915/gem/i915_gem_ttm.c:330:10: error: ‘struct ttm_buffer_object’ has no member named ‘mem’
        bo->mem.mem_type >= I915_PL_LMEM0,
          ^~
scripts/Makefile.build:272: recipe for target 'drivers/gpu/drm/i915/gem/i915_gem_ttm.o' failed
make[4]: *** [drivers/gpu/drm/i915/gem/i915_gem_ttm.o] Error 1
scripts/Makefile.build:515: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:515: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:515: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1844: recipe for target 'drivers' failed
make: *** [drivers] Error 2


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 07/12] drm/i915/gt: Pipelined page migration
  2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
@ 2021-06-15 14:45     ` Matthew Auld
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-06-15 14:45 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel; +Cc: Chris Wilson

On 14/06/2021 17:26, Thomas Hellström wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> If we pipeline the PTE updates and then do the copy of those pages
> within a single unpreemptible command packet, we can submit the copies
> and leave them to be scheduled without having to synchronously wait
> under a global lock. In order to manage migration, we need to
> preallocate the page tables (and keep them pinned and available for use
> at any time), causing a bottleneck for migrations as all clients must
> contend on the limited resources. By inlining the ppGTT updates and
> performing the blit atomically, each client only owns the PTE while in
> use, and so we can reschedule individual operations however we see fit.
> And most importantly, we do not need to take a global lock on the shared
> vm, and wait until the operation is complete before releasing the lock
> for others to claim the PTE for themselves.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH v3 07/12] drm/i915/gt: Pipelined page migration
@ 2021-06-15 14:45     ` Matthew Auld
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-06-15 14:45 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel; +Cc: Chris Wilson

On 14/06/2021 17:26, Thomas Hellström wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> If we pipeline the PTE updates and then do the copy of those pages
> within a single unpreemptible command packet, we can submit the copies
> and leave them to be scheduled without having to synchronously wait
> under a global lock. In order to manage migration, we need to
> preallocate the page tables (and keep them pinned and available for use
> at any time), causing a bottleneck for migrations as all clients must
> contend on the limited resources. By inlining the ppGTT updates and
> performing the blit atomically, each client only owns the PTE while in
> use, and so we can reschedule individual operations however we see fit.
> And most importantly, we do not need to take a global lock on the shared
> vm, and wait until the operation is complete before releasing the lock
> for others to claim the PTE for themselves.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 08/12] drm/i915/gt: Pipelined clear
  2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
@ 2021-06-15 14:47     ` Matthew Auld
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-06-15 14:47 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel; +Cc: Chris Wilson

On 14/06/2021 17:26, Thomas Hellström wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Update the PTE and emit a clear within a single unpreemptible packet
> such that we can schedule and pipeline clears.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [Intel-gfx] [PATCH v3 08/12] drm/i915/gt: Pipelined clear
@ 2021-06-15 14:47     ` Matthew Auld
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Auld @ 2021-06-15 14:47 UTC (permalink / raw)
  To: Thomas Hellström, intel-gfx, dri-devel; +Cc: Chris Wilson

On 14/06/2021 17:26, Thomas Hellström wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Update the PTE and emit a clear within a single unpreemptible packet
> such that we can schedule and pipeline clears.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2021-06-15 14:48 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-14 16:26 [PATCH v3 00/12] i915 TTM sync accelerated migration and clear Thomas Hellström
2021-06-14 16:26 ` [Intel-gfx] " Thomas Hellström
2021-06-14 16:26 ` [PATCH v3 01/12] drm/i915: Reference objects on the ww object list Thomas Hellström
2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
2021-06-14 16:26 ` [PATCH v3 02/12] drm/i915: Break out dma_resv ww locking utilities to separate files Thomas Hellström
2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
2021-06-14 16:26 ` [PATCH v3 03/12] drm/i915: Introduce a ww transaction helper Thomas Hellström
2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
2021-06-14 16:26 ` [PATCH v3 04/12] drm/i915/gt: Add an insert_entry for gen8_ppgtt Thomas Hellström
2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
2021-06-14 16:26 ` [PATCH v3 05/12] drm/i915/gt: Add a routine to iterate over the pagetables of a GTT Thomas Hellström
2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
2021-06-14 16:26 ` [PATCH v3 06/12] drm/i915/gt: Export the pinned context constructor and destructor Thomas Hellström
2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
2021-06-14 16:26 ` [PATCH v3 07/12] drm/i915/gt: Pipelined page migration Thomas Hellström
2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
2021-06-15 14:45   ` Matthew Auld
2021-06-15 14:45     ` [Intel-gfx] " Matthew Auld
2021-06-14 16:26 ` [PATCH v3 08/12] drm/i915/gt: Pipelined clear Thomas Hellström
2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
2021-06-15 14:47   ` Matthew Auld
2021-06-15 14:47     ` [Intel-gfx] " Matthew Auld
2021-06-14 16:26 ` [PATCH v3 09/12] drm/i915/gt: Setup a default migration context on the GT Thomas Hellström
2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
2021-06-14 16:26 ` [PATCH v3 10/12] drm/i915/ttm: accelerated move implementation Thomas Hellström
2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
2021-06-14 17:55   ` Thomas Hellström
2021-06-14 17:55     ` [Intel-gfx] " Thomas Hellström
2021-06-15 10:06     ` [PATCH v3] " Ramalingam C
2021-06-15 10:06       ` [Intel-gfx] " Ramalingam C
2021-06-14 16:26 ` [PATCH v3 11/12] drm/i915/gem: Zap the client blt code Thomas Hellström
2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
2021-06-14 16:33   ` Matthew Auld
2021-06-14 16:33     ` [Intel-gfx] " Matthew Auld
2021-06-14 16:40     ` Thomas Hellström
2021-06-14 16:40       ` [Intel-gfx] " Thomas Hellström
2021-06-14 16:26 ` [PATCH v3 12/12] drm/i915/gem: Zap the i915_gem_object_blt code Thomas Hellström
2021-06-14 16:26   ` [Intel-gfx] " Thomas Hellström
2021-06-14 16:43   ` Matthew Auld
2021-06-14 16:43     ` [Intel-gfx] " Matthew Auld
2021-06-15  0:55 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for i915 TTM sync accelerated migration and clear Patchwork
2021-06-15  1:24 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-06-15 10:19 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2021-06-15 11:45 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for i915 TTM sync accelerated migration and clear (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.