All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map()
@ 2020-07-08 13:47 Chris Wilson
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 2/7] drm/i915: Release shortlived maps of longlived objects Chris Wilson
                   ` (10 more replies)
  0 siblings, 11 replies; 22+ messages in thread
From: Chris Wilson @ 2020-07-08 13:47 UTC (permalink / raw)
  To: intel-gfx; +Cc: matthew.auld, Chris Wilson

As we have a pin_map interface, that knows how to flush the data to the
device, use it. The only downside is that we keep the kmap around, as
once acquired we keep the mapping cached until the object's backing
store is released.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e866b8d721ed..02a38810bcd3 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -3880,7 +3880,6 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 	struct i915_wa_ctx_bb *wa_bb[2] = { &wa_ctx->indirect_ctx,
 					    &wa_ctx->per_ctx };
 	wa_bb_func_t wa_bb_fn[2];
-	struct page *page;
 	void *batch, *batch_ptr;
 	unsigned int i;
 	int ret;
@@ -3916,14 +3915,14 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 		return ret;
 	}
 
-	page = i915_gem_object_get_dirty_page(wa_ctx->vma->obj, 0);
-	batch = batch_ptr = kmap_atomic(page);
+	batch = i915_gem_object_pin_map(wa_ctx->vma->obj, I915_MAP_WB);
 
 	/*
 	 * Emit the two workaround batch buffers, recording the offset from the
 	 * start of the workaround batch buffer object for each and their
 	 * respective sizes.
 	 */
+	batch_ptr = batch;
 	for (i = 0; i < ARRAY_SIZE(wa_bb_fn); i++) {
 		wa_bb[i]->offset = batch_ptr - batch;
 		if (GEM_DEBUG_WARN_ON(!IS_ALIGNED(wa_bb[i]->offset,
@@ -3935,10 +3934,10 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 			batch_ptr = wa_bb_fn[i](engine, batch_ptr);
 		wa_bb[i]->size = batch_ptr - (batch + wa_bb[i]->offset);
 	}
+	GEM_BUG_ON(batch_ptr - batch > CTX_WA_BB_OBJ_SIZE);
 
-	BUG_ON(batch_ptr - batch > CTX_WA_BB_OBJ_SIZE);
-
-	kunmap_atomic(batch);
+	__i915_gem_object_flush_map(wa_ctx->vma->obj, 0, batch_ptr - batch);
+	i915_gem_object_unpin_map(wa_ctx->vma->obj);
 	if (ret)
 		lrc_destroy_wa_ctx(engine);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-gfx] [PATCH 2/7] drm/i915: Release shortlived maps of longlived objects
  2020-07-08 13:47 [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Chris Wilson
@ 2020-07-08 13:47 ` Chris Wilson
  2020-07-08 13:50   ` Chris Wilson
  2020-07-08 14:35   ` [Intel-gfx] [PATCH v2] " Chris Wilson
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 3/7] drm/i915: Remove i915_gem_object_get_dirty_page() Chris Wilson
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 22+ messages in thread
From: Chris Wilson @ 2020-07-08 13:47 UTC (permalink / raw)
  To: intel-gfx; +Cc: matthew.auld, Chris Wilson

Some objects we map once during their construction, and then never
access their mappings again, even if they are kept around for the
duration of the driver. Keeping those pages mapped, often vmapped, is
therefore wasteful and we should release the maps as soon as we no
longer need them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  2 ++
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     | 20 +++++++++++++++++++
 drivers/gpu/drm/i915/gt/gen7_renderclear.c    |  1 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  1 +
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  1 +
 drivers/gpu/drm/i915/i915_perf.c              |  2 ++
 6 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 2faa481cc18f..a26a4faf014f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -394,6 +394,8 @@ static inline void i915_gem_object_unpin_map(struct drm_i915_gem_object *obj)
 	i915_gem_object_unpin_pages(obj);
 }
 
+int i915_gem_object_release_map(struct drm_i915_gem_object *obj);
+
 void
 i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
 				   unsigned int flush_domains);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index af9e48ee4a33..114256014a97 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -408,6 +408,26 @@ void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj,
 	}
 }
 
+int i915_gem_object_release_map(struct drm_i915_gem_object *obj)
+{
+	int err;
+
+	err = mutex_lock_interruptible(&obj->mm.lock);
+	if (err)
+		return err;
+
+	if (atomic_read(&obj->mm.pages_pin_count)) {
+		err = -EBUSY;
+	} else if (obj->mm.mapping) {
+		unmap_object(obj, page_mask_bits(obj->mm.mapping));
+		obj->mm.mapping = NULL;
+	}
+
+	mutex_unlock(&obj->mm.lock);
+
+	return err;
+}
+
 struct scatterlist *
 i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
 		       unsigned int n,
diff --git a/drivers/gpu/drm/i915/gt/gen7_renderclear.c b/drivers/gpu/drm/i915/gt/gen7_renderclear.c
index de595b66a746..d2a94002ae6e 100644
--- a/drivers/gpu/drm/i915/gt/gen7_renderclear.c
+++ b/drivers/gpu/drm/i915/gt/gen7_renderclear.c
@@ -397,6 +397,7 @@ int gen7_setup_clear_gpr_bb(struct intel_engine_cs * const engine,
 
 	i915_gem_object_flush_map(vma->obj);
 	i915_gem_object_unpin_map(vma->obj);
+	i915_gem_object_release_map(vma->obj);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 02a38810bcd3..476dfc6be15e 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -3938,6 +3938,7 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 
 	__i915_gem_object_flush_map(wa_ctx->vma->obj, 0, batch_ptr - batch);
 	i915_gem_object_unpin_map(wa_ctx->vma->obj);
+	i915_gem_object_release_map(wa_ctx->vma->obj);
 	if (ret)
 		lrc_destroy_wa_ctx(engine);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 68a08486fc87..a1e1d24da2fb 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -544,6 +544,7 @@ alloc_context_vma(struct intel_engine_cs *engine)
 
 		i915_gem_object_flush_map(obj);
 		i915_gem_object_unpin_map(obj);
+		i915_gem_object_release_map(obj);
 	}
 
 	vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL);
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 25329b7600c9..34f82e0e912a 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1773,6 +1773,7 @@ static int alloc_noa_wait(struct i915_perf_stream *stream)
 
 	i915_gem_object_flush_map(bo);
 	i915_gem_object_unpin_map(bo);
+	i915_gem_object_release_map(bo);
 
 	stream->noa_wait = vma;
 	return 0;
@@ -1868,6 +1869,7 @@ alloc_oa_config_buffer(struct i915_perf_stream *stream,
 
 	i915_gem_object_flush_map(obj);
 	i915_gem_object_unpin_map(obj);
+	i915_gem_object_release_map(obj);
 
 	oa_bo->vma = i915_vma_instance(obj,
 				       &stream->engine->gt->ggtt->vm,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-gfx] [PATCH 3/7] drm/i915: Remove i915_gem_object_get_dirty_page()
  2020-07-08 13:47 [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Chris Wilson
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 2/7] drm/i915: Release shortlived maps of longlived objects Chris Wilson
@ 2020-07-08 13:47 ` Chris Wilson
  2020-07-08 16:40   ` Matthew Auld
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 4/7] drm/i915/gem: Unpin idle contexts from kswapd reclaim Chris Wilson
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2020-07-08 13:47 UTC (permalink / raw)
  To: intel-gfx; +Cc: matthew.auld, Chris Wilson

Last user removed, remove the convenience function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  4 ----
 drivers/gpu/drm/i915/gem/i915_gem_pages.c  | 14 --------------
 2 files changed, 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index a26a4faf014f..7234c257eb2a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -258,10 +258,6 @@ struct page *
 i915_gem_object_get_page(struct drm_i915_gem_object *obj,
 			 unsigned int n);
 
-struct page *
-i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj,
-			       unsigned int n);
-
 dma_addr_t
 i915_gem_object_get_dma_address_len(struct drm_i915_gem_object *obj,
 				    unsigned long n,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 114256014a97..5aa9f584b154 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -553,20 +553,6 @@ i915_gem_object_get_page(struct drm_i915_gem_object *obj, unsigned int n)
 	return nth_page(sg_page(sg), offset);
 }
 
-/* Like i915_gem_object_get_page(), but mark the returned page dirty */
-struct page *
-i915_gem_object_get_dirty_page(struct drm_i915_gem_object *obj,
-			       unsigned int n)
-{
-	struct page *page;
-
-	page = i915_gem_object_get_page(obj, n);
-	if (!obj->mm.dirty)
-		set_page_dirty(page);
-
-	return page;
-}
-
 dma_addr_t
 i915_gem_object_get_dma_address_len(struct drm_i915_gem_object *obj,
 				    unsigned long n,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-gfx] [PATCH 4/7] drm/i915/gem: Unpin idle contexts from kswapd reclaim
  2020-07-08 13:47 [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Chris Wilson
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 2/7] drm/i915: Release shortlived maps of longlived objects Chris Wilson
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 3/7] drm/i915: Remove i915_gem_object_get_dirty_page() Chris Wilson
@ 2020-07-08 13:47 ` Chris Wilson
  2020-07-08 17:13   ` Matthew Auld
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 5/7] drm/i915: Preallocate stashes for vma page-directories Chris Wilson
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2020-07-08 13:47 UTC (permalink / raw)
  To: intel-gfx; +Cc: matthew.auld, Chris Wilson

We removed retiring requests from the shrinker in order to decouple the
mutexes from reclaim in preparation for unravelling the struct_mutex.
The impact of not retiring is that we are much less agressive in making
global objects available for shrinking, as such objects remain pinned
until they are flushed by a heartbeat pulse following the last retired
request along their timeline. In order to ensure that pulse occurs in
time for memory reclamation, we should kick it from kswapd.

The catch is that we have added some flush_work() into the retirement
phase (to ensure that we reach a global idle in a timely manner), but
these flush_work() are not eligible (i.e do not belong to WQ_MEM_RELCAIM)
for use from inside kswapd. To avoid flushing those workqueues, we teach
the retirer not to do so unless we are actually waiting, and only do the
plain retire from inside the shrinker.

Note that for execlists, we already retire completed contexts as they
are scheduled out, so it should not be keeping global state
unnecessarily pinned. The legacy ringbuffer however...

References: 9e9539800dd4 ("drm/i915: Remove waiting & retiring from shrinker paths")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 25 +++++++++++++-------
 drivers/gpu/drm/i915/gt/intel_gt_requests.c  |  9 ++++---
 2 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index 1ced1e5d2ec0..dc8f052a0ffe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -13,6 +13,8 @@
 #include <linux/dma-buf.h>
 #include <linux/vmalloc.h>
 
+#include "gt/intel_gt_requests.h"
+
 #include "i915_trace.h"
 
 static bool swap_available(void)
@@ -111,15 +113,6 @@ i915_gem_shrink(struct drm_i915_private *i915,
 	unsigned long count = 0;
 	unsigned long scanned = 0;
 
-	/*
-	 * When shrinking the active list, we should also consider active
-	 * contexts. Active contexts are pinned until they are retired, and
-	 * so can not be simply unbound to retire and unpin their pages. To
-	 * shrink the contexts, we must wait until the gpu is idle and
-	 * completed its switch to the kernel context. In short, we do
-	 * not have a good mechanism for idling a specific context.
-	 */
-
 	trace_i915_gem_shrink(i915, target, shrink);
 
 	/*
@@ -133,6 +126,20 @@ i915_gem_shrink(struct drm_i915_private *i915,
 			shrink &= ~I915_SHRINK_BOUND;
 	}
 
+	/*
+	 * When shrinking the active list, we should also consider active
+	 * contexts. Active contexts are pinned until they are retired, and
+	 * so can not be simply unbound to retire and unpin their pages. To
+	 * shrink the contexts, we must wait until the gpu is idle and
+	 * completed its switch to the kernel context. In short, we do
+	 * not have a good mechanism for idling a specific context, but
+	 * what we can do is give them a kick so that we do not keep idle
+	 * contexts around longer than is necessary.
+	 */
+	if (shrink & I915_SHRINK_ACTIVE)
+		/* Retire requests to unpin all idle contexts */
+		intel_gt_retire_requests(&i915->gt);
+
 	/*
 	 * As we may completely rewrite the (un)bound list whilst unbinding
 	 * (due to retiring requests) we have to strictly process only
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index 16ff47c83bd5..66fcbf9d0fdd 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -31,12 +31,15 @@ static bool engine_active(const struct intel_engine_cs *engine)
 	return !list_empty(&engine->kernel_context->timeline->requests);
 }
 
-static bool flush_submission(struct intel_gt *gt)
+static bool flush_submission(struct intel_gt *gt, long timeout)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 	bool active = false;
 
+	if (!timeout)
+		return false;
+
 	if (!intel_gt_pm_is_awake(gt))
 		return false;
 
@@ -139,7 +142,7 @@ long intel_gt_retire_requests_timeout(struct intel_gt *gt, long timeout)
 	if (unlikely(timeout < 0))
 		timeout = -timeout, interruptible = false;
 
-	flush_submission(gt); /* kick the ksoftirqd tasklets */
+	flush_submission(gt, timeout); /* kick the ksoftirqd tasklets */
 	spin_lock(&timelines->lock);
 	list_for_each_entry_safe(tl, tn, &timelines->active_list, link) {
 		if (!mutex_trylock(&tl->mutex)) {
@@ -194,7 +197,7 @@ out_active:	spin_lock(&timelines->lock);
 	list_for_each_entry_safe(tl, tn, &free, link)
 		__intel_timeline_free(&tl->kref);
 
-	if (flush_submission(gt)) /* Wait, there's more! */
+	if (flush_submission(gt, timeout)) /* Wait, there's more! */
 		active_count++;
 
 	return active_count ? timeout : 0;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-gfx] [PATCH 5/7] drm/i915: Preallocate stashes for vma page-directories
  2020-07-08 13:47 [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Chris Wilson
                   ` (2 preceding siblings ...)
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 4/7] drm/i915/gem: Unpin idle contexts from kswapd reclaim Chris Wilson
@ 2020-07-08 13:47 ` Chris Wilson
  2020-07-10 17:48   ` Matthew Auld
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 6/7] drm/i915: Switch to object allocations for page directories Chris Wilson
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2020-07-08 13:47 UTC (permalink / raw)
  To: intel-gfx; +Cc: matthew.auld, Chris Wilson

We need to make the DMA allocations used for page directories to be
performed up front so that we can include those allocations in our
memory reservation pass. The downside is that we have to assume the
worst case, even before we know the final layout, and always allocate
enough page directories for this object, even when there will be overlap.
This unfortunately can be quite expensive, especially as we have to
clear/reset the page directories and DMA pages, but it should only be
required during early phases of a workload when new objects are being
discovered, or after memory/eviction pressure when we need to rebind.
Once we reach steady state, the objects should not be moved and we no
longer need to preallocating the pages tables.

It should be noted that the lifetime for the page directories DMA is
more or less decoupled from individual fences as they will be shared
across objects across timelines.

v2: Only allocate enough PD space for the PTE we may use, we do not need
to allocate PD that will be left as scratch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_client_blt.c    | 11 +--
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 38 +++------
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 77 +++++-------------
 drivers/gpu/drm/i915/gt/intel_ggtt.c          | 60 +++++++-------
 drivers/gpu/drm/i915/gt/intel_gtt.h           | 39 ++++++---
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         | 80 ++++++++++++++++---
 drivers/gpu/drm/i915/i915_vma.c               | 27 ++++---
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 60 ++++++++------
 drivers/gpu/drm/i915/selftests/mock_gtt.c     | 22 ++---
 9 files changed, 228 insertions(+), 186 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
index 278664f831e7..947c8aa8e13e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c
@@ -32,12 +32,13 @@ static void vma_clear_pages(struct i915_vma *vma)
 	vma->pages = NULL;
 }
 
-static int vma_bind(struct i915_address_space *vm,
-		    struct i915_vma *vma,
-		    enum i915_cache_level cache_level,
-		    u32 flags)
+static void vma_bind(struct i915_address_space *vm,
+		     struct i915_vm_pt_stash *stash,
+		     struct i915_vma *vma,
+		     enum i915_cache_level cache_level,
+		     u32 flags)
 {
-	return vm->vma_ops.bind_vma(vm, vma, cache_level, flags);
+	vm->vma_ops.bind_vma(vm, stash, vma, cache_level, flags);
 }
 
 static void vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 05497b50103f..35e2b698f9ed 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -177,16 +177,16 @@ static void gen6_flush_pd(struct gen6_ppgtt *ppgtt, u64 start, u64 end)
 	mutex_unlock(&ppgtt->flush);
 }
 
-static int gen6_alloc_va_range(struct i915_address_space *vm,
-			       u64 start, u64 length)
+static void gen6_alloc_va_range(struct i915_address_space *vm,
+				struct i915_vm_pt_stash *stash,
+				u64 start, u64 length)
 {
 	struct gen6_ppgtt *ppgtt = to_gen6_ppgtt(i915_vm_to_ppgtt(vm));
 	struct i915_page_directory * const pd = ppgtt->base.pd;
-	struct i915_page_table *pt, *alloc = NULL;
+	struct i915_page_table *pt;
 	intel_wakeref_t wakeref;
 	u64 from = start;
 	unsigned int pde;
-	int ret = 0;
 
 	wakeref = intel_runtime_pm_get(&vm->i915->runtime_pm);
 
@@ -197,21 +197,17 @@ static int gen6_alloc_va_range(struct i915_address_space *vm,
 		if (px_base(pt) == px_base(&vm->scratch[1])) {
 			spin_unlock(&pd->lock);
 
-			pt = fetch_and_zero(&alloc);
-			if (!pt)
-				pt = alloc_pt(vm);
-			if (IS_ERR(pt)) {
-				ret = PTR_ERR(pt);
-				goto unwind_out;
-			}
+			pt = stash->pt[0];
+			GEM_BUG_ON(!pt);
 
 			fill32_px(pt, vm->scratch[0].encode);
 
 			spin_lock(&pd->lock);
 			if (pd->entry[pde] == &vm->scratch[1]) {
+				stash->pt[0] = pt->stash;
+				atomic_set(&pt->used, 0);
 				pd->entry[pde] = pt;
 			} else {
-				alloc = pt;
 				pt = pd->entry[pde];
 			}
 		}
@@ -223,15 +219,7 @@ static int gen6_alloc_va_range(struct i915_address_space *vm,
 	if (i915_vma_is_bound(ppgtt->vma, I915_VMA_GLOBAL_BIND))
 		gen6_flush_pd(ppgtt, from, start);
 
-	goto out;
-
-unwind_out:
-	gen6_ppgtt_clear_range(vm, from, start - from);
-out:
-	if (alloc)
-		free_px(vm, alloc);
 	intel_runtime_pm_put(&vm->i915->runtime_pm, wakeref);
-	return ret;
 }
 
 static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
@@ -299,10 +287,11 @@ static void pd_vma_clear_pages(struct i915_vma *vma)
 	vma->pages = NULL;
 }
 
-static int pd_vma_bind(struct i915_address_space *vm,
-		       struct i915_vma *vma,
-		       enum i915_cache_level cache_level,
-		       u32 unused)
+static void pd_vma_bind(struct i915_address_space *vm,
+			struct i915_vm_pt_stash *stash,
+			struct i915_vma *vma,
+			enum i915_cache_level cache_level,
+			u32 unused)
 {
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	struct gen6_ppgtt *ppgtt = vma->private;
@@ -312,7 +301,6 @@ static int pd_vma_bind(struct i915_address_space *vm,
 	ppgtt->pd_addr = (gen6_pte_t __iomem *)ggtt->gsm + ggtt_offset;
 
 	gen6_flush_pd(ppgtt, 0, ppgtt->base.vm.total);
-	return 0;
 }
 
 static void pd_vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 699125928272..e6f2acd445dd 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -269,14 +269,12 @@ static void gen8_ppgtt_clear(struct i915_address_space *vm,
 			   start, start + length, vm->top);
 }
 
-static int __gen8_ppgtt_alloc(struct i915_address_space * const vm,
-			      struct i915_page_directory * const pd,
-			      u64 * const start, const u64 end, int lvl)
+static void __gen8_ppgtt_alloc(struct i915_address_space * const vm,
+			       struct i915_vm_pt_stash *stash,
+			       struct i915_page_directory * const pd,
+			       u64 * const start, const u64 end, int lvl)
 {
-	const struct i915_page_scratch * const scratch = &vm->scratch[lvl];
-	struct i915_page_table *alloc = NULL;
 	unsigned int idx, len;
-	int ret = 0;
 
 	GEM_BUG_ON(end > vm->total >> GEN8_PTE_SHIFT);
 
@@ -297,49 +295,30 @@ static int __gen8_ppgtt_alloc(struct i915_address_space * const vm,
 			DBG("%s(%p):{ lvl:%d, idx:%d } allocating new tree\n",
 			    __func__, vm, lvl + 1, idx);
 
-			pt = fetch_and_zero(&alloc);
-			if (lvl) {
-				if (!pt) {
-					pt = &alloc_pd(vm)->pt;
-					if (IS_ERR(pt)) {
-						ret = PTR_ERR(pt);
-						goto out;
-					}
-				}
+			pt = stash->pt[!!lvl];
+			GEM_BUG_ON(!pt);
 
+			if (lvl ||
+			    gen8_pt_count(*start, end) < I915_PDES ||
+			    intel_vgpu_active(vm->i915))
 				fill_px(pt, vm->scratch[lvl].encode);
-			} else {
-				if (!pt) {
-					pt = alloc_pt(vm);
-					if (IS_ERR(pt)) {
-						ret = PTR_ERR(pt);
-						goto out;
-					}
-				}
-
-				if (intel_vgpu_active(vm->i915) ||
-				    gen8_pt_count(*start, end) < I915_PDES)
-					fill_px(pt, vm->scratch[lvl].encode);
-			}
 
 			spin_lock(&pd->lock);
-			if (likely(!pd->entry[idx]))
+			if (likely(!pd->entry[idx])) {
+				stash->pt[!!lvl] = pt->stash;
+				atomic_set(&pt->used, 0);
 				set_pd_entry(pd, idx, pt);
-			else
-				alloc = pt, pt = pd->entry[idx];
+			} else {
+				pt = pd->entry[idx];
+			}
 		}
 
 		if (lvl) {
 			atomic_inc(&pt->used);
 			spin_unlock(&pd->lock);
 
-			ret = __gen8_ppgtt_alloc(vm, as_pd(pt),
-						 start, end, lvl);
-			if (unlikely(ret)) {
-				if (release_pd_entry(pd, idx, pt, scratch))
-					free_px(vm, pt);
-				goto out;
-			}
+			__gen8_ppgtt_alloc(vm, stash,
+					   as_pd(pt), start, end, lvl);
 
 			spin_lock(&pd->lock);
 			atomic_dec(&pt->used);
@@ -359,18 +338,12 @@ static int __gen8_ppgtt_alloc(struct i915_address_space * const vm,
 		}
 	} while (idx++, --len);
 	spin_unlock(&pd->lock);
-out:
-	if (alloc)
-		free_px(vm, alloc);
-	return ret;
 }
 
-static int gen8_ppgtt_alloc(struct i915_address_space *vm,
-			    u64 start, u64 length)
+static void gen8_ppgtt_alloc(struct i915_address_space *vm,
+			     struct i915_vm_pt_stash *stash,
+			     u64 start, u64 length)
 {
-	u64 from;
-	int err;
-
 	GEM_BUG_ON(!IS_ALIGNED(start, BIT_ULL(GEN8_PTE_SHIFT)));
 	GEM_BUG_ON(!IS_ALIGNED(length, BIT_ULL(GEN8_PTE_SHIFT)));
 	GEM_BUG_ON(range_overflows(start, length, vm->total));
@@ -378,15 +351,9 @@ static int gen8_ppgtt_alloc(struct i915_address_space *vm,
 	start >>= GEN8_PTE_SHIFT;
 	length >>= GEN8_PTE_SHIFT;
 	GEM_BUG_ON(length == 0);
-	from = start;
-
-	err = __gen8_ppgtt_alloc(vm, i915_vm_to_ppgtt(vm)->pd,
-				 &start, start + length, vm->top);
-	if (unlikely(err && from != start))
-		__gen8_ppgtt_clear(vm, i915_vm_to_ppgtt(vm)->pd,
-				   from, start, vm->top);
 
-	return err;
+	__gen8_ppgtt_alloc(vm, stash, i915_vm_to_ppgtt(vm)->pd,
+			   &start, start + length, vm->top);
 }
 
 static __always_inline void
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 62979ea591f0..5a33056ab976 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -436,16 +436,17 @@ static void i915_ggtt_clear_range(struct i915_address_space *vm,
 	intel_gtt_clear_range(start >> PAGE_SHIFT, length >> PAGE_SHIFT);
 }
 
-static int ggtt_bind_vma(struct i915_address_space *vm,
-			 struct i915_vma *vma,
-			 enum i915_cache_level cache_level,
-			 u32 flags)
+static void ggtt_bind_vma(struct i915_address_space *vm,
+			  struct i915_vm_pt_stash *stash,
+			  struct i915_vma *vma,
+			  enum i915_cache_level cache_level,
+			  u32 flags)
 {
 	struct drm_i915_gem_object *obj = vma->obj;
 	u32 pte_flags;
 
 	if (i915_vma_is_bound(vma, ~flags & I915_VMA_BIND_MASK))
-		return 0;
+		return;
 
 	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
 	pte_flags = 0;
@@ -454,8 +455,6 @@ static int ggtt_bind_vma(struct i915_address_space *vm,
 
 	vm->insert_entries(vm, vma, cache_level, pte_flags);
 	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
-
-	return 0;
 }
 
 static void ggtt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
@@ -568,31 +567,25 @@ static int init_ggtt(struct i915_ggtt *ggtt)
 	return ret;
 }
 
-static int aliasing_gtt_bind_vma(struct i915_address_space *vm,
-				 struct i915_vma *vma,
-				 enum i915_cache_level cache_level,
-				 u32 flags)
+static void aliasing_gtt_bind_vma(struct i915_address_space *vm,
+				  struct i915_vm_pt_stash *stash,
+				  struct i915_vma *vma,
+				  enum i915_cache_level cache_level,
+				  u32 flags)
 {
 	u32 pte_flags;
-	int ret;
 
 	/* Currently applicable only to VLV */
 	pte_flags = 0;
 	if (i915_gem_object_is_readonly(vma->obj))
 		pte_flags |= PTE_READ_ONLY;
 
-	if (flags & I915_VMA_LOCAL_BIND) {
-		struct i915_ppgtt *alias = i915_vm_to_ggtt(vm)->alias;
-
-		ret = ppgtt_bind_vma(&alias->vm, vma, cache_level, flags);
-		if (ret)
-			return ret;
-	}
+	if (flags & I915_VMA_LOCAL_BIND)
+		ppgtt_bind_vma(&i915_vm_to_ggtt(vm)->alias->vm,
+			       stash, vma, cache_level, flags);
 
 	if (flags & I915_VMA_GLOBAL_BIND)
 		vm->insert_entries(vm, vma, cache_level, pte_flags);
-
-	return 0;
 }
 
 static void aliasing_gtt_unbind_vma(struct i915_address_space *vm,
@@ -607,6 +600,7 @@ static void aliasing_gtt_unbind_vma(struct i915_address_space *vm,
 
 static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
 {
+	struct i915_vm_pt_stash stash = {};
 	struct i915_ppgtt *ppgtt;
 	int err;
 
@@ -619,15 +613,17 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
 		goto err_ppgtt;
 	}
 
+	err = i915_vm_alloc_pt_stash(&ppgtt->vm, &stash, ggtt->vm.total);
+	if (err)
+		goto err_ppgtt;
+
 	/*
 	 * Note we only pre-allocate as far as the end of the global
 	 * GTT. On 48b / 4-level page-tables, the difference is very,
 	 * very significant! We have to preallocate as GVT/vgpu does
 	 * not like the page directory disappearing.
 	 */
-	err = ppgtt->vm.allocate_va_range(&ppgtt->vm, 0, ggtt->vm.total);
-	if (err)
-		goto err_ppgtt;
+	ppgtt->vm.allocate_va_range(&ppgtt->vm, &stash, 0, ggtt->vm.total);
 
 	ggtt->alias = ppgtt;
 	ggtt->vm.bind_async_flags |= ppgtt->vm.bind_async_flags;
@@ -638,6 +634,7 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
 	GEM_BUG_ON(ggtt->vm.vma_ops.unbind_vma != ggtt_unbind_vma);
 	ggtt->vm.vma_ops.unbind_vma = aliasing_gtt_unbind_vma;
 
+	i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 	return 0;
 
 err_ppgtt:
@@ -1165,11 +1162,6 @@ void i915_ggtt_disable_guc(struct i915_ggtt *ggtt)
 	ggtt->invalidate(ggtt);
 }
 
-static unsigned int clear_bind(struct i915_vma *vma)
-{
-	return atomic_fetch_and(~I915_VMA_BIND_MASK, &vma->flags);
-}
-
 void i915_ggtt_resume(struct i915_ggtt *ggtt)
 {
 	struct i915_vma *vma;
@@ -1187,11 +1179,13 @@ void i915_ggtt_resume(struct i915_ggtt *ggtt)
 	/* clflush objects bound into the GGTT and rebind them. */
 	list_for_each_entry(vma, &ggtt->vm.bound_list, vm_link) {
 		struct drm_i915_gem_object *obj = vma->obj;
-		unsigned int was_bound = clear_bind(vma);
+		unsigned int was_bound =
+			atomic_read(&vma->flags) & I915_VMA_BIND_MASK;
 
-		WARN_ON(i915_vma_bind(vma,
-				      obj ? obj->cache_level : 0,
-				      was_bound, NULL));
+		GEM_BUG_ON(!was_bound);
+		vma->ops->bind_vma(&ggtt->vm, NULL, vma,
+				   obj ? obj->cache_level : 0,
+				   was_bound);
 		if (obj) { /* only used during resume => exclusive access */
 			flush |= fetch_and_zero(&obj->write_domain);
 			obj->read_domains |= I915_GEM_DOMAIN_GTT;
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index f2b75078e05f..8bd462d2fcd9 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -159,7 +159,10 @@ struct i915_page_scratch {
 
 struct i915_page_table {
 	struct i915_page_dma base;
-	atomic_t used;
+	union {
+		atomic_t used;
+		struct i915_page_table *stash;
+	};
 };
 
 struct i915_page_directory {
@@ -196,12 +199,18 @@ struct drm_i915_gem_object;
 struct i915_vma;
 struct intel_gt;
 
+struct i915_vm_pt_stash {
+	/* preallocated chains of page tables/directories */
+	struct i915_page_table *pt[2];
+};
+
 struct i915_vma_ops {
 	/* Map an object into an address space with the given cache flags. */
-	int (*bind_vma)(struct i915_address_space *vm,
-			struct i915_vma *vma,
-			enum i915_cache_level cache_level,
-			u32 flags);
+	void (*bind_vma)(struct i915_address_space *vm,
+			 struct i915_vm_pt_stash *stash,
+			 struct i915_vma *vma,
+			 enum i915_cache_level cache_level,
+			 u32 flags);
 	/*
 	 * Unmap an object from an address space. This usually consists of
 	 * setting the valid PTE entries to a reserved scratch page.
@@ -281,8 +290,9 @@ struct i915_address_space {
 			  u32 flags); /* Create a valid PTE */
 #define PTE_READ_ONLY	BIT(0)
 
-	int (*allocate_va_range)(struct i915_address_space *vm,
-				 u64 start, u64 length);
+	void (*allocate_va_range)(struct i915_address_space *vm,
+				  struct i915_vm_pt_stash *stash,
+				  u64 start, u64 length);
 	void (*clear_range)(struct i915_address_space *vm,
 			    u64 start, u64 length);
 	void (*insert_page)(struct i915_address_space *vm,
@@ -568,10 +578,11 @@ int ggtt_set_pages(struct i915_vma *vma);
 int ppgtt_set_pages(struct i915_vma *vma);
 void clear_pages(struct i915_vma *vma);
 
-int ppgtt_bind_vma(struct i915_address_space *vm,
-		   struct i915_vma *vma,
-		   enum i915_cache_level cache_level,
-		   u32 flags);
+void ppgtt_bind_vma(struct i915_address_space *vm,
+		    struct i915_vm_pt_stash *stash,
+		    struct i915_vma *vma,
+		    enum i915_cache_level cache_level,
+		    u32 flags);
 void ppgtt_unbind_vma(struct i915_address_space *vm,
 		      struct i915_vma *vma);
 
@@ -579,6 +590,12 @@ void gtt_write_workarounds(struct intel_gt *gt);
 
 void setup_private_pat(struct intel_uncore *uncore);
 
+int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
+			   struct i915_vm_pt_stash *stash,
+			   u64 size);
+void i915_vm_free_pt_stash(struct i915_address_space *vm,
+			   struct i915_vm_pt_stash *stash);
+
 static inline struct sgt_dma {
 	struct scatterlist *sg;
 	dma_addr_t dma, max;
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index f0862e924d11..9633fd2d294d 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -155,19 +155,16 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt)
 	return ppgtt;
 }
 
-int ppgtt_bind_vma(struct i915_address_space *vm,
-		   struct i915_vma *vma,
-		   enum i915_cache_level cache_level,
-		   u32 flags)
+void ppgtt_bind_vma(struct i915_address_space *vm,
+		    struct i915_vm_pt_stash *stash,
+		    struct i915_vma *vma,
+		    enum i915_cache_level cache_level,
+		    u32 flags)
 {
 	u32 pte_flags;
-	int err;
 
 	if (!test_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma))) {
-		err = vm->allocate_va_range(vm, vma->node.start, vma->size);
-		if (err)
-			return err;
-
+		vm->allocate_va_range(vm, stash, vma->node.start, vma->size);
 		set_bit(I915_VMA_ALLOC_BIT, __i915_vma_flags(vma));
 	}
 
@@ -178,8 +175,6 @@ int ppgtt_bind_vma(struct i915_address_space *vm,
 
 	vm->insert_entries(vm, vma, cache_level, pte_flags);
 	wmb();
-
-	return 0;
 }
 
 void ppgtt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
@@ -188,12 +183,73 @@ void ppgtt_unbind_vma(struct i915_address_space *vm, struct i915_vma *vma)
 		vm->clear_range(vm, vma->node.start, vma->size);
 }
 
+static unsigned long pd_count(u64 size, int shift)
+{
+	/* Beware later misalignment */
+	return (size + 2 * (BIT_ULL(shift) - 1)) >> shift;
+}
+
+int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
+			   struct i915_vm_pt_stash *stash,
+			   u64 size)
+{
+	unsigned long count;
+	int shift = 21;
+	int n;
+
+	count = pd_count(size, shift);
+	while (count--) {
+		struct i915_page_table *pt;
+
+		pt = alloc_pt(vm);
+		if (IS_ERR(pt)) {
+			i915_vm_free_pt_stash(vm, stash);
+			return PTR_ERR(pt);
+		}
+
+		pt->stash = stash->pt[0];
+		stash->pt[0] = pt;
+	}
+
+	for (n = 1; n < vm->top; n++) {
+		shift += 9;
+		count = pd_count(size, shift);
+		while (count--) {
+			struct i915_page_directory *pd;
+
+			pd = alloc_pd(vm);
+			if (IS_ERR(pd)) {
+				i915_vm_free_pt_stash(vm, stash);
+				return PTR_ERR(pd);
+			}
+
+			pd->pt.stash = stash->pt[1];
+			stash->pt[1] = &pd->pt;
+		}
+	}
+
+	return 0;
+}
+
+void i915_vm_free_pt_stash(struct i915_address_space *vm,
+			   struct i915_vm_pt_stash *stash)
+{
+	struct i915_page_table *pt;
+	int n;
+
+	for (n = 0; n < ARRAY_SIZE(stash->pt); n++) {
+		while ((pt = stash->pt[n])) {
+			stash->pt[n] = pt->stash;
+			free_px(vm, pt);
+		}
+	}
+}
+
 int ppgtt_set_pages(struct i915_vma *vma)
 {
 	GEM_BUG_ON(vma->pages);
 
 	vma->pages = vma->obj->mm.pages;
-
 	vma->page_sizes = vma->obj->mm.page_sizes;
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index bc64f773dcdb..70a6f55730f3 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -291,6 +291,8 @@ i915_vma_instance(struct drm_i915_gem_object *obj,
 
 struct i915_vma_work {
 	struct dma_fence_work base;
+	struct i915_address_space *vm;
+	struct i915_vm_pt_stash stash;
 	struct i915_vma *vma;
 	struct drm_i915_gem_object *pinned;
 	struct i915_sw_dma_fence_cb cb;
@@ -302,13 +304,10 @@ static int __vma_bind(struct dma_fence_work *work)
 {
 	struct i915_vma_work *vw = container_of(work, typeof(*vw), base);
 	struct i915_vma *vma = vw->vma;
-	int err;
-
-	err = vma->ops->bind_vma(vma->vm, vma, vw->cache_level, vw->flags);
-	if (err)
-		atomic_or(I915_VMA_ERROR, &vma->flags);
 
-	return err;
+	vma->ops->bind_vma(vw->vm, &vw->stash,
+			   vma, vw->cache_level, vw->flags);
+	return 0;
 }
 
 static void __vma_release(struct dma_fence_work *work)
@@ -317,6 +316,9 @@ static void __vma_release(struct dma_fence_work *work)
 
 	if (vw->pinned)
 		__i915_gem_object_unpin_pages(vw->pinned);
+
+	i915_vm_free_pt_stash(vw->vm, &vw->stash);
+	i915_vm_put(vw->vm);
 }
 
 static const struct dma_fence_work_ops bind_ops = {
@@ -376,7 +378,6 @@ int i915_vma_bind(struct i915_vma *vma,
 {
 	u32 bind_flags;
 	u32 vma_flags;
-	int ret;
 
 	GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
 	GEM_BUG_ON(vma->size > vma->node.size);
@@ -433,9 +434,7 @@ int i915_vma_bind(struct i915_vma *vma,
 			work->pinned = vma->obj;
 		}
 	} else {
-		ret = vma->ops->bind_vma(vma->vm, vma, cache_level, bind_flags);
-		if (ret)
-			return ret;
+		vma->ops->bind_vma(vma->vm, NULL, vma, cache_level, bind_flags);
 	}
 
 	atomic_or(bind_flags, &vma->flags);
@@ -879,6 +878,14 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 			err = -ENOMEM;
 			goto err_pages;
 		}
+
+		work->vm = i915_vm_get(vma->vm);
+
+		/* Allocate enough page directories to used PTE */
+		if (vma->vm->allocate_va_range)
+			i915_vm_alloc_pt_stash(vma->vm,
+					       &work->stash,
+					       vma->size);
 	}
 
 	if (flags & PIN_GLOBAL)
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 0016ffc7d914..9b8fc990e9ef 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -172,35 +172,33 @@ static int igt_ppgtt_alloc(void *arg)
 
 	/* Check we can allocate the entire range */
 	for (size = 4096; size <= limit; size <<= 2) {
-		err = ppgtt->vm.allocate_va_range(&ppgtt->vm, 0, size);
-		if (err) {
-			if (err == -ENOMEM) {
-				pr_info("[1] Ran out of memory for va_range [0 + %llx] [bit %d]\n",
-					size, ilog2(size));
-				err = 0; /* virtual space too large! */
-			}
+		struct i915_vm_pt_stash stash = {};
+
+		err = i915_vm_alloc_pt_stash(&ppgtt->vm, &stash, size);
+		if (err)
 			goto err_ppgtt_cleanup;
-		}
 
+		ppgtt->vm.allocate_va_range(&ppgtt->vm, &stash, 0, size);
 		cond_resched();
 
 		ppgtt->vm.clear_range(&ppgtt->vm, 0, size);
+
+		i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 	}
 
 	/* Check we can incrementally allocate the entire range */
 	for (last = 0, size = 4096; size <= limit; last = size, size <<= 2) {
-		err = ppgtt->vm.allocate_va_range(&ppgtt->vm,
-						  last, size - last);
-		if (err) {
-			if (err == -ENOMEM) {
-				pr_info("[2] Ran out of memory for va_range [%llx + %llx] [bit %d]\n",
-					last, size - last, ilog2(size));
-				err = 0; /* virtual space too large! */
-			}
+		struct i915_vm_pt_stash stash = {};
+
+		err = i915_vm_alloc_pt_stash(&ppgtt->vm, &stash, size - last);
+		if (err)
 			goto err_ppgtt_cleanup;
-		}
 
+		ppgtt->vm.allocate_va_range(&ppgtt->vm, &stash,
+					    last, size - last);
 		cond_resched();
+
+		i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 	}
 
 err_ppgtt_cleanup:
@@ -284,9 +282,18 @@ static int lowlevel_hole(struct i915_address_space *vm,
 				break;
 			}
 
-			if (vm->allocate_va_range &&
-			    vm->allocate_va_range(vm, addr, BIT_ULL(size)))
-				break;
+			if (vm->allocate_va_range) {
+				struct i915_vm_pt_stash stash = {};
+
+				if (i915_vm_alloc_pt_stash(vm, &stash,
+							   BIT_ULL(size)))
+					break;
+
+				vm->allocate_va_range(vm, &stash,
+						      addr, BIT_ULL(size));
+
+				i915_vm_free_pt_stash(vm, &stash);
+			}
 
 			mock_vma->pages = obj->mm.pages;
 			mock_vma->node.size = BIT_ULL(size);
@@ -1881,6 +1888,7 @@ static int igt_cs_tlb(void *arg)
 			continue;
 
 		while (!__igt_timeout(end_time, NULL)) {
+			struct i915_vm_pt_stash stash = {};
 			struct i915_request *rq;
 			u64 offset;
 
@@ -1888,10 +1896,6 @@ static int igt_cs_tlb(void *arg)
 						   0, vm->total - PAGE_SIZE,
 						   chunk_size, PAGE_SIZE);
 
-			err = vm->allocate_va_range(vm, offset, chunk_size);
-			if (err)
-				goto end;
-
 			memset32(result, STACK_MAGIC, PAGE_SIZE / sizeof(u32));
 
 			vma = i915_vma_instance(bbe, vm, NULL);
@@ -1904,6 +1908,14 @@ static int igt_cs_tlb(void *arg)
 			if (err)
 				goto end;
 
+			err = i915_vm_alloc_pt_stash(vm, &stash, chunk_size);
+			if (err)
+				goto end;
+
+			vm->allocate_va_range(vm, &stash, offset, chunk_size);
+
+			i915_vm_free_pt_stash(vm, &stash);
+
 			/* Prime the TLB with the dummy pages */
 			for (i = 0; i < count; i++) {
 				vma->node.start = offset + i * PAGE_SIZE;
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index b173086411ef..5e4fb0fba34b 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -38,14 +38,14 @@ static void mock_insert_entries(struct i915_address_space *vm,
 {
 }
 
-static int mock_bind_ppgtt(struct i915_address_space *vm,
-			   struct i915_vma *vma,
-			   enum i915_cache_level cache_level,
-			   u32 flags)
+static void mock_bind_ppgtt(struct i915_address_space *vm,
+			    struct i915_vm_pt_stash *stash,
+			    struct i915_vma *vma,
+			    enum i915_cache_level cache_level,
+			    u32 flags)
 {
 	GEM_BUG_ON(flags & I915_VMA_GLOBAL_BIND);
 	set_bit(I915_VMA_LOCAL_BIND_BIT, __i915_vma_flags(vma));
-	return 0;
 }
 
 static void mock_unbind_ppgtt(struct i915_address_space *vm,
@@ -74,6 +74,7 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
 	ppgtt->vm.i915 = i915;
 	ppgtt->vm.total = round_down(U64_MAX, PAGE_SIZE);
 	ppgtt->vm.file = ERR_PTR(-ENODEV);
+	ppgtt->vm.dma = &i915->drm.pdev->dev;
 
 	i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT);
 
@@ -90,13 +91,12 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
 	return ppgtt;
 }
 
-static int mock_bind_ggtt(struct i915_address_space *vm,
-			  struct i915_vma *vma,
-			  enum i915_cache_level cache_level,
-			  u32 flags)
+static void mock_bind_ggtt(struct i915_address_space *vm,
+			   struct i915_vm_pt_stash *stash,
+			   struct i915_vma *vma,
+			   enum i915_cache_level cache_level,
+			   u32 flags)
 {
-	atomic_or(I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND, &vma->flags);
-	return 0;
 }
 
 static void mock_unbind_ggtt(struct i915_address_space *vm,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-gfx] [PATCH 6/7] drm/i915: Switch to object allocations for page directories
  2020-07-08 13:47 [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Chris Wilson
                   ` (3 preceding siblings ...)
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 5/7] drm/i915: Preallocate stashes for vma page-directories Chris Wilson
@ 2020-07-08 13:47 ` Chris Wilson
  2020-07-08 18:32   ` Matthew Auld
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 7/7] drm/i915/gt: Optimise aliasing-ppgtt allocations Chris Wilson
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2020-07-08 13:47 UTC (permalink / raw)
  To: intel-gfx; +Cc: matthew.auld, Chris Wilson

The GEM object is grossly overweight for the practicality of tracking
large numbers of individual pages, yet it is currently our only
abstraction for tracking DMA allocations. Since those allocations need
to be reserved upfront before an operation, and that we need to break
away from simple system memory, we need to ditch using plain struct page
wrappers.

In the process, we drop the WC mapping as we ended up clflushing
everything anyway due to various issues across a wider range of
platforms. Though in a future step, we need to drop the kmap_atomic
approach which suggests we need to pre-map all the pages and keep them
mapped.

v2: Verify our large scratch page is suitably DMA aligned; and manually
clear the scratch since we are allocating random struct pages.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   1 +
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |   2 +-
 .../drm/i915/gem/selftests/i915_gem_context.c |   2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |  53 +--
 drivers/gpu/drm/i915/gt/gen6_ppgtt.h          |   1 +
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  89 ++---
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |  37 ++-
 drivers/gpu/drm/i915/gt/intel_gtt.c           | 303 ++++--------------
 drivers/gpu/drm/i915/gt/intel_gtt.h           |  94 ++----
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  42 ++-
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  16 +-
 drivers/gpu/drm/i915/gvt/scheduler.c          |  17 +-
 drivers/gpu/drm/i915/i915_drv.c               |   1 +
 drivers/gpu/drm/i915/i915_drv.h               |   5 -
 drivers/gpu/drm/i915/i915_vma.c               |  18 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  23 ++
 drivers/gpu/drm/i915/selftests/mock_gtt.c     |   4 +
 17 files changed, 287 insertions(+), 421 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 5335f799b548..d0847d7896f9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -282,6 +282,7 @@ struct drm_i915_gem_object {
 		} userptr;
 
 		unsigned long scratch;
+		u64 encode;
 
 		void *gvt_info;
 	};
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 8291ede6902c..9fb06fcc8f8f 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -393,7 +393,7 @@ static int igt_mock_exhaust_device_supported_pages(void *arg)
 	 */
 
 	for (i = 1; i < BIT(ARRAY_SIZE(page_sizes)); i++) {
-		unsigned int combination = 0;
+		unsigned int combination = SZ_4K;
 
 		for (j = 0; j < ARRAY_SIZE(page_sizes); j++) {
 			if (i & BIT(j))
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index b81978890641..1308198543d8 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1745,7 +1745,7 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
 	if (!vm)
 		return -ENODEV;
 
-	page = vm->scratch[0].base.page;
+	page = __px_page(vm->scratch[0]);
 	if (!page) {
 		pr_err("No scratch page!\n");
 		return -EINVAL;
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 35e2b698f9ed..1c1e807f674d 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -16,8 +16,10 @@ static inline void gen6_write_pde(const struct gen6_ppgtt *ppgtt,
 				  const unsigned int pde,
 				  const struct i915_page_table *pt)
 {
+	dma_addr_t addr = pt ? px_dma(pt) : px_dma(ppgtt->base.vm.scratch[1]);
+
 	/* Caller needs to make sure the write completes if necessary */
-	iowrite32(GEN6_PDE_ADDR_ENCODE(px_dma(pt)) | GEN6_PDE_VALID,
+	iowrite32(GEN6_PDE_ADDR_ENCODE(addr) | GEN6_PDE_VALID,
 		  ppgtt->pd_addr + pde);
 }
 
@@ -79,7 +81,7 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 {
 	struct gen6_ppgtt * const ppgtt = to_gen6_ppgtt(i915_vm_to_ppgtt(vm));
 	const unsigned int first_entry = start / I915_GTT_PAGE_SIZE;
-	const gen6_pte_t scratch_pte = vm->scratch[0].encode;
+	const gen6_pte_t scratch_pte = vm->scratch[0]->encode;
 	unsigned int pde = first_entry / GEN6_PTES;
 	unsigned int pte = first_entry % GEN6_PTES;
 	unsigned int num_entries = length / I915_GTT_PAGE_SIZE;
@@ -90,8 +92,6 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 		const unsigned int count = min(num_entries, GEN6_PTES - pte);
 		gen6_pte_t *vaddr;
 
-		GEM_BUG_ON(px_base(pt) == px_base(&vm->scratch[1]));
-
 		num_entries -= count;
 
 		GEM_BUG_ON(count > atomic_read(&pt->used));
@@ -127,7 +127,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 	struct sgt_dma iter = sgt_dma(vma);
 	gen6_pte_t *vaddr;
 
-	GEM_BUG_ON(pd->entry[act_pt] == &vm->scratch[1]);
+	GEM_BUG_ON(!pd->entry[act_pt]);
 
 	vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));
 	do {
@@ -194,16 +194,17 @@ static void gen6_alloc_va_range(struct i915_address_space *vm,
 	gen6_for_each_pde(pt, pd, start, length, pde) {
 		const unsigned int count = gen6_pte_count(start, length);
 
-		if (px_base(pt) == px_base(&vm->scratch[1])) {
+		if (!pt) {
 			spin_unlock(&pd->lock);
 
 			pt = stash->pt[0];
-			GEM_BUG_ON(!pt);
+			__i915_gem_object_pin_pages(pt->base);
+			i915_gem_object_make_unshrinkable(pt->base);
 
-			fill32_px(pt, vm->scratch[0].encode);
+			fill32_px(pt, vm->scratch[0]->encode);
 
 			spin_lock(&pd->lock);
-			if (pd->entry[pde] == &vm->scratch[1]) {
+			if (!pd->entry[pde]) {
 				stash->pt[0] = pt->stash;
 				atomic_set(&pt->used, 0);
 				pd->entry[pde] = pt;
@@ -225,24 +226,27 @@ static void gen6_alloc_va_range(struct i915_address_space *vm,
 static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
 {
 	struct i915_address_space * const vm = &ppgtt->base.vm;
-	struct i915_page_directory * const pd = ppgtt->base.pd;
 	int ret;
 
-	ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+	ret = setup_scratch_page(vm);
 	if (ret)
 		return ret;
 
-	vm->scratch[0].encode =
-		vm->pte_encode(px_dma(&vm->scratch[0]),
+	vm->scratch[0]->encode =
+		vm->pte_encode(px_dma(vm->scratch[0]),
 			       I915_CACHE_NONE, PTE_READ_ONLY);
 
-	if (unlikely(setup_page_dma(vm, px_base(&vm->scratch[1])))) {
-		cleanup_scratch_page(vm);
-		return -ENOMEM;
+	vm->scratch[1] = vm->alloc_pt_dma(vm, I915_GTT_PAGE_SIZE_4K);
+	if (IS_ERR(vm->scratch[1]))
+		return PTR_ERR(vm->scratch[1]);
+
+	ret = pin_pt_dma(vm, vm->scratch[1]);
+	if (ret) {
+		i915_gem_object_put(vm->scratch[1]);
+		return ret;
 	}
 
-	fill32_px(&vm->scratch[1], vm->scratch[0].encode);
-	memset_p(pd->entry, &vm->scratch[1], I915_PDES);
+	fill32_px(vm->scratch[1], vm->scratch[0]->encode);
 
 	return 0;
 }
@@ -250,13 +254,11 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
 static void gen6_ppgtt_free_pd(struct gen6_ppgtt *ppgtt)
 {
 	struct i915_page_directory * const pd = ppgtt->base.pd;
-	struct i915_page_dma * const scratch =
-		px_base(&ppgtt->base.vm.scratch[1]);
 	struct i915_page_table *pt;
 	u32 pde;
 
 	gen6_for_all_pdes(pt, pd, pde)
-		if (px_base(pt) != scratch)
+		if (pt)
 			free_px(&ppgtt->base.vm, pt);
 }
 
@@ -297,7 +299,7 @@ static void pd_vma_bind(struct i915_address_space *vm,
 	struct gen6_ppgtt *ppgtt = vma->private;
 	u32 ggtt_offset = i915_ggtt_offset(vma) / I915_GTT_PAGE_SIZE;
 
-	px_base(ppgtt->base.pd)->ggtt_offset = ggtt_offset * sizeof(gen6_pte_t);
+	ppgtt->pp_dir = ggtt_offset * sizeof(gen6_pte_t) << 10;
 	ppgtt->pd_addr = (gen6_pte_t __iomem *)ggtt->gsm + ggtt_offset;
 
 	gen6_flush_pd(ppgtt, 0, ppgtt->base.vm.total);
@@ -307,8 +309,6 @@ static void pd_vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
 {
 	struct gen6_ppgtt *ppgtt = vma->private;
 	struct i915_page_directory * const pd = ppgtt->base.pd;
-	struct i915_page_dma * const scratch =
-		px_base(&ppgtt->base.vm.scratch[1]);
 	struct i915_page_table *pt;
 	unsigned int pde;
 
@@ -317,11 +317,11 @@ static void pd_vma_unbind(struct i915_address_space *vm, struct i915_vma *vma)
 
 	/* Free all no longer used page tables */
 	gen6_for_all_pdes(pt, ppgtt->base.pd, pde) {
-		if (px_base(pt) == scratch || atomic_read(&pt->used))
+		if (!pt || atomic_read(&pt->used))
 			continue;
 
 		free_px(&ppgtt->base.vm, pt);
-		pd->entry[pde] = scratch;
+		pd->entry[pde] = NULL;
 	}
 
 	ppgtt->scan_for_unused_pt = false;
@@ -441,6 +441,7 @@ struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt)
 	ppgtt->base.vm.insert_entries = gen6_ppgtt_insert_entries;
 	ppgtt->base.vm.cleanup = gen6_ppgtt_cleanup;
 
+	ppgtt->base.vm.alloc_pt_dma = alloc_pt_dma;
 	ppgtt->base.vm.pte_encode = ggtt->vm.pte_encode;
 
 	ppgtt->base.pd = __alloc_pd(sizeof(*ppgtt->base.pd));
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.h b/drivers/gpu/drm/i915/gt/gen6_ppgtt.h
index 72e481806c96..7249672e5802 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.h
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.h
@@ -14,6 +14,7 @@ struct gen6_ppgtt {
 	struct mutex flush;
 	struct i915_vma *vma;
 	gen6_pte_t __iomem *pd_addr;
+	u32 pp_dir;
 
 	atomic_t pin_count;
 	struct mutex pin_mutex;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index e6f2acd445dd..f987781382d5 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -199,7 +199,7 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 			      struct i915_page_directory * const pd,
 			      u64 start, const u64 end, int lvl)
 {
-	const struct i915_page_scratch * const scratch = &vm->scratch[lvl];
+	const struct drm_i915_gem_object * const scratch = vm->scratch[lvl];
 	unsigned int idx, len;
 
 	GEM_BUG_ON(end > vm->total >> GEN8_PTE_SHIFT);
@@ -239,7 +239,7 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm,
 
 			vaddr = kmap_atomic_px(pt);
 			memset64(vaddr + gen8_pd_index(start, 0),
-				 vm->scratch[0].encode,
+				 vm->scratch[0]->encode,
 				 count);
 			kunmap_atomic(vaddr);
 
@@ -296,12 +296,13 @@ static void __gen8_ppgtt_alloc(struct i915_address_space * const vm,
 			    __func__, vm, lvl + 1, idx);
 
 			pt = stash->pt[!!lvl];
-			GEM_BUG_ON(!pt);
+			__i915_gem_object_pin_pages(pt->base);
+			i915_gem_object_make_unshrinkable(pt->base);
 
 			if (lvl ||
 			    gen8_pt_count(*start, end) < I915_PDES ||
 			    intel_vgpu_active(vm->i915))
-				fill_px(pt, vm->scratch[lvl].encode);
+				fill_px(pt, vm->scratch[lvl]->encode);
 
 			spin_lock(&pd->lock);
 			if (likely(!pd->entry[idx])) {
@@ -356,16 +357,6 @@ static void gen8_ppgtt_alloc(struct i915_address_space *vm,
 			   &start, start + length, vm->top);
 }
 
-static __always_inline void
-write_pte(gen8_pte_t *pte, const gen8_pte_t val)
-{
-	/* Magic delays? Or can we refine these to flush all in one pass? */
-	*pte = val;
-	wmb(); /* cpu to cache */
-	clflush(pte); /* cache to memory */
-	wmb(); /* visible to all */
-}
-
 static __always_inline u64
 gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 		      struct i915_page_directory *pdp,
@@ -382,8 +373,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 	vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
 	do {
 		GEM_BUG_ON(iter->sg->length < I915_GTT_PAGE_SIZE);
-		write_pte(&vaddr[gen8_pd_index(idx, 0)],
-			  pte_encode | iter->dma);
+		vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma;
 
 		iter->dma += I915_GTT_PAGE_SIZE;
 		if (iter->dma >= iter->max) {
@@ -406,10 +396,12 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
 				pd = pdp->entry[gen8_pd_index(idx, 2)];
 			}
 
+			clflush_cache_range(vaddr, PAGE_SIZE);
 			kunmap_atomic(vaddr);
 			vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
 		}
 	} while (1);
+	clflush_cache_range(vaddr, PAGE_SIZE);
 	kunmap_atomic(vaddr);
 
 	return idx;
@@ -465,7 +457,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 
 		do {
 			GEM_BUG_ON(iter->sg->length < page_size);
-			write_pte(&vaddr[index++], encode | iter->dma);
+			vaddr[index++] = encode | iter->dma;
 
 			start += page_size;
 			iter->dma += page_size;
@@ -490,6 +482,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			}
 		} while (rem >= page_size && index < I915_PDES);
 
+		clflush_cache_range(vaddr, PAGE_SIZE);
 		kunmap_atomic(vaddr);
 
 		/*
@@ -521,7 +514,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
 			if (I915_SELFTEST_ONLY(vma->vm->scrub_64K)) {
 				u16 i;
 
-				encode = vma->vm->scratch[0].encode;
+				encode = vma->vm->scratch[0]->encode;
 				vaddr = kmap_atomic_px(i915_pt_entry(pd, maybe_64K));
 
 				for (i = 1; i < index; i += 16)
@@ -575,27 +568,37 @@ static int gen8_init_scratch(struct i915_address_space *vm)
 		GEM_BUG_ON(!clone->has_read_only);
 
 		vm->scratch_order = clone->scratch_order;
-		memcpy(vm->scratch, clone->scratch, sizeof(vm->scratch));
-		px_dma(&vm->scratch[0]) = 0; /* no xfer of ownership */
+		for (i = 0; i <= vm->top; i++)
+			vm->scratch[i] = i915_gem_object_get(clone->scratch[i]);
+
 		return 0;
 	}
 
-	ret = setup_scratch_page(vm, __GFP_HIGHMEM);
+	ret = setup_scratch_page(vm);
 	if (ret)
 		return ret;
 
-	vm->scratch[0].encode =
-		gen8_pte_encode(px_dma(&vm->scratch[0]),
+	vm->scratch[0]->encode =
+		gen8_pte_encode(px_dma(vm->scratch[0]),
 				I915_CACHE_LLC, vm->has_read_only);
 
 	for (i = 1; i <= vm->top; i++) {
-		if (unlikely(setup_page_dma(vm, px_base(&vm->scratch[i]))))
+		struct drm_i915_gem_object *obj;
+
+		obj = vm->alloc_pt_dma(vm, I915_GTT_PAGE_SIZE_4K);
+		if (IS_ERR(obj))
 			goto free_scratch;
 
-		fill_px(&vm->scratch[i], vm->scratch[i - 1].encode);
-		vm->scratch[i].encode =
-			gen8_pde_encode(px_dma(&vm->scratch[i]),
-					I915_CACHE_LLC);
+		ret = pin_pt_dma(vm, obj);
+		if (ret) {
+			i915_gem_object_put(obj);
+			goto free_scratch;
+		}
+
+		fill_px(obj, vm->scratch[i - 1]->encode);
+		obj->encode = gen8_pde_encode(px_dma(obj), I915_CACHE_LLC);
+
+		vm->scratch[i] = obj;
 	}
 
 	return 0;
@@ -616,12 +619,20 @@ static int gen8_preallocate_top_level_pdp(struct i915_ppgtt *ppgtt)
 
 	for (idx = 0; idx < GEN8_3LVL_PDPES; idx++) {
 		struct i915_page_directory *pde;
+		int err;
 
 		pde = alloc_pd(vm);
 		if (IS_ERR(pde))
 			return PTR_ERR(pde);
 
-		fill_px(pde, vm->scratch[1].encode);
+		err = pin_pt_dma(vm, pde->pt.base);
+		if (err) {
+			i915_gem_object_put(pde->pt.base);
+			kfree(pde);
+			return err;
+		}
+
+		fill_px(pde, vm->scratch[1]->encode);
 		set_pd_entry(pd, idx, pde);
 		atomic_inc(px_used(pde)); /* keep pinned */
 	}
@@ -635,6 +646,7 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
 {
 	const unsigned int count = gen8_pd_top_count(vm);
 	struct i915_page_directory *pd;
+	int err;
 
 	GEM_BUG_ON(count > ARRAY_SIZE(pd->entry));
 
@@ -642,12 +654,20 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
 	if (unlikely(!pd))
 		return ERR_PTR(-ENOMEM);
 
-	if (unlikely(setup_page_dma(vm, px_base(pd)))) {
+	pd->pt.base = vm->alloc_pt_dma(vm, I915_GTT_PAGE_SIZE_4K);
+	if (IS_ERR(pd->pt.base)) {
 		kfree(pd);
 		return ERR_PTR(-ENOMEM);
 	}
 
-	fill_page_dma(px_base(pd), vm->scratch[vm->top].encode, count);
+	err = pin_pt_dma(vm, pd->pt.base);
+	if (err) {
+		i915_gem_object_put(pd->pt.base);
+		kfree(pd);
+		return ERR_PTR(err);
+	}
+
+	fill_page_dma(px_base(pd), vm->scratch[vm->top]->encode, count);
 	atomic_inc(px_used(pd)); /* mark as pinned */
 	return pd;
 }
@@ -681,12 +701,7 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
 	 */
 	ppgtt->vm.has_read_only = !IS_GEN_RANGE(gt->i915, 11, 12);
 
-	/*
-	 * There are only few exceptions for gen >=6. chv and bxt.
-	 * And we are not sure about the latter so play safe for now.
-	 */
-	if (IS_CHERRYVIEW(gt->i915) || IS_BROXTON(gt->i915))
-		ppgtt->vm.pt_kmap_wc = true;
+	ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
 
 	err = gen8_init_scratch(&ppgtt->vm);
 	if (err)
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 5a33056ab976..33a3f627ddb1 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -78,8 +78,6 @@ int i915_ggtt_init_hw(struct drm_i915_private *i915)
 {
 	int ret;
 
-	stash_init(&i915->mm.wc_stash);
-
 	/*
 	 * Note that we use page colouring to enforce a guard page at the
 	 * end of the address space. This is required as the CS may prefetch
@@ -232,7 +230,7 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 
 	/* Fill the allocated but "unused" space beyond the end of the buffer */
 	while (gte < end)
-		gen8_set_pte(gte++, vm->scratch[0].encode);
+		gen8_set_pte(gte++, vm->scratch[0]->encode);
 
 	/*
 	 * We want to flush the TLBs only after we're certain all the PTE
@@ -283,7 +281,7 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 
 	/* Fill the allocated but "unused" space beyond the end of the buffer */
 	while (gte < end)
-		iowrite32(vm->scratch[0].encode, gte++);
+		iowrite32(vm->scratch[0]->encode, gte++);
 
 	/*
 	 * We want to flush the TLBs only after we're certain all the PTE
@@ -303,7 +301,7 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	unsigned int first_entry = start / I915_GTT_PAGE_SIZE;
 	unsigned int num_entries = length / I915_GTT_PAGE_SIZE;
-	const gen8_pte_t scratch_pte = vm->scratch[0].encode;
+	const gen8_pte_t scratch_pte = vm->scratch[0]->encode;
 	gen8_pte_t __iomem *gtt_base =
 		(gen8_pte_t __iomem *)ggtt->gsm + first_entry;
 	const int max_entries = ggtt_total_entries(ggtt) - first_entry;
@@ -401,7 +399,7 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 		 first_entry, num_entries, max_entries))
 		num_entries = max_entries;
 
-	scratch_pte = vm->scratch[0].encode;
+	scratch_pte = vm->scratch[0]->encode;
 	for (i = 0; i < num_entries; i++)
 		iowrite32(scratch_pte, &gtt_base[i]);
 }
@@ -617,6 +615,10 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
 	if (err)
 		goto err_ppgtt;
 
+	err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+	if (err)
+		goto err_stash;
+
 	/*
 	 * Note we only pre-allocate as far as the end of the global
 	 * GTT. On 48b / 4-level page-tables, the difference is very,
@@ -637,6 +639,8 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
 	i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 	return 0;
 
+err_stash:
+	i915_vm_free_pt_stash(&ppgtt->vm, &stash);
 err_ppgtt:
 	i915_vm_put(&ppgtt->vm);
 	return err;
@@ -712,18 +716,11 @@ static void ggtt_cleanup_hw(struct i915_ggtt *ggtt)
 void i915_ggtt_driver_release(struct drm_i915_private *i915)
 {
 	struct i915_ggtt *ggtt = &i915->ggtt;
-	struct pagevec *pvec;
 
 	fini_aliasing_ppgtt(ggtt);
 
 	intel_ggtt_fini_fences(ggtt);
 	ggtt_cleanup_hw(ggtt);
-
-	pvec = &i915->mm.wc_stash.pvec;
-	if (pvec->nr) {
-		set_pages_array_wb(pvec->pages, pvec->nr);
-		__pagevec_release(pvec);
-	}
 }
 
 static unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
@@ -786,7 +783,7 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
 		return -ENOMEM;
 	}
 
-	ret = setup_scratch_page(&ggtt->vm, GFP_DMA32);
+	ret = setup_scratch_page(&ggtt->vm);
 	if (ret) {
 		drm_err(&i915->drm, "Scratch setup failed\n");
 		/* iounmap will also get called at remove, but meh */
@@ -794,8 +791,8 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
 		return ret;
 	}
 
-	ggtt->vm.scratch[0].encode =
-		ggtt->vm.pte_encode(px_dma(&ggtt->vm.scratch[0]),
+	ggtt->vm.scratch[0]->encode =
+		ggtt->vm.pte_encode(px_dma(ggtt->vm.scratch[0]),
 				    I915_CACHE_NONE, 0);
 
 	return 0;
@@ -821,7 +818,7 @@ static void gen6_gmch_remove(struct i915_address_space *vm)
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 
 	iounmap(ggtt->gsm);
-	cleanup_scratch_page(vm);
+	free_scratch(vm);
 }
 
 static struct resource pci_resource(struct pci_dev *pdev, int bar)
@@ -849,6 +846,8 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
 	else
 		size = gen8_get_total_gtt_size(snb_gmch_ctl);
 
+	ggtt->vm.alloc_pt_dma = alloc_pt_dma;
+
 	ggtt->vm.total = (size / sizeof(gen8_pte_t)) * I915_GTT_PAGE_SIZE;
 	ggtt->vm.cleanup = gen6_gmch_remove;
 	ggtt->vm.insert_page = gen8_ggtt_insert_page;
@@ -997,6 +996,8 @@ static int gen6_gmch_probe(struct i915_ggtt *ggtt)
 	size = gen6_get_total_gtt_size(snb_gmch_ctl);
 	ggtt->vm.total = (size / sizeof(gen6_pte_t)) * I915_GTT_PAGE_SIZE;
 
+	ggtt->vm.alloc_pt_dma = alloc_pt_dma;
+
 	ggtt->vm.clear_range = nop_clear_range;
 	if (!HAS_FULL_PPGTT(i915) || intel_scanout_needs_vtd_wa(i915))
 		ggtt->vm.clear_range = gen6_ggtt_clear_range;
@@ -1047,6 +1048,8 @@ static int i915_gmch_probe(struct i915_ggtt *ggtt)
 	ggtt->gmadr =
 		(struct resource)DEFINE_RES_MEM(gmadr_base, ggtt->mappable_end);
 
+	ggtt->vm.alloc_pt_dma = alloc_pt_dma;
+
 	ggtt->do_idle_maps = needs_idle_maps(i915);
 	ggtt->vm.insert_page = i915_ggtt_insert_page;
 	ggtt->vm.insert_entries = i915_ggtt_insert_entries;
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 2a72cce63fd9..795ed81ba358 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -11,160 +11,21 @@
 #include "intel_gt.h"
 #include "intel_gtt.h"
 
-void stash_init(struct pagestash *stash)
+struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz)
 {
-	pagevec_init(&stash->pvec);
-	spin_lock_init(&stash->lock);
+	return i915_gem_object_create_internal(vm->i915, sz);
 }
 
-static struct page *stash_pop_page(struct pagestash *stash)
+int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
 {
-	struct page *page = NULL;
+	int err;
 
-	spin_lock(&stash->lock);
-	if (likely(stash->pvec.nr))
-		page = stash->pvec.pages[--stash->pvec.nr];
-	spin_unlock(&stash->lock);
+	err = i915_gem_object_pin_pages(obj);
+	if (err)
+		return err;
 
-	return page;
-}
-
-static void stash_push_pagevec(struct pagestash *stash, struct pagevec *pvec)
-{
-	unsigned int nr;
-
-	spin_lock_nested(&stash->lock, SINGLE_DEPTH_NESTING);
-
-	nr = min_t(typeof(nr), pvec->nr, pagevec_space(&stash->pvec));
-	memcpy(stash->pvec.pages + stash->pvec.nr,
-	       pvec->pages + pvec->nr - nr,
-	       sizeof(pvec->pages[0]) * nr);
-	stash->pvec.nr += nr;
-
-	spin_unlock(&stash->lock);
-
-	pvec->nr -= nr;
-}
-
-static struct page *vm_alloc_page(struct i915_address_space *vm, gfp_t gfp)
-{
-	struct pagevec stack;
-	struct page *page;
-
-	if (I915_SELFTEST_ONLY(should_fail(&vm->fault_attr, 1)))
-		i915_gem_shrink_all(vm->i915);
-
-	page = stash_pop_page(&vm->free_pages);
-	if (page)
-		return page;
-
-	if (!vm->pt_kmap_wc)
-		return alloc_page(gfp);
-
-	/* Look in our global stash of WC pages... */
-	page = stash_pop_page(&vm->i915->mm.wc_stash);
-	if (page)
-		return page;
-
-	/*
-	 * Otherwise batch allocate pages to amortize cost of set_pages_wc.
-	 *
-	 * We have to be careful as page allocation may trigger the shrinker
-	 * (via direct reclaim) which will fill up the WC stash underneath us.
-	 * So we add our WB pages into a temporary pvec on the stack and merge
-	 * them into the WC stash after all the allocations are complete.
-	 */
-	pagevec_init(&stack);
-	do {
-		struct page *page;
-
-		page = alloc_page(gfp);
-		if (unlikely(!page))
-			break;
-
-		stack.pages[stack.nr++] = page;
-	} while (pagevec_space(&stack));
-
-	if (stack.nr && !set_pages_array_wc(stack.pages, stack.nr)) {
-		page = stack.pages[--stack.nr];
-
-		/* Merge spare WC pages to the global stash */
-		if (stack.nr)
-			stash_push_pagevec(&vm->i915->mm.wc_stash, &stack);
-
-		/* Push any surplus WC pages onto the local VM stash */
-		if (stack.nr)
-			stash_push_pagevec(&vm->free_pages, &stack);
-	}
-
-	/* Return unwanted leftovers */
-	if (unlikely(stack.nr)) {
-		WARN_ON_ONCE(set_pages_array_wb(stack.pages, stack.nr));
-		__pagevec_release(&stack);
-	}
-
-	return page;
-}
-
-static void vm_free_pages_release(struct i915_address_space *vm,
-				  bool immediate)
-{
-	struct pagevec *pvec = &vm->free_pages.pvec;
-	struct pagevec stack;
-
-	lockdep_assert_held(&vm->free_pages.lock);
-	GEM_BUG_ON(!pagevec_count(pvec));
-
-	if (vm->pt_kmap_wc) {
-		/*
-		 * When we use WC, first fill up the global stash and then
-		 * only if full immediately free the overflow.
-		 */
-		stash_push_pagevec(&vm->i915->mm.wc_stash, pvec);
-
-		/*
-		 * As we have made some room in the VM's free_pages,
-		 * we can wait for it to fill again. Unless we are
-		 * inside i915_address_space_fini() and must
-		 * immediately release the pages!
-		 */
-		if (pvec->nr <= (immediate ? 0 : PAGEVEC_SIZE - 1))
-			return;
-
-		/*
-		 * We have to drop the lock to allow ourselves to sleep,
-		 * so take a copy of the pvec and clear the stash for
-		 * others to use it as we sleep.
-		 */
-		stack = *pvec;
-		pagevec_reinit(pvec);
-		spin_unlock(&vm->free_pages.lock);
-
-		pvec = &stack;
-		set_pages_array_wb(pvec->pages, pvec->nr);
-
-		spin_lock(&vm->free_pages.lock);
-	}
-
-	__pagevec_release(pvec);
-}
-
-static void vm_free_page(struct i915_address_space *vm, struct page *page)
-{
-	/*
-	 * On !llc, we need to change the pages back to WB. We only do so
-	 * in bulk, so we rarely need to change the page attributes here,
-	 * but doing so requires a stop_machine() from deep inside arch/x86/mm.
-	 * To make detection of the possible sleep more likely, use an
-	 * unconditional might_sleep() for everybody.
-	 */
-	might_sleep();
-	spin_lock(&vm->free_pages.lock);
-	while (!pagevec_space(&vm->free_pages.pvec))
-		vm_free_pages_release(vm, false);
-	GEM_BUG_ON(pagevec_count(&vm->free_pages.pvec) >= PAGEVEC_SIZE);
-	pagevec_add(&vm->free_pages.pvec, page);
-	spin_unlock(&vm->free_pages.lock);
+	i915_gem_object_make_unshrinkable(obj);
+	return 0;
 }
 
 void __i915_vm_close(struct i915_address_space *vm)
@@ -194,14 +55,7 @@ void __i915_vm_close(struct i915_address_space *vm)
 
 void i915_address_space_fini(struct i915_address_space *vm)
 {
-	spin_lock(&vm->free_pages.lock);
-	if (pagevec_count(&vm->free_pages.pvec))
-		vm_free_pages_release(vm, true);
-	GEM_BUG_ON(pagevec_count(&vm->free_pages.pvec));
-	spin_unlock(&vm->free_pages.lock);
-
 	drm_mm_takedown(&vm->mm);
-
 	mutex_destroy(&vm->mutex);
 }
 
@@ -246,8 +100,6 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	drm_mm_init(&vm->mm, 0, vm->total);
 	vm->mm.head_node.color = I915_COLOR_UNEVICTABLE;
 
-	stash_init(&vm->free_pages);
-
 	INIT_LIST_HEAD(&vm->bound_list);
 }
 
@@ -264,64 +116,50 @@ void clear_pages(struct i915_vma *vma)
 	memset(&vma->page_sizes, 0, sizeof(vma->page_sizes));
 }
 
-static int __setup_page_dma(struct i915_address_space *vm,
-			    struct i915_page_dma *p,
-			    gfp_t gfp)
-{
-	p->page = vm_alloc_page(vm, gfp | I915_GFP_ALLOW_FAIL);
-	if (unlikely(!p->page))
-		return -ENOMEM;
-
-	p->daddr = dma_map_page_attrs(vm->dma,
-				      p->page, 0, PAGE_SIZE,
-				      PCI_DMA_BIDIRECTIONAL,
-				      DMA_ATTR_SKIP_CPU_SYNC |
-				      DMA_ATTR_NO_WARN);
-	if (unlikely(dma_mapping_error(vm->dma, p->daddr))) {
-		vm_free_page(vm, p->page);
-		return -ENOMEM;
-	}
-
-	return 0;
-}
-
-int setup_page_dma(struct i915_address_space *vm, struct i915_page_dma *p)
+dma_addr_t __px_dma(struct drm_i915_gem_object *p)
 {
-	return __setup_page_dma(vm, p, __GFP_HIGHMEM);
+	GEM_BUG_ON(!i915_gem_object_has_pages(p));
+	return sg_dma_address(p->mm.pages->sgl);
 }
 
-void cleanup_page_dma(struct i915_address_space *vm, struct i915_page_dma *p)
+struct page *__px_page(struct drm_i915_gem_object *p)
 {
-	dma_unmap_page(vm->dma, p->daddr, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-	vm_free_page(vm, p->page);
+	GEM_BUG_ON(!i915_gem_object_has_pages(p));
+	return sg_page(p->mm.pages->sgl);
 }
 
 void
-fill_page_dma(const struct i915_page_dma *p, const u64 val, unsigned int count)
+fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count)
 {
-	kunmap_atomic(memset64(kmap_atomic(p->page), val, count));
+	struct page *page = __px_page(p);
+	void *vaddr;
+
+	vaddr = kmap(page);
+	memset64(vaddr, val, count);
+	clflush_cache_range(vaddr, PAGE_SIZE);
+	kunmap(page);
 }
 
-static void poison_scratch_page(struct page *page, unsigned long size)
+static void poison_scratch_page(struct drm_i915_gem_object *scratch)
 {
-	if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
-		return;
+	struct sgt_iter sgt;
+	struct page *page;
+	u8 val;
 
-	GEM_BUG_ON(!IS_ALIGNED(size, PAGE_SIZE));
+	val = 0;
+	if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
+		val = POISON_FREE;
 
-	do {
+	for_each_sgt_page(page, sgt, scratch->mm.pages) {
 		void *vaddr;
 
 		vaddr = kmap(page);
-		memset(vaddr, POISON_FREE, PAGE_SIZE);
+		memset(vaddr, val, PAGE_SIZE);
 		kunmap(page);
-
-		page = pfn_to_page(page_to_pfn(page) + 1);
-		size -= PAGE_SIZE;
-	} while (size);
+	}
 }
 
-int setup_scratch_page(struct i915_address_space *vm, gfp_t gfp)
+int setup_scratch_page(struct i915_address_space *vm)
 {
 	unsigned long size;
 
@@ -338,21 +176,27 @@ int setup_scratch_page(struct i915_address_space *vm, gfp_t gfp)
 	 */
 	size = I915_GTT_PAGE_SIZE_4K;
 	if (i915_vm_is_4lvl(vm) &&
-	    HAS_PAGE_SIZES(vm->i915, I915_GTT_PAGE_SIZE_64K)) {
+	    HAS_PAGE_SIZES(vm->i915, I915_GTT_PAGE_SIZE_64K))
 		size = I915_GTT_PAGE_SIZE_64K;
-		gfp |= __GFP_NOWARN;
-	}
-	gfp |= __GFP_ZERO | __GFP_RETRY_MAYFAIL;
 
 	do {
-		unsigned int order = get_order(size);
-		struct page *page;
-		dma_addr_t addr;
+		struct drm_i915_gem_object *obj;
 
-		page = alloc_pages(gfp, order);
-		if (unlikely(!page))
+		obj = vm->alloc_pt_dma(vm, size);
+		if (IS_ERR(obj))
 			goto skip;
 
+		if (pin_pt_dma(vm, obj))
+			goto skip_obj;
+
+		/* We need a single contiguous page for our scratch */
+		if (obj->mm.page_sizes.sg < size)
+			goto skip_obj;
+
+		/* And it needs to be correspondingly aligned */
+		if (__px_dma(obj) & (size - 1))
+			goto skip_obj;
+
 		/*
 		 * Use a non-zero scratch page for debugging.
 		 *
@@ -362,61 +206,28 @@ int setup_scratch_page(struct i915_address_space *vm, gfp_t gfp)
 		 * should it ever be accidentally used, the effect should be
 		 * fairly benign.
 		 */
-		poison_scratch_page(page, size);
-
-		addr = dma_map_page_attrs(vm->dma,
-					  page, 0, size,
-					  PCI_DMA_BIDIRECTIONAL,
-					  DMA_ATTR_SKIP_CPU_SYNC |
-					  DMA_ATTR_NO_WARN);
-		if (unlikely(dma_mapping_error(vm->dma, addr)))
-			goto free_page;
-
-		if (unlikely(!IS_ALIGNED(addr, size)))
-			goto unmap_page;
-
-		vm->scratch[0].base.page = page;
-		vm->scratch[0].base.daddr = addr;
-		vm->scratch_order = order;
+		poison_scratch_page(obj);
+
+		vm->scratch[0] = obj;
+		vm->scratch_order = get_order(size);
 		return 0;
 
-unmap_page:
-		dma_unmap_page(vm->dma, addr, size, PCI_DMA_BIDIRECTIONAL);
-free_page:
-		__free_pages(page, order);
+skip_obj:
+		i915_gem_object_put(obj);
 skip:
 		if (size == I915_GTT_PAGE_SIZE_4K)
 			return -ENOMEM;
 
 		size = I915_GTT_PAGE_SIZE_4K;
-		gfp &= ~__GFP_NOWARN;
 	} while (1);
 }
 
-void cleanup_scratch_page(struct i915_address_space *vm)
-{
-	struct i915_page_dma *p = px_base(&vm->scratch[0]);
-	unsigned int order = vm->scratch_order;
-
-	dma_unmap_page(vm->dma, p->daddr, BIT(order) << PAGE_SHIFT,
-		       PCI_DMA_BIDIRECTIONAL);
-	__free_pages(p->page, order);
-}
-
 void free_scratch(struct i915_address_space *vm)
 {
 	int i;
 
-	if (!px_dma(&vm->scratch[0])) /* set to 0 on clones */
-		return;
-
-	for (i = 1; i <= vm->top; i++) {
-		if (!px_dma(&vm->scratch[i]))
-			break;
-		cleanup_page_dma(vm, px_base(&vm->scratch[i]));
-	}
-
-	cleanup_scratch_page(vm);
+	for (i = 0; i <= vm->top; i++)
+		i915_gem_object_put(vm->scratch[i]);
 }
 
 void gtt_write_workarounds(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 8bd462d2fcd9..6e1bedf5448b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -134,31 +134,19 @@ typedef u64 gen8_pte_t;
 #define GEN8_PDE_IPS_64K BIT(11)
 #define GEN8_PDE_PS_2M   BIT(7)
 
+enum i915_cache_level;
+
+struct drm_i915_file_private;
+struct drm_i915_gem_object;
 struct i915_fence_reg;
+struct i915_vma;
+struct intel_gt;
 
 #define for_each_sgt_daddr(__dp, __iter, __sgt) \
 	__for_each_sgt_daddr(__dp, __iter, __sgt, I915_GTT_PAGE_SIZE)
 
-struct i915_page_dma {
-	struct page *page;
-	union {
-		dma_addr_t daddr;
-
-		/*
-		 * For gen6/gen7 only. This is the offset in the GGTT
-		 * where the page directory entries for PPGTT begin
-		 */
-		u32 ggtt_offset;
-	};
-};
-
-struct i915_page_scratch {
-	struct i915_page_dma base;
-	u64 encode;
-};
-
 struct i915_page_table {
-	struct i915_page_dma base;
+	struct drm_i915_gem_object *base;
 	union {
 		atomic_t used;
 		struct i915_page_table *stash;
@@ -179,12 +167,14 @@ struct i915_page_directory {
 	other)
 
 #define px_base(px) \
-	__px_choose_expr(px, struct i915_page_dma *, __x, \
-	__px_choose_expr(px, struct i915_page_scratch *, &__x->base, \
-	__px_choose_expr(px, struct i915_page_table *, &__x->base, \
-	__px_choose_expr(px, struct i915_page_directory *, &__x->pt.base, \
-	(void)0))))
-#define px_dma(px) (px_base(px)->daddr)
+	__px_choose_expr(px, struct drm_i915_gem_object *, __x, \
+	__px_choose_expr(px, struct i915_page_table *, __x->base, \
+	__px_choose_expr(px, struct i915_page_directory *, __x->pt.base, \
+	(void)0)))
+
+struct page *__px_page(struct drm_i915_gem_object *p);
+dma_addr_t __px_dma(struct drm_i915_gem_object *p);
+#define px_dma(px) (__px_dma(px_base(px)))
 
 #define px_pt(px) \
 	__px_choose_expr(px, struct i915_page_table *, __x, \
@@ -192,13 +182,6 @@ struct i915_page_directory {
 	(void)0))
 #define px_used(px) (&px_pt(px)->used)
 
-enum i915_cache_level;
-
-struct drm_i915_file_private;
-struct drm_i915_gem_object;
-struct i915_vma;
-struct intel_gt;
-
 struct i915_vm_pt_stash {
 	/* preallocated chains of page tables/directories */
 	struct i915_page_table *pt[2];
@@ -222,13 +205,6 @@ struct i915_vma_ops {
 	void (*clear_pages)(struct i915_vma *vma);
 };
 
-struct pagestash {
-	spinlock_t lock;
-	struct pagevec pvec;
-};
-
-void stash_init(struct pagestash *stash);
-
 struct i915_address_space {
 	struct kref ref;
 	struct rcu_work rcu;
@@ -265,7 +241,7 @@ struct i915_address_space {
 #define VM_CLASS_GGTT 0
 #define VM_CLASS_PPGTT 1
 
-	struct i915_page_scratch scratch[4];
+	struct drm_i915_gem_object *scratch[4];
 	unsigned int scratch_order;
 	unsigned int top;
 
@@ -274,17 +250,15 @@ struct i915_address_space {
 	 */
 	struct list_head bound_list;
 
-	struct pagestash free_pages;
-
 	/* Global GTT */
 	bool is_ggtt:1;
 
-	/* Some systems require uncached updates of the page directories */
-	bool pt_kmap_wc:1;
-
 	/* Some systems support read-only mappings for GGTT and/or PPGTT */
 	bool has_read_only:1;
 
+	struct drm_i915_gem_object *
+		(*alloc_pt_dma)(struct i915_address_space *vm, int sz);
+
 	u64 (*pte_encode)(dma_addr_t addr,
 			  enum i915_cache_level level,
 			  u32 flags); /* Create a valid PTE */
@@ -500,9 +474,9 @@ i915_pd_entry(const struct i915_page_directory * const pdp,
 static inline dma_addr_t
 i915_page_dir_dma_addr(const struct i915_ppgtt *ppgtt, const unsigned int n)
 {
-	struct i915_page_dma *pt = ppgtt->pd->entry[n];
+	struct i915_page_table *pt = ppgtt->pd->entry[n];
 
-	return px_dma(pt ?: px_base(&ppgtt->vm.scratch[ppgtt->vm.top]));
+	return __px_dma(pt ? px_base(pt) : ppgtt->vm.scratch[ppgtt->vm.top]);
 }
 
 void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt);
@@ -527,13 +501,10 @@ struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
 void i915_ggtt_suspend(struct i915_ggtt *gtt);
 void i915_ggtt_resume(struct i915_ggtt *ggtt);
 
-int setup_page_dma(struct i915_address_space *vm, struct i915_page_dma *p);
-void cleanup_page_dma(struct i915_address_space *vm, struct i915_page_dma *p);
-
-#define kmap_atomic_px(px) kmap_atomic(px_base(px)->page)
+#define kmap_atomic_px(px) kmap_atomic(__px_page(px_base(px)))
 
 void
-fill_page_dma(const struct i915_page_dma *p, const u64 val, unsigned int count);
+fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count);
 
 #define fill_px(px, v) fill_page_dma(px_base(px), (v), PAGE_SIZE / sizeof(u64))
 #define fill32_px(px, v) do {						\
@@ -541,37 +512,38 @@ fill_page_dma(const struct i915_page_dma *p, const u64 val, unsigned int count);
 	fill_px((px), v__ << 32 | v__);					\
 } while (0)
 
-int setup_scratch_page(struct i915_address_space *vm, gfp_t gfp);
-void cleanup_scratch_page(struct i915_address_space *vm);
+int setup_scratch_page(struct i915_address_space *vm);
 void free_scratch(struct i915_address_space *vm);
 
+struct drm_i915_gem_object *alloc_pt_dma(struct i915_address_space *vm, int sz);
 struct i915_page_table *alloc_pt(struct i915_address_space *vm);
 struct i915_page_directory *alloc_pd(struct i915_address_space *vm);
 struct i915_page_directory *__alloc_pd(size_t sz);
 
-void free_pd(struct i915_address_space *vm, struct i915_page_dma *pd);
+int pin_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
 
-#define free_px(vm, px) free_pd(vm, px_base(px))
+void free_pt(struct i915_address_space *vm, struct i915_page_table *pt);
+#define free_px(vm, px) free_pt(vm, px_pt(px))
 
 void
 __set_pd_entry(struct i915_page_directory * const pd,
 	       const unsigned short idx,
-	       struct i915_page_dma * const to,
+	       struct i915_page_table *pt,
 	       u64 (*encode)(const dma_addr_t, const enum i915_cache_level));
 
 #define set_pd_entry(pd, idx, to) \
-	__set_pd_entry((pd), (idx), px_base(to), gen8_pde_encode)
+	__set_pd_entry((pd), (idx), px_pt(to), gen8_pde_encode)
 
 void
 clear_pd_entry(struct i915_page_directory * const pd,
 	       const unsigned short idx,
-	       const struct i915_page_scratch * const scratch);
+	       const struct drm_i915_gem_object * const scratch);
 
 bool
 release_pd_entry(struct i915_page_directory * const pd,
 		 const unsigned short idx,
 		 struct i915_page_table * const pt,
-		 const struct i915_page_scratch * const scratch);
+		 const struct drm_i915_gem_object * const scratch);
 void gen6_ggtt_invalidate(struct i915_ggtt *ggtt);
 
 int ggtt_set_pages(struct i915_vma *vma);
@@ -593,6 +565,8 @@ void setup_private_pat(struct intel_uncore *uncore);
 int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash,
 			   u64 size);
+int i915_vm_pin_pt_stash(struct i915_address_space *vm,
+			 struct i915_vm_pt_stash *stash);
 void i915_vm_free_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 9633fd2d294d..1215c1d6d9e3 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -18,7 +18,8 @@ struct i915_page_table *alloc_pt(struct i915_address_space *vm)
 	if (unlikely(!pt))
 		return ERR_PTR(-ENOMEM);
 
-	if (unlikely(setup_page_dma(vm, &pt->base))) {
+	pt->base = vm->alloc_pt_dma(vm, I915_GTT_PAGE_SIZE_4K);
+	if (IS_ERR(pt->base)) {
 		kfree(pt);
 		return ERR_PTR(-ENOMEM);
 	}
@@ -47,7 +48,8 @@ struct i915_page_directory *alloc_pd(struct i915_address_space *vm)
 	if (unlikely(!pd))
 		return ERR_PTR(-ENOMEM);
 
-	if (unlikely(setup_page_dma(vm, px_base(pd)))) {
+	pd->pt.base = vm->alloc_pt_dma(vm, I915_GTT_PAGE_SIZE_4K);
+	if (IS_ERR(pd->pt.base)) {
 		kfree(pd);
 		return ERR_PTR(-ENOMEM);
 	}
@@ -55,27 +57,28 @@ struct i915_page_directory *alloc_pd(struct i915_address_space *vm)
 	return pd;
 }
 
-void free_pd(struct i915_address_space *vm, struct i915_page_dma *pd)
+void free_pt(struct i915_address_space *vm, struct i915_page_table *pt)
 {
-	cleanup_page_dma(vm, pd);
-	kfree(pd);
+	i915_gem_object_put(pt->base);
+	kfree(pt);
 }
 
 static inline void
-write_dma_entry(struct i915_page_dma * const pdma,
+write_dma_entry(struct drm_i915_gem_object * const pdma,
 		const unsigned short idx,
 		const u64 encoded_entry)
 {
-	u64 * const vaddr = kmap_atomic(pdma->page);
+	u64 * const vaddr = kmap_atomic(__px_page(pdma));
 
 	vaddr[idx] = encoded_entry;
+	clflush_cache_range(&vaddr[idx], sizeof(u64));
 	kunmap_atomic(vaddr);
 }
 
 void
 __set_pd_entry(struct i915_page_directory * const pd,
 	       const unsigned short idx,
-	       struct i915_page_dma * const to,
+	       struct i915_page_table * const to,
 	       u64 (*encode)(const dma_addr_t, const enum i915_cache_level))
 {
 	/* Each thread pre-pins the pd, and we may have a thread per pde. */
@@ -83,13 +86,13 @@ __set_pd_entry(struct i915_page_directory * const pd,
 
 	atomic_inc(px_used(pd));
 	pd->entry[idx] = to;
-	write_dma_entry(px_base(pd), idx, encode(to->daddr, I915_CACHE_LLC));
+	write_dma_entry(px_base(pd), idx, encode(px_dma(to), I915_CACHE_LLC));
 }
 
 void
 clear_pd_entry(struct i915_page_directory * const pd,
 	       const unsigned short idx,
-	       const struct i915_page_scratch * const scratch)
+	       const struct drm_i915_gem_object * const scratch)
 {
 	GEM_BUG_ON(atomic_read(px_used(pd)) == 0);
 
@@ -102,7 +105,7 @@ bool
 release_pd_entry(struct i915_page_directory * const pd,
 		 const unsigned short idx,
 		 struct i915_page_table * const pt,
-		 const struct i915_page_scratch * const scratch)
+		 const struct drm_i915_gem_object * const scratch)
 {
 	bool free = false;
 
@@ -231,6 +234,23 @@ int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
 	return 0;
 }
 
+int i915_vm_pin_pt_stash(struct i915_address_space *vm,
+			 struct i915_vm_pt_stash *stash)
+{
+	struct i915_page_table *pt;
+	int n, err;
+
+	for (n = 0; n < ARRAY_SIZE(stash->pt); n++) {
+		for (pt = stash->pt[n]; pt; pt = pt->stash) {
+			err = pin_pt_dma(vm, pt->base);
+			if (err)
+				return err;
+		}
+	}
+
+	return 0;
+}
+
 void i915_vm_free_pt_stash(struct i915_address_space *vm,
 			   struct i915_vm_pt_stash *stash)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index a1e1d24da2fb..db07dc2450e4 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -201,16 +201,18 @@ static struct i915_address_space *vm_alias(struct i915_address_space *vm)
 	return vm;
 }
 
+static u32 pp_dir(struct i915_address_space *vm)
+{
+	return to_gen6_ppgtt(i915_vm_to_ppgtt(vm))->pp_dir;
+}
+
 static void set_pp_dir(struct intel_engine_cs *engine)
 {
 	struct i915_address_space *vm = vm_alias(engine->gt->vm);
 
 	if (vm) {
-		struct i915_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
-
 		ENGINE_WRITE(engine, RING_PP_DIR_DCLV, PP_DIR_DCLV_2G);
-		ENGINE_WRITE(engine, RING_PP_DIR_BASE,
-			     px_base(ppgtt->pd)->ggtt_offset << 10);
+		ENGINE_WRITE(engine, RING_PP_DIR_BASE, pp_dir(vm));
 	}
 }
 
@@ -609,7 +611,7 @@ static const struct intel_context_ops ring_context_ops = {
 };
 
 static int load_pd_dir(struct i915_request *rq,
-		       const struct i915_ppgtt *ppgtt,
+		       struct i915_address_space *vm,
 		       u32 valid)
 {
 	const struct intel_engine_cs * const engine = rq->engine;
@@ -625,7 +627,7 @@ static int load_pd_dir(struct i915_request *rq,
 
 	*cs++ = MI_LOAD_REGISTER_IMM(1);
 	*cs++ = i915_mmio_reg_offset(RING_PP_DIR_BASE(engine->mmio_base));
-	*cs++ = px_base(ppgtt->pd)->ggtt_offset << 10;
+	*cs++ = pp_dir(vm);
 
 	/* Stall until the page table load is complete? */
 	*cs++ = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT;
@@ -827,7 +829,7 @@ static int switch_mm(struct i915_request *rq, struct i915_address_space *vm)
 	 * post-sync op, this extra pass appears vital before a
 	 * mm switch!
 	 */
-	ret = load_pd_dir(rq, i915_vm_to_ppgtt(vm), PP_DIR_DCLV_2G);
+	ret = load_pd_dir(rq, vm, PP_DIR_DCLV_2G);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 3c3b9842bbbd..1570eb8aa978 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -403,6 +403,14 @@ static void release_shadow_wa_ctx(struct intel_shadow_wa_ctx *wa_ctx)
 	wa_ctx->indirect_ctx.shadow_va = NULL;
 }
 
+static void set_dma_address(struct i915_page_directory *pd, dma_addr_t addr)
+{
+	struct scatterlist *sg = pd->pt.base->mm.pages->sgl;
+
+	/* This is not a good idea */
+	sg->dma_address = addr;
+}
+
 static void set_context_ppgtt_from_shadow(struct intel_vgpu_workload *workload,
 					  struct intel_context *ce)
 {
@@ -411,7 +419,7 @@ static void set_context_ppgtt_from_shadow(struct intel_vgpu_workload *workload,
 	int i = 0;
 
 	if (mm->ppgtt_mm.root_entry_type == GTT_TYPE_PPGTT_ROOT_L4_ENTRY) {
-		px_dma(ppgtt->pd) = mm->ppgtt_mm.shadow_pdps[0];
+		set_dma_address(ppgtt->pd, mm->ppgtt_mm.shadow_pdps[0]);
 	} else {
 		for (i = 0; i < GVT_RING_CTX_NR_PDPS; i++) {
 			struct i915_page_directory * const pd =
@@ -421,7 +429,8 @@ static void set_context_ppgtt_from_shadow(struct intel_vgpu_workload *workload,
 			   shadow ppgtt. */
 			if (!pd)
 				break;
-			px_dma(pd) = mm->ppgtt_mm.shadow_pdps[i];
+
+			set_dma_address(pd, mm->ppgtt_mm.shadow_pdps[i]);
 		}
 	}
 }
@@ -1240,13 +1249,13 @@ i915_context_ppgtt_root_restore(struct intel_vgpu_submission *s,
 	int i;
 
 	if (i915_vm_is_4lvl(&ppgtt->vm)) {
-		px_dma(ppgtt->pd) = s->i915_context_pml4;
+		set_dma_address(ppgtt->pd, s->i915_context_pml4);
 	} else {
 		for (i = 0; i < GEN8_3LVL_PDPES; i++) {
 			struct i915_page_directory * const pd =
 				i915_pd_entry(ppgtt->pd, i);
 
-			px_dma(pd) = s->i915_context_pdps[i];
+			set_dma_address(pd, s->i915_context_pdps[i]);
 		}
 	}
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 67102dc26fce..ea281d7b0630 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1080,6 +1080,7 @@ static void i915_driver_release(struct drm_device *dev)
 
 	intel_memory_regions_driver_release(dev_priv);
 	i915_ggtt_driver_release(dev_priv);
+	i915_gem_drain_freed_objects(dev_priv);
 
 	i915_driver_mmio_release(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2c2e88d49f3e..0df73a14af1c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -590,11 +590,6 @@ struct i915_gem_mm {
 	 */
 	atomic_t free_count;
 
-	/**
-	 * Small stash of WC pages
-	 */
-	struct pagestash wc_stash;
-
 	/**
 	 * tmpfs instance used for shmem backed objects
 	 */
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 70a6f55730f3..1fc657581c97 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -872,24 +872,30 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 	if (err)
 		return err;
 
+	if (flags & PIN_GLOBAL)
+		wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm);
+
 	if (flags & vma->vm->bind_async_flags) {
 		work = i915_vma_work();
 		if (!work) {
 			err = -ENOMEM;
-			goto err_pages;
+			goto err_rpm;
 		}
 
 		work->vm = i915_vm_get(vma->vm);
 
 		/* Allocate enough page directories to used PTE */
-		if (vma->vm->allocate_va_range)
+		if (vma->vm->allocate_va_range) {
 			i915_vm_alloc_pt_stash(vma->vm,
 					       &work->stash,
 					       vma->size);
-	}
 
-	if (flags & PIN_GLOBAL)
-		wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm);
+			err = i915_vm_pin_pt_stash(vma->vm,
+						   &work->stash);
+			if (err)
+				goto err_fence;
+		}
+	}
 
 	/*
 	 * Differentiate between user/kernel vma inside the aliasing-ppgtt.
@@ -978,9 +984,9 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 err_fence:
 	if (work)
 		dma_fence_work_commit_imm(&work->base);
+err_rpm:
 	if (wakeref)
 		intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref);
-err_pages:
 	vma_put_pages(vma);
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 9b8fc990e9ef..af8205a2bd8f 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -178,6 +178,12 @@ static int igt_ppgtt_alloc(void *arg)
 		if (err)
 			goto err_ppgtt_cleanup;
 
+		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+		if (err) {
+			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
+			goto err_ppgtt_cleanup;
+		}
+
 		ppgtt->vm.allocate_va_range(&ppgtt->vm, &stash, 0, size);
 		cond_resched();
 
@@ -194,6 +200,12 @@ static int igt_ppgtt_alloc(void *arg)
 		if (err)
 			goto err_ppgtt_cleanup;
 
+		err = i915_vm_pin_pt_stash(&ppgtt->vm, &stash);
+		if (err) {
+			i915_vm_free_pt_stash(&ppgtt->vm, &stash);
+			goto err_ppgtt_cleanup;
+		}
+
 		ppgtt->vm.allocate_va_range(&ppgtt->vm, &stash,
 					    last, size - last);
 		cond_resched();
@@ -289,6 +301,11 @@ static int lowlevel_hole(struct i915_address_space *vm,
 							   BIT_ULL(size)))
 					break;
 
+				if (i915_vm_pin_pt_stash(vm, &stash)) {
+					i915_vm_free_pt_stash(vm, &stash);
+					break;
+				}
+
 				vm->allocate_va_range(vm, &stash,
 						      addr, BIT_ULL(size));
 
@@ -1912,6 +1929,12 @@ static int igt_cs_tlb(void *arg)
 			if (err)
 				goto end;
 
+			err = i915_vm_pin_pt_stash(vm, &stash);
+			if (err) {
+				i915_vm_free_pt_stash(vm, &stash);
+				goto end;
+			}
+
 			vm->allocate_va_range(vm, &stash, offset, chunk_size);
 
 			i915_vm_free_pt_stash(vm, &stash);
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index 5e4fb0fba34b..7270fc8ca801 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -78,6 +78,8 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name)
 
 	i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT);
 
+	ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
+
 	ppgtt->vm.clear_range = mock_clear_range;
 	ppgtt->vm.insert_page = mock_insert_page;
 	ppgtt->vm.insert_entries = mock_insert_entries;
@@ -116,6 +118,8 @@ void mock_init_ggtt(struct drm_i915_private *i915, struct i915_ggtt *ggtt)
 	ggtt->mappable_end = resource_size(&ggtt->gmadr);
 	ggtt->vm.total = 4096 * PAGE_SIZE;
 
+	ggtt->vm.alloc_pt_dma = alloc_pt_dma;
+
 	ggtt->vm.clear_range = mock_clear_range;
 	ggtt->vm.insert_page = mock_insert_page;
 	ggtt->vm.insert_entries = mock_insert_entries;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-gfx] [PATCH 7/7] drm/i915/gt: Optimise aliasing-ppgtt allocations
  2020-07-08 13:47 [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Chris Wilson
                   ` (4 preceding siblings ...)
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 6/7] drm/i915: Switch to object allocations for page directories Chris Wilson
@ 2020-07-08 13:47 ` Chris Wilson
  2020-07-08 16:55   ` Matthew Auld
  2020-07-08 15:19 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2) Patchwork
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2020-07-08 13:47 UTC (permalink / raw)
  To: intel-gfx; +Cc: matthew.auld, Chris Wilson

Since the aliasing-ppgtt remains the default for gen6/gen7, it is worth
optimising the ppgtt allocation for it. In this case, we do not need to
flush the GGTT page directories entries as they are fixed during setup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 1c1e807f674d..277737e2d41a 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -184,12 +184,10 @@ static void gen6_alloc_va_range(struct i915_address_space *vm,
 	struct gen6_ppgtt *ppgtt = to_gen6_ppgtt(i915_vm_to_ppgtt(vm));
 	struct i915_page_directory * const pd = ppgtt->base.pd;
 	struct i915_page_table *pt;
-	intel_wakeref_t wakeref;
+	bool flush = false;
 	u64 from = start;
 	unsigned int pde;
 
-	wakeref = intel_runtime_pm_get(&vm->i915->runtime_pm);
-
 	spin_lock(&pd->lock);
 	gen6_for_each_pde(pt, pd, start, length, pde) {
 		const unsigned int count = gen6_pte_count(start, length);
@@ -211,16 +209,20 @@ static void gen6_alloc_va_range(struct i915_address_space *vm,
 			} else {
 				pt = pd->entry[pde];
 			}
+
+			flush = true;
 		}
 
 		atomic_add(count, &pt->used);
 	}
 	spin_unlock(&pd->lock);
 
-	if (i915_vma_is_bound(ppgtt->vma, I915_VMA_GLOBAL_BIND))
-		gen6_flush_pd(ppgtt, from, start);
+	if (flush && i915_vma_is_bound(ppgtt->vma, I915_VMA_GLOBAL_BIND)) {
+		intel_wakeref_t wakeref;
 
-	intel_runtime_pm_put(&vm->i915->runtime_pm, wakeref);
+		with_intel_runtime_pm(&vm->i915->runtime_pm, wakeref)
+			gen6_flush_pd(ppgtt, from, start);
+	}
 }
 
 static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 2/7] drm/i915: Release shortlived maps of longlived objects
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 2/7] drm/i915: Release shortlived maps of longlived objects Chris Wilson
@ 2020-07-08 13:50   ` Chris Wilson
  2020-07-08 14:35   ` [Intel-gfx] [PATCH v2] " Chris Wilson
  1 sibling, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2020-07-08 13:50 UTC (permalink / raw)
  To: intel-gfx; +Cc: matthew.auld

Quoting Chris Wilson (2020-07-08 14:47:37)
> Some objects we map once during their construction, and then never
> access their mappings again, even if they are kept around for the
> duration of the driver. Keeping those pages mapped, often vmapped, is
> therefore wasteful and we should release the maps as soon as we no
> longer need them.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |  2 ++
>  drivers/gpu/drm/i915/gem/i915_gem_pages.c     | 20 +++++++++++++++++++
>  drivers/gpu/drm/i915/gt/gen7_renderclear.c    |  1 +
>  drivers/gpu/drm/i915/gt/intel_lrc.c           |  1 +
>  .../gpu/drm/i915/gt/intel_ring_submission.c   |  1 +
>  drivers/gpu/drm/i915/i915_perf.c              |  2 ++
>  6 files changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> index 2faa481cc18f..a26a4faf014f 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> @@ -394,6 +394,8 @@ static inline void i915_gem_object_unpin_map(struct drm_i915_gem_object *obj)
>         i915_gem_object_unpin_pages(obj);
>  }
>  
> +int i915_gem_object_release_map(struct drm_i915_gem_object *obj);
> +
>  void
>  i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
>                                    unsigned int flush_domains);
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> index af9e48ee4a33..114256014a97 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> @@ -408,6 +408,26 @@ void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj,
>         }
>  }
>  
> +int i915_gem_object_release_map(struct drm_i915_gem_object *obj)
> +{
> +       int err;
> +
> +       err = mutex_lock_interruptible(&obj->mm.lock);
> +       if (err)
> +               return err;
> +
> +       if (atomic_read(&obj->mm.pages_pin_count)) {
> +               err = -EBUSY;

Hmm. I realise this slightly defeats the efficacy of the patch, since we
often have a pinned vma at the time.

Since we only call this at known points, we can just say the caller
knows best and force the unmap. If we get it wrong, we'll have a very
weird GPF.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Intel-gfx] [PATCH v2] drm/i915: Release shortlived maps of longlived objects
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 2/7] drm/i915: Release shortlived maps of longlived objects Chris Wilson
  2020-07-08 13:50   ` Chris Wilson
@ 2020-07-08 14:35   ` Chris Wilson
  2020-07-08 16:39     ` Matthew Auld
  1 sibling, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2020-07-08 14:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Some objects we map once during their construction, and then never
access their mappings again, even if they are kept around for the
duration of the driver. Keeping those pages mapped, often vmapped, is
therefore wasteful and we should release the maps as soon as we no
longer need them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h      |  2 ++
 drivers/gpu/drm/i915/gem/i915_gem_pages.c       | 16 ++++++++++++++++
 drivers/gpu/drm/i915/gt/gen7_renderclear.c      |  2 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c             |  2 +-
 drivers/gpu/drm/i915/gt/intel_ring_submission.c |  2 +-
 drivers/gpu/drm/i915/i915_perf.c                |  4 ++--
 6 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 2faa481cc18f..9cf4ad78ece6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -394,6 +394,8 @@ static inline void i915_gem_object_unpin_map(struct drm_i915_gem_object *obj)
 	i915_gem_object_unpin_pages(obj);
 }
 
+void __i915_gem_object_release_map(struct drm_i915_gem_object *obj);
+
 void
 i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
 				   unsigned int flush_domains);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index af9e48ee4a33..b2e22f4a4a78 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -408,6 +408,22 @@ void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj,
 	}
 }
 
+void __i915_gem_object_release_map(struct drm_i915_gem_object *obj)
+{
+	/*
+	 * We allow removing the mapping from underneath pinned pages!
+	 *
+	 * Furthermore, since this is an unsafe operation reserved only
+	 * for construction time manipulation, we ignore locking prudence.
+	 */
+	if (obj->mm.mapping) {
+		unmap_object(obj, page_mask_bits(obj->mm.mapping));
+		obj->mm.mapping = NULL;
+	}
+
+	i915_gem_object_unpin_map(obj);
+}
+
 struct scatterlist *
 i915_gem_object_get_sg(struct drm_i915_gem_object *obj,
 		       unsigned int n,
diff --git a/drivers/gpu/drm/i915/gt/gen7_renderclear.c b/drivers/gpu/drm/i915/gt/gen7_renderclear.c
index de595b66a746..d93d85cd3027 100644
--- a/drivers/gpu/drm/i915/gt/gen7_renderclear.c
+++ b/drivers/gpu/drm/i915/gt/gen7_renderclear.c
@@ -396,7 +396,7 @@ int gen7_setup_clear_gpr_bb(struct intel_engine_cs * const engine,
 	emit_batch(vma, memset(batch, 0, bv.max_size), &bv);
 
 	i915_gem_object_flush_map(vma->obj);
-	i915_gem_object_unpin_map(vma->obj);
+	__i915_gem_object_release_map(vma->obj);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 02a38810bcd3..0a19d551e02c 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -3937,7 +3937,7 @@ static int intel_init_workaround_bb(struct intel_engine_cs *engine)
 	GEM_BUG_ON(batch_ptr - batch > CTX_WA_BB_OBJ_SIZE);
 
 	__i915_gem_object_flush_map(wa_ctx->vma->obj, 0, batch_ptr - batch);
-	i915_gem_object_unpin_map(wa_ctx->vma->obj);
+	__i915_gem_object_release_map(wa_ctx->vma->obj);
 	if (ret)
 		lrc_destroy_wa_ctx(engine);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 68a08486fc87..a6c9ab852d17 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -543,7 +543,7 @@ alloc_context_vma(struct intel_engine_cs *engine)
 			   vaddr, engine->context_size);
 
 		i915_gem_object_flush_map(obj);
-		i915_gem_object_unpin_map(obj);
+		__i915_gem_object_release_map(obj);
 	}
 
 	vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL);
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 25329b7600c9..05dc36d3f81c 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1772,7 +1772,7 @@ static int alloc_noa_wait(struct i915_perf_stream *stream)
 	GEM_BUG_ON(cs - batch > PAGE_SIZE / sizeof(*batch));
 
 	i915_gem_object_flush_map(bo);
-	i915_gem_object_unpin_map(bo);
+	__i915_gem_object_release_map(bo);
 
 	stream->noa_wait = vma;
 	return 0;
@@ -1867,7 +1867,7 @@ alloc_oa_config_buffer(struct i915_perf_stream *stream,
 	*cs++ = 0;
 
 	i915_gem_object_flush_map(obj);
-	i915_gem_object_unpin_map(obj);
+	__i915_gem_object_release_map(obj);
 
 	oa_bo->vma = i915_vma_instance(obj,
 				       &stream->engine->gt->ggtt->vm,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2)
  2020-07-08 13:47 [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Chris Wilson
                   ` (5 preceding siblings ...)
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 7/7] drm/i915/gt: Optimise aliasing-ppgtt allocations Chris Wilson
@ 2020-07-08 15:19 ` Patchwork
  2020-07-08 15:20 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2020-07-08 15:19 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2)
URL   : https://patchwork.freedesktop.org/series/79250/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
5da8719dfd9d drm/i915/gt: Replace opencoded i915_gem_object_pin_map()
89aee552401f drm/i915: Release shortlived maps of longlived objects
4e132b12fa6a drm/i915: Remove i915_gem_object_get_dirty_page()
df4c348231d6 drm/i915/gem: Unpin idle contexts from kswapd reclaim
-:25: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#25: 
References: 9e9539800dd4 ("drm/i915: Remove waiting & retiring from shrinker paths")

-:25: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 9e9539800dd4 ("drm/i915: Remove waiting & retiring from shrinker paths")'
#25: 
References: 9e9539800dd4 ("drm/i915: Remove waiting & retiring from shrinker paths")

total: 1 errors, 1 warnings, 0 checks, 75 lines checked
b3d09276dbba drm/i915: Preallocate stashes for vma page-directories
8279ae824a1d drm/i915: Switch to object allocations for page directories
4440561be40d drm/i915/gt: Optimise aliasing-ppgtt allocations

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2)
  2020-07-08 13:47 [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Chris Wilson
                   ` (6 preceding siblings ...)
  2020-07-08 15:19 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2) Patchwork
@ 2020-07-08 15:20 ` Patchwork
  2020-07-08 15:47 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2020-07-08 15:20 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2)
URL   : https://patchwork.freedesktop.org/series/79250/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/display/intel_display.c:1223:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/display/intel_display.c:1226:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/display/intel_display.c:1229:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/display/intel_display.c:1232:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2270:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2271:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2272:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2273:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2274:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2275:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gt/intel_lrc.c:2785:17: error: too long token expansion
+drivers/gpu/drm/i915/gt/intel_lrc.c:2785:17: error: too long token expansion
+drivers/gpu/drm/i915/gt/intel_reset.c:1310:5: warning: context imbalance in 'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/sysfs_engines.c:61:10: error: bad integer constant expression
+drivers/gpu/drm/i915/gt/sysfs_engines.c:62:10: error: bad integer constant expression
+drivers/gpu/drm/i915/gt/sysfs_engines.c:66:10: error: bad integer constant expression
+drivers/gpu/drm/i915/gvt/mmio.c:287:23: warning: memcpy with byte count of 279040
+drivers/gpu/drm/i915/i915_perf.c:1425:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/i915/i915_perf.c:1479:15: warning: memset with byte count of 16777216
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen8_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen8_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen8_write8' - different lock contexts for basic block

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2)
  2020-07-08 13:47 [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Chris Wilson
                   ` (7 preceding siblings ...)
  2020-07-08 15:20 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2020-07-08 15:47 ` Patchwork
  2020-07-08 16:19 ` [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Matthew Auld
  2020-07-08 19:40 ` [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2) Patchwork
  10 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2020-07-08 15:47 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2)
URL   : https://patchwork.freedesktop.org/series/79250/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8715 -> Patchwork_18109
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/index.html

Known issues
------------

  Here are the changes found in Patchwork_18109 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_pm_rpm@basic-pci-d3-state:
    - fi-byt-j1900:       [PASS][1] -> [DMESG-WARN][2] ([i915#1982])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/fi-byt-j1900/igt@i915_pm_rpm@basic-pci-d3-state.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/fi-byt-j1900/igt@i915_pm_rpm@basic-pci-d3-state.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
    - fi-icl-u2:          [PASS][3] -> [DMESG-WARN][4] ([i915#1982]) +1 similar issue
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/fi-icl-u2/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/fi-icl-u2/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy.html

  * igt@kms_pipe_crc_basic@read-crc-pipe-c:
    - fi-tgl-y:           [PASS][5] -> [DMESG-WARN][6] ([i915#1982])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/fi-tgl-y/igt@kms_pipe_crc_basic@read-crc-pipe-c.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/fi-tgl-y/igt@kms_pipe_crc_basic@read-crc-pipe-c.html

  * igt@prime_vgem@basic-write:
    - fi-tgl-y:           [PASS][7] -> [DMESG-WARN][8] ([i915#402]) +1 similar issue
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/fi-tgl-y/igt@prime_vgem@basic-write.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/fi-tgl-y/igt@prime_vgem@basic-write.html

  
#### Possible fixes ####

  * igt@gem_exec_store@basic:
    - fi-tgl-y:           [DMESG-WARN][9] ([i915#402]) -> [PASS][10] +2 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/fi-tgl-y/igt@gem_exec_store@basic.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/fi-tgl-y/igt@gem_exec_store@basic.html

  * igt@i915_module_load@reload:
    - fi-tgl-u2:          [DMESG-WARN][11] ([i915#1982]) -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/fi-tgl-u2/igt@i915_module_load@reload.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/fi-tgl-u2/igt@i915_module_load@reload.html

  * igt@i915_pm_backlight@basic-brightness:
    - fi-whl-u:           [DMESG-WARN][13] ([i915#95]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/fi-whl-u/igt@i915_pm_backlight@basic-brightness.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/fi-whl-u/igt@i915_pm_backlight@basic-brightness.html

  * igt@kms_cursor_legacy@basic-flip-after-cursor-legacy:
    - fi-icl-u2:          [DMESG-WARN][15] ([i915#1982]) -> [PASS][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/fi-icl-u2/igt@kms_cursor_legacy@basic-flip-after-cursor-legacy.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/fi-icl-u2/igt@kms_cursor_legacy@basic-flip-after-cursor-legacy.html

  * igt@kms_flip@basic-flip-vs-modeset@b-dsi1:
    - {fi-tgl-dsi}:       [DMESG-WARN][17] ([i915#1982]) -> [PASS][18]
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/fi-tgl-dsi/igt@kms_flip@basic-flip-vs-modeset@b-dsi1.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/fi-tgl-dsi/igt@kms_flip@basic-flip-vs-modeset@b-dsi1.html

  
#### Warnings ####

  * igt@i915_pm_rpm@module-reload:
    - fi-kbl-x1275:       [DMESG-FAIL][19] ([i915#62] / [i915#95]) -> [SKIP][20] ([fdo#109271])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/fi-kbl-x1275/igt@i915_pm_rpm@module-reload.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/fi-kbl-x1275/igt@i915_pm_rpm@module-reload.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
    - fi-kbl-x1275:       [DMESG-WARN][21] ([i915#62] / [i915#92]) -> [DMESG-WARN][22] ([i915#62] / [i915#92] / [i915#95]) +4 similar issues
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/fi-kbl-x1275/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/fi-kbl-x1275/igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy.html

  * igt@kms_force_connector_basic@force-edid:
    - fi-kbl-x1275:       [DMESG-WARN][23] ([i915#62] / [i915#92] / [i915#95]) -> [DMESG-WARN][24] ([i915#62] / [i915#92]) +2 similar issues
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/fi-kbl-x1275/igt@kms_force_connector_basic@force-edid.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/fi-kbl-x1275/igt@kms_force_connector_basic@force-edid.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2089]: https://gitlab.freedesktop.org/drm/intel/issues/2089
  [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#750]: https://gitlab.freedesktop.org/drm/intel/issues/750
  [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (44 -> 37)
------------------------------

  Missing    (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_8715 -> Patchwork_18109

  CI-20190529: 20190529
  CI_DRM_8715: 76c2d43437601608d75f61a9eb6cf3a7aae5e02b @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5728: 6988eebf78e9ce9746b8c2b7d21cb4174d6623a9 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_18109: 4440561be40d3f2adc6856204ce4377e4527bac3 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

4440561be40d drm/i915/gt: Optimise aliasing-ppgtt allocations
8279ae824a1d drm/i915: Switch to object allocations for page directories
b3d09276dbba drm/i915: Preallocate stashes for vma page-directories
df4c348231d6 drm/i915/gem: Unpin idle contexts from kswapd reclaim
4e132b12fa6a drm/i915: Remove i915_gem_object_get_dirty_page()
89aee552401f drm/i915: Release shortlived maps of longlived objects
5da8719dfd9d drm/i915/gt: Replace opencoded i915_gem_object_pin_map()

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map()
  2020-07-08 13:47 [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Chris Wilson
                   ` (8 preceding siblings ...)
  2020-07-08 15:47 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
@ 2020-07-08 16:19 ` Matthew Auld
  2020-07-08 19:40 ` [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2) Patchwork
  10 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2020-07-08 16:19 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development, Matthew Auld

On Wed, 8 Jul 2020 at 14:48, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> As we have a pin_map interface, that knows how to flush the data to the
> device, use it. The only downside is that we keep the kmap around, as
> once acquired we keep the mapping cached until the object's backing
> store is released.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH v2] drm/i915: Release shortlived maps of longlived objects
  2020-07-08 14:35   ` [Intel-gfx] [PATCH v2] " Chris Wilson
@ 2020-07-08 16:39     ` Matthew Auld
  0 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2020-07-08 16:39 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development

On Wed, 8 Jul 2020 at 15:35, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Some objects we map once during their construction, and then never
> access their mappings again, even if they are kept around for the
> duration of the driver. Keeping those pages mapped, often vmapped, is
> therefore wasteful and we should release the maps as soon as we no
> longer need them.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 3/7] drm/i915: Remove i915_gem_object_get_dirty_page()
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 3/7] drm/i915: Remove i915_gem_object_get_dirty_page() Chris Wilson
@ 2020-07-08 16:40   ` Matthew Auld
  0 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2020-07-08 16:40 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development, Matthew Auld

On Wed, 8 Jul 2020 at 14:48, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Last user removed, remove the convenience function.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 7/7] drm/i915/gt: Optimise aliasing-ppgtt allocations
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 7/7] drm/i915/gt: Optimise aliasing-ppgtt allocations Chris Wilson
@ 2020-07-08 16:55   ` Matthew Auld
  0 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2020-07-08 16:55 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development, Matthew Auld

On Wed, 8 Jul 2020 at 14:47, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Since the aliasing-ppgtt remains the default for gen6/gen7, it is worth
> optimising the ppgtt allocation for it. In this case, we do not need to
> flush the GGTT page directories entries as they are fixed during setup.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 4/7] drm/i915/gem: Unpin idle contexts from kswapd reclaim
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 4/7] drm/i915/gem: Unpin idle contexts from kswapd reclaim Chris Wilson
@ 2020-07-08 17:13   ` Matthew Auld
  0 siblings, 0 replies; 22+ messages in thread
From: Matthew Auld @ 2020-07-08 17:13 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development, Matthew Auld

On Wed, 8 Jul 2020 at 14:48, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> We removed retiring requests from the shrinker in order to decouple the
> mutexes from reclaim in preparation for unravelling the struct_mutex.
> The impact of not retiring is that we are much less agressive in making
> global objects available for shrinking, as such objects remain pinned
> until they are flushed by a heartbeat pulse following the last retired
> request along their timeline. In order to ensure that pulse occurs in
> time for memory reclamation, we should kick it from kswapd.
>
> The catch is that we have added some flush_work() into the retirement
> phase (to ensure that we reach a global idle in a timely manner), but
> these flush_work() are not eligible (i.e do not belong to WQ_MEM_RELCAIM)
> for use from inside kswapd. To avoid flushing those workqueues, we teach
> the retirer not to do so unless we are actually waiting, and only do the
> plain retire from inside the shrinker.
>
> Note that for execlists, we already retire completed contexts as they
> are scheduled out, so it should not be keeping global state
> unnecessarily pinned. The legacy ringbuffer however...
>
> References: 9e9539800dd4 ("drm/i915: Remove waiting & retiring from shrinker paths")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 6/7] drm/i915: Switch to object allocations for page directories
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 6/7] drm/i915: Switch to object allocations for page directories Chris Wilson
@ 2020-07-08 18:32   ` Matthew Auld
  2020-07-08 18:47     ` Chris Wilson
  0 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2020-07-08 18:32 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development, Matthew Auld

On Wed, 8 Jul 2020 at 14:48, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> The GEM object is grossly overweight for the practicality of tracking
> large numbers of individual pages, yet it is currently our only
> abstraction for tracking DMA allocations. Since those allocations need
> to be reserved upfront before an operation, and that we need to break
> away from simple system memory, we need to ditch using plain struct page
> wrappers.
>
> In the process, we drop the WC mapping as we ended up clflushing
> everything anyway due to various issues across a wider range of
> platforms. Though in a future step, we need to drop the kmap_atomic
> approach which suggests we need to pre-map all the pages and keep them
> mapped.
>
> v2: Verify our large scratch page is suitably DMA aligned; and manually
> clear the scratch since we are allocating random struct pages.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Matthew Auld <matthew.auld@intel.com>
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   1 +
>  .../gpu/drm/i915/gem/selftests/huge_pages.c   |   2 +-
>  .../drm/i915/gem/selftests/i915_gem_context.c |   2 +-
>  drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |  53 +--
>  drivers/gpu/drm/i915/gt/gen6_ppgtt.h          |   1 +
>  drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |  89 ++---
>  drivers/gpu/drm/i915/gt/intel_ggtt.c          |  37 ++-
>  drivers/gpu/drm/i915/gt/intel_gtt.c           | 303 ++++--------------
>  drivers/gpu/drm/i915/gt/intel_gtt.h           |  94 ++----
>  drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  42 ++-
>  .../gpu/drm/i915/gt/intel_ring_submission.c   |  16 +-
>  drivers/gpu/drm/i915/gvt/scheduler.c          |  17 +-
>  drivers/gpu/drm/i915/i915_drv.c               |   1 +
>  drivers/gpu/drm/i915/i915_drv.h               |   5 -
>  drivers/gpu/drm/i915/i915_vma.c               |  18 +-
>  drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  23 ++
>  drivers/gpu/drm/i915/selftests/mock_gtt.c     |   4 +
>  17 files changed, 287 insertions(+), 421 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index 5335f799b548..d0847d7896f9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -282,6 +282,7 @@ struct drm_i915_gem_object {
>                 } userptr;
>
>                 unsigned long scratch;
> +               u64 encode;
>
>                 void *gvt_info;
>         };
> diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> index 8291ede6902c..9fb06fcc8f8f 100644
> --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> @@ -393,7 +393,7 @@ static int igt_mock_exhaust_device_supported_pages(void *arg)
>          */
>
>         for (i = 1; i < BIT(ARRAY_SIZE(page_sizes)); i++) {
> -               unsigned int combination = 0;
> +               unsigned int combination = SZ_4K;

Just an unrelated change?
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 6/7] drm/i915: Switch to object allocations for page directories
  2020-07-08 18:32   ` Matthew Auld
@ 2020-07-08 18:47     ` Chris Wilson
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2020-07-08 18:47 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld

Quoting Matthew Auld (2020-07-08 19:32:26)
> On Wed, 8 Jul 2020 at 14:48, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> > index 8291ede6902c..9fb06fcc8f8f 100644
> > --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> > +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
> > @@ -393,7 +393,7 @@ static int igt_mock_exhaust_device_supported_pages(void *arg)
> >          */
> >
> >         for (i = 1; i < BIT(ARRAY_SIZE(page_sizes)); i++) {
> > -               unsigned int combination = 0;
> > +               unsigned int combination = SZ_4K;
> 
> Just an unrelated change?

As we try to allocate 4K PD objects to bind the huge page, it fails an
assert if we exclude SZ_4K from the list of supported page sizes from
the device.

I leave it to the reader to work out how we should tie everything
together for for variable pd sizes.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2)
  2020-07-08 13:47 [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Chris Wilson
                   ` (9 preceding siblings ...)
  2020-07-08 16:19 ` [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Matthew Auld
@ 2020-07-08 19:40 ` Patchwork
  10 siblings, 0 replies; 22+ messages in thread
From: Patchwork @ 2020-07-08 19:40 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2)
URL   : https://patchwork.freedesktop.org/series/79250/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8715_full -> Patchwork_18109_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_18109_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_18109_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_18109_full:

### IGT changes ###

#### Possible regressions ####

  * igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-shrfb-draw-mmap-gtt:
    - shard-tglb:         [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-tglb5/igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-shrfb-draw-mmap-gtt.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-tglb5/igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-shrfb-draw-mmap-gtt.html

  
Known issues
------------

  Here are the changes found in Patchwork_18109_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_persistence@engines-mixed-process@vecs0:
    - shard-skl:          [PASS][3] -> [FAIL][4] ([i915#1528])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-skl10/igt@gem_ctx_persistence@engines-mixed-process@vecs0.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-skl1/igt@gem_ctx_persistence@engines-mixed-process@vecs0.html

  * igt@gem_exec_endless@dispatch@rcs0:
    - shard-tglb:         [PASS][5] -> [INCOMPLETE][6] ([i915#1958] / [i915#2119])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-tglb5/igt@gem_exec_endless@dispatch@rcs0.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-tglb5/igt@gem_exec_endless@dispatch@rcs0.html

  * igt@i915_selftest@mock@requests:
    - shard-skl:          [PASS][7] -> [INCOMPLETE][8] ([i915#198] / [i915#2110])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-skl9/igt@i915_selftest@mock@requests.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-skl8/igt@i915_selftest@mock@requests.html

  * igt@kms_addfb_basic@size-max:
    - shard-skl:          [PASS][9] -> [DMESG-WARN][10] ([i915#1982]) +6 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-skl6/igt@kms_addfb_basic@size-max.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-skl3/igt@kms_addfb_basic@size-max.html

  * igt@kms_big_fb@linear-16bpp-rotate-0:
    - shard-kbl:          [PASS][11] -> [DMESG-FAIL][12] ([i915#95])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-kbl3/igt@kms_big_fb@linear-16bpp-rotate-0.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-kbl1/igt@kms_big_fb@linear-16bpp-rotate-0.html
    - shard-apl:          [PASS][13] -> [DMESG-FAIL][14] ([i915#1635] / [i915#95])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-apl7/igt@kms_big_fb@linear-16bpp-rotate-0.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-apl3/igt@kms_big_fb@linear-16bpp-rotate-0.html

  * igt@kms_big_fb@linear-64bpp-rotate-0:
    - shard-glk:          [PASS][15] -> [DMESG-FAIL][16] ([i915#118] / [i915#95])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-glk1/igt@kms_big_fb@linear-64bpp-rotate-0.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-glk8/igt@kms_big_fb@linear-64bpp-rotate-0.html

  * igt@kms_cursor_crc@pipe-a-cursor-256x256-sliding:
    - shard-snb:          [PASS][17] -> [TIMEOUT][18] ([i915#1958] / [i915#2119])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-snb5/igt@kms_cursor_crc@pipe-a-cursor-256x256-sliding.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-snb5/igt@kms_cursor_crc@pipe-a-cursor-256x256-sliding.html

  * igt@kms_cursor_crc@pipe-a-cursor-suspend:
    - shard-kbl:          [PASS][19] -> [DMESG-WARN][20] ([i915#180]) +7 similar issues
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-kbl1/igt@kms_cursor_crc@pipe-a-cursor-suspend.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-kbl7/igt@kms_cursor_crc@pipe-a-cursor-suspend.html

  * igt@kms_cursor_crc@pipe-b-cursor-128x128-rapid-movement:
    - shard-kbl:          [PASS][21] -> [DMESG-WARN][22] ([i915#93] / [i915#95]) +1 similar issue
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-kbl1/igt@kms_cursor_crc@pipe-b-cursor-128x128-rapid-movement.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-kbl7/igt@kms_cursor_crc@pipe-b-cursor-128x128-rapid-movement.html

  * igt@kms_cursor_crc@pipe-c-cursor-256x85-onscreen:
    - shard-skl:          [PASS][23] -> [FAIL][24] ([i915#54])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-skl2/igt@kms_cursor_crc@pipe-c-cursor-256x85-onscreen.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-skl10/igt@kms_cursor_crc@pipe-c-cursor-256x85-onscreen.html

  * igt@kms_cursor_legacy@long-nonblocking-modeset-vs-cursor-atomic:
    - shard-apl:          [PASS][25] -> [DMESG-WARN][26] ([i915#1635] / [i915#95]) +14 similar issues
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-apl3/igt@kms_cursor_legacy@long-nonblocking-modeset-vs-cursor-atomic.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-apl4/igt@kms_cursor_legacy@long-nonblocking-modeset-vs-cursor-atomic.html

  * igt@kms_cursor_legacy@pipe-a-forked-move:
    - shard-glk:          [PASS][27] -> [DMESG-WARN][28] ([i915#118] / [i915#95]) +1 similar issue
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-glk1/igt@kms_cursor_legacy@pipe-a-forked-move.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-glk8/igt@kms_cursor_legacy@pipe-a-forked-move.html

  * igt@kms_flip@modeset-vs-vblank-race@b-hdmi-a1:
    - shard-glk:          [PASS][29] -> [FAIL][30] ([i915#407])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-glk2/igt@kms_flip@modeset-vs-vblank-race@b-hdmi-a1.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-glk3/igt@kms_flip@modeset-vs-vblank-race@b-hdmi-a1.html

  * igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-pwrite:
    - shard-tglb:         [PASS][31] -> [DMESG-WARN][32] ([i915#1982]) +1 similar issue
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-tglb1/igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-pwrite.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-tglb2/igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-shrfb-draw-pwrite.html

  * igt@kms_frontbuffer_tracking@fbc-shrfb-scaledprimary:
    - shard-iclb:         [PASS][33] -> [DMESG-WARN][34] ([i915#1982]) +1 similar issue
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-iclb1/igt@kms_frontbuffer_tracking@fbc-shrfb-scaledprimary.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-iclb3/igt@kms_frontbuffer_tracking@fbc-shrfb-scaledprimary.html

  * igt@kms_hdr@bpc-switch-suspend:
    - shard-skl:          [PASS][35] -> [FAIL][36] ([i915#1188])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-skl9/igt@kms_hdr@bpc-switch-suspend.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-skl9/igt@kms_hdr@bpc-switch-suspend.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes:
    - shard-iclb:         [PASS][37] -> [INCOMPLETE][38] ([i915#1185] / [i915#250])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-iclb5/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-iclb3/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html

  * igt@kms_psr@psr2_sprite_plane_move:
    - shard-iclb:         [PASS][39] -> [SKIP][40] ([fdo#109441]) +3 similar issues
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-iclb2/igt@kms_psr@psr2_sprite_plane_move.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-iclb5/igt@kms_psr@psr2_sprite_plane_move.html

  
#### Possible fixes ####

  * igt@drm_read@short-buffer-wakeup:
    - shard-skl:          [DMESG-WARN][41] ([i915#1982]) -> [PASS][42] +5 similar issues
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-skl5/igt@drm_read@short-buffer-wakeup.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-skl4/igt@drm_read@short-buffer-wakeup.html

  * igt@gem_exec_balancer@bonded-early:
    - shard-iclb:         [FAIL][43] ([i915#926]) -> [PASS][44]
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-iclb1/igt@gem_exec_balancer@bonded-early.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-iclb5/igt@gem_exec_balancer@bonded-early.html

  * igt@gem_exec_flush@basic-wb-pro-default:
    - shard-apl:          [DMESG-WARN][45] ([i915#1635] / [i915#95]) -> [PASS][46] +15 similar issues
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-apl1/igt@gem_exec_flush@basic-wb-pro-default.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-apl2/igt@gem_exec_flush@basic-wb-pro-default.html

  * igt@gem_exec_whisper@basic-normal:
    - shard-glk:          [DMESG-WARN][47] ([i915#118] / [i915#95]) -> [PASS][48] +1 similar issue
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-glk9/igt@gem_exec_whisper@basic-normal.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-glk9/igt@gem_exec_whisper@basic-normal.html

  * igt@i915_module_load@reload-with-fault-injection:
    - shard-tglb:         [DMESG-WARN][49] ([i915#402]) -> [PASS][50] +1 similar issue
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-tglb5/igt@i915_module_load@reload-with-fault-injection.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-tglb5/igt@i915_module_load@reload-with-fault-injection.html

  * igt@i915_suspend@forcewake:
    - shard-skl:          [INCOMPLETE][51] ([i915#636] / [i915#69]) -> [PASS][52]
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-skl3/igt@i915_suspend@forcewake.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-skl1/igt@i915_suspend@forcewake.html

  * igt@kms_flip@flip-vs-suspend-interruptible@b-edp1:
    - shard-skl:          [INCOMPLETE][53] ([i915#198]) -> [PASS][54]
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-skl1/igt@kms_flip@flip-vs-suspend-interruptible@b-edp1.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-skl2/igt@kms_flip@flip-vs-suspend-interruptible@b-edp1.html

  * igt@kms_flip@flip-vs-suspend-interruptible@c-dp1:
    - shard-kbl:          [DMESG-WARN][55] ([i915#180]) -> [PASS][56] +5 similar issues
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-kbl1/igt@kms_flip@flip-vs-suspend-interruptible@c-dp1.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-kbl3/igt@kms_flip@flip-vs-suspend-interruptible@c-dp1.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-blt:
    - shard-kbl:          [DMESG-WARN][57] ([i915#93] / [i915#95]) -> [PASS][58]
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-kbl4/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-blt.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-kbl3/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-blt.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-cur-indfb-draw-mmap-cpu:
    - shard-skl:          [FAIL][59] ([i915#49]) -> [PASS][60]
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-skl4/igt@kms_frontbuffer_tracking@psr-1p-primscrn-cur-indfb-draw-mmap-cpu.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-skl8/igt@kms_frontbuffer_tracking@psr-1p-primscrn-cur-indfb-draw-mmap-cpu.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-shrfb-draw-pwrite:
    - shard-tglb:         [DMESG-WARN][61] ([i915#1982]) -> [PASS][62] +2 similar issues
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-tglb2/igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-shrfb-draw-pwrite.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-tglb8/igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-shrfb-draw-pwrite.html

  * igt@kms_plane_alpha_blend@pipe-a-coverage-7efc:
    - shard-skl:          [FAIL][63] ([fdo#108145] / [i915#265]) -> [PASS][64]
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-skl4/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-skl8/igt@kms_plane_alpha_blend@pipe-a-coverage-7efc.html

  * igt@kms_plane_cursor@pipe-b-primary-size-128:
    - shard-hsw:          [INCOMPLETE][65] -> [PASS][66]
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-hsw7/igt@kms_plane_cursor@pipe-b-primary-size-128.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-hsw7/igt@kms_plane_cursor@pipe-b-primary-size-128.html

  * igt@kms_psr2_su@page_flip:
    - shard-iclb:         [SKIP][67] ([fdo#109642] / [fdo#111068]) -> [PASS][68]
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-iclb6/igt@kms_psr2_su@page_flip.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-iclb2/igt@kms_psr2_su@page_flip.html

  * igt@kms_psr@psr2_suspend:
    - shard-iclb:         [SKIP][69] ([fdo#109441]) -> [PASS][70] +1 similar issue
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-iclb6/igt@kms_psr@psr2_suspend.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-iclb2/igt@kms_psr@psr2_suspend.html

  * igt@kms_setmode@basic:
    - shard-kbl:          [FAIL][71] ([i915#31]) -> [PASS][72]
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-kbl2/igt@kms_setmode@basic.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-kbl2/igt@kms_setmode@basic.html

  * igt@perf_pmu@semaphore-busy@rcs0:
    - shard-kbl:          [FAIL][73] ([i915#1820]) -> [PASS][74]
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-kbl3/igt@perf_pmu@semaphore-busy@rcs0.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-kbl6/igt@perf_pmu@semaphore-busy@rcs0.html

  
#### Warnings ####

  * igt@gem_exec_reloc@basic-concurrent16:
    - shard-snb:          [FAIL][75] ([i915#1930]) -> [TIMEOUT][76] ([i915#1958] / [i915#2119])
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-snb5/igt@gem_exec_reloc@basic-concurrent16.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-snb5/igt@gem_exec_reloc@basic-concurrent16.html
    - shard-glk:          [INCOMPLETE][77] ([i915#1958] / [i915#58] / [k.org#198133]) -> [TIMEOUT][78] ([i915#1958] / [i915#2119])
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-glk4/igt@gem_exec_reloc@basic-concurrent16.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-glk4/igt@gem_exec_reloc@basic-concurrent16.html

  * igt@gem_vm_create@execbuf:
    - shard-snb:          [SKIP][79] ([fdo#109271]) -> [TIMEOUT][80] ([i915#1958] / [i915#2119]) +1 similar issue
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-snb5/igt@gem_vm_create@execbuf.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-snb5/igt@gem_vm_create@execbuf.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-cur-indfb-draw-blt:
    - shard-apl:          [SKIP][81] ([fdo#109271] / [i915#1635]) -> [SKIP][82] ([fdo#109271]) +1 similar issue
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-apl4/igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-cur-indfb-draw-blt.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-apl2/igt@kms_frontbuffer_tracking@fbcpsr-2p-scndscrn-cur-indfb-draw-blt.html

  * igt@kms_plane_cursor@pipe-d-viewport-size-128:
    - shard-apl:          [SKIP][83] ([fdo#109271]) -> [SKIP][84] ([fdo#109271] / [i915#1635]) +3 similar issues
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-apl3/igt@kms_plane_cursor@pipe-d-viewport-size-128.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-apl4/igt@kms_plane_cursor@pipe-d-viewport-size-128.html

  * igt@runner@aborted:
    - shard-apl:          ([FAIL][85], [FAIL][86], [FAIL][87]) ([i915#1610] / [i915#1635] / [i915#2110]) -> [FAIL][88] ([i915#1635] / [i915#2110])
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-apl1/igt@runner@aborted.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-apl3/igt@runner@aborted.html
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-apl2/igt@runner@aborted.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-apl1/igt@runner@aborted.html
    - shard-tglb:         [FAIL][89] ([i915#2110]) -> [FAIL][90] ([i915#1764] / [i915#2110])
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8715/shard-tglb7/igt@runner@aborted.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/shard-tglb3/igt@runner@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109642]: https://bugs.freedesktop.org/show_bug.cgi?id=109642
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [i915#118]: https://gitlab.freedesktop.org/drm/intel/issues/118
  [i915#1185]: https://gitlab.freedesktop.org/drm/intel/issues/1185
  [i915#1188]: https://gitlab.freedesktop.org/drm/intel/issues/1188
  [i915#1528]: https://gitlab.freedesktop.org/drm/intel/issues/1528
  [i915#1610]: https://gitlab.freedesktop.org/drm/intel/issues/1610
  [i915#1635]: https://gitlab.freedesktop.org/drm/intel/issues/1635
  [i915#1764]: https://gitlab.freedesktop.org/drm/intel/issues/1764
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#1820]: https://gitlab.freedesktop.org/drm/intel/issues/1820
  [i915#1930]: https://gitlab.freedesktop.org/drm/intel/issues/1930
  [i915#1958]: https://gitlab.freedesktop.org/drm/intel/issues/1958
  [i915#198]: https://gitlab.freedesktop.org/drm/intel/issues/198
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2110]: https://gitlab.freedesktop.org/drm/intel/issues/2110
  [i915#2119]: https://gitlab.freedesktop.org/drm/intel/issues/2119
  [i915#250]: https://gitlab.freedesktop.org/drm/intel/issues/250
  [i915#265]: https://gitlab.freedesktop.org/drm/intel/issues/265
  [i915#31]: https://gitlab.freedesktop.org/drm/intel/issues/31
  [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402
  [i915#407]: https://gitlab.freedesktop.org/drm/intel/issues/407
  [i915#49]: https://gitlab.freedesktop.org/drm/intel/issues/49
  [i915#54]: https://gitlab.freedesktop.org/drm/intel/issues/54
  [i915#58]: https://gitlab.freedesktop.org/drm/intel/issues/58
  [i915#636]: https://gitlab.freedesktop.org/drm/intel/issues/636
  [i915#69]: https://gitlab.freedesktop.org/drm/intel/issues/69
  [i915#926]: https://gitlab.freedesktop.org/drm/intel/issues/926
  [i915#93]: https://gitlab.freedesktop.org/drm/intel/issues/93
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95
  [k.org#198133]: https://bugzilla.kernel.org/show_bug.cgi?id=198133


Participating hosts (10 -> 10)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * Linux: CI_DRM_8715 -> Patchwork_18109

  CI-20190529: 20190529
  CI_DRM_8715: 76c2d43437601608d75f61a9eb6cf3a7aae5e02b @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5728: 6988eebf78e9ce9746b8c2b7d21cb4174d6623a9 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_18109: 4440561be40d3f2adc6856204ce4377e4527bac3 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18109/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915: Preallocate stashes for vma page-directories
  2020-07-08 13:47 ` [Intel-gfx] [PATCH 5/7] drm/i915: Preallocate stashes for vma page-directories Chris Wilson
@ 2020-07-10 17:48   ` Matthew Auld
  2020-07-10 17:56     ` Chris Wilson
  0 siblings, 1 reply; 22+ messages in thread
From: Matthew Auld @ 2020-07-10 17:48 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development, Matthew Auld

On Wed, 8 Jul 2020 at 14:48, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> We need to make the DMA allocations used for page directories to be
> performed up front so that we can include those allocations in our
> memory reservation pass. The downside is that we have to assume the
> worst case, even before we know the final layout, and always allocate
> enough page directories for this object, even when there will be overlap.
> This unfortunately can be quite expensive, especially as we have to
> clear/reset the page directories and DMA pages, but it should only be
> required during early phases of a workload when new objects are being
> discovered, or after memory/eviction pressure when we need to rebind.
> Once we reach steady state, the objects should not be moved and we no
> longer need to preallocating the pages tables.
>
> It should be noted that the lifetime for the page directories DMA is
> more or less decoupled from individual fences as they will be shared
> across objects across timelines.
>
> v2: Only allocate enough PD space for the PTE we may use, we do not need
> to allocate PD that will be left as scratch.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Matthew Auld <matthew.auld@intel.com>

<snip>

>
> +static unsigned long pd_count(u64 size, int shift)
> +{
> +       /* Beware later misalignment */
> +       return (size + 2 * (BIT_ULL(shift) - 1)) >> shift;
> +}
> +
> +int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
> +                          struct i915_vm_pt_stash *stash,
> +                          u64 size)
> +{
> +       unsigned long count;
> +       int shift = 21;
> +       int n;

if (gen >= 8)
    shift = 21;
else
    shift = 22;

?

Since pt=4M, pd=2G with the weird legacy ppgtt stuff?

> +
> +       count = pd_count(size, shift);
> +       while (count--) {
> +               struct i915_page_table *pt;
> +
> +               pt = alloc_pt(vm);
> +               if (IS_ERR(pt)) {
> +                       i915_vm_free_pt_stash(vm, stash);
> +                       return PTR_ERR(pt);
> +               }
> +
> +               pt->stash = stash->pt[0];
> +               stash->pt[0] = pt;
> +       }
> +
> +       for (n = 1; n < vm->top; n++) {
> +               shift += 9;
> +               count = pd_count(size, shift);
> +               while (count--) {
> +                       struct i915_page_directory *pd;
> +
> +                       pd = alloc_pd(vm);
> +                       if (IS_ERR(pd)) {
> +                               i915_vm_free_pt_stash(vm, stash);
> +                               return PTR_ERR(pd);
> +                       }
> +
> +                       pd->pt.stash = stash->pt[1];
> +                       stash->pt[1] = &pd->pt;
> +               }
> +       }
> +
> +       return 0;
> +}
> +
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Intel-gfx] [PATCH 5/7] drm/i915: Preallocate stashes for vma page-directories
  2020-07-10 17:48   ` Matthew Auld
@ 2020-07-10 17:56     ` Chris Wilson
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2020-07-10 17:56 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, Matthew Auld

Quoting Matthew Auld (2020-07-10 18:48:31)
> On Wed, 8 Jul 2020 at 14:48, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > We need to make the DMA allocations used for page directories to be
> > performed up front so that we can include those allocations in our
> > memory reservation pass. The downside is that we have to assume the
> > worst case, even before we know the final layout, and always allocate
> > enough page directories for this object, even when there will be overlap.
> > This unfortunately can be quite expensive, especially as we have to
> > clear/reset the page directories and DMA pages, but it should only be
> > required during early phases of a workload when new objects are being
> > discovered, or after memory/eviction pressure when we need to rebind.
> > Once we reach steady state, the objects should not be moved and we no
> > longer need to preallocating the pages tables.
> >
> > It should be noted that the lifetime for the page directories DMA is
> > more or less decoupled from individual fences as they will be shared
> > across objects across timelines.
> >
> > v2: Only allocate enough PD space for the PTE we may use, we do not need
> > to allocate PD that will be left as scratch.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Matthew Auld <matthew.auld@intel.com>
> 
> <snip>
> 
> >
> > +static unsigned long pd_count(u64 size, int shift)
> > +{
> > +       /* Beware later misalignment */
> > +       return (size + 2 * (BIT_ULL(shift) - 1)) >> shift;
> > +}
> > +
> > +int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
> > +                          struct i915_vm_pt_stash *stash,
> > +                          u64 size)
> > +{
> > +       unsigned long count;
> > +       int shift = 21;
> > +       int n;
> 
> if (gen >= 8)
>     shift = 21;
> else
>     shift = 22;
> 
> ?
> 
> Since pt=4M, pd=2G with the weird legacy ppgtt stuff?

I thought about it, but it just means we overallocate, and promptly free
again. That's not a huge issue for the single setup we do with the
aliasing-ppgtt, so I was waiting to see if there would be anything else to
spoil the simplicity of the levels. 128b address spaces anytime soon?

When the dust settled I was thinking more along the lines of
vm->pte_count? pte_width, maxptes, pte_shift?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-07-10 17:56 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-08 13:47 [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Chris Wilson
2020-07-08 13:47 ` [Intel-gfx] [PATCH 2/7] drm/i915: Release shortlived maps of longlived objects Chris Wilson
2020-07-08 13:50   ` Chris Wilson
2020-07-08 14:35   ` [Intel-gfx] [PATCH v2] " Chris Wilson
2020-07-08 16:39     ` Matthew Auld
2020-07-08 13:47 ` [Intel-gfx] [PATCH 3/7] drm/i915: Remove i915_gem_object_get_dirty_page() Chris Wilson
2020-07-08 16:40   ` Matthew Auld
2020-07-08 13:47 ` [Intel-gfx] [PATCH 4/7] drm/i915/gem: Unpin idle contexts from kswapd reclaim Chris Wilson
2020-07-08 17:13   ` Matthew Auld
2020-07-08 13:47 ` [Intel-gfx] [PATCH 5/7] drm/i915: Preallocate stashes for vma page-directories Chris Wilson
2020-07-10 17:48   ` Matthew Auld
2020-07-10 17:56     ` Chris Wilson
2020-07-08 13:47 ` [Intel-gfx] [PATCH 6/7] drm/i915: Switch to object allocations for page directories Chris Wilson
2020-07-08 18:32   ` Matthew Auld
2020-07-08 18:47     ` Chris Wilson
2020-07-08 13:47 ` [Intel-gfx] [PATCH 7/7] drm/i915/gt: Optimise aliasing-ppgtt allocations Chris Wilson
2020-07-08 16:55   ` Matthew Auld
2020-07-08 15:19 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2) Patchwork
2020-07-08 15:20 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-07-08 15:47 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-07-08 16:19 ` [Intel-gfx] [PATCH 1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() Matthew Auld
2020-07-08 19:40 ` [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [1/7] drm/i915/gt: Replace opencoded i915_gem_object_pin_map() (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.