All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/20] ppgtt cleanups / scratch merge
@ 2015-05-21 14:37 Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 01/20] drm/i915/gtt: Mark TLBS dirty for gen8+ Mika Kuoppala
                   ` (19 more replies)
  0 siblings, 20 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

Hi,

My take on cleaning up more of the i915_gem_gtt.c.
The goal is to have generic tools to allocate and map
any type of paging structure.

And at the end we will have one instance of each scratch
chain type (scratch pd->pte->page) across all ppgtts,
saving 2 pages of memory per ppgtt.

Mika Kuoppala (20):
  drm/i915/gtt: Mark TLBS dirty for gen8+
  drm/i915: Force PD restore on dirty ppGTTs
  drm/i915/gtt: Check va range against vm size
  drm/i915/gtt: Allow >= 4GB sizes for vm.
  drm/i915/gtt: Don't leak scratch page on mapping error
  drm/i915/gtt: Remove _single from page table allocator
  drm/i915/gtt: Introduce i915_page_dir_dma_addr
  drm/i915/gtt: Introduce struct i915_page_dma
  drm/i915/gtt: Rename unmap_and_free_px to free_px
  drm/i915/gtt: Remove superfluous free_pd with gen6/7
  drm/i915/gtt: Introduce fill_page_dma()
  drm/i915/gtt: Introduce kmap|kunmap for dma page
  drm/i915/gtt: Introduce copy_page_dma and copy_px
  drm/i915/gtt: Use macros to access dma mapped pages
  drm/i915/gtt: Make scratch page i915_page_dma compatible
  drm/i915/gtt: Fill scratch page
  drm/i915/gtt: Pin vma during virtual address allocation
  drm/i915/gtt: Cleanup page directory encoding
  drm/i915/gtt: Move scratch_pd and scratch_pt into vm area
  drm/i915/gtt: One instance of scratch page table/directory

 drivers/char/agp/intel-gtt.c        |   4 +-
 drivers/gpu/drm/i915/i915_debugfs.c |  44 +--
 drivers/gpu/drm/i915/i915_gem.c     |   6 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 662 ++++++++++++++++++++----------------
 drivers/gpu/drm/i915/i915_gem_gtt.h |  55 +--
 drivers/gpu/drm/i915/intel_lrc.c    |  69 ++--
 include/drm/intel-gtt.h             |   4 +-
 7 files changed, 470 insertions(+), 374 deletions(-)

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 01/20] drm/i915/gtt: Mark TLBS dirty for gen8+
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 02/20] drm/i915: Force PD restore on dirty ppGTTs Mika Kuoppala
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

When we touch gen8+ page maps, mark them dirty like we
do with previous gens.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 17b7df0..0ffd459 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -830,6 +830,15 @@ err_out:
 	return -ENOMEM;
 }
 
+/* PDE TLBs are a pain invalidate pre GEN8. It requires a context reload. If we
+ * are switching between contexts with the same LRCA, we also must do a force
+ * restore.
+ */
+static void mark_tlbs_dirty(struct i915_hw_ppgtt *ppgtt)
+{
+	ppgtt->pd_dirty_rings = INTEL_INFO(ppgtt->base.dev)->ring_mask;
+}
+
 static int gen8_alloc_va_range(struct i915_address_space *vm,
 			       uint64_t start,
 			       uint64_t length)
@@ -915,6 +924,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	}
 
 	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
+	mark_tlbs_dirty(ppgtt);
 	return 0;
 
 err_out:
@@ -927,6 +937,7 @@ err_out:
 		unmap_and_free_pd(ppgtt->pdp.page_directory[pdpe], vm->dev);
 
 	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
+	mark_tlbs_dirty(ppgtt);
 	return ret;
 }
 
@@ -1260,16 +1271,6 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 		kunmap_atomic(pt_vaddr);
 }
 
-/* PDE TLBs are a pain invalidate pre GEN8. It requires a context reload. If we
- * are switching between contexts with the same LRCA, we also must do a force
- * restore.
- */
-static void mark_tlbs_dirty(struct i915_hw_ppgtt *ppgtt)
-{
-	/* If current vm != vm, */
-	ppgtt->pd_dirty_rings = INTEL_INFO(ppgtt->base.dev)->ring_mask;
-}
-
 static void gen6_initialize_pt(struct i915_address_space *vm,
 		struct i915_page_table *pt)
 {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 02/20] drm/i915: Force PD restore on dirty ppGTTs
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 01/20] drm/i915/gtt: Mark TLBS dirty for gen8+ Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 15:07   ` Ville Syrjälä
  2015-05-21 14:37 ` [PATCH 03/20] drm/i915/gtt: Check va range against vm size Mika Kuoppala
                   ` (17 subsequent siblings)
  19 siblings, 1 reply; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

Force page directory reload when ppgtt va->pa
mapping has changed. Extend dirty rings mechanism
for gen > 7 and use it to force pd restore in execlist
mode when vm has been changed.

Some parts of execlist context update cleanup based on
work by Chris Wilson.

v2: Add comment about lite restore (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 65 ++++++++++++++++++++--------------------
 1 file changed, 33 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0413b8f..5ee2a8c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -264,9 +264,10 @@ u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
 }
 
 static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring,
-					 struct drm_i915_gem_object *ctx_obj)
+					 struct intel_context *ctx)
 {
 	struct drm_device *dev = ring->dev;
+	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
 	uint64_t desc;
 	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
 
@@ -284,6 +285,14 @@ static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring,
 	 * signalling between Command Streamers */
 	/* desc |= GEN8_CTX_FORCE_RESTORE; */
 
+	/* When performing a LiteRestore but with updated PD we need
+	 * to force the GPU to reload the PD
+	 */
+	if (intel_ring_flag(ring) & ctx->ppgtt->pd_dirty_rings) {
+		desc |= GEN8_CTX_FORCE_PD_RESTORE;
+		ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(ring);
+	}
+
 	/* WaEnableForceRestoreInCtxtDescForVCS:skl */
 	if (IS_GEN9(dev) &&
 	    INTEL_REVID(dev) <= SKL_REVID_B0 &&
@@ -295,8 +304,8 @@ static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring,
 }
 
 static void execlists_elsp_write(struct intel_engine_cs *ring,
-				 struct drm_i915_gem_object *ctx_obj0,
-				 struct drm_i915_gem_object *ctx_obj1)
+				 struct intel_context *to0,
+				 struct intel_context *to1)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -304,14 +313,15 @@ static void execlists_elsp_write(struct intel_engine_cs *ring,
 	uint32_t desc[4];
 
 	/* XXX: You must always write both descriptors in the order below. */
-	if (ctx_obj1)
-		temp = execlists_ctx_descriptor(ring, ctx_obj1);
+	if (to1)
+		temp = execlists_ctx_descriptor(ring, to1);
 	else
 		temp = 0;
+
 	desc[1] = (u32)(temp >> 32);
 	desc[0] = (u32)temp;
 
-	temp = execlists_ctx_descriptor(ring, ctx_obj0);
+	temp = execlists_ctx_descriptor(ring, to0);
 	desc[3] = (u32)(temp >> 32);
 	desc[2] = (u32)temp;
 
@@ -330,14 +340,20 @@ static void execlists_elsp_write(struct intel_engine_cs *ring,
 	spin_unlock(&dev_priv->uncore.lock);
 }
 
-static int execlists_update_context(struct drm_i915_gem_object *ctx_obj,
-				    struct drm_i915_gem_object *ring_obj,
-				    struct i915_hw_ppgtt *ppgtt,
-				    u32 tail)
+static void execlists_update_context(struct intel_engine_cs *ring,
+				     struct intel_context *ctx,
+				     u32 tail)
 {
+	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
+	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
+	struct drm_i915_gem_object *ring_obj = ringbuf->obj;
+	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
 	struct page *page;
 	uint32_t *reg_state;
 
+	WARN_ON(!i915_gem_obj_is_pinned(ctx_obj));
+	WARN_ON(!i915_gem_obj_is_pinned(ring_obj));
+
 	page = i915_gem_object_get_page(ctx_obj, 1);
 	reg_state = kmap_atomic(page);
 
@@ -347,7 +363,7 @@ static int execlists_update_context(struct drm_i915_gem_object *ctx_obj,
 	/* True PPGTT with dynamic page allocation: update PDP registers and
 	 * point the unallocated PDPs to the scratch page
 	 */
-	if (ppgtt) {
+	if (ppgtt && intel_ring_flag(ring) & ctx->ppgtt->pd_dirty_rings) {
 		ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
 		ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
 		ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
@@ -355,36 +371,21 @@ static int execlists_update_context(struct drm_i915_gem_object *ctx_obj,
 	}
 
 	kunmap_atomic(reg_state);
-
-	return 0;
 }
 
 static void execlists_submit_contexts(struct intel_engine_cs *ring,
 				      struct intel_context *to0, u32 tail0,
 				      struct intel_context *to1, u32 tail1)
 {
-	struct drm_i915_gem_object *ctx_obj0 = to0->engine[ring->id].state;
-	struct intel_ringbuffer *ringbuf0 = to0->engine[ring->id].ringbuf;
-	struct drm_i915_gem_object *ctx_obj1 = NULL;
-	struct intel_ringbuffer *ringbuf1 = NULL;
-
-	BUG_ON(!ctx_obj0);
-	WARN_ON(!i915_gem_obj_is_pinned(ctx_obj0));
-	WARN_ON(!i915_gem_obj_is_pinned(ringbuf0->obj));
-
-	execlists_update_context(ctx_obj0, ringbuf0->obj, to0->ppgtt, tail0);
+	if (WARN_ON(to0 == NULL))
+		return;
 
-	if (to1) {
-		ringbuf1 = to1->engine[ring->id].ringbuf;
-		ctx_obj1 = to1->engine[ring->id].state;
-		BUG_ON(!ctx_obj1);
-		WARN_ON(!i915_gem_obj_is_pinned(ctx_obj1));
-		WARN_ON(!i915_gem_obj_is_pinned(ringbuf1->obj));
+	execlists_update_context(ring, to0, tail0);
 
-		execlists_update_context(ctx_obj1, ringbuf1->obj, to1->ppgtt, tail1);
-	}
+	if (to1)
+		execlists_update_context(ring, to1, tail1);
 
-	execlists_elsp_write(ring, ctx_obj0, ctx_obj1);
+	execlists_elsp_write(ring, to0, to1);
 }
 
 static void execlists_context_unqueue(struct intel_engine_cs *ring)
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 03/20] drm/i915/gtt: Check va range against vm size
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 01/20] drm/i915/gtt: Mark TLBS dirty for gen8+ Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 02/20] drm/i915: Force PD restore on dirty ppGTTs Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 04/20] drm/i915/gtt: Allow >= 4GB sizes for vm Mika Kuoppala
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

Check the allocation area against the known end
of address space instead of against fixed value.

v2: Return ENODEV on internal bugs (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0ffd459..6f79680 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -756,9 +756,6 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
 
 	WARN_ON(!bitmap_empty(new_pds, GEN8_LEGACY_PDPES));
 
-	/* FIXME: upper bound must not overflow 32 bits  */
-	WARN_ON((start + length) > (1ULL << 32));
-
 	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
 		if (pd)
 			continue;
@@ -857,7 +854,10 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	 * actually use the other side of the canonical address space.
 	 */
 	if (WARN_ON(start + length < start))
-		return -ERANGE;
+		return -ENODEV;
+
+	if (WARN_ON(start + length > ppgtt->base.total))
+		return -ENODEV;
 
 	ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables);
 	if (ret)
@@ -1291,7 +1291,7 @@ static void gen6_initialize_pt(struct i915_address_space *vm,
 }
 
 static int gen6_alloc_va_range(struct i915_address_space *vm,
-			       uint64_t start, uint64_t length)
+			       uint64_t start_in, uint64_t length_in)
 {
 	DECLARE_BITMAP(new_page_tables, I915_PDES);
 	struct drm_device *dev = vm->dev;
@@ -1299,11 +1299,15 @@ static int gen6_alloc_va_range(struct i915_address_space *vm,
 	struct i915_hw_ppgtt *ppgtt =
 				container_of(vm, struct i915_hw_ppgtt, base);
 	struct i915_page_table *pt;
-	const uint32_t start_save = start, length_save = length;
+	uint32_t start, length, start_save, length_save;
 	uint32_t pde, temp;
 	int ret;
 
-	WARN_ON(upper_32_bits(start));
+	if (WARN_ON(start_in + length_in > ppgtt->base.total))
+		return -ENODEV;
+
+	start = start_save = start_in;
+	length = length_save = length_in;
 
 	bitmap_zero(new_page_tables, I915_PDES);
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 04/20] drm/i915/gtt: Allow >= 4GB sizes for vm.
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (2 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 03/20] drm/i915/gtt: Check va range against vm size Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 05/20] drm/i915/gtt: Don't leak scratch page on mapping error Mika Kuoppala
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

We can have exactly 4GB sized ppgtt with 32bit system.
size_t is inadequate for this.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/char/agp/intel-gtt.c        |  4 ++--
 drivers/gpu/drm/i915/i915_debugfs.c | 42 ++++++++++++++++++-------------------
 drivers/gpu/drm/i915/i915_gem.c     |  6 +++---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 22 +++++++++----------
 drivers/gpu/drm/i915/i915_gem_gtt.h | 12 +++++------
 include/drm/intel-gtt.h             |  4 ++--
 6 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c
index 0b4188b..4734d02 100644
--- a/drivers/char/agp/intel-gtt.c
+++ b/drivers/char/agp/intel-gtt.c
@@ -1408,8 +1408,8 @@ int intel_gmch_probe(struct pci_dev *bridge_pdev, struct pci_dev *gpu_pdev,
 }
 EXPORT_SYMBOL(intel_gmch_probe);
 
-void intel_gtt_get(size_t *gtt_total, size_t *stolen_size,
-		   phys_addr_t *mappable_base, unsigned long *mappable_end)
+void intel_gtt_get(u64 *gtt_total, size_t *stolen_size,
+		   phys_addr_t *mappable_base, u64 *mappable_end)
 {
 	*gtt_total = intel_private.gtt_total_entries << PAGE_SHIFT;
 	*stolen_size = intel_private.stolen_size;
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a32b669..abc9fc5 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -192,7 +192,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct i915_vma *vma;
-	size_t total_obj_size, total_gtt_size;
+	u64 total_obj_size, total_gtt_size;
 	int count, ret;
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -225,7 +225,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	}
 	mutex_unlock(&dev->struct_mutex);
 
-	seq_printf(m, "Total %d objects, %zu bytes, %zu GTT size\n",
+	seq_printf(m, "Total %d objects, %llu bytes, %llu GTT size\n",
 		   count, total_obj_size, total_gtt_size);
 	return 0;
 }
@@ -247,7 +247,7 @@ static int i915_gem_stolen_list_info(struct seq_file *m, void *data)
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
-	size_t total_obj_size, total_gtt_size;
+	u64 total_obj_size, total_gtt_size;
 	LIST_HEAD(stolen);
 	int count, ret;
 
@@ -286,7 +286,7 @@ static int i915_gem_stolen_list_info(struct seq_file *m, void *data)
 	}
 	mutex_unlock(&dev->struct_mutex);
 
-	seq_printf(m, "Total %d objects, %zu bytes, %zu GTT size\n",
+	seq_printf(m, "Total %d objects, %llu bytes, %llu GTT size\n",
 		   count, total_obj_size, total_gtt_size);
 	return 0;
 }
@@ -304,10 +304,10 @@ static int i915_gem_stolen_list_info(struct seq_file *m, void *data)
 
 struct file_stats {
 	struct drm_i915_file_private *file_priv;
-	int count;
-	size_t total, unbound;
-	size_t global, shared;
-	size_t active, inactive;
+	unsigned long count;
+	u64 total, unbound;
+	u64 global, shared;
+	u64 active, inactive;
 };
 
 static int per_file_stats(int id, void *ptr, void *data)
@@ -364,7 +364,7 @@ static int per_file_stats(int id, void *ptr, void *data)
 
 #define print_file_stats(m, name, stats) do { \
 	if (stats.count) \
-		seq_printf(m, "%s: %u objects, %zu bytes (%zu active, %zu inactive, %zu global, %zu shared, %zu unbound)\n", \
+		seq_printf(m, "%s: %lu objects, %llu bytes (%llu active, %llu inactive, %llu global, %llu shared, %llu unbound)\n", \
 			   name, \
 			   stats.count, \
 			   stats.total, \
@@ -414,7 +414,7 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 count, mappable_count, purgeable_count;
-	size_t size, mappable_size, purgeable_size;
+	u64 size, mappable_size, purgeable_size;
 	struct drm_i915_gem_object *obj;
 	struct i915_address_space *vm = &dev_priv->gtt.base;
 	struct drm_file *file;
@@ -431,17 +431,17 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 
 	size = count = mappable_size = mappable_count = 0;
 	count_objects(&dev_priv->mm.bound_list, global_list);
-	seq_printf(m, "%u [%u] objects, %zu [%zu] bytes in gtt\n",
+	seq_printf(m, "%u [%u] objects, %llu [%llu] bytes in gtt\n",
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
 	count_vmas(&vm->active_list, mm_list);
-	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
+	seq_printf(m, "  %u [%u] active objects, %llu [%llu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
 	count_vmas(&vm->inactive_list, mm_list);
-	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
+	seq_printf(m, "  %u [%u] inactive objects, %llu [%llu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
 	size = count = purgeable_size = purgeable_count = 0;
@@ -450,7 +450,7 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 		if (obj->madv == I915_MADV_DONTNEED)
 			purgeable_size += obj->base.size, ++purgeable_count;
 	}
-	seq_printf(m, "%u unbound objects, %zu bytes\n", count, size);
+	seq_printf(m, "%u unbound objects, %llu bytes\n", count, size);
 
 	size = count = mappable_size = mappable_count = 0;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
@@ -467,16 +467,16 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 			++purgeable_count;
 		}
 	}
-	seq_printf(m, "%u purgeable objects, %zu bytes\n",
+	seq_printf(m, "%u purgeable objects, %llu bytes\n",
 		   purgeable_count, purgeable_size);
-	seq_printf(m, "%u pinned mappable objects, %zu bytes\n",
+	seq_printf(m, "%u pinned mappable objects, %llu bytes\n",
 		   mappable_count, mappable_size);
-	seq_printf(m, "%u fault mappable objects, %zu bytes\n",
+	seq_printf(m, "%u fault mappable objects, %llu bytes\n",
 		   count, size);
 
-	seq_printf(m, "%zu [%lu] gtt total\n",
+	seq_printf(m, "%llu [%llu] gtt total\n",
 		   dev_priv->gtt.base.total,
-		   dev_priv->gtt.mappable_end - dev_priv->gtt.base.start);
+		   (u64)dev_priv->gtt.mappable_end - dev_priv->gtt.base.start);
 
 	seq_putc(m, '\n');
 	print_batch_pool_stats(m, dev_priv);
@@ -513,7 +513,7 @@ static int i915_gem_gtt_info(struct seq_file *m, void *data)
 	uintptr_t list = (uintptr_t) node->info_ent->data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
-	size_t total_obj_size, total_gtt_size;
+	u64 total_obj_size, total_gtt_size;
 	int count, ret;
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -535,7 +535,7 @@ static int i915_gem_gtt_info(struct seq_file *m, void *data)
 
 	mutex_unlock(&dev->struct_mutex);
 
-	seq_printf(m, "Total %d objects, %zu bytes, %zu GTT size\n",
+	seq_printf(m, "Total %d objects, %llu bytes, %llu GTT size\n",
 		   count, total_obj_size, total_gtt_size);
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5ff96f9..67a0e80 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3538,9 +3538,9 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 size, fence_size, fence_alignment, unfenced_alignment;
-	unsigned long start =
+	u64 start =
 		flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
-	unsigned long end =
+	u64 end =
 		flags & PIN_MAPPABLE ? dev_priv->gtt.mappable_end : vm->total;
 	struct i915_vma *vma;
 	int ret;
@@ -3596,7 +3596,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 	 * attempt to find space.
 	 */
 	if (size > end) {
-		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%u > %s aperture=%lu\n",
+		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%u > %s aperture=%llu\n",
 			  ggtt_view ? ggtt_view->type : 0,
 			  size,
 			  flags & PIN_MAPPABLE ? "mappable" : "total",
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 6f79680..0ff381e 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2097,7 +2097,7 @@ static int i915_gem_setup_global_gtt(struct drm_device *dev,
 void i915_gem_init_global_gtt(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long gtt_size, mappable_size;
+	u64 gtt_size, mappable_size;
 
 	gtt_size = dev_priv->gtt.base.total;
 	mappable_size = dev_priv->gtt.mappable_end;
@@ -2352,13 +2352,13 @@ static void chv_setup_private_ppat(struct drm_i915_private *dev_priv)
 }
 
 static int gen8_gmch_probe(struct drm_device *dev,
-			   size_t *gtt_total,
+			   u64 *gtt_total,
 			   size_t *stolen,
 			   phys_addr_t *mappable_base,
-			   unsigned long *mappable_end)
+			   u64 *mappable_end)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned int gtt_size;
+	u64 gtt_size;
 	u16 snb_gmch_ctl;
 	int ret;
 
@@ -2400,10 +2400,10 @@ static int gen8_gmch_probe(struct drm_device *dev,
 }
 
 static int gen6_gmch_probe(struct drm_device *dev,
-			   size_t *gtt_total,
+			   u64 *gtt_total,
 			   size_t *stolen,
 			   phys_addr_t *mappable_base,
-			   unsigned long *mappable_end)
+			   u64 *mappable_end)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	unsigned int gtt_size;
@@ -2417,7 +2417,7 @@ static int gen6_gmch_probe(struct drm_device *dev,
 	 * a coarse sanity check.
 	 */
 	if ((*mappable_end < (64<<20) || (*mappable_end > (512<<20)))) {
-		DRM_ERROR("Unknown GMADR size (%lx)\n",
+		DRM_ERROR("Unknown GMADR size (%llx)\n",
 			  dev_priv->gtt.mappable_end);
 		return -ENXIO;
 	}
@@ -2451,10 +2451,10 @@ static void gen6_gmch_remove(struct i915_address_space *vm)
 }
 
 static int i915_gmch_probe(struct drm_device *dev,
-			   size_t *gtt_total,
+			   u64 *gtt_total,
 			   size_t *stolen,
 			   phys_addr_t *mappable_base,
-			   unsigned long *mappable_end)
+			   u64 *mappable_end)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret;
@@ -2519,9 +2519,9 @@ int i915_gem_gtt_init(struct drm_device *dev)
 	gtt->base.dev = dev;
 
 	/* GMADR is the PCI mmio aperture into the global GTT. */
-	DRM_INFO("Memory usable by graphics device = %zdM\n",
+	DRM_INFO("Memory usable by graphics device = %lluM\n",
 		 gtt->base.total >> 20);
-	DRM_DEBUG_DRIVER("GMADR size = %ldM\n", gtt->mappable_end >> 20);
+	DRM_DEBUG_DRIVER("GMADR size = %lldM\n", gtt->mappable_end >> 20);
 	DRM_DEBUG_DRIVER("GTT stolen size = %zdM\n", gtt->stolen_size >> 20);
 #ifdef CONFIG_INTEL_IOMMU
 	if (intel_iommu_gfx_mapped)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 0d46dd2..c343161 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -233,8 +233,8 @@ struct i915_address_space {
 	struct drm_mm mm;
 	struct drm_device *dev;
 	struct list_head global_link;
-	unsigned long start;		/* Start offset always 0 for dri2 */
-	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
+	u64 start;		/* Start offset always 0 for dri2 */
+	u64 total;		/* size addr space maps (ex. 2GB for ggtt) */
 
 	struct {
 		dma_addr_t addr;
@@ -300,9 +300,9 @@ struct i915_address_space {
  */
 struct i915_gtt {
 	struct i915_address_space base;
-	size_t stolen_size;		/* Total size of stolen memory */
 
-	unsigned long mappable_end;	/* End offset that we can CPU map */
+	size_t stolen_size;		/* Total size of stolen memory */
+	u64 mappable_end;		/* End offset that we can CPU map */
 	struct io_mapping *mappable;	/* Mapping to our CPU mappable region */
 	phys_addr_t mappable_base;	/* PA of our GMADR */
 
@@ -314,9 +314,9 @@ struct i915_gtt {
 	int mtrr;
 
 	/* global gtt ops */
-	int (*gtt_probe)(struct drm_device *dev, size_t *gtt_total,
+	int (*gtt_probe)(struct drm_device *dev, u64 *gtt_total,
 			  size_t *stolen, phys_addr_t *mappable_base,
-			  unsigned long *mappable_end);
+			  u64 *mappable_end);
 };
 
 struct i915_hw_ppgtt {
diff --git a/include/drm/intel-gtt.h b/include/drm/intel-gtt.h
index b08bdad..9e9bddaa5 100644
--- a/include/drm/intel-gtt.h
+++ b/include/drm/intel-gtt.h
@@ -3,8 +3,8 @@
 #ifndef _DRM_INTEL_GTT_H
 #define	_DRM_INTEL_GTT_H
 
-void intel_gtt_get(size_t *gtt_total, size_t *stolen_size,
-		   phys_addr_t *mappable_base, unsigned long *mappable_end);
+void intel_gtt_get(u64 *gtt_total, size_t *stolen_size,
+		   phys_addr_t *mappable_base, u64 *mappable_end);
 
 int intel_gmch_probe(struct pci_dev *bridge_pdev, struct pci_dev *gpu_pdev,
 		     struct agp_bridge_data *bridge);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 05/20] drm/i915/gtt: Don't leak scratch page on mapping error
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (3 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 04/20] drm/i915/gtt: Allow >= 4GB sizes for vm Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 06/20] drm/i915/gtt: Remove _single from page table allocator Mika Kuoppala
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

Free the scratch page if dma mapping fails.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0ff381e..e4775d8 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2141,8 +2141,10 @@ static int setup_scratch_page(struct drm_device *dev)
 #ifdef CONFIG_INTEL_IOMMU
 	dma_addr = pci_map_page(dev->pdev, page, 0, PAGE_SIZE,
 				PCI_DMA_BIDIRECTIONAL);
-	if (pci_dma_mapping_error(dev->pdev, dma_addr))
+	if (pci_dma_mapping_error(dev->pdev, dma_addr)) {
+		__free_page(page);
 		return -EINVAL;
+	}
 #else
 	dma_addr = page_to_phys(page);
 #endif
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 06/20] drm/i915/gtt: Remove _single from page table allocator
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (4 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 05/20] drm/i915/gtt: Don't leak scratch page on mapping error Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 07/20] drm/i915/gtt: Introduce i915_page_dir_dma_addr Mika Kuoppala
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

We are always allocating a single page. No need to be verbose so
remove the suffix.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index e4775d8..3270744 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -369,7 +369,7 @@ static void gen8_initialize_pt(struct i915_address_space *vm,
 	kunmap_atomic(pt_vaddr);
 }
 
-static struct i915_page_table *alloc_pt_single(struct drm_device *dev)
+static struct i915_page_table *alloc_pt(struct drm_device *dev)
 {
 	struct i915_page_table *pt;
 	const size_t count = INTEL_INFO(dev)->gen >= 8 ?
@@ -417,7 +417,7 @@ static void unmap_and_free_pd(struct i915_page_directory *pd,
 	}
 }
 
-static struct i915_page_directory *alloc_pd_single(struct drm_device *dev)
+static struct i915_page_directory *alloc_pd(struct drm_device *dev)
 {
 	struct i915_page_directory *pd;
 	int ret = -ENOMEM;
@@ -702,7 +702,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
 			continue;
 		}
 
-		pt = alloc_pt_single(dev);
+		pt = alloc_pt(dev);
 		if (IS_ERR(pt))
 			goto unwind_out;
 
@@ -760,7 +760,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
 		if (pd)
 			continue;
 
-		pd = alloc_pd_single(dev);
+		pd = alloc_pd(dev);
 		if (IS_ERR(pd))
 			goto unwind_out;
 
@@ -950,11 +950,11 @@ err_out:
  */
 static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 {
-	ppgtt->scratch_pt = alloc_pt_single(ppgtt->base.dev);
+	ppgtt->scratch_pt = alloc_pt(ppgtt->base.dev);
 	if (IS_ERR(ppgtt->scratch_pt))
 		return PTR_ERR(ppgtt->scratch_pt);
 
-	ppgtt->scratch_pd = alloc_pd_single(ppgtt->base.dev);
+	ppgtt->scratch_pd = alloc_pd(ppgtt->base.dev);
 	if (IS_ERR(ppgtt->scratch_pd))
 		return PTR_ERR(ppgtt->scratch_pd);
 
@@ -1325,7 +1325,7 @@ static int gen6_alloc_va_range(struct i915_address_space *vm,
 		/* We've already allocated a page table */
 		WARN_ON(!bitmap_empty(pt->used_ptes, GEN6_PTES));
 
-		pt = alloc_pt_single(dev);
+		pt = alloc_pt(dev);
 		if (IS_ERR(pt)) {
 			ret = PTR_ERR(pt);
 			goto unwind_out;
@@ -1411,7 +1411,7 @@ static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
 	 * size. We allocate at the top of the GTT to avoid fragmentation.
 	 */
 	BUG_ON(!drm_mm_initialized(&dev_priv->gtt.base.mm));
-	ppgtt->scratch_pt = alloc_pt_single(ppgtt->base.dev);
+	ppgtt->scratch_pt = alloc_pt(ppgtt->base.dev);
 	if (IS_ERR(ppgtt->scratch_pt))
 		return PTR_ERR(ppgtt->scratch_pt);
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 07/20] drm/i915/gtt: Introduce i915_page_dir_dma_addr
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (5 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 06/20] drm/i915/gtt: Remove _single from page table allocator Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 08/20] drm/i915/gtt: Introduce struct i915_page_dma Mika Kuoppala
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

The legacy mode mm switch and the execlist context assigment
needs dma address for the page directories.

Introduce a function that encapsulates the scratch_pd dma
fallback if no pd is found.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 6 ++----
 drivers/gpu/drm/i915/i915_gem_gtt.h | 8 ++++++++
 drivers/gpu/drm/i915/intel_lrc.c    | 4 +---
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3270744..3642476 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -481,10 +481,8 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	int i, ret;
 
 	for (i = GEN8_LEGACY_PDPES - 1; i >= 0; i--) {
-		struct i915_page_directory *pd = ppgtt->pdp.page_directory[i];
-		dma_addr_t pd_daddr = pd ? pd->daddr : ppgtt->scratch_pd->daddr;
-		/* The page directory might be NULL, but we need to clear out
-		 * whatever the previous context might have used. */
+		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
+
 		ret = gen8_write_pdp(ring, i, pd_daddr);
 		if (ret)
 			return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index c343161..da67542 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -468,6 +468,14 @@ static inline size_t gen8_pte_count(uint64_t address, uint64_t length)
 	return i915_pte_count(address, length, GEN8_PDE_SHIFT);
 }
 
+static inline dma_addr_t
+i915_page_dir_dma_addr(const struct i915_hw_ppgtt *ppgtt, const unsigned n)
+{
+	return test_bit(n, ppgtt->pdp.used_pdpes) ?
+		ppgtt->pdp.page_directory[n]->daddr :
+		ppgtt->scratch_pd->daddr;
+}
+
 int i915_gem_gtt_init(struct drm_device *dev);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_global_gtt_cleanup(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 5ee2a8c..ebcac80 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -190,9 +190,7 @@
 #define GEN8_CTX_PRIVILEGE (1<<8)
 
 #define ASSIGN_CTX_PDP(ppgtt, reg_state, n) { \
-	const u64 _addr = test_bit(n, ppgtt->pdp.used_pdpes) ? \
-		ppgtt->pdp.page_directory[n]->daddr : \
-		ppgtt->scratch_pd->daddr; \
+	const u64 _addr = i915_page_dir_dma_addr((ppgtt), (n));	\
 	reg_state[CTX_PDP ## n ## _UDW+1] = upper_32_bits(_addr); \
 	reg_state[CTX_PDP ## n ## _LDW+1] = lower_32_bits(_addr); \
 }
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 08/20] drm/i915/gtt: Introduce struct i915_page_dma
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (6 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 07/20] drm/i915/gtt: Introduce i915_page_dir_dma_addr Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 09/20] drm/i915/gtt: Rename unmap_and_free_px to free_px Mika Kuoppala
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

All our paging structures have struct page and dma address
for that page.

Add struct for page/dma address pairs and use it to make
the setup and teardown for different paging structures
identical.

Include the page directory offset also in the struct for legacy
gens. Rename it to clearly point out that it is offset into the
ggtt.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 120 ++++++++++++++----------------------
 drivers/gpu/drm/i915/i915_gem_gtt.h |  21 ++++---
 3 files changed, 60 insertions(+), 83 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index abc9fc5..2ae5bd1 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2239,7 +2239,7 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 		struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
 
 		seq_puts(m, "aliasing PPGTT:\n");
-		seq_printf(m, "pd gtt offset: 0x%08x\n", ppgtt->pd.pd_offset);
+		seq_printf(m, "pd gtt offset: 0x%08x\n", ppgtt->pd.base.ggtt_offset);
 
 		ppgtt->debug_dump(ppgtt, m);
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3642476..a3b3188 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -301,52 +301,39 @@ static gen6_pte_t iris_pte_encode(dma_addr_t addr,
 	return pte;
 }
 
-#define i915_dma_unmap_single(px, dev) \
-	__i915_dma_unmap_single((px)->daddr, dev)
-
-static void __i915_dma_unmap_single(dma_addr_t daddr,
-				    struct drm_device *dev)
+static int setup_page_dma(struct drm_device *dev, struct i915_page_dma *p)
 {
 	struct device *device = &dev->pdev->dev;
 
-	dma_unmap_page(device, daddr, 4096, PCI_DMA_BIDIRECTIONAL);
-}
-
-/**
- * i915_dma_map_single() - Create a dma mapping for a page table/dir/etc.
- * @px:	Page table/dir/etc to get a DMA map for
- * @dev:	drm device
- *
- * Page table allocations are unified across all gens. They always require a
- * single 4k allocation, as well as a DMA mapping. If we keep the structs
- * symmetric here, the simple macro covers us for every page table type.
- *
- * Return: 0 if success.
- */
-#define i915_dma_map_single(px, dev) \
-	i915_dma_map_page_single((px)->page, (dev), &(px)->daddr)
+	p->page = alloc_page(GFP_KERNEL);
+	if (!p->page)
+		return -ENOMEM;
 
-static int i915_dma_map_page_single(struct page *page,
-				    struct drm_device *dev,
-				    dma_addr_t *daddr)
-{
-	struct device *device = &dev->pdev->dev;
+	p->daddr = dma_map_page(device,
+				p->page, 0, 4096, PCI_DMA_BIDIRECTIONAL);
 
-	*daddr = dma_map_page(device, page, 0, 4096, PCI_DMA_BIDIRECTIONAL);
-	if (dma_mapping_error(device, *daddr))
-		return -ENOMEM;
+	if (dma_mapping_error(device, p->daddr)) {
+		__free_page(p->page);
+		return -EINVAL;
+	}
 
 	return 0;
 }
 
-static void unmap_and_free_pt(struct i915_page_table *pt,
-			       struct drm_device *dev)
+static void cleanup_page_dma(struct drm_device *dev, struct i915_page_dma *p)
 {
-	if (WARN_ON(!pt->page))
+	if (WARN_ON(!p->page))
 		return;
 
-	i915_dma_unmap_single(pt, dev);
-	__free_page(pt->page);
+	dma_unmap_page(&dev->pdev->dev, p->daddr, 4096, PCI_DMA_BIDIRECTIONAL);
+	__free_page(p->page);
+	memset(p, 0, sizeof(*p));
+}
+
+static void unmap_and_free_pt(struct i915_page_table *pt,
+			       struct drm_device *dev)
+{
+	cleanup_page_dma(dev, &pt->base);
 	kfree(pt->used_ptes);
 	kfree(pt);
 }
@@ -357,7 +344,7 @@ static void gen8_initialize_pt(struct i915_address_space *vm,
 	gen8_pte_t *pt_vaddr, scratch_pte;
 	int i;
 
-	pt_vaddr = kmap_atomic(pt->page);
+	pt_vaddr = kmap_atomic(pt->base.page);
 	scratch_pte = gen8_pte_encode(vm->scratch.addr,
 				      I915_CACHE_LLC, true);
 
@@ -386,19 +373,13 @@ static struct i915_page_table *alloc_pt(struct drm_device *dev)
 	if (!pt->used_ptes)
 		goto fail_bitmap;
 
-	pt->page = alloc_page(GFP_KERNEL);
-	if (!pt->page)
-		goto fail_page;
-
-	ret = i915_dma_map_single(pt, dev);
+	ret = setup_page_dma(dev, &pt->base);
 	if (ret)
-		goto fail_dma;
+		goto fail_page_m;
 
 	return pt;
 
-fail_dma:
-	__free_page(pt->page);
-fail_page:
+fail_page_m:
 	kfree(pt->used_ptes);
 fail_bitmap:
 	kfree(pt);
@@ -409,9 +390,8 @@ fail_bitmap:
 static void unmap_and_free_pd(struct i915_page_directory *pd,
 			      struct drm_device *dev)
 {
-	if (pd->page) {
-		i915_dma_unmap_single(pd, dev);
-		__free_page(pd->page);
+	if (pd->base.page) {
+		cleanup_page_dma(dev, &pd->base);
 		kfree(pd->used_pdes);
 		kfree(pd);
 	}
@@ -431,18 +411,12 @@ static struct i915_page_directory *alloc_pd(struct drm_device *dev)
 	if (!pd->used_pdes)
 		goto free_pd;
 
-	pd->page = alloc_page(GFP_KERNEL);
-	if (!pd->page)
-		goto free_bitmap;
-
-	ret = i915_dma_map_single(pd, dev);
+	ret = setup_page_dma(dev, &pd->base);
 	if (ret)
-		goto free_page;
+		goto free_bitmap;
 
 	return pd;
 
-free_page:
-	__free_page(pd->page);
 free_bitmap:
 	kfree(pd->used_pdes);
 free_pd:
@@ -523,10 +497,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 
 		pt = pd->page_table[pde];
 
-		if (WARN_ON(!pt->page))
+		if (WARN_ON(!pt->base.page))
 			continue;
 
-		page_table = pt->page;
+		page_table = pt->base.page;
 
 		last_pte = pte + num_entries;
 		if (last_pte > GEN8_PTES)
@@ -573,7 +547,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 		if (pt_vaddr == NULL) {
 			struct i915_page_directory *pd = ppgtt->pdp.page_directory[pdpe];
 			struct i915_page_table *pt = pd->page_table[pde];
-			struct page *page_table = pt->page;
+			struct page *page_table = pt->base.page;
 
 			pt_vaddr = kmap_atomic(page_table);
 		}
@@ -605,7 +579,7 @@ static void __gen8_do_map_pt(gen8_pde_t * const pde,
 			     struct drm_device *dev)
 {
 	gen8_pde_t entry =
-		gen8_pde_encode(dev, pt->daddr, I915_CACHE_LLC);
+		gen8_pde_encode(dev, pt->base.daddr, I915_CACHE_LLC);
 	*pde = entry;
 }
 
@@ -618,7 +592,7 @@ static void gen8_initialize_pd(struct i915_address_space *vm,
 	struct i915_page_table *pt;
 	int i;
 
-	page_directory = kmap_atomic(pd->page);
+	page_directory = kmap_atomic(pd->base.page);
 	pt = ppgtt->scratch_pt;
 	for (i = 0; i < I915_PDES; i++)
 		/* Map the PDE to the page table */
@@ -633,7 +607,7 @@ static void gen8_free_page_tables(struct i915_page_directory *pd, struct drm_dev
 {
 	int i;
 
-	if (!pd->page)
+	if (!pd->base.page)
 		return;
 
 	for_each_set_bit(i, pd->used_pdes, I915_PDES) {
@@ -883,7 +857,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	/* Allocations have completed successfully, so set the bitmaps, and do
 	 * the mappings. */
 	gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
-		gen8_pde_t *const page_directory = kmap_atomic(pd->page);
+		gen8_pde_t *const page_directory = kmap_atomic(pd->base.page);
 		struct i915_page_table *pt;
 		uint64_t pd_len = gen8_clamp_pd(start, length);
 		uint64_t pd_start = start;
@@ -987,7 +961,7 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
 	gen6_for_each_pde(unused, &ppgtt->pd, start, length, temp, pde) {
 		u32 expected;
 		gen6_pte_t *pt_vaddr;
-		dma_addr_t pt_addr = ppgtt->pd.page_table[pde]->daddr;
+		dma_addr_t pt_addr = ppgtt->pd.page_table[pde]->base.daddr;
 		pd_entry = readl(ppgtt->pd_addr + pde);
 		expected = (GEN6_PDE_ADDR_ENCODE(pt_addr) | GEN6_PDE_VALID);
 
@@ -998,7 +972,7 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
 				   expected);
 		seq_printf(m, "\tPDE: %x\n", pd_entry);
 
-		pt_vaddr = kmap_atomic(ppgtt->pd.page_table[pde]->page);
+		pt_vaddr = kmap_atomic(ppgtt->pd.page_table[pde]->base.page);
 		for (pte = 0; pte < GEN6_PTES; pte+=4) {
 			unsigned long va =
 				(pde * PAGE_SIZE * GEN6_PTES) +
@@ -1033,7 +1007,7 @@ static void gen6_write_pde(struct i915_page_directory *pd,
 		container_of(pd, struct i915_hw_ppgtt, pd);
 	u32 pd_entry;
 
-	pd_entry = GEN6_PDE_ADDR_ENCODE(pt->daddr);
+	pd_entry = GEN6_PDE_ADDR_ENCODE(pt->base.daddr);
 	pd_entry |= GEN6_PDE_VALID;
 
 	writel(pd_entry, ppgtt->pd_addr + pde);
@@ -1058,9 +1032,9 @@ static void gen6_write_page_range(struct drm_i915_private *dev_priv,
 
 static uint32_t get_pd_offset(struct i915_hw_ppgtt *ppgtt)
 {
-	BUG_ON(ppgtt->pd.pd_offset & 0x3f);
+	BUG_ON(ppgtt->pd.base.ggtt_offset & 0x3f);
 
-	return (ppgtt->pd.pd_offset / 64) << 16;
+	return (ppgtt->pd.base.ggtt_offset / 64) << 16;
 }
 
 static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
@@ -1223,7 +1197,7 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 		if (last_pte > GEN6_PTES)
 			last_pte = GEN6_PTES;
 
-		pt_vaddr = kmap_atomic(ppgtt->pd.page_table[act_pt]->page);
+		pt_vaddr = kmap_atomic(ppgtt->pd.page_table[act_pt]->base.page);
 
 		for (i = first_pte; i < last_pte; i++)
 			pt_vaddr[i] = scratch_pte;
@@ -1252,7 +1226,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 	pt_vaddr = NULL;
 	for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
 		if (pt_vaddr == NULL)
-			pt_vaddr = kmap_atomic(ppgtt->pd.page_table[act_pt]->page);
+			pt_vaddr = kmap_atomic(ppgtt->pd.page_table[act_pt]->base.page);
 
 		pt_vaddr[act_pte] =
 			vm->pte_encode(sg_page_iter_dma_address(&sg_iter),
@@ -1280,7 +1254,7 @@ static void gen6_initialize_pt(struct i915_address_space *vm,
 	scratch_pte = vm->pte_encode(vm->scratch.addr,
 			I915_CACHE_LLC, true, 0);
 
-	pt_vaddr = kmap_atomic(pt->page);
+	pt_vaddr = kmap_atomic(pt->base.page);
 
 	for (i = 0; i < GEN6_PTES; i++)
 		pt_vaddr[i] = scratch_pte;
@@ -1496,11 +1470,11 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->base.total = I915_PDES * GEN6_PTES * PAGE_SIZE;
 	ppgtt->debug_dump = gen6_dump_ppgtt;
 
-	ppgtt->pd.pd_offset =
+	ppgtt->pd.base.ggtt_offset =
 		ppgtt->node.start / PAGE_SIZE * sizeof(gen6_pte_t);
 
 	ppgtt->pd_addr = (gen6_pte_t __iomem *)dev_priv->gtt.gsm +
-		ppgtt->pd.pd_offset / sizeof(gen6_pte_t);
+		ppgtt->pd.base.ggtt_offset / sizeof(gen6_pte_t);
 
 	gen6_scratch_va_range(ppgtt, 0, ppgtt->base.total);
 
@@ -1511,7 +1485,7 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 			 ppgtt->node.start / PAGE_SIZE);
 
 	DRM_DEBUG("Adding PPGTT at offset %x\n",
-		  ppgtt->pd.pd_offset << 10);
+		  ppgtt->pd.base.ggtt_offset << 10);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index da67542..666decc 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -205,19 +205,22 @@ struct i915_vma {
 #define DRM_I915_GEM_OBJECT_MAX_PIN_COUNT 0xf
 };
 
-struct i915_page_table {
+struct i915_page_dma {
 	struct page *page;
-	dma_addr_t daddr;
+	union {
+		dma_addr_t daddr;
+		uint32_t ggtt_offset;
+	};
+};
+
+struct i915_page_table {
+	struct i915_page_dma base;
 
 	unsigned long *used_ptes;
 };
 
 struct i915_page_directory {
-	struct page *page; /* NULL for GEN6-GEN7 */
-	union {
-		uint32_t pd_offset;
-		dma_addr_t daddr;
-	};
+	struct i915_page_dma base;
 
 	unsigned long *used_pdes;
 	struct i915_page_table *page_table[I915_PDES]; /* PDEs */
@@ -472,8 +475,8 @@ static inline dma_addr_t
 i915_page_dir_dma_addr(const struct i915_hw_ppgtt *ppgtt, const unsigned n)
 {
 	return test_bit(n, ppgtt->pdp.used_pdpes) ?
-		ppgtt->pdp.page_directory[n]->daddr :
-		ppgtt->scratch_pd->daddr;
+		ppgtt->pdp.page_directory[n]->base.daddr :
+		ppgtt->scratch_pd->base.daddr;
 }
 
 int i915_gem_gtt_init(struct drm_device *dev);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 09/20] drm/i915/gtt: Rename unmap_and_free_px to free_px
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (7 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 08/20] drm/i915/gtt: Introduce struct i915_page_dma Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 10/20] drm/i915/gtt: Remove superfluous free_pd with gen6/7 Mika Kuoppala
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

All the paging structures are now similar and mapped for
dma. The unmapping is taken care of by common accessors, so
don't overload the reader with such details.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 32 +++++++++++++++-----------------
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index a3b3188..053058d 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -330,8 +330,7 @@ static void cleanup_page_dma(struct drm_device *dev, struct i915_page_dma *p)
 	memset(p, 0, sizeof(*p));
 }
 
-static void unmap_and_free_pt(struct i915_page_table *pt,
-			       struct drm_device *dev)
+static void free_pt(struct drm_device *dev, struct i915_page_table *pt)
 {
 	cleanup_page_dma(dev, &pt->base);
 	kfree(pt->used_ptes);
@@ -387,8 +386,7 @@ fail_bitmap:
 	return ERR_PTR(ret);
 }
 
-static void unmap_and_free_pd(struct i915_page_directory *pd,
-			      struct drm_device *dev)
+static void free_pd(struct drm_device *dev, struct i915_page_directory *pd)
 {
 	if (pd->base.page) {
 		cleanup_page_dma(dev, &pd->base);
@@ -614,7 +612,7 @@ static void gen8_free_page_tables(struct i915_page_directory *pd, struct drm_dev
 		if (WARN_ON(!pd->page_table[i]))
 			continue;
 
-		unmap_and_free_pt(pd->page_table[i], dev);
+		free_pt(dev, pd->page_table[i]);
 		pd->page_table[i] = NULL;
 	}
 }
@@ -630,11 +628,11 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 			continue;
 
 		gen8_free_page_tables(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
-		unmap_and_free_pd(ppgtt->pdp.page_directory[i], ppgtt->base.dev);
+		free_pd(ppgtt->base.dev, ppgtt->pdp.page_directory[i]);
 	}
 
-	unmap_and_free_pd(ppgtt->scratch_pd, ppgtt->base.dev);
-	unmap_and_free_pt(ppgtt->scratch_pt, ppgtt->base.dev);
+	free_pd(ppgtt->base.dev, ppgtt->scratch_pd);
+	free_pt(ppgtt->base.dev, ppgtt->scratch_pt);
 }
 
 /**
@@ -687,7 +685,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
 
 unwind_out:
 	for_each_set_bit(pde, new_pts, I915_PDES)
-		unmap_and_free_pt(pd->page_table[pde], dev);
+		free_pt(dev, pd->page_table[pde]);
 
 	return -ENOMEM;
 }
@@ -745,7 +743,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
 
 unwind_out:
 	for_each_set_bit(pdpe, new_pds, GEN8_LEGACY_PDPES)
-		unmap_and_free_pd(pdp->page_directory[pdpe], dev);
+		free_pd(dev, pdp->page_directory[pdpe]);
 
 	return -ENOMEM;
 }
@@ -902,11 +900,11 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 err_out:
 	while (pdpe--) {
 		for_each_set_bit(temp, new_page_tables[pdpe], I915_PDES)
-			unmap_and_free_pt(ppgtt->pdp.page_directory[pdpe]->page_table[temp], vm->dev);
+			free_pt(vm->dev, ppgtt->pdp.page_directory[pdpe]->page_table[temp]);
 	}
 
 	for_each_set_bit(pdpe, new_page_dirs, GEN8_LEGACY_PDPES)
-		unmap_and_free_pd(ppgtt->pdp.page_directory[pdpe], vm->dev);
+		free_pd(vm->dev, ppgtt->pdp.page_directory[pdpe]);
 
 	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
 	mark_tlbs_dirty(ppgtt);
@@ -1345,7 +1343,7 @@ unwind_out:
 		struct i915_page_table *pt = ppgtt->pd.page_table[pde];
 
 		ppgtt->pd.page_table[pde] = ppgtt->scratch_pt;
-		unmap_and_free_pt(pt, vm->dev);
+		free_pt(vm->dev, pt);
 	}
 
 	mark_tlbs_dirty(ppgtt);
@@ -1364,11 +1362,11 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 
 	gen6_for_all_pdes(pt, ppgtt, pde) {
 		if (pt != ppgtt->scratch_pt)
-			unmap_and_free_pt(pt, ppgtt->base.dev);
+			free_pt(ppgtt->base.dev, pt);
 	}
 
-	unmap_and_free_pt(ppgtt->scratch_pt, ppgtt->base.dev);
-	unmap_and_free_pd(&ppgtt->pd, ppgtt->base.dev);
+	free_pt(ppgtt->base.dev, ppgtt->scratch_pt);
+	free_pd(ppgtt->base.dev, &ppgtt->pd);
 }
 
 static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
@@ -1418,7 +1416,7 @@ alloc:
 	return 0;
 
 err_out:
-	unmap_and_free_pt(ppgtt->scratch_pt, ppgtt->base.dev);
+	free_pt(ppgtt->base.dev, ppgtt->scratch_pt);
 	return ret;
 }
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 10/20] drm/i915/gtt: Remove superfluous free_pd with gen6/7
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (8 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 09/20] drm/i915/gtt: Rename unmap_and_free_px to free_px Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 11/20] drm/i915/gtt: Introduce fill_page_dma() Mika Kuoppala
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

This has slipped in somewhere but it was harmless
as we check the page pointer before teardown.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 053058d..5175eb8 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1366,7 +1366,6 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 	}
 
 	free_pt(ppgtt->base.dev, ppgtt->scratch_pt);
-	free_pd(ppgtt->base.dev, &ppgtt->pd);
 }
 
 static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 11/20] drm/i915/gtt: Introduce fill_page_dma()
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (9 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 10/20] drm/i915/gtt: Remove superfluous free_pd with gen6/7 Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 15:16   ` Ville Syrjälä
  2015-05-21 14:37 ` [PATCH 12/20] drm/i915/gtt: Introduce kmap|kunmap for dma page Mika Kuoppala
                   ` (8 subsequent siblings)
  19 siblings, 1 reply; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

When we setup page directories and tables, we point the entries
to a to the next level scratch structure. Make this generic
by introducing a fill_page_dma which maps and flushes. We also
need 32 bit variant for legacy gens.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 61 +++++++++++++++++++------------------
 1 file changed, 31 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 5175eb8..a3ee710 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -330,6 +330,27 @@ static void cleanup_page_dma(struct drm_device *dev, struct i915_page_dma *p)
 	memset(p, 0, sizeof(*p));
 }
 
+static void fill_page_dma(struct drm_device *dev, struct i915_page_dma *p,
+			  const uint64_t val)
+{
+	int i;
+	uint64_t * const vaddr = kmap_atomic(p->page);
+
+	for (i = 0; i < 512; i++)
+		vaddr[i] = val;
+
+	kunmap_atomic(vaddr);
+}
+
+static void fill_page_dma_32(struct drm_device *dev, struct i915_page_dma *p,
+			     const uint32_t val32)
+{
+	uint64_t v = val32;
+	v = v << 32 | val32;
+
+	fill_page_dma(dev, p, v);
+}
+
 static void free_pt(struct drm_device *dev, struct i915_page_table *pt)
 {
 	cleanup_page_dma(dev, &pt->base);
@@ -340,19 +361,12 @@ static void free_pt(struct drm_device *dev, struct i915_page_table *pt)
 static void gen8_initialize_pt(struct i915_address_space *vm,
 			       struct i915_page_table *pt)
 {
-	gen8_pte_t *pt_vaddr, scratch_pte;
-	int i;
+	gen8_pte_t scratch_pte;
 
-	pt_vaddr = kmap_atomic(pt->base.page);
 	scratch_pte = gen8_pte_encode(vm->scratch.addr,
 				      I915_CACHE_LLC, true);
 
-	for (i = 0; i < GEN8_PTES; i++)
-		pt_vaddr[i] = scratch_pte;
-
-	if (!HAS_LLC(vm->dev))
-		drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
-	kunmap_atomic(pt_vaddr);
+	fill_page_dma(vm->dev, &pt->base, scratch_pte);
 }
 
 static struct i915_page_table *alloc_pt(struct drm_device *dev)
@@ -585,20 +599,13 @@ static void gen8_initialize_pd(struct i915_address_space *vm,
 			       struct i915_page_directory *pd)
 {
 	struct i915_hw_ppgtt *ppgtt =
-			container_of(vm, struct i915_hw_ppgtt, base);
-	gen8_pde_t *page_directory;
-	struct i915_page_table *pt;
-	int i;
+		container_of(vm, struct i915_hw_ppgtt, base);
+	gen8_pde_t scratch_pde;
 
-	page_directory = kmap_atomic(pd->base.page);
-	pt = ppgtt->scratch_pt;
-	for (i = 0; i < I915_PDES; i++)
-		/* Map the PDE to the page table */
-		__gen8_do_map_pt(page_directory + i, pt, vm->dev);
+	scratch_pde = gen8_pde_encode(vm->dev, ppgtt->scratch_pt->base.daddr,
+				      I915_CACHE_LLC);
 
-	if (!HAS_LLC(vm->dev))
-		drm_clflush_virt_range(page_directory, PAGE_SIZE);
-	kunmap_atomic(page_directory);
+	fill_page_dma(vm->dev, &pd->base, scratch_pde);
 }
 
 static void gen8_free_page_tables(struct i915_page_directory *pd, struct drm_device *dev)
@@ -1242,22 +1249,16 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 }
 
 static void gen6_initialize_pt(struct i915_address_space *vm,
-		struct i915_page_table *pt)
+			       struct i915_page_table *pt)
 {
-	gen6_pte_t *pt_vaddr, scratch_pte;
-	int i;
+	gen6_pte_t scratch_pte;
 
 	WARN_ON(vm->scratch.addr == 0);
 
 	scratch_pte = vm->pte_encode(vm->scratch.addr,
 			I915_CACHE_LLC, true, 0);
 
-	pt_vaddr = kmap_atomic(pt->base.page);
-
-	for (i = 0; i < GEN6_PTES; i++)
-		pt_vaddr[i] = scratch_pte;
-
-	kunmap_atomic(pt_vaddr);
+	fill_page_dma_32(vm->dev, &pt->base, scratch_pte);
 }
 
 static int gen6_alloc_va_range(struct i915_address_space *vm,
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 12/20] drm/i915/gtt: Introduce kmap|kunmap for dma page
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (10 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 11/20] drm/i915/gtt: Introduce fill_page_dma() Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 15:19   ` Ville Syrjälä
  2015-05-21 14:37 ` [PATCH 13/20] drm/i915/gtt: Introduce copy_page_dma and copy_px Mika Kuoppala
                   ` (7 subsequent siblings)
  19 siblings, 1 reply; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

As there is flushing involved when we have done the cpu
write, make functions for mapping for cpu space. Make macros
to map any type of paging structure.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 67 +++++++++++++++++++------------------
 1 file changed, 35 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index a3ee710..3d94ad8 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -330,16 +330,32 @@ static void cleanup_page_dma(struct drm_device *dev, struct i915_page_dma *p)
 	memset(p, 0, sizeof(*p));
 }
 
+static void *kmap_page_dma(struct i915_page_dma *p)
+{
+	return kmap_atomic(p->page);
+}
+
+static void kunmap_page_dma(struct drm_device *dev, void *vaddr)
+{
+	if (!HAS_LLC(dev))
+		drm_clflush_virt_range(vaddr, PAGE_SIZE);
+
+	kunmap_atomic(vaddr);
+}
+
+#define kmap_px(px) kmap_page_dma(&(px)->base)
+#define kunmap_px(ppgtt, vaddr) kunmap_page_dma((ppgtt)->base.dev, (vaddr));
+
 static void fill_page_dma(struct drm_device *dev, struct i915_page_dma *p,
 			  const uint64_t val)
 {
 	int i;
-	uint64_t * const vaddr = kmap_atomic(p->page);
+	uint64_t * const vaddr = kmap_page_dma(p);
 
 	for (i = 0; i < 512; i++)
 		vaddr[i] = val;
 
-	kunmap_atomic(vaddr);
+	kunmap_page_dma(dev, vaddr);
 }
 
 static void fill_page_dma_32(struct drm_device *dev, struct i915_page_dma *p,
@@ -497,7 +513,6 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 	while (num_entries) {
 		struct i915_page_directory *pd;
 		struct i915_page_table *pt;
-		struct page *page_table;
 
 		if (WARN_ON(!ppgtt->pdp.page_directory[pdpe]))
 			continue;
@@ -512,22 +527,18 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 		if (WARN_ON(!pt->base.page))
 			continue;
 
-		page_table = pt->base.page;
-
 		last_pte = pte + num_entries;
 		if (last_pte > GEN8_PTES)
 			last_pte = GEN8_PTES;
 
-		pt_vaddr = kmap_atomic(page_table);
+		pt_vaddr = kmap_px(pt);
 
 		for (i = pte; i < last_pte; i++) {
 			pt_vaddr[i] = scratch_pte;
 			num_entries--;
 		}
 
-		if (!HAS_LLC(ppgtt->base.dev))
-			drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
-		kunmap_atomic(pt_vaddr);
+		kunmap_px(ppgtt, pt);
 
 		pte = 0;
 		if (++pde == I915_PDES) {
@@ -559,18 +570,14 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 		if (pt_vaddr == NULL) {
 			struct i915_page_directory *pd = ppgtt->pdp.page_directory[pdpe];
 			struct i915_page_table *pt = pd->page_table[pde];
-			struct page *page_table = pt->base.page;
-
-			pt_vaddr = kmap_atomic(page_table);
+			pt_vaddr = kmap_px(pt);
 		}
 
 		pt_vaddr[pte] =
 			gen8_pte_encode(sg_page_iter_dma_address(&sg_iter),
 					cache_level, true);
 		if (++pte == GEN8_PTES) {
-			if (!HAS_LLC(ppgtt->base.dev))
-				drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
-			kunmap_atomic(pt_vaddr);
+			kunmap_px(ppgtt, pt_vaddr);
 			pt_vaddr = NULL;
 			if (++pde == I915_PDES) {
 				pdpe++;
@@ -579,11 +586,9 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 			pte = 0;
 		}
 	}
-	if (pt_vaddr) {
-		if (!HAS_LLC(ppgtt->base.dev))
-			drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
-		kunmap_atomic(pt_vaddr);
-	}
+
+	if (pt_vaddr)
+		kunmap_px(ppgtt, pt_vaddr);
 }
 
 static void __gen8_do_map_pt(gen8_pde_t * const pde,
@@ -862,7 +867,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	/* Allocations have completed successfully, so set the bitmaps, and do
 	 * the mappings. */
 	gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
-		gen8_pde_t *const page_directory = kmap_atomic(pd->base.page);
+		gen8_pde_t *const page_directory = kmap_px(pd);
 		struct i915_page_table *pt;
 		uint64_t pd_len = gen8_clamp_pd(start, length);
 		uint64_t pd_start = start;
@@ -892,10 +897,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 			 * point we're still relying on insert_entries() */
 		}
 
-		if (!HAS_LLC(vm->dev))
-			drm_clflush_virt_range(page_directory, PAGE_SIZE);
-
-		kunmap_atomic(page_directory);
+		kunmap_px(ppgtt, page_directory);
 
 		set_bit(pdpe, ppgtt->pdp.used_pdpes);
 	}
@@ -977,7 +979,8 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
 				   expected);
 		seq_printf(m, "\tPDE: %x\n", pd_entry);
 
-		pt_vaddr = kmap_atomic(ppgtt->pd.page_table[pde]->base.page);
+		pt_vaddr = kmap_px(ppgtt->pd.page_table[pde]);
+
 		for (pte = 0; pte < GEN6_PTES; pte+=4) {
 			unsigned long va =
 				(pde * PAGE_SIZE * GEN6_PTES) +
@@ -999,7 +1002,7 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
 			}
 			seq_puts(m, "\n");
 		}
-		kunmap_atomic(pt_vaddr);
+		kunmap_px(ppgtt, pt_vaddr);
 	}
 }
 
@@ -1202,12 +1205,12 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 		if (last_pte > GEN6_PTES)
 			last_pte = GEN6_PTES;
 
-		pt_vaddr = kmap_atomic(ppgtt->pd.page_table[act_pt]->base.page);
+		pt_vaddr = kmap_px(ppgtt->pd.page_table[act_pt]);
 
 		for (i = first_pte; i < last_pte; i++)
 			pt_vaddr[i] = scratch_pte;
 
-		kunmap_atomic(pt_vaddr);
+		kunmap_px(ppgtt, pt_vaddr);
 
 		num_entries -= last_pte - first_pte;
 		first_pte = 0;
@@ -1231,21 +1234,21 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 	pt_vaddr = NULL;
 	for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
 		if (pt_vaddr == NULL)
-			pt_vaddr = kmap_atomic(ppgtt->pd.page_table[act_pt]->base.page);
+			pt_vaddr = kmap_px(ppgtt->pd.page_table[act_pt]);
 
 		pt_vaddr[act_pte] =
 			vm->pte_encode(sg_page_iter_dma_address(&sg_iter),
 				       cache_level, true, flags);
 
 		if (++act_pte == GEN6_PTES) {
-			kunmap_atomic(pt_vaddr);
+			kunmap_px(ppgtt, pt_vaddr);
 			pt_vaddr = NULL;
 			act_pt++;
 			act_pte = 0;
 		}
 	}
 	if (pt_vaddr)
-		kunmap_atomic(pt_vaddr);
+		kunmap_px(ppgtt, pt_vaddr);
 }
 
 static void gen6_initialize_pt(struct i915_address_space *vm,
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 13/20] drm/i915/gtt: Introduce copy_page_dma and copy_px
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (11 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 12/20] drm/i915/gtt: Introduce kmap|kunmap for dma page Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 14/20] drm/i915/gtt: Use macros to access dma mapped pages Mika Kuoppala
                   ` (6 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

Every time we allocate page structure, we fill it out
it to point into lower level scratch structure. But
as we have already setup scratch page directory and
scratch page table in when ppgtt was initialized, take
advantage of that and do a page copy from those.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 61 +++++++++++++++++++++++++------------
 1 file changed, 41 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3d94ad8..fa253cd 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -335,16 +335,34 @@ static void *kmap_page_dma(struct i915_page_dma *p)
 	return kmap_atomic(p->page);
 }
 
-static void kunmap_page_dma(struct drm_device *dev, void *vaddr)
+static void kunmap_page_dma(struct drm_device *dev, void *vaddr, bool dirty)
 {
-	if (!HAS_LLC(dev))
+	if (dirty && !HAS_LLC(dev))
 		drm_clflush_virt_range(vaddr, PAGE_SIZE);
 
 	kunmap_atomic(vaddr);
 }
 
 #define kmap_px(px) kmap_page_dma(&(px)->base)
-#define kunmap_px(ppgtt, vaddr) kunmap_page_dma((ppgtt)->base.dev, (vaddr));
+#define kunmap_px(ppgtt, vaddr) kunmap_page_dma((ppgtt)->base.dev, (vaddr), true);
+
+#define copy_px(ppgtt, to, from) \
+	copy_page_dma((ppgtt)->base.dev, &(to)->base, &(from)->base)
+
+static void copy_page_dma(struct drm_device *dev,
+			  struct i915_page_dma *to,
+			  struct i915_page_dma *from)
+{
+	void *dst, *src;
+
+	src = kmap_page_dma(from);
+	dst = kmap_page_dma(to);
+
+	copy_page(dst, src);
+
+	kunmap_page_dma(dev, dst, true);
+	kunmap_page_dma(dev, src, false);
+}
 
 static void fill_page_dma(struct drm_device *dev, struct i915_page_dma *p,
 			  const uint64_t val)
@@ -355,7 +373,7 @@ static void fill_page_dma(struct drm_device *dev, struct i915_page_dma *p,
 	for (i = 0; i < 512; i++)
 		vaddr[i] = val;
 
-	kunmap_page_dma(dev, vaddr);
+	kunmap_page_dma(dev, vaddr, true);
 }
 
 static void fill_page_dma_32(struct drm_device *dev, struct i915_page_dma *p,
@@ -374,15 +392,16 @@ static void free_pt(struct drm_device *dev, struct i915_page_table *pt)
 	kfree(pt);
 }
 
-static void gen8_initialize_pt(struct i915_address_space *vm,
-			       struct i915_page_table *pt)
+static void gen8_setup_scratch_pt(struct i915_address_space *vm)
 {
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_pte_t scratch_pte;
 
 	scratch_pte = gen8_pte_encode(vm->scratch.addr,
 				      I915_CACHE_LLC, true);
 
-	fill_page_dma(vm->dev, &pt->base, scratch_pte);
+	fill_page_dma(vm->dev, &ppgtt->scratch_pt->base, scratch_pte);
 }
 
 static struct i915_page_table *alloc_pt(struct drm_device *dev)
@@ -600,8 +619,7 @@ static void __gen8_do_map_pt(gen8_pde_t * const pde,
 	*pde = entry;
 }
 
-static void gen8_initialize_pd(struct i915_address_space *vm,
-			       struct i915_page_directory *pd)
+static void gen8_setup_scratch_pd(struct i915_address_space *vm)
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
@@ -610,7 +628,7 @@ static void gen8_initialize_pd(struct i915_address_space *vm,
 	scratch_pde = gen8_pde_encode(vm->dev, ppgtt->scratch_pt->base.daddr,
 				      I915_CACHE_LLC);
 
-	fill_page_dma(vm->dev, &pd->base, scratch_pde);
+	fill_page_dma(vm->dev, &ppgtt->scratch_pd->base, scratch_pde);
 }
 
 static void gen8_free_page_tables(struct i915_page_directory *pd, struct drm_device *dev)
@@ -688,7 +706,8 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
 		if (IS_ERR(pt))
 			goto unwind_out;
 
-		gen8_initialize_pt(&ppgtt->base, pt);
+		copy_px(ppgtt, pt, ppgtt->scratch_pt);
+
 		pd->page_table[pde] = pt;
 		set_bit(pde, new_pts);
 	}
@@ -746,7 +765,8 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
 		if (IS_ERR(pd))
 			goto unwind_out;
 
-		gen8_initialize_pd(&ppgtt->base, pd);
+		copy_px(ppgtt, pd, ppgtt->scratch_pd);
+
 		pdp->page_directory[pdpe] = pd;
 		set_bit(pdpe, new_pds);
 	}
@@ -937,8 +957,8 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	if (IS_ERR(ppgtt->scratch_pd))
 		return PTR_ERR(ppgtt->scratch_pd);
 
-	gen8_initialize_pt(&ppgtt->base, ppgtt->scratch_pt);
-	gen8_initialize_pd(&ppgtt->base, ppgtt->scratch_pd);
+	gen8_setup_scratch_pt(&ppgtt->base);
+	gen8_setup_scratch_pd(&ppgtt->base);
 
 	ppgtt->base.start = 0;
 	ppgtt->base.total = 1ULL << 32;
@@ -1251,17 +1271,18 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 		kunmap_px(ppgtt, pt_vaddr);
 }
 
-static void gen6_initialize_pt(struct i915_address_space *vm,
-			       struct i915_page_table *pt)
+static void gen6_setup_scratch_pt(struct i915_address_space *vm)
 {
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
 	gen6_pte_t scratch_pte;
 
 	WARN_ON(vm->scratch.addr == 0);
 
 	scratch_pte = vm->pte_encode(vm->scratch.addr,
-			I915_CACHE_LLC, true, 0);
+				     I915_CACHE_LLC, true, 0);
 
-	fill_page_dma_32(vm->dev, &pt->base, scratch_pte);
+	fill_page_dma_32(vm->dev, &ppgtt->scratch_pt->base, scratch_pte);
 }
 
 static int gen6_alloc_va_range(struct i915_address_space *vm,
@@ -1305,7 +1326,7 @@ static int gen6_alloc_va_range(struct i915_address_space *vm,
 			goto unwind_out;
 		}
 
-		gen6_initialize_pt(vm, pt);
+		copy_px(ppgtt, pt, ppgtt->scratch_pt);
 
 		ppgtt->pd.page_table[pde] = pt;
 		set_bit(pde, new_page_tables);
@@ -1388,7 +1409,7 @@ static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
 	if (IS_ERR(ppgtt->scratch_pt))
 		return PTR_ERR(ppgtt->scratch_pt);
 
-	gen6_initialize_pt(&ppgtt->base, ppgtt->scratch_pt);
+	gen6_setup_scratch_pt(&ppgtt->base);
 
 alloc:
 	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 14/20] drm/i915/gtt: Use macros to access dma mapped pages
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (12 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 13/20] drm/i915/gtt: Introduce copy_page_dma and copy_px Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 15/20] drm/i915/gtt: Make scratch page i915_page_dma compatible Mika Kuoppala
                   ` (5 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

Make paging structure type agnostic *_px macros to access
page dma struct, the backing page and the dma address.

This makes the code less cluttered on internals of
i915_page_dma.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 36 +++++++++++++++++++++---------------
 drivers/gpu/drm/i915/i915_gem_gtt.h |  8 ++++++--
 2 files changed, 27 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index fa253cd..c013a4c 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -343,11 +343,17 @@ static void kunmap_page_dma(struct drm_device *dev, void *vaddr, bool dirty)
 	kunmap_atomic(vaddr);
 }
 
-#define kmap_px(px) kmap_page_dma(&(px)->base)
+#define kmap_px(px) kmap_page_dma(px_base(px))
 #define kunmap_px(ppgtt, vaddr) kunmap_page_dma((ppgtt)->base.dev, (vaddr), true);
 
+#define setup_px(dev, px) setup_page_dma((dev), px_base(px))
+#define cleanup_px(dev, px) cleanup_page_dma((dev), px_base(px))
+#define fill_px(dev, px, v) fill_page_dma((dev), px_base(px), (v))
+#define fill32_px(dev, px, v) fill_page_dma_32((dev), px_base(px), (v))
+
 #define copy_px(ppgtt, to, from) \
-	copy_page_dma((ppgtt)->base.dev, &(to)->base, &(from)->base)
+	copy_page_dma((ppgtt)->base.dev, \
+			      px_base(to), px_base(from))
 
 static void copy_page_dma(struct drm_device *dev,
 			  struct i915_page_dma *to,
@@ -387,7 +393,7 @@ static void fill_page_dma_32(struct drm_device *dev, struct i915_page_dma *p,
 
 static void free_pt(struct drm_device *dev, struct i915_page_table *pt)
 {
-	cleanup_page_dma(dev, &pt->base);
+	cleanup_px(dev, pt);
 	kfree(pt->used_ptes);
 	kfree(pt);
 }
@@ -401,7 +407,7 @@ static void gen8_setup_scratch_pt(struct i915_address_space *vm)
 	scratch_pte = gen8_pte_encode(vm->scratch.addr,
 				      I915_CACHE_LLC, true);
 
-	fill_page_dma(vm->dev, &ppgtt->scratch_pt->base, scratch_pte);
+	fill_px(vm->dev, ppgtt->scratch_pt, scratch_pte);
 }
 
 static struct i915_page_table *alloc_pt(struct drm_device *dev)
@@ -421,7 +427,7 @@ static struct i915_page_table *alloc_pt(struct drm_device *dev)
 	if (!pt->used_ptes)
 		goto fail_bitmap;
 
-	ret = setup_page_dma(dev, &pt->base);
+	ret = setup_px(dev, pt);
 	if (ret)
 		goto fail_page_m;
 
@@ -437,8 +443,8 @@ fail_bitmap:
 
 static void free_pd(struct drm_device *dev, struct i915_page_directory *pd)
 {
-	if (pd->base.page) {
-		cleanup_page_dma(dev, &pd->base);
+	if (px_page(pd)) {
+		cleanup_px(dev, pd);
 		kfree(pd->used_pdes);
 		kfree(pd);
 	}
@@ -458,7 +464,7 @@ static struct i915_page_directory *alloc_pd(struct drm_device *dev)
 	if (!pd->used_pdes)
 		goto free_pd;
 
-	ret = setup_page_dma(dev, &pd->base);
+	ret = setup_px(dev, pd);
 	if (ret)
 		goto free_bitmap;
 
@@ -502,7 +508,7 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	int i, ret;
 
 	for (i = GEN8_LEGACY_PDPES - 1; i >= 0; i--) {
-		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
+		dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
 
 		ret = gen8_write_pdp(ring, i, pd_daddr);
 		if (ret)
@@ -543,7 +549,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 
 		pt = pd->page_table[pde];
 
-		if (WARN_ON(!pt->base.page))
+		if (WARN_ON(!px_page(pt)))
 			continue;
 
 		last_pte = pte + num_entries;
@@ -615,7 +621,7 @@ static void __gen8_do_map_pt(gen8_pde_t * const pde,
 			     struct drm_device *dev)
 {
 	gen8_pde_t entry =
-		gen8_pde_encode(dev, pt->base.daddr, I915_CACHE_LLC);
+		gen8_pde_encode(dev, px_dma(pt), I915_CACHE_LLC);
 	*pde = entry;
 }
 
@@ -625,10 +631,10 @@ static void gen8_setup_scratch_pd(struct i915_address_space *vm)
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_pde_t scratch_pde;
 
-	scratch_pde = gen8_pde_encode(vm->dev, ppgtt->scratch_pt->base.daddr,
+	scratch_pde = gen8_pde_encode(vm->dev, px_dma(ppgtt->scratch_pt),
 				      I915_CACHE_LLC);
 
-	fill_page_dma(vm->dev, &ppgtt->scratch_pd->base, scratch_pde);
+	fill_px(vm->dev, ppgtt->scratch_pd, scratch_pde);
 }
 
 static void gen8_free_page_tables(struct i915_page_directory *pd, struct drm_device *dev)
@@ -988,7 +994,7 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
 	gen6_for_each_pde(unused, &ppgtt->pd, start, length, temp, pde) {
 		u32 expected;
 		gen6_pte_t *pt_vaddr;
-		dma_addr_t pt_addr = ppgtt->pd.page_table[pde]->base.daddr;
+		const dma_addr_t pt_addr = px_dma(ppgtt->pd.page_table[pde]);
 		pd_entry = readl(ppgtt->pd_addr + pde);
 		expected = (GEN6_PDE_ADDR_ENCODE(pt_addr) | GEN6_PDE_VALID);
 
@@ -1035,7 +1041,7 @@ static void gen6_write_pde(struct i915_page_directory *pd,
 		container_of(pd, struct i915_hw_ppgtt, pd);
 	u32 pd_entry;
 
-	pd_entry = GEN6_PDE_ADDR_ENCODE(pt->base.daddr);
+	pd_entry = GEN6_PDE_ADDR_ENCODE(px_dma(pt));
 	pd_entry |= GEN6_PDE_VALID;
 
 	writel(pd_entry, ppgtt->pd_addr + pde);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 666decc..006b839 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -213,6 +213,10 @@ struct i915_page_dma {
 	};
 };
 
+#define px_base(px) (&(px)->base)
+#define px_page(px) (px_base(px)->page)
+#define px_dma(px) (px_base(px)->daddr)
+
 struct i915_page_table {
 	struct i915_page_dma base;
 
@@ -475,8 +479,8 @@ static inline dma_addr_t
 i915_page_dir_dma_addr(const struct i915_hw_ppgtt *ppgtt, const unsigned n)
 {
 	return test_bit(n, ppgtt->pdp.used_pdpes) ?
-		ppgtt->pdp.page_directory[n]->base.daddr :
-		ppgtt->scratch_pd->base.daddr;
+		px_dma(ppgtt->pdp.page_directory[n]) :
+		px_dma(ppgtt->scratch_pd);
 }
 
 int i915_gem_gtt_init(struct drm_device *dev);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 15/20] drm/i915/gtt: Make scratch page i915_page_dma compatible
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (13 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 14/20] drm/i915/gtt: Use macros to access dma mapped pages Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 16/20] drm/i915/gtt: Fill scratch page Mika Kuoppala
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

Lay out scratch page structure in similar manner than other
paging structures. This allows us to use the same tools for
setup and teardown.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 91 ++++++++++++++++++++-----------------
 drivers/gpu/drm/i915/i915_gem_gtt.h |  9 ++--
 2 files changed, 54 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index c013a4c..ccdb35f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -301,11 +301,12 @@ static gen6_pte_t iris_pte_encode(dma_addr_t addr,
 	return pte;
 }
 
-static int setup_page_dma(struct drm_device *dev, struct i915_page_dma *p)
+static int __setup_page_dma(struct drm_device *dev,
+			    struct i915_page_dma *p, gfp_t flags)
 {
 	struct device *device = &dev->pdev->dev;
 
-	p->page = alloc_page(GFP_KERNEL);
+	p->page = alloc_page(flags);
 	if (!p->page)
 		return -ENOMEM;
 
@@ -320,6 +321,11 @@ static int setup_page_dma(struct drm_device *dev, struct i915_page_dma *p)
 	return 0;
 }
 
+static int setup_page_dma(struct drm_device *dev, struct i915_page_dma *p)
+{
+	return __setup_page_dma(dev, p, GFP_KERNEL);
+}
+
 static void cleanup_page_dma(struct drm_device *dev, struct i915_page_dma *p)
 {
 	if (WARN_ON(!p->page))
@@ -404,7 +410,7 @@ static void gen8_setup_scratch_pt(struct i915_address_space *vm)
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_pte_t scratch_pte;
 
-	scratch_pte = gen8_pte_encode(vm->scratch.addr,
+	scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page),
 				      I915_CACHE_LLC, true);
 
 	fill_px(vm->dev, ppgtt->scratch_pt, scratch_pte);
@@ -532,7 +538,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 	unsigned num_entries = length >> PAGE_SHIFT;
 	unsigned last_pte, i;
 
-	scratch_pte = gen8_pte_encode(ppgtt->base.scratch.addr,
+	scratch_pte = gen8_pte_encode(px_dma(ppgtt->base.scratch_page),
 				      I915_CACHE_LLC, use_scratch);
 
 	while (num_entries) {
@@ -641,7 +647,7 @@ static void gen8_free_page_tables(struct i915_page_directory *pd, struct drm_dev
 {
 	int i;
 
-	if (!pd->base.page)
+	if (!px_page(pd))
 		return;
 
 	for_each_set_bit(i, pd->used_pdes, I915_PDES) {
@@ -989,7 +995,7 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
 	uint32_t  pte, pde, temp;
 	uint32_t start = ppgtt->base.start, length = ppgtt->base.total;
 
-	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC, true, 0);
+	scratch_pte = vm->pte_encode(px_dma(vm->scratch_page), I915_CACHE_LLC, true, 0);
 
 	gen6_for_each_pde(unused, &ppgtt->pd, start, length, temp, pde) {
 		u32 expected;
@@ -1224,7 +1230,8 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 	unsigned first_pte = first_entry % GEN6_PTES;
 	unsigned last_pte, i;
 
-	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC, true, 0);
+	scratch_pte = vm->pte_encode(px_dma(vm->scratch_page),
+				     I915_CACHE_LLC, true, 0);
 
 	while (num_entries) {
 		last_pte = first_pte + num_entries;
@@ -1283,12 +1290,12 @@ static void gen6_setup_scratch_pt(struct i915_address_space *vm)
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen6_pte_t scratch_pte;
 
-	WARN_ON(vm->scratch.addr == 0);
+	WARN_ON(px_dma(vm->scratch_page) == 0);
 
-	scratch_pte = vm->pte_encode(vm->scratch.addr,
+	scratch_pte = vm->pte_encode(px_dma(vm->scratch_page),
 				     I915_CACHE_LLC, true, 0);
 
-	fill_page_dma_32(vm->dev, &ppgtt->scratch_pt->base, scratch_pte);
+	fill32_px(vm->dev, ppgtt->scratch_pt, scratch_pte);
 }
 
 static int gen6_alloc_va_range(struct i915_address_space *vm,
@@ -1523,13 +1530,14 @@ static int __hw_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
 	ppgtt->base.dev = dev;
-	ppgtt->base.scratch = dev_priv->gtt.base.scratch;
+	ppgtt->base.scratch_page = dev_priv->gtt.base.scratch_page;
 
 	if (INTEL_INFO(dev)->gen < 8)
 		return gen6_ppgtt_init(ppgtt);
 	else
 		return gen8_ppgtt_init(ppgtt);
 }
+
 int i915_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -1844,7 +1852,7 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
 		 first_entry, num_entries, max_entries))
 		num_entries = max_entries;
 
-	scratch_pte = gen8_pte_encode(vm->scratch.addr,
+	scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page),
 				      I915_CACHE_LLC,
 				      use_scratch);
 	for (i = 0; i < num_entries; i++)
@@ -1870,7 +1878,8 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 		 first_entry, num_entries, max_entries))
 		num_entries = max_entries;
 
-	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC, use_scratch, 0);
+	scratch_pte = vm->pte_encode(px_dma(vm->scratch_page),
+				     I915_CACHE_LLC, use_scratch, 0);
 
 	for (i = 0; i < num_entries; i++)
 		iowrite32(scratch_pte, &gtt_base[i]);
@@ -2127,42 +2136,40 @@ void i915_global_gtt_cleanup(struct drm_device *dev)
 	vm->cleanup(vm);
 }
 
-static int setup_scratch_page(struct drm_device *dev)
+static int alloc_scratch_page(struct i915_address_space *vm)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct page *page;
-	dma_addr_t dma_addr;
+	struct i915_page_scratch *sp;
+	int ret;
+
+	WARN_ON(vm->scratch_page);
 
-	page = alloc_page(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO);
-	if (page == NULL)
+	sp = kzalloc(sizeof(*sp), GFP_KERNEL);
+	if (sp == NULL)
 		return -ENOMEM;
-	set_pages_uc(page, 1);
 
-#ifdef CONFIG_INTEL_IOMMU
-	dma_addr = pci_map_page(dev->pdev, page, 0, PAGE_SIZE,
-				PCI_DMA_BIDIRECTIONAL);
-	if (pci_dma_mapping_error(dev->pdev, dma_addr)) {
-		__free_page(page);
-		return -EINVAL;
+	ret = __setup_page_dma(vm->dev, px_base(sp), GFP_DMA32 | __GFP_ZERO);
+	if (ret) {
+		kfree(sp);
+		return ret;
 	}
-#else
-	dma_addr = page_to_phys(page);
-#endif
-	dev_priv->gtt.base.scratch.page = page;
-	dev_priv->gtt.base.scratch.addr = dma_addr;
+
+	set_pages_uc(px_page(sp), 1);
+
+	vm->scratch_page = sp;
 
 	return 0;
 }
 
-static void teardown_scratch_page(struct drm_device *dev)
+static void free_scratch_page(struct i915_address_space *vm)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct page *page = dev_priv->gtt.base.scratch.page;
+	struct i915_page_scratch *sp = vm->scratch_page;
 
-	set_pages_wb(page, 1);
-	pci_unmap_page(dev->pdev, dev_priv->gtt.base.scratch.addr,
-		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-	__free_page(page);
+	set_pages_wb(px_page(sp), 1);
+
+	cleanup_px(vm->dev, sp);
+	kfree(sp);
+
+	vm->scratch_page = NULL;
 }
 
 static unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
@@ -2270,7 +2277,7 @@ static int ggtt_probe_common(struct drm_device *dev,
 		return -ENOMEM;
 	}
 
-	ret = setup_scratch_page(dev);
+	ret = alloc_scratch_page(&dev_priv->gtt.base);
 	if (ret) {
 		DRM_ERROR("Scratch setup failed\n");
 		/* iounmap will also get called at remove, but meh */
@@ -2449,7 +2456,7 @@ static void gen6_gmch_remove(struct i915_address_space *vm)
 	struct i915_gtt *gtt = container_of(vm, struct i915_gtt, base);
 
 	iounmap(gtt->gsm);
-	teardown_scratch_page(vm->dev);
+	free_scratch_page(vm);
 }
 
 static int i915_gmch_probe(struct drm_device *dev,
@@ -2513,13 +2520,13 @@ int i915_gem_gtt_init(struct drm_device *dev)
 		dev_priv->gtt.base.cleanup = gen6_gmch_remove;
 	}
 
+	gtt->base.dev = dev;
+
 	ret = gtt->gtt_probe(dev, &gtt->base.total, &gtt->stolen_size,
 			     &gtt->mappable_base, &gtt->mappable_end);
 	if (ret)
 		return ret;
 
-	gtt->base.dev = dev;
-
 	/* GMADR is the PCI mmio aperture into the global GTT. */
 	DRM_INFO("Memory usable by graphics device = %lluM\n",
 		 gtt->base.total >> 20);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 006b839..1fd4041 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -217,6 +217,10 @@ struct i915_page_dma {
 #define px_page(px) (px_base(px)->page)
 #define px_dma(px) (px_base(px)->daddr)
 
+struct i915_page_scratch {
+	struct i915_page_dma base;
+};
+
 struct i915_page_table {
 	struct i915_page_dma base;
 
@@ -243,10 +247,7 @@ struct i915_address_space {
 	u64 start;		/* Start offset always 0 for dri2 */
 	u64 total;		/* size addr space maps (ex. 2GB for ggtt) */
 
-	struct {
-		dma_addr_t addr;
-		struct page *page;
-	} scratch;
+	struct i915_page_scratch *scratch_page;
 
 	/**
 	 * List of objects currently involved in rendering.
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 16/20] drm/i915/gtt: Fill scratch page
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (14 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 15/20] drm/i915/gtt: Make scratch page i915_page_dma compatible Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:56   ` Chris Wilson
  2015-05-21 14:37 ` [PATCH 17/20] drm/i915/gtt: Pin vma during virtual address allocation Mika Kuoppala
                   ` (3 subsequent siblings)
  19 siblings, 1 reply; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

During review of dynamic page tables series, I was able
to hit a lite restore bug with execlists. I assume that
due to incorrect pd, the batch run out of legit address space
and into the scratch page area. The ACTHD was increasing
due to scratch being all zeroes (MI_NOOPs). And as gen8
address space is quite large, the hangcheck happily waited
for a long long time, keeping the process effectively stuck.

According to Chris Wilson any modern gpu will grind to halt
if it encounters commands of all ones. This seemed to do the
trick and hang was declared promptly when the gpu wandered into
the scratch land.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index ccdb35f..26d1d45 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2153,6 +2153,7 @@ static int alloc_scratch_page(struct i915_address_space *vm)
 		return ret;
 	}
 
+	fill_px(vm->dev, sp, ~0ULL);
 	set_pages_uc(px_page(sp), 1);
 
 	vm->scratch_page = sp;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 17/20] drm/i915/gtt: Pin vma during virtual address allocation
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (15 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 16/20] drm/i915/gtt: Fill scratch page Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 18/20] drm/i915/gtt: Cleanup page directory encoding Mika Kuoppala
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

Dynamic page table allocation might wake the shrinker
when memory is requested for page table structures.
As this happens when we try to allocate the virtual address
during binding, our vma might be among the targets for eviction.
We should do i915_vma_pin() and do pin early in there like Chris
suggests but this is interim solution.

Shield our vma from shrinker by incrementing pin count before
the virtual address is allocated.

The proper place to fix this would be in gem, inside of
i915_vma_pin(). But we don't have that yet so take the short
cut as a intermediate solution.

Testcase: igt/gem_ctx_thrash
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 26d1d45..37dee49 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2882,9 +2882,12 @@ int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
 				    vma->node.size,
 				    VM_TO_TRACE_NAME(vma->vm));
 
+		/* XXX: i915_vma_pin() will fix this +- hack */
+		vma->pin_count++;
 		ret = vma->vm->allocate_va_range(vma->vm,
 						 vma->node.start,
 						 vma->node.size);
+		vma->pin_count--;
 		if (ret)
 			return ret;
 	}
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 18/20] drm/i915/gtt: Cleanup page directory encoding
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (16 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 17/20] drm/i915/gtt: Pin vma during virtual address allocation Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 19/20] drm/i915/gtt: Move scratch_pd and scratch_pt into vm area Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 20/20] drm/i915/gtt: One instance of scratch page table/directory Mika Kuoppala
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

Write page directory entry without using superfluous
indirect function. Also remove unused device parameter
from the encode function.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 21 ++++++---------------
 1 file changed, 6 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 37dee49..badbf13 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -192,9 +192,8 @@ static gen8_pte_t gen8_pte_encode(dma_addr_t addr,
 	return pte;
 }
 
-static gen8_pde_t gen8_pde_encode(struct drm_device *dev,
-				  dma_addr_t addr,
-				  enum i915_cache_level level)
+static gen8_pde_t gen8_pde_encode(const dma_addr_t addr,
+				  const enum i915_cache_level level)
 {
 	gen8_pde_t pde = _PAGE_PRESENT | _PAGE_RW;
 	pde |= addr;
@@ -622,22 +621,13 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 		kunmap_px(ppgtt, pt_vaddr);
 }
 
-static void __gen8_do_map_pt(gen8_pde_t * const pde,
-			     struct i915_page_table *pt,
-			     struct drm_device *dev)
-{
-	gen8_pde_t entry =
-		gen8_pde_encode(dev, px_dma(pt), I915_CACHE_LLC);
-	*pde = entry;
-}
-
 static void gen8_setup_scratch_pd(struct i915_address_space *vm)
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_pde_t scratch_pde;
 
-	scratch_pde = gen8_pde_encode(vm->dev, px_dma(ppgtt->scratch_pt),
+	scratch_pde = gen8_pde_encode(px_dma(ppgtt->scratch_pt),
 				      I915_CACHE_LLC);
 
 	fill_px(vm->dev, ppgtt->scratch_pd, scratch_pde);
@@ -899,7 +889,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	/* Allocations have completed successfully, so set the bitmaps, and do
 	 * the mappings. */
 	gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
-		gen8_pde_t *const page_directory = kmap_px(pd);
+		gen8_pde_t * const page_directory = kmap_px(pd);
 		struct i915_page_table *pt;
 		uint64_t pd_len = gen8_clamp_pd(start, length);
 		uint64_t pd_start = start;
@@ -923,7 +913,8 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 			set_bit(pde, pd->used_pdes);
 
 			/* Map the PDE to the page table */
-			__gen8_do_map_pt(page_directory + pde, pt, vm->dev);
+			page_directory[pde] = gen8_pde_encode(px_dma(pt),
+							      I915_CACHE_LLC);
 
 			/* NB: We haven't yet mapped ptes to pages. At this
 			 * point we're still relying on insert_entries() */
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 19/20] drm/i915/gtt: Move scratch_pd and scratch_pt into vm area
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (17 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 18/20] drm/i915/gtt: Cleanup page directory encoding Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 14:37 ` [PATCH 20/20] drm/i915/gtt: One instance of scratch page table/directory Mika Kuoppala
  19 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

Scratch page is part of i915_address_space due to that we
have only one of that. Move other scratch entities into
the same struct. This is a preparatory patch for having
only one instance of each scratch_pt/pd.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 65 +++++++++++++++++--------------------
 drivers/gpu/drm/i915/i915_gem_gtt.h |  7 ++--
 2 files changed, 32 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index badbf13..6910996 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -405,14 +405,12 @@ static void free_pt(struct drm_device *dev, struct i915_page_table *pt)
 
 static void gen8_setup_scratch_pt(struct i915_address_space *vm)
 {
-	struct i915_hw_ppgtt *ppgtt =
-		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_pte_t scratch_pte;
 
 	scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page),
 				      I915_CACHE_LLC, true);
 
-	fill_px(vm->dev, ppgtt->scratch_pt, scratch_pte);
+	fill_px(vm->dev, vm->scratch_pt, scratch_pte);
 }
 
 static struct i915_page_table *alloc_pt(struct drm_device *dev)
@@ -513,7 +511,7 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	int i, ret;
 
 	for (i = GEN8_LEGACY_PDPES - 1; i >= 0; i--) {
-		dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
+		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
 
 		ret = gen8_write_pdp(ring, i, pd_daddr);
 		if (ret)
@@ -623,14 +621,11 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 
 static void gen8_setup_scratch_pd(struct i915_address_space *vm)
 {
-	struct i915_hw_ppgtt *ppgtt =
-		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_pde_t scratch_pde;
 
-	scratch_pde = gen8_pde_encode(px_dma(ppgtt->scratch_pt),
-				      I915_CACHE_LLC);
+	scratch_pde = gen8_pde_encode(px_dma(vm->scratch_pt), I915_CACHE_LLC);
 
-	fill_px(vm->dev, ppgtt->scratch_pd, scratch_pde);
+	fill_px(vm->dev, vm->scratch_pd, scratch_pde);
 }
 
 static void gen8_free_page_tables(struct i915_page_directory *pd, struct drm_device *dev)
@@ -663,8 +658,8 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 		free_pd(ppgtt->base.dev, ppgtt->pdp.page_directory[i]);
 	}
 
-	free_pd(ppgtt->base.dev, ppgtt->scratch_pd);
-	free_pt(ppgtt->base.dev, ppgtt->scratch_pt);
+	free_pd(vm->dev, vm->scratch_pd);
+	free_pt(vm->dev, vm->scratch_pt);
 }
 
 /**
@@ -700,7 +695,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
 		/* Don't reallocate page tables */
 		if (pt) {
 			/* Scratch is never allocated this way */
-			WARN_ON(pt == ppgtt->scratch_pt);
+			WARN_ON(pt == ppgtt->base.scratch_pt);
 			continue;
 		}
 
@@ -708,7 +703,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
 		if (IS_ERR(pt))
 			goto unwind_out;
 
-		copy_px(ppgtt, pt, ppgtt->scratch_pt);
+		copy_px(ppgtt, pt, ppgtt->base.scratch_pt);
 
 		pd->page_table[pde] = pt;
 		set_bit(pde, new_pts);
@@ -767,7 +762,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
 		if (IS_ERR(pd))
 			goto unwind_out;
 
-		copy_px(ppgtt, pd, ppgtt->scratch_pd);
+		copy_px(ppgtt, pd, ppgtt->base.scratch_pd);
 
 		pdp->page_directory[pdpe] = pd;
 		set_bit(pdpe, new_pds);
@@ -952,13 +947,13 @@ err_out:
  */
 static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 {
-	ppgtt->scratch_pt = alloc_pt(ppgtt->base.dev);
-	if (IS_ERR(ppgtt->scratch_pt))
-		return PTR_ERR(ppgtt->scratch_pt);
+	ppgtt->base.scratch_pt = alloc_pt(ppgtt->base.dev);
+	if (IS_ERR(ppgtt->base.scratch_pt))
+		return PTR_ERR(ppgtt->base.scratch_pt);
 
-	ppgtt->scratch_pd = alloc_pd(ppgtt->base.dev);
-	if (IS_ERR(ppgtt->scratch_pd))
-		return PTR_ERR(ppgtt->scratch_pd);
+	ppgtt->base.scratch_pd = alloc_pd(ppgtt->base.dev);
+	if (IS_ERR(ppgtt->base.scratch_pd))
+		return PTR_ERR(ppgtt->base.scratch_pd);
 
 	gen8_setup_scratch_pt(&ppgtt->base);
 	gen8_setup_scratch_pd(&ppgtt->base);
@@ -986,7 +981,8 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
 	uint32_t  pte, pde, temp;
 	uint32_t start = ppgtt->base.start, length = ppgtt->base.total;
 
-	scratch_pte = vm->pte_encode(px_dma(vm->scratch_page), I915_CACHE_LLC, true, 0);
+	scratch_pte = vm->pte_encode(px_dma(vm->scratch_page),
+				     I915_CACHE_LLC, true, 0);
 
 	gen6_for_each_pde(unused, &ppgtt->pd, start, length, temp, pde) {
 		u32 expected;
@@ -1277,8 +1273,6 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 
 static void gen6_setup_scratch_pt(struct i915_address_space *vm)
 {
-	struct i915_hw_ppgtt *ppgtt =
-		container_of(vm, struct i915_hw_ppgtt, base);
 	gen6_pte_t scratch_pte;
 
 	WARN_ON(px_dma(vm->scratch_page) == 0);
@@ -1286,7 +1280,7 @@ static void gen6_setup_scratch_pt(struct i915_address_space *vm)
 	scratch_pte = vm->pte_encode(px_dma(vm->scratch_page),
 				     I915_CACHE_LLC, true, 0);
 
-	fill32_px(vm->dev, ppgtt->scratch_pt, scratch_pte);
+	fill32_px(vm->dev, vm->scratch_pt, scratch_pte);
 }
 
 static int gen6_alloc_va_range(struct i915_address_space *vm,
@@ -1316,7 +1310,7 @@ static int gen6_alloc_va_range(struct i915_address_space *vm,
 	 * tables.
 	 */
 	gen6_for_each_pde(pt, &ppgtt->pd, start, length, temp, pde) {
-		if (pt != ppgtt->scratch_pt) {
+		if (pt != vm->scratch_pt) {
 			WARN_ON(bitmap_empty(pt->used_ptes, GEN6_PTES));
 			continue;
 		}
@@ -1330,7 +1324,7 @@ static int gen6_alloc_va_range(struct i915_address_space *vm,
 			goto unwind_out;
 		}
 
-		copy_px(ppgtt, pt, ppgtt->scratch_pt);
+		copy_px(ppgtt, pt, vm->scratch_pt);
 
 		ppgtt->pd.page_table[pde] = pt;
 		set_bit(pde, new_page_tables);
@@ -1371,7 +1365,7 @@ unwind_out:
 	for_each_set_bit(pde, new_page_tables, I915_PDES) {
 		struct i915_page_table *pt = ppgtt->pd.page_table[pde];
 
-		ppgtt->pd.page_table[pde] = ppgtt->scratch_pt;
+		ppgtt->pd.page_table[pde] = vm->scratch_pt;
 		free_pt(vm->dev, pt);
 	}
 
@@ -1382,19 +1376,18 @@ unwind_out:
 static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 {
 	struct i915_hw_ppgtt *ppgtt =
-		container_of(vm, struct i915_hw_ppgtt, base);
+				container_of(vm, struct i915_hw_ppgtt, base);
 	struct i915_page_table *pt;
 	uint32_t pde;
 
-
 	drm_mm_remove_node(&ppgtt->node);
 
 	gen6_for_all_pdes(pt, ppgtt, pde) {
-		if (pt != ppgtt->scratch_pt)
+		if (pt != vm->scratch_pt)
 			free_pt(ppgtt->base.dev, pt);
 	}
 
-	free_pt(ppgtt->base.dev, ppgtt->scratch_pt);
+	free_pt(vm->dev, vm->scratch_pt);
 }
 
 static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
@@ -1409,9 +1402,9 @@ static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
 	 * size. We allocate at the top of the GTT to avoid fragmentation.
 	 */
 	BUG_ON(!drm_mm_initialized(&dev_priv->gtt.base.mm));
-	ppgtt->scratch_pt = alloc_pt(ppgtt->base.dev);
-	if (IS_ERR(ppgtt->scratch_pt))
-		return PTR_ERR(ppgtt->scratch_pt);
+	ppgtt->base.scratch_pt = alloc_pt(ppgtt->base.dev);
+	if (IS_ERR(ppgtt->base.scratch_pt))
+		return PTR_ERR(ppgtt->base.scratch_pt);
 
 	gen6_setup_scratch_pt(&ppgtt->base);
 
@@ -1444,7 +1437,7 @@ alloc:
 	return 0;
 
 err_out:
-	free_pt(ppgtt->base.dev, ppgtt->scratch_pt);
+	free_pt(ppgtt->base.dev, ppgtt->base.scratch_pt);
 	return ret;
 }
 
@@ -1460,7 +1453,7 @@ static void gen6_scratch_va_range(struct i915_hw_ppgtt *ppgtt,
 	uint32_t pde, temp;
 
 	gen6_for_each_pde(unused, &ppgtt->pd, start, length, temp, pde)
-		ppgtt->pd.page_table[pde] = ppgtt->scratch_pt;
+		ppgtt->pd.page_table[pde] = ppgtt->base.scratch_pt;
 }
 
 static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 1fd4041..ba46374 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -248,6 +248,8 @@ struct i915_address_space {
 	u64 total;		/* size addr space maps (ex. 2GB for ggtt) */
 
 	struct i915_page_scratch *scratch_page;
+	struct i915_page_table *scratch_pt;
+	struct i915_page_directory *scratch_pd;
 
 	/**
 	 * List of objects currently involved in rendering.
@@ -337,9 +339,6 @@ struct i915_hw_ppgtt {
 		struct i915_page_directory pd;
 	};
 
-	struct i915_page_table *scratch_pt;
-	struct i915_page_directory *scratch_pd;
-
 	struct drm_i915_file_private *file_priv;
 
 	gen6_pte_t __iomem *pd_addr;
@@ -481,7 +480,7 @@ i915_page_dir_dma_addr(const struct i915_hw_ppgtt *ppgtt, const unsigned n)
 {
 	return test_bit(n, ppgtt->pdp.used_pdpes) ?
 		px_dma(ppgtt->pdp.page_directory[n]) :
-		px_dma(ppgtt->scratch_pd);
+		px_dma(ppgtt->base.scratch_pd);
 }
 
 int i915_gem_gtt_init(struct drm_device *dev);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 20/20] drm/i915/gtt: One instance of scratch page table/directory
  2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
                   ` (18 preceding siblings ...)
  2015-05-21 14:37 ` [PATCH 19/20] drm/i915/gtt: Move scratch_pd and scratch_pt into vm area Mika Kuoppala
@ 2015-05-21 14:37 ` Mika Kuoppala
  2015-05-21 18:27   ` shuang.he
  19 siblings, 1 reply; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-21 14:37 UTC (permalink / raw)
  To: intel-gfx; +Cc: miku

As we use one scratch page for all ppgtt instances, we can
use one scratch page table and scratch directory across
all ppgtt instances, saving 2 pages + structs per ppgtt.

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 288 +++++++++++++++++++++++-------------
 1 file changed, 184 insertions(+), 104 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 6910996..6706081 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -349,7 +349,10 @@ static void kunmap_page_dma(struct drm_device *dev, void *vaddr, bool dirty)
 }
 
 #define kmap_px(px) kmap_page_dma(px_base(px))
-#define kunmap_px(ppgtt, vaddr) kunmap_page_dma((ppgtt)->base.dev, (vaddr), true);
+#define kunmap_px(ppgtt, vaddr) \
+	kunmap_page_dma((ppgtt)->base.dev, (vaddr), true)
+#define kunmap_readonly_px(ppgtt, vaddr) \
+	kunmap_page_dma((ppgtt)->base.dev, (vaddr), false)
 
 #define setup_px(dev, px) setup_page_dma((dev), px_base(px))
 #define cleanup_px(dev, px) cleanup_page_dma((dev), px_base(px))
@@ -403,16 +406,6 @@ static void free_pt(struct drm_device *dev, struct i915_page_table *pt)
 	kfree(pt);
 }
 
-static void gen8_setup_scratch_pt(struct i915_address_space *vm)
-{
-	gen8_pte_t scratch_pte;
-
-	scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page),
-				      I915_CACHE_LLC, true);
-
-	fill_px(vm->dev, vm->scratch_pt, scratch_pte);
-}
-
 static struct i915_page_table *alloc_pt(struct drm_device *dev)
 {
 	struct i915_page_table *pt;
@@ -481,6 +474,175 @@ free_pd:
 	return ERR_PTR(ret);
 }
 
+static int alloc_scratch_page(struct i915_address_space *vm)
+{
+	struct i915_page_scratch *sp;
+	int ret;
+
+	WARN_ON(vm->scratch_page);
+
+	sp = kzalloc(sizeof(*sp), GFP_KERNEL);
+	if (sp == NULL)
+		return -ENOMEM;
+
+	ret = __setup_page_dma(vm->dev, px_base(sp), GFP_DMA32);
+	if (ret) {
+		kfree(sp);
+		return ret;
+	}
+
+	fill_px(vm->dev, sp, ~0ULL);
+	set_pages_uc(px_page(sp), 1);
+
+	vm->scratch_page = sp;
+
+	return 0;
+}
+
+static void free_scratch_page(struct i915_address_space *vm)
+{
+	struct i915_page_scratch *sp = vm->scratch_page;
+
+	set_pages_wb(px_page(sp), 1);
+
+	cleanup_px(vm->dev, sp);
+	kfree(sp);
+
+	vm->scratch_page = NULL;
+}
+
+static void gen6_setup_scratch_pt(struct i915_address_space *vm)
+{
+	gen6_pte_t scratch_pte;
+
+	WARN_ON(px_dma(vm->scratch_page) == 0);
+
+	scratch_pte = vm->pte_encode(px_dma(vm->scratch_page),
+				     I915_CACHE_LLC, true, 0);
+
+	fill32_px(vm->dev, vm->scratch_pt, scratch_pte);
+}
+
+static void gen8_setup_scratch_pt(struct i915_address_space *vm)
+{
+	gen8_pte_t scratch_pte;
+
+	WARN_ON(px_dma(vm->scratch_page) == 0);
+
+	scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page),
+				      I915_CACHE_LLC, true);
+
+	fill_px(vm->dev, vm->scratch_pt, scratch_pte);
+}
+
+static void gen8_setup_scratch_pd(struct i915_address_space *vm)
+{
+	gen8_pde_t scratch_pde;
+
+	WARN_ON(px_dma(vm->scratch_pt) == 0);
+
+	scratch_pde = gen8_pde_encode(px_dma(vm->scratch_pt), I915_CACHE_LLC);
+
+	fill_px(vm->dev, vm->scratch_pd, scratch_pde);
+}
+
+static int setup_scratch_ggtt(struct i915_address_space *vm)
+{
+	int ret;
+
+	ret = alloc_scratch_page(vm);
+	if (ret)
+		return ret;
+
+	WARN_ON(vm->scratch_pt);
+
+	if (INTEL_INFO(vm->dev)->gen < 6)
+		return 0;
+
+	vm->scratch_pt = alloc_pt(vm->dev);
+	if (IS_ERR(vm->scratch_pt))
+		return PTR_ERR(vm->scratch_pt);
+
+	if (INTEL_INFO(vm->dev)->gen >= 8) {
+		gen8_setup_scratch_pt(vm);
+
+		WARN_ON(vm->scratch_pd);
+
+		vm->scratch_pd = alloc_pd(vm->dev);
+		if (IS_ERR(vm->scratch_pd)) {
+			ret = PTR_ERR(vm->scratch_pd);
+			goto err_pd;
+		}
+
+		gen8_setup_scratch_pd(vm);
+	} else {
+		gen6_setup_scratch_pt(vm);
+	}
+
+	return 0;
+
+err_pd:
+	free_pt(vm->dev, vm->scratch_pt);
+	return ret;
+}
+
+static int setup_scratch(struct i915_address_space *vm)
+{
+	struct i915_address_space *ggtt_vm = &to_i915(vm->dev)->gtt.base;
+
+	if (i915_is_ggtt(vm))
+		return setup_scratch_ggtt(vm);
+
+	vm->scratch_page = ggtt_vm->scratch_page;
+	vm->scratch_pt = ggtt_vm->scratch_pt;
+	vm->scratch_pd = ggtt_vm->scratch_pd;
+
+	return 0;
+}
+
+static void check_scratch_page(struct i915_address_space *vm)
+{
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
+	u32 i, *vaddr;
+
+	vaddr = kmap_px(vm->scratch_page);
+
+	for (i = 0; i < PAGE_SIZE / sizeof(u32); i++) {
+		if (vaddr[i] == 0xffffffff)
+			continue;
+
+		DRM_ERROR("%p scratch[%u] = 0x%08x\n", vm, i, vaddr[i]);
+		break;
+	}
+
+	kunmap_readonly_px(ppgtt, vaddr);
+}
+
+static void cleanup_scratch_ggtt(struct i915_address_space *vm)
+{
+	check_scratch_page(vm);
+	free_scratch_page(vm);
+
+	if (INTEL_INFO(vm->dev)->gen < 6)
+		return;
+
+	free_pt(vm->dev, vm->scratch_pt);
+
+	if (INTEL_INFO(vm->dev)->gen >= 8)
+		free_pd(vm->dev, vm->scratch_pd);
+}
+
+static void cleanup_scratch(struct i915_address_space *vm)
+{
+	if (i915_is_ggtt(vm))
+		cleanup_scratch_ggtt(vm);
+
+	vm->scratch_page = NULL;
+	vm->scratch_pt = NULL;
+	vm->scratch_pd = NULL;
+}
+
 /* Broadwell Page Directory Pointer Descriptors */
 static int gen8_write_pdp(struct intel_engine_cs *ring,
 			  unsigned entry,
@@ -535,7 +697,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 	unsigned num_entries = length >> PAGE_SHIFT;
 	unsigned last_pte, i;
 
-	scratch_pte = gen8_pte_encode(px_dma(ppgtt->base.scratch_page),
+	scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page),
 				      I915_CACHE_LLC, use_scratch);
 
 	while (num_entries) {
@@ -619,15 +781,6 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 		kunmap_px(ppgtt, pt_vaddr);
 }
 
-static void gen8_setup_scratch_pd(struct i915_address_space *vm)
-{
-	gen8_pde_t scratch_pde;
-
-	scratch_pde = gen8_pde_encode(px_dma(vm->scratch_pt), I915_CACHE_LLC);
-
-	fill_px(vm->dev, vm->scratch_pd, scratch_pde);
-}
-
 static void gen8_free_page_tables(struct i915_page_directory *pd, struct drm_device *dev)
 {
 	int i;
@@ -658,8 +811,7 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 		free_pd(ppgtt->base.dev, ppgtt->pdp.page_directory[i]);
 	}
 
-	free_pd(vm->dev, vm->scratch_pd);
-	free_pt(vm->dev, vm->scratch_pt);
+	cleanup_scratch(vm);
 }
 
 /**
@@ -947,17 +1099,6 @@ err_out:
  */
 static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 {
-	ppgtt->base.scratch_pt = alloc_pt(ppgtt->base.dev);
-	if (IS_ERR(ppgtt->base.scratch_pt))
-		return PTR_ERR(ppgtt->base.scratch_pt);
-
-	ppgtt->base.scratch_pd = alloc_pd(ppgtt->base.dev);
-	if (IS_ERR(ppgtt->base.scratch_pd))
-		return PTR_ERR(ppgtt->base.scratch_pd);
-
-	gen8_setup_scratch_pt(&ppgtt->base);
-	gen8_setup_scratch_pd(&ppgtt->base);
-
 	ppgtt->base.start = 0;
 	ppgtt->base.total = 1ULL << 32;
 	ppgtt->base.cleanup = gen8_ppgtt_cleanup;
@@ -969,7 +1110,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 
 	ppgtt->switch_mm = gen8_mm_switch;
 
-	return 0;
+	return setup_scratch(&ppgtt->base);
 }
 
 static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
@@ -1021,7 +1162,7 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
 			}
 			seq_puts(m, "\n");
 		}
-		kunmap_px(ppgtt, pt_vaddr);
+		kunmap_readonly_px(ppgtt, pt_vaddr);
 	}
 }
 
@@ -1271,18 +1412,6 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 		kunmap_px(ppgtt, pt_vaddr);
 }
 
-static void gen6_setup_scratch_pt(struct i915_address_space *vm)
-{
-	gen6_pte_t scratch_pte;
-
-	WARN_ON(px_dma(vm->scratch_page) == 0);
-
-	scratch_pte = vm->pte_encode(px_dma(vm->scratch_page),
-				     I915_CACHE_LLC, true, 0);
-
-	fill32_px(vm->dev, vm->scratch_pt, scratch_pte);
-}
-
 static int gen6_alloc_va_range(struct i915_address_space *vm,
 			       uint64_t start_in, uint64_t length_in)
 {
@@ -1387,7 +1516,7 @@ static void gen6_ppgtt_cleanup(struct i915_address_space *vm)
 			free_pt(ppgtt->base.dev, pt);
 	}
 
-	free_pt(vm->dev, vm->scratch_pt);
+	cleanup_scratch(vm);
 }
 
 static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
@@ -1402,11 +1531,10 @@ static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt)
 	 * size. We allocate at the top of the GTT to avoid fragmentation.
 	 */
 	BUG_ON(!drm_mm_initialized(&dev_priv->gtt.base.mm));
-	ppgtt->base.scratch_pt = alloc_pt(ppgtt->base.dev);
-	if (IS_ERR(ppgtt->base.scratch_pt))
-		return PTR_ERR(ppgtt->base.scratch_pt);
 
-	gen6_setup_scratch_pt(&ppgtt->base);
+	ret = setup_scratch(&ppgtt->base);
+	if (ret)
+		return ret;
 
 alloc:
 	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
@@ -1437,7 +1565,7 @@ alloc:
 	return 0;
 
 err_out:
-	free_pt(ppgtt->base.dev, ppgtt->base.scratch_pt);
+	cleanup_scratch(&ppgtt->base);
 	return ret;
 }
 
@@ -1511,10 +1639,7 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 
 static int __hw_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
-
 	ppgtt->base.dev = dev;
-	ppgtt->base.scratch_page = dev_priv->gtt.base.scratch_page;
 
 	if (INTEL_INFO(dev)->gen < 8)
 		return gen6_ppgtt_init(ppgtt);
@@ -2120,43 +2245,6 @@ void i915_global_gtt_cleanup(struct drm_device *dev)
 	vm->cleanup(vm);
 }
 
-static int alloc_scratch_page(struct i915_address_space *vm)
-{
-	struct i915_page_scratch *sp;
-	int ret;
-
-	WARN_ON(vm->scratch_page);
-
-	sp = kzalloc(sizeof(*sp), GFP_KERNEL);
-	if (sp == NULL)
-		return -ENOMEM;
-
-	ret = __setup_page_dma(vm->dev, px_base(sp), GFP_DMA32 | __GFP_ZERO);
-	if (ret) {
-		kfree(sp);
-		return ret;
-	}
-
-	fill_px(vm->dev, sp, ~0ULL);
-	set_pages_uc(px_page(sp), 1);
-
-	vm->scratch_page = sp;
-
-	return 0;
-}
-
-static void free_scratch_page(struct i915_address_space *vm)
-{
-	struct i915_page_scratch *sp = vm->scratch_page;
-
-	set_pages_wb(px_page(sp), 1);
-
-	cleanup_px(vm->dev, sp);
-	kfree(sp);
-
-	vm->scratch_page = NULL;
-}
-
 static unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
 {
 	snb_gmch_ctl >>= SNB_GMCH_GGMS_SHIFT;
@@ -2240,7 +2328,6 @@ static int ggtt_probe_common(struct drm_device *dev,
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	phys_addr_t gtt_phys_addr;
-	int ret;
 
 	/* For Modern GENs the PTEs and register space are split in the BAR */
 	gtt_phys_addr = pci_resource_start(dev->pdev, 0) +
@@ -2262,14 +2349,7 @@ static int ggtt_probe_common(struct drm_device *dev,
 		return -ENOMEM;
 	}
 
-	ret = alloc_scratch_page(&dev_priv->gtt.base);
-	if (ret) {
-		DRM_ERROR("Scratch setup failed\n");
-		/* iounmap will also get called at remove, but meh */
-		iounmap(dev_priv->gtt.gsm);
-	}
-
-	return ret;
+	return setup_scratch(&dev_priv->gtt.base);
 }
 
 /* The GGTT and PPGTT need a private PPAT setup in order to handle cacheability
@@ -2441,7 +2521,7 @@ static void gen6_gmch_remove(struct i915_address_space *vm)
 	struct i915_gtt *gtt = container_of(vm, struct i915_gtt, base);
 
 	iounmap(gtt->gsm);
-	free_scratch_page(vm);
+	cleanup_scratch(vm);
 }
 
 static int i915_gmch_probe(struct drm_device *dev,
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 16/20] drm/i915/gtt: Fill scratch page
  2015-05-21 14:37 ` [PATCH 16/20] drm/i915/gtt: Fill scratch page Mika Kuoppala
@ 2015-05-21 14:56   ` Chris Wilson
  0 siblings, 0 replies; 28+ messages in thread
From: Chris Wilson @ 2015-05-21 14:56 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx, miku

On Thu, May 21, 2015 at 05:37:44PM +0300, Mika Kuoppala wrote:
> During review of dynamic page tables series, I was able
> to hit a lite restore bug with execlists. I assume that
> due to incorrect pd, the batch run out of legit address space
> and into the scratch page area. The ACTHD was increasing
> due to scratch being all zeroes (MI_NOOPs). And as gen8
> address space is quite large, the hangcheck happily waited
> for a long long time, keeping the process effectively stuck.
> 
> According to Chris Wilson any modern gpu will grind to halt
> if it encounters commands of all ones. This seemed to do the
> trick and hang was declared promptly when the gpu wandered into
> the scratch land.
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index ccdb35f..26d1d45 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2153,6 +2153,7 @@ static int alloc_scratch_page(struct i915_address_space *vm)
>  		return ret;
>  	}
>  
> +	fill_px(vm->dev, sp, ~0ULL);

I'd be tempted to actually use 0xffff00ff. The advantage of 0 is that it
is unlikely to be noticeable. The advantage of 0xffff00ff is that is very
noticeable.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 02/20] drm/i915: Force PD restore on dirty ppGTTs
  2015-05-21 14:37 ` [PATCH 02/20] drm/i915: Force PD restore on dirty ppGTTs Mika Kuoppala
@ 2015-05-21 15:07   ` Ville Syrjälä
  2015-05-21 16:28     ` Barbalho, Rafael
  0 siblings, 1 reply; 28+ messages in thread
From: Ville Syrjälä @ 2015-05-21 15:07 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx, miku

On Thu, May 21, 2015 at 05:37:30PM +0300, Mika Kuoppala wrote:
> Force page directory reload when ppgtt va->pa
> mapping has changed. Extend dirty rings mechanism
> for gen > 7 and use it to force pd restore in execlist
> mode when vm has been changed.
> 
> Some parts of execlist context update cleanup based on
> work by Chris Wilson.
> 
> v2: Add comment about lite restore (Chris)
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c | 65 ++++++++++++++++++++--------------------
>  1 file changed, 33 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 0413b8f..5ee2a8c 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -264,9 +264,10 @@ u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj)
>  }
>  
>  static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring,
> -					 struct drm_i915_gem_object *ctx_obj)
> +					 struct intel_context *ctx)
>  {
>  	struct drm_device *dev = ring->dev;
> +	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
>  	uint64_t desc;
>  	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
>  
> @@ -284,6 +285,14 @@ static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring,
>  	 * signalling between Command Streamers */
>  	/* desc |= GEN8_CTX_FORCE_RESTORE; */
>  
> +	/* When performing a LiteRestore but with updated PD we need
> +	 * to force the GPU to reload the PD
> +	 */
> +	if (intel_ring_flag(ring) & ctx->ppgtt->pd_dirty_rings) {
> +		desc |= GEN8_CTX_FORCE_PD_RESTORE;

Wasn't there a hardware issue which basically meant you are not
allowed to actually set this bit?

Rafael had some details on that as far as I recall so adding cc...

> +		ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(ring);
> +	}
> +
>  	/* WaEnableForceRestoreInCtxtDescForVCS:skl */
>  	if (IS_GEN9(dev) &&
>  	    INTEL_REVID(dev) <= SKL_REVID_B0 &&
> @@ -295,8 +304,8 @@ static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring,
>  }
>  
>  static void execlists_elsp_write(struct intel_engine_cs *ring,
> -				 struct drm_i915_gem_object *ctx_obj0,
> -				 struct drm_i915_gem_object *ctx_obj1)
> +				 struct intel_context *to0,
> +				 struct intel_context *to1)
>  {
>  	struct drm_device *dev = ring->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -304,14 +313,15 @@ static void execlists_elsp_write(struct intel_engine_cs *ring,
>  	uint32_t desc[4];
>  
>  	/* XXX: You must always write both descriptors in the order below. */
> -	if (ctx_obj1)
> -		temp = execlists_ctx_descriptor(ring, ctx_obj1);
> +	if (to1)
> +		temp = execlists_ctx_descriptor(ring, to1);
>  	else
>  		temp = 0;
> +
>  	desc[1] = (u32)(temp >> 32);
>  	desc[0] = (u32)temp;
>  
> -	temp = execlists_ctx_descriptor(ring, ctx_obj0);
> +	temp = execlists_ctx_descriptor(ring, to0);
>  	desc[3] = (u32)(temp >> 32);
>  	desc[2] = (u32)temp;
>  
> @@ -330,14 +340,20 @@ static void execlists_elsp_write(struct intel_engine_cs *ring,
>  	spin_unlock(&dev_priv->uncore.lock);
>  }
>  
> -static int execlists_update_context(struct drm_i915_gem_object *ctx_obj,
> -				    struct drm_i915_gem_object *ring_obj,
> -				    struct i915_hw_ppgtt *ppgtt,
> -				    u32 tail)
> +static void execlists_update_context(struct intel_engine_cs *ring,
> +				     struct intel_context *ctx,
> +				     u32 tail)
>  {
> +	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
> +	struct intel_ringbuffer *ringbuf = ctx->engine[ring->id].ringbuf;
> +	struct drm_i915_gem_object *ring_obj = ringbuf->obj;
> +	struct i915_hw_ppgtt *ppgtt = ctx->ppgtt;
>  	struct page *page;
>  	uint32_t *reg_state;
>  
> +	WARN_ON(!i915_gem_obj_is_pinned(ctx_obj));
> +	WARN_ON(!i915_gem_obj_is_pinned(ring_obj));
> +
>  	page = i915_gem_object_get_page(ctx_obj, 1);
>  	reg_state = kmap_atomic(page);
>  
> @@ -347,7 +363,7 @@ static int execlists_update_context(struct drm_i915_gem_object *ctx_obj,
>  	/* True PPGTT with dynamic page allocation: update PDP registers and
>  	 * point the unallocated PDPs to the scratch page
>  	 */
> -	if (ppgtt) {
> +	if (ppgtt && intel_ring_flag(ring) & ctx->ppgtt->pd_dirty_rings) {
>  		ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
>  		ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
>  		ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
> @@ -355,36 +371,21 @@ static int execlists_update_context(struct drm_i915_gem_object *ctx_obj,
>  	}
>  
>  	kunmap_atomic(reg_state);
> -
> -	return 0;
>  }
>  
>  static void execlists_submit_contexts(struct intel_engine_cs *ring,
>  				      struct intel_context *to0, u32 tail0,
>  				      struct intel_context *to1, u32 tail1)
>  {
> -	struct drm_i915_gem_object *ctx_obj0 = to0->engine[ring->id].state;
> -	struct intel_ringbuffer *ringbuf0 = to0->engine[ring->id].ringbuf;
> -	struct drm_i915_gem_object *ctx_obj1 = NULL;
> -	struct intel_ringbuffer *ringbuf1 = NULL;
> -
> -	BUG_ON(!ctx_obj0);
> -	WARN_ON(!i915_gem_obj_is_pinned(ctx_obj0));
> -	WARN_ON(!i915_gem_obj_is_pinned(ringbuf0->obj));
> -
> -	execlists_update_context(ctx_obj0, ringbuf0->obj, to0->ppgtt, tail0);
> +	if (WARN_ON(to0 == NULL))
> +		return;
>  
> -	if (to1) {
> -		ringbuf1 = to1->engine[ring->id].ringbuf;
> -		ctx_obj1 = to1->engine[ring->id].state;
> -		BUG_ON(!ctx_obj1);
> -		WARN_ON(!i915_gem_obj_is_pinned(ctx_obj1));
> -		WARN_ON(!i915_gem_obj_is_pinned(ringbuf1->obj));
> +	execlists_update_context(ring, to0, tail0);
>  
> -		execlists_update_context(ctx_obj1, ringbuf1->obj, to1->ppgtt, tail1);
> -	}
> +	if (to1)
> +		execlists_update_context(ring, to1, tail1);
>  
> -	execlists_elsp_write(ring, ctx_obj0, ctx_obj1);
> +	execlists_elsp_write(ring, to0, to1);
>  }
>  
>  static void execlists_context_unqueue(struct intel_engine_cs *ring)
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 11/20] drm/i915/gtt: Introduce fill_page_dma()
  2015-05-21 14:37 ` [PATCH 11/20] drm/i915/gtt: Introduce fill_page_dma() Mika Kuoppala
@ 2015-05-21 15:16   ` Ville Syrjälä
  0 siblings, 0 replies; 28+ messages in thread
From: Ville Syrjälä @ 2015-05-21 15:16 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx, miku

On Thu, May 21, 2015 at 05:37:39PM +0300, Mika Kuoppala wrote:
> When we setup page directories and tables, we point the entries
> to a to the next level scratch structure. Make this generic
> by introducing a fill_page_dma which maps and flushes. We also
> need 32 bit variant for legacy gens.
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 61 +++++++++++++++++++------------------
>  1 file changed, 31 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 5175eb8..a3ee710 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -330,6 +330,27 @@ static void cleanup_page_dma(struct drm_device *dev, struct i915_page_dma *p)
>  	memset(p, 0, sizeof(*p));
>  }
>  
> +static void fill_page_dma(struct drm_device *dev, struct i915_page_dma *p,
> +			  const uint64_t val)
> +{
> +	int i;
> +	uint64_t * const vaddr = kmap_atomic(p->page);
> +
> +	for (i = 0; i < 512; i++)
> +		vaddr[i] = val;
> +
> +	kunmap_atomic(vaddr);
> +}

Where did the clflushes go? Also please keep in mind only CHV needs the
clflush and VLV doesn't.

> +
> +static void fill_page_dma_32(struct drm_device *dev, struct i915_page_dma *p,
> +			     const uint32_t val32)
> +{
> +	uint64_t v = val32;
> +	v = v << 32 | val32;
> +
> +	fill_page_dma(dev, p, v);
> +}
> +
>  static void free_pt(struct drm_device *dev, struct i915_page_table *pt)
>  {
>  	cleanup_page_dma(dev, &pt->base);
> @@ -340,19 +361,12 @@ static void free_pt(struct drm_device *dev, struct i915_page_table *pt)
>  static void gen8_initialize_pt(struct i915_address_space *vm,
>  			       struct i915_page_table *pt)
>  {
> -	gen8_pte_t *pt_vaddr, scratch_pte;
> -	int i;
> +	gen8_pte_t scratch_pte;
>  
> -	pt_vaddr = kmap_atomic(pt->base.page);
>  	scratch_pte = gen8_pte_encode(vm->scratch.addr,
>  				      I915_CACHE_LLC, true);
>  
> -	for (i = 0; i < GEN8_PTES; i++)
> -		pt_vaddr[i] = scratch_pte;
> -
> -	if (!HAS_LLC(vm->dev))
> -		drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
> -	kunmap_atomic(pt_vaddr);
> +	fill_page_dma(vm->dev, &pt->base, scratch_pte);
>  }
>  
>  static struct i915_page_table *alloc_pt(struct drm_device *dev)
> @@ -585,20 +599,13 @@ static void gen8_initialize_pd(struct i915_address_space *vm,
>  			       struct i915_page_directory *pd)
>  {
>  	struct i915_hw_ppgtt *ppgtt =
> -			container_of(vm, struct i915_hw_ppgtt, base);
> -	gen8_pde_t *page_directory;
> -	struct i915_page_table *pt;
> -	int i;
> +		container_of(vm, struct i915_hw_ppgtt, base);
> +	gen8_pde_t scratch_pde;
>  
> -	page_directory = kmap_atomic(pd->base.page);
> -	pt = ppgtt->scratch_pt;
> -	for (i = 0; i < I915_PDES; i++)
> -		/* Map the PDE to the page table */
> -		__gen8_do_map_pt(page_directory + i, pt, vm->dev);
> +	scratch_pde = gen8_pde_encode(vm->dev, ppgtt->scratch_pt->base.daddr,
> +				      I915_CACHE_LLC);
>  
> -	if (!HAS_LLC(vm->dev))
> -		drm_clflush_virt_range(page_directory, PAGE_SIZE);
> -	kunmap_atomic(page_directory);
> +	fill_page_dma(vm->dev, &pd->base, scratch_pde);
>  }
>  
>  static void gen8_free_page_tables(struct i915_page_directory *pd, struct drm_device *dev)
> @@ -1242,22 +1249,16 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>  }
>  
>  static void gen6_initialize_pt(struct i915_address_space *vm,
> -		struct i915_page_table *pt)
> +			       struct i915_page_table *pt)
>  {
> -	gen6_pte_t *pt_vaddr, scratch_pte;
> -	int i;
> +	gen6_pte_t scratch_pte;
>  
>  	WARN_ON(vm->scratch.addr == 0);
>  
>  	scratch_pte = vm->pte_encode(vm->scratch.addr,
>  			I915_CACHE_LLC, true, 0);
>  
> -	pt_vaddr = kmap_atomic(pt->base.page);
> -
> -	for (i = 0; i < GEN6_PTES; i++)
> -		pt_vaddr[i] = scratch_pte;
> -
> -	kunmap_atomic(pt_vaddr);
> +	fill_page_dma_32(vm->dev, &pt->base, scratch_pte);
>  }
>  
>  static int gen6_alloc_va_range(struct i915_address_space *vm,
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 12/20] drm/i915/gtt: Introduce kmap|kunmap for dma page
  2015-05-21 14:37 ` [PATCH 12/20] drm/i915/gtt: Introduce kmap|kunmap for dma page Mika Kuoppala
@ 2015-05-21 15:19   ` Ville Syrjälä
  0 siblings, 0 replies; 28+ messages in thread
From: Ville Syrjälä @ 2015-05-21 15:19 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx, miku

On Thu, May 21, 2015 at 05:37:40PM +0300, Mika Kuoppala wrote:
> As there is flushing involved when we have done the cpu
> write, make functions for mapping for cpu space. Make macros
> to map any type of paging structure.
> 
> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 67 +++++++++++++++++++------------------
>  1 file changed, 35 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index a3ee710..3d94ad8 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -330,16 +330,32 @@ static void cleanup_page_dma(struct drm_device *dev, struct i915_page_dma *p)
>  	memset(p, 0, sizeof(*p));
>  }
>  
> +static void *kmap_page_dma(struct i915_page_dma *p)
> +{
> +	return kmap_atomic(p->page);
> +}
> +
> +static void kunmap_page_dma(struct drm_device *dev, void *vaddr)
> +{
> +	if (!HAS_LLC(dev))
> +		drm_clflush_virt_range(vaddr, PAGE_SIZE);
> +
> +	kunmap_atomic(vaddr);
> +}

Ah there it is. But now it's being performed on VLV as well which
doesn't need it.

Also having something called kunmap_page_dma() which does an explict
clflush for apparently no reason is rather confusing. So I think the
name should have _ppgtt_ in it to make it clear this only applies to
ppgtt, otherwise someone is likely to use it for something else which
doesn't need the clflush. Or at the very least it needs a comment to
explain this stuff.

> +
> +#define kmap_px(px) kmap_page_dma(&(px)->base)
> +#define kunmap_px(ppgtt, vaddr) kunmap_page_dma((ppgtt)->base.dev, (vaddr));
> +
>  static void fill_page_dma(struct drm_device *dev, struct i915_page_dma *p,
>  			  const uint64_t val)
>  {
>  	int i;
> -	uint64_t * const vaddr = kmap_atomic(p->page);
> +	uint64_t * const vaddr = kmap_page_dma(p);
>  
>  	for (i = 0; i < 512; i++)
>  		vaddr[i] = val;
>  
> -	kunmap_atomic(vaddr);
> +	kunmap_page_dma(dev, vaddr);
>  }
>  
>  static void fill_page_dma_32(struct drm_device *dev, struct i915_page_dma *p,
> @@ -497,7 +513,6 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
>  	while (num_entries) {
>  		struct i915_page_directory *pd;
>  		struct i915_page_table *pt;
> -		struct page *page_table;
>  
>  		if (WARN_ON(!ppgtt->pdp.page_directory[pdpe]))
>  			continue;
> @@ -512,22 +527,18 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
>  		if (WARN_ON(!pt->base.page))
>  			continue;
>  
> -		page_table = pt->base.page;
> -
>  		last_pte = pte + num_entries;
>  		if (last_pte > GEN8_PTES)
>  			last_pte = GEN8_PTES;
>  
> -		pt_vaddr = kmap_atomic(page_table);
> +		pt_vaddr = kmap_px(pt);
>  
>  		for (i = pte; i < last_pte; i++) {
>  			pt_vaddr[i] = scratch_pte;
>  			num_entries--;
>  		}
>  
> -		if (!HAS_LLC(ppgtt->base.dev))
> -			drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
> -		kunmap_atomic(pt_vaddr);
> +		kunmap_px(ppgtt, pt);
>  
>  		pte = 0;
>  		if (++pde == I915_PDES) {
> @@ -559,18 +570,14 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
>  		if (pt_vaddr == NULL) {
>  			struct i915_page_directory *pd = ppgtt->pdp.page_directory[pdpe];
>  			struct i915_page_table *pt = pd->page_table[pde];
> -			struct page *page_table = pt->base.page;
> -
> -			pt_vaddr = kmap_atomic(page_table);
> +			pt_vaddr = kmap_px(pt);
>  		}
>  
>  		pt_vaddr[pte] =
>  			gen8_pte_encode(sg_page_iter_dma_address(&sg_iter),
>  					cache_level, true);
>  		if (++pte == GEN8_PTES) {
> -			if (!HAS_LLC(ppgtt->base.dev))
> -				drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
> -			kunmap_atomic(pt_vaddr);
> +			kunmap_px(ppgtt, pt_vaddr);
>  			pt_vaddr = NULL;
>  			if (++pde == I915_PDES) {
>  				pdpe++;
> @@ -579,11 +586,9 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
>  			pte = 0;
>  		}
>  	}
> -	if (pt_vaddr) {
> -		if (!HAS_LLC(ppgtt->base.dev))
> -			drm_clflush_virt_range(pt_vaddr, PAGE_SIZE);
> -		kunmap_atomic(pt_vaddr);
> -	}
> +
> +	if (pt_vaddr)
> +		kunmap_px(ppgtt, pt_vaddr);
>  }
>  
>  static void __gen8_do_map_pt(gen8_pde_t * const pde,
> @@ -862,7 +867,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>  	/* Allocations have completed successfully, so set the bitmaps, and do
>  	 * the mappings. */
>  	gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
> -		gen8_pde_t *const page_directory = kmap_atomic(pd->base.page);
> +		gen8_pde_t *const page_directory = kmap_px(pd);
>  		struct i915_page_table *pt;
>  		uint64_t pd_len = gen8_clamp_pd(start, length);
>  		uint64_t pd_start = start;
> @@ -892,10 +897,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
>  			 * point we're still relying on insert_entries() */
>  		}
>  
> -		if (!HAS_LLC(vm->dev))
> -			drm_clflush_virt_range(page_directory, PAGE_SIZE);
> -
> -		kunmap_atomic(page_directory);
> +		kunmap_px(ppgtt, page_directory);
>  
>  		set_bit(pdpe, ppgtt->pdp.used_pdpes);
>  	}
> @@ -977,7 +979,8 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
>  				   expected);
>  		seq_printf(m, "\tPDE: %x\n", pd_entry);
>  
> -		pt_vaddr = kmap_atomic(ppgtt->pd.page_table[pde]->base.page);
> +		pt_vaddr = kmap_px(ppgtt->pd.page_table[pde]);
> +
>  		for (pte = 0; pte < GEN6_PTES; pte+=4) {
>  			unsigned long va =
>  				(pde * PAGE_SIZE * GEN6_PTES) +
> @@ -999,7 +1002,7 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
>  			}
>  			seq_puts(m, "\n");
>  		}
> -		kunmap_atomic(pt_vaddr);
> +		kunmap_px(ppgtt, pt_vaddr);
>  	}
>  }
>  
> @@ -1202,12 +1205,12 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
>  		if (last_pte > GEN6_PTES)
>  			last_pte = GEN6_PTES;
>  
> -		pt_vaddr = kmap_atomic(ppgtt->pd.page_table[act_pt]->base.page);
> +		pt_vaddr = kmap_px(ppgtt->pd.page_table[act_pt]);
>  
>  		for (i = first_pte; i < last_pte; i++)
>  			pt_vaddr[i] = scratch_pte;
>  
> -		kunmap_atomic(pt_vaddr);
> +		kunmap_px(ppgtt, pt_vaddr);
>  
>  		num_entries -= last_pte - first_pte;
>  		first_pte = 0;
> @@ -1231,21 +1234,21 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>  	pt_vaddr = NULL;
>  	for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
>  		if (pt_vaddr == NULL)
> -			pt_vaddr = kmap_atomic(ppgtt->pd.page_table[act_pt]->base.page);
> +			pt_vaddr = kmap_px(ppgtt->pd.page_table[act_pt]);
>  
>  		pt_vaddr[act_pte] =
>  			vm->pte_encode(sg_page_iter_dma_address(&sg_iter),
>  				       cache_level, true, flags);
>  
>  		if (++act_pte == GEN6_PTES) {
> -			kunmap_atomic(pt_vaddr);
> +			kunmap_px(ppgtt, pt_vaddr);
>  			pt_vaddr = NULL;
>  			act_pt++;
>  			act_pte = 0;
>  		}
>  	}
>  	if (pt_vaddr)
> -		kunmap_atomic(pt_vaddr);
> +		kunmap_px(ppgtt, pt_vaddr);
>  }
>  
>  static void gen6_initialize_pt(struct i915_address_space *vm,
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 02/20] drm/i915: Force PD restore on dirty ppGTTs
  2015-05-21 15:07   ` Ville Syrjälä
@ 2015-05-21 16:28     ` Barbalho, Rafael
  2015-05-22 16:15       ` Mika Kuoppala
  0 siblings, 1 reply; 28+ messages in thread
From: Barbalho, Rafael @ 2015-05-21 16:28 UTC (permalink / raw)
  To: Ville Syrjälä, Mika Kuoppala; +Cc: intel-gfx, miku

> -----Original Message-----
> From: Ville Syrjälä [mailto:ville.syrjala@linux.intel.com]
> Sent: Thursday, May 21, 2015 4:08 PM
> To: Mika Kuoppala
> Cc: intel-gfx@lists.freedesktop.org; miku@iki.fi; Barbalho, Rafael
> Subject: Re: [Intel-gfx] [PATCH 02/20] drm/i915: Force PD restore on dirty
> ppGTTs
> 
> On Thu, May 21, 2015 at 05:37:30PM +0300, Mika Kuoppala wrote:
> > Force page directory reload when ppgtt va->pa
> > mapping has changed. Extend dirty rings mechanism
> > for gen > 7 and use it to force pd restore in execlist
> > mode when vm has been changed.
> >
> > Some parts of execlist context update cleanup based on
> > work by Chris Wilson.
> >
> > v2: Add comment about lite restore (Chris)
> >
> > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_lrc.c | 65 ++++++++++++++++++++-------------
> -------
> >  1 file changed, 33 insertions(+), 32 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c
> b/drivers/gpu/drm/i915/intel_lrc.c
> > index 0413b8f..5ee2a8c 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -264,9 +264,10 @@ u32 intel_execlists_ctx_id(struct
> drm_i915_gem_object *ctx_obj)
> >  }
> >
> >  static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring,
> > -					 struct drm_i915_gem_object
> *ctx_obj)
> > +					 struct intel_context *ctx)
> >  {
> >  	struct drm_device *dev = ring->dev;
> > +	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
> >  	uint64_t desc;
> >  	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
> >
> > @@ -284,6 +285,14 @@ static uint64_t execlists_ctx_descriptor(struct
> intel_engine_cs *ring,
> >  	 * signalling between Command Streamers */
> >  	/* desc |= GEN8_CTX_FORCE_RESTORE; */
> >
> > +	/* When performing a LiteRestore but with updated PD we need
> > +	 * to force the GPU to reload the PD
> > +	 */
> > +	if (intel_ring_flag(ring) & ctx->ppgtt->pd_dirty_rings) {
> > +		desc |= GEN8_CTX_FORCE_PD_RESTORE;
> 
> Wasn't there a hardware issue which basically meant you are not
> allowed to actually set this bit?
> 
> Rafael had some details on that as far as I recall so adding cc...

Ville is correct, there is a hardware issue in CHV with this bit and it should
not be set. On BDW I am not sure, although you can stop the pre-fetching
& caching in BDW by using 64-bit PPGTT addressing.

So it's no from me I'm afraid.


> 
> > +		ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(ring);
> > +	}
> > +

<Snip>

> > --
> > 1.9.1
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> --
> Ville Syrjälä
> Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 20/20] drm/i915/gtt: One instance of scratch page table/directory
  2015-05-21 14:37 ` [PATCH 20/20] drm/i915/gtt: One instance of scratch page table/directory Mika Kuoppala
@ 2015-05-21 18:27   ` shuang.he
  0 siblings, 0 replies; 28+ messages in thread
From: shuang.he @ 2015-05-21 18:27 UTC (permalink / raw)
  To: shuang.he, lei.a.liu, intel-gfx, mika.kuoppala

Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6451
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                 -4              272/272              268/272
ILK                                  302/302              302/302
SNB                 -1              315/315              314/315
IVB                                  343/343              343/343
BYT                                  287/287              287/287
BDW                                  317/317              317/317
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
*PNV  igt@gem_fence_thrash@bo-write-verify-threaded-none      PASS(4)      FAIL(1)
*PNV  igt@gem_tiled_pread_pwrite      PASS(5)      FAIL(1)
 PNV  igt@gem_userptr_blits@coherency-sync      CRASH(1)PASS(5)      CRASH(1)
 PNV  igt@gem_userptr_blits@coherency-unsync      CRASH(1)PASS(4)      CRASH(1)
 SNB  igt@pm_rpm@dpms-mode-unset-non-lpsp      DMESG_WARN(10)PASS(1)      DMESG_WARN(1)
(dmesg patch applied)WARNING:at_drivers/gpu/drm/i915/intel_uncore.c:#assert_device_not_suspended[i915]()@WARNING:.* at .* assert_device_not_suspended+0x
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 02/20] drm/i915: Force PD restore on dirty ppGTTs
  2015-05-21 16:28     ` Barbalho, Rafael
@ 2015-05-22 16:15       ` Mika Kuoppala
  0 siblings, 0 replies; 28+ messages in thread
From: Mika Kuoppala @ 2015-05-22 16:15 UTC (permalink / raw)
  To: Barbalho, Rafael, Ville Syrjälä; +Cc: intel-gfx, miku

"Barbalho, Rafael" <rafael.barbalho@intel.com> writes:

>> -----Original Message-----
>> From: Ville Syrjälä [mailto:ville.syrjala@linux.intel.com]
>> Sent: Thursday, May 21, 2015 4:08 PM
>> To: Mika Kuoppala
>> Cc: intel-gfx@lists.freedesktop.org; miku@iki.fi; Barbalho, Rafael
>> Subject: Re: [Intel-gfx] [PATCH 02/20] drm/i915: Force PD restore on dirty
>> ppGTTs
>> 
>> On Thu, May 21, 2015 at 05:37:30PM +0300, Mika Kuoppala wrote:
>> > Force page directory reload when ppgtt va->pa
>> > mapping has changed. Extend dirty rings mechanism
>> > for gen > 7 and use it to force pd restore in execlist
>> > mode when vm has been changed.
>> >
>> > Some parts of execlist context update cleanup based on
>> > work by Chris Wilson.
>> >
>> > v2: Add comment about lite restore (Chris)
>> >
>> > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/intel_lrc.c | 65 ++++++++++++++++++++-------------
>> -------
>> >  1 file changed, 33 insertions(+), 32 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> > index 0413b8f..5ee2a8c 100644
>> > --- a/drivers/gpu/drm/i915/intel_lrc.c
>> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> > @@ -264,9 +264,10 @@ u32 intel_execlists_ctx_id(struct
>> drm_i915_gem_object *ctx_obj)
>> >  }
>> >
>> >  static uint64_t execlists_ctx_descriptor(struct intel_engine_cs *ring,
>> > -					 struct drm_i915_gem_object
>> *ctx_obj)
>> > +					 struct intel_context *ctx)
>> >  {
>> >  	struct drm_device *dev = ring->dev;
>> > +	struct drm_i915_gem_object *ctx_obj = ctx->engine[ring->id].state;
>> >  	uint64_t desc;
>> >  	uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj);
>> >
>> > @@ -284,6 +285,14 @@ static uint64_t execlists_ctx_descriptor(struct
>> intel_engine_cs *ring,
>> >  	 * signalling between Command Streamers */
>> >  	/* desc |= GEN8_CTX_FORCE_RESTORE; */
>> >
>> > +	/* When performing a LiteRestore but with updated PD we need
>> > +	 * to force the GPU to reload the PD
>> > +	 */
>> > +	if (intel_ring_flag(ring) & ctx->ppgtt->pd_dirty_rings) {
>> > +		desc |= GEN8_CTX_FORCE_PD_RESTORE;
>> 
>> Wasn't there a hardware issue which basically meant you are not
>> allowed to actually set this bit?
>> 
>> Rafael had some details on that as far as I recall so adding cc...
>
> Ville is correct, there is a hardware issue in CHV with this bit and it should
> not be set. On BDW I am not sure, although you can stop the pre-fetching
> & caching in BDW by using 64-bit PPGTT addressing.
>
> So it's no from me I'm afraid.

Thanks for pointing out the details on this.

I worked around this with preallocating our top level PDP structure (4 pages
with 32bit mode) so that hardware sees immutable top level structure
per context.

Ville also noticed that initializing scratch structures by copying
is not good as it introduces unnecessary read. Better to just fill with
pte/pde entries always. This triggered a rebase on rest of the
series so I will resend whole series.

-Mika

>
>> 
>> > +		ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(ring);
>> > +	}
>> > +
>
> <Snip>
>
>> > --
>> > 1.9.1
>> >
>> > _______________________________________________
>> > Intel-gfx mailing list
>> > Intel-gfx@lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>> 
>> --
>> Ville Syrjälä
>> Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2015-05-22 16:16 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-21 14:37 [PATCH 00/20] ppgtt cleanups / scratch merge Mika Kuoppala
2015-05-21 14:37 ` [PATCH 01/20] drm/i915/gtt: Mark TLBS dirty for gen8+ Mika Kuoppala
2015-05-21 14:37 ` [PATCH 02/20] drm/i915: Force PD restore on dirty ppGTTs Mika Kuoppala
2015-05-21 15:07   ` Ville Syrjälä
2015-05-21 16:28     ` Barbalho, Rafael
2015-05-22 16:15       ` Mika Kuoppala
2015-05-21 14:37 ` [PATCH 03/20] drm/i915/gtt: Check va range against vm size Mika Kuoppala
2015-05-21 14:37 ` [PATCH 04/20] drm/i915/gtt: Allow >= 4GB sizes for vm Mika Kuoppala
2015-05-21 14:37 ` [PATCH 05/20] drm/i915/gtt: Don't leak scratch page on mapping error Mika Kuoppala
2015-05-21 14:37 ` [PATCH 06/20] drm/i915/gtt: Remove _single from page table allocator Mika Kuoppala
2015-05-21 14:37 ` [PATCH 07/20] drm/i915/gtt: Introduce i915_page_dir_dma_addr Mika Kuoppala
2015-05-21 14:37 ` [PATCH 08/20] drm/i915/gtt: Introduce struct i915_page_dma Mika Kuoppala
2015-05-21 14:37 ` [PATCH 09/20] drm/i915/gtt: Rename unmap_and_free_px to free_px Mika Kuoppala
2015-05-21 14:37 ` [PATCH 10/20] drm/i915/gtt: Remove superfluous free_pd with gen6/7 Mika Kuoppala
2015-05-21 14:37 ` [PATCH 11/20] drm/i915/gtt: Introduce fill_page_dma() Mika Kuoppala
2015-05-21 15:16   ` Ville Syrjälä
2015-05-21 14:37 ` [PATCH 12/20] drm/i915/gtt: Introduce kmap|kunmap for dma page Mika Kuoppala
2015-05-21 15:19   ` Ville Syrjälä
2015-05-21 14:37 ` [PATCH 13/20] drm/i915/gtt: Introduce copy_page_dma and copy_px Mika Kuoppala
2015-05-21 14:37 ` [PATCH 14/20] drm/i915/gtt: Use macros to access dma mapped pages Mika Kuoppala
2015-05-21 14:37 ` [PATCH 15/20] drm/i915/gtt: Make scratch page i915_page_dma compatible Mika Kuoppala
2015-05-21 14:37 ` [PATCH 16/20] drm/i915/gtt: Fill scratch page Mika Kuoppala
2015-05-21 14:56   ` Chris Wilson
2015-05-21 14:37 ` [PATCH 17/20] drm/i915/gtt: Pin vma during virtual address allocation Mika Kuoppala
2015-05-21 14:37 ` [PATCH 18/20] drm/i915/gtt: Cleanup page directory encoding Mika Kuoppala
2015-05-21 14:37 ` [PATCH 19/20] drm/i915/gtt: Move scratch_pd and scratch_pt into vm area Mika Kuoppala
2015-05-21 14:37 ` [PATCH 20/20] drm/i915/gtt: One instance of scratch page table/directory Mika Kuoppala
2015-05-21 18:27   ` shuang.he

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.