All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/18] 48-bit PPGTT
@ 2015-07-07 15:14 Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 01/18] drm/i915: Remove unnecessary gen8_clamp_pd Michel Thierry
                   ` (20 more replies)
  0 siblings, 21 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

These are the rebased patches, after Mika's final ppgtt clean-up series landed
(it relies in the macros added) and Akash review comments.

In order expand the GPU address space, a 4th level translation is added, the
Page Map Level 4 (PML4). This PML4 has 256 PML4 Entries (PML4E), PML4[0-255],
each pointing to a PDP. All the existing "dynamic alloc ppgtt" functions are
used, only adding the 4th level changes. I also updated some remaining
variables that were 32b only.

There are 2 hardware workarounds needed to allow correct operation with 48b
addresses (Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset). This
new patchset version includes the comments and suggestions from Chris Wilson.
A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given object can be
allocated outside the first 4 PDPs; if not, the end range is forced to 4GB. Also,
more objects now use the DRM_MM_CREATE_TOP flag. To maintain compatibility, in
libdrm I added a new drm_intel_bo_emit_reloc_48bit function that will flag
these objects, while the existing drm_intel_bo_emit_reloc clears it.

Finally, this feature is only available in BDW and Gen9, requires LRC submission
mode (execlists) and it can be detected by i915.enable_ppgtt=3.

Also note that this expanded address space is only available for full PPGTT,
aliasing PPGTT and Global GTT remain 32-bit.
Michel Thierry (18):
  drm/i915: Remove unnecessary gen8_clamp_pd
  drm/i915/gen8: Make pdp allocation more dynamic
  drm/i915/gen8: Add PML4 structure
  drm/i915/gen8: Abstract PDP usage
  drm/i915/gen8: Add dynamic page trace events
  drm/i915/gen8: implement alloc/free for 4lvl
  drm/i915/gen8: Add 4 level switching infrastructure and lrc support
  drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT
  drm/i915/gen8: Pass sg_iter through pte inserts
  drm/i915/gen8: Add 4 level support in insert_entries and clear_range
  drm/i915/gen8: Initialize PDPs
  drm/i915: Expand error state's address width to 64b
  drm/i915/gen8: Add ppgtt info and debug_dump
  drm/i915: object size needs to be u64
  drm/i915: batch_obj vm offset must be u64
  drm/i915/userptr: Kill user_size limit check
  drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  drm/i915/gen8: Flip the 48b switch

 drivers/gpu/drm/i915/i915_debugfs.c        |  18 +-
 drivers/gpu/drm/i915/i915_drv.h            |  15 +-
 drivers/gpu/drm/i915/i915_gem.c            |  19 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  13 +
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 609 ++++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem_gtt.h        |  63 ++-
 drivers/gpu/drm/i915/i915_gem_userptr.c    |   4 -
 drivers/gpu/drm/i915/i915_gpu_error.c      |  18 +-
 drivers/gpu/drm/i915/i915_params.c         |   2 +-
 drivers/gpu/drm/i915/i915_reg.h            |   1 +
 drivers/gpu/drm/i915/i915_trace.h          |  32 +-
 drivers/gpu/drm/i915/intel_lrc.c           |  65 ++-
 include/uapi/drm/i915_drm.h                |   3 +-
 13 files changed, 694 insertions(+), 168 deletions(-)

-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v4 01/18] drm/i915: Remove unnecessary gen8_clamp_pd
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 02/18] drm/i915/gen8: Make pdp allocation more dynamic Michel Thierry
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

gen8_clamp_pd clamps to the next page directory boundary, but the macro
gen8_for_each_pde already has a check to stop at the page directory boundary.

Furthermore, i915_pte_count also restricts to the next page table
boundary.

v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series.

Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h | 11 -----------
 2 files changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index b29b73f..712ca34 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -955,7 +955,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
 		gen8_pde_t *const page_directory = kmap_px(pd);
 		struct i915_page_table *pt;
-		uint64_t pd_len = gen8_clamp_pd(start, length);
+		uint64_t pd_len = length;
 		uint64_t pd_start = start;
 		uint32_t pde;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index e1cfa29..d5bf953 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -444,17 +444,6 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
 	     temp = min(temp, length),					\
 	     start += temp, length -= temp)
 
-/* Clamp length to the next page_directory boundary */
-static inline uint64_t gen8_clamp_pd(uint64_t start, uint64_t length)
-{
-	uint64_t next_pd = ALIGN(start + 1, 1 << GEN8_PDPE_SHIFT);
-
-	if (next_pd > (start + length))
-		return length;
-
-	return next_pd - start;
-}
-
 static inline uint32_t gen8_pte_index(uint64_t address)
 {
 	return i915_pte_index(address, GEN8_PDE_SHIFT);
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 02/18] drm/i915/gen8: Make pdp allocation more dynamic
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 01/18] drm/i915: Remove unnecessary gen8_clamp_pd Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 03/18] drm/i915/gen8: Add PML4 structure Michel Thierry
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

This transitional patch doesn't do much for the existing code. However,
it should make upcoming patches to use the full 48b address space a bit
easier.

v2: Renamed  pdp_free to be similar to  pd/pt (unmap_and_free_pdp).
v3: To facilitate testing, 48b mode will be available on Broadwell and
GEN9+, when i915.enable_ppgtt = 3.
v4: Rebase after s/page_tables/page_table/, added extra information
about 4-level page table formats and use IS_ENABLED macro.
v5: Check CONFIG_X86_64 instead of CONFIG_64BIT.
v6: Rebase after Mika's ppgtt cleanup / scratch merge patch series, and
follow
his nomenclature in pdp functions (there is no alloc_pdp yet).
v7: Rebase after merged version of Mika's ppgtt cleanup patch series.
v8: Rebase after final merged version of Mika's ppgtt/scratch patches.
v9: Introduce PML4 (and 48-bit checks) until next patch (Akash).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 82 ++++++++++++++++++++++++++++++-------
 drivers/gpu/drm/i915/i915_gem_gtt.h | 10 +++--
 2 files changed, 74 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 712ca34..4e1501a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -522,6 +522,43 @@ static void gen8_initialize_pd(struct i915_address_space *vm,
 	fill_px(vm->dev, pd, scratch_pde);
 }
 
+static int __pdp_init(struct drm_device *dev,
+		      struct i915_page_directory_pointer *pdp)
+{
+	size_t pdpes = I915_PDPES_PER_PDP(dev);
+
+	pdp->used_pdpes = kcalloc(BITS_TO_LONGS(pdpes),
+				  sizeof(unsigned long),
+				  GFP_KERNEL);
+	if (!pdp->used_pdpes)
+		return -ENOMEM;
+
+	pdp->page_directory = kcalloc(pdpes, sizeof(*pdp->page_directory),
+				      GFP_KERNEL);
+	if (!pdp->page_directory) {
+		kfree(pdp->used_pdpes);
+		/* the PDP might be the statically allocated top level. Keep it
+		 * as clean as possible */
+		pdp->used_pdpes = NULL;
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static void __pdp_fini(struct i915_page_directory_pointer *pdp)
+{
+	kfree(pdp->used_pdpes);
+	kfree(pdp->page_directory);
+	pdp->page_directory = NULL;
+}
+
+static void free_pdp(struct drm_device *dev,
+		     struct i915_page_directory_pointer *pdp)
+{
+	__pdp_fini(pdp);
+}
+
 /* Broadwell Page Directory Pointer Descriptors */
 static int gen8_write_pdp(struct drm_i915_gem_request *req,
 			  unsigned entry,
@@ -720,7 +757,8 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 		container_of(vm, struct i915_hw_ppgtt, base);
 	int i;
 
-	for_each_set_bit(i, ppgtt->pdp.used_pdpes, GEN8_LEGACY_PDPES) {
+	for_each_set_bit(i, ppgtt->pdp.used_pdpes,
+			 I915_PDPES_PER_PDP(ppgtt->base.dev)) {
 		if (WARN_ON(!ppgtt->pdp.page_directory[i]))
 			continue;
 
@@ -729,6 +767,7 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 		free_pd(ppgtt->base.dev, ppgtt->pdp.page_directory[i]);
 	}
 
+	free_pdp(ppgtt->base.dev, &ppgtt->pdp);
 	gen8_free_scratch(vm);
 }
 
@@ -820,8 +859,9 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
 	struct i915_page_directory *pd;
 	uint64_t temp;
 	uint32_t pdpe;
+	uint32_t pdpes = I915_PDPES_PER_PDP(dev);
 
-	WARN_ON(!bitmap_empty(new_pds, GEN8_LEGACY_PDPES));
+	WARN_ON(!bitmap_empty(new_pds, pdpes));
 
 	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
 		if (pd)
@@ -839,18 +879,19 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
 	return 0;
 
 unwind_out:
-	for_each_set_bit(pdpe, new_pds, GEN8_LEGACY_PDPES)
+	for_each_set_bit(pdpe, new_pds, pdpes)
 		free_pd(dev, pdp->page_directory[pdpe]);
 
 	return -ENOMEM;
 }
 
 static void
-free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts)
+free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts,
+		       uint32_t pdpes)
 {
 	int i;
 
-	for (i = 0; i < GEN8_LEGACY_PDPES; i++)
+	for (i = 0; i < pdpes; i++)
 		kfree(new_pts[i]);
 	kfree(new_pts);
 	kfree(new_pds);
@@ -861,23 +902,24 @@ free_gen8_temp_bitmaps(unsigned long *new_pds, unsigned long **new_pts)
  */
 static
 int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
-					 unsigned long ***new_pts)
+					 unsigned long ***new_pts,
+					 uint32_t pdpes)
 {
 	int i;
 	unsigned long *pds;
 	unsigned long **pts;
 
-	pds = kcalloc(BITS_TO_LONGS(GEN8_LEGACY_PDPES), sizeof(unsigned long), GFP_KERNEL);
+	pds = kcalloc(BITS_TO_LONGS(pdpes), sizeof(unsigned long), GFP_KERNEL);
 	if (!pds)
 		return -ENOMEM;
 
-	pts = kcalloc(GEN8_LEGACY_PDPES, sizeof(unsigned long *), GFP_KERNEL);
+	pts = kcalloc(pdpes, sizeof(unsigned long *), GFP_KERNEL);
 	if (!pts) {
 		kfree(pds);
 		return -ENOMEM;
 	}
 
-	for (i = 0; i < GEN8_LEGACY_PDPES; i++) {
+	for (i = 0; i < pdpes; i++) {
 		pts[i] = kcalloc(BITS_TO_LONGS(I915_PDES),
 				 sizeof(unsigned long), GFP_KERNEL);
 		if (!pts[i])
@@ -890,7 +932,7 @@ int __must_check alloc_gen8_temp_bitmaps(unsigned long **new_pds,
 	return 0;
 
 err_out:
-	free_gen8_temp_bitmaps(pds, pts);
+	free_gen8_temp_bitmaps(pds, pts, pdpes);
 	return -ENOMEM;
 }
 
@@ -916,6 +958,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	const uint64_t orig_length = length;
 	uint64_t temp;
 	uint32_t pdpe;
+	uint32_t pdpes = I915_PDPES_PER_PDP(ppgtt->base.dev);
 	int ret;
 
 	/* Wrap is never okay since we can only represent 48b, and we don't
@@ -927,7 +970,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	if (WARN_ON(start + length > ppgtt->base.total))
 		return -ENODEV;
 
-	ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables);
+	ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables, pdpes);
 	if (ret)
 		return ret;
 
@@ -935,7 +978,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	ret = gen8_ppgtt_alloc_page_directories(ppgtt, &ppgtt->pdp, start, length,
 					new_page_dirs);
 	if (ret) {
-		free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
+		free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
 		return ret;
 	}
 
@@ -989,7 +1032,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 		__set_bit(pdpe, ppgtt->pdp.used_pdpes);
 	}
 
-	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
+	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
 	mark_tlbs_dirty(ppgtt);
 	return 0;
 
@@ -999,10 +1042,10 @@ err_out:
 			free_pt(vm->dev, ppgtt->pdp.page_directory[pdpe]->page_table[temp]);
 	}
 
-	for_each_set_bit(pdpe, new_page_dirs, GEN8_LEGACY_PDPES)
+	for_each_set_bit(pdpe, new_page_dirs, pdpes)
 		free_pd(vm->dev, ppgtt->pdp.page_directory[pdpe]);
 
-	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables);
+	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
 	mark_tlbs_dirty(ppgtt);
 	return ret;
 }
@@ -1040,7 +1083,16 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 
 	ppgtt->switch_mm = gen8_mm_switch;
 
+	ret = __pdp_init(false, &ppgtt->pdp);
+
+	if (ret)
+		goto free_scratch;
+
 	return 0;
+
+free_scratch:
+	gen8_free_scratch(&ppgtt->base);
+	return ret;
 }
 
 static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index d5bf953..f5928db 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -98,6 +98,9 @@ typedef uint64_t gen8_pde_t;
 #define GEN8_LEGACY_PDPES		4
 #define GEN8_PTES			I915_PTES(sizeof(gen8_pte_t))
 
+/* FIXME: Next patch will use dev */
+#define I915_PDPES_PER_PDP(dev)		GEN8_LEGACY_PDPES
+
 #define PPAT_UNCACHED_INDEX		(_PAGE_PWT | _PAGE_PCD)
 #define PPAT_CACHED_PDE_INDEX		0 /* WB LLC */
 #define PPAT_CACHED_INDEX		_PAGE_PAT /* WB LLCeLLC */
@@ -241,9 +244,10 @@ struct i915_page_directory {
 };
 
 struct i915_page_directory_pointer {
-	/* struct page *page; */
-	DECLARE_BITMAP(used_pdpes, GEN8_LEGACY_PDPES);
-	struct i915_page_directory *page_directory[GEN8_LEGACY_PDPES];
+	struct i915_page_dma base;
+
+	unsigned long *used_pdpes;
+	struct i915_page_directory **page_directory;
 };
 
 struct i915_address_space {
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 03/18] drm/i915/gen8: Add PML4 structure
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 01/18] drm/i915: Remove unnecessary gen8_clamp_pd Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 02/18] drm/i915/gen8: Make pdp allocation more dynamic Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-11 20:02   ` Chris Wilson
  2015-07-07 15:14 ` [PATCH v4 04/18] drm/i915/gen8: Abstract PDP usage Michel Thierry
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

Introduces the Page Map Level 4 (PML4), ie. the new top level structure
of the page tables.

To facilitate testing, 48b mode will be available on Broadwell and
GEN9+, when i915.enable_ppgtt = 3.

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h     |  7 ++++++-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 40 ++++++++++++++++++++++++-------------
 drivers/gpu/drm/i915/i915_gem_gtt.h | 35 ++++++++++++++++++++++++--------
 3 files changed, 59 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 464b28d..de3a5d1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2497,7 +2497,12 @@ struct drm_i915_cmd_table {
 #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
 #define HAS_LOGICAL_RING_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 8)
 #define USES_PPGTT(dev)		(i915.enable_ppgtt)
-#define USES_FULL_PPGTT(dev)	(i915.enable_ppgtt == 2)
+#define USES_FULL_PPGTT(dev)	(i915.enable_ppgtt >= 2)
+#ifdef CONFIG_X86_64
+# define USES_FULL_48BIT_PPGTT(dev)	(i915.enable_ppgtt == 3)
+#else
+# define USES_FULL_48BIT_PPGTT(dev)	false
+#endif
 
 #define HAS_OVERLAY(dev)		(INTEL_INFO(dev)->has_overlay)
 #define OVERLAY_NEEDS_PHYSICAL(dev)	(INTEL_INFO(dev)->overlay_needs_physical)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 4e1501a..fa66dfa 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -104,9 +104,13 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 {
 	bool has_aliasing_ppgtt;
 	bool has_full_ppgtt;
+	bool has_full_64bit_ppgtt;
 
 	has_aliasing_ppgtt = INTEL_INFO(dev)->gen >= 6;
 	has_full_ppgtt = INTEL_INFO(dev)->gen >= 7;
+	has_full_64bit_ppgtt = IS_ENABLED(CONFIG_X86_64) &&
+			       (IS_BROADWELL(dev) ||
+				INTEL_INFO(dev)->gen >= 9) && false; /* FIXME: 64b */
 
 	if (intel_vgpu_active(dev))
 		has_full_ppgtt = false; /* emulation is too hard */
@@ -125,6 +129,9 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 	if (enable_ppgtt == 2 && has_full_ppgtt)
 		return 2;
 
+	if (enable_ppgtt == 3 && has_full_64bit_ppgtt)
+		return 3;
+
 #ifdef CONFIG_INTEL_IOMMU
 	/* Disable ppgtt on SNB if VT-d is on. */
 	if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped) {
@@ -557,6 +564,8 @@ static void free_pdp(struct drm_device *dev,
 		     struct i915_page_directory_pointer *pdp)
 {
 	__pdp_fini(pdp);
+	if (USES_FULL_48BIT_PPGTT(dev))
+		kfree(pdp);
 }
 
 /* Broadwell Page Directory Pointer Descriptors */
@@ -671,9 +680,6 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 	pt_vaddr = NULL;
 
 	for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
-		if (WARN_ON(pdpe >= GEN8_LEGACY_PDPES))
-			break;
-
 		if (pt_vaddr == NULL) {
 			struct i915_page_directory *pd = ppgtt->pdp.page_directory[pdpe];
 			struct i915_page_table *pt = pd->page_table[pde];
@@ -1066,14 +1072,6 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 		return ret;
 
 	ppgtt->base.start = 0;
-	ppgtt->base.total = 1ULL << 32;
-	if (IS_ENABLED(CONFIG_X86_32))
-		/* While we have a proliferation of size_t variables
-		 * we cannot represent the full ppgtt size on 32bit,
-		 * so limit it to the same size as the GGTT (currently
-		 * 2GiB).
-		 */
-		ppgtt->base.total = to_i915(ppgtt->base.dev)->gtt.base.total;
 	ppgtt->base.cleanup = gen8_ppgtt_cleanup;
 	ppgtt->base.allocate_va_range = gen8_alloc_va_range;
 	ppgtt->base.insert_entries = gen8_ppgtt_insert_entries;
@@ -1083,10 +1081,24 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 
 	ppgtt->switch_mm = gen8_mm_switch;
 
-	ret = __pdp_init(false, &ppgtt->pdp);
+	if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+		ret = __pdp_init(false, &ppgtt->pdp);
 
-	if (ret)
-		goto free_scratch;
+		if (ret)
+			goto free_scratch;
+
+		ppgtt->base.total = 1ULL << 32;
+		if (IS_ENABLED(CONFIG_X86_32))
+			/* While we have a proliferation of size_t variables
+			 * we cannot represent the full ppgtt size on 32bit,
+			 * so limit it to the same size as the GGTT (currently
+			 * 2GiB).
+			 */
+			ppgtt->base.total = to_i915(ppgtt->base.dev)->gtt.base.total;
+	} else {
+		ppgtt->base.total = 1ULL << 48;
+		return -EPERM; /* Not yet implemented */
+	}
 
 	return 0;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index f5928db..6fa824e 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -88,9 +88,17 @@ typedef uint64_t gen8_pde_t;
  * PDPE  |  PDE  |  PTE  | offset
  * The difference as compared to normal x86 3 level page table is the PDPEs are
  * programmed via register.
+ *
+ * GEN8 48b legacy style address is defined as a 4 level page table:
+ * 47:39 | 38:30 | 29:21 | 20:12 |  11:0
+ * PML4E | PDPE  |  PDE  |  PTE  | offset
  */
+#define GEN8_PML4ES_PER_PML4		512
+#define GEN8_PML4E_SHIFT		39
 #define GEN8_PDPE_SHIFT			30
-#define GEN8_PDPE_MASK			0x3
+/* NB: GEN8_PDPE_MASK is untrue for 32b platforms, but it has no impact on 32b page
+ * tables */
+#define GEN8_PDPE_MASK			0x1ff
 #define GEN8_PDE_SHIFT			21
 #define GEN8_PDE_MASK			0x1ff
 #define GEN8_PTE_SHIFT			12
@@ -98,8 +106,8 @@ typedef uint64_t gen8_pde_t;
 #define GEN8_LEGACY_PDPES		4
 #define GEN8_PTES			I915_PTES(sizeof(gen8_pte_t))
 
-/* FIXME: Next patch will use dev */
-#define I915_PDPES_PER_PDP(dev)		GEN8_LEGACY_PDPES
+#define I915_PDPES_PER_PDP(dev) (USES_FULL_48BIT_PPGTT(dev) ?\
+				 GEN8_PML4ES_PER_PML4 : GEN8_LEGACY_PDPES)
 
 #define PPAT_UNCACHED_INDEX		(_PAGE_PWT | _PAGE_PCD)
 #define PPAT_CACHED_PDE_INDEX		0 /* WB LLC */
@@ -250,6 +258,13 @@ struct i915_page_directory_pointer {
 	struct i915_page_directory **page_directory;
 };
 
+struct i915_pml4 {
+	struct i915_page_dma base;
+
+	DECLARE_BITMAP(used_pml4es, GEN8_PML4ES_PER_PML4);
+	struct i915_page_directory_pointer *pdps[GEN8_PML4ES_PER_PML4];
+};
+
 struct i915_address_space {
 	struct drm_mm mm;
 	struct drm_device *dev;
@@ -345,8 +360,9 @@ struct i915_hw_ppgtt {
 	struct drm_mm_node node;
 	unsigned long pd_dirty_rings;
 	union {
-		struct i915_page_directory_pointer pdp;
-		struct i915_page_directory pd;
+		struct i915_pml4 pml4;		/* GEN8+ & 48b PPGTT */
+		struct i915_page_directory_pointer pdp;	/* GEN8+ */
+		struct i915_page_directory pd;		/* GEN6-7 */
 	};
 
 	struct drm_i915_file_private *file_priv;
@@ -440,14 +456,17 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
 	     temp = min(temp, length),					\
 	     start += temp, length -= temp)
 
-#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)		\
-	for (iter = gen8_pdpe_index(start);	\
-	     pd = (pdp)->page_directory[iter], length > 0 && iter < GEN8_LEGACY_PDPES;	\
+#define gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, b)	\
+	for (iter = gen8_pdpe_index(start); \
+	     pd = (pdp)->page_directory[iter], length > 0 && (iter < b);	\
 	     iter++,				\
 	     temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,	\
 	     temp = min(temp, length),					\
 	     start += temp, length -= temp)
 
+#define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)		\
+	gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, I915_PDPES_PER_PDP(dev))
+
 static inline uint32_t gen8_pte_index(uint64_t address)
 {
 	return i915_pte_index(address, GEN8_PDE_SHIFT);
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 04/18] drm/i915/gen8: Abstract PDP usage
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (2 preceding siblings ...)
  2015-07-07 15:14 ` [PATCH v4 03/18] drm/i915/gen8: Add PML4 structure Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 05/18] drm/i915/gen8: Add dynamic page trace events Michel Thierry
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

Up until now, ppgtt->pdp has always been the root of our page tables.
Legacy 32b addresses acted like it had 1 PDP with 4 PDPEs.

In preparation for 4 level page tables, we need to stop use ppgtt->pdp
directly unless we know it's what we want. The future structure will use
ppgtt->pml4 for the top level, and the pdp is just one of the entries
being pointed to by a pml4e.

v2: Updated after dynamic page allocation changes.
v3: Rebase after s/page_tables/page_table/.
v4: Rebase after changes in "Dynamic page table allocations" patch.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rebase after final merged version of Mika's ppgtt/scratch patches.
v7: Keep pagetable map in-line (and avoid unnecessary for_each_pde
loops), remove redundant ppgtt pointer in _alloc_pagetabs (Akash)

Cc: Akash Goel  <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 107 +++++++++++++++++++++++-------------
 1 file changed, 68 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index fa66dfa..c3c2703 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -616,6 +616,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
+	struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
 	gen8_pte_t *pt_vaddr, scratch_pte;
 	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
 	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
@@ -630,10 +631,10 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 		struct i915_page_directory *pd;
 		struct i915_page_table *pt;
 
-		if (WARN_ON(!ppgtt->pdp.page_directory[pdpe]))
+		if (WARN_ON(!pdp->page_directory[pdpe]))
 			break;
 
-		pd = ppgtt->pdp.page_directory[pdpe];
+		pd = pdp->page_directory[pdpe];
 
 		if (WARN_ON(!pd->page_table[pde]))
 			break;
@@ -671,6 +672,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
+	struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
 	gen8_pte_t *pt_vaddr;
 	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
 	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
@@ -681,7 +683,7 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 
 	for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
 		if (pt_vaddr == NULL) {
-			struct i915_page_directory *pd = ppgtt->pdp.page_directory[pdpe];
+			struct i915_page_directory *pd = pdp->page_directory[pdpe];
 			struct i915_page_table *pt = pd->page_table[pde];
 			pt_vaddr = kmap_px(pt);
 		}
@@ -763,23 +765,28 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
 		container_of(vm, struct i915_hw_ppgtt, base);
 	int i;
 
-	for_each_set_bit(i, ppgtt->pdp.used_pdpes,
-			 I915_PDPES_PER_PDP(ppgtt->base.dev)) {
-		if (WARN_ON(!ppgtt->pdp.page_directory[i]))
-			continue;
+	if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+		for_each_set_bit(i, ppgtt->pdp.used_pdpes,
+				 I915_PDPES_PER_PDP(ppgtt->base.dev)) {
+			if (WARN_ON(!ppgtt->pdp.page_directory[i]))
+				continue;
 
-		gen8_free_page_tables(ppgtt->base.dev,
-				      ppgtt->pdp.page_directory[i]);
-		free_pd(ppgtt->base.dev, ppgtt->pdp.page_directory[i]);
+			gen8_free_page_tables(ppgtt->base.dev,
+					      ppgtt->pdp.page_directory[i]);
+			free_pd(ppgtt->base.dev,
+				ppgtt->pdp.page_directory[i]);
+		}
+		free_pdp(ppgtt->base.dev, &ppgtt->pdp);
+	} else {
+		WARN_ON(1); /* to be implemented later */
 	}
 
-	free_pdp(ppgtt->base.dev, &ppgtt->pdp);
 	gen8_free_scratch(vm);
 }
 
 /**
  * gen8_ppgtt_alloc_pagetabs() - Allocate page tables for VA range.
- * @ppgtt:	Master ppgtt structure.
+ * @vm:		Master vm structure.
  * @pd:		Page directory for this address range.
  * @start:	Starting virtual address to begin allocations.
  * @length	Size of the allocations.
@@ -795,13 +802,13 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
  *
  * Return: 0 if success; negative error code otherwise.
  */
-static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
+static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
 				     struct i915_page_directory *pd,
 				     uint64_t start,
 				     uint64_t length,
 				     unsigned long *new_pts)
 {
-	struct drm_device *dev = ppgtt->base.dev;
+	struct drm_device *dev = vm->dev;
 	struct i915_page_table *pt;
 	uint64_t temp;
 	uint32_t pde;
@@ -810,7 +817,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
 		/* Don't reallocate page tables */
 		if (pt) {
 			/* Scratch is never allocated this way */
-			WARN_ON(pt == ppgtt->base.scratch_pt);
+			WARN_ON(pt == vm->scratch_pt);
 			continue;
 		}
 
@@ -818,7 +825,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_hw_ppgtt *ppgtt,
 		if (IS_ERR(pt))
 			goto unwind_out;
 
-		gen8_initialize_pt(&ppgtt->base, pt);
+		gen8_initialize_pt(vm, pt);
 		pd->page_table[pde] = pt;
 		__set_bit(pde, new_pts);
 	}
@@ -834,7 +841,7 @@ unwind_out:
 
 /**
  * gen8_ppgtt_alloc_page_directories() - Allocate page directories for VA range.
- * @ppgtt:	Master ppgtt structure.
+ * @vm:		Master vm structure.
  * @pdp:	Page directory pointer for this address range.
  * @start:	Starting virtual address to begin allocations.
  * @length	Size of the allocations.
@@ -855,13 +862,14 @@ unwind_out:
  *
  * Return: 0 if success; negative error code otherwise.
  */
-static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
-				     struct i915_page_directory_pointer *pdp,
-				     uint64_t start,
-				     uint64_t length,
-				     unsigned long *new_pds)
+static int
+gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
+				  struct i915_page_directory_pointer *pdp,
+				  uint64_t start,
+				  uint64_t length,
+				  unsigned long *new_pds)
 {
-	struct drm_device *dev = ppgtt->base.dev;
+	struct drm_device *dev = vm->dev;
 	struct i915_page_directory *pd;
 	uint64_t temp;
 	uint32_t pdpe;
@@ -877,7 +885,7 @@ static int gen8_ppgtt_alloc_page_directories(struct i915_hw_ppgtt *ppgtt,
 		if (IS_ERR(pd))
 			goto unwind_out;
 
-		gen8_initialize_pd(&ppgtt->base, pd);
+		gen8_initialize_pd(vm, pd);
 		pdp->page_directory[pdpe] = pd;
 		__set_bit(pdpe, new_pds);
 	}
@@ -952,19 +960,21 @@ static void mark_tlbs_dirty(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->pd_dirty_rings = INTEL_INFO(ppgtt->base.dev)->ring_mask;
 }
 
-static int gen8_alloc_va_range(struct i915_address_space *vm,
-			       uint64_t start,
-			       uint64_t length)
+static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
+				    struct i915_page_directory_pointer *pdp,
+				    uint64_t start,
+				    uint64_t length)
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	unsigned long *new_page_dirs, **new_page_tables;
+	struct drm_device *dev = vm->dev;
 	struct i915_page_directory *pd;
 	const uint64_t orig_start = start;
 	const uint64_t orig_length = length;
 	uint64_t temp;
 	uint32_t pdpe;
-	uint32_t pdpes = I915_PDPES_PER_PDP(ppgtt->base.dev);
+	uint32_t pdpes = I915_PDPES_PER_PDP(dev);
 	int ret;
 
 	/* Wrap is never okay since we can only represent 48b, and we don't
@@ -973,7 +983,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	if (WARN_ON(start + length < start))
 		return -ENODEV;
 
-	if (WARN_ON(start + length > ppgtt->base.total))
+	if (WARN_ON(start + length > vm->total))
 		return -ENODEV;
 
 	ret = alloc_gen8_temp_bitmaps(&new_page_dirs, &new_page_tables, pdpes);
@@ -981,16 +991,15 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 		return ret;
 
 	/* Do the allocations first so we can easily bail out */
-	ret = gen8_ppgtt_alloc_page_directories(ppgtt, &ppgtt->pdp, start, length,
-					new_page_dirs);
+	ret = gen8_ppgtt_alloc_page_directories(vm, pdp, start, length,
+						new_page_dirs);
 	if (ret) {
 		free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
 		return ret;
 	}
 
-	/* For every page directory referenced, allocate page tables */
-	gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
-		ret = gen8_ppgtt_alloc_pagetabs(ppgtt, pd, start, length,
+	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
+		ret = gen8_ppgtt_alloc_pagetabs(vm, pd, start, length,
 						new_page_tables[pdpe]);
 		if (ret)
 			goto err_out;
@@ -1001,7 +1010,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 
 	/* Allocations have completed successfully, so set the bitmaps, and do
 	 * the mappings. */
-	gen8_for_each_pdpe(pd, &ppgtt->pdp, start, length, temp, pdpe) {
+	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
 		gen8_pde_t *const page_directory = kmap_px(pd);
 		struct i915_page_table *pt;
 		uint64_t pd_len = length;
@@ -1034,8 +1043,7 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 		}
 
 		kunmap_px(ppgtt, page_directory);
-
-		__set_bit(pdpe, ppgtt->pdp.used_pdpes);
+		__set_bit(pdpe, pdp->used_pdpes);
 	}
 
 	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
@@ -1045,17 +1053,38 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 err_out:
 	while (pdpe--) {
 		for_each_set_bit(temp, new_page_tables[pdpe], I915_PDES)
-			free_pt(vm->dev, ppgtt->pdp.page_directory[pdpe]->page_table[temp]);
+			free_pt(dev, pdp->page_directory[pdpe]->page_table[temp]);
 	}
 
 	for_each_set_bit(pdpe, new_page_dirs, pdpes)
-		free_pd(vm->dev, ppgtt->pdp.page_directory[pdpe]);
+		free_pd(dev, pdp->page_directory[pdpe]);
 
 	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
 	mark_tlbs_dirty(ppgtt);
 	return ret;
 }
 
+static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
+				    struct i915_pml4 *pml4,
+				    uint64_t start,
+				    uint64_t length)
+{
+	WARN_ON(1); /* to be implemented later */
+	return 0;
+}
+
+static int gen8_alloc_va_range(struct i915_address_space *vm,
+			       uint64_t start, uint64_t length)
+{
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
+
+	if (!USES_FULL_48BIT_PPGTT(vm->dev))
+		return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
+	else
+		return gen8_alloc_va_range_4lvl(vm, &ppgtt->pml4, start, length);
+}
+
 /*
  * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
  * with a net effect resembling a 2-level page table in normal x86 terms. Each
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 05/18] drm/i915/gen8: Add dynamic page trace events
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (3 preceding siblings ...)
  2015-07-07 15:14 ` [PATCH v4 04/18] drm/i915/gen8: Abstract PDP usage Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 06/18] drm/i915/gen8: implement alloc/free for 4lvl Michel Thierry
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

The dynamic page allocation patch series added it for GEN6, this patch
adds them for GEN8.

v2: Consolidate pagetable/page_directory events
v3: Multiple rebases.
v4: Rebase after s/page_tables/page_table/.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rebase after gen8_map_pagetable_range removal.
v7: Use generic page name (px) in DECLARE_EVENT_CLASS (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+)
---
 drivers/gpu/drm/i915/i915_gem_gtt.c |  6 ++++++
 drivers/gpu/drm/i915/i915_trace.h   | 32 ++++++++++++++++++++++++--------
 2 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index c3c2703..425ce65 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -828,6 +828,7 @@ static int gen8_ppgtt_alloc_pagetabs(struct i915_address_space *vm,
 		gen8_initialize_pt(vm, pt);
 		pd->page_table[pde] = pt;
 		__set_bit(pde, new_pts);
+		trace_i915_page_table_entry_alloc(vm, pde, start, GEN8_PDE_SHIFT);
 	}
 
 	return 0;
@@ -888,6 +889,7 @@ gen8_ppgtt_alloc_page_directories(struct i915_address_space *vm,
 		gen8_initialize_pd(vm, pd);
 		pdp->page_directory[pdpe] = pd;
 		__set_bit(pdpe, new_pds);
+		trace_i915_page_directory_entry_alloc(vm, pdpe, start, GEN8_PDPE_SHIFT);
 	}
 
 	return 0;
@@ -1037,6 +1039,10 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 			/* Map the PDE to the page table */
 			page_directory[pde] = gen8_pde_encode(px_dma(pt),
 							      I915_CACHE_LLC);
+			trace_i915_page_table_entry_map(&ppgtt->base, pde, pt,
+							gen8_pte_index(start),
+							gen8_pte_count(start, length),
+							GEN8_PTES);
 
 			/* NB: We haven't yet mapped ptes to pages. At this
 			 * point we're still relying on insert_entries() */
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 63328b6..e16ce43 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -186,33 +186,49 @@ DEFINE_EVENT(i915_va, i915_va_alloc,
 	     TP_ARGS(vm, start, length, name)
 );
 
-DECLARE_EVENT_CLASS(i915_page_table_entry,
-	TP_PROTO(struct i915_address_space *vm, u32 pde, u64 start, u64 pde_shift),
-	TP_ARGS(vm, pde, start, pde_shift),
+DECLARE_EVENT_CLASS(i915_px_entry,
+	TP_PROTO(struct i915_address_space *vm, u32 px, u64 start, u64 px_shift),
+	TP_ARGS(vm, px, start, px_shift),
 
 	TP_STRUCT__entry(
 		__field(struct i915_address_space *, vm)
-		__field(u32, pde)
+		__field(u32, px)
 		__field(u64, start)
 		__field(u64, end)
 	),
 
 	TP_fast_assign(
 		__entry->vm = vm;
-		__entry->pde = pde;
+		__entry->px = px;
 		__entry->start = start;
-		__entry->end = ((start + (1ULL << pde_shift)) & ~((1ULL << pde_shift)-1)) - 1;
+		__entry->end = ((start + (1ULL << px_shift)) & ~((1ULL << px_shift)-1)) - 1;
 	),
 
 	TP_printk("vm=%p, pde=%d (0x%llx-0x%llx)",
-		  __entry->vm, __entry->pde, __entry->start, __entry->end)
+		  __entry->vm, __entry->px, __entry->start, __entry->end)
 );
 
-DEFINE_EVENT(i915_page_table_entry, i915_page_table_entry_alloc,
+DEFINE_EVENT(i915_px_entry, i915_page_table_entry_alloc,
 	     TP_PROTO(struct i915_address_space *vm, u32 pde, u64 start, u64 pde_shift),
 	     TP_ARGS(vm, pde, start, pde_shift)
 );
 
+DEFINE_EVENT_PRINT(i915_px_entry, i915_page_directory_entry_alloc,
+		   TP_PROTO(struct i915_address_space *vm, u32 pdpe, u64 start, u64 pdpe_shift),
+		   TP_ARGS(vm, pdpe, start, pdpe_shift),
+
+		   TP_printk("vm=%p, pdpe=%d (0x%llx-0x%llx)",
+			     __entry->vm, __entry->px, __entry->start, __entry->end)
+);
+
+DEFINE_EVENT_PRINT(i915_px_entry, i915_page_directory_pointer_entry_alloc,
+		   TP_PROTO(struct i915_address_space *vm, u32 pml4e, u64 start, u64 pml4e_shift),
+		   TP_ARGS(vm, pml4e, start, pml4e_shift),
+
+		   TP_printk("vm=%p, pml4e=%d (0x%llx-0x%llx)",
+			     __entry->vm, __entry->px, __entry->start, __entry->end)
+);
+
 /* Avoid extra math because we only support two sizes. The format is defined by
  * bitmap_scnprintf. Each 32 bits is 8 HEX digits followed by comma */
 #define TRACE_PT_SIZE(bits) \
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 06/18] drm/i915/gen8: implement alloc/free for 4lvl
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (4 preceding siblings ...)
  2015-07-07 15:14 ` [PATCH v4 05/18] drm/i915/gen8: Add dynamic page trace events Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 07/18] drm/i915/gen8: Add 4 level switching infrastructure and lrc support Michel Thierry
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

PML4 has no special attributes, and there will always be a PML4.
So simply initialize it at creation, and destroy it at the end.

The code for 4lvl is able to call into the existing 3lvl page table code
to handle all of the lower levels.

v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the
compiler happy. And define ret only in one place.
Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl.
v3: Use i915_dma_unmap_single instead of pci API. Fix a
couple of incorrect checks when unmapping pdp and pd pages (Akash).
v4: Call __pdp_fini also for 32b PPGTT. Clean up alloc_pdp param list.
v5: Prevent (harmless) out of range access in gen8_for_each_pml4e.
v6: Simplify alloc_vma_range_4lvl and gen8_ppgtt_init_common error
paths. (Akash)
v7: Rebase, s/gen8_ppgtt_free_*/gen8_ppgtt_cleanup_*/.
v8: Change location of pml4_init/fini. It will make next patches
cleaner.
v9: Rebase after Mika's ppgtt cleanup / scratch merge patch series, while
trying to reuse as much as possible for pdp alloc. pml4_init/fini
replaced by setup/cleanup_px macros.
v10: Rebase after Mika's merged ppgtt cleanup patch series.
v11: Rebase after final merged version of Mika's ppgtt/scratch
patches.
v12: Fix pdpe start value in trace (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 161 ++++++++++++++++++++++++++++++------
 drivers/gpu/drm/i915/i915_gem_gtt.h |  12 ++-
 2 files changed, 145 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 425ce65..33df9b7 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -560,12 +560,44 @@ static void __pdp_fini(struct i915_page_directory_pointer *pdp)
 	pdp->page_directory = NULL;
 }
 
+static struct
+i915_page_directory_pointer *alloc_pdp(struct drm_device *dev)
+{
+	struct i915_page_directory_pointer *pdp;
+	int ret = -ENOMEM;
+
+	WARN_ON(!USES_FULL_48BIT_PPGTT(dev));
+
+	pdp = kzalloc(sizeof(*pdp), GFP_KERNEL);
+	if (!pdp)
+		return ERR_PTR(-ENOMEM);
+
+	ret = __pdp_init(dev, pdp);
+	if (ret)
+		goto fail_bitmap;
+
+	ret = setup_px(dev, pdp);
+	if (ret)
+		goto fail_page_m;
+
+	return pdp;
+
+fail_page_m:
+	__pdp_fini(pdp);
+fail_bitmap:
+	kfree(pdp);
+
+	return ERR_PTR(ret);
+}
+
 static void free_pdp(struct drm_device *dev,
 		     struct i915_page_directory_pointer *pdp)
 {
 	__pdp_fini(pdp);
-	if (USES_FULL_48BIT_PPGTT(dev))
+	if (USES_FULL_48BIT_PPGTT(dev)) {
+		cleanup_px(dev, pdp);
 		kfree(pdp);
+	}
 }
 
 /* Broadwell Page Directory Pointer Descriptors */
@@ -759,28 +791,46 @@ static void gen8_free_scratch(struct i915_address_space *vm)
 	free_scratch_page(dev, vm->scratch_page);
 }
 
-static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
+static void gen8_ppgtt_cleanup_3lvl(struct drm_device *dev,
+				    struct i915_page_directory_pointer *pdp)
 {
-	struct i915_hw_ppgtt *ppgtt =
-		container_of(vm, struct i915_hw_ppgtt, base);
 	int i;
 
-	if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
-		for_each_set_bit(i, ppgtt->pdp.used_pdpes,
-				 I915_PDPES_PER_PDP(ppgtt->base.dev)) {
-			if (WARN_ON(!ppgtt->pdp.page_directory[i]))
-				continue;
+	for_each_set_bit(i, pdp->used_pdpes, I915_PDPES_PER_PDP(dev)) {
+		if (WARN_ON(!pdp->page_directory[i]))
+			continue;
 
-			gen8_free_page_tables(ppgtt->base.dev,
-					      ppgtt->pdp.page_directory[i]);
-			free_pd(ppgtt->base.dev,
-				ppgtt->pdp.page_directory[i]);
-		}
-		free_pdp(ppgtt->base.dev, &ppgtt->pdp);
-	} else {
-		WARN_ON(1); /* to be implemented later */
+		gen8_free_page_tables(dev, pdp->page_directory[i]);
+		free_pd(dev, pdp->page_directory[i]);
+	}
+
+	free_pdp(dev, pdp);
+}
+
+static void gen8_ppgtt_cleanup_4lvl(struct i915_hw_ppgtt *ppgtt)
+{
+	int i;
+
+	for_each_set_bit(i, ppgtt->pml4.used_pml4es, GEN8_PML4ES_PER_PML4) {
+		if (WARN_ON(!ppgtt->pml4.pdps[i]))
+			continue;
+
+		gen8_ppgtt_cleanup_3lvl(ppgtt->base.dev, ppgtt->pml4.pdps[i]);
 	}
 
+	cleanup_px(ppgtt->base.dev, &ppgtt->pml4);
+}
+
+static void gen8_ppgtt_cleanup(struct i915_address_space *vm)
+{
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
+
+	if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
+		gen8_ppgtt_cleanup_3lvl(ppgtt->base.dev, &ppgtt->pdp);
+	else
+		gen8_ppgtt_cleanup_4lvl(ppgtt);
+
 	gen8_free_scratch(vm);
 }
 
@@ -1075,8 +1125,61 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
 				    uint64_t start,
 				    uint64_t length)
 {
-	WARN_ON(1); /* to be implemented later */
+	DECLARE_BITMAP(new_pdps, GEN8_PML4ES_PER_PML4);
+	struct i915_page_directory_pointer *pdp;
+	const uint64_t orig_start = start;
+	const uint64_t orig_length = length;
+	uint64_t temp, pml4e;
+	int ret = 0;
+
+	/* Do the pml4 allocations first, so we don't need to track the newly
+	 * allocated tables below the pdp */
+	bitmap_zero(new_pdps, GEN8_PML4ES_PER_PML4);
+
+	/* The pagedirectory and pagetable allocations are done in the shared 3
+	 * and 4 level code. Just allocate the pdps.
+	 */
+	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
+		if (!pdp) {
+			WARN_ON(test_bit(pml4e, pml4->used_pml4es));
+			pdp = alloc_pdp(vm->dev);
+			if (IS_ERR(pdp))
+				goto err_out;
+
+			pml4->pdps[pml4e] = pdp;
+			__set_bit(pml4e, new_pdps);
+			trace_i915_page_directory_pointer_entry_alloc(vm,
+								      pml4e,
+								      start,
+								      GEN8_PML4E_SHIFT);
+		}
+	}
+
+	WARN(bitmap_weight(new_pdps, GEN8_PML4ES_PER_PML4) > 2,
+	     "The allocation has spanned more than 512GB. "
+	     "It is highly likely this is incorrect.");
+
+	start = orig_start;
+	length = orig_length;
+
+	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
+		WARN_ON(!pdp);
+
+		ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
+		if (ret)
+			goto err_out;
+	}
+
+	bitmap_or(pml4->used_pml4es, new_pdps, pml4->used_pml4es,
+		  GEN8_PML4ES_PER_PML4);
+
 	return 0;
+
+err_out:
+	for_each_set_bit(pml4e, new_pdps, GEN8_PML4ES_PER_PML4)
+		gen8_ppgtt_cleanup_3lvl(vm->dev, pml4->pdps[pml4e]);
+
+	return ret;
 }
 
 static int gen8_alloc_va_range(struct i915_address_space *vm,
@@ -1085,10 +1188,10 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 
-	if (!USES_FULL_48BIT_PPGTT(vm->dev))
-		return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
-	else
+	if (USES_FULL_48BIT_PPGTT(vm->dev))
 		return gen8_alloc_va_range_4lvl(vm, &ppgtt->pml4, start, length);
+	else
+		return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
 }
 
 /*
@@ -1116,9 +1219,14 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 
 	ppgtt->switch_mm = gen8_mm_switch;
 
-	if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
-		ret = __pdp_init(false, &ppgtt->pdp);
+	if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+		ret = setup_px(ppgtt->base.dev, &ppgtt->pml4);
+		if (ret)
+			goto free_scratch;
 
+		ppgtt->base.total = 1ULL << 48;
+	} else {
+		ret = __pdp_init(false, &ppgtt->pdp);
 		if (ret)
 			goto free_scratch;
 
@@ -1130,9 +1238,10 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 			 * 2GiB).
 			 */
 			ppgtt->base.total = to_i915(ppgtt->base.dev)->gtt.base.total;
-	} else {
-		ppgtt->base.total = 1ULL << 48;
-		return -EPERM; /* Not yet implemented */
+
+		trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base,
+							      0, 0,
+							      GEN8_PML4E_SHIFT);
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 6fa824e..660c9a5 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -95,6 +95,7 @@ typedef uint64_t gen8_pde_t;
  */
 #define GEN8_PML4ES_PER_PML4		512
 #define GEN8_PML4E_SHIFT		39
+#define GEN8_PML4E_MASK			(GEN8_PML4ES_PER_PML4 - 1)
 #define GEN8_PDPE_SHIFT			30
 /* NB: GEN8_PDPE_MASK is untrue for 32b platforms, but it has no impact on 32b page
  * tables */
@@ -464,6 +465,14 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
 	     temp = min(temp, length),					\
 	     start += temp, length -= temp)
 
+#define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)	\
+	for (iter = gen8_pml4e_index(start);	\
+	     pdp = (pml4)->pdps[iter], length > 0 && iter < GEN8_PML4ES_PER_PML4;	\
+	     iter++,				\
+	     temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,	\
+	     temp = min(temp, length),					\
+	     start += temp, length -= temp)
+
 #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)		\
 	gen8_for_each_pdpe_e(pd, pdp, start, length, temp, iter, I915_PDPES_PER_PDP(dev))
 
@@ -484,8 +493,7 @@ static inline uint32_t gen8_pdpe_index(uint64_t address)
 
 static inline uint32_t gen8_pml4e_index(uint64_t address)
 {
-	WARN_ON(1); /* For 64B */
-	return 0;
+	return (address >> GEN8_PML4E_SHIFT) & GEN8_PML4E_MASK;
 }
 
 static inline size_t gen8_pte_count(uint64_t address, uint64_t length)
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 07/18] drm/i915/gen8: Add 4 level switching infrastructure and lrc support
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (5 preceding siblings ...)
  2015-07-07 15:14 ` [PATCH v4 06/18] drm/i915/gen8: implement alloc/free for 4lvl Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 08/18] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT Michel Thierry
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains
the base address to PML4, while the other PDP registers are ignored.

In LRC, the addressing mode must be specified in every context descriptor.

v2: PML4 update in legacy context switch is left for historic reasons,
the preferred mode of operation is with lrc context based submission.
v3: s/gen8_map_page_directory/gen8_setup_page_directory and
s/gen8_map_page_directory_pointer/gen8_setup_page_directory_pointer.
Also, clflush will be needed for bxt. (Akash)
v4: Squashed lrc-specific code and use a macro to set PML4 register.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
PDP update in bb_start is only for legacy 32b mode.
v6: Rebase after final merged version of Mika's ppgtt/scratch
patches.

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 54 +++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_gem_gtt.h |  2 ++
 drivers/gpu/drm/i915/i915_reg.h     |  1 +
 drivers/gpu/drm/i915/intel_lrc.c    | 65 +++++++++++++++++++++++++++----------
 4 files changed, 99 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 33df9b7..844ca6e 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -211,6 +211,9 @@ static gen8_pde_t gen8_pde_encode(const dma_addr_t addr,
 	return pde;
 }
 
+#define gen8_pdpe_encode gen8_pde_encode
+#define gen8_pml4e_encode gen8_pde_encode
+
 static gen6_pte_t snb_pte_encode(dma_addr_t addr,
 				 enum i915_cache_level level,
 				 bool valid, u32 unused)
@@ -600,6 +603,35 @@ static void free_pdp(struct drm_device *dev,
 	}
 }
 
+static void
+gen8_setup_page_directory(struct i915_hw_ppgtt *ppgtt,
+			  struct i915_page_directory_pointer *pdp,
+			  struct i915_page_directory *pd,
+			  int index)
+{
+	gen8_ppgtt_pdpe_t *page_directorypo;
+
+	if (!USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
+		return;
+
+	page_directorypo = kmap_px(pdp);
+	page_directorypo[index] = gen8_pdpe_encode(px_dma(pd), I915_CACHE_LLC);
+	kunmap_px(ppgtt, page_directorypo);
+}
+
+static void
+gen8_setup_page_directory_pointer(struct i915_hw_ppgtt *ppgtt,
+				  struct i915_pml4 *pml4,
+				  struct i915_page_directory_pointer *pdp,
+				  int index)
+{
+	gen8_ppgtt_pml4e_t *pagemap = kmap_px(pml4);
+
+	WARN_ON(!USES_FULL_48BIT_PPGTT(ppgtt->base.dev));
+	pagemap[index] = gen8_pml4e_encode(px_dma(pdp), I915_CACHE_LLC);
+	kunmap_px(ppgtt, pagemap);
+}
+
 /* Broadwell Page Directory Pointer Descriptors */
 static int gen8_write_pdp(struct drm_i915_gem_request *req,
 			  unsigned entry,
@@ -625,8 +657,8 @@ static int gen8_write_pdp(struct drm_i915_gem_request *req,
 	return 0;
 }
 
-static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
-			  struct drm_i915_gem_request *req)
+static int gen8_legacy_mm_switch(struct i915_hw_ppgtt *ppgtt,
+				 struct drm_i915_gem_request *req)
 {
 	int i, ret;
 
@@ -641,6 +673,12 @@ static int gen8_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	return 0;
 }
 
+static int gen8_48b_mm_switch(struct i915_hw_ppgtt *ppgtt,
+			      struct drm_i915_gem_request *req)
+{
+	return gen8_write_pdp(req, 0, px_dma(&ppgtt->pml4));
+}
+
 static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 				   uint64_t start,
 				   uint64_t length,
@@ -1100,6 +1138,7 @@ static int gen8_alloc_va_range_3lvl(struct i915_address_space *vm,
 
 		kunmap_px(ppgtt, page_directory);
 		__set_bit(pdpe, pdp->used_pdpes);
+		gen8_setup_page_directory(ppgtt, pdp, pd, pdpe);
 	}
 
 	free_gen8_temp_bitmaps(new_page_dirs, new_page_tables, pdpes);
@@ -1126,6 +1165,8 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
 				    uint64_t length)
 {
 	DECLARE_BITMAP(new_pdps, GEN8_PML4ES_PER_PML4);
+	struct i915_hw_ppgtt *ppgtt =
+			container_of(vm, struct i915_hw_ppgtt, base);
 	struct i915_page_directory_pointer *pdp;
 	const uint64_t orig_start = start;
 	const uint64_t orig_length = length;
@@ -1168,6 +1209,8 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
 		ret = gen8_alloc_va_range_3lvl(vm, pdp, start, length);
 		if (ret)
 			goto err_out;
+
+		gen8_setup_page_directory_pointer(ppgtt, pml4, pdp, pml4e);
 	}
 
 	bitmap_or(pml4->used_pml4es, new_pdps, pml4->used_pml4es,
@@ -1217,14 +1260,13 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->base.unbind_vma = ppgtt_unbind_vma;
 	ppgtt->base.bind_vma = ppgtt_bind_vma;
 
-	ppgtt->switch_mm = gen8_mm_switch;
-
 	if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
 		ret = setup_px(ppgtt->base.dev, &ppgtt->pml4);
 		if (ret)
 			goto free_scratch;
 
 		ppgtt->base.total = 1ULL << 48;
+		ppgtt->switch_mm = gen8_48b_mm_switch;
 	} else {
 		ret = __pdp_init(false, &ppgtt->pdp);
 		if (ret)
@@ -1239,6 +1281,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 			 */
 			ppgtt->base.total = to_i915(ppgtt->base.dev)->gtt.base.total;
 
+		ppgtt->switch_mm = gen8_legacy_mm_switch;
 		trace_i915_page_directory_pointer_entry_alloc(&ppgtt->base,
 							      0, 0,
 							      GEN8_PML4E_SHIFT);
@@ -1436,8 +1479,9 @@ static void gen8_ppgtt_enable(struct drm_device *dev)
 	int j;
 
 	for_each_ring(ring, dev_priv, j) {
+		u32 four_level = USES_FULL_48BIT_PPGTT(dev) ? GEN8_GFX_PPGTT_48B : 0;
 		I915_WRITE(RING_MODE_GEN7(ring),
-			   _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
+			   _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE | four_level));
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 660c9a5..ae9463e 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -39,6 +39,8 @@ struct drm_i915_file_private;
 typedef uint32_t gen6_pte_t;
 typedef uint64_t gen8_pte_t;
 typedef uint64_t gen8_pde_t;
+typedef uint64_t gen8_ppgtt_pdpe_t;
+typedef uint64_t gen8_ppgtt_pml4e_t;
 
 #define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT)
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 2a29bcc..d7301fc 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1669,6 +1669,7 @@ enum skl_disp_power_wells {
 #define   GFX_REPLAY_MODE		(1<<11)
 #define   GFX_PSMI_GRANULARITY		(1<<10)
 #define   GFX_PPGTT_ENABLE		(1<<9)
+#define   GEN8_GFX_PPGTT_48B		(1<<7)
 
 #define VLV_DISPLAY_BASE 0x180000
 #define VLV_MIPI_BASE VLV_DISPLAY_BASE
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d4f8b43..d2a95af 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -195,13 +195,21 @@
 	reg_state[CTX_PDP ## n ## _LDW+1] = lower_32_bits(_addr); \
 }
 
+#define ASSIGN_CTX_PML4(ppgtt, reg_state) { \
+	reg_state[CTX_PDP0_UDW + 1] = upper_32_bits(px_dma(&ppgtt->pml4)); \
+	reg_state[CTX_PDP0_LDW + 1] = lower_32_bits(px_dma(&ppgtt->pml4)); \
+}
+
 enum {
 	ADVANCED_CONTEXT = 0,
-	LEGACY_CONTEXT,
+	LEGACY_32B_CONTEXT,
 	ADVANCED_AD_CONTEXT,
 	LEGACY_64B_CONTEXT
 };
-#define GEN8_CTX_MODE_SHIFT 3
+#define GEN8_CTX_ADDRESSING_MODE_SHIFT 3
+#define GEN8_CTX_ADDRESSING_MODE(dev)  (USES_FULL_48BIT_PPGTT(dev) ?\
+		LEGACY_64B_CONTEXT :\
+		LEGACY_32B_CONTEXT)
 enum {
 	FAULT_AND_HANG = 0,
 	FAULT_AND_HALT, /* Debug only */
@@ -272,7 +280,7 @@ static uint64_t execlists_ctx_descriptor(struct drm_i915_gem_request *rq)
 	WARN_ON(lrca & 0xFFFFFFFF00000FFFULL);
 
 	desc = GEN8_CTX_VALID;
-	desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT;
+	desc |= GEN8_CTX_ADDRESSING_MODE(dev) << GEN8_CTX_ADDRESSING_MODE_SHIFT;
 	if (IS_GEN8(ctx_obj->base.dev))
 		desc |= GEN8_CTX_L3LLC_COHERENT;
 	desc |= GEN8_CTX_PRIVILEGE;
@@ -347,10 +355,16 @@ static int execlists_update_context(struct drm_i915_gem_request *rq)
 	reg_state[CTX_RING_TAIL+1] = rq->tail;
 	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);
 
-	/* True PPGTT with dynamic page allocation: update PDP registers and
-	 * point the unallocated PDPs to the scratch page
-	 */
-	if (ppgtt) {
+	if (ppgtt && USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+		/* True 64b PPGTT (48bit canonical)
+		 * PDP0_DESCRIPTOR contains the base address to PML4 and
+		 * other PDP Descriptors are ignored
+		 */
+		ASSIGN_CTX_PML4(ppgtt, reg_state);
+	} else if (ppgtt) {
+		/* True 32b PPGTT with dynamic page allocation: update PDP
+		 * registers and point the unallocated PDPs to the scratch page
+		 */
 		ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
 		ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
 		ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
@@ -1432,12 +1446,16 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
 	 * Ideally, we should set Force PD Restore in ctx descriptor,
 	 * but we can't. Force Restore would be a second option, but
 	 * it is unsafe in case of lite-restore (because the ctx is
-	 * not idle). */
+	 * not idle). PML4 is allocated during ppgtt init so this is
+	 * not needed in 48-bit.*/
 	if (req->ctx->ppgtt &&
 	    (intel_ring_flag(req->ring) & req->ctx->ppgtt->pd_dirty_rings)) {
-		ret = intel_logical_ring_emit_pdps(req);
-		if (ret)
-			return ret;
+		if (GEN8_CTX_ADDRESSING_MODE(req->i915) == LEGACY_32B_CONTEXT){
+			ret = intel_logical_ring_emit_pdps(req);
+
+			if (ret)
+				return ret;
+		}
 
 		req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring);
 	}
@@ -2104,13 +2122,24 @@ populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_o
 	reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0);
 	reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0);
 
-	/* With dynamic page allocation, PDPs may not be allocated at this point,
-	 * Point the unallocated PDPs to the scratch page
-	 */
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
+	if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
+		/* 64b PPGTT (48bit canonical)
+		 * PDP0_DESCRIPTOR contains the base address to PML4 and
+		 * other PDP Descriptors are ignored.
+		 */
+		ASSIGN_CTX_PML4(ppgtt, reg_state);
+	} else {
+		/* 32b PPGTT
+		 * PDP*_DESCRIPTOR contains the base address of space supported.
+		 * With dynamic page allocation, PDPs may not be allocated at
+		 * this point. Point the unallocated PDPs to the scratch page
+		 */
+		ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
+		ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
+		ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
+		ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
+	}
+
 	if (ring->id == RCS) {
 		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
 		reg_state[CTX_R_PWR_CLK_STATE] = GEN8_R_PWR_CLK_STATE;
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 08/18] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (6 preceding siblings ...)
  2015-07-07 15:14 ` [PATCH v4 07/18] drm/i915/gen8: Add 4 level switching infrastructure and lrc support Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 09/18] drm/i915/gen8: Pass sg_iter through pte inserts Michel Thierry
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

The insert_entries function was the function used to write PTEs. For the
PPGTT it was "hardcoded" to only understand two level page tables, which
was the case for GEN7. We can reuse this for 4 level page tables, and
remove the concept of insert_entries, which was never viable past 2
level page tables anyway, but it requires a bit of rework to make the
function a bit more generic.

This patch begins the generalization work, and it will be heavily used
upon when the 48b code is complete. The patch series attempts to make
each function which touches a part of code specific to the page table
level and here is no exception.

v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v3: Rebase after final merged version of Mika's ppgtt/scratch patches.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2)
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 52 +++++++++++++++++++++++++++----------
 1 file changed, 39 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 844ca6e..7a1d6f8 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -679,24 +679,21 @@ static int gen8_48b_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	return gen8_write_pdp(req, 0, px_dma(&ppgtt->pml4));
 }
 
-static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
-				   uint64_t start,
-				   uint64_t length,
-				   bool use_scratch)
+static void gen8_ppgtt_clear_pte_range(struct i915_address_space *vm,
+				       struct i915_page_directory_pointer *pdp,
+				       uint64_t start,
+				       uint64_t length,
+				       gen8_pte_t scratch_pte)
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
-	struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
-	gen8_pte_t *pt_vaddr, scratch_pte;
+	gen8_pte_t *pt_vaddr;
 	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
 	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
 	unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
 	unsigned num_entries = length >> PAGE_SHIFT;
 	unsigned last_pte, i;
 
-	scratch_pte = gen8_pte_encode(px_dma(ppgtt->base.scratch_page),
-				      I915_CACHE_LLC, use_scratch);
-
 	while (num_entries) {
 		struct i915_page_directory *pd;
 		struct i915_page_table *pt;
@@ -735,14 +732,30 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 	}
 }
 
-static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
-				      struct sg_table *pages,
-				      uint64_t start,
-				      enum i915_cache_level cache_level, u32 unused)
+static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
+				   uint64_t start,
+				   uint64_t length,
+				   bool use_scratch)
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
+
+	gen8_pte_t scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page),
+						 I915_CACHE_LLC, use_scratch);
+
+	gen8_ppgtt_clear_pte_range(vm, pdp, start, length, scratch_pte);
+}
+
+static void
+gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm,
+			      struct i915_page_directory_pointer *pdp,
+			      struct sg_table *pages,
+			      uint64_t start,
+			      enum i915_cache_level cache_level)
+{
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_pte_t *pt_vaddr;
 	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
 	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
@@ -776,6 +789,19 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 		kunmap_px(ppgtt, pt_vaddr);
 }
 
+static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
+				      struct sg_table *pages,
+				      uint64_t start,
+				      enum i915_cache_level cache_level,
+				      u32 unused)
+{
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
+	struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
+
+	gen8_ppgtt_insert_pte_entries(vm, pdp, pages, start, cache_level);
+}
+
 static void gen8_free_page_tables(struct drm_device *dev,
 				  struct i915_page_directory *pd)
 {
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 09/18] drm/i915/gen8: Pass sg_iter through pte inserts
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (7 preceding siblings ...)
  2015-07-07 15:14 ` [PATCH v4 08/18] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 10/18] drm/i915/gen8: Add 4 level support in insert_entries and clear_range Michel Thierry
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

As a step towards implementing 4 levels, while not discarding the
existing pte insert functions, we need to pass the sg_iter through.
The current function understands to the page directory granularity.
An object's pages may span the page directory, and so using the iter
directly as we write the PTEs allows the iterator to stay coherent
through a VMA insert operation spanning multiple page table levels.

v2: Rebase after s/page_tables/page_table/.
v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series;
updated commit message (s/map/insert).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7a1d6f8..030f688 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -750,7 +750,7 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 static void
 gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm,
 			      struct i915_page_directory_pointer *pdp,
-			      struct sg_table *pages,
+			      struct sg_page_iter *sg_iter,
 			      uint64_t start,
 			      enum i915_cache_level cache_level)
 {
@@ -760,11 +760,10 @@ gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm,
 	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
 	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
 	unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
-	struct sg_page_iter sg_iter;
 
 	pt_vaddr = NULL;
 
-	for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) {
+	while (__sg_page_iter_next(sg_iter)) {
 		if (pt_vaddr == NULL) {
 			struct i915_page_directory *pd = pdp->page_directory[pdpe];
 			struct i915_page_table *pt = pd->page_table[pde];
@@ -772,7 +771,7 @@ gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm,
 		}
 
 		pt_vaddr[pte] =
-			gen8_pte_encode(sg_page_iter_dma_address(&sg_iter),
+			gen8_pte_encode(sg_page_iter_dma_address(sg_iter),
 					cache_level, true);
 		if (++pte == GEN8_PTES) {
 			kunmap_px(ppgtt, pt_vaddr);
@@ -798,8 +797,10 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
+	struct sg_page_iter sg_iter;
 
-	gen8_ppgtt_insert_pte_entries(vm, pdp, pages, start, cache_level);
+	__sg_page_iter_start(&sg_iter, pages->sgl, sg_nents(pages->sgl), 0);
+	gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter, start, cache_level);
 }
 
 static void gen8_free_page_tables(struct drm_device *dev,
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 10/18] drm/i915/gen8: Add 4 level support in insert_entries and clear_range
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (8 preceding siblings ...)
  2015-07-07 15:14 ` [PATCH v4 09/18] drm/i915/gen8: Pass sg_iter through pte inserts Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 11/18] drm/i915/gen8: Initialize PDPs Michel Thierry
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

When 48b is enabled, gen8_ppgtt_insert_entries needs to read the Page Map
Level 4 (PML4), before it selects which Page Directory Pointer (PDP)
it will write to.

Similarly, gen8_ppgtt_clear_range needs to get the correct PDP/PD range.

This patch was inspired by Ben's "Depend exclusively on map and
unmap_vma".

v2: Rebase after s/page_tables/page_table/.
v3: Remove unnecessary pdpe loop in gen8_ppgtt_clear_range_4lvl and use
clamp_pdp in gen8_ppgtt_insert_entries (Akash).
v4: Merge gen8_ppgtt_clear_range_4lvl into gen8_ppgtt_clear_range to
maintain symmetry with gen8_ppgtt_insert_entries (Akash).
v5: Do not mix pages and bytes in insert_entries (Akash).
v6: Prevent overflow in sg_nents << PAGE_SHIFT, when inserting 4GB at
once.
v7: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
Use gen8_px_index functions, and remove unnecessary number of pages
parameter in insert_pte_entries.
v8: Change gen8_ppgtt_clear_pte_range to stop at PDP boundary, instead of
adding and extra clamp function; remove unnecessary pdp_start/pdp_len
variables (Akash).

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 49 +++++++++++++++++++++++++++----------
 1 file changed, 36 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 030f688..f12e2b6 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -688,9 +688,9 @@ static void gen8_ppgtt_clear_pte_range(struct i915_address_space *vm,
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_pte_t *pt_vaddr;
-	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
-	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
-	unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
+	unsigned pdpe = gen8_pdpe_index(start);
+	unsigned pde = gen8_pde_index(start);
+	unsigned pte = gen8_pte_index(start);
 	unsigned num_entries = length >> PAGE_SHIFT;
 	unsigned last_pte, i;
 
@@ -726,7 +726,8 @@ static void gen8_ppgtt_clear_pte_range(struct i915_address_space *vm,
 
 		pte = 0;
 		if (++pde == I915_PDES) {
-			pdpe++;
+			if (++pdpe == I915_PDPES_PER_PDP(vm->dev))
+				break;
 			pde = 0;
 		}
 	}
@@ -739,12 +740,21 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm,
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
-	struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
-
 	gen8_pte_t scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page),
 						 I915_CACHE_LLC, use_scratch);
 
-	gen8_ppgtt_clear_pte_range(vm, pdp, start, length, scratch_pte);
+	if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
+		gen8_ppgtt_clear_pte_range(vm, &ppgtt->pdp, start, length,
+					   scratch_pte);
+	} else {
+		uint64_t templ4, pml4e;
+		struct i915_page_directory_pointer *pdp;
+
+		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) {
+			gen8_ppgtt_clear_pte_range(vm, pdp, start, length,
+						   scratch_pte);
+		}
+	}
 }
 
 static void
@@ -757,9 +767,9 @@ gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm,
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
 	gen8_pte_t *pt_vaddr;
-	unsigned pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK;
-	unsigned pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK;
-	unsigned pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK;
+	unsigned pdpe = gen8_pdpe_index(start);
+	unsigned pde = gen8_pde_index(start);
+	unsigned pte = gen8_pte_index(start);
 
 	pt_vaddr = NULL;
 
@@ -777,7 +787,8 @@ gen8_ppgtt_insert_pte_entries(struct i915_address_space *vm,
 			kunmap_px(ppgtt, pt_vaddr);
 			pt_vaddr = NULL;
 			if (++pde == I915_PDES) {
-				pdpe++;
+				if (++pdpe == I915_PDPES_PER_PDP(vm->dev))
+					break;
 				pde = 0;
 			}
 			pte = 0;
@@ -796,11 +807,23 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm,
 {
 	struct i915_hw_ppgtt *ppgtt =
 		container_of(vm, struct i915_hw_ppgtt, base);
-	struct i915_page_directory_pointer *pdp = &ppgtt->pdp; /* FIXME: 48b */
 	struct sg_page_iter sg_iter;
 
 	__sg_page_iter_start(&sg_iter, pages->sgl, sg_nents(pages->sgl), 0);
-	gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter, start, cache_level);
+
+	if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
+		gen8_ppgtt_insert_pte_entries(vm, &ppgtt->pdp, &sg_iter, start,
+					      cache_level);
+	} else {
+		struct i915_page_directory_pointer *pdp;
+		uint64_t templ4, pml4e;
+		uint64_t length = (uint64_t)sg_nents(pages->sgl) << PAGE_SHIFT;
+
+		gen8_for_each_pml4e(pdp, &ppgtt->pml4, start, length, templ4, pml4e) {
+			gen8_ppgtt_insert_pte_entries(vm, pdp, &sg_iter,
+						      start, cache_level);
+		}
+	}
 }
 
 static void gen8_free_page_tables(struct drm_device *dev,
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 11/18] drm/i915/gen8: Initialize PDPs
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (9 preceding siblings ...)
  2015-07-07 15:14 ` [PATCH v4 10/18] drm/i915/gen8: Add 4 level support in insert_entries and clear_range Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 12/18] drm/i915: Expand error state's address width to 64b Michel Thierry
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

Similar to PDs, while setting up a page directory pointer, make all entries
of the pdp point to the scratch pdp before mapping (and make all its entries
point to the scratch page); this is to be safe in case of out of bound
access or  proactive prefetch.

v2: Handle scratch_pdp allocation failure correctly, and keep
initialize_px functions together (Akash)
v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Rely on
the added macros to initialize the pdps.
v4: Rebase after final merged version of Mika's ppgtt/scratch patches
(and removed commit message part related to v3).

Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 41 +++++++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_gem_gtt.h |  1 +
 2 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index f12e2b6..eb724b7 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -603,6 +603,27 @@ static void free_pdp(struct drm_device *dev,
 	}
 }
 
+static void gen8_initialize_pdp(struct i915_address_space *vm,
+				struct i915_page_directory_pointer *pdp)
+{
+	gen8_ppgtt_pdpe_t scratch_pdpe;
+
+	scratch_pdpe = gen8_pdpe_encode(px_dma(vm->scratch_pd), I915_CACHE_LLC);
+
+	fill_px(vm->dev, pdp, scratch_pdpe);
+}
+
+static void gen8_initialize_pml4(struct i915_address_space *vm,
+				 struct i915_pml4 *pml4)
+{
+	gen8_ppgtt_pml4e_t scratch_pml4e;
+
+	scratch_pml4e = gen8_pml4e_encode(px_dma(vm->scratch_pdp),
+					  I915_CACHE_LLC);
+
+	fill_px(vm->dev, pml4, scratch_pml4e);
+}
+
 static void
 gen8_setup_page_directory(struct i915_hw_ppgtt *ppgtt,
 			  struct i915_page_directory_pointer *pdp,
@@ -864,8 +885,20 @@ static int gen8_init_scratch(struct i915_address_space *vm)
 		return PTR_ERR(vm->scratch_pd);
 	}
 
+	if (USES_FULL_48BIT_PPGTT(dev)) {
+		vm->scratch_pdp = alloc_pdp(dev);
+		if (IS_ERR(vm->scratch_pdp)) {
+			free_pd(dev, vm->scratch_pd);
+			free_pt(dev, vm->scratch_pt);
+			free_scratch_page(dev, vm->scratch_page);
+			return PTR_ERR(vm->scratch_pdp);
+		}
+	}
+
 	gen8_initialize_pt(vm, vm->scratch_pt);
 	gen8_initialize_pd(vm, vm->scratch_pd);
+	if (USES_FULL_48BIT_PPGTT(dev))
+		gen8_initialize_pdp(vm, vm->scratch_pdp);
 
 	return 0;
 }
@@ -874,6 +907,8 @@ static void gen8_free_scratch(struct i915_address_space *vm)
 {
 	struct drm_device *dev = vm->dev;
 
+	if (USES_FULL_48BIT_PPGTT(dev))
+		free_pdp(dev, vm->scratch_pdp);
 	free_pd(dev, vm->scratch_pd);
 	free_pt(dev, vm->scratch_pt);
 	free_scratch_page(dev, vm->scratch_page);
@@ -1231,12 +1266,12 @@ static int gen8_alloc_va_range_4lvl(struct i915_address_space *vm,
 	 * and 4 level code. Just allocate the pdps.
 	 */
 	gen8_for_each_pml4e(pdp, pml4, start, length, temp, pml4e) {
-		if (!pdp) {
-			WARN_ON(test_bit(pml4e, pml4->used_pml4es));
+		if (!test_bit(pml4e, pml4->used_pml4es)) {
 			pdp = alloc_pdp(vm->dev);
 			if (IS_ERR(pdp))
 				goto err_out;
 
+			gen8_initialize_pdp(vm, pdp);
 			pml4->pdps[pml4e] = pdp;
 			__set_bit(pml4e, new_pdps);
 			trace_i915_page_directory_pointer_entry_alloc(vm,
@@ -1315,6 +1350,8 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 		if (ret)
 			goto free_scratch;
 
+		gen8_initialize_pml4(&ppgtt->base, &ppgtt->pml4);
+
 		ppgtt->base.total = 1ULL << 48;
 		ppgtt->switch_mm = gen8_48b_mm_switch;
 	} else {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index ae9463e..99815c9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -278,6 +278,7 @@ struct i915_address_space {
 	struct i915_page_scratch *scratch_page;
 	struct i915_page_table *scratch_pt;
 	struct i915_page_directory *scratch_pd;
+	struct i915_page_directory_pointer *scratch_pdp; /* GEN8+ & 48b PPGTT */
 
 	/**
 	 * List of objects currently involved in rendering.
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 12/18] drm/i915: Expand error state's address width to 64b
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (10 preceding siblings ...)
  2015-07-07 15:14 ` [PATCH v4 11/18] drm/i915/gen8: Initialize PDPs Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-11 20:10   ` Chris Wilson
  2015-07-07 15:14 ` [PATCH v4 13/18] drm/i915/gen8: Add ppgtt info and debug_dump Michel Thierry
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

v2: For semaphore errors, object is mapped to GGTT and offset will not
be > 4GB, print only lower 32-bits (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h       |  4 ++--
 drivers/gpu/drm/i915/i915_gpu_error.c | 18 ++++++++++--------
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index de3a5d1..91e195a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -546,7 +546,7 @@ struct drm_i915_error_state {
 
 		struct drm_i915_error_object {
 			int page_count;
-			u32 gtt_offset;
+			u64 gtt_offset;
 			u32 *pages[0];
 		} *ringbuffer, *batchbuffer, *wa_batchbuffer, *ctx, *hws_page;
 
@@ -572,7 +572,7 @@ struct drm_i915_error_state {
 		u32 size;
 		u32 name;
 		u32 rseqno[I915_NUM_RINGS], wseqno;
-		u32 gtt_offset;
+		u64 gtt_offset;
 		u32 read_domains;
 		u32 write_domain;
 		s32 fence_reg:I915_MAX_NUM_FENCE_BITS;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 6f42569..2c47ca7 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -197,7 +197,7 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m,
 	err_printf(m, "  %s [%d]:\n", name, count);
 
 	while (count--) {
-		err_printf(m, "    %08x %8u %02x %02x [ ",
+		err_printf(m, "    %016llx %8u %02x %02x [ ",
 			   err->gtt_offset,
 			   err->size,
 			   err->read_domains,
@@ -426,7 +426,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 				err_printf(m, " (submitted by %s [%d])",
 					   error->ring[i].comm,
 					   error->ring[i].pid);
-			err_printf(m, " --- gtt_offset = 0x%08x\n",
+			err_printf(m, " --- gtt_offset = 0x%016llx\n",
 				   obj->gtt_offset);
 			print_error_obj(m, obj);
 		}
@@ -434,7 +434,8 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 		obj = error->ring[i].wa_batchbuffer;
 		if (obj) {
 			err_printf(m, "%s (w/a) --- gtt_offset = 0x%08x\n",
-				   dev_priv->ring[i].name, obj->gtt_offset);
+				   dev_priv->ring[i].name,
+				   lower_32_bits(obj->gtt_offset));
 			print_error_obj(m, obj);
 		}
 
@@ -453,14 +454,14 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 		if ((obj = error->ring[i].ringbuffer)) {
 			err_printf(m, "%s --- ringbuffer = 0x%08x\n",
 				   dev_priv->ring[i].name,
-				   obj->gtt_offset);
+				   lower_32_bits(obj->gtt_offset));
 			print_error_obj(m, obj);
 		}
 
 		if ((obj = error->ring[i].hws_page)) {
 			err_printf(m, "%s --- HW Status = 0x%08x\n",
 				   dev_priv->ring[i].name,
-				   obj->gtt_offset);
+				   lower_32_bits(obj->gtt_offset));
 			offset = 0;
 			for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
 				err_printf(m, "[%04x] %08x %08x %08x %08x\n",
@@ -476,13 +477,14 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 		if ((obj = error->ring[i].ctx)) {
 			err_printf(m, "%s --- HW Context = 0x%08x\n",
 				   dev_priv->ring[i].name,
-				   obj->gtt_offset);
+				   lower_32_bits(obj->gtt_offset));
 			print_error_obj(m, obj);
 		}
 	}
 
 	if ((obj = error->semaphore_obj)) {
-		err_printf(m, "Semaphore page = 0x%08x\n", obj->gtt_offset);
+		err_printf(m, "Semaphore page = 0x%08x\n",
+			   lower_32_bits(obj->gtt_offset));
 		for (elt = 0; elt < PAGE_SIZE/16; elt += 4) {
 			err_printf(m, "[%04x] %08x %08x %08x %08x\n",
 				   elt * 4,
@@ -590,7 +592,7 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
 	int num_pages;
 	bool use_ggtt;
 	int i = 0;
-	u32 reloc_offset;
+	u64 reloc_offset;
 
 	if (src == NULL || src->pages == NULL)
 		return NULL;
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 13/18] drm/i915/gen8: Add ppgtt info and debug_dump
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (11 preceding siblings ...)
  2015-07-07 15:14 ` [PATCH v4 12/18] drm/i915: Expand error state's address width to 64b Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-07 15:14 ` [PATCH v4 14/18] drm/i915: object size needs to be u64 Michel Thierry
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

v2: Clean up patch after rebases.
v3: gen8_dump_ppgtt for 32b and 48b PPGTT.
v4: Use used_pml4es/pdpes (Akash).
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rely on used_px bits instead of null checking (Akash)

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
---
 drivers/gpu/drm/i915/i915_debugfs.c | 18 ++++----
 drivers/gpu/drm/i915/i915_gem_gtt.c | 83 +++++++++++++++++++++++++++++++++++++
 2 files changed, 93 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 98fd3c9..a7aff26 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2237,7 +2237,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *ring;
-	struct drm_file *file;
 	int i;
 
 	if (INTEL_INFO(dev)->gen == 6)
@@ -2260,13 +2259,6 @@ static void gen6_ppgtt_info(struct seq_file *m, struct drm_device *dev)
 		ppgtt->debug_dump(ppgtt, m);
 	}
 
-	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
-		struct drm_i915_file_private *file_priv = file->driver_priv;
-
-		seq_printf(m, "proc: %s\n",
-			   get_pid_task(file->pid, PIDTYPE_PID)->comm);
-		idr_for_each(&file_priv->context_idr, per_file_ctx, m);
-	}
 	seq_printf(m, "ECOCHK: 0x%08x\n", I915_READ(GAM_ECOCHK));
 }
 
@@ -2275,6 +2267,7 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
 	struct drm_info_node *node = m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_file *file;
 
 	int ret = mutex_lock_interruptible(&dev->struct_mutex);
 	if (ret)
@@ -2286,6 +2279,15 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
 	else if (INTEL_INFO(dev)->gen >= 6)
 		gen6_ppgtt_info(m, dev);
 
+	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
+		struct drm_i915_file_private *file_priv = file->driver_priv;
+
+		seq_printf(m, "\nproc: %s\n",
+			   get_pid_task(file->pid, PIDTYPE_PID)->comm);
+		idr_for_each(&file_priv->context_idr, per_file_ctx,
+			     (void *)(unsigned long)m);
+	}
+
 	intel_runtime_pm_put(dev_priv);
 	mutex_unlock(&dev->struct_mutex);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index eb724b7..449a245 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1322,6 +1322,88 @@ static int gen8_alloc_va_range(struct i915_address_space *vm,
 		return gen8_alloc_va_range_3lvl(vm, &ppgtt->pdp, start, length);
 }
 
+static void gen8_dump_pdp(struct i915_page_directory_pointer *pdp,
+			  uint64_t start, uint64_t length,
+			  gen8_pte_t scratch_pte,
+			  struct seq_file *m)
+{
+	struct i915_page_directory *pd;
+	uint64_t temp;
+	uint32_t pdpe;
+
+	gen8_for_each_pdpe(pd, pdp, start, length, temp, pdpe) {
+		struct i915_page_table *pt;
+		uint64_t pd_len = length;
+		uint64_t pd_start = start;
+		uint32_t pde;
+
+		if(!test_bit(pdpe, pdp->used_pdpes))
+			continue;
+
+		seq_printf(m, "\tPDPE #%d\n", pdpe);
+		gen8_for_each_pde(pt, pd, pd_start, pd_len, temp, pde) {
+			uint32_t  pte;
+			gen8_pte_t *pt_vaddr;
+
+			if(!test_bit(pde, pd->used_pdes))
+				continue;
+
+			pt_vaddr = kmap_px(pt);
+			for (pte = 0; pte < GEN8_PTES; pte+=4) {
+				uint64_t va =
+					(pdpe << GEN8_PDPE_SHIFT) |
+					(pde << GEN8_PDE_SHIFT) |
+					(pte << GEN8_PTE_SHIFT);
+				int i;
+				bool found = false;
+				for (i = 0; i < 4; i++)
+					if (pt_vaddr[pte + i] != scratch_pte)
+						found = true;
+				if (!found)
+					continue;
+
+				seq_printf(m, "\t\t0x%llx [%03d,%03d,%04d]: =", va, pdpe, pde, pte);
+				for (i = 0; i < 4; i++) {
+					if (pt_vaddr[pte + i] != scratch_pte)
+						seq_printf(m, " %llx", pt_vaddr[pte + i]);
+					else
+						seq_puts(m, "  SCRATCH ");
+				}
+				seq_puts(m, "\n");
+			}
+			/* don't use kunmap_px, it could trigger
+			 * an unnecessary flush.
+			 */
+			kunmap_atomic(pt_vaddr);
+		}
+	}
+}
+
+static void gen8_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
+{
+	struct i915_address_space *vm = &ppgtt->base;
+	uint64_t start = ppgtt->base.start;
+	uint64_t length = ppgtt->base.total;
+	gen8_pte_t scratch_pte = gen8_pte_encode(px_dma(vm->scratch_page),
+						 I915_CACHE_LLC, true);
+
+	if (!USES_FULL_48BIT_PPGTT(vm->dev)) {
+		gen8_dump_pdp(&ppgtt->pdp, start, length, scratch_pte, m);
+	} else {
+		uint64_t templ4, pml4e;
+		struct i915_pml4 *pml4 = &ppgtt->pml4;
+		struct i915_page_directory_pointer *pdp;
+
+		gen8_for_each_pml4e(pdp, pml4, start, length, templ4, pml4e) {
+			if (!test_bit(pml4e, pml4->used_pml4es))
+				continue;
+
+			seq_printf(m, "    PML4E #%llu\n", pml4e);
+			gen8_dump_pdp(pdp, start, length, scratch_pte, m);
+		}
+	}
+}
+
 /*
  * GEN8 legacy ppgtt programming is accomplished through a max 4 PDP registers
  * with a net effect resembling a 2-level page table in normal x86 terms. Each
@@ -1344,6 +1426,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->base.clear_range = gen8_ppgtt_clear_range;
 	ppgtt->base.unbind_vma = ppgtt_unbind_vma;
 	ppgtt->base.bind_vma = ppgtt_bind_vma;
+	ppgtt->debug_dump = gen8_dump_ppgtt;
 
 	if (USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
 		ret = setup_px(ppgtt->base.dev, &ppgtt->pml4);
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 14/18] drm/i915: object size needs to be u64
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (12 preceding siblings ...)
  2015-07-07 15:14 ` [PATCH v4 13/18] drm/i915/gen8: Add ppgtt info and debug_dump Michel Thierry
@ 2015-07-07 15:14 ` Michel Thierry
  2015-07-07 15:27   ` Chris Wilson
  2015-07-07 15:15 ` [PATCH v4 15/18] drm/i915: batch_obj vm offset must " Michel Thierry
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:14 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

In a 48b world, users can try to allocate buffers bigger than 4GB; in
these cases it is important that size is a 64b variable.

Also added a warning for illegal bind with size = 0.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c     | 5 +++--
 drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a0bff41..ebfb789 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3718,7 +3718,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	u32 size, fence_size, fence_alignment, unfenced_alignment;
+	u32 fence_alignment, unfenced_alignment;
+	u64 size, fence_size;
 	u64 start =
 		flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
 	u64 end =
@@ -3777,7 +3778,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 	 * attempt to find space.
 	 */
 	if (size > end) {
-		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%u > %s aperture=%llu\n",
+		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%llu > %s aperture=%llu\n",
 			  ggtt_view ? ggtt_view->type : 0,
 			  size,
 			  flags & PIN_MAPPABLE ? "mappable" : "total",
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 449a245..900bce6 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3312,6 +3312,9 @@ int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
 	if (WARN_ON(flags == 0))
 		return -EINVAL;
 
+	if (WARN_ON(vma->node.size == 0))
+		return -EINVAL;
+
 	bind_flags = 0;
 	if (flags & PIN_GLOBAL)
 		bind_flags |= GLOBAL_BIND;
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 15/18] drm/i915: batch_obj vm offset must be u64
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (13 preceding siblings ...)
  2015-07-07 15:14 ` [PATCH v4 14/18] drm/i915: object size needs to be u64 Michel Thierry
@ 2015-07-07 15:15 ` Michel Thierry
  2015-07-07 15:15 ` [PATCH v4 16/18] drm/i915/userptr: Kill user_size limit check Michel Thierry
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:15 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

Otherwise it can overflow in 48-bit mode, and cause an incorrect
exec_start.

Before commit 5f19e2bff ("drm/i915: Merged the many do_execbuf()
parameters into a structure"), it was already an u64, so it could be
seen as a regression (or as an optimization that looked good at that time).

Cc: John Harrison <john.c.harrison@Intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 91e195a..4a30a73 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1671,7 +1671,7 @@ struct i915_execbuffer_params {
 	struct drm_file                 *file;
 	uint32_t                        dispatch_flags;
 	uint32_t                        args_batch_start_offset;
-	uint32_t                        batch_obj_vm_offset;
+	uint64_t                        batch_obj_vm_offset;
 	struct intel_engine_cs          *ring;
 	struct drm_i915_gem_object      *batch_obj;
 	struct intel_context            *ctx;
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 16/18] drm/i915/userptr: Kill user_size limit check
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (14 preceding siblings ...)
  2015-07-07 15:15 ` [PATCH v4 15/18] drm/i915: batch_obj vm offset must " Michel Thierry
@ 2015-07-07 15:15 ` Michel Thierry
  2015-07-07 15:15 ` [PATCH v4 17/18] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:15 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

GTT was only 32b and its max value is 4GB. In order to allow objects
bigger than 4GB in 48b PPGTT, i915_gem_userptr_ioctl we could check
against max 48b range (1ULL << 48).

But since the check no longer applies, just kill the limit.

v2: Use the default ctx to infer the ppgtt max size (Akash).
v3: Just kill the limit, it was only there for early detection of an
error when used for execbuffer (Chris).

Cc: Akash Goel <akash.goel@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_userptr.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 1f4e5a3..1b66e39 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -788,7 +788,6 @@ static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
 int
 i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_userptr *args = data;
 	struct drm_i915_gem_object *obj;
 	int ret;
@@ -801,9 +800,6 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file
 	if (offset_in_page(args->user_ptr | args->user_size))
 		return -EINVAL;
 
-	if (args->user_size > dev_priv->gtt.base.total)
-		return -E2BIG;
-
 	if (!access_ok(args->flags & I915_USERPTR_READ_ONLY ? VERIFY_READ : VERIFY_WRITE,
 		       (char __user *)(unsigned long)args->user_ptr, args->user_size))
 		return -EFAULT;
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 17/18] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (15 preceding siblings ...)
  2015-07-07 15:15 ` [PATCH v4 16/18] drm/i915/userptr: Kill user_size limit check Michel Thierry
@ 2015-07-07 15:15 ` Michel Thierry
  2015-07-09 16:19   ` Michel Thierry
  2015-07-11 18:51   ` Chris Wilson
  2015-07-07 15:15 ` [PATCH v4 18/18] drm/i915/gen8: Flip the 48b switch Michel Thierry
                   ` (3 subsequent siblings)
  20 siblings, 2 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:15 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

There are some allocations that must be only referenced by 32-bit
offsets. To limit the chances of having the first 4GB already full,
objects not requiring this workaround use DRM_MM_SEARCH_BELOW/
DRM_MM_CREATE_TOP flags

In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
General State Heap (GSH) or Intructions State Heap (ISH) must be in a
32-bit range, because the General State Offset and Instruction State
Offset are limited to 32-bits.

Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if
they can be allocated above the 32-bit address range. To limit the
chances of having the first 4GB already full, objects will use
DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible.

v2: Changed flag logic from neeeds_32b, to supports_48b.
v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel)
v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK
to use last PIN_ defined instead of hard-coded value; use correct limit
check in eb_vma_misplaced. (Chris)
v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  2 ++
 drivers/gpu/drm/i915/i915_gem.c            | 14 ++++++++++++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +++++++++++++
 include/uapi/drm/i915_drm.h                |  3 ++-
 4 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4a30a73..fc88e58 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2772,6 +2772,8 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
 #define PIN_OFFSET_BIAS	(1<<3)
 #define PIN_USER	(1<<4)
 #define PIN_UPDATE	(1<<5)
+#define PIN_ZONE_4G	(1<<6)
+#define PIN_HIGH	(1<<7)
 #define PIN_OFFSET_MASK (~4095)
 int __must_check
 i915_gem_object_pin(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ebfb789..b13900d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3720,6 +3720,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 fence_alignment, unfenced_alignment;
 	u64 size, fence_size;
+	u32 search_flag = DRM_MM_SEARCH_DEFAULT;
+	u32 alloc_flag = DRM_MM_CREATE_DEFAULT;
 	u64 start =
 		flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
 	u64 end =
@@ -3761,6 +3763,14 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
 						   obj->tiling_mode,
 						   false);
 		size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
+
+		if (flags & PIN_HIGH) {
+			search_flag = DRM_MM_SEARCH_BELOW;
+			alloc_flag = DRM_MM_CREATE_TOP;
+		}
+
+		if (flags & PIN_ZONE_4G)
+			end = (1ULL << 32);
 	}
 
 	if (alignment == 0)
@@ -3803,8 +3813,8 @@ search_free:
 						  size, alignment,
 						  obj->cache_level,
 						  start, end,
-						  DRM_MM_SEARCH_DEFAULT,
-						  DRM_MM_CREATE_DEFAULT);
+						  search_flag,
+						  alloc_flag);
 	if (ret) {
 		ret = i915_gem_evict_something(dev, vm, size, alignment,
 					       obj->cache_level,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 83577c6..f2b43a2 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -588,11 +588,20 @@ i915_gem_execbuffer_reserve_vma(struct i915_vma *vma,
 	if (entry->flags & EXEC_OBJECT_NEEDS_GTT)
 		flags |= PIN_GLOBAL;
 
+	/* Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
+	 * limit address to the first 4GBs for unflagged objects.
+	 */
+	flags |= PIN_ZONE_4G;
+	if (entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS)
+		flags &= ~PIN_ZONE_4G;
+
 	if (!drm_mm_node_allocated(&vma->node)) {
 		if (entry->flags & __EXEC_OBJECT_NEEDS_MAP)
 			flags |= PIN_GLOBAL | PIN_MAPPABLE;
 		if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS)
 			flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS;
+		if ((flags & PIN_MAPPABLE) == 0)
+			flags |= PIN_HIGH;
 	}
 
 	ret = i915_gem_object_pin(obj, vma->vm, entry->alignment, flags);
@@ -670,6 +679,10 @@ eb_vma_misplaced(struct i915_vma *vma)
 	if (entry->flags & __EXEC_OBJECT_NEEDS_MAP && !obj->map_and_fenceable)
 		return !only_mappable_for_reloc(entry->flags);
 
+	if (!(entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
+	    (vma->node.start + vma->node.size) >= (1ULL << 32))
+		return true;
+
 	return false;
 }
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index e7c29f1..e4471e8 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -686,7 +686,8 @@ struct drm_i915_gem_exec_object2 {
 #define EXEC_OBJECT_NEEDS_FENCE (1<<0)
 #define EXEC_OBJECT_NEEDS_GTT	(1<<1)
 #define EXEC_OBJECT_WRITE	(1<<2)
-#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_WRITE<<1)
+#define EXEC_OBJECT_SUPPORTS_48B_ADDRESS (1<<3)
+#define __EXEC_OBJECT_UNKNOWN_FLAGS -(EXEC_OBJECT_SUPPORTS_48B_ADDRESS<<1)
 	__u64 flags;
 
 	__u64 rsvd1;
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v4 18/18] drm/i915/gen8: Flip the 48b switch
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (16 preceding siblings ...)
  2015-07-07 15:15 ` [PATCH v4 17/18] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
@ 2015-07-07 15:15 ` Michel Thierry
  2015-07-11  3:56   ` shuang.he
  2015-07-10  9:39 ` [PATCH v4 00/18] 48-bit PPGTT Chris Wilson
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:15 UTC (permalink / raw)
  To: intel-gfx; +Cc: akash.goel

Use 48b addresses if hw supports it (i915.enable_ppgtt=3).

Note, aliasing PPGTT remains 32b only.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 4 ++--
 drivers/gpu/drm/i915/i915_params.c  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 900bce6..097ec5b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -110,7 +110,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 	has_full_ppgtt = INTEL_INFO(dev)->gen >= 7;
 	has_full_64bit_ppgtt = IS_ENABLED(CONFIG_X86_64) &&
 			       (IS_BROADWELL(dev) ||
-				INTEL_INFO(dev)->gen >= 9) && false; /* FIXME: 64b */
+				INTEL_INFO(dev)->gen >= 9);
 
 	if (intel_vgpu_active(dev))
 		has_full_ppgtt = false; /* emulation is too hard */
@@ -148,7 +148,7 @@ static int sanitize_enable_ppgtt(struct drm_device *dev, int enable_ppgtt)
 	}
 
 	if (INTEL_INFO(dev)->gen >= 8 && i915.enable_execlists)
-		return 2;
+		return has_full_64bit_ppgtt ? 3 : 2;
 	else
 		return has_aliasing_ppgtt ? 1 : 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 7983fe4..ccf3eb2 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -110,7 +110,7 @@ MODULE_PARM_DESC(enable_hangcheck,
 module_param_named_unsafe(enable_ppgtt, i915.enable_ppgtt, int, 0400);
 MODULE_PARM_DESC(enable_ppgtt,
 	"Override PPGTT usage. "
-	"(-1=auto [default], 0=disabled, 1=aliasing, 2=full)");
+	"(-1=auto [default], 0=disabled, 1=aliasing, 2=full, 3=full_64b)");
 
 module_param_named(enable_execlists, i915.enable_execlists, int, 0400);
 MODULE_PARM_DESC(enable_execlists,
-- 
2.4.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 14/18] drm/i915: object size needs to be u64
  2015-07-07 15:14 ` [PATCH v4 14/18] drm/i915: object size needs to be u64 Michel Thierry
@ 2015-07-07 15:27   ` Chris Wilson
  2015-07-07 15:44     ` Michel Thierry
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2015-07-07 15:27 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, akash.goel

On Tue, Jul 07, 2015 at 04:14:59PM +0100, Michel Thierry wrote:
> In a 48b world, users can try to allocate buffers bigger than 4GB; in
> these cases it is important that size is a 64b variable.
> 
> Also added a warning for illegal bind with size = 0.
> 
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c     | 5 +++--
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a0bff41..ebfb789 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3718,7 +3718,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	u32 size, fence_size, fence_alignment, unfenced_alignment;
> +	u32 fence_alignment, unfenced_alignment;
> +	u64 size, fence_size;
>  	u64 start =
>  		flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
>  	u64 end =
> @@ -3777,7 +3778,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>  	 * attempt to find space.
>  	 */
>  	if (size > end) {
> -		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%u > %s aperture=%llu\n",
> +		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%llu > %s aperture=%llu\n",
>  			  ggtt_view ? ggtt_view->type : 0,
>  			  size,
>  			  flags & PIN_MAPPABLE ? "mappable" : "total",
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 449a245..900bce6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -3312,6 +3312,9 @@ int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
>  	if (WARN_ON(flags == 0))
>  		return -EINVAL;
>  
> +	if (WARN_ON(vma->node.size == 0))
> +		return -EINVAL;

This is superfluous. We don't allow size=0 object creation, and the test
is better (if at all) at vma_create, but what you mean here is
WARN_ON(!drm_mm_node_allocated()) which seems sensisble. And both of
these would be better as ENODEV so we don't confuse the user when they
get propagated back to userspace.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 14/18] drm/i915: object size needs to be u64
  2015-07-07 15:27   ` Chris Wilson
@ 2015-07-07 15:44     ` Michel Thierry
  2015-07-07 20:08       ` Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Michel Thierry @ 2015-07-07 15:44 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, akash.goel

On 7/7/2015 4:27 PM, Chris Wilson wrote:
> On Tue, Jul 07, 2015 at 04:14:59PM +0100, Michel Thierry wrote:
>> In a 48b world, users can try to allocate buffers bigger than 4GB; in
>> these cases it is important that size is a 64b variable.
>>
>> Also added a warning for illegal bind with size = 0.
>>
>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gem.c     | 5 +++--
>>   drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
>>   2 files changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index a0bff41..ebfb789 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -3718,7 +3718,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>>   {
>>   	struct drm_device *dev = obj->base.dev;
>>   	struct drm_i915_private *dev_priv = dev->dev_private;
>> -	u32 size, fence_size, fence_alignment, unfenced_alignment;
>> +	u32 fence_alignment, unfenced_alignment;
>> +	u64 size, fence_size;
>>   	u64 start =
>>   		flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
>>   	u64 end =
>> @@ -3777,7 +3778,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>>   	 * attempt to find space.
>>   	 */
>>   	if (size > end) {
>> -		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%u > %s aperture=%llu\n",
>> +		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%llu > %s aperture=%llu\n",
>>   			  ggtt_view ? ggtt_view->type : 0,
>>   			  size,
>>   			  flags & PIN_MAPPABLE ? "mappable" : "total",
>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> index 449a245..900bce6 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>> @@ -3312,6 +3312,9 @@ int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
>>   	if (WARN_ON(flags == 0))
>>   		return -EINVAL;
>>
>> +	if (WARN_ON(vma->node.size == 0))
>> +		return -EINVAL;
>
> This is superfluous. We don't allow size=0 object creation, and the test
> is better (if at all) at vma_create, but what you mean here is
> WARN_ON(!drm_mm_node_allocated()) which seems sensisble. And both of
> these would be better as ENODEV so we don't confuse the user when they
> get propagated back to userspace.
> -Chris
>
My idea was to catch the node.size overflow if the variable is 
inadvertently changed back to u32 (which has already happen in other 
places).

I'll change it to use drm_mm_node_allocated and return ENODEV in both 
places.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 14/18] drm/i915: object size needs to be u64
  2015-07-07 15:44     ` Michel Thierry
@ 2015-07-07 20:08       ` Chris Wilson
  2015-07-08 11:22         ` Michel Thierry
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2015-07-07 20:08 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, akash.goel

On Tue, Jul 07, 2015 at 04:44:30PM +0100, Michel Thierry wrote:
> On 7/7/2015 4:27 PM, Chris Wilson wrote:
> >On Tue, Jul 07, 2015 at 04:14:59PM +0100, Michel Thierry wrote:
> >>In a 48b world, users can try to allocate buffers bigger than 4GB; in
> >>these cases it is important that size is a 64b variable.
> >>
> >>Also added a warning for illegal bind with size = 0.
> >>
> >>Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_gem.c     | 5 +++--
> >>  drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
> >>  2 files changed, 6 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>index a0bff41..ebfb789 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>@@ -3718,7 +3718,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> >>  {
> >>  	struct drm_device *dev = obj->base.dev;
> >>  	struct drm_i915_private *dev_priv = dev->dev_private;
> >>-	u32 size, fence_size, fence_alignment, unfenced_alignment;
> >>+	u32 fence_alignment, unfenced_alignment;
> >>+	u64 size, fence_size;
> >>  	u64 start =
> >>  		flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
> >>  	u64 end =
> >>@@ -3777,7 +3778,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> >>  	 * attempt to find space.
> >>  	 */
> >>  	if (size > end) {
> >>-		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%u > %s aperture=%llu\n",
> >>+		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%llu > %s aperture=%llu\n",
> >>  			  ggtt_view ? ggtt_view->type : 0,
> >>  			  size,
> >>  			  flags & PIN_MAPPABLE ? "mappable" : "total",
> >>diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> >>index 449a245..900bce6 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> >>@@ -3312,6 +3312,9 @@ int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> >>  	if (WARN_ON(flags == 0))
> >>  		return -EINVAL;
> >>
> >>+	if (WARN_ON(vma->node.size == 0))
> >>+		return -EINVAL;
> >
> >This is superfluous. We don't allow size=0 object creation, and the test
> >is better (if at all) at vma_create, but what you mean here is
> >WARN_ON(!drm_mm_node_allocated()) which seems sensisble. And both of
> >these would be better as ENODEV so we don't confuse the user when they
> >get propagated back to userspace.
> >-Chris
> >
> My idea was to catch the node.size overflow if the variable is
> inadvertently changed back to u32 (which has already happen in other
> places).

Ok, that didn't come across when I just read node.size == 0 (what are
chances that node.size was exactly 2^32 and then truncated?)

vma->node should be fairly opaque, and if possible we want the checks in
drm_mm.c - if we can think of good tests for that layer.

Certainly drm_mm_reserve_node() probably wants a few sanity checks.
Though most of those should fall out when it can't do the reservation
the user requests.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 14/18] drm/i915: object size needs to be u64
  2015-07-07 20:08       ` Chris Wilson
@ 2015-07-08 11:22         ` Michel Thierry
  2015-07-08 15:22           ` Daniel Vetter
  0 siblings, 1 reply; 39+ messages in thread
From: Michel Thierry @ 2015-07-08 11:22 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, akash.goel

On 7/7/2015 9:08 PM, Chris Wilson wrote:
> On Tue, Jul 07, 2015 at 04:44:30PM +0100, Michel Thierry wrote:
>> On 7/7/2015 4:27 PM, Chris Wilson wrote:
>>> On Tue, Jul 07, 2015 at 04:14:59PM +0100, Michel Thierry wrote:
>>>> In a 48b world, users can try to allocate buffers bigger than 4GB; in
>>>> these cases it is important that size is a 64b variable.
>>>>
>>>> Also added a warning for illegal bind with size = 0.
>>>>
>>>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/i915_gem.c     | 5 +++--
>>>>   drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
>>>>   2 files changed, 6 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>>> index a0bff41..ebfb789 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>> @@ -3718,7 +3718,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>>>>   {
>>>>   	struct drm_device *dev = obj->base.dev;
>>>>   	struct drm_i915_private *dev_priv = dev->dev_private;
>>>> -	u32 size, fence_size, fence_alignment, unfenced_alignment;
>>>> +	u32 fence_alignment, unfenced_alignment;
>>>> +	u64 size, fence_size;
>>>>   	u64 start =
>>>>   		flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
>>>>   	u64 end =
>>>> @@ -3777,7 +3778,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>>>>   	 * attempt to find space.
>>>>   	 */
>>>>   	if (size > end) {
>>>> -		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%u > %s aperture=%llu\n",
>>>> +		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%llu > %s aperture=%llu\n",
>>>>   			  ggtt_view ? ggtt_view->type : 0,
>>>>   			  size,
>>>>   			  flags & PIN_MAPPABLE ? "mappable" : "total",
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>> index 449a245..900bce6 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>> @@ -3312,6 +3312,9 @@ int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
>>>>   	if (WARN_ON(flags == 0))
>>>>   		return -EINVAL;
>>>>
>>>> +	if (WARN_ON(vma->node.size == 0))
>>>> +		return -EINVAL;
>>>
>>> This is superfluous. We don't allow size=0 object creation, and the test
>>> is better (if at all) at vma_create, but what you mean here is
>>> WARN_ON(!drm_mm_node_allocated()) which seems sensisble. And both of
>>> these would be better as ENODEV so we don't confuse the user when they
>>> get propagated back to userspace.
>>> -Chris
>>>
>> My idea was to catch the node.size overflow if the variable is
>> inadvertently changed back to u32 (which has already happen in other
>> places).
>
> Ok, that didn't come across when I just read node.size == 0 (what are
> chances that node.size was exactly 2^32 and then truncated?)
>
Only a test explicitly looking for this kind of issues (I guess). In 
that test, objects bigger than 2^32 were truncated, while objects 
exactly 2^32 were hitting a WARN in the driver; alloc_pages wouldn't do 
anything because node.size == 0, and then insert would complain no pages 
existed.

> vma->node should be fairly opaque, and if possible we want the checks in
> drm_mm.c - if we can think of good tests for that layer.
>
> Certainly drm_mm_reserve_node() probably wants a few sanity checks.
> Though most of those should fall out when it can't do the reservation
> the user requests.
Or change drm_mm_insert_node_in_range_generic() to warn when size==0?


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 14/18] drm/i915: object size needs to be u64
  2015-07-08 11:22         ` Michel Thierry
@ 2015-07-08 15:22           ` Daniel Vetter
  2015-07-08 16:42             ` Michel Thierry
  0 siblings, 1 reply; 39+ messages in thread
From: Daniel Vetter @ 2015-07-08 15:22 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, akash.goel

On Wed, Jul 08, 2015 at 12:22:58PM +0100, Michel Thierry wrote:
> On 7/7/2015 9:08 PM, Chris Wilson wrote:
> >On Tue, Jul 07, 2015 at 04:44:30PM +0100, Michel Thierry wrote:
> >>On 7/7/2015 4:27 PM, Chris Wilson wrote:
> >>>On Tue, Jul 07, 2015 at 04:14:59PM +0100, Michel Thierry wrote:
> >>>>In a 48b world, users can try to allocate buffers bigger than 4GB; in
> >>>>these cases it is important that size is a 64b variable.
> >>>>
> >>>>Also added a warning for illegal bind with size = 0.
> >>>>
> >>>>Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> >>>>---
> >>>>  drivers/gpu/drm/i915/i915_gem.c     | 5 +++--
> >>>>  drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
> >>>>  2 files changed, 6 insertions(+), 2 deletions(-)
> >>>>
> >>>>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>>>index a0bff41..ebfb789 100644
> >>>>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>>>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>>>@@ -3718,7 +3718,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> >>>>  {
> >>>>  	struct drm_device *dev = obj->base.dev;
> >>>>  	struct drm_i915_private *dev_priv = dev->dev_private;
> >>>>-	u32 size, fence_size, fence_alignment, unfenced_alignment;
> >>>>+	u32 fence_alignment, unfenced_alignment;
> >>>>+	u64 size, fence_size;
> >>>>  	u64 start =
> >>>>  		flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
> >>>>  	u64 end =
> >>>>@@ -3777,7 +3778,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> >>>>  	 * attempt to find space.
> >>>>  	 */
> >>>>  	if (size > end) {
> >>>>-		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%u > %s aperture=%llu\n",
> >>>>+		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%llu > %s aperture=%llu\n",
> >>>>  			  ggtt_view ? ggtt_view->type : 0,
> >>>>  			  size,
> >>>>  			  flags & PIN_MAPPABLE ? "mappable" : "total",
> >>>>diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> >>>>index 449a245..900bce6 100644
> >>>>--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> >>>>+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> >>>>@@ -3312,6 +3312,9 @@ int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
> >>>>  	if (WARN_ON(flags == 0))
> >>>>  		return -EINVAL;
> >>>>
> >>>>+	if (WARN_ON(vma->node.size == 0))
> >>>>+		return -EINVAL;
> >>>
> >>>This is superfluous. We don't allow size=0 object creation, and the test
> >>>is better (if at all) at vma_create, but what you mean here is
> >>>WARN_ON(!drm_mm_node_allocated()) which seems sensisble. And both of
> >>>these would be better as ENODEV so we don't confuse the user when they
> >>>get propagated back to userspace.
> >>>-Chris
> >>>
> >>My idea was to catch the node.size overflow if the variable is
> >>inadvertently changed back to u32 (which has already happen in other
> >>places).
> >
> >Ok, that didn't come across when I just read node.size == 0 (what are
> >chances that node.size was exactly 2^32 and then truncated?)
> >
> Only a test explicitly looking for this kind of issues (I guess). In that
> test, objects bigger than 2^32 were truncated, while objects exactly 2^32
> were hitting a WARN in the driver; alloc_pages wouldn't do anything because
> node.size == 0, and then insert would complain no pages existed.
> 
> >vma->node should be fairly opaque, and if possible we want the checks in
> >drm_mm.c - if we can think of good tests for that layer.
> >
> >Certainly drm_mm_reserve_node() probably wants a few sanity checks.
> >Though most of those should fall out when it can't do the reservation
> >the user requests.
> Or change drm_mm_insert_node_in_range_generic() to warn when size==0?

WARN_ON(vma->node.size != obj->base.size) ? Feel free to get the casting
right - I suck at implicit C integer conversion rules ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 14/18] drm/i915: object size needs to be u64
  2015-07-08 15:22           ` Daniel Vetter
@ 2015-07-08 16:42             ` Michel Thierry
  2015-07-08 17:03               ` Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Michel Thierry @ 2015-07-08 16:42 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, akash.goel

On 7/8/2015 4:22 PM, Daniel Vetter wrote:
> On Wed, Jul 08, 2015 at 12:22:58PM +0100, Michel Thierry wrote:
>> On 7/7/2015 9:08 PM, Chris Wilson wrote:
>>> On Tue, Jul 07, 2015 at 04:44:30PM +0100, Michel Thierry wrote:
>>>> On 7/7/2015 4:27 PM, Chris Wilson wrote:
>>>>> On Tue, Jul 07, 2015 at 04:14:59PM +0100, Michel Thierry wrote:
>>>>>> In a 48b world, users can try to allocate buffers bigger than 4GB; in
>>>>>> these cases it is important that size is a 64b variable.
>>>>>>
>>>>>> Also added a warning for illegal bind with size = 0.
>>>>>>
>>>>>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/i915/i915_gem.c     | 5 +++--
>>>>>>   drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
>>>>>>   2 files changed, 6 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>>>>> index a0bff41..ebfb789 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>>>> @@ -3718,7 +3718,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>>>>>>   {
>>>>>>   	struct drm_device *dev = obj->base.dev;
>>>>>>   	struct drm_i915_private *dev_priv = dev->dev_private;
>>>>>> -	u32 size, fence_size, fence_alignment, unfenced_alignment;
>>>>>> +	u32 fence_alignment, unfenced_alignment;
>>>>>> +	u64 size, fence_size;
>>>>>>   	u64 start =
>>>>>>   		flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
>>>>>>   	u64 end =
>>>>>> @@ -3777,7 +3778,7 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>>>>>>   	 * attempt to find space.
>>>>>>   	 */
>>>>>>   	if (size > end) {
>>>>>> -		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%u > %s aperture=%llu\n",
>>>>>> +		DRM_DEBUG("Attempting to bind an object (view type=%u) larger than the aperture: size=%llu > %s aperture=%llu\n",
>>>>>>   			  ggtt_view ? ggtt_view->type : 0,
>>>>>>   			  size,
>>>>>>   			  flags & PIN_MAPPABLE ? "mappable" : "total",
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>> index 449a245..900bce6 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>> @@ -3312,6 +3312,9 @@ int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
>>>>>>   	if (WARN_ON(flags == 0))
>>>>>>   		return -EINVAL;
>>>>>>
>>>>>> +	if (WARN_ON(vma->node.size == 0))
>>>>>> +		return -EINVAL;
>>>>>
>>>>> This is superfluous. We don't allow size=0 object creation, and the test
>>>>> is better (if at all) at vma_create, but what you mean here is
>>>>> WARN_ON(!drm_mm_node_allocated()) which seems sensisble. And both of
>>>>> these would be better as ENODEV so we don't confuse the user when they
>>>>> get propagated back to userspace.
>>>>> -Chris
>>>>>
>>>> My idea was to catch the node.size overflow if the variable is
>>>> inadvertently changed back to u32 (which has already happen in other
>>>> places).
>>>
>>> Ok, that didn't come across when I just read node.size == 0 (what are
>>> chances that node.size was exactly 2^32 and then truncated?)
>>>
>> Only a test explicitly looking for this kind of issues (I guess). In that
>> test, objects bigger than 2^32 were truncated, while objects exactly 2^32
>> were hitting a WARN in the driver; alloc_pages wouldn't do anything because
>> node.size == 0, and then insert would complain no pages existed.
>>
>>> vma->node should be fairly opaque, and if possible we want the checks in
>>> drm_mm.c - if we can think of good tests for that layer.
>>>
>>> Certainly drm_mm_reserve_node() probably wants a few sanity checks.
>>> Though most of those should fall out when it can't do the reservation
>>> the user requests.
>> Or change drm_mm_insert_node_in_range_generic() to warn when size==0?
>
> WARN_ON(vma->node.size != obj->base.size) ? Feel free to get the casting
> right - I suck at implicit C integer conversion rules ...
> -Daniel
>
Thanks, if there's no objections, I'll change it to:

     if (WARN_ON(vma->node.size != (u64)vma->obj->base.size))
         return -ENODEV;

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 14/18] drm/i915: object size needs to be u64
  2015-07-08 16:42             ` Michel Thierry
@ 2015-07-08 17:03               ` Chris Wilson
  2015-07-13 10:27                 ` Michel Thierry
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2015-07-08 17:03 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, akash.goel

On Wed, Jul 08, 2015 at 05:42:17PM +0100, Michel Thierry wrote:
> >WARN_ON(vma->node.size != obj->base.size) ? Feel free to get the casting
> >right - I suck at implicit C integer conversion rules ...
> >-Daniel
> >
> Thanks, if there's no objections, I'll change it to:
> 
>     if (WARN_ON(vma->node.size != (u64)vma->obj->base.size))
>         return -ENODEV;

Are we still talking about i915_vma_bind()? Then vma->node.size !=
vma->obj.base.size anyway.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 17/18] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  2015-07-07 15:15 ` [PATCH v4 17/18] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
@ 2015-07-09 16:19   ` Michel Thierry
  2015-07-10  9:39     ` Chris Wilson
  2015-07-11 18:51   ` Chris Wilson
  1 sibling, 1 reply; 39+ messages in thread
From: Michel Thierry @ 2015-07-09 16:19 UTC (permalink / raw)
  To: intel-gfx, Chris Wilson; +Cc: Goel, Akash

On 7/7/2015 4:15 PM, Michel Thierry wrote:
> There are some allocations that must be only referenced by 32-bit
> offsets. To limit the chances of having the first 4GB already full,
> objects not requiring this workaround use DRM_MM_SEARCH_BELOW/
> DRM_MM_CREATE_TOP flags
>
> In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
> General State Heap (GSH) or Intructions State Heap (ISH) must be in a
> 32-bit range, because the General State Offset and Instruction State
> Offset are limited to 32-bits.
>
> Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if
> they can be allocated above the 32-bit address range. To limit the
> chances of having the first 4GB already full, objects will use
> DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible.
>
> v2: Changed flag logic from neeeds_32b, to supports_48b.
> v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel)
> v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK
> to use last PIN_ defined instead of hard-coded value; use correct limit
> check in eb_vma_misplaced. (Chris)
> v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris)
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h            |  2 ++
>   drivers/gpu/drm/i915/i915_gem.c            | 14 ++++++++++++--
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +++++++++++++
>   include/uapi/drm/i915_drm.h                |  3 ++-
>   4 files changed, 29 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4a30a73..fc88e58 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2772,6 +2772,8 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
>   #define PIN_OFFSET_BIAS        (1<<3)
>   #define PIN_USER       (1<<4)
>   #define PIN_UPDATE     (1<<5)
> +#define PIN_ZONE_4G    (1<<6)
> +#define PIN_HIGH       (1<<7)
>   #define PIN_OFFSET_MASK (~4095)
>   int __must_check
>   i915_gem_object_pin(struct drm_i915_gem_object *obj,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index ebfb789..b13900d 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3720,6 +3720,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>          struct drm_i915_private *dev_priv = dev->dev_private;
>          u32 fence_alignment, unfenced_alignment;
>          u64 size, fence_size;
> +       u32 search_flag = DRM_MM_SEARCH_DEFAULT;
> +       u32 alloc_flag = DRM_MM_CREATE_DEFAULT;
>          u64 start =
>                  flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
>          u64 end =
> @@ -3761,6 +3763,14 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>                                                     obj->tiling_mode,
>                                                     false);
>                  size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
> +
> +               if (flags & PIN_HIGH) {
> +                       search_flag = DRM_MM_SEARCH_BELOW;
> +                       alloc_flag = DRM_MM_CREATE_TOP;
> +               }
> +
> +               if (flags & PIN_ZONE_4G)
> +                       end = (1ULL << 32);
Hi Chris,
second thoughts on this... would PIN_HIGH & PIN_ZONE_4G be a problem if 
someone mixes a 64-bit kernel with 32-bit userland?
Maybe it's safer to set end = (1ULL << 32) - PAGE_SIZE.

>          }
>
>          if (alignment == 0)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 17/18] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  2015-07-09 16:19   ` Michel Thierry
@ 2015-07-10  9:39     ` Chris Wilson
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2015-07-10  9:39 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, Goel, Akash

On Thu, Jul 09, 2015 at 05:19:27PM +0100, Michel Thierry wrote:
> On 7/7/2015 4:15 PM, Michel Thierry wrote:
> >There are some allocations that must be only referenced by 32-bit
> >offsets. To limit the chances of having the first 4GB already full,
> >objects not requiring this workaround use DRM_MM_SEARCH_BELOW/
> >DRM_MM_CREATE_TOP flags
> >
> >In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
> >General State Heap (GSH) or Intructions State Heap (ISH) must be in a
> >32-bit range, because the General State Offset and Instruction State
> >Offset are limited to 32-bits.
> >
> >Objects must have EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag to indicate if
> >they can be allocated above the 32-bit address range. To limit the
> >chances of having the first 4GB already full, objects will use
> >DRM_MM_SEARCH_BELOW + DRM_MM_CREATE_TOP flags when possible.
> >
> >v2: Changed flag logic from neeeds_32b, to supports_48b.
> >v3: Moved 48-bit support flag back to exec_object. (Chris, Daniel)
> >v4: Split pin flags into PIN_ZONE_4G and PIN_HIGH; update PIN_OFFSET_MASK
> >to use last PIN_ defined instead of hard-coded value; use correct limit
> >check in eb_vma_misplaced. (Chris)
> >v5: Don't touch PIN_OFFSET_MASK and update workaround comment (Chris)
> >
> >Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v4)
> >Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> >---
> >  drivers/gpu/drm/i915/i915_drv.h            |  2 ++
> >  drivers/gpu/drm/i915/i915_gem.c            | 14 ++++++++++++--
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +++++++++++++
> >  include/uapi/drm/i915_drm.h                |  3 ++-
> >  4 files changed, 29 insertions(+), 3 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >index 4a30a73..fc88e58 100644
> >--- a/drivers/gpu/drm/i915/i915_drv.h
> >+++ b/drivers/gpu/drm/i915/i915_drv.h
> >@@ -2772,6 +2772,8 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
> >  #define PIN_OFFSET_BIAS        (1<<3)
> >  #define PIN_USER       (1<<4)
> >  #define PIN_UPDATE     (1<<5)
> >+#define PIN_ZONE_4G    (1<<6)
> >+#define PIN_HIGH       (1<<7)
> >  #define PIN_OFFSET_MASK (~4095)
> >  int __must_check
> >  i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index ebfb789..b13900d 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -3720,6 +3720,8 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> >         struct drm_i915_private *dev_priv = dev->dev_private;
> >         u32 fence_alignment, unfenced_alignment;
> >         u64 size, fence_size;
> >+       u32 search_flag = DRM_MM_SEARCH_DEFAULT;
> >+       u32 alloc_flag = DRM_MM_CREATE_DEFAULT;
> >         u64 start =
> >                 flags & PIN_OFFSET_BIAS ? flags & PIN_OFFSET_MASK : 0;
> >         u64 end =
> >@@ -3761,6 +3763,14 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
> >                                                    obj->tiling_mode,
> >                                                    false);
> >                 size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
> >+
> >+               if (flags & PIN_HIGH) {
> >+                       search_flag = DRM_MM_SEARCH_BELOW;
> >+                       alloc_flag = DRM_MM_CREATE_TOP;
> >+               }
> >+
> >+               if (flags & PIN_ZONE_4G)
> >+                       end = (1ULL << 32);
> Hi Chris,
> second thoughts on this... would PIN_HIGH & PIN_ZONE_4G be a problem
> if someone mixes a 64-bit kernel with 32-bit userland?
> Maybe it's safer to set end = (1ULL << 32) - PAGE_SIZE.

No, the uapi is and always has been u64. To be paranoid, you would have
to presume anything above INT_MAX is going to be trouble.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 00/18] 48-bit PPGTT
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (17 preceding siblings ...)
  2015-07-07 15:15 ` [PATCH v4 18/18] drm/i915/gen8: Flip the 48b switch Michel Thierry
@ 2015-07-10  9:39 ` Chris Wilson
  2015-07-10 10:24   ` Michel Thierry
  2015-07-11 19:52 ` Chris Wilson
  2015-07-11 20:06 ` Chris Wilson
  20 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2015-07-10  9:39 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, akash.goel

On Tue, Jul 07, 2015 at 04:14:45PM +0100, Michel Thierry wrote:
> These are the rebased patches, after Mika's final ppgtt clean-up series landed
> (it relies in the macros added) and Akash review comments.
> 
> In order expand the GPU address space, a 4th level translation is added, the
> Page Map Level 4 (PML4). This PML4 has 256 PML4 Entries (PML4E), PML4[0-255],
> each pointing to a PDP. All the existing "dynamic alloc ppgtt" functions are
> used, only adding the 4th level changes. I also updated some remaining
> variables that were 32b only.
> 
> There are 2 hardware workarounds needed to allow correct operation with 48b
> addresses (Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset). This
> new patchset version includes the comments and suggestions from Chris Wilson.
> A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given object can be
> allocated outside the first 4 PDPs; if not, the end range is forced to 4GB. Also,
> more objects now use the DRM_MM_CREATE_TOP flag. To maintain compatibility, in
> libdrm I added a new drm_intel_bo_emit_reloc_48bit function that will flag
> these objects, while the existing drm_intel_bo_emit_reloc clears it.
> 
> Finally, this feature is only available in BDW and Gen9, requires LRC submission
> mode (execlists) and it can be detected by i915.enable_ppgtt=3.
> 
> Also note that this expanded address space is only available for full PPGTT,
> aliasing PPGTT and Global GTT remain 32-bit.
> Michel Thierry (18):
>   drm/i915: Remove unnecessary gen8_clamp_pd
>   drm/i915/gen8: Make pdp allocation more dynamic
>   drm/i915/gen8: Add PML4 structure
>   drm/i915/gen8: Abstract PDP usage
>   drm/i915/gen8: Add dynamic page trace events
>   drm/i915/gen8: implement alloc/free for 4lvl
>   drm/i915/gen8: Add 4 level switching infrastructure and lrc support
>   drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT
>   drm/i915/gen8: Pass sg_iter through pte inserts
>   drm/i915/gen8: Add 4 level support in insert_entries and clear_range
>   drm/i915/gen8: Initialize PDPs
>   drm/i915: Expand error state's address width to 64b
>   drm/i915/gen8: Add ppgtt info and debug_dump
>   drm/i915: object size needs to be u64
>   drm/i915: batch_obj vm offset must be u64
>   drm/i915/userptr: Kill user_size limit check
>   drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
>   drm/i915/gen8: Flip the 48b switch

By the days end, I hope to be in a position to actually test large
apertures on a real machine. Do you have a tree I can use?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 00/18] 48-bit PPGTT
  2015-07-10  9:39 ` [PATCH v4 00/18] 48-bit PPGTT Chris Wilson
@ 2015-07-10 10:24   ` Michel Thierry
  0 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-10 10:24 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, akash.goel

On 7/10/2015 10:39 AM, Chris Wilson wrote:
> On Tue, Jul 07, 2015 at 04:14:45PM +0100, Michel Thierry wrote:
>> These are the rebased patches, after Mika's final ppgtt clean-up series landed
>> (it relies in the macros added) and Akash review comments.
>>
>> In order expand the GPU address space, a 4th level translation is added, the
>> Page Map Level 4 (PML4). This PML4 has 256 PML4 Entries (PML4E), PML4[0-255],
>> each pointing to a PDP. All the existing "dynamic alloc ppgtt" functions are
>> used, only adding the 4th level changes. I also updated some remaining
>> variables that were 32b only.
>>
>> There are 2 hardware workarounds needed to allow correct operation with 48b
>> addresses (Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset). This
>> new patchset version includes the comments and suggestions from Chris Wilson.
>> A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given object can be
>> allocated outside the first 4 PDPs; if not, the end range is forced to 4GB. Also,
>> more objects now use the DRM_MM_CREATE_TOP flag. To maintain compatibility, in
>> libdrm I added a new drm_intel_bo_emit_reloc_48bit function that will flag
>> these objects, while the existing drm_intel_bo_emit_reloc clears it.
>>
>> Finally, this feature is only available in BDW and Gen9, requires LRC submission
>> mode (execlists) and it can be detected by i915.enable_ppgtt=3.
>>
>> Also note that this expanded address space is only available for full PPGTT,
>> aliasing PPGTT and Global GTT remain 32-bit.
>> Michel Thierry (18):
>>    drm/i915: Remove unnecessary gen8_clamp_pd
>>    drm/i915/gen8: Make pdp allocation more dynamic
>>    drm/i915/gen8: Add PML4 structure
>>    drm/i915/gen8: Abstract PDP usage
>>    drm/i915/gen8: Add dynamic page trace events
>>    drm/i915/gen8: implement alloc/free for 4lvl
>>    drm/i915/gen8: Add 4 level switching infrastructure and lrc support
>>    drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT
>>    drm/i915/gen8: Pass sg_iter through pte inserts
>>    drm/i915/gen8: Add 4 level support in insert_entries and clear_range
>>    drm/i915/gen8: Initialize PDPs
>>    drm/i915: Expand error state's address width to 64b
>>    drm/i915/gen8: Add ppgtt info and debug_dump
>>    drm/i915: object size needs to be u64
>>    drm/i915: batch_obj vm offset must be u64
>>    drm/i915/userptr: Kill user_size limit check
>>    drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
>>    drm/i915/gen8: Flip the 48b switch
>
> By the days end, I hope to be in a position to actually test large
> apertures on a real machine. Do you have a tree I can use?
> -Chris
>
Yes, use github.com/mthierry/drm-intel.git   48b-ppgtt-20150707
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 18/18] drm/i915/gen8: Flip the 48b switch
  2015-07-07 15:15 ` [PATCH v4 18/18] drm/i915/gen8: Flip the 48b switch Michel Thierry
@ 2015-07-11  3:56   ` shuang.he
  0 siblings, 0 replies; 39+ messages in thread
From: shuang.he @ 2015-07-11  3:56 UTC (permalink / raw)
  To: shuang.he, julianx.dumez, christophe.sureau, lei.a.liu,
	intel-gfx, michel.thierry

Tested-By: Intel Graphics QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6746
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
ILK                                  303/303              303/303
SNB              +3                 309/316              312/316
IVB                                  343/343              343/343
BYT                 -2              285/285              283/285
HSW              +13                 367/381              380/381
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
*SNB  igt@kms_mmio_vs_cs_flip@setcrtc_vs_cs_flip      DMESG_WARN(1)      PASS(1)
*SNB  igt@kms_mmio_vs_cs_flip@setplane_vs_cs_flip      DMESG_WARN(1)      PASS(1)
*SNB  igt@pm_rpm@cursor      DMESG_WARN(1)      PASS(1)
*SNB  igt@pm_rpm@cursor-dpms      DMESG_FAIL(1)      FAIL(1)
*BYT  igt@gem_partial_pwrite_pread@reads-uncached      PASS(1)      FAIL(1)
*BYT  igt@gem_tiled_partial_pwrite_pread@reads      PASS(1)      FAIL(1)
*HSW  igt@kms_mmio_vs_cs_flip@setplane_vs_cs_flip      DMESG_WARN(1)      PASS(1)
*HSW  igt@pm_lpsp@non-edp      DMESG_WARN(1)      PASS(1)
*HSW  igt@pm_rpm@debugfs-read      DMESG_WARN(1)      PASS(1)
*HSW  igt@pm_rpm@gem-idle      DMESG_WARN(1)      PASS(1)
*HSW  igt@pm_rpm@gem-mmap-gtt      DMESG_WARN(1)      PASS(1)
*HSW  igt@pm_rpm@gem-pread      DMESG_WARN(1)      PASS(1)
*HSW  igt@pm_rpm@i2c      DMESG_WARN(1)      PASS(1)
*HSW  igt@pm_rpm@modeset-non-lpsp      DMESG_WARN(1)      PASS(1)
*HSW  igt@pm_rpm@modeset-non-lpsp-stress-no-wait      DMESG_WARN(1)      PASS(1)
*HSW  igt@pm_rpm@pci-d3-state      DMESG_WARN(1)      PASS(1)
*HSW  igt@pm_rpm@reg-read-ioctl      DMESG_WARN(1)      PASS(1)
*HSW  igt@pm_rpm@rte      DMESG_WARN(1)      PASS(1)
*HSW  igt@pm_rpm@sysfs-read      DMESG_WARN(1)      PASS(1)
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 17/18] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset
  2015-07-07 15:15 ` [PATCH v4 17/18] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
  2015-07-09 16:19   ` Michel Thierry
@ 2015-07-11 18:51   ` Chris Wilson
  1 sibling, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2015-07-11 18:51 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, akash.goel

On Tue, Jul 07, 2015 at 04:15:02PM +0100, Michel Thierry wrote:
> @@ -3761,6 +3763,14 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object *obj,
>  						   obj->tiling_mode,
>  						   false);
>  		size = flags & PIN_MAPPABLE ? fence_size : obj->base.size;
> +
> +		if (flags & PIN_HIGH) {
> +			search_flag = DRM_MM_SEARCH_BELOW;
> +			alloc_flag = DRM_MM_CREATE_TOP;
> +		}
> +
> +		if (flags & PIN_ZONE_4G)
> +			end = (1ULL << 32);
>  	}

This should be applied to both paths, as the PIN_HIGH is applicable to
the ggtt (keeping bits and bobs out of mappable is the plan). And with
PIN_ZONE_4G there is also no need to arbitrary impose restrictions on
applicablity (though we may never have a larger GGTT than 4G!).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 00/18] 48-bit PPGTT
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (18 preceding siblings ...)
  2015-07-10  9:39 ` [PATCH v4 00/18] 48-bit PPGTT Chris Wilson
@ 2015-07-11 19:52 ` Chris Wilson
  2015-07-11 20:06 ` Chris Wilson
  20 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2015-07-11 19:52 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, akash.goel

On Tue, Jul 07, 2015 at 04:14:45PM +0100, Michel Thierry wrote:
> These are the rebased patches, after Mika's final ppgtt clean-up series landed
> (it relies in the macros added) and Akash review comments.
> 
> In order expand the GPU address space, a 4th level translation is added, the
> Page Map Level 4 (PML4). This PML4 has 256 PML4 Entries (PML4E), PML4[0-255],
> each pointing to a PDP. All the existing "dynamic alloc ppgtt" functions are
> used, only adding the 4th level changes. I also updated some remaining
> variables that were 32b only.
> 
> There are 2 hardware workarounds needed to allow correct operation with 48b
> addresses (Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset). This
> new patchset version includes the comments and suggestions from Chris Wilson.
> A flag (EXEC_OBJECT_SUPPORTS_48B_ADDRESS) will indicate if a given object can be
> allocated outside the first 4 PDPs; if not, the end range is forced to 4GB. Also,
> more objects now use the DRM_MM_CREATE_TOP flag. To maintain compatibility, in
> libdrm I added a new drm_intel_bo_emit_reloc_48bit function that will flag
> these objects, while the existing drm_intel_bo_emit_reloc clears it.
> 
> Finally, this feature is only available in BDW and Gen9, requires LRC submission
> mode (execlists) and it can be detected by i915.enable_ppgtt=3.

We should be adding ppgtt size to say get_aperture_ioctl so that we can
allow it change in the future.

It looks like we can switch between 32bit and 48bit contexts on the fly.
As we require opt-in from userspace to even make use of the extended vm,
should we also not create 32bit contexts by default? (i.e. are there any
noticeable regressions from switching to 48bit contexts?)
> 
> Also note that this expanded address space is only available for full PPGTT,
> aliasing PPGTT and Global GTT remain 32-bit.
> Michel Thierry (18):
>   drm/i915: Remove unnecessary gen8_clamp_pd
>   drm/i915/gen8: Make pdp allocation more dynamic
>   drm/i915/gen8: Add PML4 structure
>   drm/i915/gen8: Abstract PDP usage
>   drm/i915/gen8: Add dynamic page trace events
>   drm/i915/gen8: implement alloc/free for 4lvl
>   drm/i915/gen8: Add 4 level switching infrastructure and lrc support
>   drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT
>   drm/i915/gen8: Pass sg_iter through pte inserts

sg_page_iter can be a ratelimiting step in some workloads. Though I hope
with 48bit support, the need for evictions to be reduced and so
insertion to be of less impact. However, do you have any ideas for speeding
up ppgtt_insert?

>   drm/i915/gen8: Add 4 level support in insert_entries and clear_range
>   drm/i915/gen8: Initialize PDPs
>   drm/i915: Expand error state's address width to 64b
>   drm/i915/gen8: Add ppgtt info and debug_dump
>   drm/i915: object size needs to be u64
>   drm/i915: batch_obj vm offset must be u64

Or just track the batch_vma.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 03/18] drm/i915/gen8: Add PML4 structure
  2015-07-07 15:14 ` [PATCH v4 03/18] drm/i915/gen8: Add PML4 structure Michel Thierry
@ 2015-07-11 20:02   ` Chris Wilson
  2015-07-13 14:41     ` Michel Thierry
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2015-07-11 20:02 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, akash.goel

On Tue, Jul 07, 2015 at 04:14:48PM +0100, Michel Thierry wrote:
> Introduces the Page Map Level 4 (PML4), ie. the new top level structure
> of the page tables.
> 
> To facilitate testing, 48b mode will be available on Broadwell and
> GEN9+, when i915.enable_ppgtt = 3.
> 
> Cc: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h     |  7 ++++++-
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 40 ++++++++++++++++++++++++-------------
>  drivers/gpu/drm/i915/i915_gem_gtt.h | 35 ++++++++++++++++++++++++--------
>  3 files changed, 59 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 464b28d..de3a5d1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2497,7 +2497,12 @@ struct drm_i915_cmd_table {
>  #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
>  #define HAS_LOGICAL_RING_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 8)
>  #define USES_PPGTT(dev)		(i915.enable_ppgtt)
> -#define USES_FULL_PPGTT(dev)	(i915.enable_ppgtt == 2)
> +#define USES_FULL_PPGTT(dev)	(i915.enable_ppgtt >= 2)
> +#ifdef CONFIG_X86_64
> +# define USES_FULL_48BIT_PPGTT(dev)	(i915.enable_ppgtt == 3)
> +#else
> +# define USES_FULL_48BIT_PPGTT(dev)	false
> +#endif

This requires an explanation.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 00/18] 48-bit PPGTT
  2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
                   ` (19 preceding siblings ...)
  2015-07-11 19:52 ` Chris Wilson
@ 2015-07-11 20:06 ` Chris Wilson
  20 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2015-07-11 20:06 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, akash.goel

On Tue, Jul 07, 2015 at 04:14:45PM +0100, Michel Thierry wrote:
>  drivers/gpu/drm/i915/i915_debugfs.c        |  18 +-
>  drivers/gpu/drm/i915/i915_drv.h            |  15 +-
>  drivers/gpu/drm/i915/i915_gem.c            |  19 +-
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  13 +
>  drivers/gpu/drm/i915/i915_gem_gtt.c        | 609 ++++++++++++++++++++++++-----
>  drivers/gpu/drm/i915/i915_gem_gtt.h        |  63 ++-
>  drivers/gpu/drm/i915/i915_gem_userptr.c    |   4 -
>  drivers/gpu/drm/i915/i915_gpu_error.c      |  18 +-
>  drivers/gpu/drm/i915/i915_params.c         |   2 +-
>  drivers/gpu/drm/i915/i915_reg.h            |   1 +
>  drivers/gpu/drm/i915/i915_trace.h          |  32 +-
>  drivers/gpu/drm/i915/intel_lrc.c           |  65 ++-
>  include/uapi/drm/i915_drm.h                |   3 +-
>  13 files changed, 694 insertions(+), 168 deletions(-)

There are a few of these:

 /**
  * gen8_ppgtt_alloc_page_directories() - Allocate page directories for VA range.
- * @ppgtt:     Master ppgtt structure.
+ * @vm:                Master vm structure.
  * @pdp:       Page directory pointer for this address range.
  * @start:     Starting virtual address to begin allocations.
  * @length     Size of the allocations.

-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 12/18] drm/i915: Expand error state's address width to 64b
  2015-07-07 15:14 ` [PATCH v4 12/18] drm/i915: Expand error state's address width to 64b Michel Thierry
@ 2015-07-11 20:10   ` Chris Wilson
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2015-07-11 20:10 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, akash.goel

On Tue, Jul 07, 2015 at 04:14:57PM +0100, Michel Thierry wrote:
> v2: For semaphore errors, object is mapped to GGTT and offset will not
> be > 4GB, print only lower 32-bits (Akash)
> 
> Cc: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>

We found reading 0x0000000000200000 to be quite difficult and a severe
usablity regression from 0x02000000. In the past we opted to use
0x00000000 020000000 instead.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 14/18] drm/i915: object size needs to be u64
  2015-07-08 17:03               ` Chris Wilson
@ 2015-07-13 10:27                 ` Michel Thierry
  0 siblings, 0 replies; 39+ messages in thread
From: Michel Thierry @ 2015-07-13 10:27 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, intel-gfx, akash.goel

On 7/8/2015 6:03 PM, Chris Wilson wrote:
> On Wed, Jul 08, 2015 at 05:42:17PM +0100, Michel Thierry wrote:
>>> WARN_ON(vma->node.size != obj->base.size) ? Feel free to get the casting
>>> right - I suck at implicit C integer conversion rules ...
>>> -Daniel
>>>
>> Thanks, if there's no objections, I'll change it to:
>>
>>      if (WARN_ON(vma->node.size != (u64)vma->obj->base.size))
>>          return -ENODEV;
>
> Are we still talking about i915_vma_bind()? Then vma->node.size !=
> vma->obj.base.size anyway.
> -Chris
>
Right, it can be either obj->base.size, fence_size or view_size...

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 03/18] drm/i915/gen8: Add PML4 structure
  2015-07-11 20:02   ` Chris Wilson
@ 2015-07-13 14:41     ` Michel Thierry
  2015-07-13 20:02       ` Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Michel Thierry @ 2015-07-13 14:41 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, akash.goel

On 7/11/2015 9:02 PM, Chris Wilson wrote:
> On Tue, Jul 07, 2015 at 04:14:48PM +0100, Michel Thierry wrote:
>> Introduces the Page Map Level 4 (PML4), ie. the new top level structure
>> of the page tables.
>>
>> To facilitate testing, 48b mode will be available on Broadwell and
>> GEN9+, when i915.enable_ppgtt = 3.
>>
>> Cc: Akash Goel <akash.goel@intel.com>
>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h     |  7 ++++++-
>>   drivers/gpu/drm/i915/i915_gem_gtt.c | 40 ++++++++++++++++++++++++-------------
>>   drivers/gpu/drm/i915/i915_gem_gtt.h | 35 ++++++++++++++++++++++++--------
>>   3 files changed, 59 insertions(+), 23 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 464b28d..de3a5d1 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2497,7 +2497,12 @@ struct drm_i915_cmd_table {
>>   #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
>>   #define HAS_LOGICAL_RING_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 8)
>>   #define USES_PPGTT(dev)		(i915.enable_ppgtt)
>> -#define USES_FULL_PPGTT(dev)	(i915.enable_ppgtt == 2)
>> +#define USES_FULL_PPGTT(dev)	(i915.enable_ppgtt >= 2)
>> +#ifdef CONFIG_X86_64
>> +# define USES_FULL_48BIT_PPGTT(dev)	(i915.enable_ppgtt == 3)
>> +#else
>> +# define USES_FULL_48BIT_PPGTT(dev)	false
>> +#endif
>
> This requires an explanation.
> -Chris
>
Actually, this is no longer necessary after Mika's changes:

commit c44ef60e437019b8ca1dab8b4d2e8761fd4ce1e9
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Thu Jun 25 18:35:05 2015 +0300

  drm/i915/gtt: Allow >= 4GB sizes for vm.




_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v4 03/18] drm/i915/gen8: Add PML4 structure
  2015-07-13 14:41     ` Michel Thierry
@ 2015-07-13 20:02       ` Chris Wilson
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2015-07-13 20:02 UTC (permalink / raw)
  To: Michel Thierry; +Cc: intel-gfx, akash.goel

On Mon, Jul 13, 2015 at 03:41:38PM +0100, Michel Thierry wrote:
> On 7/11/2015 9:02 PM, Chris Wilson wrote:
> >On Tue, Jul 07, 2015 at 04:14:48PM +0100, Michel Thierry wrote:
> >>Introduces the Page Map Level 4 (PML4), ie. the new top level structure
> >>of the page tables.
> >>
> >>To facilitate testing, 48b mode will be available on Broadwell and
> >>GEN9+, when i915.enable_ppgtt = 3.
> >>
> >>Cc: Akash Goel <akash.goel@intel.com>
> >>Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_drv.h     |  7 ++++++-
> >>  drivers/gpu/drm/i915/i915_gem_gtt.c | 40 ++++++++++++++++++++++++-------------
> >>  drivers/gpu/drm/i915/i915_gem_gtt.h | 35 ++++++++++++++++++++++++--------
> >>  3 files changed, 59 insertions(+), 23 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>index 464b28d..de3a5d1 100644
> >>--- a/drivers/gpu/drm/i915/i915_drv.h
> >>+++ b/drivers/gpu/drm/i915/i915_drv.h
> >>@@ -2497,7 +2497,12 @@ struct drm_i915_cmd_table {
> >>  #define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
> >>  #define HAS_LOGICAL_RING_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 8)
> >>  #define USES_PPGTT(dev)		(i915.enable_ppgtt)
> >>-#define USES_FULL_PPGTT(dev)	(i915.enable_ppgtt == 2)
> >>+#define USES_FULL_PPGTT(dev)	(i915.enable_ppgtt >= 2)
> >>+#ifdef CONFIG_X86_64
> >>+# define USES_FULL_48BIT_PPGTT(dev)	(i915.enable_ppgtt == 3)
> >>+#else
> >>+# define USES_FULL_48BIT_PPGTT(dev)	false
> >>+#endif
> >
> >This requires an explanation.
> >-Chris
> >
> Actually, this is no longer necessary after Mika's changes:
> 
> commit c44ef60e437019b8ca1dab8b4d2e8761fd4ce1e9
> Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Date:   Thu Jun 25 18:35:05 2015 +0300
> 
>  drm/i915/gtt: Allow >= 4GB sizes for vm.

I thought as much, and would have appreciated a
  /* FIXME not all GTT handling code is 32/64bit safe yet */
just to confirm that the problem was our code and not a hardware
restriction.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2015-07-13 20:02 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-07 15:14 [PATCH v4 00/18] 48-bit PPGTT Michel Thierry
2015-07-07 15:14 ` [PATCH v4 01/18] drm/i915: Remove unnecessary gen8_clamp_pd Michel Thierry
2015-07-07 15:14 ` [PATCH v4 02/18] drm/i915/gen8: Make pdp allocation more dynamic Michel Thierry
2015-07-07 15:14 ` [PATCH v4 03/18] drm/i915/gen8: Add PML4 structure Michel Thierry
2015-07-11 20:02   ` Chris Wilson
2015-07-13 14:41     ` Michel Thierry
2015-07-13 20:02       ` Chris Wilson
2015-07-07 15:14 ` [PATCH v4 04/18] drm/i915/gen8: Abstract PDP usage Michel Thierry
2015-07-07 15:14 ` [PATCH v4 05/18] drm/i915/gen8: Add dynamic page trace events Michel Thierry
2015-07-07 15:14 ` [PATCH v4 06/18] drm/i915/gen8: implement alloc/free for 4lvl Michel Thierry
2015-07-07 15:14 ` [PATCH v4 07/18] drm/i915/gen8: Add 4 level switching infrastructure and lrc support Michel Thierry
2015-07-07 15:14 ` [PATCH v4 08/18] drm/i915/gen8: Generalize PTE writing for GEN8 PPGTT Michel Thierry
2015-07-07 15:14 ` [PATCH v4 09/18] drm/i915/gen8: Pass sg_iter through pte inserts Michel Thierry
2015-07-07 15:14 ` [PATCH v4 10/18] drm/i915/gen8: Add 4 level support in insert_entries and clear_range Michel Thierry
2015-07-07 15:14 ` [PATCH v4 11/18] drm/i915/gen8: Initialize PDPs Michel Thierry
2015-07-07 15:14 ` [PATCH v4 12/18] drm/i915: Expand error state's address width to 64b Michel Thierry
2015-07-11 20:10   ` Chris Wilson
2015-07-07 15:14 ` [PATCH v4 13/18] drm/i915/gen8: Add ppgtt info and debug_dump Michel Thierry
2015-07-07 15:14 ` [PATCH v4 14/18] drm/i915: object size needs to be u64 Michel Thierry
2015-07-07 15:27   ` Chris Wilson
2015-07-07 15:44     ` Michel Thierry
2015-07-07 20:08       ` Chris Wilson
2015-07-08 11:22         ` Michel Thierry
2015-07-08 15:22           ` Daniel Vetter
2015-07-08 16:42             ` Michel Thierry
2015-07-08 17:03               ` Chris Wilson
2015-07-13 10:27                 ` Michel Thierry
2015-07-07 15:15 ` [PATCH v4 15/18] drm/i915: batch_obj vm offset must " Michel Thierry
2015-07-07 15:15 ` [PATCH v4 16/18] drm/i915/userptr: Kill user_size limit check Michel Thierry
2015-07-07 15:15 ` [PATCH v4 17/18] drm/i915: Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset Michel Thierry
2015-07-09 16:19   ` Michel Thierry
2015-07-10  9:39     ` Chris Wilson
2015-07-11 18:51   ` Chris Wilson
2015-07-07 15:15 ` [PATCH v4 18/18] drm/i915/gen8: Flip the 48b switch Michel Thierry
2015-07-11  3:56   ` shuang.he
2015-07-10  9:39 ` [PATCH v4 00/18] 48-bit PPGTT Chris Wilson
2015-07-10 10:24   ` Michel Thierry
2015-07-11 19:52 ` Chris Wilson
2015-07-11 20:06 ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.